Tuesday, May 15, 2012

Interview questions on Lookup transformatioin?

Q. What is a lookup transformation?
Ans: The Lookup transformation is passive transformation. Lookup is a transformation to look up data in a flat file or a relational table, view, or synonym.
There are two types of Lookups in Powercenter Designer, namely:
  1. Connected Lookup
  2. Unconnected Lookup .

Q. What are the tasks of a lookup transformation?
Ans: We can use the Lookup transformation to perform following:

  • Get a related value: EMP has DEPTNO but DNAME is not there. We use Lookup to get DNAME from DEPT table based on Lookup Condition.
  • Perform a calculation: We want only those Employees who’s SAL > Average (SAL). We will write Lookup Override query.
  • Update slowly changing dimension tables: Most important use. We can use a Lookup transformation to determine whether rows already exist in the target.


Q. How do you configure a lookup transformation?
Ans: Configure the lookup transformation to perform the following types of lookups:

  • Relational or flat file lookup
  • Pipeline lookup
  • Connected or unconnected lookup
  • Cached or uncached lookup


Q. What is a pipeline lookup transformation?
Ans: A pipeline lookup transformation is used to perform lookup on application sources such as JMS, MSMQ or SAP. A pipeline lookup transformation has a source qualifier as the lookups source.

Q. What is connected and unconnected lookup transformation?
Ans: A connected lookup transformation is connected the transformations in the mapping pipeline. It receives source data, performs a lookup and returns data to the pipeline.

An unconnected Lookup transformation is separate from the pipeline in the mapping. We write an expression using the :LKP reference qualifier to call the lookup within another transformation. Unconnected lookup does not participate in mapping data flow. In fact it is not connected to the other transformation so Informatica server creates a separate cache for unconnected and processing takes place in parallel. That’s why performance increases.

The major advantage of unconnected lookup is its reusability. We can call an unconnected lookup multiple times in the mapping unlike connected lookup.

Q. What are the differences between connected and unconnected lookup transformation?

  1. Connected lookup transformation receives input values directly from the pipeline. Unconnected lookup transformation receives input values from the result of a :LKP expression in another transformation.
  2. Connected lookup transformation can be configured as dynamic or static cache. Unconnected lookup transformation can be configured only as static cache.
  3. Connected lookup transformation can return multiple columns from the same row or insert into the dynamic lookup cache. Unconnected lookup transformation can return one column from each row.
  4. If there is no match for the lookup condition, connected lookup transformation returns default value for all output ports. If you configure dynamic caching, the Integration Service inserts rows into the cache or leaves it unchanged. If there is no match for the lookup condition, the unconnected lookup transformation returns null.
  5. In a connected lookup transformation, the cache includes the lookup source columns in the lookup condition and the lookup source columns that are output ports. In an unconnected lookup transformation, the cache includes all lookup/output ports in the lookup condition and the lookup/return port.
  6. Connected lookup transformation passes multiple output values to another transformation. Unconnected lookup transformation passes one output value to another transformation.
  7. Connected lookup transformation supports user-defined values. Unconnected lookup transformation does not support user-defined default values.


Q. How do you handle multiple matches in lookup transformation? or what is "Lookup Policy on Multiple Match"?
Ans: "Lookup Policy on Multiple Match" option is used to determine which rows that the lookup transformation returns when it finds multiple rows that match the lookup condition. You can select lookup to return first or last row or any matching row or to report an error.

Q. What is "Output Old Value on Update"?
Ans: This option is used when dynamic cache is enabled. When this option is enabled, the integration service outputs old values out of the lookup/output ports. When the Integration Service updates a row in the cache, it outputs the value that existed in the lookup cache before it updated the row based on the input data. When the Integration Service inserts a new row in the cache, it outputs null values. When you disable this property, the Integration Service outputs the same values out of the lookup/output and input/output ports.

Q. What is "Insert Else Update" and "Update Else Insert"?
Ans: These options are used when dynamic cache is enabled.

  • Insert Else Update option applies to rows entering the lookup transformation with the row type of insert. When this option is enabled the integration service inserts new rows in the cache and updates existing rows when disabled, the Integration Service does not update existing rows.
  • Update Else Insert option applies to rows entering the lookup transformation with the row type of update. When this option is enabled, the Integration Service updates existing rows, and inserts a new row if it is new. When disabled, the Integration Service does not insert new rows.


Q. What are the options available to configure a lookup cache?
Ans: The following options can be used to configure a lookup cache:

  • Static cache
  • Dynamic cache
  • Persistent cache
  • Recache
  • Shared Cache


Q. What is a cached lookup transformation and uncached lookup transformation?
Ans: 

  • Cached lookup transformation: The Integration Service builds a cache in memory when it processes the first row of data in a cached Lookup transformation. The Integration Service stores condition values in the index cache and output values in the data cache. The Integration Service queries the cache for each row that enters the transformation.
  • Uncached lookup transformation: For each row that enters the lookup transformation, the Integration Service queries the lookup source and returns a value. The integration service does not build a cache.


Q. How the integration service builds the caches for connected lookup transformation?
Ans: The Integration Service builds the lookup caches for connected lookup transformation in the following ways:

  • Sequential cache: The Integration Service builds lookup caches sequentially. The Integration Service builds the cache in memory when it processes the first row of the data in a cached lookup transformation.
  • Concurrent caches: The Integration Service builds lookup caches concurrently. It does not need to wait for data to reach the Lookup transformation.


Q. How the integration service builds the caches for unconnected lookup transformation?
Ans: The Integration Service builds caches for unconnected Lookup transformations as sequentially.

Q. What is a dynamic cache?
Ans: To cache a target table or flat file source and insert new rows or update existing rows in the cache, use a Lookup transformation with a dynamic cache. The IS dynamically inserts or updates data in the lookup cache and passes data to the target. Target table is also our lookup table. No good for performance if table is huge.

Q. When you use a dynamic cache, do you need to associate each lookup port with the input port?
Ans: Yes. You need to associate each lookup/output port with the input/output port or a sequence ID. The Integration Service uses the data in the associated port to insert or update rows in the lookup cache.

Q. What are the different values returned by NewLookupRow port?
Ans: The different values are

  • 0 - Integration Service does not update or insert the row in the cache.
  • 1 - Integration Service inserts the row into the cache.
  • 2 - Integration Service updates the row in the cache.


Q. What is a persistent cache?
Ans: If the lookup table does not change between sessions, we can configure the Lookup transformation to use a persistent lookup cache. The IS saves and reuses cache files from session to session, eliminating the time required to read the lookup table.

Q. What is Recache?
Ans: If the persistent cache is not synchronized with the lookup table, we can configure the Lookup transformation to rebuild the lookup cache. If Lookup table has changed, we can use this to rebuild the lookup cache.

Q. What is a shared cache?
Ans: You can configure multiple Lookup transformations in a mapping to share a single lookup cache. The Integration Service builds the cache when it processes the first Lookup transformation. It uses the same cache to perform lookups for subsequent Lookup transformations that share the cache. There are two types of Shared Cache. They are...

  1. Unnamed cache: When Lookup transformations in a mapping have compatible caching structures, the IS shares the cache by default. You can only share static unnamed caches.
  2. Named cache: Use a persistent named cache when we want to share a cache file across mappings or share a dynamic and a static cache. The caching structures must match or be compatible with a named cache. You can share static and dynamic named caches.


Q. How do you improve the performance of lookup transformation?
Ans:

  • Create an index on the columns used in the lookup condition
  • Place conditions with equality operator first
  • Cache small lookup tables.
  • Join tables in the database: If the source and the lookup table are in the same database, join the tables in the database rather than using a lookup transformation.
  • Use persistent cache for static lookups.
  • Avoid ORDER BY on all columns in the lookup source. Specify explicitly the ORDER By clause on the required columns.
  • For flat file lookups, provide Sorted files as lookup source.

6 comments:

Related Posts Plugin for WordPress, Blogger...