Saturday, April 28, 2012

Aggregator transformation

Aggregator transformation is a connected and active transformation.

The Aggregator transformation allows us to perform aggregate calculations, such as averages and sums. Aggregator transformation allows us to perform calculations on groups.

The Aggregator transformation is created with the following components...
  1. Aggregate expression
  2. Group by port
  3. Sorted Input
  4. Aggregate cache

Aggregate expression:
  • Entered in an output port.
  • Can include non-aggregate expressions and conditional clauses.
  • The transformation language includes the following aggregate functions:
  • AVG, COUNT , MAX, MIN, SUM
  • FIRST, LAST
  • MEDIAN, PERCENTILE, STDDEV, VARIANCE
  • Single Level Aggregate Function. E.g.: MAX(SAL)
  • Nested Aggregate Function. E.g.: MAX( COUNT( ITEM ))
Group By Ports:
  • This component defines the groups for a specific port which participates for aggregations, rather than performing the aggregation across all input data. For example, we can find Maximum Salary for every Department.
  • When grouping data, the Aggregator transformation outputs the last row of each group unless otherwise specified.
  • In Aggregator Transformation, Open Ports tab and select Group By as needed.

Using Sorted Input:
  • Use to improve session performance.
  • To use sorted input, we must pass data to the Aggregator transformation sorted by group by port, either in ascending or descending order.
  • When we use this option, we tell Aggregator that data coming to it is already sorted.
  • We check the Sorted Input Option in Properties Tab of the transformation.
  • If the option is checked but we are not passing sorted data to the transformation, then the session fails.

Aggregator Caches:
  • The PowerCenter Server stores data in the aggregate cache until it completes aggregate calculations.
  • It stores group values in an index cache and row data in the data cache. If the PowerCenter Server requires more space, it stores overflow values in cache files.

Note: The PowerCenter Server uses memory to process an Aggregator transformation with sorted ports. It does not use cache memory. We do not need to configure cache memory for Aggregator transformations that use sorted ports.

Working with Aggregato transformation:
  1. In the mapping click Transformation -> Create -> Select AGGREGATOR from list. Give name and click Create. Now click done.
  2. Pass the required fields (e.g.: SAL and DEPTNO) from SQ_EMP to AGGREGATOR Transformation.
  3. Edit AGGREGATOR Transformation. Go to Ports Tab
  4. Create output ports (e.g.: 4 output ports: OUT_MAX_SAL, OUT_MIN_SAL, OUT_AVG_SAL, OUT_SUM_SAL)
  5. Open Expression Editor one by one for all output ports and give the calculations. Eg: MAX(SAL), MIN(SAL), AVG(SAL),SUM(SAL)
  6. Click Apply -> Ok.
  7. Drag target table now.
  8. Connect the output ports from Aggregator transformation to target table.
  9. Click Mapping -> Validate
  10. Repository -> Save
  11. Create Session and Workflow. Run the workflow and see the data in target table.

1 comment:

Related Posts Plugin for WordPress, Blogger...