Wednesday, July 15, 2009

A Close Look at Look Up


Pre-requisite: SQL Server Integration Services (SSIS)

Are you new to SSIS’ Look-Up?
Bing it! Or visit, http://msdn.microsoft.com/en-us/library/ms141821.aspx.

Presented is the SSIS internal about the Look-Up component.

• Memory: The Look-Up component operates in the Full Cache mode, i.e. while executing it caches the entire reference table/dataset into memory, this is done in the pre-execute phase and the database is not contacted again. The other modes being Partial Cache or No Cache.

To read more on the caching this would be a good read.
http://www.sql-server-performance.com/articles/biz/SSIS_New_Features_in_SQL_Server_2008_Part1_p1.aspx

• Case-Sensitivity: The Look-Up component is Case Sensitive. Hence, it has to be appropriately utilized based on the data. We can use the upper () or lower () functions.

Performance Tuning:Suggested are some methods that would help us use the Look-Up Transformation Component efficiently.

1. Being Choosy with Columns:While configuring the Look-Up transformation we have an option to choose either a “Table or View” or “Use Result of a SQL Query” as the Reference Data Set. It is advisable to use the latter with the desired reference columns.

Reason: If the Look-Up operates in the Full Cache mode, it is an unnecessary effort expended by SSIS (by fetching non-looked up columns) and we would be saving on the memory. :)

2. Being Choosy with Rows:If appropriate filters are applied on the SQL query, the records in the Referenced data set reduce and hence the take away would be:
o Input Data Set would be compared against a smaller result set.
o Memory required would be reduced again.
Hence, gaining an edge on the time front!! :)



3. Restrict Memory: Performing look-up on a large (reference) data set; there can be cases where the Memory Buffer Manager runs out of memory (Full Cache Mode). This may hamper the execution of the SSIS package.

The Advanced tab in the “Look-Up Transformation Editor” enables us to restrict the memory to be allocated for Look-Up. On restricting the memory, two things happen:
o The amount of memory that the Lookup is allowed to use is limited. A cache policy is used and new rows are added to the cache on demand.
o The new rows are added to the cache individually; i.e., SSIS will query the database whenever data for a row cannot be located in the internal cache kept by SSIS. These are singleton (single-row) queries, unlike the large set-based table queries that occur when memory restriction is not enabled.

Advantage: Look-Up can be performed on Large (reference) data.

Disadvantage: The cost is that singleton queries are used, which are slower than a single table query.

Suggestion: The above setting is subject to every project case. It is advisable to trial run with appropriate memory setting! :)

4. Think Re-location:
Certain resource consuming operations can be pushed to the Source component itself. For Instance:
i. Look-Up operation can be performed at the source (while reading from the table).
ii. Data Conversion tasks can be performed at the source (while reading from the table).
iii. Upper () and Lower () functions can be performed using a Derived Column
Transformation component.

Wassup with SQL Server 2008 (Code Named: Katmai)????
Cache Transformation - Introduced in SQL Server 2008, aids in performing look-up better!
To be read as home work ;)

So, wishing you luck with development and customer WoWs!! :)

References: http://blogs.msdn.com/sqlperf/archive/2007/04/24/getting-optimal-performance-with-integration-services-lookups.aspx

Add to Technorati Favorites

No comments:

Post a Comment