Using Lookup in ADF Data Flow

Introduction

SSIS developer, we all know the importance of Lookup Transform. Here in ADF, we have two types of Lookup. One is for control flow and other is for data flow. The behavior of both the lookup is not same. Control flow lookup provide only record sets but in Data flow we are using it as for looking up some records from other data sets.

In ADF data flow Lookup feature is more than the SSIS Lookup Transform. So, in this article we are thinking to demonstrate all the functionality of ADF Data Flow lookup with a simple example.

 

Hope it will be interesting.

 

Case scenario

We have a Flat file which contains Employee Code, Salary and Bonus Information. There are multiple Flat files in our Azure BLOB Container.

Now we have a JSON file which contains information about Employee Code, Employee Name and Employee Email.

Now we have to use the Lookup Transform to find the Employee Name and Employee Email from our JSON file.

 

About Data Flow Lookup

Before moving to our actual data flow solution, we must need to understand every property of Data Flow Lookup.

 


 

Primary stream 

The incoming stream of data. This stream is equivalent to the left side of a join.

Lookup stream

The data that is appended to the primary stream. Which data is added is determined by the lookup conditions. This stream is equivalent to the right side of a join.

 

Match multiple rows 

If enabled, a row with multiple matches in the primary stream will return multiple rows. Otherwise, only a single row will be returned based upon the 'Match on' condition.

 

Match on 

Only visible if 'Match multiple rows' is not selected. Choose whether to match on any row, the first match, or the last match. Any row is recommended as it executes the fastest. If first row or last row is selected, you'll be required to specify sort conditions.

 

Lookup conditions: 

Choose which columns to match on. If the equality condition is met, then the rows will be considered a match. Hover and select 'Computed column' to extract a value using the data flow expression language.

 

Non-equi joins

To use a conditional operator such as not equals (!=) or greater than (>) in your lookup conditions, change the operator dropdown between the two columns. Non-equi joins require at least one of the two streams to be broadcasted using Fixed broadcasting in the Optimize tab.

 

 

 

ADF Solution Data Flow

Here we are not using any Sync as we are going to display output in Data Preview mode.

 


 

Source – 1

 


 

Source – 2

 


 

 

Flatten Transform

 


 

Lookup

 


 

Select with Data Preview

 


 


 

 

Hope you like it.

 


Comments

  1. If you have any confusion between == and === operators, here I give you a simple explanation. == check only values and === checks values snd type. Take a simple example:
    If(12 == '12') :
    pint( 'Yes')
    else:
    print('No')
    ---> Yes

    If(12 === '12') :
    pint( 'Yes')
    else:
    print('No')
    ---> No

    Hope you understand that.

    ReplyDelete

Post a Comment

Popular Posts

Triggering Pipeline in ADF

Working with Spark – Spark RDD

Master Child Table from Flat file by using ADF Data Flow