Implementing Lookups in Power Query
Scenario: Suppose you have two source tables coming from different data sources, such as Excel or text files, as shown in in the screenshot below. The Master table has a DateStart and DateEnd columns that denotes a date range for ID. The Details table has some date (and possibly time) values. You need to look up the ID column from the Master table where the Details.Date column falls between the DateStart and DateEnd so that you can create a relationship Details.ID ð Master.ID. I probably make this more complex because of a real-life scenario where this requirement popped up. But the same approach can be applied when you need to implement simpler lookups in Power Query.
Solution: There are several options to implement this requirement:
- If tables come from a relational database, implement the lookup in the source query for the Details tale. This should be the preferred option for performance reasons.
- If the lookup requires only an equality condition on one or more columns, you can add a calculated column to the Details table and use the DAX LookupValue function. However, remember that xVelocity calculated columns are not compressed and they should be avoided for large fact tables.
- If the lookup requires inequality conditions, you can use add a calculated column that has a more convoluted DAX expression, such as the one below:
ID = CALCULATE(MIN(Master[ID]), FILTER(Master, Master[DateStart]<=Details[Date] && Master[DateEnd] >= Details[Date]))
However, when you try to create a relationship, you’ll get a “circular dependency” error so you might need to copy the table.
- Implement the lookup in Power Query before tables are imported in the data model. Power Query doesn’t currently support a lookup transformation but the following function gets the job done:
let
GetKey = (SourceDate as date) =>
let
Source = Table.SelectRows(Master, each [DateStart]<=SourceDate and SourceDate<=[DateEnd]),
Key = if Table.IsEmpty (Source) then null else Source[ID]{0}
in
Key
in
GetKey
In this case, the function takes a Date argument. When the function is invoked for each row in the Details table, the query will pass the value of the Date column to this argument. Then, Table.SelectRows filters the Master table using the inequality condition. If the filtered table is empty, the function returns null so that it doesn’t error out when attempting to get to the ID value. Otherwise, the function returns the first value in the ID column, assuming that there will be at most a single row matched. The attached sample demonstrates the solution modeled in Excel. The PQLookupData file simulates the source data, while the PQLookup file has the data model and queries.