Fabric Direct Lake: Memory Utilization with Interactive Operations
As I mentioned in my Power BI and Fabric Capacities: Thinking Outside the Box, memory limits of Fabric capacities could be rather restrictive for large semantic models with imported data. One relatively new option to combat out-of-memory scenarios that deserves to be evaluated and added to the list if Fabric is in your future is semantic models configured for Direct Lake storage. The blog covers results of limited testing that I did comparing side by side the memory utilization of two identical semantic models with the first one configured to import data and the second to use Direct Lake storage. If you need a Direct Lake primer, Chris Webb has done a great job covering its essentials here and here. As a disclaimer, the emphasis is on limited as these results reflect my personal observations based on some isolated tests I’ve done lately. Your results may and probably will vary considerably.
Understanding the Tests
My starting hypothesis was that Direct Lake on-demand loading will utilize memory much more efficiently for interactive operations, such as Power BI report execution. This is a bonus to the fact that data Direct Lake models don’t require refresh. Eliminating refresh could save tremendous amount of memory to start with, even if you apply advanced techniques such as incremental refresh or hybrid tables to models with imported data. Therefore, the tests that follow focus on memory utilization with interactive operations.
To test my hypothesis, I imported the first three months for year 2016 of the NY yellow taxi Azure open dataset to a lakehouse backed up by a Fabric F2 capacity. The resulted in 34.5 million rows distributed across several Delta Parquet files. I limited the data to three months because F2 ran out of memory around the 50 million rows mark with the error “This operation was canceled because there wasn’t enough memory to finish running it. Either reduce the memory footprint of your dataset by doing things such as limiting the amount of imported data, or if using Power BI Premium, increase the memory of the Premium capacity where this dataset is hosted. More details: consumed memory 2851 MB, memory limit 2851 MB, database size before command execution 220 MB”
Descriptive enough and in line with the F2 memory limit of maximum 3 GB per semantic model. I used Power BI desktop to import all that data into a YellowTaxiImported semantic model, which I published to Power BI Service and configured for large storage format. Then, I created online a second YellowTaxiDirectLake semantic model configured for Direct Lake storage mapped directly to the data in the lakehouse. I went back to Power BI desktop to whip up a few analytical (aggregate) queries and a few detail-level queries. Finally, I ran a few tests using DAX Studio.
Analyzing Import Mode
Even after a capacity restart, the YellowTaxiImported model immediately reported 1.4 GB of memory. My conclusion was that that the primary focus of Power BI Premium on-demand loading that was introduced a while back was to speed the first query after the model was evicted from memory. Indeed, I saw that many segments were memory resident and many weren’t, but using queries to touch the non-resident column didn’t increase the memory footprint. The following table lists the query execution times with “Clear On Run” enabled in DAX Studio (to avoid skewing due to cached query data).
Naturally, as the queriers get more detailed, the slower they get because VertiPaq is a columnar database. However, the important observation is that the memory footprint remains constant. Please note that Fabric allocates additional memory to execute the queries, so the memory footprint should grow up as the report load increases.
Query | Duration (ms) |
//Analytical query 1 EVALUATE | 124 |
//Analytical query 2 DEFINE EVALUATE | 148 |
//Analytical query 3 DEFINE VAR __SQDS0BodyLimited = VAR __DS0Core = VAR __DS0PrimaryWindowed = EVALUATE ORDER BY | 114 |
//Analytical query 4 DEFINE VAR __SQDS0BodyLimited = VAR __DS0Core = VAR __DS0BodyLimited = ORDER BY | 80 |
//Detail query 1 DEFINE VAR __DS0FilterTable2 = VAR __DS0Core = VAR __DS0PrimaryWindowed = EVALUATE ORDER BY | 844 |
//Detail query 2 DEFINE VAR __DS0FilterTable2 = VAR __DS0Core = VAR __DS0PrimaryWindowed = EVALUATE ORDER BY | 860 |
//Detail query 3 DEFINE VAR __DS0FilterTable2 = VAR __DS0Core = VAR __DS0PrimaryWindowed = EVALUATE ORDER BY | 1,213 |
//Detail query 4 (All Columns) DEFINE VAR __DS0FilterTable2 = VAR __ValueFilterDM1 = VAR __DS0Core = VAR __DS0PrimaryWindowed = EVALUATE ORDER BY | 4,240 |
Analyzing Direct Lake
After another restart, the resident memory footprint of the YellowTaxiDirectLake model was only 22.4 KB! Indeed, the $System.DISCOVER_STORAGE_TABLE_COLUMN_SEGMENTS DMV showed that only system-generated RowNumber columns were memory resident.
For each query, I recorded two runs to understand how much time is spent in on-demand loading of columns into memory. The Import Mode column was added for convenience to compare the second run duration with the corresponding query duration from the Import Mode tests. Finally, the Model Resident Memory column records the memory footprint of the Direct Lake model.
Query | First Run (ms) | Second Run (ms) | Import Mode (ms) | Model Resident Memory (MB) |
//Analytical query 1 | 79 | 75 | 124 | 14 |
//Analytical query 2 | 79 | 76 | 148 | 14.3 |
//Analytical query 3 | 382 | 133 | 114 | 68.1 |
//Analytical query 4 | 209 | 130 | 80 | 68.13 |
//Detail query 1 | 7,763 | 1,023 | 844 | 669.13 |
//Detail query 2 | 1,484 | 1,453 | 860 | 670.53 |
//Detail query 3 | 1,881 | 1,463 | 1,213 | 670.6 |
//Detail query 4 | 9,663 | 3,668 | 4,240 | 1,270 |
Conclusion
To sum up this long post, the following observations can be made:
- As expected, the more columns the query touch, the higher the memory footprint of Direct Lake. For example, the last query requested all the columns, and the resulting memory footprint was at a par with imported mode.
- It’s important to note that when Fabric is under memory pressure, such as when the report load increases, Direct Lake will start paging out columns with low temperature. The exact thresholds and rules are not documented but I’d expect the eviction mechanism to be much more granular and intelligent than evicting entire datasets with imported mode.
- The reason that I didn’t see Direct Lake paging out memory is because I was still left with plenty (1.27 GB consumed out of 3 GB). It doesn’t make sense evicting data if there is no memory pressure since memory is the fasted storage.
- You’ll pay a certain price the first time a column is loaded on demand with Direct Lake. The more columns, the longer the wait. Subsequent runs, however, will be much faster if the column is still mapped in memory.
- Some queries will execute faster in import mode and some will execute slower. Overall, queries touching memory-resident columns should be comparable.
Therefore, if Direct Lake is an option for you, it should be at the forefront of your efforts to combat out-of-memory errors with large datasets. On the downside, more than likely you’ll have to implement ETL processes to synchronize your data warehouse to a Fabric lakehouse, unless your data is in Fabric to start with, or you use Fabric database mirroring for the currently supported data sources (Azure SQL DB, Cosmos, and Snowflake). I’m not counting the data synchronization time as a downside because it could supersede the time you currently spend in model refresh.