How to monitor the progress of an UPDATE statement sent to SQL Server? Unfortunately, SQL Server currently doesn’t support an option to monitor the progress of DML operations. In the case of UPDATE against large tables, it might be much faster to recreate the table, e.g. with SELECT … INTO. But suppose that INSERT could take a very long time too and you prefer to update the data instead. Here is how to “monitor” the progress while the UPDATE statement is doing its job.
Suppose you are updating the entire table and it has 138,145,625 rows (consider doing the update in batches to avoid running out of log space). Let’s say the UPDATE statement changes the RowStartColumn column to the first day of the month:
UPDATE bi.FactAccountSnapshot
SET RowStartDate = DATEADD(MONTH, DATEDIFF(MONTH, 0, RowStartDate), 0);
Use these statements to monitor the remaining work by using a reverse WHERE clause. Make sure to execute both statements (SET TRAN and SELECT together) so that the SELECT statement can read the uncommitted changes.
SET TRAN ISOLATION LEVEL READ UNCOMMITTED;
SELECT CAST(1 – CAST(COUNT(*) AS DECIMAL) / 138145625 AS DECIMAL(5, 2)) AS PercentComplete,COUNT(*) AS RowsRemaining
FROM bi.FactAccountSnapshot
WHERE RowStartDate <> DATEADD(MONTH, DATEDIFF(MONTH, 0, RowStartDate), 0);
What about INSERT? If you know how many rows you are inserting, you can simply check the current count of the inserted rows by using the sp_spaceused stored procedure: sp_spaceused ‘tableName’.
…with the second edition of my “Applied Microsoft Power BI” book. After seven books and starting from scratch every time, I finally got to write a revision! Thoroughly revised to reflect the current state of Power BI, it added more than 20% new content and probably that much content was rewritten to keep up with the ever changing world of Power BI. Because I had to draw a line somewhere, Applied Microsoft Power BI (2nd Edition) covers all features that were that were released by early January 2017 (including subscriptions). As with my previous books, I’m committed to help my readers with book-related questions and welcome all feedback on the book discussion forum on the book page. While you are there, feel free to check out the book resources (sample chapter, front matter, and more). Consider also following my blog at https://prologika.com/blog and subscribing to my newsletter at https://prologika.com to stay on the Power BI latest.
Today Microsoft released a highly anticipated Power BI feature – subscribed report delivery. Similar to SSRS individual subscriptions, users can go to a Power BI report and subscribe to one or more of its pages to receive a snapshot of the page on a scheduled basis. The following scenarios are possible depending on the report data source:
Imported datasets – the subscription follows the dataset refresh schedule. You’ll get an email every time the scheduled refresh happens, so long as you haven’t gotten an email in the last 24 hours.
DirectQuery datasets – Power BI checks the data source every 15 minutes. You’ll get an email as soon as the next check happens, provided that you haven’t gotten an email in the last 24 hours (if Daily is selected), or in the last seven days (if Weekly is selected).
Live connection to SSAS – Power BI checks the data source every 15 minutes and it’s capable of detecting if the data has changed. You’ll get an email only if the data has changed if you haven’t gotten an email in the last 24 hours
Connected Excel reports – Power BI checks the data source every hour. You’ll get an email only if the data has changed if you haven’t gotten an email in the last 24 hours.
Power BI subscriptions have these limitations:
The only export option is screenshot. You can’t receive the page exported to PowerPoint, for example.
Users can create individual subscriptions only. You can’t subscribe other users as you can do with Reporting Services data-driven subscriptions.
The Power BI admin can’t see or manage subscriptions across the tenant.
I hope you’re enjoying the holiday season. I wish you all the best in 2017! The subject of this newsletter came from a Planning and Strategy assessment for a large organization. Before I get to it and speaking of planning, don’t forget to use your Microsoft planning days as they will expire at the end of your fiscal year. This is free money that Microsoft gives you to engage Microsoft Gold partners, such as Prologika, to help you plan your SQL Server and BI initiatives. Learn how the process works here.
Just like a data warehouse, Operational Data Store (ODS) can mean different things for different people. Do you remember the time when ODS and DW were conflicting methodologies and each one claimed to be superior than the other? Since then the scholars buried the hatchet and reached a consensus that you need both. I agree.
To me, ODS is nothing more than a staging database on steroids that sits between the source systems and DW in the BI architectural stack.
What’s Operational Data Store?
According to Wikipedia “an operational data store (or “ODS”) is a database designed to integrate data from multiple sources for additional operations on the data…The general purpose of an ODS is to integrate data from disparate source systems in a single structure, using data integration technologies like data virtualization, data federation, or extract, transform, and load. This will allow operational access to the data for operational reporting, master data or reference data management. An ODS is not a replacement or substitute for a data warehouse but in turn could become a source.”
OK, this is a good starting point. See also the “Operational Data Source (ODS) Defined” blog by James Serra. But how do you design an ODS? In general, I’ve seen two implementation patterns but the design approach you take would really depends on how you plan to use the data in the ODS and what downstream systems would need that data.
One to One Pull
ODS is typically implemented as 1:1 data pull from the source systems, where ETL stages all source tables required for operational reporting and downstream systems, such loading the data warehouse. ETL typically runs daily but it could run more often to meet low-latency reporting needs. The ETL process is typically just Extract and Load (it doesn’t do any transformations), except for keeping a history of changes (more on this in a moment). This results in a highly normalized schema that’s the same is the original source schema. Then when data is loaded in DW, it’s denormalized to conform to the star schema. Let’s summarize the pros and cons of the One:one Data Pull design pattern.
Pros
Cons
Table schema
Highly normalized and identical to the source system
The number of tables increase
Operational reporting
Users can query the source data as it’s stored in the original source. This offloads reporting from the source systems
No consolidated reporting if multiple source systems process same information, e.g. multiple systems to process claims
Changes to source schema
Source schema is preserved
Additional ETL is required to transform to star schema
ETL
Extraction and load from source systems (no transformations)
As source systems change, ETL needs to change
Common Format
This design is preferred when the same business data is sourced from multiple source systems, such as when the source systems might change or be replaced over time. For example, an insurance company might have several systems to process claims. Instead of ending up with three sets of tables (one for each source system), the ODS schema is standardized and the feeds from the source systems are loaded into a shared table. For example, a common Claim table stores claim “feeds” from the three systems. As long as the source endpoint (table, view, or stored procedure) returns the data according to an agreed “contract” for the feed, ODS is abstracted from source system changes. This design is much less normalized. In fact, for the most part it should mimic the DW schema so that DW tables can piggy back on the ODS tables with no or minimum ETL.
Pros
Cons
Table schema
Denormalized and suitable for reporting
The original schema is lost
Operational reporting
Relevant information is consolidated and stored in one table
Schema is denormalized and reports might not reflect how the data is stored in the source systems
Schema changes to source systems
As long as the source endpoints adhere to the contract, ODS is abstracted from schema changes
A design contract needs to be prepared and sources systems need to provide the data in the agreed format
ETL
Less, or even no ETL to transform data from ODS to DW
ETL needs to follow the contract specification so upfront design effort is required
Further Recommendations
Despite which design pattern you choose, here are some additional recommendations to take the most of your ODS:
Store data at its most atomic level – No aggregations and summaries. Your DW would need the data at its lowest level anyway.
Keep all the historical data or as long as required by your retention policy – This is great for auditing and allows you to reload the DW from ODS since it’s unlikely that source systems will keep historical data.
Apply minimum to no ETL transformations in ODS – You would want the staged data to keep the same parity with the source data so that you can apply data quality and auditing checks.
Avoid business calculations in ODS – Business calculations, such as YTD, QTD, variances, etc., have no place in ODS. They should be defined in the semantic layer, e.g. Analysis Services model. If you attempt to do so in ODS, it will surely impact performance, forcing to you to pre-aggregate data. The only permissible type of reporting in ODS is operational reporting, such as to produce the same reports as the original systems (without giving users access to the source) or to validate that the DW results match the source systems.
Maintain column changes to most columns – I kept the best for last. Treat most columns as Type 2 so that you now when a given row was changed in the source. This is great for auditing.
Here is a hypothetical Policy table that keeps Type 2 changes. In this example, the policy rate has changed on 5/8/2010. If you follow this design, you don’t have to maintain Type 2 in your DW (if you follow the Common Format pattern) and you don’t have to pick which columns are Type 2 (all of them are). It might be extreme but it’s good for auditing. Tip: use SQL Server 2016 temporal tables to simplify Type 2 date tracking.
Atlanta MS BI Group: Presentation and sponsorship by SolidQ on 2/27.
As you’d probably agree, the BI landscape is fast-moving and it might be overwhelming. If you need any help with planning and implementing your next-generation BI solution, don’t hesitate to contact me. As a Microsoft Gold Partner and premier BI firm, you can trust us to help you plan and implement your data analytics projects, and rest assured that you’ll get the best service.
Regards,
Teo Lachev Prologika, LLC | Making Sense of Data Microsoft Partner | Gold Data Analytics
What a better gift to give you than increasing your BI IQ? Don’t miss my highly-discounted ($150 per person only) Power BI Dashboard in a Day (DIAD) session this Friday! It’s one of the three precon sessions of SQL Server Saturday BI Edition 2016. There are only three days left to register and there are still a few seats available. Then, join me on Saturday at 9 AM at SQL Saturday to learn how to embed reports using Power BI Embedded. Both events are in the Microsoft Office in Alpharetta.
Power BI Dashboard in a Day (DIAD Precon Session
SQL Saturday Atlanta BI Edition is proud to announce this full day training Power BI Dashboard in a Day (DIAD) is designed to accelerate your Power BI experience with a comprehensive training program in a single day. All you have to do is bring your Windows-based laptop and we’ll supply the rest – even lunch! With DIAD you get practical hands-on training prepared by the Microsoft Power BI team. During this precon we’ll build a Power BI dashboard together. Along the way, you’ll learn:
Learn how to apply Power BI for self-service BI and organizational BI
How to connect to, import & transform data from a variety of sources
Build real data models, as well as author and publish insightful interactive reports
Customize and share your “creations” for collaboration with other groups securely inside your organization including mobile device sharing
Get your Power BI questions answered
Register your seat now to and witness the value Power BI can deliver to you and your organization!
https://prologika.com/wp-content/uploads/2016/01/logo.png00Prologika - Teo Lachevhttps://prologika.com/wp-content/uploads/2016/01/logo.pngPrologika - Teo Lachev2016-12-06 09:36:472016-12-06 09:36:47A Special Cyber Training Offer This Week!
MS BI fans, join me for the next Atlanta MS BI and Power BI Group meeting on Monday, December 5th at 6:30 PM. I’ll review the tremendous success that Microsoft BI had in 2016. We’ll took a glimpse of the new Power BI navigation experience. Julie Smith will talk about Azure Data Factory Slices for our Prototypes with Pizza. Enjoy catering by Subway, thanks for Microsoft sponsoring this event.
In the spirit of the season, join us to reflect on the state of Microsoft BI Platform at the end of year 2016. It’s been a fantastic year for both on-premises and cloud BI! I’ll revisit the most important changes. Please feel free to share your BI architectural and technical challenges so we can discuss how the latest technologies can solve them.
Speaker:
Teo Lachev is a consultant, author, and mentor, with a focus on Microsoft Business Intelligence. Through his Atlanta-based company “Prologika” (a Microsoft Gold Partner in Data Analytics) he designs and implements innovative solutions that bring tremendous value to his customers. Teo has authored and co-authored several SQL Server BI books, and he has been leading the Atlanta Microsoft Business Intelligence group since he founded it in 2010. Microsoft has recognized Teo’s expertise and contributions to the technical community by awarding him the prestigious Microsoft Most Valuable Professional (MVP) status since 2004.
Prototypes with pizza
“Azure Data Factory and Slices” by Julie Smith
Sponsor:
Microsoft
https://prologika.com/wp-content/uploads/2016/01/logo.png00Prologika - Teo Lachevhttps://prologika.com/wp-content/uploads/2016/01/logo.pngPrologika - Teo Lachev2016-12-02 22:51:072016-12-02 22:51:07Atlanta MS BI Group Meeting on December 5th
Only two weeks to register for my “Dashboard in a Day” precon on December 9th for Atlanta BI SQL Saturday. A whole day of hands-on Power BI training for only $150!
Power BI Dashboard in a Day (DIAD) is designed to accelerate your Power BI experience with a comprehensive training program in a single day. All you have to do is bring your Windows-based laptop and we’ll supply the rest – even lunch! With DIAD you get practical hands-on training prepared by the Microsoft Power BI team. During this precon we’ll build a Power BI dashboard together. Along the way, you’ll learn:
How to connect to, import & transform data from a variety of sources
Build real data models, as well as author and publish insightful reports
Customize and share your “creations” for collaboration with other groups securely inside your organization including mobile device sharing
Get up to speed with the latest Power BI features
Get your Power BI questions answered
Register your seat now and witness the value Power BI can deliver to you and your organization!
https://prologika.com/wp-content/uploads/2016/01/logo.png00Prologika - Teo Lachevhttps://prologika.com/wp-content/uploads/2016/01/logo.pngPrologika - Teo Lachev2016-11-26 20:08:192016-11-26 20:11:36Power BI Training for Atlanta BI SQL Saturday
Next to the question “How do I offload reporting effort from IT?” is “Which MS reporting tool should my users use?”. Granted, Microsoft BI is not shy of reporting tools and they number is increasing! And with Power BI reports soon to be available in SSRS, the choice is getting even more difficult. The following diagram illustrates how I position reporting tools. The Y-axis represents complexity which measures the skills required to author different report types. The X-axis is for interactivity that is related to how well the tool supports interactive data exploration. The size of the bubble represents features supported by the tool – the bigger the bubble, the more features the tool supports. Of course, “features” are subjective. For example, if you care about ability to use custom visualizations, then Power BI reports should be very appealing to you (SSRS also supports custom visuals but you have to code them in .NET). The dividing dotted line denotes who typically creates and owns the report type: IT or Business.
SSRS traditional (now called paginated) reports are the Jupiter in the MS BI Report Tools galaxy. You can rarely have a requirement that you can’t meet with SSRS reports because they are highly customizable. While they exceed in customization, paginated (also referred to as canned) reports lack in interactivity. They are typically authored and owned by IT. However, I have customers where business users have adopted Report Builder to create basic tabular and chart reports. So, don’t rule out this scenario although it requires a steeper learning curve than using Excel or Power BI Desktop for example.
As I mentioned a while back when I had a first look at the new mobile reports (they came from the Datazen acquisition), I have a positioning problem with mobile reports that debuted in SSRS 2016. That’s even more true given that that SSRS will soon support Power BI reports that can be optimized for mobile devices too. Chris Finlan (Senior Program Manager at Microsoft) is more excited about mobile reports and he listed some good usage scenarios. The problem is that in my opinion this report type is neither interactive enough nor feature-rich enough. Because of its reliance on SSRS datasets, mobile reports are typically owned by IT (at least setting up the data) and its primary usage is creating semi-interactive reports that render well on mobile device. If you’re familiar with RoamBI or former PushBI (now a part of Tibco), this is the Microsoft competing reporting technology.
Excel reports are easy to position since many business users live and die by Excel. If pivot reports are good enough for your users, look no further then Excel. The problem with Excel reports though is that they are not supported by SSRS yet (rendering Excel reports online in SSRS is on the roadmap). You can deploy Excel reports with imported data, such as in Power Pivot model, to Power BI Service (powerbi.com). However, Excel reports connected to external data sources, such as Analysis Services, are not supported in Power BI or SharePoint Online (currently, the only choice for online rendering of Excel reports connected to SSAS is deploying them to on-premises SharePoint Server).
If your users are interested in interactive analysis and data exploration, the easy choice is Power BI reports considering all the investment Microsoft is making in Power BI.
This is the Microsoft competing report tool to Tableau, Qlik, etc. These reports can be shared online by publishing them to the cloud (Power BI Service) and soon to on-premises SSRS in native mode (currently you can publish them to SSRS 2016 in native mode but users have to download and open them in Power BI Desktop).
Finally, as any other reporting tool on the market, there isn’t a Microsoft reporting tool that does it all. Most customers use multiple tools. My high-level advice is to have a limited number of strategic SSRS paginated reports and use Excel and/or Power BI for self-service interactive reports.
Remember, data integration first, then data quality, then visualizations.
From the glimpse to the first public preview…it’s great to see one of most requested feature coming to life: ability to render online Power BI reports in on-premises SSRS. Alas, Microsoft is keeping us in suspense and no official date and release vehicles have been announced yet but we can now see and test it using the VM that Microsoft put on Azure (read the Chris Finlan’s steps to get started).
At this point, the integration supports only Power BI Desktop files that connect to Analysis Services (Multidimensional and Tabular). Attempting to deploy models connected to something else or with imported data, doesn’t work and you’ll get an error. Custom visuals and R visuals are not supported yet. For the most part, the integration is limited to report viewing only (similar to what you get if you embed Power BI reports in Power BI Embedded). That’s will be probably fine to start with but it will be nice to have Q&A, Quick Insights, Analyze in Excel, and ability to create custom reports online.
It’s great that Microsoft decided to expose the connection string as a regular SSRS data source. This will allow you to change the credentials settings, such as to impersonate the user when Kerberos is not an option and the SSAS is on another server. When the Power BI Desktop file is uploaded, it’s saved in the report catalog as any regular SSRS report (now referred to a paginated report). This means that you secure and manage Power BI reports the same way you work with paginated reports. Speaking of management, I hope that at some point Microsoft will add support for subscriptions and caching. What’s need is unification among the four types of reports: paginated, mobile, Power BI, and Excel. While waiting, we get a handy bonus feature: ability to add comments to reports, such as to get someone to formally approve what they see on the report.
Everyone is asking about SSRS support for Power BI reports. Progress has been make and I hope it won’t be long before we get the real thing.
https://prologika.com/wp-content/uploads/2016/01/logo.png00Prologika - Teo Lachevhttps://prologika.com/wp-content/uploads/2016/01/logo.pngPrologika - Teo Lachev2016-10-28 20:59:242016-10-29 15:13:20Power BI Reports in SSRS Techinical Preview
Let’s face it, the larger the company, the more difficult is to achieve the dream of single enterprise data warehouse (EDW). In a typical mid-size to large organization, data is found in many data repositories and integrating all this data is difficult. I’m doing an assessment and strategy engagement now for a unit in a large organization, and they need access to at least 10 other on-premises systems, including two very large repositories. Naturally, they don’t want to import all of this data, which could be millions of rows per day, and recreate their own copy of these large corporate repositories. So what to do?
In my “QUO VADIS DATA WAREHOUSE?” newsletter, I defined a logical data warehouse (LDW), also known as data virtualization, is an emerging technology that allows you to access the data where it is. Don’t we have linked servers in SQL Server that do this? We do and they might work to a point. But what if you want scale out distributed queries to achieve better performance? In today’s SQL PASS SUMMIT keynote, Day 1 Keynote “A.C.I.D. Intelligence” (A.C.I.D stands for Algorithms, Cloud, IoT, Data), Rohan Kumar showed something that I think it’s very important and it deserves much more attention than occasional references in blogs. It showed where Microsoft is bringing PolyBase and how this technology could be the Microsoft implementation for data virtualization.
In SQL Server 2016, PolyBase allows you to access data in on-premises Hadoop cluster and in Azure Blob Storage. For example, you can store some files in HDFS and define an external PolyBase table. Then, you can have a query with heterogeneous join between a local SQL Server table and the external table. Rohan showed that Microsoft will extend PolyBase to other popular SQL and NoSQL databases. More importantly, it showed that just like an MPP appliance, such as Microsoft Analytics Platform System (APS) or Azure SQL Data Warehouse, a SQL Server node would allow you to combine multiple SQL Server instances as compute notes so that you scale out access to these data sources. For example, if you have two SQL Server compute nodes and you use PolyBase to access an Oracle database, you’ll be essentially spreading the query across these nodes in parallel and then combine the results. Of course, just like linked servers, there are technical challenges, such as cases where SQL Server might need to move data to the other node. Rohan mentioned that the SQL Server query optimizer will have smarts to optimize heterogeneous joins.
If you’re in the market for a logical data warehouse vendor, don’t rule out Microsoft. Stay tuned for more news around PolyBase and the investments Microsoft makes in this area after the Metanautix acquisition.
https://prologika.com/wp-content/uploads/2016/01/logo.png00Prologika - Teo Lachevhttps://prologika.com/wp-content/uploads/2016/01/logo.pngPrologika - Teo Lachev2016-10-26 21:06:152016-11-12 16:53:38The Future of Microsoft Logical Data Warehouse