Live SQL Server Data in Power BI: Using DirectQuery

For me, one of the most valuable features of Power BI is the ease with which I can mashup data from a variety of different data sources. This, of course, requires importing data into Power BI’s internal xVelocity engine. For many others, however, it is critical to create Power BI reports and dashboards that reflect the current state of live data as it exists in an underlying relational database. Let’s take a look at how this is accomplished, and what some of the trade-offs are.

DirectQuery

Live data can be visualized in Power BI only through what Microsoft calls “DirectQuery”. There are two ways to use DirectQuery with Power BI. One is to directly connect to the relational database with the “DirectQuery” option. As of May 2018, this can only be done with data in SQL Server, Oracle, or Teradata.

The second option is to create a DirectQuery mode tabular Analysis Services database. If Power BI can directly connect to, say, SQL Server, why would anyone go to the trouble of creating an intermediate SSAS database? That depends on whatever additional measures might need to be defined. By defining advanced measures in a tabular mode SSAS database, those measures become immediately available to any Power BI report that connects to it. Measures defined within Power BI are only accessible to reports in that instance.

Choosing “Connect Live” will establish a connection to SSAS that can be queried automatically. This is the same whether the SSAS database is DirectQuery or not.
No data will be imported when Power BI connects to SQL (or Oracle or Teradata) in DirectQuery mode.

How Does DirectQuery Work?

Whether Power BI has directly connected to SQL Server in DirectQuery mode or whether Power BI has connected to a DirectQuery SSAS database, DirectQuery works in essentially the same way. Dax queries generated by dragging and dropping data onto visualizations are converted into SQL queries for the underlying database. Since no data is saved in Power BI, the visualizations always reflect current data. (It is possible, for performance reasons, to cache query results in Power BI’s RAM.)

Since the conversion of DAX into SQL is not perfect, some Power BI operations will not work in DirectQuery mode, and some may work incorrectly. Special attention must be paid to DAX’s time intelligence functions; they will often not behave correctly in DirectQuery mode.

You can use the SQL Server Profiler to examine the SQL queries generated by DirectQuery. The following is the translation of a very simple DAX query. Note the TOP statement; DirectQuery cannot return more than one million rows, and the SQL query is modified to reflect this fact. This is not a limitation on the number of rows that can be queried and aggregated to yield the desired results; it is only a limitation on the actual number of rows returned.

SELECT
TOP (10000001) [c36],SUM([c155])
AS [a0]
FROM
(
SELECT [t7].[c36],[t7].[c155]
FROM
(
SELECT [t1].[c36] AS [c36],[t7].[SalesAmount] AS [c155]
FROM
(
((SELECT * FROM FactSales) AS [t7]
left outer join
(
SELECT [t5].[StoreKey] AS [c92],[t5].[GeographyKey] AS [c93]
FROM
(
( SELECT [dbo].[DimStore].* FROM [dbo].[DimStore] )
)
AS [t5]
)
AS [t5] on
(
[t7].[StoreKey] = [t5].[c92]
)
)
left outer join
(
SELECT [t1].[GeographyKey] AS [c31],[t1].[RegionCountryName] AS [c36]
FROM
(
( SELECT [dbo].[DimGeography].* FROM [dbo].[DimGeography] )
)
AS [t1]
)
AS [t1] on
(
[t5].[c93] = [t1].[c31]
)
)
)
AS [t7]
)
AS [t7]
GROUP BY [c36]

Creating a DirectQuery Mode SSAS Database

If you will be reporting on live relational data in a variety of ways, in the long run, it is probably better to create an SSAS database specifically for this purpose. This is essentially the same, but there are a few twists. Since DirectQuery mode does not import data, in the design environment there is no data to look at as you develop measures. This is unpleasant and confusing, but Microsoft has provided for the creation of sample data. This is done by assigning a second partition and marking it as “Sample”. In fact, you are not permitted more than a single partition in DirectQuery mode unless it is a sample partition. Data generated by the query defining the sample partition will be imported into the design environment, and from there, things are pretty much the same as designing any other tabular mode database.

Strictly speaking, it is not necessary to mark the workspace database as DirectQuery; you can convert it to DirectQuery when you deploy. This is not a good idea, however, If the workspace database is marked as DirectQuery you will be warned if you try to do something that is not permitted. Without this warning, your mistakes may be much harder to correct later on.

Another Advantage to DirectQuery

When you import data into Power BI, those data must have sets of permissions applied to them. For an administrator, this means that security for the same data must be managed in many multiple locations. Since data is not imported in DirectQuery mode, permissions are applied by the already existing definitions in the underlying relational database, reducing the admin’s workload.

Conclusions

When I am mashing up data from different sources, I cannot use DirectQuery. Since DirectQuery generates an SQL Query, there is no provision for getting data from more than one single relational source. However, the ease of creating a DirectQuery SSAS database, or the ease of creating DirectQuery connection to Oracle, SQL Server, or Teradata, make DirectQuery a terrific tool for developers and report writers who need to access live data in Power BI.

AUTHOR: Dan Buskirk

Related Training:
SQL Server Training
Business Intelligence Training

Type to search blog.learningtree.com

Do you mean "" ?

Sorry, no results were found for your query.

Please check your spelling and try your search again.