Power BI can obtain datasets from R scripts, though as of the September 2016 release of Power BI the integration is not so slick.
For example, we may wish to explore a possible log-linear relationship in our data or the potential of a polynomial relationship. While we could, in principle, add columns to a dataset using Power Query, this approach would be clumsy and lack transparency for many aspects of exploratory data analysis. It would be far easier to use R, a language specifically developed for statistical analysis, to set up data in the desired form and then import the data into Power BI.
At present, Power BI can only import data in the form of an R data frame, so you must plan your script accordingly. Data structures like matrices and vectors cannot yet be imported into Power BI, so you must either restrict your script to the use of data frames or convert matrices into data frames within your script.
I suspect it will not be too long before Power BI boasts integration with an R development environment like RStudio or Visual Studio with the R add-in. As of September 2016, the R visualizations tool can launch an R IDE, but to the best of my knowledge, it still requires a copy-and-paste to bring the script into Power BI once the editing is complete. Importing data from an R script remains at present a manual operation.
For our example, we will read a two-column CSV file into an R dataframe containing pairs of data points representing educational level and current wage. We shall then add a third column to the dataframe containing the log of the wage. Our R script looks like this:
df[,‘LogWage’] <- log(df[,‘PEERNH2’])
Once our script has been tested, there are two ways to bring the dataframe into Power BI.
Going to the Get Data | More … dialog in Power BI desktop, we see that “R Script” is one of the choices on the list. If you wish to get to that choice a little faster, “R Script” can be found in the “Other” section of the Get Data dialog.
As exciting as the “R Script” choice is, clicking on the choice disappoints a little the first time.
Pasting the entire R script into this dialog has the advantage of permanently keeping the script within the Power BI file. Without better integration of an R development environment, however, altering or maintaining R code stored in this way will prove to be a nightmare. At present, keeping the R script as a separate file seems to provide greater advantages. In this method, the script pasted into the Execute R Script dialog would simply be a one-line script invoking the R “source” function. Clearly, if you use this method you must scrupulously maintain the directory location of the scripts so they can be found by Power BI when time comes to refresh the data.
Even though you are typing in Power BI, it’s still an R script, so don’t forget to escape your backslashes!
After you click OK, the R data frame will be imported by the Power Query machinery, the same as any other data source. In fact, you can open the Power Query editor and modify the newly created dataset there if need be, using the graphical environment or writing M script.
Clearly, this is not the environment in which we would like to edit a longer R script, to say nothing of debugging one.
Although the user interface is not as slick as one might like, the ease of which R data frames can be imported into Power BI make this a technique that no serious Power BI analyst can afford to ignore.