Last week (the third week in April 2017) Microsoft released its Community Technology Preview 2.0 for SQL Server vNext, more often referred to as simply “SQL Server 2017”. Among the many new features of interest to business intelligence analysts, data scientists, and machine learning aficionados is support for running Python scripts within SQL Server procedures.
When support for R scripts was included in SQL Server 2016, it was clear that support for other scripting languages would be coming, and support for python has now, in fact, arrived. Best thing of all, there are none of the arcane hoops that one had to jump through for R support when 2016 was still in its preview stages. Clearly, R paved the way for a more seamless installation. If you have installed support for R scripts in 2016, there is nothing new in the installation procedure itself.
The familiar features dialog of the installation program provides a checkbox for Python support.
The Python distribution installed is Continuum Analytics, the folks who maintain the Anaconda series of Python implementations. For years, Anaconda has been popular among Python users, particularly those who wish to start doing math immediately rather than taking time to install a myriad of additional packages first.
The repressed zoologist in me always cringes a little when I say Anaconda Python, since an Anaconda is more closely related to the boa constrictors. Anyhow, you can run SQL Server’s Anaconda Python from the commandline if you like, though its location is not added to the Windows path environment variable by default. The Python executables are located in
C:\Program Files\Microsoft SQL Server\MSSQL14.MSSQLSERVER\PYTHON_SERVICES
Once this path is added to the environment variable, you could run Python code from geany, though I am sure Microsoft would prefer you to use the new Python tools now available in the Visual Studio.
Again, just as with R, SQL Server is not configured to run external scripts by default. Although the documentation shows using “RECONFIGURE WITH OVERRIDE” when reconfiguring, this seems to be pointless as a server restart seems to be required for the new run value to take effect.
EXEC sp_configure ‘external scripts enabled’, 1
RECONFIGURE WITH OVERRIDE — insufficient to reset run value – server restart required
Similarly, a server restart still does not automatically restart the SQL Server Launchpad service; this will have to be restarted manually.
Test Python code provided in the Microsoft documentation did not work in my hands (perhaps the authors were thinking R while writing Python).
— from documentation:
exec sp_execute_external_script @language =N’Python’,
@input_data_1 = N’SELECT 1 as col’
Curiously, it seems to take a long time for error messages to get back to the Management Studio.
A minor modification to the example code ran just fine.
Implicit in this tiny test sample is that InputDataSet and OutputDataSet are built-parameters for scripting within T-SQL. We will discuss this in greater detail when we take a closer look at using Python and pandas to manipulate datasets.
Microsoft’s touting of its “Machine Learning Server” and describing SQL Server as now having “Built-In Artificial Intelligence” may sound like more hyperbolic marketing descriptions than technological assessments. Nevertheless, it is impossible not to get excited about the new possibilities that Microsoft is opening up in SQL Server.
In the next installment, we will look at using pandas DataFrames in SQL Server Python scripts.