This quick start introduces the basic general concepts behind how to build your first workflow by using python with SQL. In this quick start, we will build an app with a connection string, and in the second approach, we will integrate Python with SQL Server. We’ll install the required packages and SQL Server features that are required to run the python script in SQL Server.
Initially, only R service was available as a feature till SQL 2016. Microsoft included python support in SQL 2017 and added Python as an in-database component, which can be used for deep data analysis and in the implementation of machine learning models in the database itself. Then Microsoft renamed R Services to Machine Learning Services, and R and Python are two options under this feature.
SQL Server Machine Learning Services:- Machine Learning Services is a feature in SQL Server that gives the ability to run Python and R scripts with relational data. You can use open-source packages and frameworks, and the Microsoft Python and R packages for predictive analytics and machine learning. The scripts are executed in-database without moving data outside SQL Server or over the network. Once the installation of the features is done, most common packages are installed with Machine Learning service, like revoscalepy, microsoftml etc.
Now I will start with a simple program that will extract data from the database using the connection string. This program needs a .json file that will contain weekdays.
JSON data should appear like below:
Also, I need a data table that will have one varchar type column
CREATE TABLE DaysOfWeek(
[DayOfWeek] [varchar](10) NOT NULL
) ON [PRIMARY]
insert into DaysOfWeek(DayOfWeek)values(‘Monday’),(‘Tuesday’),(‘Sunday’),(‘Monday’),(‘Tuesday’),(‘Wednesday’),(‘Thursday’),(‘Friday’),(‘Saturday’),(‘Friday’),(‘Monday’),(‘Sunday’)
Now, our purpose is to map the days. So I will write a code that will read JSON data and also extract the data from the table then will map JSON and data table using the Pandas Python library.
import pandas as pd
from sqlalchemy import create_engine
with open('data.json', 'r') as f:
daysMap = json.load(f)
engine = create_engine('mssql+pyodbc://DBServerName/DatabaseName?driver=SQL Server?Trusted_Connection=yes')
sql = "SELECT * FROM DaysOfWeek"
OutputDataSet = pd.read_sql_query(sql, engine)
OutputDataSet["DayOfWeekNumber"] = pd.Series([daysMap[i] for i in OutputDataSet["DayOfWeek"]])
Run the above code, Output of the program will appear like the below screenshot:
Also need to install the packages that are written at the starting of the code. So we will install the packages using the pip install <package name> on command prompt.
We are done with the first approach. Now we will move on the next that is Python integration with SQL for that we will install the features that are required to run the python in SQL Server. Let’s start with feature installation:
1. Download SQL Server 2017.
2. On the Installation tab, select New SQL Server stand-alone installation or add features to an existing installation.
3. Select these options on the Feature Selection page:
- Database Engine Services →To use R or Python with SQL Server, you must install an instance of the database engine. You can use either a default or a named instance.
- Machine Learning Services (In-Database) →This option installs the database services that support R and Python script execution.
- Python → Check this option to add the Microsoft Python packages, the Python 3.5 executable, and select libraries from the Anaconda distribution.
Feature selection screen:
4. On the Consent to Install Python page, select Accept and then Next. The Python open-source licensing agreement also covers Anaconda and related tools, plus some new Python libraries from the Microsoft development team.
5. You may be asked to restart the computer at the end of the installation.
Enable script execution
1. Open SQL Server Management Studio.
2. Connect to the instance where you installed Machine Learning Services, click New Query to open a query window, and run the following command:
On the execution of the above command, you will see the number of records. The value for the property
external scripts enabledshould be 0 at this point. By default, this feature is turned off.
3. Run the following command to enable the external scripting feature:
Exec sp_configure 'external scripts enabled', 1 RECONFIGURE WITH OVERRIDEIf this feature is already enabled then no need to re-configure.
Restart the service
Make sure, SQL Server Launchpad service is running if service is not running restart the DB engine related service will start automatically.
Otherwise manually you can restart the services by using the Services panel in Control Panel.
If Launchpad is running, you can run simple Python scripts to verify that external scripting runtimes can communicate with SQL Server.
Ready to Run
Now system configuration is done, we are ready to run the python script in SQL Server. The first time external script runtime is loaded so it will take a little while.
Now system configuration is done, we are ready for demo. Let’s run the same code in SQL Server that we ran in VS Code using the connection string. Now, this code will run as a script in SQL Server.
Open Microsoft SQL Server Management Studio and run the below code. Please make sure you already have a table in the database. For the reusability, I have put all the code in proc as you can see the below code so we can call the proc from any application and the ‘path’ parameter is passed. Run the below code it will create a stored procedure in the specified database. Once the proc is created, we can execute this proc.
Same code we have used in .py file. Now the difference is we are using that code in SQL Server using as a script, but the output will remain the same.
create PROCEDURE proc_Weekdays (
@path varchar(500) = null
set @path = ‘C:\Gurpreet\Python\Python Program\data.json’
EXECUTE sp_execute_external_script @language = N’Python’
, @script = N’
OutputDataSet = InputDataSet
f = open (filePath, “r”)
daysMap = json.loads(f.read())
OutputDataSet[“DayOfWeekNumber”] = pandas.Series([daysMap[i] for i in OutputDataSet[“DayOfWeek”]])’
, @input_data_1 = N’SELECT * FROM DaysOfWeek’
, @params = N’ @filePath varchar(500)’
, @filePath = @path
WITH RESULT SETS (([DayOfWeek] nvarchar(10) null,[DayOfWeekNumber] int null))
Execute the stored procedure:
Exec proc_Weekdays @path='*specify the path of json file*'Here we will pass the parameter to user-defined proc, basically, this parameter is the path of the JSON file. Please use the same JSON format that is defined.
Output of the above proc:
You will notice output will be the same even the code is executed from the VS Code or SQL Server Management Studio.
Looking at the sp_execute_external_script syntax, among others, it accepts also the
@input_data_1 parameter. It allows passing the input data used by the external script in the form of a T-SQL query. sp_execute_external_script is a system proc that will execute the python code. The procedure accepts parameters so we will pass language, input, script, params parameters.
External script arguments that are used above are explained as under:
@language:- It indicates the script language. ‘Python’ is a valid language for an argument.
@input:- whatever SQL script wants to execute that we can write here.
@script:- script is nvarchar(max). The python script is specified here.
@params:- A list of input parameter declarations that are used in the external script.
Note: Also, here’s the complete list of arguments that we can use for the external script, but I’m using only four.