Querying SQL Server Data from R Using RODBC

In the previous blog, we looked at some of the annoyances encountered when installing RODBC on Linux. Of course, RODBC can also be used in R running on Windows. In either case, running queries using RODBC is straightforward and without surprises.

As is always the case, the first thing we need to do is connect. We saw how to create a DSN, or Data Source Name, when we looked at LibreOffice base. RODBC can use a DSN to connect to a database; the RODBC odbcConnect() function accepts a DNS name as an argument. We can, however, specify everything, including the ODBCdriver, right in the function call and do away with the DSN entirely. Such a connection is sometimes called a “DSN-less connection”. We only need to call the odbcDriverConnect () and supply the driver name.

We will be connecting to the preview of SQL Server vNext on Linux, but once you connect, the rest of the commands won’t care what platform you are running on.

We’ve already seen that the installation process on Linux defines a driver name in /etc/odbcinst.ini.

etc/odbcinst.ini

The “DSN-Less” Connection

ch<-odbcDriverConnect("DRIVER={ODBC Driver 13 for SQL Server};SERVER=localhost;Database=Northwind;uid=SA;pwd=sa!314159")

The “ch” variables hold a reference to a connection, and will be used in virtually all our interactions with SQL Server. In this particular example we specified a database name, butthis is not necessary. The default database may be fine, but in addition you have the option of using the same “USE database” statement that you have used in the Management Studio.

odbcQuery(ch, “USE Northwind”)

Note the first argument is “ch”, the channel handle we obtained in from odbcDriverConnect(). The sqlQuery() function would have worked just as well in this example, but it is simply my habit to call odbcQuery when I do not intend to fetch any row results.

If appropriate, we can directly fetch a table using sqlFetch. The results are, of course, an R dataframe.

resultset = sqlFetch(ch,"Products")

Of course, you are much more prone to want to run an SQL query; this can be done using sqlQuery. The sqlQuery() function is just a simple wrapper function that calls odbcQuery to execute a query and then calls sqlGetresults to retrieve the rowset.

resultset <- sqlQuery(ch, “SELECT * FROM Products WHERE CategoryID = 3”)

resultset <- sqlQuery(ch,”EXECUTE TestProcedure”)

A very nice feature of sqlQuery is that we can specify a maximum number of rows to retrieve with the optional “max” parameter. This is useful when the resultset is quite large and we wish to bring over the data chunks at a time. We call sqlQuery to execute the query and retrieve the first, well, we’ll say five rows in this example.

resultset= sqlQuery(ch, "SELECT * FROM Products", max=5)

After processing the first five rows, we could use sqlGetResults to get the next 10.

resultset = sqlGetResults(ch, max=10)

The close method allows us to do the right thing and clean up after ourselves.

close(ch)

Conclusion

RODBC is a very simple library to use, and the core set of functions needed to get started querying SQL Server data from R is even simpler. However, writing complex queries in R will be cumbersome and error-prone. It is wise to define desired SQL queries as views which can be called from R very simply.

Type to search blog.learningtree.com

Do you mean "" ?

Sorry, no results were found for your query.

Please check your spelling and try your search again.