Version control for data scientists using Git and RStudio

Look at the working directory of the average data science project and you’ll see things like this: cust-churn.csv cust-churn.R cust-churn-old.csv cust-churn-progess-meeting.R cust-churn-working.R cust-churn-working2.R cust-churn-20160217.R test.R test-bk.R Every time a change needs to be made, files are copied to preserve the working code. As changes will often be made to multiple files, it’s common to […]
SQL Server 2016: R Integration Redux

SQL Server 2016 has now reached Release Candidate 3 (RC3). One of the new features that continues to foster interest in the analytics and data mining community is the integration of SQL Server with the open-source statistics toolset R. If you fall into this category, I’m inclined to suggest you remain patient and wait for […]
SQL Server 2016: Choosing Your R-Chitecture

Enthusiastic data miners are excited about the new possibilities which have opened up with Microsoft R. Not everyone, however, has realized that the so-called “in-database” R is not the best choice for every application of R to SQL Server data. One database administrator expressed concern about R competing with SQL Server for precious RAM. A […]
A Problem with R

  The value of R lies in the enormous quantity of code contributed by analysts and academic researchers over many years, providing a packaged solution not only for common analytical techniques but also the esoteric and the obscure. The problem with R, and one that concerns many analysts dealing with large data volumes, is that […]
Introducing Microsoft R Server

Earlier this year Microsoft released Microsoft R Server. This is essentially a rebranding of Revolution R Enterprise—acquired through Microsoft’s acquisition of Revolution Analytics in April 2015. However, the fact that Microsoft is backing the product makes a big difference to many potential corporate users. And with Microsoft embracing R across the company, more investment in […]
