Machine Learning using Spark and R

R is ubiquitous in the data science community. Its ecosystem of more than 8,000 packages makes it the Swiss Army knife of modeling applications. Similarly, Apache Spark has rapidly become the big data platform of choice for data scientists. Its ability to perform calculations relatively quickly (due to features like in-memory caching) makes it ideal […]
Read More ›

Version control for data scientists using Git and RStudio

Look at the working directory of the average data science project and you’ll see things like this: cust-churn.csv cust-churn.R cust-churn-good.zip cust-churn-old.csv cust-churn-progess-meeting.R cust-churn-working.R cust-churn-working2.R cust-churn-20160217.R test.R test-bk.R Every time a change needs to be made, files are copied to preserve the working code. As changes will often be made to multiple files, it’s common to […]
Read More ›

Type to search blog.learningtree.com

Do you mean "" ?

Sorry, no results were found for your query.

Please check your spelling and try your search again.