How to Predict Outcomes Using Random Forests and Spark
Random forests are an ensemble, or model of models, machine learning approach. The algorithm builds multiple decision trees, based on different subsets of the features in the data. Outcomes are then predicted by running observations through all the trees and averaging the individual predictions. Think wisdom of crowds. Spark’s machine learning library, MLlib, has support […]
Five C# 6.0 New Features to Look Forward To.
With Visual Studio 2015 due to ship this month, it seems like a good time to consider how C# 6.0 is going to make our professional lives just that little bit easier. C# 6 is more evolution then revolution. It’s about smoothing off some of the rough edges of the language. So there’s something for […]
The Benefits of API-First Development
APIs (Application Program Interfaces) were a hot topic in 2014, and that seems set to continue in 2015. Rapid growth in parallel areas, such as the Internet of Things, are likely to keep the momentum behind APIs for the foreeable future. Some companies now have an API as their primary product. Stripe, an online payment […]
Paying by Numbers—Should Data Scientists Receive Performance Based Compensation?
A recent article suggests that, in the “near future”, data analysts will be compensated based on performance. They will receive commission-based payments, rather like salesmen, rather that being paid purely for their time. This performance will presumably be determined by the impact that the data analyst has on the key goals of the organization, e.g. […]
How to Build a Predictive Model Using Azure Machine Learning
In this article, we’ll use Microsoft’s Azure Machine Learning (ML) service to predict breast cancer diagnoses from test data. If you don’t have an Azure account, a free trial is available. Note that, at the time of writing, ML is in preview, so the details may change. However, the basic concepts should still apply. Obtaining […]