Clustering data using k-means in ML.NET
aug 7,
2018
Microsoft recently released a preview of a machine learning framework for .NET developers—ML.NET. I needed to perform a clustering analysis from existing data in one of my applications. This is a pretty common machine learning task, so I decided to document the basic approach in this article. We’ll use the well-worn iris data set from […]
ML.NET—an open source, cross-platform, machine learning framework for .NET
jun 13,
2018
One of the products Microsoft introduced at their 2018 Build conference was an open source, cross-platform machine learning framework for .NET developers. ML.NET will …allow .NET developers to develop their own models and infuse custom ML into their applications without prior expertise in developing or tuning machine learning models. Democratisation of machine learning—-i.e. enabling people […]
SQL Server! Now with Built-In AI!
mar 28,
2018
Microsoft asserts that SQL Server is the first database with “built-in” artificial intelligence. Of course, no phrase suffers more definitions than “artificial intelligence”, so rather than quibble about AI, we’ll just take a look at the technologies that support Microsoft’s claim. Moving Machine Learning to the Server-Side Great excitement followed the introduction of the integration […]
Machine Learning using Spark and R
mar 27,
2017
R is ubiquitous in the data science community. Its ecosystem of more than 8,000 packages makes it the Swiss Army knife of modeling applications. Similarly, Apache Spark has rapidly become the big data platform of choice for data scientists. Its ability to perform calculations relatively quickly (due to features like in-memory caching) makes it ideal […]
Assumptions Can Ruin Your K-Means Clusters
jan 31,
2017
Clustering is one of the most powerful and widely used of the machine learning techniques. It’s very seductive. Throw some data into the algorithm and let it discover hitherto unknown relationships and patterns. K-means is the most popular of all the cluster algorithms. It’s easy to understand—and therefore implement—so it’s available in almost all analysis […]