What is the Carbon Footprint of AI and Deep Learning?
Jul 31, 2019
Most of the recent breakthroughs in Artificial Intelligence are driven by data and computation. What is essentially missing is the energy cost. Most large AI networks require huge number of training data to ensure accuracy. However, these accuracy improvements depend on the availability of exceptionally large computational resources. The larger the computation resource, the more energy it consumes. This […]
Clustering data using k-means in ML.NET
Aug 7, 2018
Microsoft recently released a preview of a machine learning framework for .NET developers—ML.NET. I needed to perform a clustering analysis from existing data in one of my applications. This is a pretty common machine learning task, so I decided to document the basic approach in this article. We’ll use the well-worn iris data set from […]
Is functional programming more effective than object-oriented programming?
Apr 26, 2017
Imperative vs functional programming. It’s a debate that goes back to the birth of high-level languages—Fortran vs Lisp. In later years, it was retreaded as object-oriented vs function programming (OOP vs FP)—OOP having become the (massively) dominant software development paradigm. And, I’m a fully paid-up member. I embraced Object Pascal via Delphi 1 on 1995 […]
10 Rules for Creating Reproducible Results in Data Science
Mar 29, 2017
In recent years evidence has been mounting that points to a crisis in the reproducible results of scientific research. Reviews of papers in the fields of psychology and cancer biology found that only 40% and 10%, respectively, of the results, could be reproduced. Nature published the results of a survey of researchers in 2016 that […]
Assumptions Can Ruin Your K-Means Clusters
Jan 31, 2017
Clustering is one of the most powerful and widely used of the machine learning techniques. It’s very seductive. Throw some data into the algorithm and let it discover hitherto unknown relationships and patterns. K-means is the most popular of all the cluster algorithms. It’s easy to understand—and therefore implement—so it’s available in almost all analysis […]