Parallel or Perish – An Overview
jan 5,
2021
I’m pleased to begin the first of a series of blogs on parallel processing, a topic which is no longer optional in the world of machine learning and AI. In this introductory note, we’ll take an overview of the field which, for better or for worse, is becoming more complex every day. Today we are […]
Effective use of RevoScaleR Transformations
aug 13,
2019
As mentioned previously, several important RevoScaleR functions include provisions for transforming data within the function itself, rather than require separate steps in addition to the function call. This is advantageous, since it means that large datasets can be read once instead of having to be read repeatedly by several functions. rxImport, rxDataStep, and rxSplit support […]
Using Tensorflow with R
jun 27,
2019
When examining the available selection of machine learning environments, even the most ardent R user may suffer some pangs of Python envy. Most machine learning environments, such as Google’s Tensorflow, are programmed in C++ for maximum performance and maximum utilization of hardware resources such as GPUs (graphics processing units). The Tensorflow API, however, is designed […]
Loading New R Packages into AzureML
jun 18,
2019
Microsoft Azure ML provides over 500 individual R packages for use in R scripts. It is almost certain, however, that at some point you will wish to use an R package not available by default. Several years ago, before Revolution Analytics was acquired by Microsoft, Andrie deVries created a very useful package called miniCRAN. This […]
How to Interpret a Q-Q Plot
jun 4,
2019
Statisticians have developed a remarkably powerful set of tools for analyzing normally distributed data. Too bad real data is never normally distributed. Fortunately for us, most of the time “close enough” is all we really need. But how are we to know? One quick and effective method is a look at a Q-Q plot. The […]