Randomly Sampling Rows in R
maj 22,
2019
It’s impossible to imagine a data scientist who does not have to randomly sample datasets on a regular basis. Most employ the useful and easy function sample( ), defined in R’s base namespace. Let’s take a closer look at sample( ) and then take a look at a flexible alternative that is just as easy […]
Your Linear Regression Need Not Be Linear
jan 31,
2019
Anyone who has ever done a linear regression in R has seen an R formula. R formulae are examples of the Wilkinson notation, sometimes called the Wilkinson-Rogers notation. This same notation is used by other applications including Matlab and Octave to indicate relationships between variables. In most cases, R users have seen the Wilkinson notation […]
Using .RProfile to Customize your R Environment
jan 22,
2019
When R starts up, it looks for several files to initialize its configuration. R has its own configuration file called Rprofile.site located in the /etc folder. It is probably wise not to mess with this file unless you are very confident in your understanding. Most R users are aware that during startup, R reads an […]
Preparing SQL Data for R Visualizations Using Power Query Pivot
jan 11,
2019
The Data Frame The fundamental data structure used by the majority of R functions and packages is the data frame. In a data frame, sets of related values constitute rows, while an individual column vector in a data frame contains comparable measures that can be summed, averaged, or subjected to any number of numerical manipulations. […]
GPU Processing in R: Is it worth it?
nov 28,
2018
It would be difficult for an R user not to have heard of GPU processing. In 2006, about seven years after it invented the GPU, Nvidia released the first incarnation of CUDA, the architecture that allowed scientists, engineers, and statisticians to use high-end graphics processors as pure floating-point number-crunchers. Today, Nvidia’s CUDA platform, specific for […]