When programming in languages other than F#, I yearn for the pipeline operator (
|>). The pipeline operator lets you pass an intermediate result onto the next function. So, if you wish to apply functions
f to a value
x, instead of writing
you can write
x |> h |> g |> f
Personally, I think this is easier to read. It more naturally represents the order of execution. Apply
x, then apply
g to that result and, finally, apply
The value the pipeline operator brings in terms of readability becomes even more obvious when we look at a less abstract example. Let’s say that we want to sum the squares of the odd numbers between 1 and 10. In F#, without the pipeline operator, this might be coded as
Seq.sum (Seq.map (fun x -> x * x) (Seq.filter (fun x -> x % 2 = 1) [| 1..10 |]))
However, with the pipeline operator, we can write
[| 1..10 |] |> Seq.filter (fun x -> x % 2 = 1) |> Seq.map (fun x -> x * x) |> Seq.sum
This more cleanly separates the various stages of the calculation.
Of course, we could have written
let numbers = [| 1..10 |] let oddNumbers = Seq.filter (fun x -> x % 2 = 1) numbers let squaredOddNumbers = Seq.map (fun x -> x * x) oddNumbers let result = Seq.sum squaredOddNumbers
but that doesn’t feel very functional.
I should note that
|> is actually one of a pair of pipeline operators in F#. It’s actually the pipe-forward operator. There is also a pipe-backward operator (
<|)—but the former is far more common.
The pipeline operator becomes the way I think about code so, when I have to return to R, I find the transition jarring. However, as usual, it turns out there’s a package in R that resolves my dilemma—magrittr.
The magrittr package introduces a pipeline operator for R—
%>%. Let’s go straight to an example. Using magrittr, the calculation we made above, translated to R, is
1:10 %>% subset(. %% 2 == 1) %>% . ^ 2 %>% sum
The period (
.) is a placeholder for the value(s) piped through from the left of the operator.
For a more realistic example we can process the
mtcars dataset that is included in the standard R deployment. We’ll make use of the excellent dplyr package to manipulate the dataset. dplyr is designed to support the pipeline operator.
Let’s find the average MPG for all the cars that have under 100 horsepower.
library(dplyr) library(magrittr) mtcars %>% subset(hp < 100) %>% summarise(mean.mpg = mean(hp))
The answer should be 76.3.
This has been a brief introduction to the pipeline operator and how you can use it in R. Frankly, I’m all for anything that helps improve the readability of R code without lowering oneself to the level of assigning everything to an intermediate variable.
If you’d like to learn more about R, Learning Tree has two courses that may be of interest.