Quantile summary using dplyr
Say you want to compute quantile summary for multiple variables.
require(dplyr)
require(tidyr)
require(broom)
As an example let us use mtcars dataset.
glimpse(mtcars)
## Observations: 32
## Variables: 11
## $ mpg <dbl> 21.0, 21.0, 22.8, 21.4, 18.7, 18.1, 14.3, 24.4, 22.8, 19....
## $ cyl <dbl> 6, 6, 4, 6, 8, 6, 8, 4, 4, 6, 6, 8, 8, 8, 8, 8, 8, 4, 4, ...
## $ disp <dbl> 160.0, 160.0, 108.0, 258.0, 360.0, 225.0, 360.0, 146.7, 1...
## $ hp <dbl> 110, 110, 93, 110, 175, 105, 245, 62, 95, 123, 123, 180, ...
## $ drat <dbl> 3.90, 3.90, 3.85, 3.08, 3.15, 2.76, 3.21, 3.69, 3.92, 3.9...
## $ wt <dbl> 2.620, 2.875, 2.320, 3.215, 3.440, 3.460, 3.570, 3.190, 3...
## $ qsec <dbl> 16.46, 17.02, 18.61, 19.44, 17.02, 20.22, 15.84, 20.00, 2...
## $ vs <dbl> 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, ...
## $ am <dbl> 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, ...
## $ gear <dbl> 4, 4, 4, 3, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 4, 4, ...
## $ carb <dbl> 4, 4, 1, 1, 2, 1, 4, 2, 2, 4, 4, 3, 3, 3, 4, 4, 4, 1, 2, ...
We might be interested in some specific quantiles.
p <- c(0.5,0.75,0.95)
How to do this using tidyverse syntax and pipelines?
mtcars %>%
gather(variable, value, mpg:carb) %>%
group_by(variable) %>%
do(tidy(t(quantile(.$value, p))) ) %>%
gather(quantile, value, -variable) %>%
spread(variable, value)
## # A tibble: 3 x 12
## quantile am carb cyl disp drat gear hp mpg qsec vs
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 X50. 0 2.00 6.00 196 3.70 4.00 123 19.2 17.7 0
## 2 X75. 1.00 4.00 8.00 326 3.92 4.00 180 22.8 18.9 1.00
## 3 X95. 1.00 4.90 8.00 449 4.31 5.00 254 31.3 20.1 1.00
## # ... with 1 more variable: wt <dbl>