- Learners can apply ten functions from the dplyr R Package to generate a subset of data for use in a table or plot.
Research Beyond the Lab: Open Science and Research Methods for a Global Engineer
2025-03-06
… based on the concepts of functions as verbs that manipulate data frames

select: pick columns by namearrange: reorder rowsfilter: pick rows matching criteriarelocate: changes the order of the columnsmutate: add new variablessummarise: reduce variables to valuesgroup_by: for grouped operationsRules of dplyr functions:
filter().data =year == 2007 What to do with the datafilter().data =year == 2007 What to do with the datagapminder_2007filter().data =year == 2007 What to do with the datagapminder_2007<-|>| name | iso3 | year | region_sdg | varname_short | varname_long | residence | percent |
|---|---|---|---|---|---|---|---|
| Afghanistan | AFG | 2000 | Central and Southern Asia | san_bas | basic sanitation services | national | 21.9 |
| Afghanistan | AFG | 2000 | Central and Southern Asia | san_bas | basic sanitation services | rural | 19.3 |
| Afghanistan | AFG | 2000 | Central and Southern Asia | san_bas | basic sanitation services | urban | 30.9 |
| Afghanistan | AFG | 2000 | Central and Southern Asia | san_lim | limited sanitation services | national | 5.6 |
| Afghanistan | AFG | 2000 | Central and Southern Asia | san_lim | limited sanitation services | rural | 3.1 |
| Afghanistan | AFG | 2000 | Central and Southern Asia | san_lim | limited sanitation services | urban | 14.5 |
| varname_short | varname_long | n |
|---|---|---|
| san_bas | basic sanitation services | 14742 |
| san_lim | limited sanitation services | 14742 |
| san_od | no sanitation facilities | 14742 |
| san_sm | safely managed sanitation services | 14742 |
| san_unimp | unimproved sanitation facilities | 14742 |
md-03a-data-transformation.qmd file and click on it to open it in the top left window.15:00
Please get up and move! Let your emails rest in peace.
10:00
Image generated with DALL-E 3 by OpenAI
md-03b-your-turn-filter.qmd file and click on it to open it in the top left window.15:00
filter()residence == "national", etc. What to do with the datasanitation_national_2020_sm<-|>filter() function to create a subset from the sanitation data containing urban and rural estimates for Nigeria.sanitation_nigeria_urban_ruralGreat for timeseries data
Use the ggplot() function to create a connected scatterplot with geom_point() and geom_line() for the data you created in Task 1.2.
Use the aes() function to map the year variable to the x-axis, the percent variable to the y-axis, and the varname_short variable to color and group aesthetic.
Use facet_wrap() to create a separate plot urban and rural populations.
Change the colors using scale_color_colorblind().

md-03a-data-transformation.qmd file and click on it to open it in the top left window.25:00
Please get up and move! Let your emails rest in peace.
15:00
Image generated with DALL-E 3 by OpenAI
md-03c-your-turn-summarise.qmd file and click on it to open it in the top left window.30:00
Slides created via revealjs and Quarto: https://quarto.org/docs/presentations/revealjs/ Access slides as PDF on GitHub
All material is licensed under Creative Commons Attribution Share Alike 4.0 International.