This week I have been playing with tables
This post is an example of the kind of log that you should post to RPubs each week. The learning log is an opportunity to reflect on what you have learned each week and to think about what the next steps in your coding journey are. It should answer the following questions…
… include examples of plots/code that you have been working on.
This week the Tidy Tuesday dataset is about income disparities in the US. The student debt dataset looks particularly interesting.
debt <- read_csv("debt.csv")
glimpse(debt)
Rows: 30
Columns: 4
$ year <dbl> 2016, 2016, 2016, 2013, 2013, 2013, 2010, 201…
$ race <chr> "White", "Black", "Hispanic", "White", "Black…
$ loan_debt <dbl> 11108.410, 14224.770, 7493.999, 8363.605, 103…
$ loan_debt_pct <dbl> 0.3367511, 0.4183588, 0.2189689, 0.2845555, 0…
Looks like year is numeric, make it a factor so that it works better in the plot.
debt$year <- as_factor(debt$year)
Looks like this df average family student loan debt for aged 25-55, by race and year normalized to 2016 dollars. Lets plot debt over time by race.
debt %>%
ggplot(aes(x = year, y = loan_debt, colour = race, group = race)) +
geom_point() +
geom_line() +
theme_classic() +
scale_y_continuous(limits = c(0,15000)) +
labs(title = "Average family student loan debt by race and year",
y = "Loan debt ($2016)",
x = "Year")
OK plotting is great, but I am trying to learn about tables this week. My goal is to practice making tables, so lets start by averaging this data into something that might be table worthy.
This chunk groups the debt data by year (averaging across race) and summarising the mean debt levels.
summary <- debt %>%
group_by(year) %>%
summarise(meandebt = mean(loan_debt))
summary
# A tibble: 10 x 2
year meandebt
* <fct> <dbl>
1 1989 1053.
2 1992 1014.
3 1995 1735.
4 1998 1959.
5 2001 2311.
6 2004 3225.
7 2007 4793.
8 2010 6880.
9 2013 7281.
10 2016 10942.
I know that there are several packages that are useful for making tables in R. My goal is to try them out and work out what the pros/cons are
Pipe your dataframe into the kbl() function and you get a basic html table. Add kable styling to get a bootstrap theme.
summary %>%
kbl()
year | meandebt |
---|---|
1989 | 1052.853 |
1992 | 1013.949 |
1995 | 1735.111 |
1998 | 1959.001 |
2001 | 2311.100 |
2004 | 3224.881 |
2007 | 4793.357 |
2010 | 6880.418 |
2013 | 7281.225 |
2016 | 10942.393 |
summary %>%
kbl() %>%
kable_styling()
year | meandebt |
---|---|
1989 | 1052.853 |
1992 | 1013.949 |
1995 | 1735.111 |
1998 | 1959.001 |
2001 | 2311.100 |
2004 | 3224.881 |
2007 | 4793.357 |
2010 | 6880.418 |
2013 | 7281.225 |
2016 | 10942.393 |
It is weird that in each of these options, when I run the Rmd chunk it doesn’t display the contents of the table. I just get a white box. But… when I knit the document, they render just fine. Not sure what is going on with that. I also don’t really know what kable_styling() does; the only difference in the knitting document is that it appears centred on the page. There must be more to it….
The gt package makes great tables (apparently).
summary %>%
gt()
year | meandebt |
---|---|
1989 | 1052.853 |
1992 | 1013.949 |
1995 | 1735.111 |
1998 | 1959.001 |
2001 | 2311.100 |
2004 | 3224.881 |
2007 | 4793.357 |
2010 | 6880.418 |
2013 | 7281.225 |
2016 | 10942.393 |
oooo that is…. minimalistic. It appears when I run the chunk and only takes up the necessary space. It is perhaps a little skinny though…
The DT package is an interface for the javascript DataTables library.
summary %>%
DT::datatable()
OK that one seems nice. I don’t love the default font (huh the font that displays in the Rmd is different to how it renders… why is that?). Its not too big (or too skinny) though and the search function could be useful if you were dealing with big tables. The decimal places are a bit of a problem though.