Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support groups #3

Open
msberends opened this issue Feb 26, 2018 · 3 comments
Open

Support groups #3

msberends opened this issue Feb 26, 2018 · 3 comments

Comments

@msberends
Copy link

Hey,

This is a really cool package! Feature suggestion: add groups as used by dplyr.

Let's say I would have:

tbl1 <- tibble(
  x = rep(LETTERS[1:6], 3),
  type = paste("Type", c(rep(1, 6), rep(2, 6), rep(3, 6))),
  value = round(runif(18) * 500)
)

And then use with your addin:

tbl1 %>%
  group_by(x) %>%
  summarise(n = n())

This would give me 1 View, namely:

# Tab name: 1. Summarise

      x          n
1     A          3
2     B          3
3     C          3
4     D          3
5     E          3
6     F          3

But it would be great if you would support groups with the dplyr functions group_size and group_vars:

# Tab name: 1. Group by

      group_var min_size max_size
1             x        3        3

# Tab name: 2. Summarise

      x          n
1     A          3
2     B          3
3     C          3
4     D          3
5     E          3
6     F          3

It this example n() is equal to group_size, but that isn't always the case, obviously. Maybe not the best example then 😄

@daranzolin
Copy link
Owner

Thanks @msberends!

Interesting, I hadn't thought of that, and I wasn't even aware of group_size and group_vars. I'll have to think about what that would look like.

No View() tabs, but does skimr::skim from the skimr package get you want you want? It would be cool if ViewPipeSteps could pop open three tabs for the below output.

tbl1 <- tibble(
  x = rep(LETTERS[1:6], 3),
  type = paste("Type", c(rep(1, 6), rep(2, 6), rep(3, 6))),
  value = round(runif(18) * 500)
)

tbl1 %>%
  group_by(x) %>%
  skimr::skim()

Skim summary statistics
 n obs: 18 
 n variables: 3 
 group variables: x 

Variable type: character 
  x  var missing complete n min max empty n_unique
1 A type       0        3 3   6   6     0        3
2 B type       0        3 3   6   6     0        3
3 C type       0        3 3   6   6     0        3
4 D type       0        3 3   6   6     0        3
5 E type       0        3 3   6   6     0        3
6 F type       0        3 3   6   6     0        3

Variable type: numeric 
  x   var missing complete n   mean     sd min   p25 median   p75 max     hist
1 A value       0        3 3 166.33 154.12  35  81.5    128 232   336 ▇▁▇▁▁▁▁▇
2 B value       0        3 3 107.33 124.27   9  37.5     66 156.5 247 ▇▇▁▁▁▁▁▇
3 C value       0        3 3 221    208.04  92 101      110 285.5 461 ▇▁▁▁▁▁▁▃
4 D value       0        3 3 386    139.57 229 331      433 464.5 496 ▇▁▁▁▁▁▇▇
5 E value       0        3 3 124.33  53.43  92  93.5     95 140.5 186 ▇▁▁▁▁▁▁▃
6 F value       0        3 3 253    206.87  17 178      339 371   403 ▇▁▁▁▁▁▇▇

@msberends
Copy link
Author

That's cool! Didn't know that package. But I like your views better 😉
Just give it some thought, and we'll see what it brings.

@joachim-gassen
Copy link
Collaborator

It has been a while but just for the sake if completeness: The extensions that I just introduced to my fork allow using skimr::skim() to explore grouping issues.

devtools::install_github("joachim-gassen/ViewPipeSteps")
library(tidyverse)
library(ViewPipeSteps)

my_print_cmd <- c(
  "message(title);",
  "skimr::skim_tee(.data = ps%d)"
)

diamonds %>%
  select(carat, cut, color, clarity, price) %>%
  group_by(color) %>%
  print_pipe_steps(my_print_cmd, all = TRUE) %>%
  summarise(n = n(), price = mean(price)) %>%
  arrange(desc(color))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants