447 / 547 Visualizing Data. An introductory course by Richard Layton at Rose-Hulman Institute of Technology.
Introduction
Frequent links
Moodle video (accessible to
Rose-Hulman students only)
Moodle reading (accessible to Rose-Hulman
students only)
R for Data Science reading (Wickham and
Grolemund, 2017)
reading hardcopy
blog reading
w | d | agenda & assignments | milestones | due |
---|---|---|---|---|
1 | M | 1.1 Getting started Introduction to visual rhetoric [slides] Syllabus highlights Syllabus About the course Doumont (2009) Designing the graph |
||
T | 1.2 Assessing the structure of a data set Data structure / graph design [exercise] [hints] Structured data excerpts D1 data structure Install software |
D1 data search sources | Doumont reading | |
W lab | 1.3 Software studio Introduction to means [slides] Software studio R basics |
Software setup complete | ||
R | 1.4 Graphical repertoire Introduction to repertoire [slides] Portfolio highlights Sign-out Tufte reprints Reading prompts 1 |
|||
2 | M | 2.1 Data basics lesson (17 min) slides tutorial exercises |
D1 data identified wk1 exercises complete |
|
T | 2.2 Reading discussion D2 data structure Tufte (1997) Decision to launch Challenger |
D2 data search sources | Reading prompts 1 | |
W lab | 2.3 Data studio Data studio introduction (14 min) Data studio Data sources Managing files [slides] Interacting with R Return reprints 6.1 Running code 6.2 RStudio diagnostics |
|||
R | 2.4 Graph basics Graph basics introduction (11 min) Graph basics [exercises] 3.10 The layered grammar of graphics |
|||
3 | M | R Markdown basics introduction (18 min) [slides] Commit/pull/push regularly R Markdown tutorial RStudio tips |
D2 data identified wk2 exercises complete |
|
T | Design basics introduction (10 min) D3 data structure Design basics tutorial Robbins (2013a) General design principles |
D3 data search sources | ||
W lab | Portfolio studio introduction (9 min) Portfolio studio tutorial Sample portfolio skeleton Document design Document requirements Data requirements Sample portfolio entries and critiques 27.2 R Markdown basics 27.3 Text formatting with Markdown 27.4 Code chunks |
|||
R | D1 Distributions introduction (13 min) D1 Distributions [requirements] Strip plot tutorial [exercises] Box plot tutorial [exercises] |
|||
4 | M | 4.1 Reshaping data lesson: Virginia deaths (8 min) tutorial: Virginia deaths lesson: WHO tuberculosis cases (11 min) tutorial: WHO tuberculosis cases exercises 12.2 Tidy data 12.4 Separating and uniting 12.7 Non-tidy data |
D3 data identified wk3 exercises complete |
|
T | 4.2 Reading discussion Reading discussion introduction Wainer (2014) 15 displays about one thing |
D4 data search sources | Reading prompts 2 | |
W lab | 4.3 Presentations, practice, & portfolio studio 3P Studio agenda (2 min) Applying the discussion notes (6 min) |
D1 graph & prose Presentation prompts |
||
R | 4.4 D2 Multiway D2 Multiway introduction D2 Multiway [requirements] Multiway dot plot tutorial [exercises] |
|||
5 | M | Introducing factors Working with factors [exercises] 15.2 Creating factors 15.4 Modifying factor order 15.5 Modifying factor levels |
D4 data identified wk4 exercises complete |
|
T | D5 data structure Discovering stories [reading] [reflection] |
D5 data search sources | ||
W lab | Presentations, practice, & portfolio studio | D2 graph & prose Presentation prompts Reflection on rhetoric |
||
R | D3 Exploring correlations [requirements] Scatterplot [exercises] 28.2 Label |
|||
6 | M | Carpentry with joins [exercises] 13.4 Mutating joins |
D5 data identified wk5 exercises complete |
|
T | D6 data structure D4 Graph injuries/fatalities ethically [requirements] Dot plot [exercises] Image magick Dragga and Voss (2001) Cruel pies |
D6 data search sources | Reading prompts 3 | |
W lab | Presentations, practice, & portfolio studio | D3 graph & prose Presentation prompts |
||
R | Time and dates Time series data Line graph [exercises] 16.2 Creating date/times 16.3 Date-time components 16.4 Time spans |
|||
7 | M | Graphical lies [reflection] Wainer (2000) How to display data badly |
D6 data identified wk6 exercises complete |
|
T | D7 data structure D5 Redesign a graphical lie [requirements] Correcting graphical lies [slides] |
D7 data search sources | ||
W lab | Presentations, practice, & portfolio studio | D4 graph & prose Presentation prompts Reflection on rhetoric |
||
R | Misc data carpentry [exercises] | |||
8 | M | D6 Multivariate data [requirements] Scatterplot matrix [exercises] Parallel coordinate [exercises] Conditioning plot [exercises] |
D7 data identified wk7 exercises complete |
|
T | Kostelnick (2007) Conundrum of clarity | Reading prompts 4 | ||
W lab | Presentations, practice, & portfolio studio PDF scraping example |
D5 graph & prose Presentation prompts |
||
R | Color [slides] Friendly guide to color (Rost, 2018a) Choosing colors (Rost, 2018b) Scales [slides] Robbins (2013b) Scales 28.4 Scales |
|||
9 | M | D7 Learn a display [requirements] Examples and resources Revising portfolio entries [slides] |
wk8 exercises complete | |
T | Graph editing: points Graph editing: lines [exercises] |
|||
W lab | Presentations, practice, & portfolio studio | D6 graph & prose Presentation prompts |
||
R | Graph editing: smooth fit Graph editing: annotation |
|||
10 | M | Graph design improvisation ggplot extensions Beware of poor design [slides] |
wk9 exercises complete | |
T | Spence (2006) Playfair & psychology of graphs | Reading prompts 5 | ||
W lab | Portfolio final editing Presentations, practice, & portfolio studio |
D7 graph & prose Presentation prompts |
||
R | Data tables Rendering multiple files Course evaluations |
Portfolio, final push Friday, 5pm |
||
11 | M | Finals week, no class, no exam The portfolio after the term Updating the R habitat |
Index, free clip art from Clickartstockphotos
course management
R & RStudio
data
graphs
portfolio
visual rhetoric and graph design
project management
software
readings
- R basics
- Interacting with R
- RStudio tips
- Updating the R habitat
- Color names in R
- R functions in tutorials
Basics
- Data basics slides
- Four basic data skills
- Data in base R and in R packages
- Reading raw data files
- Web download using import()
- Data directory write and read
- All R objects have types
- Some R objects have attributes
- All R objects have class
- Datasets in tutorials
- Time and dates
- Time series data
- PDF scraping example
Factors
- Factor type and attributes
- Factor definition
- Creating a factor variable
- Reorder factor levels manually
- Reorder factor levels by a date variable
- Reorder factor levels by a quantitative variable
- Reorder factor levels by frequency of levels
- Recode factor levels
- Remove unused levels
- Reverse factor level order
Data studio
- Classify your data structure
- Use Notepad for CSV files
- Workflow basics
- Data transformation
filter()
,arrange()
,select()
,mutate()
,group_by()
, anddplyr::summarize()
- Data import
- Data links
Time series
- Time series with separate year month day
- Time series with decimal dates
- Edit the date scale
- Facet by a date variable
Data carpentry
- Data in row names
- Keys and values in coordinatized data
- rowrecs_to_blocks()
- blocks_to_rowrecs()
- WHO case study in data reshaping
- Web download using import()
- select with matches()
- unpivot_to_blocks()
- drop_na()
- str_replace()
- separate()
- WHO group_by() and summarize()
- WHO graphs
- Carpentry with joins
- MIDFIELD data
- MIDFIELD joins
- MIDFIELD carpentry
- MIDFIELD design
- Swiss bank data
Data exercises
- Data basics exercises
- 4.4.1 workflow
- 4.4.2 workflow
- 4.4.3 workflow
- 5.2.4
filter()
- 5.3.1
arrange()
- 5.4.1
select()
- 5.5.2
mutate()
- 5.6.7
group_by()
- Data reshaping
pivot_to_rowrecs()
,unpivot_to_blocks()
,rowrecs_to_blocks()
, andblocks_to_rowrecs()
- Carpentry with joins
Graph tutorials
- Graph basics
- Strip plot
- Box plot
- Multiway dot plot
- Scatterplot
- Dot plot
- Line graph
- Scatterplot matrix, ggscatmat(), ggpairs(), and non-ggplot packages
- Parallel coordinate
- Conditioning plot
Graph exercises
- Graph basics exercises
- Strip plot exercises
- Box plot exercises
- Multiway dot plot exercises
- Scatterplot exercises
- Dot plot exercises
- Line graph exercises
- Scatterplot matrix exercises
- Parallel coordinate exercises
- Conditioning plot exercises
Graph elements
- Line color by group
- Panels with free y-scales
- symbol color
- symbol size
- symbol shape
- text as symbols
- text as symbols legend
- line color
- line type
- line size
- reference lines vertical
- reference lines horizontal
- reference lines sloped
- linear fit
- loess fit
- identical fit in every panel
- highlight data symbols
- highlight data with labels
- text placed arbitrarily
- text placed arbitrarily in multiple panels
Portfolio
Document design
Data
requirements
Sample README
- D1 distributions
- D2 multiway
- D3 correlations
- D4 injuries or fatalities
- D5 redesign a graphical lie
- D6 multivariate
- D7 self-taught
R Markdown basics
- Rmd basics
- create an Rmd script
- set the Rmd output format
- how to format text
- initialize a report
- initialize knitr
- introductory prose
- using code chunks
- source R scripts
- data table
- include graphics
- spell check
Resources
- Set up README
- Adding links to README
- Setup reading responses
- Document design
- Media
- Fonts
- Headings
- Text color
- Emphasis
- Hyphens and dashes
- Color names in R
- Portfolio final editing
- The portfolio after the term
Citations and references
- BiBTeX
- entry types
- citation keys
- fields
- notes on usage
- articles
- books
- in a book
- in proceedings
- web pages
- software
- summary of entry types
Portfolio studio
- Adding links to README
- Importing images
- Typesetting mathematics
- Create the bib file
- BiBTeX entry types
- YAML bibliography argument
- Add a citation
- Add a references heading
- Format the citations and references
- Reading responses
- Presentation responses
- Color names in R
- Design basics
- Discovering stories
- Follow good design practices
- Beware Simpson’s paradox
- Adjust for inflation
- Adjust for population
- Adjust for PPP
- Adjust for lack of context
- Video links discovering stories
- Video links correcting graphical lies
Reading and reflection prompts Copy and paste the Rmd markup into your own Rmd file(s)
- Reading prompts 1 Tufte (1997) Decision to launch Challenger
- Reading prompts 2 Wainer (2014) 15 displays about one thing
- Reading prompts 3 Dragga and Voss (2001) Cruel pies
- Reading prompts 4 Kostelnick (2007) Conundrum of clarity
- Reading prompts 5 Spence (2006) Playfair & psychology of graphs
- Reflection on rhetoric discovering stories
- Reflection on rhetoric correcting graphical lies
- Introductory slides
- Managing files
- Planning the directory structure
- Hyphens and underscores in file names
- Planning a file-naming scheme
- Using relative paths
- Searching files
Getting started
Software studio
- Setup GitHub
- Create a repo
- Invite collaborator
- Create an Rproject
- Create the Renviron
- Setup directories
- Edit gitignore
- Setup README
- Setup reading responses
- Commits
Doumont J-L (2009) Designing the graph. Trees, maps, and theorems: Effective communication for rational minds. Principiae, Kraainem, Belgium, 133–143 http://www.treesmapsandtheorems.com/
Dragga S and Voss D (2001) Cruel pies: The inhumanity of technical illustrations. Technical Communication 48(3), 265–274
Knaflic CN (2012a) Telling multiple stories (part 1). http://tinyurl.com/y4oz8vtv
Knaflic CN (2012b) Telling multiple stories (part 2). http://tinyurl.com/y4jk4jjs
Knaflic CN (2012c) And the winner is... http://tinyurl.com/y462kkbz
Knaflic CN (2013a) Logic in order. http://tinyurl.com/yxf8gspl
Knaflic CN (2013b) The right amount of detail. http://tinyurl.com/y24gn8o4
Knaflic CN (2014) Multifaceted data and story. http://tinyurl.com/yxq8xuf2
Kostelnick C (2007) The visual rhetoric of data displays: The conundrum of clarity. IEEE Transactions on Professional Communication 50(2), 280–294
Robbins N (2013a) General principles for creating effective graphs. Creating More Effective Graphs. Chart House, Wayne, NJ, 154–225 http://www.nbr-graphs.com/resources/recommended-books/
Robbins N (2013b) Scales. Creating More Effective Graphs. Chart House, Wayne, NJ, 226–291 http://www.nbr-graphs.com/resources/recommended-books/
Rost LC (2018a) Your friendly guide to colors in data visualisation. https://blog.datawrapper.de/colorguide/
Rost LC (2018b) What to consider when choosing colors for data visualization. https://blog.datawrapper.de/colors/
Spence I (2006) William Playfair and the psychology of graphs. IEEE Transactions on Professional Communication. American Statistical Association, Alexandria, VA, 2426–2436 http://tinyurl.com/y2njxrbv
Tufte E (1997) The decision to launch the space shuttle Challenger. Visual and statistical thinking: Displays of evidence for making decisions. Graphics Press, Cheshire, CT, 16–31 https://www.edwardtufte.com/tufte/books_textb
Wainer H (2000) Graphical failures: How to display data badly. Visual revelations: Graphical tales of fate and deception from Napoleon Bonaparte To Ross Perot. Psychology Press, Mahwah, NJ, 11–40
Wainer H (2014) Fifteen displays about one thing. Medical illuminations: Using evidence, visualization, and statistical thinking to improve healthcare. Oxford University Press, Oxford, UK, 32–49
Wickham H and Grolemund G (2017) R for Data Science. O’Reilly Media, Inc., Sebastopol, CA https://r4ds.had.co.nz/