Exploratory Data Analysis

7 hours to complete
Flexible Schedule

Roger D. Peng, PhD , Jeff Leek, PhD , Brian Caffo, PhD

What You’ll Learn

Understand analytic graphics and the base plotting system in R

Use advanced graphing systems such as the Lattice system

Make graphical displays of very high dimensional data

Apply cluster analysis techniques to locate patterns in data

Skills You’ll Gain

Ggplot2 Histogram Unsupervised Learning Color Theory Exploratory Data Analysis Plot (Graphics) Data Visualization Software Statistical Analysis Scatter Plots Data analysis Box Plots Data Visualization R Programming Dimensionality Reduction Graphing

Shareable Certificate

Earn a shareable certificate to add to your LinkedIn profile.

Develop Your Specialized Knowledge

Learn new concepts from industry experts

Gain a foundational understanding of a subject or tool

Develop job-relevant skills with hands-on projects

Earn a shareable career certificate

There are 4 modules in this course

This week covers the basics of analytic graphics and the base plotting system in R. We've also included some background material to help you install R if you haven't done so already.

Welcome to Week 2 of Exploratory Data Analysis. This week covers some of the more advanced graphing systems available in R: the Lattice system and the ggplot2 system. While the base graphics system provides many important tools for visualizing data, it was part of the original R system and lacks many features that may be desirable in a plotting system, particularly when visualizing high dimensional data. The Lattice and ggplot2 systems also simplify the laying out of plots making it a much less tedious process.

Welcome to Week 3 of Exploratory Data Analysis. This week covers some of the workhorse statistical methods for exploratory analysis. These methods include clustering and dimension reduction techniques that allow you to make graphical displays of very high dimensional data (many many variables). We also cover novel ways to specify colors in R so that you can use color as an important and useful dimension when making data graphics. All of this material is covered in chapters 9-12 of my book Exploratory Data Analysis with R.

This week, we'll look at two case studies in exploratory data analysis. The first involves the use of cluster analysis techniques, and the second is a more involved analysis of some air pollution data. How one goes about doing EDA is often personal, but I'm providing these videos to give you a sense of how you might proceed with a specific type of dataset.