Transform and analyze data using R and Tidyverse, producing reproducible analyses to uncover insights from real-world datasets.
Create impactful visualizations with RStudio and Quarto, effectively communicating data-driven insights for informed decision-making.
Apply ethical principles in data science by addressing algorithmic bias, safeguarding data privacy, and promoting responsible data usage.
Earn a shareable certificate to add to your LinkedIn profile..
Learn in-demand skills from university and industry experts
Master a subject or tool with hands-on projects
Develop a deep understanding of key concepts
Earn a career certificate from Duke University
This course is an introduction to data science and statistical thinking. Learners will gain experience with exploring, visualizing, and analyzing data to understand natural phenomena and investigate patterns, model outcomes, and do so in a reproducible and shareable manner. Topics covered include data visualization and transformation for exploratory data analysis. Learners will be introduced to problems and case studies inspired by and based on real-world questions and data via lecture and live coding videos as well as interactive programming exercises. The course will focus on the R statistical computing language with a focus on packages from the Tidyverse, the RStudio integrated development environment, Quarto for reproducible reporting, and Git and GitHub for version control. The skills learners will gain in this course will prepare them for careers in a variety of fields, including data scientist, data analyst, quantitative analyst, statistician, and much more.
This course aims to better develop your statistical toolkit in the world of statistics and data science. You will learn how to collect, manipulate, and transform data in R into a more readily usable format using tidyverse data pipelines, primarily using verbs from the dplyr and tidyr packages. The topics covered provide you with the tools necessary to convert data to be better suited for data visualization (Course 1) and modeling; which is to come in this certificate program in a future course. Additionally, we discuss the topics of web scraping and the considerations one must take prior to scraping data from the web.
This course highlights the ethical responsibilities we have as statisticians and data scientists when working with data. This course demonstrates situations where ethical concerns with data arise and helps train our brains to be more aware of how data are used and the intent behind data collection. By the end of this course, you will be able to identify misrepresentation in visualizations, describe the basics of data privacy, and recognize potential situations where algorithmic bias is at play.