CS50R - Lecture 4 - Tidying Data | CS50

This podcast episode focuses on tidying data in R using the tidyverse package. It introduces the concept of tidy data, emphasizing that each observation should be a row, each variable a column, and each value a cell. The discussion covers the dplyr package, including functions like select, filter, arrange, distinct, group by, and summarize, using a storms dataset as an example. The pipe operator is introduced as a way to chain functions for more readable code. Additionally, the podcast explores the tidyr package, particularly the pivot_wider function, to reshape data for better analysis, and stringr package to clean messy data values.

Outlines

Part 1: Introduction, Packages, and Tidyverse

Part 2: Data Transformation with dplyr

Part 3: Tidy Principles and Data Restructuring

Part 4: String Cleaning and Standardization

Sign in to continue reading, translating and more.

Continue

CS50R - Lecture 4 - Tidying Data

CS50

Part 1: Introduction, Packages, and Tidyverse

Introduction to Tidy Data with R: Packages, CRAN, and the Tidyverse

Part 2: Data Transformation with dplyr

Transforming Data with dplyr: Select, Filter, and Introduction to Joins

Understanding TIBL Output and Arranging Data with dplyr

Removing Duplicates with Distinct and Saving Data to CSV

Finding the Strongest Hurricane Each Year: Groupby, Slice Max, and Summarize

Part 3: Tidy Principles and Data Restructuring

Tidy Data Principles: Observations, Variables, and Values

Tidying Data with Pivot Wider: Converting Rows to Columns

Part 4: String Cleaning and Standardization

Cleaning Character Strings with stringr: Trimming, Squishing, and Standardizing

CS50R - Lecture 4 - Tidying Data

CS50

Part 1: Introduction, Packages, and Tidyverse

00:20Introduction to Tidy Data with R: Packages, CRAN, and the Tidyverse

Introduction to Tidy Data with R: Packages, CRAN, and the Tidyverse

Part 2: Data Transformation with dplyr

02:47Transforming Data with dplyr: Select, Filter, and Introduction to Joins

Transforming Data with dplyr: Select, Filter, and Introduction to Joins

17:32Understanding TIBL Output and Arranging Data with dplyr

Understanding TIBL Output and Arranging Data with dplyr

23:32Removing Duplicates with Distinct and Saving Data to CSV

Removing Duplicates with Distinct and Saving Data to CSV

30:06Finding the Strongest Hurricane Each Year: Groupby, Slice Max, and Summarize

Finding the Strongest Hurricane Each Year: Groupby, Slice Max, and Summarize

Part 3: Tidy Principles and Data Restructuring

42:55Tidy Data Principles: Observations, Variables, and Values

Tidy Data Principles: Observations, Variables, and Values

48:25Tidying Data with Pivot Wider: Converting Rows to Columns

Tidying Data with Pivot Wider: Converting Rows to Columns

Part 4: String Cleaning and Standardization

59:37Cleaning Character Strings with stringr: Trimming, Squishing, and Standardizing

Cleaning Character Strings with stringr: Trimming, Squishing, and Standardizing