YouTube06 Jun 2024
1h 12m

CS50R - Lecture 4 - Tidying Data

Podcast cover

CS50

This podcast episode focuses on tidying data in R using the tidyverse package. It introduces the concept of tidy data, emphasizing that each observation should be a row, each variable a column, and each value a cell. The discussion covers the dplyr package, including functions like select, filter, arrange, distinct, group by, and summarize, using a storms dataset as an example. The pipe operator is introduced as a way to chain functions for more readable code. Additionally, the podcast explores the tidyr package, particularly the pivot_wider function, to reshape data for better analysis, and stringr package to clean messy data values.

Outlines

Part 1: Introduction, Packages, and Tidyverse

Part 2: Data Transformation with dplyr

Part 3: Tidy Principles and Data Restructuring

Part 4: String Cleaning and Standardization

Sign in to continue reading, translating and more.

Open full episode in Podwise