Episode 9: Adventures in Data Munging Part 1

Posted on Sunday, Aug 5, 2012 | Category: Podcast
It’s great to be back with a new episode after an eventful break! This episode begins a series on my adventures in data munging, a.k.a data processing. I discuss three issues that demonstrate the flexibility and versatility R brings for recoding messy values, important inconsistent data files, and pinpointing problematic observations and variables. We also have an extended listener feedback segment with an audio installment of the “pitfalls” of R contributed by listener Frans. I hope you enjoy this episode and keep passing along your feedback to theRcast(at)gmail.com and stop by the forums as well!

Show Notes

Episode 9 Time Stamps

00:00 The R-Podcast #009: Adventures in Data Munging Part 1
00:31 Introduction
01:38 Big news: +1
03:53 R 2.15.1 released
04:26 UseR! 2012
07:20 Hockey Summary Project
10:30 Dealing with empty files
15:18 Importing inconsistent data files
28:15 Recoding using car package
35:08 Useful functions for pinpointing issues
44:55 Listener Feedback
45:14 Daniel: Advice on data munging
55:01 Frans: Pitfalls of R
66:28 Wrapping up: subscribe to the podcast, theRcast@gmail.com, + 1-269-849-9780, Twitter @theRcast, Google Plus
71:22 End


Eric Nantz

Eric Nantz is a principal research scientist at a large life sciences company, creating innovative analytical pipelines and capabilities supporting study designs and analyses. Outside of his day job, Eric is passionate about connecting with the R community as the creator/host of the R-Podcast, Shiny Developer Series, and a curator / podcast host for the R Weekly project. Plus, he likes to share his adventures with R and general computing on Twitch livestreams at twitch.tv/rpodcast.