Episode 14: Tips and Tricks for Using R-Markdown

Posted on Wednesday, Nov 18, 2015 | Category: Podcast
The R-Podcast is back up and running! In this episode I discuss some useful resources and helpful tips/extensions that have greatly enhanced my work flow in creating reproducible analysis documents via R-Markdown. I also highlight some exciting new endeavors in the R community as well as provide my take on two key events that further illustrate the rapidly growing use of R across many industries. A big thank you to all who expressed their support during the extended hiatus, and please don’t hesitate to provide your feedback and suggestions for future episodes. I hope you enjoy this episode!

Show Notes

Resources produced by RStudio:

Viewing R-Markdown output in real-time

  • Use Yihui’s servr package to provide real-time viewing of document in RStudio viewer while editing the source file.

Creating tables in R-markdown:

  • Pander package offers many customized table options for markdown
  • kable function in the knitr package

Dealing with multiple output formats:

Insert following code chunk in beginning of document

out_type <- knitr::opts_knit$get("rmarkdown.pandoc.to")

Then use conditional logic to perform different tasks depending on output type (docx, html, pdf, md)

Interactivity with R Markdown:

R Community Roundup

  • The R-Talk Podcast: Check out their interviews with David Smith and Jenny Bryan
  • Not So Standard Deviations Podcast: While not specifically focused on R, it has come up quite a bit in their early episodes, such as their talk of the impact of the “Hadleyverse”
  • METACRAN: METACRAN is a (somewhat integrated) collection of small services around the CRAN repository of R packages. It contains this website, a mirror at GitHub, a database with API, package search, database of package downloads (from the RStudio mirror), tools to check R packages on GitHub, etc.
  • Hadley Wickham’s recent Redditt AMA!
  • First-ever Shiny Developer Conference to be held at Stanford University on January 30-21, 2016 (agenda)

Package Pick

  • captioner: An R package for generating figure/table numbers and captions, especially for Rmd docs
  • Using captioner vignette

News

Linux Foundation Announces R Consortium to Support Millions of Users Around the World

  • “The R language is used by statisticians, analysts and data scientists to unlock value from data. It is a free and open source programming language for statistical computing and provides an interactive environment for data analysis, modeling and visualization. The R Consortium will complement the work of the R Foundation, a nonprofit organization based in Austria that maintains the language. The R Consortium will focus on user outreach and other projects designed to assist the R user and developer communities.”

  • “Founding companies and organizations of the R Consortium include The R Foundation, Platinum members Microsoft and RStudio; Gold member TIBCO Software Inc.; and Silver members Alteryx, Google, HP, Mango Solutions, Ketchum Trading and Oracle.”

  • Hadley Wickham elected as chair of the Infrastructure Steering Committee (ISC)

  • The R Consortium’s first grant is awarded to Gábor Csárdi, Ph.D., to implement R-Hub, a new service for developing, building, testing and validating R packages. R-Hub will be complementary to both CRAN, the major repository for open source R packages, and R-Forge, the platform supporting R package developers. R-Hub will provide build services, continuous integration for R packages and a distribution mechanism for R package sources and binaries.”

Microsoft Closes Acquisition of Revolution Analytics

  • “R is the world’s most popular programming language for statistical computing and predictive analytics, used by more than 2 million people worldwide. Revolution has made R enterprise-ready with speed and scalability for the largest data warehouses and Hadoop systems. For example, by leveraging Intel’s Math Kernel Library (MKL), the freely available Revolution R Open executes a typical R benchmark 2.5 times faster than the standard R distribution and some functions, such as linear regression, run up to 20 times faster. With its unique parallel external memory algorithms, Revolution R Enterprise is able to deliver speeds 42 times faster than competing technology from SAS.”

  • We’re excited the work we’ve done with Revolution R will come to a wider audience through Microsoft. Our combined teams will be able to help more users use advanced analytics within Microsoft data platform solutions, both on-premises and in the cloud with Microsoft Azure. And just as importantly, the big-company resources of Microsoft will allow us to invest even more in the R Project and the Revolution R products. We will continue to sponsor local R user groups and R events, and expand our support for community initiatives. We’ll also have more resources behind our open-source R projects including RHadoop, DeployR and the Reproducible R Toolkit. And of course, we’ll be able to add further enhancements to Revolution R and bring R capabilities to the Microsoft suite of products.”

Hosts

Eric Nantz

Eric Nantz

Eric Nantz is a principal research scientist at a large life sciences company, creating innovative analytical pipelines and capabilities supporting study designs and analyses. Outside of his day job, Eric is passionate about connecting with the R community as the creator/host of the R-Podcast, Shiny Developer Series, and a curator / podcast host for the R Weekly project. Plus, he likes to share his adventures with R and general computing on Twitch livestreams at twitch.tv/rpodcast.