Automating System Data Analysis Using R

Wednesday, November 01, 2017 - 11:00 am12:30 pm

Robert Ballance, Independent Computer Scientist

Abstract: 

Data analysis is not just about discovery, it’s about communication. The R programming language and ecosystem constitute a rich tool set for or automating the reporting process with reproducible and repeatable results. This 90-minute mini-tutorial will illustrate how the R data analysis pipeline can be applied to generating and delivering reports via documents, with a quick look at related techniques for the Web. The presentation will focus on the essence: automating the process of getting tables and graphics into the hands of users. Topics will include: accessing data stored in files and databases; scripting R to automate tasks; using document generation interfaces to generate reports; and applying R packages such as `brew` `xtable,` and `ggplot2` to make the process easy and supportable.

This mini-tutorial will:

  • motivate you to pick up R 
  • illustrate ways to simplify your life by automating data analysis and reporting
  • help you to communicate effectively with users and management using R as a platform
  • facilitate the creation of automated analyses so that you and your staff can focus on the hard problems  

Robert Ballance, Independent Computer Scientist

Dr. Robert Ballance recently completed a White House Presidential Innovation Fellowship where he applied his skills with R to analyzing and delivering broadband deployment data to communities across the U.S.A. He first developed his R-programming skills while managing large-scale High-Performance Computing systems for Sandia National Laboratories. While at Sandia, he developed several R packages used internally for system analysis and reporting. Prior to joining Sandia in 2003, Dr. Ballance managed systems at the University of New Mexico High Performance Computing Center. He has consulted, taught, and developed software, including R packages, PERL applications, C and C++ compilers, programming tools, Internet software, and Unix device drivers. He is a member of USENIX, the ACM, the IEEE Computer Society, the Internet Society, and the American Association for the Advancement of Science. He was a co-founder of the Linux Clusters Institute and recently served as Secretary of the Cray Users Group. Bob received his Ph.D. in Computer Science from U.C. Berkeley in 1989.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@conference {207147,
author = {Robert Ballance},
title = {Automating System Data Analysis Using R},
year = {2017},
address = {San Francisco, CA},
publisher = {USENIX Association},
month = oct
}
Who should attend: 

This mini-tutorial is for system administrators who want to do a more efficient job of communicating their information to others. Prior knowledge of R is not required, but will be useful. Attendees with prior knowledge of R will find the content directly applicable to their typical tasks. Those who are new to R will find the content useful in evaluating whether R and its associated Unix tools would be beneficial in their work environment.

Take back to work: 
  • Motivation to learn R or improve R skills
  • Understanding of the role for reproducible data analysis in system administration
  • Familiarity with techniques for automating data analysis and reporting using R
  • Next steps to take in mastering R
Topics include: 

Analytics of System Data