Next Steps
Now that you have learned the fundamentals of building a data science workflow in R, here are some resources to help you continue developing your skills.
Reproducibility and Scaling
One of the key ideas from this workshop is that your workflow should be reproducible and scalable.
A good workflow allows you to: - rerun your entire analysis from start to finish
- save outputs in a consistent and organized way
- apply the same process to larger datasets or more powerful computing environments
For example, you can save model outputs:
write_csv(eval_df, "outputs/lm_test_predictions.csv")You can also run your entire workflow as a script:
Rscript workshop_workflow.RThis becomes especially useful when: - working with larger datasets
- automating analyses
- running jobs on high-performance computing (HPC) systems
The same workflow you used in this workshop can scale from your laptop to much larger systems.
Workshops
Please visit LibCal to register for additional R workshops. Topics include:
- Effective Data Visualization in R
Online Resources
- R for Data Science (2e) by Hadley Wickham et. al.
- Tidy Modeling with R by Max Kuhn and Julia Silge