Data Science Resources
Welcome to a resources page for Data Science research. Here you will find a list of links for data, tools, tutorials and related resources that may be very helpful to your work.
Textbooks
Wickham, Hadley, and Garrett Grolemund. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data., O’Reilly Media, Inc., 2016.
Julia Silge And David Robinson. Text Mining With R: A Tidy Approach., O’Reilly Media, Inc., 2019.
Think Python, first edition, by Allen B. Downey.
Circular Visualization in R by Zuguang Gu
An Introduction to Machine Learning with R by Laurent Gatto
Software
Python Programming Resources
- Play with code from W3’s super Python Tutorial
- Write code locally using Jupyter Interactive Python.
- Think Python, a textbook, by Allen B. Downey. Publisher Website
- The Python Programming Language
Articles for Learning
Tutorials
- Statmethods R programming Tutorial by Datacamp
- Data Science Dojo
- Machine Learning in R for beginners (Interactiving coding!)
- Your First Machine Learning Project in R Step-By-Step
- Machine learning with the “diabetes” data set in R
- Intro to Machine Learning with R & caret
- Jupyter Notebooks Gallery
- ANOVA in R | A Complete Step-by-Step Guide with Examples
- The Paired t-Test
- An Introduction to t-Tests | Definitions, Formula and Examples
- Colors for Plotting in R
- Stat545: a reference for R and programming in Analytics
- Allegheny College Department of Computer Science tutorials
- Coding Train’s Git and GitHub for Poets
- Setting Up Git on Linux
- Setting up Git
- Getting Started with Python in VS Code
- 11 Best VS Code extensions for Python (2022)
- ssh keys
- Luman’s ssh keys video tutorial
- Machine Learning in R for beginners
Providers of Data for projects
- awesome-public-datasets
- Finviz
- COVID-19 Forecasts
- An interactive visualization of the exponential spread of COVID-19
- Project Tycho
- Pelletier Library at Allegheny College (online services)
- World Health Organization
- The World Bank
- Noncommunicable Disease Surveillance, Monitoring and Reporting (NCDS)
- Demographic and Health Surveys
- Harvest Choice
- Food and Agricultural Organization
- World Population Prospects
- Centres for Disease Control and Prevention (CDC)
- US Food and Drug Administration Home Page
- The US Census
- Institute for Health Metrics and Evaluation
- IBM’s collection of opensource data sets
- Google’s opensource data sets
- Data.world: data for business-based questions
- Kaggle
- ipums
- Kaggle’s Star Trek Scripts (Could be a cool idea!)
- Project Gutenberg: Free eBooks
- … And many others that you may conveniently find using online searches.