Data Science
CMPSC-301-00: Data Science
Academic Bulletin Description
An introduction to computational methods of data analysis with an emphasis on understanding and reflecting on the social, cultural, and political issues surrounding data and its interrogation. Participating in hands-on activities that often require teamwork, students study, design, and implement analytics software and learn how to extract knowledge from, for instance, financial, political, and scientific sources of data. Students also investigate the biases, discriminatory views, and stereotypes that may be present during the collection and analysis of data, reflecting on the ethical implications of using the resulting computational techniques. During a weekly laboratory session, students use state-of-the-art statistical software to complete projects, reporting on their findings through both written documents and oral presentations. Students are invited to use their own departmentally approved laptop in this course; a limited number of laptops are available for use during class and lab sessions. Prerequisite: FS102 or FS200, or permission of the instructor. Distribution Requirements: QR, PD.
The Lab
In order to acquire the proper skills in technical writing, critical reading, and the presentation and evaluation of technical material, it is essential for students to have hands-on experience in a laboratory. Therefore, it is mandatory for all students to attend the laboratory sessions. If you will not be able to attend a laboratory, then please see the one of the course instructor at least one week in advance in order to explain your situation. Students who miss more than two unexcused laboratories will have their final grade in the course reduced by one letter grade. Students who miss more than four unexcused laboratories will automatically fail the course.
Discord
If you are already on the department’s Discord server, then you will be given access to the course’s Discord channel, called #data-science
. If not, then you will need to join the department’s Discord server before you can be added to the course’s channel.
Meeting Times
M/W/F 9:00 AM - 9:50 AM 27 August 2024 - 12 December 2024 Alden Hall, 101 Lecture
W 2:30 PM - 4:20 PM 27 August 2024 - 12 December 2024 Alden Hall, 101 Lab
Office Hours
Syllabus and classDocs
- See README.md at
classDocs/
Suggested TextBooks
-
Wickham, Hadley, and Garrett Grolemund. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data., O’Reilly Media, Inc., 2016.
-
Julia Silge And David Robinson. Text Mining With R: A Tidy Approach., O’Reilly Media, Inc., 2019.
-
Think Python, first edition, by Allen B. Downey.
-
Circular Visualization in R by Zuguang Gu
Other Useful Textbooks:
-
BUGS in Writing: A Guide to Debugging Your Prose (Second Edition). Lyn Dupr'e. Addison-Wesley Professional. ISBN-10: 020137921X and ISBN-13: 978-0201379211, 704 pages, 1998. References to the textbook are abbreviated as “BIW”.
-
Writing for Computer Science (Second Edition). Justin Zobel. Springer ISBN-10: 1852338024 and ISBN-13:978-1852338022, 270 pages, 2004. References to the textbook are abbreviated as “WFCS”.