CSCI 3345: Spring 2025

Overview

CSCI 3345 is an introduction to machine learning, focusing on computational systems that can adaptively improve their performance through experience and data. This fundamental capability has transformed how we solve complex problems across numerous domains. From powering the recommendation systems we use daily on streaming platforms, to enabling virtual assistants that understand natural language, to advancing medical diagnosis tools and autonomous vehicles, machine learning has become an integral part of modern technology. The course provides a comprehensive foundation in machine learning, combining theoretical principles with practical applications. Students will explore core concepts and various machine learning approaches, understanding both the mathematical foundations and their real-world implementations. Through hands-on programming assignments, students will experiment with different learning algorithms, gaining practical experience in developing and evaluating machine learning solutions. A key component of the course is a project that allows students to dive deeper into a specific area of machine learning that aligns with their interests.

Learning Objectives:
After completing the course, students should be able to:
- Select and apply appropriate supervised learning algorithms for classification and regression problems (e.g., linear regression, logistic regression, ridge regression, nonparametric kernel regression, neural networks, naive Bayes, support vector machines)
- Recognize different types of unsupervised learning problems, and select and apply appropriate algorithms (e.g., k-means clustering, Gaussian mixture models, linear and nonlinear dimensionality reduction)
- Work with probability (Bayes rule, conditioning, expectations, independence), linear algebra (vector and matrix operations, eigenvectors, SVD), and calculus (gradients, Jacobians) to derive machine learning methods
- Understand machine learning principles such as model selection, overfitting, and underfitting, and techniques such as cross-validation and regularization
- Implement machine learning algorithms such as logistic regression via stochastic gradient descent, linear regression, or k-means clustering
- Run appropriate supervised and unsupervised learning algorithms on real and synthetic data sets and interpret the results
Prerequisites:
- Programming: You should be familiar with algorithms and data structures. Familiarity with python or similar frameworks for numeric programming will be helpful but is not strictly required. Python (Basics).
- Probability: You should have been exposed to probability distributions, random variables, expectations, etc. Linear Algebra (Essence, Chap 1-4), Multivariate Calculus (Essence, Chap 1, 3-4, 8-9).
Lecture:
Lectures will be Tuesday and Thursday at 245 Beacon St. Room 229, from 1:30pm to 2:45pm.
Textbooks and Materials:
There is no required textbook for the course. However, the following books (available for free online) can be useful as references on relevant topics:
- Deep Learning (DL), Goodfellow, Ian and Bengio, Yoshua and Courville, Aaron, MIT Press, 2016, ISBN: 9780262035613
- Learning from Data: a Short Course, Y. S. Abu-Mostafa, M. Magdon-Ismail, H.-T. Lin.
- Mathematics for Machine Learning (MML), Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong.
- Pattern Recognition and Machine Learning (PRML), Christopher C. Bishop, Springer, 2006, ISBN: 9780387310732
- Machine Learning: A Probabilistic Perspective , Kevin Murphy.
- Dive into Deep Learning (D2L), Zhang et al.
- Hands-On Machine Learning with Scikit-Learn, Keras, and Tensorflow , Aurélien-Géron.
You may also find this tutorial Deep Learning with PyTorch: A 60 Minute Blitz helpful.
Grading Policy:
- No exams.
- 30%: Homework Assigenments
- 60%: Final Project (Proposal, Milestone Report, Presentation, and Final Code/Report)
- 10%: Attendance (Including Asking Questions, quizzes)
The coding homework assignments will be in Python, and will use PyTorch on Google Colab.
Instead of a final exam, at the end of the semester you will complete a project working in groups of at most 3 students.

Staff

Yuan Yuan

Instructor

Daven Pelaez

Teaching Assistant

Yunhan Liu

Teaching Assistant

Tentative Schedule (subject to changes)

Date	Topic	Materials	Assignments
Tue, Jan. 14	1. Introduction	Slides	Read DL 1-1.1, 2.1-2.3
Thu, Jan. 16	2. ML Problem Formulation	Slides Math_Background.pdf Notation_Guide.pdf
Tue, Jan. 21	3. Linear Regression and Regularization	Slides Notes Notes 2	Read lecture notes and Section 1 and 1.2 of Stanford CS229 Notes
Thu, Jan. 23	4. Gradient Descent	Slides Notes	Read lecture notes and Section 1.1 of Stanford CS229 Notes
Tue, Jan. 28	5. Logistic Regression	Slides Demos: Linear/Logistic: Notebook, Desmos (3D) Convex functions Desmos Multiclass logistic Desmos Coding: Python Tutorial Colab	Warmup exercise released Warmup exercise Assignment 1 released lab1 (Due Tue, Feb. 4)
Thu, Jan. 30	6. Features	Slides	Pre-reading: FeatEng and LogReg.pdf
Tue, Feb. 4	6. Features (cont.) Nonlinear features, Underfit, Overfit, Regularization, Generalization, Hyperparameters Validation, Hand-crafting features	See previous lecture slides	Read DL 5.2-5.3
Thu, Feb. 6	7. MLE and Probabilistic Formulation of Machine Learning	Slides Pre-reading: MLE.pdf	Read MLE.pdf and MML 9-9.2.2
Tue, Feb. 11	8. Maximum Likelihood Estimation (MLE) and Maximum a Posterior (MAP)	Slides Pre-reading: MAP.pdf Desmos: Bernoulli Likelihood and Posterior	Read MAP.pdf and MML 9.2.3-4
Thu, Feb. 13	8. Maximum a Posterior (MAP)	See previous lecture slides
Tue, Feb. 18	9. Regularization Convex function, Ridge regression, Lasso regression	Slides	Read DL 7.1, 7.8 Assignment 2 released (Due Tue, Feb. 25)
Thu, Feb. 20	10. Neural Networks	Slides
Tue, Feb. 25	11. Multi-layer Perception	Slides Slides (2)
Thu, Feb. 27	12. Evaluation Metrics, Imbalance Learning Confusion Matrix, Accuracy, Precision, Recall / Sensitivity, Specificity, F-score, AU-ROC, AU-PRC, Log-loss	Slides	Read Targeted Supervised Contrastive Learning for Long-Tailed Recognition Read Artificial intelligence-enabled detection and assessment of Parkinson’s disease using nocturnal breathing signals
Tue, Mar. 4	No class (Spring Break)
Thu, Mar. 6	No class (Spring Break)
Tue, Mar. 11	13. Deep Learning: CNNs	Slides Pre-reading: CNNs.pdf	Read DL 9
Thu, Mar. 13	14. Neural Net Applications - Images	Slides
Thu, Mar. 18	14. PyTorch	Slides
Tue, Mar. 20	15. Neural Net Applications - Language	Slides	The Illustrated Word2vec. Jay Alammar Video (and code): Let's build GPT. Andrej Karpathy
Thu, Mar. 25	15. Deep Learning: Attention & Transformers	See previous lecture slides	The Illustrated {Attention → Transformer → GPT-2}. Jay Alammar Video (and code): Let's build GPT. Andrej Karpathy
Tue, Mar. 27	16. K-Means and Gaussian Mixture Models Clustering, EM Algorithm	Slides Slides 2	Notes
Tue, Apr. 1	17. PCA, Autoencoders, Feature Learning	Slides
Thu, Apr. 3	18. Reinforcement Learning: MDPs & Value Functions	Slides	Notes
Tue, Apr. 8	18. Reinforcement Learning: MDPs & Value Functions	See previous lecture slides
Thu, Apr. 10	19. Reinforcement Learning: Q-learning & Deep RL	Slides	Notes
Tue, Apr. 15	19. Reinforcement Learning: Q-learning & Deep RL	See previous lecture slides	Deep Reinforcement Learning: Pong from Pixels Colab
Thu, Apr. 17	No class (Easter Break)
Tue, Apr. 22	No class (Substitute Monday Schedule)
Thu, Apr. 24	20. Nonparametric Models Decision Tree and Nearest Neighbors	Slides	Notes
Tue, Apr. 29	Final Project Presentation (1)
Thu, May. 1	Final Project Presentation (2)

Office Hours

Name	Office hours
Yuan	Tue/Fri 3-4PM @ 245 Beacon Rm. 528E
Daven	Wed 3-4PM, Thu 5:30-6:30PM @ CS Lab
Yunhan	Mon/Tue 10-11AM @ CS Lab

Office hours will take place in person (or Zoom if needed).

Course Information

This is a challenging course and we are here to help you become a more-AI version of yourself. Please feel free to reach out if you need help in any form.

1. Get help (besides office hours)

Dropbox: The lecture pdfs will be uploaded to Dropbox (follow the link) and you can ask questions there by making comments on the slides directly.
Discord: For labs/psets/final projects, we will create dedicated channels for you to ask public questions. If you cannot make your post public (e.g., due to revealing problem set solutions), please directly DM TAs or the instructor separately, or come to office hours. Please note, however, that the course staff cannot provide help debugging code, and there is no guarantee that they'll be able to answer last-minute homework questions before the deadline. We also appreciate it when you respond to questions from other students! If you have an important question that you would prefer to discuss over email, you may email the course staff, or you can contact the instructor by email directly.
Support: The university counseling services center provides a variety of programs and activities.
Accommodations for students with disabilities: If you are a student with a documented disability seeking reasonable accommodations in this course, please contact Kathy Duggan, (617) 552-8093, dugganka@bc.edu, at the Connors Family Learning Center regarding learning disabilities and ADHD, or Rory Stein, (617) 552-3470, steinr@bc.edu, in the Disability Services Office regarding all other types of disabilities, including temporary disabilities. Advance notice and appropriate documentation are required for accommodations.

2. Homework submission

All programming assignments are in Python on Colab, always due at midnight (11:59 pm) on the due date.

Install Colab on the browser: Sign in to your Google account, follow the "Link" (to be updated) to the folder of assignments, click on lab0.ipynb, click on "Open with" and "Connect more apps", install "Colaboratory".
Submission: You need save a copy of the file in your own Google drive, so that you can save your edits. Afterwards, you can download the ipynb file and submit it to Canvas.
Final project: In lieu of a final exam, we'll have a final project. This project will be completed in small groups during the last weeks of the class. The direction for this project is open-ended: you can either choose from a list of project ideas that we distribute, or you can propose a topic of your own. A short project proposal will be due approximately halfway through the course. During the final exam period, you'll turn in a final report and give a short presentation. You may use an ongoing research work for your final project, as long it meets the requirements.

3. Academic policy

Late days: You'll have 1 late day for every lab and pset respectively over the course of the semester. Each time you use one, you may submit a homework assignment one day late without penalty. You do not need to notify us when you use a late day; we'll deduct it automatically. If you run out of late days and still submit late, your assignment will be penalized at a rate of 2% per day. If you edit your assignment after the deadline, this will count as a late submission, and we'll use the revision time as the date of submission (after a short grace period of a few minutes). We will not provide additional late time, except under exceptional circumstances, and for these we'll require documentation (e.g., a doctor's note). Please note that the late days are provided to help you deal with minor setbacks, such as routine illness or injury, paper deadlines, interviews, and computer problems; these do not generally qualify for an additional extension.
Academic integrity: While you are encouraged to discuss homework assignments with other students, your programming work must be completed individually. You may not search for solutions online, or to use existing implementations of the algorithms in the assignments. Thus it is acceptable to learn from another student the general idea for writing program code to perform a particular task, or the technique for solving a mathematical problem, but unacceptable for two students to prepare their assignments together and submit what are essentially two copies of identical work. If you have any uncertainty about the application of this policy, please check with me. Failure to comply with these guidelines will be considered a violation of the University policies on academic integrity. Please make sure that you are familiar with these policies. We will use moss.pl tool to check each lab and pset for plagriasm detection.
AI assistants policy:
- Our policy for using ChatGPT and other AI assistants is identical to our policy for using human assistants.
- This is a deep learning class and you should try out all the latest AI assistants (they are pretty much all using deep learning). It's very important to play with them to learn what they can do and what they can't do. That's a part of the content of this course.
- Just like you can come to office hours and ask a human questions (about the lecture material, clarifications about pset questions, tips for getting started, etc), you are very welcome to do the same with AI assistants.
- But: just like you are not allowed to ask an expert friend to do your homework for you, you also should not ask an expert AI.
- If it is ever unclear, just imagine the AI as a human and apply the same norm as you would with a human.

4. Related Classes / Online Resources

Acknowledgements: This course draws heavily from MIT's 6.869: Advances in Computer Vision by Antonio Torralba, William Freeman, and Phillip Isola, and from Stanford's CS231n: Deep Learning for Computer Vision by Fei-Fei Li. It also includes lecture slides from other researchers, including Andrew Owens , Svetlana Lazebnik, Alexei Efros, Fei-fei Li, Carl Vondrick, David Fouhey, Justin Johnson, and Noah Snavely, David Fouhey and Ava Amini. Special thanks to Hao Wang for the insightful and generous advice.

CSCI 3345: Machine Learning

Instructor: Yuan Yuan Spring 2025 (TuTh 1:30-2:45 PM) 245 Beacon Street Room 229

Overview

Staff

Tentative Schedule (subject to changes)

Office Hours

Course Information