MATH 597 Mathematical Foundations of Data Science (F24)
Unlock the power of data with mathematical precision.
Course Information
- Instructor: Shen-Ning Tung (tung@math.nthu.edu.tw)
- Lecture Time: Wednesdays, 3:30 PM - 6:00 PM
- Office Hours: I will be available for questions until 7:00 PM after each lecture (may leave early if no one is in the classroom).
Evaluation
- Weekly Problem Sets (20%)
- Midterm Report (30%)
- Final Project (50%)
Communication
- Primary Platform: All course communication and announcements will be made on the course Discord channel.
- Questions and Discussions: Please use public Discord posts for questions about course material. This fosters collaborative learning and allows instructors to address queries efficiently.
- Private Matters: For private discussions or questions, send a direct message to the instructors on Discord.
- Submissions: All notes and reports should be submitted via HackMD notes.
Course Project
The course project is an opportunity to delve deeper into the mathematical foundations of data science. You can work individually or in a group. Choose one of the following project types:
- Theoretical Deep Dive: Select a topic with a strong theoretical basis in data science and provide a comprehensive exploration, elucidating its key concepts, principles, and mathematical underpinnings.
- Algorithm in Action: Choose a data science algorithm, implement it, and apply it to a real-world dataset. Showcase its practical utility and interpret the results.
- Theory Meets Practice: Bridge the gap between theory and application. Introduce a topic or algorithm, explain its theoretical foundations, and then demonstrate its relevance and effectiveness by implementing it and analyzing its performance on real-world data.
Important: Please discuss your project topic with the instructor and finalize it by the end of September.
References
- Textbook: “Foundations of Data Science” by Avrim Blum, John Hopcroft, and Ravindran Kannan
- Probability and Statistics: “All of Statistics: A Concise Course in Statistical Inference” by Larry Wasserman
- Linear Algebra: “Linear Algebra and Learning from Data” by Gilbert Strang
- Recommended Reading:
- Mathematical Foundations of Data Sciences: https://mathematical-tours.github.io/book/
Schedule
Date | Lecture | Comments |
---|---|---|
9/3 | Introduction to Data Science | Note |
9/10 | High-Dimensional Space | Note |
9/17 | Best-Fit Subspaces and Singular Value Decomposition (SVD) | Note |
9/24 | Random Walks and Markov Chains | Note |
10/2 | Typhoon Day Off | No class |
10/9 | Signal Processing | Note |
10/16 | Machine Learning | Note |
10/23 | Signature | Note |
10/30 | Algorithms for Massive Data | Note |
11/6 | Clustering | Note |
11/13 | Reinforcement Learning | Note |
11/20 | Network | Note |
11/27 | Optimization | Ref.1, Ref.2, Ref.3 |
12/4 | Operations Research | Note |
12/11 | Decentralized Finance | Ref.1, Ref.2 |
Course Project
- Apply Q-learning on McCall Search Model by Ya-Zhu Yang
- Reinforcement Learning for Snake by Felix Uhl
- Streamlining Time Series Analysis with sktime by Ren-Shu Yang
- Leveraging Neural Network on Detect Phases and Phase Transitions in the Ising Model by Hao-Yang Yen
- PySR– A Modern Symbolic Regression Method by YuanLong Chan
- Graph Partitioning Techniques and Applications in Community Detection by Jun-Zhi Wang