Last updated: May 02, 2026

BADM 576 - Data Science and Analytics (ML II)

Program-level details: See program/curriculum.md

Status: Draft Initial outline; pending instructor review.

Credits: 4

Term: Fall 2027 (Weeks 1-8)

Instructor: Zilong

Course Vision

Building on supervised learning foundations from FIN 550 (ML I), students master advanced ML techniques and the full deployment lifecycle. This course covers advanced ensembles, unsupervised learning, NLP/text analytics, time series, neural networks, and model deployment with MLOps and LLMOps. By course end, students can build, deploy, and monitor production ML systems.

Learning Outcomes (L-C-E Framework)

Literacy (Foundational Awareness)

L1: Understand advanced ML paradigms (unsupervised learning, deep learning, NLP) and when each applies
L2: Explain the ML deployment lifecycle (training, serving, monitoring, retraining)
L3: Recognize ethical issues in ML deployment (bias, fairness, transparency, model drift)

Competency (Applied Skills)

C1: Apply advanced ensemble methods and regularization techniques to improve model performance
C2: Implement advanced unsupervised learning (K-means, DBSCAN, hierarchical clustering, PCA, t-SNE) beyond introductory segmentation
C3: Build NLP/text analytics pipelines (TF-IDF, embeddings, text classification)
C4: Deploy and monitor ML models in production with MLOps practices

Expertise (Advanced Application)

E1: Build end-to-end ML systems from raw data to production deployment with monitoring
E2: Integrate LLMOps practices (agentic AI deployment, model evaluation, prompt management) alongside traditional MLOps
E3: Evaluate models for fairness, bias, and ethical deployment with comprehensive model documentation

Week-by-Week Breakdown

Week	Topic	Lectures	Project Work	Studio Session	Assessment
1	Advanced ensembles + regularization	2 videos	Team formation + problem scoping	Regularization deep-dive - Ridge, Lasso, ElasticNet, ensemble tuning	Weekly assignment 1
2	Unsupervised learning: clustering + dimensionality reduction	3 videos	EDA + initial segmentation	Clustering workshop - K-means, DBSCAN, hierarchical, PCA, t-SNE	Weekly assignment 2 + Milestone M1
3	NLP/text analytics	2 videos	Text analytics for project	NLP with scikit-learn - TF-IDF, word embeddings, text classification	Weekly assignment 3
4	Time series analysis	3 videos	Forecasting component	Time series workshop - ARIMA, Prophet, evaluation metrics	Weekly assignment 4 + Milestone M2
5	Neural networks intro	2 videos	Neural net model for project	Neural networks - architectures, keras/tensorflow basics	Weekly assignment 5
6	Deep learning applications	2 videos	System architecture + API design	Deep learning - CNNs for tabular data, transfer learning	Weekly assignment 6 + Milestone M3
7	Model deployment + MLOps + LLMOps	2 videos	Deploy model + monitoring	ML in production - Docker, APIs, monitoring, agentic AI deployment	Final deliverable work
8	Ethics, synthesis, portfolio showcase	1 video	Final deliverable + reflection	Ethics in ML - bias, fairness, model cards + final presentations	Final deliverable + team oral defense

Team Project: Production ML System (Team of 3)

One major team project runs across all 8 weeks. Teams build a production-ready ML system that incorporates advanced analytics, forecasting, deep learning, and full deployment with MLOps/LLMOps practices.

Problem Options:

Customer intelligence platform (segmentation + churn prediction + demand forecasting)
Financial analytics system (market analysis + sentiment + price forecasting)
Operations intelligence platform (supply chain optimization + anomaly detection + demand forecasting)
Student choice (approved)

Weekly Assignments (Weeks 1-6, Individual)

Hands-on exercises that build technical skills feeding into the team project:

Week	Assignment	Focus
1	Ensemble methods lab	Ridge, Lasso, ElasticNet comparison; advanced ensemble tuning
2	Clustering + dimensionality reduction	K-means, DBSCAN, hierarchical clustering; PCA, t-SNE visualization
3	Text analytics pipeline	TF-IDF, embeddings, text classification with scikit-learn
4	Time series forecasting	ARIMA, Prophet models; MAE, MAPE, RMSE evaluation
5	Neural network fundamentals	Architecture design, keras/tensorflow basics, training documentation
6	Deep learning application	CNNs for tabular data, transfer learning, comparison with traditional methods

Rubric per assignment (3 dimensions):

Dimension	Excellent (A)	Proficient (B)	Developing (C)
Technical Execution	Correct implementation, well-tuned parameters	Functional code, reasonable choices	Incomplete or poorly tuned
Interpretation	Clear business insights from results	Adequate explanation	Minimal interpretation
Code Quality	Clean, commented, reproducible	Readable code	Disorganized or undocumented

Project Milestones (Progressive, Team)

Milestones build progressively toward the final deployed system:

Milestone	Due	Deliverable
M1: Problem Scoping + Data	End of Week 2	Problem definition, dataset selection, EDA, initial clustering/segmentation analysis, team charter
M2: Model Development	End of Week 4	Trained models (ensemble + time series + baseline neural net), evaluation metrics, model comparison
M3: System Architecture	End of Week 6	System design document, API specification, deployment plan, LLMOps integration plan

Rubric per milestone (3 dimensions):

Dimension	Excellent (A)	Proficient (B)	Developing (C)
Progress	On track, all deliverables complete	Most deliverables complete	Behind schedule or incomplete
Technical Depth	Rigorous analysis, justified decisions	Sound approach	Superficial or unjustified
Team Collaboration	Clear task division, all members contributing	Adequate collaboration	Uneven contributions

Final Project Deliverable (Week 7-8, Team)

Deliverables:

Model Documentation:
- Model card (what it does, performance, limitations, ethical considerations)
- Data sheet (dataset provenance, bias analysis)
- System design document (inputs, outputs, dependencies)
Deployment:
- Containerized model (Docker)
- REST API (Flask/FastAPI)
- Cloud deployment (AWS Lambda, Heroku, or similar)
Monitoring + Operations:
- Unit tests + integration tests
- Performance monitoring (track model accuracy over time)
- Data drift detection (alert if input distribution changes)
- Retraining strategy (how often to retrain?)
LLMOps Component:
- Integration of agentic AI concepts (how LLM-based tools fit in the ML system)
- Prompt management and versioning strategy (if applicable)
Fairness + Ethics Analysis:
- Evaluate model for bias (across demographic groups if applicable)
- Document limitations + intended use cases
- Identify risks + mitigation strategies
Peer evaluation of team contributions
GitHub repo with all code + tests + Docker file + documentation

Rubric (5 dimensions):

Dimension	Excellent (A)	Proficient (B)	Developing (C)
Deployment	Production-ready, containerized, accessible	Works on cloud	Local only
MLOps/LLMOps	Comprehensive monitoring, drift detection, LLM integration	Basic tracking	No monitoring
Model Quality	Multiple well-tuned models with rigorous comparison	Functional models	Single or poorly tuned model
Fairness Analysis	Thoughtful bias evaluation + mitigation	Addresses fairness	Ignores fairness
Documentation	Model card + system design complete	Adequate docs	Minimal documentation

Oral Defense (Week 8, Team)

Teams present their deployed system, demonstrate the live API, walk through the model card, and answer questions on design decisions, fairness analysis, and deployment trade-offs.

Rubric (3 dimensions):

Dimension	Excellent (A)	Proficient (B)	Developing (C)
Technical Depth	Clear explanation of architecture, model choices, and trade-offs	Adequate explanation	Superficial or confused
Live Demo	Confident demo of deployed system, handles edge cases	System works but limited demo	Demo fails or only screenshots
Q&A	Handles questions confidently, demonstrates deep understanding	Answers most questions	Unable to answer or deflects

AI Tools Integration

Weeks 1-3 (Weekly Assignments + Project M1):

Use Claude/ChatGPT to:
- Explain regularization trade-offs
- Debug clustering and NLP pipeline issues
- Suggest dimensionality reduction approaches
- Generate feature engineering code

Weeks 4-6 (Weekly Assignments + Project M2-M3):

Use AI to:
- Explain ARIMA parameter selection
- Debug neural network training issues
- Suggest architecture choices
- Generate evaluation code

Weeks 7-8 (Final Deliverable + Deployment):

Use AI to:
- Write Docker/API code
- Generate monitoring and drift detection code
- Create model cards and documentation
- Review design for production readiness

Studio Session Topics:

Week 1: Regularization + advanced ensembles
Week 2: Clustering + dimensionality reduction visualization
Week 3: NLP pipelines + text feature engineering
Week 4: Time series decomposition + ARIMA
Week 5: Neural network training + debugging
Week 6: Deep learning applications + transfer learning
Week 7: Model deployment + containerization + LLMOps
Week 8: ML ethics + fairness + team presentations

Assessment Summary

Component	Weight	Notes
Weekly assignments	30%	Weeks 1-6, individual
Project milestones	25%	M1 (Wk 2), M2 (Wk 4), M3 (Wk 6), team
Final project deliverable	20%	Weeks 7-8, team
Oral defense	20%	Week 8, team
Studio participation	5%	Weekly attendance + peer feedback

No traditional exam. One major team project with weekly individual skill-building assignments.

AI Usage Levels (AIAS)

Assessment	AIAS Level	AI Permitted
Weekly Assignments	2	AI for debugging, parameter guidance, code explanation — with attribution
Project Milestones	2	AI for EDA, model selection guidance, architecture suggestions — with attribution
Final Project Deliverable	3	AI as collaborator for Docker/API code, model cards, monitoring scripts — with full disclosure
Oral Defense	0	No AI
Studio Participation	1	AI for exploration during exercises

Technology Stack

ML Libraries: scikit-learn, XGBoost, LightGBM, keras/tensorflow
Data: pandas, numpy, feature-engine
Text: scikit-learn (TF-IDF), gensim, spacy (optional)
Time Series: statsmodels, prophet
Deep Learning: keras, tensorflow
Deployment: Docker, Flask/FastAPI, AWS Lambda
Monitoring: evidently (for ML monitoring), custom scripts
Testing: pytest, unittest
IDE: VS Code with GitHub Copilot; Google Colab (browser alternative)
Notebooks: Jupyter Notebooks (via Colab or VS Code)
Version Control: GitHub

Prerequisites

Completion of FIN 550 (supervised ML foundations) + BADM 558 (cloud infrastructure)
Comfortable with Python programming + SQL

Bridge Module: ML Refresher (Pre-Course, ~3 hours)

Complete before Week 1. Available in Canvas at the start of Fall 2027. There is an approximately 8-month gap between ML I (FIN 550, Fall 2026) and ML II (this course). This module helps students rebuild fluency before diving into advanced topics.

Unit	Topics	Format	Self-Check
1. Supervised Learning Review (1 hr)	Train/test splits, cross-validation, overfitting/underfitting, bias-variance tradeoff	Narrated Jupyter notebook walkthrough	Quiz: identify overfitting in a learning curve, explain train/test split
2. Model Evaluation Refresher (1 hr)	Accuracy, precision, recall, F1, ROC/AUC, confusion matrix, regression metrics (MAE, RMSE, R²)	Interactive Jupyter exercises with pre-built models	Quiz: interpret a confusion matrix, choose the right metric for a scenario
3. Core Algorithms Quick Review (1 hr)	Linear/logistic regression, decision trees, random forest, gradient boosting — when to use each	Cheat sheet + short exercises comparing model outputs	Quiz: given a problem description, recommend an algorithm and justify

Readiness check: Students who pass all 3 self-check quizzes (70% threshold) are ready for Week 1. This module is strongly recommended for all students, not just those who feel rusty.

Course Sequence: ← BADM 557 — Business Intelligence

Next: Practicum →

MSBAi Curriculum Site

MSBAi - Online Master of Science in Business Analytics

AI-First curriculum design documentation for the MSBAi program launching Fall 2026