BADM 576 - Data Science & Machine Learning (ML II)
Program-level details: See program/CURRICULUM.md
| Credits: 4 | Term: Fall 2027 (Weeks 1-8) |
Course Vision
Building on supervised learning foundations from FIN 550 (ML I), students master advanced ML techniques and the full deployment lifecycle. This course covers advanced ensembles, unsupervised learning, NLP/text analytics, time series, neural networks, and model deployment with MLOps and LLMOps. By course end, students can build, deploy, and monitor production ML systems.
Learning Outcomes (L-C-E Framework)
Literacy (Foundational Awareness)
- L1: Understand advanced ML paradigms (unsupervised learning, deep learning, NLP) and when each applies
- L2: Explain the ML deployment lifecycle (training, serving, monitoring, retraining)
- L3: Recognize ethical issues in ML deployment (bias, fairness, transparency, model drift)
Competency (Applied Skills)
- C1: Apply advanced ensemble methods and regularization techniques to improve model performance
- C2: Implement unsupervised learning (clustering, dimensionality reduction) for business segmentation
- C3: Build NLP/text analytics pipelines (TF-IDF, embeddings, text classification)
- C4: Deploy and monitor ML models in production with MLOps practices
Expertise (Advanced Application)
- E1: Build end-to-end ML systems from raw data to production deployment with monitoring
- E2: Integrate LLMOps practices (agentic AI deployment, model evaluation, prompt management) alongside traditional MLOps
- E3: Evaluate models for fairness, bias, and ethical deployment with comprehensive model documentation
Week-by-Week Breakdown
| Week | Topic | Lectures | Project Work | Studio Session | Assessment |
|---|---|---|---|---|---|
| 1 | Advanced ensembles + regularization | 2 videos | Project 1A: Advanced model building | Regularization deep-dive - Ridge, Lasso, ElasticNet, ensemble tuning | Model comparison |
| 2 | Unsupervised learning: clustering + dimensionality reduction | 3 videos | Project 1 work: Clustering analysis | Clustering workshop - K-means, DBSCAN, hierarchical, PCA, t-SNE | Cluster evaluation |
| 3 | NLP/text analytics | 2 videos | Project 1 work: Text analysis | NLP with scikit-learn - TF-IDF, word embeddings, text classification | Text pipeline |
| 4 | Time series analysis | 3 videos | Project 2A: Forecasting setup | Time series workshop - ARIMA, Prophet, evaluation metrics | Code review |
| 5 | Neural networks intro | 2 videos | Project 2 work: Neural network model | Neural networks - architectures, keras/tensorflow basics | Model training |
| 6 | Deep learning applications | 2 videos | Project 2 work: Advanced modeling | Deep learning - CNNs for tabular data, transfer learning | Model evaluation |
| 7 | Model deployment + MLOps + LLMOps | 2 videos | Project 3A: Deploy model | ML in production - Docker, APIs, monitoring, agentic AI deployment | Deployment demo |
| 8 | Ethics, synthesis, portfolio showcase | 1 video | Project 3 complete + reflection | Ethics in ML - bias, fairness, model cards + final presentations | Final presentations + team oral defense |
Projects (3 per course)
Project 1: Advanced Analytics (Weeks 1-3, Individual, 25% of grade)
Problem Statement: Apply advanced ML techniques to a business problem. Combine ensemble methods, clustering/dimensionality reduction, and text analytics to deliver comprehensive analysis.
Problem Options:
- Customer segmentation + churn prediction (e-commerce data)
- Document classification + topic modeling (business documents)
- Market segmentation + sentiment analysis (social media + financial data)
- Student choice (approved)
Deliverables:
- Advanced ensemble models with regularization (Ridge, Lasso, ElasticNet comparison)
- Clustering analysis:
- K-means, DBSCAN, or hierarchical clustering
- Dimensionality reduction visualization (PCA, t-SNE)
- Cluster profiles with business interpretation
- Text analytics component (TF-IDF, embeddings, classification)
- Feature importance analysis across all methods
- 3-page write-up explaining approach + business insights
- GitHub repo with all code
Rubric (5 dimensions):
| Dimension | Excellent (A) | Proficient (B) | Developing (C) |
|---|---|---|---|
| Ensemble Methods | Advanced techniques, well-justified regularization | Good model choices | Limited exploration |
| Unsupervised Analysis | Meaningful clusters with business insights | Clusters identified | Limited interpretation |
| Text Analytics | Effective NLP pipeline with clear results | Basic text analysis | Minimal text work |
| Integration | Methods combined into coherent analysis | Separate but adequate | Disjointed |
| Documentation | Clear explanation + code comments | Adequate explanation | Minimal docs |
Project 2: Time Series + Deep Learning (Weeks 4-6, Individual, 35% of grade)
Problem Statement: Build forecasting models and explore deep learning for a business application. Compare traditional time series methods with neural network approaches.
Problem Options:
- Financial forecasting (stock prices, revenue, demand)
- Operations forecasting (supply chain, energy consumption)
- Healthcare prediction (patient volumes, outcomes)
- Student choice (approved)
Deliverables:
- Time series analysis:
- ARIMA and/or Prophet models
- Evaluation metrics (MAE, MAPE, RMSE)
- Forecast visualization with confidence intervals
- Neural network model:
- Architecture design and justification
- Training process documentation
- Comparison with traditional methods
- 3-page write-up explaining models + business implications
- GitHub repo with code + model artifacts
Rubric (5 dimensions):
| Dimension | Excellent (A) | Proficient (B) | Developing (C) |
|---|---|---|---|
| Time Series | Strong ARIMA/Prophet models, justified parameters | Functional forecasts | Basic or poorly tuned |
| Neural Networks | Well-designed architecture, justified choices | Functional model | Minimal or incorrect |
| Model Comparison | Thoughtful comparison across approaches | Basic comparison | Missing comparison |
| Evaluation | Rigorous metrics, confidence intervals | Standard metrics | Weak evaluation |
| Business Context | Explains implications for decision-makers | Mentions business | Only technical focus |
Project 3: Full ML System + Deployment (Weeks 7-8, Team of 3-4, 30% of grade)
Problem Statement: Build a production-ready ML system as a team. Deploy your best model, implement monitoring, and document with MLOps/LLMOps best practices.
Deliverables:
- Model Documentation:
- Model card (what it does, performance, limitations, ethical considerations)
- Data sheet (dataset provenance, bias analysis)
- System design document (inputs, outputs, dependencies)
- Deployment:
- Containerized model (Docker)
- REST API (Flask/FastAPI)
- Cloud deployment (AWS Lambda, Heroku, or similar)
- Monitoring + Operations:
- Unit tests + integration tests
- Performance monitoring (track model accuracy over time)
- Data drift detection (alert if input distribution changes)
- Retraining strategy (how often to retrain?)
- LLMOps Component:
- Integration of agentic AI concepts (how LLM-based tools fit in the ML system)
- Prompt management and versioning strategy (if applicable)
- Fairness + Ethics Analysis:
- Evaluate model for bias (across demographic groups if applicable)
- Document limitations + intended use cases
- Identify risks + mitigation strategies
- Team oral defense: present deployed system + model card (20% of Project 3 grade)
- Peer evaluation of team contributions
- GitHub repo with all code + tests + Docker file + documentation
Rubric (5 dimensions):
| Dimension | Excellent (A) | Proficient (B) | Developing (C) |
|---|---|---|---|
| Deployment | Production-ready, containerized, accessible | Works on cloud | Local only |
| MLOps/LLMOps | Comprehensive monitoring, drift detection, LLM integration | Basic tracking | No monitoring |
| Oral Defense | Clear explanation, confident demo, handles Q&A well | Adequate presentation | Unclear or unprepared |
| Fairness Analysis | Thoughtful bias evaluation + mitigation | Addresses fairness | Ignores fairness |
| Documentation | Model card + system design complete | Adequate docs | Minimal documentation |
AI Tools Integration
Weeks 1-3 (Advanced Analytics):
- Use Claude/ChatGPT to:
- Explain regularization trade-offs
- Debug clustering and NLP pipeline issues
- Suggest dimensionality reduction approaches
- Generate feature engineering code
Weeks 4-6 (Time Series + Deep Learning):
- Use AI to:
- Explain ARIMA parameter selection
- Debug neural network training issues
- Suggest architecture choices
- Generate evaluation code
Weeks 7-8 (Deployment + MLOps):
- Use AI to:
- Write Docker/API code
- Generate monitoring and drift detection code
- Create model cards and documentation
- Review design for production readiness
Studio Session Topics:
- Week 1: Regularization + advanced ensembles
- Week 2: Clustering + dimensionality reduction visualization
- Week 3: NLP pipelines + text feature engineering
- Week 4: Time series decomposition + ARIMA
- Week 5: Neural network training + debugging
- Week 6: Deep learning applications + transfer learning
- Week 7: Model deployment + containerization + LLMOps
- Week 8: ML ethics + fairness + team presentations
Assessment Summary
| Component | Weight | Notes |
|---|---|---|
| Project 1 (Advanced Analytics) | 25% | Weeks 1-3, individual |
| Project 2 (Time Series + Deep Learning) | 35% | Weeks 4-6, individual |
| Project 3 (ML System + Deployment) | 30% | Weeks 7-8, team (includes oral defense) |
| Studio participation | 10% | Weekly attendance + peer feedback |
No traditional exam. Project-based with production focus.
Technology Stack
- ML Libraries: scikit-learn, XGBoost, LightGBM, keras/tensorflow
- Data: pandas, numpy, feature-engine
- Text: scikit-learn (TF-IDF), gensim, spacy (optional)
- Time Series: statsmodels, prophet
- Deep Learning: keras, tensorflow
- Deployment: Docker, Flask/FastAPI, AWS Lambda
- Monitoring: evidently (for ML monitoring), custom scripts
- Testing: pytest, unittest
- Notebook: Jupyter
Prerequisites
- Completion of FIN 550 (supervised ML foundations) + BADM 558 (cloud infrastructure)
- Comfortable with Python programming + SQL
Last Updated: February 2026