FIN 550 / MSBAi550 - Predictive Analytics for Business (ML I)
Program-level details: See program/CURRICULUM.md
| Credits: 4 | Term: Fall 2 (Weeks 9-16) |
Course Vision
Students learn supervised machine learning fundamentals applied to business problems. This is ML I in the MSBAi sequence — focused exclusively on regression, classification, feature engineering, and business case communication. Students build strong foundations in model evaluation and selection before advancing to unsupervised learning, NLP, and deployment in BADM 576 (ML II).
Prerequisites
- Python programming (from BADM 554 or equivalent)
- Statistics foundation required: Students must complete the following Coursera pre-program courses (or equivalent) before starting FIN 550:
- Inferential Statistics (Duke University or equivalent)
- Basic Statistics (University of Amsterdam or equivalent)
- These are the same statistics prerequisites used by the on-campus MSBA program
Learning Outcomes (L-C-E Framework)
Literacy (Foundational Awareness)
- L1: Understand supervised learning and explain when regression vs. classification applies
- L2: Recognize overfitting and describe train/test/validation splits
- L3: Explain evaluation metrics (MSE/RMSE, accuracy, precision, recall, F1, AUC) and why they matter for different problems
Competency (Applied Skills)
- C1: Build regression models (linear, polynomial) using scikit-learn with proper evaluation
- C2: Build classification models (logistic regression, decision trees) with rigorous evaluation
- C3: Engineer features from raw business data and select informative features
- C4: Communicate model performance and business implications to non-technical stakeholders
Expertise (Advanced Application)
- E1: Compare multiple supervised model types and choose the best based on business criteria
- E2: Apply ensemble methods (random forest, gradient boosting) to improve model performance
- E3: Write a compelling business case connecting model results to actionable recommendations
Week-by-Week Breakdown
| Week | Topic | Lectures | Project Work | Studio Session | Assessment |
|---|---|---|---|---|---|
| 9 | Intro to supervised ML + regression setup | 3 videos | Project 1A: Regression problem setup | ML fundamentals - supervised learning, sklearn workflow | Quiz (L1-L2) |
| 10 | Linear & polynomial regression + evaluation | 3 videos | Project 1 work: Build baseline model | Regression workshop - sklearn patterns, MSE/RMSE, R² | Code review |
| 11 | Logistic regression + classification | 3 videos | Project 1 work: Classification model | Classification deep-dive - logistic, decision boundaries, confusion matrix | Project 1 due |
| 12 | Decision trees + model selection | 3 videos | Project 2A: Feature engineering | Tree models workshop - interpretability, overfitting | DataCamp assignment |
| 13 | Feature engineering + cross-validation | 2 videos | Project 2 work: Complex model building | Feature engineering tactics - domain knowledge + data-driven | Mid-course checkpoint |
| 14 | Model selection + hyperparameter tuning | 2 videos | Project 2 work: Model comparison | Cross-validation deep-dive - grid search, learning curves | Model evaluation |
| 15 | Ensemble methods intro (random forest, gradient boosting) | 2 videos | Project 3A: Business case draft | Ensembles workshop - bagging vs. boosting, when to use each | Ensemble assignment |
| 16 | Business case writing + synthesis | 1 video | Project 3 complete | Final presentations - team business cases | Projects 2 & 3 due |
Projects (3 per course)
Project 1: Supervised Learning Foundations (Weeks 9-11, Individual, 25% of grade)
Problem Statement: Predict a business outcome using supervised learning. Choose regression (continuous target) or classification (binary/multiclass). Real financial data preferred.
Options:
- Regression: Predict stock price tomorrow, house prices, customer lifetime value
- Classification: Predict loan default (yes/no), customer churn (yes/no), credit card fraud (yes/no)
Datasets Available:
- US housing prices (Kaggle)
- Stock prices (Yahoo Finance API)
- Loan defaults (LendingClub)
- Credit card fraud (Kaggle)
- Student choice (approved)
Deliverables:
- Jupyter notebook with exploratory analysis
- Clean, documented code (functions, classes)
- Train/test split + cross-validation
- 2-3 model comparisons (at least one baseline)
- Evaluation metrics (MSE/RMSE for regression, F1/AUC for classification)
- 2-page write-up explaining model choice + business implications
- GitHub repo with all code
Rubric (5 dimensions):
| Dimension | Excellent (A) | Proficient (B) | Developing (C) |
|---|---|---|---|
| Problem Understanding | Clear definition, appropriate metric chosen | Understands core problem | Vague problem statement |
| Data Handling | Thoughtful train/test/validation split, cross-validation | Train/test present | Data leakage or poor split |
| Model Building | 2-3 diverse models with justification | 2 models, basic comparison | Single model or poor comparison |
| Evaluation | Rigorous evaluation, explains metrics | Computes metrics, interpretation okay | Weak evaluation |
| Writeup | Clear explanation, connects to business | Adequate explanation | Minimal writeup |
Project 2: Feature Engineering + Model Selection (Weeks 12-14, Individual, 35% of grade)
Problem Statement: Improve your Project 1 model using feature engineering, model selection, and hyperparameter tuning. Focus on systematic comparison and understanding what drives performance.
Deliverables:
- Feature engineering notebook (create 5-10 new features with justification)
- Model comparison (decision trees, random forest, gradient boosting vs. baseline)
- Hyperparameter tuning results (grid search or random search)
- Feature importance analysis (which features matter most?)
- Evaluation metrics showing improvement over Project 1
- Learning curves (showing where you have bias vs. variance)
- 3-page write-up explaining improvements
- GitHub repo with final code + model artifacts
Rubric (5 dimensions):
| Dimension | Excellent (A) | Proficient (B) | Developing (C) |
|---|---|---|---|
| Feature Engineering | Creative features with domain insight | Standard feature creation | Minimal feature work |
| Model Selection | Systematic comparison, well-justified choices | Good model choices | Limited exploration |
| Hyperparameter Tuning | Systematic grid/random search with analysis | Some tuning attempted | Minimal tuning |
| Results | Significant improvement with clear wins | Modest improvement | Little/no improvement |
| Analysis | Explains why improvements worked | Describes metrics | Minimal analysis |
Project 3: Business Case + Presentation (Weeks 15-16, Team of 3-4, 30% of grade)
Problem Statement: Write a compelling business case for stakeholders based on your best model. Present findings and recommendations to a mock executive audience.
Deliverables:
- Business case document (5 pages):
- Executive summary
- Model performance + baseline comparison
- Recommended business action
- Limitations + risks
- ROI calculation (if applicable)
- Team oral defense: present business case + model results (20% of Project 3 grade)
- Peer evaluation of team contributions
- GitHub repo with all code + documentation
Rubric (5 dimensions):
| Dimension | Excellent (A) | Proficient (B) | Developing (C) |
|---|---|---|---|
| Business Case | Compelling, data-driven, actionable | Addresses most points | Incomplete or unclear |
| Model Understanding | Clearly explains model choices and trade-offs | Adequate technical explanation | Surface-level understanding |
| Oral Defense | Confident presentation, handles Q&A well | Adequate delivery | Unclear or unprepared |
| ROI Calculation | Thoughtful cost-benefit analysis | Estimates provided | Missing or speculative |
| Team Collaboration | Clear evidence of shared work and coordination | Adequate collaboration | Uneven contribution |
AI Tools Integration
Week 9-11 (Supervised Learning):
- Use Claude/ChatGPT to:
- Explain model selection for your problem
- Debug scikit-learn errors
- Suggest evaluation metrics
- Interpret model results
Week 12-14 (Feature Engineering + Model Selection):
- Use AI to:
- Generate feature engineering ideas
- Explain hyperparameter tuning strategies
- Debug model issues
- Suggest regularization approaches
Week 15-16 (Business Case):
- Use AI to:
- Draft business case structure
- Review ROI calculations
- Refine presentation narrative
- Practice Q&A scenarios
Studio Session Topics:
- Week 9: Supervised learning overview + sklearn workflow
- Week 10: Train/test/validation splits + regression evaluation
- Week 11: Classification metrics deep-dive (precision, recall, F1, ROC)
- Week 12: Decision trees + model interpretability
- Week 13: Feature engineering tactics + domain knowledge
- Week 14: Cross-validation + hyperparameter tuning strategies
- Week 15: Ensemble methods + gradient boosting
- Week 16: Team business case presentations + peer feedback
Assessment Summary
| Component | Weight | Notes |
|---|---|---|
| Project 1 (Supervised Learning) | 25% | Weeks 9-11, individual |
| Project 2 (Feature Engineering) | 35% | Weeks 12-14, individual |
| Project 3 (Business Case) | 30% | Weeks 15-16, team (includes oral defense) |
| Studio participation + DataCamp | 10% | Spread throughout course |
No traditional exam. Project-based with progressive complexity.
Technology Stack
- ML Library: scikit-learn, XGBoost, LightGBM
- Data: pandas, numpy
- Visualization: matplotlib, seaborn, Plotly
- Notebook: Jupyter, Google Colab
- APIs: yfinance, Kaggle API
Last Updated: February 2026