FIN 550 / MSBAi550 - Predictive Analytics for Business (ML I)

Program-level details: See program/CURRICULUM.md

Credits: 4

Term: Fall 2 (Weeks 9-16)

Course Vision

Students learn supervised machine learning fundamentals applied to business problems. This is ML I in the MSBAi sequence — focused exclusively on regression, classification, feature engineering, and business case communication. Students build strong foundations in model evaluation and selection before advancing to unsupervised learning, NLP, and deployment in BADM 576 (ML II).

Prerequisites

Python programming (from BADM 554 or equivalent)
Statistics foundation required: Students must complete the following Coursera pre-program courses (or equivalent) before starting FIN 550:
- Inferential Statistics (Duke University or equivalent)
- Basic Statistics (University of Amsterdam or equivalent)
These are the same statistics prerequisites used by the on-campus MSBA program

Learning Outcomes (L-C-E Framework)

Literacy (Foundational Awareness)

L1: Understand supervised learning and explain when regression vs. classification applies
L2: Recognize overfitting and describe train/test/validation splits
L3: Explain evaluation metrics (MSE/RMSE, accuracy, precision, recall, F1, AUC) and why they matter for different problems

Competency (Applied Skills)

C1: Build regression models (linear, polynomial) using scikit-learn with proper evaluation
C2: Build classification models (logistic regression, decision trees) with rigorous evaluation
C3: Engineer features from raw business data and select informative features
C4: Communicate model performance and business implications to non-technical stakeholders

Expertise (Advanced Application)

E1: Compare multiple supervised model types and choose the best based on business criteria
E2: Apply ensemble methods (random forest, gradient boosting) to improve model performance
E3: Write a compelling business case connecting model results to actionable recommendations

Week-by-Week Breakdown

Week	Topic	Lectures	Project Work	Studio Session	Assessment
9	Intro to supervised ML + regression setup	3 videos	Project 1A: Regression problem setup	ML fundamentals - supervised learning, sklearn workflow	Quiz (L1-L2)
10	Linear & polynomial regression + evaluation	3 videos	Project 1 work: Build baseline model	Regression workshop - sklearn patterns, MSE/RMSE, R²	Code review
11	Logistic regression + classification	3 videos	Project 1 work: Classification model	Classification deep-dive - logistic, decision boundaries, confusion matrix	Project 1 due
12	Decision trees + model selection	3 videos	Project 2A: Feature engineering	Tree models workshop - interpretability, overfitting	DataCamp assignment
13	Feature engineering + cross-validation	2 videos	Project 2 work: Complex model building	Feature engineering tactics - domain knowledge + data-driven	Mid-course checkpoint
14	Model selection + hyperparameter tuning	2 videos	Project 2 work: Model comparison	Cross-validation deep-dive - grid search, learning curves	Model evaluation
15	Ensemble methods intro (random forest, gradient boosting)	2 videos	Project 3A: Business case draft	Ensembles workshop - bagging vs. boosting, when to use each	Ensemble assignment
16	Business case writing + synthesis	1 video	Project 3 complete	Final presentations - team business cases	Projects 2 & 3 due

Projects (3 per course)

Project 1: Supervised Learning Foundations (Weeks 9-11, Individual, 25% of grade)

Problem Statement: Predict a business outcome using supervised learning. Choose regression (continuous target) or classification (binary/multiclass). Real financial data preferred.

Options:

Regression: Predict stock price tomorrow, house prices, customer lifetime value
Classification: Predict loan default (yes/no), customer churn (yes/no), credit card fraud (yes/no)

Datasets Available:

US housing prices (Kaggle)
Stock prices (Yahoo Finance API)
Loan defaults (LendingClub)
Credit card fraud (Kaggle)
Student choice (approved)

Deliverables:

Jupyter notebook with exploratory analysis
Clean, documented code (functions, classes)
Train/test split + cross-validation
2-3 model comparisons (at least one baseline)
Evaluation metrics (MSE/RMSE for regression, F1/AUC for classification)
2-page write-up explaining model choice + business implications
GitHub repo with all code

Rubric (5 dimensions):

Dimension	Excellent (A)	Proficient (B)	Developing (C)
Problem Understanding	Clear definition, appropriate metric chosen	Understands core problem	Vague problem statement
Data Handling	Thoughtful train/test/validation split, cross-validation	Train/test present	Data leakage or poor split
Model Building	2-3 diverse models with justification	2 models, basic comparison	Single model or poor comparison
Evaluation	Rigorous evaluation, explains metrics	Computes metrics, interpretation okay	Weak evaluation
Writeup	Clear explanation, connects to business	Adequate explanation	Minimal writeup

Project 2: Feature Engineering + Model Selection (Weeks 12-14, Individual, 35% of grade)

Problem Statement: Improve your Project 1 model using feature engineering, model selection, and hyperparameter tuning. Focus on systematic comparison and understanding what drives performance.

Deliverables:

Feature engineering notebook (create 5-10 new features with justification)
Model comparison (decision trees, random forest, gradient boosting vs. baseline)
Hyperparameter tuning results (grid search or random search)
Feature importance analysis (which features matter most?)
Evaluation metrics showing improvement over Project 1
Learning curves (showing where you have bias vs. variance)
3-page write-up explaining improvements
GitHub repo with final code + model artifacts

Rubric (5 dimensions):

Dimension	Excellent (A)	Proficient (B)	Developing (C)
Feature Engineering	Creative features with domain insight	Standard feature creation	Minimal feature work
Model Selection	Systematic comparison, well-justified choices	Good model choices	Limited exploration
Hyperparameter Tuning	Systematic grid/random search with analysis	Some tuning attempted	Minimal tuning
Results	Significant improvement with clear wins	Modest improvement	Little/no improvement
Analysis	Explains why improvements worked	Describes metrics	Minimal analysis

Project 3: Business Case + Presentation (Weeks 15-16, Team of 3-4, 30% of grade)

Problem Statement: Write a compelling business case for stakeholders based on your best model. Present findings and recommendations to a mock executive audience.

Deliverables:

Business case document (5 pages):
- Executive summary
- Model performance + baseline comparison
- Recommended business action
- Limitations + risks
- ROI calculation (if applicable)
Team oral defense: present business case + model results (20% of Project 3 grade)
Peer evaluation of team contributions
GitHub repo with all code + documentation

Rubric (5 dimensions):

Dimension	Excellent (A)	Proficient (B)	Developing (C)
Business Case	Compelling, data-driven, actionable	Addresses most points	Incomplete or unclear
Model Understanding	Clearly explains model choices and trade-offs	Adequate technical explanation	Surface-level understanding
Oral Defense	Confident presentation, handles Q&A well	Adequate delivery	Unclear or unprepared
ROI Calculation	Thoughtful cost-benefit analysis	Estimates provided	Missing or speculative
Team Collaboration	Clear evidence of shared work and coordination	Adequate collaboration	Uneven contribution

AI Tools Integration

Week 9-11 (Supervised Learning):

Use Claude/ChatGPT to:
- Explain model selection for your problem
- Debug scikit-learn errors
- Suggest evaluation metrics
- Interpret model results

Week 12-14 (Feature Engineering + Model Selection):

Use AI to:
- Generate feature engineering ideas
- Explain hyperparameter tuning strategies
- Debug model issues
- Suggest regularization approaches

Week 15-16 (Business Case):

Use AI to:
- Draft business case structure
- Review ROI calculations
- Refine presentation narrative
- Practice Q&A scenarios

Studio Session Topics:

Week 9: Supervised learning overview + sklearn workflow
Week 10: Train/test/validation splits + regression evaluation
Week 11: Classification metrics deep-dive (precision, recall, F1, ROC)
Week 12: Decision trees + model interpretability
Week 13: Feature engineering tactics + domain knowledge
Week 14: Cross-validation + hyperparameter tuning strategies
Week 15: Ensemble methods + gradient boosting
Week 16: Team business case presentations + peer feedback

Assessment Summary

Component	Weight	Notes
Project 1 (Supervised Learning)	25%	Weeks 9-11, individual
Project 2 (Feature Engineering)	35%	Weeks 12-14, individual
Project 3 (Business Case)	30%	Weeks 15-16, team (includes oral defense)
Studio participation + DataCamp	10%	Spread throughout course

No traditional exam. Project-based with progressive complexity.

Technology Stack

ML Library: scikit-learn, XGBoost, LightGBM
Data: pandas, numpy
Visualization: matplotlib, seaborn, Plotly
Notebook: Jupyter, Google Colab
APIs: yfinance, Kaggle API

Last Updated: February 2026

MSBAi Curriculum Site

MSBAi - Online Master of Science in Business Analytics

AI-First curriculum design documentation for the MSBAi program launching Fall 2026

FIN 550 / MSBAi550 - Predictive Analytics for Business (ML I)

Course Vision

Prerequisites

Learning Outcomes (L-C-E Framework)

Literacy (Foundational Awareness)

Competency (Applied Skills)

Expertise (Advanced Application)

Week-by-Week Breakdown

Projects (3 per course)

Project 1: Supervised Learning Foundations (Weeks 9-11, Individual, 25% of grade)

Project 2: Feature Engineering + Model Selection (Weeks 12-14, Individual, 35% of grade)

Project 3: Business Case + Presentation (Weeks 15-16, Team of 3-4, 30% of grade)

AI Tools Integration

Assessment Summary

Technology Stack