Last updated: April 05, 2026

← MSBAi Home

BADM 554 - Enterprise Database Management

Program-level details: See program/curriculum.md

Status: In Development Instructor actively building content; recording starts April 15, 2026. Structure stable.

Proposed MSBAi name: Data Foundations — pending formal rename approval

Credits: 4 Term: Fall 2026 (Weeks 1-8) Instructor: Vishal

Course Vision

Students master SQL, relational data modeling, and Python data wrangling to become proficient at designing, querying, and maintaining data systems. By course end, students build a complete ETL pipeline using Python, design a normalized database schema, and understand cloud data infrastructure basics.

Learning Outcomes (L-C-E Framework)

Literacy (Foundational Awareness)

Competency (Applied Skills)

Expertise (Advanced Application)

Week-by-Week Breakdown

Week Topic Lectures Weekly Assignment Project Milestone Studio Session
1 Relational databases + SQL SELECT fundamentals 3 videos (30 min each) SQL fundamentals exercises + quiz (L1-L2) M1: Team formation + project dataset selection Project kickoff - database design walkthrough
2 JOINs, subqueries, aggregations 3 videos SQL query problem set + quiz (C1) M2: ERD draft + normalization analysis SQL deep-dive - JOIN patterns + common mistakes
3 Data modeling + Entity-Relationship diagrams 2 videos + Jupyter notebook ER diagram case study + peer review M3: Finalized schema (CREATE TABLE) + sample queries ER diagram workshop - using Lucidchart/DrawIO
4 Python for SQL users — pandas + DataFrames 3 videos DataCamp assignment + pandas lab M4: Data source identification + extraction script SQL→Python translation - pandas mirrors SQL: SELECT→filter, GROUP BY→groupby, JOIN→merge
5 ETL pipelines + Python (pandas, sqlalchemy) 3 videos ETL coding exercises + peer code review M5: Working ETL pipeline (extract + transform) ETL pipeline workshop - error handling, logging
6 Data quality, cleaning + API integration 2 videos API integration lab + discussion post M6: Complete ETL with error handling + load to local DB Data quality + APIs - messy real-world data, Yelp/weather API ingestion
7 Indexing, optimization, cloud databases 2 videos Optimization quiz + index design exercise M7: GCP Cloud SQL deployment + performance benchmarks GCP Cloud SQL setup - cloud database hands-on
8 Data quality, governance, synthesis 1 video + review Portfolio reflection Final deliverable + oral defense Final presentations - team demos pipeline + Q&A

Team Project: Cloud-Deployed Data Pipeline (Team of 3, Weeks 1-8)

Problem Statement: Your team has been hired by a local e-commerce startup to design, build, and deploy a production-ready data pipeline. The company tracks customers, orders, products, and reviews in spreadsheets and scattered APIs. Your task: design a normalized database, build an ETL pipeline to populate it from multiple sources, deploy it to AWS RDS, and optimize for performance.

Datasets Available:

Project Milestones (20-30% of grade)

Milestone Due Deliverables
M1: Team formation + dataset selection Week 1 Team contract, dataset choice with justification
M2: ERD draft + normalization analysis Week 2 Draft ERD, identify 3NF violations in source data
M3: Finalized schema + sample queries Week 3 CREATE TABLE statements, 5 SQL queries demonstrating schema, GitHub repo
M4: Data source identification + extraction Week 4 Python extraction script using pandas (mirroring SQL queries), data inventory
M5: Working ETL pipeline (extract + transform) Week 5 pandas + sqlalchemy pipeline with transform logic
M6: Complete ETL with error handling Week 6 End-to-end pipeline loading to local DB, error handling + logging
M7: GCP Cloud SQL deployment + benchmarks Week 7 Live Cloud SQL database, performance baseline, index strategy

Each milestone is assessed with brief instructor feedback to keep teams on track. Milestones build cumulatively — each incorporates and extends prior work.

Final Project Deliverable (15-20% of grade)

Due: Week 8

Deliverables:

Rubric (5 dimensions):

Dimension Excellent (A) Proficient (B) Developing (C)
Schema Design 3NF normalized, handles all requirements, justified trade-offs Mostly 3NF with minor issues Normalization problems or missing relationships
ETL Pipeline Modular, documented, robust error handling, efficient pandas/sqlalchemy Good structure, handles happy path Fragile pipeline, minimal error handling
GCP Implementation Cloud SQL properly configured, secure, 30%+ performance gain Cloud SQL works, 10-20% improvement Connection/setup issues, minimal optimization
Documentation Complete architecture diagram, clear setup guide, code comments Adequate documentation Incomplete or unclear instructions
Business Understanding Explains design decisions in business terms, cost-aware Mentions business context No business rationale

Oral Defense (20-25% of grade)

Week 8 studio session. Each team presents their pipeline design, demonstrates the live system, and fields Q&A from instructor and peers.

Rubric (3 dimensions):

Dimension Excellent (A) Proficient (B) Developing (C)
Technical Explanation Clear walkthrough of design decisions, trade-offs articulated Adequate explanation of system Unclear or surface-level description
Live Demo System works end-to-end, handles follow-up queries confidently Demo works with minor issues Demo fails or cannot answer questions
Individual Contribution Each member explains their role with depth Members can describe their work Uneven participation or vague answers

Weekly Assignments (30-40% of grade)

Practice exercises, labs, discussions, and peer reviews that build skills for the project. Completed individually unless noted.

Week Assignment Format
1 SQL fundamentals exercises + quiz Practice problems + timed quiz
2 SQL query problem set + quiz JOINs, subqueries, aggregations
3 ER diagram case study + peer review Design exercise + review a classmate’s ERD
4 DataCamp assignment + pandas lab DataCamp module + SQL→pandas translation exercises
5 ETL coding exercises + peer code review Coding problems + review a classmate’s pipeline code
6 Data quality + API integration lab Clean a messy real-world dataset + build an API connector
7 Optimization quiz + index design exercise Timed quiz + design indexes for given queries
8 Portfolio reflection Written reflection on learning journey and skill growth

AI Tools Integration

Where AI Accelerates Learning:

  1. Week 1 (SQL Fundamentals): Use Claude/ChatGPT to:
    • Explain SQL errors (“Why does this JOIN return NULL?”)
    • Generate sample SQL queries for practice
    • Validate your schema design
    • Suggest refactoring for clarity
  2. Week 4-5 (Python ETL): Use AI to:
    • Debug pandas errors (“How do I reshape this DataFrame?”)
    • Optimize code performance (“How do I vectorize this loop?”)
    • Generate error handling patterns
    • Suggest pandas functions for complex transformations
  3. Week 7-8 (Cloud & Optimization): Use AI to:
    • Troubleshoot GCP Cloud SQL configuration issues
    • Suggest index strategies for slow queries
    • Review security practices
    • Generate Cloud Functions code for automation

Three-Options Prompting Protocol (Weeks 4-8, AIAS Level 2-3):

For all milestone submissions (M4-M7) and the final deliverable, students must use the three-options format when using AI for design or architectural decisions:

  1. Prompt AI for three distinct options (e.g., “Give me three different approaches to structuring this ETL pipeline”)
  2. Summarize each option briefly
  3. Select one and justify the choice in 2-3 sentences based on your specific project context

Submission format:

AI Prompt: [paste exact prompt]
Option A: [1-sentence summary]
Option B: [1-sentence summary]
Option C: [1-sentence summary]
Chosen: [A/B/C]
Rationale: [why this option fits your context better than the others]

This replaces generic AI attribution (“I used Copilot”) with evidence of evaluative judgment. Options must differ on at least one meaningful dimension (e.g., performance vs. simplicity vs. maintainability). The three-options protocol does not apply to Weeks 1-3 (pre-AI / AIAS Level 0-1 phases).

Studio Session Topics:

Assessment Summary

Component Weight Range Notes
Weekly assignments (exercises, quizzes, labs, peer reviews) 30-40% Individual, skill-building
Project milestones (M1-M7) 20-30% Team, progressive deliverables
Final project deliverable 15-20% Team, culminating system
Oral defense 20-25% Team, live demo + Q&A
Studio participation 5-10% Weekly attendance + engagement

No traditional exam. All assessment is project-based + participation.

AI Usage Levels (AIAS)

Assessment AIAS Level AI Permitted
Quizzes (Weeks 1, 2, 7) 0 No AI
Weekly labs + exercises 1 AI for ideation and exploration only
Project milestones M1-M3 (schema/ERD) 1 AI for ideation and schema brainstorming only
Project milestones M4-M6 (ETL pipeline) 2 AI for code assistance and debugging, with attribution
Project milestone M7 + final deliverable (cloud deployment) 3 AI as full collaborator for AWS troubleshooting and code review, with attribution
Oral defense 0 No AI
Studio participation 1 AI for exploration during exercises

Technology Stack

Prerequisites & Assumptions

Bridge Module: Python & Tools Orientation (Pre-Course, ~10 hours)

Complete before Week 1. Available in Canvas as a self-paced module with self-check quizzes.

Unit Topics Format Self-Check
1. Python Basics (3 hrs) Variables, data types, loops, conditionals, functions Video walkthroughs + coding exercises in Jupyter Quiz: write a function that filters a list
2. Data Structures (2 hrs) Lists, dictionaries, reading CSV files Jupyter exercises with sample datasets Quiz: load a CSV and answer 3 questions about it
3. Jupyter Notebook Workflow (2 hrs) Cells, markdown, running code, exporting notebooks Guided walkthrough Quiz: create a notebook with code + markdown cells
4. GitHub Essentials (2 hrs) Create account, clone repo, commit, push, write a README Step-by-step tutorial Quiz: push a Jupyter notebook to a new repo
5. Environment Setup (1 hr) Install Python, Jupyter, VS Code; verify pandas/sqlalchemy Checklist + troubleshooting guide Checkpoint: run provided test script successfully

Readiness check: Students who pass all 5 self-check quizzes (70% threshold) are cleared for Week 1. Students who don’t pass receive targeted resources and can retake.


Pedagogical Notes for Faculty

Design suggestions grounded in program research — not requirements. Adapt to your course and teaching style. Full references in reference/articles/.

The scenic route (cognitive friction) Pre-AI phases aren’t punishment — they’re the scenic route. Schultz’s dopamine research shows the brain registers learning through prediction errors (the gap between expectation and outcome), not through frictionless delivery. When students design their ERD by hand before asking Copilot to generate CREATE TABLE statements, that struggle is the mechanism — not the obstacle. Consider where in each week students should formulate their own approach before consulting AI. The AIAS progression in this course (0→1→2→3 across weeks) already builds this in; the key is framing it for students as investment, not restriction. → Machulla (2026), Schultz et al. (1997)

The IKEA effect (completion matters) Students value what they build — but only when labor leads to a finished artifact. Building and then discarding produces no ownership effect. The M1→M7→Final Deliverable pipeline already does this well: each milestone closes a loop. The oral defense is the ultimate completion signal — the moment where effort converts to demonstrated competence. When possible, make each milestone feel like a working thing (a queryable schema, a running pipeline), not just a checkpoint document. → Norton, Mochon & Ariely (2012)

Prompt sophistication as skill progression Anthropic data shows r > 0.92 correlation between prompt sophistication and output quality, and multi-turn interactions succeed 67% of the time vs. 49% for single-turn. This course is most students’ first encounter with AI-assisted coding (AIAS 1→3). Consider explicitly teaching prompt patterns early — not as a sidebar, but as a core data foundations skill: “asking the right question of an AI is the same skill as writing the right SQL query.” → Means (2026, “How You Ask”)

Three AI iterations before human review For project milestones M4-M7 (ETL pipeline work at AIAS 2-3), consider requiring students to iterate with AI at least 3 times before submitting for human feedback. This builds the habit of treating AI as a drafting partner, not an answer machine, and produces richer AI Attribution Logs. → Means (2026, “Practice Gap”)

Attack your assessments Before the semester starts, have a confident AI user (TA, LD, or yourself) attempt each major assignment using current AI tools from a student’s perspective. Where can AI complete the task without genuine understanding? Those are the spots to add pre-AI phases or shift weight toward the oral defense. Repeat each semester — AI capabilities shift fast. → Furze (2026)


Course Sequence: ← Previous: (first course) Next: BDI 513 — Data Storytelling