MSBAi Assessment Strategy
Purpose: Normative assessment policies for the MSBAi program – what faculty must follow when designing course assessments.
Program-level details: See program/curriculum.md. Research background: See reference/ASSESSMENT_RESEARCH.md. Student-facing rationale: See Why We Ask You to Show Your Thinking.
0. How to Read This Document
This document distinguishes program-wide commitments (a small set of consistency commitments to MSBAi students) from recommended defaults (pedagogical guidance faculty can adapt to their content and style).
Program-wide commitments
These are commitments MSBAi makes to students about what their experience will look like across every course. Faculty retain full autonomy over instrument, rubric, content, AI tooling choices, and weights within ranges — but the following structural elements remain consistent so students can rely on them:
- Every assessment declares its AIAS level (0-4) — students know up front what AI use is permitted on every assignment (§3.1)
- Every 8-week course has an individual oral component worth 20-25% of course grade — live in Studio preferred (Studio is the natural venue); async acceptable with mandatory Q&A mechanism when scheduling requires it; distributed or end-of-course at faculty discretion (§6)
- Every 8-week course includes a peer evaluation of teammates worth 5% of course grade — instrument, rubric dimensions, and the formative/summative split are faculty’s choice (§7)
- Every 8-week course includes one major team project — scaffolded across the semester (§2, project_milestone_template.md)
- Live Session engagement is graded via InScribe at 3-5% of course grade — attendance not required; engagement verified through one post per week (§2)
- One major project per course, not 2-3 — depth over breadth (§2)
Everything else in this document is a recommended default — well-supported by research and program design, but adapted by faculty to fit their course. When the doc uses language like “typically,” “recommended,” “default,” “may,” or “faculty’s choice,” that’s the autonomy lane.
Why this matters: MSBAi students take 9 courses across 15 months. The commitments above protect their ability to predict and trust the program’s structure. Everything else is yours to design.
1. Assessment Philosophy
- Validity First – Assessments generate trustworthy evidence of learning, not just evidence of AI-assisted output quality (Furze, 2026)
- Transparency Over Surveillance – Clear AI usage policies per assignment; trust-based with accountability. Students know the AIAS level and rationale for every assessment.
- Process Over Product – Document learning journey, weight revision and iteration. Rubrics reward reasoning quality over surface fluency of AI-assisted deliverables (Vendrell & Johnston, 2026, P7).
- Authentic Application – Design for reality: assessments reflect real-world AI-augmented workflows, not artificial “AI-proof” constraints (Furze, 2026)
- Assessment as Process – Build evidence chains over time (weekly assignments → milestones → deliverable → defense), not single high-stakes moments. Multiple modes: written, oral, practical, collaborative. This pipeline mirrors the structure of a DJ’s buildup: each stage loads the brain’s reward system so the next resolution actually registers. The anticipatory phase — not the payoff — determines how intensely learning lands (Salimpoor et al., 2011; Machulla, 2026).
- Cognitive Friction by Design – Preserve the productive struggle essential for deep learning. Students formulate hypotheses, construct arguments, or analyze data independently before consulting AI. AI extends thinking; it doesn’t replace it. The neuroscience is concrete: dopamine neurons fire on prediction errors (surprise), not on predicted rewards — when outcomes match expectations exactly, the brain’s teaching signal is zero (Schultz et al., 1997). Frictionless AI delivery eliminates the uncertainty and effort that make learning neurologically meaningful. Pre-AI phases are not punishment; they are the scenic route that makes the destination worth reaching (Machulla, 2026; Vendrell & Johnston, 2026, P1/P8).
- Low-Stakes Iteration with Peer Review – Projects follow a draft → peer feedback → revision cycle. Students learn as much from reviewing others’ work as from receiving feedback. Early submissions are low-stakes checkpoints (formative), not high-stakes deadlines (summative). Peer review is structured with rubrics and trained in the first studio session of each course. Each iteration closes a small gap between intention and outcome — the IKEA effect shows that labor leads to love only when it leads to completion (Norton et al., 2012). Multiple small completions build cumulative ownership of the final deliverable.
2. Standard Assessment Model
Every course follows this structure. Faculty choose specific assignment types (cases, labs, discussions, exercises) based on course content.
Reading the weight ranges. Each component shows a range, not a fixed weight. Faculty select values within the ranges so that the course grade sums to exactly 100%. The ranges constrain relative emphasis (e.g., weekly assignments must be at least 30% but no more than 40%); the choices within them tune the course to its content.
8-week, 4-credit courses
| Component | Weight | Timing | Description |
|---|---|---|---|
| Weekly assignments | 30-40% | Weeks 1-8 | Practice exercises, case analyses, discussions, labs, peer reviews — faculty choose format based on content |
| Project milestones | 20-30% | Weeks 1-7 | Proposal, drafts, peer review — scaffolded steps toward final project. Studio sessions are where milestone work advances; studio output is graded here, not separately. |
| Final project deliverable | 15-20% | Week 8 | Teams of 3 (2 or 4 in exceptional circumstances — see §7); delivered as a team presentation (recorded video or live at faculty discretion) |
| Individual oral component | 20-25% | Weeks 1-8 | Spoken explanation of project work demonstrating individual mastery. May be distributed across milestones (e.g., short video at each check-in) or delivered as a single end-of-course defense — faculty’s choice. Live preferred (Studio is the natural venue); async acceptable when scheduling constraints require it, but async submissions must include an interactive question-and-answer mechanism — see §6. No separate team oral component; team presentation is part of the final deliverable. |
| Peer evaluation of teammates | 5% | Weeks 4, 8 | Anonymous peer rating on contribution / reliability / communication / collaboration. Recommended default: 1% Week 4 formative pulse (individual-completion grade) + 4% Week 8 summative (rating-based). Faculty may shift the split (e.g., 0%/5%, 2%/3%) provided the total = 5%. See §7 for framework and Appendix E for a recommended escalation policy. |
| Live Session engagement (InScribe) | 3-5% | Weeks 1-8 | Watch Live Session (live or recording) → post one insight, question, or application to InScribe → respond to one peer’s post. Faculty provides the weekly prompt tied to that session’s content. Attendance at the live session is not required or graded. |
4-week, 2-credit courses
Same structure compressed. Individual project only (insufficient time for team formation). Individual oral component still required.
| Component | Weight | Timing | Description |
|---|---|---|---|
| Weekly assignments | 25-35% | Weeks 1-4 | Labs, exercises, readings |
| Project milestones | 20-30% | Weeks 1-3 | Progressive deliverables toward final |
| Final project deliverable | 25-35% | Week 4 | Individual; delivered with individual oral component |
| Live Session engagement (InScribe) | 3-5% | Weeks 1-4 | Same model as 8-week courses — watch → post → respond on InScribe |
Key principles
- One major project per course, not 2-3. Depth over breadth.
- Weekly assignments build skills needed for the project — they are not filler.
- Project milestones threaded throughout all weeks, ramping up toward the end.
- Team size: 3 students standard (2 or 4 in exceptional circumstances; see §7). Individual only (4-week courses).
- Individual oral component required in every course — format and distribution across milestones at faculty discretion (see Section 6).
- Live session attendance is not graded — engagement with session content is verified through InScribe posts.
- Studio sessions are not graded separately — studio output is captured in project milestone grades.
- Faculty choose assignment types based on content: cases, labs, discussions, exercises, peer reviews.
3. AI-Aware Assessment Framework
MSBAi uses three complementary frameworks to structure AI-appropriate assessment.
3.1 AI Assessment Scale (AIAS)
Adapted from Perkins, Furze, Roe, & MacVaugh (2024). Published AIAS uses Levels 1-5; MSBAi adapts to 0-4 (Level 0 = no AI).
| Level | AI Usage | Example |
|---|---|---|
| 0 | No AI permitted | Individual oral components, proctored assessments |
| 1 | AI for brainstorming only | Idea generation, not content creation |
| 2 | AI for drafting with human revision | Code assistance, debugging, first drafts – with attribution |
| 3 | AI as collaborative tool | Full integration with disclosure; AI for code generation, narrative refinement |
| 4 | AI as subject of analysis | Build, critique, and evaluate AI systems |
Every assessment component in every course syllabus is annotated with its AIAS level. See individual course pages for per-assignment levels.
3.2 Pre-AI / AI-Mediated / Post-AI Sequencing
Beyond setting an AIAS level per assignment, faculty should design the sequence of engagement within activities. This prevents cognitive offloading while preserving AI’s value as a thinking partner. The neuroscience basis: the brain’s dopamine system is most engaged under uncertainty — when the outcome is genuinely unknown, not when rewards arrive on schedule (Schultz et al., 1997; Fiorillo et al., 2003). The pre-AI phase creates this uncertainty (will my hypothesis hold?); the AI-mediated phase introduces surprise (did AI find something I missed?); the post-AI phase closes the gap through reflection (what did I actually learn?). This is the “dopamine gap” — the space between expecting and receiving — and it is where motivation, competence, and meaning are built (Machulla, 2026).
| Phase | Student Activity | Purpose |
|---|---|---|
| Pre-AI (AI-free) | Formulate hypothesis, draft analysis plan, identify assumptions, construct initial argument | Preserves cognitive friction; builds independent reasoning before AI exposure |
| AI-Mediated | Use AI to extend analysis, generate alternatives, challenge assumptions, debug code, explore counterarguments | Positions AI as thinking partner; student directs the inquiry |
| Post-AI (reflection) | Evaluate what AI added vs. missed, compare AI output to own reasoning, document modifications, identify limitations | Builds evaluative judgment and metacognitive awareness |
Implementation examples:
- FIN 550 lab: Students build a baseline model by hand (pre-AI), then use Copilot to optimize hyperparameters and explore feature engineering (AI-mediated), then write a reflection comparing their intuition to AI suggestions (post-AI)
- BDI 513 case: Students draft their own data story narrative (pre-AI), ask AI to suggest alternative framings or identify gaps (AI-mediated), then defend their final narrative choice in studio (post-AI)
- BADM 557 project milestone: Students design their BI dashboard wireframe independently (pre-AI), use AI to generate DAX formulas and suggest visualizations (AI-mediated), then critique which AI suggestions they rejected and why (post-AI)
This sequencing is supported by Kosmyna et al. (2025), who found that students who engaged independently before consulting an LLM produced significantly stronger outputs than those who used AI from the start. The METR randomized trial (2025) adds a cautionary data point: experienced developers using AI were 19% slower on complex tasks yet believed they had been 20% faster — the frictionless AI experience creates a subjective sense of productivity that diverges from measurable outcomes.
Sources: Vendrell & Johnston (2026), Principles P1 and P8; Furze (2026), “design for reality” principle; Machulla (2026), dopamine prediction error and the “scenic route” framing; METR (2025), AI productivity perception gap.
3.3 AI Declaration Requirements
All major projects require students to document:
- Which AI tools were used
- What prompts were employed
- What limitations were encountered
- How human judgment modified AI outputs
See Appendix A for the AI Attribution Log template.
3.4 Design Thinking Checklist for AI Task Design
Before finalizing how AI is integrated into any assignment or milestone, faculty should answer three diagnostic questions (adapted from Vander, 2026 — Rethinking AI in education through design thinking):
1. What is the learning outcome? Be precise. “Students need help with analysis” is too vague. “Students need to evaluate competing model specifications and justify their choice using domain knowledge” makes it possible to decide whether AI should assist with code generation, critique, exploration of alternatives — or not be used at all.
2. What cognitive work must remain with the student? Map this directly to the AIAS level. If the cognitive work that defines the outcome (formulating hypotheses, selecting evidence, defending a position) can be offloaded to AI, the assessment is measuring AI fluency, not learning.
3. What kind of assistance makes that cognitive work more visible, more rigorous, or more equitable? This distinguishes AI as material for thinking (students examine, critique, and modify AI output — cognitive work stays with the student) from AI as shortcut around cognition (students accept AI output — cognitive work is bypassed). Both look like productive use from the outside; only the design reveals the difference.
The central shift: The productive question is not “How do we use AI?” It is “What are students trying to understand, create or improve, and what kind of support would genuinely strengthen that learning?”
These three questions should be answered in writing for every major assessment before the course syllabus is finalized, and submitted as part of the T&L course design review.
3.5 Black Box Assessment: Making Learning Visible
Framework: Winstone, Gravett & Elkington (2026), “Black Box Assessment: Rethinking Integrity and Learning for a Time of Generative AI.” Assessment & Evaluation in Higher Education. Full notes: reference/articles/winstone-2026-black-box-assessment.md
Traditional assessment is a black box: educators send a task brief and receive a polished product, but how it was produced is opaque. GenAI makes that opacity a validity crisis — a polished product is no longer reliable evidence of learning. The solution is structural (redesign the task) not discursive (add more policy language).
The key question shift: “Did they write this?” → “What can we see about how they learned?”
The Four Components of a Black Box Assessment
The framework recommends four components for any major assessment. Faculty choose which to adopt and how:
| Component | Definition | Suggested MSBAi adaptation |
|---|---|---|
| Assessment Task | The structured assignment | Project brief / milestone set |
| Black Box Learning Outcome | ≥1 LO focused on how students engage, not just what they produce | Consider adding one explicit process LO per course |
| Black Box Windows | A few structured pieces of process evidence submitted alongside the final product | AI Attribution Log, milestone drafts, debugging notes, GitHub commits |
| Black Box Rubric | Explicitly grades quality of process — revision, error recovery, cognitive pivot | Consider adding a process dimension to major rubrics (see below) |
Black Box Windows in MSBAi Context
Select windows that combine:
- Natural evidence — code commits, debugging logs, draft histories, GitHub commit messages. These already exist in project work; make them gradeable.
- Reflective evidence — AI Attribution Log (what AI suggested, what changed, why), milestone retrospectives (I started with X, pivoted to Y because Z).
Typically two to three windows per major assessment is sufficient. Each window should reveal something about the student’s thinking that the final product conceals — not busywork.
Adding a Process Learning Outcome (recommended)
Consider including at least one L-C-E outcome focused on process in each 8-week course:
Example (Competency level): “Document the evolution of a design or analytical decision across at least two milestones — including an initial approach, what prompted a revision, and what was learned from the change.”
This anchors the Black Box Rubric dimension and signals to students that cognitive change is being assessed, not just final quality.
Black Box Rubric Dimension (recommended)
A suggested dimension to add to major project rubrics:
| Excellent (A) | Proficient (B) | Developing (C) | |
|---|---|---|---|
| Learning Trajectory | Shows clear cognitive pivot or error recovery with explicit rationale; demonstrates how thinking evolved across milestones | Shows some revision with partial explanation of what changed | Final product submitted with no visible evidence of iteration or change |
This credits a student who shows “I started with approach A, realized it was wrong, and moved to B — here’s why” — stronger evidence of learning than a polished product with no visible struggle.
Relationship to AIAS and Vander (2026)
These three frameworks work together as a complete structural system:
- AIAS (Section 3.1) — specifies how much AI is permitted on a given task
- Vander diagnostic questions (Section 3.4) — determines what cognitive work must stay with the student
- Black Box Windows + Rubric (this section) — makes that cognitive work visible and assessable
AIAS without Black Box rubric design produces disclosures without evidence. Black Box without AIAS produces process documentation without guidance on appropriate AI use.
4. Program-Level Portfolio Structure
For semester-by-semester course assignments, see program/curriculum.md.
| Program Stage | Artifacts | Competencies Demonstrated |
|---|---|---|
| Foundation courses | SQL projects, data visualization, reflection | Database, visualization basics |
| Analytics courses | ML models, business case analyses | Predictive modeling, business application |
| Advanced courses | Team project deliverables, peer reviews | Collaboration, communication |
| Practicum | Practicum + comprehensive reflection | Integration, professional readiness |
5. Synchronous Assessment Components
Even in async-first programs, synchronous touchpoints are required:
- Live Session (weekly, 90 min — lead faculty delivers the week’s content; occasional guest speakers or case discussions at faculty discretion). Attendance is not required or graded; engagement with session content is verified through the weekly InScribe post (see §2, “Live Session engagement”).
- Project Studio (weekly, 90 min — hands-on project work, live coding, peer feedback; faculty- or experienced-TA-led). Studio is also the natural live venue for project pitches, milestone defenses, peer-review discussions, and final presentations — faculty are encouraged to use Studio time across the 8 weeks for these live components whenever schedules allow (see §6). Studio is not graded as a separate line item; studio output is captured in the project milestone grades (see §2).
- Office Hours (weekly + drop-in — 1:1 and small-group; scheduled across timezones; voluntary)
- Mid-term Check-ins (15-min instructor conversation)
- Project Presentations (live via Zoom, recorded backup)
- Practicum Defense (mandatory synchronous, panel format)
Guest speakers, case discussions, and analytics current-events conversations may be incorporated into the Live Session or Project Studio at faculty discretion — they are not scheduled as a separate program-wide series.
6. Individual Oral Component Requirements
Research strongly supports oral components for verifying individual understanding in AI-enabled environments. An oral component is only meaningful if questions can be asked — one-way recorded speech doesn’t establish individual mastery, conversation does.
Key design principles:
- Live preferred — the weekly Studio session (90 min) is the natural venue for project pitches, milestone defenses, and final presentations. Faculty are encouraged to use Studio time across the 8 weeks for live oral components whenever schedules allow. Live conversation provides the question-and-answer dynamic that gives the oral component its assessment value.
- Async acceptable when scheduling constraints require it — for students unable to attend Studio at the scheduled time, async video submission is acceptable provided it is paired with an interactive Q&A mechanism (see Async Q&A Options below). Async one-way video alone does not satisfy the oral component requirement.
- Distributed or end-of-course — faculty may split the oral component across milestone check-ins (e.g., 3–5 min discussion per milestone) or deliver it as a single end-of-course defense
- Individual, always — each student must speak to the work independently, regardless of team structure
- No separate team oral component — team presentation is the delivery format for the final project deliverable, graded there
Async Q&A options (when async is needed):
The point of the oral component is to verify that the student can answer unscripted questions about the work. When async, faculty choose a mechanism that preserves that dynamic:
- Asynchronous Q&A thread — student submits the recorded explanation; instructor (and optionally classmates) post 2-3 questions within 48 hours; student records a follow-up response video within 48 hours. Two-round minimum.
- Recorded follow-up questions — instructor reviews the student’s video, records 2-3 targeted questions, student records a response video. One round.
- Scheduled live mini-defense — async video is the prep; a 10-15 minute live conversation with the instructor at a mutually-scheduled time is the defense. Hybrid model.
- Studio live + async makeup option — primary delivery is in Studio; students who genuinely cannot attend submit async video + follow-up via option 1 or 2 above.
Option 3 or 4 are recommended when feasible; options 1 and 2 are fallbacks. Pure async video with no follow-up mechanism is not sufficient.
Implementation (ACTIVE – all course syllabi):
| Course Component | Oral Weight | Format |
|---|---|---|
| 8-Week Course Projects | 20-25% of course grade | Spoken explanation of project work — distributed across milestones or single defense. Live in Studio preferred; async with mandatory Q&A mechanism (see above) when scheduling requires |
| 4-Week Course Projects | Included in final project (25-35%) | Individual spoken component — Live in Studio preferred; async with Q&A mechanism when needed |
| Practicum | 25-35% of Practicum grade (min 20%) | Faculty determines format; live panel format recommended given client/external stakeholder involvement |
Practicum oral component notes:
- Faculty determine length and format within the 25-35% range
- Panel may include client sponsor for client projects
- Each student must answer questions individually, regardless of team/individual format
- Career pivoters should be assessed on ability to articulate their analytical value proposition
- See courses/practicum.md for full Practicum guidelines
See Appendix C for the standardized oral component rubric.
7. Team Assessment Guidelines
Cross-reference: design_principles.md Constraints 7 (team projects required) and 8 (oral defense weights).
Team Project Policy:
- Every 8-week course includes at least one team project (typically the final project)
- Teams of 3 students (2 or 4 in exceptional circumstances), assigned by instructor to balance skill sets
- 4-week courses are individual projects only (insufficient time for team formation)
Individual Accountability Within Teams:
- Oral defense (20-25%) is the primary individual-accountability lever — each team member must be able to answer questions on any part of the project
- Peer evaluation of teammates (5% of course grade) is the complementary signal — see “Peer Evaluation Framework” below
- Git commit history reviewed by instructor to validate individual contributions when peer evaluations flag concerns
Peer Evaluation Framework:
Program-wide commitment: peer evaluation worth 5% of course grade. Recommended default: two anonymous touchpoints using the same instrument (contribution / reliability / communication / collaboration):
| Stage | Recommended weight | Purpose | Suggested grading basis |
|---|---|---|---|
| Week 4 formative pulse | 1% of course grade | Surface team issues with time to course-correct | Individual completion — submit the pulse, earn the 1% (ratings given/received not graded; comments visible to instructor) |
| Week 8 summative evaluation | 4% of course grade | Final teammate rating | Mean of ratings received populates the 4% |
Faculty may adapt the split (e.g., 0%/5% summative only, 2%/3%, or distributed across more touchpoints) provided the total = 5%.
Suggested implementation choices (all faculty-adaptable):
- Train students on constructive, evidence-based feedback in an early studio session (peer-review calibration exercise — see “Low-Stakes Iteration Model” below)
- Keep ratings anonymous to teammates; comments visible to instructor
- Review Week 4 pulse output the same week; use comments to triage struggling teams
- For documented free-riding revealed across Week 4 + Week 8 + studio observation, see Appendix E: Free-Rider Exception Policy for a recommended escalation framework. Faculty retain discretion on how to handle individual cases.
MSBAi Peer Assessment Types
| Type | Description | When to Use |
|---|---|---|
| Code Review | Evaluate peer code quality and documentation | Technical courses |
| Analysis Critique | Assess methodology and conclusions | Statistics/ML courses |
| Presentation Feedback | Evaluate communication effectiveness | Practicum, storytelling |
| Team Contribution | Rate collaboration and reliability | Group projects |
Low-Stakes Iteration Model with Feedback Closure
Every multi-week project should follow a draft → feedback → revision → feedback-closure cycle:
| Stage | Timing | Stakes | Feedback Source |
|---|---|---|---|
| Draft checkpoint | Mid-project (e.g., Week 3 proposal, Week 6 analysis draft) | Low — formative only, or ≤5% of project grade | Peer review + instructor spot-check |
| Peer review | 2-3 days after draft submission | Part of studio participation grade | Structured rubric (same dimensions as final rubric, simplified) |
| Feedback closure | Submitted with revision | Counts toward revision quality grade | Student documents how peer + instructor feedback was incorporated, ignored, or modified |
| Revision + final | Project deadline | Full weight (summative) | Instructor grading on final deliverable + feedback-closure quality |
Feedback closure (required): Students must submit a short feedback-response memo (½ page or one short paragraph per reviewer) with each revised draft. The memo answers, for each piece of peer and instructor feedback received:
- What was the feedback?
- How did we incorporate it — or why did we choose not to?
- What did the feedback help us see that we couldn’t see ourselves?
This makes peer review meaningful (it has visible consequences in the revision) and forces students to take peer feedback seriously alongside instructor feedback. It also creates Black Box Window evidence (§3.5) for the Learning Trajectory rubric dimension — the memo is documented cognitive change.
Recommended scope of peer review across the milestone arc:
- Proposal stage (Week 3-4): teams review 1-2 other teams’ proposals, deliver structured feedback in studio. Feedback closure submitted with Final Proposal (Week 4).
- Final presentation (Week 8 or rehearsal week): teams give and receive feedback on dry-run presentations. Feedback closure submitted with the final delivered version.
Implementation guidance:
- 8-week courses: At least one project milestone should include peer review + feedback closure before final submission
- 4-week courses: Draft checkpoints encouraged but optional (compressed timeline)
- Practicum: Part 1 (portfolio) uses Week 2 peer workshop; Part 2 (project) uses Week 7 dry run — both with feedback closure
- Peer review training: First studio session of each course is recommended to include a 15-minute peer review calibration exercise (students review a sample artifact together, discuss scoring, align expectations)
- Peer review rubric: A simplified version of the project’s final rubric (3 dimensions instead of 5, same language) works well
What students gain from reviewing — and from closure:
- Exposure to different approaches to the same problem
- Calibration of their own work quality against peers
- Practice giving constructive technical feedback — a workplace skill
- Practice receiving feedback well — closure trains the discipline of evaluating critique on merit rather than defensively ignoring or wholesale accepting it
8. Risk Mitigation
- AI Over-Reliance: Oral defense verifies understanding; process documentation reveals AI dependency; AIAS levels set appropriate use boundaries; pre-AI/post-AI sequencing (Section 3.2) ensures students build independent reasoning before consulting AI
- Assessment Gaming: Multiple modalities (written, code, oral, peer); progressive project complexity; individual Practicum with live defense
- Cognitive Offloading: Pre-AI phases in every activity preserve productive struggle; rubrics explicitly reward reasoning quality over output polish; AI Attribution Log makes thinking process visible. The risk is neurological, not just pedagogical: AI tools that deliver instant answers without uncertainty create the “infinite scroll” effect — engagement without completion signals, dopamine gaps that never resolve (Machulla, 2026). Pre-AI phases are the “page break” that gives the brain a stopping cue.
- Student Resistance to Process Documentation: Explain career rationale; provide templates (Appendix A); grade leniently first term, increase rigor over time
- Faculty Capacity for Oral Defenses: One major project per course limits oral defense load; use studio session time; TA support for scheduling
Faculty Assessment Validation (“Attack Your Assessments”)
Before finalizing course assessments, faculty should conduct an AI stress test (Furze, 2026):
- Attempt your own assessments with AI — Have a confident AI user (faculty member, TA, or instructional designer) complete each major assessment using current AI tools from a student’s perspective
- Identify vulnerability points — Which parts can AI complete without genuine understanding? Where does the assessment truly require human reasoning?
- Redesign where needed — Strengthen vulnerable assessments by adding pre-AI phases, requiring process documentation, or shifting weight toward oral defense
- Repeat each semester — AI capabilities change rapidly; what was AI-resistant in Fall 2026 may not be by Spring 2027
This exercise should be part of the faculty orientation process (presentations/faculty-orientation/) and repeated annually.
Appendix A: AI Attribution Log Template
## AI Contribution Log
### Project: [Name]
### Date: [Date]
### Student: [Name]
| Date | AI Tool | Task | Prompt Summary | Output Summary | How I Modified/Validated |
|------|---------|------|----------------|----------------|--------------------------|
| | | | | | |
| | | | | | |
### Reflection on AI Use
- What tasks did AI help with most effectively?
- Where did AI outputs require significant modification?
- What would I do differently next time?
Appendix B: AIAS Level Reference
| Level | AI Permitted | Example Assignment |
|---|---|---|
| 0 | None | Certification quiz |
| 1 | Ideation only | Brainstorm features (document what AI suggested) |
| 2 | With attribution | Standard project work |
| 3 | As collaborator | Advanced analysis with AI partnership |
| 4 | As subject | Agentic AI course projects |
Appendix C: Oral Defense Rubric
| Criterion | Excellent (A) | Proficient (B) | Developing (C) |
|---|---|---|---|
| Clarity of Explanation | Explains concepts clearly to non-expert | Clear with minor gaps | Confusing or unclear |
| Technical Depth | Demonstrates deep understanding | Shows solid understanding | Surface-level knowledge |
| Response to Questions | Handles unexpected questions confidently | Answers most questions adequately | Struggles with questions |
| Methodology Justification | Explains why decisions were made | Describes what was done | Cannot explain choices |
| AI Usage Awareness | Articulates when/how AI helped vs. didn’t | Acknowledges AI use | Unclear on AI role |
Appendix D: Cross-Course Ethics Integration
| Course | Ethics Focus | Case Study Topic |
|---|---|---|
| 554 | Data Privacy | Cambridge Analytica, GDPR compliance |
| 513 | Visualization Integrity | Misleading COVID charts, election misinformation |
| 550 | Algorithmic Fairness | Biased lending algorithms, credit scoring |
| 557 | Surveillance & BI | Employee monitoring, predictive policing |
| 558 | Cloud Security | Data breaches, sovereignty, vendor lock-in |
| 576 | Model Accountability | Healthcare AI failures, autonomous vehicles |
Appendix E: Free-Rider Escalation Framework (Recommended)
Status: This is a recommended framework for handling documented free-riding, offered to faculty who want a structured, defensible approach. Faculty retain full discretion over how individual cases are handled. The framework is designed to be defensible if challenged (by a student, advising, or governance) — using it earns that defensibility; adapting or replacing it is the instructor’s call.
When it applies. The 5% peer-eval line handles normal variance in team contribution. This framework is designed for cases where documented evidence indicates a teammate’s contribution to the team project fell substantially below expectations — not when ratings simply differ.
Suggested evidence threshold. The framework is most defensible when all three of these signals align:
- Week 4 formative pulse — at least two teammates flagged the individual (rating ≤2 on contribution or reliability, or explicit comment describing the issue)
- Week 8 summative evaluation — same individual receives mean rating ≤2.5 across contribution and reliability dimensions, with corroborating teammate comments
- Independent instructor signal — Git commit history absent or trivial, studio attendance absent, milestone contributions undocumented, missed check-ins, unresponsive to outreach, etc.
Suggested handling: two-of-three signals → instructor conversation; three-of-three → instructor may consider grade adjustment.
Suggested adjustment scale. When invoking the framework, instructors may reduce the individual’s share of the team project grade (final deliverable bucket) by:
| Pattern | Suggested adjustment |
|---|---|
| Significant under-contribution despite ability and access | -10 to -25% of individual’s final-project bucket |
| Substantial non-participation (essentially absent from team work) | -25 to -50% |
| No meaningful contribution + no engagement with instructor outreach | up to -100%; potential course failure |
The peer-eval 5% line is independent of any escalation — the individual still receives whatever they earned on the 1%/4% lines.
Suggested procedure (faculty may adapt):
- By Week 5, if Week 4 pulse + studio observation suggests a developing issue, reach out to the team and to the individual separately. Document the outreach — both for the student’s benefit (early signal) and for defensibility if escalation is later needed.
- If the issue persists, review Week 8 peer evaluations alongside Git/Canvas activity.
- Before final grades post, notify the affected student in writing: nature of the under-contribution, evidence cited, adjustment proposed, opportunity to respond in writing within ~3 business days.
- Record the final adjustment in Canvas with a brief private note. For adjustments above -25%, consulting the Academic Director before posting is recommended (not required) — this protects faculty from grade-grievance escalation and ensures cross-course consistency on high-stakes calls.
What this is not.
- Not a tool for resolving inter-team conflict or personality clashes
- Not triggered by a single low rating
- Not a substitute for early instructor intervention — the Week 4 pulse exists so most issues are caught and resolved before escalation is needed
Suggested student-facing language (faculty may adapt for syllabus):
“Team projects in this course assume each team member contributes their fair share. If documented evidence (peer evaluations across two checkpoints + independent instructor observation) indicates a teammate substantially under-contributed, the instructor may adjust that individual’s grade on the team project. The Week 4 peer pulse exists to surface issues early — when in doubt, raise concerns through the pulse or directly with the instructor before Week 6.”
Sources (Academic)
- Perkins, M., Furze, L., Roe, J., & MacVaugh, J. (2024). The Artificial Intelligence Assessment Scale (AIAS). Journal of University Teaching and Learning Practice, 21(06). doi:10.53761/q3azde36
- Vendrell, M. & Johnston, S.-K. (2026). Scaffolding Critical Thinking with Generative AI: Design Principles for Integrating Large Language Models in Higher Education. Computers and Education: Artificial Intelligence. doi:10.1016/j.caeai.2026.100572 — Summary
- Furze, L. (2026). What Curriculum Leaders Need to Know About AI in 2026. Blog post — Summary
- Kosmyna, N. et al. (2025). Students who engaged independently before consulting an LLM produced significantly stronger outputs. (cited in Vendrell & Johnston, 2026)
- British Journal of Educational Technology: GenAI impact on authentic assessment
- PMC: Student perspectives on competency-based portfolios
- Schultz, W., Dayan, P., & Montague, P.R. (1997). A Neural Substrate of Prediction and Reward. Science, 275(5306), 1593–1599. doi:10.1126/science.275.5306.1593
- Fiorillo, C.D., Tobler, P.N., & Schultz, W. (2003). Discrete Coding of Reward Probability and Uncertainty by Dopamine Neurons. Science, 299(5614), 1898–1902. doi:10.1126/science.1077349
- Salimpoor, V.N. et al. (2011). Anatomically Distinct Dopamine Release During Anticipation and Experience of Peak Emotion to Music. Nature Neuroscience, 14, 257–262. doi:10.1038/nn.2726
- Norton, M.I., Mochon, D., & Ariely, D. (2012). The IKEA Effect: When Labor Leads to Love. Journal of Consumer Psychology, 22(3), 453–460. HBS
- Machulla, P. (2026). The Dopamine Gap. Medium. — Summary
- METR (2025). Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity. Blog
- Vander, A. (2026). Rethinking AI in education through design thinking: Start with the problem. Design for learning. Use AI with purpose. Personal position paper. (Source: shared via Vishal Sachdev, April 2026)
For full source list including university best practices and program examples, see reference/ASSESSMENT_RESEARCH.md.
Document created for MSBAi Program Development - Gies College of Business