MSBAi / K-ai / NanoClaw — Process Architecture Audit

Created: 2026-05-30 · Status: living document Scope: the processes that move information from stakeholders and Box into the knowledge base and back out as answers. Grounded in the nanoclaw-msbai source, the msba-online repo, and observed incidents. Companion: architecture-email-pipeline.md describes the message plumbing; this document audits the processes and governance on top of it (and flags where that doc is stale).

A. Topology (one sentence)

6 channels → one HTTP server (:3003, path-routed) + Telegram long-poll → NanoClaw orchestration (allowlist, queue, container) → per-group Docker agent (Claude Code) operating on a single shared msba-online git working tree → reply out + audit log + behaviour-sensor. Box feeds the tree via a launchd autosync (Vishal’s Mac); GitHub pushes feed the VPS checkout via a webhook.

Channels actually in use (as of 2026-05-30): Email (most stakeholders) · Telegram (Vishal; sometimes Ron) · Web Chat (a few). The three Teams/Copilot channels are built but dormant — this audit and its governance model are scoped to the three active channels; the dormant ones are not assumed live.

B. Process inventory

#	Process	Trigger	Mechanism	Enforcement tier	Key gap
1	Ingestion + allowlist	inbound msg	`webhook.ts` path-routes; email/telegram allowlist checked before agent spawn; Box-notification short-circuit (`da98a7b`) ahead of the allowlist	Code (hard)	Allowlist read off the live working tree → race (patched `285d67d`, not rooted)
2	Orchestration / container	post-allowlist	per-group container; spawn → reuse 30 min via IPC `--resume` → fresh	Code	All 5 channel groups mount the same repo rw → cross-group concurrent writers
3	Agent processing	container msg	Modes: Question (read-only) / Status-update (commit) / Outreach	CLAUDE.md (soft)	Mode classification + which file to cite is model judgment
4	KB write governance	DECISIONS write	`kb-conflict-check` skill (skeptical subagent → commit or route to `kb-triage/`); provenance line; `lore:intentional` override	Skill gate (semi-hard) for DECISIONS; conventions for ACTION_ITEMS / OPEN_QUESTIONS	Only DECISIONS gated; other files free-form
5	Source-of-truth + propagation	SoT edit	`program/curriculum.md` = authority; propagate Type-A downstream per `reference/DEPENDENCY_GRAPH.md`; guardrail: agent “NEVER edits curriculum.md → flag in reply”	CLAUDE.md (soft)	Guardrail leaks (F1); propagation is flag-and-pray (F4)
6	Box → KB sync	launchd 4×/day (Mac)	`scripts/box-autosync.py`: rclone read → openpyxl → regenerate `courses/<code>/sync/*.md` → commit-if-md-changed → push	Deterministic script (hard)	Mac-only; Course Map flag-only; rclone token expiry
7	github-push webhook	GitHub push	`/api/github-push` HMAC → `git pull --ff-only` + chown → synthesize audit entry for non-bot commits	Code (hard)	Pulls the live checkout (race)
8	Audit logging	after container exits	host-side append to `discussions/audit-log/YYYY-MM-DD.md`, 5-min commit batch	Code (hard)	Writes into the shared tree (race)
9	Behaviour sensor	after each outbound	Haiku checks reply vs `curriculum.md` + Confirmed `DECISIONS.md` + citation existence → `[sensor: pass/flag/fail]` in the audit log; FAIL → `kb-triage/`	Code (hard)	Blind to `sync/` rosters (F3)
10	Concurrency control	—	per-group serialization (assumed, unverified — Phase 0 of the concurrency plan)	unknown	No mutation gate across writers

C. Cross-cutting findings

F1 — Soft guardrails leak; the source of truth has no hard gate

Every channel prompt says “NEVER modify program/CURRICULUM.md.” The git log says otherwise — msbai-bot@illinihunt.org (the container agent, “K-ai”) has edited it repeatedly (e.g. 7173ff0 STEM designation, 2026-05-30; 401ffb7 quantum name, 177cd2e catalog name). The guardrail lives only in CLAUDE.md — the weakest enforcement tier (deterministic hooks > Claude hooks > CLAUDE.md). There is no hard mechanism (pre-commit hook, read-only mount, author gate) stopping the container from writing the source of truth. Consequence: “who is allowed to change the source of truth, and how” has no reliable answer — both the intended path (flag → human applies) and the forbidden path (agent self-edits) are simultaneously live.

F2 — Source-of-truth fragmentation → wrong answers (the “8 recorded” incident, 2026-05-30)

The recording count for BADM 554 existed in ACTION_ITEMS.md, curriculum.md, the narrative course file, and the new sync/ roster — with no precedence. K-ai cited the stale ACTION_ITEMS.md “8 (as of 5/27)” over the fresh roster’s “13”. Patched (de-staled the item to defer to the roster; added a sync breadcrumb to all 5 channel prompts), but the pattern recurs anywhere a fact is duplicated. The autosync did not cause this — it exposed it by adding a fourth, fresher authority the others didn’t defer to.

behaviour-sensor.ts grounds on curriculum.md (primary) + Confirmed DECISIONS.md (recency override) + citation existence. It does not read courses/<code>/sync/*.md. So a correct answer sourced from today’s autosync’d roster gets a FLAG (“cannot verify”) at best, a FAIL (“contradicts stale curriculum”) at worst — observed on the BADM 554 question. The QA layer cannot see the data layer the autosync just made authoritative. Highest-value fix: add sync/ rosters to the sensor’s ground truth, with roster-match downgrading a curriculum contradiction to propagation-lag (same treatment DECISIONS already gets).

F4 — Propagation is flag-and-pray

No mechanism guarantees a flagged curriculum.md change is ever applied. Curriculum drifts → sensor (F3) false-flags correct replies → audit noise. Self-reinforcing with F1.

F5 — Shared mutable working tree (concurrency)

/root/repos/msba-online is read by the host, written rw by N per-channel containers, and git pull --ff-only-ed by the webhook — with no gate. The allowlist bug (285d67d) was one symptom; the race class is intact. The Codex design (nanoclaw-msbai/.claude/plans/repo-concurrency-decoupling.md) is sound but Phase 0 (verify group-queue.ts serialization) is not done — current write-race exposure is unknown. A live instance recurred during this audit (push rejected, 12 commits behind, rebase required).

Observed live + root-caused, 2026-05-30: the VPS msba-online checkout was found diverged — 2 local Audit log commits the host had written but never pushed to origin, and 3 behind. Root cause: audit-log.ts commitAuditLog() did a bare git push with no rebase-on-reject, so whenever origin moved (e.g. a push from the maintainer’s laptop) the audit commit stranded locally; the github-push webhook’s git pull --ff-only then kept failing silently and the VPS drifted (access control survived — the allowlist was already present — but the audit trail stopped reaching GitHub and K-ai went briefly stale on docs). Fixed (audit-log.ts): the push now git push || (git pull --rebase --autostash origin main && git push) || (git rebase --abort) — rebase-on-reject, conflict-safe, commit retained for retry. This is the cheap version of the concurrency plan’s “audit-log batch fails on divergence” failure mode; the durable fix remains Phase 1 (pinned-ref reads) + Phase 2 (out-of-tree audit spool).

F6 — The architecture doc is stale

architecture-email-pipeline.md lists the audit log + behaviour-sensor as “planned, not yet implemented” — both are live and central. Onboarding from that doc yields a wrong mental model.

F7 — Operational single points / expiring clocks

Autosync runs only on Vishal’s Mac (launchd catches up after sleep; a powered-off Mac = no sync). rclone OAuth token can expire.
TEAMS_APP_SECRET expires 2026-09-11.
Sensor + agent + autosync all depend on the single ANTHROPIC_API_KEY / Box token; no spend kill-switch or health check.

D. Prioritized recommendations

Pri	Action	Why	Effort
P1	Add `courses/<code>/sync/` rosters to the behaviour-sensor’s ground truth (F3)	QA can’t see the authoritative data; false-flags correct answers	Short
P1	Decide F1: hard-enforce the curriculum guardrail or relax it to match reality	“who edits the SoT” is undefined	Decision + Short
P2	Concurrency plan Phase 0 (verify `group-queue.ts`) → Phase 1	Unknown write-race exposure; reads still race	Quick → Short
P2	SoT precedence policy: “for X, `sync/` wins; ACTION_ITEMS/curriculum reference, never duplicate” (F2)	Prevent the next 8-vs-13	Short (partly done)
P3	Refresh `architecture-email-pipeline.md` (audit log + sensor are live) (F6)	Doc integrity	Quick
P3	Calendar `TEAMS_APP_SECRET` 9/11 expiry; add Box-token + autosync health check (F7)	Silent failures	Quick

E. Synthesis

The system is well-built at the plumbing layer (ingestion, channels, webhook, sync) but has a governance/consistency gap at the knowledge layer: multiple un-prioritized sources of truth, a soft guardrail that doesn’t hold, and a QA sensor that can’t see the freshest data. F1, F2, F3 are one connected problem — there is no authoritative answer to “what is true, and who gets to change it.”

F. First-principles analysis from the system’s own history

(Evidence mined from git history + audit logs + kb-triage, 2026-05-30.)

One root fault, expressed five ways: there is no single, authoritative, concurrency-safe source of truth. program/curriculum.md is simultaneously (a) the sensor’s “primary ground truth,” (b) a file the agent is forbidden to edit, (c) a file the sensor’s own code calls known-stale by design (behaviour-sensor.ts:53: “CURRICULUM.md routinely lags a logged decision”), and (d) a file mutated by git pull on a live working tree with no lock. Every finding traces back to this.

F-hist-1 · The sensor fails almost as often as it passes. Across discussions/audit-log/*.md: ~113 pass / ~127 flag / ~146 fail. The dominant fail class is stale-source false-positives — the sensor checks against curriculum.md, which is stale by design. Example: Practicum drives 13 fail/flag lines where the agent correctly says “4 cr, confirmed” but curriculum.md still says Draft. “Cannot be verified against ground truth” appears ~24×. Course clusters in flags: Quantum 61, BADM 557 26, Practicum 24, FIN 550 23, BADM 554 22 — the two biggest (Quantum, Practicum) are driven by stale/contradictory source docs, not agent error.

F-hist-2 · The guardrail demonstrably does not hold — and is self-contradictory. git log program/curriculum.md = 23 human edits vs 7 bot edits (msbai-bot@illinihunt.org), every bot edit a guardrail violation. The smoking gun: 7173ff0 (2026-05-30, “Document STEM designation”) — commit message says it was made to “prevent future K-ai responses defaulting to ‘unconfirmed’… Source: sensor flag 2026-05-29.” The bot edited the file it’s forbidden to edit, specifically to satisfy the sensor. The system requires curriculum.md to be fresh (or the sensor false-flags) but forbids the only always-on actor from freshening it. The guardrail is aspirational; the bot violates it or the file rots.

F-hist-3 · “Single source of truth” has eroded into a drifting multi-file mirror.

Two physical curriculum files: program/curriculum.md (tracked) and program/CURRICULUM.md (untracked working-tree duplicate, macOS case-insensitive artifact). The sensor’s loadCurriculum() tries both names — only one is versioned.
In-file self-contradiction: courses/practicum.md:14 says “Weeks 9-16”; :87 says “Weeks 5-8” (a contradiction the sensor flagged 2026-05-04).
Recording counts lived in 3 tiers (curriculum.md, ACTION_ITEMS.md, sync roster), two of which the docs themselves now annotate as stale — patched today, but symptomatic.
The “18-month” figure (pre-restructure) still lingers in ~8 files after the 02-09 move to 15 months — partial propagation, drifted.

F-hist-4 · kb-triage shows un-retired supersession chains. Of 28 triage files, 26 are sensor response-error dumps; the 2 genuine conflict files (2026-05-02-conflicts, 2026-05-19-quantum-names) show the same fact (Quantum names, prep deadlines) re-decided 3–4× with no mechanism to retire the superseded DECISIONS entry → recurring conflicts.

F-hist-5 · The concurrency root cause is fully present, not just patched. 285d67d only made the email allowlist reader tolerate a failed read. webhook.ts:118-135 (pullRepo) still runs git pull --ff-only + chown -R on the live tree every push; grep of src/ for flock|mutex|lockfile|worktree|mirror → zero hits. The sensor, Telegram allowlist, and container reads retain the identical race. The Phase-1 fix is designed but unimplemented.

G. External best practices (researched + mapped, with citations)

G1 · Single source of truth (docs-as-code). Store each fact once; reference/transclude, never copy — copies are what drift (DRY: “one unambiguous authoritative representation”). Plain markdown has no live transclusion, so back it with a CI consistency check that greps duplicated facts and fails when copies disagree — turning drift into a build failure. — DRY; Transclusion/SSOT, Paligo SSOT. Map: make curriculum.md canonical per fact; other docs link, don’t restate; add a grep-test for the handful of duplicated numbers (credits/months/courses).

G2 · Grounding the QA validator. Staleness is the silent failure mode — a confident answer off an old snapshot with no signal. Fixes: incremental re-index on change (content-hash change detector); freshness-aware precedence (rank on relevance + recency so the newest authoritative file wins); groundedness judging sampled against a ground-truth set. — RAG freshness rot; Solving Freshness in RAG (arXiv 2509.19376); Databricks groundedness judge. Map: the sensor must read the same committed ref the agent did, and treat sync/ rosters + recent DECISIONS as higher-precedence than curriculum.md — directly fixes F3. At a few msgs/day, a content-hash check + pinned-ref read suffices; a vector RAG stack is over-engineering.

G3 · Generated vs human-authored content. Mark generated files with a DO NOT EDIT — generated by … header; make generators deterministic (no timestamps, stable ordering) so regen doesn’t churn; golden-test committed-vs-generated in CI = “commit only on real change.” — Go “DO NOT EDIT” convention; deterministic codegen + golden tests. Map: box-autosync.py already commits-only-on-change and stamps “auto-generated”; add an explicit DO-NOT-EDIT header to sync/*.md and keep generation deterministic.

G4 · Concurrency on a shared git tree. Git is a single-process tool; concurrent ops on one object DB/index can corrupt it, and a stale index.lock freezes everything. So: serialize writers; read immutable committed blobs via git show <ref>:<path> / cat-file (unaffected by an in-flight pull); keep an append-only event spool out of the contended tree. — git single-process lock contention; git-cat-file, Pro Git: immutable objects. Map: this is exactly the repo-concurrency-decoupling.md plan — validates Phase 0 (serialize) + Phase 1 (pinned-ref reads) + Phase 2 (out-of-tree audit spool).

G5 · Protecting a SoT file from an autonomous agent. Prompt/markdown guardrails are the weakest layer. Defense-in-depth, strongest first: read-only mount / file perms (agent physically can’t write) → deterministic pre-commit/PreToolUse hook (fires every time, can’t be --no-verify-bypassed, agent can’t edit the hook) → CI/branch protection server-side → CODEOWNERS on the protected file and the guardrail files. Governance rule: if the agent can edit the rules that evaluate it, you have a governance problem. — Making it hard to cheat the guardrails; Claude Code hooks. Map: the “flag, don’t edit” policy is the right soft layer; back it with a read-only mount of curriculum.md into the container (highest-leverage single control at this scale). Full CODEOWNERS+CI is enterprise-grade — likely over-engineering here.

H. Path forward

(Sequence below is the Codex-Architect-reviewed revision, 2026-05-30. Codex verdict: REVISE — the directional reframe “fix the sensor before hard-locking curriculum.md” is right, but two claims were corrected; see notes.)

The reframe still holds: the sensor’s stale ground truth (F3) is what makes the guardrail unenforceable (F1) — the bot edits curriculum.md only because leaving it stale false-flags the sensor. But two corrections from the Codex review:

“Freshest” ≠ “authoritative.” Precedence must be field/category-specific, not blanket “newest wins.” sync/ rosters are authoritative for course-production facts (recording counts, video lengths, item status); curriculum.md / DECISIONS.md remain authoritative for program policy (credits, sequence, faculty). The sensor needs a precedence map, not a precedence order.
The pinned-ref change is not a silver bullet. It fixes read-side freshness + the read race — it does not stop concurrent writers (two containers, audit flush, autosync, webhook) or settle authority policy. Those are separate concerns; don’t conflate them.

Sequenced accordingly (separating what is authoritative / how reads are stable / how writes are serialized):

Define fact ownership by category (Quick, F2). One short table: which file owns which fact class — curriculum.md (program policy), DECISIONS.md (confirmed decisions), sync/ rosters (production facts), ACTION_ITEMS.md (open work, no hard facts). This is the cheap policy fix that prevents the next 8-vs-13 and feeds steps 3-4.
Pinned committed reads for host-side readers (Short, F5/G4). Serve the sensor + allowlists from git show <ref>:<path> (last-good SHA on fetch failure). Kills the read-race class (incl. the allowlist bug at root); retires the reconcileAllowlistReload band-aid.
Sensor reads sync/ rosters with field-specific precedence (Short, P1, F3/G2). Per the ownership map from step 1. Regression-test against the BADM 554 “13 recorded” case (today’s false-flag must turn green).
Hard-protect curriculum.md (Quick/Short, P1, F1/G5) — AFTER step 3’s false positives pass. Read-only mount into the container (or a pre-commit path block). Safe only once the sensor no longer needs the bot to keep curriculum fresh.
Clean obvious drift (Quick, F3/G1). Delete the untracked program/CURRICULUM.md duplicate; fix the in-file practicum Weeks contradiction; finish propagating the 15-month figure; add DO-NOT-EDIT headers to sync/*.md; add a CI grep-test for the few cross-file duplicated numbers.
Write gate / audit spool (Medium) — only if write collisions actually manifest. Do not build first (concurrency plan Phase 2). Verify group-queue.ts serialization (Phase 0) before assuming you need it.

The leverage is steps 1→3: a category ownership map + the sensor reading the right source per field. That dissolves the F1/F3 conflict and stops the false-flags — without new infrastructure. Steps 4-6 are downstream and partly conditional.

I. KB / memory options (does NanoClaw offer an alternative substrate?)

(From the 2026-05-30 ecosystem exploration. Verdict: keep-and-augment — no off-the-shelf plugin fits.)

The NanoClaw framework ships no RAG / vector / memory infrastructure — its memory model is git-markdown-per-agent-group, which is exactly what this system runs. Options surveyed:

Option	What it is	Fit
`add-karpathy-llm-wiki` (in-repo, uninstalled)	3-layer wiki: immutable raw sources / LLM-maintained interlinked markdown (`index.md` + append-only `log.md`) / a CLAUDE.md “schema” making the agent a disciplined maintainer. Ops: ingest / query / lint. Optional `qmd` hybrid-search CLI only at scale.	Best fit — partial-adopt the discipline (index + log + lint + synthesis), not a full separate wiki layer. Its lint op directly attacks F2/F3/F4 (contradictions, stale claims, orphan pages); `index.md` is a lightweight retrieval layer; synthesis-into-pages fights fragmentation.
In-repo skills `knowledge-builder` / `kb-conflict-check` / `quality-review` / `self-eval`	Homegrown governance pipeline on git-markdown. `kb-conflict-check` is keyword-only over one file (DECISIONS.md).	Already in use; the keyword-only single-file scope is itself part of the F3 root. Strengthen these rather than replace.
claude-mem (installed)	Real hybrid search (Chroma + SQLite FTS5) — but over Claude Code dev-session transcripts, not the curriculum repo; the container can’t reach it at answer time.	Wrong corpus. Keep for dev recall only (per the memory-architecture rules), never the curriculum KB.
`add-mnemon` (upstream)	Graph memory in-container, recall-before/write-after hooks (Claude-Code-only).	Conversational, per-group-local — not a curriculum KB; doesn’t unify source-of-truth.

Recommendation (Codex + exploration agree): keep git-markdown; partial-adopt the Karpathy index+log+lint+synthesis discipline by strengthening knowledge-builder + kb-conflict-check (let knowledge-builder maintain synthesis pages; add a lightweight lint pass). Reserve qmd local hybrid search as the retrieval/freshness upgrade for the behaviour-sensor (G2) only when keyword overlap visibly stops sufficing. A vector DB / dedicated service is over-engineering at a few messages/day — the scale needs cleaner ownership rules, not new infrastructure.

J. Authority & gatekeeping model (partially shipped — concrete F1 resolution)

(Grounded in mined usage, 2026-05-30: DECISIONS.md provenance + audit-log inbound volume, cross-checked against the stated role split.)

Status — shipped 2026-05-30: the access-tier model’s first phase is live. (1) The allowlist was narrowed to the 13-person pilot group (11 read-write + Ocasio/Love read-only); the other 13 were off-boarded. (2) Blocked @illinois.edu senders now receive a one-time courteous “limited pilot” reply instead of a silent drop (nanoclaw c04ebb1, codex-reviewed; verified end-to-end). (3) Observer read-only (Ocasio/Love) is enforced softly via the email-group prompt. Still future: hard read-only enforcement (the Access-column write-block), and the J.3 two-stage PR gate on the source of truth.

Strategic context. MSBAi’s coordination through K-ai is a living embodiment of the Gies AI governance/orchestration strategy and is under active research observation (W. Ocasio, G. Love are study observers). The gatekeeping model below is therefore not just an ops control — it is itself a study artifact, which raises the bar on getting authority, provenance, and auditability right.

Canonical roster, titles, access tiers, roles, and gate routing now live in program/EMAIL_ALLOWLIST.md (single source of truth — 2026-05-30 consolidation). This section keeps only the rationale, evidence, and mechanism; it does not restate the per-person data.

J.1 Access tiers (definitions)

read-write — query + routine own-lane writes + may propose source-of-truth changes (gated, J.3).
read-only — query only; never trigger a KB write or approve. Used for the research observers (Ocasio, Love); enforced softly today (an email-group prompt instruction), pending the hard Access-column write-block.

Senior staff with no defined K-ai role are simply off the pilot allowlist; if they email from @illinois.edu they get the one-time limited-pilot reply (J status block).

J.2 Authority map — rationale (the canonical per-person mapping is the allowlist `Role / domain` column)

Why the domain owners are who they are, from mined usage:

Content / academic → Vishal (“leads on course content and academic vision”; ~11 decisions).
Structure → Maria (her stated remit; emerging — 22 inbound, 1 decision so far).
Governance → Ravi (“takes decisions on some things”; 5 inbound, routed through Vishal/Amber).
Operations / KB-feeding → Amber (the workhorse — 114 inbound, ~18 decisions).
Own-lane domains → Heather (onboarding), Kacie (admissions), Lindsey (marketing), Emily (residential), course faculty (own course) — write in their lane, read-only on academic/structural SoT (marketing/admissions must reference the academic SoT, not restate it).
Upstream structure source → Learning Design (Cheng Li / Eric French), laddering up through Jason Mock (T&L) — they author the canonical Box Course Maps + rosters the autosync pulls. (Not on the pilot allowlist yet; onboard via Jason.)

J.3 Two-stage gate (maker-checker on the source of truth)

(Routing table is canonical in EMAIL_ALLOWLIST.md → “Gatekeeping routing”; mechanism + rationale below.)

Scope: only program/curriculum.md + discussions/DECISIONS.md. Everything else (ACTION_ITEMS, course-file edits, Box autosync) commits directly — keeps Amber’s high-volume ops firehose frictionless.
Mechanism: the agent opens a PR for a SoT change, tags it content/structure (from the DECISIONS Category it already assigns), and a validation email routes to the domain approver; merge = land (reuses git + the github-push webhook → VPS). Open PRs are the tracked queue (no flag-and-pray). This is the “validation email to someone else,” routed by domain.
Routing (maker ≠ checker): content → Vishal; structure → Maria (Vishal fallback while she ramps); program-governance → Ravi (escalation, not per-write) / Vishal on his behalf. Amber and the bot are the dominant makers → their SoT proposals route to the domain approver above; they never self-approve.

J.4 Why this resolves F1 (supersedes the read-only-mount in H step 4)

A read-only mount merely blocks the bot. The gate is better: the bot (and any human) can still propose good SoT changes — e.g. the 7173ff0 STEM-designation edit — but nothing lands without a domain owner’s merge. It fixes the human accidental-write hole and the bot guardrail-leak with one deterministic, tracked control, while preserving the agent’s usefulness.

J.5 Open items

Maria phase-in: start structural approvals as Maria-with-Vishal-fallback; promote to required as her remit solidifies (today: 1 decision).
✅ Allowlist tiering + roster consolidation (done 2026-05-30): EMAIL_ALLOWLIST.md now carries Access + Role / domain columns + the gatekeeping routing, and is the single source of truth. Tiers/roles/gate are documentation until hard enforcement is built.
Ravi as escalation, not routine checker (he is a dean, 5 inbound) — mirror how the org already routes his decisions through Vishal/Amber.
Lisa Marinelli + Amanda Brantner → read-only (no defined K-ai update or escalation role) — tag in the allowlist.
Nathan Yang (Quantum Cognition faculty, ncyang@illinois.edu) authored a DECISIONS entry but is not on the allowlist — add him (write-capable, own course) if he should use K-ai directly.
⚠️ Allowlist discrepancy: Cheng Li (chengli8) + Eric French (frenchem) are already present in program/EMAIL_ALLOWLIST.md even though they are considered not yet authorized. As-is, K-ai will process their email now. Reconcile: either remove them until the LD team is intentionally onboarded, or confirm authorization (they report to Jason Mock; their write-authority ladders through him).

MSBAi Curriculum Site

MSBAi / K-ai / NanoClaw — Process Architecture Audit

A. Topology (one sentence)

B. Process inventory

C. Cross-cutting findings

F1 — Soft guardrails leak; the source of truth has no hard gate

F2 — Source-of-truth fragmentation → wrong answers (the “8 recorded” incident, 2026-05-30)

F3 — The behaviour-sensor is blind to the freshest source of truth

F4 — Propagation is flag-and-pray

F5 — Shared mutable working tree (concurrency)

F6 — The architecture doc is stale

F7 — Operational single points / expiring clocks

D. Prioritized recommendations

E. Synthesis

F. First-principles analysis from the system’s own history

G. External best practices (researched + mapped, with citations)

H. Path forward

I. KB / memory options (does NanoClaw offer an alternative substrate?)

J. Authority & gatekeeping model (partially shipped — concrete F1 resolution)

J.1 Access tiers (definitions)

J.2 Authority map — rationale (the canonical per-person mapping is the allowlist `Role / domain` column)

J.3 Two-stage gate (maker-checker on the source of truth)

J.4 Why this resolves F1 (supersedes the read-only-mount in H step 4)

J.5 Open items

MSBAi Curriculum Site

MSBAi / K-ai / NanoClaw — Process Architecture Audit

A. Topology (one sentence)

B. Process inventory

C. Cross-cutting findings

F1 — Soft guardrails leak; the source of truth has no hard gate

F2 — Source-of-truth fragmentation → wrong answers (the “8 recorded” incident, 2026-05-30)

F3 — The behaviour-sensor is blind to the freshest source of truth

F4 — Propagation is flag-and-pray

F5 — Shared mutable working tree (concurrency)

F6 — The architecture doc is stale

F7 — Operational single points / expiring clocks

D. Prioritized recommendations

E. Synthesis

F. First-principles analysis from the system’s own history

G. External best practices (researched + mapped, with citations)

H. Path forward

I. KB / memory options (does NanoClaw offer an alternative substrate?)

J. Authority & gatekeeping model (partially shipped — concrete F1 resolution)

J.1 Access tiers (definitions)

J.2 Authority map — rationale (the canonical per-person mapping is the allowlist Role / domain column)

J.3 Two-stage gate (maker-checker on the source of truth)

J.4 Why this resolves F1 (supersedes the read-only-mount in H step 4)

J.5 Open items

J.2 Authority map — rationale (the canonical per-person mapping is the allowlist `Role / domain` column)