Claude Code transcript - page 95/114

Index ← Prev 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 Next →

User2026-06-30T00:30:49.533Z

Correct and this where our adaptive engine and all of that logic will continue working only with llm off And the learnerDO agent and coach agent also will keep recording the data points and stuff (May need to have a look again based on how this will affect the engines working and personalization and dynamic adaptation)

Assistant2026-06-30T00:31:26.186Z

Thinking

Assistant2026-06-30T00:31:29.039Z

Right — and that's the crucial implication: the adaptive engine, personalization, and dynamic adaptation all run on LLM-off, because they work by deterministically selecting the path over the recorded data points (lesson completion, attempts, scores, choices, posture, workflow fingerprint) — the learnerDO and coach agent keep capturing those signals whether the LLM is on or not. The personalization is selector-driven over data, not LLM inference.

Your "look again" flag is well-placed — let me capture the doctrine and dispatch a focused review of exactly how LLM-off affects the engines, rather than assume it's fine:

Assistant2026-06-30T00:31:47.395Z

$ Bash

Log adaptive-LLM-off doctrine + dispatch the review lane

cd /Users/Ali/PKM/scratch/pkm-deepdive
cat > adaptive-llm-off.txt <<'EOF'

---
## ADAPTIVE ENGINE WORKS LLM-OFF (Ali 2026-06-29)
- The ADAPTIVE ENGINE + PERSONALIZATION + DYNAMIC ADAPTATION work with LLM OFF — they personalize + adapt by SELECTING content/path DETERMINISTICALLY over RECORDED learner data points, NOT live LLM inference. This IS the deterministic "selector" tier.
- **learnerDO (Durable Object) + coach agent keep RECORDING data points with LLM off**: lesson completion, attempts, scores, time, choices, posture, mindset slot, workflow fingerprint, learning-events. Data capture + adaptation continue regardless of LLM state.
- Live LLM is NOT required for personalization/adaptation — only the OPTIONAL arena-mission toggle uses it.
- **FLAGGED REVIEW (Ali "may need to look again"):** verify how the LLM-off architecture affects the engine's working + personalization + dynamic adaptation — confirm the engines run deterministically over the recorded signals; flag any logic that wrongly assumes/requires a live LLM. Review lane dispatched (adaptive-engine-llm-off-review.md).
EOF
base64 < adaptive-llm-off.txt | ./vds-exec.sh runl '
  T=$(base64 -d); echo "$T" >> /srv/aina/ops/FACTORY-GUARDRAILS.md; echo CANON_OK'
cat > lane-adaptive-llm-off-review.md <<'EOF'
# Lane: Review adaptive engine / personalization / dynamic adaptation under LLM-OFF (Ali "look again")
## AUTONOMY: ANALYZE + REPORT (do NOT rebuild); honest; gpt-5.5. Repo /srv/repos/aina-academy (+ the personalization engine repos).
## Context (Ali 2026-06-29): the adaptive engine + personalization + dynamic adaptation must work with LLM OFF (deterministic selector over recorded data points). learnerDO (Durable Object) + coach agent keep RECORDING data points with LLM off. Canon: deterministic lessons + simulated arena; live LLM optional/arena-only. Look again at how LLM-off affects the engines.
## Do (analysis)
1. Map the adaptive/personalization stack as built: the personalization substrate/engine (AIN-96/97/98), the **learnerDO** durable object (what data points it records + when), the **coach agent** (what it records + does), the adaptation/selector logic (postures, mindset slots, tool variants, workflow fingerprint, the 3 tiers, the diagnose→teach→practise→evaluate→improve→progress loop), and the learning-event/telemetry pipeline.
2. For EACH, determine: does it work LLM-OFF? Personalization + dynamic adaptation should run by DETERMINISTICALLY selecting content/path from the RECORDED data points (completion, attempts, scores, time, choices, posture, mindset, fingerprint) — NOT live LLM inference. Confirm learnerDO + coach keep recording with LLM off.
3. FLAG any place that wrongly assumes/requires a live LLM for personalization/adaptation (e.g. an evaluator or adaptation step that calls the LLM at runtime) and how to make it deterministic (selector + pre-authored + the recorded signals). Note: the evaluator scoring path is canon selector-only (First-Principles doc); generation only assists building calibration, never live scoring.
4. Confirm the two-tier framing holds: deterministic selector (lessons + adaptation, LLM-off) + gated generation (arena missions, optional LLM toggle, sim-first).
## Deliver: report /srv/aina/ops/adaptive-engine-llm-off-review.md — per-engine LLM-off readiness, the data points learnerDO+coach record, what works deterministically vs what needs rework (with the fix), and whether the engine's personalization/dynamic-adaptation is preserved LLM-off. Honest. gpt-5.5. (Analysis only — route fixes after.)
EOF
base64 < lane-adaptive-llm-off-review.md | ./vds-exec.sh runl '
  base64 -d > /srv/aina/ops/lanes/lane-adaptive-llm-off-review.md
  export PATH=$PATH:/usr/local/bin
  C5="exec --json --dangerously-bypass-approvals-and-sandbox --model gpt-5.5 -c model_reasoning_effort=high --skip-git-repo-check -"
  cd /srv/repos/aina-academy
  nohup bash -c "cat /srv/aina/ops/lanes/lane-adaptive-llm-off-review.md | codex $C5" > /srv/aina/ops/codex-adaptive-review.log 2>&1 &
  echo "adaptive-llm-off-review lane pid=$!"'

Tool reply2026-06-30T00:31:48.357Z

CANON_OK
adaptive-llm-off-review lane pid=2630808

Assistant2026-06-30T00:31:57.216Z

Thinking

Assistant2026-06-30T00:31:57.220Z

Review lane is running (pid 2630808). Recording the doctrine to memory:

Assistant2026-06-30T00:32:03.647Z

✏️ Edit project_aina_factory_build_2026_06_29.md

/Users/Ali/.claude/projects/-Users-Ali-PKM/memory/project_aina_factory_build_2026_06_29.md

−

Lesson-shell port (pid 2571179) still building.

Lesson-shell port (pid 2571179) still building. **ADAPTIVE ENGINE WORKS LLM-OFF (Ali 2026-06-29):** adaptive engine + personalization + dynamic adaptation run with LLM OFF — deterministic SELECT over RECORDED data points (completion/attempts/scores/time/choices/posture/mindset/workflow-fingerprint), NOT live LLM. **learnerDO (DO) + coach agent keep RECORDING data points LLM-off.** Live LLM not required for personalization (only optional arena-mission toggle). Ali flagged "look again" → review lane ali (pid 2630808, gpt-5.5, report adaptive-engine-llm-off-review.md) analyzing how LLM-off affects each engine + flagging any wrongly-LLM-dependent logic (evaluator scoring is canon selector-only). FULL PICTURE NOW: deterministic lessons (chat shell, single interface, inline media) + deterministic adaptive/personalization engine + sim-first arena = whole platform launch-ready with ZERO live LLM; Gemini/live-LLM = optional-later-never-blocker.

Tool reply2026-06-30T00:32:03.791Z

The file /Users/Ali/.claude/projects/-Users-Ali-PKM/memory/project_aina_factory_build_2026_06_29.md has been updated successfully. (file state is current in your context — no need to Read it back)

Assistant2026-06-30T00:32:08.124Z

Thinking

Assistant2026-06-30T00:32:11.875Z

Confirmed and captured. The full architecture now reads clean: deterministic lessons (the chat shell) + a deterministic adaptive/personalization engine + a sim-first arena = a launch-ready platform with zero live-LLM dependency. The learnerDO and coach keep recording the data points that drive personalization regardless of LLM state — adaptation is selection over signals, not inference.

Your "look again" is now a running review (gpt-5.5) that will map the learnerDO, coach, and personalization engine, confirm each works LLM-off by selecting over recorded data, and flag anything that wrongly calls the LLM at runtime (the evaluator scoring path is already canon-bound to selector-only). I'll bring you what it finds — if any engine has a hidden LLM dependency, that's the one thing worth knowing, and I'll surface it. Nothing needs you.

User2026-06-30T00:32:28.858Z

Sorry the live LLM is the inline chat provider - that was the idea - like it was the agent and coach but the actual content and eveything will flow like that, so the live LLM/Coach can be optional - and it can be toggled off

Assistant2026-06-30T00:33:14.017Z

Thinking

Assistant2026-06-30T00:33:15.838Z

Ah, got it — thank you for the correction. So the live LLM is the inline chat provider — the conversational coach/agent inside the shell — and that's the optional, toggle-off layer. The content (lessons, cards, media) still flows deterministically through the shell as the always-on backbone; the live-LLM coach sits on top of it and can be switched off. Let me fix that in canon precisely:

Assistant2026-06-30T00:33:24.100Z

$ Bash

Log inline-chat-coach correction to canon + lanes

cd /Users/Ali/PKM/scratch/pkm-deepdive
cat > inline-coach-correction.txt <<'EOF'

---
## CORRECTION — the live LLM IS the inline chat coach (Ali 2026-06-29)
Refines the flow doctrine (do NOT read the earlier "no coach/agent at all" too literally):
- The **LIVE LLM = the inline chat provider / coach agent** WITHIN the chat shell — the conversational coach/agent layer. It is **OPTIONAL and toggle-able OFF.**
- The **CONTENT (lessons, cards, media) flows DETERMINISTICALLY** through the shell regardless — the always-on backbone (works fully LLM-off).
- The inline-chat coach (live LLM) is an OPTIONAL overlay on that backbone: toggle ON = live-LLM coach converses inline; toggle OFF = platform runs fully on deterministic content + recorded data + the adaptive engine (and the coach can be SIMULATED when off — the flight-simulator approach).
- The COACH has TWO aspects: (1) **data/signal recorder = ALWAYS ON, deterministic** (learnerDO + coach keep recording data points regardless of LLM); (2) **live-LLM inline conversationalist = OPTIONAL toggle.** "Tutor agent not needed" meant the CONTENT doesn't depend on the LLM — the coach exists as the optional live-LLM chat layer.
- So live LLM = the optional inline-chat coach (in the shell AND arena missions), toggle-off everywhere; sim/deterministic is the default. Platform fully functional with the coach toggled off.
EOF
base64 < inline-coach-correction.txt | ./vds-exec.sh runl '
  T=$(base64 -d)
  echo "$T" >> /srv/aina/ops/FACTORY-GUARDRAILS.md
  for f in lane-lesson-shell-port.md lane-adaptive-llm-off-review.md lane-practice-arena-v2.md; do [ -f /srv/aina/ops/lanes/$f ] && echo "$T" >> /srv/aina/ops/lanes/$f; done
  echo COACH_CORRECTION_LOGGED'

Tool reply2026-06-30T00:33:24.666Z

COACH_CORRECTION_LOGGED

User2026-06-30T00:33:32.481Z

[Request interrupted by user]

User2026-06-30T00:33:32.518Z

@"/Users/Ali/ui-lesson-previews-ali-recovered-2026-05-18" Check this it has the documentation of my thoughts

Assistant2026-06-30T00:33:42.363Z

Thinking

Assistant2026-06-30T00:33:43.303Z

These docs are your design source-of-truth — let me read your actual thinking before the build drifts on my interpretation. Reading the core ones:

Assistant2026-06-30T00:33:45.236Z

$ Bash

Read VISION.md + DECISIONS.md from the recovered dir

cd "/Users/Ali/ui-lesson-previews-ali-recovered-2026-05-18"
echo "== sizes =="; wc -l VISION.md DECISIONS.md README.md CURRICULUM-MAP.md CONVERSATION-LOG.md RECOVERY-MANIFEST.md 2>/dev/null
echo; echo "############## VISION.md ##############"; cat VISION.md 2>/dev/null
echo; echo "############## DECISIONS.md ##############"; cat DECISIONS.md 2>/dev/null

Tool reply2026-06-30T00:33:45.368Z

<persisted-output>
Output too large (29.6KB). Full output saved to: /Users/Ali/.claude/projects/-Users-Ali-PKM/ea21dc9e-a684-460b-b5d8-244fa739715e/tool-results/b0uxofqde.txt

Preview (first 2KB):
== sizes ==
     292 VISION.md
     193 DECISIONS.md
     106 README.md
     454 CURRICULUM-MAP.md
     177 CONVERSATION-LOG.md
      67 RECOVERY-MANIFEST.md
    1289 total

############## VISION.md ##############
---
title: "AINA Lesson Vision (universal lesson shape)"
status: approved
canonical: true
source: claude
created: 2026-05-17
last_updated: 2026-05-17
last_reviewed: 2026-05-17
owner: ali
review_cadence: quarterly
authority_level: 2
related_domains: [curriculum, lessons, product, design]
---

# AINA Lesson Vision — May 2026

**DocOS metadata** (frontmatter above): authoritative lesson-shape spec, owner ali, quarterly review. See [DOCOS.md](../DOCOS.md).
**Source:** Founder review session with the curriculum context loaded · **Audience:** Coding agents (Claude Code, Codex) and curriculum authors

This document specifies the *experience* of one AINA lesson, end to end. Architecture, schema, and component naming follow from this shape. If anything in this file conflicts with code currently in `apps/web/src/chat/` or `packages/ui/src/chat/`, this document is the intent; the code is what was reachable in one sprint.

## 1. What a lesson is

A lesson is one **capability** rendered as one **chat conversation** with five steps that map to the AINA architecture's Foundation → Applied → Integrated progression. A lesson is not a series of pages and not a video course. The chat *is* the surface.

A lesson teaches **one durable cognitive move** (e.g. Structured Intent Specification, Output Evaluation, Human-in-the-Loop Design) and ends with the learner having produced **at least one piece of their own real work** that demonstrates the move. Mastery is *demonstrated capability on the learner's own task* — not quiz completion, not a certificate.

...
</persisted-output>

Assistant2026-06-30T00:33:54.598Z

Thinking

Assistant2026-06-30T00:33:55.574Z

The VISION.md is your canonical lesson-shape spec — a lesson is one capability as one chat conversation (5 steps, Foundation→Applied→Integrated), the chat is the surface, ending in the learner's own real work. Let me read the full vision + decisions:

Assistant2026-06-30T00:33:56.074Z

⚙ Read

{
  "file_path": "/Users/Ali/.claude/projects/-Users-Ali-PKM/ea21dc9e-a684-460b-b5d8-244fa739715e/tool-results/b0uxofqde.txt"
}

Tool reply2026-06-30T00:33:56.478Z

1	== sizes ==
2	     292 VISION.md
3	     193 DECISIONS.md
4	     106 README.md
5	     454 CURRICULUM-MAP.md
6	     177 CONVERSATION-LOG.md
7	      67 RECOVERY-MANIFEST.md
8	    1289 total
9	
10	############## VISION.md ##############
11	---
12	title: "AINA Lesson Vision (universal lesson shape)"
13	status: approved
14	canonical: true
15	source: claude
16	created: 2026-05-17
17	last_updated: 2026-05-17
18	last_reviewed: 2026-05-17
19	owner: ali
20	review_cadence: quarterly
21	authority_level: 2
22	related_domains: [curriculum, lessons, product, design]
23	---
24	
25	# AINA Lesson Vision — May 2026
26	
27	**DocOS metadata** (frontmatter above): authoritative lesson-shape spec, owner ali, quarterly review. See [DOCOS.md](../DOCOS.md).
28	**Source:** Founder review session with the curriculum context loaded · **Audience:** Coding agents (Claude Code, Codex) and curriculum authors
29	
30	This document specifies the *experience* of one AINA lesson, end to end. Architecture, schema, and component naming follow from this shape. If anything in this file conflicts with code currently in `apps/web/src/chat/` or `packages/ui/src/chat/`, this document is the intent; the code is what was reachable in one sprint.
31	
32	## 1. What a lesson is
33	
34	A lesson is one **capability** rendered as one **chat conversation** with five steps that map to the AINA architecture's Foundation → Applied → Integrated progression. A lesson is not a series of pages and not a video course. The chat *is* the surface.
35	
36	A lesson teaches **one durable cognitive move** (e.g. Structured Intent Specification, Output Evaluation, Human-in-the-Loop Design) and ends with the learner having produced **at least one piece of their own real work** that demonstrates the move. Mastery is *demonstrated capability on the learner's own task* — not quiz completion, not a certificate.
37	
38	A lesson is universal by default. Personalization is an overlay (Selection, Sequencing, Filling, Framing, Exposure Depth — the five levers from the curriculum architecture), never a precondition. Every lesson must read cleanly for a learner with no role, no industry, no tool stack selected.
39	
40	## 2. The five steps
41	
42	Every lesson is the same five-step chat arc. The shape does not vary by level. The *content* inside each step varies by capability.
43	
44	### Step 1 — Why this matters (Awareness)
45	
46	The opening AINA message. Two paragraphs maximum. Frames the failure mode this capability prevents, in plain language, on the learner's terms.
47	
48	Followed by an **Explainer Card** (see §4). 60–90 seconds. The learner can skip it. The text below the card always stands alone.
49	
50	Followed by:
51	- **Workplace Use Case** callout — one sentence on when to use this in real work
52	- **Common Mistakes** expandable — three anti-patterns
53	
54	### Step 2 — The mechanism (Mechanism)
55	
56	A second AINA message that teaches the *anatomy* of the capability. This is where the **AnnotatedPromptCard** (see §4) appears: a real-world example showing the raw version, the annotated version with color-coded structural tags (ROLE / NEVER / RULE / EXAMPLE / TECHNICAL or capability-appropriate equivalents), and three to five explanation cards below.
57	
58	The point of Step 2 is not "memorize this template." It is "see the parts." A learner who reaches Step 3 should be able to name the parts in their head.
59	
60	### Step 3 — Practice attempt (First application)
61	
62	A third AINA message introducing a vague version of a real task (in a mono callout) and asking the learner to rewrite it using the capability they just saw. Composer becomes the active surface here.
63	
64	Learner submits. AINA streams a response that contains:
65	- A one-line verdict ("Solid structure. Four of five clean.")
66	- A **RubricCard** breaking out the capability's named dimensions (e.g. for Structured Intent: Role / Context / Task / Constraints / Output, each /20). Score = sum. Delta from previous attempt if any.
67	- A **FeedbackSection** with two named lists: what worked, what to sharpen.
68	
69	Hint ladder is visible but locked. Learner can unlock hints without point cost.
70	
71	### Step 4 — Refine & resubmit (Iteration)
72	
73	Same composer, same rubric, second attempt. The hint ladder is now functionally important. Three stages:
74	
75	1. **Nudge** — points at *what's missing* without naming it
76	2. **Scaffold** — gives a structural skeleton, learner fills it
77	3. **Near-solution** — names the missing element directly
78	
79	Each unlock costs zero points by design (architecture decision: hints should reduce shame, not penalize learning). After a passing score, Step 5 unlocks.
80	
81	### Step 5 — Apply at work (Transfer)
82	
83	Final AINA message: "Now bring your own task." Composer accepts free-form input. Rubric still applies but is informational, not gating. This is where the learner produces the artifact that satisfies the capability's Mastery Gate.
84	
85	This step is what makes a lesson real. It's also the step that requires no personalization — the learner is bringing their own context.
86	
87	## 3. The three columns (desktop) / single column (mobile)
88	
89	### Desktop layout
90	
91	```
92	┌──────────┬───────────────────────────┬──────────┐
93	│ Left     │ Center                    │ Right    │
94	│ sidebar  │ chat stream               │ sidebar  │
95	│          │                           │          │
96	│ - lesson │ - top bar (lesson · step  │ - your   │
97	│   title  │   pips · score)           │   trajec │
98	│ - 5      │ - chat messages           │ - attempt│
99	│   steps  │   (Step 1 → 5 unfold      │   history│
100	│ - up     │   as the learner          │ - anti-  │
101	│   next   │   progresses)             │   patterns│
102	│          │ - composer (sticky        │ - hint   │
103	│          │   bottom)                 │   reservoir│
104	└──────────┴───────────────────────────┴──────────┘
105	   260px              flex                300px
106	```
107	
108	### Mobile layout (≤ 720px)
109	
110	- Single column, full width
111	- Left sidebar → hamburger that opens a **drawer** (overlay) from the left
112	- Right sidebar → "i" button in the top bar that opens a **bottom sheet**
113	- Step pips → "Step X of 5" text only
114	- Score → small chip in top right
115	- Composer → docked to bottom, respects `safe-area-inset-bottom`, keyboard pushes the chat stream up (CSS `dvh` units)
116	- Explainer Card → 4:3 instead of 16:9, captions below the video frame, not overlaid
117	- AnnotatedPromptCard → annotations stack as cards below the raw frame, no overlay positioning
118	
119	The same primitives render in both modes. There is no separate mobile component tree.
120	
121	## 4. The named components
122	
123	These are the chat-stream primitives. Names are stable — once a name appears in this list, code should use it.
124	
125	### Explainer Card
126	
127	A 60–90 second visual explainer at the top of Step 1. Production target: HyperFrames-rendered video with Anime.js inline DOM annotations and Gemini TTS voiceover. Captions on by default. Always has a "show me the concept in text instead" toggle so the chat-only path works.
128	
129	**Working prototype substitute:** A CSS/SVG-animated card with a timeline driven by vanilla JS (no Anime.js dependency yet). Same duration, same shape, same affordances (play / pause / scrub / captions / text fallback). The prototype is `<ExplainerCard>` and the video pipeline replaces the body of the card later without touching the chat stream.
130	
131	Anti-patterns: do not autoplay; do not block the lesson behind video completion; do not ship video without captions.
132	
133	### AnnotatedPromptCard
134	
135	A two-frame component teaching the *parts* of a structured artifact. Top frame is the raw mono text. Bottom frame is the same text with color-coded inline underlines mapping to a tag legend. Below the frames: tag chips, then explanation cards (one per tag, ~2 sentences each).
136	
137	Reference visual: the bank-chatbot example screenshot from the May 17 review (ROLE / NEVER / RULE / EXAMPLE / TECHNICAL color coding, light theme, mono frames with traffic-light dots).
138	
139	Annotations are toggleable (Hide / Show button). On mobile, the annotation overlay collapses into stacked cards below the raw frame.
140	
141	### RubricCard
142	
143	Already shipped in `packages/ui/src/chat/RubricCard.tsx`. Renders score/100, verdict line, delta chip, and N named dimensions each with a horizontal bar and score/20.
144	
145	Colors map to score band:
146	- ≥ 16/20 → success (green)
147	- 10–15/20 → warning (amber)
148	- ≤ 9/20 → error (red)
149	
150	### HintStack / HintCard / HintLocked
151	
152	Already shipped. Three-stage ladder. Locked state visible, unlock is one click, no point cost.
153	
154	### Composer
155	
156	Already shipped. Multi-line capable. Submit on enter, newline on shift+enter. Respects `safe-area-inset-bottom`. Shows attempt count.
157	
158	### Message
159	
160	Already shipped. AINA messages stream character-by-character (variable rate, respects `prefers-reduced-motion`). User messages render in gold mono bubble, right-aligned.
161	
162	### UseCaseCallout / InlineCard
163	
164	Already shipped. Used for the "Workplace Use Case" callout and "The Vague Version" mono callout respectively.
165	
166	## 5. The streaming UX
167	
168	Every AINA response that takes more than ~200ms of compute is streamed. Streaming is the brand. Specifics:
169	
170	- Variable rate (2–4 chars per tick, 14–32ms between ticks)
171	- `prefers-reduced-motion: reduce` → instant render, no animation
172	- Streaming dot indicator next to the AINA eyebrow while in progress
173	- RubricCard appears *after* the intro line finishes streaming, not inline with it
174	- If the network drops mid-stream, the partial text remains visible and a retry chip appears
175	
176	This pattern lives in `apps/web/src/chat/SandboxLesson.tsx` (`streamText` helper). New chat surfaces should import or re-implement it identically — *do not* render full responses instantly even when the source is local mock data, because that breaks the learner's mental model of what they're talking to.
177	
178	## 6. The evaluator contract
179	
180	Each capability ships with a deterministic evaluator that scores attempts on N named dimensions (typically 3–5), each /20. The evaluator is a function:
181	
182	```ts
183	evaluate(attemptText: string, attemptNumber: number) -> {
184	  totalScore: number,            // 0–100
185	  outOf: 100,
186	  verdict: string,               // one line
187	  delta?: string,                // e.g. "+8" if previous attempt exists
188	  rows: { label: string, score: number, outOf: 20 }[],
189	  strengths: { lead?: string, body: string }[],
190	  improvements: { lead?: string, body: string }[],
191	  introLine: string              // streamed verdict line
192	}
193	```
194	
195	Reference implementation: `apps/web/src/chat/sandbox-data.ts` (mock heuristic for Structured Intent Specification — scores Role / Context / Task / Constraints / Output).
196	
197	Production evaluators run on Cloudflare Workers with three layers (deterministic → heuristic → LLM semantic) per the curriculum generation algorithm doc. The chat UI does not know which layer scored which dimension — it just renders the rows.
198	
199	Evaluator failure modes that must surface in the UI:
200	- Off-topic submission → "I couldn't find a structured prompt to evaluate. Try again with role / context / task / constraints / output named."
201	- Empty submission → composer rejects locally, no eval call
202	- Inappropriate / unsafe submission → evaluator returns score 0 with a coaching message, no rubric
203	
204	## 7. The universal content authoring contract
205	
206	Every universal lesson ships as a single JSON file conforming to a schema that strictly separates universal content from personalization slots. The full schema lives in `DECISIONS.md` §3. Summary of fields required for a lesson to render in this UI:
207	
208	- `title`, `sidebar_title`, `level_label`, `level_chip`, `pedagogy_phase`
209	- `big_takeaway`, `core_concept`, `mental_model`
210	- `workplace_use_case`, `anti_patterns[]`
211	- `explainer_script` (scene specs for the explainer card)
212	- `anatomy` (data for AnnotatedPromptCard: raw_text, annotated_segments[], tag_legend[], explanation_cards[])
213	- `vague_version`, `composer_placeholder`
214	- `evaluator_id` (which evaluator function to invoke)
215	- `hints[]` (three: nudge / scaffold / near-solution)
216	- `apply_at_work_prompt`
217	- `cross_references[]` (capability codes)
218	- `when_not_to_use`, `teach_back_prompt` (the Module 9 / Module 10 discrimination + transfer moves)
219	
220	The 90 lessons currently in `Personalization Engine/personalization-engine-aina/ontology/curated/module_content/` have most fields but are missing: `explainer_script`, `anatomy`, `hints[]`, `when_not_to_use`, `teach_back_prompt`. Re-authoring against the full schema is the next content-side workstream.
221	
222	## 8. What personalization touches (and what it doesn't)
223	
224	Personalization is **layered on top** of a complete universal lesson. The five levers from the architecture:
225	
226	- **Selection** — which modules to surface
227	- **Sequencing** — what order
228	- **Filling** — placeholder substitution (tool stack, recurring stakeholders, role-specific examples)
229	- **Framing** — mindset variant (Skeptic / Curious / Optimist) and Posture (Guided / Practical / Strategic / Builder / Leadership)
230	- **Exposure Depth** — Foundation only, or include Applied, or include Integrated
231	
232	Personalization **never** changes:
233	- The capability being taught
234	- The rubric dimensions or weights
235	- The mastery gate threshold
236	- The principle being assessed
237	- The five-step shape
238	
239	Personalization **does** change:
240	- The opening hook line (mindset variant)
241	- The worked example (industry / role swap)
242	- The vague version's domain (marketing vs. fintech vs. healthcare)
243	- Tool variants in the explainer and AnnotatedPromptCard ("Claude Project" vs. "Custom GPT" vs. "Gemini Gem")
244	- The order in which capabilities are taught
245	
246	When personalization is disabled or unavailable, the lesson renders with neutral defaults that *still feel complete* — that is the test the universal layer must pass.
247	
248	## 9. The video pipeline (deferred but specified)
249	
250	Explainer Cards in production use the HyperFrames pipeline:
251	
252	1. **Concept script** — author writes a 12-bar scene spec (~90s) in markdown, one beat per line
253	2. **Scene spec → JSON** — structured input to the renderer
254	3. **HyperFrames render** — produces MP4 + SRT
255	4. **Gemini TTS overlay** — voice-over track aligned to SRT
256	5. **Anime.js DOM annotations** — inline highlights that play *over* the video on the chat surface (not baked into the MP4)
257	6. **Review pass** — captions, pacing, factual accuracy
258	7. **Publish** — MP4 to CDN, SRT to lesson JSON, scene spec retained for re-render
259	
260	Until this pipeline is live, the Explainer Card renders an animated CSS/SVG sequence with the same script timing. The slot is identical — only the body of the card changes when video lands.
261	
262	Anti-patterns:
263	- Do not generate videos per user. Personalization layers in DOM, not in video.
264	- Do not animate things that don't need animation. Most lessons have 4–6 beats; that's it.
265	- Do not let LLMs freestyle the visuals. Scene specs are constrained input, validated before render.
266	
267	## 10. What "done" looks like for V1
268	
269	For the V1 commercial cohort, a "done" lesson is:
270	
271	- One universal JSON conforming to the full schema in §7
272	- One Explainer Card (CSS-animated for V1, video for V2)
273	- One AnnotatedPromptCard with at least 3 explanation cards
274	- One evaluator function (deterministic heuristic for V1, LLM-layer added in V2)
275	- Three hints (nudge / scaffold / near-solution)
276	- One Apply-at-Work prompt
277	- A pass-grade rubric that's been tested against ≥ 10 sample attempts and produces no systematic false-pass or false-hold
278	- Renders correctly on a 390px viewport
279	
280	Track 1 is 30 such lessons (one per capability). Track 1 = the commercial product. Track 2 (Levels 4–5) is deferred until Track 1 ships and the first cohort completes.
281	
282	## 11. Reference implementations
283	
284	- **Existing chat shell:** `packages/ui/src/chat/` — use as-is, do not rebuild
285	- **Working sandbox lesson:** `apps/web/src/chat/SandboxLesson.tsx` — closest production analog; the prototypes in this folder must stay compatible
286	- **Mock evaluator:** `apps/web/src/chat/sandbox-data.ts` — the contract for new evaluators
287	- **Visual reference for mockups:** `apps/web/src/components/landing-v6/StickyShellStage.astro` (Panels 04 and 05)
288	- **Universal content corpus:** `Personalization Engine/personalization-engine-aina/ontology/curated/module_content/` — 90 JSON files (30 capabilities × 3 phases). Schema needs extending per §7.
289	- **Quality bar for re-authoring:** `AINA Curriculum/AINA_25_Authored_Lessons_v1.md` — 25 lessons authored against the richer schema, treat these as the model.
290	
291	## 12. Anti-vision (what we are not building)
292	
293	- **Not a video course.** Video is one beat inside a lesson, not the lesson.
294	- **Not a quiz platform.** Multiple-choice exists only as a sanity check inside Step 1; the gating is the Apply-at-Work artifact.
295	- **Not certificate-driven.** "Statement of demonstrated capability tied to the learner's real artifact" — not a Coursera-style completion badge.
296	- **Not personalized-from-scratch.** Generated lessons that don't anchor on the universal layer are out of scope. The LLM never writes the lesson body at runtime.
297	- **Not tool-specific.** Tool variants live in slots, not in the spine. A "Master Custom GPTs" lesson is, by construction, a content-layer drift — it doesn't exist in the curriculum.
298	- **Not bootcamp-paced.** Track 1 is 11–14 weeks at a curious learner's pace. We are honest about this in acquisition copy.
299	
300	---
301	
302	If you read this and disagree with any decision, edit `DECISIONS.md`, log the change, and ping Ali. Do not silently re-shape the lesson in code without updating these documents.
303	
304	############## DECISIONS.md ##############
305	---
306	title: "AINA Lesson Shape — Decisions Log (ADR-lite for the lesson UX)"
307	status: approved
308	canonical: true
309	source: claude
310	created: 2026-05-17
311	last_updated: 2026-05-17
312	last_reviewed: 2026-05-17
313	owner: ali
314	review_cadence: quarterly
315	authority_level: 2
316	related_domains: [curriculum, lessons, design, governance]
317	---
318	
319	# Decisions Log — AINA Lesson Shape
320	
321	**Note on relationship to repo-level decisions:** This file is an ADR-lite tracker for the lesson UX specifically. Repo-wide durable decisions live in [`docs/decisions/`](../docs/decisions/) per [ADR-0001](../docs/decisions/0001-source-of-truth.md). If any decision here graduates to repo-wide, it gets promoted to a numbered ADR.
322	
323	**Owner:** Ali · **Format:** ADR-lite — what was decided, why, when, what changed.
324	
325	Decisions are listed newest first. Each decision either **locks** a piece of the lesson shape, **opens** a question that needs resolution before V1 ships, or **defers** something to a later track.
326	
327	---
328	
329	## 2026-05-17 · L17.4 — V1 evaluator runs Cloudflare Workers, three-layer
330	**Status:** Locked
331	The evaluator pipeline for V1 is the three-layer pattern from the curriculum generation algorithm doc: deterministic → heuristic → LLM semantic. Runs on Cloudflare Workers. The chat UI never sees which layer scored which dimension — the contract is the `MockEvalResult` shape in `apps/web/src/chat/sandbox-data.ts`. Reference implementation for L1.1 lives there.
332	
333	**Why:** Anchors the chat-side rendering contract before evaluator-side production work begins. Locking the shape early means evaluator authors and UI authors do not block each other.
334	
335	---
336	
337	## 2026-05-17 · L17.3 — Lesson schema extends the curated 15 fields
338	**Status:** Locked
339	The universal lesson JSON adds these fields to the curated 15-field schema currently in `ontology/curated/module_content/`:
340	
341	- `sidebar_title` (short italic-display title for the left rail)
342	- `level_chip` (`l1` | `l2` | `l3`)
343	- `explainer_script` — array of scene specs (beat number, duration_sec, headline, body, visual_hint)
344	- `anatomy` — structure for the AnnotatedPromptCard:
345	  - `raw_text` (string, monospace content)
346	  - `annotated_segments` — `[{ start, end, tag, note }]`
347	  - `tag_legend` — `[{ tag, color_role, label }]`
348	  - `explanation_cards` — `[{ tag, title, body }]`
349	- `vague_version` (string, mono, the Step 3 callout)
350	- `composer_placeholder` (string, the Step 3 input placeholder)
351	- `evaluator_id` (string, names the evaluator function)
352	- `hints` — `[{ level, label, title_md, body }]` (three: nudge / scaffold / near-solution)
353	- `apply_at_work_prompt` (string, the Step 5 prompt)
354	- `cross_references` — `[{ capability_id, relation, note }]` (architecture's Module 8 + 10 transfer move)
355	- `when_not_to_use` (string, the Module 9 discrimination move)
356	- `teach_back_prompt` (string, the Module 10 transfer test)
357	- `mindset_variants` — `{ skeptic: string, curious: string, optimist: string }` for the Step 1 opening hook
358	
359	**Why:** The 90 curated files cover the spine but compress 10 modules per capability into 3 phase-lessons and lack the discrimination + transfer moves the architecture requires. The AINA_25 authored lessons set the quality bar; the schema extension formalizes it.
360	
361	**Migration:** Re-author 30 Track 1 capabilities against the new schema as a single sprint. The previous 90 files stay as fallback content; the new 30 supersede them once published.
362	
363	---
364	
365	## 2026-05-17 · L17.2 — Lesson shape is five steps, not ten modules
366	**Status:** Locked (with reservation)
367	A lesson in the UI is one capability rendered as a five-step chat arc: *Why this matters → The mechanism → Practice attempt → Refine & resubmit → Apply at work*. Mapped to the architecture: Foundation (steps 1–2) → Applied (steps 3–4) → Integrated (step 5).
368	
369	**Why:** The architecture promises 10 modules per capability (50 capabilities × 10 = 500 modules). The chat-first UI compresses these into the five-step arc per capability. Authoring at the 500-module unit is wasted effort if learners never experience that granularity. Authoring at the 30-capability unit matches the UX.
370	
371	**Reservation:** This compression must not erase the Module 9 (discrimination — "when not to use") and Module 10 (teach-back) moves. Both are required fields in the lesson schema (see L17.3) and surface inside the Apply-at-Work step.
372	
373	**Implication:** Track 1 commercial content = 30 lessons, not 300 modules. Track 2 (L4–L5) deferred.
374	
375	---
376	
377	## 2026-05-17 · L17.1 — Universal content reads complete without personalization
378	**Status:** Locked
379	Every universal lesson must read cleanly for a learner with no role, no industry, no tool stack selected. Personalization is an overlay (the five levers: Selection, Sequencing, Filling, Framing, Exposure Depth), never a precondition.
380	
381	**Test:** Open a lesson JSON, render it with all personalization slots holding their neutral defaults. If the result feels like a stripped-down placeholder, the universal authoring failed.
382	
383	**Implication:** The free / preview tier of the product is the universal layer with neutral defaults. The paid layer adds personalization. The architecture commits to this; the content layer must follow through.
384	
385	---
386	
387	## 2026-05-17 · L17.5 — Explainer Card slot ships with CSS animation in V1, video in V2
388	**Status:** Locked (phased)
389	The Step 1 Explainer Card slot exists in the chat shell from V1. V1 ships with a CSS/SVG-animated body inside the card. V2 swaps the body for HyperFrames-rendered video + Gemini TTS + Anime.js DOM annotations.
390	
391	**Why:** The video pipeline is a 2-week production track on its own (script → scene spec → render → TTS → caption → review → publish). Shipping the lesson architecture cannot block on it. The slot is the same in both versions; only the body changes.
392	
393	**Anti-pattern:** Authoring a lesson that *requires* video to land. Every lesson must have a text + animated-CSS fallback that teaches the same concept.
394	
395	---
396	
397	## 2026-05-17 · L17.6 — AnnotatedPromptCard is a new chat primitive
398	**Status:** Locked
399	Add `AnnotatedPromptCard` to `packages/ui/src/chat/`. Renders:
400	- Two mono frames (raw + annotated, traffic-light dots, light-theme cream surface)
401	- Color-coded inline tag underlines in the annotated frame
402	- Toggle button (Hide / Show annotations)
403	- Tag legend chips
404	- Explanation card stack (one card per tag, 2 sentences)
405	
406	Mobile collapse: annotation overlays stack into cards below the raw frame instead of overlaying.
407	
408	**Why:** The visual pattern from the bank-chatbot example screenshot teaches anatomy in a way prose cannot. It belongs as a first-class primitive because it will be reused across every capability that has structural parts (Structured Intent, Constraint Setting, Workflow Architecture, Human-in-the-Loop, etc.).
409	
410	**Reference visual:** Bank chatbot example screenshot from the May 17 review (ROLE / NEVER / RULE / EXAMPLE / TECHNICAL color coding).
411	
412	---
413	
414	## 2026-05-17 · L17.7 — Streaming is brand, not optimization
415	**Status:** Locked
416	Every AINA response in the chat surface streams character-by-character with variable rate (2–4 chars per tick, 14–32ms between ticks). This applies even when the source is local mock data — *especially* when the source is local mock data, because the learner's mental model of "I am talking to AINA" must hold.
417	
418	Reference implementation: `streamText()` helper in `apps/web/src/chat/SandboxLesson.tsx`.
419	
420	Exceptions:
421	- `prefers-reduced-motion: reduce` → instant render
422	- Quick-test answer reveals → instant (they are not AINA's voice)
423	- System-level messages (errors, network drops) → instant
424	
425	---
426	
427	## 2026-05-17 · L17.8 — Hints cost zero points
428	**Status:** Locked
429	The three-stage hint ladder (Nudge → Scaffold → Near-solution) unlocks freely. No point penalty. The architecture rationale: hints reduce shame, not learning. A learner who unlocks all three and then nails attempt 4 has learned more than a learner who guessed correctly on attempt 1.
430	
431	**Implication:** Surface this in the UI copy near the hint ladder: "Stuck? Use a hint — they're free. The model gets sharper, you don't lose points." (Already in the existing landing-v6 lesson panel; preserve verbatim.)
432	
433	---
434	
435	## 2026-05-17 · L17.9 — Mobile is not a separate component tree
436	**Status:** Locked
437	The same chat primitives render in both desktop and mobile. Mobile behavior is responsive CSS + a small set of layout swaps:
438	
439	- Left sidebar → drawer (overlay)
440	- Right sidebar → bottom sheet
441	- Step pips → text "Step X of 5"
442	- Composer → docked, respects `safe-area-inset-bottom`, uses `dvh` units
443	- Explainer Card → 4:3, captions below the frame
444	- AnnotatedPromptCard → annotations stack below raw, no overlay
445	
446	We do not ship a separate React Native app for V1. Mobile-web is the entire mobile UX. (Native client is in the V2 roadmap per the curriculum design doc but deferred.)
447	
448	---
449	
450	## 2026-05-17 · L17.10 — Sidebar nav follows "Lesson N of 16" pattern
451	**Status:** Open (needs decision)
452	The current screenshot shows "Lesson 4 of 16" in Level 1. The architecture's spine specifies 10 capabilities × 10 modules per level. The UI's 16 number is unexplained.
453	
454	**Three plausible interpretations:**
455	- (a) 16 = 10 capabilities + 6 cross-cutting lessons (intros, transitions, capstones)
456	- (b) 16 = a curated subset drawn from the 100 modules at L1
457	- (c) 16 = compressed view, e.g. 3 lesson-sized chunks per level × ~5 levels
458	
459	Resolution required before authoring Track 1 at scale. Owner: Ali. Deadline: before the next content sprint.
460	
461	**Working assumption** (not locked): interpretation (a). The previews assume one lesson per Track 1 capability (30 lessons total, 10 per level), with cross-cutting framing lessons optional.
462	
463	---
464	
465	## 2026-05-17 · L17.11 — Quality bar = AINA_25 authored set
466	**Status:** Locked
467	The 25 lessons in `AINA Curriculum/AINA_25_Authored_Lessons_v1.md` are the quality bar for re-authoring. They include `worked_example`, `failure_example`, `when_not_to_use`, `teach_back_prompt`, `cross_references`, mindset variants — the fields the curated 90 lack.
468	
469	**Implication:** The next content workstream is *not* "author 210 more modules to get from 90 to 300 curated lessons." It is "re-author the 30 Track 1 capabilities against the AINA_25 schema." Quality over coverage.
470	
471	---
472	
473	## 2026-05-17 · L17.12 — Tool variants by phase
474	**Status:** Locked (per Content Layer Architecture doc)
475	- **Foundation phase** — default chat (Claude / ChatGPT / Gemini) + Custom Instructions + Memory only. No Projects, GPTs, Gems, MCP, Zapier, automation.
476	- **Applied phase** — adds Projects, Custom GPTs, Gems, scheduled tasks
477	- **Integrated phase** — adds MCP servers, automation platforms, multi-agent patterns
478	
479	**Anti-pattern flagged in evaluation:** Several curated Foundation lessons reference MCP filesystem servers or Zapier. Those must be downgraded to Applied/Integrated phases in re-authoring.
480	
481	---
482	
483	## Open questions (need resolution before V1 ships)
484	
485	1. **L17.10** — what does "Lesson N of 16" mean? See above.
486	2. **Mastery gate threshold calibration** — the architecture lists 6.0 / 6.5 / 7.0 / 7.0 / 7.5 for L1–L5 as "tested and locked" but no test data is attached. First-cohort instrumentation will calibrate. Need: a dashboard that surfaces pass/hold/reinforce rates per Level × Posture × Industry from cohort 1.
487	3. **Tool variant scope at V1 launch** — which 3–5 tools ship with Phase B demonstrations on day one? Recommended: Claude (chat + Projects), ChatGPT (chat + Custom GPTs), Gemini (chat + Gems). Defer MCP variants to Q3.
488	4. **Credential / verifiable certificate at V1** — should V1 ship with a "Statement of Demonstrated Capability" tied to the learner's Mastery Gate artifact, or stay credential-free until cohort 2? Recommendation: ship with a verifiable credential. Loss aversion is real for the working-manager ICP.
489	5. **The "Apply at Work" submission flow** — does the learner submit through the chat (paste their artifact) or upload a file? Current prototypes assume chat-paste; production may need both. Owner: design + product.
490	
491	## Deferred
492	
493	- Track 2 (Levels 4–5) commercial launch — defer until Track 1 cohort 1 graduation data lands
494	- Native mobile app — V2 roadmap
495	- API access to lessons / external embedding — post-V1
496	- Tool variants beyond Claude / ChatGPT / Gemini — phased per L17.12
497	- Custom evaluators per industry — V1 ships with capability-level evaluators; industry-level mastery gates use the curated industry gates as scoring rubric overrides, not as separate evaluators
498

User2026-06-30T00:33:59.044Z

[Request interrupted by user]