Panel: Builder
- 1. Can I Build From This?
- 2. AI Thinking Partner — Can I Write System Prompts?
- 2a. Conversation Architecture
- 2b. Cross-Module Memory — Technical Implementation
- 2c. Token Management
- 2d. Scaffolding Removal — How to Implement
- 3. Data Model Gaps
- 3a. Participant Profile
- 3b. Exercise Artifacts
- 3c. Conversation History
- 3d. ALI Assessment Schema
- 3e. 8-Week System Data
- 4. Session Management
- 4a. Save Points
- 4b. Abandonment
- 4c. Session Duration
- 4d. Multi-Device
- 5. The Exercises in Software
- 5a. Partnership Audit (§8)
- 5b. Partnership Map (§10)
- 5c. Discovery Sprint (§9)
- 5d. Readiness Diagnostic (§11)
- 5e. 90-Day Plan (§12)
- 5f. General Pattern
- 6. Artifact Generation (PDFs)
- 6a. What Needs to Be PDF-able?
- 6b. How Structured Does the Data Need to Be?
- 6c. What Does a Partnership Audit PDF Look Like?
- 6d. Technical Approach
- 7. The 8-Week System
- 7a. Trigger Mechanism
- 7b. Conversation Flow
- 7c. Accountability Pairs
- 7d. Week 8 — ALI Retake
- 7e. Monthly Continuation (Post-Week 8)
- 8. Scope Risks
- 8a. Most Technically Ambitious
- 8b. What I'd Cut for MVP
- 8c. What I'd Push to v1.1
- 8d. What MUST Be in MVP
- 9. Red Flags
- 🔴 Red Flag 1: No TECHNICAL-SPEC.md
- 🔴 Red Flag 2: AI Conversation Quality Is Unvalidated
- 🟡 Red Flag 3: "The Medium IS the Message" Creates a Quality Trap
- 🟡 Red Flag 4: Exercise Complexity Varies Wildly
- 🟡 Red Flag 5: Enterprise Features Are Implied But Not Specified
- 🟡 Red Flag 6: The 8-Week System Is a Second Product
- 🟢 Green Flag: Cost Model Is Excellent
- 🟢 Green Flag: Content Quality Is Exceptional
- Summary: Top 5 Actions Before Engineering Starts
Panel Review: The Builder
Reviewer: Senior Full-Stack Engineer / Technical Product Manager Perspective: "Drew reading this to figure out what we need to build" Document Reviewed: COURSE-SPEC-UNIFIED.md (March 2, 2026) Review Date: March 2, 2026
1. Can I Build From This?
Short answer: I can build the facilitated workshop tooling from this. I cannot build the on-demand product without a separate technical spec — and the doc acknowledges this ("Technical architecture, enterprise features, and engineering specs live in TECHNICAL-SPEC.md").
What's strong: - The content is airtight. Module sequencing, exercise instructions, AI voice/tone guidance, timing — all clearly specified. A content team could build the deck and video scripts directly from this. - The Two Modalities table (§1) cleanly separates delivery concerns. - Save points are explicitly listed per module — that's a buildable feature boundary.
Where my team gets stuck:
-
No UI wireframes or interaction patterns. The spec describes what participants do but not how they interact with the interface. "Write down 10 things" — is that a text area? Ten separate input fields? A dynamic list with add/remove? This matters for every exercise.
-
AI conversation flow is described narratively, not as state machines. I get the spirit of "AI engages with 2-3 specific activities," but I need to know: Is this a structured sequence (prompt → response → prompt → response × N turns)? Or is it freeform chat with guardrails? The answer radically changes the build.
-
No API or integration spec. The doc references email notifications, in-app badges, downloadable PDFs, radar charts, accountability pair matching — each of these is a distinct system. None have technical specifications.
-
The assessment (ALI) needs its own spec. 30 items, 6-point Likert, scoring algorithm, dimension aggregation, profile narrative generation, radar chart rendering, cohort comparison logic — this is a standalone product. §5 gives me content direction but zero technical detail.
-
"See TECHNICAL-SPEC.md" — Referenced at the bottom and in Stress Test 6. I haven't seen this document. If it exists and covers what I'm flagging here, great. If it doesn't exist yet, that's the most critical gap.
Verdict: This is a course spec, not a product spec. It's excellent at what it is. But my team needs a product spec layered on top of it to start writing code.
2. AI Thinking Partner — Can I Write System Prompts?
What's provided (§6): - Voice principles table (6 principles with positive/negative examples) - Tone calibration per module (5 distinct tones) - Scaffolding removal schedule (4 phases) - Error recovery protocol (3 scenarios with example responses) - Dissent protocol (2 scenarios) - Boundaries list (8 "never" rules) - Word count caps (200 normal, 400 synthesis) - Meta-awareness moments (2-3 per course) - Participant evaluation ternary ("That landed / That missed / Mixed")
Can I write system prompts from this? About 60% of the way there. Here's what's missing:
2a. Conversation Architecture
The spec gives me example exchanges but not the structure of a conversation. For Module 1's Partnership Audit (§8), I need to know:
- How many turns does the AI engage per exercise step?
- What triggers the AI to move from Step 4 discussion to Debrief?
- If a participant goes off-topic, what's the re-engagement strategy beyond the error recovery protocol?
- Is there a maximum conversation length per exercise?
What I'd build: A turn-level conversation flow for each exercise, specifying: trigger → AI behavior → expected participant action → next trigger. The spec gives me the "happy path" dialogue. I need the state machine.
2b. Cross-Module Memory — Technical Implementation
The spec says the AI is "Remembering" (§6 Voice Principles) and "references prior answers across modules." This is the single hardest technical requirement in the entire spec.
Implementation options:
-
Full conversation history in context window. At ~104K tokens for the full course (§17), this is feasible with current models (128K-200K context windows). But it's expensive per-call and gets unwieldy. The 8-week system adds another ~55K.
-
Structured summary objects. After each exercise, extract key data into a structured JSON object (e.g., Partnership Audit results, Identity Statement text, specific workflow names). Feed these into each subsequent prompt as "participant profile." Much cheaper, more reliable.
-
Hybrid. Structured summaries for cross-module reference + recent conversation history for in-module continuity.
My recommendation: Option 3. But the spec doesn't tell me what to extract. I need a "memory schema" — for each exercise, what data points must persist for downstream reference? For example:
- Partnership Audit: The 10 activities + their zone classifications + the participant's "hardest to classify" item
- Identity Statement: The full text
- Priority Discovery: The "most surprising" item
- Partnership Map: Task-level allocations with "why" reasoning
This is a spec gap. Without it, my team will guess what to remember, and the AI will either reference things poorly or bloat the context window.
2c. Token Management
The cost model (§17) estimates ~104K tokens for the full course and ~20K for the heaviest module (Design). These are reasonable. But:
- What model? The cost estimates assume ~$0.014/1K tokens (working backward from the numbers). That's roughly GPT-4-class pricing. Model choice isn't specified.
- What's the per-turn budget? 200-word cap on responses helps, but input tokens matter too. A participant who writes 500-word responses to every prompt will blow the budget.
- Streaming or batch? Affects UX and cost.
- Fallback model? If the primary model is down or rate-limited, what happens?
2d. Scaffolding Removal — How to Implement
The spec describes scaffolding removal qualitatively (§6): "Full scaffolding → Moderate → Light → Minimal." But how does this translate to prompt engineering?
Options: - Separate system prompts per module (simplest, most controllable) - A single system prompt with a "scaffolding level" parameter - Dynamic prompt modification based on participant behavior
I'd go with separate system prompts per module. But I need to know: does the AI's scaffolding level also adapt within a module based on participant skill? The "Teach it back" moment in Module 4 (§11) implies yes — but that's a much harder problem.
3. Data Model Gaps
The spec implies the following data structures without specifying them:
3a. Participant Profile
- user_id
- role, function, team_size, team_type
- org_ai_maturity
- provocation_response (3 sentences)
- ali_scores (pre): { overall, define, discover, design, develop, demonstrate }
- ali_scores (post): { same }
- ali_profile_narrative: text
- current_module, current_step
- created_at, last_active_at
3b. Exercise Artifacts
Each exercise produces output. But what's the schema? The spec doesn't say. My best guess:
Partnership Audit:
- items[]: { description, zone: own|augment|automate, hardest_to_classify: bool }
- own_count, augment_count, automate_count
- ai_observations: text[]
Identity Statement:
- statement_text
- pattern_used: meaning_maker|judgment_owner|trust_builder|custom
Possibility Map:
- workflows[]: {
name,
source: "partnership_audit_item_N",
efficiency: text,
augmentation: text,
transformation: text
}
- priority_discovery: text
Partnership Map:
- workflow_name
- source_workflow: FK to possibility_map
- tasks[]: {
description,
zone: own|augment|automate,
why: text,
ethical_guardrail: text,
handoff_protocol: text,
escalation_trigger: text
}
- is_guided: bool
Readiness Diagnostic:
- gaps[]: {
type: psych_safety|conceptual|technical|identity,
rating: 1-5,
signals: text
}
- biggest_gap: FK
- actions[]: { description, timeline: "30 days" }
Safety Commitment:
- commitment_text
- is_specific: bool (validated by AI)
90-Day Plan:
- target_partnership_map: FK
- metrics_30d: text
- metrics_60d: text
- metrics_90d: text
- audience: text
- success_threshold: text
- scale_trigger: text
- kill_criteria: text
- story: text
Course Commitment:
- full_statement: text
- per_step: { define, discover, design, develop, demonstrate }: text
Questions for the team: - Are these all freeform text, or do we want structured data for analytics/reporting? - Enterprise buyers will want aggregate reporting. What rollup views do we need? - Does the AI need to query this data, or just read it from context?
3c. Conversation History
- How long do we retain conversation logs?
- Is the ternary feedback ("That landed / That missed / Mixed") stored per-interaction?
- The "What do you know about me?" transparency feature (§6) implies we need a queryable participant data store.
- Do we need conversation-level analytics (average turns per exercise, drop-off points)?
3d. ALI Assessment Schema
- 30 items, but they're not all listed. We have samples (§5). Need the full item bank.
- Scoring algorithm: simple average per dimension? Weighted? Reverse-scored items?
- "Profile narrative" — is this AI-generated or template-based? The spec says "auto-generated" (§8) but doesn't specify the generation method.
- "Cohort comparison" — implies we need cohort grouping logic. By organization? By enrollment date? By industry?
3e. 8-Week System Data
- Week-by-week completion tracking
- Accountability pair assignments
- ALI retake scores (week 8)
- Monthly continuation opt-in status
4. Session Management
4a. Save Points
The spec lists save points per module (good). But:
- What state is captured at a save point? Just "you're at this step," or the full conversation up to that point?
- Resume flow: When a participant returns, does the AI summarize where they left off? Replay the last exchange? Just continue?
- Cross-session memory: If someone completes Module 1 on Monday and returns Thursday for Module 2, what's in context? Full Module 1 conversation? Just the structured artifacts?
4b. Abandonment
The spec doesn't address:
- Mid-exercise abandonment. Participant closes the browser during the Partnership Audit Step 2. When they return, do they see their partial work? Does the AI acknowledge the interruption?
- Mid-conversation abandonment. The AI asked a question, participant never responded. On resume, does the AI re-ask? Move on?
- Long abandonment. Participant doesn't return for 3 weeks. Does the AI acknowledge the gap? Offer to restart the module?
- Permanent abandonment. At what point do we consider someone "dropped"? Do we send re-engagement emails?
4c. Session Duration
The spec says "4-6 sessions over 1-3 weeks" for on-demand (§1). But there's no enforced session structure. Can someone do all 5 modules in one sitting? The 40-minute pacing nudge (§9) suggests yes but gently discourages it. What's the UX when someone ignores the nudge and keeps going for 3 hours?
4d. Multi-Device
Can a participant start on desktop and resume on mobile? The spec doesn't address responsive design or device continuity.
5. The Exercises in Software
5a. Partnership Audit (§8)
"Write down the 10 things you actually spent time on last week."
UI options: 1. Freeform text area. Simple. But how does the AI parse "10 things" from a paragraph? Unreliable. 2. Numbered list (10 text inputs). Structured. But feels rigid. What if someone has 7 things? Or 13? 3. Dynamic list with add/remove. Best UX. But requires the AI to reference items by content, not position.
My recommendation: Dynamic list. Minimum 5, soft max 10, hard max 15. Each item gets a text input + a zone selector (Own/Augment/Automate) that appears in Step 2.
Critical question: When the AI later says "You marked 'led team standup' as Own" — is it reading from structured data or from conversation history? Structured data is reliable. Conversation history is fragile.
5b. Partnership Map (§10)
This is the most complex exercise UI-wise. For each workflow: - Break into 6-10 component tasks - For each task: zone classification + why + ethical guardrail + handoff protocol + escalation trigger
That's a structured form with nested data. A freeform text chat cannot reliably capture this. Options:
- Structured form UI that the AI references and discusses. The AI coaches; the form captures.
- Conversational capture where the AI extracts structured data from freeform dialogue. Unreliable at this complexity level.
- Hybrid: AI guides the conversation, and a side panel shows the emerging Partnership Map as structured data that the participant can edit directly.
My recommendation: Option 3. It's the most ambitious but the only one that produces clean, referenceable, PDF-ready data while maintaining conversational flow.
5c. Discovery Sprint (§9)
"For each workflow, ask three questions..." — This is 3 workflows × 3 dimensions = 9 text responses. Similar question: structured form or freeform? I'd lean toward a matrix/grid UI with text areas.
5d. Readiness Diagnostic (§11)
4 gaps × (1-5 rating + text signals). This is a clear structured form. The rating is a slider or radio group; signals is a text area.
5e. 90-Day Plan (§12)
8 structured fields (30d metrics, 60d, 90d, audience, threshold, scale trigger, kill criteria, story). This is a form with the AI coaching alongside it.
5f. General Pattern
Almost every exercise needs a dual-pane UI: conversation/coaching on one side, structured artifact on the other. The AI talks to you; the artifact builds beside you. This is a significant frontend architecture decision that should be made early.
6. Artifact Generation (PDFs)
"Downloadable PDF" is mentioned in §1 (on-demand artifacts are "Digital — in-app, downloadable PDF").
6a. What Needs to Be PDF-able?
Every deliverable from §14: 1. Partnership Audit + Identity Statement 2. Possibility Map + Priority Discovery 3. Partnership Map (guided) + Partnership Map (unscaffolded) 4. Readiness Diagnostic + Safety Commitment 5. 90-Day Demonstration Plan + Course Commitment 6. The 5D Card (digital version) 7. ALI results (radar chart + profile narrative)
6b. How Structured Does the Data Need to Be?
Very. You can't generate a clean PDF from freeform conversation logs. You need structured data for every artifact. This reinforces my recommendation for the dual-pane UI: the structured side IS the PDF data source.
6c. What Does a Partnership Audit PDF Look Like?
The spec doesn't say. I'd propose:
- Header: Participant name, date, course branding
- Section 1: The 10 activities in a table (Activity | Zone | Notes)
- Section 2: Zone distribution pie chart or bar chart (X Own, Y Augment, Z Automate)
- Section 3: Identity Statement (full text, highlighted)
- Section 4: AI observations (2-3 bullet points from the coaching conversation)
- Footer: "Generated from Leading Through AI™ — LeaderFactor"
We need PDF templates designed for each artifact. This is a design task that should happen in parallel with engineering.
6d. Technical Approach
- Server-side PDF generation (Puppeteer/Playwright rendering HTML templates, or a library like react-pdf)
- Store structured artifact data in the database
- Render on-demand when participant clicks "Download"
- Include the radar chart for ALI — this means SVG/Canvas chart rendering in the PDF pipeline
7. The 8-Week System
7a. Trigger Mechanism
§13 says: - Email: "Weekly, Tuesday morning (participant's timezone). Subject line is the specific action." - In-app: "Badge on the course platform. Dismissible but persistent." - No push notifications.
What I need to build: - Email scheduling system with timezone awareness - 8 email templates (one per week) with personalized content - In-app notification/badge system - Logic to determine which week a participant is in (based on course completion date) - Opt-out/pause mechanism (not specified but legally required in most jurisdictions)
7b. Conversation Flow
When a participant clicks the email link and lands in Week 3's interaction, what happens?
- Does the AI have the full course history in context?
- Does it reference the specific artifact for that week? (Week 3 = "Present your Partnership Map")
- How long is the interaction? The spec says 20-45 min per week.
- Is this the same conversation interface as the course, or a lighter-weight one?
My recommendation: Load the structured artifact summaries (not full conversation history) + the current week's micro-action prompt into context. The AI should reference specific artifacts by name and content. This keeps token costs at the estimated ~5K/interaction.
7c. Accountability Pairs
"Matched post-enrollment (on-demand)" — How? By industry? By organization? By ALI profile? Random? Manual?
This is a non-trivial matching system if done well, or a simple random assignment if done minimally. The spec doesn't specify.
7d. Week 8 — ALI Retake
- Same 30 items as pre-course?
- Side-by-side comparison view?
- Does the AI conduct the "Reckoning" conversation with both scores visible?
- Is this a separate assessment flow or embedded in the Week 8 conversation?
7e. Monthly Continuation (Post-Week 8)
"Participants opt into a monthly check-in." This is barely specified. What's the content? What's the AI's role? What are the token estimates? Is this free or paid?
8. Scope Risks
8a. Most Technically Ambitious
The AI Thinking Partner as primary coaching mechanism for on-demand. This isn't a chatbot answering questions. It's a multi-session, cross-module, artifact-aware, tone-shifting, scaffolding-removing, error-recovering conversational coach. The quality bar is explicitly called "the existential risk for on-demand" in Stress Test 6 (§18).
Close second: The dual-pane exercise UI (if we build it). Keeping structured artifact data in sync with a conversational AI flow is complex frontend/backend coordination.
8b. What I'd Cut for MVP
- Accountability pair matching — Replace with "share your commitment with someone you trust." Manual, no system needed.
- Monthly continuation post-Week 8 — Ship the 8 weeks. Evaluate continuation later.
- Cohort comparison in ALI — Show individual scores only. Cohort comparison needs a critical mass of data.
- "What do you know about me?" transparency feature — Important ethically but can ship as a simple data export initially, not a conversational feature.
- The participant ternary evaluation ("That landed / That missed / Mixed") — Log it but don't build feedback loops from it in v1.
- PDF generation — v1 could show artifacts in-app only. PDFs in v1.1.
8c. What I'd Push to v1.1
- PDF generation for all artifacts
- Cohort/team analytics dashboard (enterprise)
- Accountability pair matching system
- Monthly continuation program
- Advanced ALI analytics (pattern detection, profile narratives beyond basic templates)
- The "unscaffolded second Partnership Map" as a distinct tracked artifact (vs. just another conversation in M3)
8d. What MUST Be in MVP
- ALI assessment (pre-course) with individual radar chart
- Full 5-module AI coaching experience
- Structured artifact capture for all exercises
- Cross-module memory (structured summaries)
- Save/resume at specified save points
- 8-week email sequence with AI interactions
- ALI retake (week 8) with comparison
9. Red Flags
🔴 Red Flag 1: No TECHNICAL-SPEC.md
The doc references it twice. If it doesn't exist yet, we're building from a course spec and calling it a product spec. That's how you get scope creep, misaligned expectations, and a 6-month project that takes 14 months.
Action needed: Write the technical spec before engineering starts. This review is essentially a table of contents for that document.
🔴 Red Flag 2: AI Conversation Quality Is Unvalidated
Stress Test 6 acknowledges this: "validate Module 1 conversation with real participants before building remaining modules." This is the right instinct. But if validation fails, the fallback options are "iterate on prompts, reduce AI scope, or pivot to facilitator-supported hybrid" — each of which is a fundamentally different product.
Action needed: Build and validate the Module 1 AI experience as a standalone prototype BEFORE committing to the full 5-module on-demand build. This is the single highest-risk item.
🟡 Red Flag 3: "The Medium IS the Message" Creates a Quality Trap
The spec positions the AI Thinking Partner as proof that human-AI partnership works. If the AI experience is mediocre, it undermines the course's core thesis. This means the bar for "good enough" is higher than a typical ed-tech chatbot. We can't ship a C+ AI experience for a course that teaches A+ human-AI partnership.
🟡 Red Flag 4: Exercise Complexity Varies Wildly
The Partnership Audit (list 10 things, classify them) is simple. The Partnership Map (decompose workflows into tasks, allocate each with reasoning, ethical guardrails, handoff protocols, escalation triggers) is an order of magnitude more complex. Building a UI that handles both elegantly — and that the AI can coach through — is a significant design challenge.
🟡 Red Flag 5: Enterprise Features Are Implied But Not Specified
Enterprise is the "real revenue driver" (§17: 100 seats at $249 = $24,900). But enterprise features aren't specified: team rollup reporting, admin dashboards, bulk enrollment, SSO, SCORM/LTI integration, manager visibility into team progress, aggregate ALI scores... This will be the first thing enterprise buyers ask about.
🟡 Red Flag 6: The 8-Week System Is a Second Product
It has its own content, its own AI interactions, its own notification system, its own pacing logic, its own scaffolding removal schedule, and its own accountability mechanism. Calling it a "reinforcement system" makes it sound like an email drip. It's actually a lightweight, 8-week AI coaching program. Scope it accordingly.
🟢 Green Flag: Cost Model Is Excellent
At ~$2.22/participant all-in and $499/seat pricing, the AI costs are negligible. Even at 5x overrun, it's fine. This means we can be generous with AI interaction depth without worrying about margin. That's rare and valuable.
🟢 Green Flag: Content Quality Is Exceptional
Whatever we build, the content is ready. The exercises are well-designed, the framework is coherent, the progression is logical, and the voice guidance is specific enough to write good prompts. The course spec is one of the best I've read. The gap is purely in product/engineering specification.
Summary: Top 5 Actions Before Engineering Starts
-
Write TECHNICAL-SPEC.md. Data models, API contracts, UI wireframes, conversation state machines, infrastructure choices. This review is the outline.
-
Build and validate Module 1 as a prototype. If the AI coaching experience doesn't work for Module 1 (the simplest module), it won't work for Module 3 (the most complex). Validate before committing to the full build.
-
Define the memory schema. For each exercise, specify exactly what structured data is extracted and persisted for cross-module reference. This is the backbone of the AI experience.
-
Design the dual-pane UI pattern. Conversation + structured artifact is the core interaction pattern. Get this right and every module follows the same pattern. Get it wrong and every module is a custom build.
-
Scope the enterprise feature set. Enterprise is the revenue engine. Don't build the individual product and then try to bolt on enterprise features. Design for enterprise from day one, even if enterprise-specific features ship later.
Review complete. Happy to go deeper on any section. This is a strong spec that needs a strong technical counterpart. Let's build the right thing.
— The Builder