🎬 SHOOT: 14 days | πŸš€ LAUNCH: 25 days | πŸ”΄ TODAY: Deck outline begins β€” engineering brief to Drew
Showing: All Junior Tim Sr. Drew Jillian Maria Nick Ben Alfred

πŸ“„ Panel Reviews

Builder
Buyer
Facilitator
Instructional
Skeptic

Panel Review: The Builder

Reviewer: Senior Full-Stack Engineer / Technical Product Manager
Perspective: "Drew reading this to figure out what we need to build"
Document Reviewed: COURSE-SPEC-UNIFIED.md (March 2, 2026)
Review Date: March 2, 2026


1. Can I Build From This?

Short answer: I can build the facilitated workshop tooling from this. I cannot build the on-demand product without a separate technical spec β€” and the doc acknowledges this ("Technical architecture, enterprise features, and engineering specs live in TECHNICAL-SPEC.md").

What's strong:
- The content is airtight. Module sequencing, exercise instructions, AI voice/tone guidance, timing β€” all clearly specified. A content team could build the deck and video scripts directly from this.
- The Two Modalities table (Β§1) cleanly separates delivery concerns.
- Save points are explicitly listed per module β€” that's a buildable feature boundary.

Where my team gets stuck:

  1. No UI wireframes or interaction patterns. The spec describes what participants do but not how they interact with the interface. "Write down 10 things" β€” is that a text area? Ten separate input fields? A dynamic list with add/remove? This matters for every exercise.

  2. AI conversation flow is described narratively, not as state machines. I get the spirit of "AI engages with 2-3 specific activities," but I need to know: Is this a structured sequence (prompt β†’ response β†’ prompt β†’ response Γ— N turns)? Or is it freeform chat with guardrails? The answer radically changes the build.

  3. No API or integration spec. The doc references email notifications, in-app badges, downloadable PDFs, radar charts, accountability pair matching β€” each of these is a distinct system. None have technical specifications.

  4. The assessment (ALI) needs its own spec. 30 items, 6-point Likert, scoring algorithm, dimension aggregation, profile narrative generation, radar chart rendering, cohort comparison logic β€” this is a standalone product. Β§5 gives me content direction but zero technical detail.

  5. "See TECHNICAL-SPEC.md" β€” Referenced at the bottom and in Stress Test 6. I haven't seen this document. If it exists and covers what I'm flagging here, great. If it doesn't exist yet, that's the most critical gap.

Verdict: This is a course spec, not a product spec. It's excellent at what it is. But my team needs a product spec layered on top of it to start writing code.


2. AI Thinking Partner β€” Can I Write System Prompts?

What's provided (Β§6):
- Voice principles table (6 principles with positive/negative examples)
- Tone calibration per module (5 distinct tones)
- Scaffolding removal schedule (4 phases)
- Error recovery protocol (3 scenarios with example responses)
- Dissent protocol (2 scenarios)
- Boundaries list (8 "never" rules)
- Word count caps (200 normal, 400 synthesis)
- Meta-awareness moments (2-3 per course)
- Participant evaluation ternary ("That landed / That missed / Mixed")

Can I write system prompts from this? About 60% of the way there. Here's what's missing:

2a. Conversation Architecture

The spec gives me example exchanges but not the structure of a conversation. For Module 1's Partnership Audit (Β§8), I need to know:

What I'd build: A turn-level conversation flow for each exercise, specifying: trigger β†’ AI behavior β†’ expected participant action β†’ next trigger. The spec gives me the "happy path" dialogue. I need the state machine.

2b. Cross-Module Memory β€” Technical Implementation

The spec says the AI is "Remembering" (Β§6 Voice Principles) and "references prior answers across modules." This is the single hardest technical requirement in the entire spec.

Implementation options:

  1. Full conversation history in context window. At ~104K tokens for the full course (Β§17), this is feasible with current models (128K-200K context windows). But it's expensive per-call and gets unwieldy. The 8-week system adds another ~55K.

  2. Structured summary objects. After each exercise, extract key data into a structured JSON object (e.g., Partnership Audit results, Identity Statement text, specific workflow names). Feed these into each subsequent prompt as "participant profile." Much cheaper, more reliable.

  3. Hybrid. Structured summaries for cross-module reference + recent conversation history for in-module continuity.

My recommendation: Option 3. But the spec doesn't tell me what to extract. I need a "memory schema" β€” for each exercise, what data points must persist for downstream reference? For example:

This is a spec gap. Without it, my team will guess what to remember, and the AI will either reference things poorly or bloat the context window.

2c. Token Management

The cost model (Β§17) estimates ~104K tokens for the full course and ~20K for the heaviest module (Design). These are reasonable. But:

2d. Scaffolding Removal β€” How to Implement

The spec describes scaffolding removal qualitatively (Β§6): "Full scaffolding β†’ Moderate β†’ Light β†’ Minimal." But how does this translate to prompt engineering?

Options:
- Separate system prompts per module (simplest, most controllable)
- A single system prompt with a "scaffolding level" parameter
- Dynamic prompt modification based on participant behavior

I'd go with separate system prompts per module. But I need to know: does the AI's scaffolding level also adapt within a module based on participant skill? The "Teach it back" moment in Module 4 (Β§11) implies yes β€” but that's a much harder problem.


3. Data Model Gaps

The spec implies the following data structures without specifying them:

3a. Participant Profile

- user_id
- role, function, team_size, team_type
- org_ai_maturity
- provocation_response (3 sentences)
- ali_scores (pre): { overall, define, discover, design, develop, demonstrate }
- ali_scores (post): { same }
- ali_profile_narrative: text
- current_module, current_step
- created_at, last_active_at

3b. Exercise Artifacts

Each exercise produces output. But what's the schema? The spec doesn't say. My best guess:

Partnership Audit:

- items[]: { description, zone: own|augment|automate, hardest_to_classify: bool }
- own_count, augment_count, automate_count
- ai_observations: text[]

Identity Statement:

- statement_text
- pattern_used: meaning_maker|judgment_owner|trust_builder|custom

Possibility Map:

- workflows[]: {
    name,
    source: "partnership_audit_item_N",
    efficiency: text,
    augmentation: text,
    transformation: text
  }
- priority_discovery: text

Partnership Map:

- workflow_name
- source_workflow: FK to possibility_map
- tasks[]: {
    description,
    zone: own|augment|automate,
    why: text,
    ethical_guardrail: text,
    handoff_protocol: text,
    escalation_trigger: text
  }
- is_guided: bool

Readiness Diagnostic:

- gaps[]: {
    type: psych_safety|conceptual|technical|identity,
    rating: 1-5,
    signals: text
  }
- biggest_gap: FK
- actions[]: { description, timeline: "30 days" }

Safety Commitment:

- commitment_text
- is_specific: bool (validated by AI)

90-Day Plan:

- target_partnership_map: FK
- metrics_30d: text
- metrics_60d: text
- metrics_90d: text
- audience: text
- success_threshold: text
- scale_trigger: text
- kill_criteria: text
- story: text

Course Commitment:

- full_statement: text
- per_step: { define, discover, design, develop, demonstrate }: text

Questions for the team:
- Are these all freeform text, or do we want structured data for analytics/reporting?
- Enterprise buyers will want aggregate reporting. What rollup views do we need?
- Does the AI need to query this data, or just read it from context?

3c. Conversation History

3d. ALI Assessment Schema

3e. 8-Week System Data


4. Session Management

4a. Save Points

The spec lists save points per module (good). But:

4b. Abandonment

The spec doesn't address:

4c. Session Duration

The spec says "4-6 sessions over 1-3 weeks" for on-demand (Β§1). But there's no enforced session structure. Can someone do all 5 modules in one sitting? The 40-minute pacing nudge (Β§9) suggests yes but gently discourages it. What's the UX when someone ignores the nudge and keeps going for 3 hours?

4d. Multi-Device

Can a participant start on desktop and resume on mobile? The spec doesn't address responsive design or device continuity.


5. The Exercises in Software

5a. Partnership Audit (Β§8)

"Write down the 10 things you actually spent time on last week."

UI options:
1. Freeform text area. Simple. But how does the AI parse "10 things" from a paragraph? Unreliable.
2. Numbered list (10 text inputs). Structured. But feels rigid. What if someone has 7 things? Or 13?
3. Dynamic list with add/remove. Best UX. But requires the AI to reference items by content, not position.

My recommendation: Dynamic list. Minimum 5, soft max 10, hard max 15. Each item gets a text input + a zone selector (Own/Augment/Automate) that appears in Step 2.

Critical question: When the AI later says "You marked 'led team standup' as Own" β€” is it reading from structured data or from conversation history? Structured data is reliable. Conversation history is fragile.

5b. Partnership Map (Β§10)

This is the most complex exercise UI-wise. For each workflow:
- Break into 6-10 component tasks
- For each task: zone classification + why + ethical guardrail + handoff protocol + escalation trigger

That's a structured form with nested data. A freeform text chat cannot reliably capture this. Options:

  1. Structured form UI that the AI references and discusses. The AI coaches; the form captures.
  2. Conversational capture where the AI extracts structured data from freeform dialogue. Unreliable at this complexity level.
  3. Hybrid: AI guides the conversation, and a side panel shows the emerging Partnership Map as structured data that the participant can edit directly.

My recommendation: Option 3. It's the most ambitious but the only one that produces clean, referenceable, PDF-ready data while maintaining conversational flow.

5c. Discovery Sprint (Β§9)

"For each workflow, ask three questions..." β€” This is 3 workflows Γ— 3 dimensions = 9 text responses. Similar question: structured form or freeform? I'd lean toward a matrix/grid UI with text areas.

5d. Readiness Diagnostic (Β§11)

4 gaps Γ— (1-5 rating + text signals). This is a clear structured form. The rating is a slider or radio group; signals is a text area.

5e. 90-Day Plan (Β§12)

8 structured fields (30d metrics, 60d, 90d, audience, threshold, scale trigger, kill criteria, story). This is a form with the AI coaching alongside it.

5f. General Pattern

Almost every exercise needs a dual-pane UI: conversation/coaching on one side, structured artifact on the other. The AI talks to you; the artifact builds beside you. This is a significant frontend architecture decision that should be made early.


6. Artifact Generation (PDFs)

"Downloadable PDF" is mentioned in Β§1 (on-demand artifacts are "Digital β€” in-app, downloadable PDF").

6a. What Needs to Be PDF-able?

Every deliverable from Β§14:
1. Partnership Audit + Identity Statement
2. Possibility Map + Priority Discovery
3. Partnership Map (guided) + Partnership Map (unscaffolded)
4. Readiness Diagnostic + Safety Commitment
5. 90-Day Demonstration Plan + Course Commitment
6. The 5D Card (digital version)
7. ALI results (radar chart + profile narrative)

6b. How Structured Does the Data Need to Be?

Very. You can't generate a clean PDF from freeform conversation logs. You need structured data for every artifact. This reinforces my recommendation for the dual-pane UI: the structured side IS the PDF data source.

6c. What Does a Partnership Audit PDF Look Like?

The spec doesn't say. I'd propose:

We need PDF templates designed for each artifact. This is a design task that should happen in parallel with engineering.

6d. Technical Approach


7. The 8-Week System

7a. Trigger Mechanism

Β§13 says:
- Email: "Weekly, Tuesday morning (participant's timezone). Subject line is the specific action."
- In-app: "Badge on the course platform. Dismissible but persistent."
- No push notifications.

What I need to build:
- Email scheduling system with timezone awareness
- 8 email templates (one per week) with personalized content
- In-app notification/badge system
- Logic to determine which week a participant is in (based on course completion date)
- Opt-out/pause mechanism (not specified but legally required in most jurisdictions)

7b. Conversation Flow

When a participant clicks the email link and lands in Week 3's interaction, what happens?

My recommendation: Load the structured artifact summaries (not full conversation history) + the current week's micro-action prompt into context. The AI should reference specific artifacts by name and content. This keeps token costs at the estimated ~5K/interaction.

7c. Accountability Pairs

"Matched post-enrollment (on-demand)" β€” How? By industry? By organization? By ALI profile? Random? Manual?

This is a non-trivial matching system if done well, or a simple random assignment if done minimally. The spec doesn't specify.

7d. Week 8 β€” ALI Retake

7e. Monthly Continuation (Post-Week 8)

"Participants opt into a monthly check-in." This is barely specified. What's the content? What's the AI's role? What are the token estimates? Is this free or paid?


8. Scope Risks

8a. Most Technically Ambitious

The AI Thinking Partner as primary coaching mechanism for on-demand. This isn't a chatbot answering questions. It's a multi-session, cross-module, artifact-aware, tone-shifting, scaffolding-removing, error-recovering conversational coach. The quality bar is explicitly called "the existential risk for on-demand" in Stress Test 6 (Β§18).

Close second: The dual-pane exercise UI (if we build it). Keeping structured artifact data in sync with a conversational AI flow is complex frontend/backend coordination.

8b. What I'd Cut for MVP

  1. Accountability pair matching β€” Replace with "share your commitment with someone you trust." Manual, no system needed.
  2. Monthly continuation post-Week 8 β€” Ship the 8 weeks. Evaluate continuation later.
  3. Cohort comparison in ALI β€” Show individual scores only. Cohort comparison needs a critical mass of data.
  4. "What do you know about me?" transparency feature β€” Important ethically but can ship as a simple data export initially, not a conversational feature.
  5. The participant ternary evaluation ("That landed / That missed / Mixed") β€” Log it but don't build feedback loops from it in v1.
  6. PDF generation β€” v1 could show artifacts in-app only. PDFs in v1.1.

8c. What I'd Push to v1.1

  1. PDF generation for all artifacts
  2. Cohort/team analytics dashboard (enterprise)
  3. Accountability pair matching system
  4. Monthly continuation program
  5. Advanced ALI analytics (pattern detection, profile narratives beyond basic templates)
  6. The "unscaffolded second Partnership Map" as a distinct tracked artifact (vs. just another conversation in M3)

8d. What MUST Be in MVP

  1. ALI assessment (pre-course) with individual radar chart
  2. Full 5-module AI coaching experience
  3. Structured artifact capture for all exercises
  4. Cross-module memory (structured summaries)
  5. Save/resume at specified save points
  6. 8-week email sequence with AI interactions
  7. ALI retake (week 8) with comparison

9. Red Flags

πŸ”΄ Red Flag 1: No TECHNICAL-SPEC.md

The doc references it twice. If it doesn't exist yet, we're building from a course spec and calling it a product spec. That's how you get scope creep, misaligned expectations, and a 6-month project that takes 14 months.

Action needed: Write the technical spec before engineering starts. This review is essentially a table of contents for that document.

πŸ”΄ Red Flag 2: AI Conversation Quality Is Unvalidated

Stress Test 6 acknowledges this: "validate Module 1 conversation with real participants before building remaining modules." This is the right instinct. But if validation fails, the fallback options are "iterate on prompts, reduce AI scope, or pivot to facilitator-supported hybrid" β€” each of which is a fundamentally different product.

Action needed: Build and validate the Module 1 AI experience as a standalone prototype BEFORE committing to the full 5-module on-demand build. This is the single highest-risk item.

🟑 Red Flag 3: "The Medium IS the Message" Creates a Quality Trap

The spec positions the AI Thinking Partner as proof that human-AI partnership works. If the AI experience is mediocre, it undermines the course's core thesis. This means the bar for "good enough" is higher than a typical ed-tech chatbot. We can't ship a C+ AI experience for a course that teaches A+ human-AI partnership.

🟑 Red Flag 4: Exercise Complexity Varies Wildly

The Partnership Audit (list 10 things, classify them) is simple. The Partnership Map (decompose workflows into tasks, allocate each with reasoning, ethical guardrails, handoff protocols, escalation triggers) is an order of magnitude more complex. Building a UI that handles both elegantly β€” and that the AI can coach through β€” is a significant design challenge.

🟑 Red Flag 5: Enterprise Features Are Implied But Not Specified

Enterprise is the "real revenue driver" (Β§17: 100 seats at $249 = $24,900). But enterprise features aren't specified: team rollup reporting, admin dashboards, bulk enrollment, SSO, SCORM/LTI integration, manager visibility into team progress, aggregate ALI scores... This will be the first thing enterprise buyers ask about.

🟑 Red Flag 6: The 8-Week System Is a Second Product

It has its own content, its own AI interactions, its own notification system, its own pacing logic, its own scaffolding removal schedule, and its own accountability mechanism. Calling it a "reinforcement system" makes it sound like an email drip. It's actually a lightweight, 8-week AI coaching program. Scope it accordingly.

🟒 Green Flag: Cost Model Is Excellent

At ~$2.22/participant all-in and $499/seat pricing, the AI costs are negligible. Even at 5x overrun, it's fine. This means we can be generous with AI interaction depth without worrying about margin. That's rare and valuable.

🟒 Green Flag: Content Quality Is Exceptional

Whatever we build, the content is ready. The exercises are well-designed, the framework is coherent, the progression is logical, and the voice guidance is specific enough to write good prompts. The course spec is one of the best I've read. The gap is purely in product/engineering specification.


Summary: Top 5 Actions Before Engineering Starts

  1. Write TECHNICAL-SPEC.md. Data models, API contracts, UI wireframes, conversation state machines, infrastructure choices. This review is the outline.

  2. Build and validate Module 1 as a prototype. If the AI coaching experience doesn't work for Module 1 (the simplest module), it won't work for Module 3 (the most complex). Validate before committing to the full build.

  3. Define the memory schema. For each exercise, specify exactly what structured data is extracted and persisted for cross-module reference. This is the backbone of the AI experience.

  4. Design the dual-pane UI pattern. Conversation + structured artifact is the core interaction pattern. Get this right and every module follows the same pattern. Get it wrong and every module is a custom build.

  5. Scope the enterprise feature set. Enterprise is the revenue engine. Don't build the individual product and then try to bolt on enterprise features. Design for enterprise from day one, even if enterprise-specific features ship later.


Review complete. Happy to go deeper on any section. This is a strong spec that needs a strong technical counterpart. Let's build the right thing.

β€” The Builder

Panel Review: The Buyer

Reviewer Profile: VP of Talent Development, Fortune 500 (12,000 employees, 800 people managers)
Budget: $2M annual L&D
Prior vendors: DDI, FranklinCovey, BetterUp
Context: CEO mandate to "figure out AI for our leaders"
Document reviewed: COURSE-SPEC-UNIFIED.md (March 2, 2026)


1. Would I Buy This?

Short answer: I'm very interested. I'd pilot it. I wouldn't do a full enterprise rollout this quarter.

What's compelling:

The positioning is the strongest thing in this spec. "The bottleneck is not technology. It is leadership." That sentence would land in my executive committee. My CEO didn't say "buy an AI tool." He said "figure out AI for our leaders." This is the only product I've seen that directly answers that brief. DDI would sell me a leadership development program that mentions AI. FranklinCovey would add an AI module to an existing course. CCL would offer a custom executive program at $15K/head. None of them have a framework built from the ground up for AI leadership specifically.

The 5D Model is clean and memorable. I could draw it on a whiteboard. My CLO could draw it on a whiteboard. That matters more than people realize β€” if my leaders can't explain the framework to their teams, the investment dies in the room where it was taught.

The "front door and foundation" strategy (Section 1) is smart. I'm buying because my AI initiatives are failing. My leaders are receiving a leadership transformation. Both things are true. That's how you sell to me and get outcomes.

What makes me hesitate:

  1. LeaderFactor is not a household name in my world. I know DDI, CCL, FranklinCovey, Korn Ferry. If I bring "LeaderFactor" to my CHRO, she'll ask "who?" The 4 Stages of Psychological Safety has traction, but it's not at the recognition level of Crucial Conversations or DiSC. That means I'm taking a reputational risk by choosing them over a safe pick.

  2. No case studies. This is a brand new course. Zero deployments. Zero enterprise references. When I buy from DDI, I get 15 case studies from companies like mine. Here I'd be an early adopter. That's either exciting or terrifying depending on my risk tolerance. Right now, with my CEO breathing down my neck, it's terrifying.

  3. The on-demand product is unproven. The spec itself flags this (Stress Test 6): "This is the existential risk for on-demand." The fact that the authors are honest about it is refreshing, but it doesn't reduce my risk. If I buy 200 seats and the AI conversations feel like a chatbot, I've wasted $50K and my credibility.

  4. No competitive moat against the big players. DDI or FranklinCovey could build a similar course in 6-12 months. The framework is good, but frameworks can be replicated. LeaderFactor's window is probably 12-18 months before the big houses catch up. That's not a reason not to buy β€” it's a reason to negotiate hard on pricing.

Comparison to alternatives:

LeaderFactor DDI FranklinCovey CCL
AI-specific framework Yes (5D Model) No (bolt-on module) No (bolt-on module) Custom design possible
Proprietary assessment ALI (new, unvalidated) Validated assessments Validated assessments Validated assessments
Enterprise readiness v2 (not yet) Full Full Full
Price per seat (enterprise) $249-$349 $300-$500 $200-$400 $1,000+
Track record New course Decades Decades Decades
AI-integrated delivery Yes No No No

LeaderFactor wins on specificity and innovation. Loses on track record and enterprise readiness. It's a higher-upside, higher-risk bet.


2. Pricing Reaction

$499 on-demand individual: Too high for individual buyers but irrelevant to me. I'm not buying individual seats. This is a marketing number that anchors enterprise pricing.

$249-$349/seat enterprise: This is the sweet spot. At $249/seat for 200 leaders, that's ~$50K. Well within my discretionary budget β€” I don't even need CFO approval at that level. At $349, I'd push back and ask for $279. The range feels like there's negotiation room, which is fine.

$1,495 workshop: Competitive with CCL and Crucial Learning. Not cheap, but not shocking. If I'm buying workshops, I'm buying 3-5 sessions for my top 100-150 leaders. That's $150K-$225K. Now I need CFO approval. The ROI story from the ALI assessment would help here.

$2,499 certification: In line with DiSC and CCL. Fair. (More on this in Section 7.)

My likely entry point: I'd start with a pilot workshop β€” one LF-led session ($6,500 + $249/seat Γ— 25 = ~$12,750) for a cohort of 25 high-potential leaders. Low risk, real data. If it lands, I'd move to enterprise on-demand seats for the broader population and certify 3-5 internal facilitators for workshops.

What I'd actually spend in Year 1:
- Pilot workshop: ~$13K
- If pilot succeeds β†’ 200 on-demand seats: ~$50K
- 3-5 facilitator certifications: ~$7.5K-$12.5K
- Total: $70K-$75K

That's 3.5% of my L&D budget. Very doable. The question is whether the pilot succeeds.


3. The Two-Modality Pitch

Does "same course, two formats" make sense? Yes, actually. Here's why.

My organization has three populations:
1. Top 100 leaders (VPs+) β€” they get workshops. High-touch, facilitator-led, in-person or virtual.
2. Next 300 leaders (directors, senior managers) β€” they could go either way. Workshops if budget allows, on-demand if not.
3. Remaining 400 people managers β€” on-demand is the only realistic option at scale.

Having a single framework across all three populations is genuinely valuable. When my VP of Engineering and a frontline manager in operations are using the same vocabulary (Partnership Map, Three Zones, Four Readiness Gaps), that's organizational alignment. DDI gives me that with their suite, but it takes 3-4 courses to build that kind of shared language.

My concern: The on-demand experience needs to be genuinely good, not a watered-down version of the workshop. The spec says the AI Thinking Partner carries the coaching load in on-demand. If it works, this is differentiated. If it doesn't, I've given 400 managers a glorified e-learning module. Stress Test 6 in the spec confirms the authors know this risk.

The dilution concern: I don't think the workshop is diluted. 3h 45m with a facilitator, pair work, and the AI as a supporting tool β€” that's a full experience. The on-demand might feel overbuilt for people used to 30-minute LinkedIn Learning modules, but that's a framing problem, not a product problem. Position it as a "cohort experience" or "guided program," not "on-demand course."


4. The AI Thinking Partner

Is it a selling point or a risk? Both. In that order.

The selling point: "The medium IS the message" is clever and true. You're teaching leaders to work with AI by having them work with AI. That's experiential learning at its best. No other leadership course does this. When I pitch this to my CEO, I can say: "They don't just learn about AI leadership β€” they practice it, in the course itself." That's a soundbite that lands.

The risk:

  1. Legal/Compliance: My legal team will ask three questions: (a) Where does participant data go? (b) Is it stored? For how long? (c) Can it be subpoenaed? The spec mentions "data transparency" (Section 6) but says enterprise features like SOC 2 and data privacy are "v2." That's a problem. I can't deploy an AI system that processes my leaders' reflections about their teams without answering these questions. More on this in Section 6.

  2. Participant trust: The spec handles this well. The voice principles (Section 6) are thoughtful β€” "substantive, challenging without threatening, remembering, honest about itself." The error recovery protocol is excellent. The "respect for dissent" protocol is genuinely impressive β€” most AI products don't account for participants who disagree with the premises. If the implementation matches the spec, participants will trust it. But "if" is doing a lot of work in that sentence.

  3. Quality consistency: A facilitator can read the room. An AI can't β€” not really. The spec's conversation quality stress test acknowledges this. I'd want to see a demo before committing. Not a scripted demo β€” a real conversation with the AI where I play a skeptical participant. If it handles me well, I'm in. If it gives me chatbot energy, I'm out.

"The medium is the message" β€” too clever? No. It's actually the most honest positioning possible. But don't lead with it in the sales conversation. Lead with the framework and the business problem. Let the AI Thinking Partner be a discovery during the pitch, not the headline.


5. The Assessment (ALI)

Is a proprietary assessment valuable to me? Extremely β€” if it's validated.

Here's what I'd use it for:
1. Pre-training diagnostic: "Here's where your leaders are today." That data alone is worth the conversation.
2. Post-training measurement: Pre/post ALI scores give me a story for my CFO. "We moved from 2.1 to 3.4 across 200 leaders in 90 days." That's a defensible ROI narrative.
3. Organizational heat map: If I can aggregate ALI data by function, level, and business unit, I can see where AI leadership maturity is strongest and weakest. That's strategic intelligence.
4. Ongoing pulse: The 8-week retake gives me a second data point. If I add annual retakes, I have a longitudinal story.

The CFO justification story: This is where the ALI is powerful. My CFO doesn't care about leadership development in the abstract. She cares about measurable outcomes. "We assessed 200 leaders. Average ALI score was 1.8 out of 6. After the program, it was 3.2. Leaders in the top quartile of ALI improvement showed 23% faster AI adoption in their teams." That's a story she'll fund. The ALI makes that story possible.

My concerns:
1. Validation. The spec doesn't mention psychometric validation β€” reliability, construct validity, predictive validity. DDI's assessments have decades of validation data. The ALI has zero. For a pilot, this is fine. For an enterprise diagnostic, I need at least internal consistency data and some evidence that ALI scores predict actual AI adoption outcomes. I'd ask LeaderFactor: "What's your validation timeline?"

  1. Normative data. The sample items are good (Section 5), and I like the 6-point Likert with no neutral option. But without normative data, my leaders' scores are meaningless in isolation. "You scored 2.4" means nothing without "and the average for leaders in your industry/level is 2.1." LeaderFactor will build this over time, but early adopters don't get it.

  2. Admin reporting. Can I see aggregate data by business unit? Can I export it? Can my HRBP pull a report for her VP? The spec doesn't address this, and it's essential for enterprise use.


6. Enterprise Concerns

This is the section that could kill the deal.

The spec says enterprise features are "v2" (referenced in Section 21 as part of TECHNICAL-SPEC.md). Here's what I need and when:

Requirement Need Level Can I Wait?
SSO/SAML Must-have Not for 200+ seats. My IT team won't provision individual accounts.
SCORM/LTI Nice-to-have Yes, if there's a standalone platform with admin controls.
SOC 2 Must-have for regulated data Depends. If participant responses are anonymized and not stored long-term, maybe. If the AI stores personal reflections about team members, absolutely not.
Data privacy (GDPR, etc.) Must-have No. I have employees in the EU.
Admin reporting Must-have Not for anything beyond a pilot.
Data residency Important Depends on what data is processed.

The minimum viable enterprise package for me to say yes to a pilot (25-50 seats):
- Basic admin dashboard (enrollment, completion, aggregate ALI scores)
- Clear data retention policy (what's stored, how long, who can access)
- Ability to delete participant data on request
- No SSO required for pilot, but committed roadmap for 200+ deployment

The minimum for a 200+ seat deployment:
- SSO/SAML
- Admin reporting with business unit segmentation
- Data processing agreement (DPA)
- SOC 2 Type I at minimum (Type II preferred)
- Clear AI data handling policy (what goes to the LLM, what's retained)

Is "v2" a dealbreaker? For the pilot, no. For the enterprise rollout, yes. I'd need a committed timeline. "Q3 2026" is fine. "Sometime next year" is not. I'd want it in the contract: if enterprise features aren't delivered by [date], I get a refund or credit on my enterprise seats.


7. Facilitator Certification

$2,499/facilitator β€” reasonable? Yes. It's market rate. DiSC is $2,495. CCL is $2,500. No pushback on price.

Would I certify all 15? No. I'd certify 3-5 initially.

Why not all 15:
1. I don't know if this course works yet. Certifying 15 facilitators at $2,499 each = $37,485 before I've run a single workshop. That's a bet I won't make on an unproven course.
2. I want to see my best 3-5 facilitators deliver it first. Get participant feedback. Iterate. Then scale.
3. My 15 facilitators are currently delivering 4 Stages. They're busy. Adding a new course means either reducing 4 Stages delivery or increasing facilitator workload. I'd phase it.

My plan:
- Certify 3 facilitators in Q2 2026
- Run 4-6 workshops in Q2-Q3 (pilot + first wave)
- If NPS > 60 and ALI pre/post shows improvement, certify 5 more in Q4
- Full 15 by mid-2027 if the course proves itself

What I'd want in the certification:
- At least one practice delivery with real participants (not just facilitator peers)
- Access to the companion resource (Section 15) β€” the quarterly-updated case studies and scenarios
- A facilitator community or Slack channel for sharing what works
- Recertification: annual or biennial, not perpetual. But make it light β€” a refresh, not a full re-cert.

The cross-sell to 4 Stages is genuine. If my facilitators are already certified in 4 Stages, the Module 4 (Develop) connection is seamless. They already know psychological safety deeply. That makes Leading Through AI easier to deliver and more credible. Smart architecture.


8. What Would Close Me?

What would make me sign a PO this quarter:

  1. A live demo with the AI Thinking Partner where I play a skeptical participant and it handles me well. Not a scripted walkthrough β€” a real, unscripted conversation. If the AI is as good as the spec promises, that demo closes me.

  2. A pilot commitment at low risk: 25 seats, one LF-led workshop, aggregate ALI data, satisfaction scores. $13K total. I can do that without CFO approval.

  3. A written enterprise roadmap with committed dates for SSO, admin reporting, and data privacy features. Doesn't need to be built. Needs to be committed.

  4. One reference. Just one. Another VP of Talent who has seen it, even in beta. "We ran it with 30 leaders and here's what happened." That would dramatically reduce my perceived risk.

  5. Tim Clark Jr. on a call with my CHRO. The positioning in Section 1 β€” "the bottleneck is not technology, it is leadership" β€” needs to come from the thought leader, not a sales rep. A 30-minute executive briefing would be the tipping point.

What would make me wait:

  1. No enterprise features until 2027. If I can't deploy at scale within 6 months of piloting, I'll wait for the product to mature.

  2. The AI Thinking Partner demo falls flat. If it feels like a chatbot, I'd rather buy the facilitated workshop only and skip on-demand entirely. But then I'm paying $1,495/head for my broader population, which doesn't scale.

  3. DDI or FranklinCovey launches something comparable. If DDI ships an "AI Leadership" course in Q3 2026, I'd evaluate both. The brand safety of DDI might win even if the product is inferior.

  4. My CEO's urgency fades. Right now I have executive air cover. If the AI hype cycle cools or a different priority takes over, this drops to "next year."


9. Red Flags

Flag 1: No validation data on the ALI.
The assessment is central to the value proposition β€” it's the diagnostic, the measurement, and the CFO story. But it's brand new with zero psychometric validation. For a pilot this is acceptable. For an enterprise diagnostic across 800 managers, I need at minimum internal consistency reliability (Cronbach's alpha > 0.7) and some evidence of construct validity. The sample items (Section 5) look well-written, but well-written β‰  validated.

Flag 2: Revenue projections feel aspirational.
Section 17 projects $5M in Year 2 and $8.8M in Year 3. 25,000 seats in Year 2 from a company that has never sold this course. The 99% margin is accurate (AI costs are trivial), but margin doesn't matter if you don't hit volume. These projections suggest a company that might be building for scale before proving product-market fit. That's not my problem directly, but it signals potential prioritization of growth over product quality.

Flag 3: The 50%+ completion rate target is ambitious.
The spec acknowledges (Stress Test 7) that industry average for on-demand L&D is 15-25% and this product needs 50%+. The mitigations listed are reasonable but unproven. If completion rates are 20%, the 8-week reinforcement system is irrelevant because most people never get there. I'd want a money-back guarantee on completion rates for on-demand β€” or at least a committed threshold.

Flag 4: "Open Questions" suggest the product isn't fully baked.
Section 21 has open questions about the 8-week system, physical vs. digital artifacts, and the relationship to a book/field guide. These are reasonable for a product in development, but they tell me I'm buying before the kitchen is finished. For a pilot, fine. For a $75K commitment, I want these resolved.

Flag 5: No mention of accessibility.
WCAG compliance, screen reader compatibility, closed captioning for videos, alternative formats for assessments β€” none mentioned. My organization has accessibility requirements for all learning platforms. This needs to be addressed before I can deploy.

Flag 6: Facilitator readiness is dependent on LeaderFactor.
Section 15 describes a "companion resource" updated quarterly with current case studies and capability snapshots. That's great in theory. But it means my facilitators are dependent on LeaderFactor's content team to keep the course current. If LeaderFactor gets distracted by growth or pivots, my facilitators are delivering a course with stale examples. I'd want a contractual commitment to quarterly companion updates for at least 2 years.


Summary Verdict

Dimension Rating Notes
Framework quality ⭐⭐⭐⭐⭐ Best AI leadership framework I've seen. Clean, memorable, actionable.
Positioning ⭐⭐⭐⭐⭐ Nails the buyer problem. "The bottleneck is leadership, not technology."
Assessment (ALI) ⭐⭐⭐⭐ High potential, needs validation data.
AI Thinking Partner ⭐⭐⭐⭐ Differentiated and smart. Need to see it work.
Enterprise readiness ⭐⭐ Not there yet. v2 roadmap needed.
Track record / proof ⭐ Zero deployments. Zero references. Biggest risk.
Pricing ⭐⭐⭐⭐ Competitive and reasonable at every tier.
Facilitator design ⭐⭐⭐⭐⭐ "Framework carries intellectual weight, facilitator carries process weight." Smart.
Cross-sell value ⭐⭐⭐⭐ Strong if I'm already in the LeaderFactor ecosystem.

Bottom line: This is the most thoughtfully designed AI leadership course I've evaluated. The framework is excellent, the positioning is sharp, and the AI Thinking Partner is a genuine differentiator. But it's a v1 product from a mid-sized company with no enterprise deployments. I'd pilot it this quarter β€” 25 seats, one facilitator-led workshop, full data collection. If the pilot proves out, I'd move to 200+ on-demand seats and 3-5 facilitator certifications in Q3-Q4. I would not do a full enterprise rollout until enterprise features (SSO, admin reporting, DPA) are in place.

The thing that would tip me from "pilot" to "committed buyer": One live demo of the AI Thinking Partner that impresses my CHRO, plus a written enterprise roadmap with dates.

The thing that would lose me entirely: If DDI ships something comparable before LeaderFactor gets enterprise-ready.


Review completed March 2, 2026
Reviewer: The Buyer (simulated VP of Talent Development)

Panel Review: The Facilitator

Reviewer Profile: Senior Certified LeaderFactor Facilitator | 3+ years | 100+ workshops delivered (4 Stages, EQ Index, Coaching & Accountability)
Document Reviewed: COURSE-SPEC-UNIFIED.md (March 2, 2026)
Date: March 2, 2026


1. Can I Deliver This?

Short answer: Yes β€” with caveats.

Given this spec plus a properly built deck and facilitator guide, I could walk into a room of 25 senior leaders and deliver a credible half-day workshop. The framework is clean, the progression is logical, and β€” critically β€” the spec explicitly states I don't need to be an AI expert (Section 16). That's the right call, and it's the single most important facilitator design decision in this document.

Where I'd get stuck:

Overall confidence level: 7.5/10. The framework is strong enough that even a mediocre delivery would land. A good facilitator will make it sing.


2. Are the Facilitator Notes Sufficient?

They're good for content. They're thin on process.

The spec is outstanding as a what to teach document. Every module has clear teaching points, exercise steps, timing, and even specific language for bridges and transitions. The redirect framework in Section 16 is particularly well done β€” I could print that on a card and use it in every session.

What's missing:

Where I'd improvise (and whether that's good or bad):


3. Own / Augment / Automate β€” Is It Intuitive to Teach?

The Three Zones Framework will land immediately. It's this course's "4 Stages" β€” simple, sticky, and useful on day one.

Own / Augment / Automate is the kind of framework senior leaders love: it gives them a decision tool and a shared vocabulary in one move. When I teach 4 Stages, the "inclusion safety β†’ learner safety β†’ contributor safety β†’ challenger safety" progression clicks in minutes. Own / Augment / Automate has the same quality. It's visual, memorable, and immediately applicable.

Comparison to other LF frameworks:

Framework Intuitive? Sticky? Actionable?
4 Stages of Psych Safety Very high Very high Moderate (cultural, slower to apply)
EQ Index competencies Moderate Moderate High (behavioral)
Own / Augment / Automate Very high Very high Very high (can apply in the room)

The Three Zones may be the most immediately actionable framework LeaderFactor has ever produced. A leader can literally use it in their next team meeting.

Where it gets harder:

Will I need to explain things multiple times? The Three Zones β€” once. The 5D Model β€” I'll reference it at every bridge, but participants won't fully internalize the five steps until the Course Commitment at the end when they write all five in one sentence. The Four Readiness Gaps β€” once, but I'll need a concrete example for each or they blur together. The Demonstration Architecture β€” once.


4. The AI Thinking Partner in the Room

This is my biggest concern with the spec as written.

Section 6 is richly detailed for on-demand delivery. Every module has specific AI dialogue examples, tone calibrations, scaffolding removal progressions, error recovery protocols, and meta-awareness moments. The on-demand AI experience is meticulously designed.

For the facilitated workshop, the spec says (Section 1): "In facilitated workshops, [the AI Thinking Partner] supports exercises β€” participants interact with it during structured activities while the facilitator manages the room."

That's essentially one sentence of guidance for the most novel and risky element of the workshop.

My specific questions:

  1. Logistics. Do all 25 participants need devices? Phones? Laptops? Tablets provided? Do they need wifi? Is there a web app? A URL? Do they log in? With what credentials? If even 3 of 25 can't connect, I've lost 5 minutes of a 45-minute module.

  2. When exactly do they interact with AI? During which exercises? The Partnership Audit? The Design Sprint? All of them? I see specific on-demand AI dialogue for every exercise, but the facilitated notes just say "pair discussion." Are they supposed to chat with the AI instead of pair discussion? Before pair discussion? In addition to?

  3. What's my role while they're chatting with AI? Am I circulating? Sitting? Watching a dashboard? In 4 Stages, every minute of facilitator time is accounted for. Here, if 25 people are silently typing to an AI for 10 minutes, what am I doing?

  4. What happens when the AI goes down? No fallback is specified. In a facilitated room, I AM the fallback β€” but only if I know the AI coaching questions well enough to deliver them verbally. The spec should include "facilitator fallback questions" for each exercise, mirroring the AI's coaching prompts.

  5. Does the AI experience add enough value in a facilitated room to justify the complexity? In on-demand, the AI is essential β€” it IS the facilitator. In a workshop, I'm the facilitator. The AI risks being a distraction, a technical hiccup waiting to happen, or a novelty that adds 5 minutes of friction to each module. The meta-learning argument ("the medium IS the message") is powerful but only if the experience is seamless.

My recommendation: For v1 facilitated workshops, make the AI Thinking Partner optional β€” an enrichment for tech-ready rooms, not a requirement. Design every exercise to work without it. Then, once the on-demand product has validated the AI interaction quality (Stress Test 6, Section 18), bring it into the facilitated room with tested, specific integration points. Right now, the spec is asking facilitators to manage a room AND troubleshoot an AI experience simultaneously, with almost no guidance for either.


5. Energy and Pacing

The timing summary (Section 12) says 3h 55m, not 3h 45m. That includes the break. Actual content time is ~3h 45m. That's a long half-day.

The energy arc:

Module 1 (DEFINE)     β€” Reflective β†’ Revelatory  [SETTLE IN]
Module 2 (DISCOVER)   β€” Expansive β†’ Surprising   [ENERGY PEAK #1]
          BREAK
Module 3 (DESIGN)     β€” Analytical β†’ Committed   [POST-BREAK FOCUS]
Module 4 (DEVELOP)    β€” Empathetic β†’ Resolute     [ENERGY SAG]  ⚠️
Module 5 (DEMONSTRATE)β€” Strategic β†’ Grounded      [FINISH LINE ENERGY]

Where I'll lose the room:

My fix: The Adoption Paradox opening (6 min) needs to be punchy and slightly provocative. The spec's language is good ("the more you push, the more resistance you create") but the delivery needs energy. I'd add a brief show-of-hands or polling moment: "How many of you have experienced resistance to a change initiative? Keep your hand up if you think you handled it well." That wakes people up.

My fix: Shorten the individual plan-building to 8 minutes. Move 4 minutes to the pair pressure-test, which is higher energy. Or make the plan-building a pair exercise from the start: "Build this plan together, then challenge each other."

The natural sag is Module 4, 2h 30m into the day. Every long workshop has one. The spec doesn't provide tools to counter it beyond the content itself.

What's missing: A movement moment. Somewhere between Modules 3 and 5, I need people on their feet. A gallery walk of Partnership Maps (post-Module 3) would be perfect: people post their maps, walk the room, put sticky dots on the most interesting allocations. Physical movement, visual engagement, social proof. 5 minutes well spent.


6. The Exercises

Overall: strong. These are exercises that respect senior leaders' time and intelligence. They're built on real work, not hypotheticals. That's the right call.

Exercise-by-exercise assessment:

Partnership Audit (Module 1, 20 min)

Will VPs actually list 10 things they did last week? Yes β€” because it's framed as "what you actually spent time on," not "list your responsibilities." The specificity of "last week" makes it concrete and slightly uncomfortable. That's the point. The 3-minute time cap helps: it prevents overthinking.

Risk: Some leaders will list only 5-6 items. The spec doesn't address what to do if someone can't reach 10. My instinct: "If you're stuck at 7, that itself is data. What did you do that you've already forgotten?"

Rating: 8/10 β€” Will land. Senior leaders actually enjoy this kind of honest self-audit when the room feels safe.

Discovery Sprint (Module 2, 22 min)

The strongest exercise in the course. Pushing three workflows through three dimensions (Efficiency β†’ Augmentation β†’ Transformation) is structured enough to prevent flailing but open enough to generate genuine insight. The pair expansion step ("partner's job: add to it") leverages the room's diversity.

Risk: The Transformation dimension is where people will stall. "What if AI enabled something you couldn't do at all before?" is a huge question. I'd want 2-3 domain-specific prompts ready: "In finance, that might mean... In operations, that might mean..."

Rating: 9/10 β€” This is the exercise participants will talk about at dinner.

Design Sprint (Module 3, 23 min)

The most ambitious exercise. Breaking workflows into component tasks, mapping each to a zone, writing rationale, adding ethical guardrails β€” in 10 minutes of individual work. This is where the spec overestimates what participants can produce in the time allotted.

Risk: Participants will do a surface-level map in 10 minutes, then get genuinely challenged in the pair step. That's actually fine β€” the challenge is where the learning happens. But the spec should acknowledge that the initial maps will be rough drafts, not finished products.

The unscaffolded second map (5-8 min) is brilliant. This is the "I can actually do this" test. The confidence transfer from guided to independent is real pedagogy. Well designed.

Rating: 7/10 β€” Will work, but needs time management discipline and a worked example on deck.

Readiness Diagnostic (Module 4, 22 min)

Solid but long. The 1-5 rating of four gaps is quick. Writing "what specific signals are you seeing" takes time and requires a level of team awareness that varies enormously. Some leaders will write paragraphs. Others will stare at the page.

Risk: The "build the plan" step (10 min) asks for "three specific actions in the next 30 days." This is essentially coaching, and in a facilitated room, the pair partner is the coach. Pair quality matters enormously here. A weak pair partner means a weak plan.

Rating: 7/10 β€” Land will vary by room. 4 Stages alumni will crush this. Everyone else will need more scaffolding.

90-Day Demonstration Plan (Module 5, 22 min)

Too many planning questions for the time. Eight specific questions in 12 minutes of individual work. I've facilitated enough planning exercises to know: leaders will spend 5 minutes on the first two questions and rush through the rest. Kill criteria and scale triggers are sophisticated concepts that deserve more than 90 seconds each.

Risk: The plans will be half-baked. That's not fatal β€” the 8-week reinforcement system can catch it β€” but the spec frames this as a "concrete plan," and what participants will produce in 12 minutes is a sketch.

My fix: Reduce to 5 core questions (30-day metrics, 60-day metrics, 90-day metrics, success threshold, one story to tell). Move kill criteria and scale triggers to Week 5 of the reinforcement system.

Rating: 6/10 β€” Needs simplification for the facilitated room. The ambition exceeds the time.


7. Cross-Sell Opportunities

Module 4 (DEVELOP) is the natural bridge to 4 Stages, and it's genuine β€” not forced.

Section 11's opening explicitly says: "83% of business leaders say psychological safety directly impacts the success of AI initiatives" and "The Psychological Safety gap maps directly to the 4 Stages of Psychological Safety β€” LeaderFactor's foundational IP. This isn't bolted on. It's structural."

From a facilitator who delivers both: this is true. The Four Readiness Gaps are a legitimate extension of the 4 Stages into the AI context. Psychological Safety as the first gap isn't performative β€” it's the actual bottleneck I've seen in every organization trying to adopt anything new.

Natural bridge moments:

  1. Module 4, Adoption Paradox. After "the more you push, the more resistance you create," I'd naturally say: "If that resonates, there's an entire framework for building psychological safety β€” the 4 Stages. What we're doing today is the AI-specific application."

  2. Module 4, Safety Commitment. When participants write their behavioral commitment, the language directly maps to 4 Stages inclusion and learner safety behaviors. For 4 Stages alumni, this is an "oh, I already know how to do this" moment. For new participants, it's a hook.

  3. Module 1, Identity Statement. EQ Index connection: self-awareness is the foundation. "Knowing who you are as a leader when AI strips away the scaffolding β€” that's an emotional intelligence challenge."

Cross-sell sequence the spec proposes (Section 19) is smart:
- New customer: 5D β†’ 4 Stages β†’ Coaching β†’ EPIC Change
- Existing customer: 4 Stages β†’ 5D

Does Module 4 feel like a genuine extension of psych safety or a forced connection?

Genuine. The reasoning is structurally sound: you can't adopt AI without safety, and safety requires the behaviors the 4 Stages teach. It doesn't feel like a sales pitch inside the course. It feels like intellectual integrity.

One caution: Don't over-sell 4 Stages inside the 5D workshop. The moment participants smell cross-sell, trust breaks. One mention in the Adoption Paradox is enough. Let the connection be obvious without being pushy.


8. Red Flags

Red Flag 1: The AI Thinking Partner in Facilitated Workshops (HIGH)

As detailed in #4 above, this is under-specified to the point of being risky. If a facilitator shows up expecting to integrate AI and it doesn't work β€” or works inconsistently β€” the meta-message of the course ("AI is a reliable thinking partner") is undermined by the course's own delivery. The medium IS the message, which means the medium failing IS the counter-message.

Mitigation: Specify exact integration points, provide offline fallbacks, or make AI optional in v1 facilitated rooms.

Red Flag 2: The 3h 55m Runtime (MEDIUM)

The timing summary adds up to 3h 55m with the break, not the marketed 3h 45m (Section 1 says "3h 45m"). More importantly, every experienced facilitator knows that published timing is aspirational. Q&A, late starts, slow exercises, and organic discussion add 10-15%. Realistically, this is a 4h 15m workshop trying to fit in a 3h 45m window.

Mitigation: Build 5-minute buffers into Modules 2 and 4. The "pressure valve" note (compress the Bridge) is acknowledged but insufficient β€” bridges are 2-3 minutes. You can't compress them much further.

Red Flag 3: Module 5 Plan Complexity (MEDIUM)

As noted in #6, eight planning questions in 12 minutes is too ambitious. Participants will leave with an incomplete 90-Day Plan and potentially feel the course ended weakly. The close (Course Commitment + Full Circle) is strong, but only if participants feel their plan is solid enough to commit to.

Mitigation: Reduce planning questions. Or explicitly frame it: "You won't finish this in the room. You'll finish it in Week 1 of the reinforcement system." That's honest and reduces pressure.

Red Flag 4: No Worked Examples on the Deck (LOW-MEDIUM)

The spec describes exercises in detail but doesn't mention worked examples, case studies, or model outputs on the deck. The Partnership Map, the 90-Day Plan, the Readiness Diagnostic β€” all need a "here's what good looks like" example. Senior leaders are high-performing people who hate ambiguity in instructions. Show them a completed Partnership Map before asking them to build one.

Mitigation: This is a deck/facilitator guide issue, not a spec issue. But the spec should call it out.

Red Flag 5: Pair Work Fatigue (LOW)

Every module uses pair discussion as the primary social learning mechanism. By Module 4, "turn to the person next to you" will feel repetitive. In 4 Stages, we vary the modality: pairs, triads, table groups, gallery walks, full-room polls.

Mitigation: Vary the social structures. Table group discussion for one module. Gallery walk for another. Full-room debrief instead of pairs for at least one exercise.

Red Flag 6: The "Mostly AI-Ready" Participant Reframe (LOW)

Section 8's reframe for someone whose audit shows 70%+ in Augment/Automate: "You've been doing $50,000 work when you're capable of $500,000 work." This is powerful but risks landing as insulting to a senior leader. "You've been doing $50,000 work" β€” a VP making $300K might hear that as "you've been wasting your time."

Mitigation: The spec already softens this with "If that reframe lands for you... If it doesn't β€” tell me what feels wrong about it." Good. But facilitators should be coached to deliver this with genuine respect, not as a gotcha.


Summary Assessment

Dimension Rating Notes
Framework quality 9/10 Own/Augment/Automate is best-in-class. 5D Model is strong.
Exercise design 7.5/10 Real work, senior-appropriate. Module 5 needs simplification.
Facilitated delivery readiness 6.5/10 Content is there. Process guidance is thin. AI integration is under-specified.
Energy/pacing design 7/10 Arc is sound. Module 4 sag is predictable. Needs movement.
Cross-sell integrity 9/10 Module 4 β†’ 4 Stages is genuine and structurally sound.
Facilitator confidence 7.5/10 I can deliver this. I'd want more guidance on logistics and AI integration.

Bottom line: This is a credible, well-designed course with a framework that will outlast most of what's in the market. The intellectual architecture is sophisticated without being academic. The exercises are grounded in real work. The 5D Model will stick.

The facilitated delivery needs a proper facilitator guide that addresses room logistics, AI integration specifics, worked examples, exercise timing within modules, and social learning variety. The spec is a brilliant content document. It's not yet a facilitation blueprint.

Would I deliver this tomorrow? With the deck, a pre-built facilitator guide, worked examples, and the AI integration either specified or removed β€” yes. Without those β€” I'd want another week of prep and a dry run.


Review submitted by: The Facilitator (Panel Simulation)
Course: Leading Through AIβ„’ β€” Unified Course Specification
Date: March 2, 2026

Panel Review: The Instructional Designer

Reviewer: The Instructional Designer (Learning Science & Assessment Design)
Document Reviewed: COURSE-SPEC-UNIFIED.md β€” "Leading Through AIβ„’"
Date: March 2, 2026


1. Learning Science Assessment

Where the Design Aligns with Evidence

Elaborative interrogation. The spec's consistent use of "why" prompts β€” particularly the "why" column in the Partnership Map (Section 10) and the requirement to articulate reasons for every Own/Augment/Automate classification β€” is one of the strongest evidence-based moves in the entire design. Elaborative interrogation (asking learners to explain why something is true) produces substantially better retention than passive review (Dunlosky et al., 2013). This is woven throughout, not bolted on.

Concrete, personal exemplars. Every exercise begins with the participant's own work β€” their actual week, their real workflows, their specific team. This is textbook Situated Learning (Lave & Wenger). Adult learners retain and transfer dramatically better when learning is anchored in their authentic context rather than hypothetical cases. The Partnership Audit (Section 8) asking "Write down the 10 things you actually spent time on last week" is exactly right.

Desirable difficulty. The spec builds in productive struggle at appropriate moments: the unscaffolded Partnership Map in Module 3, the "teach it back" moment in Module 4, and the largely silent AI during the 90-Day Plan in Module 5. These are well-placed instances of what Bjork (1994) calls "desirable difficulties" β€” challenges that slow initial performance but enhance long-term retention and transfer.

Generation effect. Participants generate their own frameworks (Identity Statement, Partnership Maps, 90-Day Plans) rather than receiving pre-built templates to fill in. The generation effect is one of the most robust findings in memory research.

Interleaving of conceptual and procedural knowledge. Each module teaches a concept, then immediately applies it. This interleaving is superior to blocked practice for transfer.

Where the Design Violates or Risks Violation

The worked-example effect is underused. Before each exercise, participants would benefit from seeing a complete, annotated worked example β€” not just the "curated real examples" in Module 2's opening (Section 9), but a fully worked Partnership Audit or Partnership Map with expert annotations explaining the reasoning at each step. The spec provides scaffolding and prompts, but a single concrete worked example would reduce extraneous cognitive load during the first attempt.

Feedback timing is inconsistent. In Module 1 (Partnership Audit), the AI engages with "2-3 specific activities" after the participant's classification β€” good immediate feedback. But in Module 5 (90-Day Plan), the AI is "largely silent during plan-building" and reviews after completion. For a complex, multi-part exercise being attempted for the first time, delayed feedback risks participants building an entire plan on a flawed foundation. The scaffolding removal logic is sound in principle but may be premature for this specific exercise, which is structurally the most complex in the course.

The spacing effect is present but could be stronger. The 8-week system spaces practice well, but within the course itself, concepts are taught and practiced once, then only retrieved briefly in the next module's bridge. The retrieval bridges are a good start (see Section 2 below), but key concepts like the Three Zones would benefit from being applied in multiple exercises across modules, not just retrieved verbally.

Dual coding is underspecified. The spec references "key visuals" on decks (Three Zones diagram, Discovery Framework layers, Partnership Map template) but doesn't describe how visual and verbal channels will be deliberately coordinated during teaching. Mayer's multimedia principles suggest this coordination is critical for learning, not just nice-to-have for aesthetics.


2. Retrieval Practice

What's There

The spec includes four retrieval bridges:

Assessment

The good: These exist at all. Most corporate L&D skips retrieval practice entirely. The bridges force participants to reconstruct information from memory before new content is introduced, which is the core mechanism of retrieval-enhanced learning (Roediger & Butler, 2011).

The problems:

  1. They're too easy. "What were the Three Zones?" is a recognition-level question β€” nearly free recall of three labels. The research shows retrieval practice works because it's effortful. These bridges need to demand more:
  2. Instead of "What were the Three Zones?", try: "Pick one task from your Audit that you classified as Augment. What would have to change about that task for you to reclassify it as Automate? What would have to change for it to be Own?"
  3. Instead of "What were the three dimensions?", try: "Take an activity you classified as Augment in Module 1. Push it through the three Discovery dimensions right now β€” what's the efficiency play, what's the augmentation play, what's the transformation play?"

  4. They retrieve labels, not application. The bridges mostly ask participants to recall vocabulary (Three Zones, three dimensions). But the learning that matters is applying the framework to their own context. Retrieval practice should target the application level, not the recognition level.

  5. The M4β†’M5 bridge is the best one β€” asking for the Identity Statement (a personally generated, meaningful output) and the readiness gap (a judgment call, not a label). More bridges should follow this pattern.

  6. There's no cumulative retrieval. The M4β†’M5 bridge reaches back to M1, which is good. But there's no moment where the participant must reconstruct the entire 5D sequence from memory with their own content mapped to each step. The Course Commitment in Module 5 partially serves this function, but it happens after new teaching rather than as a retrieval exercise.

What I'd Add


3. Scaffolding Removal

The Design

The spec describes a clear scaffolding removal arc (Section 6):

Phase AI Behavior
Onboarding + M1 Full scaffolding
M2–M3 Moderate β€” more questions, fewer interpretations
M4 Light β€” participant drives
M5 Minimal β€” participant builds independently

And across the 8-week system (Section 13):

| Weeks 1-2 | Full prompts and context |
| Weeks 3-4 | Questions over answers |
| Weeks 5-6 | Brief check-ins, participant self-evaluates |
| Weeks 7-8 | "What would you tell yourself?" |

Assessment

The in-course arc is well-designed in principle but has one timing problem. The unscaffolded Partnership Map in Module 3 (Section 10) is the right exercise for scaffolding removal, and Module 3 is the right conceptual moment β€” it's the culmination of Act I (The Work) before the pivot to The People. But there's a sequencing concern:

The participant has completed exactly ONE guided Partnership Map at this point. In learning science terms, they've had one practice trial with feedback. Asking for independent performance after a single practice trial is aggressive. The research on skill acquisition (Anderson's ACT-R theory, Fitts & Posner's stages) suggests learners need multiple varied practice attempts before independent performance is reliable.

My recommendation: Keep the unscaffolded map in Module 3, but add a brief "mini-map" exercise in Module 2 as a stepping stone. After the Discovery Sprint, ask participants to take their single highest-priority Transformation opportunity and rough-map it into Own/Augment/Automate β€” just the zones, no "why" column, no guardrails. This introduces the mapping skill in a low-stakes way before the full unscaffolded attempt in Module 3.

The 8-week scaffolding removal is excellent. The progression from AI-prompted to AI-questioning to participant-self-evaluating to "What would you tell yourself?" mirrors the internalization process described in Vygotsky's zone of proximal development. By Week 7-8, the AI is functioning as what Vygotsky would call the internalized "more knowledgeable other" β€” the participant has absorbed the coaching voice. This is genuinely well-designed.

One risk: The jump from Week 2 (full scaffolding) to Week 3 (questions over answers) is the sharpest transition in the 8-week system. Participants who are still uncertain at Week 2 may disengage at Week 3 when the support drops. Consider a Week 2.5 model: Week 3 provides scaffolding if asked for but doesn't offer it proactively.


4. Exercise Quality

Partnership Audit (Module 1) β€” Strong

This is the best exercise in the course. It's concrete (list 10 real things), categorization forces analysis, and the reveal ("look at the ratio") produces an emotional moment that anchors the entire course. The instruction to list what you actually did, not what you aspire to, is critical β€” it's a commitment device against self-flattery.

Minor weakness: The categorization step (3 min) is tight for 10 items. Some leaders will need more time to wrestle with borderline cases β€” and the wrestling IS the learning. Consider 4-5 minutes, or explicitly telling participants that borderline cases are the most valuable ones to spend time on.

Possibility Map / Discovery Sprint (Module 2) β€” Moderate-Strong

The three-dimensional push (Efficiency β†’ Augmentation β†’ Transformation) is a solid divergent thinking scaffold. The "write fast, don't filter" instruction is appropriate for the divergent phase.

Weakness: The exercise asks participants to push three workflows through three dimensions in 12 minutes. That's 9 cells to fill, each requiring genuine creative thinking. At ~80 seconds per cell, participants will likely produce shallow responses for most and thoughtful responses for 2-3. The depth-vs-breadth tradeoff isn't explicitly managed. I'd recommend: 2 workflows Γ— 3 dimensions = 6 cells in 12 minutes (2 minutes each), with explicit permission to go deep on the Transformation row. The third workflow can be the expansion partner's contribution.

Partnership Map (Module 3) β€” Strong

The guided version with the "why" column and ethical guardrails forces the kind of deliberate reasoning that produces deep processing. The challenge step (pairs or AI) creates the beneficial testing effect β€” defending your reasoning strengthens it.

The unscaffolded version is the right capstone for Act I, with the timing caveat noted in Section 3 above.

Weakness: The exercise asks participants to "break each into its component tasks β€” the 6-10 discrete steps" for 2-3 workflows. That's potentially 30 discrete tasks to then classify, justify, and add guardrails for, in 10 minutes. This is the most overloaded exercise in the course (see Section 7 on cognitive load).

Readiness Diagnostic (Module 4) β€” Moderate

The four-gap framework is clean and memorable. Rating the team 1-5 on each gap is quick and diagnostic. The "identify the biggest gap" step forces prioritization.

Weakness: This exercise is the most at risk of producing activity without learning. Rating your team on a 1-5 scale is easy and can be done superficially. The learning happens in "what specific signals are you seeing?" β€” but that question comes bundled with the rating, and participants will likely anchor on the number. I'd flip the sequence: first list the signals you're seeing on your team (behavioral evidence), then use the four gaps to categorize them, then rate. This makes the diagnostic evidence-driven rather than impression-driven.

The "teach it back" moment (on-demand) partially rescues this exercise. Having participants explain the Four Readiness Gaps as if teaching a direct report is a powerful retrieval and elaboration move. I'd make this available in the facilitated version too β€” brief pair exercise: "Explain the Four Gaps to your partner as if they're a skeptical direct report."

90-Day Demonstration Plan (Module 5) β€” Moderate-Weak

The structure is ambitious and thorough β€” 30/60/90 metrics, success thresholds, scale triggers, kill criteria, story narrative. This would be an excellent planning tool in a strategy session.

But as a learning exercise, it's the weakest in the course. Here's why:

  1. It's planning, not practicing. The other four exercises ask participants to do something with their actual work β€” classify tasks, generate possibilities, design workflows, diagnose readiness. This one asks them to plan to do something later. Planning feels productive but doesn't produce the same learning as doing.

  2. Twelve minutes to complete 8 complex fields is insufficient for quality work. Participants will rush, producing aspirational plans rather than genuinely pressure-tested ones.

  3. The AI scaffolding removal is most aggressive here ("largely silent"), precisely when the exercise is most complex and novel. The participant has never built a demonstration plan before. Withholding support for a first attempt at the hardest exercise contradicts the logic of scaffolding removal, which should track skill development, not just position in the course.

Recommendation: Either (a) simplify the plan to 4-5 fields (30-day leading indicator, 90-day success threshold, one story, kill criteria), or (b) restore moderate AI scaffolding for the initial draft and reserve the "silent AI" for a revision pass.


5. The 8-Week Reinforcement System

What the Research Says

The 8-week system is the most important part of the entire course design, and the spec seems to know this: "The course is ignition. The 8-week system is the transformation." The citation that stated intentions predict behavior only 13% of the time without follow-up (Sheeran & Webb, 2016, meta-analysis of implementation intentions) is the right anchor.

Duration

8 weeks is defensible. The habit formation literature (Lally et al., 2010) shows the average time to automaticity is 66 days (~9.5 weeks), with high individual variance (18-254 days). 8 weeks puts most participants past the inflection point. 12 weeks (the v2 design) would have captured more stragglers but at the cost of higher dropout β€” a tradeoff the spec appears to have made deliberately.

The optional monthly continuation (Section 13) is a smart escape hatch for the long-tail participants who need more time.

Structure

Weeks 1-5 (one step per week) is elegant β€” it creates a second pass through the 5D model with increasing real-world application. Each micro-action is specific, time-bounded, and behaviorally concrete. The Week 1 action ("Share your Partnership Audit with one trusted colleague") is particularly well-designed β€” it creates social commitment, external accountability, and reality-testing in a single 20-minute action.

Weeks 6-8 (deepen + integrate) accelerate well. Week 6 asks for a second application of the Discovery β†’ Design sequence β€” this is the spacing + interleaving + generation effect triple play. Week 7's direct report survey is a behavioral commitment device. Week 8's ALI retake provides closure and measurement.

Concerns

  1. Week 3 (45 minutes) is a sharp spike. Presenting the Partnership Map to your team and revising together is the right action, but 45 minutes is more than double the other weeks. Participants who struggled with Weeks 1-2 (20 minutes each) may balk at the escalation. Consider framing it as: "Present the map (20 min) + revise based on feedback (25 min, can be done later in the week)."

  2. The accountability pair structure is underspecified. Two check-ins across 8 weeks (Week 5 and Week 8) is thin. The research on accountability partners shows the mechanism works through frequency of contact and social pressure, not just existence. I'd add a brief weekly text/message prompt between pairs β€” even something as simple as "Done βœ“ / Not yet / Need help."

  3. There's no mechanism for recovery from a missed week. If a participant misses Week 3, do they do it in Week 4 along with Week 4's action? Do they skip it? The spec doesn't address this, but in practice, one missed week often becomes two, then three, then dropout. A simple "If you missed last week: do this 5-minute version instead" would reduce attrition.


6. Assessment Design (ALI)

Structure

30 items, 6 per dimension, 6-point Likert (no neutral), behavioral items, ~6 minutes.

What's Sound

What Needs Work

Sample item construction has some issues:

Pre/post design with 8 weeks between:

This will show change, but interpreting it requires caution:

  1. Response shift bias is the primary threat. After the course, participants understand the constructs differently. A "4" on "I have redesigned at least one workflow" means something different before the course (when "redesign" is vague) vs. after (when "redesign" means the Partnership Map process). This can actually suppress apparent gains β€” participants may rate themselves lower post-course because they now understand what good looks like. Consider adding a retrospective pre-test ("Thinking back to before the course, how would you now rate yourself on...") at Week 8 to capture response shift.

  2. 8 weeks is sufficient for behavioral change on leading indicators (sharing the audit, experimenting with AI tools, presenting the Partnership Map). It's tight for lagging indicators (team adoption rates, measurable outcome improvements). The ALI should primarily capture leading behavioral changes, which it does.

  3. Social desirability is a risk with any self-report assessment. The items are behavioral enough to partially mitigate this, but adding a few reverse-scored items would help detect acquiescence bias.


7. Cognitive Load

The Core Question

Each module asks participants to: (1) learn a new framework, (2) complete a substantial exercise with that framework, (3) reflect on the exercise output, (4) bridge to the next module β€” all in 45 minutes (facilitated).

Assessment

Module 1 (DEFINE) β€” Feasible. The Three Zones Framework is simple (three categories). The Partnership Audit is concrete (list 10 things, sort them). The Identity Statement is brief. The ALI reveal creates intrinsic motivation that reduces perceived cognitive load. This module is well-paced.

Module 2 (DISCOVER) β€” Feasible but tight. The Discovery Framework (Efficiency/Augmentation/Transformation) is simple enough, but the Discovery Sprint asks for creative output under time pressure. The 12-minute individual sprint for 3 workflows Γ— 3 dimensions is the tightest moment. Expect some participants to stall on the Transformation row β€” by definition, it asks them to imagine what they can't yet imagine.

Module 3 (DESIGN) β€” Overloaded. This is the highest cognitive load module:

All in 45 minutes. The teaching is 10 minutes, leaving 35 for exercises + reflection + bridge. The Design Sprint alone is allocated 23 minutes for what is essentially two exercises (guided map + unscaffolded map). I estimate participants need 30-35 minutes for the exercises to produce quality work.

Recommendation: Module 3 should be 55-60 minutes, with the extra time given to the guided Partnership Map (which is where the deep learning happens). Alternatively, reduce the scope: map ONE workflow with full depth rather than 2-3 with surface coverage. The unscaffolded map can use a simpler workflow.

Module 4 (DEVELOP) β€” Feasible. The Four Readiness Gaps framework is intuitive. The diagnostic is quick (rating + signals). The plan-building is 10 minutes for 3 actions. Well-paced.

Module 5 (DEMONSTRATE) β€” Tight but feasible. The Demonstration Architecture (three layers) is simple. The 90-Day Plan is complex but participants are operating on their own workflow at this point β€” they've built the context across prior modules. The Full Circle exercise at the end is low cognitive load (emotional, not analytical).

Extraneous Load Concerns

The facilitated version has a hidden extraneous load: context-switching between individual work, pair discussion, and full-group debrief within each exercise. Each switch costs 1-2 minutes of transition + reorientation. In a 45-minute module with 3 mode switches, that's 4-6 minutes of transitions β€” significant when time is already tight.

The on-demand version manages load better through the AI's conversational pacing, which can adapt to the participant's speed. This is a genuine advantage of the on-demand modality that the spec doesn't explicitly leverage β€” consider having the AI monitor response quality/length as a proxy for cognitive overload and adjust accordingly.


8. Own / Augment / Automate Taxonomy

Robustness

The Three Zones Framework is a clean, memorable taxonomy. Three categories is the right number β€” it's within working memory limits and forces meaningful differentiation without excessive granularity.

Edge Cases That Break It

  1. Tasks that oscillate between zones. "Reviewing a junior employee's work" might be Own when the employee is new (judgment-heavy, trust-building), Augment when they're experienced (AI flags anomalies, human evaluates), and Automate for routine quality checks. The classification depends on when in the relationship cycle you're evaluating. The spec doesn't address temporal dynamism within a single task.

  2. The Augment zone is too broad. As the spec notes, this is "the contested middle where the hardest leadership decisions live." But it contains multitudes: AI-drafts-human-edits is very different from human-drafts-AI-checks, which is different from human-decides-AI-informs. Participants will struggle to classify within Augment because the zone conflates different partnership structures. The Partnership Map partially solves this by asking "what does AI own, what does the human own, where's the handoff" β€” but that's Module 3 territory. In Module 1, when participants first encounter the framework, the Augment zone will produce the most confusion and inconsistency.

  3. Collaborative/emergent tasks. "Brainstorming product strategy with my leadership team" doesn't fit neatly. The task involves human creativity, group dynamics, and could benefit from AI input β€” but classifying it as Augment undersells the human elements, and classifying it as Own ignores AI's potential contribution. Tasks that are inherently collaborative among humans don't map cleanly to a human-AI dyad framework.

  4. Oversight itself. The spec defines Automate as "AI handles it end-to-end with human oversight." But oversight is itself a task. If I automate report generation but spend 20 minutes reviewing each report, I've Augmented, not Automated. The boundary between Augment and Automate depends on the ratio of human involvement, which the three-zone model doesn't quantify.

Will Participants Struggle?

Yes, with the Augment zone specifically. In my experience, taxonomies with a large middle category produce classification disagreement. The teaching should explicitly name this: "If you're wrestling with whether something is Own or Augment, or Augment or Automate β€” that wrestling IS the exercise. The boundary cases are where your leadership judgment lives." The spec's AI dialogue hints at this but doesn't make it a teaching point.


9. Transfer to Real Work

Transfer Probability Assessment: Moderate-High (with caveats)

The design makes several strong transfer-promoting choices:

  1. Identical elements. Every exercise uses the participant's actual work context β€” real workflows, real team members, real organizational constraints. This maximizes identical elements between learning and application contexts (Thorndike's identical elements theory).

  2. Behavioral commitments. The Safety Commitment (Module 4), Design Commitment (Module 3), and Course Commitment (Module 5) are specific, time-bound implementation intentions. Implementation intentions approximately double the likelihood of follow-through compared to goal intentions alone (Gollwitzer & Sheeran, 2006).

  3. The 8-week system bridges the intention-action gap. This is the single biggest transfer mechanism in the design. Without it, transfer probability drops by roughly half.

  4. Social accountability through pairs and the Week 3 action (presenting the Partnership Map to the team) creates social commitment that sustains behavior change.

What Would Increase Transfer

  1. Manager involvement. The single strongest predictor of training transfer is supervisor support (Baldwin & Ford, 1988; Blume et al., 2010). The spec doesn't address what happens if the participant's manager doesn't understand or support the 5D approach. A single-page "Manager Brief" β€” explaining what the participant learned and how their manager can support application β€” would meaningfully increase transfer. This could be auto-generated from the participant's course outputs.

  2. Organizational barrier anticipation. The course builds individual capability but doesn't explicitly prepare participants for organizational resistance. Module 5 (Demonstrate) partially addresses this through the "Who's the biggest skeptic?" pressure test, but there's no systematic treatment of organizational barriers to implementation. Consider adding a "Barrier Anticipation" step to the 90-Day Plan: "What organizational barriers will you encounter? Who needs to say yes? What if they say no?"

  3. Near-transfer practice before far-transfer planning. The course moves quickly from "learn the framework" to "plan a 90-day organizational transformation." The gap between these is large. Adding a near-transfer exercise β€” applying the framework to a small, low-stakes workflow first β€” before the far-transfer commitment would build confidence and skill. The unscaffolded Partnership Map partially serves this function, but it happens within the same module as the guided version, not after a practice interval.

  4. Peer learning networks post-course. The accountability pairs are good but limited to dyads. Cohort-based learning communities (even simple Slack channels or monthly calls) show strong effects on sustained behavior change in leadership development (Day et al., 2014). The spec's "Optional Monthly Continuation" references individual AI check-ins but not peer interaction.


10. Red Flags

Red Flag 1: The 45-Minute Module Myth (Module 3)

Module 3 is trying to do too much. It asks participants to learn a new concept, decompose workflows into tasks, classify and justify each task, engage in a challenge round, then do it again independently, then make a commitment β€” all in 45 minutes. In facilitated workshops, the transitions between individual/pair/group modes will eat 6-8 minutes, leaving ~37 minutes of productive time for 33+ minutes of specified activity. This module will consistently run over, and facilitators will cut the unscaffolded map β€” the most important exercise in the course. Allocate 55-60 minutes or reduce scope.

Red Flag 2: The "Mostly Scaffolding" Premise May Alienate

The course's emotional hinge depends on participants discovering that "50-70% of their week is in the Augment or Automate zones." If this doesn't happen β€” if a participant legitimately has a week that's 60% Own β€” the entire emotional arc of Module 1 falls flat. The spec includes a "Handling the 'Mostly AI-Ready' participant" protocol (Section 8), which reframes high Augment/Automate as "biggest opportunity." But it doesn't address the inverse: the participant whose honest audit shows mostly Own work. These participants (likely in roles involving high-judgment, high-relationship work β€” therapists, crisis negotiators, senior diplomats) may feel the course is not for them. The reframe needs to work in both directions.

Red Flag 3: Exercise Artifacts May Not Survive Contact with Reality

The Partnership Map is the centerpiece artifact, but it's built on a single exercise session's thinking. Real workflow redesign requires stakeholder input, technical feasibility assessment, cost analysis, and iteration. There's a risk that participants leave with a map they feel proud of but that doesn't survive the first conversation with their IT department, their team, or their boss. The Week 3 micro-action (present to team, revise) partially addresses this, but I'd make the fragility explicit: "This map is a first draft. Its value is not in being right β€” it's in giving you a structured starting point for conversations you weren't having before."

Red Flag 4: The AI Thinking Partner Is Carrying Too Much Weight

In the on-demand modality, the AI Thinking Partner is responsible for: personalized coaching, exercise scaffolding, retrieval practice, challenge/pushback, emotional calibration, meta-awareness moments, feedback on artifacts, scaffolding removal, error recovery, and dissent handling. This is an enormous surface area. If the AI underperforms on any of these (and it will β€” LLM quality is inconsistent across such varied interaction types), the entire on-demand experience degrades. The spec acknowledges this risk in Stress Test 6 (Section 18) but the mitigation ("validate Module 1 conversation with real participants") is insufficient. I'd recommend defining a minimum viable AI role (coaching + challenge + retrieval) and treating the rest as enhancements that can be added if quality permits.

Red Flag 5: No Prerequisite Assessment of AI Literacy

The course assumes participants have basic AI familiarity ("not a user, a leader") but doesn't assess this. A participant who has never used any AI tool will experience Module 1's Partnership Audit very differently from one who uses AI daily. The Augment/Automate classification requires some mental model of what AI can do β€” which varies enormously across participants. Consider adding 2-3 AI literacy screening questions to the onboarding intake and adjusting the AI Thinking Partner's scaffolding accordingly (providing more concrete examples for low-literacy participants).

Red Flag 6: The "Teach It Back" Moment Is On-Demand Only

The "teach it back" exercise in Module 4 (Section 11) β€” where participants explain the Four Readiness Gaps as if teaching a direct report β€” is one of the strongest learning moves in the entire spec. It leverages the protΓ©gΓ© effect (learning by teaching), forces deep retrieval and reorganization, and practices the exact skill participants will need for transfer (explaining the framework to their team). But it only appears in the on-demand version. The facilitated version should absolutely include this. It could replace 3 minutes of debrief time with dramatically more learning.


Summary Verdict

This is a well-designed course that demonstrates genuine learning science literacy β€” particularly in its use of personal context, elaborative interrogation, scaffolding removal, and the 8-week reinforcement system. The 5D Model is clean, memorable, and structurally sound as a learning progression.

The three changes that would most improve learning outcomes:

  1. Fix Module 3's cognitive overload β€” extend to 55-60 min or reduce scope to one workflow with full depth.
  2. Upgrade retrieval bridges from label-recall to application-level β€” make participants use prior frameworks, not just name them.
  3. Add a manager brief and barrier anticipation to the 90-Day Plan β€” organizational context is the #1 threat to transfer.

The single highest-risk element is the AI Thinking Partner's breadth of responsibility in the on-demand modality. Define the minimum viable role and validate it ruthlessly before building the full surface area.

Overall confidence in transfer to real work: 65-70% for participants who complete the 8-week system. 25-35% for those who complete only the course. The 8-week system isn't a nice-to-have β€” it's the difference between inspiration and transformation.


Review submitted by The Instructional Designer
March 2, 2026

Panel Review: The Skeptic

Reviewer Profile: Senior VP of Operations, mid-market company ($500M revenue, 2,000 employees). 20 years in leadership. Veteran of FranklinCovey, Crucial Conversations, DDI, executive coaching, and every other program HR has thrown at me. Daily AI user. Highly allergic to consultant frameworks.

Document Reviewed: COURSE-SPEC-UNIFIED.md β€” Leading Through AIβ„’, LeaderFactor's 7th signature course.


1. First Reaction

I'll give it this: the first two minutes didn't lose me. That's rare.

"AI is the great distiller" β€” that sentence made me stop scrolling. Most AI leadership content opens with fear or hype. This opened with an identity claim that felt genuinely provocative. The idea that AI doesn't threaten leadership but reveals it β€” that's a move. I leaned forward.

I also leaned forward at the "front door vs. foundation" framing. Selling the CLO on adoption pain while actually delivering a leadership transformation? That's honest about how enterprise buying works. I've been in enough rooms where the pitch and the product are two different things. At least these guys know they're doing it and designed around it.

Where I rolled my eyes: the "Participant Transformation" table. Before/After tables are the consulting equivalent of stock photography. "Before: I'm confused. After: I'm enlightened." Every course has this table. It always looks the same. It always promises the same thing. Cut it or earn it.

I also caught a whiff of overselling in the revenue projections β€” 99% margins, $8.8M by Year 3, AI costs under 0.5% of revenue. That's not a course spec, that's a pitch deck for investors. Doesn't affect whether the course is good, but it tells me someone's excited about the business model, which always makes me wonder if the product got the same level of rigor.

Overall gut: This is better than 80% of what I've sat through. It's not another 2x2 with a TED Talk stapled to it. There's actual architecture here. Whether it survives contact with real leaders is a different question.


2. The Great Distillation

Here's my honest reaction as someone who's been leading for two decades: the insight is right, but it's not as new as the spec thinks it is.

"Focus on the work that only you can do" is a message I've heard from Peter Drucker, from Stephen Covey, from every executive coach I've ever paid. The Great Distillation is a better version of it β€” the AI framing is genuinely sharper than "delegate and elevate" β€” but let's not pretend this is a revelation that will shake senior leaders to their core.

What IS different: the mechanism. Previous versions of this insight said "you should focus on higher-order work." This one says "a machine is about to do the lower-order work whether you focus on higher-order work or not." That's not aspiration. That's physics. The urgency is real in a way that Covey's "important vs. urgent" matrix never was.

The "cognitive automation" framing is where this earns its keep. The distinction between automating tasks and automating thinking β€” that lands. I've automated plenty of tasks. I haven't grappled with what it means when a machine can do the thinking I was hired to do. That's a different conversation and I'd pay attention to it.

Verdict: The insight is an evolutionary upgrade, not a revolution. But it's a meaningful upgrade. I'd respect the facilitator who delivered it well. I'd also respect them more if they acknowledged that experienced leaders have heard versions of this before and explained what makes this one structurally different β€” in the room, not just in the spec.


3. Own / Augment / Automate

Would I use this language in a real meeting?

Actually... maybe. And that surprises me.

Here's why: "Own, Augment, Automate" passes the meeting test because it's verb-based and immediately actionable. When someone on my team asks "what should we do with AI for this process?" β€” I could actually say "let's map it: what do we Own, what do we Augment, what do we Automate?" and people would get it without a training course.

Compare that to something like "The Four Demands of AI Leadership" (which I gather was the previous version). I would never say "I need to Clarify, Calibrate, Cultivate, and Configure" in a meeting without feeling like I'm reading from a consultant's slide.

The "Three Zones" language also works because it mirrors how operational leaders already think. We already categorize work into keep/change/eliminate. Own/Augment/Automate is just a smarter version of that for the AI context.

The "Augment" zone as the "contested middle" β€” that's the real insight. The spec is right that the interesting decisions aren't in Own or Automate. They're in the gray area where you're deciding how much AI involvement is appropriate. That's where I spend my actual time.

What wouldn't I use: "The Great Distillation" as a phrase. "Partnership Audit." "Capability Fog." These are course vocabulary β€” useful inside a training room, weird in a Tuesday ops meeting. "Architecture Debt" I might actually steal, though. That one maps cleanly to how I already talk about organizational problems.


4. The Exercises

Let me go through each one honestly.

The Partnership Audit (Module 1) β€” I'd engage. Reluctantly.

"List 10 things you spent time on last week and categorize them." Would I do this? Yes, but only because I'm slightly compulsive about tracking my time anyway. The exercise itself is pedestrian β€” it's a time audit with a twist. What makes it work is Step 3: "Look at the ratio." The reveal that most of my week is in the Augment/Automate zone would be genuinely uncomfortable for me. I'm not sure I'd like what I see, which means it's doing its job.

The risk: If a facilitator lets this become a checkbox exercise ("just quickly categorize your week"), it dies. The power is in the discomfort of the reveal. That requires a facilitator who can sit in silence while people process.

The Discovery Sprint (Module 2) β€” I'd phone this one in.

"Push three workflows through Efficiency β†’ Augmentation β†’ Transformation." This feels like a structured brainstorm, and I've done a thousand structured brainstorms. The three dimensions are sensible but the exercise feels like it was designed for leaders who haven't thought much about AI. For someone who uses AI daily, being asked to brainstorm "what if AI could..." for 12 minutes feels like being told to color inside the lines.

The pair expansion saves it somewhat. Having someone push me past my first ideas is where the real value lives. But in on-demand mode with just the AI? I'd speed through it.

The Design Sprint / Partnership Map (Module 3) β€” This is the one.

This is where I'd actually work. Deconstructing a workflow into component tasks and making explicit allocation decisions with a "why" column β€” that's real operational design. I do versions of this when we redesign processes. The AI framing forces a rigor I usually skip.

The unscaffolded second map is the cleverest design choice in the whole course. Making me do it alone after doing it with help β€” that's how you build a skill, not just deliver an experience. I'd respect the course for this.

The Readiness Diagnostic (Module 4) β€” Depends entirely on my mood.

Rating my team on four readiness gaps could be powerful or it could be navel-gazing. If I'm in a reflective state, I'd find value here β€” especially the Identity Integration gap, which I've never seen named before. That's the team member who can use the tools but doesn't know who they are in the new model. I have people like that. Naming it matters.

If I'm in an impatient state (which, let's be honest, is most states), I'd rate my team quickly, write three obvious actions, and move on.

The 90-Day Demonstration Plan (Module 5) β€” High potential, high risk.

See section 6 below. This is either the best exercise in the course or the most ignored. No middle ground.

Summary: I'd genuinely work on Modules 1, 3, and possibly 4. I'd coast through Module 2. Module 5 depends on whether I believe the plan will actually get used.


5. The AI Thinking Partner

Let me be direct: the meta-awareness is either genius or insufferable, and the line between those two is thinner than a razor blade.

Having an AI coach me through a course about AI leadership while occasionally saying "notice what just happened β€” you're doing the thing" β€” I get why they designed it. The medium IS the message. Conceptually, it's elegant.

In practice? First time the AI says "Notice that you just partnered with an AI to make a better decision β€” that's cognitive partnership, that's what this course is about," I will either have a genuine insight or I will close the laptop. It depends entirely on whether the AI has earned that moment by actually challenging my thinking, or whether it's congratulating itself for existing.

What I'd actually do: I'd test it. Not maliciously β€” I'd give it real answers about my real work and see if its challenges are substantive or generic. If it says "Walk me through your reasoning on that" and then follows up with something specific to what I actually wrote, I'm in. If it gives me the same follow-up it would give anyone, I'm out.

The spec's error recovery protocols are smart. "Fair enough β€” you know your work better than I do. Tell me what I'm missing." That's the right posture. An AI that admits its limitations is infinitely more trustworthy than one that pretends to be omniscient.

The "Respect for Dissent" protocol is the thing that would actually earn my trust. If I push back on the Great Distillation and the AI says "you may be right β€” let's work with your actual numbers and see what they tell us" instead of trying to argue me into the framework β€” that's when I'd start taking it seriously.

The scaffolding removal (AI does less as the course progresses) is the best structural choice in the entire spec. By Module 5, I'm building my own plan with minimal AI input. That's how you avoid creating AI dependency in a course about AI leadership. Someone was thinking clearly when they designed that.

Verdict: I'd approach it as a skeptic, but if it's built to the spec, it would probably win me over by Module 3. The critical test is whether the AI challenges me with specifics from my own context or with generic frameworks.


6. The 90-Day Plan

This is the section that will determine whether this course is different from every other one.

Every leadership course ends with some version of "now make a plan." They all sound great in the room. They all go in a drawer. I have a drawer. I know what's in it.

What's different here β€” potentially:

  1. Kill criteria. I have never seen a leadership course ask "what would cause you to stop?" That's an operational discipline, not a training exercise. It signals that the course designers understand how real decisions work. You don't just measure success β€” you define failure and design for it.

  2. Scale triggers. "What must be true before I expand" is the right question. Most AI initiatives fail not because the pilot fails but because someone scales a pilot that worked in one context into a context where it doesn't. Explicitly designing the scale decision is genuinely valuable.

  3. Three layers of evidence. Leading indicators, lagging indicators, and story indicators β€” this is how I actually report to my CEO. Numbers and narratives. If the 90-day plan teaches leaders to build both, they'll be more effective communicators, AI or not.

What's NOT different:

It's still a plan made in a training room, which means it's made with training-room energy and training-room optimism. Monday morning, my inbox has 47 unread emails, three fires are burning, and the 90-day plan is already competing with everything else for my attention.

What would make it stick: The 8-week reinforcement system. If someone actually emails me on Tuesday morning with a specific prompt tied to a specific action from my plan β€” not "how's your journey going?" but "you committed to checking your 30-day leading indicators this week; here are the ones you defined" β€” that changes the math. The reinforcement system is the structural answer to the drawer problem.

The accountability pair is either the best idea or the most ignored feature. In my experience, peer accountability works when both people are engaged and fails when either one isn't. I'd want to choose my own accountability partner, not be assigned one.

Verdict: The 90-day plan architecture is genuinely better than anything I've seen in a leadership course. Kill criteria alone puts it ahead. Whether it survives Monday morning depends on the reinforcement system, which I won't know until I experience it.


7. What Would Actually Change My Behavior?

Forget the spec's promises. Here's what would stick, six months later:

  1. The Three Zones language. I'd actually use Own/Augment/Automate when thinking about process design. It's a better lens than what I currently use. Six months from now, when someone proposes an AI initiative, I'd instinctively ask "is this an Augment play or an Automate play?" and the answer would shape the design.

  2. The "why" column in the Partnership Map. Making the reasoning explicit for every allocation decision β€” I'd carry that forward. Not as a formal exercise, but as a discipline. "Why is a human doing this? Because ___." If I can't answer that, it tells me something.

  3. The Four Readiness Gaps as a diagnostic. Specifically, the Identity Integration gap. I'd look at my team differently. The person who's resisting AI adoption β€” maybe their problem isn't skill. Maybe it's identity. I'd ask different questions.

  4. Kill criteria for AI initiatives. I'd build this into every business case going forward, AI or not.

What would NOT stick:

Net assessment: Three durable behavior changes and one diagnostic framework. That's actually more than most courses deliver. Most courses give me zero things I'm still doing at six months.


8. What I'd Tell My CEO

"Better than I expected. Not life-changing, but genuinely useful. The framework is practical β€” I'd actually use the language back here. The best part is it doesn't try to teach AI tools; it teaches how to think about designing work around AI, which is the part we're actually bad at. I'd recommend it for our director-level and above. The people leading AI initiatives need this more than another tool demo. Don't roll it out to the whole company β€” start with the leaders who are actually making allocation decisions about AI."

30 seconds. That's my honest answer.


9. Red Flags

Things that would make me check out:

  1. The "you're doing $50,000 work when you're capable of $500,000 work" line. I get what they're going for β€” reframe the ratio as opportunity, not threat. But telling a senior leader that most of their week is $50,000 work is one wrong inflection away from condescending. A good facilitator saves this. A mediocre one makes me fold my arms for the rest of the afternoon.

  2. The Provocation Essay as pre-work. Asking me to read a 1,200-word essay and write three sentences about my feelings before I've even shown up? That's homework. I'm a senior VP. I'll do it, but I'll resent it, and the course starts with a small deficit of goodwill.

  3. "The medium IS the message" self-congratulation. The spec mentions this idea multiple times β€” the AI coaching you about AI is itself the lesson. Yes, I understand the concept. If the actual course beats this drum more than twice, it will feel like the course is more impressed with itself than I am.

  4. The pair sharing in every single module. "Turn to the person next to you." Five times in four hours. I know pair sharing is pedagogically sound. I also know that by Module 4, people are performing their pair shares, not engaging with them. Vary the format or reduce the frequency.

  5. Nothing about the political reality of AI adoption. The course assumes the leader's challenge is design and team readiness. In my actual life, the biggest challenge is navigating organizational politics β€” getting budget, managing up, dealing with the executive who thinks AI is a fad and the one who thinks it's magic. Module 5 touches on this (proving value to stakeholders) but it's underweight. The course would land harder if it acknowledged that the leader's biggest obstacle might not be their team β€” it might be their boss's boss.

Things that are NOT red flags but could become them:


Summary Assessment

Dimension Rating Notes
Intellectual rigor 8/10 Framework is tight. The 5D progression is logical. Not groundbreaking but solid.
Practical utility 7/10 Three Zones and Partnership Map are genuinely useful tools. Discovery Sprint is weak.
Respect for the audience 7/10 Mostly treats leaders as adults. A few moments of over-explaining.
Durability 8/10 Tool-agnostic design is smart. Framework should age well.
Differentiation 7/10 Module 5 (Demonstrate) is genuinely different. Modules 1-4 feel familiar in places.
Likelihood I'd recommend it 7/10 I'd send my directors. I'd tell my peers it's worth the time. I wouldn't call it transformational.

Bottom line: This is a well-built course that respects leaders' intelligence more than most. It's not going to change my life, but it would change how I think about three or four things, and that's more than I can say for the last five programs I attended. The test will be execution β€” the spec is strong, but specs don't deliver courses. Facilitators and AI interactions do. If the AI Thinking Partner is as good as designed, and the facilitator can resist reading slides, this could be genuinely good.

I walked in with my arms crossed. I'm leaving with them uncrossed. That's not nothing.


Review completed March 2, 2026
The Skeptic β€” Senior VP of Operations