Human-walkthrough test of every process and UX enhancement built across the Wingman-parity, mobile-UX, and orchestrator-flow projects. Format: DO → EXPECT. Test desktop 1280×720 and mobile 375×812. MC: http://localhost:3001 (Luci) / http://100.118.207.3:3001 (Tailscale).
Tester records: screenshot per step + PASS/FAIL + a one-line note. Flag any Blocker/Major/Minor/Polish.
Segment 1 — Dashboard (console)
- DO: Open
/ (dashboard). EXPECT: single-viewport operator console — no long scroll on desktop; KPI cards across the top.
- DO: Read each KPI card. EXPECT: each headline number has a sub-state line (e.g. "3 running · 2 scheduled · 0 failed"), not just a bare number.
- DO: Look for connectivity gauges. EXPECT: any service/integration health shows as an
N/M fraction.
- DO: Check the header. EXPECT: personalised greeting + live clock/date.
- DO: Find an empty panel (if any). EXPECT: instructive empty state ("Register a task to begin…"), not a blank "No data".
- DO (mobile 375×812): reload
/. EXPECT: no horizontal scroll; quick-start chips clear the FAB (not clipped); nothing hidden under the FAB at full scroll; stat labels readable (≥13px).
Segment 2 — Bottom nav
- DO: Look at the bottom nav on any page, mobile 375×812. EXPECT: 4–5 primary tabs + a "More" tab — NOT 10 tabs overflowing off-screen.
- DO: Tap "More". EXPECT: a bottom sheet opens listing the remaining sections (Reports, Briefings, Alerts, Console, etc.) — all reachable.
- DO: Tap each nav tab. EXPECT: every tab ≥44px tap target; navigation works; no dead tabs.
Segment 3 — Board
- DO: Open
/board. EXPECT: ticket cards in a clean single column; titles wrap to 2 lines.
- DO: Look at filter tabs (Focus / Active / Needs Input / etc.). EXPECT: each tab shows its count inline in the label; switching is instant; empty filters show an instructive empty state.
- DO (mobile): look at a ticket card's action icons (edit / complete). EXPECT: each ≥44px with a clear gap — no risk of mis-tapping "complete" (destructive).
- DO: Find the FAB. EXPECT: anchored bottom-right, within thumb reach, clear of the bottom nav.
- DO: Use the composer "Quick Ticket" mode — type a title, submit. EXPECT: ticket created, appears on the board.
Segment 4 — Ticket detail + glass-box worker log
- DO: Open a ticket that has worker activity (e.g. a recently-done ticket). EXPECT: worker timeline renders.
- DO: Find a "Used N tools" group in the worker log. EXPECT: collapsible panel; expand it → see per-tool rows.
- DO: Expand a tool row. EXPECT: an Input section with the call's JSON/args.
- DO: Look for file-action chips. EXPECT:
Created / Edited / Viewed /path chips with verb icons.
- DO: Look for command chips. EXPECT:
✓ $ command chips with a success tick.
- DO: Find inline action buttons in the timeline. EXPECT: real buttons (Deploy / Run Code Review / approve) rendered at the decision point.
- DO: Find an "Agent asked a question" panel (on a needs-input ticket if available). EXPECT: collapsible Q&A block, badged "Answered" once resolved.
- DO (mobile): open the same ticket. EXPECT: Status/Priority/Assigned/Runtime controls are collapsed into ONE accordion (collapsed by default) — title + description + conversation visible on load, not buried under 2.5 screens of dropdowns.
Segment 5 — Tasks page
- DO: Open
/tasks. EXPECT: recurring-task cards — each with next-run + run-count chips, status chip (disabled / no-runs / completed), on/off toggle.
- DO: Look at the Recent Runs filter tabs. EXPECT: each tab carries an inline count (All N / Failed N / …).
- DO: Search a nonsense term. EXPECT: an instructive empty state appears (not a blank table).
- DO (mobile): check the toolbar. EXPECT: search full-width on its own row; Sort + direction not cramming/truncating; filter chips wrap (not clipped off the right edge).
Segment 6 — Insights
- DO: Open
/insights on mobile 375×812. EXPECT: NO horizontal scroll (page stays 375px wide).
- DO: Find the log blocks (Recent Failures / Command Log). EXPECT: long strings wrap, don't force sideways scroll.
- DO: Look at the data tables (Agents / Scheduler Tasks). EXPECT: on mobile they render as stacked label:value cards, readable — not squished desktop columns.
Segment 7 — Reports & Briefings
- DO: Open
/reports. EXPECT: paginated (≈15–20 per page or "Load more") — NOT 120 cards in one giant scroll.
- DO: On a report card, find the actions. EXPECT: one "Actions" button (≥44px) opening a bottom sheet (Share / PDF / Convert / Delete) — Delete asks for confirm.
- DO: Open
/briefings. EXPECT: paginated; page loads fast (not a 50,000px monster); the CEO Curation accordion is collapsed by default on mobile.
- DO: On a briefing card, find audio. EXPECT: a single play button (≥48px) — not a raw native
<audio> bar; tapping it loads/plays.
- DO (mobile both pages): check every button. EXPECT: all ≥44px tap targets.
Segment 8 — Apps
- DO: Open
/apps on mobile. EXPECT: compact cards (≈64px) or 2-column grid — 6+ apps per screen, not 3 oversized cards.
- DO: Scroll to Service Controls. EXPECT: a clear divider before the section; Start/Restart/Stop buttons ≥44px.
Segment 9 — Orchestrator: composer "Chat with Luci" (P1)
- DO: In the composer, pick "Chat with Luci" mode. Type a plain question, submit. EXPECT: the reply comes back inline in the composer (with a thinking/heartbeat indicator while waiting) — NOT a redirect to a separate workbench page.
- DO: Send a second message. EXPECT: continuity — the session remembers the first message (it's the persistent orchestrator, not a throwaway).
- DO: Look for a "view in console" link. EXPECT: present, links to the live console session.
Segment 10 — Orchestrator: intent decomposition (P4)
- DO: In "Chat with Luci", state a high-level multi-part intent (e.g. "we need to ship a new briefings layout and fix the mobile nav"). EXPECT: the reply renders proposal cards — one per suggested ticket — each with title, meta (project/assignee/priority), and a "Create ticket" button.
- DO: Confirm the raw text. EXPECT: no raw
mc-proposal code block leaks into the visible chat — only clean prose + cards.
- DO (mobile): check the proposal-card buttons. EXPECT: "Create ticket" / "Create all" buttons ≥44px.
- DO: Click "Create ticket" on one card. EXPECT: the ticket is created on the board.
Segment 11 — Global checks (apply throughout)
- Every interactive control (button, link, icon, tab) ≥44×44px on mobile.
- No element <13px for body text; secondary grey text meets contrast.
- No horizontal overflow on any page at 375px.
- No regressions — existing nav, ticket detail, workbench, composer Quick-Ticket / "Luci do it now" modes all still work.
Backend-verified separately (not Tessa — note only)
- P2/P3 orchestrator inbox: scheduler writes to
orchestrator_inbox; mc_pickup drains it and the persistent session digests it ("All N routine — acknowledged"). Verified via DB + the live persistent session, not the browser.