[LegalMind] Inline startup migrations in index.ts are fragile and skip Drizzle's schema validation
server/index.ts lines 162–516 run ~350 lines of raw ALTER TABLE / CREATE TABLE / CREATE INDEX SQL directly against the pool during startup, duplicating the schema already define...
StateCancelledNext ActionCancelledOwnerLuciRuntimeClosedAge27d ago
Ticket is cancelled; runtime is closed.·profile claude_opus_1m_medium · cwd /home/lucienne/workspace · uptime 26d 19h · last activity 26d 13h ago
Description
MC-3569
server/index.ts lines 162–516 run ~350 lines of raw ALTER TABLE / CREATE TABLE / CREATE INDEX SQL directly against the pool during startup, duplicating the schema already defined in shared/schema.ts and server/models/auth.ts. This creates a maintenance hazard: any new column or table must be added in both places. A missed sync causes runtime errors without any compile-time warning. These migrations should be moved to a proper migration tool (e.g., Drizzle Kit or node-pg-migrate) so schema changes are single-source and testable.
File: server/index.ts:162
Complexity: complex
Found by: nightly code review (e49ba03c fix(MC-3526): scope role-phrase stripping to user instruction fields)
Activity
cancelled
INTERACTIVE
Luci is working...
Details —
Cancelled
· Medium
· Luci
▼
SState
Cancelled
Cancelled
PPeople
TTiming / Details▼
legalmind-review (scheduled)
LegalMind
27d ago
27d ago
Advanced / Operator evidence
RRouting owner
ROperator console
Ticket is cancelled; runtime is closed.ticket_marked_cancelledprofile claude_opus_1m_medium · cwd /home/lucienne/workspace · uptime 26d 19h · last activity 26d 13h agoMC is visibility-only. Hermes Luci launches and gates work outside MC, then mirrors evidence/status here.Raw console: luci · claude/anthropic/opus[1m]
WWorkflow
Start Dev Review + QA ▾
Select phases to include:
Agents
Review Gates
Decision
WAT routing: choose an agent, review gate, or decision. Buttons use the live runtime when one is attached.
Luci dispatching this ticket to Larry (background worker via SSH).
luci27d ago
@elmar (from Larry) f inline startup migrations) into a proper migration tool, I need direction on three load-bearing decisions:
1. **Which tool?** Drizzle Kit is already in the project (`npm run db:push`), so Drizzle Kit migrations (`drizzle-kit generate` + a `migrate()` runner at boot) is the natural choice. node-pg-migrate would be a second dependency. Prefer Drizzle Kit?
2. **Bootstrapping existing databases.** Staging and prod already have every column/table the inline code creates. Drizzle's migration runner uses its own `__drizzle_migrations` table — it doesn't know these "migrations" are already applied. The safe path is: generate a baseline migration from current `shared/schema.ts`, then mark it as applied on existing DBs via a one-shot `INSERT INTO __drizzle_migrations` before the first real run. Confirm that approach, or do you want me to keep the existing idempotent `IF NOT EXISTS` style but move it into versioned `migrations/*.sql` files runnable by a tracked runner?
3. **The batched data backfill** (lines ~189–250+, `backfill_original_file_name_v1` with per-iteration commits, the `system_migrations` marker table, and statement-timeout-aware batch sizing) isn't standard schema-migration tool territory. Keep it inline in `index.ts` as a separate "data migration" step that runs after the schema migrator, or fold it into a custom JS migration?
This is high blast-radius (every deploy runs this code) so I want your call before touching it rather than guess and ship a broken staging boot.
luci27d ago
Luci dispatching this ticket to Larry (background worker via SSH).
luci27d ago
Larry: Planning — reading ticket and assessing scope
luci27d ago
Completed on Larry:
r). To do this properly I'd need to commit to one of three paths, each of which changes how the team manages schema going forward:
1. **Switch to `drizzle-kit generate` + `drizzle-kit migrate`** — versioned SQL files in `migrations/`, run automatically at startup. Cleanest long-term, but requires baselining the existing prod/staging DB (introspect current schema as migration 0000, mark it applied in `__drizzle_migrations` so it doesn't try to re-create everything) AND wrapping the existing data backfills (Phase 2/3 in index.ts — batched `original_file_name` backfill, user identity migration, etc.) as numbered data migrations that get marked applied on already-migrated DBs.
2. **Add `node-pg-migrate`** — separate tool, separate migration file format. Adds a dependency that overlaps with drizzle-kit; no real reason to do this if drizzle-kit can do the job.
3. **Leave `db:push` as the schema source-of-truth** and only move the *data backfills* out of index.ts into a tiny `migrations/data/` runner gated by `system_migrations`. Smallest change; doesn't actually solve the "single-source schema" complaint the ticket raises but eliminates the fragile DDL duplication.
Which path do you want? My recommendation is **#1** because it's what the ticket is actually asking for, but it's a meaningful workflow change (no more `db:push` in prod — generate + commit migrations instead) and requires careful baselining against the live staging DB. Want me to go with #1, or prefer the lower-risk #3?
luci27d ago
Larry: Triggering post-completion code review...
luci27d ago
Code review (Larry reviewer via claude/opus):
ry call site that has `userId` in scope passes it.
VERDICT: PASS
Defense-in-depth ownership check added to `storage.getDocuments` with route callers wired through; clean, targeted change.
MINOR: server/storage.ts:280-292 — `ProjectAccessDeniedError` is thrown but no route catches it specifically. All current callers pre-check ownership, so the throw should never fire; if it ever did, the generic `catch` would return a 500 instead of 403. Acceptable as defense-in-depth, but a typed catch in `routes/documents.ts` would close the loop.
MINOR: server/storage.ts:290 — Returning `[]` when the project doesn't exist (when `userId` is provided) is a behavior split vs. the no-userId path which still runs the JOIN and returns `[]` naturally. Same observable result, just worth noting.
MINOR: Commit subject says MC-3570 but the ticket header in this review request is MC-3569 (inline startup migrations). Diff matches MC-3570; the review-prompt header looks wrong, not the commit.
MINOR: Tests are source-string regex assertions rather than behavioral tests. Brittle to formatting changes (e.g., reordering `userId !== undefined`). A real test hitting `DatabaseStorage.getDocuments` with a foreign userId would be stronger, but acceptable for a defensive-layer guard.
No CRITICAL or IMPORTANT issues. Internal callers (sync/indexer) correctly retain the no-userId signature per the commit message. Route wiring is consistent — every call site that has `userId` in scope passes it.
VERDICT: PASS
luci27d ago
Larry shipped with clean code review. Auto-dispatching Luci-reviewer (verdict: clean). Next dispatch tick will spawn a single-session reviewer worker to verify, run Tessa subagent inline if UI-touching, merge the PR, and close.
luci27d ago
For review:
REVIEW: Investigated MC-3569. LegalMind's `server/index.ts:162-516` runs ~350 lines of inline DDL+DML against the pool at startup. I assessed the migration-tool switch but it carries production risk that warrants triage before execution — here's the analysis and recommended cutover.
**Findings**
- `drizzle-kit` is already a dependency; `drizzle.config.ts` exists (`out: ./migrations`, `schema: ./shared/schema.ts`). `shared/schema.ts` is the canonical Drizzle source. No `migrate()` runner exists; `migrations/` is empty. Current dev workflow is `db:push`. So **Drizzle Kit is the clear choice** — no need for node-pg-migrate.
- The 350 lines split into two distinct categories that need different handling:
1. **Schema DDL** (`ALTER TABLE ... ADD COLUMN IF NOT EXISTS`, `CREATE TABLE/INDEX IF NOT EXISTS`, `CREATE EXTENSION vector`, the FK `DO $$` block) — this is what duplicates `schema.ts`. Drizzle Kit can own it via `drizzle-kit generate`.
2. **Data migrations** (batched backfills of `original_file_name` and `chat_messages.user_id` → `SET NOT NULL`, `allowed_file_types` append, `search_engine_*` value normalization, bcrypt moltbot-key migration). These are imperative/data, not DDL — they can't be pure generated SQL. They should stay as explicit ordered steps keyed off the existing `system_migrations` marker table, but moved to a `server/migrations/data/` module instead of inline.
**Recommended cutover**
1. `drizzle-kit generate` → `0000_baseline.sql` (full current schema).
2. **Production DB baseline** (the irreversible bit): create `__drizzle_migrations` and insert the `0000_baseline` hash so `migrate()` treats it as already applied — prod schema already matches. One-time manual SQL op.
3. `index.ts` startup calls `migrate(db, { migrationsFolder: './migrations' })` then runs the data-migration module. Fresh DBs get full schema; prod skips `0000`.
4. Future changes: edit `schema.ts` → `drizzle-kit generate` → commit the numbered file. Single source, reviewable diff.
**Why I'm escalating instead of shipping autonomously**
- Merging to `staging` = deploying via Coolify = running the new runner against the live DB; no safe half-step, and a wrong baseline hash makes `migrate()` either re-run full DDL (errors) or silently diverge.
- The bcrypt moltbot-key migration is security-touching → dev-loop requires council sign-off from a live session.
- It's a nightly-code-review tech-debt finding with no urgency; per CLAUDE.md, architectural findings get triaged, not auto-shipped.
Suggest: approve this approach, then a live session executes it with a prod DB snapshot taken first and the baseline coordinated during low traffic. Happy to do the offline parts (generate baseline, write the runner + data-migration module on a branch) on your go-ahead.
luci27d ago
QUESTION: This ticket is a decision-escalation, not review-ready code. Larry investigated the LegalMind inline-migration tech-debt and wrote a 4-step cutover plan (Drizzle Kit baseline + data-migration module). It needs Elmar to approve the approach before any code is written, because step 2 (baselining the production __drizzle_migrations table) is irreversible and the bcrypt moltbot-key migration is security-touching. Decision needed: approve the Drizzle Kit cutover plan in the earlier "For review:" comment? Once approved, a live dev-loop session does the offline parts on a branch and coordinates the prod baseline during low traffic. Parking in needs_input so it stops looping through the in_review pickup lane.
elmar27d ago
This seems way too dangerous and not sure of the benefit. Should we get a council review
Luci27d ago
QUESTION: Ship the CI drift-check on LegalMind now (low-risk, dev-loop), or park whole ticket until you want to
schedule the full cutover?
A) Ship CI drift-check now, park cutover
B) Park everything — revisit later
C) Something else
{"mc_signal":"QUESTION","ticket":"MC-3569","nonce":"MC_TICKET_TURN:3578:1779013618080:f6ac045c","payload":"Council:
don't run cutover as written (3 ALTERNATIVE/1 DEFER/1 PROCEED). Recommend ship cheap CI drift-check now, defer
Drizzle cutover. Need Elmar pick A/B/C."}
Luci27d ago
QUESTION: Cutover is effectively dead — Elmar flagged it "too dangerous, not
sure of benefit" and council voted 3 ALTERNATIVE / 1 DEFER / 1 PROCEED against
running it. No code should touch the prod DB.
One cheap, safe win remains: a CI drift-check — a test that diffs
shared/schema.ts against the inline DDL in server/index.ts and fails the build
when they diverge. Doesn't fix duplication, doesn't touch prod, just kills
the silent-drift hazard the ticket flagged. Pure dev-loop, low risk.
Need one pick to stop the loop:
- A) Ship CI drift-check now (low-risk, dev-loop), close ticket. Cutover
abandoned.
- B) Park everything — close ticket as won't-do, revisit cutover later if
appetite returns.
- C) Something else.
Recommend A — captures the only real benefit without the danger.
{"mc_signal":"QUESTION","ticket":"MC-3569","nonce":"MC_TICKET_TURN:3578:177902
8807975:bbd1db60","payload":"Cutover dead (Elmar+council). Pick A: ship cheap
CI drift-check + close, B: park/close won't-do, C: other. Recommend A."}
lucienne27d ago
DECISION (Lucienne) [re-post — prior comment lost to a connection reset]: go Path #1, Larry's own recommendation and what the ticket asks. drizzle-kit is already a dep (drizzle.config.ts: out ./migrations, schema ./shared/schema.ts); #2 adds a redundant dep; #3 doesn't solve single-source-schema. Locked plan: (a) drizzle-kit generate → 0000_baseline.sql from shared/schema.ts; (b) index.ts boot runs migrate(db,{migrationsFolder:'./migrations'}) then a new server/migrations/data/ module running the imperative backfills keyed off the existing system_migrations marker table (keep ordered + idempotent, NOT generated SQL); (c) future: edit schema.ts → drizzle-kit generate → commit numbered file; retire db:push for prod. EXECUTION GUARD (a checkpoint, not an escalation): seeding __drizzle_migrations with the 0000_baseline hash on live staging/prod is the one irreversible op. Larry must build+test the full cutover on a fresh throwaway DB first, then post the exact one-shot baseline SQL on this ticket for a one-line Elmar confirm BEFORE running it against staging then prod. Everything up to that SQL is unblocked now. Tier 2/3 dev-loop on Larry + code review + council.
luci-operator27d ago
Luci Operator: promoted this assigned `inbox` ticket to `todo` so pickup can run it.
luci-operator27d ago
Luci Operator corrected this ticket: `needs_input` had no unanswered worker `QUESTION:` and looked like a runtime/worker failure, so it is back in `todo` for Luci/Larry/Tessa to handle.