You're offline — showing cached data

Wiki

12-data-flows/email-pipeline
2026-06-13 07:27:25 SAST
Wiki Home → 12-data-flows/email-pipeline

Email Pipeline

The email pipeline fetches Outlook emails via the MS Graph API, classifies them by project, generates summaries, and stores the results in email.db. See 09-integrations/email for the full integration reference.

Pipeline Overview

MS Graph API  -->  Fetch last 2 days of emails
      |
      v
inbox-rules.md  -->  Filter newsletters / noise
      |
      v
Gemini Flash  -->  Classify to project + summarize
      |
      v
email.db  -->  Insert metadata + summary
      |
      v
email_attachment_sync.py  -->  Download attachments (best-effort)

Trigger

Step-by-Step

1. Fetch Emails

2. Filter Noise

3. Classify and Summarize

4. Insert into email.db

5. Attachment Sync (Best-Effort)

Backfill Pipeline

A separate script handles enriching existing records:

Search (Downstream)

The email-index skill provides the query interface:

python ~/.claude/skills/email-index/query.py project "Project Heron"
python ~/.claude/skills/email-index/query.py search "MOI harith"
python ~/.claude/skills/email-index/query.py sender "stephan.spamer"
python ~/.claude/skills/email-index/query.py recent --limit 10
python ~/.claude/skills/email-index/query.py stats

All queries run on Luci via SSH over Tailscale. Access from other machines uses the email-index skill which handles the SSH connection.

State Management

File Purpose
.email-sync-state.json Tracks last sync date
.email-sender-map.json Cached sender-to-project mapping
~/.graph-api-token.json MS Graph API OAuth token

Error Handling

Key Takeaways

Help