{"definition_raw":"---\nid: tailscale-watch\ntitle: Tailscale UDP/DERP health watch\nschedule: \"*/30 * * * *\"\ntimeout: 60\nretry: false\nenabled: true\nnotify_on: failure\nnotify_to: dm\nrun_as: shell\ncommand: \"python3 /home/lucienne/workspace/scripts/tailscale_watch.py\"\ntags: [infrastructure, tailscale, monitoring]\nruntime_profile: direct_python\n---\n\n**OVERRIDES runtime profile:** uses `direct_python` because this network health\nwatch runs deterministic Tailscale checks and does not invoke Claude/Codex or\nany LLM API.\n\nHalf-hourly health check on Tailscale's NAT-traversal layer.\n\nBackground: 2026-05-07 SAST \u2014 Tailscale on Luci has broken UDP NAT discovery\n(`netcheck` reports `UDP: false`, `IPv4: no addr found`, no DERP latency).\nPhone-to-server traffic relays via DERP-jnb with 241ms-2.1s RTT, making the\nmobile PWA feel unusable. Workaround in place: phone uses public IP\n`http://204.168.188.33:3001` direct. SSH and admin paths still work fine via\nDERP since lower-frequency traffic tolerates the latency.\n\nFull diagnosis:\n~/workspace/reports/tailscale-degraded-2026-05-07.md\n\nBehaviour of this watcher:\n- Run `tailscale netcheck` and `tailscale ping elmars-s26-ultra`\n- Classify state: healthy | degraded | broken\n- Persist a rolling history to `~/workspace/state/tailscale_health.json`\n- Alert (force=True, bypasses quiet hours):\n  - Recovery: when state flips from degraded \u2192 healthy\n  - Warn: degraded continuously for \u226524h\n  - Escalate: degraded continuously for \u22657d, recommend Cloudflare Tunnel\n- Each alert rate-limited to once per 24h per key\n\nParent MC ticket: MC-2963.\n","id":"tailscale-watch","last_run":{"duration_s":6.448524,"log_path":"/home/lucienne/workspace/logs/task-runs/tailscale-watch/414021.log","output":"tailscale-watch: state=degraded udp=False ipv4=None ping_via=derp hours_degraded=879.66\n","started_at":"2026-06-13T07:00:35.072990+02:00","status":"completed"},"next_run":"2026-06-13 07:30","next_run_iso":"2026-06-13T07:30:00+02:00","runs":[{"duration_s":6.448524,"finished_at":"2026-06-13T07:00:41.524444+02:00","id":414021,"log_path":"/home/lucienne/workspace/logs/task-runs/tailscale-watch/414021.log","output":"tailscale-watch: state=degraded udp=False ipv4=None ping_via=derp hours_degraded=879.66\n","started_at":"2026-06-13T07:00:35.072990+02:00","status":"completed","task_id":"tailscale-watch","task_name":"Tailscale UDP/DERP health watch"},{"duration_s":6.406119,"finished_at":"2026-06-13T06:30:34.040577+02:00","id":413931,"log_path":"/home/lucienne/workspace/logs/task-runs/tailscale-watch/413931.log","output":"tailscale-watch: state=degraded udp=False ipv4=None ping_via=derp hours_degraded=879.16\n","started_at":"2026-06-13T06:30:27.630495+02:00","status":"completed","task_id":"tailscale-watch","task_name":"Tailscale UDP/DERP health watch"},{"duration_s":6.384426,"finished_at":"2026-06-13T06:01:14.607523+02:00","id":413845,"log_path":"/home/lucienne/workspace/logs/task-runs/tailscale-watch/413845.log","output":"tailscale-watch: state=degraded udp=False ipv4=None ping_via=derp hours_degraded=878.67\n","started_at":"2026-06-13T06:01:08.220300+02:00","status":"completed","task_id":"tailscale-watch","task_name":"Tailscale UDP/DERP health watch"},{"duration_s":6.44849,"finished_at":"2026-06-13T05:30:35.154295+02:00","id":413753,"log_path":"/home/lucienne/workspace/logs/task-runs/tailscale-watch/413753.log","output":"tailscale-watch: state=degraded udp=False ipv4=None ping_via=derp hours_degraded=878.16\n","started_at":"2026-06-13T05:30:28.703481+02:00","status":"completed","task_id":"tailscale-watch","task_name":"Tailscale UDP/DERP health watch"},{"duration_s":6.451982,"finished_at":"2026-06-13T05:04:36.928241+02:00","id":413677,"log_path":"/home/lucienne/workspace/logs/task-runs/tailscale-watch/413677.log","output":"tailscale-watch: state=degraded udp=False ipv4=None ping_via=derp hours_degraded=877.73\n","started_at":"2026-06-13T05:04:30.473486+02:00","status":"completed","task_id":"tailscale-watch","task_name":"Tailscale UDP/DERP health watch"},{"duration_s":6.421556,"finished_at":"2026-06-13T04:30:33.496359+02:00","id":413581,"log_path":"/home/lucienne/workspace/logs/task-runs/tailscale-watch/413581.log","output":"tailscale-watch: state=degraded udp=False ipv4=None ping_via=derp hours_degraded=877.16\n","started_at":"2026-06-13T04:30:27.072893+02:00","status":"completed","task_id":"tailscale-watch","task_name":"Tailscale UDP/DERP health watch"},{"duration_s":6.459441,"finished_at":"2026-06-13T04:04:47.970222+02:00","id":413501,"log_path":"/home/lucienne/workspace/logs/task-runs/tailscale-watch/413501.log","output":"tailscale-watch: state=degraded udp=False ipv4=None ping_via=derp hours_degraded=876.73\n","started_at":"2026-06-13T04:04:41.507848+02:00","status":"completed","task_id":"tailscale-watch","task_name":"Tailscale UDP/DERP health watch"},{"duration_s":6.515058,"finished_at":"2026-06-13T03:31:31.331327+02:00","id":413406,"log_path":"/home/lucienne/workspace/logs/task-runs/tailscale-watch/413406.log","output":"tailscale-watch: state=degraded udp=False ipv4=None ping_via=derp hours_degraded=876.18\n","started_at":"2026-06-13T03:31:24.813505+02:00","status":"completed","task_id":"tailscale-watch","task_name":"Tailscale UDP/DERP health watch"},{"duration_s":6.485091,"finished_at":"2026-06-13T03:00:45.037366+02:00","id":413317,"log_path":"/home/lucienne/workspace/logs/task-runs/tailscale-watch/413317.log","output":"tailscale-watch: state=degraded udp=False ipv4=None ping_via=derp hours_degraded=875.67\n","started_at":"2026-06-13T03:00:38.549097+02:00","status":"completed","task_id":"tailscale-watch","task_name":"Tailscale UDP/DERP health watch"},{"duration_s":6.365435,"finished_at":"2026-06-13T02:30:34.275125+02:00","id":413227,"log_path":"/home/lucienne/workspace/logs/task-runs/tailscale-watch/413227.log","output":"tailscale-watch: state=degraded udp=False ipv4=None ping_via=derp hours_degraded=875.16\n","started_at":"2026-06-13T02:30:27.907118+02:00","status":"completed","task_id":"tailscale-watch","task_name":"Tailscale UDP/DERP health watch"},{"duration_s":6.530986,"finished_at":"2026-06-13T02:02:31.307844+02:00","id":413143,"log_path":"/home/lucienne/workspace/logs/task-runs/tailscale-watch/413143.log","output":"tailscale-watch: state=degraded udp=False ipv4=None ping_via=derp hours_degraded=874.7\n","started_at":"2026-06-13T02:02:24.774152+02:00","status":"completed","task_id":"tailscale-watch","task_name":"Tailscale UDP/DERP health watch"},{"duration_s":6.400672,"finished_at":"2026-06-13T01:31:04.358380+02:00","id":413052,"log_path":"/home/lucienne/workspace/logs/task-runs/tailscale-watch/413052.log","output":"tailscale-watch: state=degraded udp=False ipv4=None ping_via=derp hours_degraded=874.17\n","started_at":"2026-06-13T01:30:57.955015+02:00","status":"completed","task_id":"tailscale-watch","task_name":"Tailscale UDP/DERP health watch"},{"duration_s":6.49215,"finished_at":"2026-06-13T01:00:48.381024+02:00","id":412967,"log_path":"/home/lucienne/workspace/logs/task-runs/tailscale-watch/412967.log","output":"tailscale-watch: state=degraded udp=False ipv4=None ping_via=derp hours_degraded=873.67\n","started_at":"2026-06-13T01:00:41.886161+02:00","status":"completed","task_id":"tailscale-watch","task_name":"Tailscale UDP/DERP health watch"},{"duration_s":6.399351,"finished_at":"2026-06-13T00:30:34.231425+02:00","id":412878,"log_path":"/home/lucienne/workspace/logs/task-runs/tailscale-watch/412878.log","output":"tailscale-watch: state=degraded udp=False ipv4=None ping_via=derp hours_degraded=873.16\n","started_at":"2026-06-13T00:30:27.829280+02:00","status":"completed","task_id":"tailscale-watch","task_name":"Tailscale UDP/DERP health watch"},{"duration_s":6.349532,"finished_at":"2026-06-13T00:00:57.172026+02:00","id":412791,"log_path":"/home/lucienne/workspace/logs/task-runs/tailscale-watch/412791.log","output":"tailscale-watch: state=degraded udp=False ipv4=None ping_via=derp hours_degraded=872.67\n","started_at":"2026-06-13T00:00:50.820490+02:00","status":"completed","task_id":"tailscale-watch","task_name":"Tailscale UDP/DERP health watch"},{"duration_s":6.486646,"finished_at":"2026-06-12T23:30:32.680530+02:00","id":412700,"log_path":"/home/lucienne/workspace/logs/task-runs/tailscale-watch/412700.log","output":"tailscale-watch: state=degraded udp=False ipv4=None ping_via=derp hours_degraded=872.16\n","started_at":"2026-06-12T23:30:26.191468+02:00","status":"completed","task_id":"tailscale-watch","task_name":"Tailscale UDP/DERP health watch"},{"duration_s":6.424959,"finished_at":"2026-06-12T23:01:57.047102+02:00","id":412616,"log_path":"/home/lucienne/workspace/logs/task-runs/tailscale-watch/412616.log","output":"tailscale-watch: state=degraded udp=False ipv4=None ping_via=derp hours_degraded=871.69\n","started_at":"2026-06-12T23:01:50.619673+02:00","status":"completed","task_id":"tailscale-watch","task_name":"Tailscale UDP/DERP health watch"},{"duration_s":6.455118,"finished_at":"2026-06-12T22:30:33.273815+02:00","id":412526,"log_path":"/home/lucienne/workspace/logs/task-runs/tailscale-watch/412526.log","output":"tailscale-watch: state=degraded udp=False ipv4=None ping_via=derp hours_degraded=871.16\n","started_at":"2026-06-12T22:30:26.816330+02:00","status":"completed","task_id":"tailscale-watch","task_name":"Tailscale UDP/DERP health watch"},{"duration_s":6.444798,"finished_at":"2026-06-12T22:00:42.681220+02:00","id":412441,"log_path":"/home/lucienne/workspace/logs/task-runs/tailscale-watch/412441.log","output":"tailscale-watch: state=degraded udp=False ipv4=None ping_via=derp hours_degraded=870.67\n","started_at":"2026-06-12T22:00:36.233639+02:00","status":"completed","task_id":"tailscale-watch","task_name":"Tailscale UDP/DERP health watch"},{"duration_s":6.464029,"finished_at":"2026-06-12T21:30:33.672209+02:00","id":412352,"log_path":"/home/lucienne/workspace/logs/task-runs/tailscale-watch/412352.log","output":"tailscale-watch: state=degraded udp=False ipv4=None ping_via=derp hours_degraded=870.16\n","started_at":"2026-06-12T21:30:27.205392+02:00","status":"completed","task_id":"tailscale-watch","task_name":"Tailscale UDP/DERP health watch"}],"runs_limit":20,"schedule":"*/30 * * * *","schedule_label":{"description":"Every 30 minutes","is_custom":false,"label":"Every 30 min","sort":1,"sort_time":""},"stats":{"avg_duration":6.782328051282051,"completed":351,"failed":0,"timeout":0,"total":351},"task":{"_description":"**OVERRIDES runtime profile:** uses `direct_python` because this network health\nwatch runs deterministic Tailscale checks and does not invoke Claude/Codex or\nany LLM API.\n\nHalf-hourly health check on Tailscale's NAT-traversal layer.\n\nBackground: 2026-05-07 SAST \u2014 Tailscale on Luci has broken UDP NAT discovery\n(`netcheck` reports `UDP: false`, `IPv4: no addr found`, no DERP latency).\nPhone-to-server traffic relays via DERP-jnb with 241ms-2.1s RTT, making the\nmobile PWA feel unusable. Workaround in place: phone uses public IP\n`http://204.168.188.33:3001` direct. SSH and admin paths still work fine via\nDERP since lower-frequency traffic tolerates the latency.\n\nFull diagnosis:\n~/workspace/reports/tailscale-degraded-2026-05-07.md\n\nBehaviour of this watcher:\n- Run `tailscale netcheck` and `tailscale ping elmars-s26-ultra`\n- Classify state: healthy | degraded | broken\n- Persist a rolling history to `~/workspace/state/tailscale_health.json`\n- Alert (force=True, bypasses quiet hours):\n  - Recovery: when state flips from degraded \u2192 healthy\n  - Warn: degraded continuously for \u226524h\n  - Escalate: degraded continuously for \u22657d, recommend Cloudflare Tunnel\n- Each alert rate-limited to once per 24h per key\n\nParent MC ticket: MC-2963.","_file":"tailscale-watch.md","_path":"/home/lucienne/workspace/tasks/tailscale-watch.md","command":"python3 /home/lucienne/workspace/scripts/tailscale_watch.py","enabled":true,"id":"tailscale-watch","notify_on":"failure","notify_to":"dm","retry":false,"run_as":"shell","runtime_profile":"direct_python","schedule":"*/30 * * * *","tags":["infrastructure","tailscale","monitoring"],"timeout":60,"title":"Tailscale UDP/DERP health watch"}}
