Fetched 93 unique posts from RSS feeds: hot, new, top_day, top_week, top_month, top_year, top_all
I have been running a personal AI assistant setup for a little over a month now. I started with apple notes but found that apple script were limited and would not let Hermes add and delete without a fuss. This project started originally as a second brain setup but then I kept seeing complaints about token usage. At the time I was on open router, even though I run on local now I still wanted to eliminate the bloat. I also wanted a space that Hermes could log everytime we fixed something so I didn't have to start from scratch every time an update wrote over our patch. My answer has been Obsidian — not as a fancy AI notebook, but as a structured knowledge base the assistant can read from and write to autonomously. Here's how it actually works under the hood. (Yes Hermes helped me write this because I committed to writing this post a few hours ago, buckle up) The Three-Tier Memory System The core idea is simple: not all information needs the same treatment. Some facts change every day. Some are stable for months. Some are just event logs. I split memory into three tiers: Tier 1 — Hot Memory (per-session, ~9K characters) This is injected into every conversation turn automatically. It covers things like communication preferences, active projects, recent corrections, and procedural quirks. Think of it as the assistant's working memory — fast to access, but limited in size. When it gets full, something has to move. Tier 2 — Vault Living Files (stable reference, on-demand) When hot memory hits about 67% capacity, I scan for entries that are stable enough to promote. Things like environment configs, operational context, and known failure patterns get written to markdown files in the Obsidian vault. The assistant reads these on-demand when deeper context is needed. This keeps the per-turn context window lean while still having access to months of accumulated knowledge. Tier 3 — Daily Notes (searchable timeline) Every day, a dated note is created with tasks, schedule, and a log section. Events get recorded throughout the day. This creates a searchable history of decisions and actions. The assistant can search past daily notes to recall what happened on a specific date or track how a project evolved over time. The Morning Briefing Pipeline This is the workflow that runs every weekday morning and it's the one that shows the system best in action. 6:50 AM — Data Collection A cron job fires and does three things: 1. Fetches tasks from Todoist via API, categorizing them by project (work, personal, side business) 2. Fetches calendar events from Google Calendar, filtering out noise like sleep tracking entries 3. Creates or updates the daily note at Daily/YYYY-MM-DD.md in the Obsidian vault The daily note follows a strict format: - Tasks section with checkboxes, priority tags, and overdue tracking - Schedule section with time-blocked events - Empty log section for recording things during the day - Wins section for end-of-day wrap-up - Context section with backlinks to people, decisions, and files 7:00 AM — Delivery A second cron job reads the cached data and delivers a formatted briefing directly to my Telegram. It's clean — no checkboxes, just bullets organized by category with an overdue count. I wake up, check my phone, and know exactly what the day looks like. The Finance Briefing Every weekday at 9 AM, another cron job runs: Navigates to Yahoo Finance for each tracked ticker Extracts current price, daily change, and recent headlines Classifies each he…
The team at Nous Research just dropped Hermes Desktop in public preview bringing their powerful open-source Hermes Agent into a polished native GUI app that runs locally on your machine. Download link https://hermes-agent.nousresearch.com/desktop
I just set 3 /goal not even 1$ used
credits to the x post https://x.com/i/status/2043233348085227734
I actually like the pro model and hit $14 vibe coding last night using open router which is still around $3.50 ish. It looks like you have to sign up direct with api billing. It’s china so data retention is a thing.
Install Hermes. Run Hermes. Cool, you have an AI agent. Now stop. Before you do anything, before you ask it to write code, research something or everything else, you open two things. In this exact order: hermes skills config hermes tools Skills first. Tools second. Why in that order? Because skills define what the agent knows how to do, tools define what it can actually touch. You decide what the agent is capable of, then you give it the hands to execute. Not the other way around. “But Hermes ships with a million things preloaded, it’s bloated.” Yeah. It does. That’s not a bug, it’s a feature. You just haven’t used it yet. Every skill in Hermes has an on/off switch. Polymarket? Off. That Pixel art skill? Off. Whatever you don’t need: off. hermes skills config is literally a curses UI where you scroll through everything that’s installed and toggle it per-platform or globally. There’s no excuse to leave stuff you don’t use burning tokens and misleading your agent. Same with toolsets. hermes tools gives you a grid of every toolset: web, browser, terminal, file, code execution, vision, delegation, and a bunch more. Most of them are on by default. If you’re only using Hermes from CLI and you don’t want it touching your browser or spawning subagents, you turn those off. You use hermes only from Telegram? Turn the CLI toolset off and the other way around. It’s a checkbox interface. It takes 90 seconds. The “bloat” complaints that's spreading on twitter are people who installed a Swiss Army knife and got mad that it came with every blade. If you use profiles, this is non-negotiable. hermes profile create lets you spin up isolated Hermes instances: different configs, different skills, different memory. This is the right way to use Hermes for multiple use cases and the spread that help me keep everything focused. But every profile is a blank slate that inherits the full skill/tool stack by default (or the one you used to create the clone). If you have a “work” profile and a “research” profile and you never run hermes skills config and hermes tools in both, you’re running the same bloated setup under different names. One profile for coding. One for research. One for home automation. Prune each one independently. Different profiles, different loadouts. This is agentic engineering. The complaints about “Hermes is too heavy” come from people who want to hand an Opus 4.8 a pile of tools and watch it figure it out. That’s not how this works. You are the engineer. You specify what the agent has access to, what it’s allowed to do, and what context it carries between sessions. And don’t think this is overhead, cause this is the actual work. Otherwise, ChatGPT is there, and I still use it when I don’t want to deal with stuff like this. hermes skills config → hermes tools. Then you work. If you skipped that step, you didn’t install a "heavy" or broken agent. You skipped the configuration phase.
Clean, fast, and surprisingly polished for a tool this powerful. The skills ecosystem, toolsets, and desktop experience all fit together really well. Download link https://hermes-agent.nousresearch.com/desktop
What is r/hermesagent? The unofficial community for Hermes Agent by Nous Research - an open-source AI assistant that runs code, manages files, browses the web, chats across platforms (Telegram, Discord, Signal, WhatsApp, email), and remembers past conversations. This subreddit is for people who actually use Hermes - not just hype, not just questions, but real setups, real workflows, real problems, and real builds.
Before you post Search first. Chances are someone already asked it: - Search r/hermesagent - Subreddit wiki (in progress) If your question is about setup, models, cost, Docker, VPS, or integrations, it's very likely been covered already.
Most popular threads (worth reading) These are the highest-signal posts from the community's first months: Models & Cost - DeepSeek v4 Pro — unlimited and almost free (612 votes, 363 comments) - DeepSeek v4 pricing change (522 votes, 81 comments) - Best FREE model for Hermes ATM (409 votes, 79 comments) - Best models after testing with 6 billion tokens (260 votes, 146 comments) - Battle of the $20 providers (165 votes, 127 comments) - Best Models for Hermes Agents — May 2026 Benchmarks (109 votes) - What model are you running your agent on? (77 votes, 145 comments) Local Models (Qwen, GLM, etc.) - Yes, Hermes and Qwen3.5:4b is all I need (214 votes, 100% upvoted) - Qwen3.6-35B-A3B Community Variants — Definitive Guide (119 votes, 97% upvoted) - Qwen3.6-27B Q8 perfect for Hermes Agent (77 votes, 98% upvoted) - Qwen3.6-27B Community Variants — Definitive Guide (56 votes, 99% upvoted) - Model Tier List & Performance Guide (April 2026) (56 points) - Masterthread — Models Feedback (Last 2 Weeks) (25 points) Megathreads - Models Megathread — May 2026 (129 points, 32 threads analyzed) - MEGATHREAD: Use Cases — May 2026 (239 votes, 35 comments) - Skills Hub & Custom Skill Development (Master Thread) Setup & First Steps - The first thing you MUST do with Hermes (301 votes, 70 comments) - The cron job every serious user should have (171 votes, 41 comments) Use Cases & Workflows - Genuinely blown away (277 votes, 71 comments) - Claude Code + Hermes = Massive Unlock (214 votes, 117 comments) - MEGATHREAD: Use Cases — May 2026 (239 votes, 35 comments) Memory & Context - Memory Providers: I tested them all (266 votes, 148 comments) Hermes Agent #1 on OpenRouter - Hermes Agent is now #1 on OpenRouter token rankings (459 votes, 49 comments) Major Releases & News - Nous Research Launches Hermes Desktop (343 votes, 105 comments) - Hermes Agent v0.15.0 — The Velocity Release (264 votes, 103 comments) Kanban - WHAT IS THE NEW KANBAN FEATURE? (IT'S GAME CHANGING) (291 votes, 80 comments) Discussion & Community (1/2) - Anthropic just proved the point — platforms will always claw back (363 votes, 75 comments) - Am I missing the point of AI agents? (214 votes, 227 comments) - Stop asking "what can Hermes do?" (155 votes, 91 comments)
Commonly asked questions These topics come up nearly every day. Search before posting: Setup - Installing Hermes: Docker vs local vs VPS - Quick vs Full install — what's the difference? - Hermes Desktop App — connecting to a remote gateway - WSL, Docker, Proxmox setup issues - WebUI confusion ("why does Hermes run in a container and the webUI also run Hermes?") Models & Providers - What's the cheapest/best model for ___? - DeepSeek v4 / Minimax M3 / GPT / Claude — which one? - Local vs cloud model …
Hermes Agent is now #1 on the Global OpenRouter token rankings. While our journey together has just begun, we'd like to take this opportunity to thank our contributors, supporters, and users for all they have done to get us this far. NousResearch on X Love to see this. Hermes has become an important part of my own home-agent journey, both locally (on a Mac Mini) and on my VPS setup (on LightNode). Even though it may not replace OpenClaw for me any time soon, it has already earned a pretty solid place in my stack. I’ve been using it alongside my other tools to experiment with agent workflows, model routing, self-hosted-ish setups, and just generally figuring out what a practical personal AI environment can look like. It’s cool to see an open-source agent project climb this fast and become a real viable competitor.
I wanted to give my agent access to my iPhone so I spent like 3 months building out this hardware device. It essentially lets an AI control your phone entirely. It can download ios apps, imessage people, even facetime audio. Originally built it for tiktok automation stuff but realized it was more broadly useful. Can read more on connect ai. Also, this works on a normal device. You don't need to jailbreak it. What do you all think
I’ve been using Hermes Agent for about a month, and the biggest surprise is not how hard it is to start. It isn’t. Hermes works impressively well out of the box. The work from u/NousResearch, u/teknium-official, and the contributors is already visible from the first run: skills, memory, session search, multiple providers, CLI and gateway support, and a system that is clearly designed to improve through use. The real challenge starts after that first “wow” moment. Because Hermes is powerful enough to make you overestimate how ready you are to use it properly. That was my first lesson: don’t try to build the whole machine on day one. It is very tempting to look at Hermes and immediately think: “I can make this manage my inbox.” “I can make this run my research workflow.” “I can make this handle my TikTok profile.” “I can make this organize half of my life.” Maybe you can. But not all at once. Start with one small workflow. Make it boringly reliable. Then add the next piece. Then connect things. Hermes gives you enough power to scale quickly, but you still need to earn that complexity step by step. Understand what works, and let the system grow with you. And don’t get too frustrated when it breaks. It will. But that’s often where the useful part starts. When you fix a broken workflow, you also make the original idea clearer: what the agent should do, what it should not do, which assumptions were too vague, and which steps needed structure. That is usually when Hermes starts becoming closer to the agent you had in mind. The second thing I wish I understood earlier: profiles are not just a convenience feature. They are one of the cleanest ways to keep Hermes useful. Don’t turn the default profile into a giant backpack full of every skill, every tool, every instruction, and every half-baked idea you had at 2am. Segment things. A research profile should not necessarily behave like a coding profile. A personal automation profile should not necessarily carry the same context as a writing profile. A focused Hermes is usually better than a “universal” Hermes that has to guess which version of itself you need. This becomes even more important with skills/config/tools management. Choosing what stays in context is not just optimization. It is part of designing the agent. The third lesson: config is not admin work. It is the product. A lot of “Hermes is acting weird” moments are actually configuration moments. Wrong assumptions. Missing settings. Too many things active at once. A parameter changed without really understanding the tradeoff. A profile that grew messy over time. Before blaming the agent, inspect the environment you built around it. And the funny part is that Hermes is very good at helping you understand Hermes. Ask it to explain your config. Ask what each setting affects. And when something feels off, don’t just tweak things blindly: ask what might be causing that behavior. That alone will save you a lot of random trial and error. The fourth thing: don’t treat the skill system as an accessory. It is closer to the core idea. Hermes is not just “chat + tools.” The interesting part is that it can learn from usage, create skills, improve them, reuse them, and gradually turn repeated behavior into something more durable. That changes the way you interact with it. You are not only prompting an agent. You are shaping an operating environment. And every task you teach Hermes tends to expose something about your own workflow: where your instructi…
A lot of people are getting into Hermes Agent lately, and thankfully the community around it is way more grounded than the crustacean one, where everyone and their mother tells newcomers to just use the latest version of Claude and spend hundreds of dollars a day doing simple stuff like web searches. I wanted to share that there’s a new FREE model on OpenRouter that is an absolute BEAST. It’s easily the best model I’ve ever used outside of the ultra-expensive SOTA models. It’s called Ring 2.6, and it’s currently free on OpenRouter: https://openrouter.ai/inclusionai/ring-2.6-1t:free The tool-calling and troubleshooting capabilities of this model are absolutely insane. I’ve been using it A LOT, and my Hermes experience has been an absolute blast. I usually rely almost exclusively on local/free OpenRouter models (or very cheap ones) for my Hermes setup, and honestly, it works fine like 95% of the time. But that remaining 5% can be REALLY annoying when things break or the model gets stuck. Normally, I only use SOTA models to fix something extremely complex or when I absolutely need to get something right on the first try. But this model? XD This thing THINKS A LOT, so it burns through tokens (I started using it yesterday) like crazy. As you can see in the screenshot, I honestly don’t know if its pricing will still be viable once it officially launches. But man... I really hope it is, because I’m in love with this thing.
UPDATE: Superseded by Hotfix release 0.15.1-2 — use hermes updateagain or get the latest here: https://github.com/NousResearch/hermes-agent/releases Dateline: May 28, 2026 https://github.com/NousResearch/hermes-agent/releases/tag/v2026.5.28 The Velocity Release. Hermes gets dramatically faster — to start, to run, to ship work, and to grow. Kanban grew into a real multi-agent platform across 104 PRs — orchestrator auto-decomposition, swarm topology, scheduled tasks, worktree-per-task, per-task model overrides. Since v0.14.0: 1,302 commits · 747 merged PRs · 1,746 files changed · 282,712 insertions · 36,699 deletions · 560+ issues closed (15 P0, 65 P1, 19 security-tagged) · 321 community contributors (including co-authors) ✨ Highlights Selected highlights from the release notes •• See official release page for complete info with Issue/PR references! Run_Agent.py Refactor — The agent conversation loop at the heart of Hermes has been reduced (-76%) and extracted into 14 cohesive modules underagent/. Behavior is unchanged: every extraction keeps a thin forwarder on AIAgent, every test patch path still works, every external caller is compatible. The reason you care: future Hermes development moves faster, authors can finally grep the codebase, and the code opens in a blink. Kanban grew into a real multi-agent platform — 104 PRs end to end in this Multi-Agent Maturation Wave Triage auto-decomposes one task into a tree of sub-tasks. hermes kanban swarm creates a full Swarm v1 graph in one command — root, parallel workers, gated verifier, gated synthesizer, shared blackboard. Tasks support per-task model overrides (cheap models for boilerplate, expensive ones for hard sub-tasks), board-level default workdirs, per-task worktree paths and branches, scheduled start times, configurable claim TTL, retry fingerprinting, stale-task detection, respawn guards, and a drag-to-delete trash zone. Workers report through /workers/active, /runs/{id}, and /inspect endpoints. Cold-start perf wave keeps going — Another second saved with 47% fewer per-turn function calls — Three new optimization rounds completed. session_search rebuilt — no LLM, no cost, 4,500× faster — The old session_search was an aux-LLM-powered tool that cost ~$0.30/call and took ~30 seconds to summarize three sessions, sometimes confabulating when the right session wasn't even in the FTS5 hit list. The new shape is one tool with three modes (discovery, scroll, browse) inferred from which args are set — Searching your past sessions for context is now free and instant. Promptware defense — Brainworm-class attacks blocked at three chokepoints — Hermes now defends the context window against prompt-injection attacks that try to hijack the agent via tool output, recalled memory, or stored skills (from Brainworm / Promptware Kill Chain research - Origin HQ, arxiv 2601.09625), . Single source of truth (tools/threat_patterns.py) with ~15 new Brainworm/C2 patterns; recalled memory is scanned at load time; tool results get delimiter markers so a malicious file or remote service can't impersonate Hermes' own system content. Paired with a new security-guidance plugin that pattern-matches dangerous code writes. Bitwarden Secrets Manager — one bootstrap token replaces every per-provider API key — Stop keeping plaintext API keys in ~/.hermes/.env. Install Bitwarden Secrets Manager (bws auto-installs lazily on first use), point Hermes at it with one bootstrap token (BWS_ACCESS_TOKEN), and every credential you …
A week ago I posted my Hermes + Codex + Claude Code setup here (https://www.reddit.com/r/hermesagent/s/et4WUIbPbH) and it got more traction than I expected. People built it, hit walls, asked good questions, and made it better. Then the news from this week hit, and it’s worth zooming out. What changed: - Anthropic announced that starting June 15, claude -p and Agent SDK calls get unbundled from the subscription pool. Programmatic usage now meters against a separate monthly credit ($20/$100/$200 by tier) at API rates. No rollover. - There’s a documented case of someone getting billed $200 in API charges because the string “HERMES.md” appeared in a commit message and Anthropic’s backend flagged the account as third-party harness usage. The detection mechanism for that is still live. - The claude -p headless flag has a known bug where it silently routes to API billing even with no ANTHROPIC_API_KEY set and an active Max sub. People have reported four-figure surprise charges. - In April, Anthropic briefly removed Claude Code from the $20 Pro tier entirely. They walked it back, but the signal was sent. This is a pattern, not a series of accidents. Every one of these moves does the same thing: it punishes the people using Claude as a component in their own system and rewards the people using Claude as the system. The honest read on what these companies want: They know the tools are more powerful when wired into our own workflows. They also know that’s the model where they make the least money and have the least control. So the pricing structure, the detection classifiers, the silent billing routes, all of it nudges us back onto their platform. Use Claude inside Claude. Use Claude Code inside the Claude Code session they shipped. Don’t pipe it. Don’t orchestrate it. Don’t build on it. Same playbook as every previous platform shift. Open enough to attract builders, then close enough to extract from them. Why this matters for the Hermes crowd specifically: The whole appeal of Hermes is that it’s the orchestrator we own, sitting on hardware we own, talking to whatever tools serve the moment. Claude Code as a coding specialist worked beautifully for exactly that reason. It was a powerful tool serving our system. That arrangement is exactly the thing Anthropic just metered. After June 15, every coding task Hermes hands off through -p is a small tax. Every project file with the wrong string in it is a billing risk. The setup still functions, but the relationship has flipped. We’re not extending their tool anymore. We’re paying tariffs to cross a border. Local LLMs are the answer, but they have to actually work first. Most local models are unusable for real orchestration work for the average person. CPU inference with a full Hermes prompt and tool context was painfully slow when I tried it. I scrapped it. That’s the gap that has to close. Not “can a local model technically reply to a prompt.” That’s solved. The actual question is: can a local model run as the brain of an always-on agent with full tool context, multi-turn memory, and reasonable latency, on hardware a normal person can afford to leave running 24/7. Right now the answer is no. Hardware is too expensive or too slow, the models that fit on consumer GPUs aren’t strong enough at tool use, and the ones strong enough at tool use need data center silicon. The middle ground doesn’t exist yet. What we should be doing: Treating cloud models as rented muscle, not permanent infrastructure. Use them …
If you updated Hermes recently and noticed the new Kanban stuff, go back and actually look at it. I dug into the upstream code and docs because I started integrating it into Hermes Desktop. The screenshot is from that. Kanban feels like Hermes’ first real durable collaboration layer. delegate_task is still the right call for short-lived subagent work: spawn, do work, come back. Kanban is for work that needs to stay visible, survive retries, move between roles, or carried forward by an agent to the point where a human must intervene. That’s a fundamentally different thing. Tasks are stored durably, move through explicit states, can depend on each other, and keep per-attempt run history. So when something gets blocked, retried, handed off, or resumed later, that context isn’t lost. You can actually inspect what happened. The task states are straightforward: triage, todo, ready, running, blocked, done, archived (and they matter a lot) But things go deeper: A child task can wait on a parent task, which means work doesn’t start until the prerequisite work is actually finished. Your Hermes Agent can now become a production line. Period. Another thing I really like is that the dashboard, CLI, and worker tools all share the same board state. It’s one thing (a db), not three disconnected views. And because runs can carry summaries and metadata, handoffs between profiles are much more structured than the usual “hope the next agent figures it out” approach. That context doesn’t just evaporate. And you can see the attempts, what got stuck, and why. A completely different feeling from “the agent kind of did some stuff and now the context is gone.” Also, no fake magic: workers are real OS processes, and the board is local SQLite. It’s not pretending to be distributed orchestration when it isn’t. I actually tried pushing in that direction myself from a very different, much more fragile angle, and I talked about that here in the community before. I still think that path is worth exploring, but this Kanban approach is probably much closer to what Hermes actually needed at this stage of growth. One real caveat: it’s single-host by design right now. Don’t oversell it to yourself as some multi-machine orchestration fabric. But as a local, durable, inspectable, human-interruptible coordination layer, this feels like a big step. It makes Hermes feel less like one smart agent doing tricks and more like a system for ongoing work. Research pipelines, review loops, coding tasks, long-running ops stuff: it all makes a lot more sense now, and it no longer feels like something that requires a PhD to set up mentally. I repeat, what I'm saying comes from a deep dive into this new feature, but it's quite new and subject to my own opinion. Some may see things differently, and I'm posting here to discuss what I might be getting wrong. Curious what people here are going to build with it first. *A note to Hermes Desktop users: I tried to make the dashboard as intuitive as possible, but Kanban has a lot going on behind the scenes. If you encounter any difficulties, any feedback is welcome (even insults, lol).
I just hit rate limits after 3-4 prompts in a pretty normal session with 5.5. Codex web UI also suddenly looks different (no longer shows 5 hr rolling window). Did something just change? https://preview.redd.it/rlh462vc805h1.png?width=2088&format=png&auto=webp&s=9bdc59bcd83d8de46dce0b5dda1cc62ffa8a24ea
At what Hermes Agent can do! I’ve had a fairly basic setup for about 3 weeks now. A couple of cron jobs for news and reminders and doing some basic dev tasks on side projects. Using mostly free openrouter models after I spent just $10 in credits. I have a cron job every morning that fetches the current free models, does a quick API call on them to see if they’re active and then ranks them by context window size and saves the top two as my config default and fallback. So every morning I just start a new session in telegram and I have the latest top free working model. This weekend I’ve gotten a load of work done on a side project which is a telegram bot for creating content, invoices and other stuff for a small business. All using Hermes/supabase/vercel. All from my phone. My projects aren’t going to make me a millionaire but it’s spectacular to be able to send telegram voice messages and screenshots and have it just go and plan, implement, test and deploy. All for about $1.50 in tokens.
I have an ollama cloud sub, and I've been switching between the different models to vibe-check their performance. I'm very impressed with both GLM 5.1 but particularly Minimax M3. Their ability to self solve, dig, and do what they need without much prompting has made Hermes particularly useful for me!!! Recommend giving them a go if you can, better limits with ollama as opposed to Claude / ChatGPT, but also keen to know what other models you can get good usage out of with good capabilites!
Long story short: All the available memory providers kinda sucks for different reasons except one. Cloud providers sucks because they are cloud, Vendor lock-in and data retention is just not for me. Hindsight is technically the best in terms of memory but it's too heavy to run, too many API calls, costly even within cheap models, hidden configuration settings, too much to deal with and with too many bugs. OpenViking is a pain to setup, I dropped halfway the process. Holographic, I liked the speed but quality was not there. I'm still unsure if it was doing something. Hancho, Another one that was a pain to setup, pretty good at profiling, but suffering from the same issue of Hindsight. Then I discovered Mnemosyne. It's not built in by default but it should! it's the easiest so setup, lightweight, fully local, and it be best balanced between quality and speed. I'm essentially making this post because I think Mnemosyne it's not getting the attention it deserves. It uses a a sqllite database with a fast embedding and a tiny local LLM to consolidate memories and its good enough, I swapped the default model with qwen 0.8b and it's even better, using bigger models is possible if you need maximum quality. Try it, I'm curious to know what you think. edit: link: https://github.com/AxDSan/mnemosyne
I considered cost effectiveness as my main motive here. I tried various tasks (Web scraping, advanced research analytics, Software development, LLM inference enhancments, etc ) and the best were as following 1-GPT 5.5 (by far) 2-Kimi k2.6 3-GLM 5.1 4-Minimax M2.7 5-Qwen 3.6 Max 6- Any Gemini model (For local models, Qwen 3.6 35B A3B is the top option. Qwen 3.6 27B dense is good but too slow for my workflow.) GPT 5.5 is a real advancement over 5.4. It is the most expensive but having to wait 18 hours for a statisical research analysis with GLM 5.1 while GPT took less than an hour, that's a clear choice. I am not wasating 18 hours just to save 10$ I have tried Sonnet 4.6. It is awesome but cost is really high so i excluded it. The subiscriptions that I find best (cost effectiveness as my main motive, again) 1-OpenAI 20$ 2-Opencode Go 10$ 3-Minimax 10$ 4-Kimi's 20$ plan 5-GLM 18$ (if you have olde 3$ annual plan, it would go 2nd place) Chinese models are awesome. GLM kept getting stuck in loops all the time. Kimi will start getting good then the 5-hour quota kicks in. Minimax is... fine? It needs excellent prompting to work as desired. GPT 5.5 was the beast in software development, scraping, analysis and multi-steps cron jobs.
I’ve spent the last few weeks obsessing over one goal: having a personal, self maintaining AI assistant that costs $0and can be controlled from my phone. It wasn't easy. I started with an AWS Ec2 with 50GB storage and t3.micro memory- minimal setup (using the free credits) and made Oracle Cloud instance ($300 free credits but just for a month so I used it for experimenting with local models) I was using Termius to SSH into everything from my phone At first I used OpenClaw. It was cool, but I spent more time fixing it than actually using it. I almost gave up until I saw a video about Hermes Agent. And i actually found Hermes while looking for how to fix an OpenClaw error on YouTube (thanks NetworkChuck 🙌🏽) He mentioned the exact same frustrations I was having, and that Hermes had been stable for a month. I didn't even finish the video before I pulled the repo. The best part? It had a "migrate from OpenClaw" feature. I was up and running in minutes. The hardest part is the rate limits. If you use cloud models especially for code, you hit a wall fast. My solution? The Fallback Chain. Initially I was using openrouter/owl-alpha (stealth models are usually flagships in testing, like big-pickle is deepseek v4) which has 1M context window and was on multiple rankings. Over time after I transitioned to Hermes, I wanted a bit more customization, while owl alpha was good at tasks, It’s nothing to talk about on roleplay, it just scrapes the surface of the character I set in SOUL md file. On my oracle instance I had been experimenting with local models (keep in mind, if you go local, you’ll be sacrificing speed but privacy. Ofc since the vms don’t have a gpu it would be slower, about 3-5 minutes for a simple response) The one I was most impressed with is Google’s Gemma-4-31b-it It played the role perfectly Buuut if you know Google, you’re familiar with their aggressive rate limiting. So I set up my agent to rotate through providers. I start with Gemma 4 for that perfect personality and roleplay via openrouter (add an ai studio api key in BYOK for longer usage). If that hits a limit, I’ve also set the same model via ollama cloud and using Google OAuth directly (basically Gemma 4 3 times lol) And if those all hit limits, it jumps to Qwen3-coder-next (Alibaba, 1M free tokens per model. There’s like 80), then Nova (AWS bedrock), DeepSeek v4 (Azure and Opencode Zen), and Claude Haiku (GitHub). If everything fails, I have Owl Alpha; which is an absolute beast, took almost 70M tokens before I got rate limited once, that too for a few hours. It lives in my Telegram and Discord. It manages my Spotify, handles my emails, and when I need real research done, I have it spawn three separate agents to work in parallel. It’s been 8 days and it hasn't broken once. If you're looking to get AI without spending a fortune, I highly recommend looking into this
So yeah, anyone tried hermes as a full on virtual assistant that you check only once in the morning or evening and let it run wild? For things like delivery or storage management or customer engagement or social media posting etc? Curious to hear from you guys
Hey everyone I wanted to hear you all what cool skills have you made with Hermes 😄 I also wanted to share mine. electrical-schematics: generates professional circuit schematics as SVG/PNG from a text description, using a maze autorouter with collision detection and diagonal routing. datasheet-finder: hunts down electronic component datasheets across four sources, handles obsolete parts with replacement suggestions, stores everything in a persistent PDF vault. video-use: edits any video by conversation, transcribes with local GPU whisper, burns subtitles, color grades, overlays animations — all through a single prompt.
Part 1 - Part two in a sticky post below. A community-sourced compilation of real-world Hermes Agent use cases, pulled from Reddit, X/Twitter, GitHub (issues, PRs, community repos), YouTube, Hacker News, blogs, podcasts, LinkedIn, Product Hunt, and GitHub Gists. 276 use cases across 16 categories from 12 sources. Dev Workflow (61 use cases) People using Hermes for software development, codebase analysis, CI/CD, multi-agent build pipelines, and developer tooling. Use Case Source 12 Hermes instances every day, in parallel — backend team monitors stack, post-training team creates RL environments and benchmarks @Teknium on X Multi-agent auto-build workflow (plan to code to QA to ship) — GPT-5.4 orchestrates, MiniMax M2.7 codes, local Qwen 35B tests @gkisokay on X Hermes as a watchdog for OpenClaw — saved hours every day @gkisokay on X Day 10: agent knows my codebase better than I do X Built my own stack, then converged on Hermes X Agent sees a file change and auto-acts on it X Codex watches Hermes agent-to-agent workflows live @gkisokay on X GLADIATOR: 9 Hermes agents, two rival AI companies competing via GitHub stars YouTube Telegram to Modal serverless — 40% faster on research tasks Blog 5 apps built and launched in a single day LinkedIn 8h/day on Opus: email pipeline with DBOS + Postgres + S3 GitHub Audited 129 of my own sessions across 23 days via external RCA script GitHub Skill Factory: silently watches workflows and writes SKILL.md + plugin.py GitHub CCD multi-agent pod on an M2 Ultra with Mem0 + Qdrant GitHub 73% of every API call is fixed overhead — built monitoring dashboard to profile token usage GitHub Accumulates knowledge about my codebase over time Blog Kanban feature built into Hermes — multi-agent kanban workflows for task orchestration u/itsdodobitch on Reddit Agent Studio — Zero Human Company — orchestration layer for team of engineer agents u/labeebk on Reddit Zero-token watchdog plugin — monitors GitHub repos, RSS feeds, websites. Only notifies on changes u/JealousPlastic on Reddit Self-hosted multi-agent system with 2 Hermes instances coordinating via Syncthing — JARVIS orchestrator delegates to specialist workers u/SpecialistPowerful23 on Reddit Auto-generating agent skills from Apify Actors — skill packages generated automatically from web scrapers u/Hayder_Germany on Reddit Hermes Agent Self-Evolution (early alpha) — DSPy + GEPA to automatically evolve Hermes skills u/piggystarter on Reddit OpenCode integration plugin — dispatch coding tasks to multi-agent harness (17 stars) zaycruz on GitHub Hermes autonomous server — systemd + native cron, production-ready headless setup (7 stars) JackTheGit on GitHub Zo-oroboros swarm executors — Claude Code and Hermes integration (6 stars) marlandoj on GitHub DeerMes — Deerflow execution layer + Hermes learning layer (2 stars) Hompeaz on GitHub 4 Anthropic-inspired agent features: Dreaming, Outcomes, Orchestration, Webhooks (PR #21212) dashitongzhi on GitHub Hermes Agent Self-Evolution (2872 stars) — optimize skills, prompts, and code using DSPy + GEPA NousResearch on GitHub Maestro — cross-agent coding conductor (151 stars) — structured memory, handoffs, plan-approve-execute coordination ReinaMacCredy on GitHub Lossless Context Management plugin (424 stars) — DAG-based context engine that never loses a message stephenschoettler on GitHub Hermes Labyrinth observability plugin (257 stars) — journeys, crossings, guideposts, and reports stainlu on GitHub Icarus plugin — self-memory…
Been using it for 2 months now. The most disliked thing about hermes. Memory is full let me clean it up. Lol😂
Hi everyone,
I’m quite new to Hermes Agent and I’m experimenting with practical personal-assistant use cases.
One thing I’m trying to understand is the safest way to give my agent access to low-risk personal accounts, for example a club/membership account where I want the agent to check events, discounts, benefits, availability, etc.
I’m not talking about banking, work systems, email, or anything highly sensitive.
My current idea is:
- Store the credentials locally on my homelab/mini-server
- Keep them out of the Hermes/Jarvis profile itself
- Do not paste passwords into Telegram/chat
- Let Jarvis access them only through SSH to the local machine
- Add profile rules saying it must never print, copy, or log the credentials
- Allow read-only actions like checking discounts/events
- Block actions like registrations, cancellations, purchases, or profile changes unless I explicitly approve the exact action
I’m trying to be as security-conscious as possible, but I’m also a beginner and want to keep the setup simple enough that I can actually use it.
How are others handling this?
Do you store credentials in local .env files, password managers, browser profiles, system keyrings, Docker secrets, or something else?
Do you give the agent direct access, or do you create small tools/scripts so the agent never touches the password directly?
Any recommended patterns or things I should absolutely avoid?
Thanks!
I’m gonna keep this short but I had to come and share this knowledge with y’all because it’s a massive unlock but I see hardly anyone talking about this setup. Install Claude Code on your VPS or Mac mini that you have Hermes running on, and have it be your personal Hermes builder, doctor, advisor, whatever (genuinely endless possibilities). Today after almost giving up on Hermes for the 10th time I just thought “fuck it”, installed Claude on the VPS, pointed it to the .hermes root directory and asked it to fix all the bugs, and it pretty much one shotted it. The bonus is that you can use your Claude subscription and not burn through tokens. Now I consider my Hermes to be my autonomous executor and the place that all my agents, scripts, skills etc live, it has my API keys, my second brain, and is of course my gateway via Telegram, but Claude Code is now where I do much of the building. I highly recommend this setup.
I built a fully autonomous local OS layer (CEM) powered by Hermes. To prove its system-level control, I had it take over my Mac, puppet the mouse and keyboard, and physically hijack Claude's web UI in real time. This isn't a browser extension. It's native OS automation driven by a 5-layer memory routing tree. We don't drag 200k tokens through every basic thought. The tree routes context surgically. We ran this architecture non-stop for days—including beating a $5M funded startup's memory benchmark with an 80.4% strict score—and it cost exactly $30 in Gemini API credits via smart routing. Here is the raw screen recording of the takeover: https://www.youtube.com/watch?v=tybwUiGCwu4
The subreddit has grown fast, which is great but it also means we need a few more people to help keep it organized, useful, and actually enjoyable to read. Right now the biggest needs are not “power mods.” They’re people who can help steward specific lanes of the community. We’re especially looking for people who naturally gravitate toward one of these areas: - Help / troubleshooting setup issues, installs, errors, debugging, common fixes - Models / routing / local LLMs model comparisons, pricing/cost tradeoffs, VRAM, local vs cloud - Infra / hosting VPS, Docker, Coolify, Proxmox, remote setups, deployment - Discussion / workflows / use cases practical usage, how people are actually using Hermes, best practices - Showcase / guides / wiki useful builds, repos, tutorials, documentation, organizing repeated knowledge What we need is simple: - help keep posts correctly flaired - help answer or redirect repeated questions - help identify posts worth saving into a wiki/index - help encourage better discussion, not just one-off support threads You do not need to do everything. The idea is that each person helps cover a lane they actually understand. If you’re active here, know the ecosystem well, and want to help shape the sub into something more useful than a pile of repeated questions, leave a comment or message me with: - which lane(s) you’d want to help with - what experience you have with Hermes / local models / hosting / workflows - whether you want to help casually, or take on a more active mod role We’ll probably start small and practical: - a few lane owners - a cleaner flair system - a pinned index / wiki - better routing for repeat questions If that sounds like your kind of thing, step forward and shoot me a message.
I feel like I’m missing something with AI agents and I’m genuinely trying to understand the hype. I’ve installed Hermes Agent and played around with it a fair bit. Apart from content creation, vibe coding side projects, email cleanup, summaries, Telegram bots sending daily briefs/weather/news etc… I’m struggling to find truly transformative personal use cases. Whenever I search Reddit or YouTube for “how people actually use agents”, it mostly seems to be: - summaries - notifications - inbox management - content generation - automation for the sake of automation Maybe I’m just not creative enough, but none of that feels life-changing to me. For work-related tasks I absolutely understand the value. AI as a copilot makes total sense. But I’m more curious about personal life use cases. People talk about agents like they fundamentally changed their daily life, and I’m trying to understand what those use cases actually are beyond basic convenience. So I’m asking honestly: What are some genuinely high-value personal use cases you’ve found for AI agents that go beyond “daily briefs” and “vibe coding”? Not looking for hype — just real examples from normal people.
I’m usually more of a reader than a poster, but I’ve been getting a lot of useful information from this subreddit, so I wanted to contribute back with what I’ve learned so far. I’m still early in building my Hermes setup, but the biggest lesson for me is that Hermes should not be treated like a random chatbot with tools. It works better when you treat it like a managed operating system that needs structure, permissions, memory control, and a clear chain of command. Here are the main things I’ve learned so far. 1. Don’t try to build the whole machine on day one A few posts/comments here helped me realize that the best way to grow Hermes is not to dump every workflow, tool, skill, and project into it at once. The phrase I keep coming back to is: Boring reliability before expanded authority. Start with one real workflow. Make it stable. Make it repeatable. Make it boring. Then expand. For me, that means using one structured project as the training ground before giving the agent broader authority over other areas. 2. One main operator, then specialized agents only when justified I like the idea of having one main operator profile that acts as the manager/coordinator. But I’m trying not to create a bunch of profiles just because I can. A profile or subagent should exist only when there is a real reason, such as: different domain expertise different model different tools different memory boundary different permission level different user/access boundary The way I’m thinking about it now: One command system. One main operator. Specialized profiles only when they earn their place. 3. Checklist Manifest instead of giant repeated chat checklists The “Nemawashi” / checklist idea from this subreddit was one of the most useful things I found. The problem is that pasting a giant checklist into every response can bloat the chat and waste context. The better version I’m working toward is an external Checklist Manifest file. The idea: Agent creates or uses a checklist file for the active project. Agent reads that file before continuing work. Agent updates the file after each checkpoint. Chat only shows a compact progress summary. Example chat summary: Last completed: source review Active: drafting current section Next: package audit Blockers: none Manifest updated: yes That keeps the full checklist durable without turning every response into a wall of text. 4. Proposal is free. Authority is controlled. Execution is logged. This has become one of my main rules. I want the agent to propose improvements freely. If it sees a repeated task, a possible new skill, a better folder structure, a memory gap, or a workflow issue, I want it to say so. But I don’t want it silently modifying its own governance, prompts, memory structure, tools, or skills. The safer pattern is: Agent identifies the issue. Agent proposes the change. Human reviews and approves the exact wording or patch. Agent applies only the approved change. Agent logs what changed. That seems like the right balance between self-improvement and control. 5. SOUL.md, AGENTS.md, memory, skills, and project files should stay separate One thing that helped me think more clearly was separating the purpose of each layer. The way I’m thinking about it: SOUL.md = identity, behavior, communication style, mission AGENTS.md = project/system rules, file boundaries, operating instructions memory = durable facts, decisions, preferences, and lessons learned skills = narrow reusable procedures project files = actual wo…
I am stucked in connecting screen
been running hermes agent on a hostinger vps for a few days to start with as a beginner. want to know if i'm the only one going through this. just setting it up was a maze. docker container, ssh tunnel from my mac to even open the dashboard, nginx inside the container so the kanban board would load properly. every time something restarts the tunnel breaks and i have to redo it. half the time the terminal interface throws errors because some module didn't install or the config has a blank field nobody told me about. then this week the whole thing went into a loop. three tasks, ~288 worker runs in 2.5 hours. every run loads a bunch of context before it even starts doing anything useful. the workers were running out of api credits but exiting like everything was fine, so the system kept spawning new ones every 60 seconds thinking the last one just finished. i had a failure limit set to 2 and it did nothing because this specific kind of crash apparently doesn't count. burned through $5 in api credits while i was away from my laptop. could have been way worse if i hadn't checked in time. few things i want to ask: is anyone else running hermes on a vps instead of just on their own computer? i went the vps route so i could leave tasks running overnight, but the maintenance is wearing me down. those of you running it locally, do you just leave your laptop on all night? i'm using openrouter so i can switch between models, currently using kimi. would it be smarter to just pay openai or anthropic directly instead? or stay on openrouter and just pick cheaper models? what's everyone else doing to keep costs down? is this loop thing a new bug from a recent update or has it always been there? wondering if going back to an older version would help or if i need to fix the code myself. any way to set a hard spending limit per task? like "stop everything if this one task burns more than X tokens, no matter what." couldn't find anything obvious in the settings. hermes does some cool stuff when it works. but right now i don't trust it enough to leave it running for more than an hour without checking on it. would love to hear how others are handling this.
Love using GPT 5.5 Pro for bigger picture researching/planning but wondering if anyone found a method to allow Hermes to use it? Even a workaround to have the GPT 5.5 Pro in OpenAI's web UI would be great
Most "run your agent for free" advice is just "use this one free provider" and then you hit its rate limit in 20 minutes and you're stuck. The thing that clicked for me: Hermes isn't one model, it's a router. You don't need one unlimited provider. You need several limited ones and automatic failover between them. When one taps out, Hermes rolls to the next and your agent just keeps running. Here's the setup I've landed on. The free / near-free providers, ranked by how I actually use them: OpenRouter free tier - fastest to set up, best starting point. Free models rotate so availability shifts, but for a default it's great. Gemini API free tier - easy onboarding, solid default brain. Has free-tier limits, which is fine because of the failover below. NVIDIA NIM (Nemotron + open models) - excellent secondary. Smaller ecosystem but genuinely free. Local via Ollama - your privacy + offline layer, and the perfect last-resort fallback since it has no rate limit at all. Costs you hardware, nothing else. I also use this for any sensitive data. Subscriptions you already pay for (ChatGPT/Codex, Grok, Copilot) - not "free," but effectively free if you're already paying. Most people have one of these sitting idle. Plug it in instead of buying anything new. The actual trick — failover so you never get stopped (just tell your agent to do this for you when you have the keys in) You can swap the default model for a free model, I just prefer it this way since I have multiple Chat GPT accounts sitting around. hermes auth add openai-codex hermes auth add google hermes auth add nvidia # local Ollama needs no key model: default: openai-codex provider: openai-codex fallback_providers: - provider: google model: google/gemini-pro-1.5 - provider: nvidia model: nemotron-3-8b-instruct - provider: ollama model: gemma4:e4b Codex → Gemini → NVIDIA → Ollama. When the top one rate-limits or times out, it falls through automatically. Order matters here - put your most capable/consistent provider first so quality degrades as little as possible when it switches. Stacking multiple keys from the same provider: hermes config set credential_pool_strategies.google round_robin hermes config set credential_pool_strategies.openai-codex fill_first # strategies: fill_first · round_robin · least_used · random If you've got more than one key for a provider, Hermes will cycle them... Separate profiles for separate work - I run different profiles (researcher, writer, coder, local) each pinned to a different provider/model, so my cheap/free models handle the grunt work and I'm not burning my good provider on everything. Honest tradeoffs: Free models aren't frontier-grade. You'll feel it on hard tasks. They're not Opus or GPT-5.5. Switching mid-session changes the reasoning style, so context can degrade a bit on a handoff. The fallback order is how you manage that. So my actual recommendation isn't "go 100% free" but use frontier models for the critical stuff (planning, PRDs, reviewing), cheaper models for task execution and free/local for "brainless" tasks. Did I miss anything here? Curious to hear your thoughts in the comments.
https://github.com/google/skills
Hi everyone! We just found out about this sub and revived our old Reddit account. It has been crazy lately but we will try to stay more active here going forward.
condensed from full release notes here with issues and GitHub user tags: https://github.com/NousResearch/hermes-agent/releases/tag/v2026.4.30 Release Date: April 30, 2026 Since v0.11.0: 1,096 commits · 550 merged PRs · 1,270 files changed · 217,776 insertions · 213 community contributors (including co-authors) The Curator release — Hermes Agent now maintains itself. An autonomous background Curator grades, prunes, and consolidates your skill library on its own schedule. The self-improvement loop that reviews what to save got a substantial upgrade. Four new inference providers, a 18th messaging platform, a 19th via Teams plugin, native Spotify + Google Meet integrations, ComfyUI and TouchDesigner-MCP moved from optional to bundled-by-default, and a ~57% cut to visible TUI cold start. Autonomous Curator — hermes curator runs as a background agent on the gateway's cron ticker (7-day cycle default). It grades your skill library, consolidates related skills, prunes dead ones, and writes per-run reports Self-improvement loop — substantially upgraded — The background review fork – Hermes core self-improvement that decides what memories/skills to save or update – is now rubric-based rather than free-form, biased to present (prefers most recently loaded skill), handles references/templates, and properly inherits the parent's live runtime. Skill integrations — major expansion ComfyUI v5 with official CLI + REST moved from optional to built-in by default. TouchDesigner-MCP bundled by default, expanded with GLSL, and more! Humanizer skill ports a text-cleaner that strips AI-isms. claude-design HTML artifact skill + design-md (Google DESIGN.md spec) + airtable salvage + skill_manage edits in external_dirs + direct-URL skill install + /reload-skills slash command. LM Studio — first-class provider — upgraded from a custom-endpoint alias Pluggable gateway platforms + Microsoft Teams + Tencent (Yuanbao) messaging — the gateway is now a plugin host. Google Meet plugin — join calls, transcribe, speak, follow up. Spotify — native tools + bundled skill + wizard — 7 tools (play, search, queue, playlists, devices) behind PKCE OAuth Four more new inference providers Models dashboard tab + in-browser model config Remote model catalog manifest Native multimodal image routing — images now route based on the model's actual vision capability Gateway media parity — native multi-image sending across Telegram, Discord, Slack, Mattermost, Email, and Signal; centralized audio routing with FLAC TUI catches up to (and past) the classic CLI — LaTeX rendering - /reload .env hot-reload - pluggable busy-indicator styles - opt-in auto-resume of last session - expanded light-terminal auto-detection - session delete from /resume picker - modified mouse-wheel line scroll - a /mouse toggle that kills ConPTY's phantom mouse injection. Observability + achievements plugins — bundled Langfuse observability plugin + bundled hermes-achievements plugin that scans full session history. TTS provider registry + Piper local TTS — pluggable tts.providers.registry; Piper ships as a native local TTS provider. (Closes #8508.) (#17843, #17885) Vercel Sandbox backend — Vercel sandboxes as an execute_code/terminal backend Secret redaction off by default — default flipped to off. Prevents the long-standing patch-corruption incidents where fake secret-shaped substrings mangled tool outputs. Opt in when you need it. Cold-start performance — visible TUI cold start cut ~57% via lazy agent init
I don't have the money for a Mac Mini and I don't use my agent for writing code, so my experience is on a Windows11 PC. If your setup is different or your use cases are different then mileage might vary, but I'm happy to share my journey and how this little buddy has definitely grown on me over the weeks. I started out wanting to test Hermes as a Claw alternative, and saw a lot of posts recommending it, and of course being attached to an established name like Nous added more credibility than a random Github repo. I have a 5060ti 16gb VRAM and 64gb DDR5 System Ram, VRAM as the bottleneck of course. I've used both LM Studio and Ollama on various setups, Ollama worked best in this scenario, again feel free to test other options yourself. Week 1, running from scratch, a standard install but using Qwen3.5:9b. Assuming larger was better, assuming I needed something even better eventually. For safety and security its running in WSL Ubuntu, not directly on the main windows environment, also better compatibility I think, working entirely from the CLI for the first few days while I got to see what it could do. The only extra add on, I added Hermes Hud: https://github.com/joeynyc/hermes-hud Its pretty, and a good system monitor thats low on resource impact. Its a nice visual representation of how Hermes is doing, skills added, that sort of thing, I don't need a multi agent mission control dashboard overkill, this is fine. After a few days I set up Telegram with the botfather, and I haven't gone back to CLI apart from checking updates or restarting the gateway if I was restarting the PC for any reason. Hermes is now almost entirely a personal assistant on my Telegram App. Where I found the 9b with 128k context chugged along and got things done, the 4B model is snappy, responsive, alive and chatty. I had used the 9b version in OpenClaw as well and found it wanted to do everything itself, rarely spawning sub agents unless bullied into it, my Hermes experience has been very different. The 4b model purrs along, seamlessly running a websearch, spawning sub agents to use skills we've built, I am honestly super impressed so far. My concerns about a loss of intelligence have been quickly washed aside by accurate tool calling, and rapid response times. So I'll say caution is a driver, and ok having it exposed to the world via telegram kind of flies in the face of that logic, but beyond that I feel its as locked down as I can. It does not have email access and I'm not installing it into the main Windows System, again thats what I want, it might not suit you, but its as safe as I can afford while still being accessible to me. Use Cases? Daily briefing and schedule keeper, so good, the longer I run it, the more it gets to know me, the more "personality it has" and not the annoying chatgpt/gemini type of thing either, honestly I feel its a better experience without a doubt. Research Assistant, obvious, a perfect always on researcher, it knows my project structure, it knows a new project starts with gathering data, it runs research on topics I mention, it adds the reports to folders I can access, URLs to every story so I can cross check, thats by design, not out of the box, and send reports on whats where and what I need to see. We use a Tavily api key here. Google could be fine, we hit limits on Duckduckgo Professional Content Creator, who isn't creating first drafts with AI? Hermes learned a skill for me that it pulls in the research we've gathered, the puts f…
I'm running Hermes on a Linux VM and accessing either via the non-TUI TUI (just "hermes") or via hermes-webui (https://github.com/nesquena/hermes-webui). It's been pretty good for a while, great feature set, but over the past couple weeks it's gotten really unstable and buggy. The whole reason I'm using Hermes is for stability. I get it, early versions of apps will be unstable. I could revert to an early version, but that's a bad habit to get into. Anyone using other web UI's they like better? I used Open WebUI (https://docs.openwebui.com/) when I was directly connected to LMStudio without Hermes, but it doesn't seem to have Hermes specific features. (profiles, logs, insights, skills, etc.). For stability I'd be willing to switch back. Looking for recommendations. Asking Hermes gave me a bunch of options, but no real user reviews, just marketing material.
Hi all, I’ve finally started adding more hermes agents to my setup (all on the same device). The issue I have is that there’s no easy way to have my agents working together and talking to each other. Because I have my orchestrator agent which is also my coding agent, but I want to soon add an agent just for orchestrating so that I can still interact with my primary agent while another agent is coding. So how do y’all get your agents to talk to each other/ work together? Thanks so much and sorry if this post is a little confusing, let me know if you have any questions.
I feel someone probably answered this earlier. When I connect to my VPS it throws a inference error and I can't connect to my vps. Trying to find answers on this.
Hi everyone, I'm currently looking for the best dashboards or UI tools to manage and monitor a multi-agent system. Specifically, I'm looking for tools that excel at: Agent Tracking & Observability: Visually tracking agent-to-agent interactions, intermediate thought steps, and final outputs. Performance & Cost Monitoring: Tracking token usage, costs, and execution latency for complex agent workflows. User Interface: Clean, user-friendly dashboards for session management or low-code orchestration. Are there any open-source or production-ready tools you highly recommend? (e.g., Chainlit, Langfuse, AutoGen Studio, Dify, etc.) I'd love to hear what you are currently using in your stack and why. Thanks in advance!
TL;DR; Be careful about what you're stuffing in your context window, use free tiers.
If you've been running Hermes more than a couple of weeks you've probably started looking at cost savings. The token math on a single paid provider with a default setup is brutal. I had a straightforward setup through OpenRouter. No smart routing. No delegation. Just direct calls. Two weeks, ~$150. Cool as shit, but not sustainable.
I've spent about a week digging in to how it works (or rather getting it to dig into itself) and wanted to share some findings.
The fix isn't one magic thing. It's a few layers that sit together to shave token usage and minimise token cost.
Layer 1: Manifest (the router)
Manifest is an open source model router that sits between Hermes and your providers. It classifies every request by complexity and routes it to the cheapest model that can handle it. Sub 2ms per decision, no external calls, no added latency.
I self host it, Hermes points at it as a custom provider.
The model field in your config is not the primary routing key. Manifest looks at the message content itself. You can tell it "use this model" and it'll override with something cheaper if the complexity doesn't justify the expensive one.
My routing table:
Simple / Standard (routine tasks, quick lookups, simple edits) - Primary: DeepSeek V4 Flash Free Tier via OpenCode API token - Fallback: DeepSeek V4 Flash (OpenCode Go sub) > DeepSeek V4 Flash (DeepSeek PAYG)
Complex (multi-step coding, architecture decisions, writing with nuance) - Primary: MiMo-V2.5 via OpenCode Go - Fallback: DeepSeek V4 Pro (DeepSeek PAYG) > MiMo-V2.5 (OpenRouter PAYG)
Reasoning (deep analysis, debugging, planning, anything needing chain of thought) - Primary: MiMo-V2.5-Pro via OpenCode Go - Fallback: DeepSeek V4 Pro (DeepSeek PAYG) > MiMo-V2.5-Pro (OpenRouter PAYG)
Basically cascading through free tiers, into subscriptions, then onto PAYG models.
The simple/standard tier chains through three different free or near free paths for the same underlying model (DeepSeek V4 Flash). OpenCode Free Tier using an API token, OpenRouter free tier, and then OpenCode Go sub. Different providers, different rate limits. Combined i'm at 410m tokens this week, with a total cost of $0.18 over the opencode go sub.
Layer 2: Delegation discipline
This is the biggest hidden cost. The orchestrator (the Hermes instance you talk to) processes everything you ask it to do. If it's writing code, running research chains, and doing implementation directly, every line of code, every tool output, every API response fills the context window.
The hard rule: if a task would consume more than 50 lines of code or output in the orchestrator's context, it gets delegated to a subagent.
Subagents get their own fresh context window and their own terminal session (viewable in [[hermes-webui]]). The orchestrator writes a spec, the subagent implements it, the orchestrator verifies the output. The orchestrator never sees the implementation details. Only the summary.
Orchestrator: "Write a Python script that does X. Constraints: Y. Output to Z."
delegate_task() > subagent writes 200 lines in isolation returns summary: "Script written at Z, tested, passes" Orchestrator: runs the script, confirms output, done.
Total orchestrator context consumed: about 1KB for the spec plus 500 bytes for the summary. Without delegation, that's 200 lines of code plus multiple edit cycles plus test output, which is 15-20KB burned directly.
There's a dai…
I started with running hermes inside a docker container, now only the backend runs in docker. But I am tired of mounting and unmoutning volumes as I used and I want to run it local, but something inside me screams NOOOOOOOOO!!!!!!. Would love to hear what you guys are running it on
Can hermes do this task? I've been trying to automate this but hermes is not able to do this. I've attached the prompt below: PROMPT: "Act as an autonomous automation engineer. Your goal is to execute a complete Reddit-to-Video social media pipeline across multiple web applications. Execute these phases sequentially and verify success at each milestone: Phase 1: Extraction & Population Navigate to Reddit and scrape the URLs of the top 10-15 trending stories from r/ShortStories. Store these in a local memory array. Open a tab to http://localhost:5000. For each stored item, find the URL input field, paste the link, and click the button labeled 'fetch story from reddit'. Phase 2: App Automation & Rendering Once fetched, click the button labeled 'Next: voiceover'. Progress through the remaining pipeline of the localhost application until you see the render command. Click the button to render the video. Wait continuously until the rendering progress bar completes. Click the 'Download' button to save the video file locally. Verify the download path. Phase 3: Captions & Distribution Open a browser tab to the Adobe Express Captions tool. Upload the downloaded video file, initiate the caption generation process, and download the updated video once processed. Navigate to YouTube Studio and TikTok Web. Authenticate if required, navigate to the upload dashboards, upload the captioned video file, generate a title based on the original story headline, and publish it. If any element or button fails to load, do not stop. Re-inspect the page tree, find the closest active element, or retry the click. Keep a live log of your progress." Can someone please tell me how i can configure/setup hermes to carry out this task perfectly without errors. Also if this can't work with Hermes, is there any other alternative that is free and can carry out this function.
There are a good few Hermes features most of you are probably not using but should be. Bitwarden Secrets Manager, my API keys used to be in plaintext .env. One bootstrap token replaced al of them, and rotating keys works. This is the #1 security upgrade. MCP Catalog hermes mcp gives you an interactive picker of Nous-vetted MCP servers. You could add database tools, Figma context, Google Drive, etc. in one click. ntfy, push notifications to your phone with no signup and no API key. Your cron jobs will ping your phone directly when tasks finish. Self-hostable too. x_search, X/Twitter search with OAuth for anyone who uses X. Context Engine, this is the smarter context management that went open-source. A2A Protocol v1.0.1, Google's Agent-to-Agent protocol just went stable. Agents discovering and talking to other agents. Hermes has an open issue (#514, 24 comments) for it.
Curious as to how everyone is using these agents? Whether it’s DIY projects, professionally required, etc.
Hi everyone! I want to share my personal assistant setup. My goal was to build a 24/7 autonomous agent with multi-tool capabilities on my home server—without spending a lot on APIs. (I use a server, but this runs on almost any desktop with enough RAM: Windows 24GB+, Linux/Mac 16GB+). By combining self-hosted tools with budget cloud APIs, my monthly token bill is about $3. Self-Hosted Stack & Local Tools All my data stays local and private (until it goes to the AI provider, of course). Hermes interacts with these via custom skills and MCP: Radicale (CalDAV): Handles my calendar (vevents), tasks (vtodo), and journaling (vjournal). Flatnotes: A lightweight markdown wiki where the agent stores and structures notes. LinkedIn MCP: A custom bridge to read notifications, organize requests, and monitor job replies via a 30-min cron job. Still working on it. SearXNG + Firecrawl + Camofox: For clean web scraping (getting clean markdown instead of messy HTML). In progress (setting up a propose/approval flow): Himalaya: A lightweight CLI email client used to fetch and process my work emails (it supports OAuth2). Gmail Gateway: Set up to securely manage and parse my personal emails. File Management: Enabled via Hermes' default tools. Honestly, I'm still a bit scared of letting an autonomous agent manage my local files directly right now, so verifying until I feel it's 100% safe! :o Multi-LLM Routing Strategy Instead of using one expensive model, I split the tasks across cheap API endpoints: 1. Core & Sub-Agents Model: deepseek-v4-flash via official DeepSeek Provider ($0.14 / 1M input tokens). Setup: Native thinking is disabled by default (reasoning_effort empty) Tip: I just use /reasoning high or /reasoning base on the fly when I need the flash model to switch into deep planning or coding mode. 2. Fallback Router openrouter/deepseek-v4 as a backup if the official API hits rate limits (I don't think so, but....). 3. Auxiliaries (via OpenRouter) Vision: qwen/qwen3.5-9b (~$0.06/1M). Used only when Camofox needs to parse UI screenshots. Context Compression & Session Search: google/gemini-2.0-flash-001 (~$0.075/1M). Perfect for compressing long contexts and to process long search results. 4. Voice Processing (100% Free & Fast) Speech-to-Text (STT): whisper-large-v3-turbo via Groq API. Completely free under their daily cap. Text-to-Speech (TTS): edge-tts Python library. Voice generation with zero API keys or costs. User Interface: Telegram (mainly, sometimes terminal) I can control Hermes via the Telegram Gateway and a Bot. Text & Voice: I can chat with Hermes using standard text or send voice notes. The Loop: Telegram Audio -> Groq Whisper -> Hermes Agent Core -> Tool Execution (Radicale/Flatnotes) -> Edge-TTS -> Voice note back to my phone. What's Next? The first thing I want to accomplish was getting this personal assistant loop up and running. Now that I have it, my plan is to use Hermes to help me manage my projects and maybe some coding workflows. I'm also looking into setting up a personalized tutor for specific subjects, maybe using Hermes to track my lessons and progress dynamically inside local JSON or YAML files. To power it all, the next step is launching a Qdrant vector store in Docker, using a fast embedding model that can run on CPU. Since Hermes handles web search and extraction so well, I plan to use Firecrawl to scrape documentation and study materials, convert them into clean markdown, and automatically chunk/index them into Qdrant. This…
Automation is proceeding really fast and agents are getting better as model rates are getting cheaper every day. In order to get the best out of these agents is to give agents the specific integration to make it more human so that they can complete tasks on the internet just like a human. The top three of them would be: Agent Mail. Agent Mail gives your AI agent an email of its own so that it can log in, sign up using that email, receive and send emails. It can be used for a lot of different purposes. AgentLine cloud: Agent Line gives your AI agent a phone number of its own to call and receive calls. SMS. Now you can use it to use your agent as support agent. You can use it to gather feedback from users, you can use it for many different purposes and also for logging into specific sites and apps where a phone number is needed so the agent can log in to them using its own phone number. Payments. Agents are still not able to pay so you can use Prava payments to pay, to give your agents access to payments so that they can pay online, buy services on your behalf and do a lot more other things. Now all these three tools make your Hermes or OpenClaw more alive, more human so that it can complete more tasks on the internet and not get stopped due to these restrictions.
As shown in the demo, Hermes Agent can automatically join a Google Meet, perform real-time speech-to-text transcription, and save the conversation as Markdown files. After the meeting ends, it pushes everything to a private GitHub repository and generates two files: a complete transcript and a summary with key discussion points, action items, decisions, and insights. This turns every meeting into a searchable, version-controlled knowledge base without any manual note-taking. Still a work in progress. It occasionally fails to join meetings or breaks during longer sessions, so for reliability I'd currently recommend the Google Meet MCP approach.
Has anyone made any $$$ with Hermes agent, i mean atleast 90% automated by Hermes. What is the business model? how did you configure hermes to do it?
Ok I have been reading about how cheap Deepseek v4 pro is and how it’s 98% cheaper than Claude, so I decided to gove it a try by putting $10 in an account on openrouter.ai. An hour later my $10 has been used up. Hermes burned 19.8 million tokens and my $10. I’ve done much longer sessions in Claude on the $20 plan and never once have hit any limit. Are my expectations off? Fwiw the $10 is not a big deal, but just extrapolating the cost here this could easily become hundreds of dollars per month which really makes me question the idea that it is cheaper than Claude.
I cannot emphasize enough how content I am with this model + Hermes combo. It does everything well for now (I have been using it for a week or two only, but still) + it has an amazing context window. I thought that it would be a problem since it didn't have a picture input so I could just copy paste a screenshot of the lecture I am listening to but I made it update its memory / skill improvement so that each time I ask it to explain the picture it directly goes and checks out the last pic saved in the ~/Pictures folder. It does the job fairly fast and brilliantly. It helped me a lot in learning my way through Linux in general, using Docker, k8s, some coding projects I've been working on and just playing around with it in general has been quite a journey. I highly recommend you guys to give it a try and save a few bucks.
We all know ~/.hermes/ accumulates a lot over time, memories, skills it writes for itself, profiles, SOUL.md. So I moved the parts worth keeping into a private git repo, symlinked them back, and set up a cron job that lets Hermes commit and push its own changes.
The core idea
Move the parts of ~/.hermes/ worth keeping into a git repo, symlink them back:
~/hermes-config/hermes/memories/ <- real files, tracked in git ~/.hermes/memories -> symlink into the repo
What I track vs. ignore
About half of ~/.hermes/ is the brain, half is disposable runtime. Only the brain goes in git.
Track: config.yaml, SOUL.md, memories/, the custom skills/ it writes, profiles/, hooks/, cron/.
Ignore: sessions/ and state.db (huge, churny, and memories already distill the signal), auth.json, logs/, _cache/, .lock, .pid, gateway_state.json, and the bundled skills (skills/apple/, skills/research/.
Two things that bit me:
Use ** globs — hermes/*/sessions/, not hermes/sessions/. The moment you add a profile, its sessions/ is at a deeper path and a shallow rule misses it. Same for every other ignore.
Bundled skills sit right next to your own in skills/, so ignore them by name and let everything else through. git check-ignore to verify before you trust it.
Secrets — encrypt and commit (worth it once you have >1 machine)
The .env (and each profile's have their own) has tokens and keys. I wanted them in the repo so a fresh box bootstraps itself, but unreadable to anyone with repo access. SOPS + age: commit an encrypted copy, decrypt only on machines whose age key you've authorized.
age-keygen -o ~/.config/sops/age/keys.txt # one key per machine sops -e --input-type binary --output-type json \ --filename-override secrets/env.enc \ ~/.hermes/.env > secrets/env.enc
The gotcha that cost me an afternoon: --input-type binary, not dotenv mode.
Single machine only? Skip this and just keep .env out of git.
Give Hermes its own GitHub identity
The part most worth doing. The agent box runs nothing but Hermes, and every commit from it is the bot, never me. I edit config from my laptop as myself, the agent box pulls my changes and pushes its own. That's what makes git log --author=bot mean something, it's everything Hermes did, none of mine.
Second GitHub account for the bot, its own SSH key, added to the repos as collaborator with push, never owner. Two trust levels:
Project repos: Hermes opens PRs I review, branch protection stops it merging itself.
Its own config repo: direct push is fine — forcing a PR for every memory edit is pure noise.
(Owner rights would let it rewrite history or merge unreviewed. Collaborator + branch protection means worst case is a rejected push, not a wrecked repo.)
bootstrap.sh
One script wires any machine: installs Hermes if missing, symlinks the tracked paths into ~/.hermes/, decrypts the secrets (mode 600), sets the bot git identity.
git clone ~/hermes-config && cd ~/hermes-config && ./bootstrap.sh
What makes it safe to re-run blindly: idempotent and non-destructive. Before each symlink it checks, link already correct (leave it), real file in the way (move it to .pre-bootstrap, never overwrite), otherwise create fresh. So "new machine" and "sync after a git pull" are the same command, and nuking ~/.hermes/ by accident is just a re-run away.
The cron jobs (all backed by skills in the repo, so they're versioned too)
autocommit-self — git add -A && commit && push of self-edits, every 6h
sync-self — git pull + bootstrap.sh + restart gateway only…
Has anyone integrated Hermes into your business or are planning to? What integrations are you thinking about. Here's mine: In Progress My bookkeeping software does not have an external API. I used my website mapping Skill to map the site and absorb all the training material. It's able to login, use MFA grabbing the text from imessage on mac, and navigate the site. It will be categorizing purchases weekly. It is able to generate new invoices. Next on the list Mapping the process to rename invoices and load into a draft in gmail. Login to bank account and pull CSV transactions to compare against ledger. Mapping does take a while. It's like training a person. You have to show it where to click and how to do things but it maps it out internally for later. Curious to hear yours.
Hey everyone, Wanted to share something I've been working on. The wiki is now live at hermesguide.xyz/wiki and it's basically everything useful from this sub organized into actual articles. What it actually is: This isn't the official Hermes docs. Those tell you what the features do. This wiki tells you how people actually use it pulled straight from threads, comments, and setup guides posted here. Real configs, real hosting setups, real mistakes people made and how they fixed them. What's live right now: - First-Time Setup — model picks, hosting options, what to configure first so you don't waste time - Cloud Hosting — Hetzner, Hugging Face Spaces, Docker, Synology, Tailscale setups - Memory Systems — Hindsight vs Honcho vs MemPalace vs OpenViking, actual token overhead numbers - Profiles & Multi-Agent — how profiles actually work (not presets, separate agents), Kanban boards, orchestrator patterns - Model Comparison — Qwen 3.6, DeepSeek V4, MiniMax M2.7, what the community actually runs and why - Coding Agent — GitHub integration, coder profiles, when to use Hermes vs OpenCode - Troubleshooting — the actual errors people hit and how they fixed them - Best Practices — workflow patterns that work, SOUL.md tips, cost management - Community Tools — HuggingMes, llm-keypool, Browser UI, Desktop App, Talaria Each article credits the users whose threads it came from. Tables work on mobile. There's a TOC so you don't have to scroll forever. Still building: The Ecosystem directory (tools, skills, MCP servers) is coming next. If you spot something wrong or want your tool/project added, drop a comment. Link: https://hermesguide.xyz/wiki
When setting up, it offers Quick Setup (Nous Portal) — free OAuth login, no API keys, model + tools (recommended) Full setup — configure every provider, tool & option yourself (bring your own keys) If wanting to use your own endpoint, do you? A) Choose Full Setup right away B Quick Setup, then input endpoint later Please guide. Does Quick Setup make things any easier if not wanting the subscription plans? I could not seek this specific part in the Wiki. Thank you
For $1/month (+ processing fees), it includes: $10 in credits ~15K requests Access to the taste-1 model ~$40 worth of DeepSeek V4 Pro usage ~$20 worth of Qwen 3.7 Max usage Up to 99% off MiMo V2.5 Basic analytics and Discord support On paper, it seems almost too good to be true for the price, especially if you're experimenting with coding agents or AI workflows. A few questions: How does the actual usage compare to the advertised limits? Are the models fast and reliable? Any hidden restrictions or rate limits? Would you recommend it over alternatives like OpenRouter or other low-cost AI platforms? Would love to hear real-world experiences before giving it a try.
Improvements Based on feedback: Better render of HTML controls (buttons, tables, etc) Mobile use is optional (doesn't force tailscale) Chat labelling, sorting, session management improvements Next Updates Auth for security Name change from Pebble (avoid confusion with watch) Feedback Anything else you would like to see? Suggest a new name for Pebble if you are feeling creative Links Home Page - Try Pebble Repo (MIT) ⭐
I made Hermes watch Hermes. This ended up being one of the most useful automations I've set up so far. I have a daily cron job running at 8:30 AM that checks the latest commits from NousResearch/hermes-agent (last 24h, paginated through all pages), compares my local install against origin/main, and sends me a short Telegram summary. Basic commit fetch (paginated — follows Link headers until exhausted): curl --max-time 15 "https://api.github.com/repos/NousResearch/hermes-agent/commits?since=$(date -u -d '24 hours ago' +%Y-%m-%dT%H:%M:%SZ)&per_page=100" If the response has a Link header with rel="next", fetch &page=2, &page=3, and so on until there is no "next" — then combine everything before summarizing. (GitHub caps at 100 per page regardless of what you ask for, so pagination is not optional.) Local version check on my Pi 4: cd /home/edoardo/.hermes/hermes-agent git fetch origin echo "BEHIND: $(git rev-list --count HEAD..origin/main)" Hermes then formats the result like this: 📦 Hermes Agent — Daily Update 📅 DD MMM YYYY Summary: 3-4 lines about what landed in the last 24h Status: Hermes is N commits behind main. *Small clarification on the fetch part: this is the portable version of the automation, the one I can point at any public repo without assuming I already have a local clone ready. For my setup, since I do have hermes-agent cloned on the Pi, the local Git path is the more compact one: git fetch, then git log / git diff --stat / git rev-list against origin/main. I originally made this because I'm working on Hermes Desktop and wanted to stay up to date on the latest releases without refreshing GitHub ten times a day. So I told the agent the general areas that might break my app, and to warn me when to worry. But it ended up being useful for a second reason too: it tells me whether the hard work I'm putting into tweaking Hermes to do something is actually worth doing right now. I already said this: sometimes you are deep in the weeds trying to patch around a behavior, tune a pipeline, debug a provider, or bend the agent into a workflow you want. Then a commit lands upstream that makes it obvious: this is already being fixed, refactored, or replaced. Sometimes the best move is “wait two days.” After 45 runs, I would recommend this to anyone serious about using Hermes. Commit history is surprisingly revealing. You start seeing integrations before they are announced, fixes while they are still being shaped, refactors that explain strange behavior, and small pieces of bigger features landing one commit at a time. A few times, the daily summary made me realize where the project was going before it was obvious from the outside. Nothing secret, obviously. Just things assembling in plain view. Kanban is the biggest example of that, which thanks to this job I was able to smell days before the official announcement. But that is my use case. Where I think this can shine for everyone is debugging. When something feels off locally, the first question becomes much easier to answer: did anything relevant change upstream? You can make the job track specific things too. If you're dealing with a bug, a provider issue, a skill problem, or some weird gateway behavior, add that to the cron prompt: Pay special attention to commits mentioning: - gateway - skills config - Telegram - provider errors - memory - my current bug / issue And now your daily update becomes a focused watchlist for whatever you care about. It also helps with prioritization. Before …
Over the past month, on a whim, I spent quite a bit of time tinkering with Hermes. My original ambition was to turn it into a more advanced AI agent. Looking back, roughly 80% of that time went into wrestling with configurations, and maybe only 20% was spent actually getting Hermes to do useful work. This article is a record of the traps I fell into, the patches I made, and the few places where Hermes did become genuinely useful. Perhaps my experience can save you a few detours. Configuration Getting It to Launch Is the First Barrier Hermes does not yet have a stable native Windows version, so it has to run inside WSL, the Windows Subsystem for Linux. That means setting up WSL yourself, installing dependencies, and configuring environment variables. If you already have another AI coding agent installed, such as Claude Code, the process is somewhat easier. If you do not, it can be painful. Once I finally got it installed, my instinct as a long-time Windows user was simple: I wanted to double-click something to launch it. Hermes does not give you that. So I had to build it myself. In theory, this is not hard. You write a BAT script and wrap the startup command inside it. In practice, the real trouble comes from the boundary between Windows and WSL. Cross-environment calls break in all sorts of strange ways. Once you have a start button, you naturally want a stop button too, which brings the same class of problems. Then I tried to make a restart button that would stop and relaunch Hermes in one go, and the number of problems multiplied. Every time something broke, I had to drag Claude Code in to help debug it. Sometimes a script that worked perfectly today would simply stop working the next day. The main reason I needed restart support was that Hermes often has to be restarted before configuration changes take effect. So I wondered: could Hermes restart itself? Unfortunately, not natively. I tried letting Hermes call an existing restart script. But every time it did, the Hermes process would be killed. From Hermes’ own perspective, it had no idea what it had just run, and no idea that it had come back to life afterward. So I built another restart script specifically for Hermes: one that lets it know it is restarting, and sends it a message after restart so it can continue the previous task. The idea is simple. Making it reliable is the hard part. I ran into every flavor of weird behavior: the restart succeeded but the conversation was lost; the interface said the restart failed when it had actually worked; it said the restart succeeded when Hermes had not restarted at all. None of these problems was fatal, because Claude Code could usually rescue me, but each one took time to diagnose. And restart scripts are only one piece of the setup. There are also environment variables, API key management, MCP tool registration, and proxy configuration. Each issue is small on its own. Pile them together, and they become a real barrier. At certain moments, the urge to delete everything and walk away was very real. The Dream and Reality of the Memory System Hermes markets itself around the idea of “self-evolution.” It has a multi-layer memory framework: Memory, User, Soul, Skill, Agents. The idea is that the AI can gradually learn, accumulate context, and improve over time. The concept is genuinely attractive. The reality, at least in my experience, is that the AI often has no idea what belongs where. It wants to dump everything into Memory. It writes a…
Hey hello Hermes community. Have you already tried MaxHermes? It is a zero-setup cloud version of Hermes Agent! As they did with MaxClaw before. What do you think about that kind of solutions? Hermes is already quite easy to setup they release more stuffs to make it even easier everyday. I'm curious about your thought about it and also your Hermes use cases.
What model are you running as primary, and what fallback order? Looking for the setup that runs smoothest for agentic work (tool calling, longer context, no dropped calls). Which primary has given you the fewest issues — rate limits, timeouts, failed tool calls? And which fallbacks actually kick in cleanly when the primary dies? Currently on GPT-5.5 primary with Claude Sonnet / Gemini 2.5 Flash / Kimi K2.6 as fallbacks. Curious what’s held up best for others. Thanks 🙏
Router-first tool loading + memory tiering — how it works and what it actually saves Been running this for a few days and the results were significant enough to share. This post covers the architecture, the numbers from my setup, and a quality regression to show it doesn't break anything. Public reference repo at the end. The problem Before any of this, every first-turn prompt in Hermes carried the full tool schema payload every toolset, every definition, regardless of what you asked. Typing "hi" cost the same as running a multi-step agentic task. On my setup that floor was 14,200 tokens, and 65% of it was tool schemas. https://preview.redd.it/efgrx5qkrt3h1.png?width=1635&format=png&auto=webp&s=bc93baee37c1d9ea5611603a0ef2a57f91a75d13 The fix is two things working together: a router that loads only the schemas you need, and a memory tier system that controls how much context gets injected based on what the message actually calls for. How it works https://preview.redd.it/ttd4obtnrt3h1.png?width=1935&format=png&auto=webp&s=4bd61d27c78b61db208248cf8388477ffb69e2ad Before the main model sees anything, two lightweight decisions happen: ① Router classifier — a small cheap model (~1.5s) reads the user message and predicts which toolsets the turn needs. Hermes assembles only those schemas. If the classifier is unsure or the message is long and complex, it declines reduction and loads the full surface — worst case is no savings, not a broken response. ② Memory tier — the same message analysis sets injection depth for the session: none, compact, or full. This controls whether USER.md profile and memory stores get loaded and at what depth. ⑤ Recall-on-miss — if the router mispredicts and the model tries to call a tool that wasn't loaded, Hermes detects the invalid tool name, injects the missing toolset, and retries. A wrong prediction costs one extra API call, not a failed turn.
What makes up the tokens https://preview.redd.it/vfe1bsmsrt3h1.png?width=1935&format=png&auto=webp&s=588648cf366cca6794c3126fda861ed3666bf5a4 Before: tool schemas dominated at 65% of 14,200 tokens. After routing: - Trivial opener ("hi", "."): 4,136 tokens. Six core schemas, system prompt, minimal overhead. Tool schemas still 68% of a much smaller total. - Normal task with full profile: 9,595 tokens. Predicted tool schemas (40%) + USER.md profile (44%) + system prompt (14%). Profile becomes the dominant cost once tools are trimmed. This is where the configurable floor matters — if you have a large USER.md, that's where your remaining tokens are going after the tool schemas are reduced.
Memory tiers https://preview.redd.it/fp89xi0wrt3h1.png?width=1785&format=png&auto=webp&s=43ad4f3b7fe87df2129c526429be729dd0871356 The tier is set once at session init based on the first message. It doesn't change mid-session. - None (~4,100 tokens): trivial openers — ".", "hi", "ok". No memory injection, minimal prompt. - Compact (~5k–9.6k): normal requests — "read config.yaml", "run tests", "bitcoin price". Injects USER.md profile and a lightweight web search hint. - Full (~11,100): history/preference recall — "last session", "what were we working on", "remember", "my preferences". Full memory store + profile. The range in compact tier is profile size. A thin USER.md lands near 5k; a populated one (~18k chars) adds ~4.5k tokens.
Numbers across scenarios https://preview.redd.it/viaf9h3dst3h1.png?width=1714&format=png&auto=webp&s=cacdad18bc704b6572f7baeaf9a1fadc20b427a0 These are l…
I have this Mimo plan for free. It was 700 Million token limit, and suddenly they made changes and increase + reset it to 38 Billion. The plan itself will expire in few days, I don't wanna waste it, what should i use it for 😭
We should start a subreddit wiki (or something like it) Every post that hits the top of this sub is either "I'm spending way too much" or "here's how I'm spending less." The same advice in different fonts every time. Free tier routing. Manifest vs direct provider config. Oracle free VPS. Ollama for local fallbacks. Context window discipline. These come up in comments, get scattered across threads, and then someone asks again next week. I think we should start pulling that together. A wiki page here, or a community starter guide, or even just a pinned thread with best-practice patterns. Happy to put it wherever — it could live on Reddit or feed back into the Hermes docs itself. From my side I can contribute stuff like: Zero-cost routing configs (cascade free tiers → subscriptions → PAYG, Manifest makes this trivial) Setting up on Oracle/Google free VPS tiers (ARM instances, Oracle's always-free, that kind of thing) Profile separation — an executor profile pinned to a cheap model vs an orchestrator that gets the expensive one The things nobody says until they've burned a few hundred dollars If it's something people are interested in I'm happy to start scaffolding?
please correct me if im wrong, but doesnt the ai model behind the agent (hermes in our case here) make all the difference? what are the models you guys tried? and which is your favorite or current model? does a provider model (frontier or else) give a better experience than a smaller local model? I would like to gather feedback so I can deploy my own hermes. Many thanks!
I'm using the Hermes desktop app on mac. There's a "manage profiles" option that lets you add or delete profile, but for the life of me I can't find any way to have a chat session in a non-default profile. (Neither can Hermes).
Hi all. I've been testing out different models and providers to see what is the best bang for buck you can get for around $20 if you are not running local models. I have a Hermes agent running on a VM with 6GB RAM, which I got for an absolute steal of $45 per year (check out the LowEndTalk forum for cheap VPS deals). I use it mainly to maintain a dashboard that does the following: Gather news on specific topics from various sources. It then curates them to see if they align with my interests (eg. no sensasionalist crap), summarizes and deduplicates articles. Check the latest benchmarks on different models Scrape my favourite webcomics from Instagram, RSS feeds, Bluesky, whatever, so they are all in one place. It also maintains the VPS, so I have it install docker containers for stuff I want, like Mealie or whatever. Lastly, I synced my Obsidian vault where I keep a list of people with birthdays, notes etc. So it can remind me who's birthday it is and what I can buy for them, or other stuff like that. My Obsidian is also where it keeps track of my health stuff. Diet, gym log, etc. So, I've been playing around with the following providers. In all cases except Codex and OpenRouter, I used Kimi K2.6 as my main model, and usually tried Gemma4 for some of the tools and auxiliary models: Ollama Cloud - $20 per month OpenCode Go - $10 per month NanoGPT - $12 per month (I think you can get $8 if you find a ref link) OpenAI Codex - $20 OpenRouter - Free Models only Here are my findings. Ollama Cloud Very stable. Charges per GPU hours instead of tokens, so as models get more efficient, you actually gain mode usage. Some people say it's a bit slow, but in my experience it was never slow enough to be problematic. I actually had a hard time hitting my usage limits. I had to run my Hermes Agent, as well as 2 pretty big coding tasks simultaneously before I hit my 5 hour window limit, and this only happened once. The rest of the time, I barely cracked 25%. For Hermes alone, you will likely never hit that limit. Cons, are that you are limited to 3 concurrent connections. Meaning, my example of 2 coding cases and Hermes was pushing it. If I had to chat to Hermes and a cron job fired that used a model, it errored out because I went over the limit of 3 connections. This is something to keep in mind for people running multiple agents or lots of cron jobs and such. OpenCode Go I felt like this was ever so slightly less stable than Ollama, but not enough to be a problem or to stay away from it. Speed was fine, I honestly didn't feel much of a difference between OpenCode and Ollama. You pay $10 per month, and essentially get $60 worth of credits. One might think $60 credits is not much, but whether it is an efficiency thing or just the fact that we aren't paying Anthropic pricing, it stretched very far. I never hit my limits. Just like Ollama, on average usage I barely got to 25-30% weekly. Unlike Ollama, you don't have concurrency limits. The con for me is that it didn't have the model I wanted for tool calls, Gemma 4. They don't have that on here. They have DeepSeek which is cheap and fast, but Gemma 4 is cheap, fast AND multimodal. Useful for curating news articles or webcomics. NanoGPT This one seemed sketchy AF at first. It's clearly meant for a specific crowd. It has a ton uncensored text models included in the sub, as well as uncensored image models (Qwen Image and Z Image Turbo) with 100 free image generations per day. They allow you to load up with cry…
I never understood the mechanics of containerization until this week, and now I’m planning to package up my agent and move it into a Docker container. For you people who clearly understand containerization better than me, is there a reason you would still intentionally choose to install Hermes (or another agent like Claude for example) to your host directory of the computer? Of course multiple agents can be installed in different environments for different purposes, so I’m wondering if you use a containerized or host level agent as your MAIN agent?
Every week someone asks for real Hermes use cases. And every week, most of us end up giving the same answers: calendar summaries, daily briefs, file cleanup, cron reminders, Telegram alerts. Those examples aren’t wrong. “My agent reads my calendar every morning and texts me a summary” is simple to explain. It fits in a screenshot. It doesn’t require a wall of text about MCP servers, sandbox backends, permission boundaries, browser automation, or whatever cursed Python pipeline you built at 2 a.m. and now quietly depend on. So the conversation keeps collapsing back into summaries and reminders. But that’s not the ceiling. If anything, that’s the hello world. Hermes is a computer you can talk to. Not in the marketing sense. In the most practical sense. It runs somewhere. It has tools. It can work with files, use a shell, run code, browse, schedule jobs, remember context, send messages, use skills, and connect through MCP to whatever systems you decide to expose. And I'm probably leaving something behind. That might be your calendar. It might be your repo. It might be your server, your dashboard, your Home Assistant setup, your crypto wallet, your garage door, your fridge, or your mattress, if you are committed enough to the bit. So “what can it do?” is the wrong question. The question you should ask (yourself, at this point): where does it make sense to put an agent inside something you already do or you want to do? You don’t need to wake up and invent a random job for Hermes just because Hermes is cool (it really is). That’s backwards and you’ll end up forcing automation into places where there was no real problem, actually causing them. Just KEEP LIVING YOUR LIFE. Keep working. Keep experimenting, you are that kind of person if you are even thinking about those questions. Keep noticing where things get annoying. At some point, especially if you’re already past the chatbot phase, something will stand out. A task that is not hard enough to justify building a full app, but annoying enough that you keep wishing it would handle itself. Or maybe a task where you don’t want full automation, but you do want a strong partner working back to back with you. Hermes belongs here. Maybe that’s a folder it watches. Maybe it’s a repo. Maybe it’s something specific about your business. Maybe it’s infrastructure. Maybe it’s some personal setup so specific that it would make zero sense to anyone else. All of those are valid. The reason the calendar example keeps coming up is not because Hermes is only good for calendar summaries. It’s because “my agent texts me my morning schedule” takes one sentence to explain. The real stuff usually doesn’t, because it’s deeper and messier, and people, me included, are not ready to share it with the rest of the world, where most people won’t understand how cool a weird edge-case task feels when it finally works. They’re personal. They depend on your tools, your permissions, your habits, your stack, and how much you trust automation in that part of your life. Hermes doesn’t become powerful because someone on Reddit hands you the perfect use case. And if you think, “Cool, he did that with Hermes, so I’m going to ask my Hermes to do the same incredible thing,” be prepared: are you ready to burn dozens of tokens (and real money) to let your Hermes try to set up something that you have absolutely no clue about? Also, when someone who actually knows a field uses Hermes and gets crazy results, it’s usually because Hermes is bo…
Showcase Thursday — post your Hermes Agent builds, tools, workflows, and integrations. New Post every week 6am EST. Tell us: • What it does • How it's helpful Include a link to code or demo if you have one. Rules: • No marketplace or affiliate links • Any contribution level welcome (Might be flagged by automod but will be approved, be patient) • Thread pinned for 72 hours, then archived
On 1st May I setup the Hermes Agent Today it's 30 days of using it and here are my observations as a non technical personnel: Solid work needs a solid model preferably any top tier model rn, if you think you can work with free models you can but not for long Memory is the biggest problem, if you have multiple sessions you will most certainly struggle with memory. Inbuilt memory is great but it will forget things quite easily Reliability of the tool connector is essential, recently composio went down so did my agent and all the cron jobs running with it. Basically my business went down. Current status: If I ask to run composio it gives an error tried debugging but with no use. So I believe I need to have a fresh restart which I will do asap but will ensure to have a clear soul.md and goals along with a proper memory setup and workers in place. My setup: Railway for hosting (use code SOLO for free credits) Alltoken.ai as model provider (zero markup fees unlike openrouter) Telegram for chat interface Composio for tool connector In built memory Feel free to ask me anything!
The API's locked down and the .json trick 403s for me now. I got a browser-agent setup working but it only succeeds occasionally, usually 403s otherwise. Has anyone gotten this to work? I just want to get summaries for certain sub reddits a couple times a day Thanks
hey r/hermesagent , been in this agent vault / cred industry for quite sometime and I am a huge fan of awesome-x lists, have authored around 5-6 myself in the past and I dont care if anyone uses it or not, I personally use it to see new PRs, what's going on in a particular industry and surrounding ecosystem. Did the same with agent vaults - https://github.com/zriyansh/awesome-agent-vault Its basically a category map for agent credential management and products, integrations, recipes, patterns, threat models. That's mostly it, let me know if I could improve it further or tools it could serve since this ecosystem is rapidly increasing with newer launches left right and center (and AI slop products as well ,ngl) .
Just got my calendar providers (Google + iCloud) wired up and I'm looking for inspiration. Here's what I'm currently running, sharing first because give-to-get ahah My current crons : 09:00 — Daily briefing Weather + agenda fusion (Google Calendar + iCloud Calendar + iCloud Reminders as all-day events). Script Python pur, 0 token cost (no_agent: true). 12:00 — Tech news digest RSS from HN, TechCrunch, The Verge, Ars Technica. Summarized by Hermes, delivered to Telegram. One-shot: Sivers Semiconductors earnings alert Monitors for earnings publication on May 29, poling every 3min. Vocal alert via TTS when results drop. What I'm building next : - Disk usage alert on the VPS (threshold at 80%) - SSL certificate expiry checker - Train strike alerts - GitHub PRs waiting for my review What are YOUR most useful crons? Looking for ideas beyond the obvious, creative stuff, home automation triggers, system health patterns, anything goes !
Hi All Came across this memory os which brings a 7 layer memory and is local + open source. Any early reviews? https://github.com/ClaudioDrews/memory-os
I have ChatGPT pro already so I wonder if it is even worth considering using Hermes agent like a tutor when I could use ChatGPT? It’s already downloaded on my local machine CLI. Down the line I do want to consider using local models for heavy token usage but I only have RTX 5080 (16GB vram) and the models that I can fit do seem to be dumb. Maybe it’s because I’m too used to frontier models. This month I do plan on buying a 32gb AMD workstation GPU for larger models. Long story short I had plan to use Hermes agent for memory offloading over stuff that I need to study over or have it identified what are my weakness. Subjects that I plan to study over are Math, Chemistry and self learning coding.
Wondering is some one might know what I am missing. https://hermes-agent.nousresearch.com/docs/user-guide/messaging/signal - got signal-cli install & running. added the account as a linked device. - when I run this is shows my message when typed signal-cli --account +1234567890 daemon --http 127.0.0.1:8080 - ran the hermes gateway setup, added signal http://127.0.0.1:8080 + phone numbers - Added the 5x lines to ~/.hermes/.env with my actual phone number SIGNAL_HTTP_URL=http://127.0.0.1:8080 SIGNAL_ACCOUNT=+1234567890 SIGNAL_ALLOWED_USERS=+1234567890
SIGNAL_HOME_CHANNEL=+1234567890 - I run hermes gateway, and it is not throwing any error message - i can type a message inside signal and it shows up in the termal - however hermes never responds Any idea why?
I've been trying to find a way to have Hermes gate specific CLI command patterns with an approval/deny flow like the built-in dangerous commands approval (allow once, allow session, allow always, deny). My Hermes agent first steered me in the direction of adding shell hooks to block the commands, which didn't really do what I wanted, so after some more exploration I (well mostly Hermes) created a plugin that hooks into the regular approval flow. https://github.com/scross01/hermes-custom-dangerous-patterns-plugin This is early, and barely tested, but wanted to share for feedback. I feel like adding custom approve/deny patterns should be part of hermes-agent core capabilities, maybe it is already and I just didn't find it.
This is an honest question for everyone . What would you be willing to pay to get hooked up with a running Hermes agent on a VPS machine ? I ask because I tend to work with brands but am going to get into the consumer market and was wondering what someone would be willing to pay to not go through the hassle of the setup/etc. To get up and running quickly and stabilized so that you are capable of using Hermes machines efficiently … Would love to hear everyone’s thoughts , thanks !
I’m trying to setup a cron job with Hermes so it’ll apply to new jobs that are posted for specific companies I’ve listed. It has my resume and all my info it needs to fill out the application but it keeps running into self-imposed issues where it won’t complete the application fully Some examples: - one company sent a verification code to my email. I provided it to Hermes but it said it can’t enter security codes or passwords by itself - refuses to signup for/login to existing job accounts for companies that require it (like workday) because it says it can’t “type, store, or manage passwords for me” so it refuses to fill out any applications that require an account login - captcha, refuses to proceed Anyone find a workaround for this, or are there comparable models that aren’t so anal about it? I’m using gpt 5.5 through ChatGPT pro sub atm
Hermeskill watches every tool call + LLM turn and terminates the agent the moment it goes wrong. Tool-call loops, cost/token runaway, wall-clock overruns, out-of-scope tool calls, heartbeat loss: when one (or multiple) of these symptoms fires hermeskill pulls the plug(apoptosis protocol) and files a "death certificate" including exactly which symptom tripped, the shutdown sequence, and what it spent. Hermeskill is a drop-in plugin without glue code, using Hermes' canonical pre_tool_call block directive so the kill is clean the instant a check fires, no more tools run, no more spend. The SDK never ships your args or transcripts; only metadata (cost, tokens, model id) leaves the box, to a control plane you self-host. Feedback is welcome and encouraged! Repo: https://github.com/theopitori/hermeskill
Guys , get Hermes up and running and start building in it non stop! What you can build with it is insane . You can get APP MVPs up and going within hours instead of weeks/months . I can’t get enough of Hermes and am building my entire infrastructure around it ! It is insanely powerful. Build the foundation boys and girls , it is essential and key for the agent to evolve with you and/or your business ! Keep tuned ! A lot more coming !