Wes Roth

Hermes Agent is INSANE...

3 min read27 Apr 2026Worth watching

TL;DR

Wes Roth demonstrates Hermes Agent, an open-source AI agent, by using it to build a gravity well spaceship simulation benchmark that automatically tests and scores LLMs like Claude Opus 4.7, GPT 5.5, and others on their ability to iteratively write better code. The video doubles as a full installation tutorial for Hermes Agent on a Hostinger VPS using Ubuntu and the Noose Portal subscription.

Key points

Wes built a gravity well simulation benchmark called Grav entirely with AI agents, where LLMs write and iteratively improve ship-piloting code over 20 rounds to achieve the highest score possible

Claude Opus 4.7 achieved an 88.3% win rate in PVP arena tests and a high score of 276 over iterations, while smaller models like Claude Sonnet 4.6 topped out around 78

Hermes Agent can orchestrate multiple AI sub-agents simultaneously, opening real instances of Codex and Claude Code and feeding them tasks, diagnostic results, and iteration instructions autonomously

The full Hermes Agent installation on a Hostinger KVM2 VPS requires SSH access, running a single installer command on Ubuntu 24.04 LTS, and connecting to either OpenRouter or the new Noose Portal subscription

Noose Portal bundles web search, image generation, browser automation, and model access under one API key, eliminating the need to set up separate keys for tools like Firecrawl or Browser Use

Actionable insights

→

Use Hostinger KVM2 ($8.99/month, 2 vCPU, 8GB RAM, 100GB NVMe) for running AI agents on a VPS — it prevents freezing and resource issues that plague lower-tier plans

→

Always install Hermes Agent with the local terminal backend first on a fresh Ubuntu install; switching to Docker sandbox mode requires Docker to be installed beforehand or the setup will crash

→

Set Hermes Agent max iterations to 90 for most tasks and 150 for deep research; leave context compression threshold at 0.5 and daily reset at 4 AM to avoid mid-task resets

→

To use GPT 5.5 in Hermes Agent, run hermes update and authenticate via the OpenAI Codex OOTH flow rather than through OpenRouter or Noose Portal, at least as of current release

→

Run AI agents in full auto mode only on isolated machines (VPS, old laptop, mini PC) with sandboxing like Docker where possible, and use a password manager like 1Password to quarantine exposed credentials quickly

Notable quotes

“The day is for collaborative AI work. The night is for automated AI agents just grinding away and doing all this stuff to get ready for the next day.”

“I would not install one on my main computer and run it like this with this bypassing the confirmation prompts. So if you are getting into this just think about it like how many layers of safety can you have between you and the agent if stuff goes wrong.”

“I know some of the companies that train it directly on the benchmarks so that they can benchmarks, basically get the highest possible score and look like they are doing really, really well.”

Worth watching?

✅

Worth watching the full video?

Watch if you want to see the live benchmark simulation and PVP replays in action — the installation tutorial and multi-agent orchestration demo are genuinely useful, but the key steps and results are all captured here.

Topics

AI & Tech Hermes Agent

Explore more summaries on these topics →

Saved you some time? The creator still deserves a like.

Watch on YouTube →

More like this