Skip to main content
Automation

AI Agent Daily: How I Run My Web Businesses on Autopilot

· 14 min read

233 skills. 7 MCP connectors. 4 specialized profiles. 2 active cronjobs, ~10 historical. That’s what my AI agent runs every single day on my server. Not a chatbot. A real operational system that drives my web projects while I focus on strategy.

I named this agent Hermes. Here’s exactly how it works, with real numbers pulled from a live audit of the install, not estimates.

Why I run an AI agent on my own server

I run several web projects: content sites, lead-generation pipelines, affiliate projects, niche research. Each one needs technical monitoring, repetitive tasks, content production, and reporting.

Before, I was on every terminal, every API, every spreadsheet myself. Today Hermes handles 80% of operations. I keep control of strategy and high-impact decisions.

Concrete result: over 2,000 work sessions on record, memory that persists across conversations, and automations that run 24/7 without my intervention.

The server: a $10/month VPS

Hermes runs on a standard VPS (think OVH, Hetzner, Oracle Cloud free tier, or any provider you like). No GPU, no dedicated hardware. Real specs at the moment I’m writing:

ResourceValue
vCPU6 cores
RAM11 GB (8.3 GB used)
Disk96 GB (69 GB used, 72%)
OSUbuntu 24.04.4 LTS, kernel 6.8.0-110
VirtualizationKVM

Nothing exotic. Any VPS in the $8-$15/month range works.

The architecture: polyglot, not “just Python”

Hermes is not a Python script with network calls. It’s a stack of services running side by side. Actual ps output:

hermes-gateway              <- main orchestrator (Node.js)
hindsight-api               <- vector memory server (Python)
redis                       <- caching layer
nginx                       <- reverse proxy
discord-scrapper            <- Discord data extraction (Node.js)
firecrawl                   <- web crawling service

Node.js for the agent harness, Python for memory and search, Docker to isolate external services, Nginx as reverse proxy. Polyglot by design, each piece is the right tool for its job.

The 7 MCP connectors

MCP (Model Context Protocol) is the standard that lets the agent connect to external tools. I have 7 configured:

ConnectorFunction
AtomicSelf-hosted RDF knowledge base (documents, procedures, specs)
HindsightLong-term vector memory for volatile, contextual facts
LinearProject management, issue tracking, sprint planning
DiscordPrimary channel, notifications, in-thread command execution
Chrome DevToolsAutomated web browsing, visual audits via CDP
Context7Real-time technical documentation lookup for any library
GlitchTipSelf-hosted application error monitoring and alerting

Each connector is an independent server. The agent can query them in parallel. While I’m chatting on Discord it can check a GlitchTip issue and update a Linear task, all in the same conversation turn.

Model routing: optimized for cost

Running a frontier model 24/7 gets expensive fast. The trick is routing the right model to the right task.

Main model: mimo-v2.5-pro (Xiaomi). That’s what handles every conversation turn. 1M token context window, solid reasoning, available via API.

Auxiliary models handle side tasks like image analysis, web page summarization, session titles, context compression, and skill curation. Before optimization, these all ran on the main model, wasting premium tokens on grunt work. Now:

TaskModelWhy
Vision (images, screenshots)Gemini 3.5 Flash (OpenRouter)Best multimodal cheap model
All other auxiliary tasks (12)DeepSeek (deepseek-chat)Excellent reasoning at $0.14/M input, $0.28/M output

The 12 auxiliary tasks routed to DeepSeek: web extraction, context compression, skills hub, approval flow, MCP routing, title generation, triage, kanban decomposition, profile description, curator, session search, and memory flush.

Smart model routing adds another layer: messages under 160 characters or 28 words get routed to DeepSeek automatically instead of mimo-v2.5-pro. Simple questions get cheap answers. Complex work stays on the premium model.

Fallback chain: if the main provider goes down, Hermes falls back to Hermes 3 405B (Nous Research) on OpenRouter’s free tier. Last resort: a tiny local Granite 3B model that can still answer and reconfigure things even when all API providers fail.

Result: significant cost reduction without quality loss on the conversations that matter.

Profiles: specialized agents, one channel

The biggest optimization isn’t technical, it’s organizational. Instead of one agent doing everything, I run 4 specialized profiles that share a single Discord channel:

ProfileToolsMCP serversMemoryPurpose
default14 toolsall 7 serversYesHub, receives all Discord messages
coding7 toolsContext7, Chrome DevToolsNoPure dev work, no distractions
business10 toolsAtomic, Hindsight, LinearYesClient work, ERP, project management
data7 toolsAtomic, Hindsight, Discord scraperYesScraping, lead enrichment, data pipelines

Each profile has its own config.yaml, .env, SOUL.md, skills directory, and memory store. The coding profile loads only terminal, file, web, browser, delegation, vision, and session search tools. No memory overhead, no irrelevant MCP connections.

How it works from Discord:

  • Delegation (synchronous, under 5 min): I ask something code-related, the default agent delegates to a subagent with only the coding toolset. Result comes back immediately.
  • Kanban orchestration (asynchronous, long-running): complex tasks get decomposed and routed to the right worker profile automatically, based on each profile’s description. The kanban dispatcher runs in the gateway, checking every 60 seconds for ready tasks.

No manual switching. No separate channels. One conversation, intelligent routing.

The 4-layer memory system

This is the most underrated part of an AI agent. Without memory, an LLM forgets everything after each message. I run 4 layers in parallel, each with a distinct role:

LayerTypeWhat it stores
MEMORY.mdText file (~4K chars)Non-critical conventions, public URLs, technical rules
AWS Secrets ManagerEncrypted vaultProduction credentials, API tokens, signing keys
HonchoRemote APIBehavioral patterns, user preferences, communication style
HindsightSelf-hosted vector DB (MCP)Volatile facts, session context, 91%+ accuracy
AtomicSelf-hosted RDF store (MCP)Stable documents, specs, procedures

Golden rule: each information type has its own layer. Non-critical conventions stay in MEMORY.md, injected every turn. Production credentials never leave AWS Secrets Manager: the agent fetches them on demand via IAM scoped to the instance. If MEMORY.md ever leaks into a trace or a dump, nothing critical leaves with it.

A memory router skill automatically decides where to store each new piece of information. No duplicates, no drift.

Compression is tuned for the 1M-token context window: threshold at 0.65 (compresses earlier), target ratio 0.3 (preserves 300K tokens). Prompt caching TTL set to 15 minutes for better cost efficiency on long sessions.

Tools and credentials: all behind AWS Secrets Manager

The agent has access to a full tooling ecosystem. Real numbers:

CategoryCountExamples
APIs and keys27+ API keysDataForSEO, Cloudflare, Backblaze B2, FAL, ElevenLabs
Model providers6 credential poolsXiaomi (MiMo), Nous Research, OpenRouter, DeepSeek, Google
Google servicesFull OAuthSearch Console, Gmail, Calendar (17+ GSC properties)
Cloud storageBackblaze B2S3-compatible object storage at $0.006/GB
InfrastructureCloudflareWorkers, D1 edge database, Pages
TTS (voice)2 providersGoogle TTS, ElevenLabs
MonitoringGlitchTipSelf-hosted error tracking

Total: 80+ credential items managed via AWS Secrets Manager. One vault, IAM scoped to the instance, automatic rotation on sensitive secrets. That’s what lets me sleep when a disk dies or a repo leaks.

The 233 skills

The agent does not improvise. It follows precise procedures stored in 233 skill files across 51 categories:

DomainExample skills
SEO and contentArticle writing, on-page optimization, niche ideation, GEO
DevOpsDeployment, monitoring, Docker, kanban-worker, Cloudflare Workers
ResearchML literature review, data extraction, competitive analysis
GitHubAuth, code-review, PR-workflow, repo-management
DataScraping, enrichment, ETL pipelines, OSINT
CreativeImage generation, video editing, infographic creation

Each skill is a markdown file with precise rules: tone, structure, banned patterns, official sources, output schemas. The agent loads the right skill based on the request.

Concrete example: when I ask for a blog post, the agent loads the blog-writer skill which carries anti-AI rules (no em-dashes, banned vocabulary, sentence-length variation), research methodology (official sources only), and the output template (MDX frontmatter, collapsible FAQ, optimized meta description).

Cronjobs: running while I sleep

Two cronjobs are running right now, untouched:

Active cronjobFrequencyWhat it does
Affiliate KPI DashboardDaily at 7:00 PMScrapes the affiliate network dashboard, posts KPIs (payout, clicks, conversion, EPC, rank) to Discord in 3 lines
DMCA Blocklist UpdaterEvery 6 hoursReads DMCA notices from Gmail, updates the blocklist TypeScript file, commits and pushes to main

Beyond these, Hermes has built up a dozen historical cronjobs around niche research and business operations. Real list:

  • SEO automation loops on niche content sites (wellness, education, affiliate verticals)
  • Daily indexing and crawl monitoring via Google Search Console API
  • Full SEO cycles on affiliate content sites (keyword research, content generation, internal linking)
  • Domain monitoring via RDAP (drops, transfers, expirations, useful for niche domain hunting)
  • Application error scanning via GlitchTip API
  • Memory hygiene cleanup, twice daily (Hindsight + Atomic deduplication)
  • Version changelog digests to the team Discord channel

Each historical cronjob left its output in the cron archive. When a niche pivots or a project ends, I disable the cron but keep the artifacts. That archive is how I decide what’s worth restarting.

Session management

Session idle timeout: 7 days. This prevents context loss on projects that span multiple days while still cleaning up stale sessions. Checkpoint snapshots preserve file state across resets.

When a session does reset, the resume mechanism injects the last 10 exchanges into context so the agent picks up where it left off. Combined with persistent memories (Honcho, Hindsight, Atomic), the agent maintains continuity even across resets.

The platforms I actually use

Hermes can speak on 8 platforms (Discord, Matrix, Telegram, WhatsApp, Slack, Mattermost, Signal, SMS). In practice, I use two:

PlatformReal usage
DiscordPrimary channel, per-project threads, cronjob delivery, ad-hoc command execution
MatrixBackup E2EE when Discord is down or when I need end-to-end encryption

The other six are configured and functional but don’t fit my workflow. Multi-platform helps if a team grows or if you want a separate on-call channel. Today, Discord is enough.

Autonomous decisions: the /goal command

For a long time, Hermes did one turn, returned a response, and waited for me to re-prompt. Since /goal shipped (our take on the Ralph loop, directly inspired by Codex CLI’s goal mode), I can set an objective and the agent iterates on its own until the goal validates:

/goal Fix every ruff error in src/ and verify scripts/run_tests.sh passes

What happens under the hood:

  1. Goal accepted: Goal set (20-turn budget): <your goal>
  2. Turn 1: Hermes starts as if the goal were a normal message
  3. Judge: after the turn, a small auxiliary model (DeepSeek) replies strict JSON {"done": bool, "reason": "..."}
  4. Loop: if not done, Hermes auto-runs the next turn
  5. Termination: Goal achieved or Goal paused - N/20 turns used

For high-impact actions (deploy, deletion, modification of critical data), /goal doesn’t short-circuit anything: the agent still asks for confirmation. The loop stays closed. I decide what gets trust, the agent executes.

The raw numbers

Full recap, no filter, pulled right now:

MetricValue
MCP servers7 (Atomic, Hindsight, Linear, Discord, Chrome DevTools, Context7, GlitchTip)
Loaded skills233 across 51 categories
Skill files844 total, 624 .md files
Active profiles4 (default, coding, business, data)
Active cronjobs2 (Affiliate KPI, DMCA Updater)
Historical cronjobs with artifacts~10 (SEO niches, affiliate, monitoring)
Connected platforms8 (Discord + Matrix used)
Managed credentials80+ via AWS Secrets Manager
Recorded sessions2,066
Primary modelmimo-v2.5-pro (Xiaomi), 1M context
Auxiliary modelsDeepSeek (12 tasks), Gemini 3.5 Flash (vision)
Smart routingEnabled, messages under 160 chars route to DeepSeek
Compression threshold0.65 (1M context window)
Prompt caching TTL15 min
Session idle timeout7 days
Fallback modelHermes 3 405B (Nous Research, OpenRouter free tier)
Max turns per session90
RAM used8.3 GB / 11 GB
Disk used69 GB / 96 GB (72%)
LLM providers6 credential pools

What this changes for a web entrepreneur

Before Hermes, I spent 3 to 4 hours daily on operational tasks: error checking, data updates, content writing, deploy tracking, KPI reporting.

Today, those tasks are either automated (cronjobs) or delegated to the agent (on-demand, or via /goal for long-running loops). My time goes to strategy, identifying new niches, and direction calls.

The agent doesn’t replace thinking. It frees up time to think.


Updated May 29, 2026. All numbers come from a live audit of the actual install, not estimates.

FAQ

An AI agent is software powered by a large language model (LLM) that runs tasks autonomously: reads files, calls APIs, executes scripts, browses the web, and keeps persistent memory across sessions. Unlike a chatbot, it operates on your infrastructure, not inside a vendor-hosted chat window.

The server is a standard VPS at around $10/month. API costs depend on the model and usage volume. With smart routing, auxiliary tasks run on DeepSeek (cheap) while complex work stays on the main model. Typical daily usage stays within a few dozen dollars per month.

Yes, since the /goal command shipped. I set an objective and the agent iterates turn after turn, judged at each step by an auxiliary model, until the goal is achieved or the turn budget runs out. For high-impact actions (deploy, deletion, critical data modification) it still asks for confirmation.

ChatGPT and Claude are hosted chat interfaces. Hermes runs on my server, connected to my tools (Discord, Linear, GitHub, Google Search Console, Cloudflare), with persistent memory, scheduled tasks, and direct access to my infrastructure. The data stays mine.