Hermes Agent x_search on basic X Premium (M4 Mac mini): uvx, OAuth, query traps
Contents
I tested Hermes Agent’s x_search_tool on a basic X Premium plan from an M4 Mac mini (16GB). The short version: the docs say x_search requires SuperGrok / X Premium+ tier, but the basic plan got through both OAuth and the actual API call. The tier gate is looser than the documentation makes it sound.
“Worked” isn’t the whole story, though. Depending on how you write the query, Grok just returns a general-knowledge answer and X search never fires at all. I hit that in practice, so it’s worth recording — together with the OAuth setup, the timing gap, and a missing parameter that surprised me.
The minimal-setup recipe comes from a post by @MtkN1XBt, which I’ll reference throughout.
What I wanted to verify
The thread by @MtkN1XBt makes three points:
- If all you want is
x_search, you don’t need to install or configure Hermes Agent itself - You only need uv (uvx) and xAI OAuth
- Calling the internal Python API (
tools.x_search_tool) directly skips one interpretation layer compared to asking Hermes’s own model to “use x_search…”
What I wanted to verify is the tier gate. Hermes Agent’s docs say:
The
x_searchtool requires either SuperGrok / X Premium+ tier authentication or a paid xAI API key
A Zenn-side hands-on report, on the other hand, says “basic X Premium worked for both grok-4.3 and x_search.”
My account is basic X Premium (not Premium+). Testing which side is right.
Setup: uvx + xAI OAuth
uvx means no Python environment setup needed. Just have uv installed.
xAI OAuth is a one-shot command.
uvx --from hermes-agent hermes auth add xai-oauth
Running it opens a browser to xAI’s login page. The callback returns to http://127.0.0.1:56121/callback and the token is stored under uvx.
Open this URL to authorize Hermes with xAI:
https://auth.x.ai/oauth2/authorize?response_type=code&client_id=...
Waiting for callback on http://127.0.0.1:56121/callback
Browser opened for xAI authorization.
Added xai-oauth OAuth credential #1: "xai-oauth-oauth-1"
The basic X Premium account was not rejected here. At least the auth phase doesn’t enforce a tier check.
For doing this over SSH on a remote host, you need a tunnel for the callback port.
ssh -L 56121:127.0.0.1:56121 user@remote-host \
'uvx --from hermes-agent hermes auth add xai-oauth --no-browser'
Open the URL in your local browser and the token reaches the remote Hermes via the SSH tunnel.
Verification 1: abstract query doesn’t invoke X search
OAuth is done, so call x_search_tool directly as a one-liner.
uvx --from hermes-agent python -c \
'from tools.x_search_tool import x_search_tool; print(x_search_tool("test query about AI"))'
Measured with time, the whole thing returned in 8.4 seconds. The returned JSON (excerpt):
{
"success": true,
"provider": "xai",
"credential_source": "xai-oauth",
"tool": "x_search",
"model": "grok-4.20-reasoning",
"query": "test query about AI",
"answer": "**AI (Artificial Intelligence)** is the simulation of human intelligence processes by machines...",
"citations": [],
"inline_citations": []
}
Two points to note.
success: truemeans basic X Premium got through.credential_sourceisxai-oauth, so it’s definitely going through OAuth credentialscitationsandinline_citationsare both empty arrays
Reading the answer body, it’s a generic explanation of “What is AI” and “Narrow AI / Generative AI / Multimodal AI / AGI,” with no trace of having actually searched X. Grok just answered from its own knowledge base.
So “worked” and “X search fired” are different things. If the query is too abstract, Grok doesn’t search X and instead answers as a general LLM.
Verification 2: a concrete query brings back citations
Re-run with a query scoped by time, target, and count.
uvx --from hermes-agent python -c \
'from tools.x_search_tool import x_search_tool; print(x_search_tool("過去1週間でX上で話題になったAIコーディングエージェントやLLMの新リリースは何?投稿者ハンドルと該当ポストを5件挙げて"))'
This one clocked 58.4 seconds under the same time measurement. Seven times the abstract case. That itself can be read as a sign that X search is actually running. Tail end of the response (excerpt):
{
"success": true,
"model": "grok-4.20-reasoning",
"answer": "...エンゲージメントが高く、具体的な新リリース/ツールを扱った投稿を5件挙げます...",
"inline_citations": [
{"url": "https://x.com/futurepedia_io/status/2056208900979151314", "title": "1", ...},
{"url": "https://x.com/i/status/2056022182560665602", "title": "2", ...},
{"url": "https://x.com/i/status/2055117211774181690", "title": "3", ...},
{"url": "https://x.com/higgsfield_ai/status/2054989169446023181", "title": "4", ...},
{"url": "https://x.com/i/status/2056292686458708178", "title": "5", ...},
{"url": "https://x.com/ApollonVisual/status/2056647214110274050", "title": "6", ...}
]
}
This time inline_citations has 8 entries. The answer body uses footnote markers like [[1]] [[2]] tying claims back to source post URLs. The posts that came back include:
- @Suryanshti777’s post introducing Anthropic’s
claude-code-setupplugin - @higgsfield_ai’s Supercomputer release announcement
- @ApollonVisual on Cursor Composer 2.5 (based on Kimi K2.5)
Concrete poster handles and source URLs are returned, so you can use Grok’s summary as an entry point and descend to the individual posts.
What’s the difference between V1 and V2
Same x_search_tool call, but one has empty citations and the other doesn’t. The only difference is query specificity.
To reliably make X search fire, the query should include at least one of:
- An explicit time range (“past week”, “May 2026”)
- A concrete scope (specific technical area, handle, topic)
- A count specification (“give me 5 posts”, forcing list output)
- Words that name X objects directly (“poster handle”, “matching posts”)
If you throw an abstract “tell me about X,” Grok answers from internal knowledge. This is a habit of web-search agents in general, and Hermes-routed x_search behaves the same way.
Response time is also a tell. In this run, X-search-firing queries took ~58s while knowledge-only ones took ~8s — about a 7x gap. Anything returning in under 10 seconds probably didn’t fire X search.
Response structure recap
x_search_tool’s response has the following fields.
success: success / failureprovider: alwaysxaicredential_source: auth method (xai-oauthor API key)tool: alwaysx_searchmodel: the Grok model that actually ran (grok-4.20-reasoningin this test)query: the query as passedanswer: the response body in Markdowncitations: top-level citation array (always empty in my tests)inline_citations: mapping of footnote numbers to source post URLs in the body
External URLs (GitHub, papers, official docs) don’t go into inline_citations — they’re written inline in the answer body. If the source post contains an external URL, Grok picks it up and writes it into the answer.
Parameters, and the missing language / region filter
According to Hermes Agent’s docs, x_search_tool accepts the following parameters.
| Parameter | Type | Description |
|---|---|---|
query | string (required) | Search query |
allowed_x_handles | string array | Handles to include (max 10, mutually exclusive with excluded_x_handles) |
excluded_x_handles | string array | Handles to exclude (max 10) |
from_date | string | Start date (YYYY-MM-DD) |
to_date | string | End date (YYYY-MM-DD) |
enable_image_understanding | boolean | Whether xAI should analyze attached images on matched posts |
enable_video_understanding | boolean | Whether xAI should analyze attached videos on matched posts |
The thing that caught my eye is that there is no language or country equivalent parameter. Neither a query-language filter nor a regional filter exists in the official spec.
The runtime behavior is consistent with that. Throwing a Japanese query (“Past week’s trending AI coding agents on X…”), I got:
- The
answerbody came back in Japanese - The six cited posts (@Suryanshti777, @higgsfield_ai, @ApollonVisual and others) were all English posts
So “translate the summary to match the query language” applies, but on the search-source side there is no language filter. This looks like the same family of behavior as the X timeline, where Grok will quietly translate English posts into Japanese.
Translation behavior can be controlled via the query body
The parameter list doesn’t expose it, but if the query body explicitly says “no translation or summarization, keep the original text,” Grok returns the original text as-is. The query I tried:
List 3 recent posts about AI coding agents that are getting attention.
Output only: poster handle, original post body (no translation, no summarization),
and any external URLs included. The whole answer should keep English posts in English
and Japanese posts in Japanese. No commentary.
The returned answer (excerpt):
**@noohelhadedy**
A Design.md file helps AI coding agents understand your design system instead of guessing your UI.
Colors, typography, spacing, components, visual rules, and brand style in one markdown file.
...
https://github.com/VoltAgent/awesome-design-md
**@tom_doerr**
Orchestrates AI coding agents with persistent memory
https://github.com/RedPlanetHQ/core
**@supabase**
Make sure you install the Supabase plugin for AI coding agents when building apps with AI and Supabase
https://supabase.com/docs/guides/ai-tools/plugins
All three are English posts, and the body, GitHub URL, and t.co shortlinks are preserved as-is. With the summarization phase skipped, the response time also dropped to ~39s (vs ~58s for the previous translated-query run).
So if you want to pull Japanese-only sources or preserve original text, the options come down to:
- Say “Japanese posts only” / “original text” explicitly in the query body (works through the prompt even though no parameter exists)
- List target accounts in
allowed_x_handles(max 10) - Narrow the time range with
from_date/to_date
Conversely, for “pull English sources and summarize in Japanese” use cases, the default translation behavior just works. For broad news clipping, no language filter means better coverage.
Config-file (~/.hermes/config.yaml) values are a separate axis. The main ones:
x_search.model: which Grok model to use (defaultgrok-4.20-reasoning)x_search.timeout_seconds: timeout in seconds (default 180, minimum 30)x_search.retries: auto-retry count on 5xx / ReadTimeout etc. (default 2)
The default timeout being 180 seconds is itself a signal that xAI also expects “complex queries take one or two minutes.”
Caveats and limits
A few practical points after handling it on real hardware.
- It’s not fast. Measured times: ~58s for an X-search-firing query, ~8s even for a knowledge-only one. For complex queries, design your skill assuming over a minute
- What comes back is “Grok’s summary of X search results + footnote URLs,” not the X search API. If you want raw post bodies, fetch the URLs separately
- As @MtkN1XBt’s post warns, “the X API is now free” is not true
- Grok’s interpretation bias rides on the
answer. Even in my sample testing, you can’t rule out that “xAI / Grok / Claude”-adjacent topics were overweighted - Basic X Premium working today can be revoked if xAI’s policy changes tomorrow. For production use, factor in the cost of upgrading to Premium+