Hermes Agent x_search on basic X Premium (M4 Mac mini): uvx, OAuth, query traps

I tested Hermes Agent’s x_search_tool on a basic X Premium plan from an M4 Mac mini (16GB). The short version: the docs say x_search requires SuperGrok / X Premium+ tier, but the basic plan got through both OAuth and the actual API call. The tier gate is looser than the documentation makes it sound.

“Worked” isn’t the whole story, though. Depending on how you write the query, Grok just returns a general-knowledge answer and X search never fires at all. I hit that in practice, so it’s worth recording — together with the OAuth setup, the timing gap, and a missing parameter that surprised me.

The minimal-setup recipe comes from a post by @MtkN1XBt, which I’ll reference throughout.

What I wanted to verify

The thread by @MtkN1XBt makes three points:

If all you want is x_search, you don’t need to install or configure Hermes Agent itself
You only need uv (uvx) and xAI OAuth
Calling the internal Python API (tools.x_search_tool) directly skips one interpretation layer compared to asking Hermes’s own model to “use x_search…”

What I wanted to verify is the tier gate. Hermes Agent’s docs say:

The x_search tool requires either SuperGrok / X Premium+ tier authentication or a paid xAI API key

A Zenn-side hands-on report, on the other hand, says “basic X Premium worked for both grok-4.3 and x_search.”

My account is basic X Premium (not Premium+). Testing which side is right.

Setup: uvx + xAI OAuth

uvx means no Python environment setup needed. Just have uv installed.

xAI OAuth is a one-shot command.

uvx --from hermes-agent hermes auth add xai-oauth

Running it opens a browser to xAI’s login page. The callback returns to http://127.0.0.1:56121/callback and the token is stored under uvx.

Open this URL to authorize Hermes with xAI:
https://auth.x.ai/oauth2/authorize?response_type=code&client_id=...
Waiting for callback on http://127.0.0.1:56121/callback
Browser opened for xAI authorization.
Added xai-oauth OAuth credential #1: "xai-oauth-oauth-1"

The basic X Premium account was not rejected here. At least the auth phase doesn’t enforce a tier check.

For doing this over SSH on a remote host, you need a tunnel for the callback port.

ssh -L 56121:127.0.0.1:56121 user@remote-host \
  'uvx --from hermes-agent hermes auth add xai-oauth --no-browser'

Open the URL in your local browser and the token reaches the remote Hermes via the SSH tunnel.

Verification 1: abstract query doesn’t invoke X search

OAuth is done, so call x_search_tool directly as a one-liner.

uvx --from hermes-agent python -c \
  'from tools.x_search_tool import x_search_tool; print(x_search_tool("test query about AI"))'

Measured with time, the whole thing returned in 8.4 seconds. The returned JSON (excerpt):

{
  "success": true,
  "provider": "xai",
  "credential_source": "xai-oauth",
  "tool": "x_search",
  "model": "grok-4.20-reasoning",
  "query": "test query about AI",
  "answer": "**AI (Artificial Intelligence)** is the simulation of human intelligence processes by machines...",
  "citations": [],
  "inline_citations": []
}

Two points to note.

success: true means basic X Premium got through. credential_source is xai-oauth, so it’s definitely going through OAuth credentials
citations and inline_citations are both empty arrays

Reading the answer body, it’s a generic explanation of “What is AI” and “Narrow AI / Generative AI / Multimodal AI / AGI,” with no trace of having actually searched X. Grok just answered from its own knowledge base.

So “worked” and “X search fired” are different things. If the query is too abstract, Grok doesn’t search X and instead answers as a general LLM.

Verification 2: a concrete query brings back citations

Re-run with a query scoped by time, target, and count.

uvx --from hermes-agent python -c \
  'from tools.x_search_tool import x_search_tool; print(x_search_tool("過去1週間でX上で話題になったAIコーディングエージェントやLLMの新リリースは何？投稿者ハンドルと該当ポストを5件挙げて"))'

This one clocked 58.4 seconds under the same time measurement. Seven times the abstract case. That itself can be read as a sign that X search is actually running. Tail end of the response (excerpt):

{
  "success": true,
  "model": "grok-4.20-reasoning",
  "answer": "...エンゲージメントが高く、具体的な新リリース／ツールを扱った投稿を5件挙げます...",
  "inline_citations": [
    {"url": "https://x.com/futurepedia_io/status/2056208900979151314", "title": "1", ...},
    {"url": "https://x.com/i/status/2056022182560665602", "title": "2", ...},
    {"url": "https://x.com/i/status/2055117211774181690", "title": "3", ...},
    {"url": "https://x.com/higgsfield_ai/status/2054989169446023181", "title": "4", ...},
    {"url": "https://x.com/i/status/2056292686458708178", "title": "5", ...},
    {"url": "https://x.com/ApollonVisual/status/2056647214110274050", "title": "6", ...}
  ]
}

This time inline_citations has 8 entries. The answer body uses footnote markers like [[1]] [[2]] tying claims back to source post URLs. The posts that came back include:

@Suryanshti777’s post introducing Anthropic’s claude-code-setup plugin
@higgsfield_ai’s Supercomputer release announcement
@ApollonVisual on Cursor Composer 2.5 (based on Kimi K2.5)

Concrete poster handles and source URLs are returned, so you can use Grok’s summary as an entry point and descend to the individual posts.

What’s the difference between V1 and V2

Same x_search_tool call, but one has empty citations and the other doesn’t. The only difference is query specificity.

To reliably make X search fire, the query should include at least one of:

An explicit time range (“past week”, “May 2026”)
A concrete scope (specific technical area, handle, topic)
A count specification (“give me 5 posts”, forcing list output)
Words that name X objects directly (“poster handle”, “matching posts”)

If you throw an abstract “tell me about X,” Grok answers from internal knowledge. This is a habit of web-search agents in general, and Hermes-routed x_search behaves the same way.

Response time is also a tell. In this run, X-search-firing queries took ~58s while knowledge-only ones took ~8s — about a 7x gap. Anything returning in under 10 seconds probably didn’t fire X search.

Response structure recap

x_search_tool’s response has the following fields.

success: success / failure
provider: always xai
credential_source: auth method (xai-oauth or API key)
tool: always x_search
model: the Grok model that actually ran (grok-4.20-reasoning in this test)
query: the query as passed
answer: the response body in Markdown
citations: top-level citation array (always empty in my tests)
inline_citations: mapping of footnote numbers to source post URLs in the body

External URLs (GitHub, papers, official docs) don’t go into inline_citations — they’re written inline in the answer body. If the source post contains an external URL, Grok picks it up and writes it into the answer.

Parameters, and the missing language / region filter

According to Hermes Agent’s docs, x_search_tool accepts the following parameters.

Parameter	Type	Description
`query`	string (required)	Search query
`allowed_x_handles`	string array	Handles to include (max 10, mutually exclusive with `excluded_x_handles`)
`excluded_x_handles`	string array	Handles to exclude (max 10)
`from_date`	string	Start date (`YYYY-MM-DD`)
`to_date`	string	End date (`YYYY-MM-DD`)
`enable_image_understanding`	boolean	Whether xAI should analyze attached images on matched posts
`enable_video_understanding`	boolean	Whether xAI should analyze attached videos on matched posts

The thing that caught my eye is that there is no language or country equivalent parameter. Neither a query-language filter nor a regional filter exists in the official spec.

The runtime behavior is consistent with that. Throwing a Japanese query (“Past week’s trending AI coding agents on X…”), I got:

The answer body came back in Japanese
The six cited posts (@Suryanshti777, @higgsfield_ai, @ApollonVisual and others) were all English posts

So “translate the summary to match the query language” applies, but on the search-source side there is no language filter. This looks like the same family of behavior as the X timeline, where Grok will quietly translate English posts into Japanese.

Translation behavior can be controlled via the query body

The parameter list doesn’t expose it, but if the query body explicitly says “no translation or summarization, keep the original text,” Grok returns the original text as-is. The query I tried:

List 3 recent posts about AI coding agents that are getting attention.
Output only: poster handle, original post body (no translation, no summarization),
and any external URLs included. The whole answer should keep English posts in English
and Japanese posts in Japanese. No commentary.

The returned answer (excerpt):

**@noohelhadedy**
A Design.md file helps AI coding agents understand your design system instead of guessing your UI.
Colors, typography, spacing, components, visual rules, and brand style in one markdown file.
...
https://github.com/VoltAgent/awesome-design-md

**@tom_doerr**
Orchestrates AI coding agents with persistent memory
https://github.com/RedPlanetHQ/core

**@supabase**
Make sure you install the Supabase plugin for AI coding agents when building apps with AI and Supabase
https://supabase.com/docs/guides/ai-tools/plugins

All three are English posts, and the body, GitHub URL, and t.co shortlinks are preserved as-is. With the summarization phase skipped, the response time also dropped to ~39s (vs ~58s for the previous translated-query run).

So if you want to pull Japanese-only sources or preserve original text, the options come down to:

Say “Japanese posts only” / “original text” explicitly in the query body (works through the prompt even though no parameter exists)
List target accounts in allowed_x_handles (max 10)
Narrow the time range with from_date / to_date

Conversely, for “pull English sources and summarize in Japanese” use cases, the default translation behavior just works. For broad news clipping, no language filter means better coverage.

Config-file (~/.hermes/config.yaml) values are a separate axis. The main ones:

x_search.model: which Grok model to use (default grok-4.20-reasoning)
x_search.timeout_seconds: timeout in seconds (default 180, minimum 30)
x_search.retries: auto-retry count on 5xx / ReadTimeout etc. (default 2)

The default timeout being 180 seconds is itself a signal that xAI also expects “complex queries take one or two minutes.”

Caveats and limits

A few practical points after handling it on real hardware.

It’s not fast. Measured times: ~58s for an X-search-firing query, ~8s even for a knowledge-only one. For complex queries, design your skill assuming over a minute
What comes back is “Grok’s summary of X search results + footnote URLs,” not the X search API. If you want raw post bodies, fetch the URLs separately
As @MtkN1XBt’s post warns, “the X API is now free” is not true
Grok’s interpretation bias rides on the answer. Even in my sample testing, you can’t rule out that “xAI / Grok / Claude”-adjacent topics were overweighted
Basic X Premium working today can be revoked if xAI’s policy changes tomorrow. For production use, factor in the cost of upgrading to Premium+