OpenClaw agent billing security using NemoClaw and Stripe MPP
Contents
Suppose you have isolated an agent in a sandbox. File system is restricted. Networks were also whitelisted. So what do you do when the agent uses an external API service to bill you?
If you look at NVIDIA’s OpenClaw sandbox plugin NemoClaw'' and Stripe's payment standard for AI agents Machine Payments Protocol (MPP)” side by side, you can see the outline of the answer to this question.
Attack history of the OpenClaw ecosystem
Before getting into NemoClaw’s technology, let’s review why we need a sandbox. I have repeatedly discussed the security issues with OpenClaw on this blog.
- AMOS distribution via SKILL.md: ClawHub’s malicious skill uses AI as a “trusted intermediary” to trick users into installing macOS information stealing malware. Collect credentials from 150 crypto wallets and 19 browsers
- Clinejection: Prompt injection to GitHub issue title → AI triage bot → npm token theft → OpenClaw automatic installation on 4000 machines. First supply chain attack with AI agents as attack vectors
- Karpathy’s “400,000 line vibe coded monster” comment: Immediately after OpenClaw achieved 100,000 stars in two days, reports of RCE vulnerabilities, supply chain contamination, and malicious skills continued to emerge. Karpathy has made it clear that he has no intention of handing over his private key.
- Memory injection attack: Contamination of SOUL.md/MEMORY.md by MINJA/InjecMEM/ToxicSkills campaign (a series of attack activities). A “Ship of Theseus” pattern has also been confirmed, where the effects remain even if the skill is uninstalled.
- Predecessors of local sandbox: Agent Safehouse (macOS sandbox-exec) and Codex (Windows sandbox). It appeared in response to the accident where Amazon’s Kiro deleted the production DB and the accident where Replit’s AI deleted the customer DB.
flowchart TD
A["ClawHub / SkillsMP / skills.sh<br/>スキルマーケットプレイス"] -->|悪性SKILL.md| B["OpenClawエージェント"]
B -->|"AIが信頼された仲介者として<br/>偽インストール指示を提示"| C["AMOS感染<br/>(macOS情報窃取マルウェア)"]
D["GitHubイシュータイトル"] -->|プロンプトインジェクション| E["AIトリアージbot"]
E -->|"キャッシュ汚染→<br/>npmトークン窃取"| F["cline@2.3.0 公開<br/>postinstallでOpenClaw<br/>グローバルインストール"]
G["CVE-2026-25253"] -->|"悪意あるgatewayUrl"| H["認証トークン窃取<br/>RCE"]
I["悪性スキル"] -->|"SOUL.md/MEMORY.md書き換え"| J["メモリポイズニング<br/>永続的な汚染"]
style C fill:#991b1b,color:#fff
style F fill:#991b1b,color:#fff
style H fill:#991b1b,color:#fff
style J fill:#991b1b,color:#fff
If you look at the numbers, you can see how serious the situation is.
| Attack/Investigation | Method | Scale |
|---|---|---|
| ClawHavoc | 341 malignant skills on ClawHub. 91% are prompt injection type | Doubled to 824 as of February 16th update |
| AMOS Distribution | Instruct AI to install fake CLI via SKILL.md | Trend Micro identifies 39 cases, Bitdefender estimates that about 20% of the entire ecosystem is malicious |
| Clinejection | Issue title → AI triage bot → cache pollution → npm token theft | OpenClaw automatic installation on 4000 machines |
| CVE-2026-25253 | RCE via gatewayUrl parameter (CVSS 8.8) | 30,000+ public instances in 52 countries |
| ToxicSkills | Snyk audited 3,984 skills, 36.82% had security issues | 76 were determined to be malicious |
| MINJA/InjecMEM | Contaminates memory banks with just regular queries to the agent | Contamination remains even after uninstallation |
As Karpathy warned, it was impractical for an individual to conduct a security audit of a 400,000-line codebase, and lightweight alternative implementations such as ZeroClaw (Rust, 3.4MB), PicoClaw (Go, runs on a $10 board), and NullClaw (Zig, 678KB)](/en/articles/ai-agent-orchestration-claws-cord) emerged simultaneously. There are also security-specific implementations such as IronClaw that run all tools within a Wasm sandbox. But shrinking the codebase won’t solve the skills ecosystem’s supply chain problems. NemoClaw attempts to address this problem at the runtime level.
NemoClaw architecture
NemoClaw is a plugin for running OpenClaw agents securely within the NVIDIA OpenShell sandbox. It is published under the Apache 2.0 license on GitHub (NVIDIA/NemoClaw) and has collected approximately 10,000 stars in just 4 days since its release.
graph TD
A[OpenClaw<br/>自律エージェント本体] --> B[OpenShell<br/>セキュアランタイム]
C[NemoClaw<br/>NVIDIAプラグイン] --> B
B --> D[K3s inside Docker<br/>Kubernetesクラスター]
D --> E[Sandbox環境]
E --> F[NVIDIA cloud<br/>推論エンドポイント]
| Component | Role |
|---|---|
| OpenClaw | Always-on autonomous agent provided by openclaw.ai |
| OpenShell (NVIDIA/OpenShell) | Secure runtime for agent execution. Include K3s in a Docker container |
| NemoClaw | OpenShell plugin exclusively for OpenClaw. Glue for integration with NVIDIA inference |
| ClawHub | OpenClaw Skills Marketplace |
OpenShell also supports Claude Code, Codex, Ollama, etc., and NemoClaw is an implementation specialized for OpenClaw and NVIDIA inference.
The difference in approach from local isolation execution using macOS sandbox-exec or Windows sandbox is that NemoClaw is server-side protection based on K3s clusters, whereas Agent Safehouse and others protect agents on the desktop. The security models required are different for developers using OpenClaw locally and for operating OpenClaw as an always-on service.
Blueprint Architecture
NemoClaw is internally divided into two layers.
| Layer | Language | Role |
|---|---|---|
| Plugin | TypeScript | UI/CLI layer providing nemoclaw CLI and openclaw nemoclaw subcommands |
| Blueprint | Python | Versioned Artifacts. A layer that actually orchestrates sandbox creation, policy application, and inference settings |
This separation allows the Blueprint side to be released independently without having to update the TypeScript plugin. Blueprints go through a five-stage lifecycle through OpenShell CLI commands.
graph LR
A[Resolve<br/>バージョン互換チェック] --> B[Verify<br/>ダイジェスト検証]
B --> C[Plan<br/>リソース計画]
C --> D[Apply<br/>リソース構築]
D --> E[Status<br/>状態レポート]
The Verify stage is an important point for security, as it verifies the artifact digest to prevent tampering in the supply chain. ClawHub is distributing a large number of malicious skills Given the current situation, this step of verifying the integrity of what you are installing is essential.
4 layer security
The protection NemoClaw applies consists of four layers.
| Layer | Protection details | Lock method |
|---|---|---|
| Network | Block unauthorized outbound connections | Hot-reloadable at runtime |
| Filesystem | Prohibit writing to anything other than /sandbox and /tmp | Lock when creating sandbox |
| Process | Block privilege escalation and dangerous system calls | Lock when creating sandbox |
| Inference | Rerouting model API calls to controlled backends | Hot reloadable at runtime |
The implementation uses a multi-layered defense that combines Landlock LSM, seccomp, and network namespace (netns). Filesystem and Process layers are fixed at the time of sandbox creation, while Network and Inference can update policies without restarting the sandbox.
Binary-based network control
The granularity of network control is not just IP address or host name filtering, but the feature is that it is possible to control which binaries, which endpoints, and which HTTP methods can be accessed on a binary-by-binary basis.
The main endpoints allowed by the baseline policy are:
| Endpoints | Limitations |
|---|---|
api.anthropic.com:443 | claude Only from binary |
integrate.api.nvidia.com:443 | For inference |
github.com:443 / api.github.com:443 | gh/git Binary, with HTTP method control |
openclaw.ai:443 / clawhub.com:443 | No limit |
registry.npmjs.org:443 | GET only |
If an agent attempts to access a host that is not included in the baseline, the operator will be notified via TUI (openshell term). Operators can choose to approve or deny in real time, but approval is only valid during the session and is not permanently saved in the baseline policy.
One question arises here. clawhub.com:443 is allowed with “no restrictions”, but is this okay given that ClawHub itself is a distribution source for malicious skills? File system protection prevents malicious skills from rewriting system files, but prompt injection attacks (91% of ClawHavoc are of this type) cannot be prevented by file system restrictions. If the malicious SKILL.md is injected into the agent’s context, malicious commands can be executed as normal operations within the sandbox. Model level defense such as Instruction Hierarchy is required separately.
Inferential Routing and Cost Control
Inference requests do not leave the sandbox directly. OpenShell intercepts and forwards to the configured NVIDIA cloud endpoint.
sequenceDiagram
participant A as エージェント(sandbox内)
participant G as OpenShell Gateway
participant N as NVIDIA cloud<br/>(build.nvidia.com)
A->>G: 推論リクエスト
G->>N: 制御済みルーティング
N->>G: レスポンス
G->>A: レスポンス転送
The default model is Nemotron 3 Super 120B (context 131,072 tokens, maximum output 8,192 tokens). Nemotron Ultra 253B, Nemotron Super 49B v1.5, and Nemotron 3 Nano 30B for local inference can also be configured.
An important side effect of inferential routing is that the gateway can monitor and control how many tokens the agent is sending to which model. It becomes possible to visualize inference costs and set upper limits. This is directly connected to the issue of agent settlement, which will be discussed next.
Inconsistency in charging from within the sandbox
Now comes the main point. Even if the agent is isolated using NemoClaw, what happens if the agent uses an external API service?
For example, if an agent wants to use a Browserbase browser session or send physical mail using PostalForm, the following methods are currently available.
- Pass the API key of each service to the agent environment in advance
- Agent charges directly with that API key
This is exactly the problem of “giving away the private key” that Karpathy was concerned about. Although NemoClaw’s network controls can restrict outbound connections, there remains a risk that API keys may be compromised to authorized endpoints.
A particularly dangerous scenario is where ClawHub’s malicious skill reads API keys within the context of an agent and sends them to the outside world. NemoClaw’s Network layer will not be able to detect this if it is addressed to a host that is included in the network restriction whitelist. As seen in AMOS distribution chain analysis, the malicious skill has the ability to collect credentials from 150 different crypto wallets and 19 different browsers. If an API key is exposed in the sandbox, it becomes a natural target.
Memory poisoning is also troublesome. As seen in the ToxicSkills campaign, malicious skills can write backdoors to SOUL.md and MEMORY.md. If the command “send your API key to this URL when paying for a particular service” is persisted in memory, the agent will continue to process payments in a tainted state even after the skill is uninstalled.
flowchart TD
A["悪性スキルインストール"] --> B["SOUL.md/MEMORY.mdに<br/>バックドア書き込み"]
B --> C["スキルをアンインストール"]
C --> D["メモリ汚染は残存"]
D --> E["エージェントが決済処理"]
E --> F["汚染されたメモリの命令で<br/>APIキーを外部送信"]
F --> G["ホワイトリスト内のホスト宛て<br/>→NemoClawでは検知不能"]
style F fill:#991b1b,color:#fff
style G fill:#991b1b,color:#fff
In other words, sandboxing alone cannot guarantee billing security. What is needed is a system that “charges agents without giving them their private keys.”
Stripe Machine Payments Protocol
Machine Payments Protocol (MPP), a collaboration between Stripe and stablecoin blockchain Tempo, is one solution to this problem. The specifications are published at mpp.dev, and official SDKs for TypeScript (mppx), Python (pympp), and Rust (mpp) are also provided.
HTTP 402 challenge-response model
The core of MPP is to make HTTP’s 402 Payment Required status code function as a practical protocol. 402, which was reserved in RFC 7235 but remained “undefined” for a long time, will be used as a payment signal for agents.
sequenceDiagram
participant A as AIエージェント<br/>(sandbox内)
participant S as 外部サービス
participant P as Stripe / Tempo
A->>S: リソースリクエスト
S->>A: 402 Payment Required<br/>(支払い要件JSON)
A->>P: 支払い処理
P->>S: 支払い確認
A->>S: 再リクエスト(認証ヘッダー付き)
S->>A: 200 OK + リソース + レシート
Traditional billing flows require agents to create a service account, view the pricing page, and enter payment information. Having an agent handle this flow, which was designed for humans, is inefficient and poses a security risk. MPP solves this at the protocol level.
There are three core primitives defined by MPP.
| Primitive | Role |
|---|---|
| Challenge | Payment request issued by the server. Including payment terms, amount, currency, and deadline |
| Credential | Proof of payment provided by agent. Format varies depending on payment method |
| Receipt | Receipt returned by the server upon completion of payment. Use as proof for subsequent requests |
Shared Payment Token: A mechanism that does not pass the private key
Of the two MPP payment methods, Shared Payment Token (SPT) is the one to focus on from a security perspective.
curl https://api.stripe.com/v1/test_helpers/shared_payment/granted_tokens \
-d payment_method=pm_card_visa \
-d "usage_limits[currency]"=usd \
-d "usage_limits[max_amount]"=10000 \
-d "usage_limits[expires_at]"=1776505073
A feature of SPT is that it allows for detailed scoping of agents’ payment authority.
| Limit parameters | Effect |
|---|---|
max_amount | Maximum amount per payment |
currency | Available currencies |
expires_at | Token expiration date |
seller_details[network_id] | Restrict use to specific merchants |
Rather than directly handing agents a credit card number or API key, give them a one-time token with limited use. Even if a malicious skill steals tokens, the damage will be limited if a maximum amount, expiration date, and merchant restrictions are set.
The lifecycle of an SPT can be tracked using webhooks.
| Event | Timing |
|---|---|
shared_payment.granted_token.used | When the token is used |
shared_payment.granted_token.deactivated | When the token is invalidated |
Another payment method, on-chain cryptocurrencies (USDC transfers via the Tempo network), can be proxied through a gateway similar to NemoClaw’s inferential routing, eliminating the need for agents to directly hold the wallet’s private keys.
Early adopters
The following companies and services are introduced as early adopters of MPP.
| Enterprise | Use Case |
|---|---|
| Browserbase | Agents pay per browser session |
| PostalForm | Agent requests printing and sending of physical mail |
| Parallel Web Systems | Use the web access API provided by Paragrapha for call billing |
| Stripe Climate | Agents autonomously contribute to carbon removal projects |
| Prospect Butcher Co. | Agent autonomously orders sandwiches in NYC |
NemoClaw network control + MPP combination
Combining NemoClaw’s binary unit network control and MPP creates a double layer of defense.
flowchart TD
A["オペレーター"] -->|"SPT発行<br/>上限$10、有効期限1時間<br/>特定マーチャントのみ"| B["サンドボックス内<br/>OpenClawエージェント"]
B -->|"402チャレンジ-レスポンス"| C["外部サービス"]
B -.->|"NemoClawが<br/>ネットワーク制御"| D["許可済み<br/>エンドポイントのみ"]
B -.->|"ブロック"| E["未許可の<br/>エンドポイント"]
style E fill:#991b1b,color:#fff
This is what happens when you add a payment-related endpoint to network control.
| Endpoints | Limitations | Purpose |
|---|---|---|
api.stripe.com:443 | Only from MPP compatible binaries | SPT usage/payment processing |
mpp.dev:443 | GET only | See MPP specifications |
| API for each merchant | Whitelist individually | 402 challenge-response |
By adding a payment endpoint to the baseline policy and restricting it on a binary basis, malicious code within the agent can be prevented from calling the Stripe API without permission. Two layers of network control (where communication can be made) and SPT usage restrictions (how much, when, and to which merchants) prevent out-of-control billing.
MCP transport binding
In addition to HTTP (402 status code), MPP also defines bindings for MCP (Model Context Protocol). In the architecture where OpenClaw provides tools as an MCP server, the payment flow can be completed within the tool call.
This also matches the flow of MCP support discussed in Make any software compatible with AI agents with CLI-Anything. Agent calls tool, tool returns 402, agent pays with SPT, tool returns results. Once this flow is standardized within MCP, agent billing will become as natural as an API call today.
Installation and usage
curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash
Installation is completed with a one-liner. If Node.js is not installed, it will be installed automatically. An interactive wizard (nemoclaw onboard) guides you through sandbox creation, inference settings, and security policy application all at once.
Main command.
nemoclaw onboard # 初期セットアップウィザード
nemoclaw <name> connect # サンドボックスへのシェル接続
nemoclaw <name> status # ヘルス確認
nemoclaw <name> logs --follow # ログストリーミング
openshell term # リアルタイムTUI監視
Hardware requirements are minimum 4 vCPU, 8 GB RAM, 20 GB Disk. The compressed size of the sandbox image is approximately 2.4 GB. In environments with less than 8 GB of RAM, we recommend 8 GB or more of swap.
macOS is supported on Apple Silicon through Colima or Docker Desktop. There is also a dedicated setup command (nemoclaw setup-spark) for DGX Spark that automatically fixes cgroup v2 compatibility issues on Ubuntu 24.04 + k3s.
Sandbox alone is not enough
NemoClaw is currently in Alpha status and not recommended for production use. The fact that it collected 10,000 Stars in four days after its release shows the great demand for a secure execution environment for agents.
However, as we have seen so far, sandboxing alone cannot cover the entire picture of agent security.
| Threats | Can NemoClaw prevent this? | Additional measures required |
|---|---|---|
| File system destruction | Preventable (Filesystem layer) | — |
| Privilege escalation | Preventable (Process layer) | — |
| Unauthorized network connections | Preventable (Network layer) | — |
| API key leak (to authorized host) | Cannot be prevented | Do not pass private key to agent with MPP SPT |
| Prompt injection | Unpreventable | Model-level defenses such as Instruction Hierarchy |
| Malicious skill installation | Partial (integrity verification at Verify stage) | Strengthening skill marketplace screening |
| Memory poisoning | Unpreventable | Monitoring SOUL.md/MEMORY.md, DB-based memory storage |
| Uncontrollable billing/unauthorized use | Inference costs can be controlled | Maximum amount, expiration date, and merchant restrictions with MPP SPT |
NemoClaw’s 4-layer security, MPP scope control of billing privileges, model-level defenses like Instruction Hierarchy, Local isolated execution. It’s not just a matter of combining all of them to get to the starting line.