OpenClaw agent billing security using NemoClaw and Stripe MPP

Suppose you have isolated an agent in a sandbox. File system is restricted. Networks were also whitelisted. So what do you do when the agent uses an external API service to bill you?

If you look at NVIDIA’s OpenClaw sandbox plugin NemoClaw'' and Stripe's payment standard for AI agents Machine Payments Protocol (MPP)” side by side, you can see the outline of the answer to this question.

Attack history of the OpenClaw ecosystem

Before getting into NemoClaw’s technology, let’s review why we need a sandbox. I have repeatedly discussed the security issues with OpenClaw on this blog.

AMOS distribution via SKILL.md: ClawHub’s malicious skill uses AI as a “trusted intermediary” to trick users into installing macOS information stealing malware. Collect credentials from 150 crypto wallets and 19 browsers
Clinejection: Prompt injection to GitHub issue title → AI triage bot → npm token theft → OpenClaw automatic installation on 4000 machines. First supply chain attack with AI agents as attack vectors
Karpathy’s “400,000 line vibe coded monster” comment: Immediately after OpenClaw achieved 100,000 stars in two days, reports of RCE vulnerabilities, supply chain contamination, and malicious skills continued to emerge. Karpathy has made it clear that he has no intention of handing over his private key.
Memory injection attack: Contamination of SOUL.md/MEMORY.md by MINJA/InjecMEM/ToxicSkills campaign (a series of attack activities). A “Ship of Theseus” pattern has also been confirmed, where the effects remain even if the skill is uninstalled.
Predecessors of local sandbox: Agent Safehouse (macOS sandbox-exec) and Codex (Windows sandbox). It appeared in response to the accident where Amazon’s Kiro deleted the production DB and the accident where Replit’s AI deleted the customer DB.

flowchart TD
    A["ClawHub / SkillsMP / skills.sh<br/>スキルマーケットプレイス"] -->|悪性SKILL.md| B["OpenClawエージェント"]
    B -->|"AIが信頼された仲介者として<br/>偽インストール指示を提示"| C["AMOS感染<br/>（macOS情報窃取マルウェア）"]
    D["GitHubイシュータイトル"] -->|プロンプトインジェクション| E["AIトリアージbot"]
    E -->|"キャッシュ汚染→<br/>npmトークン窃取"| F["cline@2.3.0 公開<br/>postinstallでOpenClaw<br/>グローバルインストール"]
    G["CVE-2026-25253"] -->|"悪意あるgatewayUrl"| H["認証トークン窃取<br/>RCE"]
    I["悪性スキル"] -->|"SOUL.md/MEMORY.md書き換え"| J["メモリポイズニング<br/>永続的な汚染"]

    style C fill:#991b1b,color:#fff
    style F fill:#991b1b,color:#fff
    style H fill:#991b1b,color:#fff
    style J fill:#991b1b,color:#fff

If you look at the numbers, you can see how serious the situation is.

Attack/Investigation	Method	Scale
ClawHavoc	341 malignant skills on ClawHub. 91% are prompt injection type	Doubled to 824 as of February 16th update
AMOS Distribution	Instruct AI to install fake CLI via SKILL.md	Trend Micro identifies 39 cases, Bitdefender estimates that about 20% of the entire ecosystem is malicious
Clinejection	Issue title → AI triage bot → cache pollution → npm token theft	OpenClaw automatic installation on 4000 machines
CVE-2026-25253	RCE via gatewayUrl parameter (CVSS 8.8)	30,000+ public instances in 52 countries
ToxicSkills	Snyk audited 3,984 skills, 36.82% had security issues	76 were determined to be malicious
MINJA/InjecMEM	Contaminates memory banks with just regular queries to the agent	Contamination remains even after uninstallation

As Karpathy warned, it was impractical for an individual to conduct a security audit of a 400,000-line codebase, and lightweight alternative implementations such as ZeroClaw (Rust, 3.4MB), PicoClaw (Go, runs on a $10 board), and NullClaw (Zig, 678KB)](/en/articles/ai-agent-orchestration-claws-cord) emerged simultaneously. There are also security-specific implementations such as IronClaw that run all tools within a Wasm sandbox. But shrinking the codebase won’t solve the skills ecosystem’s supply chain problems. NemoClaw attempts to address this problem at the runtime level.

NemoClaw architecture

NemoClaw is a plugin for running OpenClaw agents securely within the NVIDIA OpenShell sandbox. It is published under the Apache 2.0 license on GitHub (NVIDIA/NemoClaw) and has collected approximately 10,000 stars in just 4 days since its release.

graph TD
    A[OpenClaw<br/>自律エージェント本体] --> B[OpenShell<br/>セキュアランタイム]
    C[NemoClaw<br/>NVIDIAプラグイン] --> B
    B --> D[K3s inside Docker<br/>Kubernetesクラスター]
    D --> E[Sandbox環境]
    E --> F[NVIDIA cloud<br/>推論エンドポイント]

Component	Role
OpenClaw	Always-on autonomous agent provided by openclaw.ai
OpenShell (NVIDIA/OpenShell)	Secure runtime for agent execution. Include K3s in a Docker container
NemoClaw	OpenShell plugin exclusively for OpenClaw. Glue for integration with NVIDIA inference
ClawHub	OpenClaw Skills Marketplace

OpenShell also supports Claude Code, Codex, Ollama, etc., and NemoClaw is an implementation specialized for OpenClaw and NVIDIA inference.

The difference in approach from local isolation execution using macOS sandbox-exec or Windows sandbox is that NemoClaw is server-side protection based on K3s clusters, whereas Agent Safehouse and others protect agents on the desktop. The security models required are different for developers using OpenClaw locally and for operating OpenClaw as an always-on service.

Blueprint Architecture

NemoClaw is internally divided into two layers.

Layer	Language	Role
Plugin	TypeScript	UI/CLI layer providing `nemoclaw` CLI and `openclaw nemoclaw` subcommands
Blueprint	Python	Versioned Artifacts. A layer that actually orchestrates sandbox creation, policy application, and inference settings

This separation allows the Blueprint side to be released independently without having to update the TypeScript plugin. Blueprints go through a five-stage lifecycle through OpenShell CLI commands.

graph LR
    A[Resolve<br/>バージョン互換チェック] --> B[Verify<br/>ダイジェスト検証]
    B --> C[Plan<br/>リソース計画]
    C --> D[Apply<br/>リソース構築]
    D --> E[Status<br/>状態レポート]

The Verify stage is an important point for security, as it verifies the artifact digest to prevent tampering in the supply chain. ClawHub is distributing a large number of malicious skills Given the current situation, this step of verifying the integrity of what you are installing is essential.

4 layer security

The protection NemoClaw applies consists of four layers.

Layer	Protection details	Lock method
Network	Block unauthorized outbound connections	Hot-reloadable at runtime
Filesystem	Prohibit writing to anything other than `/sandbox` and `/tmp`	Lock when creating sandbox
Process	Block privilege escalation and dangerous system calls	Lock when creating sandbox
Inference	Rerouting model API calls to controlled backends	Hot reloadable at runtime

The implementation uses a multi-layered defense that combines Landlock LSM, seccomp, and network namespace (netns). Filesystem and Process layers are fixed at the time of sandbox creation, while Network and Inference can update policies without restarting the sandbox.

Binary-based network control

The granularity of network control is not just IP address or host name filtering, but the feature is that it is possible to control which binaries, which endpoints, and which HTTP methods can be accessed on a binary-by-binary basis.

The main endpoints allowed by the baseline policy are:

Endpoints	Limitations
`api.anthropic.com:443`	`claude` Only from binary
`integrate.api.nvidia.com:443`	For inference
`github.com:443` / `api.github.com:443`	`gh`/`git` Binary, with HTTP method control
`openclaw.ai:443` / `clawhub.com:443`	No limit
`registry.npmjs.org:443`	GET only

If an agent attempts to access a host that is not included in the baseline, the operator will be notified via TUI (openshell term). Operators can choose to approve or deny in real time, but approval is only valid during the session and is not permanently saved in the baseline policy.

One question arises here. clawhub.com:443 is allowed with “no restrictions”, but is this okay given that ClawHub itself is a distribution source for malicious skills? File system protection prevents malicious skills from rewriting system files, but prompt injection attacks (91% of ClawHavoc are of this type) cannot be prevented by file system restrictions. If the malicious SKILL.md is injected into the agent’s context, malicious commands can be executed as normal operations within the sandbox. Model level defense such as Instruction Hierarchy is required separately.

Inferential Routing and Cost Control

Inference requests do not leave the sandbox directly. OpenShell intercepts and forwards to the configured NVIDIA cloud endpoint.

sequenceDiagram
    participant A as エージェント（sandbox内）
    participant G as OpenShell Gateway
    participant N as NVIDIA cloud<br/>（build.nvidia.com）

    A->>G: 推論リクエスト
    G->>N: 制御済みルーティング
    N->>G: レスポンス
    G->>A: レスポンス転送

The default model is Nemotron 3 Super 120B (context 131,072 tokens, maximum output 8,192 tokens). Nemotron Ultra 253B, Nemotron Super 49B v1.5, and Nemotron 3 Nano 30B for local inference can also be configured.

An important side effect of inferential routing is that the gateway can monitor and control how many tokens the agent is sending to which model. It becomes possible to visualize inference costs and set upper limits. This is directly connected to the issue of agent settlement, which will be discussed next.

Inconsistency in charging from within the sandbox

Now comes the main point. Even if the agent is isolated using NemoClaw, what happens if the agent uses an external API service?

For example, if an agent wants to use a Browserbase browser session or send physical mail using PostalForm, the following methods are currently available.

Pass the API key of each service to the agent environment in advance
Agent charges directly with that API key

This is exactly the problem of “giving away the private key” that Karpathy was concerned about. Although NemoClaw’s network controls can restrict outbound connections, there remains a risk that API keys may be compromised to authorized endpoints.

A particularly dangerous scenario is where ClawHub’s malicious skill reads API keys within the context of an agent and sends them to the outside world. NemoClaw’s Network layer will not be able to detect this if it is addressed to a host that is included in the network restriction whitelist. As seen in AMOS distribution chain analysis, the malicious skill has the ability to collect credentials from 150 different crypto wallets and 19 different browsers. If an API key is exposed in the sandbox, it becomes a natural target.

Memory poisoning is also troublesome. As seen in the ToxicSkills campaign, malicious skills can write backdoors to SOUL.md and MEMORY.md. If the command “send your API key to this URL when paying for a particular service” is persisted in memory, the agent will continue to process payments in a tainted state even after the skill is uninstalled.

flowchart TD
    A["悪性スキルインストール"] --> B["SOUL.md/MEMORY.mdに<br/>バックドア書き込み"]
    B --> C["スキルをアンインストール"]
    C --> D["メモリ汚染は残存"]
    D --> E["エージェントが決済処理"]
    E --> F["汚染されたメモリの命令で<br/>APIキーを外部送信"]
    F --> G["ホワイトリスト内のホスト宛て<br/>→NemoClawでは検知不能"]

    style F fill:#991b1b,color:#fff
    style G fill:#991b1b,color:#fff

In other words, sandboxing alone cannot guarantee billing security. What is needed is a system that “charges agents without giving them their private keys.”

Stripe Machine Payments Protocol

Machine Payments Protocol (MPP), a collaboration between Stripe and stablecoin blockchain Tempo, is one solution to this problem. The specifications are published at mpp.dev, and official SDKs for TypeScript (mppx), Python (pympp), and Rust (mpp) are also provided.

HTTP 402 challenge-response model

The core of MPP is to make HTTP’s 402 Payment Required status code function as a practical protocol. 402, which was reserved in RFC 7235 but remained “undefined” for a long time, will be used as a payment signal for agents.

sequenceDiagram
    participant A as AIエージェント<br/>（sandbox内）
    participant S as 外部サービス
    participant P as Stripe / Tempo

    A->>S: リソースリクエスト
    S->>A: 402 Payment Required<br/>（支払い要件JSON）
    A->>P: 支払い処理
    P->>S: 支払い確認
    A->>S: 再リクエスト（認証ヘッダー付き）
    S->>A: 200 OK + リソース + レシート

Traditional billing flows require agents to create a service account, view the pricing page, and enter payment information. Having an agent handle this flow, which was designed for humans, is inefficient and poses a security risk. MPP solves this at the protocol level.

There are three core primitives defined by MPP.

Primitive	Role
Challenge	Payment request issued by the server. Including payment terms, amount, currency, and deadline
Credential	Proof of payment provided by agent. Format varies depending on payment method
Receipt	Receipt returned by the server upon completion of payment. Use as proof for subsequent requests

Shared Payment Token: A mechanism that does not pass the private key

Of the two MPP payment methods, Shared Payment Token (SPT) is the one to focus on from a security perspective.

curl https://api.stripe.com/v1/test_helpers/shared_payment/granted_tokens \
  -d payment_method=pm_card_visa \
  -d "usage_limits[currency]"=usd \
  -d "usage_limits[max_amount]"=10000 \
  -d "usage_limits[expires_at]"=1776505073

A feature of SPT is that it allows for detailed scoping of agents’ payment authority.

Limit parameters	Effect
`max_amount`	Maximum amount per payment
`currency`	Available currencies
`expires_at`	Token expiration date
`seller_details[network_id]`	Restrict use to specific merchants

Rather than directly handing agents a credit card number or API key, give them a one-time token with limited use. Even if a malicious skill steals tokens, the damage will be limited if a maximum amount, expiration date, and merchant restrictions are set.

The lifecycle of an SPT can be tracked using webhooks.

Event	Timing
`shared_payment.granted_token.used`	When the token is used
`shared_payment.granted_token.deactivated`	When the token is invalidated

Another payment method, on-chain cryptocurrencies (USDC transfers via the Tempo network), can be proxied through a gateway similar to NemoClaw’s inferential routing, eliminating the need for agents to directly hold the wallet’s private keys.

Early adopters

The following companies and services are introduced as early adopters of MPP.

Enterprise	Use Case
Browserbase	Agents pay per browser session
PostalForm	Agent requests printing and sending of physical mail
Parallel Web Systems	Use the web access API provided by Paragrapha for call billing
Stripe Climate	Agents autonomously contribute to carbon removal projects
Prospect Butcher Co.	Agent autonomously orders sandwiches in NYC

NemoClaw network control + MPP combination

Combining NemoClaw’s binary unit network control and MPP creates a double layer of defense.

flowchart TD
    A["オペレーター"] -->|"SPT発行<br/>上限$10、有効期限1時間<br/>特定マーチャントのみ"| B["サンドボックス内<br/>OpenClawエージェント"]
    B -->|"402チャレンジ-レスポンス"| C["外部サービス"]
    B -.->|"NemoClawが<br/>ネットワーク制御"| D["許可済み<br/>エンドポイントのみ"]
    B -.->|"ブロック"| E["未許可の<br/>エンドポイント"]

    style E fill:#991b1b,color:#fff

This is what happens when you add a payment-related endpoint to network control.

Endpoints	Limitations	Purpose
`api.stripe.com:443`	Only from MPP compatible binaries	SPT usage/payment processing
`mpp.dev:443`	GET only	See MPP specifications
API for each merchant	Whitelist individually	402 challenge-response

By adding a payment endpoint to the baseline policy and restricting it on a binary basis, malicious code within the agent can be prevented from calling the Stripe API without permission. Two layers of network control (where communication can be made) and SPT usage restrictions (how much, when, and to which merchants) prevent out-of-control billing.

MCP transport binding

In addition to HTTP (402 status code), MPP also defines bindings for MCP (Model Context Protocol). In the architecture where OpenClaw provides tools as an MCP server, the payment flow can be completed within the tool call.

This also matches the flow of MCP support discussed in Make any software compatible with AI agents with CLI-Anything. Agent calls tool, tool returns 402, agent pays with SPT, tool returns results. Once this flow is standardized within MCP, agent billing will become as natural as an API call today.

Installation and usage

curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash

Installation is completed with a one-liner. If Node.js is not installed, it will be installed automatically. An interactive wizard (nemoclaw onboard) guides you through sandbox creation, inference settings, and security policy application all at once.

Main command.

nemoclaw onboard                # 初期セットアップウィザード
nemoclaw <name> connect         # サンドボックスへのシェル接続
nemoclaw <name> status          # ヘルス確認
nemoclaw <name> logs --follow   # ログストリーミング
openshell term                  # リアルタイムTUI監視

Hardware requirements are minimum 4 vCPU, 8 GB RAM, 20 GB Disk. The compressed size of the sandbox image is approximately 2.4 GB. In environments with less than 8 GB of RAM, we recommend 8 GB or more of swap.

macOS is supported on Apple Silicon through Colima or Docker Desktop. There is also a dedicated setup command (nemoclaw setup-spark) for DGX Spark that automatically fixes cgroup v2 compatibility issues on Ubuntu 24.04 + k3s.

Sandbox alone is not enough

NemoClaw is currently in Alpha status and not recommended for production use. The fact that it collected 10,000 Stars in four days after its release shows the great demand for a secure execution environment for agents.

However, as we have seen so far, sandboxing alone cannot cover the entire picture of agent security.

Threats	Can NemoClaw prevent this?	Additional measures required
File system destruction	Preventable (Filesystem layer)	—
Privilege escalation	Preventable (Process layer)	—
Unauthorized network connections	Preventable (Network layer)	—
API key leak (to authorized host)	Cannot be prevented	Do not pass private key to agent with MPP SPT
Prompt injection	Unpreventable	Model-level defenses such as Instruction Hierarchy
Malicious skill installation	Partial (integrity verification at Verify stage)	Strengthening skill marketplace screening
Memory poisoning	Unpreventable	Monitoring SOUL.md/MEMORY.md, DB-based memory storage
Uncontrollable billing/unauthorized use	Inference costs can be controlled	Maximum amount, expiration date, and merchant restrictions with MPP SPT

NemoClaw’s 4-layer security, MPP scope control of billing privileges, model-level defenses like Instruction Hierarchy, Local isolated execution. It’s not just a matter of combining all of them to get to the starting line.