Kana Chat v3 and Leaning Into Blog-Specific Use

About 1.5 months after I posted the v2 article. This time, instead of feature additions, I want to write about how the emphasis of the project shifted toward blog operations.

Easing Off the DIY OpenClaw Direction

Kana Chat originally started from the idea of “let’s build something OpenClaw-ish ourselves using official CLIs and tmux.” Browser-automation tools come with ban risk and legal grey areas, and API-key-stealing wrappers are out of the question. So the motivation in v1 and v2 was to drive CLIs through tmux and run them safely on my own subscriptions.

Lately though, both Claude Code and Codex have been shipping official APIs for “controlling the CLI from outside.” Headless modes, permission modes, SDKs — you can hit them with structured I/O from the start, instead of parsing tmux pane ASCII yourself. The reason to work hard on a homegrown tmux wrapper has thinned out a fair bit compared to when I started.

That said, it’s still too early to fully replace it. The tmux layer is convenient for absorbing protocol differences between CLIs, and pulling the TUI view straight into the browser turns out to be the most robust approach in practice. So instead of pushing the DIY OpenClaw direction further to “make it do everything,” I decided to focus on the use case I actually touch every day. That use case is blog operations.

The fact that “I don’t really have anything I want to fully delegate” also weighs in. I’m doing the exact opposite of OpenClaw’s “delegate everything and walk away” design, but I’m not really troubled by that.

Why I Leaned Into Blog Use

As written on this site’s About page, this place is run as a record of “what I learn day to day in development, and the trial and error along the way.” That’s not just a tagline — practically speaking, that’s how I use it, and the daily inflow in my case is just too much.

Every day, an endless stream of AI papers, releases, news, security advisories, and other people’s blog posts pours in through Feedly and the timeline. If I read it all, the day is over. What I actually want is to glance at titles and summaries and triage in seconds: “I want to dig into this one,” “this one I just need to be aware of,” “this one I want to run on my machine today.” For the ones that catch my interest, I want to rewrite them in my own words so I can re-read them later. If I don’t write them down, in three days I won’t even remember what I saw.

So the priorities for what I want to add to Kana Chat naturally shifted in this direction:

Make the path from “incoming topic” to “draft article on disk” as short as possible
Keep the on-ramp to running it on my machine when I have free time
Bundle topics that come up multiple times into a single planned article
When something overlaps with an existing article, consult on whether to append, compare, or write fresh

In other words, the unit isn’t “read and done” but “read it, drop it into a draft, and leave it in a form I can experiment with later.” That’s the area v3 changed the most.

Rough Overall Picture

The number of layers has grown and writing it all out gets long, so just the parts that matter in v3.

flowchart TB
    A["iPhone PWA<br/>Chat / Plan / Jobs / Blog"]
    B["FastAPI"]
    PL["Planner<br/>tmux-attached"]
    BL["Blog pipeline<br/>Idea → Consult → Draft → Experiment"]
    RSS["RSS reader"]
    W["Worker pool<br/>Claude / Codex / Gemini"]

    A --> B
    B --> PL
    B --> BL
    BL --> RSS
    PL --> W
    BL --> W

Planner handles design discussions, Blog is the article pipeline, and the workers handle implementation and verification — that’s the rough division of labor.

Exposing the Planner Terminal Directly

The v2 Planner was a CLI started in a read-only sandbox. It was running, but from the browser you could only see the updated plan.md file, and follow-up questions had a noticeable lag in their replies. End result: it was faster to open another window and attach to the tmux session directly — a kind of self-defeating state.

In v3, the Planner’s tmux pane capture goes straight to the browser, and both text input and key input (↑↓ Enter Esc Tab) can be sent directly from the browser. All it really does is “the browser becomes a tmux client,” but with that, the Planner is effectively just a normal terminal you can touch.

The human hashes out requirements in conversation, the CLI drops them into a plan file, and when the user thinks “this is fine,” one button hands it off to a job. If you start to doubt during the Plan phase, you can stop it with Esc, which is much cheaper than letting a job run wild and then scrambling to kill it.

Blog Became a Pipeline From Idea to Experiment

The v2 Blog tab honestly only had image upload and a list of existing drafts. v3 added sub-tabs and burned my actual workflow straight into the UI. The screens just line up along the flow Idea → Consult → New article → Draft → Publish, but having that match the order in my head turned out to be what mattered.

The key parts are the Idea tab and what comes after it.

Bundling From the Idea Tab

I have an RSS reader running in another project that accumulates articles, and the new tab reads from it directly. When three or more articles on the same axis pile up on the RSS side, that surface as “this is bundleable into one article.” At the same time, it computes an interest score from the tags and titles of past articles, and surfaces “lines I can dig into without waiting for RSS.”

In short, it pulls candidates from both sides — “topics other people have written about that are piling up” and “topics I’ve been wanting to write about for a while.”

Settling the Direction in the Consult Tab

When you look at planning candidates and existing articles, you always run into wobble like “should this be appended? Or a separate article? Or rebuilt as a comparison piece?” Throwing this into chat every time pollutes context, and it’s not big enough to spin up the Planner.

The Consult tab is a lightweight session that calls Codex for one turn and gets back a structured consultation thread. Hitting “set this direction as a new article” dumps the consult result straight into the new article tab’s form.

New Article → Draft → Rewrite It Myself → Experiment

The new article job spits out a draft in unpublished state and stops. It does not publish.

This is the biggest point I focused on in v3. I lay the AI-written draft side by side with the original news article or paper it came from, and rewrite it while adding the questions I had, the things I thought would be better done differently, and the points I want to dig deeper into.

If I let AI take a draft all the way to publish, the result is shallow, and more importantly, it stops being a meaningful article to me. When I re-read it later, I can’t trace “why was I interested in this topic” or “how did I think about it from there” — it’s just a summary collection. Half the reason I write articles is to leave my own thinking in a form I can trace later, so if I skip touching it here, the point of writing nearly disappears.

For hands-on, empirical articles, going through Kana Chat is too slow. I usually write those on a test machine while running the code directly, and they don’t go through Kana Chat at all. The drafts I let Kana Chat produce are mostly news, explanations, and concept-organization pieces.

The publish flow is separate: I check and add to the draft myself, then fire off the edit and publish jobs. They run frontmatter and diary template validation before pushing.

flowchart TD
    NEWS["AI news<br/>papers / advisories"]
    RSS["RSS reader"]
    IDEA["Idea tab"]
    CONS["Consult tab"]
    DRAFT["Draft"]
    EXP["Experiment locally<br/>when I have time"]
    PUB["Publish job<br/>edit + push"]

    NEWS --> RSS --> IDEA
    IDEA --> CONS --> DRAFT
    IDEA --> DRAFT
    DRAFT --> EXP --> PUB
    DRAFT --> PUB

“Read → leave it as a draft → later rewrite and finish it myself” is now a single line. It’s become a device that lets me catch up with what the About page calls “a record of what I learn and the trial and error along the way,” at my own processing speed.

A side benefit: the Idea tab lines up the news summaries that drafts come from, so just glancing at them while commuting lets me sort through “this one is whatever, drop it” or “this one I want to look at from a different angle.” A bit of upstream play for choosing what to dig into. Commute time got slightly more fun.

Bilingual Output Switch

This site posts everything except diary entries as a Japanese/English pair. The new article job now takes “Japanese only / English only / both” so I can choose which language(s) to draft in per job. For the English-only case, validations specific to Japanese templates (mandatory meal/exercise sections, etc.) would misfire, so the validation set is switched per language.

Splitting Models Across Blog Stages

In v2, I lumped it as “the job uses Claude Opus.” Once I actually started running it, the strengths and weaknesses came out clearly per stage, so v3 splits them. Roughly, the allocation feels like this.

Stage	Model	Rough reason
Router / classification	GPT-5.3 codex-spark + GPT-5.4 mini fallback	Speed and stability. Doesn’t need deep reasoning
Blog consult	GPT-5.3 codex-spark	Want to run short exchanges lightly
New article writer	GPT-5.5	Compositional skill when expanding from raw material
New article reviewer	Gemini Pro	Reading a long document at once and pointing out structural breakage is a strength
Edit / publish	Claude Opus 4.6	A head above the rest as a tool user, for diff editing existing files

Model selection turns over on a short cycle, so it’s all swappable in config rather than hardcoded.

The other reason I split stages is that Claude Code’s Max quota burns out subjectively fast. It’s hit zero in 1.5 hours before, so building everything around Claude carries the risk of “can’t write when I want to write.” Codex lasts much longer, so I shifted the heavy generation work — writer and reviewer — to Codex/Gemini, and left Claude for the tool-using parts (edit and publish). That’s the allocation.

The reason I deliberately mix Gemini in for the reviewer is that I wanted an AI of a different breed in there on purpose. Claude and Codex are both lineages forged in coding-heavy use, so their structural feedback tends to look similar. Mixing in Gemini, which comes from a different lineage, brings comments from a different axis. Strictly speaking this isn’t orchestration, but in the sense of “designing what to make a reviewer do,” it feels close to the multi-agent PR review writeup I wrote before.

Naturally that raises the question “isn’t the writing style different from when Claude was doing it?” If you let Codex write straight, the word choices and paragraph rhythm shift, and re-reading it doesn’t feel like my own articles. So I’ve been beating up AGENTS.md (Codex side) and CLAUDE.md (Claude side) quite a bit, tuning them so the output styles converge.

Smaller But Effective Changes

A few changes outside the blog area that turned out to matter in daily use.

Change	Detail
Lower interruption cost in Planner	Being able to stop things before they run wild as a job made it easier to say “actually, no” during the Plan phase
Skills tab and pinned panel	I forget what I can do, so the feature list became searchable from the UI. The top of chat now permanently shows running jobs and context notes
Failure-cause report for jobs	Added a mechanism that summarizes “what happened” for failed jobs. Not having to dig through terminal logs every time is surprisingly effective

Thinking about it, even before X existed, people were endlessly scrolling SNS feeds and RSS readers, doing the same job of “extracting only what I need from a pile of miscellaneous information.” Skimming the timeline — that was a universal human task that’s been there forever. Now we can just ask AI to “pull out only the parts I need.” Information volume keeps growing exponentially while human processing capacity stays the same, so the feeling that AI fills that gap is pretty good. I think this is a fairly nice thing.

Meanwhile, even with the system having grown this big, “okay do this for me” on Kana Chat is wrapped in such a tight harness that what it can actually do is thin. The approval gates and sandbox constraints are by design, but the result is that more often than not I just hit Claude Code from the web directly, or remote-desktop into the PC and use Codex hands-on. Kana Chat is no longer “the thing that does everything for me automatically.” You could also say it never quite became that.

I haven’t been chatting with it casually lately either. Or rather, if I’m going to chat with it, I’d rather just move my hands, so there’s no time to chat. I’m not Kani-san so I guess that’s fine, but there’s also a slight “what was I building this for again? Was it really meant to be an AI RSS reader?” doubt that I can’t fully shake.