Japan's Digital Agency open-sources its government AI "Gennai" with RAG, self-hosted LLM, and legal-AI templates under commercial-friendly licenses
Contents
On April 24, 2026, Japan’s Digital Agency open-sourced parts of “Gennai” (源内), the generative AI platform it currently operates inside central-government ministries.
Alongside the web interface, the release includes AI development templates tailored to AWS, Azure, and Google Cloud, all under commercially usable licenses.
“A government-built AI system landing on GitHub” is still a surprising framing in Japan, but the release is clearly designed for reuse — local governments and private-sector operators can rebuild Gennai inside their own environments.
What Gennai is
Gennai is a generative AI platform for government staff.
Led by the Digital Agency, it’s designed as a shared baseline that central-government ministries can actually use for day-to-day work.
The plan is to give roughly 180,000 government employees access to generative AI during fiscal year 2026.
The goal is not just another chat UI. The stated aim is to “let teams build and operate business-specific generative AI applications, quickly, safely, and easily, for real-world work.”
Rather than a single LLM wrapper, Gennai is built more as a platform for registering, managing, and operating team-owned AI apps.
The policy backdrop is Japan’s AI Act (as it’s informally called), enacted in May 2025, and the AI Basic Plan approved by cabinet in December 2025.
The thinking is “lead by example”: the government itself should be the first to actually use AI, and Gennai is the vehicle for that.
What was released
Two repositories were published under the digital-go-jp organization on GitHub.
| Repository | Contents |
|---|---|
digital-go-jp/genai-web | The Gennai web application itself |
digital-go-jp/genai-ai-api | AI-side microservices (per-cloud templates) |
genai-web (the web application)
This is the screen-facing side that staff actually see.
It’s built on top of an AWS-published OSS project for generative-AI use cases, with Digital Agency-specific features added on top.
The tech stack isn’t particularly exotic.
- Language: TypeScript (99% of the repo)
- Frontend: React
- Styling: Tailwind CSS
- Infrastructure: AWS + AWS CDK
Main features include:
- Team management: group AI apps by organization unit
- AI app registration and management: register task-specific AI as apps
- External microservice wiring: combine with
genai-ai-apitemplates - Digital Agency Design System applied
- SAML authentication, so it slots into staff-authentication setups
- Logging, CI/CD, custom-domain settings
It’s closer to “an app store and launcher for task-specific AI apps” than a single chatbot.
genai-ai-api (the AI-side microservices)
This is the actual AI processing that the web app calls into.
It’s a monorepo with separate templates for each of the three major clouds.
| Folder | Cloud | Contents |
|---|---|---|
aws/ | AWS | RAG template for government operational work |
azure/ | Azure | Self-hosted LLM deployment template |
google-cloud/ | Google Cloud | Legal / regulatory AI template |
You don’t need to install everything — pick the template that matches the cloud you already use.
The language breakdown is about 40% Python, mixed with Bicep (Azure IaC), TypeScript, HCL (Terraform), Shell, and PowerShell.
How AWS, Azure, and Google Cloud templates differ
The three templates cover clearly different problem areas.
graph TD
A[Gennai genai-web<br/>staff-facing UI] --> B{Use case}
B -->|Knowledge search / document QA| C[AWS template<br/>Gov RAG]
B -->|Run your own LLM| D[Azure template<br/>Self-hosted LLM]
B -->|Law / regulation AI| E[GCP template<br/>Legal AI]
C --> F[RAG over ministry docs<br/>using S3 / OpenSearch, etc.]
D --> G[Deploy your own LLM<br/>on Azure]
E --> H[AI apps linked to<br/>legal APIs and current law]
- AWS: assumes you want to query internal documents and manuals via RAG. Operationally focused.
- Azure: assumes you want to host your own LLM on the cloud. High model-choice flexibility.
- Google Cloud: dedicated to AI apps dealing with law and regulation, with current-law data integration in mind.
What’s interesting is that the release doesn’t lock into a single cloud.
For cross-ministry use, different ministries have already standardized on different clouds, so covering all three is actually a natural design choice.
Licensing and commercial use
Two license families are used.
- Source code: MIT License (with a bit of Amazon Software License mixed in)
- Documentation: Creative Commons Attribution 4.0 International (CC BY 4.0)
MIT means commercial use, modification, and redistribution are fine.
”Standing up the whole of Gennai internally” or “pulling part of a template into our existing AI platform” — both are legally unproblematic.
That said, the Digital Agency also spells out the operational stance:
- Feature requests and pull requests are not accepted
- Vulnerability reports are accepted only through a dedicated channel
- Positioned as a public resource contributed to the OSS community
The stance is “we distribute, but this isn’t a joint development project.”
The expected pattern is for others to fork and evolve their own copies.
What they’re explicitly not releasing
They’re also clear about what stays internal.
- Internal manuals
- The large language models themselves
- Production logs
Since LLMs are decoupled, anyone standing up Gennai in their own environment has to bring their own — either an API key or a self-hosted model.
The Azure template already goes in the direction of self-hosted LLM, so there’s a clear route from the released code into a fully self-hosted setup.
Why release it
The Digital Agency gives three reasons.
- Reference implementation: contributes to shared rules for safely operating task-specific AI across central-government ministries
- Avoiding duplicated development: local governments and other agencies don’t have to rebuild a similar platform from scratch
- Reducing vendor lock-in: distributing something modifiable and reusable curbs dependence on any single vendor
The second point is quite concrete in practice. If every local government commissions its own generative-AI platform from a system integrator, dozens of similar systems end up being built in parallel nationwide.
Using Gennai as the base can cut both procurement cost and operational risk at the same time.
The third point, “avoiding vendor lock-in,” is visible in the multi-cloud template lineup even though the base OSS was AWS-centric.
There’s intentional room to step off any single cloud vendor.
What’s next
The Digital Agency plans to progressively publish usage results and technical know-how.
In particular, tech articles titled along the lines of “building a modern web application on Japan’s Government Cloud” have been teased, which could surface real recipes for operating AI apps on top of the government cloud.
For government-published AI platforms, the “how they actually run it” information tends to be more valuable than the features themselves.
Worth following the updates on this front.
Related moves
Japanese sovereign AI and government AI have been moving quickly since the start of 2026.
A little earlier, SoftBank, NEC, Honda, and Sony launched a new domestic AI company “Japan AI Foundation Model Development” aiming at a 1-trillion-parameter physical AI model.
That effort is closer to “the model the government uses,” while Gennai is closer to “the platform the government uses” — different layers of the same stack.
On the model side, NII has released LLM-jp-4-32B-A3B-thinking, and the Japanese-capable LLM lineup is expanding quickly overall (see Japanese LLM options compared).
Combined with Gennai’s Azure template, you could realistically self-host a Japanese-specialized model and point it at government-style use cases.
What you can actually do with it
If you pull the repos locally, what can you actually build?
Here’s a read of the likely paths after skimming the READMEs.
Prerequisites
genai-web alone looks like “just React + TypeScript + Tailwind,” but real-world operation requires the following.
| Prerequisite | Notes |
|---|---|
| SAML authentication | Staff-auth integration is assumed, so you need an internal IdP or a dummy replacement |
| Cloud account | At minimum one of AWS / Azure / Google Cloud |
| LLM endpoint | Bedrock, Azure OpenAI Service, Vertex AI, or a self-hosted LLM |
| AWS CDK | genai-web deployment is AWS-based |
You can get the frontend up with npm install && npm run dev, but extracting real value as an AI app still requires cloud-side resources.
Can it be used like NotebookLM
Yes, but the scope is a bit different.
NotebookLM is a tool where you upload PDFs or Google Docs and get summaries, Q&A, and even audio podcast generation.
Gennai, on the other hand, is a “platform for managing and operating task-specific AI apps at an organization level,” so it doesn’t ship NotebookLM’s feature set directly.
However, since the AWS template is “government operational RAG,” these use cases fall out naturally:
- Ingest internal manuals and let staff query them in natural language
- Make past circulars and meeting minutes searchable for Q&A-style lookup
- Summarize information that spans multiple documents
The flashier parts like audio-podcast generation aren’t in the template, but since the design supports adding external-API microservices, bolting on a separate TTS service is straightforward.
What LLMs you need
The most important point: LLMs are not included in the repositories.
What’s open-sourced is “the AI app container” and “cloud-specific templates” — you’re expected to bring your own model.
Three broad options:
| Route | Usable models | Best-fit template |
|---|---|---|
| Managed LLM | Claude (Bedrock) / GPT-4o (Azure OpenAI) / Gemini (Vertex AI) | The standard template for each cloud |
| Self-hosted | Llama / Qwen / LLM-jp / PLaMo, etc. | Azure template (self-hosted LLM design) |
| Hybrid | Sensitive data on self-hosted, general tasks via managed | All templates |
For government use, sending data to an external vendor API isn’t always allowed.
That’s why the Azure template is explicitly oriented toward self-hosted LLMs.
For Japanese-specialized use, self-hosting an open model like LLM-jp-4-32B-A3B-thinking is a realistic choice.
Can you build document embedding search
Yes, and that’s essentially the main feature of the AWS template.
Calling it “government operational RAG” implies the following setup is included:
- Documents uploaded → vectorized by an embedding model
- Stored in a vector database (OpenSearch / Bedrock Knowledge Base / pgvector, etc.)
- At query time, similar documents are retrieved and passed to the LLM as context
Embedding models themselves are assumed to be the managed ones on each cloud:
- AWS: Titan Embeddings, Cohere Embed
- Azure: text-embedding-3-small/large, ada-002
- Google Cloud: text-embedding-004, gecko
Whether to add hybrid search (vector + BM25) or reranking is up to the implementation.
Using the templates as-is already gives you working embedding search, but getting production-grade accuracy requires domain-specific chunking and metadata design.
A government releasing an entire system on GitHub is still a rare sight in Japan.
If you actually want to understand Gennai, cloning digital-go-jp/genai-web and running it locally is faster than reading more about it.