Tech 4 min read

msgvault - a tool for backing up Gmail locally and searching it fast

IkesanContents

What msgvault is

msgvault is a tool for backing up Gmail locally and searching it quickly. It was created by Wes McKinney, the author of pandas, the Python data analysis library.

The project description says it is “a local-first storage and retrieval engine for slicing, dicing, and querying a lifetime of email and messaging data in milliseconds.” In other words, a local-first engine that can search a lifetime of mail in milliseconds.

Why this matters

I have been using Gmail for close to 20 years. I rarely go back to old messages, but unlike paper letters, digital data does not come back once it is gone.

There are real risks in depending entirely on Google:

  • Mail can stop arriving because of storage limits
  • The service can disappear or change policy
  • An account lockout can cut off access to all data

There are also stories about Gemini/Nanabanana generating images on its own and then getting the account locked for policy violations even though the user never asked for image generation. Keeping your life’s data in a single Google service is not risk-free.

Tech stack

  • Single Go binary
  • SQLite as the main database
  • DuckDB + Parquet for fast analytical queries
  • FTS5 for full-text search
  • Gmail API with OAuth authentication

Like a tool built by the author of pandas, it uses DuckDB + Parquet as the data-processing backbone.

Main features

Backup

msgvault init-db
msgvault add-account you@gmail.com  # OAuth in the browser
msgvault sync-full you@gmail.com    # full sync
msgvault sync you@gmail.com         # incremental sync

The initial full sync can take a while because of Gmail API rate limits, but incremental syncs after that finish in seconds.

msgvault search "query term"
msgvault search "from:example@gmail.com has:attachment"
msgvault search "\"exact match phrase\""

It supports Gmail-style search operators such as from:, to:, has:attachment, before:, and after:.

Gmail search has felt a bit unreliable lately. If you want to search for an exact string, it may split it apart or pick up similar words. Because msgvault is based on SQLite FTS5, you can wrap a phrase in double quotes for exact matching.

TUI

msgvault tui

It includes a terminal UI for browsing and searching mail interactively.

MCP server

msgvault mcp

That lets Claude Desktop and other MCP-capable AI agents search and analyze your mail archive. It becomes possible to ask something like “find mail from so-and-so last year” in natural language.

Can it be used with Google Workspace?

There is no explicit documentation, but since it only uses the Gmail API and OAuth, it will probably work. The one caveat is that a Workspace administrator may restrict API use.

Backup for the backup

Because msgvault is local-first, if your local data disappears, that is the end of it. Ironically, the “backup of the Gmail backup” also needs a backup.

Practical options:

  1. Do not delete Gmail - if you do not use msgvault’s staged deletion feature, Gmail itself remains the backup
  2. rclone + cloud storage - sync ~/.msgvault/ to S3 or Backblaze B2 regularly
  3. NAS / separate disk - use rsync and a 3-2-1 backup rule

As long as you are not deleting mail from Gmail permanently, it ends up being effectively a triple backup, so it may not be worth worrying too much.

Current caveats

  • Pre-alpha software, so APIs and storage formats may change
  • Gmail only for now, with WhatsApp and iMessage planned later
  • You need to create your own OAuth credentials

Take

If you care about owning your own data, this is the kind of tool that makes sense. It is worth trying if you dislike Gmail search or want a path away from Gmail in the future.

That said, because it is still pre-alpha, it is probably too early to trust it as your only backup. For now, using it alongside Gmail without deleting anything is the safer approach.