Tech 2 min read

@paddlejs-models/ocr Doesn't Work in Browsers (as of 2025)

Conclusion

@paddlejs-models/ocr and related libraries do not work in browser environments as of December 2025. The code has Node.js dependencies that cause errors even when bundled with Vite or loaded via esm.sh.

If you need OCR in the browser, use Tesseract.js.

What I Tried

1. @paddlejs-models/ocr (npm)

pnpm add @paddlejs-models/ocr
const ocr = await import('@paddlejs-models/ocr');
await ocr.init();
const result = await ocr.recognize(imageElement);

Error: ReferenceError: Module is not defined

The Emscripten-compiled WASM references Node.js’s global Module variable, which doesn’t exist in the browser.

2. Working Around It with Vite Config

// astro.config.mjs
vite: {
  define: {
    'Module': 'globalThis.Module || {}',
  },
  optimizeDeps: {
    include: ['@paddlejs-models/ocr'],
  },
}

Error: Invalid define value (must be an entity name or JS literal)

Vite’s define option doesn’t accept expressions.

3. Excluding from optimizeDeps

vite: {
  optimizeDeps: {
    exclude: ['@paddlejs-models/ocr'],
  },
}

Result: Same Module is not defined error.

4. Polyfilling window.Module

<script is:inline>
  window.Module = window.Module || {};
</script>

Result: Module is referenced at module load time, so the polyfill arrives too late.

5. esm.sh CDN

const ocr = await import('https://esm.sh/@paddlejs-models/ocr');

esm.sh is a CDN that converts npm packages to browser-compatible ESM.

Error: Same ReferenceError: Module is not defined

esm.sh can’t convert Node.js-dependent code for browser use.

6. @gutenye/ocr-browser

I tried an alternative wrapper library.

pnpm add @gutenye/ocr-browser onnxruntime-web

Problem: The open() function’s type checking doesn’t work in the browser — all inputs get coerced to [object Object] or [object HTMLImageElement].

Input formats I tried:

  • { data, width, height } object
  • HTMLImageElement
  • HTMLCanvasElement
  • ImageData
  • Blob URL
  • Data URL (via FileReader)
  • Data URL (via canvas.toDataURL)

All failed.

Alternative

pnpm add tesseract.js
import { createWorker } from 'tesseract.js';

const worker = await createWorker('jpn');
const result = await worker.recognize(imageFile);
console.log(result.data.text);
await worker.terminate();
  • Japanese support
  • Works in the browser without issues
  • Language data downloads from CDN only on first use (~14MB)

PaddleOCR is excellent in Python environments, but its JavaScript/browser implementation is not mature. For OCR in the browser, just use Tesseract.js.