Skip to content

julien-blanchon/Montelimar

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Montélimar OCR
Textify and Latexify every part of your screen

A OCR toolbox integrated in your mac

Apache 2 License Tauri Svelte Python Rust

Preview

Dark Light
... ...

Features

  • 🔍 Advanced OCR Capabilities

    • Text extraction from any part of your screen
    • LaTeX equation recognition and conversion with Nougat model
    • Support for multiple OCRS models and Nougat models.
  • Quick Access

    • Menubar integration for easy access
    • Global keyboard shortcuts for instant capture
    • Custom shortcuts for specific OCR configurations
  • 🎨 User Interface

    • Clean and modern interface in the menubar
    • Dark and light theme support
    • Hoverlay panel (spotlight style) to choose model to use for OCR
  • 🛠️ Customization

    • Configurable OCR settings
    • Customizable keyboard shortcuts
    • Auto-start option
    • Sound feedback toggle
  • 📋 Clipboard Integration

    • Automatic clipboard copying
    • Quick copy buttons for results
  • 📝 History Management

    • Screenshot history tracking
    • Search through past OCR results
    • Option to disable history for privacy
  • 🔒 Privacy and Security

    • Local processing of screenshots
    • Optional history disable feature

Installation

Getting Started

Prerequisites

  • Bun - Fast all-in-one JavaScript runtime & package manager
  • uv - Fast Python package installer
  • Rust - For Tauri's backend

Development Setup

  1. Install the project dependencies for javascript
bun install
  1. Python Sidecar Setup:
# Navigate to src-python directory and create a virtual environment
cd src-python
# Create a virtual environment
uv venv
# Activate the virtual environment
source .venv/bin/activate
# Sync the dependencies
uv sync

The Python sidecar (ocr_mlx) is packaged using box-packager, which:

  • Takes the entry point defined in pyproject.toml (ocr_mlx.endpoint:main)
  • Bundles all dependencies into a single executable
  • Places the executable in src-tauri/binaries with platform-specific naming
  • Uses PyApp to bootstrap the Python environment at runtime
  1. Build the Python Sidecar:
# This command will:
# 1. Package the Python app using box-packager
# 2. Copy the executable to src-tauri/binaries
# 3. Name it appropriately for your platform
bun run python:package:build

# Initialize the packaged environment
bun run python:package:reset
  1. Start Development Server:
# This will:
# 1. Start the Svelte dev server
# 2. Launch the Tauri window
# 3. Initialize the Python sidecar for OCR
bun run tauri dev

Building for Production

# Build the complete application:
# 1. Checks if Python sidecar needs rebuilding
# 2. Rebuilds sidecar if needed
# 3. Builds the Tauri application
bun run tauri build

The built application will be available in src-tauri/target/release.

Development Workflow

Frontend Development (SvelteKit + Tauri)

  • src/ - SvelteKit frontend code
  • src-tauri/ - Tauri backend code
  • Hot reload enabled for both frontend and Rust changes

Python Sidecar Development

  • src-python/ - Python OCR service code
  • Development mode: bun run python:package:dev
  • API development:
    # Generate OpenAPI specs from Python service
    bun run python:package:generate-openapi
    
    # Generate TypeScript client from OpenAPI specs
    bun run python:package:generate-client

Asset Generation

  • bun run icon:generate - Generate app icons from SVG
  • bun run icon:generate-tray - Generate animated tray icons

Development Tools

  • SvelteKit 5 with Runes for reactive frontend
  • Tauri v2 for native capabilities
  • Tailwind CSS + DaisyUI for styling
  • MLX-Nougat for efficient OCR processing
  • SQLite with Drizzle ORM for data persistence
  • TypeScript for type safety
  • Python sidecar with FastAPI for OCR service

Note: The Python sidecar is automatically rebuilt when changes are detected in the src-python directory. The build process is managed by the pretauri script.

Contributing

Contributions are always welcome!

See contributing.md for ways to get started.

Roadmap

  • Enhanced OCR Capabilities

    • Support for handwritten text recognition
    • Table structure recognition and export to Excel/CSV
    • Chemical formula recognition
    • Multi-language support with language auto-detection
  • Unified processing backend

  • User Experience

    • Interactive tutorial for new users
    • Customizable OCR region presets
  • Platform Support

    • Windows support
    • Linux support

Authors

Related

This project is heavily inspired by the following commercial projects:

Acknowledgements

  • Tauri v2: A framework for building web based desktop apps using Rust.
  • tauri-toolkit: A tauri plugin toolkit for menubar apps.
  • JacobBolda: A youtube channel with a lot of tauri content
  • MrJakob: A youtube channel with a lot of tauri content
  • OCRS: A lightweight OCR library project.
  • mlx-nougat: Nougat implementation for MLX.
  • nougat: Facebook Nougat OCR.
  • Fluent Icons: A library of icons for Windows (use for the tray icon)