A Chrome extension that captures screen content, performs OCR, and generates intelligent insights using LLMs.
- Screen capture with adjustable quality settings
- OCR text extraction using Tesseract.js
- LLM-powered analysis using Hugging Face Inference API
- Modern, responsive UI with progress indicators
- Efficient image processing and caching
- Error handling and validation
- Clone the repository:
git clone https://github.com/KPrathamesh-27/Guide_Extension
- Install dependencies:
# Install backend dependencies
cd backend
npm install
3. Configure environment variables:
Create a `.env` file in the backend directory:
```env
PORT=3000
HUGGING_FACE_API_KEY=your_api_key_here
- Load the extension in Chrome:
- Open Chrome and navigate to
chrome://extensions
- Enable "Developer mode"
- Click "Load unpacked"
- Select the
extension
directory
cd backend
npm run dev
- Make changes to the extension code
- Reload the extension in Chrome
- Click the extension icon in Chrome
- Click "Capture Screen" to capture the current tab
- Optional: Add specific instructions in the text input
- Wait for processing and view results
- Frontend:
- HTML/CSS/JavaScript
- Chrome Extension APIs
- Modern UI components
- Backend:
- Node.js/Express
- Tesseract.js for OCR
- Hugging Face Inference API
- File type validation
- Fork the repository
- Create your feature branch
- Commit your changes
- Push to the branch
- Create a Pull Request
MIT License
Prathamesh Kusalkar
- Tesseract.js team
- Hugging Face team
- Chrome Extensions documentation