Consider hybrid text+OCR strategy #6

deepdive101 · 2025-01-30T15:54:49Z

Consider also giving the LLM the coarsely extracted text from the page using a traditional PDF-to-markdown non-OCR package, e.g. pymupdf4llm, pdf2markdown4llm, etc. The idea is that this could help the LLM ground its OCR response using the given rough text. The prompt will then have to be updated accordingly to ask the LLM to consider both inputs. This can then be an optional enum keyword argument, disabled by default.

You don't have to then bundle the traditional package itself as a hard requirement of aipdf. If a user wants to use it, the user can include it in their personal package requirements. aipdf will just be an integrator.

The text was updated successfully, but these errors were encountered:

torrmal · 2025-01-31T19:50:45Z

this is a great idea, like a plugin, that does the job pre-llm?
do you want to collaborate on this?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider hybrid text+OCR strategy #6

Consider hybrid text+OCR strategy #6

deepdive101 commented Jan 30, 2025 •

edited

Loading

torrmal commented Jan 31, 2025

Consider hybrid text+OCR strategy #6

Consider hybrid text+OCR strategy #6

Comments

deepdive101 commented Jan 30, 2025 • edited Loading

torrmal commented Jan 31, 2025

deepdive101 commented Jan 30, 2025 •

edited

Loading