Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Camelot does not delete temporary folders created during processing till the process exits - changed behavior since 0.11.0 #537

Open
imbellis opened this issue Jan 28, 2025 · 0 comments
Labels
bug Something isn't working

Comments

@imbellis
Copy link

Version 1.0.0 seems to have made a change to delete the temporary folders created only after the process exits.
This causes issues for long running processes as the number of folders keeps increasing and eventually fills up the temp folder.

Looking through the code, this appears to be due to a change in the TemporaryDirectory class in utils.py
While there is no description of the reason why this change was made, perhaps it might be a better solution to have an optional flag controlling this behavior.
I see that reverting to the older version of TemporaryDirectory breaks the code perhaps because the created folder is being referenced elsewhere in the handler class. Adding code to delete the created folders immediately after the output has been returned from parse() and _parse_page() seems to work without breaking.

Steps to reproduce the bug

Standard call to read_pdf. example below

 tables = camelot.read_pdf(filepath=filepath,
                                    pages=str(page_num),
                                    backend="pdfium",
                                    table_regions=[input_region],
                                    flavor=flavor)

**PDF**

<!-- Add the PDF file that you want to extract tables from. -->

Not pdf or data specific behavior

**Screenshots**

<!-- If applicable, add screenshots to help explain your problem. -->

**Environment**

I would expect same behaviour on all OS'es, 
- OS : Was tested on Ubuntu, RHEL and Windows WSL
- Python version: 3.12.3
- Numpy version: 2.1.2
- OpenCV version: 4.10.0
- Ghostscript version: 0.7
- camelot version: 1.0.0

**Additional context**

<!-- Add any other context about the problem here. -->
@imbellis imbellis added the bug Something isn't working label Jan 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant