-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leak in Writer? #4
Comments
I managed to trim down the script a good bit -- it turns out writing data is unnecessary, the leaks happen just creating writers: https://gist.github.com/JohnEmhoff/55f562c2de701dfb426643a3e7751ef8 |
Thank you for reporting it. I think I successfully pinpointed the problem when Writer's constructor build an I'm still looking for the concrete source of the leak. |
Thanks for looking into it. I think you're right -- I noticed that when my spec in the script above is just a column or two, it leaks much, much more slowly. |
After by passing the |
I have the same problem. I tried to dig a bit and it seems the source of the leak is the creation of multiple ColumnWriter (of any type, string, float or int). |
Also I noticed the stripe size is not being honored. |
I found this recomendation. https://www.mail-archive.com/[email protected]/msg00225.html https://orc.apache.org/api/orc-core/org/apache/orc/impl/WriterImpl.html#writeIntermediateFooter-- |
Fund a similar problem, can not flush content to file manually, and batch_size in Writer parameter seems invalid. Any solutions? |
Hello! Thanks for pyorc; using it has been a pleasure so far, with the exception that we seem to be running into memory issues. I think
Writer
is leaking memory? Our workload is roughly:Memory usage will grow without bound between iterations. This, coupled with the fact that lowering the stripe size all the way down to 1M has no effect, makes me suspect a memory leak. Below is a script that will reproduce -- around iteration 10 it gets to 20G and then killed by the OOM killer on my machine. Let me know if there's anything I can do to help track it down!
https://gist.github.com/JohnEmhoff/274f6e05cba3f17a16683eb394bfe6b5
The text was updated successfully, but these errors were encountered: