-
Notifications
You must be signed in to change notification settings - Fork 529
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Huge max_heap_size compared to the actual data size #1245
Comments
Happy to look into it. I suspect that using persistent term will help. Overall though for bulk data like timeseries data the fact the remains that there is going to be a fair bit over overhead, both in processing and memory, compared to raw JSON. Absinthe is going to type annotate and type check every single value and if you want to move thousands of values, that adds up. |
Overall I would suggest that you treat bulk data transfers a bit like you'd treat images or binary data in GraphQL: Link to them. We have a bulk data controller that takes a set of signed parameters and returns JSON. Then in our GraphQL we return links to those endpoints with the signed params embedded. |
Just wanted to voice that we are also running into this issue. We would love to be able to return large numbers of records through GraphQL - but these massive memory spikes make that impossible and we are having to move to other solutions. I don't understand the problem well enough to know if type annotation and checking is something that requires using as much memory as we are seeing. The sample app linked above using 50x the memory of the actual response seems excessive, but perhaps that's just what it is? Feels tempting to look for some memory wins here... |
Hey @crayment There is certainly no low hanging fruit here. For some thing as simple as a json response of Are you using the persistent term backend? That'd certainly be the first place to start from a practical standpoint. |
Thanks for the response @benwilson512. Would you mind expanding a bit on the details of how you expect using persistent term would reduce memory usage? Reading the docs it seems to talk about wins at compile time. Just trying to wrap my head around the trade-offs here a bit. If you have any recommended resources would really appreciate them! |
Sure! To start with some stuff you probably already know, each BEAM process (hereafter, just 'process') has its own HEAP. If you send a value from process A to process B it is copied into the heap of process B. This can pose some challenges for tools like Absinthe where you have potentially very large data-structures necessary to describe complex schemas, and you need those structures available in every HTTP request so that you can verify the incoming document. In a naive schema storage mechanism like The exception to this sort of thing is data that is found in what is called the "Constant pool". So if you have:
And you call Naturally then the "old school" way of solving this problem was to transform an Absinthe schema into gigantic constants via macros like you see above. The problem is that this has a lot of practical limitations. It is incredibly easy to have some sort of dynamic function call, or an anonymous function, or any number of other things that basically prevents the structure from actually living in the constant pool and instead requiring copying. EG:
This is no longer a constant because its contents depends on the results of a function call. It doesn't matter if The magic of So to put it all together, the |
Environment
Actual behavior
A few days ago our application started getting killed with
Out of memory
error.What we found is that for some queries, even if the size of the response is small, the memory consumed by the process that processes the GraphQL query is huge. When multiple such queries happen concurrently, the RAM usage jumps from 1Gb to over 6Gb. As a temporary solution we set the
:max_heap_size
flag in the resolver before processing the query, so these queries can be killed instead of killing the whole application.Here is a project that reproduces the issue: https://github.com/IvanIvanoff/absinthe_memory
In this project, we observe that when the HTTP response is small in size at 828 kB, Absinthe peaks at over 50 Mb of RAM usage when processing it.
Expected behavior
The peak memory usage should not be so big.
The text was updated successfully, but these errors were encountered: