-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Storage capacity estimates #1653
Comments
I did more analysis. I used following data:
I assumed that estimated usage corresponds only to the usage of the table itself (estimated disk usage of indexes is zero), and come up with the following table (how much we are underestimating per content item): Few notes:
Note: I ignored storage used by beacon network as it seems significantly smaller in comparison. |
Based on the information from the previous message, I think we can have following conclusions:
Based on everything posted so far, I'm working on estimate on extra disk space needed per content item, so that we don't overestimate usage for long running clients. |
Hm, I'm not so sure we want autovacuum on by default. I get that there are situations were being able to run vacuum would be important, like when restarting trin with a lower db size limit. But maybe we should do a manual vacuum at that point, rather than have autovacuum all the time and significantly increase IOPS & DB wear. |
We are wrongly estimating our storage capacity usage. Goal of this issue is to enumerate all known issues on this topic, so we can start discussion on how to address it.
We underestimate how much space is used
Currently, we estimate how much disk space is used by adding lengths of
content_id
,content_key
andcontent_value
. However, there is extra overhead, I believe caused by:I strongly believe that overhead is dependent on the number of stored items, not the size of the items. Because average size of the content on the state network is fraction of average size of the content on the history network (see data below), our estimate is significantly worse for the state network data.
I extracted the data from 3 different types of prod nodes:
History node (4444 nodes) - actual disk usage 36 183 mb
State nodes - actual disk usage 54 198 mb
Big storage nodes (21m state) - actual disk usage 639 759 mb
We can observe that actual usage in nodes that have both history and state content is significantly worse (up to 60%) then when we only have history network (only ~3% extra)
Beacon network declares that it doesn't need any space
While beacon clearly uses some disk space, it's unclear how much that is. Also, it never reserves space from the one provided via flag (meaning, we just assume it uses zero and distribute specified capacity between history and state).
I would also highlight that I didn't investigate if beacon correctly reports it's usage (I know there are multiple tables, and I don't know if we are actually reporting stats from all of them).
Disk usage never goes down
Something to be tested, but I'm pretty sure that disk usage never goes down, even if we remove a lot of content at once.
My understanding is that sqlite keeps free database pages around, so it can reuse them later.
Forcing database to free space can be done either using
VACUUM
command (which I believe requires hard drive of twice as much space), or settingauto_vacuum
which will clean database usage on the go (which I believe would become active only on new DB or after being followed byVACUUM
command).The text was updated successfully, but these errors were encountered: