-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lvcreate: --quiet
doesn't suppress overallocation warning.
#57
Comments
My own use case is that I make a large number of snapshots which eventually trigger all of these warnings even though the actual used space is nowhere near the size that will endanger the pool. When I am done with a particular task, all the snapshots are deleted leaving the base volume. I don't need any of these warnings and I would like them all suppressed. I know of other projects and users who would like to suppress this warning as well along with the other three that you are suppressing via lvm2/lib/metadata/thin_manip.c Lines 413 to 428 in f70d97b
As you can see, the first warning is logged using a different method than the other three: I have read what I think are all of the historical mailing list threads and forum threads on this particular topic. I have found one proposed fix that appears to be optimal for my use case: adding an envvar As stated in that proposal, there are two precedents for this type of suppression that are already implemented in LVM2: Here are two historical threads about this feature request from the mailing list: Here is an issue in the bugtracker at RedHat (unfortunately marked WONTFIX): Here is a forum thread over at Proxmox tracking this issue downstream: I posted about this all to the lvm-linux mailing list, and there is interest in this change from the Qubes OS project and @DemiMarie: I am now going to suggest that all users and projects that want this change add a 👍 this issue so that there is a measurement of interest. |
I've done a bit more digging. These are the locations where messages/warnings are suppressed elsewhere: LVM_SUPPRESS_LOCKING_FAILURE_MESSAGESLines 128 to 157 in 8dccc23
LVM_SUPPRESS_FD_WARNINGSLines 3550 to 3586 in 8dccc23
|
Would this change do the trick?
Is an envvar available in this way to |
Interesting. This issue has 41 upvotes on the issue submission comment, so it looks like it has fairly wide appeal. Anyone interested at taking a crack at getting this done? |
Hello, As a first step I would change the warning to give better advice or be less dramatic. Thinpools, the entire point is to allow larger volumes to coexist in a smaller space than the total allocated space. All it's really saying is "You're going to run out of space in your thin pool if your volumes exceed the space of the thin pool" Here is one discussion that demonstrates the confusion There is also the warning that "WARNING: You have not turned on protection against thin pools running out of space." Which suggest that you have to do that, and it also is not clear, how will the system handle running out of space ? Gracefully I hope ? Or will all the VMs crash horribly all at once when they try to write into empty space that doesn't exist ? In most cases, I think thin pools are going to take all the space on the disk, so autoextend won't help anyway ? Preferably I would like a more suggestive language "If you use more space than there is in total in your thinpool, bad things (specify) will happen. If you have extra space in your volume group, you can allow the thinpool to autoextend" Also, maybe autoextend being enabled should be the default setting for new thinpools ? |
It would be first good to understood that 'running out of space' a thin-pool is nowhere near similar to running out of space in your filesystem - it may possibly get seriously more worse for the user of a thin-pool and he can lose a big amount of data (hopefully users do make their backups...) So yes - the message should 'scary a user' to avoid assuming this is something he should run into such state on daily bases and expect it's an 'easy-to-fix' scenario as it's simply not. So we try to kindly push users towards some 'monitored' usage of thin-pool. So there is at least log evidence that possible disaster is coming.. It is usually not complex to 'recover' thin-pool itself so it starts again, but the much more worse is to fix the consistency of all the data stored within the thin-pool. So if there would be 'a trivial' knob to disable all these warning - we are pretty sure many distros would be picking the easiest path and set this as 'a new default' - then we just get requests for recovery helps and we would be blamed about 'unexpected' data loss... One of idea we could think about here is to actually detect 'real' state of thin-pool and if we are past certain threshold (70% maybe 80%) - emit this warning - although for small volume size this may simply be not early enough.... But at least there would be no messages for nearly empty pools.... But the main part is - thin-pool is NOT supposed to be operated at 'full pool state' - this should always prevented! |
Is that still true with |
Does LVM "do the right thing" when a thin pool runs out of space ? I would imagine, like previous generations, that running out of space would be treated as a failing block device when the last free extent is filled up. I think it should send a signal similar to a failing hard drive and remount all associated file systems to read only. I notice the warning suggest to enable "auto extend" the thin pool. In my case, there is never any space left to extend the thin pool into, my basic systems only have had one thin pool that uses all space left in the volume group. But is there a reason not to have auto extend enabled by default if there were free space available in the volume group ? Do most users prefer to crash their system rather than step over the empty space left in their volume group if any ? And if you have auto extend enabled, but you also use up all the space left in your volume group, then what happens ? Does it break worse than your thin pool running out of space internally ? These are all the question that this warning make me ask. That I imagine any responsible system admin will have to find the answer to on their own when confronted with that warning. Frankly, that's a lot of extra research that you don't expect to have to do when you were just trying to setup partitions, it could I would like the reassurance that "LVM does the right thing in all cases" and I think that means "every filesystem flips to read only as it tried to write into new extents that you don't have, your write call fails but existing data is not lost". Also I would expect LVM to have a kind extent reserve system, a kind of landing strip that starts to have serious alarm bells when you start using the reserve. Something similar to "tune2fs -m " but for extents. And imagine the following case that could happen to anyone, one of your VM has a bug and it starts to write bug data at 1GB/s until something breaks. Any overprovisionned system will fail (that's everyone using thinpool), auto extend will not help. EDIT: I ran this whole discussion through chatgpt, which you can review here Its responses are
So I take it the warning can be dealt with in two ways, "ignore or build your own LVM2 monitoring infrastructure, either way running out of space will fail the last write and go readonly like any broken hard drive" Can anyone actually knowledgeable on the topic back these answers up for the future admin who end up on this page on their search for the warning ? |
This option only changes the behavior for 'instant' erroring - so there is no delay and you get 'write' errors as soon as they happen - otherwise there is by default a 60sec window so lvm2 has plenty of time to extend data or metadata volume. Now let me describe couple issues here - where people do not realize how problematic is out-of-space thin-pool. |
Thank you! That finally makes sense why it's bad to let it run out of space. Any chance of changing it so that all writes fail when it's out of space, instead of only writes that need the space? (That would solve the consistency and data loss issues, right?) Or would that introduce too much of a performance hit to check for free space before all writes? Would it work to have an atomic int somewhere in memory that's initialized to zero, checked on all writes, and flipped to non-zero by any thread that detects an out-of-space condition? That might allow some blocks to get written that shouldn't because of the race condition though, but hopefully that can be addressed by using more expensive locks to handle sync calls? Or maybe there could be an atomic int that represents when the thin pool usage is over ~95%, and when it's set, more expensive locks are always used. So for pools <95%, the performance penalty is minimal, and pools >=95% pay the performance penalty but get the data consistency benefit. I'm not familiar with linux's block device interface, but I'd be happy to take a look and try to come up with something if you want. If not, any chance of updating the lvmthin man page to explain what you did about some writes erroring but others going through "successfully"? |
I'm glad there is some movement on this issue. I want to also point out that I am not asking for warnings to be removed or changed nor am I wanting to argue about the validity of warnings. What I see in the source code is that some of the warnings are sent to log via a function that allows warnings to be silenced, and the first warning is not able to be silenced. My main goal is to simply have an option to silence this warning as an end user option. I am 1000% ok with this option be default off so that the current behavior is retained, whereas I am able to control it myself. Additionally, I proposed using an environment variable, but I'm also not a project maintainer. I defer to whatever solution maintainers feel is the correct way to go for adding a knob to silence this warning as a user option. |
I think we should remove the warnings about overprovisioning and just replace them with improved warnings about thin pools running low on space, at percent-full levels that are configurable by the user, e.g. WARNING: thin pool "foo" has reached 90% full, and will be autoextended at 95% full. I suggested something like this back in 2017 in https://bugzilla.redhat.com/show_bug.cgi?id=1465974 |
I love any solution that you think is appropriate as long as the warning that says "WARNING: Sum of all thin volume sizes (%s) exceeds the size of thin pool%s%s%s (%s)." can be suppressed by a configuration change or other appropriate user option. |
No
This is not going to work for our needs - we need to be sure we provided warning for a user who uses 'defaults' and activates empty pool and eventually fill the pool completely on the first activation. However - after looking at our option list - this can possibly work when --monitor option is explicitly listed on cmdline. lvcreate --monitor n -T.... This might be a way - lvm2 code can skip the message - when user intentionally created unmonitored volume - at the same time we have metadata logged with this content about users 'opt-out' . |
I agree that warnings shoud absolutely not be removed. My original concern way way up top in my first comment is that for some reason this one single warning is not wrapped in |
You're talking about warning the user when a thin pool is created or activated without monitoring (and --monitor n can silence the warning). That's reasonable, and I included similar in my list of suggestions ("will need manual extension" is about the same as "not monitored".) So, I think that a rough plan would be:
|
WARNING: thin pool "foo" is not monitored and will require manual lvextend (currently N% full.) This message could be printed from lvcreate, lvchange -ay, and lvs. Maybe --ignoremonitoring could silence the warning for all of these? WARNING: VG "vg" has insufficient space for the next autoextend of thin pool "foo" (currently N% full.) This message about insufficient VG space could be printed from the same commands. |
Hi, This is tangentially related and for the people who end up on this thread from google. The webmin dash doesn't see it df -h doesn't see it The way to find out is this command
And for something as important as this appears to be, it's a surprisingly obscure command to find ! I'm saying that because if there MUST be such a warning about thin pools getting overfilled, maybe it should include the command to check at that point ? And also, why isn't there a simple command to check that ? (
) Or better yet, since LVM is a core linux component, why doesn't df doesn't tell of the space left in the thin pool parent of mounted filesystems ? That was the first place I checked, probably the first place new lvm users would check ! Lastly, now I am kind of curious to see what's going to happen to this system if I bust the thin pool (it's a test system anyway) |
Clearly both tools are not realated to 'lvm2' - so if you want to see info in 'webmin' - you would need to bother authors of that project. And 'df' tool is 'filesystem' specific - so it's not its job to i.e. report disk errors, out of space of thin-pool' or broken raid1 leg - these are essentially same type of errors.
Yep - this is correct command to work the 'LVs' - i.e. in a similar way when you work with md raid you are using 'mdadm' and for encrypted DMs we have cryptsetup tool.
Well we were thinking about possibly adding maybe 'colors' in colorized terms - or maybe '!' prefix on the LVs that do need some attention - but these were just some ideas - nothing yet materialized...
There is surely a warning in journal/syslog - but I admit there are not many admins checking it...
lvs has tons of option. So i.e. you could let it select to show all thinpools with percentage of usage higher then X...
lvdisplay is 'older' version of more universal 'lvs' tool - lvs is fully configurable and can be used in any scripting without any complicated parsing - you can just select columns & format and even get json output...
lvm2 is managing block device layer. df is reporting 'filesystem layer' and 'btrfs' can tell stories.... So you are mixing apples and oranges - 'df' has no way how to interpret 'free space' in thin-pool. And strictly speaking 'df' is laying about free space in filesystem all the time anyway ;) - as you can have holes in files..... |
For reference, here is the kind of damage that running out a thin pool causes
|
Is there any way to suppress the warning below? I only want to see errors.
In case it helps:
The text was updated successfully, but these errors were encountered: