-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sometimes misses drive detection with lots of drives #1133
Comments
I am trying to reproduce this outside of cockpit with a repr.sh script (it has to end in .txt so that I can attach it here -- please rename). But so far no luck -- that does fail after a few iterations, but on a different issue. So this needs something more, perhaps because Cockpit is connected to udisks and has signal listeners, etc.? I'll try to refine this and report back if I have a working shell reproducer. Update: I ran |
udisks bug report: storaged-project/udisks#1133 Known issue cockpit-project#4942
One potential issue on a first sight - it takes a while until udiskd processes all uevents. There's a side probing thread that may potentially do additional I/O (e.g. cd-rom drives). Even after What happens if you put |
I forgot to mention -- it's not a race condition. I've kept the VM up in that state for an hour, and it didn't catch up. I can reload the cockpit page or do udisksctl dump all I want, it's a stable stabe. Only restarting udisks helped. |
Thanks, I don't like race conditions :-) Any chance to get ssh access to that machine? I'd be interested in:
We've seen some And perhaps the obvious:
|
udisks and udevadm dumps are already linked from the description (the "good" and "broken" links). I reproduced the failure again, and grabbed all four things. Unfortunately uploading to this github issue is broken right now, so I put them onto my server: https://piware.de/tmp/udisks-info/ I'm happy to give you ssh access to that machine. It's a VM running on my laptop, so requires some port forwarding, but that can be arranged. Please send me your public SSH key, and let's coordinate on Slack or so? (My laptop isn't always on, and I'll be working on a train tomorrow afternoon). Alternatively I can talk you to running the test on your machine, in a container (it's quite a lot of stuff to download, but no danger to your OS or home dir) |
I tried to run the daemon with
With
but that's not very useful, and unrelated to this bug (I suppose). I played around further with repr.sh, but no luck so far. I updated the comment above with the results. So I am trying my luck with catching it from the other side, and modifying our integration test. That does the following steps:
I tried to vary this. "stable" means it survived 3 parallel runs of 10 iterations each. "sleep" means adding
Current theory: happens when connecting to udisks while events are going on. I tried to simulate that with an updated There must be something specific about cockpit's udisks monitoring. The Storage page does a single D-Bus connection to udisks (it also connects to Stratis). Right after connection, it enables the LVM2 and iscsi modules, that also sounds significant. I'll play around with that. [1] diff, for posterity: --- test/verify/check-storage-scaling
+++ test/verify/check-storage-scaling
@@ -17,6 +17,8 @@
# You should have received a copy of the GNU Lesser General Public License
# along with Cockpit; If not, see <http://www.gnu.org/licenses/>.
+import time
+
import storagelib
import testlib
@@ -27,9 +29,14 @@ class TestStorageScaling(storagelib.StorageCase):
m = self.machine
b = self.browser
+ self.login_and_go()
+ time.sleep(5)
m.execute("modprobe scsi_debug num_tgts=200")
- self.login_and_go("/storage")
+ b.go("/storage")
+ b.enter_page("/storage")
+ b.wait_visible("#devices")
+
with b.wait_timeout(60):
b.click("#drives button:contains(Show all 202 drives)")
b.wait_not_in_text("#drives", "Show all") |
I reworked repr.sh a bit to be closer to what cockpit does: initially udisksd is not running, then it gets started and modules enabled. But still no luck with reproducing that way 😢 I need to stop here. Is there some other debugging that I could turn on during the test? You can also try to run it yourself. To make sure you don't have to change your running system, it's safest and easiest too do that in a toolbox with our devel/CI container:
Inside, check out cockpit, build an arch image:
Then open some 2 or 3 terminals (for more parallelism, and failing faster), and run
at some point this should fail and "sit":
Then you can log into the VM using the provided command (the port number will vary depending on which parallel test failed). You can log in as user "admin" or "root", both have password "foobar". |
udisks bug report: storaged-project/udisks#1133 Known issue #4942
(rather busy here at the moment, sorry, and PTO until Jul 17) Just another thing that I've noticed:
For some reason, multipathd is kicking in. |
@tbzatek : Very well spotted! There is no multipathd process running. However, there is /usr/lib/udev/rules.d/56-multipath.rules, which triggers a lot of processes like "/sbin/multipath -u sdbv". I tried --- test/verify/check-storage-scaling
+++ test/verify/check-storage-scaling
@@ -27,6 +27,7 @@ class TestStorageScaling(storagelib.StorageCase):
m = self.machine
b = self.browser
+ m.execute("rm /usr/lib/udev/rules.d/56-multipath.rules; udevadm control --reload")
m.execute("modprobe scsi_debug num_tgts=200")
self.login_and_go("/storage")
I have run it 3x parallel for 15 minutes now, and it eventually failed. It may have taken longer than usual.
However, that was already the case in my original lsscsi output, with multipath. What command did you run which produce the "sdc-mpathc" output? I don't see that anywhere in any of the dumps. If it was
|
This now also affects Fedora 38. |
Testing this on F39 cloud image:
Not good. |
Btw. same thing just like last time - if I start |
Weird.. I suppose the next thing to look at is udev, which sits in the middle? I checked that the udev dump is identical in good/bad situations, but it would be good to find out if it's missing the sending of uevents, or if udisks is missing to receive/handle some of them? |
It's a performance issue mixed with what looks like a ring buffer on the kernel side, read through netlink. Reading the Just putting a 100ms sleep at the beginning of udisks uevent processing chain results in about 90% uevent loss after a burst of about 250-300 uevents are read, likely being overwritten by other records in the ring buffer. Looking at the uevent sequence numbers the after the buffer is exhausted, there are significant gaps. Also noticed random order of incoming records (but that's something generally not guaranteed and we're kinda ready for that). So we'll need a dedicated high-priority side thread that only reads and queues all incoming uevents that are then put into a probing queue processed in another thread and the results are then sent to the main thread where appropriate d-bus objects are created. |
Nice debugging! I'm not entirely sure if we aren't asking too much here. I added that test case back then as we got a bug report from users with a gazillion drives -- but they usually aren't being hotplugged all at the same time. We could certainly also rewrite this in a way to create 200 drives and restart udisks, to treat them as "coldplug". |
Oh no worries, the change will be reasonably small I think. Sooner or later we'd hit the limit anyway and it might have actually been source of some random failures we've been seeing. The problem is that there's a lot of going on in the main thread and things would only get worse once new features are added. |
Another good catch, addressed in #1224. |
Great work @tbzatek ! 💯 |
In cockpit we have a storage scaling integration test. It creates 200 SCSI devices/drives with
modprobe scsi_debug num_tgts=200
, and then checks that they all propagate correctly through the kernel → udev → udisks → dbus → cockpit chain.This test now fails very often on Arch Linux: It's often showing not the expected 200 SCSI drives (plus the virto root disk, and the cloud-init iso, thus 202 in total), but fewer, e.g. 188 or 193.
This seems to be an udisks bug. In the "good" case, udisks should pick up all 200 devices and drives:
But when the test fails, it has fewer drives:
The
udisksctl dump
diff between the good and the broken state shows that the device isn't properly connected to the drive, and the corresponding drive object is missing:However, it is good in udev. The
udevadm info --export-db
diff between thegood and broken situation looks identical aside from time stamps and minor reordering:
So far we only see this on Arch Linux, not (yet?) on Fedora/Debian/RHEL.
Arch's udisks package has no patches. Fedora 38 has kernel 6.3.9 and systemd 253.3 (same version), so arch is not really ahead here. So I'm not yet sure what the significant difference is, might just be sheer timing luck.
The text was updated successfully, but these errors were encountered: