-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CFE-4335: Watchdog to ignore stale pidfile #2841
CFE-4335: Watchdog to ignore stale pidfile #2841
Conversation
I don't have an easy way to test it, but in isolation the code seems to work. /proc exists at least on all Linux + Solaris, though I'm unable to find confirmation that it's required by POSIX. For the basic point of, "how old is this process?" (checking mtime with ls -t) it should work. Options to head and ls only include those specified by POSIX. Hat tip to https://stackoverflow.com/q/40205638/5419599 where I got the idea for this code. |
Does not work on MacOS, not sure how broadly watchdog is intended to work. |
Also, shouldn't the pidfile be in /var/run/ ? That exists on all modern unixes, including macos, and gets wiped out on boot which is the most common reason for stale pidfiles generally (existing past boot). Reference: https://www.pathname.com/fhs/2.2/fhs-5.13.html |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
@flynn1973 Can you confirm if this change improves the situation for you? |
a quick run of the changed script shows a missing "then" ;-)
error: root@aixtest01: /root # /var/cfengine/bin/watchdog_changed after adding it the missing statement the script runs without errors....all in all this change will definitely improve the situation..good idea. |
@flynn1973 Patched. |
@mikeweilgart do you want to squash these commits together and amend the commit body to include something like
|
a36d631
to
73d5463
Compare
@mikeweilgart thanks, one more thing, can you re-word the commit subject to be past tense? Something like "Made AIX watchdog ignore stale pid files". |
73d5463
to
0e50036
Compare
@mikeweilgart I think this watchdog is specific to AIX. It might work on Linux as well, but the Linux one is much more simple. The aix watchdog uses this template here: https://github.com/cfengine/masterfiles/blob/master/cfe_internal/core/watchdog/watchdog.cf#L174-L179 and the linux watchdog, well the watchdog used when /etc/cron.d is present uses https://github.com/cfengine/masterfiles/blob/master/cfe_internal/core/watchdog/watchdog.cf#L225-L227 Would be good to combine them, but that would be separate work. |
Ticket: CFE-4335 Changelog: AIX watchdog now handles stale pids Code uses /proc opportunistically only if it exists; this improves accuracy for systems with /proc and doesn't affect other systems. (Turns out this doesn't matter as the template is only for AIX which does have /proc) This leaves in place a race condition if many watchdog processes are started in very short succession, since the pidfile is not atomically checked and updated. For purposes of the watchdog this is probably good enough. (See https://stackoverflow.com/a/688365/5419599 for more on this.)
0e50036
to
f53436b
Compare
Thanks @mikeweilgart ! |
Fixes CFE-4335.