NRI plugins penalized with death if they take 2 seconds #114
Labels
Discussion Needed
Significant issue require careful consideration/discussion before continuing to a pull request
Milestone
Description
NRI plugins running in
containerd
by default have 2 seconds per event to provide a response. This is fine. But, if it misses a single response in that timeframe, it is closed / cut off from future events. For plugins built on github.com/containerd/nri, that results in the process exiting.There is a specific set of errors that induce this close-the-connection behavior:
nri/pkg/adaptation/plugin.go
Lines 520 to 533 in 7b3bcee
The other ones in that list look very reasonable. But, I'd like to suggest that a plugin responding to one event in more than (by default) 2 seconds doesn't indicate that the plugin has entirely failed and it can probably still be used for future events, so a better behavior would be to simply time out that one event but continue.
The text was updated successfully, but these errors were encountered: