Conversation
e4326b8 to
ef87436
Compare
There was a problem hiding this comment.
I'm wondering if retval should still be 0 if the run is skipped. It's not a run failure, but the run didn't happen, either, which could be construed as a policy failure. As we have CHEFCLIENT_FAILURE, we may want to have a constant for CHEFCLIENT_SKIPPED so that, if the host gets in a state where it's regularly skipping, it can be easily identified by monitoring. I'll suggest CHEFCLIENT_SKIPPED be an exit code of 67 to show I'm not totally a fossil 😜
|
Well that has some risks because you don't necessarily want cron to think it failed. I have two possible compromises that I'm cool with, but I much prefer the first. option 1The hook returns option 2We add a config option I think the first one is more flexible for all sorts of use-cases (you could have multiple return codes for WHY it failed, including some of them being 0)... but if there's a reason I'm missing to choose the second, I'm open to hearing it. |
|
@dafyddcrosby - ping? which would you prefer |
|
I think the first option should be fine. I think that as long as we're not in a position where no Chef run has actually happened for $period_of_time and we have no way of determining it was skipped, should be good. |
|
Awesome, thanks, I'll modify accordingly. |
8893990 to
cdbf21c
Compare
This adds a new hook point (and sample plugin usage) that allows the
Chef run to be skipped based on some local criteria.
Example usage might be:
- Device is on battery
- Device is not connected to VPN/backhaul/etc.
- Some global service meant to disable runs during an emergency
Previously I did this in pre_run or pre_start, but the problem with that
is that the only way is to force `exit`, which causes the logs to get
messed up because we never update the links. This provides a clean way
to skip the run but still update the chef.{cur,last} links so that it's
clear what has happened.
Sample output:
```
$ sudo chefctl -iv
[2025-11-13 18:27:44 +1000] DEBUG chefctl: Loading plugin at /etc/cinc/chefctl_hooks.rb.
[2025-11-13 18:27:44 +1000] DEBUG chefctl: Including registered plugin KrHook
[2025-11-13 18:27:44 +1000] DEBUG chefctl: Trying lock /var/lock/subsys/chefctl
[2025-11-13 18:27:44 +1000] DEBUG chefctl: Lock acquired: /var/lock/subsys/chefctl
[2025-11-13 18:27:44 +1000] INFO chefctl: taste-tester mode ends in < 1 hour, extending back to 1 hour
[2025-11-13 18:27:44 +1000] DEBUG chefctl: Skippinbg battery check due to --immediate flag
```
and
```
$ sudo chefctl
[2025-11-13 18:27:22 +1000] INFO chefctl: taste-tester mode ends in < 1 hour, extending back to 1 hour
[2025-11-13 18:27:22 +1000] WARN chefctl: Running on battery power, skipping Chef run
[2025-11-13 18:27:22 +1000] INFO chefctl: Plugin requested skipping chef run.
```
Signed-off-by: Phil Dibowitz <phil@ipom.com>
|
OK, option 1 implemented. Plus a few typos in comments fixed. |
|
@dafyddcrosby has imported this pull request. If you are a Meta employee, you can view this in D89968754. |
|
@joshuamiller01 merged this pull request in 51217bd. |
This adds a new hook point (and sample plugin usage) that allows the
Chef run to be skipped based on some local criteria.
Example usage might be:
Previously I did this in pre_run or pre_start, but the problem with that
is that the only way is to force
exit, which causes the logs to getmessed up because we never update the links. This provides a clean way
to skip the run but still update the chef.{cur,last} links so that it's
clear what has happened.
Sample output:
and
Signed-off-by: Phil Dibowitz phil@ipom.com