[DRBD-user] DRBD resource monitoring exits with code 8 on Pacemaker cluster

Wed Jul 16 18:49:59 CEST 2014

> Date: Wed, 16 Jul 2014 17:58:56 +0200
> From: lars.ellenberg at linbit.com
> To: drbd-user at lists.linbit.com
> Subject: Re: [DRBD-user] DRBD resource monitoring exits with code 8 on Pacemaker cluster
> 
> On Wed, Jul 16, 2014 at 12:44:18AM +0200, Giuseppe Ragusa wrote:
> > Hi all,
> > I have a CMAN+Pacemaker cluster on CentOS 6.5 with DRBD primary/secondary resources backing some KVM VirtualDomain resources.
> > 
> > Recently I manually disabled all KVM resources (while leaving the DRBD ones untouched) then put both nodes in standby to apply some updates (nothing that could actually touch KVM/DRBD, but I took this course of actions for safety) and finally rebooted both nodes too.
> > 
> > After bootup I reverted with "pcs cluster unstandby --all" and re-enabled each KVM resource.
> > Everything came up correctly but I now have these recurring lines in my corosync.log (basically rc=8 for each and every DRBD resource):
> 
> And what is wrong with that?
> You requested debug level messages, you get debug level messages.
> 
> exit code 8 is the expected exit code
> for a monitor of a resource in "master" mode.

Ah, sorry: my fault for not reading through RA script before posting!
I was trying to diagnose the exact sequence of events of an (unrelated) previous error and found those lines while searching for an invalid exit status...

> > drbd(DatabaseVMDisk)[19424]:    2014/07/16_00:24:04 DEBUG: database_vm: Exit code 0
> > drbd(ShareVMDisk)[19426]:       2014/07/16_00:24:04 DEBUG: share_vm: Exit code 0
> > drbd(ShareDataDisk)[19425]:     2014/07/16_00:24:04 DEBUG: share_data: Exit code 0
> > drbd(DatabaseVMDisk)[19424]:    2014/07/16_00:24:04 DEBUG: database_vm: Command output:
> > drbd(ShareVMDisk)[19426]:       2014/07/16_00:24:04 DEBUG: share_vm: Command output:
> > Jul 16 00:24:04 [6428] cluster1.verolengo.privatelan       lrmd:    debug: operation_finished:  DatabaseVMDisk_monitor_31000:19424 - exited with rc=8
> > Jul 16 00:24:04 [6428] cluster1.verolengo.privatelan       lrmd:    debug: operation_finished:  DatabaseVMDisk_monitor_31000:19424:stderr [ -- empty -- ]
> > Jul 16 00:24:04 [6428] cluster1.verolengo.privatelan       lrmd:    debug: operation_finished:  DatabaseVMDisk_monitor_31000:19424:stdout [  ]
> > Jul 16 00:24:04 [6428] cluster1.verolengo.privatelan       lrmd:    debug: log_finished:        finished - rsc:DatabaseVMDisk action:monitor call_id:254 pid:19424 exit-code:8 exec-time:0ms queue-time:0ms
> > Jul 16 00:24:04 [6428] cluster1.verolengo.privatelan       lrmd:    debug: operation_finished:  ShareVMDisk_monitor_31000:19426 - exited with rc=8
> > Jul 16 00:24:04 [6428] cluster1.verolengo.privatelan       lrmd:    debug: operation_finished:  ShareVMDisk_monitor_31000:19426:stderr [ -- empty -- ]
> > Jul 16 00:24:04 [6428] cluster1.verolengo.privatelan       lrmd:    debug: operation_finished:  ShareVMDisk_monitor_31000:19426:stdout [  ]
> > Jul 16 00:24:04 [6428] cluster1.verolengo.privatelan       lrmd:    debug: log_finished:        finished - rsc:ShareVMDisk action:monitor call_id:252 pid:19426 exit-code:8 exec-time:0ms queue-time:0ms
> > drbd(ShareDataDisk)[19425]:     2014/07/16_00:24:04 DEBUG: share_data: Command output:
> > Jul 16 00:24:04 [6428] cluster1.verolengo.privatelan       lrmd:    debug: operation_finished:  ShareDataDisk_monitor_31000:19425 - exited with rc=8
> > Jul 16 00:24:04 [6428] cluster1.verolengo.privatelan       lrmd:    debug: operation_finished:  ShareDataDisk_monitor_31000:19425:stderr [ -- empty -- ]
> > Jul 16 00:24:04 [6428] cluster1.verolengo.privatelan       lrmd:    debug: operation_finished:  ShareDataDisk_monitor_31000:19425:stdout [  ]
> > Jul 16 00:24:04 [6428] cluster1.verolengo.privatelan       lrmd:    debug: log_finished:        finished - rsc:ShareDataDisk action:monitor call_id:257 pid:19425 exit-code:8 exec-time:0ms queue-time:0ms
> > Jul 16 00:24:19 [6428] cluster1.verolengo.privatelan       lrmd:    debug: recurring_action_timer:      Scheduling another invokation of FirewallVMDisk_monitor_29000
> > Jul 16 00:24:19 [6428] cluster1.verolengo.privatelan       lrmd:    debug: recurring_action_timer:      Scheduling another invokation of DCVMDisk_monitor_29000
> > Jul 16 00:24:19 [6428] cluster1.verolengo.privatelan       lrmd:    debug: recurring_action_timer:      Scheduling another invokation of ApplicationVMDisk_monitor_29000
> > drbd(DCVMDisk)[19783]:  2014/07/16_00:24:19 DEBUG: dc_vm: Calling /usr/sbin/crm_master -Q -l reboot -v 10000
> > drbd(ApplicationVMDisk)[19784]: 2014/07/16_00:24:19 DEBUG: application_vm: Calling /usr/sbin/crm_master -Q -l reboot -v 10000
> > drbd(FirewallVMDisk)[19782]:    2014/07/16_00:24:19 DEBUG: firewall_vm: Calling /usr/sbin/crm_master -Q -l reboot -v 10000
> > 
> > From some searching it seems to imply that DRBD came up outside of RA
> > control, so I wonder how I could avoid this in the future
> 
> If you want DRBD to be controlled by pacemaker only,
> tell your init system (chkconfig drbd off or similar;
> chmod -x /etc/init.d/drbd is quite effective as well).
> And your operators ;-)

Sure, in fact it is already configured like that, but I was in "error discovery frenzy" mode and run to erroneous conclusions while searching for exit codes on the net... :(
Sorry again for the noise, and many thanks for your wonderful work (and help on the list)!

Regards,
Giuseppe

> > and how I could clean it up now.
> 
> I don't see anything to be cleaned up?
> 
> 	Lars
> 
> -- 
> : Lars Ellenberg
> : LINBIT | Your Way to High Availability
> : DRBD/HA support and consulting http://www.linbit.com
> 
> DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
> __
> please don't Cc me, but send to list   --   I'm subscribed
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20140716/4f7215d1/attachment.htm>