Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Lars Ellenberg wrote: > / 2006-12-12 21:49:36 +0100 > \ Uwe Melzer: > >>Hi, >>I use DRBD 0.7.22 with heartbeat 1.2.3 . >>I have defined 10 drbd devices, each with it's own sync group (0-9). >>wfc_timeout is set to 0, degr-wfc-timeout to 120 for all devices. >> >>During boot process I observe the following on returned primary node. >>When a sync is running heartbeat starts didn't wait sync is finshed >>for all devices. >>'auto_failback on' is setting in the ha.cf file. >> >>Why? The last command in the drbd start script is 'drbdadm wait_con_int' . > > > Wait for Connection Interactively. > > there is no mention of wait for sync here. > > >>But there is not description in the man pages (drbdadm) for wait_con_int. >>On the drbd devices run a database installation, so I must wait for the >>end of the sync. >>Where can be the mistake, was is going wrong. > > > set auto_failback off. > > if you really cannot live without it, > there is a wait_sync command for drbdsetup (not for drbdadm). > Let me describe the situation: secondary node is Primary/Unknown primary is booting drbd starts on primary, sync per group is starting - start script leaves heartbeat starts on primary, auto_failback on --> take over resouces initiate secondary node try to set devices in secondary status per drbddisk. Today I looked in the ha-log files on both nodes. During the sync phase I found these infos an the secondary node: heartbeat: 2006/12/12_19:44:37 info: Running /etc/ha.d/resource.d/drbddisk logs stop heartbeat: 2006/12/12_19:44:42 ERROR: Return code 20 from /etc/ha.d/resource.d/drbddisk heartbeat: 2006/12/12_19:44:43 info: Retrying failed stop operation [drbddisk::logs] heartbeat: 2006/12/12_19:44:43 info: Running /etc/ha.d/resource.d/drbddisk logs stop heartbeat: 2006/12/12_19:44:49 ERROR: Return code 20 from /etc/ha.d/resource.d/drbddisk heartbeat: 2006/12/12_19:44:50 info: Retrying failed stop operation [drbddisk::logs] heartbeat: 2006/12/12_19:44:50 info: Running /etc/ha.d/resource.d/drbddisk logs stop heartbeat: 2006/12/12_19:44:55 ERROR: Return code 20 from /etc/ha.d/resource.d/drbddisk heartbeat: 2006/12/12_19:44:56 info: Retrying failed stop operation [drbddisk::logs] heartbeat: 2006/12/12_19:44:56 info: Running /etc/ha.d/resource.d/drbddisk logs stop heartbeat: 2006/12/12_19:45:01 ERROR: Return code 20 from /etc/ha.d/resource.d/drbddisk heartbeat: 2006/12/12_19:45:02 info: Retrying failed stop operation [drbddisk::logs] heartbeat: 2006/12/12_19:45:02 info: Running /etc/ha.d/resource.d/drbddisk logs stop heartbeat: 2006/12/12_19:45:07 ERROR: Return code 20 from /etc/ha.d/resource.d/drbddisk heartbeat: 2006/12/12_19:45:08 info: Retrying failed stop operation [drbddisk::logs] heartbeat: 2006/12/12_19:45:08 info: Running /etc/ha.d/resource.d/drbddisk logs stop heartbeat: 2006/12/12_19:45:13 ERROR: Return code 20 from /etc/ha.d/resource.d/drbddisk heartbeat: 2006/12/12_19:45:14 info: Retrying failed stop operation [drbddisk::logs] heartbeat: 2006/12/12_19:45:14 info: Running /etc/ha.d/resource.d/drbddisk logs stop heartbeat: 2006/12/12_19:45:19 ERROR: Return code 20 from /etc/ha.d/resource.d/drbddisk heartbeat: 2006/12/12_19:45:20 info: Retrying failed stop operation [drbddisk::logs] heartbeat: 2006/12/12_19:45:20 info: Running /etc/ha.d/resource.d/drbddisk logs stop heartbeat: 2006/12/12_19:45:25 ERROR: Return code 20 from /etc/ha.d/resource.d/drbddisk heartbeat: 2006/12/12_19:45:26 info: Retrying failed stop operation [drbddisk::logs] heartbeat: 2006/12/12_19:45:26 info: Running /etc/ha.d/resource.d/drbddisk logs stop heartbeat: 2006/12/12_19:45:31 ERROR: Return code 20 from /etc/ha.d/resource.d/drbddisk heartbeat: 2006/12/12_19:45:32 info: Retrying failed stop operation [drbddisk::logs] heartbeat: 2006/12/12_19:45:32 info: Running /etc/ha.d/resource.d/drbddisk logs stop heartbeat: 2006/12/12_19:45:37 ERROR: Return code 20 from /etc/ha.d/resource.d/drbddisk heartbeat: 2006/12/12_19:45:38 info: Retrying failed stop operation [drbddisk::logs] heartbeat: 2006/12/12_19:45:38 info: Running /etc/ha.d/resource.d/drbddisk logs stop heartbeat: 2006/12/12_19:45:43 ERROR: Return code 20 from /etc/ha.d/resource.d/drbddisk heartbeat: 2006/12/12_19:46:11 ERROR: Resource script for drbddisk::logs probably not LSB-compliant. heartbeat: 2006/12/12_19:46:11 WARN: it (drbddisk::logs) MUST succeed on a stop when already stopped heartbeat: 2006/12/12_19:46:11 WARN: Machine reboot narrowly avoided! It seems for me that drbddisk didn't check that the device is in sync mode. Please send me your comment. Thanks and Regards -- Uwe