Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi, Thank for your explanation about the usermode_helper. In order to not introduce any side effect, I will now install DRBD in the default way. Well, I have install again from scratch DRBD 8.3.0 _without_ PREFIX=... So now: - Sources are in /usr/local/drbd-8.3.0/source/drbd-8.3.0 - Symbolic link /usr/local/drbd -> /usr/local/drbd-8.3.0 - Build with "make all && make install". - Configuration file is /usr/local/drbd-8.3.0/etc/drbd.conf - DRBD module is now in /lib/modules/`uname -r`/kernel/drivers/block/drbd.ko - Userland tools are now in /sbin From my point of view, there is no more trace of previous installations, except the fact that my DRBD disks have been created with DRBD 8.2.5. All seems running well. Then I invalidate a resource, and while resynchronization time, I've got always the same problem. usermode_helper is the default one: -8<--------------------------------------------------------------------------- # cat /sys/module/drbd/usermode_helper /sbin/drbdadm # ls -la /sbin/drbdadm -rwxr-xr-x 1 root root 227310 Feb 25 10:52 /sbin/drbdadm* # ls -la /sbin/drbd* -rwxr-xr-x 1 root root 227310 Feb 25 10:52 /sbin/drbdadm* -rwxr-xr-x 1 root root 172921 Mar 14 2008 /sbin/drbdadm-8.2.5* -rwsr-x--- 1 root haclient 123867 Feb 25 10:52 /sbin/drbdmeta* -rwsr-x--- 1 root haclient 124229 Mar 14 2008 /sbin/drbdmeta-8.2.5* -rwsr-x--- 1 root haclient 113385 Feb 25 10:52 /sbin/drbdsetup* -rwsr-x--- 1 root haclient 97359 Mar 14 2008 /sbin/drbdsetup-8.2.5 # drbdadm | grep Version Version: 8.3.0 (api:88) -8<--------------------------------------------------------------------------- Here is the /var/log/message output (now usermode_helper will be correct): -8<--------------------------------------------------------------------------- Feb 25 11:16:41 rh4-2_1 kernel: drbd: initialised. Version: 8.3.0 (api:88/proto:86-89) Feb 25 11:16:41 rh4-2_1 kernel: drbd: GIT-hash: 9ba8b93e24d842f0dd3fb1f9b90e8348ddb95829 build by root at rh4-2, 2009-02-25 10:51:49 Feb 25 11:16:41 rh4-2_1 kernel: drbd: registered as block device major 147 Feb 25 11:16:41 rh4-2_1 kernel: drbd: minor_table @ 0xf7e97b80 [...] Feb 25 11:16:41 rh4-2_1 kernel: drbd0: disk( Diskless -> Attaching ) Feb 25 11:16:41 rh4-2_1 kernel: drbd0: Starting worker thread (from cqueue/0 [14428]) Feb 25 11:16:41 rh4-2_1 kernel: drbd0: Found 4 transactions (192 active extents) in activity log. Feb 25 11:16:41 rh4-2_1 kernel: drbd0: Method to ensure write ordering: barrier Feb 25 11:16:41 rh4-2_1 kernel: drbd0: max_segment_size ( = BIO size ) = 32768 Feb 25 11:16:41 rh4-2_1 kernel: drbd0: Adjusting my ra_pages to backing device's (32 -> 1024) Feb 25 11:16:41 rh4-2_1 kernel: drbd0: drbd_bm_resize called with capacity == 142248864 Feb 25 11:16:41 rh4-2_1 kernel: drbd0: resync bitmap: bits=17781108 words=555660 Feb 25 11:16:41 rh4-2_1 kernel: drbd0: size = 68 GB (71124432 KB) Feb 25 11:16:41 rh4-2_1 kernel: drbd0: recounting of set bits took additional 4 jiffies Feb 25 11:16:41 rh4-2_1 kernel: drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map. Feb 25 11:16:41 rh4-2_1 kernel: drbd0: disk( Attaching -> UpToDate ) Feb 25 11:16:41 rh4-2_1 kernel: drbd0: Barriers not supported on meta data device - disabling Feb 25 11:16:41 rh4-2_1 kernel: drbd0: conn( StandAlone -> Unconnected ) Feb 25 11:16:41 rh4-2_1 kernel: drbd0: Starting receiver thread (from drbd0_worker [14448]) [...] Feb 25 11:16:41 rh4-2_1 kernel: drbd0: receiver (re)started Feb 25 11:16:41 rh4-2_1 kernel: drbd0: conn( Unconnected -> WFConnection ) Feb 25 11:16:42 rh4-2_1 kernel: drbd0: Handshake successful: Agreed network protocol version 89 Feb 25 11:16:42 rh4-2_1 kernel: drbd0: conn( WFConnection -> WFReportParams ) Feb 25 11:16:42 rh4-2_1 kernel: drbd0: Starting asender thread (from drbd0_receiver [14464]) Feb 25 11:16:42 rh4-2_1 kernel: drbd0: data-integrity-alg: <not-used> Feb 25 11:16:42 rh4-2_1 kernel: drbd0: drbd_sync_handshake: Feb 25 11:16:42 rh4-2_1 kernel: drbd0: self FA64FF57A0D8F9BC:0000000000000000:7E1FD1A10363D65A:FF850D8CF344B2F2 Feb 25 11:16:42 rh4-2_1 kernel: drbd0: peer FA64FF57A0D8F9BC:0000000000000000:7E1FD1A10363D65B:FF850D8CF344B2F2 Feb 25 11:16:42 rh4-2_1 kernel: drbd0: uuid_compare()=0 by rule 4 Feb 25 11:16:42 rh4-2_1 kernel: drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> Connected ) pdsk( DUnknown -> UpToDate ) [...] Feb 25 11:17:35 rh4-2_1 kernel: drbd0: conn( Connected -> StartingSyncT ) Feb 25 11:17:35 rh4-2_1 kernel: drbd0: 68 GB (17781108 bits) marked out-of-sync by on disk bit-map. Feb 25 11:17:35 rh4-2_1 kernel: drbd0: conn( StartingSyncT -> WFSyncUUID ) Feb 25 11:17:35 rh4-2_1 kernel: drbd0: helper command: /sbin/drbdadm before-resync-target minor-0 Feb 25 11:17:35 rh4-2_1 kernel: drbd0: helper command: /sbin/drbdadm before-resync-target minor-0 exit code 0 (0x0) Feb 25 11:17:35 rh4-2_1 kernel: drbd0: conn( WFSyncUUID -> SyncTarget ) disk( UpToDate -> Inconsistent ) Feb 25 11:17:35 rh4-2_1 kernel: drbd0: Began resync as SyncTarget (will sync 71124432 KB [17781108 bits set]). -8<--------------------------------------------------------------------------- Here is the output of the drbdadm role drbd_res0 with my problem: -8<--------------------------------------------------------------------------- [root at rh4-2 libsh]# pwd /usr/local/drbd/libsh [root at rh4-2 libsh]# type drbdadm drbdadm is hashed (/sbin/drbdadm) [root at rh4-2 libsh]# hash -r [root at rh4-2 libsh]# type drbdadm drbdadm is /sbin/drbdadm [root at rh4-2 libsh]# drbdadm -c ../etc/drbd.conf role drbd_res0 Secondary/Secondary # (43) sync_progress = (integer) 833 [len: 4] [root at rh4-2 libsh]# type drbdadm drbdadm is hashed (/sbin/drbdadm) -8<--------------------------------------------------------------------------- So, here is the output of the strace command (which seems correct): -8<--------------------------------------------------------------------------- [root at rh4-2 libsh]# strace -e execve -f drbdadm -c ../etc/drbd.conf role drbd_res0 execve("/sbin/drbdadm", ["drbdadm", "-c", "../etc/drbd.conf", "role", "drbd_res0"], [/* 31 vars */]) = 0 Process 25402 attached [pid 25402] execve("/usr/local/drbd/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "role"], [/* 32 vars */]) = -1 ENOENT (No such file or directory) [pid 25402] execve("/usr/kerberos/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "role"], [/* 32 vars */]) = -1 ENOENT (No such file or directory) [pid 25402] execve("/usr/kerberos/bin/drbdsetup", ["drbdsetup", "/dev/drbd0", "role"], [/* 32 vars */]) = -1 ENOENT (No such file or directory) [pid 25402] execve("/usr/local/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "role"], [/* 32 vars */]) = -1 ENOENT (No such file or directory) [pid 25402] execve("/usr/local/bin/drbdsetup", ["drbdsetup", "/dev/drbd0", "role"], [/* 32 vars */]) = -1 ENOENT (No such file or directory) [pid 25402] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "role"], [/* 32 vars */]) = 0 Process 25402 detached --- SIGCHLD (Child exited) @ 0 (0) --- Process 25405 attached Process 25399 suspended [pid 25405] execve("/usr/local/drbd/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "role"], [/* 32 vars */]) = -1 ENOENT (No such file or directory) [pid 25405] execve("/usr/kerberos/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "role"], [/* 32 vars */]) = -1 ENOENT (No such file or directory) [pid 25405] execve("/usr/kerberos/bin/drbdsetup", ["drbdsetup", "/dev/drbd0", "role"], [/* 32 vars */]) = -1 ENOENT (No such file or directory) [pid 25405] execve("/usr/local/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "role"], [/* 32 vars */]) = -1 ENOENT (No such file or directory) [pid 25405] execve("/usr/local/bin/drbdsetup", ["drbdsetup", "/dev/drbd0", "role"], [/* 32 vars */]) = -1 ENOENT (No such file or directory) [pid 25405] execve("/sbin/drbdsetup", ["drbdsetup", "/dev/drbd0", "role"], [/* 32 vars */]) = 0 Secondary/Secondary # (43) sync_progress = (integer) 878 [len: 4] Process 25399 resumed Process 25405 detached --- SIGCHLD (Child exited) @ 0 (0) --- -8<--------------------------------------------------------------------------- End of resynchronization: -8<--------------------------------------------------------------------------- Feb 25 11:54:55 rh4-2_1 kernel: drbd0: Resync done (total 2239 sec; paused 0 sec; 31764 K/sec) Feb 25 11:54:55 rh4-2_1 kernel: drbd0: conn( SyncTarget -> Connected ) disk( Inconsistent -> UpToDate ) Feb 25 11:54:55 rh4-2_1 kernel: drbd0: helper command: /sbin/drbdadm after-resync-target minor-0 Feb 25 11:54:55 rh4-2_1 kernel: drbd0: helper command: /sbin/drbdadm after-resync-target minor-0 exit code 0 (0x0) -8<--------------------------------------------------------------------------- And then no more strange message: -8<--------------------------------------------------------------------------- # drbdadm -c ../etc/drbd.conf role drbd_res0 Secondary/Secondary -8<--------------------------------------------------------------------------- Help ! I really don't understand what's happened. Any other idea than "using the older drbdadm/drbdsetup against the newer module" ??? BTW, I have seen a new file in ll /var/lib/drbd: -8<--------------------------------------------------------------------------- [root at rh4-2 libsh]# ll /var/lib/drbd/ total 16 drwxr-xr-x 2 root root 4096 Feb 25 11:18 ./ drwxr-xr-x 37 root root 4096 Feb 25 10:52 ../ lrwxrwxrwx 1 root root 38 Feb 25 11:16 drbd-minor-0.conf -> /usr/local/drbd-8.3.0/etc/drbd.conf lrwxrwxrwx 1 root root 38 Feb 25 11:16 drbd-minor-1.conf -> /usr/local/drbd-8.3.0/etc/drbd.conf lrwxrwxrwx 1 root root 38 Feb 25 11:16 drbd-minor-2.conf -> /usr/local/drbd-8.3.0/etc/drbd.conf -rw------- 1 root root 36 Feb 25 11:18 node_id -8<--------------------------------------------------------------------------- What is this node_id file ? Thank in advance for any hint. Best regards. -- Hervé GAUTIER