Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi Lars, > ------------------------------ > I am sure drbd is fine with any library and bash compatible shell. > I rather suspect you doing something "stupid" (sorry for directness). 1. First of, my environment does not have "/bin/bash" (and I do not plan to use "bash"). Instead, all it has is /bin/sh -> /bin/busybox. Thus, I changed all calls to /bin/bash. 2. My ip -> /bin/busybox doesn't support route_module, thus, it spurrs "No route from me (192.168.2.2) to peer (192.168.2.1) messages as I run my /etc/rc.d/init.d/drbd start (from scripts/drbd; also tweaked to work on my environment). So I made some tweaks to prioritize ifconfig -> /bin/busybox and augment the absence of `ip route`. Note: Although the drbd.o module loads and /proc/drbd is available even if I do not do (2). I just thought I had to do it just to be sure/and be free of these "No route..." errors. > if you talk "bash" to me, that is, > starting from no drbd module loaded, > step by step the commands that you issue, > indicating on which node, and your expectations, > including /proc/drbd and/or dmesg | tail | grep -i drbd > (if it changed from the step before), > then probably the list or I can point out where the actual thinko is. After doing (1), and (2) above, and drbd starts cleanly (w/o errors) on my environment, I now have the following `cat /proc/drbd` results on the Secondary node: Results after a previous (sucessful) full-sync: ------------------------------------------------ version: 0.7.11 (api:77/proto:74) SVN Revision: 1807 build by root at linuxmachine, 2005-08-12 10:20:57 0: cs:Connected st:Secondary/Primary ld:Consistent ns:0 nr:2048256 dw:2048256 dr:0 al:0 bm:252 lo:0 pe:0 ua:0 ap:0 On-going results during a full-sync (invoked by `drbdadm invalidate all`) ------------------------------------------------ version: 0.7.11 (api:77/proto:74) SVN Revision: 1807 build by root at linuxmachine, 2005-08-12 10:20:57 0: cs:SyncTarget st:Secondary/Primary ld:Inconsistent ns:0 nr:2350500 dw:2350500 dr:0 al:0 bm:396 lo:64 pe:768 ua:64 ap:0 [==>.................] sync'ed: 15.0% (1746268/2048256)K finish: 0:13:13 speed: 2,148 (1,908) K/sec ...and still on-going... ------------------------------------------------ version: 0.7.11 (api:77/proto:74) SVN Revision: 1807 build by root at linuxmachine, 2005-08-12 10:20:57 0: cs:SyncTarget st:Secondary/Primary ld:Inconsistent ns:0 nr:3036368 dw:3036368 dr:0 al:0 bm:438 lo:93 pe:768 ua:93 ap:0 [=========>..........] sync'ed: 48.6% (1060516/2048256)K finish: 0:07:57 speed: 2,176 (1,916) K/sec ...almost... ------------------------------------------------ version: 0.7.11 (api:77/proto:74) SVN Revision: 1807 build by root at linuxmachine, 2005-08-12 10:20:57 0: cs:SyncTarget st:Secondary/Primary ld:Inconsistent ns:0 nr:3928032 dw:3928032 dr:0 al:0 bm:492 lo:84 pe:768 ua:84 ap:0 [==================>.] sync'ed: 91.9% (168816/2048256)K finish: 0:01:06 speed: 2,412 (1,924) K/sec ...and finally: ------------------------------------------------ version: 0.7.11 (api:77/proto:74) SVN Revision: 1807 build by root at linuxmachine, 2005-08-12 10:20:57 0: cs:Connected st:Secondary/Primary ld:Consistent ns:0 nr:4096512 dw:4096512 dr:0 al:0 bm:504 lo:0 pe:0 ua:0 ap:0 ------------------------------------------------ >From above, and after mount+inspection of target device on the Secondary, I am sure that full-sync did its job. Now, after I unmount the target device on Secondary, and re-enable DRBD on it, once more, as Secondary, while the other node remains Primary, I did some dumping of data on Primary's source device (simple `cp`+`sync` commands just to add data on it and made sure they're not just in buffer). Now, as I do the above `cp`+`sync` commands, I constantly check the hardware LEDs of both nodes. Only one node, the Primary, shows activity. These are lots of data continously being created on the Primary, yet still, no activity on the Secondary's HDD LED. And while I'm at it, I'm also doing a series of "cat /proc/drbd >>/tmp/copying.res;" and "dmesg|tail|grep -i drbd>>/tmp/copying.res" on Secondary. After HDD LED on Primary stops blinking, I openned the file, and here're the results: version: 0.7.11 (api:77/proto:74) SVN Revision: 1807 build by root at linuxmachine, 2005-08-12 10:20:57 0: cs:Connected st:Secondary/Primary ld:Consistent ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 drbd0: Handshake successful: DRBD Network Protocol version 74 drbd0: Connection established. drbd0: I am(S): 1:00000002:00000004:00000019:00000002:01 drbd0: Peer(P): 1:00000002:00000004:00000019:00000003:10 drbd0: drbd0_receiver [5708]: cstate WFReportParams --> WFBitMapT drbd0: Secondary/Unknown --> Secondary/Primary drbd0: drbd0_receiver [5708]: cstate WFBitMapT --> SyncTarget drbd0: Resync started as SyncTarget (need to sync 0 KB [0 bits set]). drbd0: Resync done (total 1 sec; paused 0 sec; 0 K/sec) drbd0: drbd0_receiver [5708]: cstate SyncTarget --> Connected version: 0.7.11 (api:77/proto:74) SVN Revision: 1807 build by root at linuxmachine, 2005-08-12 10:20:57 0: cs:Connected st:Secondary/Primary ld:Consistent ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 drbd0: Handshake successful: DRBD Network Protocol version 74 drbd0: Connection established. drbd0: I am(S): 1:00000002:00000004:00000019:00000002:01 drbd0: Peer(P): 1:00000002:00000004:00000019:00000003:10 drbd0: drbd0_receiver [5708]: cstate WFReportParams --> WFBitMapT drbd0: Secondary/Unknown --> Secondary/Primary drbd0: drbd0_receiver [5708]: cstate WFBitMapT --> SyncTarget drbd0: Resync started as SyncTarget (need to sync 0 KB [0 bits set]). drbd0: Resync done (total 1 sec; paused 0 sec; 0 K/sec) drbd0: drbd0_receiver [5708]: cstate SyncTarget --> Connected version: 0.7.11 (api:77/proto:74) SVN Revision: 1807 build by root at linuxmachine, 2005-08-12 10:20:57 0: cs:Connected st:Secondary/Primary ld:Consistent ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 drbd0: Handshake successful: DRBD Network Protocol version 74 drbd0: Connection established. drbd0: I am(S): 1:00000002:00000004:00000019:00000002:01 drbd0: Peer(P): 1:00000002:00000004:00000019:00000003:10 drbd0: drbd0_receiver [5708]: cstate WFReportParams --> WFBitMapT drbd0: Secondary/Unknown --> Secondary/Primary drbd0: drbd0_receiver [5708]: cstate WFBitMapT --> SyncTarget drbd0: Resync started as SyncTarget (need to sync 0 KB [0 bits set]). drbd0: Resync done (total 1 sec; paused 0 sec; 0 K/sec) drbd0: drbd0_receiver [5708]: cstate SyncTarget --> Connected version: 0.7.11 (api:77/proto:74) SVN Revision: 1807 build by root at linuxmachine, 2005-08-12 10:20:57 0: cs:Connected st:Secondary/Primary ld:Consistent ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 drbd0: Handshake successful: DRBD Network Protocol version 74 drbd0: Connection established. drbd0: I am(S): 1:00000002:00000004:00000019:00000002:01 drbd0: Peer(P): 1:00000002:00000004:00000019:00000003:10 drbd0: drbd0_receiver [5708]: cstate WFReportParams --> WFBitMapT drbd0: Secondary/Unknown --> Secondary/Primary drbd0: drbd0_receiver [5708]: cstate WFBitMapT --> SyncTarget drbd0: Resync started as SyncTarget (need to sync 0 KB [0 bits set]). drbd0: Resync done (total 1 sec; paused 0 sec; 0 K/sec) drbd0: drbd0_receiver [5708]: cstate SyncTarget --> Connected version: 0.7.11 (api:77/proto:74) SVN Revision: 1807 build by root at linuxmachine, 2005-08-12 10:20:57 0: cs:Connected st:Secondary/Primary ld:Consistent ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 drbd0: Handshake successful: DRBD Network Protocol version 74 drbd0: Connection established. drbd0: I am(S): 1:00000002:00000004:00000019:00000002:01 drbd0: Peer(P): 1:00000002:00000004:00000019:00000003:10 drbd0: drbd0_receiver [5708]: cstate WFReportParams --> WFBitMapT drbd0: Secondary/Unknown --> Secondary/Primary drbd0: drbd0_receiver [5708]: cstate WFBitMapT --> SyncTarget drbd0: Resync started as SyncTarget (need to sync 0 KB [0 bits set]). drbd0: Resync done (total 1 sec; paused 0 sec; 0 K/sec) drbd0: drbd0_receiver [5708]: cstate SyncTarget --> Connected version: 0.7.11 (api:77/proto:74) SVN Revision: 1807 build by root at linuxmachine, 2005-08-12 10:20:57 0: cs:Connected st:Secondary/Primary ld:Consistent ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 drbd0: Handshake successful: DRBD Network Protocol version 74 drbd0: Connection established. drbd0: I am(S): 1:00000002:00000004:00000019:00000002:01 drbd0: Peer(P): 1:00000002:00000004:00000019:00000003:10 drbd0: drbd0_receiver [5708]: cstate WFReportParams --> WFBitMapT drbd0: Secondary/Unknown --> Secondary/Primary drbd0: drbd0_receiver [5708]: cstate WFBitMapT --> SyncTarget drbd0: Resync started as SyncTarget (need to sync 0 KB [0 bits set]). drbd0: Resync done (total 1 sec; paused 0 sec; 0 K/sec) drbd0: drbd0_receiver [5708]: cstate SyncTarget --> Connected version: 0.7.11 (api:77/proto:74) SVN Revision: 1807 build by root at linuxmachine, 2005-08-12 10:20:57 0: cs:Connected st:Secondary/Primary ld:Consistent ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 drbd0: I am(S): 1:00000002:00000004:00000019:00000002:01 drbd0: Peer(P): 1:00000002:00000004:00000019:00000003:10 drbd0: drbd0_receiver [5708]: cstate WFReportParams --> WFBitMapT drbd0: Secondary/Unknown --> Secondary/Primary drbd0: drbd0_receiver [5708]: cstate WFBitMapT --> SyncTarget drbd0: Resync started as SyncTarget (need to sync 0 KB [0 bits set]). drbd0: Resync done (total 1 sec; paused 0 sec; 0 K/sec) drbd0: drbd0_receiver [5708]: cstate SyncTarget --> Connected version: 0.7.11 (api:77/proto:74) SVN Revision: 1807 build by root at linuxmachine, 2005-08-12 10:20:57 0: cs:WFConnection st:Secondary/Unknown ld:Consistent ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 drbd0: PingAck did not arrive in time. drbd0: drbd0_asender [5715]: cstate Connected --> NetworkFailure drbd0: asender terminated drbd0: drbd0_receiver [5708]: cstate NetworkFailure --> BrokenPipe drbd0: short read expecting header on sock: r=-512 drbd0: worker terminated drbd0: drbd0_receiver [5708]: cstate BrokenPipe --> Unconnected drbd0: Connection lost. drbd0: drbd0_receiver [5708]: cstate Unconnected --> WFConnection version: 0.7.11 (api:77/proto:74) SVN Revision: 1807 build by root at linuxmachine, 2005-08-12 10:20:57 0: cs:Connected st:Secondary/Primary ld:Consistent ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 drbd0: Handshake successful: DRBD Network Protocol version 74 drbd0: Connection established. drbd0: I am(S): 1:00000002:00000004:00000019:00000003:01 drbd0: Peer(P): 1:00000002:00000004:0000001a:00000003:10 drbd0: drbd0_receiver [5708]: cstate WFReportParams --> WFBitMapT drbd0: Secondary/Unknown --> Secondary/Primary drbd0: drbd0_receiver [5708]: cstate WFBitMapT --> SyncTarget drbd0: Resync started as SyncTarget (need to sync 0 KB [0 bits set]). drbd0: Resync done (total 1 sec; paused 0 sec; 0 K/sec) drbd0: drbd0_receiver [5708]: cstate SyncTarget --> Connected version: 0.7.11 (api:77/proto:74) SVN Revision: 1807 build by root at linuxmachine, 2005-08-12 10:20:57 0: cs:Connected st:Secondary/Primary ld:Consistent ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 drbd0: Handshake successful: DRBD Network Protocol version 74 drbd0: Connection established. drbd0: I am(S): 1:00000002:00000004:00000019:00000003:01 drbd0: Peer(P): 1:00000002:00000004:0000001a:00000003:10 drbd0: drbd0_receiver [5708]: cstate WFReportParams --> WFBitMapT drbd0: Secondary/Unknown --> Secondary/Primary drbd0: drbd0_receiver [5708]: cstate WFBitMapT --> SyncTarget drbd0: Resync started as SyncTarget (need to sync 0 KB [0 bits set]). drbd0: Resync done (total 1 sec; paused 0 sec; 0 K/sec) drbd0: drbd0_receiver [5708]: cstate SyncTarget --> Connected version: 0.7.11 (api:77/proto:74) SVN Revision: 1807 build by root at linuxmachine, 2005-08-12 10:20:57 0: cs:Connected st:Secondary/Secondary ld:Consistent ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 drbd0: drbd0_receiver [5708]: cstate Unconnected --> WFConnection drbd0: drbd0_receiver [5708]: cstate WFConnection --> WFReportParams drbd0: Handshake successful: DRBD Network Protocol version 74 drbd0: Connection established. drbd0: I am(S): 1:00000002:00000004:0000001a:00000003:01 drbd0: Peer(S): 1:00000002:00000004:0000001a:00000003:01 drbd0: drbd0_receiver [5708]: cstate WFReportParams --> Connected drbd0: Secondary/Unknown --> Secondary/Secondary Note that the results are the same all through out the "copying_on_primary" event/process until it finishes. If I am already able to perform full-sync each time I want to, what stops DRBD from performing partial-sync during a "Connected" state? Once more, I already saw how DRBD-0.6.13 work (w/ auto-partial-sync) way-way back. That is why I'm expecting that this (0.7.11) should work also the same. > or, to start with, > just point out were the "drbd quickstart guide" does not work for you. Your guides/how to are fine. DRBD is fine, I am sure. That is why I'm guessing/trying to figure out if my thin environment is the culprit for this failure. And by the way, if it matters, I'm implementing DRBD on the Nodes' internal device. That is, I do not have the actual hostnames (`hostname`) as part of my /etc/hosts. Best regards, Vic -- Internal Virus Database is out-of-date. Checked by AVG Anti-Virus. Version: 7.0.323 / Virus Database: 267.8.13/47 - Release Date: 7/12/2005