[DRBD-user] initial setup of drbd (kernel 2.6.7, drbd 0.7.0)

Lars Ellenberg Lars.Ellenberg at linbit.com
Fri Jul 23 13:16:35 CEST 2004

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


/ 2004-07-23 09:04:15 +0200
\ Martin Bene:
> Hi, 
> 
> I just ran into a couple of strange effects when setting up drbd 0.7 on
> our test cluster;
> 
> 4 drbd devices, 2 - 18 GB each.
> 
> Setup was slightly non - standard: I configured just one side to
> primary/consistent before adding the 2nd side (needed to copy data onto
> the new drbd devices).
> 
> drbd setup of node1: all went as expected, no trouble. created
> filesystems, mounted, copied data onto drbd.
> 
> drbd setup on node2: 
> # /etc/init.d/drbd start
> * Starting DRBD...
> Child process does not terminate!
> Exiting.
> 
> drbdadm up all in the startup script finished configuration for just he
> first drbd device. setup for the 2nd one seems to have hung for some
> time
> 
> 31008 ?        SW     0:00 [drbd0_worker]
> 29849 ?        RW     0:11 [drbd0_receiver]
>  2845 pts/0    D      0:00 /sbin/drbdsetup /dev/nbd/1 disk /dev/md5
> internal -1 --on-io-error=panic
>  5207 ?        RW     0:00 [drbd0_asender]
> 
> finaly resulting in 
> 
> test-neu1 user # cat /proc/drbd
> version: 0.7.0 svn $Rev: 1438 $ (api:74/proto:74)
> 
>  0: cs:SyncTarget st:Secondary/Primary ld:Inconsistent
>     ns:0 nr:27311780 dw:27311780 dr:0 al:0 bm:3447 lo:134 pe:493 ua:134
> ap:0
>         [==================>.] sync'ed: 93.7% (1818/28487)M
>         finish: 0:01:07 speed: 27,482 (25,008) K/sec
>  1: cs:StandAlone st:Secondary/Unknown ld:Inconsistent
>     ns:0 nr:0 dw:0 dr:0 al:0 bm:1781 lo:0 pe:0 ua:0 ap:0
>  2: cs:Unconfigured
>  3: cs:Unconfigured
> 
> on the 2nd node.
> 
> Timeout on initialising the internal metadata while resync is already
> running (and this slowing things down) on the 1st device?

yes, this may be the cause.
you happen to use the same underlying physical devices?
is this a 2.4. kernel?
 
> Here's the relevant syslog entries:
> 
> 08:30:40 drbd: initialised. Version: 0.7.0 svn $Rev: 1438 $ (api:74/proto:74)
> 08:30:40 drbd: registered as block device major 43
> 08:30:40 drbd0: Creating state block
> 08:30:40 drbd0: resync bitmap: bits=7292848 words=227902
> 08:30:40 drbd0: size = 29171392 KB
> 08:30:40 drbd0: Assuming that all blocks are out of sync (aka FullSync)
> 08:30:45 drbd0: 29171392 KB now marked out-of-sync by on disk bit-map.
> 08:30:46 drbd1: Creating state block
> 08:30:46 drbd1: resync bitmap: bits=7292848 words=227902
> 08:30:46 drbd1: size = 29171392 KB
> 08:30:46 drbd1: Assuming that all blocks are out of sync (aka FullSync)
> 08:30:46 drbd0: Handshake successful: DRBD Protocol version 74
> 08:30:46 drbd0: Connection established.
> 08:30:46 drbd0: Secondary/Unknown --> Secondary/Primary
> 08:30:46 drbd0: Resync started as SyncTarget
>	(need to sync 29171392 KB [7292848 bits set]).
> 08:31:41 drbd1: 29171392 KB now marked out-of-sync by on disk bit-map.
> 
> Another effect: The progress bar for resync diplayed on node1 seems to
> be inconsistent
> 
> test-neu2 drbd # cat /proc/drbd
> version: 0.7.0 svn $Rev: 1438 $ (api:74/proto:74)
> 
>  0: cs:SyncSource st:Primary/Secondary ld:Consistent
>     ns:8745864 nr:0 dw:87300 dr:9494623 al:109 bm:1412 lo:700 pe:1350
> ua:700 ap:0
>         [=================>..] sync'ed: 88.9% (19952/28487)M
>         finish: 0:13:34 speed: 25,076 (25,187) K/sec
> 
> test-neu2 drbd # cat /proc/drbd
> version: 0.7.0 svn $Rev: 1438 $ (api:74/proto:74)
> 
>  0: cs:SyncSource st:Primary/Secondary ld:Consistent
>     ns:12940824 nr:0 dw:87492 dr:13690591 al:109 bm:1668 lo:1000 pe:1876
> ua:1000 ap:0
>         [========>...........] sync'ed: 44.4% (15858/28487)M
>         finish: 0:09:20 speed: 28,933 (25,160) K/sec
> 
> Time to finish and synced/size info seem to be OK; but the progress bar
> definitely isn't.. started out at ~50%, went to 100% and then jumped
> back ~40. 

we patched the code there with some 64bit long compatibility things.
seems like there sneaked in some integer overflow issue...

	Lars Ellenberg

-- 
please use the "List-Reply" function of your email client.



More information about the drbd-user mailing list