[DRBD-user] Recommended procedure to fsck a drbd device

Jose Manuel Torralba jt at mercuryinternet.com
Tue Apr 13 13:51:20 CEST 2010

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hi, this is my first post in this list and the question is regarding an inherited system of which I'm not completely familiar with. If I'm missing any necessary details, please ask.

We have a cluster, running drbd 8.2.6, both nodes were affected by a power cut. OS is CentOS 5.1 64bit, kernel 2.6.18-53.1.19.el5.028stab053.14

The problem is that it's not restarting because none of the nodes is mounting the drbd device with the following error:

Apr 13 08:34:25 nodo02 Filesystem[29553]: INFO: Running start for /dev/drbd0 on /vz
Apr 13 08:34:25 nodo02 kernel: EXT3-fs error (device drbd0): ext3_check_descriptors: Block bitmap for group 5376 not in group (block 662340004)!
Apr 13 08:34:25 nodo02 kernel: EXT3-fs: group descriptors corrupted!
Apr 13 08:34:25 nodo02 Filesystem[29553]: ERROR: Couldn't mount filesystem /dev/drbd0 on /vz

We have stopped services in both nodes and, in order to isolate the problem, we have executed in one of the nodes (both behave the same):

service drbd start
drbdadm primary r0
mount /dev/drbd0 /vz

Last lines of dmesg:

drbd: initialised. Version: 8.0.7 (api:86/proto:86)
drbd: GIT-hash: cf14288833afe95db396075f8530a5960d29e498 build by phil at mescal, 2007-11-02 13:15:41
drbd: registered as block device major 147
drbd: minor_table @ 0xffff8102245d2ec0
drbd0: disk( Diskless -> Attaching )
drbd0: Found 6 transactions (324 active extents) in activity log.
drbd0: max_segment_size ( = BIO size ) = 32768
drbd0: drbd_bm_resize called with capacity == 2929554040
drbd0: resync bitmap: bits=366194255 words=5721786
drbd0: size = 1396 GB (1464777020 KB)
drbd0: reading of bitmap took 619 jiffies
drbd0: recounting of set bits took additional 45 jiffies
drbd0: 4 KB marked out-of-sync by on disk bit-map.
drbd0: disk( Attaching -> UpToDate )
drbd0: Writing meta data super block now.
drbd0: conn( StandAlone -> Unconnected )
drbd0: receiver (re)started
drbd0: conn( Unconnected -> WFConnection )
drbd0: role( Secondary -> Primary )
drbd0: Writing meta data super block now.
EXT3-fs error (device drbd0): ext3_check_descriptors: Block bitmap for group 5376 not in group (block 662340004)!
EXT3-fs: group descriptors corrupted!

We think the the next step is to run fsck in one node and then force the other node to syncronize.

A read only run of fsck.ext3 over /dev/drbd0 returns an awful lot of errors.

The question is if this is the recommended procedure to fsck a drbd device.

Thanks


José Torralba
Mercury Internet





More information about the drbd-user mailing list