[DRBD-user] What does "block drbd0: BAD! sector= .... cstate=SyncSource" want to tell me?

Lars Ellenberg lars.ellenberg at linbit.com
Thu Jul 5 19:00:10 CEST 2012

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Wed, Jul 04, 2012 at 08:14:05PM +0200, Lutz Vieweg wrote:
> Hi,
> 
> I just had to reboot a system that is configured as the "secondary" for 3 DRBD devices.
> After the reboot, connection to the primary system was established and re-synchronisation started.
> 
> Some scary messages were emitted during that process - on the primary:
> 
> >block drbd0: uuid_compare()=1 by rule 70
> >block drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> Consistent )
> >block drbd0: send bitmap stats [Bytes(packets)]: plain 0(0), RLE 68591(17), total 68591; compression: 99.9%
> >block drbd0: receive bitmap stats [Bytes(packets)]: plain 0(0), RLE 68591(17), total 68591; compression: 99.9%
> >block drbd0: helper command: /sbin/drbdadm before-resync-source minor-0
> >block drbd0: helper command: /sbin/drbdadm before-resync-source minor-0 exit code 0 (0x0)
> >block drbd0: conn( WFBitMapS -> SyncSource ) pdsk( Consistent -> Inconsistent )
> >block drbd0: updated sync UUID 0AE...
> >block drbd0: Began resync as SyncSource (will sync 3242376 KB [810594 bits set]).
> >block drbd0: BAD! sector=2513600376s enr=76708 rs_left=-15 rs_failed=0 count=32 cstate=SyncSource
> >block drbd0: BAD! sector=2342485912s enr=71486 rs_left=-19 rs_failed=0 count=32 cstate=SyncSource
> >block drbd0: BAD! sector=3393683440s enr=103566 rs_left=-30 rs_failed=0 count=32 cstate=SyncSource
> >block drbd0: BAD! sector=3062103824s enr=93447 rs_left=-2 rs_failed=0 count=32 cstate=SyncSource
> >block drbd0: Resync done (total 337 sec; paused 0 sec; 9620 K/sec)
> >block drbd0: updated UUIDs 0AE...
> >block drbd0: conn( SyncSource -> Connected ) pdsk( Inconsistent -> UpToDate )
> >block drbd0: bitmap WRITE of 0 pages took 1 jiffies
> >block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map.
> 
> And on the (rebooted) secondary:
> 
> >block drbd0: disk( Diskless -> Attaching )
> >block drbd0: max BIO size = 131072
> >block drbd0: drbd_bm_resize called with capacity == 3550894184
> >block drbd0: resync bitmap: bits=443861773 words=6935341 pages=13546
> >block drbd0: size = 1693 GB (1775447092 KB)
> >block drbd0: bitmap READ of 13546 pages took 1443 jiffies
> >block drbd0: recounting of set bits took additional 34 jiffies
> >block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map.
> >block drbd0: disk( Attaching -> UpToDate )
> >block drbd0: attached to UUIDs 9DE...
> >block drbd0: drbd_sync_handshake:
> >block drbd0: self 9DE... bits:0 flags:0
> >block drbd0: peer 0AE... bits:810473 flags:0
> >block drbd0: uuid_compare()=-1 by rule 50
> >block drbd0: peer( Unknown -> Primary ) conn( WFReportParams -> WFBitMapT ) disk( UpToDate -> Outdated ) pdsk( DUnknown -> UpToDate )
> >block drbd0: receive bitmap stats [Bytes(packets)]: plain 0(0), RLE 68591(17), total 68591; compression: 99.9%
> >block drbd0: send bitmap stats [Bytes(packets)]: plain 0(0), RLE 68591(17), total 68591; compression: 99.9%
> >block drbd0: conn( WFBitMapT -> WFSyncUUID )
> >block drbd0: updated sync uuid 9DE...
> >block drbd0: helper command: /sbin/drbdadm before-resync-target minor-0
> >block drbd0: helper command: /sbin/drbdadm before-resync-target minor-0 exit code 0 (0x0)
> >block drbd0: conn( WFSyncUUID -> SyncTarget ) disk( Outdated -> Inconsistent )
> >block drbd0: Began resync as SyncTarget (will sync 3242376 KB [810594 bits set]).
> >block drbd0: BAD! sector=2513600376s enr=76708 rs_left=-15 rs_failed=0 count=32 cstate=SyncTarget
> >block drbd0: BAD! sector=2342485912s enr=71486 rs_left=-19 rs_failed=0 count=32 cstate=SyncTarget
> >block drbd0: BAD! sector=3393683440s enr=103566 rs_left=-30 rs_failed=0 count=32 cstate=SyncTarget
> >block drbd0: BAD! sector=3062103824s enr=93447 rs_left=-2 rs_failed=0 count=32 cstate=SyncTarget
> >block drbd0: Resync done (total 337 sec; paused 0 sec; 9620 K/sec)
> >block drbd0: updated UUIDs 0AE...
> >block drbd0: conn( SyncTarget -> Connected ) disk( Inconsistent -> UpToDate )
> >block drbd0: helper command: /sbin/drbdadm after-resync-target minor-0
> >block drbd0: helper command: /sbin/drbdadm after-resync-target minor-0 exit code 0 (0x0)
> >block drbd0: bitmap WRITE of 0 pages took 1 jiffies
> >block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map.
> 
> Now I wonder: What does drbd0 want to tell me with those "BAD! ..." messages?

It's just some reference counter that should not have gone negative,
but did, because we forgot to update/reinitialize it at some stage.

Depending on your exact DRBD version, I could tell you various things
about this.  But if you run 8.3 git it is supposed to be fixed, finally...

> It seems to have completed the synchronization successfully. Also, no "read errors" where
> reported in on either host.
> 
> Should I be concerned about the data integrity, now?

Nope. All good.

Cheers,

	Lars

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com



More information about the drbd-user mailing list