[DRBD-user] Extent XXX beyond end of bitmap!
Oleksiy Evin
o.evin at onefc.com
Tue Oct 2 05:50:25 CEST 2018
The above 'Extent XXX beyond end of bitmap!' error is constantly reproduced on our environment. That's not clear what was exactly the trigger, but that happened when peacemaker were unable to properly failover to another node due to DRBD timeout issue following by the server reset.
# drbdadm statussg-master-drbd role:Secondary disk:Diskless peer role:Primary replication:Established peer-disk:UpToDate
# drbdadm up allextent 19136507 beyond end of bitmap!extent 21495810 beyond end of bitmap!extent 21785161 beyond end of bitmap!... another 50+ entries similar to above...../shared/drbdmeta.c:2279:apply_al: ASSERT(bm_pos - bm_on_disk_pos <= chunk - extents_size) failed.sg-master-drbd: Failure: (102) Local address(port) already in use.Command 'drbdsetup-84 connect sg-master-drbd ipv4:172.16.2.10:7801 ipv4:172.16.2.20:7801 --protocol=C --max-buffers=64K --sndbuf-size=1024K --after-sb-0pri=discard-younger-primary --after-sb-1pri=discard-secondary --after-sb-2pri=call-pri-lost-after-sb' terminated with exit code 10
]# drbdadm attach allextent 19136507 beyond end of bitmap!extent 21495810 beyond end of bitmap!extent 21785161 beyond end of bitmap!...../shared/drbdmeta.c:2279:apply_al: ASSERT(bm_pos - bm_on_disk_pos <= chunk - extents_size) failed.
Previously we fixed that by recreating DRBD meta data and fully resynchronize the nodes, which obviously is incorrect way to handle it.
The configuration is pretty much standard, with Internal meta data and defaults for AL and max-peers.
resource master-drbd { net { protocol C; max-buffers 64K; sndbuf-size 1024K; after-sb-0pri discard-younger-primary; after-sb-1pri discard-secondary; after-sb-2pri call-pri-lost-after-sb; } disk { resync-rate 4000M; disk-barrier no; disk-flushes no; c-plan-ahead 0; read-balancing 1M-striping; } volume 0 { disk /dev/drbdpool/data; device /dev/drbd0; meta-disk internal; } on hcluster01 { address 172.16.2.10:7801; } on hcluster02 { address 172.16.2.20:7801; }}
I'm not able to get 'drbdadm dump-md' with the following error:
# drbdadm dump-md allFound meta data is "unclean", please apply-al firstCommand 'drbdmeta 0 v08 /dev/drbdpool/data internal dump-md' terminated with exit code 255
Backend device 'dm-3' for DRBD is a logical volume 'data' which combines two Hardware RAID0 arrays (sda, sdb) by volume group 'drbdpool'.
Reported sizes on a Failed node:
# blockdev --reportRO RA SSZ BSZ StartSec Size Devicerw 256 512 4096 0 120009573531648 /dev/sdarw 256 512 4096 0 100007977943040 /dev/sdcrw 256 512 4096 0 220017543086080 /dev/dm-3
# blockdev --getsize /dev/drbd0blockdev: cannot open /dev/drbd0: Wrong medium type
Reported sizes on a Operational node:
# blockdev --reportRO RA SSZ BSZ StartSec Size Devicerw 256 512 4096 0 120009573531648 /dev/sdarw 256 512 4096 0 100007977943040 /dev/sdcrw 256 512 4096 0 220017543086080 /dev/dm-3rw 256 512 4096 0 220010828644352 /dev/drbd0
# blockdev --getsize /dev/drbd0429708649696
# vgdisplay
--- Volume group --- VG Name drbdpool System ID Format lvm2 Metadata Areas 2 Metadata Sequence No 2 VG Access read/write VG Status resizable MAX LV 0 Cur LV 1 Open LV 1 Max PV 0 Cur PV 2 Act PV 2 VG Size 200.10 TiB PE Size 4.00 MiB Total PE 52456270 Alloc PE / Size 52456270 / 200.10 TiB Free PE / Size 0 / 0
# lvdisplay --- Logical volume --- LV Path /dev/drbdpool/data LV Name data VG Name drbdpool LV Write Access read/write LV Status available # open 2 LV Size 200.10 TiB Current LE 52456270 Segments 2 Allocation inherit Read ahead sectors auto - currently set to 256 Block device 253:3
# dmesg | grep drbd[ 1.863088] drbd: loading out-of-tree module taints kernel.[ 1.865879] drbd: module verification failed: signature and/or required key missing - tainting kernel[ 1.894498] drbd: initialized. Version: 8.4.11-1 (api:1/proto:86-101)[ 1.894501] drbd: GIT-hash: 66145a308421e9c124ec391a7848ac20203bb03c build by mockbuild@, 2018-04-26 12:10:42[ 1.894502] drbd: registered as block device major 147[ 88.950747] drbd sg-master-drbd: Starting worker thread (from drbdsetup-84 [3242])[ 88.951999] drbd sg-master-drbd: conn( StandAlone -> Unconnected ) [ 88.952532] drbd sg-master-drbd: Starting receiver thread (from drbd_w_sg-maste [3244])[ 88.952592] drbd sg-master-drbd: receiver (re)started[ 88.952656] drbd sg-master-drbd: conn( Unconnected -> WFConnection ) [ 89.453261] drbd sg-master-drbd: Handshake successful: Agreed network protocol version 101[ 89.453271] drbd sg-master-drbd: Feature flags enabled on protocol level: 0xf TRIM THIN_RESYNC WRITE_SAME WRITE_ZEROES.[ 89.453358] drbd sg-master-drbd: conn( WFConnection -> WFReportParams ) [ 89.453373] drbd sg-master-drbd: Starting ack_recv thread (from drbd_r_sg-maste [3245])[ 89.469010] block drbd0: max BIO size = 4096[ 89.469023] block drbd0: size = 200 TB (214854324848 KB)[ 89.469043] block drbd0: peer( Unknown -> Primary ) conn( WFReportParams -> Connected ) pdsk( DUnknown -> UpToDate ) [49807.178096] drbd sg-master-drbd: peer( Primary -> Unknown ) conn( Connected -> Disconnecting ) pdsk( UpToDate -> DUnknown ) [49807.178116] drbd sg-master-drbd: ack_receiver terminated[49807.178124] drbd sg-master-drbd: Terminating drbd_a_sg-maste[49807.192386] drbd sg-master-drbd: Connection closed[49807.192452] drbd sg-master-drbd: conn( Disconnecting -> StandAlone ) [49807.192463] drbd sg-master-drbd: receiver terminated[49807.192470] drbd sg-master-drbd: Terminating drbd_r_sg-maste[49807.229346] drbd sg-master-drbd: Terminating drbd_w_sg-maste[49847.525209] drbd sg-master-drbd: Starting worker thread (from drbdsetup-84 [23082])[49847.525490] drbd sg-master-drbd: conn( StandAlone -> Unconnected ) [49847.525542] drbd sg-master-drbd: Starting receiver thread (from drbd_w_sg-maste [23084])[49847.525624] drbd sg-master-drbd: receiver (re)started[49847.525687] drbd sg-master-drbd: conn( Unconnected -> WFConnection ) [49848.025725] drbd sg-master-drbd: Handshake successful: Agreed network protocol version 101[49848.025735] drbd sg-master-drbd: Feature flags enabled on protocol level: 0xf TRIM THIN_RESYNC WRITE_SAME WRITE_ZEROES.[49848.025964] drbd sg-master-drbd: conn( WFConnection -> WFReportParams ) [49848.025979] drbd sg-master-drbd: Starting ack_recv thread (from drbd_r_sg-maste [23085])[49848.036394] block drbd0: max BIO size = 4096[49848.036407] block drbd0: size = 200 TB (214854324848 KB)[49848.036427] block drbd0: peer( Unknown -> Primary ) conn( WFReportParams -> Connected ) pdsk( DUnknown -> UpToDate )
//OE
-----Original Message-----
From: Robert Altnoeder <robert.altnoeder at linbit.com>
To: drbd-user at lists.linbit.com
Subject: Re: [DRBD-user] Extent XXX beyond end of bitmap!
Date: Tue, 14 Aug 2018 13:03:40 +0200
The following information would be useful for debugging:- Internal or external meta data?- Any special activity log configuration, like a striped AL, differentAL stripe size, etc.?- Any manually configured number of AL extents?- Value of max-peers- Reported size of the DRBD device in sectors- Reported size of the backend device for DRBD in sectors- Ideally, a 'drbdadm dump-md' of the meta data of the affected devices
br,Robert
On 08/14/2018 10:02 AM, Yannis Milios wrote:Does this happen on both nodes? What’s the status of the backingdevice (lvm) ? Can you post the exact versions for both kernel moduleand utils? Any clue in the logs?
On Tue, 14 Aug 2018 at 06:57, Oleksiy Evin <o.evin at onefc.com<mailto:o.evin at onefc.com>> wrote:
# drbdadm attach all extent 19136522 beyond end of bitmap! extent 19143798 beyond end of bitmap! extent 19151565 beyond end of bitmap!
../shared/drbdmeta.c:2279:apply_al: ASSERT(bm_pos - bm_on_disk_pos <= chunk - extents_size) failed.
_______________________________________________drbd-user mailing listdrbd-user at lists.linbit.comhttp://lists.linbit.com/mailman/listinfo/drbd-user
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20181002/5c8c4ad9/attachment.htm>
More information about the drbd-user
mailing list