<html dir="ltr"><head></head><body style="text-align:left; direction:ltr;"><div>The above 'Extent XXX beyond end of bitmap!' error is constantly reproduced on our environment. That's not clear what was exactly the trigger, but that happened when peacemaker were unable to properly failover to another node due to DRBD timeout issue following by the server reset. </div><pre><br></pre><pre># drbdadm status</pre><pre>sg-master-drbd role:Secondary</pre><pre> disk:Diskless</pre><pre> peer role:Primary</pre><pre> replication:Established peer-disk:UpToDate</pre><div><br></div><pre># drbdadm up all</pre><pre>extent 19136507 beyond end of bitmap!</pre><pre>extent 21495810 beyond end of bitmap!</pre><pre>extent 21785161 beyond end of bitmap!</pre><pre>... </pre><pre>another 50+ entries similar to above</pre><pre>...</pre><pre>../shared/drbdmeta.c:2279:apply_al: ASSERT(bm_pos - bm_on_disk_pos <= chunk - extents_size) failed.</pre><pre>sg-master-drbd: Failure: (102) Local address(port) already in use.</pre><pre>Command 'drbdsetup-84 connect sg-master-drbd ipv4:172.16.2.10:7801 ipv4:172.16.2.20:7801 --protocol=C --max-buffers=64K --sndbuf-size=1024K --after-sb-0pri=discard-younger-primary --after-sb-1pri=discard-secondary --after-sb-2pri=call-pri-lost-after-sb' terminated with exit code 10</pre><pre><br></pre><pre>]# drbdadm attach all</pre><pre>extent 19136507 beyond end of bitmap!</pre><pre>extent 21495810 beyond end of bitmap!</pre><pre>extent 21785161 beyond end of bitmap!</pre><pre>...</pre><pre>../shared/drbdmeta.c:2279:apply_al: ASSERT(bm_pos - bm_on_disk_pos <= chunk - extents_size) failed.</pre><pre><br></pre><div>Previously we fixed that by recreating DRBD meta data and fully resynchronize the nodes, which obviously is incorrect way to handle it.</div><div><br></div><div>The configuration is pretty much standard, with Internal meta data and defaults for AL and max-peers.</div><div><br></div><pre>resource master-drbd {</pre><pre> net {</pre><pre> protocol C;</pre><pre> max-buffers 64K;</pre><pre> sndbuf-size 1024K;</pre><pre> after-sb-0pri discard-younger-primary;</pre><pre> after-sb-1pri discard-secondary;</pre><pre> after-sb-2pri call-pri-lost-after-sb;</pre><pre> }</pre><pre> disk {</pre><pre> resync-rate 4000M;</pre><pre> disk-barrier no;</pre><pre> disk-flushes no;</pre><pre> c-plan-ahead 0;</pre><pre> read-balancing 1M-striping;</pre><pre> }</pre><pre> volume 0 {</pre><pre> disk /dev/drbdpool/data;</pre><pre> device /dev/drbd0;</pre><pre> meta-disk internal;</pre><pre> }</pre><pre> on hcluster01 {</pre><pre> address 172.16.2.10:7801;</pre><pre> }</pre><pre> on hcluster02 {</pre><pre> address 172.16.2.20:7801;</pre><pre> }</pre><pre>}</pre><div></div><div><br></div><div>I'm not able to get 'drbdadm dump-md' with the following error:</div><div><br></div><pre># drbdadm dump-md all</pre><pre>Found meta data is "unclean", please apply-al first</pre><pre>Command 'drbdmeta 0 v08 /dev/drbdpool/data internal dump-md' terminated with exit code 255</pre><div><br></div><div>Backend device 'dm-3' for DRBD is a logical volume 'data' which combines two Hardware RAID0 arrays (sda, sdb) by volume group 'drbdpool'. </div><div><br></div><div>Reported sizes on a Failed node:</div><div><br></div><pre># blockdev --report</pre><pre>RO RA SSZ BSZ StartSec Size Device</pre><pre>rw 256 512 4096 0 120009573531648 /dev/sda</pre><pre>rw 256 512 4096 0 100007977943040 /dev/sdc</pre><pre>rw 256 512 4096 0 220017543086080 /dev/dm-3</pre><div></div><div><br></div><pre># blockdev --getsize /dev/drbd0</pre><pre>blockdev: cannot open /dev/drbd0: Wrong medium type</pre><div><br></div><div>Reported sizes on a Operational node:</div><div><br></div><pre># blockdev --report</pre><pre>RO RA SSZ BSZ StartSec Size Device</pre><pre>rw 256 512 4096 0 120009573531648 /dev/sda</pre><pre>rw 256 512 4096 0 100007977943040 /dev/sdc</pre><pre>rw 256 512 4096 0 220017543086080 /dev/dm-3</pre><pre>rw 256 512 4096 0 220010828644352 /dev/drbd0</pre><pre><br></pre><pre># blockdev --getsize /dev/drbd0</pre><pre>429708649696</pre><pre><br></pre><pre># vgdisplay</pre><pre><br></pre><pre> --- Volume group ---</pre><pre></pre><pre> VG Name drbdpool</pre><pre> System ID </pre><pre> Format lvm2</pre><pre> Metadata Areas 2</pre><pre> Metadata Sequence No 2</pre><pre> VG Access read/write</pre><pre> VG Status resizable</pre><pre> MAX LV 0</pre><pre> Cur LV 1</pre><pre> Open LV 1</pre><pre> Max PV 0</pre><pre> Cur PV 2</pre><pre> Act PV 2</pre><pre> VG Size 200.10 TiB</pre><pre> PE Size 4.00 MiB</pre><pre> Total PE 52456270</pre><pre> Alloc PE / Size 52456270 / 200.10 TiB</pre><pre> Free PE / Size 0 / 0 </pre><div> </div><pre># lvdisplay</pre><pre> </pre><pre> --- Logical volume ---</pre><pre> LV Path /dev/drbdpool/data</pre><pre> LV Name data</pre><pre> VG Name drbdpool</pre><pre> LV Write Access read/write</pre><pre> LV Status available</pre><pre> # open 2</pre><pre> LV Size 200.10 TiB</pre><pre> Current LE 52456270</pre><pre> Segments 2</pre><pre> Allocation inherit</pre><pre> Read ahead sectors auto</pre><pre> - currently set to 256</pre><pre> Block device 253:3 </pre><div></div><div><pre><br></pre><pre># dmesg | grep drbd</pre><pre>[ 1.863088] drbd: loading out-of-tree module taints kernel.</pre><pre>[ 1.865879] drbd: module verification failed: signature and/or required key missing - tainting kernel</pre><pre>[ 1.894498] drbd: initialized. Version: 8.4.11-1 (api:1/proto:86-101)</pre><pre>[ 1.894501] drbd: GIT-hash: 66145a308421e9c124ec391a7848ac20203bb03c build by mockbuild@, 2018-04-26 12:10:42</pre><pre>[ 1.894502] drbd: registered as block device major 147</pre><pre>[ 88.950747] drbd sg-master-drbd: Starting worker thread (from drbdsetup-84 [3242])</pre><pre>[ 88.951999] drbd sg-master-drbd: conn( StandAlone -> Unconnected ) </pre><pre>[ 88.952532] drbd sg-master-drbd: Starting receiver thread (from drbd_w_sg-maste [3244])</pre><pre>[ 88.952592] drbd sg-master-drbd: receiver (re)started</pre><pre>[ 88.952656] drbd sg-master-drbd: conn( Unconnected -> WFConnection ) </pre><pre>[ 89.453261] drbd sg-master-drbd: Handshake successful: Agreed network protocol version 101</pre><pre>[ 89.453271] drbd sg-master-drbd: Feature flags enabled on protocol level: 0xf TRIM THIN_RESYNC WRITE_SAME WRITE_ZEROES.</pre><pre>[ 89.453358] drbd sg-master-drbd: conn( WFConnection -> WFReportParams ) </pre><pre>[ 89.453373] drbd sg-master-drbd: Starting ack_recv thread (from drbd_r_sg-maste [3245])</pre><pre>[ 89.469010] block drbd0: max BIO size = 4096</pre><pre>[ 89.469023] block drbd0: size = 200 TB (214854324848 KB)</pre><pre>[ 89.469043] block drbd0: peer( Unknown -> Primary ) conn( WFReportParams -> Connected ) pdsk( DUnknown -> UpToDate ) </pre><pre>[49807.178096] drbd sg-master-drbd: peer( Primary -> Unknown ) conn( Connected -> Disconnecting ) pdsk( UpToDate -> DUnknown ) </pre><pre>[49807.178116] drbd sg-master-drbd: ack_receiver terminated</pre><pre>[49807.178124] drbd sg-master-drbd: Terminating drbd_a_sg-maste</pre><pre>[49807.192386] drbd sg-master-drbd: Connection closed</pre><pre>[49807.192452] drbd sg-master-drbd: conn( Disconnecting -> StandAlone ) </pre><pre>[49807.192463] drbd sg-master-drbd: receiver terminated</pre><pre>[49807.192470] drbd sg-master-drbd: Terminating drbd_r_sg-maste</pre><pre>[49807.229346] drbd sg-master-drbd: Terminating drbd_w_sg-maste</pre><pre>[49847.525209] drbd sg-master-drbd: Starting worker thread (from drbdsetup-84 [23082])</pre><pre>[49847.525490] drbd sg-master-drbd: conn( StandAlone -> Unconnected ) </pre><pre>[49847.525542] drbd sg-master-drbd: Starting receiver thread (from drbd_w_sg-maste [23084])</pre><pre>[49847.525624] drbd sg-master-drbd: receiver (re)started</pre><pre>[49847.525687] drbd sg-master-drbd: conn( Unconnected -> WFConnection ) </pre><pre>[49848.025725] drbd sg-master-drbd: Handshake successful: Agreed network protocol version 101</pre><pre>[49848.025735] drbd sg-master-drbd: Feature flags enabled on protocol level: 0xf TRIM THIN_RESYNC WRITE_SAME WRITE_ZEROES.</pre><pre>[49848.025964] drbd sg-master-drbd: conn( WFConnection -> WFReportParams ) </pre><pre>[49848.025979] drbd sg-master-drbd: Starting ack_recv thread (from drbd_r_sg-maste [23085])</pre><pre>[49848.036394] block drbd0: max BIO size = 4096</pre><pre>[49848.036407] block drbd0: size = 200 TB (214854324848 KB)</pre><pre>[49848.036427] block drbd0: peer( Unknown -> Primary ) conn( WFReportParams -> Connected ) pdsk( DUnknown -> UpToDate ) </pre><pre></pre><pre><br></pre></div><div>//OE</div><div><br></div><div>-----Original Message-----</div><div><b>From</b>: Robert Altnoeder <<a href="mailto:Robert%20Altnoeder%20%3crobert.altnoeder@linbit.com%3e">robert.altnoeder@linbit.com</a>></div><div><b>To</b>: <a href="mailto:drbd-user@lists.linbit.com">drbd-user@lists.linbit.com</a></div><div><b>Subject</b>: Re: [DRBD-user] Extent XXX beyond end of bitmap!</div><div><b>Date</b>: Tue, 14 Aug 2018 13:03:40 +0200</div><div><br></div><pre>The following information would be useful for debugging:</pre><pre>- Internal or external meta data?</pre><pre>- Any special activity log configuration, like a striped AL, different</pre><pre>AL stripe size, etc.?</pre><pre>- Any manually configured number of AL extents?</pre><pre>- Value of max-peers</pre><pre>- Reported size of the DRBD device in sectors</pre><pre>- Reported size of the backend device for DRBD in sectors</pre><pre>- Ideally, a 'drbdadm dump-md' of the meta data of the affected devices</pre><pre><br></pre><pre>br,</pre><pre>Robert</pre><pre><br></pre><pre>On 08/14/2018 10:02 AM, Yannis Milios wrote:</pre><pre></pre><pre>Does this happen on both nodes? What’s the status of the backing</pre><pre>device (lvm) ? Can you post the exact versions for both kernel module</pre><pre>and utils? Any clue in the logs?</pre><pre><br></pre><pre>On Tue, 14 Aug 2018 at 06:57, Oleksiy Evin <<a href="mailto:o.evin@onefc.com">o.evin@onefc.com</a></pre><pre><mailto:<a href="mailto:o.evin@onefc.com">o.evin@onefc.com</a>>> wrote:</pre><pre><br></pre><pre><br></pre><pre> # drbdadm attach all</pre><pre> extent 19136522 beyond end of bitmap!</pre><pre> extent 19143798 beyond end of bitmap!</pre><pre> extent 19151565 beyond end of bitmap!</pre><pre><br></pre><pre> ../shared/drbdmeta.c:2279:apply_al: ASSERT(bm_pos - bm_on_disk_pos</pre><pre> <= chunk - extents_size) failed.</pre><pre><br></pre><pre></pre><pre><br></pre><pre>_______________________________________________</pre><pre>drbd-user mailing list</pre><pre><a href="mailto:drbd-user@lists.linbit.com">drbd-user@lists.linbit.com</a></pre><pre><a href="http://lists.linbit.com/mailman/listinfo/drbd-user">http://lists.linbit.com/mailman/listinfo/drbd-user</a></pre><pre><br></pre></body></html>