Usynced blocks if replication is interrupted during initial sync

Wed Mar 20 00:55:16 CET 2024

Thank you for your response! 

We are still seeing the issue as described but perhaps I am not putting the invalidate
at the right spot

Note - I've added it at step 6 below, but I'm wondering if it should be after
the additional node is configured and adjusted (in which case I would need to
unmount as apparently you can't invalidate a disk in use)

So do I need to invalidate after every node is added?

Also Note, the node-id in the logs from the kernel is 0 but peers are configured with 1 and 2 , 
is this an issue or they separate ids?

Here is the reproduction strategy and steps I tried, along with config and logs below

target disk is /dev/vg/persist mounted at /persist

1. /dev/vg/persist mounted directly without drbd
2. Enable DRBD by creating a single node configuration file [1]
3. Reboot
4. Create metadata on separate disk (--max-peers=5)
5. drdbadm up persist
6. drbdadm invalidate persist <-- just added
7. drbdadm primary --force persist
8. mount /dev/drbd0 to /persist
9. start using that mount point
10. some time later
11. Modify configuration to add new target backup node [2]
12. Copy config to remote node and reboot, it will restart in secondary
13. drbdadm adjust persist (on primary) <-- should I put invalidate near here
14. secondary comes up and initial sync starts
15. stop at 50% by disabling network interface
16. re-enable network interface
17. sync completes right away - primary drbd log up to this point [3]
18. drbdadm verify persist - fails many blocks

[1]
```
resource persist {
  device  /dev/drbd0;
  disk    /dev/vg/persist;

  disk {
    resync-rate 400M;
  }

  net {
    after-sb-1pri discard-secondary;
    verify-alg md5;
  }

  meta-disk /dev/vg/metapersist;

  on node-1 {
    node-id 1;
    address 172.20.0.1:7789;
  }

  connection-mesh {
    hosts node-1;
  }
}
```

[2]
```
resource persist {
  device  /dev/drbd0;
  disk    /dev/vg/persist;

  disk {
    resync-rate 400M;
  }

  net {
    after-sb-1pri discard-secondary;
    verify-alg md5;
  }

  meta-disk /dev/vg/metapersist;

  on node-2 {
    node-id 2;
    address 172.20.0.2:7789;
  }

  on node-1 {
    node-id 1;
    address 172.20.0.1:7789;
  }

  connection-mesh {
    hosts node-2 node-1;
  }
}
```

[3]
```
[   14.723777] drbd: loading out-of-tree module taints kernel.
[   14.744889] drbd: initialized. Version: 9.2.8 (api:2/proto:86-122)
[   14.744894] drbd: GIT-hash: e163b05a76254c0f51f999970e861d72bb16409a build by bamboo at 3861-update-edaptos-to-contain-new-drbd9-kernel-version-2-dcwd8, 2024-03-11 19:24:04
[   14.744898] drbd: registered as block device major 147
[   14.753757] drbd: registered transport class 'tcp' (version:9.2.8)
[   14.833147] drbd persist: Starting worker thread (from drbdsetup [1274])
[   14.858421] drbd persist/0 drbd0: meta-data IO uses: blk-bio
[   14.858707] drbd persist/0 drbd0: disk( Diskless -> Attaching ) [attach]
[   14.858727] drbd persist/0 drbd0: Maximum number of peer devices = 5
[   14.858997] drbd persist: Method to ensure write ordering: flush
[   14.859007] drbd persist/0 drbd0: drbd_bm_resize called with capacity == 53198848
[   14.865734] drbd persist/0 drbd0: resync bitmap: bits=6649856 words=519520 pages=1015
[   14.865742] drbd0: detected capacity change from 0 to 53198848
[   14.865748] drbd persist/0 drbd0: size = 25 GB (26599424 KB)
[   14.903824] drbd persist/0 drbd0: recounting of set bits took additional 7ms
[   14.903841] drbd persist/0 drbd0: disk( Attaching -> Inconsistent ) [attach]
[   14.903847] drbd persist/0 drbd0: attached to current UUID: 0000000000000004
[   14.912275] drbd persist: Preparing cluster-wide state change 2815770690 (1->-1 7683/4609)
[   14.912282] drbd persist: Committing cluster-wide state change 2815770690 (0ms)
[   14.912290] drbd persist: role( Secondary -> Primary ) [primary]
[   14.912295] drbd persist/0 drbd0: disk( Inconsistent -> UpToDate ) [primary]
[   14.912330] drbd persist/0 drbd0: persisting effective size = 25 GB (26599424 KB)
[   14.912808] drbd persist: Forced to consider local data as UpToDate!
[   14.913133] drbd persist/0 drbd0: new current UUID: E86819AF4D474727 weak: FFFFFFFFFFFFFFFD
[   15.059936] EXT4-fs (drbd0): mounted filesystem with ordered data mode. Opts: (null). Quota mode: disabled.
[  134.763079] drbd persist node-2: Starting sender thread (from drbdsetup [8670])
[  134.766987] drbd persist node-2: conn( StandAlone -> Unconnected ) [connect]
[  134.767526] drbd persist node-2: Starting receiver thread (from drbd_w_persist [1275])
[  134.767605] drbd persist node-2: conn( Unconnected -> Connecting ) [connecting]
[  166.481244] drbd persist node-2: Handshake to peer 2 successful: Agreed network protocol version 122
[  166.481252] drbd persist node-2: Feature flags enabled on protocol level: 0x7f TRIM THIN_RESYNC WRITE_SAME WRITE_ZEROES RESYNC_DAGTAG
[  166.482918] drbd persist: Preparing cluster-wide state change 3780268308 (1->2 499/145)
[  166.489976] drbd persist/0 drbd0 node-2: drbd_sync_handshake:
[  166.489983] drbd persist/0 drbd0 node-2: self E86819AF4D474727:83E0478EA8FD2BB2:0000000000000000:0000000000000000 bits:32096 flags:20
[  166.489993] drbd persist/0 drbd0 node-2: peer 0000000000000004:0000000000000000:0000000000000000:0000000000000000 bits:0 flags:25
[  166.490001] drbd persist/0 drbd0 node-2: uuid_compare()=source-set-bitmap by rule=just-created-peer
[  166.490007] drbd persist/0 drbd0 node-2: Setting and writing one bitmap slot, after drbd_sync_handshake
[  166.499865] drbd persist/0 drbd0: bitmap WRITE of 1015 pages took 7 ms
[  166.500321] drbd persist/0 drbd0 node-2: updated bitmap UUID E86819AF4D474727:0000000000000004:0000000000000000:0000000000000000
[  166.500748] drbd persist: State change 3780268308: primary_nodes=2, weak_nodes=FFFFFFFFFFFFFFF9
[  166.500757] drbd persist: Committing cluster-wide state change 3780268308 (17ms)
[  166.500816] drbd persist node-2: conn( Connecting -> Connected ) peer( Unknown -> Secondary ) [connected]
[  166.500825] drbd persist/0 drbd0 node-2: pdsk( DUnknown -> Inconsistent ) repl( Off -> WFBitMapS ) [connected]
[  166.510600] drbd persist/0 drbd0 node-2: send bitmap stats [Bytes(packets)]: plain 0(0), RLE 23(1), total 23; compression: 100.0%
[  166.519007] drbd persist/0 drbd0 node-2: receive bitmap stats [Bytes(packets)]: plain 0(0), RLE 23(1), total 23; compression: 100.0%
[  166.519035] drbd persist/0 drbd0 node-2: helper command: /usr/sbin/drbdadm before-resync-source
[  166.523064] drbd persist/0 drbd0 node-2: helper command: /usr/sbin/drbdadm before-resync-source exit code 0
[  166.523135] drbd persist/0 drbd0 node-2: repl( WFBitMapS -> SyncSource ) [receive-bitmap]
[  166.523297] drbd persist/0 drbd0 node-2: Began resync as SyncSource (will sync 26599424 KB [6649856 bits set]).
[  266.229892] drbd persist node-2: PingAck did not arrive in time.
[  266.229925] drbd persist node-2: conn( Connected -> NetworkFailure ) peer( Secondary -> Unknown )
[  266.229932] drbd persist/0 drbd0 node-2: repl( SyncSource -> Off )
[  266.230252] drbd persist node-2: Terminating sender thread
[  266.230280] drbd persist node-2: Starting sender thread (from drbd_r_persist [8675])
[  266.245998] drbd persist node-2: Connection closed
[  266.246016] drbd persist node-2: helper command: /usr/sbin/drbdadm disconnected
[  266.248999] drbd persist node-2: helper command: /usr/sbin/drbdadm disconnected exit code 0
[  266.249042] drbd persist node-2: conn( NetworkFailure -> Unconnected ) [disconnected]
[  266.249055] drbd persist node-2: Restarting receiver thread
[  266.249064] drbd persist node-2: conn( Unconnected -> Connecting ) [connecting]
[  266.249085] drbd persist node-2: Configured local address not found, retrying every 10 sec, err=-99
[  543.224049] drbd persist node-2: Handshake to peer 2 successful: Agreed network protocol version 122
[  543.224057] drbd persist node-2: Feature flags enabled on protocol level: 0x7f TRIM THIN_RESYNC WRITE_SAME WRITE_ZEROES RESYNC_DAGTAG
[  543.225910] drbd persist: Preparing cluster-wide state change 3373764551 (1->2 499/145)
[  543.235989] drbd persist/0 drbd0 node-2: drbd_sync_handshake:
[  543.235995] drbd persist/0 drbd0 node-2: self E86819AF4D474727:0000000000000004:0000000000000000:0000000000000000 bits:4958062 flags:120
[  543.236003] drbd persist/0 drbd0 node-2: peer 83E0478EA8FD2BB2:0000000000000000:83E0478EA8FD2BB2:0000000000000000 bits:4956507 flags:1124
[  543.236012] drbd persist/0 drbd0 node-2: uuid_compare()=source-copy-other-bitmap by rule=bitmap-self-other
[  543.236019] drbd persist/0 drbd0 node-2: Copying bitmap of peer node_id=0 (bitmap_index=1)
[  543.245444] drbd persist: State change 3373764551: primary_nodes=2, weak_nodes=FFFFFFFFFFFFFFF9
[  543.245453] drbd persist: Committing cluster-wide state change 3373764551 (19ms)
[  543.245491] drbd persist node-2: conn( Connecting -> Connected ) peer( Unknown -> Secondary ) [connected]
[  543.245497] drbd persist/0 drbd0 node-2: repl( Off -> WFBitMapS ) [connected]
[  543.252591] drbd persist/0 drbd0 node-2: send bitmap stats [Bytes(packets)]: plain 0(0), RLE 935(1), total 935; compression: 99.9%
[  543.260772] drbd persist/0 drbd0 node-2: receive bitmap stats [Bytes(packets)]: plain 0(0), RLE 935(1), total 935; compression: 99.9%
[  543.260795] drbd persist/0 drbd0 node-2: helper command: /usr/sbin/drbdadm before-resync-source
[  543.265185] drbd persist/0 drbd0 node-2: helper command: /usr/sbin/drbdadm before-resync-source exit code 0
[  543.265237] drbd persist/0 drbd0 node-2: repl( WFBitMapS -> SyncSource ) [receive-bitmap]
[  543.265349] drbd persist/0 drbd0 node-2: Began resync as SyncSource (will sync 147708 KB [36927 bits set]).
[  545.059544] drbd persist/0 drbd0 node-2: updated UUIDs E86819AF4D474727:0000000000000000:0000000000000000:0000000000000000
[  545.063831] drbd persist/0 drbd0 node-2: Resync done (total 1 sec; paused 0 sec; 147708 K/sec)
[  545.063847] drbd persist/0 drbd0 node-2: pdsk( Inconsistent -> UpToDate ) repl( SyncSource -> Established ) [resync-finished]
```

From: Joel Colledge <joel.colledge at linbit.com>
Sent: Tuesday, March 19, 2024 1:56 AM
To: Tim Westbrook <Tim_Westbrook at selinc.com>
Cc: drbd-user at lists.linbit.com <drbd-user at lists.linbit.com>
Subject: Re: Usynced blocks if replication is interrupted during initial sync

[Caution - External]

Hi Tim,

Thanks for the report and your previous one with the subject line
"Blocks fail to sync if connection lost during initial sync". This
does look like a bug.

I can reproduce the issue. It appears to be a regression introduced in
drbd-9.1.16 / drbd-9.2.5. Specifically, with the following commit:
f647d1f7fff6 drbd: set bitmap UUID when setting bitmap slot

The issue only occurs when there is a spare bitmap slot. That is, when
create-md was run with --max-peers=2 or a higher value.

Based on a brief look at the logs, the root cause seems to be that the
day0 bitmap is unexpectedly copied after reconnecting during the
initial sync.

A workaround is to bring the node with the data that is to be
preserved up as follows:
$ drbdadm up <res>
$ drbdadm invalidate <res>
$ drbdadm primary --force <res>
This causes all bits in the day0 bitmap to be set, so that when it is
unexpectedly used, a full sync is performed. This only needs to be
done at the very start. No additional steps should be required when
bringing up additional nodes later.

An alternative workaround would be to reduce the max-peers value.
However, it is good practice to keep some spare peer slots for
potential future usage, so this is not ideal.

Best regards,
Joel