[DRBD-user] drbd sycing again from the start

Thu Oct 17 17:21:00 CEST 2019

8.4.4 is old. Can you upgrade to the latest 8.4.11? I believe 8.4.4 was
older than the other reporters with a similar issue, so this may be
fixed. Upgrading to .11 should not cause any issues.

PS - Please keep replies on the list. These discussions help others by
being in the archives.

digimer

On 2019-10-17 9:54 a.m., Paras pradhan wrote:
> drbd version is drbd-8.4.4-0.27.4.2 and yes are upgrading it to version
> 9 in the near future.
> 
> No it is not a live snapshot. Both drbd nodes were shutdown and used
> clonezilla bootable image to take backup and also to restore.
> 
> Thanks
> Paras.
> 
> On Wed, Oct 16, 2019 at 7:26 PM Digimer <lists at alteeve.ca
> <mailto:lists at alteeve.ca>> wrote:
> 
>     I don't see the version, but looking in the mailing list archives, the
>     common recommendation is to upgrade. What version of DRBD 8 are you
>     using, exactly?
> 
>     Does the resync happen only after recovery? Is the backups of the nodes
>     done via live-snapshot? If so, then if there is _any_ time between the
>     two nodes being snapped, the UUID will differ and could be causing this.
> 
>     digimer
> 
>     On 2019-10-16 10:04 a.m., Paras pradhan wrote:
>     > Hi
>     >
>     > Here is the log for one of the drbd resource (which is 300GB).
>     >
>     > --
>     > [  194.780377] block drbd1: disk( Diskless -> Attaching )
>     > [  194.780536] block drbd1: max BIO size = 1048576
>     > [  194.780548] block drbd1: drbd_bm_resize called with capacity ==
>     629126328
>     > [  194.783069] block drbd1: resync bitmap: bits=78640791 words=1228763
>     > pages=2400
>     > [  194.783077] block drbd1: size = 300 GB (314563164 KB)
>     > [  194.793958] block drbd1: bitmap READ of 2400 pages took 3 jiffies
>     > [  194.796342] block drbd1: recounting of set bits took additional
>     1 jiffies
>     > [  194.796348] block drbd1: 0 KB (0 bits) marked out-of-sync by on
>     disk
>     > bit-map.
>     > [  194.796359] block drbd1: disk( Attaching -> Outdated )
>     > [  194.796366] block drbd1: attached to UUIDs
>     > 56E4CF14A115440C:0000000000000000:02DCAB23D758DA48:02DBAB23D758DA49
>     > [  475.740272] block drbd1: drbd_sync_handshake:
>     > [  475.740280] block drbd1: self
>     > 56E4CF14A115440C:0000000000000000:02DCAB23D758DA48:02DBAB23D758DA49
>     > bits:0 flags:0
>     > [  475.740288] block drbd1: peer
>     > F5A226CE3F2DA2F2:0000000000000000:56E5CF14A115440D:56E4CF14A115440D
>     > bits:0 flags:0
>     > [  475.740295] block drbd1: uuid_compare()=-2 by rule 60
>     > [  475.740299] block drbd1: Writing the whole bitmap, full sync
>     required
>     > after drbd_sync_handshake.
>     > [  475.757877] block drbd1: bitmap WRITE of 2400 pages took 4 jiffies
>     > [  475.757888] block drbd1: 300 GB (78640791 bits) marked
>     out-of-sync by
>     > on disk bit-map.
>     > [  475.758018] block drbd1: peer( Unknown -> Secondary ) conn(
>     > WFReportParams -> WFBitMapT ) pdsk( DUnknown -> UpToDate )
>     > [  475.800134] block drbd1: receive bitmap stats [Bytes(packets)]:
>     plain
>     > 0(0), RLE 23(1), total 23; compression: 100.0%
>     > [  475.802697] block drbd1: send bitmap stats [Bytes(packets)]: plain
>     > 0(0), RLE 23(1), total 23; compression: 100.0%
>     > [  475.802717] block drbd1: conn( WFBitMapT -> WFSyncUUID )
>     > [  475.815155] block drbd1: updated sync uuid
>     > CEF5B26573C154CC:0000000000000000:02DCAB23D758DA48:02DBAB23D758DA49
>     > [  475.815377] block drbd1: helper command: /sbin/drbdadm
>     > before-resync-target minor-1
>     > [  475.820270] block drbd1: helper command: /sbin/drbdadm
>     > before-resync-target minor-1 exit code 0 (0x0)
>     > [  475.820293] block drbd1: conn( WFSyncUUID -> SyncTarget ) disk(
>     > Outdated -> Inconsistent )
>     > [  475.820306] block drbd1: Began resync as SyncTarget (will sync
>     > 314563164 KB [78640791 bits set]).
>     > [  538.518371] block drbd1: peer( Secondary -> Primary )
>     > [  538.548954] block drbd1: role( Secondary -> Primary )
>     > [ 2201.521232] block drbd1: conn( SyncTarget -> PausedSyncT )
>     user_isp(
>     > 0 -> 1 )
>     > [ 2201.521237] block drbd1: Resync suspended
>     > [ 2301.930484] block drbd1: conn( PausedSyncT -> SyncTarget )
>     user_isp(
>     > 1 -> 0 )
>     > [ 2301.930490] block drbd1: Syncer continues.
>     > [ 5216.750314] block drbd1: Resync done (total 4740 sec; paused
>     100 sec;
>     > 67792 K/sec)
>     > [ 5216.750323] block drbd1: 98 % had equal checksums, eliminated:
>     > 311395164K; transferred 3168000K total 314563164K
>     > [ 5216.750333] block drbd1: updated UUIDs
>     > F5A226CE3F2DA2F3:0000000000000000:CEF5B26573C154CD:56E5CF14A115440D
>     > [ 5216.750343] block drbd1: conn( SyncTarget -> Connected ) disk(
>     > Inconsistent -> UpToDate )
>     > [ 5216.750518] block drbd1: helper command: /sbin/drbdadm
>     > after-resync-target minor-1
>     > [ 5216.845211] block drbd1: helper command: /sbin/drbdadm
>     > after-resync-target minor-1 exit code 0 (0x0)
>     > ---
>     >
>     >
>     > Thanks!
>     >
>     > On Wed, Oct 16, 2019 at 12:46 AM Digimer <lists at alteeve.ca
>     <mailto:lists at alteeve.ca>
>     > <mailto:lists at alteeve.ca <mailto:lists at alteeve.ca>>> wrote:
>     >
>     >     On 2019-10-15 4:58 p.m., Paras pradhan wrote:
>     >     > Hi
>     >     >
>     >     > I have a two node drbd 8 cluster. We are doing some test and
>     while
>     >     drbd
>     >     > resources are consistent/synced on both nodes, I powered off
>     both
>     >     nodes
>     >     > and took a bare metal backup using clonezilla. 
>     >     >
>     >     > Then I restored the both nodes using the backup and started drbd
>     >     on both
>     >     > nodes. On one of the nodes it starts to sync all over again.  
>     >     >
>     >     > My question is: when I took the backup drbd resources are synced
>     >     and why
>     >     > it is starting all over again? I hope I explained clearly.
>     >     >
>     >     > Thanks in advance !
>     >     > Paras.
>     >
>     >     Do you have the system logs from when you started DRBD on the
>     nodes
>     >     post-recovery? There should be DRBD log entries on both nodes
>     as DRBD
>     >     started. The reason/trigger of the resync will likely be
>     explained in
>     >     there. If not, please share the logs.
>     >
>     >     --
>     >     Digimer
>     >     Papers and Projects: https://alteeve.com/w/
>     >     "I am, somehow, less interested in the weight and convolutions of
>     >     Einstein’s brain than in the near certainty that people of
>     equal talent
>     >     have lived and died in cotton fields and sweatshops." -
>     Stephen Jay
>     >     Gould
>     >
> 
> 
>     -- 
>     Digimer
>     Papers and Projects: https://alteeve.com/w/
>     "I am, somehow, less interested in the weight and convolutions of
>     Einstein’s brain than in the near certainty that people of equal talent
>     have lived and died in cotton fields and sweatshops." - Stephen Jay
>     Gould
> 

-- 
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interested in the weight and convolutions of
Einstein’s brain than in the near certainty that people of equal talent
have lived and died in cotton fields and sweatshops." - Stephen Jay Gould