[DRBD-user] Data corruption after a failover-resync-failover test.

Sat May 23 13:50:32 CEST 2015

I had a very peculiar situation with DRBD.
I performed a fail-over test, which worked, but after I added the failed
node back as secondary (discarding its data) and performing another
fail-over test, all data on the secondary node turned out to be completely
useless (files contained garbage after the first block, files containing
only garbage, etc.).

Here is my setup:

I run two DRBD nodes in primary-secondary mode. Usually the primary server
is used, but in emergencies, the secondary, less powerful server could be
activated manually to provide at least some service (crash-consistent, at
least).

The primary server uses a LVM volume of 40 GiB, upon which DRBD acts, which
is provided for the domU to create partitions on. The domU doesn't know
DRBD is backing it. So inside the DRBD block device, it creates partitions
as usual.
My mistake, as I assume, is that I checked the partition size from within
the domU, and used that number to create the partition backing DRBD on the
secondary server (which is a mere virtual server), which was now too small,
since I should have checked the size of the LVM volume backing DRBD on the
primary.

Synchronization worked, and so did fail-over:
- I disconnected the secondary node.
- I promoted it to primary.
- I mounted the DRBD volume (mount -o rw /dev/mapper/"$drbdres"1 /data).
- I ran tests, wrote to the disk, etc..

May 22 14:27:08 kernel: [935443.236935] block drbd2: peer( Primary ->
Unknown ) conn( Connected -> Disconnecting ) pdsk( UpToDate -> DUnknown )
May 22 14:27:08 kernel: [935443.238259] block drbd2: asender terminated
May 22 14:27:08 kernel: [935443.238263] block drbd2: Terminating
drbd2_asender
May 22 14:27:08 kernel: [935443.238327] block drbd2: Connection closed
May 22 14:27:08 kernel: [935443.238331] block drbd2: conn( Disconnecting ->
StandAlone )
May 22 14:27:08 kernel: [935443.238340] block drbd2: receiver terminated
May 22 14:27:08 kernel: [935443.238342] block drbd2: Terminating
drbd2_receiver
May 22 14:27:26 kernel: [935460.609680] block drbd2: role( Secondary ->
Primary )
May 22 14:28:24 kernel: [935519.197240] device-mapper: multipath: version
1.3.2 loaded
May 22 14:28:41 kernel: [935535.837046] device-mapper: table: 252:2: drbd2
too small for target: start=81793024, len=2093056, dev_size=83883448
May 22 14:38:02 kernel: [936096.382990] EXT4-fs (dm-0):
ext4_orphan_cleanup: deleting unreferenced inode 666094
May 22 14:38:02 kernel: [936096.384190] EXT4-fs (dm-0):
ext4_orphan_cleanup: deleting unreferenced inode 683278
May 22 14:38:02 kernel: [936096.384199] EXT4-fs (dm-0):
ext4_orphan_cleanup: deleting unreferenced inode 683273
May 22 14:38:02 kernel: [936096.384206] EXT4-fs (dm-0):
ext4_orphan_cleanup: deleting unreferenced inode 683272
May 22 14:38:02 kernel: [936096.384211] EXT4-fs (dm-0):
ext4_orphan_cleanup: deleting unreferenced inode 683271
May 22 14:38:02 kernel: [936096.384216] EXT4-fs (dm-0):
ext4_orphan_cleanup: deleting unreferenced inode 683270
May 22 14:38:02 kernel: [936096.384220] EXT4-fs (dm-0): 6 orphan inodes
deleted
May 22 14:38:02 kernel: [936096.384222] EXT4-fs (dm-0): recovery complete
May 22 14:38:02 kernel: [936096.385250] EXT4-fs (dm-0): mounted filesystem
with ordered data mode. Opts: (null)

Note in particular the message from device-mapper. The swap partition was
at the referenced location, so I could work on the data partition, but I
wonder if something already went wrong here. I did not try mounting the
swap partition.

Since the now-promoted server would have been functional, I wanted to
restore the previous state.
- I demoted and connected it as a secondary again, with --discard-my-data.
- About 1.2 GB were synced, and everything was "UpToDate" again.

May 22 15:29:40 kernel: [939194.560779] block drbd2: Split-Brain detected,
manually solved. Sync from peer node
May 22 15:29:40 kernel: [939194.560784] block drbd2: peer( Unknown ->
Primary ) conn( WFReportParams -> WFBitMapT ) disk( UpToDate -> Outdated )
pdsk( DUnknown -> UpToDate )
May 22 15:29:40 kernel: [939194.579725] block drbd2: receive bitmap stats
[Bytes(packets)]: plain 0(0), RLE 15(1), total 15; compression: 100.0%
May 22 15:29:40 kernel: [939194.584810] block drbd2: conn( WFBitMapT ->
WFSyncUUID )
May 22 15:29:40 kernel: [939195.171600] block drbd2: updated sync uuid
41AB15EA146152E0:0000000000000000:22955A9C3A791A4A:22945A9C3A791A4B
May 22 15:29:40 kernel: [939195.171882] block drbd2: helper command:
/sbin/drbdadm before-resync-target minor-2
May 22 15:29:40 kernel: [939195.173982] block drbd2: helper command:
/sbin/drbdadm before-resync-target minor-2 exit code 0 (0x0)
May 22 15:29:40 kernel: [939195.173988] block drbd2: conn( WFSyncUUID ->
SyncTarget ) disk( Outdated -> Inconsistent )
May 22 15:29:40 kernel: [939195.173993] block drbd2: Began resync as
SyncTarget (will sync 1226284 KB [306571 bits set]).
May 22 15:33:12 kernel: [939406.611609] block drbd2: Resync done (total 211
sec; paused 0 sec; 5808 K/sec)
May 22 15:33:12 kernel: [939406.611615] block drbd2: updated UUIDs
1A549A81F3FE347E:0000000000000000:41AB15EA146152E0:41AA15EA146152E1
May 22 15:33:12 kernel: [939406.611620] block drbd2: conn( SyncTarget ->
Connected ) disk( Inconsistent -> UpToDate )

I ran the same test again:
- I disconnected the secondary node.
- I promoted it to primary.
- I mounted the DRBD volume
- I noticed that none of the services could start anymore.
- Checked the configuration files, which contained garbage data.
- Checking the syslog, fsck had to correct a few entries during the mount;
the files were there, but no useful data was within them.

May 22 16:07:36 kernel: [941470.358138] block drbd2: peer( Primary ->
Unknown ) conn( Connected -> Disconnecting ) pdsk( UpToDate -> DUnknown )
May 22 16:07:36 kernel: [941470.358461] block drbd2: asender terminated
May 22 16:07:36 kernel: [941470.358464] block drbd2: Terminating
drbd2_asender
May 22 16:07:36 kernel: [941470.358525] block drbd2: Connection closed
May 22 16:07:36 kernel: [941470.358529] block drbd2: conn( Disconnecting ->
StandAlone )
May 22 16:07:36 kernel: [941470.358580] block drbd2: receiver terminated
May 22 16:07:36 kernel: [941470.358583] block drbd2: Terminating
drbd2_receiver
May 22 16:07:36 kernel: [941470.361063] block drbd2: role( Secondary ->
Primary )
May 22 16:07:36 kernel: [941470.361220] block drbd2: new current UUID
C6E989B8069E233D:1A549A81F3FE347E:41AB15EA146152E0:41AA15EA146152E1
May 22 16:07:36 kernel: [941470.366794] device-mapper: table: 252:2: drbd2
too small for target: start=81793024, len=2093056, dev_size=83883448
May 22 16:07:36 kernel: [941470.443423] EXT4-fs (dm-0):
ext4_orphan_cleanup: deleting unreferenced inode 665239
May 22 16:07:36 kernel: [941470.443796] EXT4-fs error (device dm-0):
ext4_mb_generate_buddy:739: group 158, 1528 clusters in bitmap, 5723 in gd
May 22 16:07:36 kernel: [941470.449068] EXT4-fs (dm-0):
ext4_orphan_cleanup: truncating inode 662235 to 757 bytes
May 22 16:07:36 kernel: [941470.449426] EXT4-fs (dm-0): 1 orphan inode
deleted
May 22 16:07:36 kernel: [941470.449429] EXT4-fs (dm-0): 1 truncate cleaned
up
May 22 16:07:36 kernel: [941470.449431] EXT4-fs (dm-0): recovery complete
May 22 16:07:36 kernel: [941470.450756] EXT4-fs (dm-0): mounted filesystem
with ordered data mode. Opts: (null)
May 22 16:07:40 kernel: [941474.730794] EXT4-fs error (device dm-0):
ext4_lookup:1047: inode #1179664: comm tail: deleted inode referenced:
1247435
May 22 16:07:40 kernel: [941474.734670] EXT4-fs error (device dm-0):
ext4_lookup:1047: inode #1179664: comm tail: deleted inode referenced:
1247435
May 22 16:09:33 kernel: [941587.487254] EXT4-fs error (device dm-0):
ext4_lookup:1047: inode #664780: comm bash: deleted inode referenced: 683837
May 22 16:09:33 kernel: [941587.727142] EXT4-fs error (device dm-0):
ext4_lookup:1047: inode #664780: comm vi: deleted inode referenced: 683837
[...]
May 22 16:09:33 kernel: [941587.730099] EXT4-fs error (device dm-0):
ext4_lookup:1047: inode #664780: comm vi: deleted inode referenced: 683837

Running online verification found tons of blocks that were out of sync.

May 22 16:18:30 kernel: [942125.119936] block drbd2: Starting Online Verify
from sector 0
May 22 16:18:35 kernel: [942129.459450] block drbd2: Out of sync:
start=2056, size=16 (sectors)
May 22 16:18:52 kernel: [942146.550253] block drbd2: Out of sync:
start=10272, size=8 (sectors)
May 22 16:21:19 kernel: [942293.893023] block drbd2: Out of sync:
start=1374144, size=64 (sectors)
May 22 16:21:20 kernel: [942295.189719] block drbd2: Out of sync:
start=1387992, size=8 (sectors)
May 22 16:21:20 kernel: [942295.193997] block drbd2: Out of sync:
start=1388032, size=432 (sectors)
May 22 16:21:21 kernel: [942295.292691] block drbd2: Out of sync:
start=1388768, size=288 (sectors)
May 22 16:21:21 kernel: [942295.377464] block drbd2: Out of sync:
start=1389616, size=16 (sectors)
May 22 16:21:21 kernel: [942295.380208] block drbd2: Out of sync:
start=1389656, size=48 (sectors)
May 22 16:21:23 kernel: [942297.476909] block drbd2: Out of sync:
start=1410272, size=800 (sectors)
May 22 16:21:34 kernel: [942308.667068] block drbd2: Out of sync:
start=1525888, size=8 (sectors)
May 22 16:21:34 kernel: [942308.669117] block drbd2: Out of sync:
start=1525896, size=8 (sectors)
May 22 16:21:34 kernel: [942309.102272] block drbd2: Out of sync:
start=1530080, size=24 (sectors)
May 22 16:21:34 kernel: [942309.239587] block drbd2: Out of sync:
start=1531104, size=656 (sectors)
May 22 16:21:35 kernel: [942309.681967] block drbd2: Out of sync:
start=1536088, size=328 (sectors)
May 22 16:21:39 kernel: [942313.464188] block drbd2: Out of sync:
start=1574936, size=16 (sectors)

No errors were logged by DRBD that would indicate to me that the data was
trashed.

My question is:
- Was the possible mistake that I mentioned above an actual mistake? ( I am
guessing that due to internal metadata, the partition in the domU was
actually smaller than the space DRBD needed. So I accidentally made the
partition on the secondary server slightly too small, and when the
secondary node was promoted, it shreded some of DRBD's metadata somehow?)
Could it have been the cause for this?

I have no knowledge of how DRBD internally works, so I'm at a loss, but
perhaps someone can shed some light on this?

Thanks,
Michael
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20150523/c2df5817/attachment.htm>