Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello On Tue, 7 Jan 2014 16:47:04 +0100 Stefan Bauer <stefan.bauer at cubewerk.de> wrote: > -----Ursprüngliche Nachricht----- > Von: Christian Hammers <chammers at netcologne.de> > Gesendet: Di 07.01.2014 15:48 > Betreff: Re: [DRBD-user] proto c - corrupt files - directories missing > An: Stefan Bauer <stefan.bauer at cubewerk.de>; > CC: drbd-user at lists.linbit.com; > > Hello > > > > Have you tried "drbdadm verify clusterdb_res" to check if the secondary is > > really identical to the primary? > > > > I would assume that DRBD only detects corrupted data using checksum when > > reading and out-of-date data when comparing those checksums on write requests > > but it cannot detect that the data on your secondary has accidentaly become > > out-of-date. > > Hi Christian, > > Thank you for your time. > > now it gets strange! I just started a resync after the second node was offline. > > [438614.558716] block drbd0: updated sync UUID A712D7A357B968B7:5410F28F1CEC98E8:540FF28F1CEC98E8:736AAB121F6173C0 > [439240.761231] block drbd0: Resync done (total 626 sec; paused 0 sec; 111204 K/sec) > [439240.761244] block drbd0: updated UUIDs A712D7A357B968B7:0000000000000000:5410F28F1CEC98E8:540FF28F1CEC98E8 > [439240.761255] block drbd0: conn( SyncSource -> Connected ) pdsk( Inconsistent -> UpToDate ) > [439240.854011] block drbd0: bitmap WRITE of 8933 pages took 23 jiffies > [439240.854023] block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map. > > After this i ran a verify and a bunch of out-of-sync were detected: If your secondary was just offline for a short time, it only catches up the changes that were made during this time. It can therefore re-syncs quite fast but it won't detect out-of-sync blocks that have existed long ago. The following messages explain why the filesystem on your secondary node looks strange :) > [439694.710861] block drbd0: Out of sync: start=73992, size=8 (sectors) > [439695.086765] block drbd0: Out of sync: start=270448, size=8 (sectors) > [439695.087157] block drbd0: Out of sync: start=270768, size=8 (sectors) > [439695.087293] block drbd0: Out of sync: start=270824, size=8 (sectors) ... > and so on. Am i right, after the whole verify process is > finished, my data should be in "real" sync? :) No, according to the manpage "drbdadm verify" only marks blocks as invalid but does not repair them. I found that unexpected, too. Try "drbdadm invalidate clusterdb_res" on your *secondary* node. This will start a complete resync from the primary node and copies every block whose checksum mismatches. Can take some hours, though. bye, -christian-