Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Thu, Apr 17, 2008 at 4:56 PM, Lars Ellenberg <lars.ellenberg at linbit.com> wrote: > > On Thu, Apr 17, 2008 at 01:15:21PM +0200, Szeróvay Gergely wrote: > > Hello All, > > > > I have a 3 node system. In the system I have 25 DRBD mirrored > > partition, their total size is about 250GB. The 3 node is: > > - immortal: Intel 82573L Gigabit Ethernet NIC (kernel 2.6.21.6, > > driver: e1000, version: 7.3.20-k2-NAPI, firmware-version: 0.5-7) > > - endless: Intel 82566DM-2 Gigabit Ethernet NIC (kernel 2.6.22.18, > > driver: e1000, version: 7.6.15.4, > > firmware-version: 1.3-0) > > - infinity: Intel 82573E Gigabit Ethernet NIC (kernel 2.6.22.18, > > driver: e1000, version: 7.6.15.4, firmware-version: 3.1-7) > > > > One month ago I switched to DRBD 8.2.5 from 7.x. Before I used the 7.x > > series without problems. I had no problem during the update, the parts > > of the mirrors connected and synced cleanly. > > > > After updating I started to verify the DRBD volumes: > > - most of them has usually not out-of-sync blocks > > - one has 2-3 new oos block almost every day > > - a few of them has a new oos block about every week > > > > I try to track down the source of oos blocks. I read through the > > drbd-user forums, in the „Tracking down sources of corruption > > (possibly) detected by drbdadm verify" thread I found very useful > > hints. > > > > I cheked my network connections between every node, every direction > > with this test: > > > > host1:~ # md5sum /tmp/file_with_1GB_random_data > > host2:~ # netcat -l -p 4999 | md5sum > > host1:~ # netcat -q0 192.168.x.x 4999 < /tmp/file_with_1GB_random_data > > > > The test always gives the same md5sums on the two tested node, the > > transfer speed is about 100MB/sec when the file is cached. > > > > I repeated this test between every node-pairs many times, I found no > > md5 mismatch. > > > > I saved the oos blocks from the underlying device. I used commands like this: > > > > host:~ dd iflag=direct bs=512 skip=11993992 count=8 > > if=/dev/immortal0/65data2 | xxd -a > ./primary_4k_dump > > > > when the syslog message was > > > > „Apr 17 11:14:09 immortal kernel: drbd6: Out of sync: start=11993992, > > size=8 (sectors)" > > > > and the primary underlying device was /dev/immortal0/65data2. > > > > I compared the problematic blocks from the two nodes with diff: > > host:~ diff ./primary_4k_dump ./secondary_4k_dump > > > > I usually found 1-2byte difference between the blocks on the two node, > > but one time I found that the last 1336 bytes of block was zeroed out > > (on the other node it has "random" data).Two example: > > > > 1 4k block oos: > > c2 > > < 0000010: 0000 0000 1500 0000 0000 01ff 0000 0000 ................ > > --- > > > 0000010: 0000 0000 1500 0000 0001 01ff 0000 0000 ................ > > > > another 1 4k block oos: > > 22c22 > > < 00001f0: 0b85 0000 0000 0000 1800 0000 0000 0000 ................ > > --- > > > 00001f0: 2d79 0000 0000 0000 1800 0000 0000 0000 -y.............. > > > > Any idea would help. > > what file systems? > what kernel version? > what drbd protocol? > > it is possible (I got this suspicion earlier, but could not prove it > during local testing) that something submits a buffer to the block > device stack, but then modifies this buffer while it is still in flight. > > these snippets you show look suspiciously like block maps. if the block > offset also confirms that this is within some filesystem block map, than > this is my working theory of what happens: > > ext3 submits block to drbd > drbd writes to local storage > ext3 modifies the page, even though the bio is not yet completed > drbd sends the (now modified) page over network > drbd is notified of local completion > drbd receives acknowledgement of remote completion > original request completed. > > i ran into these things while testing the "data integrity" thing, > i.e. "data-integrity-alg md5sum", where every now and then > an ext3 on top of drbd would produce "wrong checksums", > and the hexdump of the corresponding data payload always > looked like a block map, and was different in just one 64bit "pointer". > > -- > : Lars Ellenberg http://www.linbit.com : > : DRBD/HA support and consulting sales at linbit.com : > : LINBIT Information Technologies GmbH Tel +43-1-8178292-0 : > : Vivenotgasse 48, A-1120 Vienna/Europe Fax +43-1-8178292-82 : > __ > please don't Cc me, but send to list -- I'm subscribed > _______________________________________________ > drbd-user mailing list > drbd-user at lists.linbit.com > http://lists.linbit.com/mailman/listinfo/drbd-user > DRBD 8.2.5 with protocol „C" Kernel versions (kernels from kernel.org with Vserver patch): node „immortal": 2.6.21.6-vs2.2.0.3 32bit smp node „endless": 2.6.22.18-vs2.2.0.6 32bit smp (with new e1000 driver) node „infinity": 2.6.22.18-vs2.2.0.6 32bit smp (with new e1000 driver) I use Reiserfs usually with group quotas enabled. The DRBD device is on the top of LVM2 (and on software RAID1 in some cases). My system often has heavy load, but I cannot found connection between the oos blocks and the load. My most problematic volume contains a Mysql5 database. I try to stress it with move big files to the volume, but the oos blocks not generated more frequently. I tried the crc32 data-integrity-alg on one most problematic volume, it detected some errors per day, but I think its not a network error, because the network pass the tests cleanly, and the full resyncs made no corruptions. Thank you: Gergely