Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi, We have experienced a strange replication problem since we use B protocol. The scenario is the following: Some binary files are saved to the replicated IO pair ( kernel:3.0.13, drbd-8.3.12, protocol B, EXT3 ) Later they are copied to an other (but replicated) directory. They are still consistent and there is no problem till the io1 (the actual Primary) is rebooted. Strange it needs a reboot. An enforced role change does not show the symptom. io2 takes the Primary role and when the cluster starts using the binary files they show checksum error. We have turned of the write cache in the sas disks ( sdparam --set WCE=0 /dev/sda ) and the symptom seemed to be disappeared, but later it surfaced again. Those corrupted binary files has some 40 kbytes hole filled with zeros. Yes it can be a HW issue, but we did not see it with C protocol (which is deadly slow in our system unfortunately) Have someone seen something similar ? Thanks, Akos