Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
W dniu 18 lipca 2011 12:08 użytkownik Łukasz Oleś <lukaszoles at gmail.com> napisał: > W dniu 18 lipca 2011 10:38 użytkownik Felix Frank <ff at mpexnet.de> napisał: >> Hi, >> >> you should keep stuff on-list :-) > > yeah, wrong button :) > >> On 07/18/2011 10:29 AM, Łukasz Oleś wrote: >>> W dniu 11 lipca 2011 10:45 użytkownik Felix Frank <ff at mpexnet.de> napisał: >>>> On 07/11/2011 10:32 AM, Łukasz Oleś wrote: >>>>> Online verify found 88 4k block out of sync! >>>> >>>> 350k out of 3TB? >>>> >>>> I'd venture to say this is what can result from unfortunate bit-flips >>>> somewhere along the way. >>>> >>>> Yes, these are bad to have. Yet, as far as I know, few systems can even >>>> protect you from data corruption happening on the way from RAM to HDD, >>>> so I disbelieve you're facing a huge problem. >>> >>> During last week we have made more tests. Firstly, we repeated the >>> test and again there where mismatches. What is more interesting >>> corrupted files were on source volume. >> >> Like I wrote - data can be corrupted between your CPU/Mem and HDD. It's >> really hard to guard against this. >> I've read that ZFS does have defences, but using it atop DRBD will prove >> difficult ;-) >> >>> Then we repeated test again but files were copied localy(not via >>> iscsi) and drbd was disconnected. Files were ok. >>> >>> Any ideas? >> >> Not really. Your iSCSI target cannot likely take the blame, since all it >> does is feeding data to DRBD. If your Secondary received sound data, >> that means that the iSCSI target fed sound data to DRBD. >> >> DRBD cannot be blamed either, because it does little more than hand down >> data to your HDD controller. >> >> I have to admit that it's strange, however. My "misfortune theory" >> doesn't really hold if only one machine is affected. How did you make >> sure you were writing sound data during your final test? > > I'm copying files with known md5 sums After another week of testing it looks like problem is solved :) Between initiator and target I have Chelsio 10GB ethernet adapters and probably I got error described here: http://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=MIGR-5086853&brandind=5000008 After updating drivers everything seems to work. Regards, -- Łukasz Oleś