Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi all, i have a very strange problem with my linux-cluster. I have two dell servers with a shared drbd-partition for the data. The connection state on both systems seems to be fine, if you can trust /proc/drbd: default:~ # cat /proc/drbd version: 0.7.14 (api:77/proto:74) SVN Revision: 1990 build by root at girgendwas, 2006-05-18 03:03:57 0: cs:Connected st:Primary/Secondary ld:Consistent ns:928491536 nr:148 dw:907947060 dr:35596009 al:42515 bm:2803 lo:0 pe:0 ua:0 ap:0 default:~ # and on the other side: backup:~ # cat /proc/drbd version: 0.7.14 (api:77/proto:74) SVN Revision: 1990 build by root at girgendwas, 2006-05-18 03:03:57 0: cs:Connected st:Secondary/Primary ld:Consistent ns:148 nr:928498956 dw:939440840 dr:80299 al:29 bm:7584 lo:0 pe:0 ua:0 ap:0 backup:~ # When I force the system to change roles and the backup-system is mounting the datadisk-partition, at first everythings seems to be okay, no error messages. But if I take a look at the files on that partition, there are very strange effects. Filenames aren't correct and I get I/O-error if I try to access directories backup:~ # ls /datadisk/ . .. a2chive i.coming quarantine spama3sassin Can you see the dot in "incoming"? I get errors while trying to access that directory: backup:~ # l /datadisk/ /bin/ls: /datadisk/MailScanner/i.coming: Input/output error /bin/ls: /datadisk/MailScanner/quarantine: Input/output error total 16 drwxr-xr-x 6 root root 4096 Dec 6 2005 ./ drwxr-xr-x 22 root root 4096 Sep 25 18:49 ../ ?rwxrwx--- 2 postfix www 4096 Jan 27 2006 a2chive drwxr-xr-x 2 postfix postfix 4096 Oct 24 2005 spama3sassin/ backup:~ # Has anybody seen such an effect before? I have several other clusters with the same hardware and this never happened before. With kind regards, Volker Dose