Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello, I'm currently using DRBD for test purposes on two Dell PE2950's with a megasas disksystem. My system is a SuSE 10.1 installation - kernel 2.6.16.13-4-smp, drbd drbd: initialised. Version: 0.7.17 (api:77/proto:74) drbd: SVN Revision: 2125 build by lmb at chip, 2006-03-27 17:40:22 I have created a simple DRBD volume using below config (no linux-ha, heartbeat etc yet): -- resource drbd0 { protocol C; incon-degr-cmd "halt -f"; syncer { rate 110M; group 1; al-extents 257; } on dkvm1 { device /dev/drbd1; disk /dev/sda6; address 10.100.10.101:7789; meta-disk internal; } on dkvm2 { device /dev/drbd1; disk /dev/sda6; address 10.100.10.102:7789; meta-disk internal; } } -- A manual test unplugging my primary node has been succesfull - state detected - fsck - remount gave me fs access on my secondary node. Today I hit this bug on my secondary node: http://www.ussg.iu.edu/hypermail/linux/kernel/0610.1/2599.html getting log entries like: -- sd 0:2:0:0: megasas: RESET -2189309 cmd=2a megasas: [ 0]waiting for 16 commands to complete megasas: [ 5]waiting for 16 commands to complete megasas: [10]waiting for 16 commands to complete -- I would expect my secondary node to fail, and my primary stay online. Instead the /dev/drbd1 seemed to be in a "hanging" state where also my primary node hung. When i issued a 'ifconfig eth0 down' (eth0=drbd interface) on my secondary node, the primary woke up and continued. Is there something i forgot in my config to take care of the situation where scsi bus errors leave the secondary node in semi-working condition - kernel was up, networking was up, but scsi dead. I placed last part of dmesg from both nodes here: http://spa.ceman.info/drbd If more info is needed, please state what is needed regards, Lars