Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
/ 2006-07-25 14:49:40 -0400 \ Brent A Nelson: > Some additional info: the mkfs is still hung and a subsequent attempt also hung. A short dd to the device did not > hang, but it completed far too quickly and showed no activity on the secondary. A longer dd did hang. > > The machine has three stuck processes and top shows that the machine is in 100% wait. > > All 6 drbd devices have LVM logical volumes for their backing store (I used logical volumes so that the block > devices wouldn't get reordered by the system if a disk disappeared; perhaps there's a better way). 3 disks are > secondary for the other machine, and 3 disks are primary. > > Could this be an issue with drbd on LVM? Or maybe something that's fixed by a newer drbd version? A bug when > compiled with gcc-3.4, maybe? Is there anything I should try to help diagnose the situation before I attempt to > recover (these machines are not yet in production, so I can wait a bit, if needed)? what does cat /proc/drbd say (on both nodes) ? it may be that the requests are all there on the secondary, but for some reason (which would be possibly a drbd bug) the lower level block io layer on the secondary sees no reason to actually process them (possibly because drbd fails to communicate the need for this properly). what io-schedulers do you use? # grep . /sys/block/*/queue/scheduler does it help to do on the Secondary # sync or # echo s > /proc/sysrq-trigger or # perl -e '$x = "X" x (1024*1024*500)' # (just consume memory) or # find / -ls > /dev/null # trigger some io or something like that ? does it help to set the io-scheduler to deadline? # echo deadline > /sys/block/_whatever_/queue/scheduler -- : Lars Ellenberg Tel +43-1-8178292-0 : : LINBIT Information Technologies GmbH Fax +43-1-8178292-82 : : Schoenbrunner Str. 244, A-1120 Vienna/Europe http://www.linbit.com : __ please use the "List-Reply" function of your email client.