Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
/ 2004-10-21 20:03:58 +0100 \ Matthew Hodgson: > Lars Ellenberg wrote: > > >>>/ 2004-10-07 13:46:15 +0200 > >>>\ Alex Ongena: > >>> > >>>>Hi, > >>>> > >>>>My master stays in WFReportParams forever due to a network > >>>>failure on my slave. > >>>> > >>>>Scenario: Master is running and is Primary, Slave is booting > >>>> > >>>>This is the relevant log: > >>> > >>>interssting. > >>>I miss the "handshake successful" message, though. > >>>anyways, this "should not happen". > >>> > >>>we'll have a look. > >>> > >>>what kernel is this, in case it matters? > >> > >>There is a new release 0.7.5 ... perhaps it fixes this? > > > >unlikely. > >that have been unrealted fixes, I think. > > I currently have a fileserver stuck with the same problem (I think) > running 0.7.5 on 2.4.27. The cluster is a pair of identical: > > # uname -a > Linux 2.4.27 #14 SMP Tue Oct 12 16:31:10 BST 2004 i686 unknown unknown > GNU/Linux > > vendor_id : GenuineIntel > cpu family : 15 > model : 2 > model name : Intel(R) Xeon(TM) CPU 2.80GHz > stepping : 5 > cpu MHz : 2793.076 > cache size : 512 KB > > MemTotal: 2068944 kB > > Intel(R) PRO/1000 Network Driver - version 5.4.11 > e1000: eth0, eth1: e1000_probe: Intel(R) PRO/1000 Network Connection > (using latest e1000-5.4.11 from Intel) > > Intel(R) PRO/100 Network Driver - version 2.3.43-k1 > e100: eth2: Intel(R) PRO/100 Network Connection > > ICH5: chipset revision 2 > > 3ware 9500s-8 SCSI-style IDE RAID Controller: > 3w-9xxx: scsi0: Found a 3ware 9000 Storage Controller at 0xfe8ffc00, > 3w-9xxx: scsi0: Firmware FE9X 2.02.00.012, BIOS BE9X 2.02.01.037, Ports: 8. > > 7x250G + 1 hot spare disks, RAID 5, so ~1.4T logical disk space per node. > > drbd: initialised. Version: 0.7.5 (api:76/proto:74) > drbd: SVN Revision: 1578 build by root at mxtelecom.com, 2004-10-10 18:54:22 > > eth0 and eth2 are bonded as bond0 and access the main LAN - eth1 however > is dedicated to drbd as a gigabit crossover segment direct to the other > node, on a 10.0.0.0/24 network. > > The nodes have been running using protocol C brought up by: > > drbdsetup /dev/drbd0 disk /dev/sda3 internal -1 > drbdsetup /dev/drbd0 primary > drbdsetup /dev/drbd0 net 10.0.0.2:7788 10.0.0.1:7788 C > drbdsetup /dev/drbd0 syncer -r 512000 > > The shared device is a single 1.4T XFS partition. > > > The e1000 driver has been very flakey, with: > > NETDEV WATCHDOG: eth1: transmit timed out > drbd0: PingAck did not arrive in time. > drbd0: drbd0_asender [164]: cstate Connected --> NetworkFailure > drbd0: asender terminated > drbd0: drbd0_receiver [163]: cstate NetworkFailure --> BrokenPipe > drbd0: short read expecting header on sock: r=-512 > drbd0: worker terminated > drbd0: drbd0_receiver [163]: cstate BrokenPipe --> Unconnected > drbd0: Connection lost. > > appearing quite often on the master, associated with: > > e1000: eth1: e1000_watchdog: NIC Link is Down > e1000: eth1: e1000_watchdog: NIC Link is Up 1000 Mbps Full Duplex > drbd0: meta connection shut down by peer. > drbd0: drbd0_asender [181]: cstate Connected --> NetworkFailure > drbd0: asender terminated > drbd0: drbd0_receiver [180]: cstate NetworkFailure --> BrokenPipe > drbd0: short read receiving data block: read 1632 expected 4096 > drbd0: error receiving Data, l: 4112! > drbd0: worker terminated > drbd0: drbd0_receiver [180]: cstate BrokenPipe --> Unconnected > drbd0: Connection lost. > > appearing on the slave. now, thinking about this, I again miss the handshake message in the log. sorry for the extra wide lines here, but I think it makes it more readable. both do this: > drbd0: Connection lost. here, the should both be completely reset, network wise... again, both continue: > drbd0: drbd0_receiver [328]: cstate Unconnected --> WFConnection > drbd0: drbd0_receiver [328]: cstate WFConnection --> WFReportParams now, here both are basically executing the same code. the send off a handshake packet, then try to receive the one sent by the peer. next expected log message would be drbd0: Handshake successful: DRBD Network Protocol version ## but it does not appear. and now, on $right, it says: > drbd0: sock_recvmsg returned -11 so, it appears that $left has not managed to get its initial message out, and $right times out waiting for it. but $left does not timeout, it gets stuck. now, either it got stuck before it even tried to send, or it got stuck trying to send, or it got stuck trying to receive. there may be some thing in the network stack (it seems to be flaky in your setup anyways), and it really hangs somewhere in the tcp/ip stack. this seems rather unlikely, as the socket timers are explicitly set to 2*HZ, so both operations should time out soonish, and log an error message, as on $right. so most likely the whole thing gets stuck in the down(&mdev->data.mutex); of drbd_send_handshake(mdev) ... but I don't see why this could happen, since asender is dead now, worker has just been restarted and should be doing nothing, and drbdsetup should not be running anyways. so the only thing that would try to down it there is the receiver, and it has to succeed immediately, there cannot be any contention. it might have something to do with how the disk unplugging on 2.4 works... but then again, you'd reported a completely hung io-subsystem, not only a hung drbd. > > drbd0: drbd0_receiver [236]: cstate WFReportParams --> BrokenPipe > DRBD then hangs hard in WFReportParams mode. The underlying device is > drbd0: short read expecting header on sock: r=-11 > still accessible, but the drbdsetup userland utils hang solid, and of > drbd0: Discarding network configuration. > course replication is dead. > hm... > the slave's DRBD comes back up okay, but without the master and being > desynced, it's obviously useless. > > I assume my only option is to reboot the master to get out of this mess > - if there is any way to stop DRBD from sometimes doing this on network > failure it would be very much appreciated ;) > > Also, any thoughts on what might cause such horrible network/DRBD > flakiness in the first place would be very gratefully received - are > there known clashes between the e100 & e1000 drivers? Or with bonding? > Or the 3ware RAID card or even using XFS? Lars Ellenberg -- please use the "List-Reply" function of your email client.