Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hey list, I'm currently trying to setup a drbd/iscsi/heartbeat-environment. This seems to work out quite well: - two boxes with drbd storage - heartbeat to monitor links and to start iscsi-target etc. Upon failure of the primary node, heartbeat starts iscsi-target on the secondary and makes it the primary node. This works great even while clients are writing to the filesystem via iscsi - the iscsi initiator simply reconnects to the cluster-IP and continues writing. But as soon as node1 comes back, both nodes complain about a "split brain" situation and refuses to resync - although nothing has been written to the device on node1 since its disconnect! Shouldn't heartbeat handle this situation? On top of that, I'm not able to resync the devices without playing around with various drbdadm commands (including taking both sides down completely - which would not be an acceptable solution in a production environment) Both sides are equally configured debian etch systems, using heartbeat v2 packages and drbd8-packets from backports.org: version: 8.0.4 (api:86/proto:86) SVN Revision: 2947 build by root at nas02, 2007-10-16 13:43:43 Please note that the two systems do NOT have equal hardware componentes - it's just a test environment with different storage capabilities (~230GB vs. ~60GB) below is a log extract and my heartbeat configuration: dmesg output from nas02 (after nas01 has been started again): r8169: eth1: link up drbd0: conn( WFConnection -> WFReportParams ) drbd0: Handshake successful: DRBD Network Protocol version 86 drbd0: Considerable difference in lower level device sizes: 121720712s vs. 455185536s drbd0: Split-Brain detected, dropping connection! drbd0: self C765DF24A2676E31:DDFCF96A854616A7:B34DB3CFA71F57C2:15641BAC6660D448 drbd0: peer E785C3CFDDDC5C05:DDFCF96A854616A7:B34DB3CFA71F57C2:15641BAC6660D448 drbd0: conn( WFReportParams -> Disconnecting ) drbd0: error receiving ReportState, l: 4! drbd0: meta connection shut down by peer. drbd0: asender terminated drbd0: tl_clear() drbd0: Connection closed drbd0: conn( Disconnecting -> StandAlone ) drbd0: receiver terminated ha.cf (taken from nas02, besides the ucast-settings thay are identical): logfacility daemon keepalive 1 deadtime 10 initdead 60 ucast eth1 172.16.15.1 ucast eth0 192.168.0.30 auto_failback off node nas01 node nas02 haresources: nas01 192.168.0.34 drbddisk::storage iscsi-target MailTo::technik at megabit.net and here is a quick "drawing" of the whole layout: iSCSI initiators | /--------------Switch------------\ | | 192.168.0.30 192.168.0.31 nas01 ClusterIP 192.168.0.34 nas02 172.16.15.1 172.16.15.2 | | \-----------CrossOver Link--------/ Mit freundlichen Grüßen / with kind regards Rudolph Bott -- Megabit Informationstechnik GmbH Karstr.25 41068 Moenchengladbach Tel:02161/30898-0 Fax:-18 AG MG HRB 10141, GF: Dipl.-Ing. Thomas Tillig, Michael Benten