[DRBD-user] meta connection shut down by peer

Horvath Szabolcs hsz at leier.hu
Thu Aug 12 12:19:58 CEST 2004


Hello!

Here is a new problem: _sometimes_ drbdadm adjust fails.
Then the master node shows 'primary/unknown', the slave node shows
'secondary/unknown'. It seems their network connection is broken.
(The network works fine! drbd1 and drbd2 are working properly)

Messages at boot time:

Aug 12 11:20:06 localhost kernel: drbd: initialised. Version: 0.7.1
(api:75/proto:74)
Aug 12 11:20:06 localhost kernel: drbd: SVN Revision: 1481M build by
root at castor, 2004-08-03 18:14:36
Aug 12 11:20:06 localhost kernel: drbd: registered as block device major 147

Aug 12 11:20:07 localhost kernel: drbd0: resync bitmap: bits=25602947
words=800094
Aug 12 11:20:07 localhost kernel: drbd0: size = 102411788 KB
Aug 12 11:20:07 localhost kernel: drbd0: 0 KB marked out-of-sync by on
disk bit-map.
Aug 12 11:20:07 localhost kernel: drbd0: Found 6 transactions (324 active
extents) in activity log.
Aug 12 11:20:07 localhost kernel: drbd0: Marked additional 131584 KB as
out-of-sync based on AL.
Aug 12 11:20:08 localhost kernel: drbd0: drbdsetup [1877]: cstate
Unconfigured --> StandAlone
Aug 12 11:20:08 localhost kernel: drbd0: drbdsetup [1879]: cstate
StandAlone --> Unconnected
Aug 12 11:20:08 localhost kernel: drbd0: drbd0_receiver [1880]: cstate
Unconnected --> WFConnection
Aug 12 11:20:08 localhost kernel: drbd1: resync bitmap: bits=25602947
words=800094
Aug 12 11:20:08 localhost kernel: drbd1: size = 102411788 KB
Aug 12 11:20:08 localhost kernel: drbd0: drbd0_receiver [1880]: cstate
WFConnection --> WFReportParams
Aug 12 11:20:08 localhost kernel: drbd0: Handshake successful: DRBD
Network Protocol version 74
Aug 12 11:20:08 localhost kernel: drbd0: Connection established.
Aug 12 11:20:08 localhost kernel: drbd0: I am(S):
1:00000005:0000000b:00000075:0000000b:10
Aug 12 11:20:08 localhost kernel: drbd0: Peer(P):
1:00000005:0000000b:00000074:0000000c:10
Aug 12 11:20:08 localhost kernel: drbd0: drbd0_receiver [1880]: cstate
WFReportParams --> WFBitMapS
Aug 12 11:20:08 localhost kernel: drbd0: sock_sendmsg returned -104
Aug 12 11:20:08 localhost kernel: drbd0: drbd0_receiver [1880]: cstate
WFBitMapS --> BrokenPipe
Aug 12 11:20:08 localhost kernel: drbd0: short sent ReportBitMap size=4096
sent=4064
Aug 12 11:20:08 localhost kernel: drbd0: Secondary/Unknown -->
Secondary/Primary
Aug 12 11:20:08 localhost kernel: drbd0: meta connection shut down by peer.
Aug 12 11:20:08 localhost kernel: drbd0: asender terminated
Aug 12 11:20:08 localhost kernel: drbd0: sock was shut down by peer
Aug 12 11:20:08 localhost kernel: drbd0: drbd0_receiver [1880]: cstate
BrokenPipe --> BrokenPipe
Aug 12 11:20:08 localhost kernel: drbd0: short read expecting header on
sock: r=0
Aug 12 11:20:08 localhost kernel: drbd0: worker terminated
Aug 12 11:20:08 localhost kernel: drbd0: drbd0_receiver [1880]: cstate
BrokenPipe --> Unconnected
Aug 12 11:20:08 localhost kernel: drbd0: Connection lost.
Aug 12 11:20:08 localhost kernel: drbd0: drbd0_receiver [1880]: cstate
Unconnected --> WFConnection


When I force resync with 'drbdadm adjust r0':

Aug 12 11:22:15 localhost kernel: drbd0: drbd0_receiver [1880]: cstate
WFConnection --> WFReportParams
Aug 12 11:22:15 localhost kernel: drbd0: Handshake successful: DRBD
Network Protocol version 74
Aug 12 11:22:15 localhost kernel: drbd0: Connection established.
Aug 12 11:22:15 localhost kernel: drbd0: I am(P):
1:00000005:0000000c:00000075:0000000b:10
Aug 12 11:22:15 localhost kernel: drbd0: Peer(S):
1:00000005:0000000b:00000075:0000000c:00
Aug 12 11:22:15 localhost kernel: drbd0: drbd0_receiver [1880]: cstate
WFReportParams --> WFBitMapS
Aug 12 11:22:15 localhost kernel: drbd0: Primary/Unknown -->
Primary/Secondary
Aug 12 11:22:15 localhost kernel: drbd0: drbd0_receiver [1880]: cstate
WFBitMapS --> SyncSource
Aug 12 11:22:15 localhost kernel: drbd0: Resync started as SyncSource
(need to sync 1052948 KB [263237 bits set])


/etc/drbd.conf:

resource r0 {
  protocol B;
  incon-degr-cmd "halt -f";
  startup {
    degr-wfc-timeout 60;
    wfc-timeout 30;
  }
  disk { on-io-error detach; }
  net { }
  syncer {
    rate 100M;
    group 1;
    al-extents 257;
  }
  on castor {
    device     /dev/drbd0;
    disk       /dev/hda5;
    address    192.168.5.1:7788;
    meta-disk  internal;
  }
  on pollux {
    device    /dev/drbd0;
    disk      /dev/hda5;
    address   192.168.5.2:7788;
    meta-disk internal;
  }
}
[...]

Network configuration: (nothing unusual)

iface eth0 inet static
        address 192.168.5.X
        netmask 255.255.255.0
        mtu 5000



What should I do else than forcing 'drbdadm adjust r0'?


Szabolcs Horvath




More information about the drbd-user mailing list