[DRBD-user] PingAck did not arrive in time

Christian Gebler geblerchristian at googlemail.com
Mon Sep 23 15:46:25 CEST 2013

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hi all,

several times an hour occurs the following error.  I'm using DRBD-Utils
8.3.11 with a drbd 8.3.13 kernel modul (yes, these are the versions in the
current ubuntu repository) on an Ubuntu-Server 12.04 LTS
(3.5.0-40-generic), the two-node-cluster is connected from NIC to NIC. I
never had any network problems with DRBD, so I hope anybody can help me.

I even tried it without the peer-to-peer connection with a normal network
connection, but same problem.


Here are my configs and the "PingAck did not arrive in time." Error from
syslog.


Syslog:

Sep 23 14:29:30 server2 kernel: [433433.205076] block drbd0: PingAck did
not arrive in time.
Sep 23 14:29:30 server2 kernel: [433433.224170] block drbd0: peer(
Secondary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk( UpToDate
-> DUnknown )
Sep 23 14:29:30 server2 kernel: [433433.224271] block drbd0: new current
UUID 18159322F4E2DA03:64D5FAA4F99072B3:86AA2CBE74BC037D:86A92CBE74BC037D
Sep 23 14:29:30 server2 kernel: [433433.224382] block drbd0: asender
terminated
Sep 23 14:29:30 server2 kernel: [433433.224390] block drbd0: Terminating
drbd0_asender
Sep 23 14:29:30 server2 kernel: [433433.224500] block drbd0: Connection
closed
Sep 23 14:29:30 server2 kernel: [433433.224512] block drbd0: conn(
NetworkFailure -> Unconnected )
Sep 23 14:29:30 server2 kernel: [433433.224520] block drbd0: receiver
terminated
Sep 23 14:29:30 server2 kernel: [433433.224524] block drbd0: Restarting
drbd0_receiver
Sep 23 14:29:30 server2 kernel: [433433.224528] block drbd0: receiver
(re)started
Sep 23 14:29:30 server2 kernel: [433433.224534] block drbd0: conn(
Unconnected -> WFConnection )
Sep 23 14:29:31 server2 kernel: [433434.407261] block drbd0: Handshake
successful: Agreed network protocol version 96
Sep 23 14:29:31 server2 kernel: [433434.407592] block drbd0: Peer
authenticated using 20 bytes of 'sha1' HMAC
Sep 23 14:29:31 server2 kernel: [433434.407628] block drbd0: conn(
WFConnection -> WFReportParams )
Sep 23 14:29:31 server2 kernel: [433434.407637] block drbd0: Starting
asender thread (from drbd0_receiver [16816])
Sep 23 14:29:31 server2 kernel: [433434.407837] block drbd0:
data-integrity-alg: <not-used>
Sep 23 14:29:31 server2 kernel: [433434.407868] block drbd0:
drbd_sync_handshake:
Sep 23 14:29:31 server2 kernel: [433434.407875] block drbd0: self
18159322F4E2DA03:64D5FAA4F99072B3:86AA2CBE74BC037D:86A92CBE74BC037D bits:16
flags:0
Sep 23 14:29:31 server2 kernel: [433434.407880] block drbd0: peer
64D5FAA4F99072B2:0000000000000000:86AA2CBE74BC037C:86A92CBE74BC037D bits:0
flags:0
Sep 23 14:29:31 server2 kernel: [433434.407884] block drbd0:
uuid_compare()=1 by rule 70
Sep 23 14:29:31 server2 kernel: [433434.407894] block drbd0: peer( Unknown
-> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown ->
Consistent )
Sep 23 14:29:31 server2 kernel: [433434.973757] block drbd0: helper
command: /sbin/drbdadm before-resync-source minor-0
Sep 23 14:29:31 server2 kernel: [433434.976878] block drbd0: helper
command: /sbin/drbdadm before-resync-source minor-0 exit code 0 (0x0)
Sep 23 14:29:31 server2 kernel: [433434.976891] block drbd0: conn(
WFBitMapS -> SyncSource ) pdsk( Consistent -> Inconsistent )
Sep 23 14:29:31 server2 kernel: [433434.976902] block drbd0: Began resync
as SyncSource (will sync 64 KB [16 bits set]).
Sep 23 14:29:31 server2 kernel: [433434.976917] block drbd0: updated sync
UUID 18159322F4E2DA03:64D6FAA4F99072B3:64D5FAA4F99072B3:86AA2CBE74BC037D
Sep 23 14:29:31 server2 kernel: [433434.991168] block drbd0: Resync done
(total 1 sec; paused 0 sec; 64 K/sec)
Sep 23 14:29:31 server2 kernel: [433434.991178] block drbd0: updated UUIDs
18159322F4E2DA03:0000000000000000:64D6FAA4F99072B3:64D5FAA4F99072B3
Sep 23 14:29:31 server2 kernel: [433434.991187] block drbd0: conn(
SyncSource -> Connected ) pdsk( Inconsistent -> UpToDate )
Sep 23 14:29:31 server2 kernel: [433435.072506] block drbd0: bitmap WRITE
of 8049 pages took 20 jiffies
Sep 23 14:29:31 server2 kernel: [433435.072540] block drbd0: 0 KB (0 bits)
marked out-of-sync by on disk bit-map.






DRBD - global_common.conf:

global {
        usage-count no;
        # minor-count dialog-refresh disable-ip-verification
}

common {
        protocol C;

        handlers {
                pri-on-incon-degr
"/usr/lib/drbd/notify-pri-on-incon-degr.sh;
/usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ;
reboot -f";
                pri-lost-after-sb
"/usr/lib/drbd/notify-pri-lost-after-sb.sh;
/usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ;
reboot -f";
                local-io-error "/usr/lib/drbd/notify-io-error.sh;
/usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ;
halt -f";
                # fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
                # split-brain "/usr/lib/drbd/notify-split-brain.sh root";
                # out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root";
                # before-resync-target
"/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 -- -c 16k";
                # after-resync-target
/usr/lib/drbd/unsnapshot-resync-target-lvm.sh;
        }

        startup {
                # wfc-timeout degr-wfc-timeout outdated-wfc-timeout
wait-after-sb
        }

        disk {
                # on-io-error fencing use-bmbv no-disk-barrier
no-disk-flushes
                # no-disk-drain no-md-flushes max-bio-bvecs
        }

        net {
                # sndbuf-size rcvbuf-size timeout connect-int ping-int
ping-timeout max-buffers
                # max-epoch-size ko-count allow-two-primaries cram-hmac-alg
shared-secret
                # after-sb-0pri after-sb-1pri after-sb-2pri
data-integrity-alg no-tcp-cork
        }

        syncer {
                # rate after al-extents use-rle cpu-mask verify-alg
csums-alg
                rate 110M;
        }
}




Ressource Config:

resource r0_opt {
      protocol C;
      startup {
              wfc-timeout  15;
              degr-wfc-timeout 60;
      }
      net {
              cram-hmac-alg sha1;
              shared-secret "xxxxxxxxxxx";
      }
      on server1 {
              device /dev/drbd0;
              disk /dev/vg01/opt-lv;
              address 172.22.122.1:7789;
              meta-disk internal;
      }
      on server2 {
              device /dev/drbd0;
              disk /dev/vg01/opt-lv;
              address 172.22.122.2:7789;
              meta-disk internal;
      }
}





Thx,

-Chris
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20130923/71dbafe0/attachment.htm>


More information about the drbd-user mailing list