[DRBD-user] Concurrent local write detected!

Digimer lists at alteeve.ca
Thu Mar 2 09:07:52 CET 2017

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hi all,

  We had an event last night on a system that's been in production for a
couple of years; DRBD 8.3.16. At almost exactly midnight, both nodes
threw these errors:

=====
eb 28 03:42:01 aae-a01n01 rsyslogd: [origin software="rsyslogd"
swVersion="5.8.10" x-pid="1729" x-info="http://www.rsyslog.com"]
rsyslogd was HUPed
Mar  2 00:00:07 aae-a01n01 kernel: block drbd0: drbd0_receiver[4763]
Concurrent local write detected!   new: 622797696s +4096; pending:
622797696s +4096
Mar  2 00:00:07 aae-a01n01 kernel: block drbd0: Concurrent write! [W
AFTERWARDS] sec=622797696s
Mar  2 00:00:07 aae-a01n01 kernel: block drbd0: Got DiscardAck packet
622797696s +4096! DRBD is not a random data generator!
Mar  2 00:00:17 aae-a01n01 kernel: block drbd0: qemu-kvm[20305]
Concurrent remote write detected! [DISCARD L] new: 673151680s +32768;
pending: 673151712s +16384
Mar  2 00:00:17 aae-a01n01 kernel: block drbd0: qemu-kvm[20305]
Concurrent remote write detected! [DISCARD L] new: 673151712s +16384;
pending: 673151712s +16384
Mar  2 00:00:17 aae-a01n01 kernel: block drbd0: qemu-kvm[20305]
Concurrent remote write detected! [DISCARD L] new: 673151712s +16384;
pending: 673151712s +16384
Mar  2 00:00:17 aae-a01n01 kernel: block drbd0: qemu-kvm[20305]
Concurrent remote write detected! [DISCARD L] new: 673151712s +16384;
pending: 673151712s +16384
Mar  2 00:00:17 aae-a01n01 kernel: block drbd0: qemu-kvm[20305]
Concurrent remote write detected! [DISCARD L] new: 673151712s +16384;
pending: 673151712s +16384
=====

=====
Feb 28 03:23:01 aae-a01n02 rsyslogd: [origin software="rsyslogd"
swVersion="5.8.10" x-pid="1729" x-info="http://www.rsyslog.com"]
rsyslogd was HUPed
Mar  2 00:00:07 aae-a01n02 kernel: block drbd0: drbd0_receiver[4758]
Concurrent local write detected!   new: 622797696s +4096; pending:
622797696s +4096
Mar  2 00:00:07 aae-a01n02 kernel: block drbd0: Concurrent write!
[DISCARD BY FLAG] sec=622797696s
Mar  2 00:00:11 aae-a01n02 kernel: block drbd0: qemu-kvm[15639]
Concurrent remote write detected! [DISCARD L] new: 622797696s +4096;
pending: 622797696s +4096
Mar  2 00:00:18 aae-a01n02 kernel: block drbd0: qemu-kvm[15639]
Concurrent remote write detected! [DISCARD L] new: 673151712s +16384;
pending: 673151712s +16384
Mar  2 00:00:18 aae-a01n02 kernel: block drbd0: qemu-kvm[15639]
Concurrent remote write detected! [DISCARD L] new: 673151712s +16384;
pending: 673151712s +16384
Mar  2 00:00:18 aae-a01n02 kernel: block drbd0: drbd0_receiver[4758]
Concurrent local write detected!   new: 673151712s +16384; pending:
673151712s +16384
Mar  2 00:00:18 aae-a01n02 kernel: block drbd0: Concurrent write!
[DISCARD BY FLAG] sec=673151712s
Mar  2 00:00:18 aae-a01n02 kernel: block drbd0: qemu-kvm[15639]
Concurrent remote write detected! [DISCARD L] new: 673151712s +16384;
pending: 673151712s +16384
Mar  2 00:00:18 aae-a01n02 kernel: block drbd0: drbd0_receiver[4758]
Concurrent local write detected!   new: 673151744s +16384; pending:
673151744s +16384
Mar  2 00:00:18 aae-a01n02 kernel: block drbd0: Concurrent write! [W
AFTERWARDS] sec=673151744s
Mar  2 00:00:18 aae-a01n02 kernel: block drbd0: qemu-kvm[15639]
Concurrent remote write detected! [DISCARD L] new: 673151744s +16384;
pending: 673151744s +16384
Mar  2 00:00:18 aae-a01n02 kernel: block drbd0: qemu-kvm[15639]
Concurrent remote write detected! [DISCARD L] new: 673151744s +16384;
pending: 673151744s +16384
Mar  2 00:00:18 aae-a01n02 kernel: block drbd0: qemu-kvm[15639]
Concurrent remote write detected! [DISCARD L] new: 673151744s +16384;
pending: 673151744s +16384
Mar  2 00:00:18 aae-a01n02 kernel: block drbd0: qemu-kvm[15639]
Concurrent remote write detected! [DISCARD L] new: 673151744s +16384;
pending: 673151744s +16384
Mar  2 00:00:18 aae-a01n02 kernel: block drbd0: qemu-kvm[15639]
Concurrent remote write detected! [DISCARD L] new: 673151744s +16384;
pending: 673151744s +16384
Mar  2 00:00:18 aae-a01n02 kernel: block drbd0: qemu-kvm[15639]
Concurrent remote write detected! [DISCARD L] new: 673151744s +16384;
pending: 673151744s +16384
=====

=====
[root at aae-a01n02 ~]# drbdadm dump-xml
<config file="/etc/drbd.conf">
    <common protocol="C">
        <section name="net">
            <option name="allow-two-primaries"/>
            <option name="after-sb-0pri" value="discard-zero-changes"/>
            <option name="after-sb-1pri" value="discard-secondary"/>
            <option name="after-sb-2pri" value="disconnect"/>
        </section>
        <section name="disk">
            <option name="fencing" value="resource-and-stonith"/>
        </section>
        <section name="syncer">
            <option name="rate" value="30M"/>
        </section>
        <section name="startup">
            <option name="wfc-timeout" value="300"/>
            <option name="degr-wfc-timeout" value="120"/>
            <option name="outdated-wfc-timeout" value="120"/>
            <option name="become-primary-on" value="both"/>
        </section>
        <section name="handlers">
            <option name="pri-on-incon-degr"
value="/usr/lib/drbd/notify-pri-on-incon-degr.sh;
/usr/lib/drbd/notify-emergency-reboot.sh; echo b &gt;
/proc/sysrq-trigger ; reboot -f"/>
            <option name="pri-lost-after-sb"
value="/usr/lib/drbd/notify-pri-lost-after-sb.sh;
/usr/lib/drbd/notify-emergency-reboot.sh; echo b &gt;
/proc/sysrq-trigger ; reboot -f"/>
            <option name="local-io-error"
value="/usr/lib/drbd/notify-io-error.sh;
/usr/lib/drbd/notify-emergency-shutdown.sh; echo o &gt;
/proc/sysrq-trigger ; halt -f"/>
            <option name="fence-peer" value="/usr/lib/drbd/rhcs_fence"/>
        </section>
    </common>
    <resource name="r0">
        <host name="aae-a01n01.hwholdings.com">
            <device minor="0">/dev/drbd0</device>
            <disk>/dev/sda5</disk>
            <address family="ipv4" port="7788">10.10.10.1</address>
            <meta-disk>internal</meta-disk>
        </host>
        <host name="aae-a01n02.hwholdings.com">
            <device minor="0">/dev/drbd0</device>
            <disk>/dev/sda5</disk>
            <address family="ipv4" port="7788">10.10.10.2</address>
            <meta-disk>internal</meta-disk>
        </host>
    </resource>
    <resource name="r1">
        <host name="aae-a01n01.hwholdings.com">
            <device minor="1">/dev/drbd1</device>
            <disk>/dev/sda6</disk>
            <address family="ipv4" port="7789">10.10.10.1</address>
            <meta-disk>internal</meta-disk>
        </host>
        <host name="aae-a01n02.hwholdings.com">
            <device minor="1">/dev/drbd1</device>
            <disk>/dev/sda6</disk>
            <address family="ipv4" port="7789">10.10.10.2</address>
            <meta-disk>internal</meta-disk>
        </host>
    </resource>
</config>
=====

=====
[root at aae-a01n02 ~]# cat /proc/drbd
version: 8.3.16 (api:88/proto:86-97)
GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by
root at rhel6-builder-production.alteeve.ca, 2015-04-05 19:59:27
 0: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r-----
    ns:408 nr:2068182 dw:2068586 dr:48408 al:8 bm:115 lo:0 pe:0 ua:0
ap:0 ep:1 wo:f oos:0
 1: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r-----
    ns:750365 nr:770052 dw:1520413 dr:1062911 al:15463 bm:145 lo:0 pe:0
ua:0 ap:0 ep:1 wo:f oos:0
=====

At this point, storage hung (I assume on purpose). Recovery was a full
restart of the cluster.

Googling doesn't return much on this. Can someone provide insight into
what might have happened? This was a pretty scary event, and it's the
first time I've seen it happen in all the years I've been using DRBD.

Let me know if there are any other logs or info

Thanks!

digimer

-- 
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interested in the weight and convolutions of
Einstein’s brain than in the near certainty that people of equal talent
have lived and died in cotton fields and sweatshops." - Stephen Jay Gould



More information about the drbd-user mailing list