[DRBD-user] bad performance after sync

Corey Edwards tensai at zmonkey.org
Thu May 12 00:07:22 CEST 2005

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


I'm having a problem with a DRBD 0.7.10 volume. After performing the
initial sync, the master seems to block. Load average goes through the
roof. Since this is a production mail server I've had to turn off the
slave and run single-legged, which works just fine. The box performs
just like it did before I added DRBD.

I have another server which is running similar hardware and similar
setup and it's got no such problems. Nothing like this popped up during
testing, although those were different servers.

Here is the dmesg output of the process

    drbd0: Handshake successful: DRBD Network Protocol version 74
    drbd0: Connection established.
    drbd0: I am(P): 1:00000002:00000001:00000007:00000001:10
    drbd0: Peer(S): 0:00000002:00000001:00000006:00000001:01
    drbd0: drbd0_receiver [412]: cstate WFReportParams --> WFBitMapS
    drbd0: Primary/Unknown --> Primary/Secondary
    drbd0: drbd0_receiver [412]: cstate WFBitMapS --> SyncSource
    drbd0: Resync started as SyncSource (need to sync 4453844 KB [1113461 bits set]).
    drbd0: Resync done (total 1296 sec; paused 0 sec; 3436 K/sec)
    drbd0: drbd0_worker [22324]: cstate SyncSource --> Connected
    drbd0: sock was shut down by peer
    drbd0: meta connection shut down by peer.
    drbd0: drbd0_asender [15301]: cstate Connected --> NetworkFailure
    drbd0: asender terminated
    drbd0: drbd0_receiver [412]: cstate NetworkFailure --> BrokenPipe
    drbd0: short read expecting header on sock: r=0
    drbd0: worker terminated
    drbd0: drbd0_receiver [412]: cstate BrokenPipe --> Unconnected
    drbd0: Connection lost.
    drbd0: drbd0_receiver [412]: cstate Unconnected --> WFConnection

Nothing seems unusual there. I also traced the /proc/drbd info during
that time. Here it is.

# while true; do cat /proc/drbd |grep bm; sleep 1; done
    ns:84416914 nr:0 dw:76543740 dr:166308862 al:2136886 bm:2136347 lo:2 pe:869 ua:1631 ap:28
    ns:84419154 nr:0 dw:76543984 dr:166313806 al:2136892 bm:2136347 lo:3 pe:319 ua:1132 ap:67
    ns:84425174 nr:0 dw:76545548 dr:166313806 al:2136902 bm:2136349 lo:2 pe:585 ua:18 ap:13
    ns:84425690 nr:0 dw:76545992 dr:166313866 al:2136939 bm:2136353 lo:8 pe:0 ua:0 ap:5
    ns:84427234 nr:0 dw:76547536 dr:166315054 al:2137039 bm:2136353 lo:1 pe:16 ua:0 ap:17
    ns:84427234 nr:0 dw:76547536 dr:166316054 al:2137039 bm:2136353 lo:1 pe:16 ua:0 ap:17
    ns:84427234 nr:0 dw:76547536 dr:166319686 al:2137039 bm:2136353 lo:32 pe:16 ua:0 ap:48

-- This is about when the sync finished

    ns:84427234 nr:0 dw:76547536 dr:166324046 al:2137039 bm:2136353 lo:3 pe:16 ua:0 ap:19
    ns:84427234 nr:0 dw:76547536 dr:166325930 al:2137039 bm:2136353 lo:0 pe:16 ua:0 ap:16
    ns:84427234 nr:0 dw:76547536 dr:166327798 al:2137039 bm:2136353 lo:0 pe:16 ua:0 ap:16
    ns:84427234 nr:0 dw:76547536 dr:166330402 al:2137039 bm:2136353 lo:1 pe:16 ua:0 ap:17
    ns:84427234 nr:0 dw:76547536 dr:166336218 al:2137039 bm:2136353 lo:1 pe:16 ua:0 ap:17
    ns:84427234 nr:0 dw:76547536 dr:166338454 al:2137039 bm:2136353 lo:2 pe:16 ua:0 ap:18
    ns:84427234 nr:0 dw:76547536 dr:166340118 al:2137039 bm:2136353 lo:0 pe:16 ua:0 ap:16
    ns:84429598 nr:0 dw:76549900 dr:166340218 al:2137069 bm:2136353 lo:0 pe:15 ua:0 ap:15
    ns:84429598 nr:0 dw:76549900 dr:166340218 al:2137069 bm:2136353 lo:0 pe:15 ua:0 ap:15
    ns:84429598 nr:0 dw:76549900 dr:166340218 al:2137069 bm:2136353 lo:0 pe:15 ua:0 ap:15
    ns:84429598 nr:0 dw:76549900 dr:166340218 al:2137069 bm:2136353 lo:0 pe:15 ua:0 ap:15
    ns:84429598 nr:0 dw:76549900 dr:166340218 al:2137069 bm:2136353 lo:0 pe:15 ua:0 ap:15
    ns:84429598 nr:0 dw:76549900 dr:166340234 al:2137069 bm:2136353 lo:0 pe:15 ua:0 ap:15
    ns:84429598 nr:0 dw:76549900 dr:166340234 al:2137069 bm:2136353 lo:0 pe:15 ua:0 ap:15
    ns:84429598 nr:0 dw:76549900 dr:166340234 al:2137069 bm:2136353 lo:0 pe:15 ua:0 ap:15
    ns:84429598 nr:0 dw:76549900 dr:166340234 al:2137069 bm:2136353 lo:0 pe:15 ua:0 ap:15
    ns:84429598 nr:0 dw:76549900 dr:166340238 al:2137069 bm:2136353 lo:0 pe:15 ua:0 ap:15
    ns:84429598 nr:0 dw:76549900 dr:166340238 al:2137069 bm:2136353 lo:0 pe:15 ua:0 ap:15
    ns:84429598 nr:0 dw:76549900 dr:166340238 al:2137069 bm:2136353 lo:0 pe:15 ua:0 ap:15
    ns:84429598 nr:0 dw:76549900 dr:166340250 al:2137069 bm:2136353 lo:0 pe:15 ua:0 ap:15
    ns:84429598 nr:0 dw:76549900 dr:166340250 al:2137069 bm:2136353 lo:0 pe:15 ua:0 ap:15
    ns:84429606 nr:0 dw:76549908 dr:166340274 al:2137069 bm:2136353 lo:0 pe:1 ua:0 ap:1
    ns:84429606 nr:0 dw:76549908 dr:166340282 al:2137069 bm:2136353 lo:0 pe:1 ua:0 ap:1
    ns:84429606 nr:0 dw:76549908 dr:166340282 al:2137069 bm:2136353 lo:0 pe:1 ua:0 ap:1
    ns:84429606 nr:0 dw:76549908 dr:166340282 al:2137069 bm:2136353 lo:0 pe:1 ua:0 ap:1
    ns:84429606 nr:0 dw:76549908 dr:166340282 al:2137069 bm:2136353 lo:0 pe:1 ua:0 ap:1
    ns:84429606 nr:0 dw:76549908 dr:166340282 al:2137069 bm:2136353 lo:0 pe:1 ua:0 ap:1
    ns:84429606 nr:0 dw:76549908 dr:166340282 al:2137069 bm:2136353 lo:0 pe:1 ua:0 ap:1
    ns:84429606 nr:0 dw:76549908 dr:166340282 al:2137069 bm:2136353 lo:0 pe:1 ua:0 ap:1
    ns:84429606 nr:0 dw:76549908 dr:166340322 al:2137069 bm:2136353 lo:0 pe:1 ua:0 ap:1
    ns:84429606 nr:0 dw:76549908 dr:166340322 al:2137069 bm:2136353 lo:0 pe:1 ua:0 ap:1
    ns:84429606 nr:0 dw:76549908 dr:166340334 al:2137069 bm:2136353 lo:0 pe:1 ua:0 ap:1
    ns:84429606 nr:0 dw:76549908 dr:166340334 al:2137069 bm:2136353 lo:0 pe:1 ua:0 ap:1
    ns:84429606 nr:0 dw:76549908 dr:166340338 al:2137069 bm:2136353 lo:0 pe:1 ua:0 ap:1
    ns:84429606 nr:0 dw:76549908 dr:166340338 al:2137069 bm:2136353 lo:0 pe:1 ua:0 ap:1
    ns:84429606 nr:0 dw:76549908 dr:166340346 al:2137069 bm:2136353 lo:0 pe:1 ua:0 ap:1
    ns:84429606 nr:0 dw:76549908 dr:166340346 al:2137069 bm:2136353 lo:0 pe:1 ua:0 ap:1
    ns:84429606 nr:0 dw:76549908 dr:166340346 al:2137069 bm:2136353 lo:0 pe:1 ua:0 ap:1
    ns:84429606 nr:0 dw:76551308 dr:166341006 al:2137073 bm:2136357 lo:0 pe:0 ua:0 ap:0

-- this is about when I stopped the secondary

    ns:84429606 nr:0 dw:76555764 dr:166343742 al:2137093 bm:2136377 lo:47 pe:0 ua:0 ap:46
    ns:84429606 nr:0 dw:76557484 dr:166346994 al:2137107 bm:2136391 lo:6 pe:0 ua:0 ap:6
    ns:84429606 nr:0 dw:76559868 dr:166347738 al:2137136 bm:2136420 lo:1 pe:0 ua:0 ap:1
    ns:84429606 nr:0 dw:76562492 dr:166348186 al:2137158 bm:2136442 lo:5 pe:0 ua:0 ap:5
    ns:84429606 nr:0 dw:76563768 dr:166348634 al:2137172 bm:2136456 lo:3 pe:0 ua:0 ap:0
    ns:84429606 nr:0 dw:76566732 dr:166349098 al:2137363 bm:2136648 lo:6 pe:0 ua:0 ap:3

It almost seems like the secondary isn't writing changes to the disk,
but only when it's consistent. Everything works great during a sync.

Corey

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20050511/b1ac3f21/attachment.pgp>


More information about the drbd-user mailing list