<!DOCTYPE html><html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /></head><body><div data-html-editor-font-wrapper="true" style="font-family: arial, sans-serif; font-size: 13px;"> <p>Hi<br><br>I'm using some servers on debian with ganeti and drbd.<br><br>Since I've upgraded them to debian 9, and drbd 8.9.10-2 (from debian repo).<br><br>I got a lot of issue with my drbd resources, I got randomly on my dmesg some resources disconnected:<br><br>today for example:<br><br>[Tue Aug 28 14:32:38 2018] drbd resource10: peer( Primary -> Unknown ) conn( Connected -> Disconnecting ) pdsk( UpToDate -> DUnknown ) <br>[Tue Aug 28 14:32:38 2018] drbd resource10: ack_receiver terminated<br>[Tue Aug 28 14:32:38 2018] drbd resource10: Terminating drbd_a_resource<br>[Tue Aug 28 14:32:38 2018] drbd resource10: Connection closed<br>[Tue Aug 28 14:32:38 2018] drbd resource10: conn( Disconnecting -> StandAlone ) <br>[Tue Aug 28 14:32:38 2018] drbd resource10: receiver terminated<br>[Tue Aug 28 14:32:38 2018] drbd resource10: Terminating drbd_r_resource<br>[Tue Aug 28 14:32:38 2018] block drbd10: disk( UpToDate -> Failed ) <br>[Tue Aug 28 14:32:38 2018] block drbd10: 0 KB (0 bits) marked out-of-sync by on disk bit-map.<br>[Tue Aug 28 14:32:38 2018] block drbd10: disk( Failed -> Diskless ) <br>[Tue Aug 28 14:32:38 2018] drbd resource10: Terminating drbd_w_resource<br>[Tue Aug 28 14:32:40 2018] drbd resource10: Starting worker thread (from drbdsetup-84 [10222])<br>[Tue Aug 28 14:32:40 2018] block drbd10: disk( Diskless -> Attaching ) <br>[Tue Aug 28 14:32:40 2018] drbd resource10: Method to ensure write ordering: flush<br>[Tue Aug 28 14:32:40 2018] block drbd10: max BIO size = 262144<br>[Tue Aug 28 14:32:40 2018] block drbd10: Adjusting my ra_pages to backing device's (32 -> 256)<br>[Tue Aug 28 14:32:40 2018] block drbd10: drbd_bm_resize called with capacity == 314572800<br>[Tue Aug 28 14:32:40 2018] block drbd10: resync bitmap: bits=39321600 words=614400 pages=1200<br>[Tue Aug 28 14:32:40 2018] block drbd10: size = 150 GB (157286400 KB)<br>[Tue Aug 28 14:32:40 2018] block drbd10: recounting of set bits took additional 0 jiffies<br>[Tue Aug 28 14:32:40 2018] block drbd10: 0 KB (0 bits) marked out-of-sync by on disk bit-map.<br>[Tue Aug 28 14:32:40 2018] block drbd10: disk( Attaching -> UpToDate ) <br>[Tue Aug 28 14:32:40 2018] block drbd10: attached to UUIDs 0748EE11C429D3B4:0000000000000000:FDAEFCD2E8D9890A:FDADFCD2E8D9890B<br>[Tue Aug 28 14:32:40 2018] drbd resource10: conn( StandAlone -> Unconnected ) <br>[Tue Aug 28 14:32:40 2018] drbd resource10: Starting receiver thread (from drbd_w_resource [10225])<br>[Tue Aug 28 14:32:40 2018] drbd resource10: receiver (re)started<br>[Tue Aug 28 14:32:40 2018] drbd resource10: conn( Unconnected -> WFConnection ) <br>[Tue Aug 28 14:32:41 2018] drbd resource10: Handshake successful: Agreed network protocol version 101<br>[Tue Aug 28 14:32:41 2018] drbd resource10: Feature flags enabled on protocol level: 0x7 TRIM THIN_RESYNC WRITE_SAME.<br>[Tue Aug 28 14:32:41 2018] drbd resource10: Peer authenticated using 16 bytes HMAC<br>[Tue Aug 28 14:32:41 2018] drbd resource10: conn( WFConnection -> WFReportParams ) <br>[Tue Aug 28 14:32:41 2018] drbd resource10: Starting ack_recv thread (from drbd_r_resource [10246])<br>[Tue Aug 28 14:32:41 2018] block drbd10: drbd_sync_handshake:<br>[Tue Aug 28 14:32:41 2018] block drbd10: self 0748EE11C429D3B4:0000000000000000:FDAEFCD2E8D9890A:FDADFCD2E8D9890B bits:0 flags:0<br>[Tue Aug 28 14:32:41 2018] block drbd10: peer 629F1036CD6CA2AF:0748EE11C429D3B5:FDAEFCD2E8D9890B:FDADFCD2E8D9890B bits:0 flags:0<br>[Tue Aug 28 14:32:41 2018] block drbd10: uuid_compare()=-1 by rule 50<br>[Tue Aug 28 14:32:41 2018] block drbd10: peer( Unknown -> Primary ) conn( WFReportParams -> WFBitMapT ) disk( UpToDate -> Outdated ) pdsk( DUnknown -> UpToDate ) <br>[Tue Aug 28 14:32:41 2018] block drbd10: receive bitmap stats [Bytes(packets)]: plain 0(0), RLE 23(1), total 23; compression: 100.0%<br>[Tue Aug 28 14:32:41 2018] block drbd10: send bitmap stats [Bytes(packets)]: plain 0(0), RLE 23(1), total 23; compression: 100.0%<br>[Tue Aug 28 14:32:41 2018] block drbd10: conn( WFBitMapT -> WFSyncUUID ) <br>[Tue Aug 28 14:32:41 2018] block drbd10: updated sync uuid 0749EE11C429D3B4:0000000000000000:FDAEFCD2E8D9890A:FDADFCD2E8D9890B<br>[Tue Aug 28 14:32:41 2018] block drbd10: helper command: /bin/true before-resync-target minor-10<br>[Tue Aug 28 14:32:41 2018] block drbd10: helper command: /bin/true before-resync-target minor-10 exit code 0 (0x0)<br>[Tue Aug 28 14:32:41 2018] block drbd10: conn( WFSyncUUID -> SyncTarget ) disk( Outdated -> Inconsistent ) <br>[Tue Aug 28 14:32:41 2018] block drbd10: Began resync as SyncTarget (will sync 0 KB [0 bits set]).<br>[Tue Aug 28 14:32:41 2018] block drbd10: Resync done (total 1 sec; paused 0 sec; 0 K/sec)<br>[Tue Aug 28 14:32:41 2018] block drbd10: updated UUIDs 629F1036CD6CA2AE:0000000000000000:0749EE11C429D3B4:0748EE11C429D3B5<br>[Tue Aug 28 14:32:41 2018] block drbd10: conn( SyncTarget -> Connected ) disk( Inconsistent -> UpToDate ) <br>[Tue Aug 28 14:32:41 2018] block drbd10: helper command: /bin/true after-resync-target minor-10<br>[Tue Aug 28 14:32:41 2018] block drbd10: helper command: /bin/true after-resync-target minor-10 exit code 0 (0x0)<br><br>I've got 4 differents clusters on these same versions and I got the same problem on all.<br><br>It's not always the same resource.<br> </p> <p>Any idea what I can check?<br><br>Thanks a lot,<br><br><br><signature>Nicolas</signature></p> </div></body></html>