Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello everyone *Please Urgent, my servers are in production* I am in a serious problem and need help *My my scenario* - I have two workstations ASUS P8H77-M PRO with Intel core I7, Proxmox VE 2.3, DRBD 8.3.10, LVM on top of DRBD - 2 NICs Realtek RTL8111/8168 PCI-E of 1 Gb/s in bond round robin only for use with DRBD And after awhile it shows me this: shell#cat /proc/drbd version: 8.3.13 (api:88/proto:86-96) GIT-hash: 83ca112086600faacab2f157bc5a9324f7bd7f77 build by root at sighted, 2012-10-09 12:47:51 0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown r----- ns:237256 nr:307093 dw:307093 dr:690264 al:0 bm:321 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0 1: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r----- ns:0 nr:467984 dw:467984 dr:537932 al:0 bm:13 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0 *This is my configuration:* File global_common.conf: global { usage-count no; } common { protocol C; handlers { pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f"; pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f"; local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f"; split-brain "/usr/lib/drbd/notify-split-brain.sh root"; out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root"; } startup { } disk { on-io-error detach; } net { sndbuf-size 0; no-tcp-cork; unplug-watermark 16; max-buffers 8000; max-epoch-size 8000; data-integrity-alg sha1; } syncer { rate 75M; al-extents 3389; cpu-mask 0; verify-alg "sha1"; } } *File r0.res:* resource r0 { protocol C; startup { wfc-timeout 15; degr-wfc-timeout 60; become-primary-on both; } net { allow-two-primaries; after-sb-0pri discard-zero-changes; after-sb-1pri discard-secondary; after-sb-2pri disconnect; } on kvm5 { device /dev/drbd0; disk /dev/sda3; address 10.2.2.50:7788; meta-disk internal; } on kvm6 { device /dev/drbd0; disk /dev/sda3; address 10.2.2.51:7788; meta-disk internal; } } *File r1.res:* resource r1 { protocol C; startup { wfc-timeout 15; degr-wfc-timeout 60; become-primary-on both; } net { allow-two-primaries; after-sb-0pri discard-zero-changes; after-sb-1pri discard-secondary; after-sb-2pri disconnect; } on kvm5 { device /dev/drbd1; disk /dev/sdb3; address 10.2.2.50:7789; meta-disk internal; } on kvm6 { device /dev/drbd1; disk /dev/sdb3; address 10.2.2.51:7789; meta-disk internal; } } *Note:* I use on the directive net "data-integrity-alg sha1"; because for me is very important the data *This is my logs:* *Log in Node A:* Jun 14 08:07:28 kvm5 kernel: dlm: connecting to 4 Jun 14 08:50:12 kvm5 kernel: block drbd0: Digest mismatch, buffer modified by upper layers during write: 21158352s +4096 Jun 14 08:50:12 kvm5 kernel: block drbd0: sock was reset by peer Jun 14 08:50:12 kvm5 kernel: block drbd0: peer( Primary -> Unknown ) conn( Connected -> BrokenPipe ) pdsk( UpToDate -> DUnknown ) Jun 14 08:50:12 kvm5 kernel: block drbd0: short read expecting header on sock: r=-104 Jun 14 08:50:12 kvm5 kernel: block drbd0: meta connection shut down by peer. Jun 14 08:50:12 kvm5 kernel: block drbd0: new current UUID 76A887AA443E0DBB:15B9E4140BB5F41B:48B8F43E491AA38D:48B7F43E491AA38D Jun 14 08:50:12 kvm5 kernel: block drbd0: asender terminated Jun 14 08:50:12 kvm5 kernel: block drbd0: Terminating asender thread Jun 14 08:50:12 kvm5 kernel: block drbd0: Connection closed Jun 14 08:50:12 kvm5 kernel: block drbd0: conn( BrokenPipe -> Unconnected ) Jun 14 08:50:12 kvm5 kernel: block drbd0: receiver terminated Jun 14 08:50:12 kvm5 kernel: block drbd0: Restarting receiver thread Jun 14 08:50:12 kvm5 kernel: block drbd0: receiver (re)started Jun 14 08:50:12 kvm5 kernel: block drbd0: conn( Unconnected -> WFConnection ) Jun 14 08:50:13 kvm5 kernel: block drbd0: Handshake successful: Agreed network protocol version 96 Jun 14 08:50:13 kvm5 kernel: block drbd0: conn( WFConnection -> WFReportParams ) Jun 14 08:50:13 kvm5 kernel: block drbd0: Starting asender thread (from drbd0_receiver [1847]) Jun 14 08:50:13 kvm5 kernel: block drbd0: data-integrity-alg: sha1 Jun 14 08:50:13 kvm5 kernel: block drbd0: drbd_sync_handshake: Jun 14 08:50:13 kvm5 kernel: block drbd0: self 76A887AA443E0DBB:15B9E4140BB5F41B:48B8F43E491AA38D:48B7F43E491AA38D bits:99 flags:0 Jun 14 08:50:13 kvm5 kernel: block drbd0: peer CF68F4906E4001C5:15B9E4140BB5F41B:48B8F43E491AA38D:48B7F43E491AA38D bits:0 flags:0 Jun 14 08:50:13 kvm5 kernel: block drbd0: uuid_compare()=100 by rule 90 Jun 14 08:50:13 kvm5 kernel: block drbd0: helper command: /sbin/drbdadm initial-split-brain minor-0 Jun 14 08:50:13 kvm5 kernel: block drbd0: helper command: /sbin/drbdadm initial-split-brain minor-0 exit code 0 (0x0) Jun 14 08:50:13 kvm5 kernel: block drbd0: Split-Brain detected but unresolved, dropping connection! Jun 14 08:50:13 kvm5 kernel: block drbd0: helper command: /sbin/drbdadm split-brain minor-0 Jun 14 08:50:13 kvm5 kernel: block drbd0: helper command: /sbin/drbdadm split-brain minor-0 exit code 0 (0x0) Jun 14 08:50:13 kvm5 kernel: block drbd0: conn( WFReportParams -> Disconnecting ) Jun 14 08:50:13 kvm5 kernel: block drbd0: error receiving ReportState, l: 4! Jun 14 08:50:13 kvm5 kernel: block drbd0: asender terminated Jun 14 08:50:13 kvm5 kernel: block drbd0: Terminating asender thread Jun 14 08:50:13 kvm5 kernel: block drbd0: Connection closed Jun 14 08:50:13 kvm5 kernel: block drbd0: conn( Disconnecting -> StandAlone ) Jun 14 08:50:13 kvm5 kernel: block drbd0: receiver terminated Jun 14 08:50:13 kvm5 kernel: block drbd0: Terminating receiver thread *Log in node B:* Jun 14 08:07:28 kvm6 kernel: dlm: Using TCP for communications Jun 14 08:07:28 kvm6 kernel: dlm: got connection from 3 Jun 14 08:50:12 kvm6 kernel: block drbd0: Digest integrity check FAILED: 21158352s +4096 Jun 14 08:50:12 kvm6 kernel: block drbd0: error receiving Data, l: 4140! Jun 14 08:50:12 kvm6 kernel: block drbd0: peer( Primary -> Unknown ) conn( Connected -> ProtocolError ) pdsk( UpToDate -> DUnknown ) Jun 14 08:50:12 kvm6 kernel: block drbd0: new current UUID CF68F4906E4001C5:15B9E4140BB5F41B:48B8F43E491AA38D:48B7F43E491AA38D Jun 14 08:50:12 kvm6 kernel: block drbd0: asender terminated Jun 14 08:50:12 kvm6 kernel: block drbd0: Terminating asender thread Jun 14 08:50:12 kvm6 kernel: block drbd0: Connection closed Jun 14 08:50:12 kvm6 kernel: block drbd0: conn( ProtocolError -> Unconnected ) Jun 14 08:50:12 kvm6 kernel: block drbd0: receiver terminated Jun 14 08:50:12 kvm6 kernel: block drbd0: Restarting receiver thread Jun 14 08:50:12 kvm6 kernel: block drbd0: receiver (re)started Jun 14 08:50:12 kvm6 kernel: block drbd0: conn( Unconnected -> WFConnection ) Jun 14 08:50:13 kvm6 kernel: block drbd0: Handshake successful: Agreed network protocol version 96 Jun 14 08:50:13 kvm6 kernel: block drbd0: conn( WFConnection -> WFReportParams ) Jun 14 08:50:13 kvm6 kernel: block drbd0: Starting asender thread (from drbd0_receiver [1857]) Jun 14 08:50:13 kvm6 kernel: block drbd0: data-integrity-alg: sha1 Jun 14 08:50:13 kvm6 kernel: block drbd0: drbd_sync_handshake: Jun 14 08:50:13 kvm6 kernel: block drbd0: self CF68F4906E4001C5:15B9E4140BB5F41B:48B8F43E491AA38D:48B7F43E491AA38D bits:0 flags:0 Jun 14 08:50:13 kvm6 kernel: block drbd0: peer 76A887AA443E0DBB:15B9E4140BB5F41B:48B8F43E491AA38D:48B7F43E491AA38D bits:99 flags:0 Jun 14 08:50:13 kvm6 kernel: block drbd0: uuid_compare()=100 by rule 90 Jun 14 08:50:13 kvm6 kernel: block drbd0: helper command: /sbin/drbdadm initial-split-brain minor-0 Jun 14 08:50:13 kvm6 kernel: block drbd0: helper command: /sbin/drbdadm initial-split-brain minor-0 exit code 0 (0x0) Jun 14 08:50:13 kvm6 kernel: block drbd0: Split-Brain detected but unresolved, dropping connection! Jun 14 08:50:13 kvm6 kernel: block drbd0: helper command: /sbin/drbdadm split-brain minor-0 Jun 14 08:50:13 kvm6 kernel: block drbd0: meta connection shut down by peer. Jun 14 08:50:13 kvm6 kernel: block drbd0: conn( WFReportParams -> NetworkFailure ) Jun 14 08:50:13 kvm6 kernel: block drbd0: asender terminated Jun 14 08:50:13 kvm6 kernel: block drbd0: Terminating asender thread Jun 14 08:50:13 kvm6 kernel: block drbd0: helper command: /sbin/drbdadm split-brain minor-0 exit code 0 (0x0) Jun 14 08:50:13 kvm6 kernel: block drbd0: conn( NetworkFailure -> Disconnecting ) Jun 14 08:50:13 kvm6 kernel: block drbd0: error receiving ReportState, l: 4! Jun 14 08:50:13 kvm6 kernel: block drbd0: Connection closed Jun 14 08:50:13 kvm6 kernel: block drbd0: conn( Disconnecting -> StandAlone ) Jun 14 08:50:13 kvm6 kernel: block drbd0: receiver terminated Jun 14 08:50:13 kvm6 kernel: block drbd0: Terminating receiver thread I will be extremely grateful to anyone who can help me Best regards Cesar -- View this message in context: http://drbd.10923.n7.nabble.com/Replication-problems-constants-with-DRBD-8-3-10-tp17896.html Sent from the DRBD - User mailing list archive at Nabble.com.