[DRBD-user] Xenserver 6.1 - network problem when i promote node to primary

Ballard, Justin Justin.Ballard at utoledo.edu
Mon Apr 22 14:51:20 CEST 2013

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Greetings

Having just been through the same problem, I would recommend switching back to Linux bridging. OVS and DRBD don't seem to play nice with each other for some reason.  

Good luck


Sent from my iPhone

On Apr 22, 2013, at 8:00 AM, "gerrykernan" <gerry.kernan at infinityit.ie> wrote:

> hi
> 
> i am using drbd 8.4.3-2 on citrix xenserver 6.1 , but when i do drbdadm
> primary all , the command hangs and eventally gives back "command timed out
> aftefr 120 sec"
> in dmesg is get the errors below. then both NIC get disconnected and after a
> few minutes come back on line. when system eventually comes back the node is
> in primary/seconday and connected but it takes a long time.
> resource 1 is 220 Gb and resource 2 is 900 Gb
> 
> 
> drbd version
> drbd-utils-8.4.3-2
> drbd-xen-8.4.3-2
> drbd-km-2.6.32.43_0.4.1.xs1.6.10.734.170748xen-8.4.3-2
> 
> [  600.973858] INFO: task ovs-vswitchd:5615 blocked for more than 120
> seconds.
> [  600.973868] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [  600.973874] ovs-vswitchd  D eeaabc9c     0  5615   5614 0x00000004
> [  600.973879]  eeaabcb0 00000282 00000002 eeaabc9c c16db324 00000000
> 00000000 00000000
> [  600.973886]  00000000 00000000 0000000f e871f4a4 e871f394 e871f300
> e871f4a4 c16dd200
> [  600.973892]  00000002 b948bfd4 0000005a ecff8580 00000019 00002306
> 00000000 c01d4790
> [  600.973898] Call Trace:
> [  600.973908]  [<c01d4790>] ? pollwake+0x0/0x70
> [  600.973911]  [<c01d4790>] ? pollwake+0x0/0x70
> [  600.973917]  [<c03d418c>] __mutex_lock_slowpath+0x10c/0x160
> [  600.973921]  [<c03d3fe5>] mutex_lock+0x25/0x40
> [  600.973926]  [<c036d875>] genl_rcv+0x15/0x30
> [  600.973929]  [<c036ba81>] netlink_unicast+0x241/0x250
> [  600.973934]  [<c0349acc>] ? memcpy_fromiovec+0x4c/0x70
> [  600.973938]  [<c036c771>] netlink_sendmsg+0x1c1/0x280
> [  600.973941]  [<c033ffd7>] sock_sendmsg+0xd7/0x100
> [  600.973945]  [<c014e6b0>] ? autoremove_wake_function+0x0/0x50
> [  600.973949]  [<c014e6b0>] ? autoremove_wake_function+0x0/0x50
> [  600.973952]  [<c0151cd1>] ? __hrtimer_start_range_ns+0xe1/0x190
> [  600.973957]  [<c0261871>] ? copy_from_user+0x41/0x70
> [  600.973960]  [<c0349df6>] ? verify_iovec+0x36/0xa0
> [  600.973963]  [<c0340116>] sys_sendmsg+0x116/0x230
> [  600.973967]  [<c0340c07>] ? sys_recvmsg+0xf7/0x1c0
> [  600.973971]  [<c0146b23>] ? get_signal_to_deliver+0xa3/0x4e0
> [  600.973976]  [<c01c43d9>] ? do_sync_read+0xd9/0x110
> [  600.973979]  [<c033f0d4>] ? sock_poll+0x14/0x20
> [  600.973983]  [<c01f540a>] ? ep_send_events_proc+0x5a/0x100
> [  600.973987]  [<c01f58ac>] ? ep_scan_ready_list+0xfc/0x150
> [  600.973991]  [<c03413a7>] sys_socketcall+0x247/0x270
> [  600.973995]  [<c012c420>] ? default_wake_function+0x0/0x20
> [  600.973999]  [<c0104571>] syscall_call+0x7/0xb
> [  600.974018] INFO: task drbdsetup:8432 blocked for more than 120 seconds.
> [  600.974023] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [  600.974029] drbdsetup     D 00000001     0  8432      1 0x00000004
> [  600.974033]  ea92dc54 00000286 ea92dbd8 00000001 00000003 eeaf4d08
> eeaf4d04 00000000
> [  600.974039]  00000000 b94886b7 0000005a ee36d754 ee36d644 ee36d5b0
> ee36d754 c16bd200
> [  600.974045]  00000000 b9481226 0000005a edf373c0 00000000 00000008
> edb9c198 eeaf4c00
> [  600.974050] Call Trace:
> [  600.974065]  [<f0f2524d>] ? _req_st_cond+0xed/0x130 [drbd]
> [  600.974075]  [<f0f2834b>] drbd_req_state+0x14b/0x310 [drbd]
> [  600.974079]  [<c01444a9>] ? complete_signal+0xd9/0x1b0
> [  600.974082]  [<c014e6b0>] ? autoremove_wake_function+0x0/0x50
> [  600.974092]  [<f0f28533>] _drbd_request_state+0x23/0xb0 [drbd]
> [  600.974096]  [<c0145435>] ? force_sig_info+0xa5/0xc0
> [  600.974107]  [<f0f1e968>] drbd_set_role+0x58/0x780 [drbd]
> [  600.974118]  [<f0f1f4c6>] drbd_adm_set_role+0xa6/0xc0 [drbd]
> [  600.974122]  [<c036ebc3>] genl_rcv_msg+0x183/0x1c0
> [  600.974126]  [<c036ea40>] ? genl_rcv_msg+0x0/0x1c0
> [  600.974129]  [<c036bced>] netlink_rcv_skb+0x7d/0xa0
> [  600.974132]  [<c036d881>] genl_rcv+0x21/0x30
> [  600.974136]  [<c036ba81>] netlink_unicast+0x241/0x250
> [  600.974139]  [<c0349acc>] ? memcpy_fromiovec+0x4c/0x70
> [  600.974143]  [<c036c771>] netlink_sendmsg+0x1c1/0x280
> [  600.974146]  [<c033f63b>] sock_aio_write+0xeb/0x100
> [  600.974150]  [<c01c42c9>] do_sync_write+0xd9/0x110
> [  600.974154]  [<c014e6b0>] ? autoremove_wake_function+0x0/0x50
> [  600.974158]  [<c01c4b68>] vfs_write+0x178/0x180
> [  600.974161]  [<c01c5192>] sys_write+0x42/0x70
> [  600.974165]  [<c0104571>] syscall_call+0x7/0xb
> 
> 
> config
> /etc/drbd.d/global_common.conf
> global { usage-count yes; }
> common {
> protocol C;
> net {
> shared-secret "#####";
> after-sb-0pri discard-zero-changes;
> after-sb-1pri consensus;
> after-sb-2pri disconnect;
> }
> disk { max-bio-bvecs 1; }
> handlers { split-brain "/usr/lib/drbd/notify-split-brain.sh"; }
> syncer { rate 40M; }
> }
> 
> 
> /etc/drbd.d/drbd-sr1.res
> resource drbd-sr1 {
> device /dev/drbd1;
> disk /dev/cciss/c0d1p1;
> meta-disk internal;
> on xenoctagon-1 { address 10.100.100.1:7788; }
> on xen-octagon2 { address 10.100.100.2:7788; }
> }
> 
> /etc/drbd.d/drbd-sr2.res
> resource drbd-sr2 {
> device /dev/drbd2;
> disk /dev/cciss/c0d2p1;
> meta-disk internal;
> on xenoctagon-1 { address 10.100.100.1:7789; }
> on xen-octagon2 { address 10.100.100.2:7789; }
> }
> 
> Regards
> 
> Gerry Kernan
> 
> 
> 
> --
> View this message in context: http://drbd.10923.n7.nabble.com/Xenserver-6-1-network-problem-when-i-promote-node-to-primary-tp17727.html
> Sent from the DRBD - User mailing list archive at Nabble.com.
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user



More information about the drbd-user mailing list