Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
hi i am using drbd 8.4.3-2 on citrix xenserver 6.1 , but when i do drbdadm primary all , the command hangs and eventally gives back "command timed out aftefr 120 sec" in dmesg is get the errors below. then both NIC get disconnected and after a few minutes come back on line. when system eventually comes back the node is in primary/seconday and connected but it takes a long time. resource 1 is 220 Gb and resource 2 is 900 Gb drbd version drbd-utils-8.4.3-2 drbd-xen-8.4.3-2 drbd-km-2.6.32.43_0.4.1.xs1.6.10.734.170748xen-8.4.3-2 [ 600.973858] INFO: task ovs-vswitchd:5615 blocked for more than 120 seconds. [ 600.973868] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 600.973874] ovs-vswitchd D eeaabc9c 0 5615 5614 0x00000004 [ 600.973879] eeaabcb0 00000282 00000002 eeaabc9c c16db324 00000000 00000000 00000000 [ 600.973886] 00000000 00000000 0000000f e871f4a4 e871f394 e871f300 e871f4a4 c16dd200 [ 600.973892] 00000002 b948bfd4 0000005a ecff8580 00000019 00002306 00000000 c01d4790 [ 600.973898] Call Trace: [ 600.973908] [<c01d4790>] ? pollwake+0x0/0x70 [ 600.973911] [<c01d4790>] ? pollwake+0x0/0x70 [ 600.973917] [<c03d418c>] __mutex_lock_slowpath+0x10c/0x160 [ 600.973921] [<c03d3fe5>] mutex_lock+0x25/0x40 [ 600.973926] [<c036d875>] genl_rcv+0x15/0x30 [ 600.973929] [<c036ba81>] netlink_unicast+0x241/0x250 [ 600.973934] [<c0349acc>] ? memcpy_fromiovec+0x4c/0x70 [ 600.973938] [<c036c771>] netlink_sendmsg+0x1c1/0x280 [ 600.973941] [<c033ffd7>] sock_sendmsg+0xd7/0x100 [ 600.973945] [<c014e6b0>] ? autoremove_wake_function+0x0/0x50 [ 600.973949] [<c014e6b0>] ? autoremove_wake_function+0x0/0x50 [ 600.973952] [<c0151cd1>] ? __hrtimer_start_range_ns+0xe1/0x190 [ 600.973957] [<c0261871>] ? copy_from_user+0x41/0x70 [ 600.973960] [<c0349df6>] ? verify_iovec+0x36/0xa0 [ 600.973963] [<c0340116>] sys_sendmsg+0x116/0x230 [ 600.973967] [<c0340c07>] ? sys_recvmsg+0xf7/0x1c0 [ 600.973971] [<c0146b23>] ? get_signal_to_deliver+0xa3/0x4e0 [ 600.973976] [<c01c43d9>] ? do_sync_read+0xd9/0x110 [ 600.973979] [<c033f0d4>] ? sock_poll+0x14/0x20 [ 600.973983] [<c01f540a>] ? ep_send_events_proc+0x5a/0x100 [ 600.973987] [<c01f58ac>] ? ep_scan_ready_list+0xfc/0x150 [ 600.973991] [<c03413a7>] sys_socketcall+0x247/0x270 [ 600.973995] [<c012c420>] ? default_wake_function+0x0/0x20 [ 600.973999] [<c0104571>] syscall_call+0x7/0xb [ 600.974018] INFO: task drbdsetup:8432 blocked for more than 120 seconds. [ 600.974023] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 600.974029] drbdsetup D 00000001 0 8432 1 0x00000004 [ 600.974033] ea92dc54 00000286 ea92dbd8 00000001 00000003 eeaf4d08 eeaf4d04 00000000 [ 600.974039] 00000000 b94886b7 0000005a ee36d754 ee36d644 ee36d5b0 ee36d754 c16bd200 [ 600.974045] 00000000 b9481226 0000005a edf373c0 00000000 00000008 edb9c198 eeaf4c00 [ 600.974050] Call Trace: [ 600.974065] [<f0f2524d>] ? _req_st_cond+0xed/0x130 [drbd] [ 600.974075] [<f0f2834b>] drbd_req_state+0x14b/0x310 [drbd] [ 600.974079] [<c01444a9>] ? complete_signal+0xd9/0x1b0 [ 600.974082] [<c014e6b0>] ? autoremove_wake_function+0x0/0x50 [ 600.974092] [<f0f28533>] _drbd_request_state+0x23/0xb0 [drbd] [ 600.974096] [<c0145435>] ? force_sig_info+0xa5/0xc0 [ 600.974107] [<f0f1e968>] drbd_set_role+0x58/0x780 [drbd] [ 600.974118] [<f0f1f4c6>] drbd_adm_set_role+0xa6/0xc0 [drbd] [ 600.974122] [<c036ebc3>] genl_rcv_msg+0x183/0x1c0 [ 600.974126] [<c036ea40>] ? genl_rcv_msg+0x0/0x1c0 [ 600.974129] [<c036bced>] netlink_rcv_skb+0x7d/0xa0 [ 600.974132] [<c036d881>] genl_rcv+0x21/0x30 [ 600.974136] [<c036ba81>] netlink_unicast+0x241/0x250 [ 600.974139] [<c0349acc>] ? memcpy_fromiovec+0x4c/0x70 [ 600.974143] [<c036c771>] netlink_sendmsg+0x1c1/0x280 [ 600.974146] [<c033f63b>] sock_aio_write+0xeb/0x100 [ 600.974150] [<c01c42c9>] do_sync_write+0xd9/0x110 [ 600.974154] [<c014e6b0>] ? autoremove_wake_function+0x0/0x50 [ 600.974158] [<c01c4b68>] vfs_write+0x178/0x180 [ 600.974161] [<c01c5192>] sys_write+0x42/0x70 [ 600.974165] [<c0104571>] syscall_call+0x7/0xb config /etc/drbd.d/global_common.conf global { usage-count yes; } common { protocol C; net { shared-secret "#####"; after-sb-0pri discard-zero-changes; after-sb-1pri consensus; after-sb-2pri disconnect; } disk { max-bio-bvecs 1; } handlers { split-brain "/usr/lib/drbd/notify-split-brain.sh"; } syncer { rate 40M; } } /etc/drbd.d/drbd-sr1.res resource drbd-sr1 { device /dev/drbd1; disk /dev/cciss/c0d1p1; meta-disk internal; on xenoctagon-1 { address 10.100.100.1:7788; } on xen-octagon2 { address 10.100.100.2:7788; } } /etc/drbd.d/drbd-sr2.res resource drbd-sr2 { device /dev/drbd2; disk /dev/cciss/c0d2p1; meta-disk internal; on xenoctagon-1 { address 10.100.100.1:7789; } on xen-octagon2 { address 10.100.100.2:7789; } } Regards Gerry Kernan -- View this message in context: http://drbd.10923.n7.nabble.com/Xenserver-6-1-network-problem-when-i-promote-node-to-primary-tp17727.html Sent from the DRBD - User mailing list archive at Nabble.com.