Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
There we go -- just hadn't noticed before, but CentOS5 (this is a XenServer actually) was writing out the same stuff to /var/log/kern.log with timestamps. I reproduced this again just to make sure I have my story straight. I have vh03 and vh04, with vh04 being the primary right now. *First I make vh04 secondary:* [root at vh03 ~]# date Mon Oct 8 00:47:30 EST 2012 [root at vh04 ~]# drbdadm secondary drbd-sr1 [root at vh04 ~]# cat /proc/drbd version: 8.4.2 (api:1/proto:86-101) GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by root at xenddk.eightmile.locatrix.net, 2012-09-20 14:40:39 1: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r----- ns:551471038 nr:102855 dw:339447098 dr:893008359 al:52675 bm:35090 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0 *Kernel log on vh04 shows:* Oct 8 00:47:37 vh04 kernel: block drbd1: role( Primary -> Secondary ) Oct 8 00:47:37 vh04 kernel: block drbd1: bitmap WRITE of 0 pages took 0 jiffies Oct 8 00:47:37 vh04 kernel: block drbd1: 0 KB (0 bits) marked out-of-sync by on disk bit-map. *No issues, that worked fine, returns immediately as expected.* *Now I make vh03 the primary:* [root at vh03 ~]# date Mon Oct 8 00:48:00 EST 2012 [root at vh03 ~]# cat /proc/drbd version: 8.4.2 (api:1/proto:86-101) GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by root at xenddk.eightmile.locatrix.net, 2012-09-20 14:40:39 1: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r----- ns:102855 nr:530600089 dw:530702944 dr:288517 al:196 bm:25600 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0 [root at vh03 ~]# drbdadm primary drbd-sr1 *Right away vh03 freezes up. So I'm no longer getting any console output...but 15 minutes later it comes alive again...this time SSH didn't even drop, it just started working again after 15 minutes, and this is what I see:* Command 'drbdsetup primary 1' did not terminate within 121 seconds [root at vh03 ~]# date Mon Oct 8 01:04:19 EST 2012 *Now that it's alive again, here's what the kernel log had to say during that time:* Oct 8 00:51:39 vh03 kernel: INFO: task ovs-vswitchd:5724 blocked for more than 120 seconds. Oct 8 00:51:39 vh03 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Oct 8 00:51:39 vh03 kernel: ovs-vswitchd D ea841ce8 0 5724 5723 0x00000004 Oct 8 00:51:39 vh03 kernel: ea841cfc 00200286 00000002 ea841ce8 c16eb224 00000000 00200200 ee404050 Oct 8 00:51:39 vh03 kernel: ea8a2140 00000019 eddd99d4 eddd98c4 eddd9830 eddd99d4 c16ec100 00000003 Oct 8 00:51:39 vh03 kernel: eb7e1900 00d8bdf8 00000000 c022a07e ea841f08 00000001 00000001 ea841cec Oct 8 00:51:39 vh03 kernel: Call Trace: Oct 8 00:51:39 vh03 kernel: [<c022a07e>] ? copy_to_user+0x3e/0x60 Oct 8 00:51:39 vh03 kernel: [<c0383c2b>] __mutex_lock_slowpath+0x10b/0x160 Oct 8 00:51:39 vh03 kernel: [<c0383a80>] mutex_lock+0x20/0x40 Oct 8 00:51:39 vh03 kernel: [<c03229a0>] genl_rcv+0x10/0x30 Oct 8 00:51:39 vh03 kernel: [<c0320d5a>] netlink_unicast+0x24a/0x270 Oct 8 00:51:39 vh03 kernel: [<c0301e21>] ? memcpy_fromiovec+0x41/0x60 Oct 8 00:51:39 vh03 kernel: [<c03219de>] netlink_sendmsg+0x1ce/0x290 Oct 8 00:51:39 vh03 kernel: [<c02f88ea>] sock_sendmsg+0xfa/0x110 Oct 8 00:51:39 vh03 kernel: [<c01458a0>] ? enqueue_hrtimer+0x60/0x80 Oct 8 00:51:39 vh03 kernel: [<c0142b10>] ? autoremove_wake_function+0x0/0x50 Oct 8 00:51:39 vh03 kernel: [<c02f7ac0>] ? sock_destroy_inode+0x10/0x20 Oct 8 00:51:39 vh03 kernel: [<c0229eac>] ? copy_from_user+0x3c/0x70 Oct 8 00:51:39 vh03 kernel: [<c02f9493>] ? sys_recvmsg+0xf3/0x1c0 Oct 8 00:51:39 vh03 kernel: [<c02f80a0>] ? sockfd_lookup_light+0x30/0x60 Oct 8 00:51:39 vh03 kernel: [<c02f8bc5>] sys_sendto+0xa5/0xd0 Oct 8 00:51:39 vh03 kernel: [<c0194625>] ? put_unused_fd+0x35/0x70 Oct 8 00:51:39 vh03 kernel: [<c02f9973>] ? sys_accept4+0x1b3/0x1e0 Oct 8 00:51:39 vh03 kernel: [<c0142b10>] ? autoremove_wake_function+0x0/0x50 Oct 8 00:51:39 vh03 kernel: [<c02f8c26>] sys_send+0x36/0x40 Oct 8 00:51:39 vh03 kernel: [<c02f9ae6>] sys_socketcall+0x146/0x270 Oct 8 00:51:39 vh03 kernel: [<c0104501>] syscall_call+0x7/0xb Oct 8 00:51:39 vh03 kernel: INFO: task drbdsetup:18163 blocked for more than 120 seconds. Oct 8 00:51:39 vh03 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Oct 8 00:51:39 vh03 kernel: drbdsetup D 00000002 0 18163 1 0x00000000 Oct 8 00:51:39 vh03 kernel: e9ecdc68 00000286 00000001 00000002 ed014410 e9ecdc04 e9ecdc14 f0c9ac78 Oct 8 00:51:39 vh03 kernel: 8c8c6ebd 00008176 ed817214 ed817104 ed817070 ed817214 c16ec100 00000003 Oct 8 00:51:39 vh03 kernel: edc08200 f0c8b32b 00000000 00003104 00000000 00000005 002689a5 ed014400 Oct 8 00:51:39 vh03 kernel: Call Trace: Oct 8 00:51:39 vh03 kernel: [<f0c9ac78>] ? conn_highest_peer+0x48/0x60 [drbd] Oct 8 00:51:39 vh03 kernel: [<f0c8b32b>] ? drbd_send_command+0x3b/0x50 [drbd] Oct 8 00:51:39 vh03 kernel: [<f0c9bfe8>] ? _req_st_cond+0xe8/0x130 [drbd] Oct 8 00:51:39 vh03 kernel: [<f0c9e846>] drbd_req_state+0x146/0x300 [drbd] Oct 8 00:51:39 vh03 kernel: [<c0139d23>] ? signal_wake_up+0x23/0x40 Oct 8 00:51:39 vh03 kernel: [<c0139e19>] ? complete_signal+0xd9/0x1b0 Oct 8 00:51:39 vh03 kernel: [<c0142b10>] ? autoremove_wake_function+0x0/0x50 Oct 8 00:51:39 vh03 kernel: [<f0c9ea1e>] _drbd_request_state+0x1e/0xa0 [drbd] Oct 8 00:51:39 vh03 kernel: [<c013ac4b>] ? force_sig_info+0x9b/0xb0 Oct 8 00:51:39 vh03 kernel: [<f0c96303>] drbd_set_role+0x53/0x550 [drbd] Oct 8 00:51:39 vh03 kernel: [<f0c93f45>] ? drbd_adm_prepare+0x255/0x430 [drbd] Oct 8 00:51:39 vh03 kernel: [<f0c96bed>] drbd_adm_set_role+0x9d/0xc0 [drbd] Oct 8 00:51:39 vh03 kernel: [<c0323c68>] genl_rcv_msg+0x158/0x190 Oct 8 00:51:39 vh03 kernel: [<c0323b10>] ? genl_rcv_msg+0x0/0x190 Oct 8 00:51:39 vh03 kernel: [<c0320fcd>] netlink_rcv_skb+0x7d/0xa0 Oct 8 00:51:39 vh03 kernel: [<c03229ac>] genl_rcv+0x1c/0x30 Oct 8 00:51:39 vh03 kernel: [<c0320d5a>] netlink_unicast+0x24a/0x270 Oct 8 00:51:39 vh03 kernel: [<c0301e21>] ? memcpy_fromiovec+0x41/0x60 Oct 8 00:51:39 vh03 kernel: [<c03219de>] netlink_sendmsg+0x1ce/0x290 Oct 8 00:51:39 vh03 kernel: [<c02f7f3a>] sock_aio_write+0x10a/0x120 Oct 8 00:51:39 vh03 kernel: [<c019631d>] do_sync_write+0xcd/0x110 Oct 8 00:51:39 vh03 kernel: [<c0142b10>] ? autoremove_wake_function+0x0/0x50 Oct 8 00:51:39 vh03 kernel: [<c0201aaf>] ? security_file_permission+0xf/0x20 Oct 8 00:51:39 vh03 kernel: [<c01964d0>] ? rw_verify_area+0x60/0xc0 Oct 8 00:51:39 vh03 kernel: [<c0196b56>] vfs_write+0x146/0x150 Oct 8 00:51:39 vh03 kernel: [<c01970fd>] sys_write+0x3d/0x70 Oct 8 00:51:39 vh03 kernel: [<c0104501>] syscall_call+0x7/0xb Oct 8 00:53:39 vh03 kernel: INFO: task ovs-vswitchd:5724 blocked for more than 120 seconds. Oct 8 00:53:39 vh03 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Oct 8 00:53:39 vh03 kernel: ovs-vswitchd D ea841ce8 0 5724 5723 0x00000004 Oct 8 00:53:39 vh03 kernel: ea841cfc 00200286 00000002 ea841ce8 c16eb224 00000000 00200200 ee404050 Oct 8 00:53:39 vh03 kernel: ea8a2140 00000019 eddd99d4 eddd98c4 eddd9830 eddd99d4 c16ec100 00000003 Oct 8 00:53:39 vh03 kernel: eb7e1900 00d8bdf8 00000000 c022a07e ea841f08 00000001 00000001 ea841cec Oct 8 00:53:39 vh03 kernel: Call Trace: Oct 8 00:53:39 vh03 kernel: [<c022a07e>] ? copy_to_user+0x3e/0x60 Oct 8 00:53:39 vh03 kernel: [<c0383c2b>] __mutex_lock_slowpath+0x10b/0x160 Oct 8 00:53:39 vh03 kernel: [<c0383a80>] mutex_lock+0x20/0x40 Oct 8 00:53:39 vh03 kernel: [<c03229a0>] genl_rcv+0x10/0x30 Oct 8 00:53:39 vh03 kernel: [<c0320d5a>] netlink_unicast+0x24a/0x270 Oct 8 00:53:39 vh03 kernel: [<c0301e21>] ? memcpy_fromiovec+0x41/0x60 Oct 8 00:53:39 vh03 kernel: [<c03219de>] netlink_sendmsg+0x1ce/0x290 Oct 8 00:53:39 vh03 kernel: [<c02f88ea>] sock_sendmsg+0xfa/0x110 Oct 8 00:53:39 vh03 kernel: [<c01458a0>] ? enqueue_hrtimer+0x60/0x80 Oct 8 00:53:39 vh03 kernel: [<c0142b10>] ? autoremove_wake_function+0x0/0x50 Oct 8 00:53:39 vh03 kernel: [<c02f7ac0>] ? sock_destroy_inode+0x10/0x20 Oct 8 00:53:39 vh03 kernel: [<c0229eac>] ? copy_from_user+0x3c/0x70 Oct 8 00:53:39 vh03 kernel: [<c02f9493>] ? sys_recvmsg+0xf3/0x1c0 Oct 8 00:53:39 vh03 kernel: [<c02f80a0>] ? sockfd_lookup_light+0x30/0x60 Oct 8 00:53:39 vh03 kernel: [<c02f8bc5>] sys_sendto+0xa5/0xd0 Oct 8 00:53:39 vh03 kernel: [<c0194625>] ? put_unused_fd+0x35/0x70 Oct 8 00:53:39 vh03 kernel: [<c02f9973>] ? sys_accept4+0x1b3/0x1e0 Oct 8 00:53:39 vh03 kernel: [<c0142b10>] ? autoremove_wake_function+0x0/0x50 Oct 8 00:53:39 vh03 kernel: [<c02f8c26>] sys_send+0x36/0x40 Oct 8 00:53:39 vh03 kernel: [<c02f9ae6>] sys_socketcall+0x146/0x270 Oct 8 00:53:39 vh03 kernel: [<c0104501>] syscall_call+0x7/0xb Oct 8 00:53:39 vh03 kernel: INFO: task drbdsetup:18163 blocked for more than 120 seconds. Oct 8 00:53:39 vh03 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Oct 8 00:53:39 vh03 kernel: drbdsetup D 00000002 0 18163 1 0x00000000 Oct 8 00:53:39 vh03 kernel: e9ecdc68 00000286 00000001 00000002 ed014410 e9ecdc04 e9ecdc14 f0c9ac78 Oct 8 00:53:39 vh03 kernel: 8c8c6ebd 00008176 ed817214 ed817104 ed817070 ed817214 c16ec100 00000003 Oct 8 00:53:39 vh03 kernel: edc08200 f0c8b32b 00000000 00003104 00000000 00000005 002689a5 ed014400 Oct 8 00:53:39 vh03 kernel: Call Trace: Oct 8 00:53:39 vh03 kernel: [<f0c9ac78>] ? conn_highest_peer+0x48/0x60 [drbd] Oct 8 00:53:39 vh03 kernel: [<f0c8b32b>] ? drbd_send_command+0x3b/0x50 [drbd] Oct 8 00:53:39 vh03 kernel: [<f0c9bfe8>] ? _req_st_cond+0xe8/0x130 [drbd] Oct 8 00:53:39 vh03 kernel: [<f0c9e846>] drbd_req_state+0x146/0x300 [drbd] Oct 8 00:53:39 vh03 kernel: [<c0139d23>] ? signal_wake_up+0x23/0x40 Oct 8 00:53:39 vh03 kernel: [<c0139e19>] ? complete_signal+0xd9/0x1b0 Oct 8 00:53:39 vh03 kernel: [<c0142b10>] ? autoremove_wake_function+0x0/0x50 Oct 8 00:53:39 vh03 kernel: [<f0c9ea1e>] _drbd_request_state+0x1e/0xa0 [drbd] Oct 8 00:53:39 vh03 kernel: [<c013ac4b>] ? force_sig_info+0x9b/0xb0 Oct 8 00:53:39 vh03 kernel: [<f0c96303>] drbd_set_role+0x53/0x550 [drbd] Oct 8 00:53:39 vh03 kernel: [<f0c93f45>] ? drbd_adm_prepare+0x255/0x430 [drbd] Oct 8 00:53:39 vh03 kernel: [<f0c96bed>] drbd_adm_set_role+0x9d/0xc0 [drbd] Oct 8 00:53:39 vh03 kernel: [<c0323c68>] genl_rcv_msg+0x158/0x190 Oct 8 00:53:39 vh03 kernel: [<c0323b10>] ? genl_rcv_msg+0x0/0x190 Oct 8 00:53:39 vh03 kernel: [<c0320fcd>] netlink_rcv_skb+0x7d/0xa0 Oct 8 00:53:39 vh03 kernel: [<c03229ac>] genl_rcv+0x1c/0x30 Oct 8 00:53:39 vh03 kernel: [<c0320d5a>] netlink_unicast+0x24a/0x270 Oct 8 00:53:39 vh03 kernel: [<c0301e21>] ? memcpy_fromiovec+0x41/0x60 Oct 8 00:53:39 vh03 kernel: [<c03219de>] netlink_sendmsg+0x1ce/0x290 Oct 8 00:53:39 vh03 kernel: [<c02f7f3a>] sock_aio_write+0x10a/0x120 Oct 8 00:53:39 vh03 kernel: [<c019631d>] do_sync_write+0xcd/0x110 Oct 8 00:53:39 vh03 kernel: [<c0142b10>] ? autoremove_wake_function+0x0/0x50 Oct 8 00:53:39 vh03 kernel: [<c0201aaf>] ? security_file_permission+0xf/0x20 Oct 8 00:53:39 vh03 kernel: [<c01964d0>] ? rw_verify_area+0x60/0xc0 Oct 8 00:53:39 vh03 kernel: [<c0196b56>] vfs_write+0x146/0x150 Oct 8 00:53:39 vh03 kernel: [<c01970fd>] sys_write+0x3d/0x70 Oct 8 00:53:39 vh03 kernel: [<c0104501>] syscall_call+0x7/0xb Oct 8 00:55:39 vh03 kernel: INFO: task ovs-vswitchd:5724 blocked for more than 120 seconds. Oct 8 00:55:39 vh03 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Oct 8 00:55:39 vh03 kernel: ovs-vswitchd D ea841ce8 0 5724 5723 0x00000004 Oct 8 00:55:39 vh03 kernel: ea841cfc 00200286 00000002 ea841ce8 c16eb224 00000000 00200200 ee404050 Oct 8 00:55:39 vh03 kernel: ea8a2140 00000019 eddd99d4 eddd98c4 eddd9830 eddd99d4 c16ec100 00000003 Oct 8 00:55:39 vh03 kernel: eb7e1900 00d8bdf8 00000000 c022a07e ea841f08 00000001 00000001 ea841cec Oct 8 00:55:39 vh03 kernel: Call Trace: Oct 8 00:55:39 vh03 kernel: [<c022a07e>] ? copy_to_user+0x3e/0x60 Oct 8 00:55:39 vh03 kernel: [<c0383c2b>] __mutex_lock_slowpath+0x10b/0x160 Oct 8 00:55:39 vh03 kernel: [<c0383a80>] mutex_lock+0x20/0x40 Oct 8 00:55:39 vh03 kernel: [<c03229a0>] genl_rcv+0x10/0x30 Oct 8 00:55:39 vh03 kernel: [<c0320d5a>] netlink_unicast+0x24a/0x270 Oct 8 00:55:39 vh03 kernel: [<c0301e21>] ? memcpy_fromiovec+0x41/0x60 Oct 8 00:55:39 vh03 kernel: [<c03219de>] netlink_sendmsg+0x1ce/0x290 Oct 8 00:55:39 vh03 kernel: [<c02f88ea>] sock_sendmsg+0xfa/0x110 Oct 8 00:55:39 vh03 kernel: [<c01458a0>] ? enqueue_hrtimer+0x60/0x80 Oct 8 00:55:39 vh03 kernel: [<c0142b10>] ? autoremove_wake_function+0x0/0x50 Oct 8 00:55:39 vh03 kernel: [<c02f7ac0>] ? sock_destroy_inode+0x10/0x20 Oct 8 00:55:39 vh03 kernel: [<c0229eac>] ? copy_from_user+0x3c/0x70 Oct 8 00:55:39 vh03 kernel: [<c02f9493>] ? sys_recvmsg+0xf3/0x1c0 Oct 8 00:55:39 vh03 kernel: [<c02f80a0>] ? sockfd_lookup_light+0x30/0x60 Oct 8 00:55:39 vh03 kernel: [<c02f8bc5>] sys_sendto+0xa5/0xd0 Oct 8 00:55:39 vh03 kernel: [<c0194625>] ? put_unused_fd+0x35/0x70 Oct 8 00:55:39 vh03 kernel: [<c02f9973>] ? sys_accept4+0x1b3/0x1e0 Oct 8 00:55:39 vh03 kernel: [<c0142b10>] ? autoremove_wake_function+0x0/0x50 Oct 8 00:55:39 vh03 kernel: [<c02f8c26>] sys_send+0x36/0x40 Oct 8 00:55:39 vh03 kernel: [<c02f9ae6>] sys_socketcall+0x146/0x270 Oct 8 00:55:39 vh03 kernel: [<c0104501>] syscall_call+0x7/0xb Oct 8 00:55:39 vh03 kernel: INFO: task drbdsetup:18163 blocked for more than 120 seconds. Oct 8 00:55:39 vh03 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Oct 8 00:55:39 vh03 kernel: drbdsetup D 00000002 0 18163 1 0x00000000 Oct 8 00:55:39 vh03 kernel: e9ecdc68 00000286 00000001 00000002 ed014410 e9ecdc04 e9ecdc14 f0c9ac78 Oct 8 00:55:39 vh03 kernel: 8c8c6ebd 00008176 ed817214 ed817104 ed817070 ed817214 c16ec100 00000003 Oct 8 00:55:39 vh03 kernel: edc08200 f0c8b32b 00000000 00003104 00000000 00000005 002689a5 ed014400 Oct 8 00:55:39 vh03 kernel: Call Trace: Oct 8 00:55:39 vh03 kernel: [<f0c9ac78>] ? conn_highest_peer+0x48/0x60 [drbd] Oct 8 00:55:39 vh03 kernel: [<f0c8b32b>] ? drbd_send_command+0x3b/0x50 [drbd] Oct 8 00:55:39 vh03 kernel: [<f0c9bfe8>] ? _req_st_cond+0xe8/0x130 [drbd] Oct 8 00:55:39 vh03 kernel: [<f0c9e846>] drbd_req_state+0x146/0x300 [drbd] Oct 8 00:55:39 vh03 kernel: [<c0139d23>] ? signal_wake_up+0x23/0x40 Oct 8 00:55:39 vh03 kernel: [<c0139e19>] ? complete_signal+0xd9/0x1b0 Oct 8 00:55:39 vh03 kernel: [<c0142b10>] ? autoremove_wake_function+0x0/0x50 Oct 8 00:55:39 vh03 kernel: [<f0c9ea1e>] _drbd_request_state+0x1e/0xa0 [drbd] Oct 8 00:55:39 vh03 kernel: [<c013ac4b>] ? force_sig_info+0x9b/0xb0 Oct 8 00:55:39 vh03 kernel: [<f0c96303>] drbd_set_role+0x53/0x550 [drbd] Oct 8 00:55:39 vh03 kernel: [<f0c93f45>] ? drbd_adm_prepare+0x255/0x430 [drbd] Oct 8 00:55:39 vh03 kernel: [<f0c96bed>] drbd_adm_set_role+0x9d/0xc0 [drbd] Oct 8 00:55:39 vh03 kernel: [<c0323c68>] genl_rcv_msg+0x158/0x190 Oct 8 00:55:39 vh03 kernel: [<c0323b10>] ? genl_rcv_msg+0x0/0x190 Oct 8 00:55:39 vh03 kernel: [<c0320fcd>] netlink_rcv_skb+0x7d/0xa0 Oct 8 00:55:39 vh03 kernel: [<c03229ac>] genl_rcv+0x1c/0x30 Oct 8 00:55:39 vh03 kernel: [<c0320d5a>] netlink_unicast+0x24a/0x270 Oct 8 00:55:39 vh03 kernel: [<c0301e21>] ? memcpy_fromiovec+0x41/0x60 Oct 8 00:55:39 vh03 kernel: [<c03219de>] netlink_sendmsg+0x1ce/0x290 Oct 8 00:55:39 vh03 kernel: [<c02f7f3a>] sock_aio_write+0x10a/0x120 Oct 8 00:55:39 vh03 kernel: [<c019631d>] do_sync_write+0xcd/0x110 Oct 8 00:55:39 vh03 kernel: [<c0142b10>] ? autoremove_wake_function+0x0/0x50 Oct 8 00:55:39 vh03 kernel: [<c0201aaf>] ? security_file_permission+0xf/0x20 Oct 8 00:55:39 vh03 kernel: [<c01964d0>] ? rw_verify_area+0x60/0xc0 Oct 8 00:55:39 vh03 kernel: [<c0196b56>] vfs_write+0x146/0x150 Oct 8 00:55:39 vh03 kernel: [<c01970fd>] sys_write+0x3d/0x70 Oct 8 00:55:39 vh03 kernel: [<c0104501>] syscall_call+0x7/0xb Oct 8 00:57:39 vh03 kernel: INFO: task ovs-vswitchd:5724 blocked for more than 120 seconds. Oct 8 00:57:39 vh03 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Oct 8 00:57:39 vh03 kernel: ovs-vswitchd D ea841ce8 0 5724 5723 0x00000004 Oct 8 00:57:39 vh03 kernel: ea841cfc 00200286 00000002 ea841ce8 c16eb224 00000000 00200200 ee404050 Oct 8 00:57:39 vh03 kernel: ea8a2140 00000019 eddd99d4 eddd98c4 eddd9830 eddd99d4 c16ec100 00000003 Oct 8 00:57:39 vh03 kernel: eb7e1900 00d8bdf8 00000000 c022a07e ea841f08 00000001 00000001 ea841cec Oct 8 00:57:39 vh03 kernel: Call Trace: Oct 8 00:57:39 vh03 kernel: [<c022a07e>] ? copy_to_user+0x3e/0x60 Oct 8 00:57:39 vh03 kernel: [<c0383c2b>] __mutex_lock_slowpath+0x10b/0x160 Oct 8 00:57:39 vh03 kernel: [<c0383a80>] mutex_lock+0x20/0x40 Oct 8 00:57:39 vh03 kernel: [<c03229a0>] genl_rcv+0x10/0x30 Oct 8 00:57:39 vh03 kernel: [<c0320d5a>] netlink_unicast+0x24a/0x270 Oct 8 00:57:39 vh03 kernel: [<c0301e21>] ? memcpy_fromiovec+0x41/0x60 Oct 8 00:57:39 vh03 kernel: [<c03219de>] netlink_sendmsg+0x1ce/0x290 Oct 8 00:57:39 vh03 kernel: [<c02f88ea>] sock_sendmsg+0xfa/0x110 Oct 8 00:57:39 vh03 kernel: [<c01458a0>] ? enqueue_hrtimer+0x60/0x80 Oct 8 00:57:39 vh03 kernel: [<c0142b10>] ? autoremove_wake_function+0x0/0x50 Oct 8 00:57:39 vh03 kernel: [<c02f7ac0>] ? sock_destroy_inode+0x10/0x20 Oct 8 00:57:39 vh03 kernel: [<c0229eac>] ? copy_from_user+0x3c/0x70 Oct 8 00:57:39 vh03 kernel: [<c02f9493>] ? sys_recvmsg+0xf3/0x1c0 Oct 8 00:57:39 vh03 kernel: [<c02f80a0>] ? sockfd_lookup_light+0x30/0x60 Oct 8 00:57:39 vh03 kernel: [<c02f8bc5>] sys_sendto+0xa5/0xd0 Oct 8 00:57:39 vh03 kernel: [<c0194625>] ? put_unused_fd+0x35/0x70 Oct 8 00:57:39 vh03 kernel: [<c02f9973>] ? sys_accept4+0x1b3/0x1e0 Oct 8 00:57:39 vh03 kernel: [<c0142b10>] ? autoremove_wake_function+0x0/0x50 Oct 8 00:57:39 vh03 kernel: [<c02f8c26>] sys_send+0x36/0x40 Oct 8 00:57:39 vh03 kernel: [<c02f9ae6>] sys_socketcall+0x146/0x270 Oct 8 00:57:39 vh03 kernel: [<c0104501>] syscall_call+0x7/0xb Oct 8 00:57:39 vh03 kernel: INFO: task drbdsetup:18163 blocked for more than 120 seconds. Oct 8 00:57:39 vh03 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Oct 8 00:57:39 vh03 kernel: drbdsetup D 00000002 0 18163 1 0x00000000 Oct 8 00:57:39 vh03 kernel: e9ecdc68 00000286 00000001 00000002 ed014410 e9ecdc04 e9ecdc14 f0c9ac78 Oct 8 00:57:39 vh03 kernel: 8c8c6ebd 00008176 ed817214 ed817104 ed817070 ed817214 c16ec100 00000003 Oct 8 00:57:39 vh03 kernel: edc08200 f0c8b32b 00000000 00003104 00000000 00000005 002689a5 ed014400 Oct 8 00:57:39 vh03 kernel: Call Trace: Oct 8 00:57:39 vh03 kernel: [<f0c9ac78>] ? conn_highest_peer+0x48/0x60 [drbd] Oct 8 00:57:39 vh03 kernel: [<f0c8b32b>] ? drbd_send_command+0x3b/0x50 [drbd] Oct 8 00:57:39 vh03 kernel: [<f0c9bfe8>] ? _req_st_cond+0xe8/0x130 [drbd] Oct 8 00:57:39 vh03 kernel: [<f0c9e846>] drbd_req_state+0x146/0x300 [drbd] Oct 8 00:57:39 vh03 kernel: [<c0139d23>] ? signal_wake_up+0x23/0x40 Oct 8 00:57:39 vh03 kernel: [<c0139e19>] ? complete_signal+0xd9/0x1b0 Oct 8 00:57:39 vh03 kernel: [<c0142b10>] ? autoremove_wake_function+0x0/0x50 Oct 8 00:57:39 vh03 kernel: [<f0c9ea1e>] _drbd_request_state+0x1e/0xa0 [drbd] Oct 8 00:57:39 vh03 kernel: [<c013ac4b>] ? force_sig_info+0x9b/0xb0 Oct 8 00:57:39 vh03 kernel: [<f0c96303>] drbd_set_role+0x53/0x550 [drbd] Oct 8 00:57:39 vh03 kernel: [<f0c93f45>] ? drbd_adm_prepare+0x255/0x430 [drbd] Oct 8 00:57:39 vh03 kernel: [<f0c96bed>] drbd_adm_set_role+0x9d/0xc0 [drbd] Oct 8 00:57:39 vh03 kernel: [<c0323c68>] genl_rcv_msg+0x158/0x190 Oct 8 00:57:39 vh03 kernel: [<c0323b10>] ? genl_rcv_msg+0x0/0x190 Oct 8 00:57:39 vh03 kernel: [<c0320fcd>] netlink_rcv_skb+0x7d/0xa0 Oct 8 00:57:39 vh03 kernel: [<c03229ac>] genl_rcv+0x1c/0x30 Oct 8 00:57:39 vh03 kernel: [<c0320d5a>] netlink_unicast+0x24a/0x270 Oct 8 00:57:39 vh03 kernel: [<c0301e21>] ? memcpy_fromiovec+0x41/0x60 Oct 8 00:57:39 vh03 kernel: [<c03219de>] netlink_sendmsg+0x1ce/0x290 Oct 8 00:57:39 vh03 kernel: [<c02f7f3a>] sock_aio_write+0x10a/0x120 Oct 8 00:57:39 vh03 kernel: [<c019631d>] do_sync_write+0xcd/0x110 Oct 8 00:57:39 vh03 kernel: [<c0142b10>] ? autoremove_wake_function+0x0/0x50 Oct 8 00:57:39 vh03 kernel: [<c0201aaf>] ? security_file_permission+0xf/0x20 Oct 8 00:57:39 vh03 kernel: [<c01964d0>] ? rw_verify_area+0x60/0xc0 Oct 8 00:57:39 vh03 kernel: [<c0196b56>] vfs_write+0x146/0x150 Oct 8 00:57:39 vh03 kernel: [<c01970fd>] sys_write+0x3d/0x70 Oct 8 00:57:39 vh03 kernel: [<c0104501>] syscall_call+0x7/0xb Oct 8 00:59:39 vh03 kernel: INFO: task ovs-vswitchd:5724 blocked for more than 120 seconds. Oct 8 00:59:39 vh03 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Oct 8 00:59:39 vh03 kernel: ovs-vswitchd D ea841ce8 0 5724 5723 0x00000004 Oct 8 00:59:39 vh03 kernel: ea841cfc 00200286 00000002 ea841ce8 c16eb224 00000000 00200200 ee404050 Oct 8 00:59:39 vh03 kernel: ea8a2140 00000019 eddd99d4 eddd98c4 eddd9830 eddd99d4 c16ec100 00000003 Oct 8 00:59:39 vh03 kernel: eb7e1900 00d8bdf8 00000000 c022a07e ea841f08 00000001 00000001 ea841cec Oct 8 00:59:39 vh03 kernel: Call Trace: Oct 8 00:59:39 vh03 kernel: [<c022a07e>] ? copy_to_user+0x3e/0x60 Oct 8 00:59:39 vh03 kernel: [<c0383c2b>] __mutex_lock_slowpath+0x10b/0x160 Oct 8 00:59:39 vh03 kernel: [<c0383a80>] mutex_lock+0x20/0x40 Oct 8 00:59:39 vh03 kernel: [<c03229a0>] genl_rcv+0x10/0x30 Oct 8 00:59:39 vh03 kernel: [<c0320d5a>] netlink_unicast+0x24a/0x270 Oct 8 00:59:39 vh03 kernel: [<c0301e21>] ? memcpy_fromiovec+0x41/0x60 Oct 8 00:59:39 vh03 kernel: [<c03219de>] netlink_sendmsg+0x1ce/0x290 Oct 8 00:59:39 vh03 kernel: [<c02f88ea>] sock_sendmsg+0xfa/0x110 Oct 8 00:59:39 vh03 kernel: [<c01458a0>] ? enqueue_hrtimer+0x60/0x80 Oct 8 00:59:39 vh03 kernel: [<c0142b10>] ? autoremove_wake_function+0x0/0x50 Oct 8 00:59:39 vh03 kernel: [<c02f7ac0>] ? sock_destroy_inode+0x10/0x20 Oct 8 00:59:39 vh03 kernel: [<c0229eac>] ? copy_from_user+0x3c/0x70 Oct 8 00:59:39 vh03 kernel: [<c02f9493>] ? sys_recvmsg+0xf3/0x1c0 Oct 8 00:59:39 vh03 kernel: [<c02f80a0>] ? sockfd_lookup_light+0x30/0x60 Oct 8 00:59:39 vh03 kernel: [<c02f8bc5>] sys_sendto+0xa5/0xd0 Oct 8 00:59:39 vh03 kernel: [<c0194625>] ? put_unused_fd+0x35/0x70 Oct 8 00:59:39 vh03 kernel: [<c02f9973>] ? sys_accept4+0x1b3/0x1e0 Oct 8 00:59:39 vh03 kernel: [<c0142b10>] ? autoremove_wake_function+0x0/0x50 Oct 8 00:59:39 vh03 kernel: [<c02f8c26>] sys_send+0x36/0x40 Oct 8 00:59:39 vh03 kernel: [<c02f9ae6>] sys_socketcall+0x146/0x270 Oct 8 00:59:39 vh03 kernel: [<c0104501>] syscall_call+0x7/0xb Oct 8 00:59:39 vh03 kernel: INFO: task drbdsetup:18163 blocked for more than 120 seconds. Oct 8 00:59:39 vh03 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Oct 8 00:59:39 vh03 kernel: drbdsetup D 00000002 0 18163 1 0x00000000 Oct 8 00:59:39 vh03 kernel: e9ecdc68 00000286 00000001 00000002 ed014410 e9ecdc04 e9ecdc14 f0c9ac78 Oct 8 00:59:39 vh03 kernel: 8c8c6ebd 00008176 ed817214 ed817104 ed817070 ed817214 c16ec100 00000003 Oct 8 00:59:39 vh03 kernel: edc08200 f0c8b32b 00000000 00003104 00000000 00000005 002689a5 ed014400 Oct 8 00:59:39 vh03 kernel: Call Trace: Oct 8 00:59:39 vh03 kernel: [<f0c9ac78>] ? conn_highest_peer+0x48/0x60 [drbd] Oct 8 00:59:39 vh03 kernel: [<f0c8b32b>] ? drbd_send_command+0x3b/0x50 [drbd] Oct 8 00:59:39 vh03 kernel: [<f0c9bfe8>] ? _req_st_cond+0xe8/0x130 [drbd] Oct 8 00:59:39 vh03 kernel: [<f0c9e846>] drbd_req_state+0x146/0x300 [drbd] Oct 8 00:59:39 vh03 kernel: [<c0139d23>] ? signal_wake_up+0x23/0x40 Oct 8 00:59:39 vh03 kernel: [<c0139e19>] ? complete_signal+0xd9/0x1b0 Oct 8 00:59:39 vh03 kernel: [<c0142b10>] ? autoremove_wake_function+0x0/0x50 Oct 8 00:59:39 vh03 kernel: [<f0c9ea1e>] _drbd_request_state+0x1e/0xa0 [drbd] Oct 8 00:59:39 vh03 kernel: [<c013ac4b>] ? force_sig_info+0x9b/0xb0 Oct 8 00:59:39 vh03 kernel: [<f0c96303>] drbd_set_role+0x53/0x550 [drbd] Oct 8 00:59:39 vh03 kernel: [<f0c93f45>] ? drbd_adm_prepare+0x255/0x430 [drbd] Oct 8 00:59:39 vh03 kernel: [<f0c96bed>] drbd_adm_set_role+0x9d/0xc0 [drbd] Oct 8 00:59:39 vh03 kernel: [<c0323c68>] genl_rcv_msg+0x158/0x190 Oct 8 00:59:39 vh03 kernel: [<c0323b10>] ? genl_rcv_msg+0x0/0x190 Oct 8 00:59:39 vh03 kernel: [<c0320fcd>] netlink_rcv_skb+0x7d/0xa0 Oct 8 00:59:39 vh03 kernel: [<c03229ac>] genl_rcv+0x1c/0x30 Oct 8 00:59:39 vh03 kernel: [<c0320d5a>] netlink_unicast+0x24a/0x270 Oct 8 00:59:39 vh03 kernel: [<c0301e21>] ? memcpy_fromiovec+0x41/0x60 Oct 8 00:59:39 vh03 kernel: [<c03219de>] netlink_sendmsg+0x1ce/0x290 Oct 8 00:59:39 vh03 kernel: [<c02f7f3a>] sock_aio_write+0x10a/0x120 Oct 8 00:59:39 vh03 kernel: [<c019631d>] do_sync_write+0xcd/0x110 Oct 8 00:59:39 vh03 kernel: [<c0142b10>] ? autoremove_wake_function+0x0/0x50 Oct 8 00:59:39 vh03 kernel: [<c0201aaf>] ? security_file_permission+0xf/0x20 Oct 8 00:59:39 vh03 kernel: [<c01964d0>] ? rw_verify_area+0x60/0xc0 Oct 8 00:59:39 vh03 kernel: [<c0196b56>] vfs_write+0x146/0x150 Oct 8 00:59:39 vh03 kernel: [<c01970fd>] sys_write+0x3d/0x70 Oct 8 00:59:39 vh03 kernel: [<c0104501>] syscall_call+0x7/0xb Oct 8 01:03:46 vh03 kernel: d-con drbd-sr1: sock_recvmsg returned -110 Oct 8 01:03:46 vh03 kernel: d-con drbd-sr1: peer( Secondary -> Unknown ) conn( Connected -> BrokenPipe ) pdsk( UpToDate -> DUnknown ) Oct 8 01:03:46 vh03 kernel: d-con drbd-sr1: short read (expected size 16) Oct 8 01:03:46 vh03 kernel: d-con drbd-sr1: asender terminated Oct 8 01:03:46 vh03 kernel: block drbd1: role( Secondary -> Primary ) Oct 8 01:03:46 vh03 kernel: d-con drbd-sr1: Terminating asender thread Oct 8 01:03:46 vh03 kernel: d-con drbd-sr1: Connection closed Oct 8 01:03:46 vh03 kernel: block drbd1: new current UUID 16D41637EA6A4C4B:5A99C51D68CDB446:188F44BA42FFFCF4:188E44BA42FFFCF4 Oct 8 01:03:46 vh03 kernel: d-con drbd-sr1: conn( BrokenPipe -> Unconnected ) Oct 8 01:03:46 vh03 kernel: d-con drbd-sr1: receiver terminated Oct 8 01:03:46 vh03 kernel: d-con drbd-sr1: Restarting receiver thread Oct 8 01:03:46 vh03 kernel: d-con drbd-sr1: receiver (re)started Oct 8 01:03:46 vh03 kernel: d-con drbd-sr1: conn( Unconnected -> WFConnection ) Oct 8 01:03:46 vh03 kernel: d-con drbd-sr1: initial packet S crossed Oct 8 01:03:46 vh03 kernel: d-con drbd-sr1: Handshake successful: Agreed network protocol version 101 Oct 8 01:03:46 vh03 kernel: d-con drbd-sr1: conn( WFConnection -> WFReportParams ) Oct 8 01:03:46 vh03 kernel: d-con drbd-sr1: Starting asender thread (from drbd_r_drbd-sr1 [6827]) Oct 8 01:03:47 vh03 kernel: block drbd1: drbd_sync_handshake: Oct 8 01:03:47 vh03 kernel: block drbd1: self 16D41637EA6A4C4B:5A99C51D68CDB446:188F44BA42FFFCF4:188E44BA42FFFCF4 bits:0 flags:0 Oct 8 01:03:47 vh03 kernel: block drbd1: peer 5A99C51D68CDB446:0000000000000000:188F44BA42FFFCF4:188E44BA42FFFCF4 bits:0 flags:0 Oct 8 01:03:47 vh03 kernel: block drbd1: uuid_compare()=1 by rule 70 Oct 8 01:03:47 vh03 kernel: block drbd1: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> Consistent ) Oct 8 01:03:47 vh03 kernel: block drbd1: send bitmap stats [Bytes(packets)]: plain 0(0), RLE 23(1), total 23; compression: 100.0% Oct 8 01:03:47 vh03 kernel: block drbd1: receive bitmap stats [Bytes(packets)]: plain 0(0), RLE 23(1), total 23; compression: 100.0% Oct 8 01:03:47 vh03 kernel: block drbd1: helper command: /sbin/drbdadm before-resync-source minor-1 Oct 8 01:03:47 vh03 kernel: block drbd1: helper command: /sbin/drbdadm before-resync-source minor-1 exit code 0 (0x0) Oct 8 01:03:47 vh03 kernel: block drbd1: conn( WFBitMapS -> SyncSource ) pdsk( Consistent -> Inconsistent ) Oct 8 01:03:47 vh03 kernel: block drbd1: Began resync as SyncSource (will sync 0 KB [0 bits set]). Oct 8 01:03:47 vh03 kernel: block drbd1: updated sync UUID 16D41637EA6A4C4B:5A9AC51D68CDB446:5A99C51D68CDB446:188F44BA42FFFCF4 Oct 8 01:03:47 vh03 kernel: block drbd1: Resync done (total 1 sec; paused 0 sec; 0 K/sec) Oct 8 01:03:47 vh03 kernel: block drbd1: updated UUIDs 16D41637EA6A4C4B:0000000000000000:5A9AC51D68CDB446:5A99C51D68CDB446 Oct 8 01:03:47 vh03 kernel: block drbd1: conn( SyncSource -> Connected ) pdsk( Inconsistent -> UpToDate ) *Looks like there was some whacky stuff going on with ovs-switchd (Open vSwitch; part of XenServer) and drbd that locked the machine up.. then at 01:03:46 it decides to start working again, and vh03 is successfully made the primary:* [root at vh03 ~]# cat /proc/drbd version: 8.4.2 (api:1/proto:86-101) GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by root at xenddk.eightmile.locatrix.net, 2012-09-20 14:40:39 1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----- ns:102855 nr:530600089 dw:530702944 dr:288517 al:196 bm:25600 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0 Very strange! Any ideas, guys? Thanks very much for any help. Cheers, Andrew On Sun, Oct 7, 2012 at 10:50 AM, Felix Frank <ff at mpexnet.de> wrote: > On 10/07/2012 03:22 PM, Dan Barker wrote: > >> My system logs them with timestamps. They just happen to be relative to >> boot >> time. I dmtime say "dmtime 12345" and see the real time stamp. >> > > Yes, most kernels do that. > > In this particular use case, those relative times would even suffice, > because one could tell how long it took from promotion to finish. > > > Cheers, > Felix > ______________________________**_________________ > drbd-user mailing list > drbd-user at lists.linbit.com > http://lists.linbit.com/**mailman/listinfo/drbd-user<http://lists.linbit.com/mailman/listinfo/drbd-user> > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20121007/406c498c/attachment.htm>