Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Dear all, I have sent this problem earlier but maybe its not detail, here I try to write more detail. I hope anybody can help me to point out the problem. First of all used virtualization, I used Ubuntu 12.04 x64 both for domain0 and domainU with modification to run under xen hypervisor and work with remus. I follow and configured the remus with this notes http://wiki.xen.org/wiki/Install_Xen_4.1.4_with_Remus_and_DRBD_on_Ubuntu_12.10but I used xen 4.2.2 as my hypervisor with DRBD 3.8.11 remus support from this link http://remusha.wikidot.com/local--files/configuring-and-installing-remus/drbd-8.3.11-remus.tar.gz . If DRBD run with Primary - secondary mode, there is no problem. However remus run with dual primary mode. If I try to run remus the drbd will freeze and cause my domainU to freeze. With dmesg error message is below : [242525.600067] block drbd1: Local backing block device frozen? [242537.632070] block drbd1: Local backing block device frozen? [242549.664075] block drbd1: Local backing block device frozen? [242561.696083] block drbd1: Local backing block device frozen? [242573.728079] block drbd1: Local backing block device frozen? [242585.760069] block drbd1: Local backing block device frozen? [242597.792079] block drbd1: Local backing block device frozen? [242609.824069] block drbd1: Local backing block device frozen? [242621.856083] block drbd1: Local backing block device frozen? [242633.888068] block drbd1: Local backing block device frozen? [242640.332124] INFO: task blkback.2.xvda:5779 blocked for more than 120 seconds. [242640.332130] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [242640.332134] blkback.2.xvda D ffff88003fc13780 0 5779 2 0x00000000 [242640.332142] ffff880026743940 0000000000000246 000000000000000b ffff8800267402d0 [242640.332150] ffff880026743fd8 ffff880026743fd8 ffff880026743fd8 0000000000013780 [242640.332157] ffff880032944500 ffff88003368c500 ffff8800357d6000 ffff8800357d69d8 [242640.332164] Call Trace: [242640.332178] [<ffffffff816579cf>] schedule+0x3f/0x60 [242640.332200] [<ffffffffa00e68d5>] drbd_al_begin_io+0x205/0x270 [drbd] [242640.332207] [<ffffffff811adde8>] ? bvec_alloc_bs+0x68/0x100 [242640.332212] [<ffffffff811adf32>] ? bio_alloc_bioset+0xb2/0xf0 [242640.332219] [<ffffffff8108aa50>] ? add_wait_queue+0x60/0x60 [242640.332231] [<ffffffffa00e41bd>] drbd_make_request_common+0xc4d/0x1430 [drbd] [242640.332239] [<ffffffffa01b83ce>] ? xen_blkbk_map+0x24e/0x2f0 [xen_blkback] [242640.332245] [<ffffffff81301006>] ? throtl_find_tg+0x46/0x60 [242640.332257] [<ffffffffa00e4e04>] drbd_make_request+0x464/0x7e0 [drbd] [242640.332264] [<ffffffff812f03bb>] ? generic_make_request_checks+0x1eb/0x370 [242640.332269] [<ffffffff812f0194>] generic_make_request.part.50+0x74/0xb0 [242640.332274] [<ffffffff812f05a8>] generic_make_request+0x68/0x70 [242640.332278] [<ffffffff812f0635>] submit_bio+0x85/0x110 [242640.332284] [<ffffffffa01b8f0f>] dispatch_rw_block_io+0x44f/0x700 [xen_blkback] [242640.332292] [<ffffffff8100330e>] ? xen_end_context_switch+0x1e/0x30 [242640.332298] [<ffffffffa01b93df>] __do_block_io_op+0x21f/0x360 [xen_blkback] [242640.332304] [<ffffffffa01b9608>] xen_blkif_schedule+0xb8/0x320 [xen_blkback] [242640.332309] [<ffffffff8108aa50>] ? add_wait_queue+0x60/0x60 [242640.332314] [<ffffffffa01b9550>] ? xen_blkif_be_int+0x30/0x30 [xen_blkback] [242640.332319] [<ffffffff81089fbc>] kthread+0x8c/0xa0 [242640.332326] [<ffffffff81664034>] kernel_thread_helper+0x4/0x10 [242640.332330] [<ffffffff816620e3>] ? int_ret_from_sys_call+0x7/0x1b [242640.332336] [<ffffffff81659dbc>] ? retint_restore_args+0x5/0x6 [242640.332340] [<ffffffff81664030>] ? gs_change+0x13/0x13 [242645.920070] block drbd1: Local backing block device frozen? [242657.952074] block drbd1: Local backing block device frozen? [242669.984072] block drbd1: Local backing block device frozen? [242682.016071] block drbd1: Local backing block device frozen? [242694.048071] block drbd1: Local backing block device frozen? [242706.080071] block drbd1: Local backing block device frozen? [242718.112077] block drbd1: Local backing block device frozen? sb-voip2 at sbvoip2:~$ sudo cat /proc/drbd version: 8.3.11 (api:88/proto:86-96) GIT-hash: 0de839cee13a4160eed6037c4bddd066645e23c5 build by root at sbvoip2, 2013-02-19 08:30:51 1: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate D r----- ns:14732 nr:1784712 dw:1799444 dr:579340 al:31 bm:44 lo:1 pe:0 ua:0 ap:1 ep:1 wo:b def:0 chkpt:662 oos:0 As we can read after drbd block device frozen then blkback also not working [242640.332124] INFO: task blkback.2.xvda:5779 blocked for more than 120 seconds. Some one told me its because high load of IO but I alwasy monitor my server with xm top and the serer load always under 50% I hope anybody can help me, if you need some more log I will try to post it. However I found this patch http://permalink.gmane.org/gmane.linux.kernel.commits.head/358143, but I am not sure it could be applied with my DRBD version since I can't find drivers/block/drbd/drbd_state.c within my installation Many thanks, Agya -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20130401/2b1da928/attachment.htm>