Hi guys. Have some updates and more info for you.<br>While trying to tackle the problem from dif. angles, I installed plain HDD in one of the servers and put XS 5.6 FP1 on it, also<br>gave DRBD /dev/sda3 on the same drive. Before server booted from flash card (perhaps not fast enough to capture crash info) and data was on RAID volumes.<br>
This time around I was able to see what happened behind "sunshine screen" (also got crash and other logs generated).<br> <1>BUG: unable to handle kernel NULL pointer dereference at 00000004<br> <1>IP: [<c01b9abc>] bio_free+0x2c/0x50<br>
<4>*pdpt = 00000004fe3ed027 *pde = 0000000000000000 <br> <0>Oops: 0000 [#1] SMP <br> <0>last sysfs file: /sys/class/net/lo/carrier<br><br>I'm attaching snapshot of "OOPS" and a few log files hoping this will shed some light on what's really causing this<br>
and how to fix it.. As I understand this problem was confirmed by Jodok (or maybe it's dif. issue).. Any thoughts?<br><br><br><br><div class="gmail_quote">On Sun, Dec 19, 2010 at 12:42 PM, Rom Zhe <span dir="ltr"><<a href="mailto:zherom@gmail.com">zherom@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"> Hi all,<br>I spent almost 2 days trying to get DRBD working with XenServer 5.6 FP1.<br>Compiled DRBD 8.3.81 and 8.3.9 on new XS DDK VM (kernel 2.6.32.12-0.7.1) and at first thought everything was good..<br>
But
after setting up 2 nodes, syncing both and getting ready to call it a
day decided to check the speed. <br>dd if=/dev/zero of=/dev/drbd1 bs=256M
count=1 oflags=direct <br>and... immediately got the "sunshine" screen
(the one we see when XS boots up). Complete server lockup. No network, no kb or
mouse.. <br>After hard reset no signs of any troubles. Server boots
up just fine. logs don't have anything special. <br>Later found out it's
not just "dd" that brings up the sunshine, fdisk -l /dev/drbd1 and pvs
also cause the same sudden freeze. <br>Tried giving DRBD whole disk
(/dev/sdb) or partition (/dev/sdb1) - no dif. if I do "service stop
drbd" - all is good. <br>I can fdisk or dd backing device - no problem.
These servers were working fine with XS 5.5, 5.6 and DRBD 8.3.81, I ran
multiple throughput/latency tests before (against /dev/drbd1) on both
and all was good..<br>I tried drbd device in Primary/Secondary, in primary stand alone and got tired of hard-resetting after each 'sunshine'.<br>Exhausted all other "what if I try this" options, online searches didn't help much either.<br>
Everything points to drbd-km module and I'd like to ask if anybody have seen this before or has any suggestions.<br>Perhaps there is something about kernel 2.6.32.12 or some settings I need to adjust for ./configure when compiling drbd?<br>
While nodes are syncing - all seems good. No I/O errors, speed is about 90% compared to direct disk access.<br>Any thoughts/ideas are greatly appreciated. Thanks!<br><br>
</blockquote></div><br>