Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello! I'm using DRBD 8.0.14 on a Xen 3.3.1 x86_64 cluster for disk replication. Over the last few weeks the systems have been crashing, particularly under load. I have captured the following using netconsole: Unable to handle kernel paging request at 0000180017006702 RIP: [<ffffffff80262ceb>] put_page+0x0/0x2e PGD 0 Oops: 0000 [1] SMP CPU 3 Modules linked in: netconsole ip_vs_wrr ip_vs xt_physdev iptable_filter ip_tables x_tables drbd bridge button ac battery ipmi_devintf ipmi_si ipmi_msghandler e1000e serio_raw pcsp$ Pid: 4044, comm: drbd1_receiver Tainted: GF 2.6.18.8-xen #1 RIP: e030:[<ffffffff80262ceb>] [<ffffffff80262ceb>] put_page+0x0/0x2e RSP: e02b:ffff88026548bba8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff880192b24880 RCX: 0000000000000027 RDX: ffff88026e63d680 RSI: ffff8801b1fe8380 RDI: 0000180017006702 RBP: 0000000000000001 R08: 010100004600f501 R09: 0000000000000018 R10: ffffffff8049dd80 R11: ffff8802672ba8f8 R12: 0000000000000018 R13: 0000000000000018 R14: 0000000000000000 R15: 0000000000000000 FS: 00002ac6a4f7e010(0000) GS:ffffffff804d8180(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 Process drbd1_receiver (pid: 4044, threadinfo ffff88026548a000, task ffff880265a017a0) Stack: ffffffff80395b3c ffff880192b24880 ffff880192b24880 ffff880192b24880 ffffffff80395911 ffff88016fe34d80 ffffffff803c35e0 0000410000000008 ffff88026548be40 0000001800000000 ffff88016fe35280 0000001800000000 Call Trace: [<ffffffff80395b3c>] skb_release_data+0x61/0x9c [<ffffffff80395911>] kfree_skbmem+0x9/0x75 [<ffffffff803c35e0>] tcp_recvmsg+0x72e/0xb05 [<ffffffff80392091>] sock_common_recvmsg+0x2d/0x42 [<ffffffff80392091>] sock_common_recvmsg+0x2d/0x42 [<ffffffff8038fee9>] sock_recvmsg+0x101/0x120 [<ffffffff8038fee9>] sock_recvmsg+0x101/0x120 [<ffffffff8024327a>] autoremove_wake_function+0x0/0x2e [<ffffffff802dd2d5>] generic_make_request+0x15f/0x174 [<ffffffff880c52bf>] :dm_mod:__map_bio+0x47/0x9b [<ffffffff8817cb26>] :drbd:drbd_recv+0x7b/0x109 [<ffffffff88180cd0>] :drbd:receive_DataRequest+0x72/0x575 [<ffffffff8817d4da>] :drbd:drbdd+0x77/0x151 [<ffffffff881801f2>] :drbd:drbdd_init+0xbe/0x1ab [<ffffffff88190440>] :drbd:drbd_thread_setup+0x11c/0x1c6 [<ffffffff8020af54>] child_rip+0xa/0x12 [<ffffffff88190324>] :drbd:drbd_thread_setup+0x0/0x1c6 [<ffffffff8020af4a>] child_rip+0x0/0x12 Code: 8b 07 f6 c4 40 74 05 e9 62 f9 ff ff 8b 47 08 85 c0 75 0a 0f RIP [<ffffffff80262ceb>] put_page+0x0/0x2e RSP <ffff88026548bba8> CR2: 0000180017006702 Since there are a lot of drbd related functions listed, I suspect the problem originates there. The logging shows nothing interesting around the time of the crash, just a spontaneous crash/reboot. The 'receiver1' thread belongs to a 'zabbix' DomU, which runs a mysql database that causes most of the io load. The excact DRBD version I use is: version: 8.0.14 (api:86/proto:86) GIT-hash: bb447522fc9a87d0069b7e14f0234911ebdab0f7 build by beheer at xen03, 2008-11-20 11:25:39 And the drbd.conf file is attached. As you can see the kernel tainted flags say 'GF'. I have searched where this comes from, but can't find anything. We only run open-source stuff on these machines, so no scary binary only modules. Also, a dmesg|grep -i 'tainted' returns nothing, and cat /proc/sys/kernel/tainted returns '0'. Go figure... My guess is that some startup script uses 'insmod -f' while it's not needed. The kernel I'm using is here: http://xenbits.xensource.com/linux-2.6.18-xen.hg. It's quite old, and I'd love to upgrade to something recent, but this is the only kernel that can run as Dom0 with a recent version of Xen that I can find. My questions are: - Is this related to DRBD or should I go and bug the xen guys? - Would upgrading to 8.2.x or even 8.3.x help? Thanks in advance for your help. Kind Regards, Ronald Moesbergen -------------- next part -------------- A non-text attachment was scrubbed... Name: drbd.conf Type: application/octet-stream Size: 4228 bytes Desc: not available URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20090209/b3bd8c2d/attachment.obj>