Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Fri, Sep 03, 2010 at 04:11:03PM +0200, Roland Friedwagner wrote: > Hello Michael, > > I can confirm this issue. > Our secondary crashed last night after started an online verify via cron. > I had to push The Button... > > Found these last messages in syslog: > Sep 2 00:18:01 bach-s52 kernel: block drbd0: Online Verify start sector: 0 > Sep 2 00:18:01 bach-s52 kernel: block drbd1: conn( Connected -> VerifyT ) > Sep 2 00:18:01 bach-s52 kernel: block drbd1: Online Verify start sector: 0 > Sep 2 00:18:04 bach-s52 kernel: Unable to handle kernel NULL pointer dereference at 0000000000000000 RIP: > Sep 2 00:18:04 bach-s52 kernel: [<ffffffff88440fbf>] :drbd:w_e_end_ov_req+0x29/0x136 > Sep 2 00:18:04 bach-s52 kernel: PGD 0 > Sep 2 00:18:04 bach-s52 kernel: Oops: 0000  SMP > Sep 2 00:18:04 bach-s52 kernel: last sysfs file: /devices/pci0000:00/0000:00:1c.2/0000:03:00.1/irq > > DRBD Version: 18.104.22.168 > HW: HP DL380G6 (1 x Xeon X5570) > OS: RHEL 5.5 x86_64 > Kernel: 2.6.18-194.11.3.el5 #1 SMP Mon Aug 23 15:51:38 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux > > Does not reproduce until now. Too bad. It was much easier if it was reproducible. Anyways, please do gdb drbd.ko -ex 'l *(w_e_end_ov_req+0x29)' -ex q (you may have to rebuild the module with EXTRA_CFLAGS=-g or something). May Michael can do that as well, if it is still possible? Of course against his drbd.ko build, with his address (w_e_end_ov_req+0x36, if I read the archives right). That should tell you which C-code line the RIP corresponds to. We can then start guessing which pointer may have been NULL, and why. -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. __ please don't Cc me, but send to list -- I'm subscribed