[DRBD-user] Kernel Oops, RH9.0, 2.4.20-19.9

Tim Hasson tim at aidasystems.com
Mon Jun 14 06:39:38 CEST 2004


Here's the results after I upgraded to kernel 2.4.26 from kernel.org (also
recompiled drbd 0.6.12)



Jun 12 18:53:22 drbd2 drbd: ===> drbd start <===
Jun 12 18:53:22 drbd2 drbd: modprobe -s drbd minor_count=2
Jun 12 18:53:22 drbd2 kernel: drbd: initialised. Version: 0.6.12
(api:64/proto:62)
Jun 12 18:53:22 drbd2 drbd: drbdsetup /dev/nb0 disk /dev/sdb1 --do-panic
--disk-size=35277516k
Jun 12 18:53:22 drbd2 kernel: drbd0: Creating state file
Jun 12 18:53:22 drbd2 kernel: "/var/lib/drbd/drbd0"
Jun 12 18:53:22 drbd2 kernel: klogd 1.4.1, ---------- state change ----------
Jun 12 18:53:22 drbd2 drbd: drbdsetup /dev/nb0 net 192.168.0.87:7788
192.168.0.86:7788 C --sync-min=10M --sync-max=25M --sync-nice=-10
--tl-size=5000 --timeout=60 --connect-int=10 --ping-int=10
Jun 12 18:53:22 drbd2 drbd: drbdsetup /dev/nb1 disk /dev/sdc1 --do-panic
--disk-size=35277516k
Jun 12 18:53:22 drbd2 kernel: drbd1: Creating state file
Jun 12 18:53:22 drbd2 kernel: "/var/lib/drbd/drbd1"
Jun 12 18:53:22 drbd2 kernel: drbd0: Connection established. size=35277516 KB /
blksize=4096 B
Jun 12 18:53:23 drbd2 drbd: drbdsetup /dev/nb1 net 192.168.0.87:7789
192.168.0.86:7789 C --sync-min=10M --sync-max=25M --sync-nice=-10
--tl-size=5000 --timeout=60 --connect-int=10 --ping-int=10
Jun 12 18:53:23 drbd2 drbd: drbdsetup /dev/nb0 wait_connect -t 0
Jun 12 18:53:23 drbd2 kernel: drbd1: Connection established. size=35277516 KB /
blksize=4096 B
Jun 12 18:53:23 drbd2 drbd: drbdsetup /dev/nb1 wait_connect -t 0
Jun 12 18:53:23 drbd2 drbd: 'drbd0' SyncingAll, waiting for this to finish
Jun 12 18:53:23 drbd2 drbd: 'drbd1' SyncingAll, waiting for this to finish
Jun 12 18:53:23 drbd2 drbd: drbdsetup /dev/nb0 wait_sync
Jun 12 18:53:23 drbd2 drbd: drbdsetup /dev/nb1 wait_sync
Jun 12 18:54:28 drbd2 kernel: KERNEL: assertion (tp->copied_seq == tp->rcv_nxt
|| (flags&(MSG_PEEK|MSG_TRUNC))) failed at tcp.c(1603)
Jun 12 18:54:28 drbd2 last message repeated 11 times
Jun 12 18:54:28 drbd2 kernel: KERNEL: assertion (flags&MSG_PEEK) failed at
tcp.c(1540)
Jun 12 18:54:28 drbd2 kernel: KERNEL: assertion (skb==NULL ||
before(tp->copied_seq, TCP_SKB_CB(skb)->end_seq)) failed at tcp.c(1290)
Jun 12 18:54:28 drbd2 kernel: KERNEL: assertion (flags&MSG_PEEK) failed at
tcp.c(1540)
Jun 12 18:54:28 drbd2 kernel: KERNEL: assertion (skb==NULL ||
before(tp->copied_seq, TCP_SKB_CB(skb)->end_seq)) failed at tcp.c(1290)
Jun 12 18:54:28 drbd2 kernel: KERNEL: assertion (flags&MSG_PEEK) failed at
tcp.c(1540)
Jun 12 18:54:28 drbd2 kernel: KERNEL: assertion (skb==NULL ||
before(tp->copied_seq, TCP_SKB_CB(skb)->end_seq)) failed at tcp.c(1290)
Jun 12 18:54:28 drbd2 kernel: KERNEL: assertion (flags&MSG_PEEK) failed at
tcp.c(1540)
Jun 12 18:54:28 drbd2 last message repeated 3 times


This is when the server became inaccessible over a day. Funny thing syslog is
flooded with these messages till the moment the server was rebooted today.


Here's another trial:


Jun 13 20:49:16 drbd2 drbd: ===> drbd start <===
Jun 13 20:49:16 drbd2 drbd: modprobe -s drbd minor_count=2
Jun 13 20:49:16 drbd2 kernel: drbd: initialised. Version: 0.6.12
(api:64/proto:62)
Jun 13 20:49:16 drbd2 drbd: drbdsetup /dev/nb0 disk /dev/sdb1 --do-panic
--disk-size=35277516k
Jun 13 20:49:16 drbd2 kernel: drbd0: Creating state file
Jun 13 20:49:16 drbd2 kernel: "/var/lib/drbd/drbd0"
Jun 13 20:49:16 drbd2 kernel: klogd 1.4.1, ---------- state change ----------
Jun 13 20:49:17 drbd2 drbd: drbdsetup /dev/nb0 net 192.168.0.87:7788
192.168.0.86:7788 C --sync-min=10M --sync-max=25M --sync-nice=-10
--tl-size=5000 --timeo
ut=60 --connect-int=10 --ping-int=10
Jun 13 20:49:17 drbd2 drbd: drbdsetup /dev/nb1 disk /dev/sdc1 --do-panic
--disk-size=35277516k
Jun 13 20:49:17 drbd2 kernel: drbd1: Creating state file
Jun 13 20:49:17 drbd2 kernel: "/var/lib/drbd/drbd1"
Jun 13 20:49:17 drbd2 kernel: drbd0: Connection established. size=35277516 KB /
blksize=4096 B
Jun 13 20:49:17 drbd2 drbd: drbdsetup /dev/nb1 net 192.168.0.87:7789
192.168.0.86:7789 C --sync-min=10M --sync-max=25M --sync-nice=-10
--tl-size=5000 --timeo
ut=60 --connect-int=10 --ping-int=10
Jun 13 20:49:17 drbd2 drbd: drbdsetup /dev/nb0 wait_connect -t 0
Jun 13 20:49:17 drbd2 kernel: drbd1: Connection established. size=35277516 KB /
blksize=4096 B
Jun 13 20:49:17 drbd2 drbd: drbdsetup /dev/nb1 wait_connect -t 0
Jun 13 20:49:17 drbd2 drbd: 'drbd0' SyncingAll, waiting for this to finish
Jun 13 20:49:17 drbd2 drbd: drbdsetup /dev/nb0 wait_sync
Jun 13 20:49:17 drbd2 drbd: 'drbd1' SyncingAll, waiting for this to finish
Jun 13 20:49:17 drbd2 drbd: drbdsetup /dev/nb1 wait_sync
Jun 13 20:59:18 drbd2 syslogd 1.4.1: restart.



Any ideas?



Quoting Lars Ellenberg <Lars.Ellenberg at linbit.com>,
on Sun, 13 Jun 2004 11:30:31 +0200:

> / 2004-06-12 17:05:24 -0700
> \ Tim Hasson:
> > Do I have to upgrade the kernel on both drbd1 and drbd2 or can I just
> upgrade
> > the one that is giving kernel oops's (drbd2) ?
> > 
> 
> you don't have to.
> as long as it is the same drbd version,
> it does not care about kernel version or architecture.
> 
> note that changing the kernel may or may not help.
> its only that nobody else has reported an oops related to 0.6.x
> for a long time, many people use it, probably even with your exact
> hardware and kernel. so whatever does cause the oops at your side,
> I seriously doubt that its drbd fault. I even seriously doubt that its
> the RH kernels fault.
> 
> but you said it is easy to reproduce for you (happens every time...).
> so go ahead, and try to change one thing at a time, and see when you can
> no longer reproduce it...  rule out the possible causes one by one.
> 
> > They are both running the same kernel right now (rh 2.4.20-19.9)
> > and also both running drbd 0.6.12
> > 
> > I was hoping I can get away with upgrading the kernel on drbd2 and
> recompiling
> > drbd, because I'd hate to bring down the primary file server for a kernel
> > reboot, and even worse, if the kernel doesn't boot.
> > 
> > SO as of now I am upgrading drbd2 (the server in question) until I get an
> answer
> > from you..
> > 
> > Thank you for all your help..
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user
> 




More information about the drbd-user mailing list