[DRBD-user] [Fwd: Re: [Linux-HA] heartbeat 2.0.8: lockups] kernel oops

Gerry Reno greno at verizon.net
Wed Feb 21 17:25:23 CET 2007

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Gerry Reno wrote:
>
> Forwarding this from linux-ha list:
>
> -------- Original Message --------
> Subject:     Re: [Linux-HA] heartbeat 2.0.8: lockups
> Date:     Mon, 19 Feb 2007 09:57:16 -0500
> From:     Gerry Reno <greno at verizon.net>
> Reply-To:     General Linux-HA mailing list <linux-ha at lists.linux-ha.org>
> To:     General Linux-HA mailing list <linux-ha at lists.linux-ha.org>
> References: 
> <12392854.6367231171759462886.JavaMail.root at vms074.mailsrvcs.net> 
> <26ef5e70702190352p4d6d24cajb31b28edbe0d1885 at mail.gmail.com> 
> <45D9B8CD.70907 at verizon.net>
>
>
>
> Gerry Reno wrote:
>> Andrew Beekhof wrote:
>>> so what are we looking at here?  what time did the lockup occur?
>>>
>>> On 2/18/07, greno at verizon.net <greno at verizon.net> wrote:
>>>> I've been running heartbeat on my two nodes for almost two weeks 
>>>> and everything is functioning as it is supposed to with the 
>>>> exception that I am getting frequent lockups on the primary 
>>>> server.  It doesn't matter which server that I make the primary it 
>>>> will eventually be locked up.  The lockups are very hard.  There is 
>>>> no response of any kind out of the locked up machine.  Sometimes 
>>>> the drive light will be on and sometimes not.  The lockups are 
>>>> occurring at times of disk access such as during backups or right 
>>>> after I ftp a file or tar file over to another machine from the 
>>>> drbd array.  There is very little in the logs.  It just shows a big 
>>>> gap and then a syslog restart for when I cold booted the server to 
>>>> bring it back up.  I'm going to attach dmesg output and 
>>>> /var/log/messages output for both servers.  What should I do to 
>>>> track down the source of this problem?
>>>>
>>>> heartbeat-2.0.8-1.fc6
>>>> drbd-0.7.23-15.fc6.at
>>>>
>>>> Other info:
>>>> drbd is running over logical volume which is over a RAID-1 md array 
>>>> on each server.
>>>>
>>>> Both servers were rock stable prior to installing HA.
>>>>
>>>>
>> Andrew,
>>  The lockup occurred about 18:07.  If you search for 'restart' in the 
>> log you should find it.  It was about a 7 minute gap.  This one 
>> occurred when I ftp'd a file.
>>
>> Gerry
> Andrew,
> I just went and checked on the primary and this was showing in a 
> terminal window:
>
> Message from syslogd at grp-01-30-02 at Sun Feb 18 23:09:30 2007 ...
> grp-01-30-02 kernel: Oops: 0000 [#1]
>
> Message from syslogd at grp-01-30-02 at Sun Feb 18 23:09:30 2007 ...
> grp-01-30-02 kernel: SMP
>
> Message from syslogd at grp-01-30-02 at Sun Feb 18 23:09:30 2007 ...
> grp-01-30-02 kernel: CPU:    0
>
> Message from syslogd at grp-01-30-02 at Sun Feb 18 23:09:30 2007 ...
> grp-01-30-02 kernel: EIP:    0061:[<c042e17d>]    Tainted: GF     VLI
>
> Message from syslogd at grp-01-30-02 at Sun Feb 18 23:09:30 2007 ...
> grp-01-30-02 kernel: EFLAGS: 00010202   (2.6.19-1.2895.fc6xen #1)
>
> Message from syslogd at grp-01-30-02 at Sun Feb 18 23:09:30 2007 ...
> grp-01-30-02 kernel: EIP is at put_pid+0x6/0x20
>
> Message from syslogd at grp-01-30-02 at Sun Feb 18 23:09:30 2007 ...
> grp-01-30-02 kernel: eax: 000000ab   ebx: 00000008   ecx: c1b84140   
> edx: 000000ab
>
> Message from syslogd at grp-01-30-02 at Sun Feb 18 23:09:30 2007 ...
> grp-01-30-02 kernel: esi: c1b840c0   edi: c1b840c0   ebp: e3957be0   
> esp: e9ea0f0c
>
> Message from syslogd at grp-01-30-02 at Sun Feb 18 23:09:30 2007 ...
> grp-01-30-02 kernel: ds: 007b   es: 007b   ss: 0069
>
> Message from syslogd at grp-01-30-02 at Sun Feb 18 23:09:30 2007 ...
> grp-01-30-02 kernel: Process IPaddr (pid: 29397, ti=e9ea0000 
> task=e70464d0 task.ti=e9ea0000)
>
> Message from syslogd at grp-01-30-02 at Sun Feb 18 23:09:30 2007 ...
> grp-01-30-02 kernel: Stack: c046c707 00000000 00000000 d0e76988 
> ed7e8520 e3957be0 d76b4b40 00000000
>
> Message from syslogd at grp-01-30-02 at Sun Feb 18 23:09:30 2007 ...
> grp-01-30-02 kernel:        e1f53d20 c046a03a c0467578 d76b4b40 
> 000001ff 00000004 c041fdbd 00000000
>
> Message from syslogd at grp-01-30-02 at Sun Feb 18 23:09:30 2007 ...
> grp-01-30-02 kernel:        00000000 e7046978 d76b4b40 e70464d0 
> 00000001 c0420fa0 00000000 c04442e3
>
> Message from syslogd at grp-01-30-02 at Sun Feb 18 23:09:30 2007 ...
> grp-01-30-02 kernel: Call Trace:
>
> Message from syslogd at grp-01-30-02 at Sun Feb 18 23:09:30 2007 ...
> grp-01-30-02 kernel:  [<c046c707>] __fput+0x12f/0x190
>
> Message from syslogd at grp-01-30-02 at Sun Feb 18 23:09:30 2007 ...
> grp-01-30-02 kernel:  [<c046a03a>] filp_close+0x52/0x59
>
> Message from syslogd at grp-01-30-02 at Sun Feb 18 23:09:30 2007 ...
> grp-01-30-02 kernel:  [<c041fdbd>] put_files_struct+0x65/0xa7
>
> Message from syslogd at grp-01-30-02 at Sun Feb 18 23:09:30 2007 ...
> grp-01-30-02 kernel:  [<c0420fa0>] do_exit+0x246/0x787
>
> Message from syslogd at grp-01-30-02 at Sun Feb 18 23:09:30 2007 ...
> grp-01-30-02 kernel:  [<c042156e>] sys_exit_group+0x0/0xd
>
> Message from syslogd at grp-01-30-02 at Sun Feb 18 23:09:30 2007 ...
> grp-01-30-02 kernel:  [<c0404efb>] syscall_call+0x7/0xb
>
> Message from syslogd at grp-01-30-02 at Sun Feb 18 23:09:31 2007 ...
> grp-01-30-02 kernel:  [<00c5d402>] 0xc5d402
>
> Message from syslogd at grp-01-30-02 at Sun Feb 18 23:09:31 2007 ...
> grp-01-30-02 kernel:  =======================
>
> Message from syslogd at grp-01-30-02 at Sun Feb 18 23:09:31 2007 ...
> grp-01-30-02 kernel: Code: 00 77 09 8d 1c 06 85 db 7e 15 eb a8 83 c3 
> 08 81 c6 00 80 00 00 31 c9 81 fb 74 53 68 c0 72 cc 89 f8 5b 5e 5f c3 
> 85 c0 89 c2 74 19 <8b> 00 48 74 0a 90 ff 0a 0f 94 c0 84 c0 74 0a a1 28 
> 2d 83 c0 e9
>
> Message from syslogd at grp-01-30-02 at Sun Feb 18 23:09:31 2007 ...
> grp-01-30-02 kernel: EIP: [<c042e17d>] put_pid+0x6/0x20 SS:ESP 
> 0069:e9ea0f0c
>
>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
>
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user
>
Got another oops today on drbd device:

Message from syslogd at grp-01-30-02 at Wed Feb 21 04:34:34 2007 ...
grp-01-30-02 kernel: Oops: 0000 [#2]

Message from syslogd at grp-01-30-02 at Wed Feb 21 04:34:34 2007 ...
grp-01-30-02 kernel: SMP

Message from syslogd at grp-01-30-02 at Wed Feb 21 04:34:34 2007 ...
grp-01-30-02 kernel: CPU:    0

Message from syslogd at grp-01-30-02 at Wed Feb 21 04:34:34 2007 ...
grp-01-30-02 kernel: EIP:    0061:[<000000c0>]    Tainted: GF     VLI

Message from syslogd at grp-01-30-02 at Wed Feb 21 04:34:34 2007 ...
grp-01-30-02 kernel: EFLAGS: 00210206   (2.6.19-1.2895.fc6xen #1)

Message from syslogd at grp-01-30-02 at Wed Feb 21 04:34:34 2007 ...
grp-01-30-02 kernel: EIP is at 0xc0

Message from syslogd at grp-01-30-02 at Wed Feb 21 04:34:34 2007 ...
grp-01-30-02 kernel: eax: e3957820   ebx: e3957820   ecx: 000000c0   
edx: e87ae220

Message from syslogd at grp-01-30-02 at Wed Feb 21 04:34:34 2007 ...
grp-01-30-02 kernel: esi: e87ae220   edi: e87ae2a0   ebp: e3957820   
esp: deb1cf90

Message from syslogd at grp-01-30-02 at Wed Feb 21 04:34:34 2007 ...
grp-01-30-02 kernel: ds: 007b   es: 007b   ss: 0069

Message from syslogd at grp-01-30-02 at Wed Feb 21 04:34:34 2007 ...
grp-01-30-02 kernel: Process mysqld (pid: 7042, ti=deb1c000 
task=d40a8cb0 task.ti=deb1c000)

Message from syslogd at grp-01-30-02 at Wed Feb 21 04:34:34 2007 ...
grp-01-30-02 kernel: Stack: c046a01b deb1cfbc e87ae220 00000052 e87ae2a0 
c046afd3 00000052 086cdb78

Message from syslogd at grp-01-30-02 at Wed Feb 21 04:34:34 2007 ...
grp-01-30-02 kernel:        00000052 deb1c000 c0404efb 00000052 00000000 
00000001 086cdb78 00000052

Message from syslogd at grp-01-30-02 at Wed Feb 21 04:34:35 2007 ...
grp-01-30-02 kernel:        00db00d8 ffffffda 0000007b 0000007b 00000006 
00338402 00000073 00200293

Message from syslogd at grp-01-30-02 at Wed Feb 21 04:34:35 2007 ...
grp-01-30-02 kernel: Call Trace:

Message from syslogd at grp-01-30-02 at Wed Feb 21 04:34:35 2007 ...
grp-01-30-02 kernel: Inexact backtrace:

Message from syslogd at grp-01-30-02 at Wed Feb 21 04:34:36 2007 ...
grp-01-30-02 kernel:  [<c046a01b>] filp_close+0x33/0x59

Message from syslogd at grp-01-30-02 at Wed Feb 21 04:34:36 2007 ...
grp-01-30-02 kernel:  [<c046afd3>] sys_close+0x73/0xaa

Message from syslogd at grp-01-30-02 at Wed Feb 21 04:34:36 2007 ...
grp-01-30-02 kernel:  [<c0404efb>] syscall_call+0x7/0xb

Message from syslogd at grp-01-30-02 at Wed Feb 21 04:34:36 2007 ...
grp-01-30-02 kernel:  =======================

Message from syslogd at grp-01-30-02 at Wed Feb 21 04:34:36 2007 ...
grp-01-30-02 kernel: Code:  Bad EIP value.

Message from syslogd at grp-01-30-02 at Wed Feb 21 04:34:36 2007 ...
grp-01-30-02 kernel: EIP: [<000000c0>] 0xc0 SS:ESP 0069:deb1cf90






More information about the drbd-user mailing list