[DRBD-user] Crash during synchronisation when RATE >= 1024 ???

Benoit.Ropartz at alcatel.fr Benoit.Ropartz at alcatel.fr
Thu Jun 17 11:18:08 CEST 2004


Hello,

OK i've used ksymoops :

Jun 16 17:45:17 SML_B kernel: Oops: kernel access of bad area, sig: 11
Jun 16 17:45:17 SML_B kernel: NIP: D104CE50 XER: 00000000 LR: D104CE38 SP:
CAA19EF0 REGS: caa19e40 TRAP: 0300    Not tainted
Using defaults from ksymoops -t elf32-powerpc -a powerpc:common
Jun 16 17:45:17 SML_B kernel: MSR: 00001032 EE: 0 PR: 0 FP: 0 ME: 1 IR/DR:
11
Jun 16 17:45:17 SML_B kernel: TASK = caa18000[464] 'drbd0_receiver' Last
syscall: 120
Jun 16 17:45:17 SML_B kernel: last math cf9d4000 last altivec 00000000
Jun 16 17:45:17 SML_B kernel: GPR00: 00001032 CAA19EF0 CAA18000 00000000
00001032 000004BA CAA1B154 CAA1B040
Jun 16 17:45:17 SML_B kernel: GPR08: BA2E8BA3 00000000 00000000 CAA19F20
84000042 00000000 00000000 00000000
Jun 16 17:45:17 SML_B kernel: GPR16: 00000000 00000000 00000000 00000000
00009032 0EA2FF40 00000000 C0005ED4
Jun 16 17:45:17 SML_B kernel: GPR24: C0005BC0 10003F3C 00000000 000000A0
C01F0000 CAA19F18 CFCFA420 CFCFA000
Jun 16 17:45:17 SML_B kernel: Call backtrace:
Jun 16 17:45:17 SML_B kernel: D104CB50 D104D034 D104E0F8 D104EC5C D10500A8
D105077C D1045C6C
Jun 16 17:45:17 SML_B kernel: C00088D8
Warning (Oops_read): Code line not seen, dumping what data is available


>>NIP; d104ce50 <[drbd]finish_wait+40/90>   <=====

>>GPR1; caa19ef0 <_end+a783308/10dac418>
>>GPR2; caa18000 <_end+a781418/10dac418>
>>GPR6; caa1b154 <_end+a78456c/10dac418>
>>GPR7; caa1b040 <_end+a784458/10dac418>
>>GPR11; caa19f20 <_end+a783338/10dac418>
>>GPR23; c0005ed4 <ret_from_except+0/34>
>>GPR24; c0005bc0 <DoSyscall+0/5c>
>>GPR28; c01f0000 <Symbios_trailer.1+0/8>
>>GPR29; caa19f18 <_end+a783330/10dac418>
>>GPR30; cfcfa420 <_end+fa63838/10dac418>
>>GPR31; cfcfa000 <_end+fa63418/10dac418>

Trace; d104cb50 <[drbd]drbd_alloc_ee+40/78>
Trace; d104d034 <[drbd]drbd_get_ee+194/20c>
Trace; d104e0f8 <[drbd]read_in_block+30/110>
Trace; d104ec5c <[drbd]receive_Data+b8/360>
Trace; d10500a8 <[drbd]drbdd+70/114>
Trace; d105077c <[drbd]drbdd_init+68/178>
Trace; d1045c6c <[drbd]drbd_thread_setup+a4/124>
Trace; c00088d8 <arch_kernel_thread+2c/38>


6 warnings issued.  Results may not be reliable.

Finally DRBD crash each time DRBD receive a block !!! Unfortunately i
cannot interpret
the ksymoops results ???





Lars Ellenberg <Lars.Ellenberg at linbit.com>@linbit.com on 10/06/2004
11:40:04

Please respond to drbd-user <drbd-user at linbit.com>

Sent by:    drbd-user-admin at linbit.com


To:    drbd-user at linbit.com
cc:
Subject:    Re: [DRBD-user] Crash during synchronisation when RATE >= 1024
       ???


/ 2004-06-10 09:29:42 +0200
\ Benoit.Ropartz at alcatel.fr:
> Hi,
>
> My Platform is Linux SML_B 2.4.21-rc5-20031126 #1 Tue Feb 17 18:41:17 MET
> 2004 ppc unknown (RS6000).
>
> When the synchronisation use a rate < 1024 : there is no problem !!!
>
> But when the rate is >= 1024 bytes there is a DRBD crash.
>
> Syslog file :
>
> drbd0: size = 847040 KB
> drbd0: 16384 KB marked out-of-sync by on disk bit-map.
> drbd0: Found 4 transactions (4 active extents) in activity log.
> drbd0: Marked additional 0 KB as out-of-sync based on AL.

> drbd0: Connection established.
> drbd0: Resync started as source (need to sync 16384 KB).
> Oops: kernel access of bad area, sig: 11
> NIP: D104CE50 XER: 00000000 LR: D104CE38 SP: CF65BF10 REGS: cf65be60
TRAP: 0300    Not tainted
> MSR: 00001032 EE: 0 PR: 0 FP: 0 ME: 1 IR/DR: 11
> DAR: 00000000, DSISR: 42000000
> TASK = cf65a000[233] 'drbd0_receiver' Last syscall: 120
> last math cf38a000 last altivec 00000000
> GPR00: 00001032 CF65BF10 CF65A000 00000000 00001032 000004BA CE3DD154
CE3DD040
> GPR08: BA2E8BA3 00000000 00000000 CF65BF40 84000042 00000000 00000000
00000000
> GPR16: 00000000 00000000 00000000 00000000 00009032 0E5C7F40 00000000
C0005ED4
> GPR24: C0005BC0 10003F3C 00001000 00000100 C01F0000 CF65BF38 CFCFA420
CFCFA000
> Call backtrace:
> D104CB50 D104D034 D104EFBC D10500A8 D105077C D1045C6C C00088D8
> note: drbd0_receiver[233] exited with preempt_count 2

would you please pipe that through ksymoops?
otherwise it is just garbage.
--> see man ksymoops.

and you could:
 * try with latest cvs
   [but I doubt that there have been changes relevant to this problem]
 * try with preemt disabled

 lge
_______________________________________________
drbd-user mailing list
drbd-user at lists.linbit.com
 http://lists.linbit.com/mailman/listinfo/drbd-user







More information about the drbd-user mailing list