[DRBD-user] potential bug if reconstruct drbd with degraded size.

Lars Ellenberg Lars.Ellenberg at linbit.com
Thu Oct 20 13:37:39 CEST 2005

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


/ 2005-10-20 16:02:34 +0800
\ Francis I. Malolot:
> Hi All,
> 
> A potential bug if  reconstruct drbd with degraded size.
> I had created drbd with a size of 2.0Tb both side on top of
> lvm and xfs with external meta disk while on syn_ed  I had stop
> drbd on both sides.Then remove and recreate volumes(1.5Tb) with degraded
> sizes on both sides again, a potential bug occur.
> 
> BTW drbd is 0.7.13, kernel 2.6.13

> tagged command queuing enabled, command queue depth 16.

interessting...
what driver is this?
I'd like to test some improvements we have in drbd 0.8,
that use TCQ ...

>  target1:0:0: FAST-80 WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 62)
> Attached scsi disk sda at scsi1, channel 0, id 0, lun 0
> Attached scsi generic sg0 at scsi1, channel 0, id 0, lun 0,  type 0
>   Vendor: SN-3143P  Model:                   Rev: 0001
>   Type:   Direct-Access                      ANSI SCSI revision: 03

...

> drbd0: Resync started as SyncSource (need to sync 2048000000 KB [512000000 
> bits set]).
> XFS mounting filesystem drbd0
> Ending clean XFS mount for filesystem: drbd0
> drbd0: Primary/Secondary --> Secondary/Secondary
> drbd0: drbdsetup [3276]: cstate SyncSource --> Unconnected
> drbd0: drbd0_receiver [3223]: cstate Unconnected --> BrokenPipe
> drbd0: short read expecting header on sock: r=-512
> drbd0: worker terminated
> drbd0: asender terminated
> drbd0: drbd0_receiver [3223]: cstate BrokenPipe --> StandAlone
> drbd0: Connection lost.
> drbd0: receiver terminated
> drbd0: drbdsetup [3276]: cstate StandAlone --> StandAlone
> drbd0: drbdsetup [3276]: cstate StandAlone --> Unconfigured
> drbd0: worker terminated

and here you reconfigure with a smaller lower level storage, right?

> drbd0: resync bitmap: bits=384008192 words=12000256
> drbd0: size = 1464 GB (1536032768 KB)
> drbd0: 1536032768 KB now marked out-of-sync by on disk bit-map.
> drbd0: 1464 GB marked out-of-sync by on disk bit-map.
> Unable to handle kernel paging request at virtual address 00200200
>  printing eip:
> f09ff4d7
> *pde = 00000000
> Oops: 0002 [#1]
> SMP
> Modules linked in: drbd bonding i2c_i801 i2c_dev i2c_core mptspi mptscsih 
> mptbase aic7xxx e1000 e100 sym53c8xx
> CPU:    0
> EIP:    0060:[<f09ff4d7>]    Not tainted VLI
> EFLAGS: 00010006   (2.6.13)
> EIP is at lc_set+0x47/0xc0 [drbd]
> eax: f0a2bc04   ebx: 0003d090   ecx: 00200200   edx: 00100100
                                       ^^^^^^^^        ^^^^^^^^
this is list poison.
so something tries to manipulate a list entry that was poisoned.

> esi: f0a2bc34   edi: f0a2a000   ebp: 00000000   esp: e60afd94
> ds: 007b   es: 007b   ss: 0068
> Process drbdsetup (pid: 3293, threadinfo=e60ae000 task=e9909530)
> Stack: c02ecdd8 00000000 00000000 ea3b4000 e9c40000 f09fd636 f0a2a000 0003d090
>        00000100 f0a04820 e60afdc8 e9c40508 00000005 00000000 00000000 00000001
>        00000000 00000001 00000000 00000001 f0a04b00 e9c40000 000001e7 
> f09efc55
> Call Trace:
>  [<c02ecdd8>] sprintf+0x28/0x30
>  [<f09fd636>] drbd_al_read_log+0x246/0x290 [drbd]
>  [<f09efc55>] drbd_ioctl_set_disk+0x485/0x800 [drbd]
>  [<c0178155>] dput+0x175/0x1f0
>  [<f09f16a9>] drbd_ioctl+0x879/0xcb3 [drbd]
>  [<c02e993a>] kobject_get+0x1a/0x30
>  [<c0348884>] get_disk+0x44/0xa0
>  [<c03479af>] exact_lock+0xf/0x20
>  [<c02edcc0>] __copy_to_user_ll+0x70/0x80
>  [<c02edd92>] copy_to_user+0x42/0x60
>  [<c0169c98>] cp_new_stat64+0xf8/0x110
>  [<c0347342>] blkdev_driver_ioctl+0x52/0x90
>  [<c0347424>] blkdev_ioctl+0xa4/0x1b0
>  [<c016861b>] block_ioctl+0x2b/0x30
>  [<c0172e1e>] do_ioctl+0x8e/0xa0
>  [<c0173005>] vfs_ioctl+0x65/0x1f0
>  [<c01731d5>] sys_ioctl+0x45/0x70
>  [<c0102f9f>] sysenter_past_esp+0x54/0x75
> Code: 0c 0f 88 83 00 00 00 8b 47 1c 39 c2 73 7c 8b 4f 18 8d 04 87 0f af d1 
> 01 d0 8d 70 30 8b 4e 04 89 5e 14 85 c9 74 1a 8b 50 30 85 d2 <89> 11 74 03 89 
> 4a 04 c7 40 30 00 00 00 00 c7 46 04 00 00 00 00


if you can reproduce this easily, that could help to investigate the issue.

-- 
: Lars Ellenberg                                  Tel +43-1-8178292-0  :
: LINBIT Information Technologies GmbH            Fax +43-1-8178292-82 :
: Schoenbrunner Str. 244, A-1120 Vienna/Europe   http://www.linbit.com :
__
please use the "List-Reply" function of your email client.



More information about the drbd-user mailing list