[DRBD-user] DRBD attempts to access beyond end of device

Thu Feb 28 19:09:46 CET 2008

On Thu, 28 Feb 2008, Todd Denniston wrote:

> Nate Seif wrote, On 02/28/2008 11:12 AM:
>>
>>
>>  On Wed, 27 Feb 2008, Lars Ellenberg wrote:
>> 
>> >  On Wed, Feb 27, 2008 at 01:07:04PM -0500, Nate Seif wrote:
>> > > 
>> > > 
>> > >  On Wed, 27 Feb 2008, Lars Ellenberg wrote:
>> > > 
>> > > >  On Tue, Feb 26, 2008 at 04:14:36PM -0500, Nate Seif wrote:
>> > > > >  Hello all:
>> > > > >  I intermittently experience the errors below while running DRBD 
>> > > > >  and would
>> > > > >  like to correct whatever condition is causing DRBD to randomly 
>> > > > >  lose pages.
>> > > > >  My hard disks and partitions are identical and have never given me
>> > > > >  problems previously. I don't see any other disk I/O errors in my 
>> > > > >  logs. And
>> > > > >  it appears that occassionally (not always) these errors are 
>> > > > >  preceded by a
>> > > > >  resync of the two disks.
>> > > > > 
>> > > > >  Why would DRBD "attempt to access beyond end of device"?
>> > > > > 
>> > > > >  I am running DRBD 8.06 on Gentoo Linux as I could not get my 
>> > > > >  latest
>> > > > >  Gentoo kernel to load the DRBD module where version > 8.06. 
>> > > > >  Metadata is
>> > > > >  "internal" and I'm running Protocol C. I'd be happy to post my 
>> > > > >  drbd.conf
>> > > > >  page if necessary.
>> > > > > 
>> > > > > 
>> > > > >  Feb 26 08:21:46 <hostname> attempt to access beyond end of device
>> > > > >  Feb 26 08:21:46 <hostname> drbd0: rw=1, want=211992584, 
>> > > > >  limit=211986944
>> > > > >  Feb 26 08:21:46 <hostname> Buffer I/O error on device drbd0, 
>> > > > >  logical block
>> > > > >  26499072
>> > > > >  Feb 26 08:21:46 <hostname> lost page write due to I/O error on 
>> > > > >  drbd0
>> > > > >  Feb 26 08:21:46 <hostname> attempt to access beyond end of device
>> > > > >  Feb 26 08:21:46 <hostname> drbd0: rw=1, want=211992592, 
>> > > > >  limit=211986944
>> > > > >  Feb 26 08:21:46 <hostname> Buffer I/O error on device drbd0, 
>> > > > >  logical block
>> > > > >  26499073
>> > > > >  Feb 26 08:21:46 <hostname> lost page write due to I/O error on 
>> > > > >  drbd0
>> > > > > 
>> > > > > 
>> > > > >  Any ideas, tips, help, etc. is much appreciated. Thank you -
>> > > > 
>> > > >  let me guess:
>> > > >  you did mkfs /dev/sda1, not mkfs /dev/drbd0?
>> > > >  well, you screwed up.
>> > > 
>> > >  I did NOT mkfs on /dev/hda4. (I have DRBD running on a pair of 
>> > >  IDE/PATA
>> > >  disks and no SATA drives in either system.)
>> > > 
>> > >  I partitioned my disks with fdisk. I have identical drives with
>> > >  identically sized partitions. I compiled the DRBD module, started 
>> > >  DRBD,
>> > >  mounted /dev/drbd0 (not /dev/hda4), and formatted drbd0 with an ext3
>> > >  file system on the primary only after I got DRBD up and running months
>> > >  ago.
>> > 
>> >  please do
>> > 
>> >      tune2fs -l /dev/mapper/vg00--bk1-root |
>> >      grep -e ^Block.count: -e ^Block.size:
>>
>>  I do not have RAID on either system and /dev/mapper does not exist on
>>  either machine. I have a single, identical hard drive in each system where
>>  /dev/hda4 is the partition DRBD uses. Can I change the tune2fs command you
>>  suggested above to get the bytes my ext3 FS thinks it's occupying?
>> 
>
> You should be able to...
>
> please run
> tune2fs -l /dev/hda4 |
>      grep -e ^Block.count: -e ^Block.size:

Thanks for the suggestion, Todd.

# tune2fs -l /dev/hda4 | grep -e ^Block.count: -e ^Block.size:
Block count:              26499186
Block size:               4096

>
> or better
> tune2fs -l /dev/drbd0 |
>      grep -e ^Block.count: -e ^Block.size:

# tune2fs -l /dev/drbd0 | grep -e ^Block.count: -e ^Block.size:
Block count:              26499186
Block size:               4096

I see that 26499186 * 4096 = 108540665856 bytes.
108540665856 bytes * (1 kilobyte / 1024 bytes) = 105996744 kilobytes.

This appears to be the same size that my kernel sees for hda4, and not 
drbd0:

# grep -e hda4 -e drbd0 /proc/partitions
     3     4  105996744 hda4
   147     0  105993472 drbd0

>> >  you again get two numbers, this time unit is kilo byte.
>> >  that is the size of the partitions as the kernel sees them now.
>> >  according to the logs above (the limit= is unit sectors),
>> >  drbd0 will be 105993472 kB.
>> >  I dare say hda4 will be somewhat larger, my best guess, given the
>> >  information I have, is that hda4 will be 105996740 kB.
>> >  and that this also matches what the tune2fs reports.

hda4 is slightly larger than drbd0 according to /proc/partitions AND this 
does match what tune2fs reports. This seems to me to be what we'd expect 
if I had created the filesystem directly on hda4... (Is this how 
everyone else interprets it?) But, I swear when I set up my DRBD cluster I 
created the file system on /dev/drbd0:

# mke2fs -j /dev/drbd0

I am and was very much aware of the fact that I have to deal only with 
/dev/drbd0 and not the underlying partition /dev/hda4. Is it possible that 
mke2fs created the ext3 file system on /dev/hda4 instead of /dev/drbd0 as 
I told it to?

I imagine I need to get ext3 to see/truncate its FS size to that of 
/dev/drbd0 so that the tune2fs command returns the bytes the kernel sees 
on /dev/drbd0. Is this correct?

Thanks, Lars and Todd, for your pointers -

Nate