[DRBD-user] drbd freeze its filesystem

Fri Oct 13 17:00:19 CEST 2006

Thanks a lot for your suggestions. I tried on other two PCs with RAID
software and reiserfs on DRBD. I had obtain a transfer rate of 11.09
MB/sec without any filesystem freeze.

So, trying a lot of parameters combination, I found the following that
give me a transfer rate of 10MB/sec without freeze.

Lower device: 104:09   (cciss/c0d0p9)
Meta device: internal
Disk options:
Local address: 192.168.0.2:7788
Remote address: 192.168.0.1:7788
Wire protocol: C
Net options:
 timeout = 4.0 sec
 connect-int = 10 sec (default)
 ping-int = 10 sec (default)
 max-epoch-size = 16384
 max-buffers = 20480
 unplug-watermark = 20480
 sndbuf-size = 131070  (default)
 ko-count = 0  (default)
Syncer options:
 rate = 40960 KB/sec
 group = 0  (default)
 al-extents = 257

On 10/12/06, Lars Ellenberg <Lars.Ellenberg at linbit.com> wrote:
> / 2006-10-12 15:15:36 +0200
> \ Enrico Morelli:
> > >which file system?
> > >if reiserfs, please try something else.
> > >we had a report about "aparent freezes" with reiserfs on top of drbd,
> > >but the "freeze" (it recovers after minutes to hours "all by itself")
> > >was in reiserfs, not in drbd. we just change the timing behaviour of the
> > >io stack...
> > >
> > ArghHHhh!!! Yes, I have the reiserfs on top of drbd and the machines
> > are servers in production. So i cannot change the filesystem.
>
> to "migrate" the file system, you could
> degrade the cluster, mkfs.xfs on the not-active node,
> rsync the data over. the first rsync will take ages,
> and you will need to repeat it, but the less is changed,
> the less it needs to transfer, the less can be changed.
> then you have one short downtime window where you decide that
> now you go down with server one, remount ro, do a final rsync,
> and go active with the other node.
> then you let drbd sync back the new xfs.
>
> no guarantee for nothing here, your problem may still be something else
> completely, and this may make it even worse...
>
> > The is some workaround to avoid this problem?
> > I didn't find warning about the problem reiserfs/drbd, maybe should be
> > useful to write about this on the site or on the wiki.
>
> it is not "reiserfs and drbd leads to trouble".
> there are clusters with reiserfs on top of drbd out there,
> performing just fine.
>
> it is "we had _one_ report", where the original poster figured out that
> some processes on reiserfs on top of drbd got stuck in getdents64 for
> ages, and that using a different file system made the problem go away
> for him.
>
> if you google for hang and reiserfs and getdents64 you get a few hits,
> but most of them are pretty old.
>
> in 2.6.16-rc.something, there was a
> "reiserfs-hang-and-performance-fix-for-data=journal-mode.patch"
> included, maybe they just got it "half-right" ?
>
> in some (very old) posts about reiserfs hang in getdents64, I found
> the recommendation to "just touch some random directory" within that
> file system, to "wake it up".
>
> so as a workaround for your situation, maybe do some
>  # sync
> on the drbd primary and/or secondary, or do some
>  # mkdir /mnt/reiserfs-mount-point/dummy$$
>  # rmdir /mnt/reiserfs-mount-point/dummy$$
>
> may or may not help...
>
> --
> : Lars Ellenberg                                  Tel +43-1-8178292-0  :
> : LINBIT Information Technologies GmbH            Fax +43-1-8178292-82 :
> : Schoenbrunner Str. 244, A-1120 Vienna/Europe   http://www.linbit.com :
> __
> please use the "List-Reply" function of your email client.
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user
>