[drbd-mc] Lots of pvdisplay commands -> RA timeouts

Rasto Levrinc rasto.levrinc at gmail.com
Tue Dec 18 15:16:20 CET 2012


On Tue, Dec 18, 2012 at 11:00 AM, Lars Ellenberg
<lars.ellenberg at linbit.com> wrote:
> On Mon, Dec 17, 2012 at 04:09:25PM +0100, Rasto Levrinc wrote:
>> On Mon, Dec 17, 2012 at 3:39 PM, Caspar Smit <c.smit at truebit.nl> wrote:
>> > 2012/12/17 Rasto Levrinc <rasto.levrinc at gmail.com>
>> >>
>> >> On Mon, Dec 17, 2012 at 3:09 PM, Caspar Smit <c.smit at truebit.nl> wrote:
>> >> > Hi Rasto,
>> >> >
>> >> > I noticed this in one of my clusters:
>> >> >
>> >> ...
>> >>
>> >> > /usr/local/bin/lcmc-gui-helper-1.4.2 hw-info-daemon
>> >> > root      9869  0.0  0.0  39856  1516 pts/7    S+   14:49   0:00
>> >> > \_ sudo -E -p DRBD MC sudo pwd:  /usr/local/bin/lcmc-gui-helper-1.4.2
>> >> > hw-info-daemon
>> >> > root      9870  0.3  0.0  27676  5000 pts/7    S+   14:49   0:00
>> >> > \_ /usr/bin/perl /usr/local/bin/lcmc-gui-helper-1.4.2 hw-info-daemon
>> >> > root     18176  0.0  0.0   9060  1180 pts/7    S+   14:50   0:00
>> >> > \_ sh -c /sbin/pvdisplay -C --noheadings -o pv_name,vg_name 2>/dev/null
>> >> > root     18177  0.0  0.0  17872  1604 pts/7    D+   14:50   0:00
>> >> > \_ /sbin/pvdisplay -C --noheadings -o pv_name,vg_name
>> >> >
>> >> > Why is LCMC running so many pvdisplay commands at once?
>> >>
>> >> Hi Caspar,
>> >>
>> >> it is running it once in 10 seconds, to see if something has changed.
>> >> Can you check what does it do on your nodes?
>> >>
>> >>  /sbin/pvdisplay -C --noheadings -o pv_name,vg_name
>> >>
>> >> Rasto
>> >>
>> >
>> > # /sbin/pvdisplay -C --noheadings -o pv_name,vg_name
>> >   /dev/sdb   single_array3
>> >   /dev/sdc   single_array3
>> >   /dev/sdd   single_array3
>> >   /dev/sdh   replicated_array1and2
>> >   /dev/sdi   replicated_array1and2
>> >   /dev/sdj   replicated_array1and2
>> >   /dev/sdk   replicated_array1and2
>> >
>> > I know that LCMC does monitor changes with the lcmc-gui-helper script, but I
>> > presume the "hw-info-daemon" part has to run only once and not 5(+) times
>> > concurrently?
>> >
>> > Running 5x pvdisplay concurrently can really slow things down.
>>
>> It shouldn't run this 5x concurrently. What here probably happens, is that
>> the hw daemon takes too long and is assumed dead and is restarted.
>
> Which does not really improve things in this case ;-)

That's actually a regression, the old daemon must be killed, when anything
hangs, so that's the first bug.

>
>> Can it be that /sbin/pvdisplay -C --noheadings -o pv_name,vg_name
>> hangs on or takes very long on your system, at least sometimes?
>
> We have seen lvm commands that scan meta data take several *minutes* to
> complete on a moderately busy server.
> [0.1 seconds when the system is idle,
>  virtually "forever" when it is really busy :-)]
>
> In part because of too many devices to be scanned, badly chosen filter
> settings, badly chosen bio flags for O_DIRECT (that has been fixed since
> in kernel), too long device queues (too large nr_requests),
> and evil io scheduler interactions.
> All tuneable, or possible to work around.
> Still that brought it down to ~ 20 seconds only.
>
>> Anyway I can/should fix LCMC to deal with this situation.
>
> You should probably not initiate a full device scan every ten seconds,
> but preferably on demand only,
> or maybe every once in a while if loadavg is low.

The LCMC needs to have relatively uptodate knowledge about LVM, if possible.
It could run the scan only if any pv, vg or lv has changed. I've figured out
that I could check the lvm .cache file timestamp that is written every time
any of the lvm commands are run. It wouldn't help if the lvm cache is
disabled, but I guess if somebody disables that, has also made some device
filters. And update every 10 seconds + time the scan runs or so would be
acceptable. Or maybe something clever like (time the scan runs) * 2 + 10.

Rasto


More information about the drbd-mc mailing list