[Drbd-dev] verbose log output from drbd kernel module

Lars Ellenberg lars.ellenberg at linbit.com
Thu Nov 2 14:19:44 CET 2017


On Tue, Oct 31, 2017 at 02:58:18PM +1100, Tobin C. Harding wrote:
> Currently the drdb kernel module is a little bit verbose. While there
> are numerous repeated log lines, a number of them are repeated in
> different sequences. It is not immediately clear which lines
> are _really_ necessary to understand what is going on. A good
> understanding of drdb is probably required to reduce the duplicates
> without loss of information. There are however a number of duplicate log
> entries that appear consecutively, these could be reduced to a
> single instance without loss of information (assuming the number of
> times the error message appears consecutively is not useful
> information).
> 
> So far I have found the following duplicate consecutive log entries
> 
> #
> 
> [ 1274.952627] F2FS-fs (drbd0): Magic Mismatch, valid(0xf2f52010) - read(0x29363930)
> [ 1274.952646] F2FS-fs (drbd0): Can't find valid F2FS filesystem in 1th superblock
> [ 1274.953695] F2FS-fs (drbd0): Magic Mismatch, valid(0xf2f52010) - read(0x0)
> [ 1274.953710] F2FS-fs (drbd0): Can't find valid F2FS filesystem in 2th superblock
> [ 1274.953760] F2FS-fs (drbd0): Magic Mismatch, valid(0xf2f52010) - read(0x29363930)
> [ 1274.953770] F2FS-fs (drbd0): Can't find valid F2FS filesystem in 1th superblock
> [ 1274.953779] F2FS-fs (drbd0): Magic Mismatch, valid(0xf2f52010) - read(0x0)
> [ 1274.953786] F2FS-fs (drbd0): Can't find valid F2FS filesystem in 2th superblock

Those are not DRBD messages, but generated by f2fs.

> # After adding/deploying a resource 
> 
> [1368149.470125] drbd backups: Preparing cluster-wide state change 3280213538 (1->-1 3/1)
> [1368149.470127] drbd backups: Committing cluster-wide state change 3280213538 (0ms)
> [1368149.470136] drbd backups: role( Secondary -> Primary )
> [1368149.474569] drbd backups: role( Primary -> Secondary )
> [1368149.479674] drbd backups: Preparing cluster-wide state change 1520483517 (1->-1 3/1)
> [1368149.479676] drbd backups: Committing cluster-wide state change 1520483517 (0ms)
> [1368149.479685] drbd backups: role( Secondary -> Primary )
> [1368149.484642] drbd backups: role( Primary -> Secondary )
> [1368164.080148] drbd backups: Preparing cluster-wide state change 3014496965 (1->-1 3/1)
> [1368164.080150] drbd backups: Committing cluster-wide state change 3014496965 (0ms)
> [1368164.080161] drbd backups: role( Secondary -> Primary )
> ... [repeats many times]

Auto promote, every time someone opens DRBD for write, this does
Secondary -> Primary, when the last opener closes again,
it does an "auto-demotion" again Primary -> Secondary.

If you do stuff like that: "while : > /dev/drbd0 ; do :; done",
you get it logged, over and over again.

Usually, you want to use DRBD for longer than a few milli seconds,
so these should only be logged once per mount respectively umount.


> # If block device is busy
> 
> [1369106.294749] drbd backups: Failed: role( Primary -> Secondary )
> [1369106.297290] drbd backups: State change failed: Device is held open by someone
> ... [repeats 6 times]
> 
> 
> Would it be useful to reduce these log lines to a single instance?

Useful, maybe.
Not so easy, though, if it is retried from userland.
And maybe not that easy even for in-kernel retries,
because we want to log that failure at least once,
and may not always know beforehand which retry will be
our last try in that particular sequence of retries.

-- 
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker
: R&D, Integration, Ops, Consulting, Support

DRBD® and LINBIT® are registered trademarks of LINBIT


More information about the drbd-dev mailing list