Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Mon, 15 Feb 2016 15:29:39 +0100, Lars Ellenberg wrote: > On Fri, Feb 12, 2016 at 04:10:14PM +0000, freebird wrote: > > Hi, I have a memory usage problem with the resync sender on a single > > resource. If the total size of the out of sync blocks exceeds the > > memory size when drbd connects, the sync progresses until it exhausts > > the memory leading to the OOM killer kicking in (too slowly to make > > any difference) followed by a kernel panic. Everything else including > > replication of very large files that exceed the memory size, works > > fine. > > > > I've tested with out of sync sizes up to just short of the total > > memory when the resync succeeds and observed that the memory used > > increases in line with sync'ed blocks i.e. the resync process seems to > > be allocating memory per block and doesn't release or reuse it ... > > only when the resync completes does the memory get released. The > > strange thing is that I can't find which specific process is retaining > > the memory ... nothing shows up in top, slabtop or the process table. > > We regularly resync terrabytes of dummy data in test VMs that have > only a few hundred megs of ram allocated. So I seriously doubt > that this was a generic DRBD issue, but would suggest that > something is off in your deployment. > Hi Lars, that's what I suspected myself as I know the DRBD user base is large and googling didn't find anyone else with the issue but I was just hoping for some suggestions. Anyway, it turns out to be an XFS issue ... if I dismount the filesystem before the resync, the problem doesn't occur. The XFS setup hasn't been tweaked from what mkfs.xfs defaults created so I'll investigate tuning or even using a different filesystem. Thanks for the reply, Nige. -- Twenty years from now you will be more disappointed by the things that you didn't do than by the ones you did do. [Mark Twain]