<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<style type="text/css" style="display:none;"><!-- P {margin-top:0;margin-bottom:0;} --></style>
</head>
<body dir="ltr">
<div id="divtagdefaultwrapper" style="font-size: 12pt; color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-serif, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", NotoColorEmoji, "Segoe UI Symbol", "Android Emoji", EmojiSymbols;" dir="ltr">
<div id="divtagdefaultwrapper" dir="ltr" style="font-size: 12pt; color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-serif, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", NotoColorEmoji, "Segoe UI Symbol", "Android Emoji", EmojiSymbols;">
<p style="margin-top:0; margin-bottom:0">Hi,</p>
<p style="margin-top:0; margin-bottom:0"><br>
</p>
<p style="margin-top:0; margin-bottom:0">Yep, I did go smaller, but 10MB seemed too basic, so I did a 500GB volume. </p>
<p style="margin-top:0; margin-bottom:0"><br>
</p>
<p style="margin-top:0; margin-bottom:0"><span style="font-family:monospace">[root@ae-fs01 /]# time mkfs.xfs /dev/drbd100
<br>
meta-data=/dev/drbd100 isize=512 agcount=4, agsize=30517578 blks <br>
= sectsz=4096 attr=2, projid32bit=1 <br>
= crc=1 finobt=0, sparse=0 <br>
data = bsize=4096 blocks=122070312, imaxpct=25 <br>
= sunit=0 swidth=0 blks <br>
naming =version 2 bsize=4096 ascii-ci=0 ftype=1 <br>
log =internal log bsize=4096 blocks=59604, version=2 <br>
= sectsz=4096 sunit=1 blks, lazy-count=1 <br>
realtime =none extsz=4096 blocks=0, rtextents=0 <br>
<br>
real 3m23.066s <br>
user 0m0.001s <br>
sys 0m0.097s<br>
</span><span style="font-size: 12pt;"><br>
</span></p>
<p style="margin-top:0; margin-bottom:0"><span style="font-size: 12pt;">I spoke to the guys in the XFS IRC channel which were able to pretty much immediately suggest the solution. Use "-K" to not attempt to discard blocks at mkfs time. This pretty much went
from taking almost 4 hours to format the ~12TB device to just about a minute. I assume that since drbd supports discard/TRIM for SSD backends, even while I have none at the moment, the mkfs.xfs creation process saw the support for "discard" and was trying
to discard prior to creating the file system. I am not currently using SSDs on the system, they are all 10K rpm 2.5" SAS drives.</span></p>
<p style="margin-top:0; margin-bottom:0"><span style="font-size: 12pt;"><br>
</span></p>
<p style="margin-top:0; margin-bottom:0"><span style="font-size: 12pt;">Without using -K</span></p>
<p style="margin-top:0; margin-bottom:0"><span style="font-size: 12pt;"><br>
</span></p>
<p style="margin-top:0; margin-bottom:0"><span style="font-size: 12pt;"><span style="font-family:monospace"># time mkfs.xfs /dev/drbd100
<br>
meta-data=/dev/drbd100 isize=512 agcount=12, agsize=268435455 blks <br>
= sectsz=4096 attr=2, projid32bit=1 <br>
= crc=1 finobt=0, sparse=0 <br>
data = bsize=4096 blocks=3093299200, imaxpct=5 <br>
= sunit=0 swidth=0 blks <br>
naming =version 2 bsize=4096 ascii-ci=0 ftype=1 <br>
log =internal log bsize=4096 blocks=521728, version=2 <br>
= sectsz=4096 sunit=1 blks, lazy-count=1 <br>
realtime =none extsz=4096 blocks=0, rtextents=0 <br>
<br>
<br>
real 230m43.596s <br>
user 0m0.020s <br>
sys 0m13.172s<br>
<br>
</span>Using -K</span></p>
<p style="margin-top:0; margin-bottom:0"><span style="font-size: 12pt;"><br>
</span></p>
<p style="margin-top:0; margin-bottom:0"><span style="font-size: 12pt;"><span style="font-family:monospace"># time mkfs.xfs -K /dev/drbd100
<br>
meta-data=/dev/drbd100 isize=512 agcount=12, agsize=268435455 blks <br>
= sectsz=4096 attr=2, projid32bit=1 <br>
= crc=1 finobt=0, sparse=0 <br>
data = bsize=4096 blocks=3093299200, imaxpct=5 <br>
= sunit=0 swidth=0 blks <br>
naming =version 2 bsize=4096 ascii-ci=0 ftype=1 <br>
log =internal log bsize=4096 blocks=521728, version=2 <br>
= sectsz=4096 sunit=1 blks, lazy-count=1 <br>
realtime =none extsz=4096 blocks=0, rtextents=0 <br>
<br>
<br>
real 1m3.523s <br>
user 0m0.009s <br>
sys 0m0.463s<br>
</span></span></p>
<p style="margin-top:0; margin-bottom:0"><span style="font-size: 12pt;"><span style="font-family:monospace"><br>
</span></span></p>
<p style="margin-top:0; margin-bottom:0"><span style="font-size: 12pt;"><span style="font-family:monospace">My drbd device is working well now and I have the large single XFS partition on it.</span></span></p>
<p style="margin-top:0; margin-bottom:0"><span style="font-size: 12pt;"><span style="font-family:monospace"><br>
</span></span></p>
<p style="margin-top:0; margin-bottom:0"><span style="font-size: 12pt;">Thanks,</span><span style="font-family:monospace"><br>
</span></p>
<p style="margin-top:0; margin-bottom:0"><br>
</p>
<p style="margin-top:0; margin-bottom:0">Diego</p>
<div id="Signature">
<div id="divtagdefaultwrapper" style="font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255); font-family:Calibri,Arial,Helvetica,sans-serif,EmojiFont,"Apple Color Emoji","Segoe UI Emoji",NotoColorEmoji,"Segoe UI Symbol","Android Emoji",EmojiSymbols">
<p></p>
</div>
</div>
</div>
<hr tabindex="-1" style="display:inline-block; width:98%">
<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" color="#000000" style="font-size:11pt"><b>From:</b> drbd-user-bounces@lists.linbit.com <drbd-user-bounces@lists.linbit.com> on behalf of Lars Ellenberg <lars.ellenberg@linbit.com><br>
<b>Sent:</b> Thursday, May 3, 2018 10:13:07 AM<br>
<b>To:</b> drbd-user@lists.linbit.com<br>
<b>Subject:</b> Re: [DRBD-user] New 3-way drbd setup does not seem to take i/o</font>
<div> </div>
</div>
<div class="BodyFragment"><font size="2"><span style="font-size:11pt">
<div class="PlainText">On Thu, May 03, 2018 at 11:45:14AM +0000, Remolina, Diego J wrote:<br>
> A bit of progress, but still call traces being dumped in the logs. I<br>
> waited for the full initial sync to finish, then I created the file<br>
> system from a different node, ae-fs02, instead of ae-fs01. Initially,<br>
> the command hung for a while, but it eventually succeded. However the<br>
> following call traces where dumped:<br>
> <br>
> <br>
> [ 5687.457691] drbd test: role( Secondary -> Primary )<br>
> [ 5882.661739] INFO: task mkfs.xfs:80231 blocked for more than 120 seconds.<br>
> [ 5882.661770] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.<br>
> [ 5882.661796] mkfs.xfs D ffff9df559b1cf10 0 80231 8839 0x00000080<br>
> [ 5882.661800] Call Trace:<br>
> [ 5882.661807] [<ffffffffb7d12f49>] schedule+0x29/0x70<br>
> [ 5882.661809] [<ffffffffb7d108b9>] schedule_timeout+0x239/0x2c0<br>
> [ 5882.661819] [<ffffffffc08a1f1b>] ? drbd_make_request+0x23b/0x360 [drbd]<br>
> [ 5882.661824] [<ffffffffb76f76e2>] ? ktime_get_ts64+0x52/0xf0<br>
> [ 5882.661826] [<ffffffffb7d1245d>] io_schedule_timeout+0xad/0x130<br>
> [ 5882.661828] [<ffffffffb7d1357d>] wait_for_completion_io+0xfd/0x140<br>
> [ 5882.661833] [<ffffffffb76cee80>] ? wake_up_state+0x20/0x20<br>
> [ 5882.661837] [<ffffffffb792308c>] blkdev_issue_discard+0x2ac/0x2d0<br>
> [ 5882.661843] [<ffffffffb792c141>] blk_ioctl_discard+0xd1/0x120<br>
<br>
<br>
> I would think this is not normal. Do you think this is a RHEL 7.5 specific issue?<br>
> <br>
> # cat /etc/redhat-release<br>
> Red Hat Enterprise Linux Server release 7.5 (Maipo)<br>
> # uname -a<br>
> Linux ae-fs02 3.10.0-862.el7.x86_64 #1 SMP Wed Mar 21 18:14:51 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux<br>
<br>
The suggestion was to:<br>
> start with somthing easier then:<br>
> create a small (like 10M) resource with DM and then try to create the<br>
> XFS on on (without the additional zfs steps).<br>
<br>
Notice the "small" and "easy" in there?<br>
Once that works, go bigger, and more complex.<br>
<br>
You have a ~ 11 TiB volume, which currently is being completely<br>
discarded by mkfs. This may take some time. Yes, for larger devices,<br>
this may take "more than 120 seconds".<br>
<br>
As a side note, most of the time when you<br>
think you want large DRBD volumes to then<br>
"carve out smaller pieces on top of that",<br>
you are mistaken. For reasons.<br>
<br>
Anyways, as long as the "stats" still make progress,<br>
that's just "situation normal, all fucked up".<br>
And, as you said, "it eventually succeded".<br>
So there :-)<br>
<br>
There was an upstream kernel patch for that in 2014, adding a<br>
"cond_resched()" in the submit path of blkdev_issue_discard(),<br>
<a href="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/diff/?id=c8123f8c9cb5&context=6&ignorews=0&dt=0" id="LPlnk891132" class="OWAAutoLink" previewremoved="true">https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/diff/?id=c8123f8c9cb5&context=6&ignorews=0&dt=0</a><br>
<br>
But it's not only the submission that can take a long time,<br>
it is also (and especially) the wait_for_completion_io().<br>
<br>
We could "make the warnings" go away by accepting only (arbitrary small<br>
number) of discard requests at a time, and then blocking in<br>
submit_bio(), until at least one of the pending ones completes.<br>
But that'd be only cosmetic I think,<br>
and potentially make things take even longer.<br>
<br>
<br>
-- <br>
: Lars Ellenberg<br>
: LINBIT | Keeping the Digital World Running<br>
: DRBD -- Heartbeat -- Corosync -- Pacemaker<br>
<br>
DRBD® and LINBIT® are registered trademarks of LINBIT<br>
__<br>
please don't Cc me, but send to list -- I'm subscribed<br>
_______________________________________________<br>
drbd-user mailing list<br>
drbd-user@lists.linbit.com<br>
<a href="http://lists.linbit.com/mailman/listinfo/drbd-user" id="LPlnk115417" class="OWAAutoLink" previewremoved="true">http://lists.linbit.com/mailman/listinfo/drbd-user</a><br>
</div>
</span></font></div>
</div>
</body>
</html>