Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
> (no need to stop services, though)
>
>
Thanks for the detailed steps.
>>> drbd hot area) Currently we use 521 al-extents for a 1.4 TB device. How
>>> much difference does it make when we increase this number? (nr of
>>> writes
>>> vs. size of al-extents) And also can this be changed on-the-fly?
>>>
>>
>
>
> while the system is under load, try:
> watch -n1 cat /proc/drbd
> and keep looking at this "al:###" number
> if it changes too frequently, it may help to increase the al size.
> if it does not change at all for several seconds/minutes while the dw
> and ns numbers are still moving, then it is big enough, and you may even
> consider to reduce it (less resync time in case of a Primary crash).
>
Under moderate load the al: grows with 1/sec
But high load makes the al: grow with 10/sec - Is that okay?
ns:839765704 nr:146624628 dw:984353832 dr:946917693 al:874245 bm:2765
lo:109 pe:0 ua:0 ap:109
Sar output of the last few hours:
CPU %user %nice %system %iowait %idl
02:50:01 PM all 0.77 0.00 4.23 17.76 77.23
03:00:04 PM all 0.87 0.00 4.61 17.17 77.35
03:10:01 PM all 0.91 0.00 5.78 26.25 67.06
03:20:01 PM all 0.79 0.00 5.56 38.75 54.89
03:30:01 PM all 0.79 0.00 5.48 29.48 64.25
03:40:02 PM all 0.77 0.00 5.65 41.55 52.03
03:50:01 PM all 0.80 0.00 8.16 78.03 13.01
04:00:02 PM all 1.14 0.00 8.51 86.01 4.34
04:10:02 PM all 1.80 0.00 9.80 84.04 4.36
04:20:02 PM all 2.67 0.00 12.40 81.18 3.75
04:30:01 PM all 2.54 0.00 14.15 79.68 3.63
04:40:01 PM all 2.34 0.00 10.72 84.21 2.73
04:50:01 PM all 2.36 0.00 10.96 81.10 5.59
05:00:01 PM all 2.36 0.00 10.84 82.69 4.10
Starting at 03:45 the loadavg when from 15 to 60.