Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
I have several successful DRBD clusters in production, including two RHEL 5.3 servers running drbd 8.3.2. They have been running fine for more than a year. Today we saw very high iowait (99%) on the primary node (possibly on the secondary too, but I neglected to look) and users could not work. We tried finding the source of the iowait but could not. Ended up rebooting the primary. The standby took over fine and began a resync, but it was running very slow, like 8K per second. Resyncing the 2TB volume was going to take 11+ hours. So I did... drbdsetup /dev/drbd0 -r 300M The command took more than a minute to return to the shell prompt. Now when I cat /proc/drbd, I see the speed going at ~90K, then ~80, then ~70, then ~60, and so on, until it reaches 0 and then it says "stalled." After 10-30 seconds in a stalled state, it kicks back off again at about 90K, then it slowly drops back down and stalls again. Both servers are on the same GigE switch. Running iperf shows that we're getting almost the full gigabit per second on the replication link. Any ideas how I can troubleshoot this before users come in tomorrow morning? -- Eric Robinson Disclaimer - February 16, 2011 This email and any files transmitted with it are confidential and intended solely for drbd-user at lists.linbit.com. If you are not the named addressee you should not disseminate, distribute, copy or alter this email. Any views or opinions presented in this email are solely those of the author and might not represent those of Physicians' Managed Care or Physician Select Management. Warning: Although Physicians' Managed Care or Physician Select Management has taken reasonable precautions to ensure no viruses are present in this email, the company cannot accept responsibility for any loss or damage arising from the use of this email or attachments. This disclaimer was added by Policy Patrol: http://www.policypatrol.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20110216/064b94d1/attachment.htm>