Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
From: "Balreddy Medipally" <balreddy.m at imimobile.com> > I couldn't found anything like "sda: Uncorrectable error" in > dmesg and I am sure that there is no faulty HDD in our servers. No filesystem or disk errors anywhere in the dmesg output from either box? OK. That means other stuff is going on, but solving that will be more annoying than replacing disks. > Actually we are using DRBD for MySQL high availability, so when > dropping some big table like 50GB or bigger [then some] applications > are not able to access [the] MySQL DB [while the table's being > dropped]. Dropping a table that big entails a lot of disk I/O. There's not a whole lot you can do about that. Do you see the same problems on a non-DRBD system? That's something I'd check if I could. > resource drbd0 { > protocol C; > startup { > become-primary-on both; > } Dual-primary is important; you should've mentioned that in the first message. Which filesystem are you using? What are the mount options? Because GFS/OCFS2 have to maintain consistency and make sure all writes are replicated to both nodes and manage locking, they're going to perform worse than ext3 in a primary/secondary setup. If performance is that critical, you'd probably be better off using primary/secondary than dual-primary. If you're unable to do that, switching to protocol A would probably make stuff a bit faster. We've had a primary/secondary setup using protocol A for over a year. This setup's had two unplanned failovers due to equipment failures, and we haven't lost any data. > net { > timeout 60; > max-epoch-size 8000; > max-buffers 8000; > unplug-watermark 128; > connect-int 10; > ping-int 10; > sndbuf-size 1024k; > ko-count 180; > ping-timeout 5; > allow-two-primaries; Probably OK. > syncer { > rate 200M; > al-extents 3389; Can the link between the two nodes actually handle 200M/sec? If not, set this lower. Setting the sync rate too high can actually slow things down, though it shouldn't cause the problems you reported. > no-disk-barrier; > no-disk-flushes; > no-md-flushes; > on-io-error detach; This should be OK too, provided the disks have battery-backed cache and so forth. -- Matt G / Dances With Crows The Crow202 Blog: http://crow202.org/wordpress/ There is no Darkness in Eternity/But only Light too dim for us to see