Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Thu, May 26, 2011 at 06:51:45PM +0200, Thilo Uttendorfer wrote: > Hi all, > > we have on one of our systems a strange DRBD problem. One DRBD resource has > lots of "bio_add_page failed" errors which leads to constant resyncs for about one > hour yesterday. Today the "bio_add_page failed" errors occured again. All other (10) > resources (same configuration) don't have this problem. On top of the DRBD-Device > runs a KVM-guests, the backing-device is a LVM partition. > > host-z1 is primary, host-z2 is secondary. kern.log on host-z2 looks like this: > > May 25 18:00:16 host-z2 kernel: [77255.104705] block drbd8: alloc_ee: bio_add_page(s=13979095,data_size=8192,ds=4096) failed You are using the "use-bmbv" setting? Either upgrade your DRBD (newer then 8.3.7 allows us to fall back to multiple bios in that case), or disable that setting. If you do not have use-bmbv enabled, then this would be unexpected, and I'd request some more details about your setup. > May 25 18:00:16 host-z2 kernel: [77255.104822] block drbd8: merge_bvec_fn() = 0 > May 25 18:00:16 host-z2 kernel: [77255.104931] block drbd8: bio->bi_max_vecs = 4 > May 25 18:00:16 host-z2 kernel: [77255.105011] block drbd8: bio->bi_vcnt = 1 > May 25 18:00:16 host-z2 kernel: [77255.105097] block drbd8: bio->bi_size = 4096 > May 25 18:00:16 host-z2 kernel: [77255.105175] block drbd8: bio->bi_phys_segments = 1 > May 25 18:00:16 host-z2 kernel: [77255.105261] block drbd8: error receiving Data, l: 8216! > May 25 18:00:16 host-z2 kernel: [77255.105350] block drbd8: peer( Primary -> Unknown ) conn( Connected -> ProtocolError ) pdsk( UpToDate ) > May 25 18:00:16 host-z2 kernel: [77255.105382] block drbd8: asender terminated > May 25 18:00:16 host-z2 kernel: [77255.105392] block drbd8: Terminating drbd8_asender > May 25 18:00:16 host-z2 kernel: [77255.106038] block drbd8: Connection closed > May 25 18:00:16 host-z2 kernel: [77255.106096] block drbd8: conn( ProtocolError -> Unconnected ) > May 25 18:00:16 host-z2 kernel: [77255.106110] block drbd8: receiver terminated > May 25 18:00:16 host-z2 kernel: [77255.106113] block drbd8: Restarting drbd8_receiver > May 25 18:00:16 host-z2 kernel: [77255.106117] block drbd8: receiver (re)started > May 25 18:00:16 host-z2 kernel: [77255.106124] block drbd8: conn( Unconnected -> WFConnection ) > ... etc... > > DRBD version: 8.3.7 > Kernel 2.6.34.6 (64bit) > > > Any help would be much appreciated! > Thank you, > > Thilo > > > -- > Thilo Uttendorfer > Linux Information Systems AG > Putzbrunner Str. 71, 81739 München > > Fon: +49 89 993412-11, Fax: +49 89 993412-99 > t.uttendorfer at linux-ag.com, http://www.linux-ag.com -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.