Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi, I was mucking around with doing a whole lot of performance testing on my drbd setup with various configurations and ran into some trouble in the form of these: Oct 20 13:25:19 eros kernel: bio too big device drbd36 (48 > 8) Oct 20 13:25:19 eros kernel: bio too big device drbd36 (56 > 8) Oct 20 13:25:19 eros kernel: bio too big device drbd36 (48 > 8) Oct 20 13:25:19 eros kernel: bio too big device drbd36 (24 > 8) Oct 20 13:25:19 eros kernel: bio too big device drbd36 (56 > 8) The basic setup is xen vm's on top of drbd over lvm over iscsi (xen hosts: centos 5.2, drbd-8.0.13, 2.6.18-92.1.10.el5xen, iscsi servers centos 5.2, iscsi enterprise target) I was testing inserting an (md) raid0 layer between lvm and the iscsi devices, and the first error came when doing a pvmove of a drbd backing device from an iscsi device to the md stripe. Fine I thought, detached the device and moved it while detached instead. Resynched, restarted the xen vm and all was fine. (and, I noted max_segment_size got set to 4096) Then I tested pvmoving the (again detached) device back to the direct iscsi device and reattached. Now the bio too big errors would come on any access to the device (while max_segment_size got set to 32768). Detaching, removing and recreating the backing lv didnt solve the issue. Bringing down and up the drbd device (on both nodes) didnt solve it. Eventually I had to shut down drbd completely on the node in question and restart it from scratch, which made the device accessible from the xen vm again. I'm not sure exactly what was happening, but it appears as if some layer was sticking to the 4096 max_segment_size, while something above no longer thought that's the case? It's a bit hard to trace as there seems to be no simple ways to get max_segment_sizes out of various devices. Obviously, having read up on the whole merge_bvec stuff I shouldn't exactly be surprised that a pvmove between two devices may result in, er, odd issues, but detach/reattach to devices with different configs should work without having to restart drbd, right? Any suggestions? Should I just expect it to behave this way/should stacking on top of md be avoided/should I try the --use-bmbv option? I'm a bit reluctant to test retriggering the situation too much (as restarting the whole drbd on the node in question stops a few things) before I can get some input on wether this can potentially corrupt data, if it is a bug or if it was just an anomaly of my config or a result of the pvmove that got something stuck. Best regards, David ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program.