<html><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div><div><font class="Apple-style-span" face="'Lucida Grande'"><div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; ">Thanks for your response Lars.</div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; min-height: 14px; "><br></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; ">With some outside assistance we were able to single out the issue we encountered. </div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; min-height: 14px; "><br></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; ">The problem stemmed from the "secondary" machine not being properly provisioned. Looking at the two partition tables one can clearly see that there was a slight size difference between the partitions on the primary and the partitions on the secondary. This left the partition for /dev/drbd2 on the secondary machine (nfs2 in the config) *smaller* than the /dev/drbd2 partition on the primary machine. Upon noticing problematic behavior with our application we stopped the sync, but shutting down DRBD on the Secondary machine.</div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; min-height: 14px; "><br></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; ">What has caused so much heartache moving forward was that for some reason DRBD resized the partition for /dev/drbd2: </div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; min-height: 14px; "><br></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal 'Lucida Grande'; ">Mar 5 14:37:19 nfs2 kernel: drbd2: drbd_bm_resize called with capacity == 3421310910</div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal 'Lucida Grande'; min-height: 15px; "><br></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal 'Lucida Grande'; ">This led to EXT3 trying to access the data that no longer existed on that device:</div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal 'Lucida Grande'; min-height: 15px; "><br></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal 'Lucida Grande'; ">Mar 5 14:37:34 nfs2 kernel: attempt to access beyond end of device</div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal 'Lucida Grande'; min-height: 15px; "><br></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; ">If I understand correctly, because the DRBD device beneath LVM was resized, LVM freaked out. LVM was attempting to map the VG over the DRBD devices, but could not, because it was the underlying devices were smaller than expected. Hence this message:</div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; min-height: 14px; "><br></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; ">Mar 5 18:56:26 nfs2 kernel: device-mapper: table: device 147:2 too small for target</div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; min-height: 14px; "><br></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; ">At this point the gentleman called into to assist us with repair, was able to dump the MD of the device and resize the "la-size-sect" by hand. </div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; min-height: 14px; "><br></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; ">Once the DRBD device was made to match what LVM expected, we were able to bring up the VG. </div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; min-height: 14px; "><br></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; ">While it is clear the root cause was non identical partitions, I am amazed that DRBD made the decision to resize the partition instead of throw an error message and stop the sync process. In our investigation we did find code that seemed to designed to prevent this, though we are not sure it is in the correct code path:</div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; min-height: 14px; "><br></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 11px/normal 'Lucida Grande'; background-color: rgb(226, 219, 203); ">/ Never shrink a device with usable data.</div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 11px/normal 'Lucida Grande'; background-color: rgb(226, 219, 203); "> if(drbd_new_dev_size(mdev,mdev->bc) <</div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 11px/normal 'Lucida Grande'; background-color: rgb(226, 219, 203); "> drbd_get_capacity(mdev->this_bdev) &&</div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 11px/normal 'Lucida Grande'; background-color: rgb(226, 219, 203); "> mdev->state.disk >= Outdated ) {</div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 11px/normal 'Lucida Grande'; background-color: rgb(226, 219, 203); "> dec_local(mdev);</div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 11px/normal 'Lucida Grande'; background-color: rgb(226, 219, 203); "> ERR("The peer's disk size is too small!\n");</div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 11px/normal 'Lucida Grande'; background-color: rgb(226, 219, 203); "> drbd_force_state(mdev,NS(conn,Disconnecting));</div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 11px/normal 'Lucida Grande'; background-color: rgb(226, 219, 203); "> mdev->bc->dc.disk_size = my_usize;</div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 11px/normal 'Lucida Grande'; background-color: rgb(226, 219, 203); "> return FALSE;</div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 11px/normal 'Lucida Grande'; background-color: rgb(226, 219, 203); "> }</div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 11px/normal 'Lucida Grande'; background-color: rgb(226, 219, 203); "> dec_local(mdev);</div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal 'Lucida Grande'; min-height: 15px; "><br></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; ">At any rate. We dodged a bullet this time and while we did have quite a scare, I still believe DRBD has a place in our infrastructure. </div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; min-height: 14px; "><br></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; ">Please any additional comments and or insights are welcome.</div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; min-height: 14px; "><br></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; ">Tyler</div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; "><br class="webkit-block-placeholder"></div></div></font></div></div><div>On Mar 6, 2008, at 1:51 AM, Lars Ellenberg wrote:</div><div><div><div><br class="Apple-interchange-newline"><blockquote type="cite">On Wed, Mar 05, 2008 at 05:52:31PM -0800, Tyler Seaton wrote:<br><blockquote type="cite">Hey Guys,<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">I have a pretty bad situation on my hands.<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">We had a node configured running DRBD 8.0.6. The goal was to keep this <br></blockquote><blockquote type="cite">running in standalone mode until we provisioned a matching machine. We <br></blockquote><blockquote type="cite">purchased the matching machine and finally had it fully configured today. I <br></blockquote><blockquote type="cite">kicked off the initial sync, and had hoped that we would have both machines <br></blockquote><blockquote type="cite">in sync within a day or two.<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">This was unfortunately not the case. When I kicked off the sync all seemed <br></blockquote><blockquote type="cite">well however our application quickly began throwing error's as the primary <br></blockquote><blockquote type="cite">node became read only. I quickly shut off drbd on the secondary node and <br></blockquote><blockquote type="cite">attempted to return the original configuration to the primary server. Sadly <br></blockquote><blockquote type="cite">no amount of back peddling has helped us. We are currently dead in the <br></blockquote><blockquote type="cite">water.<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">DRBD was configured on the primary node with LVM. We have/had 3 resources <br></blockquote><blockquote type="cite">configured the first 2 being 2TB in size and the 3rd being 1.4-5TB in size. <br></blockquote><blockquote type="cite">Since stopping the initial sync I have not been able to mount LVM Volume <br></blockquote><blockquote type="cite">Group that sits above the three resources. NOTE: the SDB devices on nfs2 <br></blockquote><blockquote type="cite">are numbered differently.<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">/var/log/messages was giving the following messages:<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">Mar 5 14:38:35 nfs2 kernel: drbd2: rw=0, want=3434534208, limit=3421310910<br></blockquote><blockquote type="cite">Mar 5 14:38:35 nfs2 kernel: attempt to access beyond end of device<br></blockquote><blockquote type="cite">Mar 5 14:38:35 nfs2 kernel: drbd2: rw=0, want=3434534216, limit=3421310910<br></blockquote><blockquote type="cite">Mar 5 14:38:35 nfs2 kernel: attempt to access beyond end of device<br></blockquote><blockquote type="cite">Mar 5 14:38:35 nfs2 kernel: drbd2: rw=0, want=3434534224, limit=3421310910<br></blockquote><blockquote type="cite">Mar 5 14:38:35 nfs2 kernel: attempt to access beyond end of device<br></blockquote><br>please provide FROM BOTH NODES output of<br># drbdadm -d attach all<br># sfdisk -d /dev/sdb<br># grep -e drbd -e sdb /proc/partitions<br><br>-- <br>: Lars Ellenberg <a href="http://www.linbit.com">http://www.linbit.com</a> :<br>: DRBD/HA support and consulting sales at linbit.com :<br>: LINBIT Information Technologies GmbH Tel +43-1-8178292-0 :<br>: Vivenotgasse 48, A-1120 Vienna/Europe Fax +43-1-8178292-82 :<br>__<br>please use the "List-Reply" function of your email client.<br>_______________________________________________<br>drbd-user mailing list<br><a href="mailto:drbd-user@lists.linbit.com">drbd-user@lists.linbit.com</a><br>http://lists.linbit.com/mailman/listinfo/drbd-user<br></blockquote></div><br></div></div></body></html>