Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Lars, Thanks for the information - what will happen to anything trying to write to the resource when I run suspend-io? Will it simply hang until resume-io is run? Thanks! -----Original Message----- From: drbd-user-bounces at lists.linbit.com [mailto:drbd-user-bounces at lists.linbit.com] On Behalf Of Lars Ellenberg Sent: Thursday, May 07, 2009 12:45 AM To: drbd-user at lists.linbit.com Subject: Re: [DRBD-user] Resizing DRBD/LVM, stuck in WFSyncUUID On Wed, May 06, 2009 at 04:22:35PM -0700, Mike Sweetser - Adhost wrote: > Hello: > > I'm doing some testing of resizing a LVM-based DRBD partition. I've > successfully resized the LVM, and then resized the partition via DRBDADM > (the partition is named part3) > > drbdadm resize part3 > > I see the following on the Primary server, and everything is OK. > > May 6 16:15:19 SERVER1 kernel: drbd4: drbd_bm_resize called with > capacity == 31457280 > May 6 16:15:19 SERVER1 kernel: drbd4: resync bitmap: bits=3932160 > words=122880 > May 6 16:15:19 SERVER1 kernel: drbd4: size = 15 GB (15728640 KB) > May 6 16:15:42 SERVER1 kernel: drbd4: Writing the whole bitmap, size > changed > > However, I see this on the Secondary server, and it's stuck in > WFSyncUUID: > > May 6 16:15:18 SERVER2 kernel: drbd4: drbd_bm_resize called with > capacity == 31457280 > May 6 16:15:18 SERVER2 kernel: drbd4: resync bitmap: bits=3932160 > words=122880 > May 6 16:15:18 SERVER2 kernel: drbd4: size = 15 GB (15728640 KB) > May 6 16:15:18 SERVER2 kernel: drbd4: Writing the whole bitmap, size > changed > May 6 16:15:19 SERVER2 kernel: drbd4: writing of bitmap took 1376 > jiffies > May 6 16:15:19 SERVER2 kernel: drbd4: 10 GB (2621440 bits) marked > out-of-sync by on disk bit-map. > May 6 16:15:19 SERVER2 kernel: drbd4: Writing meta data super block > now. > May 6 16:15:19 SERVER2 kernel: drbd4: No resync, but 2621440 bits in > bitmap! > May 6 16:15:19 SERVER2 kernel: drbd4: bm_set was 2621440, corrected to > 2621472. /usr/local/src/drbd-8.2.6/drbd/drbd_receiver.c:2144 > > May 6 16:15:19 SERVER2 kernel: drbd4: Resync of new storage after > online grow > May 6 16:15:19 SERVER2 kernel: drbd4: conn( Connected -> WFSyncUUID ) > > Seven minutes later, it's still in WFSyncUUID on the Secondary. > > Am I missing a step? Is something possibly configured wrong on my end? > Help? :) there has been an unlikely but possible "wait-for-ever" condition in some versions of DRBD if the connection or resync handshake happens while there is IO in flight. get out of WFSync*: try drbdadm disconnect, then reconnect. if the disconnect does not work, cut the tcp connection by other means (e.g. iptables reject, or ifdown) workaround to make the race impossible: drbdadm suspend-io do-the-interessting-stuff-here drbdadm resume-io fix: upgrade to 8.3 ;) -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD(r) and LINBIT(r) are registered trademarks of LINBIT, Austria. __ please don't Cc me, but send to list -- I'm subscribed _______________________________________________ drbd-user mailing list drbd-user at lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user