Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Thu, May 28, 2009 at 10:16:07PM +0400, Maxim Vladimirsky wrote: > On Thu, May 28, 2009 at 12:22 PM, Lars Ellenberg wrote: ... > > I am sure they do _log_ why they are going StandAlone. > > Maybe you want to let us know? ... > > My guess is that somehow your init script start/stop order got screwed up. ... > > Hi Lars, > > Please find message logs from both nodes attached (drbd-8.3.1.zip). As you > can see in there node1 (primary) was rebooted (I did that myself to check > how cluster reacts) but on boot-up it went in StandAlone mode instead of > Secondary. Just for comparison I attached logs from a cluster running DRBD > 0.7.24, as you can see in the same situation rebooted primary node come up > as secondary - this is the behavior I expect from DRBD 8.3.1 cluster. No Sir. I did not mean you to post thousands of log lines. Going through thousands of lines of log files is a service we provide to paying customers. To those we also provide help integrating DRBD into whatever solution they have. wanna become a paying customer? ;) I meant you to look into those logs, and find out why they go StandAlone. But, again: the hint was init script start/stop order. you probably stop the network/down the replication link before you umount and make it secondary. which results in a split brain situation, and (potentially) diverging datasets. which is detected uppon next handshake once the communication is restored. drbd 0.7 did not have any such detection. checkout the users guide, read about split brain. to avoid a home grown split brain and data set divergence on every orderly reboot, you need to first stop services, then umount, then secondary drbd. and _then_ you may take down your communications. > I our case DRBD is started and to some degree controlled (though > monitored would probably be more correct term) by drbd_controller - > our proprietary application. neat. then it is probably not even init script ordering, but just broken home grown "controller" application. I guess you have to fix it. > It is logs can be seen in the message log. But it has nothing to do > with DRBD going StandAlone I disagree. > for drbd_controller just started DRBD and it came up StandAlone before > drbd_controller had chance to do something wise. yeah sure ;) but that is only the _reaction_, not the cause. the cause is your "controller" doing stupid things (or not enough, or the wrong things, or the right things in the wrong order) on reboot. (only my educated guess, of course) > The intention to set such a short timeouts was to make sure that > drbdadm/drbdsetup does not block drbd_controller for long time. It is > drbd_controller responsibility to poll DRBD state on regular basis. > > Kind regards, Maxim Vladimirsky hth. -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. __ please don't Cc me, but send to list -- I'm subscribed