Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
At 04:25 2007-09-27, Lars Ellenberg wrote: >On Wed, Sep 26, 2007 at 10:35:36AM -0300, Luis F. V. Gomes wrote: > > > > Ok. I will try to make a script to detect this situation and use > > drbdadm to force a new primary node. Later I can manually restore the > > Maildirs using scp. > >how do you propose to detect "this" situation? Well, let's say if the drbd partition is considered outdated for more than 15 minutes and the "fake" server IP is down, I would consider that the primary node is down and the outdated node is the only one. As you said, it is better to have an outdated mail server than to have no mail server. >you really want to *automatically* override "outdated" state? > >then, what is the point to have it marked as "outdated" >in the first place? That's the point! I really do not want to override the outdated state. I need one of these situations (of course we are talking about a single point of failure): a) Never to get an outdated state due to a blackout. So, if one node does not wake up, the other will start the services. Maybe this approach solves the problem, but since the nodes would never shut down at the same time even if they are on the same UPS, is there a state in drbd that could be set by the shutdown script that means both nodes were ok and synchonized when turned off? or b) Even if outdated, what I really want is the fake server IP to be active. I would consider to use an alternative spool directory while the primary node is not fixed or declared lost. I woult not care if the drbd partition would not be accessible. Or better, I would WANT it inaccessible until manual intervention. >thank you for sharing your thoughts and proposals on this. Ok. I just hope it's not "off-topic". Maybe this can be solved using some Heartbeat unusual configuration for option "b". Thanks > > At 19:08 2007-09-25, Lars Ellenberg wrote: > > > > >On Tue, Sep 25, 2007 at 02:56:36PM -0300, Luis F. V. Gomes wrote: > > >> Hi all > > >> > > >> I think this may have happened to one of yours but I could not find > > >> any hints to solve this problem automatically in the archives: > > >> Due to a blackout, node 1 shuts down first and becomes outdated (DRBD > > >> 8); node 2 acquires its resources and becomes primary for a few > > >> minutes until it shuts down too. > > >> After power comes back, node 2 does not boot due to a hardware > > >> transient problem (the disk is intact). Node 1 boots but refuses to > > >> be primary because it is outdated and my clustered services (http and > > >> email servers) are a joke. > > > > > >to force a non-up2date (local data is outdated or inconsistent) drbd > > >to become Primary anyways, because you prefer to be online with > > >potentially stale data than to be completely offline, > > > drbdadm -- --overwrite-data-of-peer primary all > > > > > >but, yes, then you really forced drbd into UGLY mode, > > >diverging data sets and all, once you connect both nodes again > > >you will see some infamous "split brain detected" message, > > >and you'll have fun to sort out the mess. > > > > > >provided that all important data on those boxes is actually > > >Maildirs, it may even be possible to merge it easily, though. > > > > > >depending on exactly how "old" that "outdated" is, > > >this may be an option unless you prefer to fix the other node. > > > > > >> Is there any configuration workaround for this situation? > > >> > > >> I just want the SMTP and POP services to wake up in node 1 (probably > > >> using an alternative temporary spool directory) and later manually > > >> resynchronize the data (maildir and queues) after fixing note 2. > > >> > > >> RedHat EL 5 > > >> Heartbeat 2.0.8.3 > > >> drbd.x86_64 8.0.3-1.el5.centos > > >> dovecot.x86_64 1.0-1.2.rc15.el5 > > >> exim.x86_64 4.63-3.el5 > > > > > >good luck! > > > > > >-- > > >: Lars Ellenberg Tel +43-1-8178292-55 : > > >: LINBIT Information Technologies GmbH Fax +43-1-8178292-82 : > > >: Vivenotgasse 48, A-1120 Vienna/Europe http://www.linbit.com : > > ! ,,, ! (@ @) +--------------------oOO-/-(*)-\-OOo---+----------------------------------+ | Luis Fernando V. Gomes | Email: lf at ele.puc-rio.br | Dept. Engenharia Eletrica | | PUC-Rio | Voz: +(55) (21) 3527-1220 | R. Marques de Sao Vicente 225/401L | Fax: +(55) (21) 3527-1232 | 22451-900 - Rio de Janeiro/RJ, Brasil| +--------------------------------------+----------------------------------+