<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 TRANSITIONAL//EN">
<HTML>
<HEAD>
<META NAME="GENERATOR" CONTENT="GtkHTML/3.12.0">
</HEAD>
<BODY>
Hi Lars,<BR>
Turns out, I had to take down drbd on the secondary and completely resync the partitions before I could get them to see each other. Not sure how I got into that situation though. So, I now have a drbd master/slave with a filesystem resource following it around via heartbeat. I have noticed one unusual aspect, which I wanted to ask you about. Right now, the filesystem resource mounts fine on one node but not the other due to an earlier error in the switching. I believe I can correct the situation, but I noticed that the drbd_promote function checks "are we already primary (local)?" then return success, and "we're we not secondary (local)? then return generic error. It gets through to the drbdadm primary command only if we are truly secondary. However, there is no check to make sure the remote is not still primary. What I'm seeing happening is a promote call where the secondary node tried to go primary, but the primary node is still primary (not sure if it just never got the secondary command or if its a timing issue that it just hasn't completed switching from primary to secondary). This triggers an error: "State change failed: (-1) Multiple primaries not allowed by config". The return from the command is 0, even though the return from within drbdadm from drbdsetup is 11. This 0 propagates back out to drbd_promote, and gets returned to heartbeat as Success, so heartbeat assumes its OK to proceed with the Filesystem mount (which of course fails). See error snippet below:<BR>
<BR>
drbd[9059]: 2007/04/25_08:44:22 DEBUG: pgsql: Calling /sbin/drbdadm -c /etc/drbd.conf primary pgsql<BR>
drbd[9059]: 2007/04/25_08:44:23 DEBUG: pgsql: <FONT COLOR="#ff0000">Exit code 0</FONT><BR>
drbd[9059]: 2007/04/25_08:44:23 DEBUG: pgsql: Command output: State change failed: (-1) Multiple primaries not allowed by config<BR>
Command '/sbin/drbdsetup /dev/drbd0 primary' terminated with exit code 11<BR>
lrmd[1171]: 2007/04/25_08:44:23 info: RA output: (rsc_drbd_7788:0:promote:stdout) State change failed: (-1) Multiple primaries not allowed by config Command '/sbin/drbdsetup /dev/drbd0 primary' terminated with <FONT COLOR="#ff0000">exit code 11</FONT><BR>
crmd[1176]: 2007/04/25_08:44:23 info: process_lrm_event: LRM operation rsc_drbd_7788:0_promote_0 (call=50, <FONT COLOR="#ff0000">rc=0</FONT>) complete <BR>
<BR>
What I propose to do is to put a conditional wait loop in the promote function that checks the remote status, and if its Primary then sleep and loop until the remote is not Primary, or terminate after so many loops. I think this provides a safer promote, though it doesn't address the issue shown by the log messages above. Thoughts?<BR>
<BR>
Doug<BR>
<BR>
<BR>
On Tue, 2007-04-24 at 12:32 -0400, Doug Knight wrote:<BR>
<BLOCKQUOTE TYPE=CITE>
<FONT COLOR="#000000">Hi Lars,</FONT><BR>
<FONT COLOR="#000000">I think I'm pretty close on getting the filesystem following drbd, but I've encountered a strange situation. drbd is running on both nodes, one indicating Secondary from the /proc/drbd file and Slave in heartbeat, the other Master and Primary. However, both show the other as Unknown. I'm able to ping successfully over the dedicated CAT6 cable between the nodes. Any idea why the two drbd processes wouldn't see each other?</FONT><BR>
<BR>
<FONT COLOR="#000000">Doug</FONT><BR>
<FONT COLOR="#000000">On Tue, 2007-04-24 at 08:00 -0400, Doug Knight wrote:</FONT><BR>
<BLOCKQUOTE TYPE=CITE>
<FONT COLOR="#000000">On Tue, 2007-04-24 at 11:07 +0200, Lars Ellenberg wrote: </FONT>
<BLOCKQUOTE TYPE=CITE>
<PRE>
<FONT COLOR="#000000">On Mon, Apr 23, 2007 at 02:39:16PM -0400, Doug Knight wrote:</FONT>
<FONT COLOR="#000000">> Hey Lars,</FONT>
<FONT COLOR="#000000">please read the last line of my sig... :)</FONT>
</PRE>
</BLOCKQUOTE>
<FONT COLOR="#000000">Sorry, I was jumping between email lists, and some of the others come in with a reply-to already set to the list, not the individual...</FONT><BR>
<BR>
<BLOCKQUOTE TYPE=CITE>
<PRE>
<FONT COLOR="#000000">> Quick question: Comparing the printf text in your patch to the various</FONT>
<FONT COLOR="#000000">> checks in the drbd ocf script, should the patch be printf'ing "Not</FONT>
<FONT COLOR="#000000">> configured" as the script is checking for (DRBD_STATE_LOCAL checks)? If</FONT>
<FONT COLOR="#000000">> these should match, then I'd actually prefer to change the checks in the</FONT>
<FONT COLOR="#000000">> ocf script to match your patch (I like not having the embedded blank). </FONT>
<FONT COLOR="#000000">> </FONT>
<FONT COLOR="#000000">> Doug</FONT>
<FONT COLOR="#000000">I think the 0.7 drbdsetup reported "Not configured".</FONT>
<FONT COLOR="#000000">So maybe we should stick with that?</FONT>
<FONT COLOR="#000000">but since scripts tailored for 0.7 are likely to be broken for 8.0</FONT>
<FONT COLOR="#000000">anyways, we could also change this to "Unconfigured".</FONT>
<FONT COLOR="#000000">I'm not sure yet, but I'm more for consistency with the /proc/drbd</FONT>
<FONT COLOR="#000000">output, so its probably going to be "Unconfigured".</FONT>
</PRE>
</BLOCKQUOTE>
<FONT COLOR="#000000">Good, I have already changed my script to look for Unconfigured, and now I can switch drbd back and forth between nodes using a Place constraint with no problem. Now, I'm working on tying the filesystem resource to it. I've got a couple of questions up on the HA-Linux list on that. There is so much overlap between the HA and the drbd work that sometimes its hard to tell which list I should send to (or just keeping track where I've asked what ;)</FONT><BR>
<BR>
<FONT COLOR="#000000">Thanks! </FONT>
<PRE>
<FONT COLOR="#000000">_______________________________________________</FONT>
<FONT COLOR="#000000">drbd-user mailing list</FONT>
<FONT COLOR="#000000"><A HREF="mailto:drbd-user@lists.linbit.com">drbd-user@lists.linbit.com</A></FONT>
<FONT COLOR="#000000"><A HREF="http://lists.linbit.com/mailman/listinfo/drbd-user">http://lists.linbit.com/mailman/listinfo/drbd-user</A></FONT>
</PRE>
</BLOCKQUOTE>
<PRE>
<FONT COLOR="#000000">_______________________________________________</FONT>
<FONT COLOR="#000000">drbd-user mailing list</FONT>
<FONT COLOR="#000000"><A HREF="mailto:drbd-user@lists.linbit.com">drbd-user@lists.linbit.com</A></FONT>
<FONT COLOR="#000000"><A HREF="http://lists.linbit.com/mailman/listinfo/drbd-user">http://lists.linbit.com/mailman/listinfo/drbd-user</A></FONT>
</PRE>
</BLOCKQUOTE>
</BODY>
</HTML>