<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv=Content-Type content="text/html; charset=us-ascii">
<meta name=Generator content="Microsoft Word 12 (filtered medium)">
<style>
<!--
/* Font Definitions */
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:Verdana;
        panose-1:2 11 6 4 3 5 4 4 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0cm;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
        {mso-style-priority:34;
        margin-top:0cm;
        margin-right:0cm;
        margin-bottom:0cm;
        margin-left:36.0pt;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri","sans-serif";}
span.E-MailFormatvorlage17
        {mso-style-type:personal-compose;
        font-family:"Calibri","sans-serif";
        color:windowtext;}
.MsoChpDefault
        {mso-style-type:export-only;}
@page WordSection1
        {size:612.0pt 792.0pt;
        margin:70.85pt 70.85pt 2.0cm 70.85pt;}
div.WordSection1
        {page:WordSection1;}
/* List Definitions */
@list l0
        {mso-list-id:1931355597;
        mso-list-type:hybrid;
        mso-list-template-ids:692514762 67567631 67567641 67567643 67567631 67567641 67567643 67567631 67567641 67567643;}
@list l0:level1
        {mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-18.0pt;}
ol
        {margin-bottom:0cm;}
ul
        {margin-bottom:0cm;}
-->
</style>
<!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang=DE link=blue vlink=purple>
<div class=WordSection1>
<p class=MsoNormal><span lang=EN-US>Hi everybody,<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p>
<p class=MsoNormal><span lang=EN-US>Today I had the problem that after a reboot,
a node wouldn’t come back into Connected State. It was always like
WFConnection or Disconnected and so on. The secondary node did not reconnect
and so it wasn’t syncing. I thought I need to recreate the device and do
a manual split-brain recovery. Nothing worked. The DRBD stayed outdated
respectively inconsistent.<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p>
<p class=MsoNormal><span lang=EN-US>I was able resolve the issue and hopefully
the following explaination is correct (did it of my memories) and does help
some other admins which sturggled with this issue for days.<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p>
<p class=MsoNormal><span lang=EN-US>Some System Info (Debian Stable with
Backport packages):<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p>
<p class=MsoNormal><span lang=EN-US>--- Cluster Config & Status Dump --<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US>Created: Do 30. Sep 13:25:21 CEST 2010 on
pilot01-node1 by uid=0(root) gid=0(root) Gruppen=0(root)<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US>Systeminfo: Linux pilot01-node1
2.6.28-1-amd64 #1 SMP Wed Feb 18 17:16:12 UTC 2009 x86_64 GNU/Linux<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p>
<p class=MsoNormal><span lang=EN-US>#####################<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US>### 1. DRBD State ###<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US>#####################<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p>
<p class=MsoNormal><span lang=EN-US>drbd driver loaded OK; device status:<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US>version: 8.3.7 (api:88/proto:86-91)<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US>GIT-hash:
ea9e28dbff98e331a62bcbcc63a6135808fe2917 build by root@prolog01-pilot1,
2010-06-07 17:34:47<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US>m:res
cs
ro
ds
p mounted fstype<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US>0:pilot0 Connected
Primary/Secondary UpToDate/UpToDate C /mnt/cluster xfs<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p>
<p class=MsoNormal><span lang=EN-US>------------------------------------------------------------------------------------------------------------------------------------------------------------------------------<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p>
<p class=MsoNormal><span lang=EN-US>That was the initial drbd state of the
secondary node:<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p>
<p class=MsoNormal><span lang=EN-US>root@pilot01-node2:/home/nwadmin# cat
/proc/drbd<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US>version: 8.3.7 (api:88/proto:86-91)<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US>GIT-hash:
ea9e28dbff98e331a62bcbcc63a6135808fe2917 build by root@prolog01-pilot1, 2010-06-07
17:34:47<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US> 0: cs:WFConnection
ro:Secondary/Unknown ds:Inconsistent/DUnknown C r----<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US> ns:0 nr:0 dw:0 dr:0 al:0
bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:1951768<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p>
<p class=MsoNormal><span lang=EN-US>------------------------------------------------------------------------------------------------------------------------------------------------------------------------------<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p>
<p class=MsoNormal><span lang=EN-US>Resolution:<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p>
<p class=MsoNormal><span lang=EN-US>Then I looked at the logs (maybe a little
too late) and saw that there were erros concerning the drbd.conf. The
ocf:linbit:drbd uses /etc/drbd.conf as the OCF_RESKEY_drbdconf and my drbdadm
tool always wanted to use /usr/local/etc/drbd.conf (maybe this is compiled into
the drb-utils, I wasn’t able to figure that out) therefor the pacemaker
always refused to let the secondary node connect to the drbd device.<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p>
<p class=MsoNormal><span lang=EN-US>What I did to resolve it was:<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p>
<p class=MsoListParagraph style='text-indent:-18.0pt;mso-list:l0 level1 lfo1'><![if !supportLists]><span
lang=EN-US><span style='mso-list:Ignore'>1.<span style='font:7.0pt "Times New Roman"'>
</span></span></span><![endif]><span lang=EN-US>Change my resource to something
like this:<o:p></o:p></span></p>
<p class=MsoNormal style='margin-left:18.0pt'><span lang=EN-US>primitive
drbd_pilot0 ocf:linbit:drbd \<o:p></o:p></span></p>
<p class=MsoNormal style='margin-left:18.0pt'><span lang=EN-US>
params drbd_resource="pilot0"
drbdconf="/usr/local/etc/drbd.conf" \<o:p></o:p></span></p>
<p class=MsoNormal style='margin-left:18.0pt'><span lang=EN-US>
operations $id="drbd_pilot0-operations" \<o:p></o:p></span></p>
<p class=MsoNormal style='margin-left:18.0pt'><span lang=EN-US>
op monitor interval="15s"<o:p></o:p></span></p>
<p class=MsoListParagraph style='text-indent:-18.0pt;mso-list:l0 level1 lfo1'><![if !supportLists]><span
lang=EN-US><span style='mso-list:Ignore'>2.<span style='font:7.0pt "Times New Roman"'>
</span></span></span><![endif]><span lang=EN-US>Cleaned up all the erros on the
resource:<o:p></o:p></span></p>
<p class=MsoNormal style='margin-left:18.0pt'><span lang=EN-US>Crm resource
cleanup ms_drbd_pilot0<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p>
<p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p>
<p class=MsoNormal><span lang=EN-US>------------------------------------------------------------------------------------------------------------------------------------------------------------------------------<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p>
<p class=MsoNormal><span lang=EN-US>Here is the State of the Syncing:<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p>
<p class=MsoNormal><span lang=EN-US>root@pilot01-node1:/home/nwadmin# crm
resource cleanup res_MySQL<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US>Cleaning up res_MySQL on pilot01-node1<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US>Cleaning up res_MySQL on pilot01-node2<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US>root@pilot01-node1:/home/nwadmin# cat
/proc/drbd<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US>version: 8.3.7 (api:88/proto:86-91)<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US>GIT-hash:
ea9e28dbff98e331a62bcbcc63a6135808fe2917 build by root@prolog01-pilot1,
2010-06-07 17:34:47<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US> 0: cs:SyncSource ro:Primary/Secondary
ds:UpToDate/Inconsistent C r----<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US> ns:879617 nr:0 dw:2121
dr:890151 al:5 bm:53 lo:1 pe:39 ua:189 ap:0 ep:1 wo:b oos:1073972<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US>
[========>...........] sync'ed: 45.1% (1073972/1951768)K<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US>
finish: 0:00:22 speed: 47,920 (48,764) K/sec<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p>
<p class=MsoNormal><span lang=EN-US>------------------------------------------------------------------------------------------------------------------------------------------------------------------------------<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p>
<p class=MsoNormal><span lang=EN-US>Here is a relevant log output:<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p>
<p class=MsoNormal><span lang=EN-US>crmd: [2326]: info: do_lrm_rsc_op:
Performing key=81:445:0:106b9e8c-1ea2-475f-b2c9-ddb3088ea7aa
op=drbd_pilot0:1_notify_0 )<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US>Sep 30 11:29:24 s_all@pilot01-node2 lrmd:
[2323]: info: RA output: (drbd_pilot0:1:notify:stderr) Warning: resource pilot0
last used config file: /etc/drbd.conf current config file:
/usr/local/etc/drbd.conf<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US>Sep 30 11:29:24 s_all@pilot01-node2 lrmd:
[2323]: info: RA output: (drbd_pilot0:1:notify:stderr)
/usr/lib/ocf/resource.d//linbit/drbd: line 762: [: too many arguments<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US>Sep 30 11:29:24 s_all@pilot01-node2 crmd:
[2326]: info: process_lrm_event: LRM operation drbd_pilot0:1_notify_0 (call=17,
rc=0, cib-update=26, confirmed=true) ok<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US>Sep 30 11:29:26 s_all@pilot01-node2 lrmd:
[2323]: info: rsc:drbd_pilot0:1:18: notify<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US>Sep 30 11:29:26 s_all@pilot01-node2 crmd:
[2326]: info: do_lrm_rsc_op: Performing
key=79:448:0:106b9e8c-1ea2-475f-b2c9-ddb3088ea7aa op=drbd_pilot0:1_notify_0 )<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US>Sep 30 11:29:26 s_all@pilot01-node2 crmd:
[2326]: info: process_lrm_event: LRM operation drbd_pilot0:1_notify_0 (call=18,
rc=0, cib-update=27, confirmed=true) ok<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US>Sep 30 11:29:28 s_all@pilot01-node2 lrmd:
[2323]: info: rsc:drbd_pilot0:1:19: notify<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US>Sep 30 11:29:28 s_all@pilot01-node2 crmd:
[2326]: info: do_lrm_rsc_op: Performing
key=79:451:0:106b9e8c-1ea2-475f-b2c9-ddb3088ea7aa op=drbd_pilot0:1_notify_0 )<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US>Sep 30 11:29:28 s_all@pilot01-node2 crmd:
[2326]: info: process_lrm_event: LRM operation drbd_pilot0:1_notify_0 (call=19,
rc=0, cib-update=28, confirmed=true) ok<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US>Sep 30 11:29:29 s_all@pilot01-node2 lrmd:
[2323]: info: rsc:drbd_pilot0:1:20: notify<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US>Sep 30 11:29:29 s_all@pilot01-node2 crmd:
[2326]: info: do_lrm_rsc_op: Performing
key=79:454:0:106b9e8c-1ea2-475f-b2c9-ddb3088ea7aa op=drbd_pilot0:1_notify_0 )<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US>Sep 30 11:29:29 s_all@pilot01-node2 crmd:
[2326]: info: process_lrm_event: LRM operation drbd_pilot0:1_notify_0 (call=20,
rc=0, cib-update=29, confirmed=true) ok<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US>Sep 30 11:29:36 s_all@pilot01-node2 lrmd:
[2323]: info: RA output: (drbd_pilot0:1:monitor:stderr) Warning: resource
pilot0 last used config file: /usr/local/etc/drbd.conf current
config file: /etc/drbd.conf<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p>
<p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><b><span
style='font-size:10.0pt;font-family:"Verdana","sans-serif";color:black'>Kind
Regards,<o:p></o:p></span></b></p>
<p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><b><span
style='font-size:10.0pt;font-family:"Verdana","sans-serif";color:black'>Sebastian</span></b><span
style='font-size:7.5pt;font-family:"Verdana","sans-serif";color:gray'><o:p></o:p></span></p>
<p class=MsoNormal><o:p> </o:p></p>
</div>
</body>
</html>