<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><meta http-equiv=Content-Type content="text/html; charset=iso-8859-1"><meta name=Generator content="Microsoft Word 12 (filtered medium)"><style><!--
/* Font Definitions */
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:Tahoma;
        panose-1:2 11 6 4 3 5 4 4 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0cm;
        margin-bottom:.0001pt;
        text-align:justify;
        font-size:10.5pt;
        font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
span.EmailStyle17
        {mso-style-type:personal;
        font-family:"Calibri","sans-serif";
        color:windowtext;}
span.EmailStyle18
        {mso-style-type:personal-reply;
        font-family:"Calibri","sans-serif";
        color:#1F497D;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;}
@page WordSection1
        {size:612.0pt 792.0pt;
        margin:72.0pt 90.0pt 72.0pt 90.0pt;}
div.WordSection1
        {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]--></head><body lang=FR link=blue vlink=purple style='text-justify-trim:punctuation'><div class=WordSection1><p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;color:#1F497D'>Hi Simon.<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;color:#1F497D'>AFAIK, the Ping Ack error means your replication network links are either down or subject to sufficient errors to prevent both nodes to reach each other in a timely manner. I had the occasion to experience such behavior because of bad optical fibers for instance, generating huge number of network errors. You also have “network failure” messages in your logs and it’s “Waiting for connection”. In your case I’d say the first thing to do is to test this network : Can both nodes ping each other address on this network ? Does an ifconfig of each address report errors ? Etc… I bet when your replication network is up again, your cluster will run fine.<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;color:#1F497D'>Pascal.<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;color:#1F497D'><o:p> </o:p></span></p><div><div style='border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0cm 0cm 0cm'><p class=MsoNormal align=left style='text-align:left'><b><span style='font-size:10.0pt;font-family:"Tahoma","sans-serif"'>De :</span></b><span style='font-size:10.0pt;font-family:"Tahoma","sans-serif"'> drbd-user-bounces@lists.linbit.com [mailto:drbd-user-bounces@lists.linbit.com] <b>De la part de</b> simon<br><b>Envoyé :</b> samedi 18 août 2012 03:37<br><b>À :</b> drbd-user@lists.linbit.com<br><b>Objet :</b> [DRBD-user] Drbd : PingAsk timeout, about 10 mins.<o:p></o:p></span></p></div></div><p class=MsoNormal align=left style='text-align:left'><o:p> </o:p></p><p class=MsoNormal><span lang=EN-US>Hi all,<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p><p class=MsoNormal><span lang=EN-US>I used drbd 8.3.7 on HA. When Master host is dead and HA swatches from Master to Slave, the drbd can’t switch because it spends 10 minutes to mount its partition. But the time is timeout to HA.(in HA, default overtime is 2 miniutes).<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p><p class=MsoNormal><span lang=EN-US>Why does drbd spent that long time? <o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p><p class=MsoNormal><span lang=EN-US>The log is:<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US>Jul 22 21:06:34 QD-CS-MDC-B kernel: [325560.739458] block drbd1: peer( Primary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) <o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US>Jul 22 21:06:34 QD-CS-MDC-B kernel: [325560.739468] block drbd1: asender terminated<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US>Jul 22 21:06:34 QD-CS-MDC-B kernel: [325560.739470] block drbd1: Terminating asender thread<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US>Jul 22 21:06:34 QD-CS-MDC-B kernel: [325560.739526] block drbd1: short read expecting header on sock: r=-512<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US>Jul 22 21:06:34 QD-CS-MDC-B kernel: [325560.739666] block drbd1: Connection closed<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US>Jul 22 21:06:34 QD-CS-MDC-B kernel: [325560.739672] block drbd1: conn( NetworkFailure -> Unconnected ) <o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US>Jul 22 21:06:34 QD-CS-MDC-B kernel: [325560.739678] block drbd1: receiver terminated<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US>Jul 22 21:06:34 QD-CS-MDC-B kernel: [325560.739680] block drbd1: Restarting receiver thread<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US>Jul 22 21:06:34 QD-CS-MDC-B kernel: [325560.739683] block drbd1: receiver (re)started<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US>Jul 22 21:06:34 QD-CS-MDC-B kernel: [325560.739687] block drbd1: conn( Unconnected -> WFConnection ) <o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US>Jul 22 21:06:39 QD-CS-MDC-B pengine: [17776]: info: crm_log_init: Changed active directory to /usr/var/lib/heartbeat/cores/root<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US>Jul 22 21:06:47 QD-CS-MDC-B kernel: [325573.727331] NET: Registered protocol family 17<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US>Jul 22 21:06:47 QD-CS-MDC-B kernel: [325573.768912] block drbd0: role( Secondary -> Primary ) <o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US>Jul 22 21:06:47 QD-CS-MDC-B kernel: [325573.772742] block drbd1: role( Secondary -> Primary ) <o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US style='color:red'>Jul 22 21:06:47 QD-CS-MDC-B kernel: [325573.772997] block drbd1: Creating new current UUID<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US>Jul 22 21:08:47 QD-CS-MDC-B su: (to hitv) root on none<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US style='color:red'>Jul 22 21:16:47 QD-CS-MDC-B kernel: [326174.032485] block drbd0: PingAck did not arrive in time.<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US>Jul 22 21:16:47 QD-CS-MDC-B kernel: [326174.032493] block drbd0: peer( Primary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) <o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US>Jul 22 21:16:47 QD-CS-MDC-B kernel: [326174.032503] block drbd0: asender terminated<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US>Jul 22 21:16:47 QD-CS-MDC-B kernel: [326174.032506] block drbd0: Terminating asender thread<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US>Jul 22 21:16:47 QD-CS-MDC-B kernel: [326174.032514] block drbd0: Creating new current UUID<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US>Jul 22 21:16:47 QD-CS-MDC-B kernel: [326174.032567] block drbd0: short read expecting header on sock: r=-512<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US>Jul 22 21:16:47 QD-CS-MDC-B kernel: [326174.032868] block drbd0: Connection closed<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US>Jul 22 21:16:47 QD-CS-MDC-B kernel: [326174.032875] block drbd0: conn( NetworkFailure -> Unconnected ) <o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US>Jul 22 21:16:47 QD-CS-MDC-B kernel: [326174.032879] block drbd0: receiver terminated<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US>Jul 22 21:16:47 QD-CS-MDC-B kernel: [326174.032881] block drbd0: Restarting receiver thread<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US>Jul 22 21:16:47 QD-CS-MDC-B kernel: [326174.032884] block drbd0: receiver (re)started<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US>Jul 22 21:16:47 QD-CS-MDC-B kernel: [326174.032888] block drbd0: conn( Unconnected -> WFConnection )<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US style='font-size:12.0pt'>Jul 22 21:16:48 QD-CS-MDC-B kernel: [326174.600888] kjournald starting. Commit interval 15 seconds<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US style='font-size:12.0pt'>Jul 22 21:16:48 QD-CS-MDC-B kernel: [326174.600956] EXT3-fs warning: maximal mount count reached, running e2fsck is recommended<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US style='font-size:12.0pt'>Jul 22 21:16:48 QD-CS-MDC-B kernel: [326174.601330] EXT3 FS on drbd0, internal journal<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US style='font-size:12.0pt'>Jul 22 21:16:48 QD-CS-MDC-B kernel: [326174.601334] EXT3-fs: recovery complete.<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US style='font-size:12.0pt'>Jul 22 21:16:48 QD-CS-MDC-B kernel: [326174.601392] EXT3-fs: mounted filesystem with ordered data mode. <o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US style='font-size:12.0pt'> <o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US>According to the log, the timeout is PingAsk operation.<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p><p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p><p class=MsoNormal><span lang=EN-US>Thanks your help.<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US> <o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US> </span><span lang=EN-US style='font-size:12.0pt'>simon<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US style='font-size:12.0pt'><o:p> </o:p></span></p><p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p><p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p></div></body></html>