[DRBD-user] problem about "the failed node comes up again": Primary becomes StandAlone

Selina Sun - 孙晶洁 Selina.Sun at zyxel.cn
Fri Apr 27 07:57:36 CEST 2007

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hi all:

      I countered a problem when configuring DRBD+HTTP+Heartbeat.

      I set up a cluster of two computers: node1 (primary) and node2 (secondary), sharing a drbd device on the net. Each machine just have one Nic for drbd and heartbeats.

      Step 1:   startup drbd and heartbeat on node1(the heartbeat will mount drbd device in local directory tree and start http service) and node2. It’s ok. I can access virtual IP through IE. And see the page on node1.

      Step 2: reboot node1. Then the node2 becomes the new primary and heartbeat startup http service on node2. node2 now provide httpd service. 

      Step 3: When the node1 comes up again. Start drbd and heartbeat on it. 

But run “cat /proc/drbd” on node1, it shows:

      

      At the same time, run “cat /proc/drbd” on node2, it shows:

      version: 0.7.18 (api:78/proto:74)

SVN Revision: 2176 build by root at secondary, 2007-04-23 17:18:34

 0: cs:StandAlone st:Primary/Unknown ld:Consistent

    ns:8 nr:8 dw:68 dr:489 al:0 bm:1 lo:0 pe:0 ua:0 ap:0

      

      /var/log/message in node2:

      Apr 27 13:37:36 mouse kernel: drbd0: drbd0_receiver [6530]: cstate WFConnection --> WFReportParams

Apr 27 13:37:36 mouse kernel: drbd0: Handshake successful: DRBD Network Protocol version 74

Apr 27 13:37:36 mouse kernel: drbd0: Connection established.

Apr 27 13:37:36 mouse kernel: drbd0: I am(P): 1:00000002:00000001:0000004b:00000013:10

Apr 27 13:37:36 mouse kernel: drbd0: Peer(S): 1:00000002:00000001:0000004d:00000012:00

Apr 27 13:37:36 mouse kernel: drbd0: Current Primary shall become sync TARGET! Aborting to prevent data corruption.

Apr 27 13:37:36 mouse kernel: drbd0: drbd0_receiver [6530]: cstate WFReportParams --> StandAlone

Apr 27 13:37:36 mouse kernel: drbd0: error receiving ReportParams, l: 72!

Apr 27 13:37:36 mouse kernel: drbd0: worker terminated

Apr 27 13:37:36 mouse kernel: drbd0: asender terminated

Apr 27 13:37:36 mouse kernel: drbd0: drbd0_receiver [6530]: cstate StandAlone --> StandAlone

Apr 27 13:37:36 mouse kernel: drbd0: Connection lost.

Apr 27 13:37:36 mouse kernel: drbd0: receiver terminated

Step4: run “drbdadm adjust all” on node2, everything is ok.

In my opinion, without step 4, everything should ok:  the failed node comes up and run as secondary in cluster. 

      I wonder if there is something wrong with my configuration. How can I get these two machines reconnect without step 4? 

Thanks...

 

Best regards

Selina Sun

SW2

ZyXEL Communications(Wuxi)Corp.

Tel: +86-510-88080888 ext. 15516

Email: selina.sun at zyxel.cn <mailto:selina.sun at zyxel.cn>  

 

Did you check www.zyxel.cn <http://www.zyxel.cn/>  today?

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20070427/7b76b208/attachment.htm>


More information about the drbd-user mailing list