[DRBD-user] problem about "the failed node comes up again": Primary
becomes StandAlone
=?gb2312?B?U2VsaW5hIFN1biAtIMvvvqe94A==?=
Selina.Sun at zyxel.cn
Fri Apr 27 07:57:36 CEST 2007
Hi all:
I countered a problem when configuring DRBD+HTTP+Heartbeat.
I set up a cluster of two computers: node1 (primary) and node2 (secondary), sharing a drbd device on the net. Each machine just have one Nic for drbd and heartbeats.
Step 1: startup drbd and heartbeat on node1(the heartbeat will mount drbd device in local directory tree and start http service) and node2. It¡¯s ok. I can access virtual IP through IE. And see the page on node1.
Step 2: reboot node1. Then the node2 becomes the new primary and heartbeat startup http service on node2. node2 now provide httpd service.
Step 3: When the node1 comes up again. Start drbd and heartbeat on it.
But run ¡°cat /proc/drbd¡± on node1, it shows:
At the same time, run ¡°cat /proc/drbd¡± on node2, it shows:
version: 0.7.18 (api:78/proto:74)
SVN Revision: 2176 build by root at secondary, 2007-04-23 17:18:34
0: cs:StandAlone st:Primary/Unknown ld:Consistent
ns:8 nr:8 dw:68 dr:489 al:0 bm:1 lo:0 pe:0 ua:0 ap:0
/var/log/message in node2:
Apr 27 13:37:36 mouse kernel: drbd0: drbd0_receiver [6530]: cstate WFConnection --> WFReportParams
Apr 27 13:37:36 mouse kernel: drbd0: Handshake successful: DRBD Network Protocol version 74
Apr 27 13:37:36 mouse kernel: drbd0: Connection established.
Apr 27 13:37:36 mouse kernel: drbd0: I am(P): 1:00000002:00000001:0000004b:00000013:10
Apr 27 13:37:36 mouse kernel: drbd0: Peer(S): 1:00000002:00000001:0000004d:00000012:00
Apr 27 13:37:36 mouse kernel: drbd0: Current Primary shall become sync TARGET! Aborting to prevent data corruption.
Apr 27 13:37:36 mouse kernel: drbd0: drbd0_receiver [6530]: cstate WFReportParams --> StandAlone
Apr 27 13:37:36 mouse kernel: drbd0: error receiving ReportParams, l: 72!
Apr 27 13:37:36 mouse kernel: drbd0: worker terminated
Apr 27 13:37:36 mouse kernel: drbd0: asender terminated
Apr 27 13:37:36 mouse kernel: drbd0: drbd0_receiver [6530]: cstate StandAlone --> StandAlone
Apr 27 13:37:36 mouse kernel: drbd0: Connection lost.
Apr 27 13:37:36 mouse kernel: drbd0: receiver terminated
Step4: run ¡°drbdadm adjust all¡± on node2, everything is ok.
In my opinion, without step 4, everything should ok: the failed node comes up and run as secondary in cluster.
I wonder if there is something wrong with my configuration. How can I get these two machines reconnect without step 4?
Thanks...
Best regards
Selina Sun
SW2
ZyXEL Communications£¨Wuxi£©Corp.
Tel: +86-510-88080888 ext. 15516
Email: selina.sun at zyxel.cn <mailto:selina.sun at zyxel.cn>
Did you check www.zyxel.cn <http://www.zyxel.cn/> today?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.linbit.com/pipermail/drbd-user/attachments/20070427/7b76b208/attachment.html
More information about the drbd-user
mailing list