[DRBD-user] Trouble getting node to re-join two node cluster (OCFS2/DRBD Primary/Primary)

Reid, Mike MBReid at thepei.com
Thu Sep 15 22:50:44 CEST 2011

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.

Hello all,

** I have also posted this in the OCFS2/pacemaker list, but one response
indicated it may be more specific to DRBD? **

We have a two-node cluster still in development that has been running fine
for weeks (little to no traffic). I made some updates to our CIB recently,
and everything seemed just fine.

Yesterday I attempted to untar ~1.5GB to the OCFS2/DRBD volume, and once it
was complete one of the nodes had become completely disconnected and I
haven't been able to reconnect since.

DRBD is working fine, everything is UpToDate and I can get both nodes in
Primary/Primary, but when it comes down to starting OCFS2 and mounting the
volume, I'm left with:

> resFS:0_start_0 (node=node1, call=21, rc=1, status=complete): unknown error

I am using "pcmk" as the cluster_stack, and letting Pacemaker control

The last time this happened the only way I was able to resolve it was to
reformat the device (via mkfs.ocfs2 -F). I don't think I should have to do
this, underlying blocks seem fine, and one of the nodes is running just
fine. The (currently) unmounted node is staying in sync as far as DRBD is

Here's some detail that hopefully will help, please let me know if there's
anything else I can provide to help know the best way to get this node back

Ubuntu 10.10 / Kernel 2.6.35

Corosync 1.2.1
Cluster Agents 1.0.3 (Heartbeat)
Cluster Glue 1.0.6
OpenAIS 1.1.2

DRBD 8.3.10
OCFS2 1.5.0

cat /sys/fs/ocfs2/cluster_stack = pcmk

node1: mounted.ocfs2 -d

Device                FS     UUID                                  Label
/dev/sda3             ocfs2  fe4273e1-f866-4541-bbcf-66c5dfd496d6

node2: mounted.ocfs2 -d

Device                FS     UUID                                  Label
/dev/sda3             ocfs2  d6f7cc6d-21d1-46d3-9792-bc650736a5ef
/dev/drbd0            ocfs2  d6f7cc6d-21d1-46d3-9792-bc650736a5ef

- Both nodes are identical, in fact one node is a direct mirror (hdd clone)
- I have attached the CIB (crm configure edit contents) and mount trace

------ End of Forwarded Message

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20110915/fc9bd475/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: crm_configure.txt
Type: application/octet-stream
Size: 2852 bytes
Desc: crm_configure.txt
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20110915/fc9bd475/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mount_trace.txt
Type: application/octet-stream
Size: 4747 bytes
Desc: mount_trace.txt
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20110915/fc9bd475/attachment-0001.obj>

More information about the drbd-user mailing list