[DRBD-user] Novice trying to restore drbd system.

Felix Frank ff at mpexnet.de
Wed May 18 09:19:02 CEST 2011

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On 05/17/2011 09:24 PM, listslut at outofoptions.net wrote:
> I inherited a broken cluster.  With the help of a national vendor I am
> worse off and 'the good node' is a tad hosed.  I upgraded to the latest
> kernel and got everything back to this point. (Note, the good node was
> shut out of the cluster so the other one is still up and working.  That
> one hangs on an 'ls' command that is why I have my doubts about it). 
> There was data on this node this morning.  I think.  I'd prefer not to
> hose the data on this node in case the other node has a problem.  I'm
> hoping I can just bring it up and it syncs and life is good.
> 
> [root at julius init.d]# df
> Filesystem           1K-blocks      Used Available Use% Mounted on
> /dev/cciss/c0d0p1     36562540  10644068  24031240  31% /
> tmpfs                  6147644         0   6147644   0% /dev/shm
> [root at julius init.d]# mount /srv/vmdata/
> /sbin/mount.gfs2: can't open /dev/drbd0: Wrong medium type
> [root at julius init.d]# service drbd stop
> Stopping all DRBD resources.
> [root at julius init.d]# /sbin/drbdadm create-md drbd0
> md_offset 986671665152
> al_offset 986671632384
> bm_offset 986641518592
> 
> Found some data
>  ==> This might destroy existing data! <==
> 
> Do you want to proceed?
> [need to type 'yes' to confirm] no

Good choice.

Activate the DRBD service again. Examine the contents of the device
using "file -sL /dev/drbd0" (should recognize the gfs2?).
I believe that if you want to get data back, you may want to run some
sort of fsck against drbd0.

Speaking of cluster file systems - is this a dual-primary setup? Is the
"good node" Primary?
In this case, you will most likely face split brain either way, so
syncing back up won't be easy. It may then be your best shot (and most
simple solution) to heal your "good node" (find out what's blocking it)
and use that as sync source.

On the "good node", could it be that the dlm is biting you since the
peer node is in trouble?

HTH,
Felix



More information about the drbd-user mailing list