Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
I have a two-node active/passive cluster, with DRBD controlled by
corosync/pacemaker.
All storage is based on LVM.
------------------------------------------------------------------------------------
a) How do I know, which node of the cluster is currently active?
How can I check if a node is currently in use by the iSCSI-target daemon?
I can try to deactivate a volume group using:
[root at node1 ~]# vgchange -an data
Can't deactivate volume group "data" with 3 open logical volume(s)
In which case, if I get a message like the above then I know that
node1 is the active node, but is there a better (non-intrusive)
way to check?
A better option seems to be 'pvs -v'. If the node is active then it shows
the volume names:
[root at node1 ~]# pvs -v
Scanning for physical volume names
PV VG Fmt Attr PSize PFree DevSize PV UUID
/dev/drbd1 data lvm2 a- 109.99g 0 110.00g
c40m9K-tNk8-vTVz-tKix-UGyu-gYXa-gnKYoJ
/dev/drbd2 tempdb lvm2 a- 58.00g 0 58.00g
4CTq7I-yxAy-TZbY-TFxa-3alW-f97X-UDlGNP
/dev/drbd3 distrib lvm2 a- 99.99g 0 100.00g
l0DqWG-dR7s-XD2M-3Oek-bAft-d981-UuLReC
where on the inactive node it gives errors:
[root at node2 ~]# pvs -v
Scanning for physical volume names
/dev/drbd0: open failed: Wrong medium type
/dev/drbd1: open failed: Wrong medium type
Any further ideas/comments/suggestions?
------------------------------------------------------------------------------------
b) how can I gracefully failover to the other node ? Up to now, the only
way I
know is forcing the active node to reboot (by entering two subsequent
'reboot'
commands). This however breaks the DRBD synchronization, and I need to
use a fix-split-brain procedure to bring back the DRBD in sync.
On the other hand, if I try to stop the corosync service on the active
node,
the command takes forever! I understand that the suggested procedure
should be
to disconnect all clients from the active node and then stop services,
is it a better approach to shut down the public network interface before
stopping the corosync service (in order to forcibly close client
connections)?
Thanks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20120105/3f998f1c/attachment.htm>