[DRBD-user] Fault Tolerant NFS

Wed Jun 6 13:13:13 CEST 2012

On 06/06/2012 08:16 AM, Torsten Rosenberger wrote:
> Am 06.06.2012 05:50, schrieb Yount, William D:
>> I understand what heartbeat does in the general sense. Actually configuring it correctly and making it work the way it is supposed to is the problem.
>>
>> I have read the official DRBD/Heartbeat documentation (http://www.linbit.com/fileadmin/tech-guides/ha-nfs.pdf). That covers a LVM situation that isn't applicable to me. I use LVM but just have one logical volume so no need to group them.
>>
>> I have been able to cobble together a set of steps based off of the official documentation and other guides. Different documentation takes different approaches and they often contain contradictory information.
>>
>> I have two servers with two 2tb hard drives each. I am using software RAID with logical volumes. I have one 50gb LV for the OS, one 30gb LV for swap and one 1.7tb volume for Storage. All I want is to mirror that 1.7tb LV across servers and then have pacemaker/heartbeat switch over the second server. 
>>
>> I am not sure if I need to define nfs-kernel-server, LVM, exportFS and drbd0 as services. I am using the LCMC application to monitor the configuration. 
>>
>> Using the steps that I attached, if the primary server goes down, the secondary does nothing. It doesn't mount /dev/drbd0 to /Storage and it doesn't start accepting traffic on 10.89.99.30. 
>>
>>
>>
>>
>>
>>
>> -----Original Message-----
>> From: Marcel Kraan [mailto:marcel at kraan.net] 
>> Sent: Tuesday, June 05, 2012 5:19 PM
>> To: Yount, William D
>> Cc: Felix Frank; drbd-user at lists.linbit.com
>> Subject: Re: [DRBD-user] Fault Tolerant NFS
>>
>> This is what heartbeat does.
>> It mount the drbd disk  and start all the programs that are given in the haresources the virtual ip will be on the second server up and running.
>> so basically your 1servers becomes the second.
>> when the 1st server come up again he will take it over again.
>>
>> i can shutdown the first or second server without going down.. (maybe 5 or 10 seconds for switching)
>>
>> works great...
>>
>> On 5 jun. 2012, at 23:59, Yount, William D wrote:
>>
>>> I am looking for a fault tolerant solution. By this, I mean I want there to be an automatic switch over if one of the two storage servers goes down with no human intervention. 
>>>
>>> Initially, I followed this guide: 
>>> https://help.ubuntu.com/community/HighlyAvailableNFS
>>> That works fine, but there are several steps that require human intervention in case of a server failure:
>>> 	Promote secondary server to primary
>>> 	Mount drbd partition to export path
>>> 	Restart nfs-kernel-server (if necessary)
>>>
>>> I was trying to get dual primaries setup, thinking that if one goes out the other will take over automatically. There just seems to be so many moving pieces that don't always work they way they are supposed to. I have been reading all the material I can get my hands on but a lot of it seems contradictory or only applicable on certain OS versions with certain versions of OCFS2, DRBD and Pacemaker. 
>>>
>>> It doesn't matter to me if it is master/slave or dual primaries. I am just trying to find something that actually works.
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: Felix Frank [mailto:ff at mpexnet.de]
>>> Sent: Tuesday, June 05, 2012 2:42 AM
>>> To: Yount, William D
>>> Cc: drbd-user at lists.linbit.com
>>> Subject: Re: [DRBD-user] Fault Tolerant NFS
>>>
>>> On 06/05/2012 07:41 AM, Yount, William D wrote:
>>>> Does anyone have a good resource for setting up a fault tolerant NFS 
>>>> cluster using DRBD? I am currently using DRBD, Pacemaker, Corosync 
>>>> and
>>>> OCFS2 on Ubuntu 12.04.
>>> Those are all right, but I don't really see how OCFS2 is required.
>>> Dual-primary? Not needed for HA NFS.
>>>
>>> But it should still work.
>>>
>>>> High availability doesn't meet my needs. I have spent quite a while 
>>>> reading and trying out every combination of settings, but nothing 
>>>> seems to work properly.
>>> What are the exact limitations you're facing? Stale mounts after failover?
>>> _______________________________________________
>>> drbd-user mailing list
>>> drbd-user at lists.linbit.com
>>> http://lists.linbit.com/mailman/listinfo/drbd-user
>>
>>
>> _______________________________________________
>> drbd-user mailing list
>> drbd-user at lists.linbit.com
>> http://lists.linbit.com/mailman/listinfo/drbd-user
> Hello
> 
> check the 'No Quorum Policy' in the pacemaker CRM config default is stop, i
> changed it to suicied

I may be wrong here but setting the policy to "suicide" in a two node
cluster means if there is a split then *both* nodes will commit suicide,
no? So what you really want is to set the policy to "ignore"?

Regards,
  Dennis