[DRBD-user] drbd + 10gig network

Fri Oct 23 23:40:24 CEST 2009

Lars Ellenberg wrote:
> On Fri, Oct 16, 2009 at 02:21:40AM -0600, Mike Lovell wrote:
>   
>> Mike Lovell wrote:
>>     
>>> Johan Verrept wrote:
>>>       
>>>> On Wed, 2009-10-14 at 23:21 -0600, Mike Lovell wrote:
>>>>   
>>>>         
>>>>> first off, hello everybody. i'm somewhat new to drbd and definitely 
>>>>> new to the mailing list.
>>>>>
>>>>> i am try to set up a cheap alternative to a iscsi san using some  
>>>>> somewhat commodity hardware and drbd. i happen to have some 10 
>>>>> gigabit network interfaces around so i thought it would be a great 
>>>>> interconnect for the drbd replication and probably as the 
>>>>> interconnect to the rest of the network.
>>>>>
>>>>> things were going well in my small proof of concept but when i made 
>>>>> the jump to the 10 gigabit network interfaces, i started running 
>>>>> into troubles with drbd not being able to complete a 
>>>>> synchronization. it will get anywhere between 5 and 15 percent done 
>>>>> (on a 2TB volume) and the stall. the only thing i have been able to 
>>>>> do to get things going again is to take down the network interface, 
>>>>> stop drbd, bring back up the interface, start drbd, and wait for it 
>>>>> to stall again. i have to take down the network interface because 
>>>>> drbd wont respond until then.
>>>>>
>>>>> in dmesg on the node with the UpToDate disk, i see errors like this 
>>>>> in the kernel log.
>>>>>
>>>>> [191401.876167] drbd0: Began resync as SyncSource (will sync 
>>>>> 1809012776 KB [452253194 bits set]).
>>>>> [191409.068152] drbd0: [drbd0_worker/24334] sock_sendmsg time 
>>>>> expired, ko = 4294967295
>>>>> [191416.533556] drbd0: [drbd0_worker/24334] sock_sendmsg time 
>>>>> expired, ko = 4294967294
>>>>> [191423.531804] drbd0: [drbd0_worker/24334] sock_sendmsg time 
>>>>> expired, ko = 4294967293
>>>>> [191429.888326] drbd0: [drbd0_worker/24334] sock_sendmsg time 
>>>>> expired, ko = 4294967292
>>>>> [191437.658299] drbd0: [drbd0_worker/24334] sock_sendmsg time 
>>>>> expired, ko = 4294967291
>>>>>
>>>>> in my trouble shooting, i tried changing the replication to use the 
>>>>> gigabit network interfaces already in the system and the 
>>>>> synchronization completed. i also tried a newer kernel and a new 
>>>>> version of drbd.
>>>>>
>>>>> i am doing this on debian lenny using the 2.6.26 kernel and drbd 
>>>>> 8.0.14 that are with the distro. the system is a single opteron 
>>>>> 2346 on a supermicro h8dme-2 with a intel 10 gigabit nic. the 
>>>>> underlying device is a software raid10 with linux md. i did try a 
>>>>> 2.6.30 kernel and drbd 8.3 but it didn't help.
>>>>>
>>>>> has anyone seen anything like this or have any recommendations?
>>>>>     
>>>>>           
>>>> <disclaimer> I am not an expert at drbd </disclaimer>
>>>>
>>>> I have seen similar things (stalling drbd) mentioned on the mailing
>>>> list. Mostly the reaction is a finger pointing first to your network
>>>> interface/drivers. Perhaps you should look into that first? From your
>>>> symptoms, I would strongly suspect the problem is there (especially
>>>> since it works fine once you switch interfaces). Perhaps run a few iperf
>>>> test to see if it runs smoothly?
>>>>
>>>> 	J.
>>>>
>>>>   
>>>>         
>>> i realized right after i sent my request that i hadn't done any load  
>>> or integrity testing on the 10 gigabit interfaces since i moved them  
>>> around and reinstalled the OS. i had previously used these nics for  
>>> stuff other than drbd and so i assumed that things were still  
>>> operating properly. i am going to start some testing on the interfaces  
>>> and see if i see any problems but considering my previous experience  
>>> with these cards, i'm doubting that is the problem. no harm in  
>>> checking though. i'll let the list know the results of my test.
>>>
>>> has anyone else on the list been able to do drbd over 10 gigabit links  
>>> before and been successful with it? if so, what was your hardware and  
>>> software set up to do it?
>>>       
>> i did some performance and load testing on the 10 gig interfaces today.  
>> using a variety of methods, i moved > 10 TiB of data across the link  
>> without dropped packets or connection interrupt. i things like `cat  
>> /dev/zero | nc` on one box to `nc > /dev/null` on the other and iperf  
>> and NPtcp between the nodes. no kernel errors, no connection drops, no  
>> dropped packets listed in ifconfig for the devices. i even just tried  
>> building the latest drivers for the nic from intel and the problem 
>> remains.
>>
>> any other thoughts?
>>     
>
> try DRBD 8.3.4.
> It handles some settings more gracefully.
>
> On <= 8.3.2, try decreasing sync-rate, and increase "max-buffers".
>
>   
i spent some more time on this problem and still haven't been able to 
resolve it yet. i tried changing from the opteron platform that i was 
originally using to a xeon (nehalem) platform which has the IOAT and DCA 
optimzations but using the same nics. that didn't fix the problem but 
did greatly improved the performance when it was sync'ing but also 
exaggerated the problem. when the sync hangs, the drbd module is almost 
completely unresponsive. i tried doing a pause-sync and then resume-sync 
thinking that it would nudge the module into working but the commands 
timeout on talking to the module. i can still cat /proc/drbd but that is 
about it until i take down the network interface and drbd detects the 
network change. if i then bring back up the interface, drbd detects it 
can talk again but then only syncs a couple of megabytes before stalling 
again. i have tried every way i can think of to check the integrity of 
the network link between the hosts and everything says they are fine 
except for during a ping flood there will be a few out of a couple 
hundred thousand packets that get dropped. but tcp should be able to 
handle that amount of loss without coughing.

but, since i don't have any other 10gig equipment to test with, i can't 
say for sure that it is not the driver or network cards. i was able to 
convince my boss to let me buy two new 10gig nics so that i can test on 
a different stack. does anyone on the list have any preferences on 
network cards or chipsets for 10gig ethernet cards? i have been using 
ones with an intel 82598 chipset. i am eye'ing ones from myricom and 
chelsio. does anyone have any experience with these or any other 
recommendations?

thanks

mike
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20091023/2ae18839/attachment.htm>