[DRBD-user] drbd + 10gig network

Mike Lovell mike at dev-zero.net
Thu Oct 15 18:36:20 CEST 2009

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Johan Verrept wrote:
> On Wed, 2009-10-14 at 23:21 -0600, Mike Lovell wrote:
>   
>> first off, hello everybody. i'm somewhat new to drbd and definitely new 
>> to the mailing list.
>>
>> i am try to set up a cheap alternative to a iscsi san using some 
>> somewhat commodity hardware and drbd. i happen to have some 10 gigabit 
>> network interfaces around so i thought it would be a great interconnect 
>> for the drbd replication and probably as the interconnect to the rest of 
>> the network.
>>
>> things were going well in my small proof of concept but when i made the 
>> jump to the 10 gigabit network interfaces, i started running into 
>> troubles with drbd not being able to complete a synchronization. it will 
>> get anywhere between 5 and 15 percent done (on a 2TB volume) and the 
>> stall. the only thing i have been able to do to get things going again 
>> is to take down the network interface, stop drbd, bring back up the 
>> interface, start drbd, and wait for it to stall again. i have to take 
>> down the network interface because drbd wont respond until then.
>>
>> in dmesg on the node with the UpToDate disk, i see errors like this in 
>> the kernel log.
>>
>> [191401.876167] drbd0: Began resync as SyncSource (will sync 1809012776 
>> KB [452253194 bits set]).
>> [191409.068152] drbd0: [drbd0_worker/24334] sock_sendmsg time expired, 
>> ko = 4294967295
>> [191416.533556] drbd0: [drbd0_worker/24334] sock_sendmsg time expired, 
>> ko = 4294967294
>> [191423.531804] drbd0: [drbd0_worker/24334] sock_sendmsg time expired, 
>> ko = 4294967293
>> [191429.888326] drbd0: [drbd0_worker/24334] sock_sendmsg time expired, 
>> ko = 4294967292
>> [191437.658299] drbd0: [drbd0_worker/24334] sock_sendmsg time expired, 
>> ko = 4294967291
>>
>> in my trouble shooting, i tried changing the replication to use the 
>> gigabit network interfaces already in the system and the synchronization 
>> completed. i also tried a newer kernel and a new version of drbd.
>>
>> i am doing this on debian lenny using the 2.6.26 kernel and drbd 8.0.14 
>> that are with the distro. the system is a single opteron 2346 on a 
>> supermicro h8dme-2 with a intel 10 gigabit nic. the underlying device is 
>> a software raid10 with linux md. i did try a 2.6.30 kernel and drbd 8.3 
>> but it didn't help.
>>
>> has anyone seen anything like this or have any recommendations?
>>     
>
> <disclaimer> I am not an expert at drbd </disclaimer>
>
> I have seen similar things (stalling drbd) mentioned on the mailing
> list. Mostly the reaction is a finger pointing first to your network
> interface/drivers. Perhaps you should look into that first? From your
> symptoms, I would strongly suspect the problem is there (especially
> since it works fine once you switch interfaces). Perhaps run a few iperf
> test to see if it runs smoothly?
>
> 	J.
>
>   
i realized right after i sent my request that i hadn't done any load or 
integrity testing on the 10 gigabit interfaces since i moved them around 
and reinstalled the OS. i had previously used these nics for stuff other 
than drbd and so i assumed that things were still operating properly. i 
am going to start some testing on the interfaces and see if i see any 
problems but considering my previous experience with these cards, i'm 
doubting that is the problem. no harm in checking though. i'll let the 
list know the results of my test.

has anyone else on the list been able to do drbd over 10 gigabit links 
before and been successful with it? if so, what was your hardware and 
software set up to do it?

thx.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20091015/0de861c7/attachment.htm>


More information about the drbd-user mailing list