[DRBD-user] drbd + 10gig network

Mike Lovell mike at dev-zero.net
Thu Oct 15 07:21:49 CEST 2009

first off, hello everybody. i'm somewhat new to drbd and definitely new 
to the mailing list.

i am try to set up a cheap alternative to a iscsi san using some 
somewhat commodity hardware and drbd. i happen to have some 10 gigabit 
network interfaces around so i thought it would be a great interconnect 
for the drbd replication and probably as the interconnect to the rest of 
the network.

things were going well in my small proof of concept but when i made the 
jump to the 10 gigabit network interfaces, i started running into 
troubles with drbd not being able to complete a synchronization. it will 
get anywhere between 5 and 15 percent done (on a 2TB volume) and the 
stall. the only thing i have been able to do to get things going again 
is to take down the network interface, stop drbd, bring back up the 
interface, start drbd, and wait for it to stall again. i have to take 
down the network interface because drbd wont respond until then.

in dmesg on the node with the UpToDate disk, i see errors like this in 
the kernel log.

[191401.876167] drbd0: Began resync as SyncSource (will sync 1809012776 
KB [452253194 bits set]).
[191409.068152] drbd0: [drbd0_worker/24334] sock_sendmsg time expired, 
ko = 4294967295
[191416.533556] drbd0: [drbd0_worker/24334] sock_sendmsg time expired, 
ko = 4294967294
[191423.531804] drbd0: [drbd0_worker/24334] sock_sendmsg time expired, 
ko = 4294967293
[191429.888326] drbd0: [drbd0_worker/24334] sock_sendmsg time expired, 
ko = 4294967292
[191437.658299] drbd0: [drbd0_worker/24334] sock_sendmsg time expired, 
ko = 4294967291

in my trouble shooting, i tried changing the replication to use the 
gigabit network interfaces already in the system and the synchronization 
completed. i also tried a newer kernel and a new version of drbd.

i am doing this on debian lenny using the 2.6.26 kernel and drbd 8.0.14 
that are with the distro. the system is a single opteron 2346 on a 
supermicro h8dme-2 with a intel 10 gigabit nic. the underlying device is 
a software raid10 with linux md. i did try a 2.6.30 kernel and drbd 8.3 
but it didn't help.

has anyone seen anything like this or have any recommendations?

thanks in advance


