[DRBD-user] connect error -22 with SDP/InfiniBand

J. Ryan Earl oss at jryanearl.us
Fri Sep 17 20:04:52 CEST 2010

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hello Lars =)

Reply inline:

On Fri, Sep 17, 2010 at 4:22 AM, Lars Ellenberg
<lars.ellenberg at linbit.com>wrote:

>
> -EINVAL
>
> iirc, it is a bug inside the in-kernel SDP connect() peer lookup,
> which EINVALs if the target address is not given as AF_INET (!),
> even if the socket itself is AF_INET_SDP.
> Or the other way around.
>

I see, great info.


>
> If you do "drbdadm -d connect $resource", you get the drbdsetup
> command that would have been issued.
> replace the second (remmote) sdp with ipv4,
> and do them manually, on both nodes.
> If that does not work, replace only the first (local) sdp with ipv4,
> but keep the second (remote) sdp.
>

The first suggestion seems to work:

[root at node01 ~]# drbdsetup 0 net sdp:192.168.20.1:7778 ipv4:
192.168.20.2:7778 C --set-defaults --create-device --max-epoch-size=20000
--max-buffers=20000 --after-sb-2pri=disconnect
--after-sb-1pri=discard-secondary --after-sb-0pri=discard-zero-changes
--allow-two-primaries

[root at node02 ~]# drbdsetup 0 net sdp:192.168.20.2:7778 ipv4:
192.168.20.1:7778 C --set-defaults --create-device --max-epoch-size=20000
--max-buffers=20000 --after-sb-2pri=disconnect
--after-sb-1pri=discard-secondary --after-sb-0pri=discard-zero-changes
--allow-two-primaries



>
> If that gets you connected, then its that bug.
> I think I even patched it in kernel once,
> but don't find that right now,
> and don't remember the SDP version either.
> I think it was
> drivers/infiniband/ulp/sdp/sdp_main.c:addr_resolve_remote()
> missing an (... || ... = AF_INET_SDP)
>

Hmmm.  It may be a MLNX_OFED specific bug?  If we can get the code, I have a
support path with Mellanox, I can probably get this pushed into their
upstream OFED.  I'll look at it and see if I can figure it out.


>
> That's all userland, and does not affect DRBD, as DRBD does all
> networking from within the kernel.
>

Ahhh, right.


> Share your findings on DRBD performance IPoIB vs. SDP,
> once you get the thing to work on your platform.


Well, using the drbdsetup method above, SDP is significantly slower than
IPoIB.  Below are some write results, the HAStorage VG is using drbd0 as a
PV:

With IPoIB:

# time dd if=/dev/zero of=/dev/HAStorage/test bs=4096M count=5

0+5 records in

0+5 records out

10737397760 bytes (11 GB) copied, 17.8384 seconds, 602 MB/s



real    0m17.864s

user    0m0.000s

sys     0m10.292s

With SDP and the drbdsetup sdp/ipv4:

# time dd if=/dev/zero of=/dev/HAStorage/test bs=4096M count=5
0+5 records in
0+5 records out
10737397760 bytes (11 GB) copied, 26.2015 seconds, 410 MB/s

real    0m26.220s
user    0m0.001s
sys     0m11.891s


The underlying storage is a 16-disk RAID10 with 1.5GB of flash-backed write
cache.  Local writes to storage happen around 900MB/s sustained, and burst
at several GB/sec to write cache.  I suspect a proper fix for the in-kernel
SDP connect() may fix SDP performance here.  At least I would hope so!

Thanks for the information, very helpful,
-JR
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20100917/9ec035b1/attachment.htm>


More information about the drbd-user mailing list