Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
>>>>> "Lars" == Lars Ellenberg <lars.ellenberg at linbit.com> writes:
Hi Lars,
>> Now to the more serious problem.
>> Do you have any hint on how to start
>> debugging the SDP connect problem?
Lars> Sorry.
Lars> The workaround mentioned before, respectively patching OFED kernel[*],
Lars> did work last time I tried.
Lars> Performance tuning is a different thing altogether.
Lars> [*] I think it was something like this
Lars> diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c
Lars> index ce511d8..26ef4c4 100644
Lars> --- a/drivers/infiniband/core/addr.c
Lars> +++ b/drivers/infiniband/core/addr.c
Lars> @@ -306,7 +306,7 @@ static int addr_resolve_remote(struct sockaddr *src_in,
Lars> struct sockaddr *dst_in,
Lars> struct rdma_dev_addr *addr)
Lars> {
Lars> - if (src_in->sa_family == AF_INET) {
Lars> + if (src_in->sa_family == AF_INET || src_in->sa_family == AF_INET_SDP) {
Lars> return addr4_resolve_remote((struct sockaddr_in *) src_in,
Lars> (struct sockaddr_in *) dst_in, addr);
Lars> } else
thanks for the hint. The patch above didn't cut it yet. In fact the
function addr_resolve_remote doesn't even exist anymore there is just a
similar function addr_resolve which needed the additional AF_INET_SDP
check above. However this was still not the solution. Fortunately the
sdp module has a debug mode that helped me to create a workaround. The
point is that the src_address comes in with a different sa_family than
the dst_address (src AF_INET, dst AF_INET_SDP). The sdp code would then
stop with an error. My workaround fixed this in the IB code, but I have
the suspicion that drbd might be doing a mistake here by leaving the src
(local port) sa_family at AF_INET. In order to resolve this, it would be
helpful to know exactly where in the drbd code the sa_family of src and
dst address is set before the first connection.
Roland