Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
> >> When node A starts back up, the SCTP protocol notices this (as it?s > >> supposed to), and delivers an SCTP_ASSOC_CHANGE / SCTP_RESTART > >> notification to the SCTP socket, telling the socket owner (the dlm_recv > >> thread) that the other node has restarted. DLM responds by telling SCTP > >> to create a clone of the master socket, for use in communicating with > >> the newly restarted node. (This is an SCTP_SOCKOPT_PEELOFF request.) > >> And this is where things go wrong: the SCTP_SOCKOPT_PEELOFF request is > >> designed to be called from user space, not from a kernel thread, and so > >> it /does/ allocate a file descriptor for the new socket. Since DLM is > >> calling it from a kernel thread, the kernel thread now has an open file > >> descriptor (#0) to that socket. And since kernel threads share the same > >> file descriptor, every kernel thread on the system has this open > >> descriptor. So defect #1 is that DLM is calling an SCTP user-space > >> interface from a kernel thread, which results in pollution of the kernel > >> thread file descriptor table. Thanks for that analysis. As you point out, SCTP is only ever really used or tested from user space, not from the kernel like the dlm does. So I'm not surprised to hear about problems like this. I don't know how difficult it might be to fix that. I'd also expect to find other problems like it with dlm+sctp. Some experienced time and attention is probably needed to move the dlm's sctp support beyond experimental. Dave