[DRBD-user] Kernel panic from drdbadm on CentOS 6

David Coulson david at davidcoulson.net
Tue Aug 16 12:30:18 CEST 2011

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Dominik-

The last thing I got from RedHat when I sent them a vmcore was:

"We've gone through the analysis of the cores that you have provided it 
looks as though the panic in both cases was from trying to dereference a 
NULL 'sk' pointer from the 'sock_net' function."

so it sounds like we're both experiencing the same thing as this person:

http://lists.linbit.com/pipermail/drbd-user/2010-August/014619.html

I can reproduce it every time I reboot one of my nodes (I am running 
cman/clvmd/pacemaker/gfs2 on top of DRBD). I am in the process of 
getting DRBD support from Linbit to actually resolve this issue, but 
internal politics where I work is making it take longer than I was 
expecting.

I do have a pair of older RHEL6 (pretty much initial 6.0 release plus a 
couple of patches from before February) systems running DRBD happily - I 
did take my kernel back to an earlier release on the unstable boxes and 
that didn't seem to do much for me.

David

On 8/15/11 10:38 AM, Dominik Epple wrote:
> Hi list,
>
> we are facing kernel panics with CentOS 6 (RHEL 6 compatible) with kernel version 2.6.32-71.29.1.el6.x86_64, drbd version 8.3.11 built from sources.
>
> Some days ago David Coulson reported the same problem in a thread called "DRBD trace with RHEL6.1 upgrade". The last mail in the thread (Wed Aug 3 16:31:56 CEST 2011) has a screenshot with the call trace (http://i.imgur.com/cSOzV.png). Since I have no (easy) means of taking a screenshot of the call trace from my machine, I cannot give one here, but it is the very same problem with Process drbdadm and vfs_ioctl in the call trace, etc.
>
> I cannot say, unfortunately, how to reproduce the panic. I was unable to found out a single event which triggers it reproducibly. But it seems that it is necessary that drbd runs under a cluster management system (pacemaker/corosync in my case).
>
> Actions that can trigger those panics include:
>     * Starting the pacemaker drbd resource via "crm resource start<resourcename>"
>     * Restarting the cluster management system via "service corosync restart"
>     * Doing a "ifdown eth0; ifup eth0" for the interface drbd is run over
>
> Since the other thread has no answer to this problem, I ask here again: Where does this panic come from? Is there a solution, patch, or workaround known?
>
> Thanks and regards
> Dominik Epple
>
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user



More information about the drbd-user mailing list