[DRBD-user] drbd kernel BUG: unable to handle kernel NULL pointer dereference at 0000000000000038

France mailinglists at isg.si
Fri Mar 16 14:26:53 CET 2012

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On 16/3/12 1:29 PM, Florian Haas wrote:
> On Fri, Mar 16, 2012 at 10:36 AM, France<mailinglists at isg.si>  wrote:
>> Hi,
>>
>> i'm hitting a bug in drbd, with latest CentOs and drbd 8.3.12 using GFS2 on
>> top with cman and rgmanager.
>>
>> Here is the simplest method to have it occur.
>> 1. Start drbd on node s2
>> 2. Start drbd on node s3
>> They sync up:
>> [root at s3 ~]# cat /proc/drbd
>> version: 8.3.12 (api:88/proto:86-96)
>> GIT-hash: e2a8ef4656be026bbae540305fcb998a5991090f build by dag at Build64R6,
>> 2011-11-20 10:57:03
>>   0: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r-----
>>     ns:0 nr:45060 dw:45056 dr:660 al:0 bm:11 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b
>> oos:0
>> 3. Start cman on s2&  s3, so i can use gfs2: cluster is up OK:
>> [root at s3 ~]# cman_tool status
>> Version: 6.2.0
>> Config Version: 8
>> Cluster Name: stor
>> Cluster Id: 61164
>> Cluster Member: Yes
>> Cluster Generation: 140
>> Membership state: Cluster-Member
>> Nodes: 2
>> Expected votes: 1
>> Total votes: 2
>> Node votes: 1
>> Quorum: 1
>> Active subsystems: 7
>> Flags: 2node
>> Ports Bound: 0
>> Node name: s3alt.c.XX.si
>> Node ID: 3
>> Multicast addresses: 239.192.238.219 239.192.0.2
>> Node addresses: 192.168.168.3 10.31.0.42
>> 4. Start gfs2 on both nodes:
>> Mar 16 10:29:41 s3 kernel: GFS2 (built Mar  7 2012 00:54:51) installed
>> Mar 16 10:29:41 s3 kernel: GFS2: fsid=: Trying to join cluster "lock_dlm",
>> "stor:drbdstor"
>> Mar 16 10:29:41 s3 kernel: dlm: Using SCTP for communications
> That's the root cause of your problem.
>
> This is a known issue, and although it's apparently DRBD that's
> causing this panic, its own code isn't to blame for this.
> DLM-over-SCTP isn't fully supported, and unless you can entice (i.e.
> pay) someone to fix it, you won't get this to work reliably.
>
> Sadly, it's apparently impossible to force DLM-over-TCP for multihomed
> hosts, so the only way to work around this seems to be to just run the
> DLM on box with a single (possibly bonded) network interface, silly as
> that may sound.
>
> A more detailed discussion of this issue is here:
>
> http://www.mail-archive.com/drbd-user@lists.linbit.com/msg04492.html
>
> Hope this helps.
>
> Cheers,
> Florian
>
Thank you for your reply Florian.



More information about the drbd-user mailing list