[DRBD-user] Online Verify and Kernel Panic

Fabrice Charlier fabrice.charlier at uclouvain.be
Mon Oct 11 15:16:19 CEST 2010


Hi all,

We are running a web cluster based on dual primary drbd configuration 
and ocfs2. During each week-end we run a online verify on the drbd 
volume by executing "/sbin/drbdadm verify all" on one node. Last w-e, 
one node (not the one executing the verify command) completely crash and 
we found it this morning with a nice kernel panic message on the console.

Anybody else already observed this behavior?

OS:  Linux server1.ucl.ac.be 2.6.18-194.3.1.el5 #1 SMP Thu May 13 
13:08:30 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux

DRBD: # modinfo drbd
filename:       /lib/modules/2.6.18-194.3.1.el5/weak-updates/drbd83/drbd.ko
alias:          block-major-147-*
license:        GPL
version:        8.3.2
description:    drbd - Distributed Replicated Block Device v8.3.2
author:         Philipp Reisner <phil at linbit.com>, Lars Ellenberg 
<lars at linbit.com>
srcversion:     EB9EAE1FF5D024E96B05208
depends:
vermagic:       2.6.18-128.7.1.el5 SMP mod_unload gcc-4.1
parm:           minor_count:Maximum number of drbd devices (1-255) (uint)
parm:           disable_sendpage:bool
parm:           allow_oos:DONT USE! (bool)
parm:           cn_idx:uint
parm:           proc_details:int
parm:           enable_faults:int
parm:           fault_rate:int
parm:           fault_count:int
parm:           fault_devs:int
parm:           usermode_helper:string


Log on server1:
Oct 10 00:42:01 server1 kernel: block drbd0: conn( Connected -> VerifyS )
Oct 10 00:42:01 server1 kernel: block drbd0: Starting Online Verify from 
sector 0
Oct 10 00:42:11 server1 kernel: block drbd0: PingAck did not arrive in time.
Oct 10 00:42:11 server1 kernel: block drbd0: peer( Primary -> Unknown ) 
conn( VerifyS -> NetworkFailure ) pdsk( UpToDate -> DUnknown )
Oct 10 00:42:11 server1 kernel: block drbd0: Online Verify reached sector 0
Oct 10 00:42:11 server1 kernel: block drbd0: asender terminated
Oct 10 00:42:11 server1 kernel: block drbd0: Terminating asender thread
Oct 10 00:42:11 server1 kernel: block drbd0: short read expecting header 
on sock: r=-512
Oct 10 00:42:11 server1 kernel: block drbd0: Creating new current UUID
Oct 10 00:42:11 server1 kernel: block drbd0: Connection closed
Oct 10 00:42:11 server1 kernel: block drbd0: conn( NetworkFailure -> 
Unconnected )
Oct 10 00:42:11 server1 kernel: block drbd0: receiver terminated
Oct 10 00:42:11 server1 kernel: block drbd0: Restarting receiver thread
Oct 10 00:42:11 server1 kernel: block drbd0: receiver (re)started
Oct 10 00:42:11 server1 kernel: block drbd0: conn( Unconnected -> 
WFConnection )

Log on server2:
Oct 10 00:42:01 server2 kernel: block drbd0: conn( Connected -> VerifyT )
Oct 10 00:42:01 server2 kernel: block drbd0: Online Verify start sector: 0



-- 
--------------------------------------------------------------------
Fabrice Charlier - UCL/SGSI/SIPR

Office : +32.10.47.32.34
GSM    : +32.474.86.81.23
-------------------------------------------------------------------





More information about the drbd-user mailing list