[Drbd-dev] oopses in 2.6.19.1
Ard van Breemen
ard at kwaak.net
Sun Jan 28 12:38:55 CET 2007
Hi,
On Sun, Jan 28, 2007 at 11:59:38AM +0100, Ard van Breemen wrote:
>From that point on that message is repeated over and over...
What makes it even more interesting:
siep:~# drbdadm down all
Child process does not terminate!
Exiting.
siep:~# No response from the DRBD driver! Is the module loaded?
No response from the DRBD driver! Is the module loaded?
State change failed: (0)unknown error.
cat /proc/drbd
version: 8.0.0 (api:86/proto:86)
SVN Revision: 2713 build by ard at siddev, 2007-01-26 15:41:35
0: cs:SyncTarget st:Secondary/Primary ds:Inconsistent/UpToDate C r---
ns:0 nr:1297904392 dw:1297901288 dr:0 al:0 bm:79217 lo:97 pe:3444 ua:97 ap:0
[==>.................] sync'ed: 12.0% (996766/1132123)M
finish: 3:34:10 speed: 79,384 (71,408) K/sec
resync: used:11/31 hits:81043148 misses:79225 starving:0 dirty:0 changed:79225
act_log: used:0/127 hits:0 misses:0 starving:0 dirty:0 changed:0
(siep in this case is the slave)
So: on drbdadm down all, the system starts to sync again...
We've got a dmesg for that:
Jan 28 06:47:02 localhost kernel: drbd0: Retrying drbd_rs_del_all() later. refcnt=1
Jan 28 06:47:33 localhost last message repeated 295 times
<snip>
Jan 28 11:59:54 localhost last message repeated 501 times
Jan 28 11:59:54 localhost kernel: drbd0: peer( Primary -> Unknown ) conn( SyncTarget -> Disconnecting ) pdsk(
Jan 28 11:59:54 localhost kernel: drbd0: short read expecting header on sock: r=-512
Jan 28 11:59:54 localhost kernel: drbd0: asender terminated
Jan 28 11:59:54 localhost kernel: drbd0: Retrying drbd_rs_del_all() later. refcnt=1
Jan 28 11:59:54 localhost kernel: drbd0: Resync done (total 60002 sec; paused 0 sec; 0 K/sec)
Jan 28 11:59:54 localhost kernel: drbd0: conn( Disconnecting -> Connected ) disk( Inconsistent -> UpToDate ) p
Jan 28 11:59:54 localhost kernel: drbd0: Writing meta data super block now.
Jan 28 11:59:54 localhost kernel: drbd0: tl_clear()
Jan 28 11:59:54 localhost kernel: drbd0: Connection closed
Jan 28 11:59:54 localhost kernel: drbd0: conn( Connected -> Unconnected ) pdsk( UpToDate -> DUnknown )
Jan 28 11:59:54 localhost kernel: drbd0: drbdd_init: (mdev->receiver.t_state != Restarting) in /usr/src/shared
Jan 28 11:59:54 localhost kernel: drbd0: receiver terminated
Jan 28 11:59:54 localhost kernel: drbd0: receiver (re)started
Jan 28 11:59:54 localhost kernel: drbd0: conn( Unconnected -> WFConnection )
Jan 28 11:59:54 localhost kernel: drbd0: conn( WFConnection -> WFReportParams )
Jan 28 11:59:54 localhost kernel: drbd0: Handshake successful: DRBD Network Protocol version 86
Jan 28 11:59:54 localhost kernel: drbd0: Peer authenticated using 20 bytes of 'sha1' HMAC
Jan 28 11:59:54 localhost kernel: drbd0: Writing meta data super block now.
Jan 28 11:59:55 localhost kernel: drbd0: writing of bitmap took 93 jiffies
Jan 28 11:59:55 localhost kernel: drbd0: 1105 GB marked out-of-sync by on disk bit-map.
Jan 28 11:59:55 localhost kernel: drbd0: 1159294952 KB now marked out-of-sync by on disk bit-map.
Jan 28 11:59:55 localhost kernel: drbd0: Writing meta data super block now.
Jan 28 11:59:55 localhost kernel: drbd0: peer( Unknown -> Primary ) conn( WFReportParams -> WFBitMapT ) pdsk(
Jan 28 11:59:55 localhost kernel: drbd0: Writing meta data super block now.
Jan 28 11:59:56 localhost kernel: drbd0: conn( WFBitMapT -> WFSyncUUID )
Jan 28 11:59:56 localhost kernel: drbd0: conn( WFSyncUUID -> SyncTarget ) disk( UpToDate -> Inconsistent )
Jan 28 11:59:56 localhost kernel: drbd0: Began resync as SyncTarget (will sync 1159294952 KB [289823738 bits s
Jan 28 11:59:56 localhost kernel: drbd0: Writing meta data super block now.
Jan 28 12:07:46 localhost kernel: drbd0: drbd_rs_complete_io(,67857280 [=2070]) called, but refcnt is 0!?
Jan 28 12:12:24 localhost kernel: drbd0: drbd_rs_complete_io(,107445120 [=3278]) called, but refcnt is 0!?
So the patch of Lars makes the system not crash :-).
If anyone can hint where I have to infest the code with some printk's and
hopefully some explanation of what to look for, I might be able to track it
down. Unfortunately I only have about 10% resources left in my head to look at
it :-(.
More information about the drbd-dev
mailing list