[DRBD-user] drbd_panic() in drbd_receiver.c

Mon Jul 3 13:10:33 CEST 2006

I found that in case when data are inconsistent and the peer says that
it has I/O errors, so that synchronization can't be finished, DRBD does
drbd_panic() which means kernel panic.

This is the code from got_NegDReply():

drbd_panic("Got NegDReply. WE ARE LOST. We lost our up-to-date
disk.\n");

        // THINK do we have other options, but panic?
        //       what about bio_endio, in case we don't panic ??

As a test I removed panic() and setting drbd_did_panic from drbd_panic()
to see what will happen. After diskonnecting the disk from SyncSource
peer, I/O errors occured, the secondary peer reported that (the above
message) and everything is running well. I've restarted the primary node
with the disk connected, the synchronization continues.

I'm doing those test because I don't like the idea that the whole
machine panics because of I/O errors on one DRBD device, I'd like to
have the box running with other DRBD devices available. From the
comment in the above code I can see that just commenting out the panic()
code isn't a good solution :)

Can you explain why panic() is used in this case instead of just
behaving like the connection has been lost?

-- 
Damian Pietras