Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
I found that in case when data are inconsistent and the peer says that it has I/O errors, so that synchronization can't be finished, DRBD does drbd_panic() which means kernel panic. This is the code from got_NegDReply(): drbd_panic("Got NegDReply. WE ARE LOST. We lost our up-to-date disk.\n"); // THINK do we have other options, but panic? // what about bio_endio, in case we don't panic ?? As a test I removed panic() and setting drbd_did_panic from drbd_panic() to see what will happen. After diskonnecting the disk from SyncSource peer, I/O errors occured, the secondary peer reported that (the above message) and everything is running well. I've restarted the primary node with the disk connected, the synchronization continues. I'm doing those test because I don't like the idea that the whole machine panics because of I/O errors on one DRBD device, I'd like to have the box running with other DRBD devices available. From the comment in the above code I can see that just commenting out the panic() code isn't a good solution :) Can you explain why panic() is used in this case instead of just behaving like the connection has been lost? -- Damian Pietras