[CASE-29] After re-connect, WFBitMapS-WFBitMapT status has sustained continuously and copy command hangs

Jaeheon Kim jhkim at mantech.co.kr
Thu Mar 3 14:18:06 CET 2016

Dear Phil,

Thank you for your new patch.


1. I think this patch should resolve CASE-14, CASE-20.

    1) CASE-14:

        [DRBD-user] [CASE-14] primary node hang by VM-net-disconnect during
big file copy


         - See  my comments "10) worker thread deadlock" and "11) receiver
thread deadlock"
         - This deadlock disappeared.
         - Not reproduced yet.

    2) CASE-20:

         [DRBD-user] [CASE-20] What is the tail_recursion operation during


          - Panic by "tail_recursion" disappeared also.

2. Another problem occurred !!!

   1) Test-step

      - Test scenario is similar as CASE-14.
      - During big file copy, If you disconnect and connect by VM network
        then each node sometimes goes to WFBitMapS-WFBitMapT status
continuously and file copy command on the primary hangs.

   2) Log (/var/log/messages) analysys:

       (1) primary log
           - Click  http://pastebin.com/3hAWNkHn

       (2) secondary log
           - Click http://pastebin.com/KK1MUbDT

3. Questions

   1) Why did not start resyn after WFBitMapS-WFBitMapT?

   2) What is this ASSERT?
         -  "Mar  3 14:38:28 drbd9-02 kernel: drbd r0/0 drbd1: ASSERT
FAILED cstate = SyncTarget, expected: WFSyncUUID|WFBitMapT|Behind"

Please check my comments in the attached log files.

