[DRBD-user] DRBD Bug report - I/O hangs on primary node

Miguel Castellanos miguelcastellanos at hotmail.es
Mon May 2 16:39:12 CEST 2016

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.



          
          DRBD Bug report - I/O hangs on primary node
          
          
Hello everyone,

          I use DRBD as a backup solution for many servers, one big storage server backs up several smaller servers.
          

          SMALL SERVERS
          

          My Small servers are KVM Hypervisors. I work with drbd over local storage.
          HW Configuration: Dual AMD Opteron 6272, 128 GB RAM, Dual 1 Gbps Intel 82576 NICs.
          SW Configuration: Centos 7.0 Kernel 3.10.0-123.el7.x86_64, DRBD 8.4.7-2 GIT-hash: 15c1a783b673d9c1a2a9f9de38b3d7e98c275709
          The DRBD volume on the small servers works over a 4TB  LSI Hardware RAID with DRBD external metadata over a SSD partition.
          I use the DRBD block device as Physical Volume for LVM, where I create Logical Volumes to represent RAW Virtual Hard disks for KVM Virtual Machines.
          

          THE BIG BACKUP SERVER
          

          The Big server is also Dual AMD Opteron 6272, 128 GB RAM, Dual 1 Gbps Intel 82576 NICs. 
          SW Configuration: Centos 7.0 Kernel 3.10.0-123.el7.x86_64, DRBD 8.4.7-2 GIT-hash: 15c1a783b673d9c1a2a9f9de38b3d7e98c275709
          I called it "BIG" because it has 60TB of storage over an Adaptec Hardware RAID.
          Its role is only to use DRBD to replicate the data from the small servers.
          I created 15 GPT 4TB partitions over that harware RAID, each one presents the backup volume for one of the small servers.
          I use also a SSD for external metadata, I have a LVM VG group called "metadata" where I created one Logical Volume for the metadata of each DRBD resource.
          

          THE BACKUP PROCESS
          

          All DRBD volumes works with protocol A, the small servers always hold the primary role and the corresponding drbd resource works disconnected most of the time.
          In the backup server I continuously execute a backup script, It iterates over all small servers in order to backup their data, one server at a time, the backup script connects to each small server via SSH, and executes the following commands as root in each small server:
          
          drbdadm connect <resource>
          drbdadm resume-sync <resource>
          <wait until the resource reaches the state UpToDate/UpToDate>
          drbdadm pause-sync <resource>
          drbdadm disconnect <resource>
           
          Then it continues to the next small server. This process runs all the time without stops.
          

          THE PROBLEM
          
          Everything works fine most of the time, but at least once a month i/o stops on at least one small server. (All small servers have experimented the problem). 
          The DRBD resource "hangs" on the primary node and it is not possible to write any data to it. Of course all Virtual Machines running on the Hypervisor stop working and the server's Load Average grows without limit. Operating system boot does not work because it hangs when trying to umount filesystems that were mounted from the DRBD resource. 
          A Hardware Boot is always required to recover control of the system.
          
          A normal backup session (DRBD resync) on the small servers generates the following set of messages in the syslog:
          This is actually the last successful sync session before the failure:
          May  1 10:48:39 node-80-13 kernel: drbd node-80-13: conn( StandAlone -> Unconnected ) 
May  1 10:48:39 node-80-13 kernel: drbd node-80-13: Starting receiver thread (from drbd_w_node-80- [3406])
May  1 10:48:39 node-80-13 kernel: drbd node-80-13: receiver (re)started
May  1 10:48:39 node-80-13 kernel: drbd node-80-13: conn( Unconnected -> WFConnection ) 
May  1 10:48:39 node-80-13 kernel: drbd node-80-13: Handshake successful: Agreed network protocol version 101
May  1 10:48:39 node-80-13 kernel: drbd node-80-13: Feature flags enabled on protocol level: 0x7 TRIM THIN_RESYNC 
WRITE_SAME.
May  1 10:48:39 node-80-13 kernel: drbd node-80-13: conn( WFConnection -> WFReportParams ) 
May  1 10:48:39 node-80-13 kernel: drbd node-80-13: Starting ack_recv thread (from drbd_r_node-80- [2607])
May  1 10:48:39 node-80-13 kernel: block drbd13: drbd_sync_handshake:
May  1 10:48:39 node-80-13 kernel: block drbd13: self C38A5875B169C401:0690DCE7CD1B204B:F1B7EF7B79DF16F3:F1B6EF7B7
9DF16F3 bits:45856 flags:0
May  1 10:48:39 node-80-13 kernel: block drbd13: peer 0690DCE7CD1B204A:0000000000000000:F1B7EF7B79DF16F2:F1B5EF7B7
9DF16F3 bits:0 flags:0
May  1 10:48:39 node-80-13 kernel: block drbd13: uuid_compare()=1 by rule 70
May  1 10:48:39 node-80-13 kernel: block drbd13: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) 
pdsk( DUnknown -> Consistent ) 
May  1 10:48:39 node-80-13 kernel: block drbd13: send bitmap stats [Bytes(packets)]: plain 0(0), RLE 8123(2), tota
l 8123; compression: 100.0%
May  1 10:48:39 node-80-13 kernel: block drbd13: receive bitmap stats [Bytes(packets)]: plain 0(0), RLE 8123(2), t
otal 8123; compression: 100.0%
May  1 10:48:39 node-80-13 kernel: block drbd13: helper command: /sbin/drbdadm before-resync-source minor-13
May  1 10:48:39 node-80-13 kernel: block drbd13: helper command: /sbin/drbdadm before-resync-source minor-13 exit 
code 0 (0x0)
May  1 10:48:39 node-80-13 kernel: block drbd13: conn( WFBitMapS -> PausedSyncS ) pdsk( Consistent -> Inconsistent
 ) 
May  1 10:48:39 node-80-13 kernel: block drbd13: Began resync as PausedSyncS (will sync 183436 KB [45859 bits set]
).
May  1 10:48:39 node-80-13 kernel: block drbd13: updated sync UUID C38A5875B169C401:0691DCE7CD1B204B:0690DCE7CD1B2
04B:F1B7EF7B79DF16F3
May  1 10:48:42 node-80-13 kernel: block drbd13: conn( PausedSyncS -> SyncSource ) user_isp( 1 -> 0 ) 
May  1 10:48:42 node-80-13 kernel: block drbd13: Syncer continues.
May  1 10:48:47 node-80-13 kernel: block drbd13: Resync done (total 7 sec; paused 2 sec; 36684 K/sec)
May  1 10:48:47 node-80-13 kernel: block drbd13: updated UUIDs C38A5875B169C401:0000000000000000:0691DCE7CD1B204B:
0690DCE7CD1B204B
May  1 10:48:47 node-80-13 kernel: block drbd13: conn( SyncSource -> Connected ) pdsk( Inconsistent -> UpToDate ) 
May  1 10:48:52 node-80-13 kernel: block drbd13: user_isp( 0 -> 1 ) 
May  1 10:48:54 node-80-13 kernel: drbd node-80-13: peer( Secondary -> Unknown ) conn( Connected -> Disconnecting 
) pdsk( UpToDate -> DUnknown ) 
May  1 10:48:54 node-80-13 kernel: drbd node-80-13: ack_receiver terminated
May  1 10:48:54 node-80-13 kernel: drbd node-80-13: Terminating drbd_a_node-80-
May  1 10:48:54 node-80-13 kernel: block drbd13: new current UUID 598C5E0CEB5F8A4D:C38A5875B169C401:0691DCE7CD1B20
4B:0690DCE7CD1B204B
May  1 10:48:54 node-80-13 kernel: drbd node-80-13: Connection closed
May  1 10:48:54 node-80-13 kernel: drbd node-80-13: conn( Disconnecting -> StandAlone ) 
May  1 10:48:54 node-80-13 kernel: drbd node-80-13: receiver terminated
May  1 10:48:54 node-80-13 kernel: drbd node-80-13: Terminating drbd_r_node-80-
          However the next backup session did not finish successfully:
          May  1 10:52:32 node-80-13 kernel: drbd node-80-13: conn( StandAlone -> Unconnected ) 
May  1 10:52:32 node-80-13 kernel: drbd node-80-13: Starting receiver thread (from drbd_w_node-80- [3406])
May  1 10:52:32 node-80-13 kernel: drbd node-80-13: receiver (re)started
May  1 10:52:32 node-80-13 kernel: drbd node-80-13: conn( Unconnected -> WFConnection ) 
May  1 10:52:33 node-80-13 kernel: drbd node-80-13: Handshake successful: Agreed network protocol version 101
May  1 10:52:33 node-80-13 kernel: drbd node-80-13: Feature flags enabled on protocol level: 0x7 TRIM THIN_RESYNC 
WRITE_SAME.
May  1 10:52:33 node-80-13 kernel: drbd node-80-13: conn( WFConnection -> WFReportParams ) 
May  1 10:52:33 node-80-13 kernel: drbd node-80-13: Starting ack_recv thread (from drbd_r_node-80- [5857])

        
        In this moment i/o hangs and the DRBD resource remains
          forever in the WFReportParams state.
        

        
        From the BIG Server's perspective this is a successful
          session:
        

        
        
          May  1 10:00:02 storage-80-101 kernel:
              drbd node-80-13: Handshake successful: Agreed network
              protocol version 101
          May  1 10:00:02 storage-80-101 kernel:
              drbd node-80-13: Feature flags enabled on protocol level:
              0x7 TRIM THIN_RES
          YNC WRITE_SAME.
          May  1 10:00:02 storage-80-101 kernel:
              drbd node-80-13: conn( WFConnection -> WFReportParams
              ) 
          May  1 10:00:02 storage-80-101 kernel:
              drbd node-80-13: Starting ack_recv thread (from
              drbd_r_node-80- [9039])
          May  1 10:00:02 storage-80-101 kernel:
              block drbd13: drbd_sync_handshake:
          May  1 10:00:02 storage-80-101 kernel:
              block drbd13: self
              86CB6F103DB64E5A:0000000000000000:86A850CE30BAFCC2:46121
          5DE40F8C736 bits:0 flags:0
          May  1 10:00:02 storage-80-101 kernel:
              block drbd13: peer
              BAA6F74140F9B5A1:86CB6F103DB64E5B:86A850CE30BAFCC2:46121
          5DE40F8C737 bits:51667 flags:0
          May  1 10:00:02 storage-80-101 kernel:
              block drbd13: uuid_compare()=-1 by rule 50
          May  1 10:00:02 storage-80-101 kernel:
              block drbd13: peer( Unknown -> Primary ) conn(
              WFReportParams -> WFBitMapT 
          ) disk( UpToDate -> Outdated ) pdsk(
              DUnknown -> UpToDate ) peer_isp( 0 -> 1 ) 
        
        
          May  1 10:00:02 storage-80-101 kernel:
              block drbd13: receive bitmap stats [Bytes(packets)]: plain
              0(0), RLE 8985(3
          ), total 8985; compression: 100.0%
          May  1 10:00:02 storage-80-101 kernel:
              block drbd13: send bitmap stats [Bytes(packets)]: plain
              0(0), RLE 8985(3), 
          total 8985; compression: 100.0%
          May  1 10:00:02 storage-80-101 kernel:
              block drbd13: conn( WFBitMapT -> WFSyncUUID ) 
          May  1 10:00:02 storage-80-101 kernel:
              block drbd13: updated sync uuid
              86CC6F103DB64E5A:0000000000000000:86A850CE3
          0BAFCC2:461215DE40F8C736
          May  1 10:00:02 storage-80-101 kernel:
              block drbd13: helper command: /sbin/drbdadm
              before-resync-target minor-13
          May  1 10:00:02 storage-80-101 kernel:
              block drbd13: helper command: /sbin/drbdadm
              before-resync-target minor-13 e
          xit code 0 (0x0)
          May  1 10:00:02 storage-80-101 kernel:
              block drbd13: conn( WFSyncUUID -> PausedSyncT ) disk(
              Outdated -> Inconsist
          ent ) 
          May  1 10:00:02 storage-80-101 kernel:
              block drbd13: Began resync as PausedSyncT (will sync
              209244 KB [52311 bits 
          set]).
          May  1 10:00:04 storage-80-101 kernel:
              block drbd13: conn( PausedSyncT -> SyncTarget )
              peer_isp( 1 -> 0 ) 
          May  1 10:00:04 storage-80-101 kernel:
              block drbd13: Syncer continues.
          May  1 10:00:31 storage-80-101 kernel:
              block drbd13: Resync done (total 29 sec; paused 2 sec;
              7748 K/sec)
          May  1 10:00:31 storage-80-101 kernel:
              block drbd13: updated UUIDs
              BAA6F74140F9B5A0:0000000000000000:86CC6F103DB64E5A:86CB6F103DB64E5B
          May  1 10:00:31 storage-80-101 kernel:
              block drbd13: conn( SyncTarget -> Connected ) disk(
              Inconsistent -> UpToDate ) 
          May  1 10:00:31 storage-80-101 kernel:
              block drbd13: helper command: /sbin/drbdadm
              after-resync-target minor-13
          May  1 10:00:31 storage-80-101 kernel:
              block drbd13: helper command: /sbin/drbdadm
              after-resync-target minor-13 exit code 0 (0x0)
          May  1 10:00:34 storage-80-101 kernel:
              block drbd13: conn( Connected -> Behind ) disk(
              UpToDate -> Outdated ) 
          May  1 10:00:35 storage-80-101 kernel:
              block drbd13: peer_isp( 0 -> 1 ) 
          May  1 10:00:35 storage-80-101 kernel:
              block drbd13: updated sync uuid
              85C59D47AB684CF0:0000000000000000:86CC6F103DB64E5A:86CB6F103DB64E5B
          May  1 10:00:35 storage-80-101 kernel:
              block drbd13: helper command: /sbin/drbdadm
              before-resync-target minor-13
          May  1 10:00:35 storage-80-101 kernel:
              block drbd13: helper command: /sbin/drbdadm
              before-resync-target minor-13 exit code 0 (0x0)
          May  1 10:00:35 storage-80-101 kernel:
              block drbd13: conn( Behind -> PausedSyncT ) disk(
              Outdated -> Inconsistent ) 
          May  1 10:00:35 storage-80-101 kernel:
              block drbd13: Began resync as PausedSyncT (will sync 7860
              KB [1965 bits set]).
          May  1 10:00:36 storage-80-101 kernel:
              block drbd13: conn( PausedSyncT -> Behind ) 
          May  1 10:00:37 storage-80-101 kernel:
              block drbd13: peer_isp( 1 -> 0 ) 
          May  1 10:00:37 storage-80-101 kernel:
              drbd node-80-13: peer( Primary -> Unknown ) conn(
              Behind -> TearDown ) pdsk( UpToDate -> DUnknown ) 
          May  1 10:00:37 storage-80-101 kernel:
              drbd node-80-13: ack_receiver terminated
          May  1 10:00:37 storage-80-101 kernel:
              drbd node-80-13: Terminating drbd_a_node-80-
          May  1 10:00:37 storage-80-101 kernel:
              drbd node-80-13: Connection closed
          May  1 10:00:37 storage-80-101 kernel:
              drbd node-80-13: conn( TearDown -> Unconnected ) 
          May  1 10:00:37 storage-80-101 kernel:
              drbd node-80-13: receiver terminated
          May  1 10:00:37 storage-80-101 kernel:
              drbd node-80-13: Restarting receiver thread
          May  1 10:00:37 storage-80-101 kernel:
              drbd node-80-13: receiver (re)started
          May  1 10:00:37 storage-80-101 kernel:
              drbd node-80-13: conn( Unconnected -> WFConnection ) 
        
        

        
        And this is the failure from the BIG server's perspective:
        

        
        
          May  1 10:52:33 storage-80-101 kernel:
              drbd node-80-13: Handshake successful: Agreed network
              protocol version 101
          May  1 10:52:33 storage-80-101 kernel:
              drbd node-80-13: Feature flags enabled on protocol level:
              0x7 TRIM THIN_RESYNC WRITE_SAME.
          May  1 10:52:33 storage-80-101 kernel:
              drbd node-80-13: conn( WFConnection -> WFReportParams
              ) 
          May  1 10:52:33 storage-80-101 kernel:
              drbd node-80-13: Starting ack_recv thread (from
              drbd_r_node-80- [9039])
          May  1 10:52:33 storage-80-101 kernel:
              block drbd13: drbd_sync_handshake:
          May  1 10:52:33 storage-80-101 kernel:
              block drbd13: self
              C38A5875B169C400:0000000000000000:0691DCE7CD1B204A:0690DCE7CD1B204B
              bits:0 flags:0
          May  1 10:52:33 storage-80-101 kernel:
              block drbd13: peer
              598C5E0CEB5F8A4D:C38A5875B169C401:0691DCE7CD1B204B:0690DCE7CD1B204B
              bits:44218 flags:0
          May  1 10:52:33 storage-80-101 kernel:
              block drbd13: uuid_compare()=-1 by rule 50
          May  1 10:52:33 storage-80-101 kernel:
              block drbd13: peer( Unknown -> Primary ) conn(
              WFReportParams -> WFBitMapT ) disk( UpToDate ->
              Outdated ) pdsk( DUnknown -> UpToDate ) peer_isp( 0
              -> 1 ) 
        
        

        
        CONFIGURATION DETAILS
        

          
        global_common.conf (for both small and big servers)
        

        
        
          global
          {
                  usage-count yes;
          }
          common {
                  disk
                  {
                          resync-rate 200M;
                          no-disk-barrier;
                          al-extents 3389;
                          disk-barrier no;
                  }
                  syncer
                  {
                          rate 200M;
                  }
                  net
                  {
                          protocol A;
                          sndbuf-size 4000k;
                          max-buffers 8000;
                          max-epoch-size 8000;
                          on-congestion pull-ahead;
                          congestion-fill 3600k;
                          congestion-extents 3000;
                  }
          }
        
        

        
        An example resource file:
        

        
        
          resource node-80-13
          {
          on small_server
          {
           device /dev/drbd13;
           disk /dev/sda1;
           address 10.80.1.13:8013;
           meta-disk /dev/sdb2;
          }
          on big_server
          {
           device /dev/drbd13;
           disk /dev/sda13;
           address 10.80.1.101:8013;
           meta-disk /dev/meta/node-80-13;
          }
          }
        
        

        
        LAST COMMENTS
        

        
        We have experimented this problem for about 3 years, we
          have installed all DRBD releases from the 8.4.X branch  during
          that time, but the problem remains.
        

        
        We have reduced the frequency of the problem by only
          running the backup process when Load Average, CPU usage and
          i/o are in low levels in the small servers. So our backup
          script checks all those parameters via SNMP before executing
          the sync process.
        

        
        If you need any additional information, please let me know.
        

        
        Thanks a lot for your time and help.
        

        
        Best regards,
        

        
        Miguel Castellanos
         
    
    

    

     		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20160502/c63485cf/attachment.htm>


More information about the drbd-user mailing list