[DRBD-user] Uncatchable DRBD out-of-sync issue

Stanislav German-Evtushenko ginermail at gmail.com
Mon Apr 8 15:47:36 CEST 2013

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hello all,

I have new information.

*******************************************************************************************************************
1. I found in logs on host1 this:
...
Apr  6 19:14:36 host1 Server Administrator: Storage Service EventID:
2405  Command timeout on physical disk:  Physical Disk 0:0:2
Controller 0, Connector 0
Apr  6 19:14:37 host1 Server Administrator: Storage Service EventID:
2405  Command timeout on physical disk:  Physical Disk 0:0:2
Controller 0, Connector 0
Apr  6 19:14:37 host1 Server Administrator: Storage Service EventID:
2405  Command timeout on physical disk:  Physical Disk 0:0:2
Controller 0, Connector 0
Apr  6 19:14:37 host1 Server Administrator: Storage Service EventID:
2095  Unexpected sense. SCSI sense data: Sense key:  6 Sense code: 29
Sense qualifier:  0:  Physical Disk 0:0:2 Controller 0, Connector 0
Apr  6 19:14:37 host1 Server Administrator: Storage Service EventID:
2095  Unexpected sense. SCSI sense data: Sense key:  6 Sense code: 29
Sense qualifier:  0:  Physical Disk 0:0:2 Controller 0, Connector 0
...

2. Then I checked physical drives with MegaCli64 -PDList -aALL|egrep
-i 'error' and it found some errors:

host1

    Media Error Count: 0
    Other Error Count: 4
    Media Error Count: 0
    Other Error Count: 6
    Media Error Count: 0
    Other Error Count: 10
    Media Error Count: 0
    Other Error Count: 4
    Media Error Count: 0
    Other Error Count: 0

host2

    Media Error Count: 0
    Other Error Count: 0
    Media Error Count: 0
    Other Error Count: 0
    Media Error Count: 0
    Other Error Count: 0
    Media Error Count: 0
    Other Error Count: 0
    Media Error Count: 0
    Other Error Count: 0

3. I checked firmware versions with MegaCli64 -PDList -aALL|egrep -i
'firmware level|inquiry data':

host1

    Device Firmware Level: 0001
    Inquiry Data:       S2B7J90Z909775SAMSUNG HE103SJ
       1AJ30001
    Device Firmware Level: 0001
    Inquiry Data:       S2B7J90Z908036SAMSUNG HE103SJ
       1AJ30001
    Device Firmware Level: 0001
    Inquiry Data:       S2B7J90Z908039SAMSUNG HE103SJ
       1AJ30001
    Device Firmware Level: 0001
    Inquiry Data:       S2B7J90Z910558SAMSUNG HE103SJ
       1AJ30001
    Device Firmware Level: 0001
    Inquiry Data:       S2B7J90Z909773SAMSUNG HE103SJ
       1AJ30001Hotspare Information:

host2

    Device Firmware Level: 1V02
    Inquiry Data:      WD-WCAW32135912WDC WD1003FBYX-18Y7B0
       01.01V02
    Device Firmware Level: 1V02
    Inquiry Data:      WD-WCAW32133592WDC WD1003FBYX-18Y7B0
       01.01V02
    Device Firmware Level: 1V02
    Inquiry Data:      WD-WCAW32105584WDC WD1003FBYX-18Y7B0
       01.01V02
    Device Firmware Level: 1V02
    Inquiry Data:      WD-WCAW32121292WDC WD1003FBYX-18Y7B0
       01.01V02
    Device Firmware Level: 1V02
    Inquiry Data:      WD-WCAW32128662WDC WD1003FBYX-18Y7B0
       01.01V02Hotspare Information:
*******************************************************************************************************************

I found here https://en.wikipedia.org/wiki/Key_Code_Qualifier that
"Sense key:  6 Sense code: 29 Sense qualifier:  0:" means "Unit
Attention - POR or device reset occurred".

Then I found on the dell forum
(http://en.community.dell.com/support-forums/servers/f/906/t/19426714.aspx)
the information that resetting hard drive is okay under some
circumstances (for example under heavy load) but I wonder if this is
not true for DRBD and this CAN cause DRBD issues?

Best regards,
Stanislav



More information about the drbd-user mailing list