[DRBD-user] data-integrity-alg in dual-Primary setup

Fri Sep 17 11:22:53 CEST 2010

Hi,

Three months ago we deployed a "web cluster" for LAMP hosting. We based 
this solution on drdb in active/active mode combined with ocfs2. This 
solution matches correctly our needs but several times ( ~ once a month) 
a problem appears: we have enable the "data-integrity-alg" option to 
(try to) avoid silent corruption of data and several times, this feature 
detected that data have been altered during transit between the two 
nodes. As we have active/active nodes and as we use automatic split 
brain recovery policies proposed in official documentation, the two 
members of the mirror are disconnected and we have to resync it manually 
to continue normal operation. We have already disabled all TCP 
offloading capabilities of all NICs without success.

Is it possible to ask drdb to retry to send the block until success in 
this kind of situation? If not, are you planning to implement this feature?

Regards,
-- 
--------------------------------------------------------------------
Fabrice Charlier - UCL/SGSI/SIPR

Office : +32.10.47.32.34
GSM    : +32.474.86.81.23
-------------------------------------------------------------------