[DRBD-user] iscsi target failover, write process goes infinite loop

anugunj anuj singh anujhere at gmail.com
Thu Jun 21 11:36:07 CEST 2007

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.

I have a setup of four machines with rhel4. I am setting up a HA for
Two of four machines are cluster nodes. and other two machines I am
using as SAN iscsi-target machines which will be mirrored using drbd.
File system I am using is GFS.
if one storage device failes other has to take over the read write
process. and after failed storage device comes up it has to update

For the above:
I installed drdb-8.0.3 on my cluster nodes, configured iscsi-initiator. 
 exported stroage device from two iscsi-target machines which are
accessible on my cluster nodes.
created drbd-meta disk and created gfs file system on it. 
made both the cluster nodes primary/primary with "drbdadmin primary all"
on both nodes.
mounted /dev/drbd1 on both cluster nodes. 
Everything works fine till my both iscsi-target machines are up. I can
read write on any of the cluster node's mounted GFS file system which is
mirroring using drbd.
node-A , node-B = cluster machines
target-A, target-B= iscsi-target machines exporting it's hard disk to
cluser node-A and node-B

node-A is accessing , target-A , node-B accessing target-B

drbd is running on node-A and node-B

If I stop target-B machine to simulate storage failover on node-B
I can read data on node-A, but if if i add any file to it , write
process goes to infinite loop, and after it I can not read until I
target-B machine is not running.

Can I use two iscsi-target (storage) mirrored. in case one iscsi-target
node goes down, other storage machine take over read-write. When down
machine boots up again it update itself with the current machine?

My drdb.conf file is:

global {
    usage-count yes;

common {
  syncer { rate 10M; }
resource r0 {
  protocol C;
  handlers {
    pri-on-incon-degr "echo o > /proc/sysrq-trigger ; halt -f";
    pri-lost-after-sb "echo o > /proc/sysrq-trigger ; halt -f";
    local-io-error "echo o > /proc/sysrq-trigger ; halt -f";

  startup {
    degr-wfc-timeout 120;    # 2 minutes.

  disk {
    on-io-error   detach;

  net {
     cram-hmac-alg "sha1";
     shared-secret "prolog";
    after-sb-0pri disconnect;
    after-sb-1pri disconnect;
    after-sb-2pri disconnect;
    rr-conflict disconnect;

  syncer {
    rate 10M;
    al-extents 257;
on pr0021.prolo.com {
                 device    /dev/drbd1;
                 disk      /dev/sda;
                 meta-disk  internal;
    on pr0005.prolo.com {
                 device    /dev/drbd1;
                 disk      /dev/sda;
                 meta-disk  internal;

thanks and regards
anugunj "anuj singh"
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20070621/eae083c8/attachment.pgp>

More information about the drbd-user mailing list