[DRBD-user] drbd read performance

Thu Jul 30 17:22:56 CEST 2009

I'll respond to my own post, and see if this helps someone else.   
While troubleshooting this mess, I tried running a dd on the drbd  
device on the secondary node.  It told me that the device was  
unavailable.  I tried to down the resources on the secondary node,  
"drbdadm down all" it gave a message about the primary device refusing  
the action.  Sorry I did not turn on logging during this to get the  
exact errors.

Frustrated, I just rebooted the secondary box.  When I did this, I  
lost access to all of my iSCSI and NFS luns.  A few minutes later the  
primary drbd server crashed.  Heartbeat then failed all of the  
resources over to the secondary box.  The secondary then started  
trying to sync back to the primary.

This is when I panicked thinking that it was overwriting good data  
with bad.  I immediately stopped the sync, and unplugged the sync  
cable.  I then brought everything down on both nodes to start doing  
some forensics.

Before the event, I looked at /proc/drbd and the primary said that it  
was primary and up to date, and that there were no oos.  When I looked  
at the secondary it said that it was secondary with 40000000 oos blocks.

I then used drbdadm to make each side invalidate the other copy, and  
just bring the device online.  I looked at the primary first, since  
this is where everything was mounting from.  Turns out its data was at  
minimum 4 days old.  I shut it back down, and looked at the  
secondary.  It had the latest copy of data.  So I invalidated the  
primary data, and allowed it to sync back to primary.

Everything is back to normal now, and read speeds to the drbd device  
are about 200+ MB/s, as I expected.  I am not sure if this is  
possible, but my guess is that the primary node lost access to its  
copy of the data, and was updating the remote copy of the data.  All  
read traffic was having to go to the remote server in order to be  
serviced.  This makes sense in my head, but I am not sure how the drbd  
code is setup, and if that is even a possibility.

It looks like drbd was acting correctly, and had I let it fail over,  
it would have done the right thing.  I just was not willing to take  
that chance.

Hope this helps someone.

> Ok, this is my first post to this list, so please be easy on me.  I  
> have setup two openfiler nodes each with a 4 drive software sata  
> RAID0 array that mirrors from one node to the other.  There are only  
> the 4 drives in each host, so we partitioned off a RAID10 slice for  
> the boot, then we created a RAID0 slice for the DRBD data.  We then  
> use that volume to create iSCSI and NFS shares to serve up to other  
> hosts.
>
> I have been trying to track down why my performance seems to be so  
> bad.  I then ran across the following test, and it leaves me  
> scratching my head.
>
> On the primary server, I run a dd against the drbd1 device to just  
> read it in.
>
> dd if=/dev/drbd1 of=/dev/null bs=1M&
>
> I then run iostat -k 2 to check the performance.  i see long periods  
> 2-10 secons on NO activity, then brief periods of 25-30 MB/s.  I  
> tried disabling the remote node, and this does not improve  
> performance.
>
> If I run the same command for the underlying md2 raid disk, I get a  
> consistent 200-240 MB/s.  I expected there to be a write penalty,  
> but I am scratching my head on the read penalty.  By the time that  
> we get the iSCSI out to the clients, I am getting maybe 30MB/s, and  
> averaging about 15MB/s.
>
> Here is my drbd.conf
>
> global {
>    # minor-count 64;
>    # dialog-refresh 5; # 5 seconds
>    # disable-ip-verification;
>    usage-count ask;
> }
>
> common {
>  syncer { rate 100M;
>           al-extents 257;
>  }
>
>  net {
>        unplug-watermark 128;
>  }
> }
>
> resource meta {
>
>  protocol C;
>
>  handlers {
>    pri-on-incon-degr "echo O > /proc/sysrq-trigger ; halt -f";
>    pri-lost-after-sb "echo O > /proc/sysrq-trigger ; halt -f";
>    local-io-error "echo O > /proc/sysrq-trigger ; halt -f";
>    outdate-peer "/usr/lib64/heartbeat/drbd-peer-outdater";
>  }
>
> startup {
>    # wfc-timeout  0;
>    degr-wfc-timeout 120;    # 2 minutes.
>  }
>
>  disk {
>    on-io-error   detach;
>    fencing resource-only;
>  }
>
>  net {
>    after-sb-0pri disconnect;
>    after-sb-1pri disconnect;
>    after-sb-2pri disconnect;
>    rr-conflict disconnect;
>  }
>
>  syncer {
> #    rate 10M;
> #    after "r2";
>    al-extents 257;
>  }
>
>    device     /dev/drbd0;
>    disk       /dev/rootvg/meta;
>    meta-disk internal;
>
>  on stg1 {
>    address    1.2.5.80:7788;
> }
>
>   on stg2 {
>    address   1.2.5.81:7788;
>  }
> }
>
> resource NAS {
>
>  protocol C;
>
>  handlers {
>    pri-on-incon-degr "echo O > /proc/sysrq-trigger ; halt -f";
>    pri-lost-after-sb "echo O > /proc/sysrq-trigger ; halt -f";
>    local-io-error "echo O > /proc/sysrq-trigger ; halt -f";
>    outdate-peer "/usr/lib64/heartbeat/drbd-peer-outdater";
>  }
>
>  startup {
>    wfc-timeout         0;  ## Infinite!
>    degr-wfc-timeout  120;  ## 2 minutes.
>  }
>
>  disk {
>    on-io-error detach;
>    fencing resource-only;
>  }
>
>  net {
>    # timeout           60;
>    # connect-int       10;
>    # ping-int          10;
>    # max-buffers     2048;
>    # max-epoch-size  2048;
>  }
>
>  syncer {
>  after "meta";
>  }
>
>    device     /dev/drbd1;
>    disk       /dev/md2;
>    meta-disk internal;
>
>  on stg1 {
>  address    1.2.5.80:7789;
>  }
>
>  on stg2 {
>  address   1.2.5.81:7789;
>  }
> }
>
>
>
> Any direction would be appreciated.
>
> Gary
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user