[DRBD-user] Stop master mount access during a slave network failure in C protocol?

Mon Aug 31 18:22:51 CEST 2015

On 31/08/15 11:27 AM, Mayk Eskilla wrote:
> Hi list
> 
> I'm testing drbd C protocol with two ext4 partitions on 2 banana pi's and I noticed that"C protocol does not stop a copying process into the master mount, while I disconnect the slave network cable: There is a short delay in copying, but the process then continues with /proc/drbd mentioning "Network failure" and "Waiting for connection" plus slave being in inconsistent state. re-plugging the slave network cable will trigger a successful sync form master to slave and both are up-to-date again.

Do you have two drbd resources, each with ext4 and mounted on one node
at a time? If you're mounting an ext4 partition on both nodes, you will
corrupt the file system very quickly.

Also note that without fencing, it's very possible to get a split-brain.

> My question is simply this: how come C protocol does not block master mount write access, when data can not safely be written to the slave? Is this considered a heartbeat's task, so drbd does not react itself? Or can I modify the drbd.conf, so at least disk writes into the master are stopped when the slave is disconnected?

If the Primary node loses connection to the Secondary, it starts marking
the changed inodes in a "dirty blocks list". Later, when the Secondary
reconnects, it starts sync'ing those dirty blocks over to the peer at
the rate set in 'syncer { rate xM; };.

Note that replication (replicating new data to both nodes when
connected) always goes as fast as possible. When dirty blocks need to by
sync'ed, the speed given for this is take away from the replication
rate, causing your writes to feel slow.

> Attached is my minimalistic drbd.conf
> 
> cat /etc/drbd.conf
> global { usage-count no; }
> common { syncer { rate 100M; } }

This is way too high. Try 20M on an rpi.

> resource r0 {
>         protocol C;
>         startup {
>                 wfc-timeout  15;
>                 degr-wfc-timeout 60;
>         }
>         net {
>                 cram-hmac-alg sha256;
>                 shared-secret "secret";
>         }
>         on Pi1 {
>                 device /dev/drbd0;
>                 disk /dev/sda1;
>                 address 192.168.1.11:7789;
>                 meta-disk internal;
>         }
>         on Pi2 {
>                 device /dev/drbd0;
>                 disk /dev/sda1;
>                 address 192.168.1.12:7789;
>                 meta-disk internal;
>         }
> }
> 
> There is no heartbeat service involved as of now, so I'm assigning roles myself with drbdadm.

When then time comes, use corosync + pacemaker. Heartbeat is long
deprecated. When you do setup pacemaker, figure out what device you can
use for fencing and configure/test stonith before anything else. Once
that is working, configure drbd's fencing to 'resource-and-stonith' and
configure the 'crm-{un,}fence-peer.sh' handlers.

This will prevent split-brains and make your cluster much safer and more
stable.

> Regards
> 
> Mayk 		 	   		  
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user
> 

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?