[DRBD-user] High IOwait with rsync on DRBD-OCFS2-NFS node

Dan Barker dbarker at visioncomm.net
Sat Jan 4 18:59:44 CET 2014

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


> 1/4/2014 9:17 PM, Dan Barker пишет:
> >> -----Original Message-----
> >> From: drbd-user-bounces at lists.linbit.com [mailto:drbd-user-
> >> bounces at lists.linbit.com] On Behalf Of Vasily S. Kostroma
> >> Sent: Friday, January 03, 2014 7:05 PM
> >> To: drbd-user at lists.linbit.com
> >> Subject: [DRBD-user] High IOwait with rsync on DRBD-OCFS2-NFS node
> >>
> >> Happy holidays, friends.
> >> It's my first message in this list.
> >>
> >> I have two 100% same servers with two logical drives - one for system
> >> and second one for DRBD primary-primary configuration. The servers
> >> connected via dedicated Gb NIC's via crossover cable. Operational
> system
> >> is Debian 7, file system for DRBD is OCFS2. Additionally I using NFS
> >> server to share a files from DRBD drive to other servers. The
> >> configuration is very simple. Just one resource with internal metadata.
> >> Nothing special.
> >>
> >> But I have a problem and, unfortunately, I have no idea how to fix it.
> >> I using Node1 as primary and Node2 as secondary. I mean clients
> >> (web-servers) are connecting to Node1 via NFS, and backup's running on
> >> Node2. Example. I starting rsync on Node2 to sync a local folder with
> >> folder on the same server, but DRBD+OCFS2. It's working fine, but in
> >> backup time the NFS server on Node1 stop working till backup on Node2
> is
> >> in progress. A NFS processes (threads) is on "D" state in top. After
> >> backups completed the NFS start working perfectly. Just for additional
> >> attention: I running rsync on Node2, but NFS stop working on Node1.
> >>
> >> I will appreciate any idea what is wrong.. All settings is almost
> >> default, but problem is there :(
> >>
> >> Sorry for my English.
> >> And thank you in advance.
> >> Happy holidays!
> >>
> > Happy Holidays back at you! It's too cold for my liking, however.
> >
> > I do not believe you can run rsync successfully against an OCFS2 file
> system in that way. Your experience appears to bear that out. You have
> several solutions that could allow this sort of hot backup, however.
> >
> > 1) disconnect the resource before taking the rsync backup. This will
> effectively "snapshot" the resource at the time of the disconnect. Node1
> will continue serving the shares on NFS, but the Node2 disk version will
> slowly diverge from the "snapshot" on node2. Be sure to mount the resource
> Read Only on Node 2 or you will have a split brain for sure (noatime is
> probably not sufficient to protect you). As soon as the backup is
> complete, unmount and connect the resource, and the disks will rapidly re-
> synchronize. PROS: No interruption to NFS and you produce a "crash
> consistent" version of your filesystem for the rsync backup. CONS: No
> redundancy during the time the backup takes.
> >
> > 2) Run your rsync backup from Node1.
> >
> > 3) There may be a snapshot facility in OCFS2. I do not know - I've used
> OCFS2 primarily with xen and use reflink for snapshots. But, reflink
> snapshots a file but I believe you want to snapshot a file system. If
> there is an appropriate facility in OCFS2, produce the snapshot on Node1,
> and then backup the snapshot from Node2. That shouldn't lockout NFS on
> Node1 accessing the live disk - the snapshot is static.
> >
> > hth
> >
> > Dan in Atlanta

> -----Original Message-----
> From: Vasily S. Kostroma [mailto:admin at v-sf.info]
> Sent: Saturday, January 04, 2014 12:44 PM
> To: Dan Barker
> Subject: Re: [DRBD-user] High IOwait with rsync on DRBD-OCFS2-NFS node
> 
> Thank you for the fast reply.
> Unfortunately, looks like I did not explain my problem correctly. Sorry
> for that, English is not my native language.
> Please let me explain one more time.
> 
> The DRBD configuration is primary-primary, 'couse I want to use one for
> NFS shares, and second one for backups. I have several types of backups:
> - backup of web-sites from DRBD to DRBD (just a copy of folders within
> same drive/server)
> - backup a lot of files from remote servers to DRBD drive
> 
> I always using "secondary" node for any backups, 'couse I do not want to
> make any additional load on "primary" node. My problem is the second
> type of backup. I run a script, which sync a data from remote servers to
> DRBD drive on Node2. Nothing special, just rsync... But while script
> working on "secondary" Node2, the NFS stop working and IOwait is quickly
> increasing on "primary" Node1.
> You think it's OCFS2 limitations?
> 
> Sorry for any misunderstanding,
> Best regards,
> Vasily

You explained your problem correctly. You are trying to do something that is not a good idea.

You do not have a "secondary" node, you have two primaries. You are using Node1 as a primary and you are trying to use Node2 as a secondary, but you had to utilize dual-primary to make Node2 mount the resource.

I do not think this is a limitation in OCFS2; I think it's a benefit of OCFS2. With a non-clustered file system, you'd simply have a very corrupted backup, if not a split-brain. With OCFS2, one node waits until the other node is through messing with resources and you are probably protected from split brain and other corruption. You simply have unacceptable performance issues.

If everything worked like you want, you'd have inconsistent backups. As rsync reads the head of a file to back it up, NFS can be writing the tail of the same file, and the other way around too! Not a good situation for a backup.

Someone else may have some other ideas to accomplish your goals, but I'm about out of ideas. Remember, you do not simply want a backup that runs smoothly, you want something you can restore cleanly - and restoring corrupted files does not meet that goal.

Dan in Atlanta



More information about the drbd-user mailing list