[DRBD-user] DRBD + ZFS
David Bruzos
david.bruzos at jaxport.com
Mon Aug 30 13:26:06 CEST 2021
Hi Eric,
Sorry about the delay. The article you provided is interesting, but rather specific to a workload that would show rather dramatic results on VDO. In your case, the main objective is making the most our of your NVME storage, while maintaining good performance. The article would be very much applicable if you were doing replication over a slow WAN link or something like that, but I imagine that the network is not going to be a bottleneck for you, so saving throughput at the DRBD layer is probably not a big advantage.
The real space and performance killer (if done wrong) in your case is going to be proper block alignments to optimize the mysql workload. Depending on your underlining storage optimal block size (usually 4KB) and the vdev type you want to use (EG. raidz, mirror), you will have to make sure that everything is optimized for mysql's 16KB writes. As I pointed out earlier, mirror will be simplest/fastest and raidz is doable, but will be slower for writes (may not matter if you got enough iops). The key is that with raidz, you will have to take more factors into account to ensure everything is optimal. In my case for example, my newest setup uses raidz and compression for making the most our of my NVME, but I use ashift=9 (512 byte blocks) to be able to make 4K zvols for my VMs and still greatly benefit from compression.
It is important to point out that the raidz details are not unique to ZFS. Most people that use tradditional raid5 setups use it in a suboptimal manner and actually have terrible performance and either can't tell, or eventually move to raid10, because "raid5 sucks". In any case, to answer your question, I would still use ZFS instead of VDO for multiple reasons and I would still use it only under DRBD in this case. You have a standard workload, so you should be able to optimize it to fit your objectives.
Here is a good article about mysql on ZFS that should get you started:
https://shatteredsilicon.net/blog/2020/06/05/mysql-mariadb-innodb-on-zfs/
David
--
David Bruzos (Systems Administrator)
Jacksonville Port Authority
2831 Talleyrand Ave.
Jacksonville, FL 32206
Cell: (904) 625-0969
Office: (904) 357-3069
Email: david.bruzos at jaxport.com
On Tue, Aug 24, 2021 at 09:26:22PM +0000, Eric Robinson wrote:
> EXTERNAL
> This message is from an external sender.
> Please use caution when opening attachments, clicking links, and responding.
> If in doubt, contact the person or the helpdesk by phone.
> ________________________________
>
>
> Hi David --
>
> Here is a link to a Linbit article about using DRBD with VDO. While the focus of this article is VDO, I assume the compression recommendation would apply to other technologies such as ZFS. As the article states, their goal was to compress data before it gets passed off to DRBD, because then DRBD replication is faster and more efficient. This was echoed in some follow-up conversation I had with a Linbit rep (or someone from Red Hat, I forget which).
>
> https://linbit.com/blog/albireo-virtual-data-optimizer-vdo-on-drbd/
>
> My use case is multi-tenant MySQL servers. I'll have 125+ separate instances of MySQL running on each cluster node, all out of separate directories and listening on separate ports. The instances will be divided into 4 sets of 50, which live on 4 separate filesystems, on 4 separate DRBD disks. I've used this approach before very successfully with up to 60 MySQL instances, and now I'm dramatically increasing the server power and doubling the number of instances. 4 separate DRBD threads will handle the replication. I'll be using corosync+pacemaker for the HA stack. I'd really like to compress the data and make the most of the available NVME media. The servers do not have RAID controllers. I'll be using ZFS, mdraid, or LVM to create 4 separate arrays for my DRBD backing disks.
>
> --Eric
>
> > -----Original Message-----
> > From: David Bruzos <david.bruzos at jaxport.com>
> > Sent: Tuesday, August 24, 2021 2:03 PM
> > To: Eric Robinson <eric.robinson at psmnv.com>
> > Cc: rabin at isoc.org.il; drbd-user at lists.linbit.com
> > Subject: Re: [DRBD-user] DRBD + ZFS
> >
> > Hello Eric:
> >
> > > What degree of performance degradation have you observed with DRBD
> > over ZFS? Our servers will be using NVME drives with 25Gbit networking:
> >
> > Unfortunately, I have not had the time to properly benchmark and
> > compare a setup like yours with DRBD on top of ZFS. Very superficial tests
> > show that my I/O is more than sufficient for my workload, so I'm then more
> > interested is the data integrity, snapshotting, compression, etc. I would not
> > want to create misinformation by sharing I/O stats that are not taking into
> > account the many aspects of a proper ZFS benchmark and that are not being
> > compared against an alternative setup.
> > In the days of spinning rust storage, I always used mirrored vdevs, always
> > added a fast ZIL, lots of RAM for ARC and a couple of caching devices for
> > L2ARC, so the performance was great when compared with the alternatives.
> >
> > > Since you don't recommend having ZFS above DRBD, what filesystem do
> > you use over DRBD?
> >
> > I've always had good results with XFS on LVM (very thin). That combination
> > usually gives you good flexibility at the VM level and the performance is
> > great. These days, ext4 is a reasonable choice, but I still use XFS most of the
> > time.
> > I would like to see what other folks think about the XFS+LVM combination
> > for VMs vs something like ext4+LVM.
> >
> > > Linbit recommends that compression take place above DRBD rather than
> > below. What are your thoughts about their recommendation versus your
> > approach?
> >
> > If you can provide a link to their recommendation, I can be more specific.
> > In any case, I'm sure their recommendation is reasonable depending on what
> > your specific workload is. In my case, I mostly use compression at the
> > backing storage level, because it gives me a predictable and well understood
> > VM environment where I can run a wide variety of guest operating systems,
> > applications, workloads, etc, without having to worry about the specifics for
> > each possible VM scenario.
> > The reason I normally don't use ZFS for VMs is because I believe it best
> > serves its purpose at the backing storage level for many reasons. ZFS is
> > designed to leverage lots of RAM for ARC, to handle the storage directly, to
> > do many things with your hardware that are very much abstracted away at
> > the guest level.
> >
> > What is your specific usage scenario?
> >
> >
> > --
> > David Bruzos (Systems Administrator)
> > Jacksonville Port Authority
> > 2831 Talleyrand Ave.
> > Jacksonville, FL 32206
> > Cell: (904) 625-0969
> > Office: (904) 357-3069
> > Email: david.bruzos at jaxport.com
> >
> > On Tue, Aug 24, 2021 at 03:21:10PM +0000, Eric Robinson wrote:
> > > EXTERNAL
> > > This message is from an external sender.
> > > Please use caution when opening attachments, clicking links, and
> > responding.
> > > If in doubt, contact the person or the helpdesk by phone.
> > > ________________________________
> > >
> > >
> > > Hi David --
> > >
> > > Thanks for your feedback! I do have a couple of follow-up
> > questions/comments.
> > >
> > > What degree of performance degradation have you observed with DRBD
> > over ZFS? Our servers will be using NVME drives with 25Gbit networking.
> > > Since you don't recommend having ZFS above DRBD, what filesystem do
> > you use over DRBD?
> > > Linbit recommends that compression take place above DRBD rather than
> > below. What are your thoughts about their recommendation versus your
> > approach?
> > >
> > > --Eric
> > >
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: David Bruzos <david.bruzos at jaxport.com>
> > > > Sent: Saturday, August 21, 2021 8:34 AM
> > > > To: Eric Robinson <eric.robinson at psmnv.com>
> > > > Cc: rabin at isoc.org.il; drbd-user at lists.linbit.com
> > > > Subject: Re: [DRBD-user] DRBD + ZFS
> > > >
> > > > Hello folks,
> > > > I've used DRBD over ZFS for many years and my experience has
> > > > been very possitive. My primary use case has been virtual machine
> > > > backing storage for Xen hypervisors, with dom0 running ZFS and DRBD.
> > > > The realtime nature of DRBD replication allows for VM migrations,
> > > > etc, and ZFS makes remote incremental backups awesome. Overall, it
> > > > is a combination that is hard to beat.
> > > >
> > > > * Key things to keep in mind:
> > > >
> > > > . The performance of DRBD on ZFS is not the best in the world,
> > > > but the benefits of a properly configured and used setup far
> > > > outweigh the performance costs.
> > > > . If you are not limited buy storage size (typical when using
> > > > rotating disks), I would absolutely recommend mirror vdevs with
> > > > ashift=12 for best results in most circumstances.
> > > > . If space is a limiting factor (typical with SSD/NVME), I use
> > > > raidz, but careful considerations have to be made, so you don't end
> > > > up wasting tuns of space, because of ashift/blocksize/striping issues.
> > > > . Compression works great under the DRBD devices, but
> > > > volblocksize/ashift details are extremely important to get the most out of
> > it.
> > > > . I would not create additional ZFS file systems on top of the
> > > > DRBD devices for compression or any other intensive feature, just
> > > > not worth it, you want that as close to the physical storage as possible.
> > > >
> > > > I do run a few ZFS file systems on virtual machines that are
> > > > backed by DRBD devices on top of ZFS, but I am after other ZFS
> > > > features in those cases. The VMs running ZFS have compression=off,
> > > > no vdev redundancy, optimized volblocksize for the
> > > > situation/workload in question, etc. My typical goto filesystem for
> > > > VMs is XFS, because it is lean-and-mean and has the kind of features that
> > everyone should want in a general purpose FS.
> > > >
> > > > If you have specific questions, let me know.
> > > >
> > > > David
> > > >
> > > > --
> > > > David Bruzos (Systems Administrator) Jacksonville Port Authority
> > > > 2831 Talleyrand Ave.
> > > > Jacksonville, FL 32206
> > > > Cell: (904) 625-0969
> > > > Office: (904) 357-3069
> > > > Email: david.bruzos at jaxport.com
> > > >
> > > > On Fri, Aug 20, 2021 at 11:32:31AM +0000, Eric Robinson wrote:
> > > > > EXTERNAL
> > > > > This message is from an external sender.
> > > > > Please use caution when opening attachments, clicking links, and
> > > > responding.
> > > > > If in doubt, contact the person or the helpdesk by phone.
> > > > > ________________________________
> > > > >
> > > > > My main motivation is the desire for a compressed filesystem. I
> > > > > have
> > > > experimented with using VDO for that purpose and it works, but the
> > > > setup is complex and I don’t know if I trust it to work well when
> > > > VDO is in a stack of Pacemaker cluster resources. If there a better
> > > > way of getting compression to work above DRBD?
> > > > >
> > > > > -Eric
> > > > >
> > > > >
> > > > > From: rabin at isoc.org.il <rabin at isoc.org.il>
> > > > > Sent: Thursday, August 19, 2021 4:43 PM
> > > > > To: Eric Robinson <eric.robinson at psmnv.com>
> > > > > Cc: drbd-user at lists.linbit.com
> > > > > Subject: Re: [DRBD-user] DRBD + ZFS
> > > > >
> > > > > Not sure ZFS is the right choice as an underline for a resource,
> > > > > it is powerful but also complex (as a code base), which will
> > > > > probably will make it
> > > > slow.
> > > > >
> > > > > unless you are going to expose the ZVOL or the dataset directly to
> > > > > be consumed, stacking ZFS over DRBD over ZFS, seems to me as a bad
> > idea.
> > > > >
> > > > >
> > > > >
> > > > > Rabin
> > > > >
> > > > >
> > > > > On Wed, 18 Aug 2021 at 09:37, Eric Robinson
> > > > <eric.robinson at psmnv.com<mailto:eric.robinson at psmnv.com>> wrote:
> > > > > I’m considering deploying DRBD between ZFS layers. The lowest
> > > > > layer
> > > > RAIDZ will serve as the DRBD backing device. Then I would build
> > > > another ZFS filesystem on top to benefit from compression. Any
> > > > thoughs, experiences, opinions, positive or negative?
> > > > >
> > > > > --Eric
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > Disclaimer : This email and any files transmitted with it are
> > > > > confidential and
> > > > intended solely for intended recipients. If you are not the named
> > > > addressee you should not disseminate, distribute, copy or alter this
> > > > email. Any views or opinions presented in this email are solely
> > > > those of the author and might not represent those of Physician
> > > > Select Management. Warning: Although Physician Select Management
> > has
> > > > taken reasonable precautions to ensure no viruses are present in
> > > > this email, the company cannot accept responsibility for any loss or
> > damage arising from the use of this email or attachments.
> > > > > _______________________________________________
> > > > > Star us on GITHUB: https://github.com/LINBIT drbd-user mailing
> > > > > list drbd-user at lists.linbit.com<mailto:drbd-user at lists.linbit.com>
> > > > > https://lists.linbit.com/mailman/listinfo/drbd-user
> > > > > Disclaimer : This email and any files transmitted with it are
> > > > > confidential and
> > > > intended solely for intended recipients. If you are not the named
> > > > addressee you should not disseminate, distribute, copy or alter this
> > > > email. Any views or opinions presented in this email are solely
> > > > those of the author and might not represent those of Physician
> > > > Select Management. Warning: Although Physician Select Management
> > has
> > > > taken reasonable precautions to ensure no viruses are present in
> > > > this email, the company cannot accept responsibility for any loss or
> > damage arising from the use of this email or attachments.
> > > >
> > > > > _______________________________________________
> > > > > Star us on GITHUB: https://github.com/LINBIT drbd-user mailing
> > > > > list drbd-user at lists.linbit.com
> > > > > https://lists.linbit.com/mailman/listinfo/drbd-user
> > > >
> > > >
> > > > --
> > > >
> > __________________________________________________________
> > > > ______________________________________
> > > >
> > > > Please note that under Florida's public records law (F.S. 668.6076),
> > > > most written communications to or from the Jacksonville Port
> > > > Authority are public records, available to the public and media upon
> > > > request. Your email communications may therefore be subject to
> > > > public disclosure. If you have received this email in error, please
> > > > notify the sender by return email and delete immediately without
> > forwarding to others.
> > > Disclaimer : This email and any files transmitted with it are confidential and
> > intended solely for intended recipients. If you are not the named addressee
> > you should not disseminate, distribute, copy or alter this email. Any views or
> > opinions presented in this email are solely those of the author and might not
> > represent those of Physician Select Management. Warning: Although
> > Physician Select Management has taken reasonable precautions to ensure
> > no viruses are present in this email, the company cannot accept responsibility
> > for any loss or damage arising from the use of this email or attachments.
> >
> > --
> > __________________________________________________________
> > ______________________________________
> >
> > Please note that under Florida's public records law (F.S. 668.6076), most
> > written communications to or from the Jacksonville Port Authority are public
> > records, available to the public and media upon request. Your email
> > communications may therefore be subject to public disclosure. If you have
> > received this email in error, please notify the sender by return email and
> > delete immediately without forwarding to others.
> Disclaimer : This email and any files transmitted with it are confidential and intended solely for intended recipients. If you are not the named addressee you should not disseminate, distribute, copy or alter this email. Any views or opinions presented in this email are solely those of the author and might not represent those of Physician Select Management. Warning: Although Physician Select Management has taken reasonable precautions to ensure no viruses are present in this email, the company cannot accept responsibility for any loss or damage arising from the use of this email or attachments.
--
________________________________________________________________________________________________
Please note that under Florida's public records law (F.S. 668.6076), most
written communications
to or from the Jacksonville Port Authority are
public records, available to the public and media
upon request. Your email
communications may therefore be subject to public disclosure. If you have
received this email in error, please notify the sender by return email and
delete immediately
without forwarding to others.
More information about the drbd-user
mailing list