[DRBD-user] Ideas for splitting data between A and C protocol devices - safe or not?

Thu Aug 14 19:03:05 CEST 2008

Thanks for the great answers - I was just about to respond to myself  
anyways to say "hmm this was a bad idea", but you provide additional  
insights. :)

At this point, I'm only really thinking about Protocol C as I believe  
I depend on synchronous ordered writes and the speed I'm seeing with  
bonnie++ tests is well beyond what our old standalone server was  
providing (still need to do more database benchmarks, but it's looking  
great).

On Aug 14, 2008, at 3:47 AM, Lars Ellenberg wrote:
> which is already risky, as DRBD currently has no means to ensure
> consistency (write-ordering) accros multiple devices,
> so in case of crash/disconnect/failover,
> your data files may be more recent than your wal-files,
> or your wal-files may be more recent than your data files.

Well, when using Protocol C on both devices, it should be safe, no?   
It is a relatively common tuning for PostgreSQL to store the WAL on  
separate partition - as all writes to the db must go to the wal first  
also, this increases overall write throughput and reduces delay to  
other processes dealing with the data partition as the I/O impact of  
WAL writing is removed.  I would believe that with Protocol C, and  
PostgreSQL calling fdatasync for each write, the order should be  
okay.  If not can you explain a little more as to why?  I could  
replace the 10-disk raid10 and 2-disk raid1 with a single 12-disk  
raid10, but I'd like to keep the separation if possible, as this has  
worked out very well for me with standalone disks in the past.

> you cannot apply todays wal-files to last-weeks base file backup,
> right?

You can if you have all of the wal files since that backup was made  
and apply them in order, yes.  This is what originally got me thinking  
of the possibility.  But PostgreSQL requires fsync to be properly  
honored to guarantee consistency in this manner.

> as long as one node always remains up...

Yes, I'd be willing to live with that assumption - these two servers  
will exist in the same rack in the same datacenter with all the rest  
of our hardware, so it's not a good disaster-recovery solution anyways  
- if we lose power the entire business is down and at that point  
recovering from backups is an acceptable path to recovery (for the  
time being, anyways).

>  with drbd protocol "A",
>  peer node may not yet know anything about that commit.
>
>  commit was lost.

Losing a few commits, while better to avoid, would not be nearly so  
bad as having a data partition that PostgreSQL couldn't self-recover  
from.  We are facing some serious performance bottlenecks from our  
database now, so the chance of losing a small amount of data was less  
concerning than getting the immediate problem solved.  We'll be  
reducing write I/O requirements through software rearchitecture over  
the coming year, but need to put the fire out first. :)  And  
currently, we have only a standalone server, and a backup that's  
several hours out of date at any point and takes a couple hours to  
recover from.

But certainly I'd like to get the most reliability possible if the  
performance hit doesn't kill us, and it's looking like using protocol  
C for everything will provide ample speed.

I will post some benchmarks shortly.

Cheers,
-- 
Casey Allen Shobe
Database Architect, The Berkeley Electronic Press
cshobe at bepress.com (email/jabber/aim/msn)
http://www.bepress.com | +1 (510) 665-1200 x163