[DRBD-user] Ideas for splitting data between A and C protocol devices - safe or not?

Lars Ellenberg lars.ellenberg at linbit.com
Sat Aug 16 16:05:10 CEST 2008

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.

On Thu, Aug 14, 2008 at 10:03:05AM -0700, Casey Allen Shobe wrote:
> Thanks for the great answers - I was just about to respond to myself  
> anyways to say "hmm this was a bad idea", but you provide additional  
> insights. :)
> At this point, I'm only really thinking about Protocol C as I believe I 
> depend on synchronous ordered writes and the speed I'm seeing with  
> bonnie++ tests is well beyond what our old standalone server was  
> providing (still need to do more database benchmarks, but it's looking  
> great).
> On Aug 14, 2008, at 3:47 AM, Lars Ellenberg wrote:
>> which is already risky, as DRBD currently has no means to ensure
>> consistency (write-ordering) accros multiple devices,
>> so in case of crash/disconnect/failover,
>> your data files may be more recent than your wal-files,
>> or your wal-files may be more recent than your data files.
> Well, when using Protocol C on both devices, it should be safe, no?  It 
> is a relatively common tuning for PostgreSQL to store the WAL on  
> separate partition - as all writes to the db must go to the wal first  
> also, this increases overall write throughput and reduces delay to other 
> processes dealing with the data partition as the I/O impact of WAL 
> writing is removed.  I would believe that with Protocol C, and  
> PostgreSQL calling fdatasync for each write, the order should be okay.  
> If not can you explain a little more as to why?  I could replace the 
> 10-disk raid10 and 2-disk raid1 with a single 12-disk raid10, but I'd 
> like to keep the separation if possible, as this has worked out very well 
> for me with standalone disks in the past.

if you have two drbd,
you have this race:
  both under write load
  for some reason only one device loses/restarts the replication link
  then crashes right there.

one drbd may be ahead of the other
for some logically connected operations.

point being, write order dependencies accross multiple drbd
cannot be ensured in face of component failures.

it is unlikely to cause any harm in the real world, but it may.

>> you cannot apply todays wal-files to last-weeks base file backup,
>> right?
> You can if you have all of the wal files since that backup was made and 
> apply them in order, yes.  This is what originally got me thinking of the 
> possibility.  But PostgreSQL requires fsync to be properly honored to 
> guarantee consistency in this manner.
>> as long as one node always remains up...
> Yes, I'd be willing to live with that assumption - these two servers  
> will exist in the same rack in the same datacenter with all the rest of 
> our hardware, so it's not a good disaster-recovery solution anyways - if 
> we lose power the entire business is down and at that point recovering 
> from backups is an acceptable path to recovery (for the time being, 
> anyways).
>>  with drbd protocol "A",
>>  peer node may not yet know anything about that commit.
>>  commit was lost.
> Losing a few commits, while better to avoid, would not be nearly so bad 
> as having a data partition that PostgreSQL couldn't self-recover from.  

your statement was: 
 "as long as one node always remains up, data is never lost"
I just pointed out that with drbd protocol A, this is not the case.

> We are facing some serious performance bottlenecks from our database now, 
> so the chance of losing a small amount of data was less concerning than 
> getting the immediate problem solved.

that is exactly the tradeoff with protocol A.

> We'll be reducing write I/O requirements through software
> rearchitecture over the coming year, but need to put the fire out
> first. :)  And currently, we have only a standalone server, and a
> backup that's several hours out of date at any point and takes a
> couple hours to recover from.
> But certainly I'd like to get the most reliability possible if the
> performance hit doesn't kill us, and it's looking like using protocol
> C for everything will provide ample speed.
> I will post some benchmarks shortly.

thanks, that is appreciated.

: Lars Ellenberg                
: LINBIT HA-Solutions GmbH
: DRBD®/HA support and consulting    http://www.linbit.com

DRBD® and LINBIT® are registered trademarks
of LINBIT Information Technologies GmbH
please don't Cc me, but send to list   --   I'm subscribed

More information about the drbd-user mailing list