[Drbd-dev] How Locking in GFS works...
Philipp Reisner
philipp.reisner at linbit.com
Mon Oct 4 15:26:15 CEST 2004
On Monday 04 October 2004 15:01, Lars Marowsky-Bree wrote:
> On 2004-10-04T14:56:21, Philipp Reisner <philipp.reisner at linbit.com> wrote:
> > This is intended as food for thought on how we should design our
> > support for shared disk file systems.
>
> I'm still not sure what kind of special support you need. The only
> guarantee you need to provide is that after a barrier all reads on all
> nodes return the same data for those blocks affected by the flush.
>
> The shared disk file system itself will take care of issueing
> appropriate barrier and flushing the OS caches.
>
> Am I missing something? ;-)
>
If everything works (esp. the locking of the shared disk fs) no.
But just consider that the locking of the shared disk FS on
top of us is broken, and that it issues a write request to
the same block number on both nodes.
Then each node would write its copy first and the peers
version of the data at second to that block number.
=> We would have different data in this block on our
two copies. - And we would event know about it!
What would have happened on a real shared disk?
The real shared disk would have ordered in some order,
ond one of the writes would overwrite the other version.
(This is the basic design idea of proposed solution 1)
(For proposed solution2 the lock "granulaty" of the
shared disk FS is interesting...)
--snip from ROADMAP file--
global write order
As far as I understand the topic up to now we have two options
to establish a global write order.
Proposed Solution 1, using the order of a coordinator node:
Writes from the coordinator node are carried out, as they are
carried out on the primary node in conventional DRBD. ( Write
to disk and send to peer simultaneously. )
Writes from the other node are sent to the coordinator first,
then the coordinator inserts a small "write now" packet into
its stream of write packets.
The node commits the write to its local IO subsystem as soon
as it gets the "write-now" packet from the coordinator.
Note: With protocol C it does not matter which node is the
coordinator from the performance viewpoint.
Proposed Solution 2, use a dedicated LRU to implement locking:
Each extent in the locking LRU can have on of these states:
requested
locked-by-peer
locked-by-me
locked-by-me-and-requested-by-peer
We allow application writes only to extents which are in
locked-by-me* state.
New Packets:
LockExtent
LockExtentAck
Configuration directives: dl-extents , dl-extent-size
TODO: Need to verify with GFS that this makes sense.
--
: Dipl-Ing Philipp Reisner Tel +43-1-8178292-50 :
: LINBIT Information Technologies GmbH Fax +43-1-8178292-82 :
: Schönbrunnerstr 244, 1120 Vienna, Austria http://www.linbit.com :
More information about the drbd-dev
mailing list