All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alex Zhuravlev <Alex.Zhuravlev@Sun.COM>
To: lustre-devel@lists.lustre.org
Subject: [Lustre-devel] global epochs [an alternative proposal, long and dry].
Date: Tue, 23 Dec 2008 09:44:27 +0300	[thread overview]
Message-ID: <495088CB.5070506@sun.com> (raw)
In-Reply-To: <18767.58149.550264.505562@gargle.gargle.HOWL>

Nikita Danilov wrote:
> Any message is used as a transport for epochs, including any reply
> from a server. So a typical scenario would be

I agree, but I think there will be cases with no messages at all.
like WBC doing flush every few minutes and then going idle. depending
on workload this may introduce additional network overhead on any node.

> etc. Note, that nothing prevents server from increasing its local epoch
> before replying to every reintegration (this was mentioned in the
> original document as an "extreme case"). With this policy there is never
> more than one reintegration on a given client in a given epoch, and we
> can indeed implement stability algorithm without clients.

hmm? if it's client only who're aware of parts of distributed transaction,
how can we?


> DLM plays no special role in the epochs mechanism. All that it is used
> for is to guarantee that conflicting operations are executed in the
> proper order (i.e., an epoch of dependent operation is never less than
> an epoch of an operation it depends on), but this is what DLM is for,
> and this has be guaranteed anyway.

conflict resolution can be delegated to some different mechanism when STL takes place.

> last_committed can be and have to be used. When a client reintegrated
> operation OP = (U(0), ..., U(N)), it counts this operation as `volatile'
> until all N servers reported (through the usual last_committed
> mechanism, as it is used by Lustre currently) that all updates have
> committed.

yup. at some point I got to think you're going to use epochs instead of transno
in last_committed, which could be a problem.


just to list my observations about global epochs:
  * it's a problem to implement synchronous operations
  * network overhead even with local-only changes depending on workload
  * disk overhead even with local-only changes
  * SC is a single point of failure with any topology as it's the only place to
    find final minimum
  * tree reduction isn't obvious thing because client can't report its minimum
    to any node, instead tree is rather static thing and any change should be
    done very carefully. otherwise it's very easy to lose minimum



thanks, Alex

  reply	other threads:[~2008-12-23  6:44 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-12-22  7:53 [Lustre-devel] global epochs [an alternative proposal, long and dry] Nikita Danilov
2008-12-22 11:52 ` Alex Zhuravlev
2008-12-22 12:45   ` Nikita Danilov
2008-12-22 13:48     ` Alexander Zarochentsev
2008-12-22 14:21       ` Nikita Danilov
2008-12-22 14:45         ` Alex Zhuravlev
2008-12-22 14:44     ` Alex Zhuravlev
2008-12-22 17:15       ` Nikita Danilov
2008-12-22 17:36         ` Alex Zhuravlev
2008-12-22 18:57           ` Nikita Danilov
2008-12-23  6:44             ` Alex Zhuravlev [this message]
2008-12-23 10:00               ` Nikita Danilov
2008-12-23 10:21                 ` Alex Zhuravlev
2008-12-23 11:06                   ` Nikita Danilov
2008-12-23 11:31                     ` Alex Zhuravlev
2008-12-23 12:50                       ` Nikita Danilov
2008-12-23 13:11                         ` Alex Zhuravlev
2008-12-23 13:24                           ` Nikita Danilov
2008-12-24 10:32                         ` Alex Zhuravlev
2008-12-24 11:37                           ` Nikita Danilov
2008-12-26  9:01                             ` Alex Zhuravlev
2008-12-23 23:37             ` Andreas Dilger
2008-12-24 12:35               ` Eric Barton
2008-12-24 16:16               ` Nikita Danilov
2009-01-15 23:40 ` [Lustre-devel] global epochs vs fsync Alex Zhuravlev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=495088CB.5070506@sun.com \
    --to=alex.zhuravlev@sun.com \
    --cc=lustre-devel@lists.lustre.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.