From: Andreas Dilger <adilger@sun.com>
To: lustre-devel@lists.lustre.org
Subject: [Lustre-devel] Recovering opens by reconstruction
Date: Tue, 07 Jul 2009 08:38:19 -0600 [thread overview]
Message-ID: <20090707143819.GL5073@webber.adilger.int> (raw)
In-Reply-To: <4A531BE4.20207@sun.com>
On Jul 07, 2009 13:56 +0400, Alex Zhuravlev wrote:
> Nicolas Williams wrote:
> > Also, as Oleg explained to me, most open state is for files whose opens
> > committed long ago, so most open state is recovered before other
> > transactions. Which means we already have a separate open state
> > recovery phase -- it just isn't explicit. So the only thing that
> > changes in my proposal is that all committed open state will be
> > recovered by anonymous open by FID reconstruction instead of by replay,
> > with all other transactions (including as-yet uncommitted opens) will be
> > recovered by replay.
>
> I think it'd be slightly easier to introduce two notions of replay:
>
> 1) on-disk replay -- we try to recover some on-disk state from client's cache
> regular requests like mkdir, unlink, rename, setattr, etc
>
> 2) in-core replay - we try to recover some in-core state from client's cache
> ldlm locks, open files
>
> the thing is that open(2) is quite interesting in this regard because it does
> (1) *and* (2). I believe this is why we used (1) for (2).
>
> my old thougth was that instead of introducing special new open-by-fid RPC
> we should try to implement open in terms of LDLM locks because it's in-core
> state (though with specific tracking of unlinked files). given this we'd
> automatically get single mechanism for all in-core states and we'd get rid
> of special paths for open replays.
One problem with this is that the ordering needs to be preserved. Opens
that have committed need to be replayed before any other replay operations,
because those replayed operations may depend on the file being open.
However, "normal" lock replay should happen after (or conceivably during)
operation replay so that the objects being locked actually exist and the
server can (hopefully soon) verify the lock version number during recovery.
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
next prev parent reply other threads:[~2009-07-07 14:38 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-07-02 22:39 [Lustre-devel] Recovering opens by reconstruction Nicolas Williams
2009-07-03 19:02 ` Mikhail Pershin
2009-07-03 21:55 ` Nicolas Williams
2009-07-04 0:48 ` Nicolas Williams
2009-07-04 7:14 ` Mikhail Pershin
2009-07-04 7:10 ` Mikhail Pershin
2009-07-06 17:34 ` Nicolas Williams
2009-07-06 22:42 ` Nicolas Williams
2009-07-07 9:56 ` Alex Zhuravlev
2009-07-07 14:38 ` Andreas Dilger [this message]
2009-07-08 6:46 ` Alex Zhuravlev
2009-07-07 16:03 ` Nicolas Williams
2009-07-07 13:56 ` Mikhail Pershin
2009-07-07 15:21 ` Andreas Dilger
2009-07-07 16:42 ` Mikhail Pershin
2009-07-07 16:50 ` Nicolas Williams
2009-07-07 16:14 ` Nicolas Williams
2009-07-08 17:15 ` Alex Zhuravlev
2009-07-06 17:20 ` Nicolas Williams
2009-07-06 22:37 ` Nicolas Williams
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090707143819.GL5073@webber.adilger.int \
--to=adilger@sun.com \
--cc=lustre-devel@lists.lustre.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.