From: Dave Chinner <david@fromorbit.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Allison Henderson <allison.henderson@oracle.com>,
linux-xfs@vger.kernel.org
Subject: Re: [RFC PATCH 02/13] xfs: get directory offset when removing directory name
Date: Tue, 15 Aug 2017 13:23:07 +1000 [thread overview]
Message-ID: <20170815032307.GH21024@dastard> (raw)
In-Reply-To: <20170814235653.GQ4796@magnolia>
On Mon, Aug 14, 2017 at 04:56:53PM -0700, Darrick J. Wong wrote:
> The biggest complaint I have about this patch series is that we end up
> using separate transactions for each phase of the link/unlink
> operations. There's no guarantee that once we start the process of
> expand-dir, add-name-to-dir, log-inode-cores, expand-attr,
> add-pptr-to-attrs that we actually get all the way to the end of that
> process, particularly if we crash in the middle and have to recover the
> log coming back up.
Yes, that's a problem that we had not been solved at the point in
time I wrote that patchset. After making the parent pointer
attributes be added and removed correctly, then next thing to do was
to make them atomic and replayable. And that's the bit I never got
to.
> Obviously, back in 2014 when Dave started on these patches there was no
> defer_ops, so a lot of effort goes into passing parameters up and down
> the stack and fiddling with the transaction reservation type
> declarations in order to guarantee that we can shove everything into a
> single transaction. Now that we have a more general mechanism to split
> complex updates into a chain of smaller updates (and make log recovery
> pick up where we left off) it's time to figure out how to make attr (and
> maybe dir) updates a deferrable operation.
Right. We've got the infrastructure now, but it's nasty to retrofit
to all the attr code.
> This adds complexity to the log because we have to define an "attr add
> KEY:VALUE" and a "attr delete KEY" deferred op that hooks into the
> existing attr code. For parent pointers it's not a big deal to log
> values since they're not big, but for the general case this idea needs
> more thought, or a decision that we'll just keep doing things the same
> way.
The problem with this is that it requires us to log the entire
attribute value, which could be 64k in size. That used to be a big
issue because log bandwidth was at a premium. That isn't so much a
problem now, so logging the attr value and getting rid of the
unlogged sync write for remote attr updates is the way I think we
should head.
> Anyway, assuming the existence of a "attr delete KEY" log intent item we
> no longer have to pass the offsets around everywhere because instead
> _dir_removename attaches the defer item to the defer_ops and
> _defer_finish takes care of rolling through the updates.
>
> Thoughts?
Yup, it will simplify some things.
However, if we're going to start converting complex operations into
smaller, simpler ops chained via intents and deferred ops, lets
let's take a step back here and think about how we'd like them to be
defined, maintained and executed. If we're going to take a big step
to convert the dir/attr code to run from deferred ops, I want to
make sure we can make the most of it in future.
FYI, what I'd really like to do is to be able to statically define
the individual operations needed to make a modification as a set of
deferred ops sorta like the way we build up transaction
reservations. I'm thinking of defer ops being somewhat akin to a
state machine in definition and implementation. i.e. rather than
being opaque things that are built up by dynamic events, they become
mostly static ops with some dynamically driven events.
Hence a directory addition is simply a case of selecting the correct
defer_ops structure and starting it to run on the directory inode
and new dirent name. The high level code would need to check what
directory features are enabled to select the ops that need to run,
but otherwise there's nothing more to do but hand it to the
xfs_deferops_start(). Parent pointers then just becomes a different
set of deferops. If possible, it'd be nice to use the definition of
each op to determine the log/block reservations needed for the op to
run, too.
Expanding this filesystem wide, we'd end up with an engine that runs
defer ops in the most efficient manner possible and a bunch of
smaller, simpler operations linked by intent/done log entries. These
ops could be run sync or async by the engine, which would give us a
path towards async metadata ops and bulk metadata update
optimisations...
Yeah, I'm getting way ahead of myself here, but I think you get the
idea. Does that even remotely sound plausible, or have I just been
hitting the hard stuff too much recently?
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2017-08-15 3:23 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-10 1:41 [RFC PATCH 00/13] Add parent pointer attributes Allison Henderson
2017-08-10 1:41 ` [RFC PATCH 01/13] xfs: get directory offset when adding directory name Allison Henderson
2017-08-10 1:41 ` [RFC PATCH 02/13] xfs: get directory offset when removing " Allison Henderson
2017-08-14 23:56 ` Darrick J. Wong
2017-08-15 3:23 ` Dave Chinner [this message]
2017-08-15 5:47 ` Dave Chinner
2017-08-15 3:54 ` Allison Henderson
2017-08-10 1:41 ` [RFC PATCH 03/13] xfs: get directory offset when replacing a " Allison Henderson
2017-08-10 1:41 ` [RFC PATCH 04/13] xfs: add parent pointer support to attribute code Allison Henderson
2017-08-10 1:41 ` [RFC PATCH 05/13] xfs: define parent pointer xattr format Allison Henderson
2017-08-14 23:59 ` Darrick J. Wong
2017-08-15 4:05 ` Allison Henderson
2017-08-10 1:41 ` [RFC PATCH 06/13] :xfs: extent transaction reservations for parent attributes Allison Henderson
2017-08-10 1:41 ` [RFC PATCH 07/13] Add the extra space requirements for parent pointer attributes when calculating the minimum log size during mkfs Allison Henderson
2017-08-10 1:41 ` [RFC PATCH 08/13] xfs: parent pointer attribute creation Allison Henderson
2017-08-15 0:06 ` Darrick J. Wong
2017-08-15 3:32 ` Dave Chinner
2017-08-15 4:09 ` Allison Henderson
2017-08-10 1:41 ` [RFC PATCH 09/13] xfs: add parent attributes to link Allison Henderson
2017-08-10 1:41 ` [RFC PATCH 10/13] xfs: remove parent pointers in unlink Allison Henderson
2017-08-10 1:41 ` [RFC PATCH 11/13] xfs_bmap_add_attrfork(): re-add error handling from set_attrforkoff() call Allison Henderson
2017-08-10 1:41 ` [RFC PATCH 12/13] Add parent pointers to rename Allison Henderson
2017-08-10 1:41 ` [RFC PATCH 13/13] Add the parent pointer support to the superblock version 5 Allison Henderson
2017-08-15 0:07 ` Darrick J. Wong
2017-08-15 4:11 ` Allison Henderson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170815032307.GH21024@dastard \
--to=david@fromorbit.com \
--cc=allison.henderson@oracle.com \
--cc=darrick.wong@oracle.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox