From: "Darrick J. Wong" <djwong@kernel.org>
To: Alli <allison.henderson@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>, linux-xfs@vger.kernel.org
Subject: Re: [PATCH RESEND v2 01/18] xfs: Fix multi-transaction larp replay
Date: Thu, 11 Aug 2022 20:05:08 -0700 [thread overview]
Message-ID: <YvXDZI/oN2duH26L@magnolia> (raw)
In-Reply-To: <f85ae9d8425aaff455301b28af32ba0d813f733b.camel@oracle.com>
On Thu, Aug 11, 2022 at 06:55:16PM -0700, Alli wrote:
> On Wed, 2022-08-10 at 16:12 +1000, Dave Chinner wrote:
> > On Tue, Aug 09, 2022 at 10:01:49PM -0700, Alli wrote:
> > > On Wed, 2022-08-10 at 11:58 +1000, Dave Chinner wrote:
> > > > On Tue, Aug 09, 2022 at 09:52:55AM -0700, Darrick J. Wong wrote:
> > > > > On Thu, Aug 04, 2022 at 12:39:56PM -0700, Allison Henderson
> > > > > wrote:
> > > > > > Recent parent pointer testing has exposed a bug in the
> > > > > > underlying
> > > > > > attr replay. A multi transaction replay currently performs a
> > > > > > single step of the replay, then deferrs the rest if there is
> > > > > > more
> > > > > > to do.
> > > >
> > > > Yup.
> > > >
> > > > > > This causes race conditions with other attr replays that
> > > > > > might be recovered before the remaining deferred work has had
> > > > > > a
> > > > > > chance to finish.
> > > >
> > > > What other attr replays are we racing against? There can only be
> > > > one incomplete attr item intent/done chain per inode present in
> > > > log
> > > > recovery, right?
> > > No, a rename queues up a set and remove before committing the
> > > transaction. One for the new parent pointer, and another to remove
> > > the
> > > old one.
> >
> > Ah. That really needs to be described in the commit message -
> > changing from "single intent chain per object" to "multiple
> > concurrent independent and unserialised intent chains per object" is
> > a pretty important design rule change...
> >
> > The whole point of intents is to allow complex, multi-stage
> > operations on a single object to be sequenced in a tightly
> > controlled manner. They weren't intended to be run as concurrent
> > lines of modification on single items; if you need to do two
> > modifications on an object, the intent chain ties the two
> > modifications together into a single whole.
> >
> > One of the reasons I rewrote the attr state machine for LARP was to
> > enable new multiple attr operation chains to be easily build from
> > the entry points the state machien provides. Parent attr rename
> > needs a new intent chain to be built, not run multiple independent
> > intent chains for each modification.
> >
> > > It cant be an attr replace because technically the names are
> > > different.
> >
> > I disagree - we have all the pieces we need in the state machine
> > already, we just need to define separate attr names for the
> > remove and insert steps in the attr intent.
> >
> > That is, the "replace" operation we execute when an attr set
> > overwrites the value is "technically" a "replace value" operation,
> > but we actually implement it as a "replace entire attribute"
> > operation.
> >
> > Without LARP, we do that overwrite in independent steps via an
> > intermediate INCOMPLETE state to allow two xattrs of the same name
> > to exist in the attr tree at the same time. IOWs, the attr value
> > overwrite is effectively a "set-swap-remove" operation on two
> > entirely independent xattrs, ensuring that if we crash we always
> > have either the old or new xattr visible.
> >
> > With LARP, we can remove the original attr first, thereby avoiding
> > the need for two versions of the xattr to exist in the tree in the
> > first place. However, we have to do these two operations as a pair
> > of linked independent operations. The intent chain provides the
> > linking, and requires us to log the name and the value of the attr
> > that we are overwriting in the intent. Hence we can always recover
> > the modification to completion no matter where in the operation we
> > fail.
> >
> > When it comes to a parent attr rename operation, we are effectively
> > doing two linked operations - remove the old attr, set the new attr
> > - on different attributes. Implementation wise, it is exactly the
> > same sequence as a "replace value" operation, except for the fact
> > that the new attr we add has a different name.
> >
> > Hence the only real difference between the existing "attr replace"
> > and the intent chain we need for "parent attr rename" is that we
> > have to log two attr names instead of one.
>
> To be clear, this would imply expanding xfs_attri_log_format to have
> another alfi_new_name_len feild and another iovec for the attr intent
> right? Does that cause issues to change the on disk log layout after
> the original has merged? Or is that ok for things that are still
> experimental? Thanks!
XFS_SB_FEAT_INCOMPAT_PARENT should protect against that, since the
userspace xattr api does not support replacing an attr's name, only the
value.
--D
> Allison
>
> > Basically, we have a new
> > XFS_ATTRI_OP_FLAGS... type for this, and that's what tells us that
> > we are operating on two different attributes instead of just one.
> >
> > The recovery operation becomes slightly different - we have to run a
> > remove on the old, then a replace on the new - so there a little bit
> > of new code needed to manage that in the state machine.
> >
> > These, however, are just small tweaks on the existing replace attr
> > operation, and there should be little difference in performance or
> > overhead between a "replace value" and a "replace entire xattr"
> > operation as they are largely the same runtime operation for LARP.
> >
> > > So the recovered set grows the leaf, and returns the egain, then
> > > rest
> > > gets capture committed. Next up is the recovered remove which
> > > pulls
> > > out the fork, which causes problems when the rest of the set
> > > operation
> > > resumes as a deferred operation.
> >
> > Yup, and all this goes away when we build the right intent chain for
> > replacing a parent attr rename....
> >
> > Cheers,
> >
> > Dave.
>
next prev parent reply other threads:[~2022-08-12 3:05 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-04 19:39 [PATCH RESEND v2 00/18] Parent Pointers Allison Henderson
2022-08-04 19:39 ` [PATCH RESEND v2 01/18] xfs: Fix multi-transaction larp replay Allison Henderson
2022-08-09 16:52 ` Darrick J. Wong
2022-08-10 1:58 ` Dave Chinner
2022-08-10 5:01 ` Alli
2022-08-10 6:12 ` Dave Chinner
2022-08-10 15:52 ` Darrick J. Wong
2022-08-10 19:28 ` Alli
2022-08-12 1:55 ` Alli
2022-08-12 3:05 ` Darrick J. Wong [this message]
2022-08-16 0:54 ` Dave Chinner
2022-08-16 5:07 ` Darrick J. Wong
2022-08-16 20:41 ` Alli
2022-08-19 1:05 ` Alli
2022-08-23 15:07 ` Darrick J. Wong
2022-08-24 18:47 ` Alli
2022-08-10 3:08 ` Alli
2022-08-04 19:39 ` [PATCH RESEND v2 02/18] xfs: Increase XFS_DEFER_OPS_NR_INODES to 5 Allison Henderson
2022-08-09 16:38 ` Darrick J. Wong
2022-08-10 3:07 ` Alli
2022-08-04 19:39 ` [PATCH RESEND v2 03/18] xfs: Hold inode locks in xfs_ialloc Allison Henderson
2022-08-04 19:39 ` [PATCH RESEND v2 04/18] xfs: Hold inode locks in xfs_trans_alloc_dir Allison Henderson
2022-08-04 19:40 ` [PATCH RESEND v2 05/18] xfs: get directory offset when adding directory name Allison Henderson
2022-08-04 19:40 ` [PATCH RESEND v2 06/18] xfs: get directory offset when removing " Allison Henderson
2022-08-04 19:40 ` [PATCH RESEND v2 07/18] xfs: get directory offset when replacing a " Allison Henderson
2022-08-04 19:40 ` [PATCH RESEND v2 08/18] xfs: add parent pointer support to attribute code Allison Henderson
2022-08-09 16:54 ` Darrick J. Wong
2022-08-10 3:08 ` Alli
2022-08-04 19:40 ` [PATCH RESEND v2 09/18] xfs: define parent pointer xattr format Allison Henderson
2022-08-04 19:40 ` [PATCH RESEND v2 10/18] xfs: Add xfs_verify_pptr Allison Henderson
2022-08-09 16:59 ` Darrick J. Wong
2022-08-10 3:08 ` Alli
2022-08-04 19:40 ` [PATCH RESEND v2 11/18] xfs: extend transaction reservations for parent attributes Allison Henderson
2022-08-09 17:48 ` Darrick J. Wong
2022-08-10 3:08 ` Alli
2022-08-04 19:40 ` [PATCH RESEND v2 12/18] xfs: parent pointer attribute creation Allison Henderson
2022-08-09 18:01 ` Darrick J. Wong
2022-08-09 18:13 ` Darrick J. Wong
2022-08-10 3:09 ` Alli
2022-08-10 3:08 ` Alli
2022-08-04 19:40 ` [PATCH RESEND v2 13/18] xfs: add parent attributes to link Allison Henderson
2022-08-09 18:43 ` Darrick J. Wong
2022-08-10 3:09 ` Alli
2022-09-23 20:25 ` Darrick J. Wong
2022-08-04 19:40 ` [PATCH RESEND v2 14/18] xfs: remove parent pointers in unlink Allison Henderson
2022-08-09 18:45 ` Darrick J. Wong
2022-08-10 3:09 ` Alli
2022-08-04 19:40 ` [PATCH RESEND v2 15/18] xfs: Add parent pointers to rename Allison Henderson
2022-08-09 18:49 ` Darrick J. Wong
2022-08-10 3:09 ` Alli
2022-08-04 19:40 ` [PATCH RESEND v2 16/18] xfs: Add the parent pointer support to the superblock version 5 Allison Henderson
2022-08-04 19:40 ` [PATCH RESEND v2 17/18] xfs: Add helper function xfs_attr_list_context_init Allison Henderson
2022-08-04 19:40 ` [PATCH RESEND v2 18/18] xfs: Add parent pointer ioctl Allison Henderson
2022-08-09 19:26 ` Darrick J. Wong
2022-08-10 3:09 ` Alli
2022-09-24 0:01 ` Darrick J. Wong
2022-08-09 22:55 ` [RFC PATCH 19/18] xfs: fix unit conversion error in xfs_log_calc_max_attrsetm_res Darrick J. Wong
2022-08-09 22:56 ` [RFC PATCH 20/18] xfs: drop compatibility minimum log size computations for reflink Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YvXDZI/oN2duH26L@magnolia \
--to=djwong@kernel.org \
--cc=allison.henderson@oracle.com \
--cc=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox