From: Dave Chinner <david@fromorbit.com>
To: Ben Myers <bpm@sgi.com>
Cc: Brian Foster <bfoster@redhat.com>, xfs@oss.sgi.com
Subject: Re: [PATCH 08/14] xfs: remote attribute allocation may be contiguous
Date: Tue, 21 May 2013 10:25:18 +1000 [thread overview]
Message-ID: <20130521002518.GI24543@dastard> (raw)
In-Reply-To: <20130520220418.GF20028@sgi.com>
On Mon, May 20, 2013 at 05:04:18PM -0500, Ben Myers wrote:
> On Mon, May 20, 2013 at 03:03:49PM -0400, Brian Foster wrote:
> > On 05/19/2013 07:51 PM, Dave Chinner wrote:
> > > From: Dave Chinner <dchinner@redhat.com>
> > >
> > > When CRCs are enabled, there may be multiple allocations made if the
> > > headers cause a length overflow. This, however, does not mean that
> > > the number of headers required increases, as the second and
> > > subsequent extents may be contiguous with the previous extent. Hence
> > > when we map the extents to write the attribute data, we may end up
> > > with less extents than allocations made. Hence the assertion that we
> > > consume th enumber of headers we calculated in the allocation loop
> the
>
> > > is incorrect and needs to be removed.
> > >
> > > Signed-off-by: Dave Chinner <dchinner@redhat.com>
> > > ---
> > > fs/xfs/xfs_attr_remote.c | 6 +++++-
> > > 1 file changed, 5 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/fs/xfs/xfs_attr_remote.c b/fs/xfs/xfs_attr_remote.c
> > > index dee8446..aad95b0 100644
> > > --- a/fs/xfs/xfs_attr_remote.c
> > > +++ b/fs/xfs/xfs_attr_remote.c
> > > @@ -359,6 +359,11 @@ xfs_attr_rmtval_set(
> > > * into requiring more blocks. e.g. for 512 byte blocks, we'll
> > > * spill for another block every 9 headers we require in this
> > > * loop.
> > > + *
> > > + * Note that this can result in contiguous allocation of blocks,
> > > + * so we don't use all the space we allocate for headers as we
> > > + * have one less header for each contiguous allocation that
> > > + * occurs in the map/write loop below.
> > > */
> > > if (crcs && blkcnt == 0) {
> > > int total_len;
> > > @@ -439,7 +444,6 @@ xfs_attr_rmtval_set(
> > > lblkno += map.br_blockcount;
> > > }
> > > ASSERT(valuelen == 0);
> > > - ASSERT(hdrcnt == 0);
> >
> > I can't say I grok the context enough atm to send a Reviewed-by, but if
> > we're removing this, why not just remove the hdrcnt-- a few lines up as
> > well? It doesn't appear to be used after the first loop at this point.
Yup, I noticed that later on....
> Could also keep a separate variable for each loop and compare them here. The
> count for the 2nd loop should always be smaller than for the first... The
> assert was worth adding so maybe it is worth fixing too? Either way is fine
> with me.
Well, I'm not convinced the code is correct yet, anyway. My
overnight runs on a 1k block size filesystem (can't use 512 byte
block size with 512 byte inodes) tripped over the same assert that I
was seeing with 4k block size filesystems and these patches fix.
By "not convinced", I mean that I think I've made a mistake in just
putting a header at the start of each extent. Call it preamture
optimisation, if you want, as it seems like a good idea at the time.
The main issue is that we don't know ahead of time how big an
attribute is going to be because it will have a variable number of
headers, and hence we don't know where the attribute data actually
ends from a mapping lookup as there can be trailing contiguous
blocks that hold other attribute data.
IOWs, I've found that this code is incredibly fragile, difficult to
verify and hard to debug. And I'm pretty certain the mapping of
trailing blocks is almost impossible to solve sanely.
What I'd like to do right now is change this format to
have a header per filesystem block. The reason for doing this is
that all the unknowns are removed - we have a direct relationship
between the attribute data length and the number of blocks needed to
store the data (it is always what xfs_attr3_rmt_blocks() returns)
and that greatly simplifies all this code.
It gets rid of the multiple allocation loop, the complexity of
contiguous allocation from multiple loop instances, etc. It does
make the parsing/verification of the attribute data a little more
complex, but that is a constant rather than the mess I've created
right now...
The only issue is that it is a change of the on-disk format. I'm
happy just to go change it as this is marked experimental, it hasn't
reached an initial public release yet and the only people using it
are developers and testers using volatile data...
I'm going to do this today, based on top of this series - this series
allows much more testing to be done, so it should be committed
anyway. Yeah, it's a pain, but this code is going to cause us long
term problems if we don't fix these problems now...
Thoughts?
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2013-05-21 0:25 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-05-19 23:51 [PATCH 00/14] xfs: fixes for 3.10-rc2 (update) Dave Chinner
2013-05-19 23:51 ` [PATCH 01/14] xfs: fix sub-page blocksize data integrity writes Dave Chinner
2013-05-20 18:02 ` Brian Foster
2013-05-20 19:18 ` Ben Myers
2013-05-19 23:51 ` [PATCH 02/14] xfs: fix rounding in xfs_free_file_space Dave Chinner
2013-05-20 18:03 ` Brian Foster
2013-05-19 23:51 ` [PATCH 03/14] xfs: Don't reference the EFI after it is freed Dave Chinner
2013-05-20 18:03 ` Brian Foster
2013-05-19 23:51 ` [PATCH 04/14] xfs: avoid nesting transactions in xfs_qm_scall_setqlim() Dave Chinner
2013-05-20 18:03 ` Brian Foster
2013-05-21 0:06 ` Dave Chinner
2013-05-21 0:36 ` [PATCH 04/14 V2] " Dave Chinner
2013-05-21 10:51 ` Brian Foster
2013-05-19 23:51 ` [PATCH 05/14] xfs: fix missing KM_NOFS tags to keep lockdep happy Dave Chinner
2013-05-20 21:16 ` Ben Myers
2013-05-21 0:08 ` Dave Chinner
2013-05-19 23:51 ` [PATCH 06/14] xfs: xfs_da3_node_read_verify() doesn't handle XFS_ATTR3_LEAF_MAGIC Dave Chinner
2013-05-20 21:32 ` Ben Myers
2013-05-19 23:51 ` [PATCH 07/14] xfs: xfs_attr_shortform_allfit() does not handle attr3 format Dave Chinner
2013-05-20 21:52 ` Ben Myers
2013-05-19 23:51 ` [PATCH 08/14] xfs: remote attribute allocation may be contiguous Dave Chinner
2013-05-20 19:03 ` Brian Foster
2013-05-20 22:04 ` Ben Myers
2013-05-21 0:25 ` Dave Chinner [this message]
2013-05-19 23:51 ` [PATCH 09/14] xfs: remote attribute lookups require the value length Dave Chinner
2013-05-20 22:15 ` Ben Myers
2013-05-19 23:51 ` [PATCH 10/14] xfs: remote attribute read too short Dave Chinner
2013-05-20 23:00 ` Ben Myers
2013-05-19 23:51 ` [PATCH 11/14] xfs: remote attribute tail zeroing does too much Dave Chinner
2013-05-20 23:01 ` Ben Myers
2013-05-19 23:51 ` [PATCH 12/14] xfs: correctly map remote attr buffers during removal Dave Chinner
2013-05-19 23:51 ` [PATCH 13/14] xfs: fully initialise temp leaf in xfs_attr3_leaf_unbalance Dave Chinner
2013-05-19 23:51 ` [PATCH 14/14] xfs: fully initialise temp leaf in xfs_attr3_leaf_compact Dave Chinner
2013-05-20 19:37 ` [PATCH 00/14] xfs: fixes for 3.10-rc2 (update) Ben Myers
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130521002518.GI24543@dastard \
--to=david@fromorbit.com \
--cc=bfoster@redhat.com \
--cc=bpm@sgi.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox