From: Brian Foster <bfoster@redhat.com>
To: Dave Chinner <david@fromorbit.com>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>, linux-xfs@vger.kernel.org
Subject: Re: [PATCH RFC] xfs: support magic value in xfs_buf_ops
Date: Fri, 25 Jan 2019 09:43:25 -0500 [thread overview]
Message-ID: <20190125144324.GA58775@bfoster> (raw)
In-Reply-To: <20190124221917.GN4205@dastard>
On Fri, Jan 25, 2019 at 09:19:17AM +1100, Dave Chinner wrote:
> On Thu, Jan 24, 2019 at 11:08:46AM -0800, Darrick J. Wong wrote:
> > On Thu, Jan 24, 2019 at 10:54:40AM -0500, Brian Foster wrote:
> > > Add a field to specify the v4 and v5 magic values in xfs_buf_ops.
> > > This allows otherwise identical verifiers to distinguish between
> > > and verify different magic values (inobt vs. finobt buffers). This
> > > also facilitates verification of the appropriate magic value based
> > > on superblock version.
> > >
> > > The magic field is optional and is free to be used as appropriate
> > > for each particular verifier.
> > >
> > > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > > ---
> > >
> > > Hi all,
> > >
> > > What do folks think of something like this as a lightweight (and
> > > untested) means to do proper [f]inobt magic verification? For reference,
> > > the initial version of this put together to help root cause a user
> > > report is here[1]. I was hoping to do the same thing with less code
> > > duplication. A couple things that come to mind:
> > >
> > > 1. I know scrub has at least one place where we invoke the verifier with
> > > ->b_ops == NULL, which will cause this to explode. Could we fix that up
> > > to assign and reset ->b_ops to accommodate something like this, or is
> > > that problematic?
> >
> > IIRC one of the scrub findroot reviewers didn't like the idea of scrub
> > setting b_ops until it was absolutely sure it wanted to. I think it's
> > actually ok to patch it in temporarily while running the read verifier
> > since we have the buffer locked and patch it out afterwards.
>
> How does this interact with xfs_buf_ensure_ops()?
>
> [ side note: the comments about this function are poor - I have no
> idea what problem it is avoiding from reading the code. Yes, I know
> it protects against transactions with buffers and no ops, but the
> comments don't tell me *how or when that occurs* so I do not know
> where to go looking for potential issues here. ]
>
I think the when and how behind this logic is the scrub case (i.e.,
xrep_findroot_block()) called out above: we read the buffer with a NULL
b_ops param because we don't know which buf_ops actually applies. If a
->b_ops is not ultimately attached, the buf sits around in cache without
->b_ops and is never verified (even if read with a non-NULL b_ops) until
it cycles out of cache. So without this logic, the aforementioned case
would have to drop the buffer from the cache if it was ultimately read
with a NULL b_ops.
With regard to verifiers depending on ->b_ops != NULL, I don't think
this would change anything at this level. The higher level scrub code
would just be required to assign ->b_ops in order to run a verifier and
thus would have to make sure to reset ->b_ops in the event of a failure.
> > > 2. We have some other verifiers around that actually use the buffer
> > > magic to set a more specific verifier. See xfs_da3_node_read_verify()
> > > for an example. I'm not sure this is all that useful for such higher
> > > level verifiers, but I think we'd at least be able to use it for the
> > > underlying verifiers. That might provide some extra sb version vs. magic
> > > sanity checking for places that might not already look at the sb version
> > > (otherwise it's just refactoring).
> > >
> > > Thoughts or other ideas before I try to apply this more broadly? Thanks.
> >
> > Hmm... not sure if I like the idea that you have to find the b_ops
> > declaration to figure out which magic number the verifier function is
> > checking, but I don't really have a cogent objection.
>
> Yeah, I don't really like it either (especially the added CPU
> overhead that we avoided by doing compile time byte swapping),
> but I'm struggling to come up with a better option.
>
I suppose we could store the on-disk magics in the xfs_buf_ops
structures (it works on x86_64 at least, I'd have to verify other
arches), but that is pretty ugly. Given all of the other conversions and
checks, I'm not sure it's worth it.
Hmm, I suppose we could also define a separate set of on-disk magic
directives:
#define XFS_FIBT_CRC_MAGIC_DISK cpu_to_be32(XFS_FIBT_CRC_MAGIC)
... and start using those in various places to avoid the ugliness. I
think that's a separate change though (and again, it's not immediately
clear to me the benefit justifies the additional code).
Brian
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
next prev parent reply other threads:[~2019-01-25 14:43 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-24 15:54 [PATCH RFC] xfs: support magic value in xfs_buf_ops Brian Foster
2019-01-24 19:08 ` Darrick J. Wong
2019-01-24 22:19 ` Dave Chinner
2019-01-25 14:43 ` Brian Foster [this message]
2019-01-27 17:49 ` Darrick J. Wong
2019-01-28 21:06 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190125144324.GA58775@bfoster \
--to=bfoster@redhat.com \
--cc=darrick.wong@oracle.com \
--cc=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox