From: Eric Sandeen <sandeen@sandeen.net>
To: Dave Chinner <david@fromorbit.com>, xfs@oss.sgi.com
Subject: Re: [PATCH 1/2] repair: don't unlock prefetch tree to read discontig buffers
Date: Thu, 08 May 2014 21:14:31 -0500 [thread overview]
Message-ID: <536C3A07.2050100@sandeen.net> (raw)
In-Reply-To: <1399598222-4349-2-git-send-email-david@fromorbit.com>
On 5/8/14, 8:17 PM, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
>
> The way discontiguous buffers are currently handled in prefetch is
> by unlocking the prefetch tree and reading them one at a time in
> pf_read_discontig(), inside the normal loop of searching for buffers
> to read in a more optimized fashion.
>
> But by unlocking the tree, we allow other threads to come in and
> find buffers which we've already stashed locally on our bplist[].
> If 2 threads think they own the same set of buffers, they may both
> try to delete them from the prefetch btree, and the second one to
> arrive will not find it, resulting in:
>
> fatal error -- prefetch corruption
>
> To fix this, simply abort the buffer gathering loop when we come
> across a discontiguous buffer, process the gathered list as per
> normal, and then after running the large optimised read, check to
> see if the last buffer on the list is a discontiguous buffer.
> If is is discontiguous, then issue the discontiguous buffer read
> while the locks are not held. We only ever have one discontiguous
> buffer per read loop, so it is safe just to check the last buffer in
> the list.
>
> The fix is loosely based on a a patch provided by Eric Sandeen, who
> did all the hard work of finding the bug and demonstrating how to
> fix it.
Ok, this makes sense to me. The comment above the discontig read
seems a bit confusing; you say it's safe to read while unlocked,
but I wouldn't have expected it not to be - the lock is just for
btree manipulation, and that's not being done. So I think the
comment adds a little confusion rather than clarification.
But the code looks fine.
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
> Reported-by:Eric Sandeen <sandeen@redhat.com>
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
> repair/prefetch.c | 53 +++++++++++++++++++++++++++--------------------------
> 1 file changed, 27 insertions(+), 26 deletions(-)
>
> diff --git a/repair/prefetch.c b/repair/prefetch.c
> index 65fedf5..e269f1f 100644
> --- a/repair/prefetch.c
> +++ b/repair/prefetch.c
> @@ -444,27 +444,6 @@ pf_read_inode_dirs(
> }
>
> /*
> - * Discontiguous buffers require multiple IOs to fill, so we can't use any
> - * linearising, hole filling algorithms on them to avoid seeks. Just remove them
> - * for the prefetch queue and read them straight into the cache and release
> - * them.
> - */
> -static void
> -pf_read_discontig(
> - struct prefetch_args *args,
> - struct xfs_buf *bp)
> -{
> - if (!btree_delete(args->io_queue, XFS_DADDR_TO_FSB(mp, bp->b_bn)))
> - do_error(_("prefetch corruption\n"));
> -
> - pthread_mutex_unlock(&args->lock);
> - libxfs_readbufr_map(mp->m_ddev_targp, bp, 0);
> - bp->b_flags |= LIBXFS_B_UNCHECKED;
> - libxfs_putbuf(bp);
> - pthread_mutex_lock(&args->lock);
> -}
> -
> -/*
> * pf_batch_read must be called with the lock locked.
> */
> static void
> @@ -496,13 +475,19 @@ pf_batch_read(
> }
> while (bplist[num] && num < MAX_BUFS && fsbno < max_fsbno) {
> /*
> - * Handle discontiguous buffers outside the seek
> - * optimised IO loop below.
> + * Discontiguous buffers need special handling, so stop
> + * gathering new buffers and process the list and this
> + * discontigous buffer immediately. This avoids the
> + * complexity of keeping a separate discontigous buffer
> + * list and seeking back over ranges we've already done
> + * optimised reads for.
> */
> if ((bplist[num]->b_flags & LIBXFS_B_DISCONTIG)) {
> - pf_read_discontig(args, bplist[num]);
> - bplist[num] = NULL;
> - } else if (which != PF_META_ONLY ||
> + num++;
> + break;
> + }
> +
> + if (which != PF_META_ONLY ||
> !B_IS_INODE(XFS_BUF_PRIORITY(bplist[num])))
> num++;
> if (num == MAX_BUFS)
> @@ -570,6 +555,22 @@ pf_batch_read(
> * now read the data and put into the xfs_but_t's
> */
> len = pread64(mp_fd, buf, (int)(last_off - first_off), first_off);
> +
> + /*
> + * Check the last buffer on the list to see if we need to
> + * process a discontiguous buffer. The gather loop guarantees
> + * we only ever have a single discontig buffer on the list,
> + * and that it is the last buffer, so it is safe to do this
> + * check and read here like this while we aren't holding any
> + * locks.
> + */
> + if ((bplist[num - 1]->b_flags & LIBXFS_B_DISCONTIG)) {
> + libxfs_readbufr_map(mp->m_ddev_targp, bplist[num - 1], 0);
> + bplist[num - 1]->b_flags |= LIBXFS_B_UNCHECKED;
> + libxfs_putbuf(bplist[num - 1]);
> + num--;
> + }
> +
> if (len > 0) {
> /*
> * go through the xfs_buf_t list copying from the
>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2014-05-09 2:14 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-09 1:17 [PATCH 0/2] repair: fixes for 3.2.0-rc2 Dave Chinner
2014-05-09 1:17 ` [PATCH 1/2] repair: don't unlock prefetch tree to read discontig buffers Dave Chinner
2014-05-09 2:14 ` Eric Sandeen [this message]
2014-05-09 3:53 ` Dave Chinner
2014-05-09 1:17 ` [PATCH 2/2] repair: don't grind CPUs with large extent lists Dave Chinner
2014-05-09 3:01 ` Eric Sandeen
2014-05-09 3:56 ` Dave Chinner
2014-05-09 4:18 ` Eric Sandeen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=536C3A07.2050100@sandeen.net \
--to=sandeen@sandeen.net \
--cc=david@fromorbit.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox