From: Eric Sandeen <sandeen@sandeen.net>
To: Dave Chinner <david@fromorbit.com>, xfs@oss.sgi.com
Subject: Re: [PATCH 1/2] repair: don't unlock prefetch tree to read discontig buffers
Date: Thu, 08 May 2014 21:14:31 -0500 [thread overview]
Message-ID: <536C3A07.2050100@sandeen.net> (raw)
In-Reply-To: <1399598222-4349-2-git-send-email-david@fromorbit.com>
On 5/8/14, 8:17 PM, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
>
> The way discontiguous buffers are currently handled in prefetch is
> by unlocking the prefetch tree and reading them one at a time in
> pf_read_discontig(), inside the normal loop of searching for buffers
> to read in a more optimized fashion.
>
> But by unlocking the tree, we allow other threads to come in and
> find buffers which we've already stashed locally on our bplist[].
> If 2 threads think they own the same set of buffers, they may both
> try to delete them from the prefetch btree, and the second one to
> arrive will not find it, resulting in:
>
> fatal error -- prefetch corruption
>
> To fix this, simply abort the buffer gathering loop when we come
> across a discontiguous buffer, process the gathered list as per
> normal, and then after running the large optimised read, check to
> see if the last buffer on the list is a discontiguous buffer.
> If is is discontiguous, then issue the discontiguous buffer read
> while the locks are not held. We only ever have one discontiguous
> buffer per read loop, so it is safe just to check the last buffer in
> the list.
>
> The fix is loosely based on a a patch provided by Eric Sandeen, who
> did all the hard work of finding the bug and demonstrating how to
> fix it.
Ok, this makes sense to me. The comment above the discontig read
seems a bit confusing; you say it's safe to read while unlocked,
but I wouldn't have expected it not to be - the lock is just for
btree manipulation, and that's not being done. So I think the
comment adds a little confusion rather than clarification.
But the code looks fine.
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
> Reported-by:Eric Sandeen <sandeen@redhat.com>
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
> repair/prefetch.c | 53 +++++++++++++++++++++++++++--------------------------
> 1 file changed, 27 insertions(+), 26 deletions(-)
>
> diff --git a/repair/prefetch.c b/repair/prefetch.c
> index 65fedf5..e269f1f 100644
> --- a/repair/prefetch.c
> +++ b/repair/prefetch.c
> @@ -444,27 +444,6 @@ pf_read_inode_dirs(
> }
>
> /*
> - * Discontiguous buffers require multiple IOs to fill, so we can't use any
> - * linearising, hole filling algorithms on them to avoid seeks. Just remove them
> - * for the prefetch queue and read them straight into the cache and release
> - * them.
> - */
> -static void
> -pf_read_discontig(
> - struct prefetch_args *args,
> - struct xfs_buf *bp)
> -{
> - if (!btree_delete(args->io_queue, XFS_DADDR_TO_FSB(mp, bp->b_bn)))
> - do_error(_("prefetch corruption\n"));
> -
> - pthread_mutex_unlock(&args->lock);
> - libxfs_readbufr_map(mp->m_ddev_targp, bp, 0);
> - bp->b_flags |= LIBXFS_B_UNCHECKED;
> - libxfs_putbuf(bp);
> - pthread_mutex_lock(&args->lock);
> -}
> -
> -/*
> * pf_batch_read must be called with the lock locked.
> */
> static void
> @@ -496,13 +475,19 @@ pf_batch_read(
> }
> while (bplist[num] && num < MAX_BUFS && fsbno < max_fsbno) {
> /*
> - * Handle discontiguous buffers outside the seek
> - * optimised IO loop below.
> + * Discontiguous buffers need special handling, so stop
> + * gathering new buffers and process the list and this
> + * discontigous buffer immediately. This avoids the
> + * complexity of keeping a separate discontigous buffer
> + * list and seeking back over ranges we've already done
> + * optimised reads for.
> */
> if ((bplist[num]->b_flags & LIBXFS_B_DISCONTIG)) {
> - pf_read_discontig(args, bplist[num]);
> - bplist[num] = NULL;
> - } else if (which != PF_META_ONLY ||
> + num++;
> + break;
> + }
> +
> + if (which != PF_META_ONLY ||
> !B_IS_INODE(XFS_BUF_PRIORITY(bplist[num])))
> num++;
> if (num == MAX_BUFS)
> @@ -570,6 +555,22 @@ pf_batch_read(
> * now read the data and put into the xfs_but_t's
> */
> len = pread64(mp_fd, buf, (int)(last_off - first_off), first_off);
> +
> + /*
> + * Check the last buffer on the list to see if we need to
> + * process a discontiguous buffer. The gather loop guarantees
> + * we only ever have a single discontig buffer on the list,
> + * and that it is the last buffer, so it is safe to do this
> + * check and read here like this while we aren't holding any
> + * locks.
> + */
> + if ((bplist[num - 1]->b_flags & LIBXFS_B_DISCONTIG)) {
> + libxfs_readbufr_map(mp->m_ddev_targp, bplist[num - 1], 0);
> + bplist[num - 1]->b_flags |= LIBXFS_B_UNCHECKED;
> + libxfs_putbuf(bplist[num - 1]);
> + num--;
> + }
> +
> if (len > 0) {
> /*
> * go through the xfs_buf_t list copying from the
>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2014-05-09 2:14 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-09 1:17 [PATCH 0/2] repair: fixes for 3.2.0-rc2 Dave Chinner
2014-05-09 1:17 ` [PATCH 1/2] repair: don't unlock prefetch tree to read discontig buffers Dave Chinner
2014-05-09 2:14 ` Eric Sandeen [this message]
2014-05-09 3:53 ` Dave Chinner
2014-05-09 1:17 ` [PATCH 2/2] repair: don't grind CPUs with large extent lists Dave Chinner
2014-05-09 3:01 ` Eric Sandeen
2014-05-09 3:56 ` Dave Chinner
2014-05-09 4:18 ` Eric Sandeen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=536C3A07.2050100@sandeen.net \
--to=sandeen@sandeen.net \
--cc=david@fromorbit.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.