From: Alex Elder <aelder@sgi.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: xfs@oss.sgi.com
Subject: Re: [PATCH 3/6] xfs: exact busy extent tracking
Date: Fri, 25 Mar 2011 16:04:20 -0500 [thread overview]
Message-ID: <1301087060.2537.689.camel@doink> (raw)
In-Reply-To: <20110322200137.657110761@bombadil.infradead.org>
On Tue, 2011-03-22 at 15:55 -0400, Christoph Hellwig wrote:
> Update the extent tree in case we have to reuse a busy extent, so that it
> always is kept uptodate. This is done by replacing the busy list searches
> with a new xfs_alloc_busy_reuse helper, which updates the busy extent tree
> in case of a reuse. Also replace setting transactions to sync with forcing
> the log out in case we found a busy extent to reuse. This makes the code a
> lot more simple, and is required for discard support later on. While it
> will cause performance regressios with just this patch applied, the impact
> is more than mitigated by the next patch in the series.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
In this one, xfs_alloc_busy_reuse() is broken, or at least it has some
dead code in it now. (But now that I've looked at the later patches
I see that's a temporary artifact. I've left my original comments
in place anyway, in case any insights remain despite that.)
More below.
> Index: xfs/fs/xfs/xfs_alloc.c
> ===================================================================
> --- xfs.orig/fs/xfs/xfs_alloc.c 2011-03-20 19:41:55.835479390 +0100
> +++ xfs/fs/xfs/xfs_alloc.c 2011-03-21 14:49:14.157973188 +0100
. . .
> @@ -2459,100 +2459,6 @@ error0:
> return error;
> }
>
> -
> -/*
> - * AG Busy list management
> - * The busy list contains block ranges that have been freed but whose
> - * transactions have not yet hit disk. If any block listed in a busy
> - * list is reused, the transaction that freed it must be forced to disk
> - * before continuing to use the block.
> - *
> - * xfs_alloc_busy_insert - add to the per-ag busy list
> - * xfs_alloc_busy_clear - remove an item from the per-ag busy list
> - * xfs_alloc_busy_search - search for a busy extent
> - */
> -
> -/*
> - * Insert a new extent into the busy tree.
OK, to be honest I haven't re-read this entire block of
comment text to identify what might be of value. But
is there really *nothing* worth saving? Is the busy
extent tree documented adequately elsewhere?
> - * The busy extent tree is indexed by the start block of the busy extent.
> - * there can be multiple overlapping ranges in the busy extent tree but only
> - * ever one entry at a given start block. The reason for this is that
> - * multi-block extents can be freed, then smaller chunks of that extent
> - * allocated and freed again before the first transaction commit is on disk.
> - * If the exact same start block is freed a second time, we have to wait for
> - * that busy extent to pass out of the tree before the new extent is inserted.
> - * There are two main cases we have to handle here.
. . .
> @@ -2583,66 +2487,30 @@ xfs_alloc_busy_insert(
> new->agno = agno;
> new->bno = bno;
> new->length = len;
> - new->tid = xfs_log_get_trans_ident(tp);
> -
> INIT_LIST_HEAD(&new->list);
>
> /* trace before insert to be able to see failed inserts */
> trace_xfs_alloc_busy(tp, agno, bno, len, 0);
>
> pag = xfs_perag_get(tp->t_mountp, new->agno);
> -restart:
> spin_lock(&pag->pagb_lock);
> rbp = &pag->pagb_tree.rb_node;
> - parent = NULL;
> - busyp = NULL;
> - match = 0;
> - while (*rbp && match >= 0) {
> + while (*rbp) {
> parent = *rbp;
> busyp = rb_entry(parent, struct xfs_busy_extent, rb_node);
>
> if (new->bno < busyp->bno) {
> /* may overlap, but exact start block is lower */
This comment isn't really right any more (BUG_ON that condition).
> rbp = &(*rbp)->rb_left;
> - if (new->bno + new->length > busyp->bno)
> - match = busyp->tid == new->tid ? 1 : -1;
> + BUG_ON(new->bno + new->length > busyp->bno);
> } else if (new->bno > busyp->bno) {
> /* may overlap, but exact start block is higher */
This one too.
> rbp = &(*rbp)->rb_right;
> - if (bno < busyp->bno + busyp->length)
> - match = busyp->tid == new->tid ? 1 : -1;
> + BUG_ON(bno < busyp->bno + busyp->length);
> } else {
> - match = busyp->tid == new->tid ? 1 : -1;
> - break;
> + BUG();
> }
. . .
> @@ -2704,6 +2571,173 @@ xfs_alloc_busy_search(
> return match;
> }
>
/*
* The found free extent [fbno, fend] overlaps part or all
* of the given busy extent. If the overlap covers the
* beginning, the end, or all of the busy extent, the
* overlapping portion can be made unbusy and used for
* the allocation. We can't split a busy extent because
* we can't modify a transaction/CIL context busy list,
* but we can update an entry's block number or length.
*
* The caller will force the log and re-check the busy
* list after returning from this function.
*/
> +STATIC int
> +xfs_alloc_busy_try_reuse(
> + struct xfs_perag *pag,
> + struct xfs_busy_extent *busyp,
> + xfs_agblock_t fbno,
> + xfs_agblock_t fend)
> +{
> + xfs_agblock_t bbno = busyp->bno;
> + xfs_agblock_t bend = bbno + busyp->length;
> +
> + if (bbno < fbno && bend > fend) {
> + /*
> + * Case 1:
> + * bbno bend
> + * +BBBBBBBBBBBBBBBBB+
> + * +---------+
> + * fbno fend
> + */
> +
> + /*
> + * We would have to split the busy extent to be able
> + * to track it correct, which we cannot do because we
> + * would have to modify the list of busy extents
> + * attached to the transaction/CIL context, which
> + * is mutable.
> + *
> + * Force out the log to clear the busy extents
> + * and retry the search.
The caller forces the log. Rely on a comment in this
function's header to say that (not here).
> + */
> + return -1;
> + } else if (bbno >= fbno && bend <= fend) {
> + /*
> + * Case 2:
> + * bbno bend
> + * +BBBBBBBBBBBBBBBBB+
> + * +-----------------+
> + * fbno fend
. . .
> +
> + return 1;
> +}
> +
> +
> +/*
> + * For a given extent [fbno, flen], make sure we can reuse it safely.
> + */
> +void
> +xfs_alloc_busy_reuse(
> + struct xfs_trans *tp,
> + xfs_agnumber_t agno,
> + xfs_agblock_t fbno,
> + xfs_extlen_t flen)
> +{
> + struct xfs_perag *pag;
> + struct rb_node *rbp;
> +
> + ASSERT(flen > 0);
> +
> + pag = xfs_perag_get(tp->t_mountp, agno);
> +restart:
> + spin_lock(&pag->pagb_lock);
> + rbp = pag->pagb_tree.rb_node;
> + while (rbp) {
> + struct xfs_busy_extent *busyp =
> + rb_entry(rbp, struct xfs_busy_extent, rb_node);
> + xfs_agblock_t fend = fbno + flen;
> + xfs_agblock_t bbno = busyp->bno;
> + xfs_agblock_t bend = bbno + busyp->length;
> + int overlap;
> +
> + if (fend <= bbno) {
> + rbp = rbp->rb_left;
> + continue;
> + } else if (fbno >= bend) {
> + rbp = rbp->rb_right;
> + continue;
> + }
> +
I was going to suggest:
/* Extent overlaps busy extent */
here, but I think that may not be the case any more if
busyp->length is zero. Or rather, the extent may
surround the zero-length busy extent (which I suppose
could be considered overlap).
If busyp->length is zero, I think the call to
xfs_alloc_busy_try_reuse() is not needed; in fact,
if it is already 0, that function will call
rb_erase() on the entry again.
> + overlap = xfs_alloc_busy_try_reuse(pag, busyp,
> + fbno, fbno + flen);
Note that xfs_alloc_busy_try_reuse() only returns 1 or -1 now...
> + if (overlap) {
...therefore this branch is always taken, and the code
below this block to the end of the loop is not reached.
Since this is the only place it's used, xfs_alloc_busy_try_reuse()
might as well be defined as a void function.
(Ahh, but now that I've looked at the later patches
I see it gets used again later. I'm leaving my
comments here nevertheless.)
> + spin_unlock(&pag->pagb_lock);
> + xfs_log_force(tp->t_mountp, XFS_LOG_SYNC);
> + goto restart;
> + }
+------ code starting here is never reached
|
> + /*
> + * No more busy extents to search.
> + */
> + if (bbno <= fbno && bend >= fend)
> + break;
> +
> + if (fbno < bbno)
> + rbp = rbp->rb_left;
> + else
> + rbp = rbp->rb_right;
|
+------ end of code not reached
> + }
> + spin_unlock(&pag->pagb_lock);
> + xfs_perag_put(pag);
> +}
> +
> /*
> * For a given extent [fbno, flen], search the busy extent list
> * to find a subset of the extent that is not busy.
* If part or all of the extent is busy, force the log, then
* verify it is no longer busy before returning.
> @@ -2889,14 +2923,12 @@ xfs_alloc_busy_clear(
> trace_xfs_alloc_unbusy(mp, busyp->agno, busyp->bno,
> busyp->length);
>
> - ASSERT(xfs_alloc_busy_search(mp, busyp->agno, busyp->bno,
> - busyp->length) == 1);
> -
> list_del_init(&busyp->list);
>
> pag = xfs_perag_get(mp, busyp->agno);
> spin_lock(&pag->pagb_lock);
> - rb_erase(&busyp->rb_node, &pag->pagb_tree);
> + if (busyp->length)
> + rb_erase(&busyp->rb_node, &pag->pagb_tree);
> spin_unlock(&pag->pagb_lock);
> xfs_perag_put(pag);
>
. . .
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2011-03-25 21:01 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-03-22 19:55 [PATCH 0/6] [PATCH 0/6] more efficient busy extent handling and discard support Christoph Hellwig
2011-03-22 19:55 ` [PATCH 1/6] xfs: optimize AGFL refills Christoph Hellwig
2011-03-22 22:30 ` Alex Elder
2011-03-23 12:16 ` Christoph Hellwig
2011-03-25 21:03 ` Alex Elder
2011-03-28 12:07 ` Christoph Hellwig
2011-03-22 23:30 ` Dave Chinner
2011-03-23 12:16 ` Christoph Hellwig
2011-03-22 19:55 ` [PATCH 2/6] xfs: do not immediately reuse busy extent ranges Christoph Hellwig
2011-03-22 22:30 ` Alex Elder
2011-03-23 12:17 ` Christoph Hellwig
2011-03-25 21:03 ` Alex Elder
2011-03-28 12:07 ` Christoph Hellwig
2011-03-22 19:55 ` [PATCH 3/6] xfs: exact busy extent tracking Christoph Hellwig
2011-03-22 23:47 ` Dave Chinner
2011-03-23 12:24 ` Christoph Hellwig
2011-03-25 21:04 ` Alex Elder [this message]
2011-03-28 12:10 ` Christoph Hellwig
2011-03-22 19:55 ` [PATCH 4/6] xfs: allow reusing busy extents where safe Christoph Hellwig
2011-03-23 0:20 ` Dave Chinner
2011-03-23 12:26 ` Christoph Hellwig
2011-03-25 21:04 ` Alex Elder
2011-03-22 19:55 ` [PATCH 5/6] xfs: add online discard support Christoph Hellwig
2011-03-23 0:30 ` Dave Chinner
2011-03-23 12:31 ` Christoph Hellwig
2011-03-25 21:04 ` Alex Elder
2011-03-28 12:35 ` Christoph Hellwig
2011-03-22 19:55 ` [PATCH 6/6] xfs: make discard operations asynchronous Christoph Hellwig
2011-03-23 0:43 ` Dave Chinner
2011-03-28 12:44 ` Christoph Hellwig
2011-03-25 21:04 ` Alex Elder
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1301087060.2537.689.camel@doink \
--to=aelder@sgi.com \
--cc=hch@infradead.org \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.