From: Alex Elder <aelder@sgi.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: xfs@oss.sgi.com
Subject: Re: [PATCH 3/6] xfs: exact busy extent tracking
Date: Fri, 25 Mar 2011 16:04:20 -0500 [thread overview]
Message-ID: <1301087060.2537.689.camel@doink> (raw)
In-Reply-To: <20110322200137.657110761@bombadil.infradead.org>
On Tue, 2011-03-22 at 15:55 -0400, Christoph Hellwig wrote:
> Update the extent tree in case we have to reuse a busy extent, so that it
> always is kept uptodate. This is done by replacing the busy list searches
> with a new xfs_alloc_busy_reuse helper, which updates the busy extent tree
> in case of a reuse. Also replace setting transactions to sync with forcing
> the log out in case we found a busy extent to reuse. This makes the code a
> lot more simple, and is required for discard support later on. While it
> will cause performance regressios with just this patch applied, the impact
> is more than mitigated by the next patch in the series.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
In this one, xfs_alloc_busy_reuse() is broken, or at least it has some
dead code in it now. (But now that I've looked at the later patches
I see that's a temporary artifact. I've left my original comments
in place anyway, in case any insights remain despite that.)
More below.
> Index: xfs/fs/xfs/xfs_alloc.c
> ===================================================================
> --- xfs.orig/fs/xfs/xfs_alloc.c 2011-03-20 19:41:55.835479390 +0100
> +++ xfs/fs/xfs/xfs_alloc.c 2011-03-21 14:49:14.157973188 +0100
. . .
> @@ -2459,100 +2459,6 @@ error0:
> return error;
> }
>
> -
> -/*
> - * AG Busy list management
> - * The busy list contains block ranges that have been freed but whose
> - * transactions have not yet hit disk. If any block listed in a busy
> - * list is reused, the transaction that freed it must be forced to disk
> - * before continuing to use the block.
> - *
> - * xfs_alloc_busy_insert - add to the per-ag busy list
> - * xfs_alloc_busy_clear - remove an item from the per-ag busy list
> - * xfs_alloc_busy_search - search for a busy extent
> - */
> -
> -/*
> - * Insert a new extent into the busy tree.
OK, to be honest I haven't re-read this entire block of
comment text to identify what might be of value. But
is there really *nothing* worth saving? Is the busy
extent tree documented adequately elsewhere?
> - * The busy extent tree is indexed by the start block of the busy extent.
> - * there can be multiple overlapping ranges in the busy extent tree but only
> - * ever one entry at a given start block. The reason for this is that
> - * multi-block extents can be freed, then smaller chunks of that extent
> - * allocated and freed again before the first transaction commit is on disk.
> - * If the exact same start block is freed a second time, we have to wait for
> - * that busy extent to pass out of the tree before the new extent is inserted.
> - * There are two main cases we have to handle here.
. . .
> @@ -2583,66 +2487,30 @@ xfs_alloc_busy_insert(
> new->agno = agno;
> new->bno = bno;
> new->length = len;
> - new->tid = xfs_log_get_trans_ident(tp);
> -
> INIT_LIST_HEAD(&new->list);
>
> /* trace before insert to be able to see failed inserts */
> trace_xfs_alloc_busy(tp, agno, bno, len, 0);
>
> pag = xfs_perag_get(tp->t_mountp, new->agno);
> -restart:
> spin_lock(&pag->pagb_lock);
> rbp = &pag->pagb_tree.rb_node;
> - parent = NULL;
> - busyp = NULL;
> - match = 0;
> - while (*rbp && match >= 0) {
> + while (*rbp) {
> parent = *rbp;
> busyp = rb_entry(parent, struct xfs_busy_extent, rb_node);
>
> if (new->bno < busyp->bno) {
> /* may overlap, but exact start block is lower */
This comment isn't really right any more (BUG_ON that condition).
> rbp = &(*rbp)->rb_left;
> - if (new->bno + new->length > busyp->bno)
> - match = busyp->tid == new->tid ? 1 : -1;
> + BUG_ON(new->bno + new->length > busyp->bno);
> } else if (new->bno > busyp->bno) {
> /* may overlap, but exact start block is higher */
This one too.
> rbp = &(*rbp)->rb_right;
> - if (bno < busyp->bno + busyp->length)
> - match = busyp->tid == new->tid ? 1 : -1;
> + BUG_ON(bno < busyp->bno + busyp->length);
> } else {
> - match = busyp->tid == new->tid ? 1 : -1;
> - break;
> + BUG();
> }
. . .
> @@ -2704,6 +2571,173 @@ xfs_alloc_busy_search(
> return match;
> }
>
/*
* The found free extent [fbno, fend] overlaps part or all
* of the given busy extent. If the overlap covers the
* beginning, the end, or all of the busy extent, the
* overlapping portion can be made unbusy and used for
* the allocation. We can't split a busy extent because
* we can't modify a transaction/CIL context busy list,
* but we can update an entry's block number or length.
*
* The caller will force the log and re-check the busy
* list after returning from this function.
*/
> +STATIC int
> +xfs_alloc_busy_try_reuse(
> + struct xfs_perag *pag,
> + struct xfs_busy_extent *busyp,
> + xfs_agblock_t fbno,
> + xfs_agblock_t fend)
> +{
> + xfs_agblock_t bbno = busyp->bno;
> + xfs_agblock_t bend = bbno + busyp->length;
> +
> + if (bbno < fbno && bend > fend) {
> + /*
> + * Case 1:
> + * bbno bend
> + * +BBBBBBBBBBBBBBBBB+
> + * +---------+
> + * fbno fend
> + */
> +
> + /*
> + * We would have to split the busy extent to be able
> + * to track it correct, which we cannot do because we
> + * would have to modify the list of busy extents
> + * attached to the transaction/CIL context, which
> + * is mutable.
> + *
> + * Force out the log to clear the busy extents
> + * and retry the search.
The caller forces the log. Rely on a comment in this
function's header to say that (not here).
> + */
> + return -1;
> + } else if (bbno >= fbno && bend <= fend) {
> + /*
> + * Case 2:
> + * bbno bend
> + * +BBBBBBBBBBBBBBBBB+
> + * +-----------------+
> + * fbno fend
. . .
> +
> + return 1;
> +}
> +
> +
> +/*
> + * For a given extent [fbno, flen], make sure we can reuse it safely.
> + */
> +void
> +xfs_alloc_busy_reuse(
> + struct xfs_trans *tp,
> + xfs_agnumber_t agno,
> + xfs_agblock_t fbno,
> + xfs_extlen_t flen)
> +{
> + struct xfs_perag *pag;
> + struct rb_node *rbp;
> +
> + ASSERT(flen > 0);
> +
> + pag = xfs_perag_get(tp->t_mountp, agno);
> +restart:
> + spin_lock(&pag->pagb_lock);
> + rbp = pag->pagb_tree.rb_node;
> + while (rbp) {
> + struct xfs_busy_extent *busyp =
> + rb_entry(rbp, struct xfs_busy_extent, rb_node);
> + xfs_agblock_t fend = fbno + flen;
> + xfs_agblock_t bbno = busyp->bno;
> + xfs_agblock_t bend = bbno + busyp->length;
> + int overlap;
> +
> + if (fend <= bbno) {
> + rbp = rbp->rb_left;
> + continue;
> + } else if (fbno >= bend) {
> + rbp = rbp->rb_right;
> + continue;
> + }
> +
I was going to suggest:
/* Extent overlaps busy extent */
here, but I think that may not be the case any more if
busyp->length is zero. Or rather, the extent may
surround the zero-length busy extent (which I suppose
could be considered overlap).
If busyp->length is zero, I think the call to
xfs_alloc_busy_try_reuse() is not needed; in fact,
if it is already 0, that function will call
rb_erase() on the entry again.
> + overlap = xfs_alloc_busy_try_reuse(pag, busyp,
> + fbno, fbno + flen);
Note that xfs_alloc_busy_try_reuse() only returns 1 or -1 now...
> + if (overlap) {
...therefore this branch is always taken, and the code
below this block to the end of the loop is not reached.
Since this is the only place it's used, xfs_alloc_busy_try_reuse()
might as well be defined as a void function.
(Ahh, but now that I've looked at the later patches
I see it gets used again later. I'm leaving my
comments here nevertheless.)
> + spin_unlock(&pag->pagb_lock);
> + xfs_log_force(tp->t_mountp, XFS_LOG_SYNC);
> + goto restart;
> + }
+------ code starting here is never reached
|
> + /*
> + * No more busy extents to search.
> + */
> + if (bbno <= fbno && bend >= fend)
> + break;
> +
> + if (fbno < bbno)
> + rbp = rbp->rb_left;
> + else
> + rbp = rbp->rb_right;
|
+------ end of code not reached
> + }
> + spin_unlock(&pag->pagb_lock);
> + xfs_perag_put(pag);
> +}
> +
> /*
> * For a given extent [fbno, flen], search the busy extent list
> * to find a subset of the extent that is not busy.
* If part or all of the extent is busy, force the log, then
* verify it is no longer busy before returning.
> @@ -2889,14 +2923,12 @@ xfs_alloc_busy_clear(
> trace_xfs_alloc_unbusy(mp, busyp->agno, busyp->bno,
> busyp->length);
>
> - ASSERT(xfs_alloc_busy_search(mp, busyp->agno, busyp->bno,
> - busyp->length) == 1);
> -
> list_del_init(&busyp->list);
>
> pag = xfs_perag_get(mp, busyp->agno);
> spin_lock(&pag->pagb_lock);
> - rb_erase(&busyp->rb_node, &pag->pagb_tree);
> + if (busyp->length)
> + rb_erase(&busyp->rb_node, &pag->pagb_tree);
> spin_unlock(&pag->pagb_lock);
> xfs_perag_put(pag);
>
. . .
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2011-03-25 21:01 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-03-22 19:55 [PATCH 0/6] [PATCH 0/6] more efficient busy extent handling and discard support Christoph Hellwig
2011-03-22 19:55 ` [PATCH 1/6] xfs: optimize AGFL refills Christoph Hellwig
2011-03-22 22:30 ` Alex Elder
2011-03-23 12:16 ` Christoph Hellwig
2011-03-25 21:03 ` Alex Elder
2011-03-28 12:07 ` Christoph Hellwig
2011-03-22 23:30 ` Dave Chinner
2011-03-23 12:16 ` Christoph Hellwig
2011-03-22 19:55 ` [PATCH 2/6] xfs: do not immediately reuse busy extent ranges Christoph Hellwig
2011-03-22 22:30 ` Alex Elder
2011-03-23 12:17 ` Christoph Hellwig
2011-03-25 21:03 ` Alex Elder
2011-03-28 12:07 ` Christoph Hellwig
2011-03-22 19:55 ` [PATCH 3/6] xfs: exact busy extent tracking Christoph Hellwig
2011-03-22 23:47 ` Dave Chinner
2011-03-23 12:24 ` Christoph Hellwig
2011-03-25 21:04 ` Alex Elder [this message]
2011-03-28 12:10 ` Christoph Hellwig
2011-03-22 19:55 ` [PATCH 4/6] xfs: allow reusing busy extents where safe Christoph Hellwig
2011-03-23 0:20 ` Dave Chinner
2011-03-23 12:26 ` Christoph Hellwig
2011-03-25 21:04 ` Alex Elder
2011-03-22 19:55 ` [PATCH 5/6] xfs: add online discard support Christoph Hellwig
2011-03-23 0:30 ` Dave Chinner
2011-03-23 12:31 ` Christoph Hellwig
2011-03-25 21:04 ` Alex Elder
2011-03-28 12:35 ` Christoph Hellwig
2011-03-22 19:55 ` [PATCH 6/6] xfs: make discard operations asynchronous Christoph Hellwig
2011-03-23 0:43 ` Dave Chinner
2011-03-28 12:44 ` Christoph Hellwig
2011-03-25 21:04 ` Alex Elder
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1301087060.2537.689.camel@doink \
--to=aelder@sgi.com \
--cc=hch@infradead.org \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox