All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Lachlan McIlroy <lachlan@sgi.com>
Cc: xfs-dev <xfs-dev@sgi.com>, xfs-oss <xfs@oss.sgi.com>
Subject: Re: [PATCH] Prevent extent btree block allocation failures
Date: Mon, 23 Jun 2008 18:05:30 +1000	[thread overview]
Message-ID: <20080623080530.GI29319@disturbed> (raw)
In-Reply-To: <485F455A.9060701@sgi.com>

On Mon, Jun 23, 2008 at 04:40:26PM +1000, Lachlan McIlroy wrote:
> Dave Chinner wrote:
>> On Mon, Jun 23, 2008 at 03:57:22PM +1000, Lachlan McIlroy wrote:
>>> Dave Chinner wrote:
>>>> On Fri, Jun 20, 2008 at 03:21:20PM +1000, Dave Chinner wrote:
>>>>> On Thu, Jun 19, 2008 at 05:28:50PM +1000, Lachlan McIlroy wrote:
>>>>>> There's something else that looks suspicious to me - this code in
>>>>>> xfs_bmap_btalloc() is setting minleft to 0.  Doesn't this go against
>>>>>> what you were saying about setting minleft to be the space we might
>>>>>> need for the btree operations?
>>>>>>
>>>>>> 	if (args.fsbno == NULLFSBLOCK && nullfb) {
>>>>>> 		args.fsbno = 0;
>>>>>> 		args.type = XFS_ALLOCTYPE_FIRST_AG;
>>>>>> 		args.total = ap->minlen;
>>>>>> 		args.minleft = 0;
>>>>>> 		if ((error = xfs_alloc_vextent(&args)))
>>>>>> 			return error;
>>>>>> 		ap->low = 1;
>>>>>> 	}
>>>>> Hmmm - that looks suspicious. In xfs_bmapi(), when we are doing a
>>>>> write and *firstblock == NULLFSBLOCK (which leads to nullfb being
>>>>> set in the above code), we do:
>>>>>
>>>>>         if (wr && *firstblock == NULLFSBLOCK) {
>>>>>                 if (XFS_IFORK_FORMAT(ip, whichfork) == XFS_DINODE_FMT_BTREE)
>>>>>                         minleft = be16_to_cpu(ifp->if_broot->bb_level) + 1;
>>>>>                 else
>>>>>                         minleft = 1;
>>>>>         } else
>>>>>                 minleft = 0;
>>>>>
>>>>> If we are in btree format we set the minleft to the number of blocks needed
>>>>> for a split. If we are in extent or local format, change to extent of btree
>>>>> format requires one extra block.
>>>>>
>>>>> The above code you point out definitely breaks this - we haven't done a
>>>>> previous allocation so we can start from the first AG, but we sure as
>>>>> hell still need minleft set to the number of blocks needed for a
>>>>> format change or btree split.
>>>> Just to point out yet another problem in this code (one that's just
>>>> been tripped over @ agami) is unwritten extent conversion.
>>>>
>>>> Basically, we don't do an allocation here, so when we end up in
>>>> xfs_bmap_add_extent_unwritten_real() with a null firstblock. Hence
>>>> the cases where conversion can cause a split - case
>>>> MASK(LEFT_FILLING), MASK(RIGHT_FILLING) and 0 (convert the middle of
>>>> an extent) - we can select an AG that doesn't have enough space for
>>>> the entire split as we've ignored the number of blocks we might
>>>> need to allocate in the split (the minleft parameter) entirely.
>>>>
>>>> I suspect that xfs_bmbt_split() needs to handle the null first block
>>>> case slightly differently - the minleft parameter passed to the
>>>> allocation should not be zero - it should be the number of levels
>>>> above the current level left in the tree. i.e:
>>>>
>>>> 	minleft = be16_to_cpu(ifp->if_broot->bb_level) + 1;
>>>>
>>>> If we've already got a firstblock set, then this should have already
>>>> been taken into account (i.e. we still need to fix the low space
>>>> case where it got ignored as we were discussing).
>>> Funny.  I tested the exact same change last week to try to fix the same
>>> problem.  Seemed to work okay.
>>
>> Cool. Got a patch for review?
>
> I couldn't find the original patch that calculated minleft as above - instead
> here's a variant that addresses the double insert problem by retrieving the
> reservation amount from the transaction.  It could very well be overkill though.

No, that seems valid; all allocations need to pass in a reservation
for a number of blocks needed for the transaction to proceed. I did
a quick check and everything appears to be reserving only what
is necessary for a bmbt split (or two)...

> --- fs/xfs/xfs_bmap_btree.c_1.169	2008-06-16 17:25:10.000000000 +1000
> +++ fs/xfs/xfs_bmap_btree.c	2008-06-16 18:32:45.000000000 +1000
> @@ -1496,9 +1496,12 @@ xfs_bmbt_split(
> 	if (args.fsbno == NULLFSBLOCK) {
> 		args.fsbno = lbno;
> 		args.type = XFS_ALLOCTYPE_START_BNO;
> -	} else
> +		args.minleft = xfs_trans_get_block_res(args.tp);
> +	} else {

Might be worth a comment ;)

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2008-06-23  8:04 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-06-13  7:38 [PATCH] Prevent extent btree block allocation failures Lachlan McIlroy
2008-06-13 13:44 ` Christoph Hellwig
2008-06-16  3:57   ` Lachlan McIlroy
2008-06-13 15:57 ` Dave Chinner
2008-06-16  6:11   ` Lachlan McIlroy
2008-06-16 17:10     ` Dave Chinner
2008-06-17  1:58       ` Lachlan McIlroy
2008-06-17  7:39         ` Dave Chinner
2008-06-19  7:28           ` Lachlan McIlroy
2008-06-20  5:21             ` Dave Chinner
2008-06-23  5:20               ` Dave Chinner
2008-06-23  5:57                 ` Lachlan McIlroy
2008-06-23  6:14                   ` Dave Chinner
2008-06-23  6:40                     ` Lachlan McIlroy
2008-06-23  8:05                       ` Dave Chinner [this message]
2008-06-23  5:24               ` Lachlan McIlroy
2008-06-23  6:21                 ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080623080530.GI29319@disturbed \
    --to=david@fromorbit.com \
    --cc=lachlan@sgi.com \
    --cc=xfs-dev@sgi.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.