linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
To: Curt Wohlgemuth <curtw@google.com>
Cc: ext4 development <linux-ext4@vger.kernel.org>
Subject: Re: Help understanding prealloc space choice?
Date: Wed, 14 Oct 2009 10:53:15 +0530	[thread overview]
Message-ID: <20091014052314.GA29704@skywalker.linux.vnet.ibm.com> (raw)
In-Reply-To: <6601abe90910131106u3a569d51g1322fe6764a2fbb6@mail.gmail.com>

On Tue, Oct 13, 2009 at 11:06:35AM -0700, Curt Wohlgemuth wrote:
> Hi all:
> 
> I'm looking in ext4_mb_use_preallocated() and am seeing something odd.
> 
> First we look through the inode prealloc list, and see if we have a
> preallocation that satisfies the allocation context:
> 
>        /* all fields in this condition don't change,
>         * so we can skip locking for them */
>        if (ac->ac_o_ex.fe_logical < pa->pa_lstart ||
>                ac->ac_o_ex.fe_logical >= pa->pa_lstart + pa->pa_len)
>                continue;
> 
>        /* non-extent files can't have physical blocks past 2^32 */
>        if (!(EXT4_I(ac->ac_inode)->i_flags & EXT4_EXTENTS_FL) &&
>                pa->pa_pstart + pa->pa_len > EXT4_MAX_BLOCK_FILE_PHYS)
>                continue;
> 
>        /* found preallocated blocks, use them */
>        spin_lock(&pa->pa_lock);
>        if (pa->pa_deleted == 0 && pa->pa_free) {
> 
>             => Now we're good, and have an AC that satisfies us.
>             => We call ext4_mb_use_inode_pa(ac, pa);
> 
> 
> But ext4_mb_use_inode_pa() has this:
> 
> 	BUG_ON(pa->pa_free < len);
> 
> Nowhere do we check the 'pa_free' value to decide if this preallocation is
> okay to use.
> 

the 'len' value above is derived out of what we have in prealloc space.
ie, we do this

     start = pa->pa_pstart + (ac->ac_o_ex.fe_logical - pa->pa_lstart);                                                                 
     end = min(pa->pa_pstart + pa->pa_len, start + ac->ac_o_ex.fe_len);                                                                
     len = end - start;     

Now to decide whether we need to use a particular inode prealloc space
we look at the pa->pa_lstart which is the start logical block number
mapping this prealloc space. So if the requested logical block number
falls within a prealloc space (ie within pa->pa_lstart , pa->pa_lstart + pa->pa_len)
we use the prealloc space. Done by the below conditional ext4_mb_use_preallocated

      /* all fields in this condition don't change,                                                                             
       * so we can skip locking for them */                                                                                     
       if (ac->ac_o_ex.fe_logical < pa->pa_lstart ||                                                                             
             ac->ac_o_ex.fe_logical >= pa->pa_lstart + pa->pa_len)                                                             
                        continue;               



> 
> Further down in ext4_mb_use_preallocated() we check the locality group
> prealloc list; for this, we DO check pa_free:
> 
>          spin_lock(&pa->pa_lock);
>          if (pa->pa_deleted == 0 &&
>                          pa->pa_free >= ac->ac_o_ex.fe_len) {
> 
>                  cpa = ext4_mb_check_group_pa(goal_block,
>                                                  pa, cpa);
> 

 
locality group prealloc space is not looked with the logical block number.
We just claim need blocks from the prealloc space. Hence we check for the
available free blocks and the needed free blocks.



> So my question is:  Is it a bug that we don't check that an inode
> preallocation has enough free blocks for the AC before we try to use it?  I
> have hit the BUG_ON above at least once in my testing, but I can't
> characterize what the workload was at the time (nor can I reproduce it...).
> 

You should not hit that. That would mean prealloc space accounting went wrong.
Which is really a BUG

-aneesh

  reply	other threads:[~2009-10-14  5:23 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-13 18:06 Help understanding prealloc space choice? Curt Wohlgemuth
2009-10-14  5:23 ` Aneesh Kumar K.V [this message]
2009-10-14 17:26   ` Curt Wohlgemuth

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091014052314.GA29704@skywalker.linux.vnet.ibm.com \
    --to=aneesh.kumar@linux.vnet.ibm.com \
    --cc=curtw@google.com \
    --cc=linux-ext4@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).