linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Sean Anderson <seanga2@gmail.com>
Cc: linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	jack@suse.com, tytso@mit.edu, adilger.kernel@dilger.ca,
	linux-ext4@vger.kernel.org
Subject: Re: [ext2] Mislabeled quadratic probing?
Date: Sat, 29 Jul 2017 19:37:18 -0700	[thread overview]
Message-ID: <20170730023718.GH15980@bombadil.infradead.org> (raw)
In-Reply-To: <07c8955b-0ead-9dd9-978e-767d5dec6712@gmail.com>

On Sat, Jul 29, 2017 at 10:24:29AM -0400, Sean Anderson wrote:
> Hi,
> 
> I was reading through the ext2 inode allocation code, and found the
> following snippet in fs/ext2/ialloc.c:find_group_other
> 
> /*
>  * Use a quadratic hash to find a group with a free inode and some
>  * free blocks.
>  */
> for (i = 1; i < ngroups; i <<= 1) {
>         group += i;
>         if (group >= ngroups)
>                 group -= ngroups;
>         desc = ext2_get_group_desc (sb, group, NULL);
>         if (desc && le16_to_cpu(desc->bg_free_inodes_count) &&
>                         le16_to_cpu(desc->bg_free_blocks_count))
>                 goto found;
> }
> 
> As I understand it, quadratic probing starting at a hash H would try
> positions H+1, H+4, H+9, H+16, H+25, etc. Here, however, the algorithm
> appears to try positions H+1, H+3, H+7, H+15, H+31, etc., which appears
> to be some form of exponential probing. I was unable to find the patch
> which introduced this code, but it appears that it was introduced in
> v2.4.14.3, and before that linear probing was used. Clearly, this code
> works, and I can't really find any compelling arguments to switch to
> quadratic probing proper. I suspect it was done this way to avoid a
> multiply or an extra subtract on every loop. Can anyone shed some light
> on the choice (and apparent mislabel) of this algorithm?

It can't have been to avoid an arithmetic operation.  The quadratic
hash would simply be s/<<= 1/+= 2/ which is going to be equal cost on
basically every CPU.

The biggest danger I see here is that we're only going to test 32
groups before falling back to linear probing (we'll shift the single
'1' bit out of 'i' in 32 steps).  That might be a performance problem,
but it should hit quite rarely.

The danger in changing it is that we'll end up with new files created in
a directory choosing a different block group from files created in that
directory using an old kernel.  And that could be a worse performance
impact.

I think we'd need to see some benchmarks ... Ted, any suggestions for
something which might show a difference between these two approaches
to hashing?

  reply	other threads:[~2017-07-30  2:37 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-29 14:24 [ext2] Mislabeled quadratic probing? Sean Anderson
2017-07-30  2:37 ` Matthew Wilcox [this message]
2017-07-30 22:22   ` Theodore Ts'o
2017-07-31  1:40   ` Sean Anderson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170730023718.GH15980@bombadil.infradead.org \
    --to=willy@infradead.org \
    --cc=adilger.kernel@dilger.ca \
    --cc=jack@suse.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=seanga2@gmail.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).