linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Zeev Tarantov <zeev.tarantov@gmail.com>
To: Andreas Dilger <adilger@dilger.ca>
Cc: Eric Sandeen <sandeen@redhat.com>,
	ext4 development <linux-ext4@vger.kernel.org>
Subject: Re: [PATCH] e2fsprogs: don't set stripe/stride to 1 block in mkfs
Date: Tue, 5 Apr 2011 11:43:01 +0300	[thread overview]
Message-ID: <BANLkTikNXr9R+odfKMnYGWNqDBYtu0ZTvw@mail.gmail.com> (raw)
In-Reply-To: <F8519658-E9FA-4DD5-9980-59006CDACC8C@dilger.ca>

On Tue, Apr 5, 2011 at 11:10, Andreas Dilger <adilger@dilger.ca> wrote:
> On 2011-04-04, at 9:11 AM, Eric Sandeen wrote:
>> Block devices may set minimum or optimal IO hints equal to
>> blocksize; in this case there is really nothing for ext4
>> to do with this information (i.e. search for a block-aligned
>> allocation?) so don't set fs geometry with single-block
>> values.
>>
>> Zeev also reported that with a block-sized stripe, the
>> ext4 allocator spends time spinning in ext4_mb_scan_aligned(),
>> oddly enough.
>>
>> Reported-by: Zeev Tarantov <zeev.tarantov@gmail.com>
>> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
>> ---
>>
>> diff --git a/misc/mke2fs.c b/misc/mke2fs.c
>> index 9798b88..74b838c 100644
>> --- a/misc/mke2fs.c
>> +++ b/misc/mke2fs.c
>> @@ -1135,8 +1135,11 @@ static int get_device_geometry(const char *file,
>>       if ((opt_io == 0) && (psector_size > blocksize))
>>               opt_io = psector_size;
>>
>> -     fs_param->s_raid_stride = min_io / blocksize;
>> -     fs_param->s_raid_stripe_width = opt_io / blocksize;
>> +     /* setting stripe/stride to blocksize is pointless */
>> +     if (min_io > blocksize)
>> +             fs_param->s_raid_stride = min_io / blocksize;
>> +     if (opt_io > blocksize)
>> +             fs_param->s_raid_stripe_width = opt_io / blocksize;
>
> I don't think it is harmful to specify an mballoc alignment that is an even multiple of the underlying device IO size (e.g. at least 256kB or 512kB).
>
> If the underlying device (e.g. zram) is reporting 16kB or 64kB opt_io size because that is PAGE_SIZE, but blocksize is 4kB, then we will have the same performance problem again.

I think I'm not understanding you. Are you objecting to the patch? If so, why?
If io block is an even multiple of fs blocks, then mballoc really does
need to use the fancier aligning code (though for a ratio that is a
power of two, it shouldn't need to be slow).
In your example, If PAGE_SIZE is 16k and zram reports min and opt io
size as 16k but fs block is 4k, then Eric Sandeen's new code will set
stride and stripe to 4, as it should. Then mballoc will really need to
align blocks in groups of 4 and spending time doing that is not a
performance problem.

This patch is only meant to _not_ set stripe and stride to 1, which
cannot help block allocation and does in fact cause the kernel to
spend more cpu time because it does not detect the degenerate case.
Making the allocator faster in the case of a ratio that is a power of
two is the work of a different patch someone may want to write.
Using larger fs blocks in case the underlying block device prefers
larger io blocks is the work of the admin, I think.

> Cheers, Andreas

-Z.T.
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2011-04-05  8:43 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-04 19:11 [PATCH] e2fsprogs: don't set stripe/stride to 1 block in mkfs Eric Sandeen
2011-04-05  8:10 ` Andreas Dilger
2011-04-05  8:43   ` Zeev Tarantov [this message]
2011-04-05 16:39   ` Eric Sandeen
2011-04-05 16:56     ` Eric Sandeen
2011-04-05 22:21       ` Eric Sandeen
2011-04-06  6:51         ` Zeev Tarantov
2011-04-06 17:00           ` Eric Sandeen
2011-04-08  0:13       ` Andreas Dilger
2011-04-08  0:24         ` Eric Sandeen
2011-05-18 17:20 ` Ted Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BANLkTikNXr9R+odfKMnYGWNqDBYtu0ZTvw@mail.gmail.com \
    --to=zeev.tarantov@gmail.com \
    --cc=adilger@dilger.ca \
    --cc=linux-ext4@vger.kernel.org \
    --cc=sandeen@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).