From: Curt Wohlgemuth <curtw@google.com>
To: ext4 development <linux-ext4@vger.kernel.org>
Subject: Question on block group allocation
Date: Thu, 23 Apr 2009 09:41:50 -0700 [thread overview]
Message-ID: <6601abe90904230941x5cdd590ck2d51410326df2fc5@mail.gmail.com> (raw)
Hi:
I'm seeing a performance problem on ext4 vs ext2, and in trying to
narrow it down, I've got a question about block allocation in ext4
that I'm having trouble figuring out.
The test in question just does random reads of several rather large
files (4.5GB and 10GB) in a single thread. All files are created in
the top-level directory. Looking into the block layout for the
various files, I'm struck by the wide separation of the extents in
some of the files.
As a simple example, I formatted/mounted a new ext4 partition with
default parameters (with the exception of "-O ^has_journal", but this
shouldn't make a difference); the FS has 5585 block groups of 4K
blocks.
Using dd, I created (in this order) two 4GB files and a 10GB file in
the mount directory.
The extent blocks are reasonably close together for the two 4GB files,
but the extents for the 10GB file show a huge gap, which seems to hurt
the random read performance pretty substantially. Here's the output
from debugfs:
BLOCKS:
(IND):8396832, (0-106495):8282112-8388607,
(106496-399359):11241472-11534335, (399360-888831):20482048-20971519,
(888832-1116159):23889920-24117247, (1116160-1277951):71665664-
71827455, (1277952-1767423):78678016-79167487,
(1767424-2125823):102402048-102760447,
(2125824-2148351):102768672-102791199,
(2148352-2621439):102793216-103266303
TOTAL: 2621441
Note the gap between blocks 79167487 and 102402048. I was lucky
enough to capture the mb_history from this 10GB create:
29109 14 735/30720/32758@1114112 735/30720/2048@1114112
735/30720/2048@1114112 1 0 0 1568 M 0 0
29109 14 736/0/32758@1116160 736/0/2048@1116160
2187/2048/2048@1116160 1 1 0 1568 0 0
29109 14 2187/4096/32758@1118208 2187/4096/2048@1118208
2187/4096/2048@1118208 1 0 0 1568 M 2048 4096
I've been staring at ext4_mb_regular_allocator() trying to understand
why an allocation with a goal block of 736 ends up with a best found
extent group of 2187, and I'm stuck -- at least without a lot of
printk messages. It seems to me that we just cycle through the block
groups starting with the goal group until we find a group that fits.
Again, according to dumpe2fs, block groups 737, 738, 739, ... all have
32768 free blocks. So why we end up with a best fit group of 2187 is
a mystery to me.
Can anybody give me an insight to what's happening here?
Thanks,
Curt
next reply other threads:[~2009-04-23 16:41 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-04-23 16:41 Curt Wohlgemuth [this message]
2009-04-23 19:08 ` Question on block group allocation Andreas Dilger
2009-04-23 22:02 ` Curt Wohlgemuth
2009-04-27 2:14 ` Theodore Tso
2009-04-27 5:29 ` Curt Wohlgemuth
2009-04-27 10:42 ` Theodore Tso
2009-04-27 22:40 ` Theodore Tso
2009-04-29 18:38 ` Curt Wohlgemuth
2009-04-29 19:37 ` Theodore Tso
2009-04-29 20:21 ` Curt Wohlgemuth
2009-04-29 21:20 ` Theodore Tso
2009-04-29 21:50 ` Theodore Tso
2009-04-29 22:29 ` Curt Wohlgemuth
2009-05-01 4:39 ` Theodore Tso
2009-05-04 15:52 ` Curt Wohlgemuth
2009-04-29 19:16 ` Theodore Tso
2009-04-27 23:12 ` Andreas Dilger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6601abe90904230941x5cdd590ck2d51410326df2fc5@mail.gmail.com \
--to=curtw@google.com \
--cc=linux-ext4@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).