From: Eric Sandeen <sandeen@redhat.com>
To: Rogier Wolff <R.E.Wolff@BitWizard.nl>
Cc: linux-ext4@vger.kernel.org
Subject: Re: Time for "mkdir" on ext3.
Date: Thu, 10 Mar 2011 09:48:38 -0600 [thread overview]
Message-ID: <4D78F2D6.7000208@redhat.com> (raw)
In-Reply-To: <20110310071119.GF31710@bitwizard.nl>
On 3/10/11 1:11 AM, Rogier Wolff wrote:
>
> Hi,
>
> I have an ext3 filesystem. When I "cp -lr" a big tree there, it turns
> out that the "mkdir" calls take the bulk of the time. IIRC there are
> 325000 directories (and 4 million files). Each mkdir call takes about
> 50ms (*), so that accounts for about 4.5 hours of the running time.
>
> Would ext4 perform significantly better?
>
> Roger.
>
> (*) I forgot about this Email while it was still in my editor. Now a
> day layter the mkdir calls all take around 17ms, and things run about
> 3x faster. On the other hand it's been running for over 5 hours. And
> yesterday I've seen a streak of >100ms mkdir calls... So apparently
> it depends on "something"....
>
There's a pretty pathological case in the directory allocator, where it
scans forward to find a free block group starting at the parent. For
each new subdir, it re-scans starting at the parent, even if it found
those groups full last time. I had experimented with an in-memory
"last free group" on each parent, which sped things up after the initial
scan. That might be what you're seeing...
Here's the patch I had, untested since 2007 - if you're in a testing
mood... of course if it breaks you get to keep the pieces. :)
-Eric
diff --git a/fs/ext3/ialloc.c b/fs/ext3/ialloc.c
index 9724aef..2f7be0c 100644
--- a/fs/ext3/ialloc.c
+++ b/fs/ext3/ialloc.c
@@ -242,6 +242,7 @@ static int find_group_dir(struct super_block *sb, struct inode *parent)
static int find_group_orlov(struct super_block *sb, struct inode *parent)
{
int parent_group = EXT3_I(parent)->i_block_group;
+ unsigned int child_group = EXT3_I(parent)->i_child_block_group;
struct ext3_sb_info *sbi = EXT3_SB(sb);
struct ext3_super_block *es = sbi->s_es;
int ngroups = sbi->s_groups_count;
@@ -269,7 +270,7 @@ static int find_group_orlov(struct super_block *sb, struct inode *parent)
get_random_bytes(&group, sizeof(group));
parent_group = (unsigned)group % ngroups;
for (i = 0; i < ngroups; i++) {
- group = (parent_group + i) % ngroups;
+ group = (child_group + i) % ngroups;
desc = ext3_get_group_desc (sb, group, NULL);
if (!desc || !desc->bg_free_inodes_count)
continue;
@@ -312,6 +313,7 @@ static int find_group_orlov(struct super_block *sb, struct inode *parent)
continue;
if (le16_to_cpu(desc->bg_free_blocks_count) < min_blocks)
continue;
+ EXT3_I(parent)->i_child_block_group = group;
return group;
}
@@ -555,6 +557,8 @@ got:
ei->i_dtime = 0;
ei->i_block_alloc_info = NULL;
ei->i_block_group = group;
+ if (S_ISDIR(mode))
+ ei->i_child_block_group = group;
ext3_set_inode_flags(inode);
if (IS_DIRSYNC(inode))
diff --git a/fs/ext3/inode.c b/fs/ext3/inode.c
index ae94f6d..72b0c92 100644
--- a/fs/ext3/inode.c
+++ b/fs/ext3/inode.c
@@ -2888,6 +2888,8 @@ struct inode *ext3_iget(struct super_block *sb, unsigned long ino)
ei->i_disksize = inode->i_size;
inode->i_generation = le32_to_cpu(raw_inode->i_generation);
ei->i_block_group = iloc.block_group;
+ if (S_ISDIR(inode->i_mode))
+ ei->i_child_block_group = ei->i_block_group;
/*
* NOTE! The in-memory inode i_data array is in little-endian order
* even on big-endian machines: we do NOT byteswap the block numbers!
diff --git a/include/linux/ext3_fs_i.h b/include/linux/ext3_fs_i.h
index f42c098..79f3a72 100644
--- a/include/linux/ext3_fs_i.h
+++ b/include/linux/ext3_fs_i.h
@@ -87,6 +87,7 @@ struct ext3_inode_info {
* near to their parent directory's inode.
*/
__u32 i_block_group;
+ __u32 i_child_block_group; /* last bg children allocated to */
unsigned long i_state_flags; /* Dynamic state flags for ext3 */
/* block reservation info */
prev parent reply other threads:[~2011-03-10 15:48 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-03-10 7:11 Time for "mkdir" on ext3 Rogier Wolff
2011-03-10 15:48 ` Eric Sandeen [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4D78F2D6.7000208@redhat.com \
--to=sandeen@redhat.com \
--cc=R.E.Wolff@BitWizard.nl \
--cc=linux-ext4@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).