From: Coly Li <colyli@gmail.com>
To: Bernd Schubert <bernd.schubert@fastmail.fm>
Cc: linux-ext4@vger.kernel.org,
Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
Subject: Re: [PATCH 2/2] ext4 directory index: read-ahead blocks
Date: Sat, 18 Jun 2011 02:44:11 +0800 [thread overview]
Message-ID: <4DFBA07B.6090001@gmail.com> (raw)
In-Reply-To: <20110617160100.2062012.50927.stgit@localhost.localdomain>
On 2011年06月18日 00:01, Bernd Schubert Wrote:
> While creating files in large directories we noticed an endless number
> of 4K reads. And those reads very much reduced file creation numbers
> as shown by bonnie. While we would expect about 2000 creates/s, we
> only got about 25 creates/s. Running the benchmarks for a long time
> improved the numbers, but not above 200 creates/s.
> It turned out those reads came from directory index block reads
> and probably the bh cache never cached all dx blocks. Given by
> the high number of directories we have (8192) and number of files required
> to trigger the issue (16 million), rather probably bh cached dx blocks
> got lost in favour of other less important blocks.
> The patch below implements a read-ahead for *all* dx blocks of a directory
> if a single dx block is missing in the cache. That also helps the LRU
> to cache important dx blocks.
>
> Unfortunately, it also has a performance trade-off for the first access to
> a directory, although the READA flag is set already.
> Therefore at least for now, this option is disabled by default, but may
> be enabled using 'mount -o dx_read_ahead' or 'mount -odx_read_ahead=1'
>
> Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
> ---
A question is, is there any performance number for dx dir read ahead ?
My concern is, if buffer cache replacement behavior is not ideal, which may replace a dx block by other (maybe) more hot
blocks, dx dir readahead will introduce more I/Os. In this case, we may focus on exploring why dx block is replaced out
of buffer cache, other than using dx readahead.
[snip]
> diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
> index 6f32da4..78290f0 100644
> --- a/fs/ext4/namei.c
> +++ b/fs/ext4/namei.c
> @@ -334,6 +334,35 @@ struct stats dx_show_entries(struct dx_hash_info *hinfo, struct inode *dir,
> #endif /* DX_DEBUG */
>
> /*
> + * Read ahead directory index blocks
> + */
> +static void dx_ra_blocks(struct inode *dir, struct dx_entry * entries)
> +{
> + int i, err = 0;
> + unsigned num_entries = dx_get_count(entries);
> +
> + if (num_entries < 2 || num_entries > dx_get_limit(entries)) {
> + dxtrace(printk("dx read-ahead: invalid number of entries\n"));
> + return;
> + }
> +
> + dxtrace(printk("dx read-ahead: %d entries in dir-ino %lu \n",
> + num_entries, dir->i_ino));
> +
> + i = 1; /* skip first entry, it was already read in by the caller */
> + do {
> + struct dx_entry *entry;
> + ext4_lblk_t block;
> +
> + entry = entries + i;
> +
> + block = dx_get_block(entry);
> + err = ext4_bread_ra(dir, dx_get_block(entry));
> + i++;
> + } while (i < num_entries && !err);
> +}
> +
I see sync reading here (CMIIW), this is performance killer. An async background reading ahead is better.
[snip]
Thanks.
Coly
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2011-06-17 18:42 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-06-17 16:00 [PATCH 1/2] ext4: Fix compilation with -DDX_DEBUG Bernd Schubert
2011-06-17 16:01 ` [PATCH 2/2] ext4 directory index: read-ahead blocks Bernd Schubert
2011-06-17 18:44 ` Coly Li [this message]
2011-06-17 19:29 ` Andreas Dilger
2011-06-17 22:08 ` Bernd Schubert
2011-06-17 21:35 ` Bernd Schubert
2011-06-18 7:45 ` Robin Dong
2011-06-17 18:29 ` [PATCH 1/2] ext4: Fix compilation with -DDX_DEBUG Coly Li
2011-06-17 21:25 ` Bernd Schubert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4DFBA07B.6090001@gmail.com \
--to=colyli@gmail.com \
--cc=bernd.schubert@fastmail.fm \
--cc=bernd.schubert@itwm.fraunhofer.de \
--cc=linux-ext4@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.