From: Jan Kara <jack@suse.cz>
To: Zheng Liu <gnehzuil.liu@gmail.com>
Cc: linux-ext4@vger.kernel.org, Zheng Liu <wenqing.lz@taobao.com>,
Theodore Ts'o <tytso@mit.edu>, Jan kara <jack@suse.cz>
Subject: Re: [PATCH 05/10 v5] ext4: lookup block mapping in extent status tree
Date: Tue, 12 Feb 2013 13:31:42 +0100 [thread overview]
Message-ID: <20130212123142.GC19583@quack.suse.cz> (raw)
In-Reply-To: <1360313046-9876-6-git-send-email-wenqing.lz@taobao.com>
On Fri 08-02-13 16:44:01, Zheng Liu wrote:
> From: Zheng Liu <wenqing.lz@taobao.com>
>
> After tracking all extent status, we already have a extent cache in
> memory. Every time we want to lookup a block mapping, we can first
> try to lookup it in extent status tree to avoid a potential disk I/O.
>
> A new function called ext4_es_lookup_extent is defined to finish this
> work. When we try to lookup a block mapping, we always call
> ext4_map_blocks and/or ext4_da_map_blocks. So in these functions we
> first try to lookup a block mapping in extent status tree.
>
> A new flag EXT4_GET_BLOCKS_NO_PUT_HOLE is used in ext4_da_map_blocks
> in order not to put a hole into extent status tree because this hole
> will be converted to delayed extent in the tree immediately.
It looks somewhat inconsistent that you put hole into the extent tree in
ext4_ext_map_blocks() but all other extent types are handled in
ext4_map_blocks() or ext4_da_map_blocks(). Can we put the handling in one
place?
Otherwise the patch looks OK.
Honza
> Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
> Cc: "Theodore Ts'o" <tytso@mit.edu>
> Cc: Jan kara <jack@suse.cz>
> ---
> fs/ext4/ext4.h | 2 ++
> fs/ext4/extents.c | 7 ++++-
> fs/ext4/extents_status.c | 59 +++++++++++++++++++++++++++++++++++++++++
> fs/ext4/extents_status.h | 1 +
> fs/ext4/inode.c | 64 +++++++++++++++++++++++++++++++++++++++++++--
> include/trace/events/ext4.h | 56 +++++++++++++++++++++++++++++++++++++++
> 6 files changed, 186 insertions(+), 3 deletions(-)
>
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index 8462eb3..ad885b5 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -582,6 +582,8 @@ enum {
> #define EXT4_GET_BLOCKS_KEEP_SIZE 0x0080
> /* Do not take i_data_sem locking in ext4_map_blocks */
> #define EXT4_GET_BLOCKS_NO_LOCK 0x0100
> + /* Do not put hole in extent cache */
> +#define EXT4_GET_BLOCKS_NO_PUT_HOLE 0x0200
>
> /*
> * Flags used by ext4_free_blocks
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index 4b065ff..1be8955 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -2154,6 +2154,8 @@ ext4_ext_put_gap_in_cache(struct inode *inode, struct ext4_ext_path *path,
> block,
> le32_to_cpu(ex->ee_block),
> ext4_ext_get_actual_len(ex));
> + ext4_es_insert_extent(inode, lblock, len, ~0,
> + EXTENT_STATUS_HOLE);
> } else if (block >= le32_to_cpu(ex->ee_block)
> + ext4_ext_get_actual_len(ex)) {
> ext4_lblk_t next;
> @@ -2167,6 +2169,8 @@ ext4_ext_put_gap_in_cache(struct inode *inode, struct ext4_ext_path *path,
> block);
> BUG_ON(next == lblock);
> len = next - lblock;
> + ext4_es_insert_extent(inode, lblock, len, ~0,
> + EXTENT_STATUS_HOLE);
> } else {
> lblock = len = 0;
> BUG();
> @@ -4006,7 +4010,8 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
> * put just found gap into cache to speed up
> * subsequent requests
> */
> - ext4_ext_put_gap_in_cache(inode, path, map->m_lblk);
> + if ((flags & EXT4_GET_BLOCKS_NO_PUT_HOLE) == 0)
> + ext4_ext_put_gap_in_cache(inode, path, map->m_lblk);
> goto out2;
> }
>
> diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c
> index 71cb75a..ca7dc9f 100644
> --- a/fs/ext4/extents_status.c
> +++ b/fs/ext4/extents_status.c
> @@ -468,6 +468,65 @@ error:
> return err;
> }
>
> +/*
> + * ext4_es_lookup_extent() looks up an extent in extent status tree.
> + *
> + * ext4_es_lookup_extent is called by ext4_map_blocks/ext4_da_map_blocks.
> + *
> + * Return: 1 on found, 0 on not
> + */
> +int ext4_es_lookup_extent(struct inode *inode, struct extent_status *es)
> +{
> + struct ext4_es_tree *tree;
> + struct extent_status *es1 = NULL;
> + struct rb_node *node;
> + int found = 0;
> +
> + trace_ext4_es_lookup_extent_enter(inode, es->es_lblk);
> + es_debug("lookup extent in block %u\n", es->es_lblk);
> +
> + tree = &EXT4_I(inode)->i_es_tree;
> + read_lock(&EXT4_I(inode)->i_es_lock);
> +
> + /* find extent in cache firstly */
> + es->es_len = es->es_pblk = 0;
> + if (tree->cache_es) {
> + es1 = tree->cache_es;
> + if (in_range(es->es_lblk, es1->es_lblk, es1->es_len)) {
> + es_debug("%u cached by [%u/%u)\n",
> + es->es_lblk, es1->es_lblk, es1->es_len);
> + found = 1;
> + goto out;
> + }
> + }
> +
> + node = tree->root.rb_node;
> + while (node) {
> + es1 = rb_entry(node, struct extent_status, rb_node);
> + if (es->es_lblk < es1->es_lblk)
> + node = node->rb_left;
> + else if (es->es_lblk > ext4_es_end(es1))
> + node = node->rb_right;
> + else {
> + found = 1;
> + break;
> + }
> + }
> +
> +out:
> + if (found) {
> + BUG_ON(!es1);
> + es->es_lblk = es1->es_lblk;
> + es->es_len = es1->es_len;
> + es->es_pblk = es1->es_pblk;
> + }
> +
> + read_unlock(&EXT4_I(inode)->i_es_lock);
> +
> + trace_ext4_es_lookup_extent_exit(inode, es, found);
> + return found;
> +}
> +
> static int __es_remove_extent(struct ext4_es_tree *tree, ext4_lblk_t lblk,
> ext4_lblk_t end)
> {
> diff --git a/fs/ext4/extents_status.h b/fs/ext4/extents_status.h
> index b5788eb..effe78c 100644
> --- a/fs/ext4/extents_status.h
> +++ b/fs/ext4/extents_status.h
> @@ -53,6 +53,7 @@ extern int ext4_es_remove_extent(struct inode *inode, ext4_lblk_t lblk,
> ext4_lblk_t len);
> extern ext4_lblk_t ext4_es_find_delayed_extent(struct inode *inode,
> struct extent_status *es);
> +extern int ext4_es_lookup_extent(struct inode *inode, struct extent_status *es);
>
> static inline int ext4_es_is_written(struct extent_status *es)
> {
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 16454fc..670779a 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -508,12 +508,34 @@ static pgoff_t ext4_num_dirty_pages(struct inode *inode, pgoff_t idx,
> int ext4_map_blocks(handle_t *handle, struct inode *inode,
> struct ext4_map_blocks *map, int flags)
> {
> + struct extent_status es;
> int retval;
>
> map->m_flags = 0;
> ext_debug("ext4_map_blocks(): inode %lu, flag %d, max_blocks %u,"
> "logical block %lu\n", inode->i_ino, flags, map->m_len,
> (unsigned long) map->m_lblk);
> +
> + /* Lookup extent status tree firstly */
> + es.es_lblk = map->m_lblk;
> + if (ext4_es_lookup_extent(inode, &es)) {
> + if (ext4_es_is_written(&es) || ext4_es_is_unwritten(&es)) {
> + map->m_pblk = ext4_es_pblock(&es) +
> + map->m_lblk - es.es_lblk;
> + map->m_flags |= ext4_es_is_written(&es) ?
> + EXT4_MAP_MAPPED : EXT4_MAP_UNWRITTEN;
> + retval = es.es_len - (map->m_lblk - es.es_lblk);
> + if (retval > map->m_len)
> + retval = map->m_len;
> + map->m_len = retval;
> + } else if (ext4_es_is_delayed(&es) || ext4_es_is_hole(&es)) {
> + retval = 0;
> + } else {
> + BUG_ON(1);
> + }
> + goto found;
> + }
> +
> /*
> * Try to see if we can get the block without requesting a new
> * file system block.
> @@ -541,6 +563,7 @@ int ext4_map_blocks(handle_t *handle, struct inode *inode,
> if (!(flags & EXT4_GET_BLOCKS_NO_LOCK))
> up_read((&EXT4_I(inode)->i_data_sem));
>
> +found:
> if (retval > 0 && map->m_flags & EXT4_MAP_MAPPED) {
> int ret = check_block_validity(inode, map);
> if (ret != 0)
> @@ -1772,6 +1795,7 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
> struct ext4_map_blocks *map,
> struct buffer_head *bh)
> {
> + struct extent_status es;
> int retval;
> sector_t invalid_block = ~((sector_t) 0xffff);
>
> @@ -1782,6 +1806,39 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
> ext_debug("ext4_da_map_blocks(): inode %lu, max_blocks %u,"
> "logical block %lu\n", inode->i_ino, map->m_len,
> (unsigned long) map->m_lblk);
> +
> + /* Lookup extent status tree firstly */
> + es.es_lblk = iblock;
> + if (ext4_es_lookup_extent(inode, &es)) {
> +
> + if (ext4_es_is_hole(&es)) {
> + retval = 0;
> + down_read((&EXT4_I(inode)->i_data_sem));
> + goto add_delayed;
> + }
> +
> + if (ext4_es_is_delayed(&es)) {
> + map_bh(bh, inode->i_sb, invalid_block);
> + set_buffer_new(bh);
> + set_buffer_delay(bh);
> + return 0;
> + }
> +
> + map->m_pblk = ext4_es_pblock(&es) + iblock - es.es_lblk;
> + retval = es.es_len - (iblock - es.es_lblk);
> + if (retval > map->m_len)
> + retval = map->m_len;
> + map->m_len = retval;
> + if (ext4_es_is_written(&es))
> + map->m_flags |= EXT4_MAP_MAPPED;
> + else if (ext4_es_is_unwritten(&es))
> + map->m_flags |= EXT4_MAP_UNWRITTEN;
> + else
> + BUG_ON(1);
> +
> + return retval;
> + }
> +
> /*
> * Try to see if we can get the block without requesting a new
> * file system block.
> @@ -1800,10 +1857,13 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
> map->m_flags |= EXT4_MAP_FROM_CLUSTER;
> retval = 0;
> } else if (ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))
> - retval = ext4_ext_map_blocks(NULL, inode, map, 0);
> + retval = ext4_ext_map_blocks(NULL, inode, map,
> + EXT4_GET_BLOCKS_NO_PUT_HOLE);
> else
> - retval = ext4_ind_map_blocks(NULL, inode, map, 0);
> + retval = ext4_ind_map_blocks(NULL, inode, map,
> + EXT4_GET_BLOCKS_NO_PUT_HOLE);
>
> +add_delayed:
> if (retval == 0) {
> int ret;
> /*
> diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h
> index d278ced..822780a 100644
> --- a/include/trace/events/ext4.h
> +++ b/include/trace/events/ext4.h
> @@ -2177,6 +2177,62 @@ TRACE_EVENT(ext4_es_find_delayed_extent_exit,
> __entry->pblk, __entry->status, __entry->ret)
> );
>
> +TRACE_EVENT(ext4_es_lookup_extent_enter,
> + TP_PROTO(struct inode *inode, ext4_lblk_t lblk),
> +
> + TP_ARGS(inode, lblk),
> +
> + TP_STRUCT__entry(
> + __field( dev_t, dev )
> + __field( ino_t, ino )
> + __field( ext4_lblk_t, lblk )
> + ),
> +
> + TP_fast_assign(
> + __entry->dev = inode->i_sb->s_dev;
> + __entry->ino = inode->i_ino;
> + __entry->lblk = lblk;
> + ),
> +
> + TP_printk("dev %d,%d ino %lu lblk %u",
> + MAJOR(__entry->dev), MINOR(__entry->dev),
> + (unsigned long) __entry->ino, __entry->lblk)
> +);
> +
> +TRACE_EVENT(ext4_es_lookup_extent_exit,
> + TP_PROTO(struct inode *inode, struct extent_status *es,
> + int found),
> +
> + TP_ARGS(inode, es, found),
> +
> + TP_STRUCT__entry(
> + __field( dev_t, dev )
> + __field( ino_t, ino )
> + __field( ext4_lblk_t, lblk )
> + __field( ext4_lblk_t, len )
> + __field( ext4_fsblk_t, pblk )
> + __field( unsigned long long, status )
> + __field( int, found )
> + ),
> +
> + TP_fast_assign(
> + __entry->dev = inode->i_sb->s_dev;
> + __entry->ino = inode->i_ino;
> + __entry->lblk = es->es_lblk;
> + __entry->len = es->es_len;
> + __entry->pblk = ext4_es_pblock(es);
> + __entry->status = ext4_es_status(es);
> + __entry->found = found;
> + ),
> +
> + TP_printk("dev %d,%d ino %lu found %d [%u/%u) %llu %llx",
> + MAJOR(__entry->dev), MINOR(__entry->dev),
> + (unsigned long) __entry->ino, __entry->found,
> + __entry->lblk, __entry->len,
> + __entry->found ? __entry->pblk : 0,
> + __entry->found ? __entry->status : 0)
> +);
> +
> #endif /* _TRACE_EXT4_H */
>
> /* This part must be outside protection */
> --
> 1.7.12.rc2.18.g61b472e
>
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
next prev parent reply other threads:[~2013-02-12 12:31 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-02-08 8:43 [PATCH 00/10 v5] ext4: extent status tree (step2) Zheng Liu
2013-02-08 8:43 ` [PATCH 01/10 v5] ext4: refine extent status tree Zheng Liu
2013-02-08 15:35 ` Jan Kara
2013-02-15 6:38 ` Zheng Liu
2013-02-08 8:43 ` [PATCH 02/10 v5] ext4: add physical block and status member into " Zheng Liu
2013-02-08 15:39 ` Jan Kara
2013-02-08 8:43 ` [PATCH 03/10 v5] ext4: let ext4_ext_map_blocks return EXT4_MAP_UNWRITTEN flag Zheng Liu
2013-02-08 15:41 ` Jan Kara
2013-02-08 8:44 ` [PATCH 04/10 v5] ext4: track all extent status in extent status tree Zheng Liu
2013-02-11 12:21 ` Jan Kara
2013-02-15 6:45 ` Zheng Liu
2013-02-13 3:28 ` Theodore Ts'o
2013-02-13 3:46 ` [PATCH 1/2] ext4: rename ext4_es_find_extent() to ext4_es_find_delayed_extent() Theodore Ts'o
2013-02-13 3:46 ` [PATCH 2/2] ext4: track all extent status in extent status tree Theodore Ts'o
2013-02-15 6:53 ` [PATCH 04/10 v5] " Zheng Liu
2013-02-17 16:26 ` Zheng Liu
2013-02-08 8:44 ` [PATCH 05/10 v5] ext4: lookup block mapping " Zheng Liu
2013-02-12 12:31 ` Jan Kara [this message]
2013-02-15 7:06 ` Zheng Liu
2013-02-15 16:47 ` Jan Kara
2013-02-15 17:25 ` Theodore Ts'o
2013-02-16 2:32 ` Zheng Liu
2013-02-16 16:18 ` Possible TODO projects for the map_blocks() code path (was: Re: [PATCH 05/10 v5] ext4: lookup block mapping in extent status tree) Theodore Ts'o
2013-02-17 3:15 ` Zheng Liu
2013-02-08 8:44 ` [PATCH 06/10 v5] ext4: remove single extent cache Zheng Liu
2013-02-08 8:44 ` [PATCH 07/10 v5] ext4: adjust some functions for reclaiming extents from extent status tree Zheng Liu
2013-02-08 8:44 ` [PATCH 08/10 v5] ext4: reclaim " Zheng Liu
2013-02-08 8:44 ` [PATCH 09/10 v5] ext4: convert unwritten extents from extent status tree in end_io Zheng Liu
2013-02-10 8:45 ` Zheng Liu
2013-02-11 1:52 ` Theodore Ts'o
2013-02-12 12:51 ` Jan Kara
2013-02-15 7:12 ` Zheng Liu
2013-02-08 8:44 ` [PATCH 10/10 v5] ext4: remove bogus wait for unwritten extents in ext4_ind_direct_IO Zheng Liu
2013-02-12 12:58 ` Jan Kara
2013-02-15 7:14 ` Zheng Liu
2013-02-10 1:38 ` [PATCH 00/10 v5] ext4: extent status tree (step2) Theodore Ts'o
2013-02-10 8:40 ` Zheng Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130212123142.GC19583@quack.suse.cz \
--to=jack@suse.cz \
--cc=gnehzuil.liu@gmail.com \
--cc=linux-ext4@vger.kernel.org \
--cc=tytso@mit.edu \
--cc=wenqing.lz@taobao.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).