All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Zheng Liu <gnehzuil.liu@gmail.com>
Cc: linux-ext4@vger.kernel.org, Zheng Liu <wenqing.lz@taobao.com>,
	Theodore Ts'o <tytso@mit.edu>, Jan kara <jack@suse.cz>
Subject: Re: [PATCH 05/10 v5] ext4: lookup block mapping in extent status tree
Date: Tue, 12 Feb 2013 13:31:42 +0100	[thread overview]
Message-ID: <20130212123142.GC19583@quack.suse.cz> (raw)
In-Reply-To: <1360313046-9876-6-git-send-email-wenqing.lz@taobao.com>

On Fri 08-02-13 16:44:01, Zheng Liu wrote:
> From: Zheng Liu <wenqing.lz@taobao.com>
> 
> After tracking all extent status, we already have a extent cache in
> memory.  Every time we want to lookup a block mapping, we can first
> try to lookup it in extent status tree to avoid a potential disk I/O.
> 
> A new function called ext4_es_lookup_extent is defined to finish this
> work.  When we try to lookup a block mapping, we always call
> ext4_map_blocks and/or ext4_da_map_blocks.  So in these functions we
> first try to lookup a block mapping in extent status tree.
> 
> A new flag EXT4_GET_BLOCKS_NO_PUT_HOLE is used in ext4_da_map_blocks
> in order not to put a hole into extent status tree because this hole
> will be converted to delayed extent in the tree immediately.
  It looks somewhat inconsistent that you put hole into the extent tree in
ext4_ext_map_blocks() but all other extent types are handled in
ext4_map_blocks() or ext4_da_map_blocks(). Can we put the handling in one
place?

Otherwise the patch looks OK.

								Honza

> Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
> Cc: "Theodore Ts'o" <tytso@mit.edu>
> Cc: Jan kara <jack@suse.cz>
> ---
>  fs/ext4/ext4.h              |  2 ++
>  fs/ext4/extents.c           |  7 ++++-
>  fs/ext4/extents_status.c    | 59 +++++++++++++++++++++++++++++++++++++++++
>  fs/ext4/extents_status.h    |  1 +
>  fs/ext4/inode.c             | 64 +++++++++++++++++++++++++++++++++++++++++++--
>  include/trace/events/ext4.h | 56 +++++++++++++++++++++++++++++++++++++++
>  6 files changed, 186 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index 8462eb3..ad885b5 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -582,6 +582,8 @@ enum {
>  #define EXT4_GET_BLOCKS_KEEP_SIZE		0x0080
>  	/* Do not take i_data_sem locking in ext4_map_blocks */
>  #define EXT4_GET_BLOCKS_NO_LOCK			0x0100
> +	/* Do not put hole in extent cache */
> +#define EXT4_GET_BLOCKS_NO_PUT_HOLE		0x0200
>  
>  /*
>   * Flags used by ext4_free_blocks
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index 4b065ff..1be8955 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -2154,6 +2154,8 @@ ext4_ext_put_gap_in_cache(struct inode *inode, struct ext4_ext_path *path,
>  				block,
>  				le32_to_cpu(ex->ee_block),
>  				 ext4_ext_get_actual_len(ex));
> +		ext4_es_insert_extent(inode, lblock, len, ~0,
> +				      EXTENT_STATUS_HOLE);
>  	} else if (block >= le32_to_cpu(ex->ee_block)
>  			+ ext4_ext_get_actual_len(ex)) {
>  		ext4_lblk_t next;
> @@ -2167,6 +2169,8 @@ ext4_ext_put_gap_in_cache(struct inode *inode, struct ext4_ext_path *path,
>  				block);
>  		BUG_ON(next == lblock);
>  		len = next - lblock;
> +		ext4_es_insert_extent(inode, lblock, len, ~0,
> +				      EXTENT_STATUS_HOLE);
>  	} else {
>  		lblock = len = 0;
>  		BUG();
> @@ -4006,7 +4010,8 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
>  		 * put just found gap into cache to speed up
>  		 * subsequent requests
>  		 */
> -		ext4_ext_put_gap_in_cache(inode, path, map->m_lblk);
> +		if ((flags & EXT4_GET_BLOCKS_NO_PUT_HOLE) == 0)
> +			ext4_ext_put_gap_in_cache(inode, path, map->m_lblk);
>  		goto out2;
>  	}
>  
> diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c
> index 71cb75a..ca7dc9f 100644
> --- a/fs/ext4/extents_status.c
> +++ b/fs/ext4/extents_status.c
> @@ -468,6 +468,65 @@ error:
>  	return err;
>  }
>  
> +/*
> + * ext4_es_lookup_extent() looks up an extent in extent status tree.
> + *
> + * ext4_es_lookup_extent is called by ext4_map_blocks/ext4_da_map_blocks.
> + *
> + * Return: 1 on found, 0 on not
> + */
> +int ext4_es_lookup_extent(struct inode *inode, struct extent_status *es)
> +{
> +	struct ext4_es_tree *tree;
> +	struct extent_status *es1 = NULL;
> +	struct rb_node *node;
> +	int found = 0;
> +
> +	trace_ext4_es_lookup_extent_enter(inode, es->es_lblk);
> +	es_debug("lookup extent in block %u\n", es->es_lblk);
> +
> +	tree = &EXT4_I(inode)->i_es_tree;
> +	read_lock(&EXT4_I(inode)->i_es_lock);
> +
> +	/* find extent in cache firstly */
> +	es->es_len = es->es_pblk = 0;
> +	if (tree->cache_es) {
> +		es1 = tree->cache_es;
> +		if (in_range(es->es_lblk, es1->es_lblk, es1->es_len)) {
> +			es_debug("%u cached by [%u/%u)\n",
> +				 es->es_lblk, es1->es_lblk, es1->es_len);
> +			found = 1;
> +			goto out;
> +		}
> +	}
> +
> +	node = tree->root.rb_node;
> +	while (node) {
> +		es1 = rb_entry(node, struct extent_status, rb_node);
> +		if (es->es_lblk < es1->es_lblk)
> +			node = node->rb_left;
> +		else if (es->es_lblk > ext4_es_end(es1))
> +			node = node->rb_right;
> +		else {
> +			found = 1;
> +			break;
> +		}
> +	}
> +
> +out:
> +	if (found) {
> +		BUG_ON(!es1);
> +		es->es_lblk = es1->es_lblk;
> +		es->es_len = es1->es_len;
> +		es->es_pblk = es1->es_pblk;
> +	}
> +
> +	read_unlock(&EXT4_I(inode)->i_es_lock);
> +
> +	trace_ext4_es_lookup_extent_exit(inode, es, found);
> +	return found;
> +}
> +
>  static int __es_remove_extent(struct ext4_es_tree *tree, ext4_lblk_t lblk,
>  				 ext4_lblk_t end)
>  {
> diff --git a/fs/ext4/extents_status.h b/fs/ext4/extents_status.h
> index b5788eb..effe78c 100644
> --- a/fs/ext4/extents_status.h
> +++ b/fs/ext4/extents_status.h
> @@ -53,6 +53,7 @@ extern int ext4_es_remove_extent(struct inode *inode, ext4_lblk_t lblk,
>  				 ext4_lblk_t len);
>  extern ext4_lblk_t ext4_es_find_delayed_extent(struct inode *inode,
>  					       struct extent_status *es);
> +extern int ext4_es_lookup_extent(struct inode *inode, struct extent_status *es);
>  
>  static inline int ext4_es_is_written(struct extent_status *es)
>  {
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 16454fc..670779a 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -508,12 +508,34 @@ static pgoff_t ext4_num_dirty_pages(struct inode *inode, pgoff_t idx,
>  int ext4_map_blocks(handle_t *handle, struct inode *inode,
>  		    struct ext4_map_blocks *map, int flags)
>  {
> +	struct extent_status es;
>  	int retval;
>  
>  	map->m_flags = 0;
>  	ext_debug("ext4_map_blocks(): inode %lu, flag %d, max_blocks %u,"
>  		  "logical block %lu\n", inode->i_ino, flags, map->m_len,
>  		  (unsigned long) map->m_lblk);
> +
> +	/* Lookup extent status tree firstly */
> +	es.es_lblk = map->m_lblk;
> +	if (ext4_es_lookup_extent(inode, &es)) {
> +		if (ext4_es_is_written(&es) || ext4_es_is_unwritten(&es)) {
> +			map->m_pblk = ext4_es_pblock(&es) +
> +					map->m_lblk - es.es_lblk;
> +			map->m_flags |= ext4_es_is_written(&es) ?
> +					EXT4_MAP_MAPPED : EXT4_MAP_UNWRITTEN;
> +			retval = es.es_len - (map->m_lblk - es.es_lblk);
> +			if (retval > map->m_len)
> +				retval = map->m_len;
> +			map->m_len = retval;
> +		} else if (ext4_es_is_delayed(&es) || ext4_es_is_hole(&es)) {
> +			retval = 0;
> +		} else {
> +			BUG_ON(1);
> +		}
> +		goto found;
> +	}
> +
>  	/*
>  	 * Try to see if we can get the block without requesting a new
>  	 * file system block.
> @@ -541,6 +563,7 @@ int ext4_map_blocks(handle_t *handle, struct inode *inode,
>  	if (!(flags & EXT4_GET_BLOCKS_NO_LOCK))
>  		up_read((&EXT4_I(inode)->i_data_sem));
>  
> +found:
>  	if (retval > 0 && map->m_flags & EXT4_MAP_MAPPED) {
>  		int ret = check_block_validity(inode, map);
>  		if (ret != 0)
> @@ -1772,6 +1795,7 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
>  			      struct ext4_map_blocks *map,
>  			      struct buffer_head *bh)
>  {
> +	struct extent_status es;
>  	int retval;
>  	sector_t invalid_block = ~((sector_t) 0xffff);
>  
> @@ -1782,6 +1806,39 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
>  	ext_debug("ext4_da_map_blocks(): inode %lu, max_blocks %u,"
>  		  "logical block %lu\n", inode->i_ino, map->m_len,
>  		  (unsigned long) map->m_lblk);
> +
> +	/* Lookup extent status tree firstly */
> +	es.es_lblk = iblock;
> +	if (ext4_es_lookup_extent(inode, &es)) {
> +
> +		if (ext4_es_is_hole(&es)) {
> +			retval = 0;
> +			down_read((&EXT4_I(inode)->i_data_sem));
> +			goto add_delayed;
> +		}
> +
> +		if (ext4_es_is_delayed(&es)) {
> +			map_bh(bh, inode->i_sb, invalid_block);
> +			set_buffer_new(bh);
> +			set_buffer_delay(bh);
> +			return 0;
> +		}
> +
> +		map->m_pblk = ext4_es_pblock(&es) + iblock - es.es_lblk;
> +		retval = es.es_len - (iblock - es.es_lblk);
> +		if (retval > map->m_len)
> +			retval = map->m_len;
> +		map->m_len = retval;
> +		if (ext4_es_is_written(&es))
> +			map->m_flags |= EXT4_MAP_MAPPED;
> +		else if (ext4_es_is_unwritten(&es))
> +			map->m_flags |= EXT4_MAP_UNWRITTEN;
> +		else
> +			BUG_ON(1);
> +
> +		return retval;
> +	}
> +
>  	/*
>  	 * Try to see if we can get the block without requesting a new
>  	 * file system block.
> @@ -1800,10 +1857,13 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
>  			map->m_flags |= EXT4_MAP_FROM_CLUSTER;
>  		retval = 0;
>  	} else if (ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))
> -		retval = ext4_ext_map_blocks(NULL, inode, map, 0);
> +		retval = ext4_ext_map_blocks(NULL, inode, map,
> +					     EXT4_GET_BLOCKS_NO_PUT_HOLE);
>  	else
> -		retval = ext4_ind_map_blocks(NULL, inode, map, 0);
> +		retval = ext4_ind_map_blocks(NULL, inode, map,
> +					     EXT4_GET_BLOCKS_NO_PUT_HOLE);
>  
> +add_delayed:
>  	if (retval == 0) {
>  		int ret;
>  		/*
> diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h
> index d278ced..822780a 100644
> --- a/include/trace/events/ext4.h
> +++ b/include/trace/events/ext4.h
> @@ -2177,6 +2177,62 @@ TRACE_EVENT(ext4_es_find_delayed_extent_exit,
>  		  __entry->pblk, __entry->status, __entry->ret)
>  );
>  
> +TRACE_EVENT(ext4_es_lookup_extent_enter,
> +	TP_PROTO(struct inode *inode, ext4_lblk_t lblk),
> +
> +	TP_ARGS(inode, lblk),
> +
> +	TP_STRUCT__entry(
> +		__field(	dev_t,		dev		)
> +		__field(	ino_t,		ino		)
> +		__field(	ext4_lblk_t,	lblk		)
> +	),
> +
> +	TP_fast_assign(
> +		__entry->dev	= inode->i_sb->s_dev;
> +		__entry->ino	= inode->i_ino;
> +		__entry->lblk	= lblk;
> +	),
> +
> +	TP_printk("dev %d,%d ino %lu lblk %u",
> +		  MAJOR(__entry->dev), MINOR(__entry->dev),
> +		  (unsigned long) __entry->ino, __entry->lblk)
> +);
> +
> +TRACE_EVENT(ext4_es_lookup_extent_exit,
> +	TP_PROTO(struct inode *inode, struct extent_status *es,
> +		 int found),
> +
> +	TP_ARGS(inode, es, found),
> +
> +	TP_STRUCT__entry(
> +		__field(	dev_t,		dev		)
> +		__field(	ino_t,		ino		)
> +		__field(	ext4_lblk_t,	lblk		)
> +		__field(	ext4_lblk_t,	len		)
> +		__field(	ext4_fsblk_t,	pblk		)
> +		__field(	unsigned long long,	status	)
> +		__field(	int,		found		)
> +	),
> +
> +	TP_fast_assign(
> +		__entry->dev	= inode->i_sb->s_dev;
> +		__entry->ino	= inode->i_ino;
> +		__entry->lblk	= es->es_lblk;
> +		__entry->len	= es->es_len;
> +		__entry->pblk	= ext4_es_pblock(es);
> +		__entry->status	= ext4_es_status(es);
> +		__entry->found	= found;
> +	),
> +
> +	TP_printk("dev %d,%d ino %lu found %d [%u/%u) %llu %llx",
> +		  MAJOR(__entry->dev), MINOR(__entry->dev),
> +		  (unsigned long) __entry->ino, __entry->found,
> +		  __entry->lblk, __entry->len,
> +		  __entry->found ? __entry->pblk : 0,
> +		  __entry->found ? __entry->status : 0)
> +);
> +
>  #endif /* _TRACE_EXT4_H */
>  
>  /* This part must be outside protection */
> -- 
> 1.7.12.rc2.18.g61b472e
> 
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

  reply	other threads:[~2013-02-12 12:31 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-08  8:43 [PATCH 00/10 v5] ext4: extent status tree (step2) Zheng Liu
2013-02-08  8:43 ` [PATCH 01/10 v5] ext4: refine extent status tree Zheng Liu
2013-02-08 15:35   ` Jan Kara
2013-02-15  6:38     ` Zheng Liu
2013-02-08  8:43 ` [PATCH 02/10 v5] ext4: add physical block and status member into " Zheng Liu
2013-02-08 15:39   ` Jan Kara
2013-02-08  8:43 ` [PATCH 03/10 v5] ext4: let ext4_ext_map_blocks return EXT4_MAP_UNWRITTEN flag Zheng Liu
2013-02-08 15:41   ` Jan Kara
2013-02-08  8:44 ` [PATCH 04/10 v5] ext4: track all extent status in extent status tree Zheng Liu
2013-02-11 12:21   ` Jan Kara
2013-02-15  6:45     ` Zheng Liu
2013-02-13  3:28   ` Theodore Ts'o
2013-02-13  3:46     ` [PATCH 1/2] ext4: rename ext4_es_find_extent() to ext4_es_find_delayed_extent() Theodore Ts'o
2013-02-13  3:46       ` [PATCH 2/2] ext4: track all extent status in extent status tree Theodore Ts'o
2013-02-15  6:53     ` [PATCH 04/10 v5] " Zheng Liu
2013-02-17 16:26     ` Zheng Liu
2013-02-08  8:44 ` [PATCH 05/10 v5] ext4: lookup block mapping " Zheng Liu
2013-02-12 12:31   ` Jan Kara [this message]
2013-02-15  7:06     ` Zheng Liu
2013-02-15 16:47       ` Jan Kara
2013-02-15 17:25       ` Theodore Ts'o
2013-02-16  2:32         ` Zheng Liu
2013-02-16 16:18           ` Possible TODO projects for the map_blocks() code path (was: Re: [PATCH 05/10 v5] ext4: lookup block mapping in extent status tree) Theodore Ts'o
2013-02-17  3:15             ` Zheng Liu
2013-02-08  8:44 ` [PATCH 06/10 v5] ext4: remove single extent cache Zheng Liu
2013-02-08  8:44 ` [PATCH 07/10 v5] ext4: adjust some functions for reclaiming extents from extent status tree Zheng Liu
2013-02-08  8:44 ` [PATCH 08/10 v5] ext4: reclaim " Zheng Liu
2013-02-08  8:44 ` [PATCH 09/10 v5] ext4: convert unwritten extents from extent status tree in end_io Zheng Liu
2013-02-10  8:45   ` Zheng Liu
2013-02-11  1:52     ` Theodore Ts'o
2013-02-12 12:51   ` Jan Kara
2013-02-15  7:12     ` Zheng Liu
2013-02-08  8:44 ` [PATCH 10/10 v5] ext4: remove bogus wait for unwritten extents in ext4_ind_direct_IO Zheng Liu
2013-02-12 12:58   ` Jan Kara
2013-02-15  7:14     ` Zheng Liu
2013-02-10  1:38 ` [PATCH 00/10 v5] ext4: extent status tree (step2) Theodore Ts'o
2013-02-10  8:40   ` Zheng Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130212123142.GC19583@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=gnehzuil.liu@gmail.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=wenqing.lz@taobao.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.