All of lore.kernel.org
 help / color / mirror / Atom feed
From: Liu Bo <bo.li.liu@oracle.com>
To: Qu Wenruo <quwenruo@cn.fujitsu.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH] btrfs: Fix and enhance merge_extent_mapping() to insert best fitted extent map
Date: Thu, 18 Sep 2014 16:20:24 +0800	[thread overview]
Message-ID: <20140918082023.GC15092@localhost.localdomain> (raw)
In-Reply-To: <541A90A2.5050407@cn.fujitsu.com>

On Thu, Sep 18, 2014 at 03:58:26PM +0800, Qu Wenruo wrote:
> 
[...]
> >
> >original:   (start >= existing->start && start < extent_map_end(existing))
> >this patch: (start < extent_map_end(existing) && start + len > existing->start)
> >
> >(start + len > existing->start) doesn't equal to start >= existing->start,
> >here is a case of (start+len > existing->start) but (start <= existing->start).
> >
> >        |--------| -->(existing)
> >     |--------|    -->[start, start+len)
> >
> >And calling search_extent_mapping() doesn't make sure that
> >(start >= existing->start) is true, either.
> All right, that case is right and I'm wrong.
> I'll change the if() to use start >= existing->start.
> 
> BTW, Any other problem?

Others look good.

thanks,
-liubo

> 
> Thanks,
> Qu
> >
> >>>And one of overlapping cases is (existing->start > start), ie. em->start > start, this is
> >>>against our rule of btrfs_get_extent,
> >>Nope again, this overlapping in fact is quite normal in multithread
> >>random read/write.
> >>The files's [0~16) is a preallocated one,
> >>Thread A:
> >>     write [4K, 8K) into the file, but not committed yet.
> >>     extent map tree contains [0,16K) only
> >>Thread B:
> >>btrfs_get_extent()
> >>     the map_start is 8K, len is 4K as an example
> >>     grab a large em, take [0,16K), since [4K,8K) write is not committed.
> >>     comes to insert: btrfs_release_path(path);
> >>
> >>Thread A:
> >>     [4K, 8K) is not committed
> >>     the extent map is now [0, 4K) [4K, 8K) [8K, 16K).
> >>
> >>Thread B:
> >>     goes to insert: add_extent_mapping()
> >>     the [0,16K) is overlapping, and the returned existing one is [8K, 16K).
> >>     which contains the [map_start, map_start +  len).
> >So this's an example of existing->start == start (both are 8K), not
> >existing->start > start.
> >
> >See __extent_writepage_io(),
> >
> >{
> >		...
> >                 em = epd->get_extent(inode, page, pg_offset, cur,
> >                                      end - cur + 1, 1);
> >                 if (IS_ERR_OR_NULL(em)) {
> >                         SetPageError(page);
> >                         ret = PTR_ERR_OR_ZERO(em);
> >                         break;
> >                 }
> >                 extent_offset = cur - em->start;
> >		^^^^^^^^^^^^^^^^^^^^^^^^^^^ it needs to be (em->start <= cur)
> >
> >		...
> >}
> >
> >thanks,
> >-liubo
> >
> >>>struct extent_map *btrfs_get_extent(...)
> >>>{
> >>>	[...]
> >>>	insert:
> >>>		btrfs_release_path(path);
> >>>		if (em->start > start || extent_map_end(em) <= start) {
> >>>			btrfs_err(root->fs_info, "bad extent! em: [%llu %llu] passed
> >>>	[%llu %llu]",
> >>>				em->start, em->len, start, len);
> >>>			err = -EIO;
> >>>                 goto out;
> >>>         }
> >>>         [...]
> >>>}
> >>>
> >>>thanks,
> >>>-liubo
> >>>
> >>>>+			/*
> >>>>+			 * The existing extent map is the one nearest to
> >>>>+			 * the [start, start + len) range which overlaps
> >>>>+			 */
> >>>>+			err = merge_extent_mapping(em_tree, existing,
> >>>>+						   em, start);
> >>>>  			free_extent_map(existing);
> >>>>-			existing = NULL;
> >>>>-		}
> >>>>-		if (!existing) {
> >>>>-			existing = lookup_extent_mapping(em_tree, em->start,
> >>>>-							 em->len);
> >>>>-			if (existing) {
> >>>>-				err = merge_extent_mapping(em_tree, existing,
> >>>>-							   em, start);
> >>>>-				free_extent_map(existing);
> >>>>-				if (err) {
> >>>>-					free_extent_map(em);
> >>>>-					em = NULL;
> >>>>-				}
> >>>>-			} else {
> >>>>-				err = -EIO;
> >>>>+			if (err) {
> >>>>  				free_extent_map(em);
> >>>>  				em = NULL;
> >>>>  			}
> >>>>-- 
> >>>>2.1.0
> >>>>
> >>>>--
> >>>>To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> >>>>the body of a message to majordomo@vger.kernel.org
> >>>>More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

  reply	other threads:[~2014-09-18  8:20 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-17  3:53 [PATCH] btrfs: Fix and enhance merge_extent_mapping() to insert best fitted extent map Qu Wenruo
2014-09-18  4:21 ` Liu Bo
2014-09-18  5:36   ` Qu Wenruo
2014-09-18  5:40     ` Qu Wenruo
2014-09-18  7:33     ` Liu Bo
2014-09-18  7:58       ` Qu Wenruo
2014-09-18  8:20         ` Liu Bo [this message]
2014-09-18  8:24           ` Qu Wenruo
2014-09-18  9:01             ` Liu Bo
2014-09-18 13:16 ` Filipe David Manana
2014-09-19  0:31   ` Qu Wenruo
2014-10-08 12:08     ` Filipe David Manana
2014-10-09  0:28       ` Qu Wenruo
2014-10-09 10:27         ` Filipe David Manana
2014-10-10  2:39           ` Qu Wenruo
2014-10-10  8:08             ` Filipe David Manana
2014-10-13  2:47               ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140918082023.GC15092@localhost.localdomain \
    --to=bo.li.liu@oracle.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=quwenruo@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.