linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/10] bugfixes and regression tests of btrfs_get_extent
@ 2018-01-05 19:51 Liu Bo
  2018-01-05 19:51 ` [PATCH v2 01/10] Btrfs: fix incorrect block_len in merge_extent_mapping Liu Bo
                   ` (10 more replies)
  0 siblings, 11 replies; 26+ messages in thread
From: Liu Bo @ 2018-01-05 19:51 UTC (permalink / raw)
  To: linux-btrfs

Although
commit e6c4efd87ab0 ("btrfs: Fix and enhance merge_extent_mapping() to insert best fitted extent map")
fixed up the negetive em->len, it has introduced several regressions, several has been fixed by

commit 32be3a1ac6d0 ("btrfs: Fix the wrong condition judgment about subset extent map"),
commit 8dff9c853410 ("Btrfs: deal with duplciates during extent_map insertion in btrfs_get_extent") and
commit 8e2bd3b7fac9 ("Btrfs: deal with existing encompassing extent map in btrfs_get_extent()").

Unfortunately, there is one more regression which is caught recently by a
user's workloads.

While debugging the above issue, I found that all of these bugs are caused
by some racy situations, which can be very tricky to reproduce, so I
created several extent map specific test cases in btrfs's selftest
framework.

Patch 1-2 are fixing two bugs.
Patch 3-4 are some preparatory work.
Patch 3-5 are regression tests about the logic of handling EEXIST from
adding extent map.
Patch 8-10 are debugging wise, one is a direct tracepoint and the other is
to enable kprobe on merge_extent_mapping.

v2:
- Improve commit log to provide more details about the bug.
- Adjust bugfixes to the front so that we can merge them firstly.

Liu Bo (10):
  Btrfs: fix incorrect block_len in merge_extent_mapping
  Btrfs: fix unexpected EEXIST from btrfs_get_extent
  Btrfs: add helper for em merge logic
  Btrfs: move extent map specific code to extent_map.c
  Btrfs: add extent map selftests
  Btrfs: extent map selftest: buffered write vs dio read
  Btrfs: extent map selftest: dio write vs dio read
  Btrfs: add WARN_ONCE to detect unexpected error from
    merge_extent_mapping
  Btrfs: add tracepoint for em's EEXIST case
  Btrfs: noinline merge_extent_mapping

 fs/btrfs/Makefile                 |   2 +-
 fs/btrfs/extent_map.c             | 134 ++++++++++++++
 fs/btrfs/extent_map.h             |   2 +
 fs/btrfs/inode.c                  | 108 +-----------
 fs/btrfs/tests/btrfs-tests.c      |   3 +
 fs/btrfs/tests/btrfs-tests.h      |   1 +
 fs/btrfs/tests/extent-map-tests.c | 363 ++++++++++++++++++++++++++++++++++++++
 include/trace/events/btrfs.h      |  35 ++++
 8 files changed, 540 insertions(+), 108 deletions(-)
 create mode 100644 fs/btrfs/tests/extent-map-tests.c

-- 
2.9.4


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2 01/10] Btrfs: fix incorrect block_len in merge_extent_mapping
  2018-01-05 19:51 [PATCH v2 00/10] bugfixes and regression tests of btrfs_get_extent Liu Bo
@ 2018-01-05 19:51 ` Liu Bo
  2018-01-09 17:24   ` Josef Bacik
  2018-01-05 19:51 ` [PATCH v2 02/10] Btrfs: fix unexpected EEXIST from btrfs_get_extent Liu Bo
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 26+ messages in thread
From: Liu Bo @ 2018-01-05 19:51 UTC (permalink / raw)
  To: linux-btrfs

%block_len could be checked on deciding if two em are mergeable.

merge_extent_mapping() has only added the front pad if the front part
of em gets truncated, but it's possible that the end part gets
truncated.

For both compressed extent and inline extent, em->block_len is not
adjusted accordingly, and for regular extent, em->block_len always
equals to em->len, hence this sets em->block_len with em->len.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
---
 fs/btrfs/inode.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index e1a7f3c..2784bb3 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -6860,7 +6860,7 @@ static int merge_extent_mapping(struct extent_map_tree *em_tree,
 	if (em->block_start < EXTENT_MAP_LAST_BYTE &&
 	    !test_bit(EXTENT_FLAG_COMPRESSED, &em->flags)) {
 		em->block_start += start_diff;
-		em->block_len -= start_diff;
+		em->block_len = em->len;
 	}
 	return add_extent_mapping(em_tree, em, 0);
 }
-- 
2.9.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 02/10] Btrfs: fix unexpected EEXIST from btrfs_get_extent
  2018-01-05 19:51 [PATCH v2 00/10] bugfixes and regression tests of btrfs_get_extent Liu Bo
  2018-01-05 19:51 ` [PATCH v2 01/10] Btrfs: fix incorrect block_len in merge_extent_mapping Liu Bo
@ 2018-01-05 19:51 ` Liu Bo
  2018-01-09 17:27   ` Josef Bacik
  2018-01-05 19:51 ` [PATCH v2 03/10] Btrfs: add helper for em merge logic Liu Bo
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 26+ messages in thread
From: Liu Bo @ 2018-01-05 19:51 UTC (permalink / raw)
  To: linux-btrfs

This fixes a corner case that is caused by a race of dio write vs dio
read/write.

Here is how the race could happen.

Suppose that no extent map has been loaded into memory yet.
There is a file extent [0, 32K), two jobs are running concurrently
against it, t1 is doing dio write to [8K, 32K) and t2 is doing dio
read from [0, 4K) or [4K, 8K).

t1 goes ahead of t2 and splits em [0, 32K) to em [0K, 8K) and [8K 32K).

------------------------------------------------------
             t1                                t2
      btrfs_get_blocks_direct()         btrfs_get_blocks_direct()
       -> btrfs_get_extent()              -> btrfs_get_extent()
           -> lookup_extent_mapping()
           -> add_extent_mapping()            -> lookup_extent_mapping()
              # load [0, 32K)
       -> btrfs_new_extent_direct()
           -> btrfs_drop_extent_cache()
              # split [0, 32K) and
	      # drop [8K, 32K)
           -> add_extent_mapping()
              # add [8K, 32K)
                                              -> add_extent_mapping()
                                                 # handle -EEXIST when adding
                                                 # [0, 32K)
------------------------------------------------------
About how t2(dio read/write) runs into -EEXIST:

a) add_extent_mapping() gets -EEXIST for adding em [0, 32k),

b) search_extent_mapping() then returns [0, 8k) as the existing em,
   even though start == existing->start, em is [0, 32k) so that
   extent_map_end(em) > extent_map_end(existing), i.e. 32k > 8k,

c) then it goes thru merge_extent_mapping() which tries to add a [8k, 8k)
   (with a length 0) and returns -EEXIST as [8k, 32k) is already in tree,

d) so btrfs_get_extent() ends up returning -EEXIST to dio read/write,
   which is confusing applications.

Here I conclude all the possible situations,
1) start < existing->start

            +-----------+em+-----------+
+--prev---+ |     +-------------+      |
|         | |     |             |      |
+---------+ +     +---+existing++      ++
                +
                |
                +
             start

2) start == existing->start

      +------------em------------+
      |     +-------------+      |
      |     |             |      |
      +     +----existing-+      +
            |
            |
            +
         start

3) start > existing->start && start < (existing->start + existing->len)

      +------------em------------+
      |     +-------------+      |
      |     |             |      |
      +     +----existing-+      +
               |
               |
               +
             start

4) start >= (existing->start + existing->len)

+-----------+em+-----------+
|     +-------------+      | +--next---+
|     |             |      | |         |
+     +---+existing++      + +---------+
                      +
                      |
                      +
                   start

As we can see, it turns out that if start is within existing em (front
inclusive), then the existing em should be returned as is, otherwise,
we try our best to merge candidate em with sibling ems to form a
larger em (in order to reduce the total number of em).

Reported-by: David Vallender <david.vallender@landmark.co.uk>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
---
v2: Improve commit log to provide more details about the bug.

 fs/btrfs/inode.c | 17 +++--------------
 1 file changed, 3 insertions(+), 14 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 2784bb3..a270fe2 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -7162,19 +7162,12 @@ struct extent_map *btrfs_get_extent(struct btrfs_inode *inode,
 		 * existing will always be non-NULL, since there must be
 		 * extent causing the -EEXIST.
 		 */
-		if (existing->start == em->start &&
-		    extent_map_end(existing) >= extent_map_end(em) &&
-		    em->block_start == existing->block_start) {
-			/*
-			 * The existing extent map already encompasses the
-			 * entire extent map we tried to add.
-			 */
+		if (start >= existing->start &&
+		    start < extent_map_end(existing)) {
 			free_extent_map(em);
 			em = existing;
 			err = 0;
-
-		} else if (start >= extent_map_end(existing) ||
-		    start <= existing->start) {
+		} else {
 			/*
 			 * The existing extent map is the one nearest to
 			 * the [start, start + len) range which overlaps
@@ -7186,10 +7179,6 @@ struct extent_map *btrfs_get_extent(struct btrfs_inode *inode,
 				free_extent_map(em);
 				em = NULL;
 			}
-		} else {
-			free_extent_map(em);
-			em = existing;
-			err = 0;
 		}
 	}
 	write_unlock(&em_tree->lock);
-- 
2.9.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 03/10] Btrfs: add helper for em merge logic
  2018-01-05 19:51 [PATCH v2 00/10] bugfixes and regression tests of btrfs_get_extent Liu Bo
  2018-01-05 19:51 ` [PATCH v2 01/10] Btrfs: fix incorrect block_len in merge_extent_mapping Liu Bo
  2018-01-05 19:51 ` [PATCH v2 02/10] Btrfs: fix unexpected EEXIST from btrfs_get_extent Liu Bo
@ 2018-01-05 19:51 ` Liu Bo
  2018-01-09 17:27   ` Josef Bacik
  2018-01-05 19:51 ` [PATCH v2 04/10] Btrfs: move extent map specific code to extent_map.c Liu Bo
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 26+ messages in thread
From: Liu Bo @ 2018-01-05 19:51 UTC (permalink / raw)
  To: linux-btrfs

This is a prepare work for the following extent map selftest, which
runs tests against em merge logic.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
---
 fs/btrfs/ctree.h |  2 ++
 fs/btrfs/inode.c | 80 ++++++++++++++++++++++++++++++++------------------------
 2 files changed, 48 insertions(+), 34 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index b2e09fe..328f40f 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -3148,6 +3148,8 @@ struct btrfs_delalloc_work *btrfs_alloc_delalloc_work(struct inode *inode,
 						    int delay_iput);
 void btrfs_wait_and_free_delalloc_work(struct btrfs_delalloc_work *work);
 
+int btrfs_add_extent_mapping(struct extent_map_tree *em_tree,
+			     struct extent_map **em_in, u64 start, u64 len);
 struct extent_map *btrfs_get_extent_fiemap(struct btrfs_inode *inode,
 		struct page *page, size_t pg_offset, u64 start,
 		u64 len, int create);
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index a270fe2..876118c 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -6911,6 +6911,51 @@ static noinline int uncompress_inline(struct btrfs_path *path,
 	return ret;
 }
 
+int btrfs_add_extent_mapping(struct extent_map_tree *em_tree,
+			     struct extent_map **em_in, u64 start, u64 len)
+{
+	int ret;
+	struct extent_map *em = *em_in;
+
+	ret = add_extent_mapping(em_tree, em, 0);
+	/* it is possible that someone inserted the extent into the tree
+	 * while we had the lock dropped.  It is also possible that
+	 * an overlapping map exists in the tree
+	 */
+	if (ret == -EEXIST) {
+		struct extent_map *existing;
+
+		ret = 0;
+
+		existing = search_extent_mapping(em_tree, start, len);
+		/*
+		 * existing will always be non-NULL, since there must be
+		 * extent causing the -EEXIST.
+		 */
+		if (start >= existing->start &&
+		    start < extent_map_end(existing)) {
+			free_extent_map(em);
+			*em_in = existing;
+			ret = 0;
+		} else {
+			/*
+			 * The existing extent map is the one nearest to
+			 * the [start, start + len) range which overlaps
+			 */
+			ret = merge_extent_mapping(em_tree, existing,
+						   em, start);
+			free_extent_map(existing);
+			if (ret) {
+				free_extent_map(em);
+				*em_in = NULL;
+			}
+		}
+	}
+
+	ASSERT(ret == 0 || ret == -EEXIST);
+	return ret;
+}
+
 /*
  * a bit scary, this does extent mapping from logical file offset to the disk.
  * the ugly parts come from merging extents from the disk with the in-ram
@@ -7147,40 +7192,7 @@ struct extent_map *btrfs_get_extent(struct btrfs_inode *inode,
 
 	err = 0;
 	write_lock(&em_tree->lock);
-	ret = add_extent_mapping(em_tree, em, 0);
-	/* it is possible that someone inserted the extent into the tree
-	 * while we had the lock dropped.  It is also possible that
-	 * an overlapping map exists in the tree
-	 */
-	if (ret == -EEXIST) {
-		struct extent_map *existing;
-
-		ret = 0;
-
-		existing = search_extent_mapping(em_tree, start, len);
-		/*
-		 * existing will always be non-NULL, since there must be
-		 * extent causing the -EEXIST.
-		 */
-		if (start >= existing->start &&
-		    start < extent_map_end(existing)) {
-			free_extent_map(em);
-			em = existing;
-			err = 0;
-		} else {
-			/*
-			 * The existing extent map is the one nearest to
-			 * the [start, start + len) range which overlaps
-			 */
-			err = merge_extent_mapping(em_tree, existing,
-						   em, start);
-			free_extent_map(existing);
-			if (err) {
-				free_extent_map(em);
-				em = NULL;
-			}
-		}
-	}
+	err = btrfs_add_extent_mapping(em_tree, &em, start, len);
 	write_unlock(&em_tree->lock);
 out:
 
-- 
2.9.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 04/10] Btrfs: move extent map specific code to extent_map.c
  2018-01-05 19:51 [PATCH v2 00/10] bugfixes and regression tests of btrfs_get_extent Liu Bo
                   ` (2 preceding siblings ...)
  2018-01-05 19:51 ` [PATCH v2 03/10] Btrfs: add helper for em merge logic Liu Bo
@ 2018-01-05 19:51 ` Liu Bo
  2018-01-09 17:29   ` Josef Bacik
  2018-01-05 19:51 ` [PATCH v2 05/10] Btrfs: add extent map selftests Liu Bo
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 26+ messages in thread
From: Liu Bo @ 2018-01-05 19:51 UTC (permalink / raw)
  To: linux-btrfs

These helpers are extent map specific, move them to extent_map.c.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
---
 fs/btrfs/ctree.h      |   2 -
 fs/btrfs/extent_map.c | 125 ++++++++++++++++++++++++++++++++++++++++++++++++++
 fs/btrfs/extent_map.h |   2 +
 fs/btrfs/inode.c      | 107 ------------------------------------------
 4 files changed, 127 insertions(+), 109 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 328f40f..b2e09fe 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -3148,8 +3148,6 @@ struct btrfs_delalloc_work *btrfs_alloc_delalloc_work(struct inode *inode,
 						    int delay_iput);
 void btrfs_wait_and_free_delalloc_work(struct btrfs_delalloc_work *work);
 
-int btrfs_add_extent_mapping(struct extent_map_tree *em_tree,
-			     struct extent_map **em_in, u64 start, u64 len);
 struct extent_map *btrfs_get_extent_fiemap(struct btrfs_inode *inode,
 		struct page *page, size_t pg_offset, u64 start,
 		u64 len, int create);
diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c
index 2e348fb..6fe8b14 100644
--- a/fs/btrfs/extent_map.c
+++ b/fs/btrfs/extent_map.c
@@ -454,3 +454,128 @@ void replace_extent_mapping(struct extent_map_tree *tree,
 
 	setup_extent_mapping(tree, new, modified);
 }
+
+static struct extent_map *next_extent_map(struct extent_map *em)
+{
+	struct rb_node *next;
+
+	next = rb_next(&em->rb_node);
+	if (!next)
+		return NULL;
+	return container_of(next, struct extent_map, rb_node);
+}
+
+static struct extent_map *prev_extent_map(struct extent_map *em)
+{
+	struct rb_node *prev;
+
+	prev = rb_prev(&em->rb_node);
+	if (!prev)
+		return NULL;
+	return container_of(prev, struct extent_map, rb_node);
+}
+
+/* helper for btfs_get_extent.  Given an existing extent in the tree,
+ * the existing extent is the nearest extent to map_start,
+ * and an extent that you want to insert, deal with overlap and insert
+ * the best fitted new extent into the tree.
+ */
+static int merge_extent_mapping(struct extent_map_tree *em_tree,
+				struct extent_map *existing,
+				struct extent_map *em,
+				u64 map_start)
+{
+	struct extent_map *prev;
+	struct extent_map *next;
+	u64 start;
+	u64 end;
+	u64 start_diff;
+
+	BUG_ON(map_start < em->start || map_start >= extent_map_end(em));
+
+	if (existing->start > map_start) {
+		next = existing;
+		prev = prev_extent_map(next);
+	} else {
+		prev = existing;
+		next = next_extent_map(prev);
+	}
+
+	start = prev ? extent_map_end(prev) : em->start;
+	start = max_t(u64, start, em->start);
+	end = next ? next->start : extent_map_end(em);
+	end = min_t(u64, end, extent_map_end(em));
+	start_diff = start - em->start;
+	em->start = start;
+	em->len = end - start;
+	if (em->block_start < EXTENT_MAP_LAST_BYTE &&
+	    !test_bit(EXTENT_FLAG_COMPRESSED, &em->flags)) {
+		em->block_start += start_diff;
+		em->block_len = em->len;
+	}
+	return add_extent_mapping(em_tree, em, 0);
+}
+
+/**
+ * btrfs_add_extent_mapping - add extent mapping into em_tree
+ * @em_tree - the extent tree into which we want to insert the extent mapping
+ * @em_in   - extent we are inserting
+ * @start   - start of the logical range btrfs_get_extent() is requesting
+ * @len     - length of the logical range btrfs_get_extent() is requesting
+ *
+ * Note that @em_in's range may be different from [start, start+len),
+ * but they must be overlapped.
+ *
+ * Insert @em_in into @em_tree. In case there is an overlapping range, handle
+ * the -EEXIST by either:
+ * a) Returning the existing extent in @em_in if @start is within the
+ *    existing em.
+ * b) Merge the existing extent with @em_in passed in.
+ *
+ * Return 0 on success, otherwise -EEXIST.
+ *
+ */
+int btrfs_add_extent_mapping(struct extent_map_tree *em_tree,
+			     struct extent_map **em_in, u64 start, u64 len)
+{
+	int ret;
+	struct extent_map *em = *em_in;
+
+	ret = add_extent_mapping(em_tree, em, 0);
+	/* it is possible that someone inserted the extent into the tree
+	 * while we had the lock dropped.  It is also possible that
+	 * an overlapping map exists in the tree
+	 */
+	if (ret == -EEXIST) {
+		struct extent_map *existing;
+
+		ret = 0;
+
+		existing = search_extent_mapping(em_tree, start, len);
+		/*
+		 * existing will always be non-NULL, since there must be
+		 * extent causing the -EEXIST.
+		 */
+		if (start >= existing->start &&
+		    start < extent_map_end(existing)) {
+			free_extent_map(em);
+			*em_in = existing;
+			ret = 0;
+		} else {
+			/*
+			 * The existing extent map is the one nearest to
+			 * the [start, start + len) range which overlaps
+			 */
+			ret = merge_extent_mapping(em_tree, existing,
+						   em, start);
+			free_extent_map(existing);
+			if (ret) {
+				free_extent_map(em);
+				*em_in = NULL;
+			}
+		}
+	}
+
+	ASSERT(ret == 0 || ret == -EEXIST);
+	return ret;
+}
diff --git a/fs/btrfs/extent_map.h b/fs/btrfs/extent_map.h
index 64365bb..40f34ad 100644
--- a/fs/btrfs/extent_map.h
+++ b/fs/btrfs/extent_map.h
@@ -92,4 +92,6 @@ int unpin_extent_cache(struct extent_map_tree *tree, u64 start, u64 len, u64 gen
 void clear_em_logging(struct extent_map_tree *tree, struct extent_map *em);
 struct extent_map *search_extent_mapping(struct extent_map_tree *tree,
 					 u64 start, u64 len);
+int btrfs_add_extent_mapping(struct extent_map_tree *em_tree,
+			     struct extent_map **em_in, u64 start, u64 len);
 #endif
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 876118c..b2272f9 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -6803,68 +6803,6 @@ static int btrfs_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode)
 	goto out_fail;
 }
 
-/* Find next extent map of a given extent map, caller needs to ensure locks */
-static struct extent_map *next_extent_map(struct extent_map *em)
-{
-	struct rb_node *next;
-
-	next = rb_next(&em->rb_node);
-	if (!next)
-		return NULL;
-	return container_of(next, struct extent_map, rb_node);
-}
-
-static struct extent_map *prev_extent_map(struct extent_map *em)
-{
-	struct rb_node *prev;
-
-	prev = rb_prev(&em->rb_node);
-	if (!prev)
-		return NULL;
-	return container_of(prev, struct extent_map, rb_node);
-}
-
-/* helper for btfs_get_extent.  Given an existing extent in the tree,
- * the existing extent is the nearest extent to map_start,
- * and an extent that you want to insert, deal with overlap and insert
- * the best fitted new extent into the tree.
- */
-static int merge_extent_mapping(struct extent_map_tree *em_tree,
-				struct extent_map *existing,
-				struct extent_map *em,
-				u64 map_start)
-{
-	struct extent_map *prev;
-	struct extent_map *next;
-	u64 start;
-	u64 end;
-	u64 start_diff;
-
-	BUG_ON(map_start < em->start || map_start >= extent_map_end(em));
-
-	if (existing->start > map_start) {
-		next = existing;
-		prev = prev_extent_map(next);
-	} else {
-		prev = existing;
-		next = next_extent_map(prev);
-	}
-
-	start = prev ? extent_map_end(prev) : em->start;
-	start = max_t(u64, start, em->start);
-	end = next ? next->start : extent_map_end(em);
-	end = min_t(u64, end, extent_map_end(em));
-	start_diff = start - em->start;
-	em->start = start;
-	em->len = end - start;
-	if (em->block_start < EXTENT_MAP_LAST_BYTE &&
-	    !test_bit(EXTENT_FLAG_COMPRESSED, &em->flags)) {
-		em->block_start += start_diff;
-		em->block_len = em->len;
-	}
-	return add_extent_mapping(em_tree, em, 0);
-}
-
 static noinline int uncompress_inline(struct btrfs_path *path,
 				      struct page *page,
 				      size_t pg_offset, u64 extent_offset,
@@ -6911,51 +6849,6 @@ static noinline int uncompress_inline(struct btrfs_path *path,
 	return ret;
 }
 
-int btrfs_add_extent_mapping(struct extent_map_tree *em_tree,
-			     struct extent_map **em_in, u64 start, u64 len)
-{
-	int ret;
-	struct extent_map *em = *em_in;
-
-	ret = add_extent_mapping(em_tree, em, 0);
-	/* it is possible that someone inserted the extent into the tree
-	 * while we had the lock dropped.  It is also possible that
-	 * an overlapping map exists in the tree
-	 */
-	if (ret == -EEXIST) {
-		struct extent_map *existing;
-
-		ret = 0;
-
-		existing = search_extent_mapping(em_tree, start, len);
-		/*
-		 * existing will always be non-NULL, since there must be
-		 * extent causing the -EEXIST.
-		 */
-		if (start >= existing->start &&
-		    start < extent_map_end(existing)) {
-			free_extent_map(em);
-			*em_in = existing;
-			ret = 0;
-		} else {
-			/*
-			 * The existing extent map is the one nearest to
-			 * the [start, start + len) range which overlaps
-			 */
-			ret = merge_extent_mapping(em_tree, existing,
-						   em, start);
-			free_extent_map(existing);
-			if (ret) {
-				free_extent_map(em);
-				*em_in = NULL;
-			}
-		}
-	}
-
-	ASSERT(ret == 0 || ret == -EEXIST);
-	return ret;
-}
-
 /*
  * a bit scary, this does extent mapping from logical file offset to the disk.
  * the ugly parts come from merging extents from the disk with the in-ram
-- 
2.9.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 05/10] Btrfs: add extent map selftests
  2018-01-05 19:51 [PATCH v2 00/10] bugfixes and regression tests of btrfs_get_extent Liu Bo
                   ` (3 preceding siblings ...)
  2018-01-05 19:51 ` [PATCH v2 04/10] Btrfs: move extent map specific code to extent_map.c Liu Bo
@ 2018-01-05 19:51 ` Liu Bo
  2018-01-09 17:31   ` Josef Bacik
  2018-01-05 19:51 ` [PATCH v2 06/10] Btrfs: extent map selftest: buffered write vs dio read Liu Bo
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 26+ messages in thread
From: Liu Bo @ 2018-01-05 19:51 UTC (permalink / raw)
  To: linux-btrfs

We've observed that btrfs_get_extent() and merge_extent_mapping() could
return -EEXIST in several cases, and they are caused by some racy
condition, e.g dio read vs dio write, which makes the problem very tricky
to reproduce.

This adds extent map selftests in order to simulate those racy situations.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
---
 fs/btrfs/Makefile                 |   2 +-
 fs/btrfs/tests/btrfs-tests.c      |   3 +
 fs/btrfs/tests/btrfs-tests.h      |   1 +
 fs/btrfs/tests/extent-map-tests.c | 202 ++++++++++++++++++++++++++++++++++++++
 4 files changed, 207 insertions(+), 1 deletion(-)
 create mode 100644 fs/btrfs/tests/extent-map-tests.c

diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile
index 6fe881d..0c43736 100644
--- a/fs/btrfs/Makefile
+++ b/fs/btrfs/Makefile
@@ -19,4 +19,4 @@ btrfs-$(CONFIG_BTRFS_FS_REF_VERIFY) += ref-verify.o
 btrfs-$(CONFIG_BTRFS_FS_RUN_SANITY_TESTS) += tests/free-space-tests.o \
 	tests/extent-buffer-tests.o tests/btrfs-tests.o \
 	tests/extent-io-tests.o tests/inode-tests.o tests/qgroup-tests.o \
-	tests/free-space-tree-tests.o
+	tests/free-space-tree-tests.o tests/extent-map-tests.o
diff --git a/fs/btrfs/tests/btrfs-tests.c b/fs/btrfs/tests/btrfs-tests.c
index d3f2537..9786d8c 100644
--- a/fs/btrfs/tests/btrfs-tests.c
+++ b/fs/btrfs/tests/btrfs-tests.c
@@ -277,6 +277,9 @@ int btrfs_run_sanity_tests(void)
 				goto out;
 		}
 	}
+	ret = btrfs_test_extent_map();
+	if (ret)
+		goto out;
 out:
 	btrfs_destroy_test_fs();
 	return ret;
diff --git a/fs/btrfs/tests/btrfs-tests.h b/fs/btrfs/tests/btrfs-tests.h
index 266f1e3..bc0615b 100644
--- a/fs/btrfs/tests/btrfs-tests.h
+++ b/fs/btrfs/tests/btrfs-tests.h
@@ -33,6 +33,7 @@ int btrfs_test_extent_io(u32 sectorsize, u32 nodesize);
 int btrfs_test_inodes(u32 sectorsize, u32 nodesize);
 int btrfs_test_qgroups(u32 sectorsize, u32 nodesize);
 int btrfs_test_free_space_tree(u32 sectorsize, u32 nodesize);
+int btrfs_test_extent_map(void);
 struct inode *btrfs_new_test_inode(void);
 struct btrfs_fs_info *btrfs_alloc_dummy_fs_info(u32 nodesize, u32 sectorsize);
 void btrfs_free_dummy_fs_info(struct btrfs_fs_info *fs_info);
diff --git a/fs/btrfs/tests/extent-map-tests.c b/fs/btrfs/tests/extent-map-tests.c
new file mode 100644
index 0000000..0407396
--- /dev/null
+++ b/fs/btrfs/tests/extent-map-tests.c
@@ -0,0 +1,202 @@
+/*
+ * Copyright (C) 2017 Oracle.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public
+ * License v2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public
+ * License along with this program; if not, write to the
+ * Free Software Foundation, Inc., 59 Temple Place - Suite 330,
+ * Boston, MA 021110-1307, USA.
+ */
+
+#include <linux/types.h>
+#include "btrfs-tests.h"
+#include "../ctree.h"
+
+static void free_extent_map_tree(struct extent_map_tree *em_tree)
+{
+	struct extent_map *em;
+	struct rb_node *node;
+
+	while (!RB_EMPTY_ROOT(&em_tree->map)) {
+		node = rb_first(&em_tree->map);
+		em = rb_entry(node, struct extent_map, rb_node);
+		remove_extent_mapping(em_tree, em);
+
+#ifdef CONFIG_BTRFS_DEBUG
+		if (refcount_read(&em->refs) != 1) {
+			test_msg(
+				 "Oops we have a em leak: em (start 0x%llx len 0x%llx block_start 0x%llx block_len 0x%llx) refs %d\n",
+				 em->start, em->len, em->block_start,
+				 em->block_len, refcount_read(&em->refs));
+
+			refcount_set(&em->refs, 1);
+		}
+#endif
+		free_extent_map(em);
+	}
+}
+
+/*
+ * Test scenario:
+ *
+ * Suppose that no extent map has been loaded into memory yet, there is a file
+ * extent [0, 16K), followed by another file extent [16K, 20K), two dio reads
+ * are entering btrfs_get_extent() concurrently, t1 is reading [8K, 16K), t2 is
+ * reading [0, 8K)
+ *
+ *     t1                            t2
+ *  btrfs_get_extent()              btrfs_get_extent()
+ *    -> lookup_extent_mapping()      ->lookup_extent_mapping()
+ *    -> add_extent_mapping(0, 16K)
+ *    -> return em
+ *                                    ->add_extent_mapping(0, 16K)
+ *                                    -> #handle -EEXIST
+ */
+static void test_case_1(struct extent_map_tree *em_tree)
+{
+	struct extent_map *em;
+	u64 start = 0;
+	u64 len = SZ_8K;
+	int ret;
+
+	em = alloc_extent_map();
+	if (!em)
+		/* Skip the test on error. */
+		return;
+
+	/* Add [0, 16K) */
+	em->start = 0;
+	em->len = SZ_16K;
+	em->block_start = 0;
+	em->block_len = SZ_16K;
+	ret = add_extent_mapping(em_tree, em, 0);
+	ASSERT(ret == 0);
+	free_extent_map(em);
+
+	/* Add [16K, 20K) following [0, 16K)  */
+	em = alloc_extent_map();
+	if (!em)
+		goto out;
+
+	em->start = SZ_16K;
+	em->len = SZ_4K;
+	em->block_start = SZ_32K; /* avoid merging */
+	em->block_len = SZ_4K;
+	ret = add_extent_mapping(em_tree, em, 0);
+	ASSERT(ret == 0);
+	free_extent_map(em);
+
+	em = alloc_extent_map();
+	if (!em)
+		goto out;
+
+	/* Add [0, 8K), should return [0, 16K) instead. */
+	em->start = start;
+	em->len = len;
+	em->block_start = start;
+	em->block_len = len;
+	ret = btrfs_add_extent_mapping(em_tree, &em, em->start, em->len);
+	if (ret)
+		test_msg("case1 [%llu %llu]: ret %d\n",
+			 start, start + len, ret);
+	if (em &&
+	    (em->start != 0 || extent_map_end(em) != SZ_16K ||
+	     em->block_start != 0 || em->block_len != SZ_16K))
+		test_msg("case1 [%llu %llu]: ret %d return a wrong em (start %llu len %llu block_start %llu block_len %llu\n",
+			 start, start + len, ret, em->start, em->len,
+			 em->block_start, em->block_len);
+	free_extent_map(em);
+out:
+	/* free memory */
+	free_extent_map_tree(em_tree);
+}
+
+/*
+ * Test scenario:
+ *
+ * Reading the inline ending up with EEXIST, ie. read an inline
+ * extent and discard page cache and read it again.
+ */
+static void test_case_2(struct extent_map_tree *em_tree)
+{
+	struct extent_map *em;
+	int ret;
+
+	em = alloc_extent_map();
+	if (!em)
+		/* Skip the test on error. */
+		return;
+
+	/* Add [0, 1K) */
+	em->start = 0;
+	em->len = SZ_1K;
+	em->block_start = EXTENT_MAP_INLINE;
+	em->block_len = (u64)-1;
+	ret = add_extent_mapping(em_tree, em, 0);
+	ASSERT(ret == 0);
+	free_extent_map(em);
+
+	/* Add [4K, 4K) following [0, 1K)  */
+	em = alloc_extent_map();
+	if (!em)
+		goto out;
+
+	em->start = SZ_4K;
+	em->len = SZ_4K;
+	em->block_start = SZ_4K;
+	em->block_len = SZ_4K;
+	ret = add_extent_mapping(em_tree, em, 0);
+	ASSERT(ret == 0);
+	free_extent_map(em);
+
+	em = alloc_extent_map();
+	if (!em)
+		goto out;
+
+	/* Add [0, 1K) */
+	em->start = 0;
+	em->len = SZ_1K;
+	em->block_start = EXTENT_MAP_INLINE;
+	em->block_len = (u64)-1;
+	ret = btrfs_add_extent_mapping(em_tree, &em, em->start, em->len);
+	if (ret)
+		test_msg("case2 [0 1K]: ret %d\n", ret);
+	if (em &&
+	    (em->start != 0 || extent_map_end(em) != SZ_1K ||
+	     em->block_start != EXTENT_MAP_INLINE || em->block_len != (u64)-1))
+		test_msg("case2 [0 1K]: ret %d return a wrong em (start %llu len %llu block_start %llu block_len %llu\n",
+			 ret, em->start, em->len, em->block_start,
+			 em->block_len);
+	free_extent_map(em);
+out:
+	/* free memory */
+	free_extent_map_tree(em_tree);
+}
+
+int btrfs_test_extent_map()
+{
+	struct extent_map_tree *em_tree;
+
+	test_msg("Running extent_map tests\n");
+
+	em_tree = kzalloc(sizeof(*em_tree), GFP_KERNEL);
+	if (!em_tree)
+		/* Skip the test on error. */
+		return 0;
+
+	extent_map_tree_init(em_tree);
+
+	test_case_1(em_tree);
+	test_case_2(em_tree);
+
+	kfree(em_tree);
+	return 0;
+}
-- 
2.9.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 06/10] Btrfs: extent map selftest: buffered write vs dio read
  2018-01-05 19:51 [PATCH v2 00/10] bugfixes and regression tests of btrfs_get_extent Liu Bo
                   ` (4 preceding siblings ...)
  2018-01-05 19:51 ` [PATCH v2 05/10] Btrfs: add extent map selftests Liu Bo
@ 2018-01-05 19:51 ` Liu Bo
  2018-01-09 17:32   ` Josef Bacik
  2018-01-05 19:51 ` [PATCH v2 07/10] Btrfs: extent map selftest: dio " Liu Bo
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 26+ messages in thread
From: Liu Bo @ 2018-01-05 19:51 UTC (permalink / raw)
  To: linux-btrfs

This test case simulates the racy situation of buffered write vs dio
read, and see if btrfs_get_extent() would return -EEXIST.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
---
 fs/btrfs/tests/extent-map-tests.c | 73 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 73 insertions(+)

diff --git a/fs/btrfs/tests/extent-map-tests.c b/fs/btrfs/tests/extent-map-tests.c
index 0407396..2adf55f 100644
--- a/fs/btrfs/tests/extent-map-tests.c
+++ b/fs/btrfs/tests/extent-map-tests.c
@@ -181,6 +181,78 @@ static void test_case_2(struct extent_map_tree *em_tree)
 	free_extent_map_tree(em_tree);
 }
 
+static void __test_case_3(struct extent_map_tree *em_tree, u64 start)
+{
+	struct extent_map *em;
+	u64 len = SZ_4K;
+	int ret;
+
+	em = alloc_extent_map();
+	if (!em)
+		/* Skip this test on error. */
+		return;
+
+	/* Add [4K, 8K) */
+	em->start = SZ_4K;
+	em->len = SZ_4K;
+	em->block_start = SZ_4K;
+	em->block_len = SZ_4K;
+	ret = add_extent_mapping(em_tree, em, 0);
+	ASSERT(ret == 0);
+	free_extent_map(em);
+
+	em = alloc_extent_map();
+	if (!em)
+		goto out;
+
+	/* Add [0, 16K) */
+	em->start = 0;
+	em->len = SZ_16K;
+	em->block_start = 0;
+	em->block_len = SZ_16K;
+	ret = btrfs_add_extent_mapping(em_tree, &em, start, len);
+	if (ret)
+		test_msg("case3 [0x%llx 0x%llx): ret %d\n",
+			 start, start + len, ret);
+	/*
+	 * Since bytes within em are contiguous, em->block_start is identical to
+	 * em->start.
+	 */
+	if (em &&
+	    (start < em->start || start + len > extent_map_end(em) ||
+	     em->start != em->block_start || em->len != em->block_len))
+		test_msg("case3 [0x%llx 0x%llx): ret %d em (start 0x%llx len 0x%llx block_start 0x%llx block_len 0x%llx)\n",
+			 start, start + len, ret, em->start, em->len,
+			 em->block_start, em->block_len);
+	free_extent_map(em);
+out:
+	/* free memory */
+	free_extent_map_tree(em_tree);
+}
+
+/*
+ * Test scenario:
+ *
+ * Suppose that no extent map has been loaded into memory yet.
+ * There is a file extent [0, 16K), two jobs are running concurrently
+ * against it, t1 is buffered writing to [4K, 8K) and t2 is doing dio
+ * read from [0, 4K) or [8K, 12K) or [12K, 16K).
+ *
+ * t1 goes ahead of t2 and adds em [4K, 8K) into tree.
+ *
+ *         t1                       t2
+ *  cow_file_range()	     btrfs_get_extent()
+ *                            -> lookup_extent_mapping()
+ *   -> add_extent_mapping()
+ *                            -> add_extent_mapping()
+ */
+static void test_case_3(struct extent_map_tree *em_tree)
+{
+	__test_case_3(em_tree, 0);
+	__test_case_3(em_tree, SZ_8K);
+	__test_case_3(em_tree, (12 * 1024ULL));
+}
+
 int btrfs_test_extent_map()
 {
 	struct extent_map_tree *em_tree;
@@ -196,6 +268,7 @@ int btrfs_test_extent_map()
 
 	test_case_1(em_tree);
 	test_case_2(em_tree);
+	test_case_3(em_tree);
 
 	kfree(em_tree);
 	return 0;
-- 
2.9.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 07/10] Btrfs: extent map selftest: dio write vs dio read
  2018-01-05 19:51 [PATCH v2 00/10] bugfixes and regression tests of btrfs_get_extent Liu Bo
                   ` (5 preceding siblings ...)
  2018-01-05 19:51 ` [PATCH v2 06/10] Btrfs: extent map selftest: buffered write vs dio read Liu Bo
@ 2018-01-05 19:51 ` Liu Bo
  2018-01-09 17:32   ` Josef Bacik
  2018-01-05 19:51 ` [PATCH v2 08/10] Btrfs: add WARN_ONCE to detect unexpected error from merge_extent_mapping Liu Bo
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 26+ messages in thread
From: Liu Bo @ 2018-01-05 19:51 UTC (permalink / raw)
  To: linux-btrfs

This test case simulates the racy situation of dio write vs dio read,
and see if btrfs_get_extent() would return -EEXIST.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
---
 fs/btrfs/tests/extent-map-tests.c | 88 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 88 insertions(+)

diff --git a/fs/btrfs/tests/extent-map-tests.c b/fs/btrfs/tests/extent-map-tests.c
index 2adf55f..66d5523 100644
--- a/fs/btrfs/tests/extent-map-tests.c
+++ b/fs/btrfs/tests/extent-map-tests.c
@@ -253,6 +253,93 @@ static void test_case_3(struct extent_map_tree *em_tree)
 	__test_case_3(em_tree, (12 * 1024ULL));
 }
 
+static void __test_case_4(struct extent_map_tree *em_tree, u64 start)
+{
+	struct extent_map *em;
+	u64 len = SZ_4K;
+	int ret;
+
+	em = alloc_extent_map();
+	if (!em)
+		/* Skip this test on error. */
+		return;
+
+	/* Add [0K, 8K) */
+	em->start = 0;
+	em->len = SZ_8K;
+	em->block_start = 0;
+	em->block_len = SZ_8K;
+	ret = add_extent_mapping(em_tree, em, 0);
+	ASSERT(ret == 0);
+	free_extent_map(em);
+
+	em = alloc_extent_map();
+	if (!em)
+		goto out;
+
+	/* Add [8K, 24K) */
+	em->start = SZ_8K;
+	em->len = 24 * 1024ULL;
+	em->block_start = SZ_16K; /* avoid merging */
+	em->block_len = 24 * 1024ULL;
+	ret = add_extent_mapping(em_tree, em, 0);
+	ASSERT(ret == 0);
+	free_extent_map(em);
+
+	em = alloc_extent_map();
+	if (!em)
+		goto out;
+	/* Add [0K, 32K) */
+	em->start = 0;
+	em->len = SZ_32K;
+	em->block_start = 0;
+	em->block_len = SZ_32K;
+	ret = btrfs_add_extent_mapping(em_tree, &em, start, len);
+	if (ret)
+		test_msg("case4 [0x%llx 0x%llx): ret %d\n",
+			 start, len, ret);
+	if (em &&
+	    (start < em->start || start + len > extent_map_end(em)))
+		test_msg("case4 [0x%llx 0x%llx): ret %d, added wrong em (start 0x%llx len 0x%llx block_start 0x%llx block_len 0x%llx)\n",
+			 start, len, ret, em->start, em->len, em->block_start,
+			 em->block_len);
+	free_extent_map(em);
+out:
+	/* free memory */
+	free_extent_map_tree(em_tree);
+}
+
+/*
+ * Test scenario:
+ *
+ * Suppose that no extent map has been loaded into memory yet.
+ * There is a file extent [0, 32K), two jobs are running concurrently
+ * against it, t1 is doing dio write to [8K, 32K) and t2 is doing dio
+ * read from [0, 4K) or [4K, 8K).
+ *
+ * t1 goes ahead of t2 and splits em [0, 32K) to em [0K, 8K) and [8K 32K).
+ *
+ *         t1                                t2
+ *  btrfs_get_blocks_direct()	       btrfs_get_blocks_direct()
+ *   -> btrfs_get_extent()              -> btrfs_get_extent()
+ *       -> lookup_extent_mapping()
+ *       -> add_extent_mapping()            -> lookup_extent_mapping()
+ *          # load [0, 32K)
+ *   -> btrfs_new_extent_direct()
+ *       -> btrfs_drop_extent_cache()
+ *          # split [0, 32K)
+ *       -> add_extent_mapping()
+ *          # add [8K, 32K)
+ *                                          -> add_extent_mapping()
+ *                                             # handle -EEXIST when adding
+ *                                             # [0, 32K)
+ */
+static void test_case_4(struct extent_map_tree *em_tree)
+{
+	__test_case_4(em_tree, 0);
+	__test_case_4(em_tree, SZ_4K);
+}
+
 int btrfs_test_extent_map()
 {
 	struct extent_map_tree *em_tree;
@@ -269,6 +356,7 @@ int btrfs_test_extent_map()
 	test_case_1(em_tree);
 	test_case_2(em_tree);
 	test_case_3(em_tree);
+	test_case_4(em_tree);
 
 	kfree(em_tree);
 	return 0;
-- 
2.9.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 08/10] Btrfs: add WARN_ONCE to detect unexpected error from merge_extent_mapping
  2018-01-05 19:51 [PATCH v2 00/10] bugfixes and regression tests of btrfs_get_extent Liu Bo
                   ` (6 preceding siblings ...)
  2018-01-05 19:51 ` [PATCH v2 07/10] Btrfs: extent map selftest: dio " Liu Bo
@ 2018-01-05 19:51 ` Liu Bo
  2018-01-09 17:33   ` Josef Bacik
  2018-01-05 19:51 ` [PATCH v2 09/10] Btrfs: add tracepoint for em's EEXIST case Liu Bo
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 26+ messages in thread
From: Liu Bo @ 2018-01-05 19:51 UTC (permalink / raw)
  To: linux-btrfs

This is a subtle case, so in order to understand the problem, it'd be good
to know the content of existing and em when any error occurs.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
---
v2: Remove unnecessary KERN_INFO.

 fs/btrfs/extent_map.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c
index 6fe8b14..b5d0add 100644
--- a/fs/btrfs/extent_map.c
+++ b/fs/btrfs/extent_map.c
@@ -562,17 +562,23 @@ int btrfs_add_extent_mapping(struct extent_map_tree *em_tree,
 			*em_in = existing;
 			ret = 0;
 		} else {
+			u64 orig_start = em->start;
+			u64 orig_len = em->len;
+
 			/*
 			 * The existing extent map is the one nearest to
 			 * the [start, start + len) range which overlaps
 			 */
 			ret = merge_extent_mapping(em_tree, existing,
 						   em, start);
-			free_extent_map(existing);
 			if (ret) {
 				free_extent_map(em);
 				*em_in = NULL;
+				WARN_ONCE(ret, "Unexpected error %d: merge existing(start %llu len %llu) with em(start %llu len %llu)\n",
+					  ret, existing->start, existing->len,
+					  orig_start, orig_len);
 			}
+			free_extent_map(existing);
 		}
 	}
 
-- 
2.9.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 09/10] Btrfs: add tracepoint for em's EEXIST case
  2018-01-05 19:51 [PATCH v2 00/10] bugfixes and regression tests of btrfs_get_extent Liu Bo
                   ` (7 preceding siblings ...)
  2018-01-05 19:51 ` [PATCH v2 08/10] Btrfs: add WARN_ONCE to detect unexpected error from merge_extent_mapping Liu Bo
@ 2018-01-05 19:51 ` Liu Bo
  2018-01-09 17:35   ` Josef Bacik
  2018-01-19 18:15   ` David Sterba
  2018-01-05 19:51 ` [PATCH v2 10/10] Btrfs: noinline merge_extent_mapping Liu Bo
  2018-01-08 19:57 ` [PATCH v2 00/10] bugfixes and regression tests of btrfs_get_extent David Sterba
  10 siblings, 2 replies; 26+ messages in thread
From: Liu Bo @ 2018-01-05 19:51 UTC (permalink / raw)
  To: linux-btrfs

This is adding a tracepoint 'btrfs_handle_em_exist' to help debug the
subtle bugs around merge_extent_mapping.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
---
 fs/btrfs/extent_map.c        |  3 +++
 include/trace/events/btrfs.h | 35 +++++++++++++++++++++++++++++++++++
 2 files changed, 38 insertions(+)

diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c
index b5d0add..a5a1d17 100644
--- a/fs/btrfs/extent_map.c
+++ b/fs/btrfs/extent_map.c
@@ -552,6 +552,9 @@ int btrfs_add_extent_mapping(struct extent_map_tree *em_tree,
 		ret = 0;
 
 		existing = search_extent_mapping(em_tree, start, len);
+
+		trace_btrfs_handle_em_exist(existing, em, start, len);
+
 		/*
 		 * existing will always be non-NULL, since there must be
 		 * extent causing the -EEXIST.
diff --git a/include/trace/events/btrfs.h b/include/trace/events/btrfs.h
index 4342a32..b7ffcf7 100644
--- a/include/trace/events/btrfs.h
+++ b/include/trace/events/btrfs.h
@@ -249,6 +249,41 @@ TRACE_EVENT_CONDITION(btrfs_get_extent,
 		  __entry->refs, __entry->compress_type)
 );
 
+TRACE_EVENT(btrfs_handle_em_exist,
+
+	TP_PROTO(const struct extent_map *existing, const struct extent_map *map, u64 start, u64 len),
+
+	TP_ARGS(existing, map, start, len),
+
+	TP_STRUCT__entry(
+		__field(	u64,  e_start		)
+		__field(	u64,  e_len		)
+		__field(	u64,  map_start		)
+		__field(	u64,  map_len		)
+		__field(	u64,  start		)
+		__field(	u64,  len		)
+	),
+
+	TP_fast_assign(
+		__entry->e_start	= existing->start;
+		__entry->e_len		= existing->len;
+		__entry->map_start	= map->start;
+		__entry->map_len	= map->len;
+		__entry->start		= start;
+		__entry->len		= len;
+	),
+
+	TP_printk("start=%llu len=%llu "
+		  "existing(start=%llu len=%llu) "
+		  "em(start=%llu len=%llu)",
+		  (unsigned long long)__entry->start,
+		  (unsigned long long)__entry->len,
+		  (unsigned long long)__entry->e_start,
+		  (unsigned long long)__entry->e_len,
+		  (unsigned long long)__entry->map_start,
+		  (unsigned long long)__entry->map_len)
+);
+
 /* file extent item */
 DECLARE_EVENT_CLASS(btrfs__file_extent_item_regular,
 
-- 
2.9.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 10/10] Btrfs: noinline merge_extent_mapping
  2018-01-05 19:51 [PATCH v2 00/10] bugfixes and regression tests of btrfs_get_extent Liu Bo
                   ` (8 preceding siblings ...)
  2018-01-05 19:51 ` [PATCH v2 09/10] Btrfs: add tracepoint for em's EEXIST case Liu Bo
@ 2018-01-05 19:51 ` Liu Bo
  2018-01-09 17:35   ` Josef Bacik
  2018-01-08 19:57 ` [PATCH v2 00/10] bugfixes and regression tests of btrfs_get_extent David Sterba
  10 siblings, 1 reply; 26+ messages in thread
From: Liu Bo @ 2018-01-05 19:51 UTC (permalink / raw)
  To: linux-btrfs

In order to debug subtle bugs around merge_extent_mapping(), perf probe
can be used to check the arguments, but sometimes merge_extent_mapping()
got inlined by compiler and couldn't be probed.

This is adding noinline attribute to merge_extent_mapping().

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
---
 fs/btrfs/extent_map.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c
index a5a1d17..9e5fdf9 100644
--- a/fs/btrfs/extent_map.c
+++ b/fs/btrfs/extent_map.c
@@ -480,10 +480,10 @@ static struct extent_map *prev_extent_map(struct extent_map *em)
  * and an extent that you want to insert, deal with overlap and insert
  * the best fitted new extent into the tree.
  */
-static int merge_extent_mapping(struct extent_map_tree *em_tree,
-				struct extent_map *existing,
-				struct extent_map *em,
-				u64 map_start)
+static noinline int merge_extent_mapping(struct extent_map_tree *em_tree,
+					 struct extent_map *existing,
+					 struct extent_map *em,
+					 u64 map_start)
 {
 	struct extent_map *prev;
 	struct extent_map *next;
-- 
2.9.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 00/10] bugfixes and regression tests of btrfs_get_extent
  2018-01-05 19:51 [PATCH v2 00/10] bugfixes and regression tests of btrfs_get_extent Liu Bo
                   ` (9 preceding siblings ...)
  2018-01-05 19:51 ` [PATCH v2 10/10] Btrfs: noinline merge_extent_mapping Liu Bo
@ 2018-01-08 19:57 ` David Sterba
  2018-01-18 16:51   ` David Sterba
  10 siblings, 1 reply; 26+ messages in thread
From: David Sterba @ 2018-01-08 19:57 UTC (permalink / raw)
  To: Liu Bo; +Cc: linux-btrfs

On Fri, Jan 05, 2018 at 12:51:07PM -0700, Liu Bo wrote:
> Although
> commit e6c4efd87ab0 ("btrfs: Fix and enhance merge_extent_mapping() to insert best fitted extent map")
> fixed up the negetive em->len, it has introduced several regressions, several has been fixed by
> 
> commit 32be3a1ac6d0 ("btrfs: Fix the wrong condition judgment about subset extent map"),
> commit 8dff9c853410 ("Btrfs: deal with duplciates during extent_map insertion in btrfs_get_extent") and
> commit 8e2bd3b7fac9 ("Btrfs: deal with existing encompassing extent map in btrfs_get_extent()").
> 
> Unfortunately, there is one more regression which is caught recently by a
> user's workloads.
> 
> While debugging the above issue, I found that all of these bugs are caused
> by some racy situations, which can be very tricky to reproduce, so I
> created several extent map specific test cases in btrfs's selftest
> framework.
> 
> Patch 1-2 are fixing two bugs.
> Patch 3-4 are some preparatory work.
> Patch 3-5 are regression tests about the logic of handling EEXIST from
> adding extent map.
> Patch 8-10 are debugging wise, one is a direct tracepoint and the other is
> to enable kprobe on merge_extent_mapping.
> 
> v2:
> - Improve commit log to provide more details about the bug.
> - Adjust bugfixes to the front so that we can merge them firstly.

Patchset updated in for-next. Expected merge target is 4.16, review is
still needed (and welcome).

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 01/10] Btrfs: fix incorrect block_len in merge_extent_mapping
  2018-01-05 19:51 ` [PATCH v2 01/10] Btrfs: fix incorrect block_len in merge_extent_mapping Liu Bo
@ 2018-01-09 17:24   ` Josef Bacik
  0 siblings, 0 replies; 26+ messages in thread
From: Josef Bacik @ 2018-01-09 17:24 UTC (permalink / raw)
  To: Liu Bo; +Cc: linux-btrfs

On Fri, Jan 05, 2018 at 12:51:08PM -0700, Liu Bo wrote:
> %block_len could be checked on deciding if two em are mergeable.
> 
> merge_extent_mapping() has only added the front pad if the front part
> of em gets truncated, but it's possible that the end part gets
> truncated.
> 
> For both compressed extent and inline extent, em->block_len is not
> adjusted accordingly, and for regular extent, em->block_len always
> equals to em->len, hence this sets em->block_len with em->len.
> 
> Signed-off-by: Liu Bo <bo.li.liu@oracle.com>

Reviewed-by: Josef Bacik <jbacik@fb.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 02/10] Btrfs: fix unexpected EEXIST from btrfs_get_extent
  2018-01-05 19:51 ` [PATCH v2 02/10] Btrfs: fix unexpected EEXIST from btrfs_get_extent Liu Bo
@ 2018-01-09 17:27   ` Josef Bacik
  0 siblings, 0 replies; 26+ messages in thread
From: Josef Bacik @ 2018-01-09 17:27 UTC (permalink / raw)
  To: Liu Bo; +Cc: linux-btrfs

On Fri, Jan 05, 2018 at 12:51:09PM -0700, Liu Bo wrote:
> This fixes a corner case that is caused by a race of dio write vs dio
> read/write.
> 
> Here is how the race could happen.
> 
> Suppose that no extent map has been loaded into memory yet.
> There is a file extent [0, 32K), two jobs are running concurrently
> against it, t1 is doing dio write to [8K, 32K) and t2 is doing dio
> read from [0, 4K) or [4K, 8K).
> 
> t1 goes ahead of t2 and splits em [0, 32K) to em [0K, 8K) and [8K 32K).
> 
> ------------------------------------------------------
>              t1                                t2
>       btrfs_get_blocks_direct()         btrfs_get_blocks_direct()
>        -> btrfs_get_extent()              -> btrfs_get_extent()
>            -> lookup_extent_mapping()
>            -> add_extent_mapping()            -> lookup_extent_mapping()
>               # load [0, 32K)
>        -> btrfs_new_extent_direct()
>            -> btrfs_drop_extent_cache()
>               # split [0, 32K) and
> 	      # drop [8K, 32K)
>            -> add_extent_mapping()
>               # add [8K, 32K)
>                                               -> add_extent_mapping()
>                                                  # handle -EEXIST when adding
>                                                  # [0, 32K)
> ------------------------------------------------------
> About how t2(dio read/write) runs into -EEXIST:
> 
> a) add_extent_mapping() gets -EEXIST for adding em [0, 32k),
> 
> b) search_extent_mapping() then returns [0, 8k) as the existing em,
>    even though start == existing->start, em is [0, 32k) so that
>    extent_map_end(em) > extent_map_end(existing), i.e. 32k > 8k,
> 
> c) then it goes thru merge_extent_mapping() which tries to add a [8k, 8k)
>    (with a length 0) and returns -EEXIST as [8k, 32k) is already in tree,
> 
> d) so btrfs_get_extent() ends up returning -EEXIST to dio read/write,
>    which is confusing applications.
> 
> Here I conclude all the possible situations,
> 1) start < existing->start
> 
>             +-----------+em+-----------+
> +--prev---+ |     +-------------+      |
> |         | |     |             |      |
> +---------+ +     +---+existing++      ++
>                 +
>                 |
>                 +
>              start
> 
> 2) start == existing->start
> 
>       +------------em------------+
>       |     +-------------+      |
>       |     |             |      |
>       +     +----existing-+      +
>             |
>             |
>             +
>          start
> 
> 3) start > existing->start && start < (existing->start + existing->len)
> 
>       +------------em------------+
>       |     +-------------+      |
>       |     |             |      |
>       +     +----existing-+      +
>                |
>                |
>                +
>              start
> 
> 4) start >= (existing->start + existing->len)
> 
> +-----------+em+-----------+
> |     +-------------+      | +--next---+
> |     |             |      | |         |
> +     +---+existing++      + +---------+
>                       +
>                       |
>                       +
>                    start
> 
> As we can see, it turns out that if start is within existing em (front
> inclusive), then the existing em should be returned as is, otherwise,
> we try our best to merge candidate em with sibling ems to form a
> larger em (in order to reduce the total number of em).
> 
> Reported-by: David Vallender <david.vallender@landmark.co.uk>
> Signed-off-by: Liu Bo <bo.li.liu@oracle.com>

Reviewed-by: Josef Bacik <jbacik@fb.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 03/10] Btrfs: add helper for em merge logic
  2018-01-05 19:51 ` [PATCH v2 03/10] Btrfs: add helper for em merge logic Liu Bo
@ 2018-01-09 17:27   ` Josef Bacik
  0 siblings, 0 replies; 26+ messages in thread
From: Josef Bacik @ 2018-01-09 17:27 UTC (permalink / raw)
  To: Liu Bo; +Cc: linux-btrfs

On Fri, Jan 05, 2018 at 12:51:10PM -0700, Liu Bo wrote:
> This is a prepare work for the following extent map selftest, which
> runs tests against em merge logic.
> 
> Signed-off-by: Liu Bo <bo.li.liu@oracle.com>

Reviewed-by: Josef Bacik <jbacik@fb.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 04/10] Btrfs: move extent map specific code to extent_map.c
  2018-01-05 19:51 ` [PATCH v2 04/10] Btrfs: move extent map specific code to extent_map.c Liu Bo
@ 2018-01-09 17:29   ` Josef Bacik
  0 siblings, 0 replies; 26+ messages in thread
From: Josef Bacik @ 2018-01-09 17:29 UTC (permalink / raw)
  To: Liu Bo; +Cc: linux-btrfs

On Fri, Jan 05, 2018 at 12:51:11PM -0700, Liu Bo wrote:
> These helpers are extent map specific, move them to extent_map.c.
> 
> Signed-off-by: Liu Bo <bo.li.liu@oracle.com>

Reviewed-by: Josef Bacik <jbacik@fb.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 05/10] Btrfs: add extent map selftests
  2018-01-05 19:51 ` [PATCH v2 05/10] Btrfs: add extent map selftests Liu Bo
@ 2018-01-09 17:31   ` Josef Bacik
  0 siblings, 0 replies; 26+ messages in thread
From: Josef Bacik @ 2018-01-09 17:31 UTC (permalink / raw)
  To: Liu Bo; +Cc: linux-btrfs

On Fri, Jan 05, 2018 at 12:51:12PM -0700, Liu Bo wrote:
> We've observed that btrfs_get_extent() and merge_extent_mapping() could
> return -EEXIST in several cases, and they are caused by some racy
> condition, e.g dio read vs dio write, which makes the problem very tricky
> to reproduce.
> 
> This adds extent map selftests in order to simulate those racy situations.
> 
> Signed-off-by: Liu Bo <bo.li.liu@oracle.com>

Reviewed-by: Josef Bacik <jbacik@fb.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 06/10] Btrfs: extent map selftest: buffered write vs dio read
  2018-01-05 19:51 ` [PATCH v2 06/10] Btrfs: extent map selftest: buffered write vs dio read Liu Bo
@ 2018-01-09 17:32   ` Josef Bacik
  0 siblings, 0 replies; 26+ messages in thread
From: Josef Bacik @ 2018-01-09 17:32 UTC (permalink / raw)
  To: Liu Bo; +Cc: linux-btrfs

On Fri, Jan 05, 2018 at 12:51:13PM -0700, Liu Bo wrote:
> This test case simulates the racy situation of buffered write vs dio
> read, and see if btrfs_get_extent() would return -EEXIST.
> 
> Signed-off-by: Liu Bo <bo.li.liu@oracle.com>

Reviewed-by: Josef Bacik <jbacik@fb.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 07/10] Btrfs: extent map selftest: dio write vs dio read
  2018-01-05 19:51 ` [PATCH v2 07/10] Btrfs: extent map selftest: dio " Liu Bo
@ 2018-01-09 17:32   ` Josef Bacik
  0 siblings, 0 replies; 26+ messages in thread
From: Josef Bacik @ 2018-01-09 17:32 UTC (permalink / raw)
  To: Liu Bo; +Cc: linux-btrfs

On Fri, Jan 05, 2018 at 12:51:14PM -0700, Liu Bo wrote:
> This test case simulates the racy situation of dio write vs dio read,
> and see if btrfs_get_extent() would return -EEXIST.
> 
> Signed-off-by: Liu Bo <bo.li.liu@oracle.com>

Reviewed-by: Josef Bacik <jbacik@fb.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 08/10] Btrfs: add WARN_ONCE to detect unexpected error from merge_extent_mapping
  2018-01-05 19:51 ` [PATCH v2 08/10] Btrfs: add WARN_ONCE to detect unexpected error from merge_extent_mapping Liu Bo
@ 2018-01-09 17:33   ` Josef Bacik
  0 siblings, 0 replies; 26+ messages in thread
From: Josef Bacik @ 2018-01-09 17:33 UTC (permalink / raw)
  To: Liu Bo; +Cc: linux-btrfs

On Fri, Jan 05, 2018 at 12:51:15PM -0700, Liu Bo wrote:
> This is a subtle case, so in order to understand the problem, it'd be good
> to know the content of existing and em when any error occurs.
> 
> Signed-off-by: Liu Bo <bo.li.liu@oracle.com>

Reviewed-by: Josef Bacik <jbacik@fb.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 09/10] Btrfs: add tracepoint for em's EEXIST case
  2018-01-05 19:51 ` [PATCH v2 09/10] Btrfs: add tracepoint for em's EEXIST case Liu Bo
@ 2018-01-09 17:35   ` Josef Bacik
  2018-01-19 18:15   ` David Sterba
  1 sibling, 0 replies; 26+ messages in thread
From: Josef Bacik @ 2018-01-09 17:35 UTC (permalink / raw)
  To: Liu Bo; +Cc: linux-btrfs

On Fri, Jan 05, 2018 at 12:51:16PM -0700, Liu Bo wrote:
> This is adding a tracepoint 'btrfs_handle_em_exist' to help debug the
> subtle bugs around merge_extent_mapping.
> 
> Signed-off-by: Liu Bo <bo.li.liu@oracle.com>

Reviewed-by: Josef Bacik <jbacik@fb.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 10/10] Btrfs: noinline merge_extent_mapping
  2018-01-05 19:51 ` [PATCH v2 10/10] Btrfs: noinline merge_extent_mapping Liu Bo
@ 2018-01-09 17:35   ` Josef Bacik
  0 siblings, 0 replies; 26+ messages in thread
From: Josef Bacik @ 2018-01-09 17:35 UTC (permalink / raw)
  To: Liu Bo; +Cc: linux-btrfs

On Fri, Jan 05, 2018 at 12:51:17PM -0700, Liu Bo wrote:
> In order to debug subtle bugs around merge_extent_mapping(), perf probe
> can be used to check the arguments, but sometimes merge_extent_mapping()
> got inlined by compiler and couldn't be probed.
> 
> This is adding noinline attribute to merge_extent_mapping().
> 
> Signed-off-by: Liu Bo <bo.li.liu@oracle.com>

Reviewed-by: Josef Bacik <jbacik@fb.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 00/10] bugfixes and regression tests of btrfs_get_extent
  2018-01-08 19:57 ` [PATCH v2 00/10] bugfixes and regression tests of btrfs_get_extent David Sterba
@ 2018-01-18 16:51   ` David Sterba
  0 siblings, 0 replies; 26+ messages in thread
From: David Sterba @ 2018-01-18 16:51 UTC (permalink / raw)
  To: dsterba, Liu Bo, linux-btrfs

On Mon, Jan 08, 2018 at 08:57:34PM +0100, David Sterba wrote:
> On Fri, Jan 05, 2018 at 12:51:07PM -0700, Liu Bo wrote:
> > Although
> > commit e6c4efd87ab0 ("btrfs: Fix and enhance merge_extent_mapping() to insert best fitted extent map")
> > fixed up the negetive em->len, it has introduced several regressions, several has been fixed by
> > 
> > commit 32be3a1ac6d0 ("btrfs: Fix the wrong condition judgment about subset extent map"),
> > commit 8dff9c853410 ("Btrfs: deal with duplciates during extent_map insertion in btrfs_get_extent") and
> > commit 8e2bd3b7fac9 ("Btrfs: deal with existing encompassing extent map in btrfs_get_extent()").
> > 
> > Unfortunately, there is one more regression which is caught recently by a
> > user's workloads.
> > 
> > While debugging the above issue, I found that all of these bugs are caused
> > by some racy situations, which can be very tricky to reproduce, so I
> > created several extent map specific test cases in btrfs's selftest
> > framework.
> > 
> > Patch 1-2 are fixing two bugs.
> > Patch 3-4 are some preparatory work.
> > Patch 3-5 are regression tests about the logic of handling EEXIST from
> > adding extent map.
> > Patch 8-10 are debugging wise, one is a direct tracepoint and the other is
> > to enable kprobe on merge_extent_mapping.
> > 
> > v2:
> > - Improve commit log to provide more details about the bug.
> > - Adjust bugfixes to the front so that we can merge them firstly.
> 
> Patchset updated in for-next. Expected merge target is 4.16, review is
> still needed (and welcome).

Patchset added to 4.16.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 09/10] Btrfs: add tracepoint for em's EEXIST case
  2018-01-05 19:51 ` [PATCH v2 09/10] Btrfs: add tracepoint for em's EEXIST case Liu Bo
  2018-01-09 17:35   ` Josef Bacik
@ 2018-01-19 18:15   ` David Sterba
  2018-01-19 18:22     ` Nikolay Borisov
  1 sibling, 1 reply; 26+ messages in thread
From: David Sterba @ 2018-01-19 18:15 UTC (permalink / raw)
  To: Liu Bo; +Cc: linux-btrfs, nborisov

On Fri, Jan 05, 2018 at 12:51:16PM -0700, Liu Bo wrote:
> This is adding a tracepoint 'btrfs_handle_em_exist' to help debug the
> subtle bugs around merge_extent_mapping.
> 
> Signed-off-by: Liu Bo <bo.li.liu@oracle.com>

Nikolay has some concernas about adding the tracepoint, so I'll leave
this patch out of the series for now as we should decide how to proceed.

Thacepoints are considered an ABI by some and not ABI by others. I think
it's a good addition to the debugging aids that also may turn out to be
useful for evaluating performance later.

At minimum we could add some prefix/suffix to the debugging tracepoint
name, so we can let developers add what they need right away.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 09/10] Btrfs: add tracepoint for em's EEXIST case
  2018-01-19 18:15   ` David Sterba
@ 2018-01-19 18:22     ` Nikolay Borisov
  2018-01-19 23:32       ` David Sterba
  0 siblings, 1 reply; 26+ messages in thread
From: Nikolay Borisov @ 2018-01-19 18:22 UTC (permalink / raw)
  To: dsterba, Liu Bo, linux-btrfs



On 19.01.2018 20:15, David Sterba wrote:
> On Fri, Jan 05, 2018 at 12:51:16PM -0700, Liu Bo wrote:
>> This is adding a tracepoint 'btrfs_handle_em_exist' to help debug the
>> subtle bugs around merge_extent_mapping.
>>
>> Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
> 
> Nikolay has some concernas about adding the tracepoint, so I'll leave
> this patch out of the series for now as we should decide how to proceed.
> 
> Thacepoints are considered an ABI by some and not ABI by others. I think
> it's a good addition to the debugging aids that also may turn out to be
> useful for evaluating performance later.
> 
> At minimum we could add some prefix/suffix to the debugging tracepoint
> name, so we can let developers add what they need right away.
> 


My concern specifically has to do with the fact that if tracepoints are
considere ABI then whatever decision we make now will be cast in stone
and if we juggle the code around we will still have to retain the format
of the tracepoint. If, OTOH we are able to remove and change the
tracepoint as we see fit - then it's okay. Otherwise we risk polluting
the code

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 09/10] Btrfs: add tracepoint for em's EEXIST case
  2018-01-19 18:22     ` Nikolay Borisov
@ 2018-01-19 23:32       ` David Sterba
  0 siblings, 0 replies; 26+ messages in thread
From: David Sterba @ 2018-01-19 23:32 UTC (permalink / raw)
  To: Nikolay Borisov; +Cc: dsterba, Liu Bo, linux-btrfs, jeffm

On Fri, Jan 19, 2018 at 08:22:36PM +0200, Nikolay Borisov wrote:
> 
> 
> On 19.01.2018 20:15, David Sterba wrote:
> > On Fri, Jan 05, 2018 at 12:51:16PM -0700, Liu Bo wrote:
> >> This is adding a tracepoint 'btrfs_handle_em_exist' to help debug the
> >> subtle bugs around merge_extent_mapping.
> >>
> >> Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
> > 
> > Nikolay has some concernas about adding the tracepoint, so I'll leave
> > this patch out of the series for now as we should decide how to proceed.
> > 
> > Thacepoints are considered an ABI by some and not ABI by others. I think
> > it's a good addition to the debugging aids that also may turn out to be
> > useful for evaluating performance later.
> > 
> > At minimum we could add some prefix/suffix to the debugging tracepoint
> > name, so we can let developers add what they need right away.
> My concern specifically has to do with the fact that if tracepoints are
> considere ABI then whatever decision we make now will be cast in stone

Right, same as with the ioctls, we take a great care there and still do
mistakes that get discovered months after.

> and if we juggle the code around we will still have to retain the format
> of the tracepoint. If, OTOH we are able to remove and change the
> tracepoint as we see fit - then it's okay. Otherwise we risk polluting
> the code

Understood and agreed.

Jeff raised a question if there are namespaces in the tracepoints.
That's an interesting point and would be more systematic than inventing
our own naming scheme.

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2018-01-19 23:34 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-05 19:51 [PATCH v2 00/10] bugfixes and regression tests of btrfs_get_extent Liu Bo
2018-01-05 19:51 ` [PATCH v2 01/10] Btrfs: fix incorrect block_len in merge_extent_mapping Liu Bo
2018-01-09 17:24   ` Josef Bacik
2018-01-05 19:51 ` [PATCH v2 02/10] Btrfs: fix unexpected EEXIST from btrfs_get_extent Liu Bo
2018-01-09 17:27   ` Josef Bacik
2018-01-05 19:51 ` [PATCH v2 03/10] Btrfs: add helper for em merge logic Liu Bo
2018-01-09 17:27   ` Josef Bacik
2018-01-05 19:51 ` [PATCH v2 04/10] Btrfs: move extent map specific code to extent_map.c Liu Bo
2018-01-09 17:29   ` Josef Bacik
2018-01-05 19:51 ` [PATCH v2 05/10] Btrfs: add extent map selftests Liu Bo
2018-01-09 17:31   ` Josef Bacik
2018-01-05 19:51 ` [PATCH v2 06/10] Btrfs: extent map selftest: buffered write vs dio read Liu Bo
2018-01-09 17:32   ` Josef Bacik
2018-01-05 19:51 ` [PATCH v2 07/10] Btrfs: extent map selftest: dio " Liu Bo
2018-01-09 17:32   ` Josef Bacik
2018-01-05 19:51 ` [PATCH v2 08/10] Btrfs: add WARN_ONCE to detect unexpected error from merge_extent_mapping Liu Bo
2018-01-09 17:33   ` Josef Bacik
2018-01-05 19:51 ` [PATCH v2 09/10] Btrfs: add tracepoint for em's EEXIST case Liu Bo
2018-01-09 17:35   ` Josef Bacik
2018-01-19 18:15   ` David Sterba
2018-01-19 18:22     ` Nikolay Borisov
2018-01-19 23:32       ` David Sterba
2018-01-05 19:51 ` [PATCH v2 10/10] Btrfs: noinline merge_extent_mapping Liu Bo
2018-01-09 17:35   ` Josef Bacik
2018-01-08 19:57 ` [PATCH v2 00/10] bugfixes and regression tests of btrfs_get_extent David Sterba
2018-01-18 16:51   ` David Sterba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).