linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Btrfs: deal with existing encompassing extent map in btrfs_get_extent()
@ 2016-11-09 23:26 Omar Sandoval
  2016-11-10 15:06 ` David Sterba
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Omar Sandoval @ 2016-11-09 23:26 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Omar Sandoval <osandov@fb.com>

My QEMU VM was seeing inexplicable I/O errors that I tracked down to
errors coming from the qcow2 virtual drive in the host system. The qcow2
file is a nocow file on my Btrfs drive, which QEMU opens with O_DIRECT.
Every once in awhile, pread() or pwrite() would return EEXIST, which
makes no sense. This turned out to be a bug in btrfs_get_extent().

Commit 8dff9c853410 ("Btrfs: deal with duplciates during extent_map
insertion in btrfs_get_extent") fixed a case in btrfs_get_extent() where
two threads race on adding the same extent map to an inode's extent map
tree. However, if the added em is merged with an adjacent em in the
extent tree, then we'll end up with an existing extent that is not
identical to but instead encompasses the extent we tried to add. When we
call merge_extent_mapping() to find the nonoverlapping part of the new
em, the arithmetic overflows because there is no such thing. We then end
up trying to add a bogus em to the em_tree, which results in a EEXIST
that can bubble all the way up to userspace.

Fix it by extending the identical extent map special case.

Signed-off-by: Omar Sandoval <osandov@fb.com>
---
Applies to 4.9-rc4.

Here [1] is a reproducer for this bug that doesn't involve firing up a
QEMU VM. Also, a big shoutout to BCC [2] and BPF for making it possible
to debug this on my laptop without compiling a custom kernel and
rebooting just to add printks [3].

1: https://gist.github.com/osandov/d08aabe5d4dec15517e9fde17012fd3b
2: https://github.com/iovisor/bcc
3: https://gist.github.com/osandov/eb1db868ce10c3af9e00b90f3a65bf9f

 fs/btrfs/inode.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 2b790bd..e5cf589 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -7049,11 +7049,11 @@ struct extent_map *btrfs_get_extent(struct inode *inode, struct page *page,
 		 * extent causing the -EEXIST.
 		 */
 		if (existing->start == em->start &&
-		    extent_map_end(existing) == extent_map_end(em) &&
+		    extent_map_end(existing) >= extent_map_end(em) &&
 		    em->block_start == existing->block_start) {
 			/*
-			 * these two extents are the same, it happens
-			 * with inlines especially
+			 * The existing extent map already encompasses the
+			 * entire extent map we tried to add.
 			 */
 			free_extent_map(em);
 			em = existing;
-- 
2.10.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2016-11-17  0:32 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-11-09 23:26 [PATCH] Btrfs: deal with existing encompassing extent map in btrfs_get_extent() Omar Sandoval
2016-11-10 15:06 ` David Sterba
2016-11-10 15:37   ` Holger Hoffstätte
2016-11-10 15:42   ` Omar Sandoval
2016-11-11  0:36   ` Liu Bo
2016-11-10 15:11 ` Holger Hoffstätte
2016-11-10 15:37   ` Omar Sandoval
2016-11-10 16:01     ` Holger Hoffstätte
2016-11-10 16:20       ` Omar Sandoval
2016-11-10 16:31         ` Holger Hoffstätte
2016-11-10 20:01 ` Liu Bo
2016-11-10 20:09   ` Omar Sandoval
2016-11-10 20:24     ` Omar Sandoval
2016-11-10 22:38       ` Liu Bo
2016-11-10 22:45         ` Omar Sandoval
2016-11-17  0:32           ` Omar Sandoval

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).