All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Filipe Manana <fdmanana@suse.com>
Subject: [PATCH 3.10 13/35] Btrfs: fix race leading to BUG_ON when running delalloc for nodatacow
Date: Wed, 20 Jan 2016 14:00:43 -0800	[thread overview]
Message-ID: <20160120211952.897550752@linuxfoundation.org> (raw)
In-Reply-To: <20160120211951.234493363@linuxfoundation.org>

3.10-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Filipe Manana <fdmanana@suse.com>

commit 1d512cb77bdbda80f0dd0620a3b260d697fd581d upstream.

If we are using the NO_HOLES feature, we have a tiny time window when
running delalloc for a nodatacow inode where we can race with a concurrent
link or xattr add operation leading to a BUG_ON.

This happens because at run_delalloc_nocow() we end up casting a leaf item
of type BTRFS_INODE_[REF|EXTREF]_KEY or of type BTRFS_XATTR_ITEM_KEY to a
file extent item (struct btrfs_file_extent_item) and then analyse its
extent type field, which won't match any of the expected extent types
(values BTRFS_FILE_EXTENT_[REG|PREALLOC|INLINE]) and therefore trigger an
explicit BUG_ON(1).

The following sequence diagram shows how the race happens when running a
no-cow dellaloc range [4K, 8K[ for inode 257 and we have the following
neighbour leafs:

             Leaf X (has N items)                    Leaf Y

 [ ... (257 INODE_ITEM 0) (257 INODE_REF 256) ]  [ (257 EXTENT_DATA 8192), ... ]
              slot N - 2         slot N - 1              slot 0

 (Note the implicit hole for inode 257 regarding the [0, 8K[ range)

       CPU 1                                         CPU 2

 run_dealloc_nocow()
   btrfs_lookup_file_extent()
     --> searches for a key with value
         (257 EXTENT_DATA 4096) in the
         fs/subvol tree
     --> returns us a path with
         path->nodes[0] == leaf X and
         path->slots[0] == N

   because path->slots[0] is >=
   btrfs_header_nritems(leaf X), it
   calls btrfs_next_leaf()

   btrfs_next_leaf()
     --> releases the path

                                              hard link added to our inode,
                                              with key (257 INODE_REF 500)
                                              added to the end of leaf X,
                                              so leaf X now has N + 1 keys

     --> searches for the key
         (257 INODE_REF 256), because
         it was the last key in leaf X
         before it released the path,
         with path->keep_locks set to 1

     --> ends up at leaf X again and
         it verifies that the key
         (257 INODE_REF 256) is no longer
         the last key in the leaf, so it
         returns with path->nodes[0] ==
         leaf X and path->slots[0] == N,
         pointing to the new item with
         key (257 INODE_REF 500)

   the loop iteration of run_dealloc_nocow()
   does not break out the loop and continues
   because the key referenced in the path
   at path->nodes[0] and path->slots[0] is
   for inode 257, its type is < BTRFS_EXTENT_DATA_KEY
   and its offset (500) is less then our delalloc
   range's end (8192)

   the item pointed by the path, an inode reference item,
   is (incorrectly) interpreted as a file extent item and
   we get an invalid extent type, leading to the BUG_ON(1):

   if (extent_type == BTRFS_FILE_EXTENT_REG ||
      extent_type == BTRFS_FILE_EXTENT_PREALLOC) {
       (...)
   } else if (extent_type == BTRFS_FILE_EXTENT_INLINE) {
       (...)
   } else {
       BUG_ON(1)
   }

The same can happen if a xattr is added concurrently and ends up having
a key with an offset smaller then the delalloc's range end.

So fix this by skipping keys with a type smaller than
BTRFS_EXTENT_DATA_KEY.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/btrfs/inode.c |   10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -1286,8 +1286,14 @@ next_slot:
 		num_bytes = 0;
 		btrfs_item_key_to_cpu(leaf, &found_key, path->slots[0]);
 
-		if (found_key.objectid > ino ||
-		    found_key.type > BTRFS_EXTENT_DATA_KEY ||
+		if (found_key.objectid > ino)
+			break;
+		if (WARN_ON_ONCE(found_key.objectid < ino) ||
+		    found_key.type < BTRFS_EXTENT_DATA_KEY) {
+			path->slots[0]++;
+			goto next_slot;
+		}
+		if (found_key.type > BTRFS_EXTENT_DATA_KEY ||
 		    found_key.offset > end)
 			break;
 

  parent reply	other threads:[~2016-01-20 22:22 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-20 22:00 [PATCH 3.10 00/35] 3.10.95-stable review Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 01/35] unix: avoid use-after-free in ep_remove_wait_queue Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 02/35] sctp: translate host order to network order when setting a hmacid Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 03/35] snmp: Remove duplicate OUTMCAST stat increment Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 05/35] tcp: md5: fix lockdep annotation Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 06/35] tcp: initialize tp->copied_seq in case of cross SYN connection Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 07/35] net, scm: fix PaX detected msg_controllen overflow in scm_detach_fds Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 08/35] net: ipmr: fix static mfc/dev leaks on table destruction Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 09/35] net: ip6mr: " Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 10/35] broadcom: fix PHY_ID_BCM5481 entry in the id table Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 11/35] ipv6: distinguish frag queues by device for multicast and link-local packets Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 12/35] ipv6: sctp: implement sctp_v6_destroy_sock() Greg Kroah-Hartman
2016-01-20 22:00 ` Greg Kroah-Hartman [this message]
2016-01-20 22:00 ` [PATCH 3.10 14/35] ext4, jbd2: ensure entering into panic after recording an error in superblock Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 15/35] firewire: ohci: fix JMicron JMB38x IT context discovery Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 16/35] nfs4: start callback_ident at idr 1 Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 17/35] nfs: if we have no valid attrs, then dont declare the attribute cache valid Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 18/35] USB: cdc_acm: Ignore Infineon Flash Loader utility Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 19/35] USB: cp210x: Remove CP2110 ID from compatibility list Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 20/35] USB: add quirk for devices with broken LPM Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 21/35] USB: whci-hcd: add check for dma mapping error Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 22/35] usb: Use the USB_SS_MULT() macro to decode burst multiplier for log message Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 23/35] gre6: allow to update all parameters via rtnl Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 24/35] atl1c: Improve driver not to do order 4 GFP_ATOMIC allocation Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 25/35] sctp: update the netstamp_needed counter when copying sockets Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 26/35] ipv6: sctp: clone options to avoid use after free Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 28/35] sh_eth: fix kernel oops in skb_put() Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 29/35] pptp: verify sockaddr_len in pptp_bind() and pptp_connect() Greg Kroah-Hartman
2016-01-20 22:01 ` [PATCH 3.10 30/35] bluetooth: Validate socket address length in sco_sock_bind() Greg Kroah-Hartman
2016-01-20 22:01 ` [PATCH 3.10 31/35] af_unix: Revert lock_interruptible in stream receive code Greg Kroah-Hartman
2016-01-20 22:01 ` [PATCH 3.10 32/35] KEYS: Fix race between key destruction and finding a keyring by name Greg Kroah-Hartman
2016-01-20 22:01 ` [PATCH 3.10 33/35] KEYS: Fix crash when attempt to garbage collect an uninstantiated keyring Greg Kroah-Hartman
2016-01-20 22:01 ` [PATCH 3.10 34/35] KEYS: Fix race between read and revoke Greg Kroah-Hartman
2016-01-20 22:01 ` [PATCH 3.10 35/35] KEYS: Fix keyring ref leak in join_session_keyring() Greg Kroah-Hartman
2016-01-20 23:14 ` [PATCH 3.10 00/35] 3.10.95-stable review Shuah Khan
2016-01-21  7:06   ` Willy Tarreau
2016-01-22  7:52     ` Greg Kroah-Hartman
2016-01-22  8:30       ` Willy Tarreau
2016-01-21 12:20 ` Guenter Roeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160120211952.897550752@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=fdmanana@suse.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.