linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Filipe Manana <fdmanana@suse.com>
Subject: [PATCH 4.1 31/45] Btrfs: fix race leading to BUG_ON when running delalloc for nodatacow
Date: Sat, 12 Dec 2015 11:33:26 -0800	[thread overview]
Message-ID: <20151212193325.489318376@linuxfoundation.org> (raw)
In-Reply-To: <20151212193323.965395988@linuxfoundation.org>

4.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Filipe Manana <fdmanana@suse.com>

commit 1d512cb77bdbda80f0dd0620a3b260d697fd581d upstream.

If we are using the NO_HOLES feature, we have a tiny time window when
running delalloc for a nodatacow inode where we can race with a concurrent
link or xattr add operation leading to a BUG_ON.

This happens because at run_delalloc_nocow() we end up casting a leaf item
of type BTRFS_INODE_[REF|EXTREF]_KEY or of type BTRFS_XATTR_ITEM_KEY to a
file extent item (struct btrfs_file_extent_item) and then analyse its
extent type field, which won't match any of the expected extent types
(values BTRFS_FILE_EXTENT_[REG|PREALLOC|INLINE]) and therefore trigger an
explicit BUG_ON(1).

The following sequence diagram shows how the race happens when running a
no-cow dellaloc range [4K, 8K[ for inode 257 and we have the following
neighbour leafs:

             Leaf X (has N items)                    Leaf Y

 [ ... (257 INODE_ITEM 0) (257 INODE_REF 256) ]  [ (257 EXTENT_DATA 8192), ... ]
              slot N - 2         slot N - 1              slot 0

 (Note the implicit hole for inode 257 regarding the [0, 8K[ range)

       CPU 1                                         CPU 2

 run_dealloc_nocow()
   btrfs_lookup_file_extent()
     --> searches for a key with value
         (257 EXTENT_DATA 4096) in the
         fs/subvol tree
     --> returns us a path with
         path->nodes[0] == leaf X and
         path->slots[0] == N

   because path->slots[0] is >=
   btrfs_header_nritems(leaf X), it
   calls btrfs_next_leaf()

   btrfs_next_leaf()
     --> releases the path

                                              hard link added to our inode,
                                              with key (257 INODE_REF 500)
                                              added to the end of leaf X,
                                              so leaf X now has N + 1 keys

     --> searches for the key
         (257 INODE_REF 256), because
         it was the last key in leaf X
         before it released the path,
         with path->keep_locks set to 1

     --> ends up at leaf X again and
         it verifies that the key
         (257 INODE_REF 256) is no longer
         the last key in the leaf, so it
         returns with path->nodes[0] ==
         leaf X and path->slots[0] == N,
         pointing to the new item with
         key (257 INODE_REF 500)

   the loop iteration of run_dealloc_nocow()
   does not break out the loop and continues
   because the key referenced in the path
   at path->nodes[0] and path->slots[0] is
   for inode 257, its type is < BTRFS_EXTENT_DATA_KEY
   and its offset (500) is less then our delalloc
   range's end (8192)

   the item pointed by the path, an inode reference item,
   is (incorrectly) interpreted as a file extent item and
   we get an invalid extent type, leading to the BUG_ON(1):

   if (extent_type == BTRFS_FILE_EXTENT_REG ||
      extent_type == BTRFS_FILE_EXTENT_PREALLOC) {
       (...)
   } else if (extent_type == BTRFS_FILE_EXTENT_INLINE) {
       (...)
   } else {
       BUG_ON(1)
   }

The same can happen if a xattr is added concurrently and ends up having
a key with an offset smaller then the delalloc's range end.

So fix this by skipping keys with a type smaller than
BTRFS_EXTENT_DATA_KEY.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/btrfs/inode.c |   10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -1294,8 +1294,14 @@ next_slot:
 		num_bytes = 0;
 		btrfs_item_key_to_cpu(leaf, &found_key, path->slots[0]);
 
-		if (found_key.objectid > ino ||
-		    found_key.type > BTRFS_EXTENT_DATA_KEY ||
+		if (found_key.objectid > ino)
+			break;
+		if (WARN_ON_ONCE(found_key.objectid < ino) ||
+		    found_key.type < BTRFS_EXTENT_DATA_KEY) {
+			path->slots[0]++;
+			goto next_slot;
+		}
+		if (found_key.type > BTRFS_EXTENT_DATA_KEY ||
 		    found_key.offset > end)
 			break;
 



  parent reply	other threads:[~2015-12-12 19:35 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-12 19:32 [PATCH 4.1 00/45] 4.1.15-stable review Greg Kroah-Hartman
2015-12-12 19:32 ` [PATCH 4.1 01/45] unix: avoid use-after-free in ep_remove_wait_queue Greg Kroah-Hartman
2015-12-12 19:32 ` [PATCH 4.1 02/45] tools/net: Use include/uapi with __EXPORTED_HEADERS__ Greg Kroah-Hartman
2015-12-12 19:32 ` [PATCH 4.1 03/45] packet: do skb_probe_transport_header when we actually have data Greg Kroah-Hartman
2015-12-12 19:32 ` [PATCH 4.1 04/45] packet: always probe for transport header Greg Kroah-Hartman
2015-12-12 19:33 ` [PATCH 4.1 05/45] packet: only allow extra vlan len on ethernet devices Greg Kroah-Hartman
2015-12-12 19:33 ` [PATCH 4.1 06/45] packet: infer protocol from ethernet header if unset Greg Kroah-Hartman
2015-12-12 19:33 ` [PATCH 4.1 07/45] packet: fix tpacket_snd max frame len Greg Kroah-Hartman
2015-12-12 19:33 ` [PATCH 4.1 08/45] sctp: translate host order to network order when setting a hmacid Greg Kroah-Hartman
2015-12-12 19:33 ` [PATCH 4.1 09/45] ip_tunnel: disable preemption when updating per-cpu tstats Greg Kroah-Hartman
2015-12-12 19:33 ` [PATCH 4.1 10/45] snmp: Remove duplicate OUTMCAST stat increment Greg Kroah-Hartman
2015-12-12 19:33 ` [PATCH 4.1 12/45] tcp: md5: fix lockdep annotation Greg Kroah-Hartman
2015-12-12 19:33 ` [PATCH 4.1 13/45] tcp: disable Fast Open on timeouts after handshake Greg Kroah-Hartman
2015-12-12 19:33 ` [PATCH 4.1 14/45] tcp: fix potential huge kmalloc() calls in TCP_REPAIR Greg Kroah-Hartman
2015-12-12 19:33 ` [PATCH 4.1 15/45] tcp: initialize tp->copied_seq in case of cross SYN connection Greg Kroah-Hartman
2015-12-12 19:33 ` [PATCH 4.1 16/45] net, scm: fix PaX detected msg_controllen overflow in scm_detach_fds Greg Kroah-Hartman
2015-12-12 19:33 ` [PATCH 4.1 17/45] net: ipmr: fix static mfc/dev leaks on table destruction Greg Kroah-Hartman
2015-12-12 19:33 ` [PATCH 4.1 18/45] net: ip6mr: " Greg Kroah-Hartman
2015-12-12 19:33 ` [PATCH 4.1 19/45] broadcom: fix PHY_ID_BCM5481 entry in the id table Greg Kroah-Hartman
2015-12-12 19:33 ` [PATCH 4.1 20/45] ipv6: distinguish frag queues by device for multicast and link-local packets Greg Kroah-Hartman
2015-12-12 19:33 ` [PATCH 4.1 21/45] RDS: fix race condition when sending a message on unbound socket Greg Kroah-Hartman
2015-12-12 19:33 ` [PATCH 4.1 22/45] bpf, array: fix heap out-of-bounds access when updating elements Greg Kroah-Hartman
2015-12-12 19:33 ` [PATCH 4.1 23/45] ipv6: add complete rcu protection around np->opt Greg Kroah-Hartman
2015-12-12 19:33 ` [PATCH 4.1 24/45] net/neighbour: fix crash at dumping device-agnostic proxy entries Greg Kroah-Hartman
2015-12-12 19:33 ` [PATCH 4.1 25/45] ipv6: sctp: implement sctp_v6_destroy_sock() Greg Kroah-Hartman
2015-12-12 19:33 ` [PATCH 4.1 26/45] net_sched: fix qdisc_tree_decrease_qlen() races Greg Kroah-Hartman
2015-12-12 19:33 ` [PATCH 4.1 27/45] btrfs: check unsupported filters in balance arguments Greg Kroah-Hartman
2015-12-12 19:33 ` [PATCH 4.1 28/45] Btrfs: fix file corruption and data loss after cloning inline extents Greg Kroah-Hartman
2015-12-12 19:33 ` [PATCH 4.1 29/45] Btrfs: fix truncation of compressed and inlined extents Greg Kroah-Hartman
2015-12-12 19:33 ` [PATCH 4.1 30/45] Btrfs: fix race leading to incorrect item deletion when dropping extents Greg Kroah-Hartman
2015-12-12 19:33 ` Greg Kroah-Hartman [this message]
2015-12-12 19:33 ` [PATCH 4.1 32/45] Btrfs: fix race when listing an inodes xattrs Greg Kroah-Hartman
2015-12-12 19:33 ` [PATCH 4.1 33/45] rbd: dont put snap_context twice in rbd_queue_workfn() Greg Kroah-Hartman
2015-12-12 19:33 ` [PATCH 4.1 34/45] ext4 crypto: fix memory leak in ext4_bio_write_page() Greg Kroah-Hartman
2015-12-12 19:33 ` [PATCH 4.1 35/45] ext4: fix potential use after free in __ext4_journal_stop Greg Kroah-Hartman
2015-12-12 19:33 ` [PATCH 4.1 36/45] ext4, jbd2: ensure entering into panic after recording an error in superblock Greg Kroah-Hartman
2015-12-12 19:33 ` [PATCH 4.1 37/45] firewire: ohci: fix JMicron JMB38x IT context discovery Greg Kroah-Hartman
2015-12-12 19:33 ` [PATCH 4.1 38/45] nfsd: serialize state seqid morphing operations Greg Kroah-Hartman
2015-12-12 19:33 ` [PATCH 4.1 39/45] nfsd: eliminate sending duplicate and repeated delegations Greg Kroah-Hartman
2015-12-12 19:33 ` [PATCH 4.1 40/45] debugfs: fix refcount imbalance in start_creating Greg Kroah-Hartman
2015-12-12 19:33 ` [PATCH 4.1 41/45] nfs4: start callback_ident at idr 1 Greg Kroah-Hartman
2015-12-12 19:33 ` [PATCH 4.1 42/45] nfs: if we have no valid attrs, then dont declare the attribute cache valid Greg Kroah-Hartman
2015-12-12 19:33 ` [PATCH 4.1 43/45] ocfs2: fix umask ignored issue Greg Kroah-Hartman
2015-12-12 19:33 ` [PATCH 4.1 44/45] ceph: fix message length computation Greg Kroah-Hartman
2015-12-12 19:33 ` [PATCH 4.1 45/45] ALSA: hda/hdmi - apply Skylake fix-ups to Broxton display codec Greg Kroah-Hartman
2015-12-13  3:04 ` [PATCH 4.1 00/45] 4.1.15-stable review Shuah Khan
2015-12-13 15:58 ` Guenter Roeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151212193325.489318376@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=fdmanana@suse.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).