From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Filipe Manana <fdmanana@suse.com>
Subject: [PATCH 3.10 13/35] Btrfs: fix race leading to BUG_ON when running delalloc for nodatacow
Date: Wed, 20 Jan 2016 14:00:43 -0800 [thread overview]
Message-ID: <20160120211952.897550752@linuxfoundation.org> (raw)
In-Reply-To: <20160120211951.234493363@linuxfoundation.org>
3.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Filipe Manana <fdmanana@suse.com>
commit 1d512cb77bdbda80f0dd0620a3b260d697fd581d upstream.
If we are using the NO_HOLES feature, we have a tiny time window when
running delalloc for a nodatacow inode where we can race with a concurrent
link or xattr add operation leading to a BUG_ON.
This happens because at run_delalloc_nocow() we end up casting a leaf item
of type BTRFS_INODE_[REF|EXTREF]_KEY or of type BTRFS_XATTR_ITEM_KEY to a
file extent item (struct btrfs_file_extent_item) and then analyse its
extent type field, which won't match any of the expected extent types
(values BTRFS_FILE_EXTENT_[REG|PREALLOC|INLINE]) and therefore trigger an
explicit BUG_ON(1).
The following sequence diagram shows how the race happens when running a
no-cow dellaloc range [4K, 8K[ for inode 257 and we have the following
neighbour leafs:
Leaf X (has N items) Leaf Y
[ ... (257 INODE_ITEM 0) (257 INODE_REF 256) ] [ (257 EXTENT_DATA 8192), ... ]
slot N - 2 slot N - 1 slot 0
(Note the implicit hole for inode 257 regarding the [0, 8K[ range)
CPU 1 CPU 2
run_dealloc_nocow()
btrfs_lookup_file_extent()
--> searches for a key with value
(257 EXTENT_DATA 4096) in the
fs/subvol tree
--> returns us a path with
path->nodes[0] == leaf X and
path->slots[0] == N
because path->slots[0] is >=
btrfs_header_nritems(leaf X), it
calls btrfs_next_leaf()
btrfs_next_leaf()
--> releases the path
hard link added to our inode,
with key (257 INODE_REF 500)
added to the end of leaf X,
so leaf X now has N + 1 keys
--> searches for the key
(257 INODE_REF 256), because
it was the last key in leaf X
before it released the path,
with path->keep_locks set to 1
--> ends up at leaf X again and
it verifies that the key
(257 INODE_REF 256) is no longer
the last key in the leaf, so it
returns with path->nodes[0] ==
leaf X and path->slots[0] == N,
pointing to the new item with
key (257 INODE_REF 500)
the loop iteration of run_dealloc_nocow()
does not break out the loop and continues
because the key referenced in the path
at path->nodes[0] and path->slots[0] is
for inode 257, its type is < BTRFS_EXTENT_DATA_KEY
and its offset (500) is less then our delalloc
range's end (8192)
the item pointed by the path, an inode reference item,
is (incorrectly) interpreted as a file extent item and
we get an invalid extent type, leading to the BUG_ON(1):
if (extent_type == BTRFS_FILE_EXTENT_REG ||
extent_type == BTRFS_FILE_EXTENT_PREALLOC) {
(...)
} else if (extent_type == BTRFS_FILE_EXTENT_INLINE) {
(...)
} else {
BUG_ON(1)
}
The same can happen if a xattr is added concurrently and ends up having
a key with an offset smaller then the delalloc's range end.
So fix this by skipping keys with a type smaller than
BTRFS_EXTENT_DATA_KEY.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/btrfs/inode.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -1286,8 +1286,14 @@ next_slot:
num_bytes = 0;
btrfs_item_key_to_cpu(leaf, &found_key, path->slots[0]);
- if (found_key.objectid > ino ||
- found_key.type > BTRFS_EXTENT_DATA_KEY ||
+ if (found_key.objectid > ino)
+ break;
+ if (WARN_ON_ONCE(found_key.objectid < ino) ||
+ found_key.type < BTRFS_EXTENT_DATA_KEY) {
+ path->slots[0]++;
+ goto next_slot;
+ }
+ if (found_key.type > BTRFS_EXTENT_DATA_KEY ||
found_key.offset > end)
break;
next prev parent reply other threads:[~2016-01-20 22:01 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-20 22:00 [PATCH 3.10 00/35] 3.10.95-stable review Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 01/35] unix: avoid use-after-free in ep_remove_wait_queue Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 02/35] sctp: translate host order to network order when setting a hmacid Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 03/35] snmp: Remove duplicate OUTMCAST stat increment Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 05/35] tcp: md5: fix lockdep annotation Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 06/35] tcp: initialize tp->copied_seq in case of cross SYN connection Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 07/35] net, scm: fix PaX detected msg_controllen overflow in scm_detach_fds Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 08/35] net: ipmr: fix static mfc/dev leaks on table destruction Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 09/35] net: ip6mr: " Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 10/35] broadcom: fix PHY_ID_BCM5481 entry in the id table Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 11/35] ipv6: distinguish frag queues by device for multicast and link-local packets Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 12/35] ipv6: sctp: implement sctp_v6_destroy_sock() Greg Kroah-Hartman
2016-01-20 22:00 ` Greg Kroah-Hartman [this message]
2016-01-20 22:00 ` [PATCH 3.10 14/35] ext4, jbd2: ensure entering into panic after recording an error in superblock Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 15/35] firewire: ohci: fix JMicron JMB38x IT context discovery Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 16/35] nfs4: start callback_ident at idr 1 Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 17/35] nfs: if we have no valid attrs, then dont declare the attribute cache valid Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 18/35] USB: cdc_acm: Ignore Infineon Flash Loader utility Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 19/35] USB: cp210x: Remove CP2110 ID from compatibility list Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 20/35] USB: add quirk for devices with broken LPM Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 21/35] USB: whci-hcd: add check for dma mapping error Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 22/35] usb: Use the USB_SS_MULT() macro to decode burst multiplier for log message Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 23/35] gre6: allow to update all parameters via rtnl Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 24/35] atl1c: Improve driver not to do order 4 GFP_ATOMIC allocation Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 25/35] sctp: update the netstamp_needed counter when copying sockets Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 26/35] ipv6: sctp: clone options to avoid use after free Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 28/35] sh_eth: fix kernel oops in skb_put() Greg Kroah-Hartman
2016-01-20 22:00 ` [PATCH 3.10 29/35] pptp: verify sockaddr_len in pptp_bind() and pptp_connect() Greg Kroah-Hartman
2016-01-20 22:01 ` [PATCH 3.10 30/35] bluetooth: Validate socket address length in sco_sock_bind() Greg Kroah-Hartman
2016-01-20 22:01 ` [PATCH 3.10 31/35] af_unix: Revert lock_interruptible in stream receive code Greg Kroah-Hartman
2016-01-20 22:01 ` [PATCH 3.10 32/35] KEYS: Fix race between key destruction and finding a keyring by name Greg Kroah-Hartman
2016-01-20 22:01 ` [PATCH 3.10 33/35] KEYS: Fix crash when attempt to garbage collect an uninstantiated keyring Greg Kroah-Hartman
2016-01-20 22:01 ` [PATCH 3.10 34/35] KEYS: Fix race between read and revoke Greg Kroah-Hartman
2016-01-20 22:01 ` [PATCH 3.10 35/35] KEYS: Fix keyring ref leak in join_session_keyring() Greg Kroah-Hartman
2016-01-20 23:14 ` [PATCH 3.10 00/35] 3.10.95-stable review Shuah Khan
2016-01-21 7:06 ` Willy Tarreau
2016-01-22 7:52 ` Greg Kroah-Hartman
2016-01-22 8:30 ` Willy Tarreau
2016-01-21 12:20 ` Guenter Roeck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160120211952.897550752@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=fdmanana@suse.com \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).