From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
patches@lists.linux.dev, Filipe Manana <fdmanana@suse.com>,
David Sterba <dsterba@suse.com>, Sasha Levin <sashal@kernel.org>
Subject: [PATCH 4.19 12/78] btrfs: fix processing of delayed tree block refs during backref walking
Date: Wed, 2 Nov 2022 03:33:57 +0100 [thread overview]
Message-ID: <20221102022053.294815104@linuxfoundation.org> (raw)
In-Reply-To: <20221102022052.895556444@linuxfoundation.org>
From: Filipe Manana <fdmanana@suse.com>
[ Upstream commit 943553ef9b51db303ab2b955c1025261abfdf6fb ]
During backref walking, when processing a delayed reference with a type of
BTRFS_TREE_BLOCK_REF_KEY, we have two bugs there:
1) We are accessing the delayed references extent_op, and its key, without
the protection of the delayed ref head's lock;
2) If there's no extent op for the delayed ref head, we end up with an
uninitialized key in the stack, variable 'tmp_op_key', and then pass
it to add_indirect_ref(), which adds the reference to the indirect
refs rb tree.
This is wrong, because indirect references should have a NULL key
when we don't have access to the key, and in that case they should be
added to the indirect_missing_keys rb tree and not to the indirect rb
tree.
This means that if have BTRFS_TREE_BLOCK_REF_KEY delayed ref resulting
from freeing an extent buffer, therefore with a count of -1, it will
not cancel out the corresponding reference we have in the extent tree
(with a count of 1), since both references end up in different rb
trees.
When using fiemap, where we often need to check if extents are shared
through shared subtrees resulting from snapshots, it means we can
incorrectly report an extent as shared when it's no longer shared.
However this is temporary because after the transaction is committed
the extent is no longer reported as shared, as running the delayed
reference results in deleting the tree block reference from the extent
tree.
Outside the fiemap context, the result is unpredictable, as the key was
not initialized but it's used when navigating the rb trees to insert
and search for references (prelim_ref_compare()), and we expect all
references in the indirect rb tree to have valid keys.
The following reproducer triggers the second bug:
$ cat test.sh
#!/bin/bash
DEV=/dev/sdj
MNT=/mnt/sdj
mkfs.btrfs -f $DEV
mount -o compress $DEV $MNT
# With a compressed 128M file we get a tree height of 2 (level 1 root).
xfs_io -f -c "pwrite -b 1M 0 128M" $MNT/foo
btrfs subvolume snapshot $MNT $MNT/snap
# Fiemap should output 0x2008 in the flags column.
# 0x2000 means shared extent
# 0x8 means encoded extent (because it's compressed)
echo
echo "fiemap after snapshot, range [120M, 120M + 128K):"
xfs_io -c "fiemap -v 120M 128K" $MNT/foo
echo
# Overwrite one extent and fsync to flush delalloc and COW a new path
# in the snapshot's tree.
#
# After this we have a BTRFS_DROP_DELAYED_REF delayed ref of type
# BTRFS_TREE_BLOCK_REF_KEY with a count of -1 for every COWed extent
# buffer in the path.
#
# In the extent tree we have inline references of type
# BTRFS_TREE_BLOCK_REF_KEY, with a count of 1, for the same extent
# buffers, so they should cancel each other, and the extent buffers in
# the fs tree should no longer be considered as shared.
#
echo "Overwriting file range [120M, 120M + 128K)..."
xfs_io -c "pwrite -b 128K 120M 128K" $MNT/snap/foo
xfs_io -c "fsync" $MNT/snap/foo
# Fiemap should output 0x8 in the flags column. The extent in the range
# [120M, 120M + 128K) is no longer shared, it's now exclusive to the fs
# tree.
echo
echo "fiemap after overwrite range [120M, 120M + 128K):"
xfs_io -c "fiemap -v 120M 128K" $MNT/foo
echo
umount $MNT
Running it before this patch:
$ ./test.sh
(...)
wrote 134217728/134217728 bytes at offset 0
128 MiB, 128 ops; 0.1152 sec (1.085 GiB/sec and 1110.5809 ops/sec)
Create a snapshot of '/mnt/sdj' in '/mnt/sdj/snap'
fiemap after snapshot, range [120M, 120M + 128K):
/mnt/sdj/foo:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [245760..246015]: 34304..34559 256 0x2008
Overwriting file range [120M, 120M + 128K)...
wrote 131072/131072 bytes at offset 125829120
128 KiB, 1 ops; 0.0001 sec (683.060 MiB/sec and 5464.4809 ops/sec)
fiemap after overwrite range [120M, 120M + 128K):
/mnt/sdj/foo:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [245760..246015]: 34304..34559 256 0x2008
The extent in the range [120M, 120M + 128K) is still reported as shared
(0x2000 bit set) after overwriting that range and flushing delalloc, which
is not correct - an entire path was COWed in the snapshot's tree and the
extent is now only referenced by the original fs tree.
Running it after this patch:
$ ./test.sh
(...)
wrote 134217728/134217728 bytes at offset 0
128 MiB, 128 ops; 0.1198 sec (1.043 GiB/sec and 1068.2067 ops/sec)
Create a snapshot of '/mnt/sdj' in '/mnt/sdj/snap'
fiemap after snapshot, range [120M, 120M + 128K):
/mnt/sdj/foo:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [245760..246015]: 34304..34559 256 0x2008
Overwriting file range [120M, 120M + 128K)...
wrote 131072/131072 bytes at offset 125829120
128 KiB, 1 ops; 0.0001 sec (694.444 MiB/sec and 5555.5556 ops/sec)
fiemap after overwrite range [120M, 120M + 128K):
/mnt/sdj/foo:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [245760..246015]: 34304..34559 256 0x8
Now the extent is not reported as shared anymore.
So fix this by passing a NULL key pointer to add_indirect_ref() when
processing a delayed reference for a tree block if there's no extent op
for our delayed ref head with a defined key. Also access the extent op
only after locking the delayed ref head's lock.
The reproducer will be converted later to a test case for fstests.
Fixes: 86d5f994425252 ("btrfs: convert prelimary reference tracking to use rbtrees")
Fixes: a6dbceafb915e8 ("btrfs: Remove unused op_key var from add_delayed_refs")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/btrfs/backref.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c
index 5e27e30fd887..781c725e6432 100644
--- a/fs/btrfs/backref.c
+++ b/fs/btrfs/backref.c
@@ -761,16 +761,11 @@ static int add_delayed_refs(const struct btrfs_fs_info *fs_info,
struct share_check *sc)
{
struct btrfs_delayed_ref_node *node;
- struct btrfs_delayed_extent_op *extent_op = head->extent_op;
struct btrfs_key key;
- struct btrfs_key tmp_op_key;
struct rb_node *n;
int count;
int ret = 0;
- if (extent_op && extent_op->update_key)
- btrfs_disk_key_to_cpu(&tmp_op_key, &extent_op->key);
-
spin_lock(&head->lock);
for (n = rb_first(&head->ref_tree); n; n = rb_next(n)) {
node = rb_entry(n, struct btrfs_delayed_ref_node,
@@ -797,10 +792,16 @@ static int add_delayed_refs(const struct btrfs_fs_info *fs_info,
case BTRFS_TREE_BLOCK_REF_KEY: {
/* NORMAL INDIRECT METADATA backref */
struct btrfs_delayed_tree_ref *ref;
+ struct btrfs_key *key_ptr = NULL;
+
+ if (head->extent_op && head->extent_op->update_key) {
+ btrfs_disk_key_to_cpu(&key, &head->extent_op->key);
+ key_ptr = &key;
+ }
ref = btrfs_delayed_node_to_tree_ref(node);
ret = add_indirect_ref(fs_info, preftrees, ref->root,
- &tmp_op_key, ref->level + 1,
+ key_ptr, ref->level + 1,
node->bytenr, count, sc,
GFP_ATOMIC);
break;
--
2.35.1
next prev parent reply other threads:[~2022-11-02 3:27 UTC|newest]
Thread overview: 85+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-02 2:33 [PATCH 4.19 00/78] 4.19.264-rc1 review Greg Kroah-Hartman
2022-11-02 2:33 ` [PATCH 4.19 01/78] ocfs2: clear dinode links count in case of error Greg Kroah-Hartman
2022-11-02 2:33 ` [PATCH 4.19 02/78] ocfs2: fix BUG when iput after ocfs2_mknod fails Greg Kroah-Hartman
2022-11-02 2:33 ` [PATCH 4.19 03/78] x86/microcode/AMD: Apply the patch early on every logical thread Greg Kroah-Hartman
2022-11-02 2:33 ` [PATCH 4.19 04/78] hwmon/coretemp: Handle large core ID value Greg Kroah-Hartman
2022-11-02 2:33 ` [PATCH 4.19 05/78] ata: ahci-imx: Fix MODULE_ALIAS Greg Kroah-Hartman
2022-11-02 2:33 ` [PATCH 4.19 06/78] ata: ahci: Match EM_MAX_SLOTS with SATA_PMP_MAX_PORTS Greg Kroah-Hartman
2022-11-02 2:33 ` [PATCH 4.19 07/78] KVM: arm64: vgic: Fix exit condition in scan_its_table() Greg Kroah-Hartman
2022-11-02 2:33 ` [PATCH 4.19 08/78] media: venus: dec: Handle the case where find_format fails Greg Kroah-Hartman
2022-11-02 2:33 ` [PATCH 4.19 09/78] arm64: errata: Remove AES hwcap for COMPAT tasks Greg Kroah-Hartman
2022-11-02 2:33 ` [PATCH 4.19 10/78] r8152: add PID for the Lenovo OneLink+ Dock Greg Kroah-Hartman
2022-11-02 2:33 ` [PATCH 4.19 11/78] btrfs: fix processing of delayed data refs during backref walking Greg Kroah-Hartman
2022-11-02 2:33 ` Greg Kroah-Hartman [this message]
2022-11-02 2:33 ` [PATCH 4.19 13/78] ACPI: extlog: Handle multiple records Greg Kroah-Hartman
2022-11-02 2:33 ` [PATCH 4.19 14/78] tipc: Fix recognition of trial period Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 15/78] tipc: fix an information leak in tipc_topsrv_kern_subscr Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 16/78] HID: magicmouse: Do not set BTN_MOUSE on double report Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 17/78] net/atm: fix proc_mpc_write incorrect return value Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 18/78] net: sched: cake: fix null pointer access issue when cake_init() fails Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 19/78] net: hns: fix possible memory leak in hnae_ae_register() Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 20/78] iommu/vt-d: Clean up si_domain in the init_dmars() error path Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 21/78] media: v4l2-mem2mem: Apply DST_QUEUE_OFF_BASE on MMAP buffers across ioctls Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 22/78] [PATCH v3] ACPI: video: Force backlight native for more TongFang devices Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 23/78] Makefile.debug: re-enable debug info for .S files Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 24/78] hv_netvsc: Fix race between VF offering and VF association message from host Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 25/78] mm: /proc/pid/smaps_rollup: fix no vmas null-deref Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 26/78] can: kvaser_usb: Fix possible completions during init_completion Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 27/78] ALSA: Use del_timer_sync() before freeing timer Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 28/78] ALSA: au88x0: use explicitly signed char Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 29/78] USB: add RESET_RESUME quirk for NVIDIA Jetson devices in RCM Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 30/78] usb: dwc3: gadget: Stop processing more requests on IMI Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 31/78] usb: dwc3: gadget: Dont set IMI for no_interrupt Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 32/78] usb: bdc: change state when port disconnected Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 33/78] usb: xhci: add XHCI_SPURIOUS_SUCCESS to ASM1042 despite being a V0.96 controller Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 34/78] xhci: Remove device endpoints from bandwidth list when freeing the device Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 35/78] tools: iio: iio_utils: fix digit calculation Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 36/78] iio: light: tsl2583: Fix module unloading Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 37/78] fbdev: smscufx: Fix several use-after-free bugs Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 38/78] mac802154: Fix LQI recording Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 39/78] drm/msm/dsi: fix memory corruption with too many bridges Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 40/78] drm/msm/hdmi: " Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 41/78] mmc: core: Fix kernel panic when remove non-standard SDIO card Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 42/78] kernfs: fix use-after-free in __kernfs_remove Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 43/78] perf auxtrace: Fix address filter symbol name match for modules Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 44/78] s390/futex: add missing EX_TABLE entry to __futex_atomic_op() Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 45/78] Xen/gntdev: dont ignore kernel unmapping error Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 46/78] xen/gntdev: Prevent leaking grants Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 47/78] mm,hugetlb: take hugetlb_lock before decrementing h->resv_huge_pages Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 48/78] net: ieee802154: fix error return code in dgram_bind() Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 49/78] drm/msm: Fix return type of mdp4_lvds_connector_mode_valid Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 50/78] arc: iounmap() arg is volatile Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 51/78] ALSA: ac97: fix possible memory leak in snd_ac97_dev_register() Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 52/78] tipc: fix a null-ptr-deref in tipc_topsrv_accept Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 53/78] net: netsec: fix error handling in netsec_register_mdio() Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 54/78] x86/unwind/orc: Fix unreliable stack dump with gcov Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 55/78] amd-xgbe: fix the SFP compliance codes check for DAC cables Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 56/78] amd-xgbe: add the bit rate quirk for Molex cables Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 57/78] kcm: annotate data-races around kcm->rx_psock Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 58/78] kcm: annotate data-races around kcm->rx_wait Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 59/78] net: fix UAF issue in nfqnl_nf_hook_drop() when ops_init() failed Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 60/78] net: lantiq_etop: dont free skb when returning NETDEV_TX_BUSY Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 61/78] tcp: fix indefinite deferral of RTO with SACK reneging Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 62/78] can: mscan: mpc5xxx: mpc5xxx_can_probe(): add missing put_clock() in error path Greg Kroah-Hartman
2022-11-04 17:28 ` Pavel Machek
2022-11-02 2:34 ` [PATCH 4.19 63/78] PM: hibernate: Allow hybrid sleep to work with s2idle Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 64/78] media: vivid: s_fbuf: add more sanity checks Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 65/78] media: vivid: dev->bitmap_cap wasnt freed in all cases Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 66/78] media: v4l2-dv-timings: add sanity checks for blanking values Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 67/78] media: videodev2.h: V4L2_DV_BT_BLANKING_HEIGHT should check interlaced Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 68/78] i40e: Fix ethtool rx-flow-hash setting for X722 Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 69/78] i40e: Fix VF hang when reset is triggered on another VF Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 70/78] i40e: Fix flow-type by setting GL_HASH_INSET registers Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 71/78] net: ksz884x: fix missing pci_disable_device() on error in pcidev_init() Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 72/78] PM: domains: Fix handling of unavailable/disabled idle states Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 73/78] ALSA: aoa: i2sbus: fix possible memory leak in i2sbus_add_dev() Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.19 74/78] ALSA: aoa: Fix I2S device accounting Greg Kroah-Hartman
2022-11-02 2:35 ` [PATCH 4.19 75/78] openvswitch: switch from WARN to pr_warn Greg Kroah-Hartman
2022-11-02 2:35 ` [PATCH 4.19 76/78] net: ehea: fix possible memory leak in ehea_register_port() Greg Kroah-Hartman
2022-11-02 2:35 ` [PATCH 4.19 77/78] net/mlx5e: Do not increment ESN when updating IPsec ESN state Greg Kroah-Hartman
2022-11-02 2:35 ` [PATCH 4.19 78/78] can: rcar_canfd: rcar_canfd_handle_global_receive(): fix IRQ storm on global FIFO receive Greg Kroah-Hartman
2022-11-02 17:22 ` [PATCH 4.19 00/78] 4.19.264-rc1 review Pavel Machek
2022-11-02 20:46 ` Guenter Roeck
2022-11-03 10:18 ` Naresh Kamboju
2022-11-03 12:22 ` Sudip Mukherjee
2022-11-04 15:17 ` zhouzhixiu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20221102022053.294815104@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=dsterba@suse.com \
--cc=fdmanana@suse.com \
--cc=patches@lists.linux.dev \
--cc=sashal@kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).