From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
patches@lists.linux.dev, Filipe Manana <fdmanana@suse.com>,
David Sterba <dsterba@suse.com>, Sasha Levin <sashal@kernel.org>
Subject: [PATCH 4.14 09/60] btrfs: fix processing of delayed data refs during backref walking
Date: Wed, 2 Nov 2022 03:34:30 +0100 [thread overview]
Message-ID: <20221102022051.373479482@linuxfoundation.org> (raw)
In-Reply-To: <20221102022051.081761052@linuxfoundation.org>
From: Filipe Manana <fdmanana@suse.com>
[ Upstream commit 4fc7b57228243d09c0d878873bf24fa64a90fa01 ]
When processing delayed data references during backref walking and we are
using a share context (we are being called through fiemap), whenever we
find a delayed data reference for an inode different from the one we are
interested in, then we immediately exit and consider the data extent as
shared. This is wrong, because:
1) This might be a DROP reference that will cancel out a reference in the
extent tree;
2) Even if it's an ADD reference, it may be followed by a DROP reference
that cancels it out.
In either case we should not exit immediately.
Fix this by never exiting when we find a delayed data reference for
another inode - instead add the reference and if it does not cancel out
other delayed reference, we will exit early when we call
extent_is_shared() after processing all delayed references. If we find
a drop reference, then signal the code that processes references from
the extent tree (add_inline_refs() and add_keyed_refs()) to not exit
immediately if it finds there a reference for another inode, since we
have delayed drop references that may cancel it out. In this later case
we exit once we don't have references in the rb trees that cancel out
each other and have two references for different inodes.
Example reproducer for case 1):
$ cat test-1.sh
#!/bin/bash
DEV=/dev/sdj
MNT=/mnt/sdj
mkfs.btrfs -f $DEV
mount $DEV $MNT
xfs_io -f -c "pwrite 0 64K" $MNT/foo
cp --reflink=always $MNT/foo $MNT/bar
echo
echo "fiemap after cloning:"
xfs_io -c "fiemap -v" $MNT/foo
rm -f $MNT/bar
echo
echo "fiemap after removing file bar:"
xfs_io -c "fiemap -v" $MNT/foo
umount $MNT
Running it before this patch, the extent is still listed as shared, it has
the flag 0x2000 (FIEMAP_EXTENT_SHARED) set:
$ ./test-1.sh
fiemap after cloning:
/mnt/sdj/foo:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..127]: 26624..26751 128 0x2001
fiemap after removing file bar:
/mnt/sdj/foo:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..127]: 26624..26751 128 0x2001
Example reproducer for case 2):
$ cat test-2.sh
#!/bin/bash
DEV=/dev/sdj
MNT=/mnt/sdj
mkfs.btrfs -f $DEV
mount $DEV $MNT
xfs_io -f -c "pwrite 0 64K" $MNT/foo
cp --reflink=always $MNT/foo $MNT/bar
# Flush delayed references to the extent tree and commit current
# transaction.
sync
echo
echo "fiemap after cloning:"
xfs_io -c "fiemap -v" $MNT/foo
rm -f $MNT/bar
echo
echo "fiemap after removing file bar:"
xfs_io -c "fiemap -v" $MNT/foo
umount $MNT
Running it before this patch, the extent is still listed as shared, it has
the flag 0x2000 (FIEMAP_EXTENT_SHARED) set:
$ ./test-2.sh
fiemap after cloning:
/mnt/sdj/foo:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..127]: 26624..26751 128 0x2001
fiemap after removing file bar:
/mnt/sdj/foo:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..127]: 26624..26751 128 0x2001
After this patch, after deleting bar in both tests, the extent is not
reported with the 0x2000 flag anymore, it gets only the flag 0x1
(which is FIEMAP_EXTENT_LAST):
$ ./test-1.sh
fiemap after cloning:
/mnt/sdj/foo:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..127]: 26624..26751 128 0x2001
fiemap after removing file bar:
/mnt/sdj/foo:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..127]: 26624..26751 128 0x1
$ ./test-2.sh
fiemap after cloning:
/mnt/sdj/foo:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..127]: 26624..26751 128 0x2001
fiemap after removing file bar:
/mnt/sdj/foo:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..127]: 26624..26751 128 0x1
These tests will later be converted to a test case for fstests.
Fixes: dc046b10c8b7d4 ("Btrfs: make fiemap not blow when you have lots of snapshots")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/btrfs/backref.c | 33 ++++++++++++++++++++++++---------
1 file changed, 24 insertions(+), 9 deletions(-)
diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c
index 58dc96d7ecaf..93cfbdada40f 100644
--- a/fs/btrfs/backref.c
+++ b/fs/btrfs/backref.c
@@ -146,6 +146,7 @@ struct share_check {
u64 root_objectid;
u64 inum;
int share_count;
+ bool have_delayed_delete_refs;
};
static inline int extent_is_shared(struct share_check *sc)
@@ -832,13 +833,22 @@ static int add_delayed_refs(const struct btrfs_fs_info *fs_info,
key.offset = ref->offset;
/*
- * Found a inum that doesn't match our known inum, we
- * know it's shared.
+ * If we have a share check context and a reference for
+ * another inode, we can't exit immediately. This is
+ * because even if this is a BTRFS_ADD_DELAYED_REF
+ * reference we may find next a BTRFS_DROP_DELAYED_REF
+ * which cancels out this ADD reference.
+ *
+ * If this is a DROP reference and there was no previous
+ * ADD reference, then we need to signal that when we
+ * process references from the extent tree (through
+ * add_inline_refs() and add_keyed_refs()), we should
+ * not exit early if we find a reference for another
+ * inode, because one of the delayed DROP references
+ * may cancel that reference in the extent tree.
*/
- if (sc && sc->inum && ref->objectid != sc->inum) {
- ret = BACKREF_FOUND_SHARED;
- goto out;
- }
+ if (sc && count < 0)
+ sc->have_delayed_delete_refs = true;
ret = add_indirect_ref(fs_info, preftrees, ref->root,
&key, 0, node->bytenr, count, sc,
@@ -868,7 +878,7 @@ static int add_delayed_refs(const struct btrfs_fs_info *fs_info,
}
if (!ret)
ret = extent_is_shared(sc);
-out:
+
spin_unlock(&head->lock);
return ret;
}
@@ -972,7 +982,8 @@ static int add_inline_refs(const struct btrfs_fs_info *fs_info,
key.type = BTRFS_EXTENT_DATA_KEY;
key.offset = btrfs_extent_data_ref_offset(leaf, dref);
- if (sc && sc->inum && key.objectid != sc->inum) {
+ if (sc && sc->inum && key.objectid != sc->inum &&
+ !sc->have_delayed_delete_refs) {
ret = BACKREF_FOUND_SHARED;
break;
}
@@ -982,6 +993,7 @@ static int add_inline_refs(const struct btrfs_fs_info *fs_info,
ret = add_indirect_ref(fs_info, preftrees, root,
&key, 0, bytenr, count,
sc, GFP_NOFS);
+
break;
}
default:
@@ -1071,7 +1083,8 @@ static int add_keyed_refs(struct btrfs_fs_info *fs_info,
key.type = BTRFS_EXTENT_DATA_KEY;
key.offset = btrfs_extent_data_ref_offset(leaf, dref);
- if (sc && sc->inum && key.objectid != sc->inum) {
+ if (sc && sc->inum && key.objectid != sc->inum &&
+ !sc->have_delayed_delete_refs) {
ret = BACKREF_FOUND_SHARED;
break;
}
@@ -1490,6 +1503,7 @@ int btrfs_check_shared(struct btrfs_root *root, u64 inum, u64 bytenr)
.root_objectid = root->objectid,
.inum = inum,
.share_count = 0,
+ .have_delayed_delete_refs = false,
};
tmp = ulist_alloc(GFP_NOFS);
@@ -1528,6 +1542,7 @@ int btrfs_check_shared(struct btrfs_root *root, u64 inum, u64 bytenr)
break;
bytenr = node->val;
shared.share_count = 0;
+ shared.have_delayed_delete_refs = false;
cond_resched();
}
--
2.35.1
next prev parent reply other threads:[~2022-11-02 3:40 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-02 2:34 [PATCH 4.14 00/60] 4.14.298-rc1 review Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.14 01/60] ocfs2: clear dinode links count in case of error Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.14 02/60] ocfs2: fix BUG when iput after ocfs2_mknod fails Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.14 03/60] x86/microcode/AMD: Apply the patch early on every logical thread Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.14 04/60] ata: ahci-imx: Fix MODULE_ALIAS Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.14 05/60] ata: ahci: Match EM_MAX_SLOTS with SATA_PMP_MAX_PORTS Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.14 06/60] KVM: arm64: vgic: Fix exit condition in scan_its_table() Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.14 07/60] arm64: errata: Remove AES hwcap for COMPAT tasks Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.14 08/60] r8152: add PID for the Lenovo OneLink+ Dock Greg Kroah-Hartman
2022-11-02 2:34 ` Greg Kroah-Hartman [this message]
2022-11-02 2:34 ` [PATCH 4.14 10/60] ACPI: extlog: Handle multiple records Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.14 11/60] HID: magicmouse: Do not set BTN_MOUSE on double report Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.14 12/60] net/atm: fix proc_mpc_write incorrect return value Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.14 13/60] net: hns: fix possible memory leak in hnae_ae_register() Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.14 14/60] iommu/vt-d: Clean up si_domain in the init_dmars() error path Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.14 15/60] media: v4l2-mem2mem: Apply DST_QUEUE_OFF_BASE on MMAP buffers across ioctls Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.14 16/60] [PATCH v3] ACPI: video: Force backlight native for more TongFang devices Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.14 17/60] ALSA: Use del_timer_sync() before freeing timer Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.14 18/60] ALSA: au88x0: use explicitly signed char Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.14 19/60] USB: add RESET_RESUME quirk for NVIDIA Jetson devices in RCM Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.14 20/60] usb: dwc3: gadget: Dont set IMI for no_interrupt Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.14 21/60] usb: bdc: change state when port disconnected Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.14 22/60] usb: xhci: add XHCI_SPURIOUS_SUCCESS to ASM1042 despite being a V0.96 controller Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.14 23/60] xhci: Remove device endpoints from bandwidth list when freeing the device Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.14 24/60] tools: iio: iio_utils: fix digit calculation Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.14 25/60] iio: light: tsl2583: Fix module unloading Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.14 26/60] fbdev: smscufx: Fix several use-after-free bugs Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.14 27/60] mac802154: Fix LQI recording Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.14 28/60] drm/msm/hdmi: fix memory corruption with too many bridges Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.14 29/60] mmc: core: Fix kernel panic when remove non-standard SDIO card Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.14 30/60] kernfs: fix use-after-free in __kernfs_remove Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.14 31/60] s390/futex: add missing EX_TABLE entry to __futex_atomic_op() Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.14 32/60] Xen/gntdev: dont ignore kernel unmapping error Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.14 33/60] xen/gntdev: Prevent leaking grants Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.14 34/60] mm,hugetlb: take hugetlb_lock before decrementing h->resv_huge_pages Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.14 35/60] net: ieee802154: fix error return code in dgram_bind() Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.14 36/60] drm/msm: Fix return type of mdp4_lvds_connector_mode_valid Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.14 37/60] arc: iounmap() arg is volatile Greg Kroah-Hartman
2022-11-02 2:34 ` [PATCH 4.14 38/60] ALSA: ac97: fix possible memory leak in snd_ac97_dev_register() Greg Kroah-Hartman
2022-11-02 2:35 ` [PATCH 4.14 39/60] x86/unwind/orc: Fix unreliable stack dump with gcov Greg Kroah-Hartman
2022-11-02 2:35 ` [PATCH 4.14 40/60] amd-xgbe: fix the SFP compliance codes check for DAC cables Greg Kroah-Hartman
2022-11-02 2:35 ` [PATCH 4.14 41/60] amd-xgbe: add the bit rate quirk for Molex cables Greg Kroah-Hartman
2022-11-02 2:35 ` [PATCH 4.14 42/60] kcm: annotate data-races around kcm->rx_psock Greg Kroah-Hartman
2022-11-02 2:35 ` [PATCH 4.14 43/60] kcm: annotate data-races around kcm->rx_wait Greg Kroah-Hartman
2022-11-02 2:35 ` [PATCH 4.14 44/60] net: lantiq_etop: dont free skb when returning NETDEV_TX_BUSY Greg Kroah-Hartman
2022-11-02 2:35 ` [PATCH 4.14 45/60] tcp: fix indefinite deferral of RTO with SACK reneging Greg Kroah-Hartman
2022-11-02 2:35 ` [PATCH 4.14 46/60] can: mscan: mpc5xxx: mpc5xxx_can_probe(): add missing put_clock() in error path Greg Kroah-Hartman
2022-11-02 2:35 ` [PATCH 4.14 47/60] PM: hibernate: Allow hybrid sleep to work with s2idle Greg Kroah-Hartman
2022-11-02 2:35 ` [PATCH 4.14 48/60] media: vivid: s_fbuf: add more sanity checks Greg Kroah-Hartman
2022-11-02 2:35 ` [PATCH 4.14 49/60] media: vivid: dev->bitmap_cap wasnt freed in all cases Greg Kroah-Hartman
2022-11-02 2:35 ` [PATCH 4.14 50/60] media: v4l2-dv-timings: add sanity checks for blanking values Greg Kroah-Hartman
2022-11-02 2:35 ` [PATCH 4.14 51/60] media: videodev2.h: V4L2_DV_BT_BLANKING_HEIGHT should check interlaced Greg Kroah-Hartman
2022-11-02 2:35 ` [PATCH 4.14 52/60] i40e: Fix ethtool rx-flow-hash setting for X722 Greg Kroah-Hartman
2022-11-02 2:35 ` [PATCH 4.14 53/60] i40e: Fix flow-type by setting GL_HASH_INSET registers Greg Kroah-Hartman
2022-11-02 2:35 ` [PATCH 4.14 54/60] net: ksz884x: fix missing pci_disable_device() on error in pcidev_init() Greg Kroah-Hartman
2022-11-02 2:35 ` [PATCH 4.14 55/60] PM: domains: Fix handling of unavailable/disabled idle states Greg Kroah-Hartman
2022-11-02 2:35 ` [PATCH 4.14 56/60] ALSA: aoa: i2sbus: fix possible memory leak in i2sbus_add_dev() Greg Kroah-Hartman
2022-11-02 2:35 ` [PATCH 4.14 57/60] ALSA: aoa: Fix I2S device accounting Greg Kroah-Hartman
2022-11-02 2:35 ` [PATCH 4.14 58/60] openvswitch: switch from WARN to pr_warn Greg Kroah-Hartman
2022-11-02 2:35 ` [PATCH 4.14 59/60] net: ehea: fix possible memory leak in ehea_register_port() Greg Kroah-Hartman
2022-11-02 2:35 ` [PATCH 4.14 60/60] can: rcar_canfd: rcar_canfd_handle_global_receive(): fix IRQ storm on global FIFO receive Greg Kroah-Hartman
2022-11-02 20:45 ` [PATCH 4.14 00/60] 4.14.298-rc1 review Guenter Roeck
2022-11-03 10:33 ` Naresh Kamboju
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20221102022051.373479482@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=dsterba@suse.com \
--cc=fdmanana@suse.com \
--cc=patches@lists.linux.dev \
--cc=sashal@kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).