From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
patches@lists.linux.dev, "Darrick J. Wong" <djwong@kernel.org>,
Christoph Hellwig <hch@lst.de>,
Chandan Babu R <chandanbabu@kernel.org>,
Leah Rumancik <leah.rumancik@gmail.com>
Subject: [PATCH 6.1 38/97] xfs: restrict when we try to align cow fork delalloc to cowextsz hints
Date: Wed, 7 May 2025 20:39:13 +0200 [thread overview]
Message-ID: <20250507183808.526382076@linuxfoundation.org> (raw)
In-Reply-To: <20250507183806.987408728@linuxfoundation.org>
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: "Darrick J. Wong" <djwong@kernel.org>
[ Upstream commit 288e1f693f04e66be99f27e7cbe4a45936a66745 ]
xfs/205 produces the following failure when always_cow is enabled:
# --- a/tests/xfs/205.out 2024-02-28 16:20:24.437887970 -0800
# +++ b/tests/xfs/205.out.bad 2024-06-03 21:13:40.584000000 -0700
# @@ -1,4 +1,5 @@
# QA output created by 205
# *** one file
# + !!! disk full (expected)
# *** one file, a few bytes at a time
# *** done
This is the result of overly aggressive attempts to align cow fork
delalloc reservations to the CoW extent size hint. Looking at the trace
data, we're trying to append a single fsblock to the "fred" file.
Trying to create a speculative post-eof reservation fails because
there's not enough space.
We then set @prealloc_blocks to zero and try again, but the cowextsz
alignment code triggers, which expands our request for a 1-fsblock
reservation into a 39-block reservation. There's not enough space for
that, so the whole write fails with ENOSPC even though there's
sufficient space in the filesystem to allocate the single block that we
need to land the write.
There are two things wrong here -- first, we shouldn't be attempting
speculative preallocations beyond what was requested when we're low on
space. Second, if we've already computed a posteof preallocation, we
shouldn't bother trying to align that to the cowextsize hint.
Fix both of these problems by adding a flag that only enables the
expansion of the delalloc reservation to the cowextsize if we're doing a
non-extending write, and only if we're not doing an ENOSPC retry. This
requires us to move the ENOSPC retry logic to xfs_bmapi_reserve_delalloc.
I probably should have caught this six years ago when 6ca30729c206d was
being reviewed, but oh well. Update the comments to reflect what the
code does now.
Fixes: 6ca30729c206d ("xfs: bmap code cleanup")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com>
Acked-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/xfs/libxfs/xfs_bmap.c | 31 +++++++++++++++++++++++++++----
fs/xfs/xfs_iomap.c | 34 ++++++++++++----------------------
2 files changed, 39 insertions(+), 26 deletions(-)
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -3959,20 +3959,32 @@ xfs_bmapi_reserve_delalloc(
xfs_extlen_t alen;
xfs_extlen_t indlen;
int error;
- xfs_fileoff_t aoff = off;
+ xfs_fileoff_t aoff;
+ bool use_cowextszhint =
+ whichfork == XFS_COW_FORK && !prealloc;
+retry:
/*
* Cap the alloc length. Keep track of prealloc so we know whether to
* tag the inode before we return.
*/
+ aoff = off;
alen = XFS_FILBLKS_MIN(len + prealloc, XFS_MAX_BMBT_EXTLEN);
if (!eof)
alen = XFS_FILBLKS_MIN(alen, got->br_startoff - aoff);
if (prealloc && alen >= len)
prealloc = alen - len;
- /* Figure out the extent size, adjust alen */
- if (whichfork == XFS_COW_FORK) {
+ /*
+ * If we're targetting the COW fork but aren't creating a speculative
+ * posteof preallocation, try to expand the reservation to align with
+ * the COW extent size hint if there's sufficient free space.
+ *
+ * Unlike the data fork, the CoW cancellation functions will free all
+ * the reservations at inactivation, so we don't require that every
+ * delalloc reservation have a dirty pagecache.
+ */
+ if (use_cowextszhint) {
struct xfs_bmbt_irec prev;
xfs_extlen_t extsz = xfs_get_cowextsz_hint(ip);
@@ -3991,7 +4003,7 @@ xfs_bmapi_reserve_delalloc(
*/
error = xfs_quota_reserve_blkres(ip, alen);
if (error)
- return error;
+ goto out;
/*
* Split changing sb for alen and indlen since they could be coming
@@ -4036,6 +4048,17 @@ out_unreserve_blocks:
out_unreserve_quota:
if (XFS_IS_QUOTA_ON(mp))
xfs_quota_unreserve_blkres(ip, alen);
+out:
+ if (error == -ENOSPC || error == -EDQUOT) {
+ trace_xfs_delalloc_enospc(ip, off, len);
+
+ if (prealloc || use_cowextszhint) {
+ /* retry without any preallocation */
+ use_cowextszhint = false;
+ prealloc = 0;
+ goto retry;
+ }
+ }
return error;
}
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -1115,33 +1115,23 @@ xfs_buffered_write_iomap_begin(
}
}
-retry:
- error = xfs_bmapi_reserve_delalloc(ip, allocfork, offset_fsb,
- end_fsb - offset_fsb, prealloc_blocks,
- allocfork == XFS_DATA_FORK ? &imap : &cmap,
- allocfork == XFS_DATA_FORK ? &icur : &ccur,
- allocfork == XFS_DATA_FORK ? eof : cow_eof);
- switch (error) {
- case 0:
- break;
- case -ENOSPC:
- case -EDQUOT:
- /* retry without any preallocation */
- trace_xfs_delalloc_enospc(ip, offset, count);
- if (prealloc_blocks) {
- prealloc_blocks = 0;
- goto retry;
- }
- fallthrough;
- default:
- goto out_unlock;
- }
-
if (allocfork == XFS_COW_FORK) {
+ error = xfs_bmapi_reserve_delalloc(ip, allocfork, offset_fsb,
+ end_fsb - offset_fsb, prealloc_blocks, &cmap,
+ &ccur, cow_eof);
+ if (error)
+ goto out_unlock;
+
trace_xfs_iomap_alloc(ip, offset, count, allocfork, &cmap);
goto found_cow;
}
+ error = xfs_bmapi_reserve_delalloc(ip, allocfork, offset_fsb,
+ end_fsb - offset_fsb, prealloc_blocks, &imap, &icur,
+ eof);
+ if (error)
+ goto out_unlock;
+
/*
* Flag newly allocated delalloc blocks with IOMAP_F_NEW so we punch
* them out if the write happens to fail.
next prev parent reply other threads:[~2025-05-07 18:45 UTC|newest]
Thread overview: 107+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-07 18:38 [PATCH 6.1 00/97] 6.1.138-rc1 review Greg Kroah-Hartman
2025-05-07 18:38 ` [PATCH 6.1 01/97] Revert "rndis_host: Flag RNDIS modems as WWAN devices" Greg Kroah-Hartman
2025-05-07 18:38 ` [PATCH 6.1 02/97] ALSA: usb-audio: Add second USB ID for Jabra Evolve 65 headset Greg Kroah-Hartman
2025-05-07 18:38 ` [PATCH 6.1 03/97] drm/nouveau: Fix WARN_ON in nouveau_fence_context_kill() Greg Kroah-Hartman
2025-05-07 18:38 ` [PATCH 6.1 04/97] EDAC/altera: Test the correct error reg offset Greg Kroah-Hartman
2025-05-07 18:38 ` [PATCH 6.1 05/97] EDAC/altera: Set DDR and SDMMC interrupt mask before registration Greg Kroah-Hartman
2025-05-07 18:38 ` [PATCH 6.1 06/97] i2c: imx-lpi2c: Fix clock count when probe defers Greg Kroah-Hartman
2025-05-07 18:38 ` [PATCH 6.1 07/97] arm64: errata: Add missing sentinels to Spectre-BHB MIDR arrays Greg Kroah-Hartman
2025-05-07 18:38 ` [PATCH 6.1 08/97] parisc: Fix double SIGFPE crash Greg Kroah-Hartman
2025-05-07 18:38 ` [PATCH 6.1 09/97] perf/x86/intel: KVM: Mask PEBS_ENABLE loaded for guest with vCPUs value Greg Kroah-Hartman
2025-05-07 18:38 ` [PATCH 6.1 10/97] amd-xgbe: Fix to ensure dependent features are toggled with RX checksum offload Greg Kroah-Hartman
2025-05-07 18:38 ` [PATCH 6.1 11/97] irqchip/qcom-mpm: Prevent crash when trying to handle non-wake GPIOs Greg Kroah-Hartman
2025-05-07 18:38 ` [PATCH 6.1 12/97] mmc: renesas_sdhi: Fix error handling in renesas_sdhi_probe Greg Kroah-Hartman
2025-05-07 18:38 ` [PATCH 6.1 13/97] wifi: brcm80211: fmac: Add error handling for brcmf_usb_dl_writeimage() Greg Kroah-Hartman
2025-05-07 18:38 ` [PATCH 6.1 14/97] dm-integrity: fix a warning on invalid table line Greg Kroah-Hartman
2025-05-07 18:38 ` [PATCH 6.1 15/97] dm: always update the array size in realloc_argv on success Greg Kroah-Hartman
2025-05-07 18:38 ` [PATCH 6.1 16/97] iommu/amd: Fix potential buffer overflow in parse_ivrs_acpihid Greg Kroah-Hartman
2025-05-07 18:38 ` [PATCH 6.1 17/97] iommu/vt-d: Apply quirk_iommu_igfx for 8086:0044 (QM57/QS57) Greg Kroah-Hartman
2025-05-07 18:38 ` [PATCH 6.1 18/97] platform/x86/intel-uncore-freq: Fix missing uncore sysfs during CPU hotplug Greg Kroah-Hartman
2025-05-07 18:38 ` [PATCH 6.1 19/97] ksmbd: fix use-after-free in kerberos authentication Greg Kroah-Hartman
2025-05-07 18:38 ` [PATCH 6.1 20/97] cpufreq: Avoid using inconsistent policy->min and policy->max Greg Kroah-Hartman
2025-05-07 18:38 ` [PATCH 6.1 21/97] cpufreq: Fix setting policy limits when frequency tables are used Greg Kroah-Hartman
2025-05-07 18:38 ` [PATCH 6.1 22/97] tracing: Fix oob write in trace_seq_to_buffer() Greg Kroah-Hartman
2025-05-07 18:38 ` [PATCH 6.1 23/97] xfs: fix error returns from xfs_bmapi_write Greg Kroah-Hartman
2025-05-07 18:38 ` [PATCH 6.1 24/97] xfs: fix xfs_bmap_add_extent_delay_real for partial conversions Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 25/97] xfs: remove a racy if_bytes check in xfs_reflink_end_cow_extent Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 26/97] xfs: require XFS_SB_FEAT_INCOMPAT_LOG_XATTRS for attr log intent item recovery Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 27/97] xfs: check opcode and iovec count match in xlog_recover_attri_commit_pass2 Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 28/97] xfs: validate recovered name buffers when recovering xattr items Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 29/97] xfs: revert commit 44af6c7e59b12 Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 30/97] xfs: match lock mode in xfs_buffered_write_iomap_begin() Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 31/97] xfs: make the seq argument to xfs_bmapi_convert_delalloc() optional Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 32/97] xfs: make xfs_bmapi_convert_delalloc() to allocate the target offset Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 33/97] xfs: convert delayed extents to unwritten when zeroing post eof blocks Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 34/97] xfs: allow symlinks with short remote targets Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 35/97] xfs: make sure sb_fdblocks is non-negative Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 36/97] xfs: fix freeing speculative preallocations for preallocated files Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 37/97] xfs: allow unlinked symlinks and dirs with zero size Greg Kroah-Hartman
2025-05-07 18:39 ` Greg Kroah-Hartman [this message]
2025-05-07 18:39 ` [PATCH 6.1 39/97] KVM: x86: Load DR6 with guest value only before entering .vcpu_run() loop Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 40/97] dm-bufio: dont schedule in atomic context Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 41/97] ASoC: soc-pcm: Fix hw_params() and DAPM widget sequence Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 42/97] wifi: plfxlc: Remove erroneous assert in plfxlc_mac_release Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 43/97] vxlan: vnifilter: Fix unlocked deletion of default FDB entry Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 44/97] net/mlx5: E-Switch, Initialize MAC Address for Default GID Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 45/97] net/mlx5: E-switch, Fix error handling for enabling roce Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 46/97] net: mscc: ocelot: treat 802.1ad tagged traffic as 802.1Q-untagged Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 47/97] net: mscc: ocelot: delete PVID VLAN when readding it as non-PVID Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 48/97] net: ethernet: mtk-star-emac: fix spinlock recursion issues on rx/tx poll Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 49/97] net: ethernet: mtk-star-emac: rearm interrupts in rx_poll only when advised Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 50/97] net_sched: drr: Fix double list add in class with netem as child qdisc Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 51/97] net_sched: hfsc: Fix a UAF vulnerability " Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 52/97] net_sched: ets: Fix double list add " Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 53/97] net_sched: qfq: " Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 54/97] ice: Check VF VSI Pointer Value in ice_vc_add_fdir_fltr() Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 55/97] net: dlink: Correct endianness handling of led_mode Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 56/97] net: dsa: felix: fix broken taprio gate states after clock jump Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 57/97] net: ipv6: fix UDPv6 GSO segmentation with NAT Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 58/97] bnxt_en: Fix coredump logic to free allocated buffer Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 59/97] bnxt_en: Fix out-of-bound memcpy() during ethtool -w Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 60/97] bnxt_en: Fix ethtool -d byte order for 32-bit values Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 61/97] nvme-tcp: fix premature queue removal and I/O failover Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 62/97] net: lan743x: Fix memleak issue when GSO enabled Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 63/97] net: fec: ERR007885 Workaround for conventional TX Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 64/97] net: hns3: store rx VLAN tag offload state for VF Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 65/97] net: hns3: fix an interrupt residual problem Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 66/97] net: hns3: fixed debugfs tm_qset size Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 67/97] net: hns3: defer calling ptp_clock_register() Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 68/97] net: vertexcom: mse102x: Fix possible stuck of SPI interrupt Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 69/97] net: vertexcom: mse102x: Fix LEN_MASK Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 70/97] net: vertexcom: mse102x: Add range check for CMD_RTS Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 71/97] net: vertexcom: mse102x: Fix RX error handling Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 72/97] md: move initialization and destruction of io_acct_set to md.c Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 73/97] PCI: imx6: Skip controller_id generation logic for i.MX7D Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 74/97] sch_htb: make htb_qlen_notify() idempotent Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 75/97] sch_drr: make drr_qlen_notify() idempotent Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 76/97] sch_hfsc: make hfsc_qlen_notify() idempotent Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 77/97] sch_qfq: make qfq_qlen_notify() idempotent Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 78/97] sch_ets: make est_qlen_notify() idempotent Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 79/97] Revert "x86/kexec: Allocate PGD for x86_64 transition page tables separately" Greg Kroah-Hartman
2025-05-07 18:49 ` David Woodhouse
2025-05-07 18:39 ` [PATCH 6.1 80/97] firmware: arm_scmi: Balance device refcount when destroying devices Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 81/97] firmware: arm_ffa: Skip Rx buffer ownership release if not acquired Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 82/97] ARM: dts: opos6ul: add ksz8081 phy properties Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 83/97] net: phy: microchip: force IRQ polling mode for lan88xx Greg Kroah-Hartman
2025-05-07 18:39 ` [PATCH 6.1 84/97] Revert "drm/meson: vclk: fix calculation of 59.94 fractional rates" Greg Kroah-Hartman
2025-05-07 18:40 ` [PATCH 6.1 85/97] irqchip/gic-v2m: Mark a few functions __init Greg Kroah-Hartman
2025-05-07 18:40 ` [PATCH 6.1 86/97] irqchip/gic-v2m: Prevent use after free of gicv2m_get_fwnode() Greg Kroah-Hartman
2025-05-07 18:40 ` [PATCH 6.1 87/97] memcg: drain obj stock on cpu hotplug teardown Greg Kroah-Hartman
2025-05-07 18:40 ` [PATCH 6.1 88/97] riscv: uprobes: Add missing fence.i after building the XOL buffer Greg Kroah-Hartman
2025-05-07 18:40 ` [PATCH 6.1 89/97] iommu/arm-smmu-v3: Use the new rb tree helpers Greg Kroah-Hartman
2025-05-07 18:40 ` [PATCH 6.1 90/97] iommu/arm-smmu-v3: Fix iommu_device_probe bug due to duplicated stream ids Greg Kroah-Hartman
2025-05-07 18:40 ` [PATCH 6.1 91/97] drm/amd/display: phase2 enable mst hdcp multiple displays Greg Kroah-Hartman
2025-05-07 18:40 ` [PATCH 6.1 92/97] drm/amd/display: Clean up style problems in amdgpu_dm_hdcp.c Greg Kroah-Hartman
2025-05-07 18:40 ` [PATCH 6.1 93/97] drm/amd/display: Change HDCP update sequence for DM Greg Kroah-Hartman
2025-05-07 18:40 ` [PATCH 6.1 94/97] drm/amd/display: Add scoped mutexes for amdgpu_dm_dhcp Greg Kroah-Hartman
2025-05-07 18:40 ` [PATCH 6.1 95/97] drm/amd/display: Fix slab-use-after-free in hdcp Greg Kroah-Hartman
2025-05-07 18:40 ` [PATCH 6.1 96/97] ASoC: Use of_property_read_bool() Greg Kroah-Hartman
2025-05-07 18:40 ` [PATCH 6.1 97/97] ASoC: soc-core: Stop using of_property_read_bool() for non-boolean properties Greg Kroah-Hartman
2025-05-08 7:21 ` [PATCH 6.1 00/97] 6.1.138-rc1 review Pavel Machek
2025-05-08 9:45 ` Jon Hunter
2025-05-08 9:48 ` Jon Hunter
2025-05-08 9:52 ` Jon Hunter
2025-05-08 11:24 ` Greg Kroah-Hartman
2025-05-08 12:21 ` Jon Hunter
2025-05-08 11:28 ` Florian Fainelli
2025-05-08 15:00 ` Shuah Khan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250507183808.526382076@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=chandanbabu@kernel.org \
--cc=djwong@kernel.org \
--cc=hch@lst.de \
--cc=leah.rumancik@gmail.com \
--cc=patches@lists.linux.dev \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox