From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
Jesper Dangaard Brouer <hawk@kernel.org>,
Ilias Apalodimas <ilias.apalodimas@linaro.org>,
Paolo Abeni <pabeni@redhat.com>, Sasha Levin <sashal@kernel.org>,
clrkwllms@kernel.org, rostedt@goodmis.org,
netdev@vger.kernel.org, linux-rt-devel@lists.linux.dev
Subject: [PATCH AUTOSEL 6.14 038/108] net: page_pool: Don't recycle into cache on PREEMPT_RT
Date: Tue, 3 Jun 2025 20:54:21 -0400 [thread overview]
Message-ID: <20250604005531.4178547-38-sashal@kernel.org> (raw)
In-Reply-To: <20250604005531.4178547-1-sashal@kernel.org>
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
[ Upstream commit 32471b2f481dea8624f27669d36ffd131d24b732 ]
With preemptible softirq and no per-CPU locking in local_bh_disable() on
PREEMPT_RT the consumer can be preempted while a skb is returned.
Avoid the race by disabling the recycle into the cache on PREEMPT_RT.
Cc: Jesper Dangaard Brouer <hawk@kernel.org>
Cc: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/20250512092736.229935-2-bigeasy@linutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
**YES** This commit should be backported to stable kernel trees. Here's
my detailed analysis: ## Bug Fix Classification This is a **race
condition fix** for PREEMPT_RT kernels that prevents potential memory
corruption and system instability. The commit addresses a fundamental
concurrency issue in the page_pool subsystem that only manifests on
real-time kernels. ## Technical Analysis of the Race Condition The issue
occurs in the `page_pool_napi_local()` function, which determines
whether pages can be recycled directly into the page pool's fast cache
(`pool->alloc.cache[]`). This lockless cache provides optimal
performance by avoiding spinlock overhead. **On regular kernels:** -
Softirqs are non-preemptible - Direct cache access is safe because
producer and consumer cannot run concurrently **On PREEMPT_RT kernels:**
- Softirqs can be preempted by higher priority tasks - A softirq
returning pages can be interrupted while another context allocates from
the same cache - This creates a classic race condition on the shared
cache data structure ## Code Change Analysis The fix is minimal and
surgical: ```c /bin /bin.usr-is-merged /boot /dev /etc /home /init /lib
/lib.usr-is-merged /lib64 /lost+found /media /mnt /opt /proc /root /run
/sbin /sbin.usr-is-merged /snap /srv /sys /tmp /usr /var On PREEMPT_RT
the softirq can be preempted by the consumer linux/ if
(IS_ENABLED(CONFIG_PREEMPT_RT)) return false; ``` This forces all page
recycling on PREEMPT_RT to use the ring buffer path (`ptr_ring`) which
has proper locking, instead of the lockless direct cache. ## Impact
Assessment **Functional Impact:** Zero - pages are still recycled
correctly, just through a different path **Performance Impact:** Minimal
- ring buffer operations are slightly slower than direct cache access,
but only affects PREEMPT_RT systems **Risk:** Very low - the change is
isolated, well-understood, and follows established kernel patterns ##
Backporting Suitability 1. **Clear bug fix:** Addresses a real race
condition that could cause memory corruption 2. **Minimal and
contained:** Single 4-line change with no dependencies 3. **High
importance for affected systems:** Critical for PREEMPT_RT system
stability 4. **Low regression risk:** Fallback to existing, proven ring
buffer mechanism 5. **Follows established patterns:** Similar PREEMPT_RT
guards exist throughout the networking stack ## Historical Context Based
on my analysis of the kernel repository, the page_pool infrastructure
has evolved significantly, with the direct caching mechanism being added
for performance optimization. The `page_pool_napi_local()` function was
introduced in commit 4a96a4e807c3 (Linux 6.9+) as part of the lockless
caching optimization. This fix addresses an oversight in that
optimization where PREEMPT_RT preemption semantics weren't considered.
## Conclusion This commit represents exactly the type of fix that stable
kernels should include: a focused, low-risk correction of a race
condition that could cause system instability on specific
configurations. While it only affects PREEMPT_RT systems, the potential
consequences (memory corruption, crashes) are severe enough to warrant
backporting to any stable tree that supports PREEMPT_RT and contains the
page_pool caching infrastructure.
net/core/page_pool.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index cca51aa2e876f..68e7962daa08f 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -801,6 +801,10 @@ static bool page_pool_napi_local(const struct page_pool *pool)
const struct napi_struct *napi;
u32 cpuid;
+ /* On PREEMPT_RT the softirq can be preempted by the consumer */
+ if (IS_ENABLED(CONFIG_PREEMPT_RT))
+ return false;
+
if (unlikely(!in_softirq()))
return false;
--
2.39.5
next prev parent reply other threads:[~2025-06-04 0:56 UTC|newest]
Thread overview: 108+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-04 0:53 [PATCH AUTOSEL 6.14 001/108] net: macb: Check return value of dma_set_mask_and_coherent() Sasha Levin
2025-06-04 0:53 ` [PATCH AUTOSEL 6.14 002/108] net: lan743x: Modify the EEPROM and OTP size for PCI1xxxx devices Sasha Levin
2025-06-04 0:53 ` [PATCH AUTOSEL 6.14 003/108] tipc: use kfree_sensitive() for aead cleanup Sasha Levin
2025-06-04 0:53 ` [PATCH AUTOSEL 6.14 004/108] f2fs: use vmalloc instead of kvmalloc in .init_{,de}compress_ctx Sasha Levin
2025-06-04 0:53 ` [PATCH AUTOSEL 6.14 005/108] bpf: Check rcu_read_lock_trace_held() in bpf_map_lookup_percpu_elem() Sasha Levin
2025-06-04 0:53 ` [PATCH AUTOSEL 6.14 006/108] Bluetooth: btusb: Add new VID/PID 13d3/3584 for MT7922 Sasha Levin
2025-06-04 0:53 ` [PATCH AUTOSEL 6.14 007/108] i2c: designware: Invoke runtime suspend on quick slave re-registration Sasha Levin
2025-06-04 0:53 ` [PATCH AUTOSEL 6.14 008/108] wifi: mt76: mt7996: drop fragments with multicast or broadcast RA Sasha Levin
2025-06-04 0:53 ` [PATCH AUTOSEL 6.14 009/108] emulex/benet: correct command version selection in be_cmd_get_stats() Sasha Levin
2025-06-04 0:53 ` [PATCH AUTOSEL 6.14 010/108] Bluetooth: btusb: Add new VID/PID 13d3/3630 for MT7925 Sasha Levin
2025-06-04 0:53 ` [PATCH AUTOSEL 6.14 011/108] Bluetooth: btusb: Add RTL8851BE device 0x0bda:0xb850 Sasha Levin
2025-06-04 0:53 ` [PATCH AUTOSEL 6.14 012/108] Bluetooth: ISO: Fix not using SID from adv report Sasha Levin
2025-06-04 0:53 ` [PATCH AUTOSEL 6.14 013/108] Bluetooth: btmrvl_sdio: Fix wakeup source leaks on device unbind Sasha Levin
2025-06-04 0:53 ` [PATCH AUTOSEL 6.14 014/108] Bluetooth: btmtksdio: " Sasha Levin
2025-06-04 0:53 ` [PATCH AUTOSEL 6.14 015/108] wifi: mt76: mt76x2: Add support for LiteOn WN4516R,WN4519R Sasha Levin
2025-06-04 0:53 ` [PATCH AUTOSEL 6.14 016/108] wifi: mt76: mt7921: add 160 MHz AP for mt7922 device Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 017/108] wifi: mt76: mt7925: introduce thermal protection Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 018/108] wifi: mac80211: validate SCAN_FLAG_AP in scan request during MLO Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 019/108] sctp: Do not wake readers in __sctp_write_space() Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 020/108] libbpf/btf: Fix string handling to support multi-split BTF Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 021/108] cpufreq: scmi: Skip SCMI devices that aren't used by the CPUs Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 022/108] i2c: tegra: check msg length in SMBUS block read Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 023/108] i2c: npcm: Add clock toggle recovery Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 024/108] clk: qcom: gcc-x1e80100: Set FORCE MEM CORE for UFS clocks Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 025/108] clk: qcom: gcc: Set FORCE_MEM_CORE_ON for gcc_ufs_axi_clk for 8650/8750 Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 026/108] net: dlink: add synchronization for stats update Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 027/108] wifi: ath12k: fix macro definition HAL_RX_MSDU_PKT_LENGTH_GET Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 028/108] wifi: ath12k: fix a possible dead lock caused by ab->base_lock Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 029/108] wifi: ath11k: Fix QMI memory reuse logic Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 030/108] iommu/amd: Allow matching ACPI HID devices without matching UIDs Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 031/108] wifi: rtw89: leave idle mode when setting WEP encryption for AP mode Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 032/108] tcp: always seek for minimal rtt in tcp_rcv_rtt_update() Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 033/108] tcp: remove zero TCP TS samples for autotuning Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 034/108] tcp: fix initial tp->rcvq_space.space value for passive TS enabled flows Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 035/108] tcp: add receive queue awareness in tcp_rcv_space_adjust() Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 036/108] x86/sgx: Prevent attempts to reclaim poisoned pages Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 037/108] ipv4/route: Use this_cpu_inc() for stats on PREEMPT_RT Sasha Levin
2025-06-04 0:54 ` Sasha Levin [this message]
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 039/108] xfrm: validate assignment of maximal possible SEQ number Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 040/108] openvswitch: Stricter validation for the userspace action Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 041/108] net: atlantic: generate software timestamp just before the doorbell Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 042/108] pinctrl: armada-37xx: propagate error from armada_37xx_pmx_set_by_name() Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 043/108] pinctrl: armada-37xx: propagate error from armada_37xx_gpio_get_direction() Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 044/108] bpf: Pass the same orig_call value to trampoline functions Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 045/108] net: stmmac: generate software timestamp just before the doorbell Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 046/108] pinctrl: armada-37xx: propagate error from armada_37xx_pmx_gpio_set_direction() Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 047/108] libbpf: Check bpf_map_skeleton link for NULL Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 048/108] pinctrl: armada-37xx: propagate error from armada_37xx_gpio_get() Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 049/108] net/mlx5: HWS, fix counting of rules in the matcher Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 050/108] net: mlx4: add SOF_TIMESTAMPING_TX_SOFTWARE flag when getting ts info Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 051/108] net: vertexcom: mse102x: Return code for mse102x_rx_pkt_spi Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 052/108] wifi: rtw88: rtw8822bu VID/PID for BUFFALO WI-U2-866DM Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 053/108] wireless: purelifi: plfxlc: fix memory leak in plfxlc_usb_wreq_asyn() Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 054/108] wifi: mac80211: do not offer a mesh path if forwarding is disabled Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 055/108] bpftool: Fix cgroup command to only show cgroup bpf programs Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 056/108] clk: rockchip: rk3036: mark ddrphy as critical Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 057/108] hid-asus: check ROG Ally MCU version and warn Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 058/108] rtla: Define __NR_sched_setattr for LoongArch Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 059/108] wifi: iwlwifi: mvm: fix beacon CCK flag Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 060/108] wifi: iwlwifi: dvm: pair transport op-mode enter/leave Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 061/108] bpf: Add bpf_rbtree_{root,left,right} kfunc Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 062/108] f2fs: fix to bail out in get_new_segment() Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 063/108] netfilter: nft_set_pipapo: clamp maximum map bucket size to INT_MAX Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 064/108] libbpf: Add identical pointer detection to btf_dedup_is_equiv() Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 065/108] scsi: lpfc: Fix lpfc_check_sli_ndlp() handling for GEN_REQUEST64 commands Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 066/108] scsi: smartpqi: Add new PCI IDs Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 067/108] iommu/amd: Ensure GA log notifier callbacks finish running before module unload Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 068/108] wifi: iwlwifi: pcie: make sure to lock rxq->read Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 069/108] wifi: rtw89: 8922a: fix TX fail with wrong VCO setting Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 070/108] wifi: mac80211_hwsim: Prevent tsf from setting if beacon is disabled Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 071/108] netdevsim: Mark NAPI ID on skb in nsim_rcv Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 072/108] net/mlx5: HWS, Fix IP version decision Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 073/108] bpf: Use proper type to calculate bpf_raw_tp_null_args.mask index Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 074/108] wifi: mac80211: VLAN traffic in multicast path Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 075/108] Revert "mac80211: Dynamically set CoDel parameters per station" Sasha Levin
2025-06-04 0:54 ` [PATCH AUTOSEL 6.14 076/108] wifi: iwlwifi: Add missing MODULE_FIRMWARE for Qu-c0-jf-b0 Sasha Levin
2025-06-04 0:55 ` [PATCH AUTOSEL 6.14 077/108] net: bridge: mcast: update multicast contex when vlan state is changed Sasha Levin
2025-06-04 0:55 ` [PATCH AUTOSEL 6.14 078/108] net: bridge: mcast: re-implement br_multicast_{enable, disable}_port functions Sasha Levin
2025-06-04 0:55 ` [PATCH AUTOSEL 6.14 079/108] vxlan: Do not treat dst cache initialization errors as fatal Sasha Levin
2025-06-04 0:55 ` [PATCH AUTOSEL 6.14 080/108] bnxt_en: Remove unused field "ref_count" in struct bnxt_ulp Sasha Levin
2025-06-04 0:55 ` [PATCH AUTOSEL 6.14 081/108] vxlan: Add RCU read-side critical sections in the Tx path Sasha Levin
2025-06-04 0:55 ` [PATCH AUTOSEL 6.14 082/108] wifi: ath12k: correctly handle mcast packets for clients Sasha Levin
2025-06-04 0:55 ` [PATCH AUTOSEL 6.14 083/108] wifi: ath12k: using msdu end descriptor to check for rx multicast packets Sasha Levin
2025-06-04 0:55 ` [PATCH AUTOSEL 6.14 084/108] net: ethernet: ti: am65-cpsw: handle -EPROBE_DEFER Sasha Levin
2025-06-04 0:55 ` [PATCH AUTOSEL 6.14 085/108] software node: Correct a OOB check in software_node_get_reference_args() Sasha Levin
2025-06-04 0:55 ` [PATCH AUTOSEL 6.14 086/108] wifi: ath12k: make assoc link associate first Sasha Levin
2025-06-04 0:55 ` [PATCH AUTOSEL 6.14 087/108] isofs: fix Y2038 and Y2156 issues in Rock Ridge TF entry Sasha Levin
2025-06-04 0:55 ` [PATCH AUTOSEL 6.14 088/108] pinctrl: mcp23s08: Reset all pins to input at probe Sasha Levin
2025-06-04 0:55 ` [PATCH AUTOSEL 6.14 089/108] wifi: ath12k: fix failed to set mhi state error during reboot with hardware grouping Sasha Levin
2025-06-04 0:55 ` [PATCH AUTOSEL 6.14 090/108] scsi: lpfc: Use memcpy() for BIOS version Sasha Levin
2025-06-04 0:55 ` [PATCH AUTOSEL 6.14 091/108] sock: Correct error checking condition for (assign|release)_proto_idx() Sasha Levin
2025-06-04 0:55 ` [PATCH AUTOSEL 6.14 092/108] i40e: fix MMIO write access to an invalid page in i40e_clear_hw Sasha Levin
2025-06-04 0:55 ` [PATCH AUTOSEL 6.14 093/108] ixgbe: Fix unreachable retry logic in combined and byte I2C write functions Sasha Levin
2025-06-04 0:55 ` [PATCH AUTOSEL 6.14 094/108] RDMA/hns: initialize db in update_srq_db() Sasha Levin
2025-06-04 0:55 ` [PATCH AUTOSEL 6.14 095/108] ice: fix check for existing switch rule Sasha Levin
2025-06-04 0:55 ` [PATCH AUTOSEL 6.14 096/108] usbnet: asix AX88772: leave the carrier control to phylink Sasha Levin
2025-06-04 0:55 ` [PATCH AUTOSEL 6.14 097/108] f2fs: fix to set atomic write status more clear Sasha Levin
2025-06-04 0:55 ` [PATCH AUTOSEL 6.14 098/108] bpf, sockmap: Fix data lost during EAGAIN retries Sasha Levin
2025-06-04 0:55 ` [PATCH AUTOSEL 6.14 099/108] net: ethernet: cortina: Use TOE/TSO on all TCP Sasha Levin
2025-06-04 0:55 ` [PATCH AUTOSEL 6.14 100/108] octeontx2-pf: Add error log forcn10k_map_unmap_rq_policer() Sasha Levin
2025-06-04 0:55 ` [PATCH AUTOSEL 6.14 101/108] wifi: ath12k: Fix incorrect rates sent to firmware Sasha Levin
2025-06-04 0:55 ` [PATCH AUTOSEL 6.14 102/108] wifi: ath12k: Fix the enabling of REO queue lookup table feature Sasha Levin
2025-06-04 0:55 ` [PATCH AUTOSEL 6.14 103/108] wifi: ath12k: Fix memory leak due to multiple rx_stats allocation Sasha Levin
2025-06-04 0:55 ` [PATCH AUTOSEL 6.14 104/108] wifi: ath11k: determine PM policy based on machine model Sasha Levin
2025-06-04 0:55 ` [PATCH AUTOSEL 6.14 105/108] wifi: ath12k: fix link valid field initialization in the monitor Rx Sasha Levin
2025-06-04 0:55 ` [PATCH AUTOSEL 6.14 106/108] wifi: ath12k: fix incorrect CE addresses Sasha Levin
2025-06-04 0:55 ` [PATCH AUTOSEL 6.14 107/108] wifi: ath12k: Pass correct values of center freq1 and center freq2 for 160 MHz Sasha Levin
2025-06-04 0:55 ` [PATCH AUTOSEL 6.14 108/108] net/mlx5: HWS, Harden IP version definer checks Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250604005531.4178547-38-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=bigeasy@linutronix.de \
--cc=clrkwllms@kernel.org \
--cc=hawk@kernel.org \
--cc=ilias.apalodimas@linaro.org \
--cc=linux-rt-devel@lists.linux.dev \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=patches@lists.linux.dev \
--cc=rostedt@goodmis.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox