From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
patches@lists.linux.dev,
Sebastian Brzezinka <sebastian.brzezinka@intel.com>,
Krzysztof Karas <krzysztof.karas@intel.com>,
Andi Shyti <andi.shyti@linux.intel.com>,
Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Subject: [PATCH 6.1 48/55] drm/i915/gt: fix refcount underflow in intel_engine_park_heartbeat
Date: Mon, 13 Apr 2026 18:01:22 +0200 [thread overview]
Message-ID: <20260413155726.620943222@linuxfoundation.org> (raw)
In-Reply-To: <20260413155724.820472494@linuxfoundation.org>
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sebastian Brzezinka <sebastian.brzezinka@intel.com>
commit 4c71fd099513bfa8acab529b626e1f0097b76061 upstream.
A use-after-free / refcount underflow is possible when the heartbeat
worker and intel_engine_park_heartbeat() race to release the same
engine->heartbeat.systole request.
The heartbeat worker reads engine->heartbeat.systole and calls
i915_request_put() on it when the request is complete, but clears
the pointer in a separate, non-atomic step. Concurrently, a request
retirement on another CPU can drop the engine wakeref to zero, triggering
__engine_park() -> intel_engine_park_heartbeat(). If the heartbeat
timer is pending at that point, cancel_delayed_work() returns true and
intel_engine_park_heartbeat() reads the stale non-NULL systole pointer
and calls i915_request_put() on it again, causing a refcount underflow:
```
<4> [487.221889] Workqueue: i915-unordered engine_retire [i915]
<4> [487.222640] RIP: 0010:refcount_warn_saturate+0x68/0xb0
...
<4> [487.222707] Call Trace:
<4> [487.222711] <TASK>
<4> [487.222716] intel_engine_park_heartbeat.part.0+0x6f/0x80 [i915]
<4> [487.223115] intel_engine_park_heartbeat+0x25/0x40 [i915]
<4> [487.223566] __engine_park+0xb9/0x650 [i915]
<4> [487.223973] ____intel_wakeref_put_last+0x2e/0xb0 [i915]
<4> [487.224408] __intel_wakeref_put_last+0x72/0x90 [i915]
<4> [487.224797] intel_context_exit_engine+0x7c/0x80 [i915]
<4> [487.225238] intel_context_exit+0xf1/0x1b0 [i915]
<4> [487.225695] i915_request_retire.part.0+0x1b9/0x530 [i915]
<4> [487.226178] i915_request_retire+0x1c/0x40 [i915]
<4> [487.226625] engine_retire+0x122/0x180 [i915]
<4> [487.227037] process_one_work+0x239/0x760
<4> [487.227060] worker_thread+0x200/0x3f0
<4> [487.227068] ? __pfx_worker_thread+0x10/0x10
<4> [487.227075] kthread+0x10d/0x150
<4> [487.227083] ? __pfx_kthread+0x10/0x10
<4> [487.227092] ret_from_fork+0x3d4/0x480
<4> [487.227099] ? __pfx_kthread+0x10/0x10
<4> [487.227107] ret_from_fork_asm+0x1a/0x30
<4> [487.227141] </TASK>
```
Fix this by replacing the non-atomic pointer read + separate clear with
xchg() in both racing paths. xchg() is a single indivisible hardware
instruction that atomically reads the old pointer and writes NULL. This
guarantees only one of the two concurrent callers obtains the non-NULL
pointer and performs the put, the other gets NULL and skips it.
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/work_items/15880
Fixes: 058179e72e09 ("drm/i915/gt: Replace hangcheck by heartbeats")
Cc: <stable@vger.kernel.org> # v5.5+
Signed-off-by: Sebastian Brzezinka <sebastian.brzezinka@intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://lore.kernel.org/r/d4c1c14255688dd07cc8044973c4f032a8d1559e.1775038106.git.sebastian.brzezinka@intel.com
(cherry picked from commit 13238dc0ee4f9ab8dafa2cca7295736191ae2f42)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c | 26 +++++++++++++++--------
1 file changed, 18 insertions(+), 8 deletions(-)
--- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
@@ -116,10 +116,12 @@ static void heartbeat(struct work_struct
/* Just in case everything has gone horribly wrong, give it a kick */
intel_engine_flush_submission(engine);
- rq = engine->heartbeat.systole;
- if (rq && i915_request_completed(rq)) {
- i915_request_put(rq);
- engine->heartbeat.systole = NULL;
+ rq = xchg(&engine->heartbeat.systole, NULL);
+ if (rq) {
+ if (i915_request_completed(rq))
+ i915_request_put(rq);
+ else
+ engine->heartbeat.systole = rq;
}
if (!intel_engine_pm_get_if_awake(engine))
@@ -200,8 +202,11 @@ static void heartbeat(struct work_struct
unlock:
mutex_unlock(&ce->timeline->mutex);
out:
- if (!engine->i915->params.enable_hangcheck || !next_heartbeat(engine))
- i915_request_put(fetch_and_zero(&engine->heartbeat.systole));
+ if (!engine->i915->params.enable_hangcheck || !next_heartbeat(engine)) {
+ rq = xchg(&engine->heartbeat.systole, NULL);
+ if (rq)
+ i915_request_put(rq);
+ }
intel_engine_pm_put(engine);
}
@@ -215,8 +220,13 @@ void intel_engine_unpark_heartbeat(struc
void intel_engine_park_heartbeat(struct intel_engine_cs *engine)
{
- if (cancel_delayed_work(&engine->heartbeat.work))
- i915_request_put(fetch_and_zero(&engine->heartbeat.systole));
+ if (cancel_delayed_work(&engine->heartbeat.work)) {
+ struct i915_request *rq;
+
+ rq = xchg(&engine->heartbeat.systole, NULL);
+ if (rq)
+ i915_request_put(rq);
+ }
}
void intel_gt_unpark_heartbeats(struct intel_gt *gt)
next prev parent reply other threads:[~2026-04-13 16:17 UTC|newest]
Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-13 16:00 [PATCH 6.1 00/55] 6.1.169-rc1 review Greg Kroah-Hartman
2026-04-13 16:00 ` [PATCH 6.1 01/55] lib/crypto: chacha: Zeroize permuted_state before it leaves scope Greg Kroah-Hartman
2026-04-13 16:00 ` [PATCH 6.1 02/55] wifi: rt2x00usb: fix devres lifetime Greg Kroah-Hartman
2026-04-13 16:00 ` [PATCH 6.1 03/55] xfrm_user: fix info leak in build_report() Greg Kroah-Hartman
2026-04-13 16:00 ` [PATCH 6.1 04/55] mptcp: fix slab-use-after-free in __inet_lookup_established Greg Kroah-Hartman
2026-04-13 16:00 ` [PATCH 6.1 05/55] Input: uinput - fix circular locking dependency with ff-core Greg Kroah-Hartman
2026-04-13 16:00 ` [PATCH 6.1 06/55] Input: uinput - take event lock when submitting FF request "event" Greg Kroah-Hartman
2026-04-13 16:00 ` [PATCH 6.1 07/55] MIPS: Always record SEGBITS in cpu_data.vmbits Greg Kroah-Hartman
2026-04-13 16:00 ` [PATCH 6.1 08/55] MIPS: mm: Suppress TLB uniquification on EHINV hardware Greg Kroah-Hartman
2026-04-13 16:00 ` [PATCH 6.1 09/55] MIPS: mm: Rewrite TLB uniquification for the hidden bit feature Greg Kroah-Hartman
2026-04-13 16:00 ` [PATCH 6.1 10/55] media: uvcvideo: Mark invalid entities with id UVC_INVALID_ENTITY_ID Greg Kroah-Hartman
2026-04-13 16:00 ` [PATCH 6.1 11/55] media: uvcvideo: Use heuristic to find stream entity Greg Kroah-Hartman
2026-04-13 16:00 ` [PATCH 6.1 12/55] apparmor: validate DFA start states are in bounds in unpack_pdb Greg Kroah-Hartman
2026-04-13 16:00 ` [PATCH 6.1 13/55] apparmor: fix memory leak in verify_header Greg Kroah-Hartman
2026-04-13 16:00 ` [PATCH 6.1 14/55] apparmor: replace recursive profile removal with iterative approach Greg Kroah-Hartman
2026-04-13 16:00 ` [PATCH 6.1 15/55] apparmor: fix: limit the number of levels of policy namespaces Greg Kroah-Hartman
2026-04-13 16:00 ` [PATCH 6.1 16/55] apparmor: fix side-effect bug in match_char() macro usage Greg Kroah-Hartman
2026-04-13 16:00 ` [PATCH 6.1 17/55] apparmor: fix missing bounds check on DEFAULT table in verify_dfa() Greg Kroah-Hartman
2026-04-13 16:00 ` [PATCH 6.1 18/55] apparmor: Fix double free of ns_name in aa_replace_profiles() Greg Kroah-Hartman
2026-04-13 16:00 ` [PATCH 6.1 19/55] apparmor: fix unprivileged local user can do privileged policy management Greg Kroah-Hartman
2026-04-13 16:00 ` [PATCH 6.1 20/55] apparmor: fix differential encoding verification Greg Kroah-Hartman
2026-04-13 16:00 ` [PATCH 6.1 21/55] apparmor: fix race on rawdata dereference Greg Kroah-Hartman
2026-04-13 16:00 ` [PATCH 6.1 22/55] apparmor: fix race between freeing data and fs accessing it Greg Kroah-Hartman
2026-04-13 16:00 ` [PATCH 6.1 23/55] usb: gadget: u_ether: Fix race between gether_disconnect and eth_stop Greg Kroah-Hartman
2026-04-13 16:00 ` [PATCH 6.1 24/55] Revert "ACPI: EC: Evaluate orphan _REG under EC device" Greg Kroah-Hartman
2026-04-13 16:00 ` [PATCH 6.1 25/55] ACPICA: Add a depth argument to acpi_execute_reg_methods() Greg Kroah-Hartman
2026-04-13 16:01 ` [PATCH 6.1 26/55] ACPI: EC: Evaluate _REG outside the EC scope more carefully Greg Kroah-Hartman
2026-04-13 16:01 ` [PATCH 6.1 27/55] usb: gadget: f_hid: move list and spinlock inits from bind to alloc Greg Kroah-Hartman
2026-04-13 16:01 ` [PATCH 6.1 28/55] rfkill: Use sysfs_emit() to instead of sprintf() Greg Kroah-Hartman
2026-04-13 16:01 ` [PATCH 6.1 29/55] rfkill: sync before userspace visibility/changes Greg Kroah-Hartman
2026-04-13 16:01 ` [PATCH 6.1 30/55] net: rfkill: reduce data->mtx scope in rfkill_fop_open Greg Kroah-Hartman
2026-04-13 16:01 ` [PATCH 6.1 31/55] net: rfkill: prevent unlimited numbers of rfkill events from being created Greg Kroah-Hartman
2026-04-13 16:01 ` [PATCH 6.1 32/55] seg6: separate dst_cache for input and output paths in seg6 lwtunnel Greg Kroah-Hartman
2026-04-13 16:01 ` [PATCH 6.1 33/55] Revert "mptcp: add needs_id for netlink appending addr" Greg Kroah-Hartman
2026-04-13 16:01 ` [PATCH 6.1 34/55] drm/scheduler: signal scheduled fence when kill job Greg Kroah-Hartman
2026-04-13 16:01 ` [PATCH 6.1 35/55] netfilter: nft_set_pipapo: do not rely on ZERO_SIZE_PTR Greg Kroah-Hartman
2026-04-13 16:01 ` [PATCH 6.1 36/55] netfilter: nft_ct: fix use-after-free in timeout object destroy Greg Kroah-Hartman
2026-04-13 16:01 ` [PATCH 6.1 37/55] xfrm: clear trailing padding in build_polexpire() Greg Kroah-Hartman
2026-04-13 16:01 ` [PATCH 6.1 38/55] tipc: fix bc_ackers underflow on duplicate GRP_ACK_MSG Greg Kroah-Hartman
2026-04-13 16:01 ` [PATCH 6.1 39/55] wifi: brcmsmac: Fix dma_free_coherent() size Greg Kroah-Hartman
2026-04-13 16:01 ` [PATCH 6.1 40/55] arm64: dts: hisilicon: poplar: Correct PCIe reset GPIO polarity Greg Kroah-Hartman
2026-04-13 16:01 ` [PATCH 6.1 41/55] arm64: dts: hisilicon: hi3798cv200: Add missing dma-ranges Greg Kroah-Hartman
2026-04-13 16:01 ` [PATCH 6.1 42/55] nfc: pn533: allocate rx skb before consuming bytes Greg Kroah-Hartman
2026-04-13 16:01 ` [PATCH 6.1 43/55] batman-adv: reject oversized global TT response buffers Greg Kroah-Hartman
2026-04-13 16:01 ` [PATCH 6.1 44/55] EDAC/mc: Fix error path ordering in edac_mc_alloc() Greg Kroah-Hartman
2026-04-13 16:01 ` [PATCH 6.1 45/55] net/tls: fix use-after-free in -EBUSY error path of tls_do_encryption Greg Kroah-Hartman
2026-04-13 16:01 ` [PATCH 6.1 46/55] net: altera-tse: fix skb leak on DMA mapping error in tse_start_xmit() Greg Kroah-Hartman
2026-04-13 16:01 ` [PATCH 6.1 47/55] batman-adv: hold claim backbone gateways by reference Greg Kroah-Hartman
2026-04-13 16:01 ` Greg Kroah-Hartman [this message]
2026-04-13 16:01 ` [PATCH 6.1 49/55] net/mlx5: Update the list of the PCI supported devices Greg Kroah-Hartman
2026-04-13 16:01 ` [PATCH 6.1 50/55] mmc: vub300: fix NULL-deref on disconnect Greg Kroah-Hartman
2026-04-13 16:01 ` [PATCH 6.1 51/55] net: qualcomm: qca_uart: report the consumed byte on RX skb allocation failure Greg Kroah-Hartman
2026-04-13 16:01 ` [PATCH 6.1 52/55] net: stmmac: fix integer underflow in chain mode Greg Kroah-Hartman
2026-04-13 16:01 ` [PATCH 6.1 53/55] rxrpc: fix reference count leak in rxrpc_server_keyring() Greg Kroah-Hartman
2026-04-13 16:01 ` [PATCH 6.1 54/55] rxrpc: Fix key/keyring checks in setsockopt(RXRPC_SECURITY_KEY/KEYRING) Greg Kroah-Hartman
2026-04-13 16:01 ` [PATCH 6.1 55/55] Revert "PCI: Enable ACS after configuring IOMMU for OF platforms" Greg Kroah-Hartman
2026-04-13 17:38 ` [PATCH 6.1 00/55] 6.1.169-rc1 review Brett A C Sheffield
2026-04-13 18:51 ` Florian Fainelli
2026-04-14 7:53 ` Jon Hunter
2026-04-14 8:13 ` Pavel Machek
2026-04-14 8:20 ` Pavel Machek
2026-04-14 9:06 ` Peter Schneider
2026-04-14 11:49 ` Ron Economos
2026-04-14 12:30 ` Francesco Dolcini
2026-04-14 17:44 ` Shuah Khan
2026-04-14 17:44 ` Miguel Ojeda
2026-04-14 18:18 ` Mark Brown
2026-04-16 22:09 ` Barry K. Nathan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260413155726.620943222@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=andi.shyti@linux.intel.com \
--cc=joonas.lahtinen@linux.intel.com \
--cc=krzysztof.karas@intel.com \
--cc=patches@lists.linux.dev \
--cc=sebastian.brzezinka@intel.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox