* [PATCH v2] accel/habanalabs: fix kref underflow in hl_cs_signal_sob_wraparound_handler
@ 2026-06-28 11:30 WenTao Liang
2026-06-28 11:48 ` sashiko-bot
0 siblings, 1 reply; 2+ messages in thread
From: WenTao Liang @ 2026-06-28 11:30 UTC (permalink / raw)
To: dri-devel
Cc: ogabbay, koby.elbaz, konstantin.sinyuk, kees, linux-kernel,
stable, WenTao Liang, Greg KH
When other_sob->need_reset is true and encaps_sig is false,
hw_sob_put(other_sob) decrements the kref to 0, but the matching
hw_sob_get(other_sob) is skipped because it is inside the encaps_sig
block. The function returns other_sob with kref=0, causing a subsequent
kref_put to underflow. Fix by adding hw_sob_get(other_sob) in the else
branch.
Suggested-by: Greg KH <gregkh@linuxfoundation.org>
Fixes: dadf17abb724 ("habanalabs: add support for encapsulated signals reservation")
Cc: stable@vger.kernel.org
Signed-off-by: WenTao Liang <vulab@iscas.ac.cn>
---
Changes in v2:
- Correct Fixes hash based on reviewer feedback
---
---
drivers/accel/habanalabs/common/command_submission.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/accel/habanalabs/common/command_submission.c b/drivers/accel/habanalabs/common/command_submission.c
index ba4257bda77b..675301dfc0ef 100644
--- a/drivers/accel/habanalabs/common/command_submission.c
+++ b/drivers/accel/habanalabs/common/command_submission.c
@@ -1860,11 +1860,10 @@ int hl_cs_signal_sob_wraparound_handler(struct hl_device *hdev, u32 q_idx,
if (other_sob->need_reset)
hw_sob_put(other_sob);
- if (encaps_sig) {
+ if (encaps_sig)
/* set reset indication for the sob */
sob->need_reset = true;
- hw_sob_get(other_sob);
- }
+ hw_sob_get(other_sob);
dev_dbg(hdev->dev, "switched to SOB %d, q_idx: %d\n",
prop->curr_sob_offset, q_idx);
--
2.39.5 (Apple Git-154)
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH v2] accel/habanalabs: fix kref underflow in hl_cs_signal_sob_wraparound_handler
2026-06-28 11:30 [PATCH v2] accel/habanalabs: fix kref underflow in hl_cs_signal_sob_wraparound_handler WenTao Liang
@ 2026-06-28 11:48 ` sashiko-bot
0 siblings, 0 replies; 2+ messages in thread
From: sashiko-bot @ 2026-06-28 11:48 UTC (permalink / raw)
To: WenTao Liang; +Cc: dri-devel
Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
New issues:
- [High] Unconditionally incrementing the reference count of the new SOB during a wraparound causes a permanent reference leak when `encaps_sig` is false, eventually causing a denial of service.
Pre-existing issues:
- [High] Pre-existing reference underflow on the old SOB during a wraparound in `init_signal_cs()`.
- [High] Pre-existing mismatch between the hardware signal generation and the SOB address returned to userspace, leading to an indefinite userspace hang.
--
commit c04dffdbf27c2cf3b267257e4a97f77648c44492
Author: WenTao Liang <vulab@iscas.ac.cn>
accel/habanalabs: fix kref underflow in hl_cs_signal_sob_wraparound_handler
When other_sob->need_reset is true and encaps_sig is false,
hw_sob_put(other_sob) decrements the kref to 0, but the matching
hw_sob_get(other_sob) is skipped because it is inside the encaps_sig
block. The function returns other_sob with kref=0, causing a subsequent
kref_put to underflow. Fix by adding hw_sob_get(other_sob) in the else
branch.
Suggested-by: Greg KH <gregkh@linuxfoundation.org>
Fixes: dadf17abb724 ("habanalabs: add support for encapsulated signals reservation")
Cc: stable@vger.kernel.org
Signed-off-by: WenTao Liang <vulab@iscas.ac.cn>
> diff --git a/drivers/accel/habanalabs/common/command_submission.c b/drivers/accel/habanalabs/common/command_submission.c
> index d51b4cf332b2a..ffcdf29b7e070 100644
> --- a/drivers/accel/habanalabs/common/command_submission.c
> +++ b/drivers/accel/habanalabs/common/command_submission.c
[ ... ]
> @@ -1869,11 +1869,10 @@ int hl_cs_signal_sob_wraparound_handler(struct hl_device *hdev, u32 q_idx,
> if (other_sob->need_reset)
> hw_sob_put(other_sob);
>
> - if (encaps_sig) {
> + if (encaps_sig)
> /* set reset indication for the sob */
> sob->need_reset = true;
> - hw_sob_get(other_sob);
> - }
> + hw_sob_get(other_sob);
[Severity: High]
Does this unconditional get leak the reference count when encaps_sig is false?
If encaps_sig is false, init_signal_cs() retains the old SOB in
cs_cmpl->hw_sob and ignores the new SOB returned by this wraparound handler.
Since no object takes ownership of this newly acquired reference, does it
permanently leak the other_sob reference?
Could this inflated refcount cause the kref_read(&other_sob->kref) != 1
check to fail during future wraparounds, leading to a denial of service?
[ ... ]
[Severity: High]
This is a pre-existing issue, but can init_signal_cs() cause a reference
underflow on the old SOB during a wraparound?
In drivers/accel/habanalabs/common/hw_queue.c:init_signal_cs(), the
completion object retains a pointer to the old SOB before calling the
wraparound handler:
cs_cmpl->hw_sob = hw_sob;
...
rc = hl_cs_signal_sob_wraparound_handler(hdev, q_idx, &hw_sob, 1,
false);
If a wraparound occurs, the handler calls hw_sob_put_err(sob) on the old
SOB. Later, when the job completes, cs_do_release() calls
cs_release_sob_reset_handler() which unconditionally calls
hw_sob_put(hl_cs_cmpl->hw_sob).
Since the completion object still points to the old SOB, does this decrement
the old SOB's refcount without a corresponding increment, leading to premature
release?
[Severity: High]
This is a pre-existing issue, but is there a mismatch between the
hardware signal generation and the SOB address returned to userspace?
In drivers/accel/habanalabs/common/hw_queue.c:init_signal_cs(), the hardware
command is generated using the old SOB's ID:
hdev->asic_funcs->gen_signal_cb(hdev, job->patched_cb,
cs_cmpl->hw_sob->sob_id, 0, true);
Immediately after, the wraparound handler executes and updates the local hw_sob
pointer to the new SOB:
rc = hl_cs_signal_sob_wraparound_handler(hdev, q_idx, &hw_sob, 1,
false);
The function then assigns the new SOB's address to the job:
job->cs->sob_addr_offset = hw_sob->sob_addr;
If userspace waits on this exported offset, it waits on the new SOB, but
the hardware was programmed to signal the old SOB. Can this cause an
indefinite userspace hang?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260628113020.43942-1-vulab@iscas.ac.cn?part=1
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-06-28 11:48 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-28 11:30 [PATCH v2] accel/habanalabs: fix kref underflow in hl_cs_signal_sob_wraparound_handler WenTao Liang
2026-06-28 11:48 ` sashiko-bot
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.