From: sashiko-bot@kernel.org
To: "WenTao Liang" <vulab@iscas.ac.cn>
Cc: dri-devel@lists.freedesktop.org
Subject: Re: [PATCH v2] accel/habanalabs: fix kref underflow in hl_cs_signal_sob_wraparound_handler
Date: Sun, 28 Jun 2026 11:48:44 +0000 [thread overview]
Message-ID: <20260628114845.0AB481F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260628113020.43942-1-vulab@iscas.ac.cn>
Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
New issues:
- [High] Unconditionally incrementing the reference count of the new SOB during a wraparound causes a permanent reference leak when `encaps_sig` is false, eventually causing a denial of service.
Pre-existing issues:
- [High] Pre-existing reference underflow on the old SOB during a wraparound in `init_signal_cs()`.
- [High] Pre-existing mismatch between the hardware signal generation and the SOB address returned to userspace, leading to an indefinite userspace hang.
--
commit c04dffdbf27c2cf3b267257e4a97f77648c44492
Author: WenTao Liang <vulab@iscas.ac.cn>
accel/habanalabs: fix kref underflow in hl_cs_signal_sob_wraparound_handler
When other_sob->need_reset is true and encaps_sig is false,
hw_sob_put(other_sob) decrements the kref to 0, but the matching
hw_sob_get(other_sob) is skipped because it is inside the encaps_sig
block. The function returns other_sob with kref=0, causing a subsequent
kref_put to underflow. Fix by adding hw_sob_get(other_sob) in the else
branch.
Suggested-by: Greg KH <gregkh@linuxfoundation.org>
Fixes: dadf17abb724 ("habanalabs: add support for encapsulated signals reservation")
Cc: stable@vger.kernel.org
Signed-off-by: WenTao Liang <vulab@iscas.ac.cn>
> diff --git a/drivers/accel/habanalabs/common/command_submission.c b/drivers/accel/habanalabs/common/command_submission.c
> index d51b4cf332b2a..ffcdf29b7e070 100644
> --- a/drivers/accel/habanalabs/common/command_submission.c
> +++ b/drivers/accel/habanalabs/common/command_submission.c
[ ... ]
> @@ -1869,11 +1869,10 @@ int hl_cs_signal_sob_wraparound_handler(struct hl_device *hdev, u32 q_idx,
> if (other_sob->need_reset)
> hw_sob_put(other_sob);
>
> - if (encaps_sig) {
> + if (encaps_sig)
> /* set reset indication for the sob */
> sob->need_reset = true;
> - hw_sob_get(other_sob);
> - }
> + hw_sob_get(other_sob);
[Severity: High]
Does this unconditional get leak the reference count when encaps_sig is false?
If encaps_sig is false, init_signal_cs() retains the old SOB in
cs_cmpl->hw_sob and ignores the new SOB returned by this wraparound handler.
Since no object takes ownership of this newly acquired reference, does it
permanently leak the other_sob reference?
Could this inflated refcount cause the kref_read(&other_sob->kref) != 1
check to fail during future wraparounds, leading to a denial of service?
[ ... ]
[Severity: High]
This is a pre-existing issue, but can init_signal_cs() cause a reference
underflow on the old SOB during a wraparound?
In drivers/accel/habanalabs/common/hw_queue.c:init_signal_cs(), the
completion object retains a pointer to the old SOB before calling the
wraparound handler:
cs_cmpl->hw_sob = hw_sob;
...
rc = hl_cs_signal_sob_wraparound_handler(hdev, q_idx, &hw_sob, 1,
false);
If a wraparound occurs, the handler calls hw_sob_put_err(sob) on the old
SOB. Later, when the job completes, cs_do_release() calls
cs_release_sob_reset_handler() which unconditionally calls
hw_sob_put(hl_cs_cmpl->hw_sob).
Since the completion object still points to the old SOB, does this decrement
the old SOB's refcount without a corresponding increment, leading to premature
release?
[Severity: High]
This is a pre-existing issue, but is there a mismatch between the
hardware signal generation and the SOB address returned to userspace?
In drivers/accel/habanalabs/common/hw_queue.c:init_signal_cs(), the hardware
command is generated using the old SOB's ID:
hdev->asic_funcs->gen_signal_cb(hdev, job->patched_cb,
cs_cmpl->hw_sob->sob_id, 0, true);
Immediately after, the wraparound handler executes and updates the local hw_sob
pointer to the new SOB:
rc = hl_cs_signal_sob_wraparound_handler(hdev, q_idx, &hw_sob, 1,
false);
The function then assigns the new SOB's address to the job:
job->cs->sob_addr_offset = hw_sob->sob_addr;
If userspace waits on this exported offset, it waits on the new SOB, but
the hardware was programmed to signal the old SOB. Can this cause an
indefinite userspace hang?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260628113020.43942-1-vulab@iscas.ac.cn?part=1
prev parent reply other threads:[~2026-06-28 11:48 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-28 11:30 [PATCH v2] accel/habanalabs: fix kref underflow in hl_cs_signal_sob_wraparound_handler WenTao Liang
2026-06-28 11:48 ` sashiko-bot [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260628114845.0AB481F000E9@smtp.kernel.org \
--to=sashiko-bot@kernel.org \
--cc=dri-devel@lists.freedesktop.org \
--cc=sashiko-reviews@lists.linux.dev \
--cc=vulab@iscas.ac.cn \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.