From: Oded Gabbay <ogabbay@kernel.org>
To: linux-kernel@vger.kernel.org
Subject: [PATCH 28/30] habanalabs: fix race between wait and irq
Date: Sat, 22 Jan 2022 21:57:29 +0200 [thread overview]
Message-ID: <20220122195731.934494-28-ogabbay@kernel.org> (raw)
In-Reply-To: <20220122195731.934494-1-ogabbay@kernel.org>
There is a race in the user interrupts code, where between checking
the target value and adding the new pend to the list, there is a chance
the interrupt happened.
In that case, no one will complete the node, and we will get a timeout
on it.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
drivers/misc/habanalabs/common/command_submission.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/misc/habanalabs/common/command_submission.c b/drivers/misc/habanalabs/common/command_submission.c
index 2b27ff826e3c..3d9020825df5 100644
--- a/drivers/misc/habanalabs/common/command_submission.c
+++ b/drivers/misc/habanalabs/common/command_submission.c
@@ -2901,16 +2901,21 @@ static int _hl_interrupt_wait_ioctl(struct hl_device *hdev, struct hl_ctx *ctx,
pend->cq_kernel_addr = (u64 *) cb->kernel_address + cq_counters_offset;
pend->cq_target_value = target_value;
+ spin_lock_irqsave(&interrupt->wait_list_lock, flags);
+
/* We check for completion value as interrupt could have been received
* before we added the node to the wait list
*/
if (*pend->cq_kernel_addr >= target_value) {
+ spin_unlock_irqrestore(&interrupt->wait_list_lock, flags);
+
*status = HL_WAIT_CS_STATUS_COMPLETED;
/* There was no interrupt, we assume the completion is now. */
pend->fence.timestamp = ktime_get();
goto set_timestamp;
} else if (!timeout_us) {
+ spin_unlock_irqrestore(&interrupt->wait_list_lock, flags);
*status = HL_WAIT_CS_STATUS_BUSY;
pend->fence.timestamp = ktime_get();
goto set_timestamp;
@@ -2919,7 +2924,6 @@ static int _hl_interrupt_wait_ioctl(struct hl_device *hdev, struct hl_ctx *ctx,
/* Add pending user interrupt to relevant list for the interrupt
* handler to monitor
*/
- spin_lock_irqsave(&interrupt->wait_list_lock, flags);
list_add_tail(&pend->wait_list_node, &interrupt->wait_list_head);
spin_unlock_irqrestore(&interrupt->wait_list_lock, flags);
--
2.25.1
next prev parent reply other threads:[~2022-01-22 19:59 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-01-22 19:57 [PATCH 01/30] habanalabs: check the return value of hl_cs_poll_fences() Oded Gabbay
2022-01-22 19:57 ` [PATCH 02/30] habanalabs: fix race when waiting on encaps signal Oded Gabbay
2022-01-22 19:57 ` [PATCH 03/30] habanalabs: fix possible memory leak in MMU DR fini Oded Gabbay
2022-01-22 19:57 ` [PATCH 04/30] habanalabs/gaudi: disable CGM permanently Oded Gabbay
2022-01-22 19:57 ` [PATCH 05/30] habanalabs: remove ASIC functions of clock gating Oded Gabbay
2022-01-22 19:57 ` [PATCH 06/30] habanalabs: make some MMU functions common Oded Gabbay
2022-01-22 19:57 ` [PATCH 07/30] habanalabs: sysfs functions should be in sysfs.c Oded Gabbay
2022-01-22 19:57 ` [PATCH 08/30] habanalabs: get clk is common function Oded Gabbay
2022-01-22 19:57 ` [PATCH 09/30] habanalabs: remove hwmgr.c Oded Gabbay
2022-01-22 19:57 ` [PATCH 10/30] habanalabs: move more f/w functions to firmware_if.c Oded Gabbay
2022-01-22 19:57 ` [PATCH 11/30] habanalabs: remove asic callback set_pll_profile() Oded Gabbay
2022-01-22 19:57 ` [PATCH 12/30] habanalabs: rename dev_attr_grp to dev_clk_attr_grp Oded Gabbay
2022-01-22 19:57 ` [PATCH 13/30] habanalabs: add vrm version to sysfs Oded Gabbay
2022-01-22 19:57 ` [PATCH 14/30] habanalabs: remove power9 workaround for dma support Oded Gabbay
2022-01-22 19:57 ` [PATCH 15/30] habanalabs: use common wrapper for MMU cache invalidation Oded Gabbay
2022-01-22 19:57 ` [PATCH 16/30] habanalabs: sysfs support for fw os version Oded Gabbay
2022-01-22 19:57 ` [PATCH 17/30] habanalabs: there is no kernel TDR in future ASICs Oded Gabbay
2022-01-22 19:57 ` [PATCH 18/30] habanalabs: duplicate HOP table props to MMU props Oded Gabbay
2022-01-22 19:57 ` [PATCH 19/30] habanalabs: don't free phys_pg_pack inside lock Oded Gabbay
2022-01-22 19:57 ` [PATCH 20/30] habanalabs: avoid copying pll data if pll_info_get fails Oded Gabbay
2022-01-22 19:57 ` [PATCH 21/30] habanalabs: add missing error check in sysfs clk_freq_mhz_show Oded Gabbay
2022-01-22 19:57 ` [PATCH 22/30] habanalabs: fix soft reset flow in case of failure Oded Gabbay
2022-01-22 19:57 ` [PATCH 23/30] habanalabs: add missing error check in sysfs max_power_show Oded Gabbay
2022-01-22 19:57 ` [PATCH 24/30] habanalabs: update to latest f/w specs Oded Gabbay
2022-01-22 19:57 ` [PATCH 25/30] habanalabs: expose number of user interrupts Oded Gabbay
2022-01-22 19:57 ` [PATCH 26/30] habanalabs: reject host map with mmu disabled Oded Gabbay
2022-01-22 19:57 ` [PATCH 27/30] habanalabs: fix user interrupt wait when timeout is 0 Oded Gabbay
2022-01-22 19:57 ` Oded Gabbay [this message]
2022-01-22 19:57 ` [PATCH 29/30] habanalabs: prevent false heartbeat failure during soft-reset Oded Gabbay
2022-01-22 19:57 ` [PATCH 30/30] habanalabs: remove duplicate print Oded Gabbay
[not found] ` <20220123002722.3057-1-hdanton@sina.com>
2022-01-24 18:22 ` [PATCH 02/30] habanalabs: fix race when waiting on encaps signal Dani Liberman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220122195731.934494-28-ogabbay@kernel.org \
--to=ogabbay@kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.