* [PATCH 1/3] accel/habanalabs: remove wrong doc for init_phys_pg_pack_from_userptr
@ 2023-09-28 9:19 Oded Gabbay
2023-09-28 9:19 ` [PATCH 2/3] accel/habanalabs: fix bug in decoder wait for cs completion Oded Gabbay
2023-09-28 9:19 ` [PATCH 3/3] accel/habanalabs/gaudi2: perform hard-reset upon PCIe AXI drain event Oded Gabbay
0 siblings, 2 replies; 3+ messages in thread
From: Oded Gabbay @ 2023-09-28 9:19 UTC (permalink / raw)
To: dri-devel, linux-kernel; +Cc: Dafna Hirschfeld
From: Dafna Hirschfeld <dhirschfeld@habana.ai>
The function does not pin the pages so remove that from the inline doc.
Signed-off-by: Dafna Hirschfeld <dhirschfeld@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
drivers/accel/habanalabs/common/memory.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/accel/habanalabs/common/memory.c b/drivers/accel/habanalabs/common/memory.c
index ba59e921236e..0b8689fe0b64 100644
--- a/drivers/accel/habanalabs/common/memory.c
+++ b/drivers/accel/habanalabs/common/memory.c
@@ -832,7 +832,6 @@ int hl_unreserve_va_block(struct hl_device *hdev, struct hl_ctx *ctx,
* physical pages
*
* This function does the following:
- * - Pin the physical pages related to the given virtual block.
* - Create a physical page pack from the physical pages related to the given
* virtual block.
*/
--
2.34.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH 2/3] accel/habanalabs: fix bug in decoder wait for cs completion
2023-09-28 9:19 [PATCH 1/3] accel/habanalabs: remove wrong doc for init_phys_pg_pack_from_userptr Oded Gabbay
@ 2023-09-28 9:19 ` Oded Gabbay
2023-09-28 9:19 ` [PATCH 3/3] accel/habanalabs/gaudi2: perform hard-reset upon PCIe AXI drain event Oded Gabbay
1 sibling, 0 replies; 3+ messages in thread
From: Oded Gabbay @ 2023-09-28 9:19 UTC (permalink / raw)
To: dri-devel, linux-kernel; +Cc: farah kassabri
From: farah kassabri <fkassabri@habana.ai>
The decoder interrupts are handled in the interrupt context
same as all user interrupts.
In such case, the wait list should be protected by
spin_lock_irqsave in order to avoid deadlock that might happen
with the user submission flow.
Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
.../accel/habanalabs/common/command_submission.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/drivers/accel/habanalabs/common/command_submission.c b/drivers/accel/habanalabs/common/command_submission.c
index 4f7b70d9754c..3aa6eeef443b 100644
--- a/drivers/accel/habanalabs/common/command_submission.c
+++ b/drivers/accel/habanalabs/common/command_submission.c
@@ -3526,7 +3526,7 @@ static int _hl_interrupt_wait_ioctl_user_addr(struct hl_device *hdev, struct hl_
u64 *timestamp)
{
struct hl_user_pending_interrupt *pend;
- unsigned long timeout;
+ unsigned long timeout, flags;
u64 completion_value;
long completion_rc;
int rc = 0;
@@ -3546,9 +3546,9 @@ static int _hl_interrupt_wait_ioctl_user_addr(struct hl_device *hdev, struct hl_
/* Add pending user interrupt to relevant list for the interrupt
* handler to monitor
*/
- spin_lock(&interrupt->wait_list_lock);
+ spin_lock_irqsave(&interrupt->wait_list_lock, flags);
list_add_tail(&pend->list_node, &interrupt->wait_list_head);
- spin_unlock(&interrupt->wait_list_lock);
+ spin_unlock_irqrestore(&interrupt->wait_list_lock, flags);
/* We check for completion value as interrupt could have been received
* before we added the node to the wait list
@@ -3579,14 +3579,14 @@ static int _hl_interrupt_wait_ioctl_user_addr(struct hl_device *hdev, struct hl_
* If comparison fails, keep waiting until timeout expires
*/
if (completion_rc > 0) {
- spin_lock(&interrupt->wait_list_lock);
+ spin_lock_irqsave(&interrupt->wait_list_lock, flags);
/* reinit_completion must be called before we check for user
* completion value, otherwise, if interrupt is received after
* the comparison and before the next wait_for_completion,
* we will reach timeout and fail
*/
reinit_completion(&pend->fence.completion);
- spin_unlock(&interrupt->wait_list_lock);
+ spin_unlock_irqrestore(&interrupt->wait_list_lock, flags);
if (copy_from_user(&completion_value, u64_to_user_ptr(user_address), 8)) {
dev_err(hdev->dev, "Failed to copy completion value from user\n");
@@ -3623,9 +3623,9 @@ static int _hl_interrupt_wait_ioctl_user_addr(struct hl_device *hdev, struct hl_
}
remove_pending_user_interrupt:
- spin_lock(&interrupt->wait_list_lock);
+ spin_lock_irqsave(&interrupt->wait_list_lock, flags);
list_del(&pend->list_node);
- spin_unlock(&interrupt->wait_list_lock);
+ spin_unlock_irqrestore(&interrupt->wait_list_lock, flags);
*timestamp = ktime_to_ns(pend->fence.timestamp);
--
2.34.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH 3/3] accel/habanalabs/gaudi2: perform hard-reset upon PCIe AXI drain event
2023-09-28 9:19 [PATCH 1/3] accel/habanalabs: remove wrong doc for init_phys_pg_pack_from_userptr Oded Gabbay
2023-09-28 9:19 ` [PATCH 2/3] accel/habanalabs: fix bug in decoder wait for cs completion Oded Gabbay
@ 2023-09-28 9:19 ` Oded Gabbay
1 sibling, 0 replies; 3+ messages in thread
From: Oded Gabbay @ 2023-09-28 9:19 UTC (permalink / raw)
To: dri-devel, linux-kernel; +Cc: Tomer Tayar
From: Tomer Tayar <ttayar@habana.ai>
Non-completed transactions from PCIe towards the device are handled by
the AXI drain mechanism. This handling is in the PCIe level, but the
transactions are still there in the device consuming some queues
entries, and therefore the device must be reset.
Modify to perform hard-reset upon PCIe AXI drain events.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
.../habanalabs/include/gaudi2/gaudi2_async_ids_map_extended.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/accel/habanalabs/include/gaudi2/gaudi2_async_ids_map_extended.h b/drivers/accel/habanalabs/include/gaudi2/gaudi2_async_ids_map_extended.h
index 57e661771b6c..b2dbe1f64430 100644
--- a/drivers/accel/habanalabs/include/gaudi2/gaudi2_async_ids_map_extended.h
+++ b/drivers/accel/habanalabs/include/gaudi2/gaudi2_async_ids_map_extended.h
@@ -1293,7 +1293,7 @@ static struct gaudi2_async_events_ids_map gaudi2_irq_map_table[] = {
.name = "" },
{ .fc_id = 631, .cpu_id = 128, .valid = 1, .msg = 0, .reset = EVENT_RESET_TYPE_NONE,
.name = "PCIE_P2P_MSIX" },
- { .fc_id = 632, .cpu_id = 129, .valid = 1, .msg = 0, .reset = EVENT_RESET_TYPE_NONE,
+ { .fc_id = 632, .cpu_id = 129, .valid = 1, .msg = 0, .reset = EVENT_RESET_TYPE_HARD,
.name = "PCIE_DRAIN_COMPLETE" },
{ .fc_id = 633, .cpu_id = 130, .valid = 1, .msg = 0, .reset = EVENT_RESET_TYPE_NONE,
.name = "TPC0_BMON_SPMU" },
--
2.34.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2023-09-28 9:20 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-28 9:19 [PATCH 1/3] accel/habanalabs: remove wrong doc for init_phys_pg_pack_from_userptr Oded Gabbay
2023-09-28 9:19 ` [PATCH 2/3] accel/habanalabs: fix bug in decoder wait for cs completion Oded Gabbay
2023-09-28 9:19 ` [PATCH 3/3] accel/habanalabs/gaudi2: perform hard-reset upon PCIe AXI drain event Oded Gabbay
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox