All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] habanalabs: support DEVICE_UNUSABLE error indication from FW
@ 2021-04-04  7:23 Oded Gabbay
  0 siblings, 0 replies; only message in thread
From: Oded Gabbay @ 2021-04-04  7:23 UTC (permalink / raw)
  To: linux-kernel; +Cc: Koby Elbaz

From: Koby Elbaz <kelbaz@habana.ai>

In case of multiple ECC errors, FW will set the DEVICE_UNUSABLE bit.
On boot-up, the driver will therefore fail inserting the device.

Signed-off-by: Koby Elbaz <kelbaz@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 drivers/misc/habanalabs/common/firmware_if.c        | 3 +++
 drivers/misc/habanalabs/include/common/hl_boot_if.h | 4 ++++
 2 files changed, 7 insertions(+)

diff --git a/drivers/misc/habanalabs/common/firmware_if.c b/drivers/misc/habanalabs/common/firmware_if.c
index 092691a8917d..d81a8d537373 100644
--- a/drivers/misc/habanalabs/common/firmware_if.c
+++ b/drivers/misc/habanalabs/common/firmware_if.c
@@ -342,6 +342,9 @@ static int fw_read_errors(struct hl_device *hdev, u32 boot_err0_reg,
 		dev_err(hdev->dev, "Device boot error - eFuse failure\n");
 	if (err_val & CPU_BOOT_ERR0_PLL_FAIL)
 		dev_err(hdev->dev, "Device boot error - PLL failure\n");
+	if (err_val & CPU_BOOT_ERR0_DEVICE_UNUSABLE_FAIL)
+		dev_err(hdev->dev,
+			"Device boot error - device unusable failure\n");
 
 	security_val = RREG32(cpu_security_boot_status_reg);
 	if (security_val & CPU_BOOT_DEV_STS0_ENABLED)
diff --git a/drivers/misc/habanalabs/include/common/hl_boot_if.h b/drivers/misc/habanalabs/include/common/hl_boot_if.h
index 1717874ff306..980b432fd76e 100644
--- a/drivers/misc/habanalabs/include/common/hl_boot_if.h
+++ b/drivers/misc/habanalabs/include/common/hl_boot_if.h
@@ -73,6 +73,9 @@
  * CPU_BOOT_ERR0_PLL_FAIL		PLL settings failed, meaning that one
  *					of the PLLs remains in REF_CLK
  *
+ * CPU_BOOT_ERR0_DEVICE_UNUSABLE_FAIL	Device is unusable and customer support
+ *					should be contacted.
+ *
  * CPU_BOOT_ERR0_ENABLED		Error registers enabled.
  *					This is a main indication that the
  *					running FW populates the error
@@ -92,6 +95,7 @@
 #define CPU_BOOT_ERR0_PRI_IMG_VER_FAIL		(1 << 10)
 #define CPU_BOOT_ERR0_SEC_IMG_VER_FAIL		(1 << 11)
 #define CPU_BOOT_ERR0_PLL_FAIL			(1 << 12)
+#define CPU_BOOT_ERR0_DEVICE_UNUSABLE_FAIL	(1 << 13)
 #define CPU_BOOT_ERR0_ENABLED			(1 << 31)
 
 /*
-- 
2.25.1


^ permalink raw reply related	[flat|nested] only message in thread

only message in thread, other threads:[~2021-04-04  7:23 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-04-04  7:23 [PATCH] habanalabs: support DEVICE_UNUSABLE error indication from FW Oded Gabbay

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.