All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oded Gabbay <oded.gabbay@gmail.com>
To: linux-kernel@vger.kernel.org, oshpigelman@habana.ai,
	ttayar@habana.ai, gregkh@linuxfoundation.org
Subject: [PATCH 7/9] habanalabs: protect only pointer dereference in hard-reset
Date: Sun, 28 Jul 2019 14:28:16 +0300	[thread overview]
Message-ID: <20190728112818.30397-8-oded.gabbay@gmail.com> (raw)
In-Reply-To: <20190728112818.30397-1-oded.gabbay@gmail.com>

This patch changes the location of taking a mutex lock and releasing it
during the hard-reset process of the ASIC.

The only place we need to protect is when we dereference pointers that may
go away in case the user process aborts/closes the FD.

That way, we allow the user process to actually close its FD in case we
tell him that an error occurred.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
---
 drivers/misc/habanalabs/device.c | 26 ++++++++++++++------------
 1 file changed, 14 insertions(+), 12 deletions(-)

diff --git a/drivers/misc/habanalabs/device.c b/drivers/misc/habanalabs/device.c
index 5400e65ba5fa..471506b54217 100644
--- a/drivers/misc/habanalabs/device.c
+++ b/drivers/misc/habanalabs/device.c
@@ -574,20 +574,21 @@ static void device_kill_open_processes(struct hl_device *hdev)
 	else
 		pending_total = HL_PENDING_RESET_PER_SEC;
 
-	pending_cnt = pending_total;
-
 	/* Flush all processes that are inside hl_open */
 	mutex_lock(&hdev->fpriv_list_lock);
+	mutex_unlock(&hdev->fpriv_list_lock);
 
-	while ((!list_empty(&hdev->fpriv_list)) && (pending_cnt)) {
-
-		pending_cnt--;
-
-		dev_info(hdev->dev,
-			"Can't HARD reset, waiting for user to close FD\n");
+	/* Giving time for user to close FD, and for processes that are inside
+	 * hl_device_open to finish
+	 */
+	if (!list_empty(&hdev->fpriv_list))
 		ssleep(1);
-	}
 
+	mutex_lock(&hdev->fpriv_list_lock);
+
+	/* This section must be protected because we are dereferencing
+	 * pointers that are freed if the process exits
+	 */
 	if (!list_empty(&hdev->fpriv_list)) {
 		task = get_pid_task(hdev->compute_ctx->hpriv->taskpid,
 					PIDTYPE_PID);
@@ -600,6 +601,8 @@ static void device_kill_open_processes(struct hl_device *hdev)
 		}
 	}
 
+	mutex_unlock(&hdev->fpriv_list_lock);
+
 	/* We killed the open users, but because the driver cleans up after the
 	 * user contexts are closed (e.g. mmu mappings), we need to wait again
 	 * to make sure the cleaning phase is finished before continuing with
@@ -609,6 +612,8 @@ static void device_kill_open_processes(struct hl_device *hdev)
 	pending_cnt = pending_total;
 
 	while ((!list_empty(&hdev->fpriv_list)) && (pending_cnt)) {
+		dev_info(hdev->dev,
+			"Waiting for all unmap operations to finish before hard reset\n");
 
 		pending_cnt--;
 
@@ -618,9 +623,6 @@ static void device_kill_open_processes(struct hl_device *hdev)
 	if (!list_empty(&hdev->fpriv_list))
 		dev_crit(hdev->dev,
 			"Going to hard reset with open user contexts\n");
-
-	mutex_unlock(&hdev->fpriv_list_lock);
-
 }
 
 static void device_hard_reset_pending(struct work_struct *work)
-- 
2.17.1


  parent reply	other threads:[~2019-07-28 11:28 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-28 11:28 [PATCH 0/9] habanalabs: support open device by multiple processes Oded Gabbay
2019-07-28 11:28 ` [PATCH 1/9] habanalabs: add handle field to context structure Oded Gabbay
2019-07-28 11:28 ` [PATCH 2/9] habanalabs: verify context is valid in IOCTLs Oded Gabbay
2019-07-28 11:28 ` [PATCH 3/9] habanalabs: create context in lazy mode Oded Gabbay
2019-07-28 11:28 ` [PATCH 4/9] habanalabs: don't change frequency if user context is valid Oded Gabbay
2019-07-28 11:28 ` [PATCH 5/9] habanalabs: maintain a list of file private data objects Oded Gabbay
2019-07-28 11:28 ` [PATCH 6/9] habanalabs: define user context as compute context Oded Gabbay
2019-07-28 11:28 ` Oded Gabbay [this message]
2019-07-28 11:28 ` [PATCH 8/9] habanalabs: kill user process after CS rollback Oded Gabbay
2019-07-28 11:28 ` [PATCH 9/9] habanalabs: allow multiple processes to open FD Oded Gabbay
2019-07-28 11:44   ` Greg KH
2019-07-28 11:56     ` Oded Gabbay
2019-07-28 12:04       ` Greg KH
2019-07-28 12:06         ` Oded Gabbay
2019-07-28 12:12           ` Greg KH
2019-07-28 12:18             ` Oded Gabbay

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190728112818.30397-8-oded.gabbay@gmail.com \
    --to=oded.gabbay@gmail.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oshpigelman@habana.ai \
    --cc=ttayar@habana.ai \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.