linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Zhu Yanjun <yanjun.zhu@linux.dev>
To: Bob Pearson <rpearsonhpe@gmail.com>,
	Bart Van Assche <bvanassche@acm.org>,
	Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Cc: "linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
	"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>
Subject: Re: [bug report] blktests srp/002 hang
Date: Sat, 16 Sep 2023 13:59:36 +0800	[thread overview]
Message-ID: <d3205633-0cd2-f87e-1c40-21b8172b6da3@linux.dev> (raw)
In-Reply-To: <9dd0aa0a-d696-a95b-095b-f54d6d31a6ab@linux.dev>

[-- Attachment #1: Type: text/plain, Size: 3005 bytes --]



在 2023/9/14 7:38, Zhu Yanjun 写道:
> 在 2023/9/14 1:36, Bob Pearson 写道:
>> On 8/25/23 08:52, Bart Van Assche wrote:
>>> On 8/24/23 18:11, Shinichiro Kawasaki wrote:
>>>> If it takes time to resolve the issues, it sounds a good idea to 
>>>> make siw driver
>>>> default, since it will make the hangs less painful for blktests 
>>>> users. Another
>>>> idea to reduce the pain is to improve srp/002 and srp/011 to detect 
>>>> the hangs
>>>> and report them as failures.
>>>
>>> At this moment we don't know whether the hangs can be converted into 
>>> failures.
>>> Answering this question is only possible after we have found the root 
>>> cause of
>>> the hang. If the hang would be caused by commands getting stuck in 
>>> multipathd
>>> then it can be solved by changing the path configuration (see also 
>>> the dmsetup
>>> message commands in blktests). If the hang is caused by a kernel bug 
>>> then it's
>>> very well possible that there is no way to recover other than by 
>>> rebooting the
>>> system on which the tests are run.
>>>
>>> Thanks,
>>>
>>> Bart.
>>
>> Since 6.6.0-rc1 came out I decided to give blktests srp another try 
>> with the current
>> rdma for-next branch on my Ubuntu (debian) system. For the first time 
>> in a very long
>> time all the srp test cases run correctly multiple times. I ran each 
>> one 3X.
>>
>> I had tried to build multipath-tools from source but ran into problems 
>> so I reinstalled
>> the current Ubuntu packages. I have no idea what was the root cause 
>> that finally went
>> away but I don't think it was in rxe as there aren't any recent 
>> patches related to the
>> blktests failures. I did notice that the dmesg traces picked up a 
>> couple of lines after
>> the place where it used to hang. Something about setting an ALUA 
>> timeout to 60 seconds.
>>
>> Thanks to all who worked on this.
> 
> Hi, Bob
> 
> About this problem, IIRC, this problem easily occurred on Debian and 
> Fedora 38 and with the commit 9b4b7c1f9f54 ("RDMA/rxe: Add workqueue 
> support for rxe tasks").
> 
> And on Debian, with the latest multipathd, this problem seems to disappear.
> 
> On Fedora 38, even with the latest multipathd, this problem still can be 
> observed.

On Debian, with the latest multipathd or revert the commit 9b4b7c1f9f54 
("RDMA/rxe: Add workqueue support for rxe tasks"), this problem will 
disappear.

On Fedora 38, if the commit 9b4b7c1f9f54 ("RDMA/rxe: Add workqueue 
support for rxe tasks") is reverted, will this problem still appear?
I do not have such test environment. The commit is in the attachment,
can anyone have a test? Please let us know the test result. Thanks.

Zhu Yanjun

> 
> On Ubuntu, it is difficult to reproduce this problem.
> 
> Perhaps this is why you can not reproduce this problem on Ubuntu.
> 
> It seems that this problem is related with linux distribution and the 
> version of multipathd.
> 
> If I am missing something, please feel free to let me know.
> 
> Zhu Yanjun
> 
>>
>> Bob Pearson
> 

[-- Attachment #2: 0001-Revert-RDMA-rxe-Add-workqueue-support-for-rxe-tasks.patch --]
[-- Type: text/plain, Size: 9149 bytes --]

From fd2360edbc9171298d2e91fd9b74b4c3022db9d4 Mon Sep 17 00:00:00 2001
From: Zhu Yanjun <yanjun.zhu@linux.dev>
Date: Fri, 15 Sep 2023 23:07:17 -0400
Subject: [PATCH 1/1] Revert "RDMA/rxe: Add workqueue support for rxe tasks"

This reverts commit 9b4b7c1f9f54120940e243251e2b1407767b3381.

Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
---
 drivers/infiniband/sw/rxe/rxe.c      |   9 +--
 drivers/infiniband/sw/rxe/rxe_task.c | 110 ++++++++++++---------------
 drivers/infiniband/sw/rxe/rxe_task.h |   6 +-
 3 files changed, 49 insertions(+), 76 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe.c b/drivers/infiniband/sw/rxe/rxe.c
index 54c723a6edda..7a7e713de52d 100644
--- a/drivers/infiniband/sw/rxe/rxe.c
+++ b/drivers/infiniband/sw/rxe/rxe.c
@@ -212,15 +212,9 @@ static int __init rxe_module_init(void)
 {
 	int err;
 
-	err = rxe_alloc_wq();
-	if (err)
-		return err;
-
 	err = rxe_net_init();
-	if (err) {
-		rxe_destroy_wq();
+	if (err)
 		return err;
-	}
 
 	rdma_link_register(&rxe_link_ops);
 	pr_info("loaded\n");
@@ -232,7 +226,6 @@ static void __exit rxe_module_exit(void)
 	rdma_link_unregister(&rxe_link_ops);
 	ib_unregister_driver(RDMA_DRIVER_RXE);
 	rxe_net_exit();
-	rxe_destroy_wq();
 
 	pr_info("unloaded\n");
 }
diff --git a/drivers/infiniband/sw/rxe/rxe_task.c b/drivers/infiniband/sw/rxe/rxe_task.c
index 1501120d4f52..fb9a6bc8e620 100644
--- a/drivers/infiniband/sw/rxe/rxe_task.c
+++ b/drivers/infiniband/sw/rxe/rxe_task.c
@@ -6,24 +6,8 @@
 
 #include "rxe.h"
 
-static struct workqueue_struct *rxe_wq;
-
-int rxe_alloc_wq(void)
-{
-	rxe_wq = alloc_workqueue("rxe_wq", WQ_UNBOUND, WQ_MAX_ACTIVE);
-	if (!rxe_wq)
-		return -ENOMEM;
-
-	return 0;
-}
-
-void rxe_destroy_wq(void)
-{
-	destroy_workqueue(rxe_wq);
-}
-
 /* Check if task is idle i.e. not running, not scheduled in
- * work queue and not draining. If so move to busy to
+ * tasklet queue and not draining. If so move to busy to
  * reserve a slot in do_task() by setting to busy and taking
  * a qp reference to cover the gap from now until the task finishes.
  * state will move out of busy if task returns a non zero value
@@ -37,6 +21,9 @@ static bool __reserve_if_idle(struct rxe_task *task)
 {
 	WARN_ON(rxe_read(task->qp) <= 0);
 
+	if (task->tasklet.state & BIT(TASKLET_STATE_SCHED))
+		return false;
+
 	if (task->state == TASK_STATE_IDLE) {
 		rxe_get(task->qp);
 		task->state = TASK_STATE_BUSY;
@@ -51,7 +38,7 @@ static bool __reserve_if_idle(struct rxe_task *task)
 }
 
 /* check if task is idle or drained and not currently
- * scheduled in the work queue. This routine is
+ * scheduled in the tasklet queue. This routine is
  * called by rxe_cleanup_task or rxe_disable_task to
  * see if the queue is empty.
  * Context: caller should hold task->lock.
@@ -59,7 +46,7 @@ static bool __reserve_if_idle(struct rxe_task *task)
  */
 static bool __is_done(struct rxe_task *task)
 {
-	if (work_pending(&task->work))
+	if (task->tasklet.state & BIT(TASKLET_STATE_SCHED))
 		return false;
 
 	if (task->state == TASK_STATE_IDLE ||
@@ -90,23 +77,23 @@ static bool is_done(struct rxe_task *task)
  * schedules the task. They must call __reserve_if_idle to
  * move the task to busy before calling or scheduling.
  * The task can also be moved to drained or invalid
- * by calls to rxe_cleanup_task or rxe_disable_task.
+ * by calls to rxe-cleanup_task or rxe_disable_task.
  * In that case tasks which get here are not executed but
  * just flushed. The tasks are designed to look to see if
- * there is work to do and then do part of it before returning
+ * there is work to do and do part of it before returning
  * here with a return value of zero until all the work
- * has been consumed then it returns a non-zero value.
+ * has been consumed then it retuens a non-zero value.
  * The number of times the task can be run is limited by
  * max iterations so one task cannot hold the cpu forever.
- * If the limit is hit and work remains the task is rescheduled.
  */
-static void do_task(struct rxe_task *task)
+static void do_task(struct tasklet_struct *t)
 {
+	int cont;
+	int ret;
+	struct rxe_task *task = from_tasklet(task, t, tasklet);
 	unsigned int iterations;
 	unsigned long flags;
 	int resched = 0;
-	int cont;
-	int ret;
 
 	WARN_ON(rxe_read(task->qp) <= 0);
 
@@ -128,22 +115,25 @@ static void do_task(struct rxe_task *task)
 		} while (ret == 0 && iterations-- > 0);
 
 		spin_lock_irqsave(&task->lock, flags);
-		/* we're not done yet but we ran out of iterations.
-		 * yield the cpu and reschedule the task
-		 */
-		if (!ret) {
-			task->state = TASK_STATE_IDLE;
-			resched = 1;
-			goto exit;
-		}
-
 		switch (task->state) {
 		case TASK_STATE_BUSY:
-			task->state = TASK_STATE_IDLE;
+			if (ret) {
+				task->state = TASK_STATE_IDLE;
+			} else {
+				/* This can happen if the client
+				 * can add work faster than the
+				 * tasklet can finish it.
+				 * Reschedule the tasklet and exit
+				 * the loop to give up the cpu
+				 */
+				task->state = TASK_STATE_IDLE;
+				resched = 1;
+			}
 			break;
 
-		/* someone tried to schedule the task while we
-		 * were running, keep going
+		/* someone tried to run the task since the last time we called
+		 * func, so we will call one more time regardless of the
+		 * return value
 		 */
 		case TASK_STATE_ARMED:
 			task->state = TASK_STATE_BUSY;
@@ -151,24 +141,22 @@ static void do_task(struct rxe_task *task)
 			break;
 
 		case TASK_STATE_DRAINING:
-			task->state = TASK_STATE_DRAINED;
+			if (ret)
+				task->state = TASK_STATE_DRAINED;
+			else
+				cont = 1;
 			break;
 
 		default:
 			WARN_ON(1);
-			rxe_dbg_qp(task->qp, "unexpected task state = %d",
-				   task->state);
-			task->state = TASK_STATE_IDLE;
+			rxe_info_qp(task->qp, "unexpected task state = %d", task->state);
 		}
 
-exit:
 		if (!cont) {
 			task->num_done++;
 			if (WARN_ON(task->num_done != task->num_sched))
-				rxe_dbg_qp(
-					task->qp,
-					"%ld tasks scheduled, %ld tasks done",
-					task->num_sched, task->num_done);
+				rxe_err_qp(task->qp, "%ld tasks scheduled, %ld tasks done",
+					   task->num_sched, task->num_done);
 		}
 		spin_unlock_irqrestore(&task->lock, flags);
 	} while (cont);
@@ -181,12 +169,6 @@ static void do_task(struct rxe_task *task)
 	rxe_put(task->qp);
 }
 
-/* wrapper around do_task to fix argument for work queue */
-static void do_work(struct work_struct *work)
-{
-	do_task(container_of(work, struct rxe_task, work));
-}
-
 int rxe_init_task(struct rxe_task *task, struct rxe_qp *qp,
 		  int (*func)(struct rxe_qp *))
 {
@@ -194,9 +176,11 @@ int rxe_init_task(struct rxe_task *task, struct rxe_qp *qp,
 
 	task->qp = qp;
 	task->func = func;
+
+	tasklet_setup(&task->tasklet, do_task);
+
 	task->state = TASK_STATE_IDLE;
 	spin_lock_init(&task->lock);
-	INIT_WORK(&task->work, do_work);
 
 	return 0;
 }
@@ -229,6 +213,8 @@ void rxe_cleanup_task(struct rxe_task *task)
 	while (!is_done(task))
 		cond_resched();
 
+	tasklet_kill(&task->tasklet);
+
 	spin_lock_irqsave(&task->lock, flags);
 	task->state = TASK_STATE_INVALID;
 	spin_unlock_irqrestore(&task->lock, flags);
@@ -240,7 +226,7 @@ void rxe_cleanup_task(struct rxe_task *task)
 void rxe_run_task(struct rxe_task *task)
 {
 	unsigned long flags;
-	bool run;
+	int run;
 
 	WARN_ON(rxe_read(task->qp) <= 0);
 
@@ -249,11 +235,11 @@ void rxe_run_task(struct rxe_task *task)
 	spin_unlock_irqrestore(&task->lock, flags);
 
 	if (run)
-		do_task(task);
+		do_task(&task->tasklet);
 }
 
-/* schedule the task to run later as a work queue entry.
- * the queue_work call can be called holding
+/* schedule the task to run later as a tasklet.
+ * the tasklet)schedule call can be called holding
  * the lock.
  */
 void rxe_sched_task(struct rxe_task *task)
@@ -264,7 +250,7 @@ void rxe_sched_task(struct rxe_task *task)
 
 	spin_lock_irqsave(&task->lock, flags);
 	if (__reserve_if_idle(task))
-		queue_work(rxe_wq, &task->work);
+		tasklet_schedule(&task->tasklet);
 	spin_unlock_irqrestore(&task->lock, flags);
 }
 
@@ -291,9 +277,7 @@ void rxe_disable_task(struct rxe_task *task)
 	while (!is_done(task))
 		cond_resched();
 
-	spin_lock_irqsave(&task->lock, flags);
-	task->state = TASK_STATE_DRAINED;
-	spin_unlock_irqrestore(&task->lock, flags);
+	tasklet_disable(&task->tasklet);
 }
 
 void rxe_enable_task(struct rxe_task *task)
@@ -307,7 +291,7 @@ void rxe_enable_task(struct rxe_task *task)
 		spin_unlock_irqrestore(&task->lock, flags);
 		return;
 	}
-
 	task->state = TASK_STATE_IDLE;
+	tasklet_enable(&task->tasklet);
 	spin_unlock_irqrestore(&task->lock, flags);
 }
diff --git a/drivers/infiniband/sw/rxe/rxe_task.h b/drivers/infiniband/sw/rxe/rxe_task.h
index a63e258b3d66..facb7c8e3729 100644
--- a/drivers/infiniband/sw/rxe/rxe_task.h
+++ b/drivers/infiniband/sw/rxe/rxe_task.h
@@ -22,7 +22,7 @@ enum {
  * called again.
  */
 struct rxe_task {
-	struct work_struct	work;
+	struct tasklet_struct	tasklet;
 	int			state;
 	spinlock_t		lock;
 	struct rxe_qp		*qp;
@@ -32,10 +32,6 @@ struct rxe_task {
 	long			num_done;
 };
 
-int rxe_alloc_wq(void);
-
-void rxe_destroy_wq(void);
-
 /*
  * init rxe_task structure
  *	qp  => parameter to pass to func
-- 
2.40.1


  reply	other threads:[~2023-09-16  6:07 UTC|newest]

Thread overview: 86+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-21  6:46 [bug report] blktests srp/002 hang Shinichiro Kawasaki
2023-08-22  1:46 ` Bob Pearson
2023-08-22 10:18   ` Shinichiro Kawasaki
2023-08-22 15:20     ` Bart Van Assche
2023-08-23 16:19       ` Bob Pearson
2023-08-23 19:46         ` Bart Van Assche
2023-08-24 16:24           ` Bob Pearson
2023-08-24  8:55         ` Bernard Metzler
2023-08-24 15:35         ` Bernard Metzler
2023-08-24 16:05           ` Bart Van Assche
2023-08-24 16:27             ` Bob Pearson
2023-08-25  1:11       ` Shinichiro Kawasaki
2023-08-25  1:36         ` Bob Pearson
2023-08-25 10:16           ` Shinichiro Kawasaki
2023-08-25 13:49           ` Bart Van Assche
2023-08-25 13:52         ` Bart Van Assche
2023-09-13 17:36           ` Bob Pearson
2023-09-13 23:38             ` Zhu Yanjun
2023-09-16  5:59               ` Zhu Yanjun [this message]
2023-09-19  4:14                 ` Shinichiro Kawasaki
2023-09-19  8:07                   ` Zhu Yanjun
2023-09-19 16:30                     ` Pearson, Robert B
2023-09-19 18:11                     ` Bob Pearson
2023-09-20  4:22                       ` Zhu Yanjun
2023-09-20 16:24                         ` Bob Pearson
2023-09-20 16:36                           ` Bart Van Assche
2023-09-20 17:18                             ` Bob Pearson
2023-09-20 17:22                               ` Bart Van Assche
2023-09-20 17:29                                 ` Bob Pearson
2023-09-21  5:46                                   ` Zhu Yanjun
2023-09-21 10:06                                   ` Zhu Yanjun
2023-09-21 14:23                                   ` Rain River
2023-09-21 14:39                                     ` Bob Pearson
2023-09-21 15:08                                       ` Zhu Yanjun
2023-09-21 15:10                                       ` Zhu Yanjun
2023-09-22 18:14                                         ` Bob Pearson
2023-09-22 22:06                                           ` Bart Van Assche
2023-09-24  1:17                                           ` Rain River
2023-09-25  4:47                                             ` Daisuke Matsuda (Fujitsu)
2023-09-25 14:31                                               ` Zhu Yanjun
2023-09-26  1:09                                                 ` Daisuke Matsuda (Fujitsu)
2023-09-26  6:09                                                   ` Zhu Yanjun
2023-09-25 15:00                                               ` Bart Van Assche
2023-09-25 15:25                                                 ` Bob Pearson
2023-09-25 15:52                                                 ` Jason Gunthorpe
2023-09-25 15:54                                                   ` Bob Pearson
2023-09-25 19:57                                                 ` Bob Pearson
2023-09-25 20:33                                                   ` Bart Van Assche
2023-09-25 20:40                                                     ` Bob Pearson
2023-09-26 15:36                                                   ` Rain River
2023-09-26  1:17                                                 ` Daisuke Matsuda (Fujitsu)
2023-10-17 17:09                                                   ` Bob Pearson
2023-10-17 17:13                                                     ` Bart Van Assche
2023-10-17 17:15                                                       ` Bob Pearson
2023-10-17 17:19                                                       ` Bob Pearson
2023-10-17 17:34                                                         ` Bart Van Assche
2023-10-17 17:58                                                     ` Jason Gunthorpe
2023-10-17 18:44                                                       ` Bob Pearson
2023-10-17 18:51                                                         ` Jason Gunthorpe
2023-10-17 19:55                                                           ` Bob Pearson
2023-10-17 20:06                                                             ` Bart Van Assche
2023-10-17 20:13                                                               ` Bob Pearson
2023-10-17 21:14                                                               ` Bob Pearson
2023-10-17 21:18                                                                 ` Bart Van Assche
2023-10-17 21:23                                                                   ` Bob Pearson
2023-10-17 21:30                                                                     ` Bart Van Assche
2023-10-17 21:39                                                                       ` Bob Pearson
2023-10-17 22:42                                                                         ` Bart Van Assche
2023-10-18 18:29                                                                           ` Bob Pearson
2023-10-18 19:17                                                                             ` Jason Gunthorpe
2023-10-18 19:48                                                                               ` Bart Van Assche
2023-10-18 20:03                                                                                 ` Bob Pearson
2023-10-18 20:04                                                                                 ` Bob Pearson
2023-10-18 20:14                                                                                 ` Bob Pearson
     [not found]                                                                                 ` <fb5f6da5-5017-440d-9cb5-38796554366c@gmail.com>
2023-10-18 20:49                                                                                   ` Bart Van Assche
2023-10-18 21:17                                                                                     ` Pearson, Robert B
2023-10-18 21:27                                                                                       ` Bart Van Assche
2023-10-18 21:52                                                                                         ` Bob Pearson
2023-10-19 19:17                                                                                           ` Bart Van Assche
2023-10-20 17:12                                                                                             ` Bob Pearson
2023-10-20 17:41                                                                                               ` Bart Van Assche
2023-10-18 19:38                                                                             ` Bart Van Assche
2023-10-17 19:18                                                       ` Bart Van Assche
2023-10-18  8:16                                                     ` Zhu Yanjun
2023-09-22 11:06 ` Linux regression tracking #adding (Thorsten Leemhuis)
2023-10-13 12:51   ` Linux regression tracking #update (Thorsten Leemhuis)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d3205633-0cd2-f87e-1c40-21b8172b6da3@linux.dev \
    --to=yanjun.zhu@linux.dev \
    --cc=bvanassche@acm.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=rpearsonhpe@gmail.com \
    --cc=shinichiro.kawasaki@wdc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).