* [PATCH] scsi: core: move scsi_host_busy() out of host lock for waking up EH handler
@ 2024-01-12 7:00 Ming Lei
2024-01-12 11:12 ` Hannes Reinecke
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Ming Lei @ 2024-01-12 7:00 UTC (permalink / raw)
To: Martin K . Petersen, linux-scsi; +Cc: Ming Lei, Ewan Milne
Inside scsi_eh_wakeup(), scsi_host_busy() is called & checked with host lock
every time for deciding if error handler kthread needs to be waken up.
This way can be too heavy in case of recovery, such as:
- N hardware queues
- queue depth is M for each hardware queue
- each scsi_host_busy() iterates over (N * M) tag/requests
If recovery is triggered in case that all requests are in-flight, each
scsi_eh_wakeup() is strictly serialized, when scsi_eh_wakeup() is called
for the last in-flight request, scsi_host_busy() has been run for (N * M - 1)
times, and request has been iterated for (N*M - 1) * (N * M) times.
If both N and M are big enough, hard lockup can be triggered on acquiring
host lock, and it is observed on mpi3mr(128 hw queues, queue depth 8169).
Fix the issue by calling scsi_host_busy() outside host lock, and we
don't need host lock for getting busy count because host lock never
covers that.
Cc: Ewan Milne <emilne@redhat.com>
Fixes: 6eb045e092ef ("scsi: core: avoid host-wide host_busy counter for scsi_mq")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
BTW, another way is to wake up EH handler unconditionally, and it should
work too.
drivers/scsi/scsi_error.c | 11 +++++++----
drivers/scsi/scsi_lib.c | 4 +++-
drivers/scsi/scsi_priv.h | 2 +-
3 files changed, 11 insertions(+), 6 deletions(-)
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index c67cdcdc3ba8..6743eb88e0a5 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -61,11 +61,11 @@ static int scsi_eh_try_stu(struct scsi_cmnd *scmd);
static enum scsi_disposition scsi_try_to_abort_cmd(const struct scsi_host_template *,
struct scsi_cmnd *);
-void scsi_eh_wakeup(struct Scsi_Host *shost)
+void scsi_eh_wakeup(struct Scsi_Host *shost, unsigned int busy)
{
lockdep_assert_held(shost->host_lock);
- if (scsi_host_busy(shost) == shost->host_failed) {
+ if (busy == shost->host_failed) {
trace_scsi_eh_wakeup(shost);
wake_up_process(shost->ehandler);
SCSI_LOG_ERROR_RECOVERY(5, shost_printk(KERN_INFO, shost,
@@ -87,8 +87,10 @@ void scsi_schedule_eh(struct Scsi_Host *shost)
if (scsi_host_set_state(shost, SHOST_RECOVERY) == 0 ||
scsi_host_set_state(shost, SHOST_CANCEL_RECOVERY) == 0) {
+ unsigned int busy = scsi_host_busy(shost);
+
shost->host_eh_scheduled++;
- scsi_eh_wakeup(shost);
+ scsi_eh_wakeup(shost, busy);
}
spin_unlock_irqrestore(shost->host_lock, flags);
@@ -283,10 +285,11 @@ static void scsi_eh_inc_host_failed(struct rcu_head *head)
struct scsi_cmnd *scmd = container_of(head, typeof(*scmd), rcu);
struct Scsi_Host *shost = scmd->device->host;
unsigned long flags;
+ unsigned int busy = scsi_host_busy(shost);
spin_lock_irqsave(shost->host_lock, flags);
shost->host_failed++;
- scsi_eh_wakeup(shost);
+ scsi_eh_wakeup(shost, busy);
spin_unlock_irqrestore(shost->host_lock, flags);
}
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index cf3864f72093..df5ac03d5d6c 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -278,9 +278,11 @@ static void scsi_dec_host_busy(struct Scsi_Host *shost, struct scsi_cmnd *cmd)
rcu_read_lock();
__clear_bit(SCMD_STATE_INFLIGHT, &cmd->state);
if (unlikely(scsi_host_in_recovery(shost))) {
+ unsigned int busy = scsi_host_busy(shost);
+
spin_lock_irqsave(shost->host_lock, flags);
if (shost->host_failed || shost->host_eh_scheduled)
- scsi_eh_wakeup(shost);
+ scsi_eh_wakeup(shost, busy);
spin_unlock_irqrestore(shost->host_lock, flags);
}
rcu_read_unlock();
diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h
index 3f0dfb97db6b..1fbfe1b52c9f 100644
--- a/drivers/scsi/scsi_priv.h
+++ b/drivers/scsi/scsi_priv.h
@@ -92,7 +92,7 @@ extern void scmd_eh_abort_handler(struct work_struct *work);
extern enum blk_eh_timer_return scsi_timeout(struct request *req);
extern int scsi_error_handler(void *host);
extern enum scsi_disposition scsi_decide_disposition(struct scsi_cmnd *cmd);
-extern void scsi_eh_wakeup(struct Scsi_Host *shost);
+extern void scsi_eh_wakeup(struct Scsi_Host *shost, unsigned int busy);
extern void scsi_eh_scmd_add(struct scsi_cmnd *);
void scsi_eh_ready_devs(struct Scsi_Host *shost,
struct list_head *work_q,
--
2.42.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] scsi: core: move scsi_host_busy() out of host lock for waking up EH handler
2024-01-12 7:00 [PATCH] scsi: core: move scsi_host_busy() out of host lock for waking up EH handler Ming Lei
@ 2024-01-12 11:12 ` Hannes Reinecke
2024-01-12 12:42 ` Ming Lei
2024-01-23 15:23 ` Bart Van Assche
2024-01-24 3:00 ` Martin K. Petersen
2 siblings, 1 reply; 9+ messages in thread
From: Hannes Reinecke @ 2024-01-12 11:12 UTC (permalink / raw)
To: Ming Lei, Martin K . Petersen, linux-scsi; +Cc: Ewan Milne
On 1/12/24 08:00, Ming Lei wrote:
> Inside scsi_eh_wakeup(), scsi_host_busy() is called & checked with host lock
> every time for deciding if error handler kthread needs to be waken up.
>
> This way can be too heavy in case of recovery, such as:
>
> - N hardware queues
> - queue depth is M for each hardware queue
> - each scsi_host_busy() iterates over (N * M) tag/requests
>
> If recovery is triggered in case that all requests are in-flight, each
> scsi_eh_wakeup() is strictly serialized, when scsi_eh_wakeup() is called
> for the last in-flight request, scsi_host_busy() has been run for (N * M - 1)
> times, and request has been iterated for (N*M - 1) * (N * M) times.
>
> If both N and M are big enough, hard lockup can be triggered on acquiring
> host lock, and it is observed on mpi3mr(128 hw queues, queue depth 8169).
>
> Fix the issue by calling scsi_host_busy() outside host lock, and we
> don't need host lock for getting busy count because host lock never
> covers that.
>
Can you share details for the hard lockup?
I do agree that scsi_host_busy() is an expensive operation, so it
might not be ideal to call it under a spin lock.
But I wonder where the lockup comes in here.
Care to explain?
And if it leads to a lockup, aren't other instances calling
scsi_host_busy() under a spinlock affected, as well?
Cheers,
Hannes
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] scsi: core: move scsi_host_busy() out of host lock for waking up EH handler
2024-01-12 11:12 ` Hannes Reinecke
@ 2024-01-12 12:42 ` Ming Lei
2024-01-12 19:34 ` Ewan Milne
0 siblings, 1 reply; 9+ messages in thread
From: Ming Lei @ 2024-01-12 12:42 UTC (permalink / raw)
To: Hannes Reinecke; +Cc: Martin K . Petersen, linux-scsi, Ewan Milne, ming.lei
On Fri, Jan 12, 2024 at 12:12:57PM +0100, Hannes Reinecke wrote:
> On 1/12/24 08:00, Ming Lei wrote:
> > Inside scsi_eh_wakeup(), scsi_host_busy() is called & checked with host lock
> > every time for deciding if error handler kthread needs to be waken up.
> >
> > This way can be too heavy in case of recovery, such as:
> >
> > - N hardware queues
> > - queue depth is M for each hardware queue
> > - each scsi_host_busy() iterates over (N * M) tag/requests
> >
> > If recovery is triggered in case that all requests are in-flight, each
> > scsi_eh_wakeup() is strictly serialized, when scsi_eh_wakeup() is called
> > for the last in-flight request, scsi_host_busy() has been run for (N * M - 1)
> > times, and request has been iterated for (N*M - 1) * (N * M) times.
> >
> > If both N and M are big enough, hard lockup can be triggered on acquiring
> > host lock, and it is observed on mpi3mr(128 hw queues, queue depth 8169).
> >
> > Fix the issue by calling scsi_host_busy() outside host lock, and we
> > don't need host lock for getting busy count because host lock never
> > covers that.
> >
> Can you share details for the hard lockup?
> I do agree that scsi_host_busy() is an expensive operation, so it
> might not be ideal to call it under a spin lock.
> But I wonder where the lockup comes in here.
> Care to explain?
Recovery happens when there is N * M inflight requests, then scsi_dec_host_busy()
can be called for each inflight request/scmnd from irq context.
host lock serializes every scsi_eh_wakeup().
Given each hardware queue has its own irq handler, so there could be one
request, scsi_dec_host_busy() is called and the host lock is spinned until
it is released from scsi_dec_host_busy() for all requests from all other
hardware queues.
The spin time can be long enough to trigger the hard lockup if N and M
is big enough, and the total wait time can be:
(N - 1) * M * time_taken_in_scsi_host_busy().
Meantime the same story happens on scsi_eh_inc_host_failed() which is
called from softirq context, so host lock spin can be much more worse.
It is observed on mpi3mr with 128(N) hw queues and 8169(M) queue depth.
>
> And if it leads to a lockup, aren't other instances calling scsi_host_busy()
> under a spinlock affected, as well?
It is only possible when it is called in per-command situation.
Thanks,
Ming
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] scsi: core: move scsi_host_busy() out of host lock for waking up EH handler
2024-01-12 12:42 ` Ming Lei
@ 2024-01-12 19:34 ` Ewan Milne
2024-01-13 1:59 ` Ming Lei
0 siblings, 1 reply; 9+ messages in thread
From: Ewan Milne @ 2024-01-12 19:34 UTC (permalink / raw)
To: Ming Lei; +Cc: Hannes Reinecke, Martin K . Petersen, linux-scsi
On Fri, Jan 12, 2024 at 7:43 AM Ming Lei <ming.lei@redhat.com> wrote:
>
> On Fri, Jan 12, 2024 at 12:12:57PM +0100, Hannes Reinecke wrote:
> > On 1/12/24 08:00, Ming Lei wrote:
> > > Inside scsi_eh_wakeup(), scsi_host_busy() is called & checked with host lock
> > > every time for deciding if error handler kthread needs to be waken up.
> > >
> > > This way can be too heavy in case of recovery, such as:
> > >
> > > - N hardware queues
> > > - queue depth is M for each hardware queue
> > > - each scsi_host_busy() iterates over (N * M) tag/requests
> > >
> > > If recovery is triggered in case that all requests are in-flight, each
> > > scsi_eh_wakeup() is strictly serialized, when scsi_eh_wakeup() is called
> > > for the last in-flight request, scsi_host_busy() has been run for (N * M - 1)
> > > times, and request has been iterated for (N*M - 1) * (N * M) times.
> > >
> > > If both N and M are big enough, hard lockup can be triggered on acquiring
> > > host lock, and it is observed on mpi3mr(128 hw queues, queue depth 8169).
> > >
> > > Fix the issue by calling scsi_host_busy() outside host lock, and we
> > > don't need host lock for getting busy count because host lock never
> > > covers that.
> > >
> > Can you share details for the hard lockup?
> > I do agree that scsi_host_busy() is an expensive operation, so it
> > might not be ideal to call it under a spin lock.
> > But I wonder where the lockup comes in here.
> > Care to explain?
>
> Recovery happens when there is N * M inflight requests, then scsi_dec_host_busy()
> can be called for each inflight request/scmnd from irq context.
>
> host lock serializes every scsi_eh_wakeup().
>
> Given each hardware queue has its own irq handler, so there could be one
> request, scsi_dec_host_busy() is called and the host lock is spinned until
> it is released from scsi_dec_host_busy() for all requests from all other
> hardware queues.
>
> The spin time can be long enough to trigger the hard lockup if N and M
> is big enough, and the total wait time can be:
>
> (N - 1) * M * time_taken_in_scsi_host_busy().
>
> Meantime the same story happens on scsi_eh_inc_host_failed() which is
> called from softirq context, so host lock spin can be much more worse.
>
> It is observed on mpi3mr with 128(N) hw queues and 8169(M) queue depth.
>
> >
> > And if it leads to a lockup, aren't other instances calling scsi_host_busy()
> > under a spinlock affected, as well?
>
> It is only possible when it is called in per-command situation.
>
>
> Thanks,
> Ming
>
I can't see why this wouldn't work, or cause a problem with a lost wakeup,
but the cost of iterating to obtain the host_busy value is still being paid,
just outside the host_lock. If this has triggered a hard lockup, should
we revisit the algorithm, e.g. are we still delaying EH wakeup for a noticeable
amount of time? O(n^2) algorithms in the kernel don't seem like the best idea.
In any case...
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
-Ewan
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] scsi: core: move scsi_host_busy() out of host lock for waking up EH handler
2024-01-12 19:34 ` Ewan Milne
@ 2024-01-13 1:59 ` Ming Lei
2024-01-23 7:04 ` Sathya Prakash Veerichetty
0 siblings, 1 reply; 9+ messages in thread
From: Ming Lei @ 2024-01-13 1:59 UTC (permalink / raw)
To: Ewan Milne; +Cc: Hannes Reinecke, Martin K . Petersen, linux-scsi, ming.lei
On Fri, Jan 12, 2024 at 02:34:52PM -0500, Ewan Milne wrote:
> On Fri, Jan 12, 2024 at 7:43 AM Ming Lei <ming.lei@redhat.com> wrote:
> >
> > On Fri, Jan 12, 2024 at 12:12:57PM +0100, Hannes Reinecke wrote:
> > > On 1/12/24 08:00, Ming Lei wrote:
> > > > Inside scsi_eh_wakeup(), scsi_host_busy() is called & checked with host lock
> > > > every time for deciding if error handler kthread needs to be waken up.
> > > >
> > > > This way can be too heavy in case of recovery, such as:
> > > >
> > > > - N hardware queues
> > > > - queue depth is M for each hardware queue
> > > > - each scsi_host_busy() iterates over (N * M) tag/requests
> > > >
> > > > If recovery is triggered in case that all requests are in-flight, each
> > > > scsi_eh_wakeup() is strictly serialized, when scsi_eh_wakeup() is called
> > > > for the last in-flight request, scsi_host_busy() has been run for (N * M - 1)
> > > > times, and request has been iterated for (N*M - 1) * (N * M) times.
> > > >
> > > > If both N and M are big enough, hard lockup can be triggered on acquiring
> > > > host lock, and it is observed on mpi3mr(128 hw queues, queue depth 8169).
> > > >
> > > > Fix the issue by calling scsi_host_busy() outside host lock, and we
> > > > don't need host lock for getting busy count because host lock never
> > > > covers that.
> > > >
> > > Can you share details for the hard lockup?
> > > I do agree that scsi_host_busy() is an expensive operation, so it
> > > might not be ideal to call it under a spin lock.
> > > But I wonder where the lockup comes in here.
> > > Care to explain?
> >
> > Recovery happens when there is N * M inflight requests, then scsi_dec_host_busy()
> > can be called for each inflight request/scmnd from irq context.
> >
> > host lock serializes every scsi_eh_wakeup().
> >
> > Given each hardware queue has its own irq handler, so there could be one
> > request, scsi_dec_host_busy() is called and the host lock is spinned until
> > it is released from scsi_dec_host_busy() for all requests from all other
> > hardware queues.
> >
> > The spin time can be long enough to trigger the hard lockup if N and M
> > is big enough, and the total wait time can be:
> >
> > (N - 1) * M * time_taken_in_scsi_host_busy().
> >
> > Meantime the same story happens on scsi_eh_inc_host_failed() which is
> > called from softirq context, so host lock spin can be much more worse.
> >
> > It is observed on mpi3mr with 128(N) hw queues and 8169(M) queue depth.
> >
> > >
> > > And if it leads to a lockup, aren't other instances calling scsi_host_busy()
> > > under a spinlock affected, as well?
> >
> > It is only possible when it is called in per-command situation.
> >
> >
> > Thanks,
> > Ming
> >
>
> I can't see why this wouldn't work, or cause a problem with a lost wakeup,
> but the cost of iterating to obtain the host_busy value is still being paid,
> just outside the host_lock. If this has triggered a hard lockup, should
> we revisit the algorithm, e.g. are we still delaying EH wakeup for a noticeable
> amount of time?
SCSI EH is designed to start handling until all in-flight commands are
failed, so it waits until all requests are failed first.
> O(n^2) algorithms in the kernel don't seem like the best idea.
It is actually O(n) because each hardware queue handles request
in parallel.
It is degraded to O(n^2) or O(n * m) just because of shared host lock.
Single or N scsi_host_busy() won't take too long without host lock, what
matters is actually the per-host lock spin time which can be accumulated
as too big.
>
> In any case...
> Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Thanks for the review!
--
Ming
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] scsi: core: move scsi_host_busy() out of host lock for waking up EH handler
2024-01-13 1:59 ` Ming Lei
@ 2024-01-23 7:04 ` Sathya Prakash Veerichetty
0 siblings, 0 replies; 9+ messages in thread
From: Sathya Prakash Veerichetty @ 2024-01-23 7:04 UTC (permalink / raw)
To: Ming Lei; +Cc: Ewan Milne, Hannes Reinecke, Martin K . Petersen, linux-scsi
[-- Attachment #1: Type: text/plain, Size: 4142 bytes --]
On Fri, Jan 12, 2024 at 6:59 PM Ming Lei <ming.lei@redhat.com> wrote:
>
> On Fri, Jan 12, 2024 at 02:34:52PM -0500, Ewan Milne wrote:
> > On Fri, Jan 12, 2024 at 7:43 AM Ming Lei <ming.lei@redhat.com> wrote:
> > >
> > > On Fri, Jan 12, 2024 at 12:12:57PM +0100, Hannes Reinecke wrote:
> > > > On 1/12/24 08:00, Ming Lei wrote:
> > > > > Inside scsi_eh_wakeup(), scsi_host_busy() is called & checked with host lock
> > > > > every time for deciding if error handler kthread needs to be waken up.
> > > > >
> > > > > This way can be too heavy in case of recovery, such as:
> > > > >
> > > > > - N hardware queues
> > > > > - queue depth is M for each hardware queue
> > > > > - each scsi_host_busy() iterates over (N * M) tag/requests
> > > > >
> > > > > If recovery is triggered in case that all requests are in-flight, each
> > > > > scsi_eh_wakeup() is strictly serialized, when scsi_eh_wakeup() is called
> > > > > for the last in-flight request, scsi_host_busy() has been run for (N * M - 1)
> > > > > times, and request has been iterated for (N*M - 1) * (N * M) times.
> > > > >
> > > > > If both N and M are big enough, hard lockup can be triggered on acquiring
> > > > > host lock, and it is observed on mpi3mr(128 hw queues, queue depth 8169).
> > > > >
> > > > > Fix the issue by calling scsi_host_busy() outside host lock, and we
> > > > > don't need host lock for getting busy count because host lock never
> > > > > covers that.
> > > > >
> > > > Can you share details for the hard lockup?
> > > > I do agree that scsi_host_busy() is an expensive operation, so it
> > > > might not be ideal to call it under a spin lock.
> > > > But I wonder where the lockup comes in here.
> > > > Care to explain?
> > >
> > > Recovery happens when there is N * M inflight requests, then scsi_dec_host_busy()
> > > can be called for each inflight request/scmnd from irq context.
> > >
> > > host lock serializes every scsi_eh_wakeup().
> > >
> > > Given each hardware queue has its own irq handler, so there could be one
> > > request, scsi_dec_host_busy() is called and the host lock is spinned until
> > > it is released from scsi_dec_host_busy() for all requests from all other
> > > hardware queues.
> > >
> > > The spin time can be long enough to trigger the hard lockup if N and M
> > > is big enough, and the total wait time can be:
> > >
> > > (N - 1) * M * time_taken_in_scsi_host_busy().
> > >
> > > Meantime the same story happens on scsi_eh_inc_host_failed() which is
> > > called from softirq context, so host lock spin can be much more worse.
> > >
> > > It is observed on mpi3mr with 128(N) hw queues and 8169(M) queue depth.
> > >
> > > >
> > > > And if it leads to a lockup, aren't other instances calling scsi_host_busy()
> > > > under a spinlock affected, as well?
> > >
> > > It is only possible when it is called in per-command situation.
> > >
> > >
> > > Thanks,
> > > Ming
> > >
> >
> > I can't see why this wouldn't work, or cause a problem with a lost wakeup,
> > but the cost of iterating to obtain the host_busy value is still being paid,
> > just outside the host_lock. If this has triggered a hard lockup, should
> > we revisit the algorithm, e.g. are we still delaying EH wakeup for a noticeable
> > amount of time?
>
> SCSI EH is designed to start handling until all in-flight commands are
> failed, so it waits until all requests are failed first.
>
> > O(n^2) algorithms in the kernel don't seem like the best idea.
>
> It is actually O(n) because each hardware queue handles request
> in parallel.
>
> It is degraded to O(n^2) or O(n * m) just because of shared host lock.
>
> Single or N scsi_host_busy() won't take too long without host lock, what
> matters is actually the per-host lock spin time which can be accumulated
> as too big.
>
> >
> > In any case...
> > Reviewed-by: Ewan D. Milne <emilne@redhat.com>
>
> Thanks for the review!
>
>
> --
> Ming
>
>
Reviewed-by: Sathya Prakash Veerichetty <safhya.prakash@broadcom.com>
Tested-by: Sathya Prakash Veerichetty <safhya.prakash@broadcom.com>
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4227 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] scsi: core: move scsi_host_busy() out of host lock for waking up EH handler
2024-01-12 7:00 [PATCH] scsi: core: move scsi_host_busy() out of host lock for waking up EH handler Ming Lei
2024-01-12 11:12 ` Hannes Reinecke
@ 2024-01-23 15:23 ` Bart Van Assche
2024-01-24 3:00 ` Martin K. Petersen
2 siblings, 0 replies; 9+ messages in thread
From: Bart Van Assche @ 2024-01-23 15:23 UTC (permalink / raw)
To: Ming Lei, Martin K . Petersen, linux-scsi; +Cc: Ewan Milne
On 1/11/24 23:00, Ming Lei wrote:
> @@ -87,8 +87,10 @@ void scsi_schedule_eh(struct Scsi_Host *shost)
>
> if (scsi_host_set_state(shost, SHOST_RECOVERY) == 0 ||
> scsi_host_set_state(shost, SHOST_CANCEL_RECOVERY) == 0) {
> + unsigned int busy = scsi_host_busy(shost);
> +
> shost->host_eh_scheduled++;
> - scsi_eh_wakeup(shost);
> + scsi_eh_wakeup(shost, busy);
> }
No new variable is needed here. If this patch is reposted, please change
the above into the following:
+ scsi_eh_wakeup(shost, scsi_host_busy(shost));
With or without that additional change:
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] scsi: core: move scsi_host_busy() out of host lock for waking up EH handler
2024-01-12 7:00 [PATCH] scsi: core: move scsi_host_busy() out of host lock for waking up EH handler Ming Lei
2024-01-12 11:12 ` Hannes Reinecke
2024-01-23 15:23 ` Bart Van Assche
@ 2024-01-24 3:00 ` Martin K. Petersen
2024-02-03 2:31 ` Ming Lei
2 siblings, 1 reply; 9+ messages in thread
From: Martin K. Petersen @ 2024-01-24 3:00 UTC (permalink / raw)
To: linux-scsi, Ming Lei; +Cc: Martin K . Petersen, Ewan Milne
On Fri, 12 Jan 2024 15:00:00 +0800, Ming Lei wrote:
> Inside scsi_eh_wakeup(), scsi_host_busy() is called & checked with host lock
> every time for deciding if error handler kthread needs to be waken up.
>
> This way can be too heavy in case of recovery, such as:
>
> - N hardware queues
> - queue depth is M for each hardware queue
> - each scsi_host_busy() iterates over (N * M) tag/requests
>
> [...]
Applied to 6.8/scsi-fixes, thanks!
[1/1] scsi: core: move scsi_host_busy() out of host lock for waking up EH handler
https://git.kernel.org/mkp/scsi/c/4373534a9850
--
Martin K. Petersen Oracle Linux Engineering
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] scsi: core: move scsi_host_busy() out of host lock for waking up EH handler
2024-01-24 3:00 ` Martin K. Petersen
@ 2024-02-03 2:31 ` Ming Lei
0 siblings, 0 replies; 9+ messages in thread
From: Ming Lei @ 2024-02-03 2:31 UTC (permalink / raw)
To: Martin K. Petersen; +Cc: linux-scsi, Ewan Milne
On Wed, Jan 24, 2024 at 11:01 AM Martin K. Petersen
<martin.petersen@oracle.com> wrote:
>
> On Fri, 12 Jan 2024 15:00:00 +0800, Ming Lei wrote:
>
> > Inside scsi_eh_wakeup(), scsi_host_busy() is called & checked with host lock
> > every time for deciding if error handler kthread needs to be waken up.
> >
> > This way can be too heavy in case of recovery, such as:
> >
> > - N hardware queues
> > - queue depth is M for each hardware queue
> > - each scsi_host_busy() iterates over (N * M) tag/requests
> >
> > [...]
>
> Applied to 6.8/scsi-fixes, thanks!
>
> [1/1] scsi: core: move scsi_host_busy() out of host lock for waking up EH handler
> https://git.kernel.org/mkp/scsi/c/4373534a9850
Hi Martin,
When I started to backport this commit, I found it was merged as wrong,
the point is that scsi_host_busy() needs to be moved out of host lock.
I will send one new patch to fix it.
Thanks,
Ming
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2024-02-03 2:31 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-12 7:00 [PATCH] scsi: core: move scsi_host_busy() out of host lock for waking up EH handler Ming Lei
2024-01-12 11:12 ` Hannes Reinecke
2024-01-12 12:42 ` Ming Lei
2024-01-12 19:34 ` Ewan Milne
2024-01-13 1:59 ` Ming Lei
2024-01-23 7:04 ` Sathya Prakash Veerichetty
2024-01-23 15:23 ` Bart Van Assche
2024-01-24 3:00 ` Martin K. Petersen
2024-02-03 2:31 ` Ming Lei
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox