Linux-NVME Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] nvme-mpath: fix I/O failure with EAGAIN when failing over I/O
@ 2023-06-19 14:10 Sagi Grimberg
  2023-06-20  4:28 ` Christoph Hellwig
  2023-06-20  9:15 ` Martin Wilck
  0 siblings, 2 replies; 5+ messages in thread
From: Sagi Grimberg @ 2023-06-19 14:10 UTC (permalink / raw)
  To: linux-nvme; +Cc: Keith Busch, Christoph Hellwig

It is possible that the next available path we failover to, happens to
be frozen (for example if it is during connection establishment). If
the original I/O was set with NOWAIT, this cause the I/O to unnecessarily
fail because the request queue cannot be entered, hence the I/O fails with
EAGAIN.

The NOWAIT restriction that was originally set for the I/O is no longer
relevant or needed because this is the nvme requeue context. Hence we
clear the REQ_NOWAIT flag when failing over I/O.

This fix a simple test case of nvme controller reset during I/O when the
multipath device that has only a single path and I/O fails with "Resource
temporarily unavailable" errno. Note that this reproduces with io_uring
which by default sets IOCB_NOWAIT by default.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/multipath.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 2bc159a318ff..6425e6ec3932 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -102,6 +102,7 @@ void nvme_failover_req(struct request *req)
 	spin_lock_irqsave(&ns->head->requeue_lock, flags);
 	for (bio = req->bio; bio; bio = bio->bi_next) {
 		bio_set_dev(bio, ns->head->disk->part0);
+		bio->bi_opf &= ~REQ_NOWAIT;
 		if (bio->bi_opf & REQ_POLLED) {
 			bio->bi_opf &= ~REQ_POLLED;
 			bio->bi_cookie = BLK_QC_T_NONE;
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] nvme-mpath: fix I/O failure with EAGAIN when failing over I/O
  2023-06-19 14:10 [PATCH] nvme-mpath: fix I/O failure with EAGAIN when failing over I/O Sagi Grimberg
@ 2023-06-20  4:28 ` Christoph Hellwig
  2023-06-20 10:00   ` Sagi Grimberg
  2023-06-20  9:15 ` Martin Wilck
  1 sibling, 1 reply; 5+ messages in thread
From: Christoph Hellwig @ 2023-06-20  4:28 UTC (permalink / raw)
  To: Sagi Grimberg; +Cc: linux-nvme, Keith Busch, Christoph Hellwig

> +++ b/drivers/nvme/host/multipath.c
> @@ -102,6 +102,7 @@ void nvme_failover_req(struct request *req)
>  	spin_lock_irqsave(&ns->head->requeue_lock, flags);
>  	for (bio = req->bio; bio; bio = bio->bi_next) {
>  		bio_set_dev(bio, ns->head->disk->part0);
> +		bio->bi_opf &= ~REQ_NOWAIT;

Please add a comment based on your explanation in the commit log here.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] nvme-mpath: fix I/O failure with EAGAIN when failing over I/O
  2023-06-19 14:10 [PATCH] nvme-mpath: fix I/O failure with EAGAIN when failing over I/O Sagi Grimberg
  2023-06-20  4:28 ` Christoph Hellwig
@ 2023-06-20  9:15 ` Martin Wilck
  2023-06-20  9:59   ` Sagi Grimberg
  1 sibling, 1 reply; 5+ messages in thread
From: Martin Wilck @ 2023-06-20  9:15 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme; +Cc: Keith Busch, Christoph Hellwig, Daniel Wagner

Hello Sagi,

On Mon, 2023-06-19 at 17:10 +0300, Sagi Grimberg wrote:
> It is possible that the next available path we failover to, happens
> to
> be frozen (for example if it is during connection establishment). If
> the original I/O was set with NOWAIT, this cause the I/O to
> unnecessarily
> fail because the request queue cannot be entered, hence the I/O fails
> with
> EAGAIN.
> 
> The NOWAIT restriction that was originally set for the I/O is no
> longer
> relevant or needed because this is the nvme requeue context. Hence we
> clear the REQ_NOWAIT flag when failing over I/O.

Could you please explain this in more detail? We are on the bio level,
thus IIUC a new request will need to be allocated when the bio is
requeued. This means that if the fail-over queue is frozen e.g. during
a NVMe controller reset, IO may be blocked for a possibly very long
time, which is what the NOWAIT flag was initially supposed to avoid. 

I am asking because we've seen a similar phenomenon with a 3rd party
multipath implementation recently.


Regards
Martin

> This fix a simple test case of nvme controller reset during I/O when
> the
> multipath device that has only a single path and I/O fails with
> "Resource
> temporarily unavailable" errno. Note that this reproduces with
> io_uring
> which by default sets IOCB_NOWAIT by default.
> 
> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
> ---
>  drivers/nvme/host/multipath.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/nvme/host/multipath.c
> b/drivers/nvme/host/multipath.c
> index 2bc159a318ff..6425e6ec3932 100644
> --- a/drivers/nvme/host/multipath.c
> +++ b/drivers/nvme/host/multipath.c
> @@ -102,6 +102,7 @@ void nvme_failover_req(struct request *req)
>         spin_lock_irqsave(&ns->head->requeue_lock, flags);
>         for (bio = req->bio; bio; bio = bio->bi_next) {
>                 bio_set_dev(bio, ns->head->disk->part0);
> +               bio->bi_opf &= ~REQ_NOWAIT;
>                 if (bio->bi_opf & REQ_POLLED) {
>                         bio->bi_opf &= ~REQ_POLLED;
>                         bio->bi_cookie = BLK_QC_T_NONE;



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] nvme-mpath: fix I/O failure with EAGAIN when failing over I/O
  2023-06-20  9:15 ` Martin Wilck
@ 2023-06-20  9:59   ` Sagi Grimberg
  0 siblings, 0 replies; 5+ messages in thread
From: Sagi Grimberg @ 2023-06-20  9:59 UTC (permalink / raw)
  To: Martin Wilck, linux-nvme; +Cc: Keith Busch, Christoph Hellwig, Daniel Wagner


> Hello Sagi,
> 
> On Mon, 2023-06-19 at 17:10 +0300, Sagi Grimberg wrote:
>> It is possible that the next available path we failover to, happens
>> to
>> be frozen (for example if it is during connection establishment). If
>> the original I/O was set with NOWAIT, this cause the I/O to
>> unnecessarily
>> fail because the request queue cannot be entered, hence the I/O fails
>> with
>> EAGAIN.
>>
>> The NOWAIT restriction that was originally set for the I/O is no
>> longer
>> relevant or needed because this is the nvme requeue context. Hence we
>> clear the REQ_NOWAIT flag when failing over I/O.
> 
> Could you please explain this in more detail? We are on the bio level,
> thus IIUC a new request will need to be allocated when the bio is
> requeued.

The issue is not the tag allocation, its entering the request queue,
which fails immediately if the bio has NOWAIT set on it (and the
queue is frozen).

> This means that if the fail-over queue is frozen e.g. during
> a NVMe controller reset, IO may be blocked for a possibly very long
> time,

That should not be the case, especially with Ming's patch that moves
the freeze/unfreeze after we successfully connect. This should address
any I/O that is held hostage for a long period of a frozen queue.

> which is what the NOWAIT flag was initially supposed to avoid.

NOWAIT was set by the issuer specifically because it's context must not
block on I/O. The failover is a different context, and there is no need
to require this, its no longer the issuer context.

> I am asking because we've seen a similar phenomenon with a 3rd party
> multipath implementation recently.

I have no idea what is this 3'rd party multipath implementation, nor how
it interacts with nvme multipathing.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] nvme-mpath: fix I/O failure with EAGAIN when failing over I/O
  2023-06-20  4:28 ` Christoph Hellwig
@ 2023-06-20 10:00   ` Sagi Grimberg
  0 siblings, 0 replies; 5+ messages in thread
From: Sagi Grimberg @ 2023-06-20 10:00 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-nvme, Keith Busch


>> +++ b/drivers/nvme/host/multipath.c
>> @@ -102,6 +102,7 @@ void nvme_failover_req(struct request *req)
>>   	spin_lock_irqsave(&ns->head->requeue_lock, flags);
>>   	for (bio = req->bio; bio; bio = bio->bi_next) {
>>   		bio_set_dev(bio, ns->head->disk->part0);
>> +		bio->bi_opf &= ~REQ_NOWAIT;
> 
> Please add a comment based on your explanation in the commit log here.

Will do.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-06-20 10:00 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-19 14:10 [PATCH] nvme-mpath: fix I/O failure with EAGAIN when failing over I/O Sagi Grimberg
2023-06-20  4:28 ` Christoph Hellwig
2023-06-20 10:00   ` Sagi Grimberg
2023-06-20  9:15 ` Martin Wilck
2023-06-20  9:59   ` Sagi Grimberg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox