* [PATCH] nvme-mpath: fix I/O failure with EAGAIN when failing over I/O
@ 2023-06-19 14:10 Sagi Grimberg
2023-06-20 4:28 ` Christoph Hellwig
2023-06-20 9:15 ` Martin Wilck
0 siblings, 2 replies; 5+ messages in thread
From: Sagi Grimberg @ 2023-06-19 14:10 UTC (permalink / raw)
To: linux-nvme; +Cc: Keith Busch, Christoph Hellwig
It is possible that the next available path we failover to, happens to
be frozen (for example if it is during connection establishment). If
the original I/O was set with NOWAIT, this cause the I/O to unnecessarily
fail because the request queue cannot be entered, hence the I/O fails with
EAGAIN.
The NOWAIT restriction that was originally set for the I/O is no longer
relevant or needed because this is the nvme requeue context. Hence we
clear the REQ_NOWAIT flag when failing over I/O.
This fix a simple test case of nvme controller reset during I/O when the
multipath device that has only a single path and I/O fails with "Resource
temporarily unavailable" errno. Note that this reproduces with io_uring
which by default sets IOCB_NOWAIT by default.
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
drivers/nvme/host/multipath.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 2bc159a318ff..6425e6ec3932 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -102,6 +102,7 @@ void nvme_failover_req(struct request *req)
spin_lock_irqsave(&ns->head->requeue_lock, flags);
for (bio = req->bio; bio; bio = bio->bi_next) {
bio_set_dev(bio, ns->head->disk->part0);
+ bio->bi_opf &= ~REQ_NOWAIT;
if (bio->bi_opf & REQ_POLLED) {
bio->bi_opf &= ~REQ_POLLED;
bio->bi_cookie = BLK_QC_T_NONE;
--
2.40.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] nvme-mpath: fix I/O failure with EAGAIN when failing over I/O
2023-06-19 14:10 [PATCH] nvme-mpath: fix I/O failure with EAGAIN when failing over I/O Sagi Grimberg
@ 2023-06-20 4:28 ` Christoph Hellwig
2023-06-20 10:00 ` Sagi Grimberg
2023-06-20 9:15 ` Martin Wilck
1 sibling, 1 reply; 5+ messages in thread
From: Christoph Hellwig @ 2023-06-20 4:28 UTC (permalink / raw)
To: Sagi Grimberg; +Cc: linux-nvme, Keith Busch, Christoph Hellwig
> +++ b/drivers/nvme/host/multipath.c
> @@ -102,6 +102,7 @@ void nvme_failover_req(struct request *req)
> spin_lock_irqsave(&ns->head->requeue_lock, flags);
> for (bio = req->bio; bio; bio = bio->bi_next) {
> bio_set_dev(bio, ns->head->disk->part0);
> + bio->bi_opf &= ~REQ_NOWAIT;
Please add a comment based on your explanation in the commit log here.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] nvme-mpath: fix I/O failure with EAGAIN when failing over I/O
2023-06-19 14:10 [PATCH] nvme-mpath: fix I/O failure with EAGAIN when failing over I/O Sagi Grimberg
2023-06-20 4:28 ` Christoph Hellwig
@ 2023-06-20 9:15 ` Martin Wilck
2023-06-20 9:59 ` Sagi Grimberg
1 sibling, 1 reply; 5+ messages in thread
From: Martin Wilck @ 2023-06-20 9:15 UTC (permalink / raw)
To: Sagi Grimberg, linux-nvme; +Cc: Keith Busch, Christoph Hellwig, Daniel Wagner
Hello Sagi,
On Mon, 2023-06-19 at 17:10 +0300, Sagi Grimberg wrote:
> It is possible that the next available path we failover to, happens
> to
> be frozen (for example if it is during connection establishment). If
> the original I/O was set with NOWAIT, this cause the I/O to
> unnecessarily
> fail because the request queue cannot be entered, hence the I/O fails
> with
> EAGAIN.
>
> The NOWAIT restriction that was originally set for the I/O is no
> longer
> relevant or needed because this is the nvme requeue context. Hence we
> clear the REQ_NOWAIT flag when failing over I/O.
Could you please explain this in more detail? We are on the bio level,
thus IIUC a new request will need to be allocated when the bio is
requeued. This means that if the fail-over queue is frozen e.g. during
a NVMe controller reset, IO may be blocked for a possibly very long
time, which is what the NOWAIT flag was initially supposed to avoid.
I am asking because we've seen a similar phenomenon with a 3rd party
multipath implementation recently.
Regards
Martin
> This fix a simple test case of nvme controller reset during I/O when
> the
> multipath device that has only a single path and I/O fails with
> "Resource
> temporarily unavailable" errno. Note that this reproduces with
> io_uring
> which by default sets IOCB_NOWAIT by default.
>
> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
> ---
> drivers/nvme/host/multipath.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/nvme/host/multipath.c
> b/drivers/nvme/host/multipath.c
> index 2bc159a318ff..6425e6ec3932 100644
> --- a/drivers/nvme/host/multipath.c
> +++ b/drivers/nvme/host/multipath.c
> @@ -102,6 +102,7 @@ void nvme_failover_req(struct request *req)
> spin_lock_irqsave(&ns->head->requeue_lock, flags);
> for (bio = req->bio; bio; bio = bio->bi_next) {
> bio_set_dev(bio, ns->head->disk->part0);
> + bio->bi_opf &= ~REQ_NOWAIT;
> if (bio->bi_opf & REQ_POLLED) {
> bio->bi_opf &= ~REQ_POLLED;
> bio->bi_cookie = BLK_QC_T_NONE;
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] nvme-mpath: fix I/O failure with EAGAIN when failing over I/O
2023-06-20 9:15 ` Martin Wilck
@ 2023-06-20 9:59 ` Sagi Grimberg
0 siblings, 0 replies; 5+ messages in thread
From: Sagi Grimberg @ 2023-06-20 9:59 UTC (permalink / raw)
To: Martin Wilck, linux-nvme; +Cc: Keith Busch, Christoph Hellwig, Daniel Wagner
> Hello Sagi,
>
> On Mon, 2023-06-19 at 17:10 +0300, Sagi Grimberg wrote:
>> It is possible that the next available path we failover to, happens
>> to
>> be frozen (for example if it is during connection establishment). If
>> the original I/O was set with NOWAIT, this cause the I/O to
>> unnecessarily
>> fail because the request queue cannot be entered, hence the I/O fails
>> with
>> EAGAIN.
>>
>> The NOWAIT restriction that was originally set for the I/O is no
>> longer
>> relevant or needed because this is the nvme requeue context. Hence we
>> clear the REQ_NOWAIT flag when failing over I/O.
>
> Could you please explain this in more detail? We are on the bio level,
> thus IIUC a new request will need to be allocated when the bio is
> requeued.
The issue is not the tag allocation, its entering the request queue,
which fails immediately if the bio has NOWAIT set on it (and the
queue is frozen).
> This means that if the fail-over queue is frozen e.g. during
> a NVMe controller reset, IO may be blocked for a possibly very long
> time,
That should not be the case, especially with Ming's patch that moves
the freeze/unfreeze after we successfully connect. This should address
any I/O that is held hostage for a long period of a frozen queue.
> which is what the NOWAIT flag was initially supposed to avoid.
NOWAIT was set by the issuer specifically because it's context must not
block on I/O. The failover is a different context, and there is no need
to require this, its no longer the issuer context.
> I am asking because we've seen a similar phenomenon with a 3rd party
> multipath implementation recently.
I have no idea what is this 3'rd party multipath implementation, nor how
it interacts with nvme multipathing.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] nvme-mpath: fix I/O failure with EAGAIN when failing over I/O
2023-06-20 4:28 ` Christoph Hellwig
@ 2023-06-20 10:00 ` Sagi Grimberg
0 siblings, 0 replies; 5+ messages in thread
From: Sagi Grimberg @ 2023-06-20 10:00 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: linux-nvme, Keith Busch
>> +++ b/drivers/nvme/host/multipath.c
>> @@ -102,6 +102,7 @@ void nvme_failover_req(struct request *req)
>> spin_lock_irqsave(&ns->head->requeue_lock, flags);
>> for (bio = req->bio; bio; bio = bio->bi_next) {
>> bio_set_dev(bio, ns->head->disk->part0);
>> + bio->bi_opf &= ~REQ_NOWAIT;
>
> Please add a comment based on your explanation in the commit log here.
Will do.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-06-20 10:00 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-19 14:10 [PATCH] nvme-mpath: fix I/O failure with EAGAIN when failing over I/O Sagi Grimberg
2023-06-20 4:28 ` Christoph Hellwig
2023-06-20 10:00 ` Sagi Grimberg
2023-06-20 9:15 ` Martin Wilck
2023-06-20 9:59 ` Sagi Grimberg
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox