public inbox for linux-nvme@lists.infradead.org
 help / color / mirror / Atom feed
From: Sagi Grimberg <sagi@grimberg.me>
To: Jirong Feng <jirong.feng@easystack.cn>,
	Keith Busch <kbusch@kernel.org>, Jens Axboe <axboe@fb.com>,
	Christoph Hellwig <hch@lst.de>
Cc: linux-nvme@lists.infradead.org, peng.xiao@easystack.cn
Subject: Re: Should NVME_SC_INVALID_NS be translated to BLK_STS_IOERR instead of BLK_STS_NOTSUPP so that multipath(both native and dm) can failover on the failure?
Date: Mon, 4 Dec 2023 10:47:34 +0200	[thread overview]
Message-ID: <b769ab2f-e4dd-47a2-9e20-af9b16c21dff@grimberg.me> (raw)
In-Reply-To: <9b1589fb-6f47-40bb-8aa6-22ae61145de4@easystack.cn>


> Hi all,
> 
> I have two storage servers, each of which has an NVMe SSD. Recently I'm 
> trying nvmet-tcp with DRBD, steps are:
> 1. Configure DRBD for the two SSDs in two-primary mode, so that each 
> server can accept IO on DRBD device.
> 2. On each server, add the corresponding DRBD device to nvmet subsystem 
> with same device uuid, so that multipath on the host side can group them 
> into one device(My fabric type is tcp).
> 3. On client host, nvme discover & connect the both servers, making sure 
> DM multipath device is generated, and both paths are online.
> 4. Execute fio randread on DM device continuously.
> 5. On the server whose multipath status is active, under nvmet namespace 
> configfs directory, execute "echo 0 > enable" to disable the namespace.
> what I expect is that IO can be automatically retried and switched to 
> the other storage server by multipath, fio goes on. But actually I see 
> an "Operation not supported" error, and fio fails and stops. I've also 
> tried iSCSI target, after I delete mapped lun from acl, fio continues 
> running without any error.
> 
> My kernel version is 4.18.0-147.5.1(rhel 8.1). After checked out the 
> kernel code, I found that:
> 1. On target side, nvmet returns NVME_SC_INVALID_NS to host due to 
> namespace not found.
> 2. On host side, nvme driver translates this error to BLK_STS_NOTSUPP 
> for block layer.
> 3. Multipath calls for function blk_path_error() to decide whether to 
> retry.
> 4. In function blk_path_error(), BLK_STS_NOTSUPP is not considered to be 
> a path error, so it returns false, multipath will not retry.
> I've also checked out the master branch from origin, it's almost the 
> same. In iSCSI target, the process is similar, the only difference is 
> that TCM_NON_EXISTENT_LUN will be translated to BLK_STS_IOERR, which is 
> considered to be a path error in function blk_path_error().
> 
> So my question is as the subject...Is it reasonable to translate 
> NVME_SC_INVALID_NS to BLK_STS_IOERR just like what iSCSI target does? 
> Should multipath failover on this error?

The host issued IO to a non-existing namespace. Semantically it is not
an IO error in the sense that its retryable.

btw, AFAICT TCM_NON_EXISTENT_LUN does return an ILLEGAL_REQUEST however
the host chooses to ignore the particular additional sense specifically.

While I guess similar behavior could be done in nvme, the question is
why is a non-existent namespace failure a retryable error? the namespace
is gone...

Thoughts?

Perhaps what you are seeking is a soft way to disable a namespace based
on your test case?


  reply	other threads:[~2023-12-04  8:47 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-04  7:58 Should NVME_SC_INVALID_NS be translated to BLK_STS_IOERR instead of BLK_STS_NOTSUPP so that multipath(both native and dm) can failover on the failure? Jirong Feng
2023-12-04  8:47 ` Sagi Grimberg [this message]
2023-12-05  3:54   ` Jirong Feng
2023-12-05  4:37 ` Keith Busch
2023-12-05  4:40   ` Christoph Hellwig
2023-12-05  5:18     ` Keith Busch
2023-12-05  7:06       ` Jirong Feng
2023-12-05  8:50     ` Sagi Grimberg
2023-12-25 11:25       ` Jirong Feng
2023-12-25 11:40         ` Sagi Grimberg
2023-12-25 12:14           ` Jirong Feng
2023-12-26 13:27             ` Jirong Feng
2024-01-01  9:51               ` Sagi Grimberg
2024-01-02 10:33                 ` Jirong Feng
2024-01-02 12:46                   ` Sagi Grimberg
2024-01-03 10:24                     ` Jirong Feng
2024-01-04 11:56                       ` Sagi Grimberg
2024-01-30  9:36                         ` Jirong Feng
2024-01-30 11:29                           ` Sagi Grimberg
2024-01-31  6:25                             ` Christoph Hellwig
2024-03-20  3:17                               ` Jirong Feng
2024-03-20  8:51                                 ` Sagi Grimberg
2024-03-21  3:06                                   ` Jirong Feng
2024-04-07 22:28                                     ` Sagi Grimberg
2024-04-12  7:52                                       ` Jirong Feng
2024-04-12  8:57                                         ` Sagi Grimberg
2024-04-22  9:47                                           ` Sagi Grimberg
2024-04-23  3:15                                             ` Jirong Feng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b769ab2f-e4dd-47a2-9e20-af9b16c21dff@grimberg.me \
    --to=sagi@grimberg.me \
    --cc=axboe@fb.com \
    --cc=hch@lst.de \
    --cc=jirong.feng@easystack.cn \
    --cc=kbusch@kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=peng.xiao@easystack.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox