From: hare@suse.de (Hannes Reinecke)
Subject: [PATCH 02/10] nvme: ANA transition timeout handling
Date: Tue, 29 May 2018 15:34:41 +0200 [thread overview]
Message-ID: <20180529153441.6eeff725@pentland.suse.de> (raw)
In-Reply-To: <20180529124729.GC7376@lst.de>
On Tue, 29 May 2018 14:47:29 +0200
Christoph Hellwig <hch@lst.de> wrote:
> > + if (ns->anagrpid != le32_to_cpu(id->anagrpid)) {
> > + dev_warn(ctrl->device, "nsid %d ANA group id
> > changed\n",
> > + ns->head->ns_id);
> > + queue_delayed_work(nvme_wq, &ctrl->ana_work, 0);
> > + }
>
> No need to queue any work if an anagrpid changed. We'll automatically
> index into the right group once it has changed.
>
> > diff --git a/drivers/nvme/host/multipath.c
> > b/drivers/nvme/host/multipath.c index 1a8791340862..2fcaf50d84e2
> > 100644 --- a/drivers/nvme/host/multipath.c
> > +++ b/drivers/nvme/host/multipath.c
> > @@ -69,6 +69,8 @@ void nvme_failover_req(struct request *req)
> > * entirely trivial..
> > */
> > nvme_update_ana_state(ns, NVME_ANA_CHANGE);
> > + queue_delayed_work(nvme_wq, &ns->ctrl->ana_work,
> > + ns->ctrl->anatt * HZ);
>
> This doesn't make much sense. Once we get the ana transitioning
> status we should either retry the command up to ANATT or try another
> path. There is no point in scheduling a read of the log page after
> ANATT, as we'll already get an AEN when that log page is ready.
>
In an ideal world, yes.
But what happens if we don't?
> > @@ -323,7 +325,7 @@ static int nvme_process_ana_log(struct
> > nvme_ctrl *ctrl, bool groups_only) ctrl->ana_log_buf,
> > ctrl->ana_log_size, 0); if (error) {
> > dev_warn(ctrl->device, "Failed to get ANA log:
> > %d\n", error);
> > - return error;
> > + return -EIO;
> > }
> >
> > for (i = 0; i < le16_to_cpu(ctrl->ana_log_buf->ngrps);
> > i++) { @@ -345,6 +347,8 @@ static int nvme_process_ana_log(struct
> > nvme_ctrl *ctrl, bool groups_only) dev_info(ctrl->device, "ANA
> > group %d: %s.\n", grpid, nvme_ana_state_names[desc->state]);
> > WRITE_ONCE(ctrl->ana_state[grpid], desc->state);
> > + if (desc->state == NVME_ANA_CHANGE)
> > + error = -EAGAIN;
>
> Huh? Why would be stop processing our log when we see a change state?
> This looks extremely dubious to me, and does not match the changelog
> either.
>
We don't stop processing. We just record the error so that we can
retrigger the ANA log page scan.
> > + if (!ctrl->ana_log_buf)
> > + return;
>
> How would the log buf disappear? Even if it does please does this
> in a separate, documented patch.
>
> > + if (ctrl->state != NVME_CTRL_LIVE)
> > + return;
>
> This looks sensible, but it should probably also check for ADMIN_LIVE
> for completeness, and be a seprate, properly documented patch.
>
Ok, will be doing so.
> > + /*
> > + * In case of an I/O error just add a small delay
> > to not hit
> > + * the target too hard
> > + */
> > + if (ret == -EIO)
> > + log_delay =
> > msecs_to_jiffies(NVME_ANA_LOG_DELAY);
> > + queue_delayed_work(nvme_wq, &ctrl->ana_work,
> > log_delay);
>
> What is the rationale for this I/O error handling? In NVMe over
> Fabrics transport errors tear down the association, so I really
> don't see why we should handle errors here.
>
The idea of the patch is to start off a delayed workqueue function to
ensure we're catching ANA transition timeout errors.
The workqueue function will be cancelled if we get an AEN, but gives it
another go at reading the ANA log if the transition timeout expires.
I do agree on the EIO error, though. That can be removed.
Cheers,
Hannes
next prev parent reply other threads:[~2018-05-29 13:34 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-29 10:14 [PATCH 00/10] nvme: ANA fixups Hannes Reinecke
2018-05-29 10:14 ` [PATCH 01/10] nvme: add missing kfree() in nvme_configure_ana() Hannes Reinecke
2018-05-29 12:16 ` Johannes Thumshirn
2018-05-29 12:39 ` Christoph Hellwig
2018-05-29 10:14 ` [PATCH 02/10] nvme: ANA transition timeout handling Hannes Reinecke
2018-05-29 12:47 ` Christoph Hellwig
2018-05-29 13:34 ` Hannes Reinecke [this message]
2018-05-29 13:47 ` Christoph Hellwig
2018-05-29 10:14 ` [PATCH 03/10] nvme: Only update capacity for optimized or non-optimized paths Hannes Reinecke
2018-05-29 12:38 ` Christoph Hellwig
2018-05-29 10:14 ` [PATCH 04/10] nvme: clear current path on ANA state change Hannes Reinecke
2018-05-29 12:22 ` Johannes Thumshirn
2018-05-29 12:43 ` Hannes Reinecke
2018-05-29 12:48 ` Christoph Hellwig
2018-05-29 10:14 ` [PATCH 05/10] nvme: retry nvme_get_log_ext() when processing ANA log Hannes Reinecke
2018-05-29 12:48 ` Christoph Hellwig
2018-05-29 10:14 ` [PATCH 06/10] nvme: simplify check for ANA in nvme_ns_id_attrs_are_visible() Hannes Reinecke
2018-05-29 12:49 ` Christoph Hellwig
2018-05-29 10:14 ` [PATCH 07/10] nvmet: make ANATT configurable Hannes Reinecke
2018-05-29 12:50 ` Christoph Hellwig
2018-05-29 10:14 ` [PATCH 08/10] nvmet: Set nanagrpid correctly Hannes Reinecke
2018-05-29 12:51 ` Christoph Hellwig
2018-05-29 13:04 ` Hannes Reinecke
2018-05-29 13:38 ` Christoph Hellwig
2018-05-29 10:14 ` [PATCH 09/10] nvmet: Set mnan correctly Hannes Reinecke
2018-05-29 12:52 ` Christoph Hellwig
2018-05-29 13:06 ` Hannes Reinecke
2018-05-29 13:40 ` Christoph Hellwig
2018-05-29 10:14 ` [PATCH 10/10] nvmet: set 'nuse' and 'nsze' to zero for inaccessible paths Hannes Reinecke
2018-05-29 12:57 ` Christoph Hellwig
2018-05-31 10:30 ` [PATCH 00/10] nvme: ANA fixups Sagi Grimberg
2018-05-31 16:26 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180529153441.6eeff725@pentland.suse.de \
--to=hare@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.