public inbox for linux-nvme@lists.infradead.org
 help / color / mirror / Atom feed
From: John Meneghini <jmeneghi@redhat.com>
To: sagi@grimberg.me
Cc: emilne@redhat.com, hare@kernel.org, kbusch@meta.com,
	linux-nvme@lists.infradead.org, mlombard@redhat.com
Subject: Re: [PATCH v2 7/7] nvme: add reserved ioq tags for cancel
Date: Wed, 26 Jun 2024 15:10:21 -0400	[thread overview]
Message-ID: <eeca40fa-5510-4db1-a66b-bc1808b398ed@redhat.com> (raw)
In-Reply-To: <20240626183819.23960-1-jmeneghi@redhat.com>

This is the patch I wrote to solve this problem while at LSF/MM. Is this what you are thinking about Sagi?

Note: more changes are needed to the error handlers to account for this.  The idea is that the eh will need to be modified to 
keep track of outstanding nvme-cancel command for each io queue.  Following the first command timeout the eh will send one 
single command cancel command to abort the slow command in the controller.  If second command timeout occurs before the CQE for 
the first cancel is returned by the controller the error handler sends a  Multiple Command Cancel to the IO queue with NSID set 
to FFFFFFFFh.  This form of the cancel command will cancel/abort all outstanding commands on the IO queue.

The problem is, in most cases when a command times out due to a problem in the controller not just one command times out but all 
outstanding commands timeout in a thundering herd. There are cases where a single IO will hang up, but those aren't usually 
reads or writes - like a reservation command that gets suck, or a dsm command that's going slow. Usually when reads and writes 
start timing out it's because the storage is just swamped and all IOs start to slow down.

Therefore, with only 2 reserved tags on each IO queue, the host should be able to use the cancel command to abort any and all 
outstanding IOs that time out.

/John

On 6/26/24 14:38, John Meneghini wrote:Multiple Command Cancel:
> If the nvme Cancel command is supported, we need to reserve 2 tags for
> each IO queue. Note that one addition tag is reserved to account for
> the case where this is a fabrics controller.
> 
> Signed-off-by: John Meneghini <jmeneghi@redhat.com>
> ---
>   drivers/nvme/host/core.c | 5 +++++
>   1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 691dd6ee6dc3..76554fb373a3 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -4570,6 +4570,7 @@ int nvme_alloc_io_tag_set(struct nvme_ctrl *ctrl, struct blk_mq_tag_set *set,
>   		unsigned int cmd_size)
>   {
>   	int ret;
> +	u32 effects = le32_to_cpu(ctrl->effects->iocs[nvme_cmd_cancel]);
>   
>   	memset(set, 0, sizeof(*set));
>   	set->ops = ops;
> @@ -4580,9 +4581,13 @@ int nvme_alloc_io_tag_set(struct nvme_ctrl *ctrl, struct blk_mq_tag_set *set,
>   	 */
>   	if (ctrl->quirks & NVME_QUIRK_SHARED_TAGS)
>   		set->reserved_tags = NVME_AQ_DEPTH;
> +	else if  (effects & NVME_CMD_EFFECTS_CSUPP)
> +		/* Reserve 2 X io_queue count for NVMe Cancel */
> +		set->reserved_tags = (2 * ctrl->queue_count);
>   	else if (ctrl->ops->flags & NVME_F_FABRICS)
>   		/* Reserved for fabric connect */
>   		set->reserved_tags = 1;
> +
>   	set->numa_node = ctrl->numa_node;
>   	set->flags = BLK_MQ_F_SHOULD_MERGE;
>   	if (ctrl->ops->flags & NVME_F_BLOCKING)



  parent reply	other threads:[~2024-06-26 19:10 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-10 16:30 [PATCH 0/5] Support for Cancel commands for host (TCP and RDMA) Maurizio Lombardi
2024-05-10 16:30 ` [PATCH 1/5] nvme: add definitions for cancel command Maurizio Lombardi
2024-05-10 16:30 ` [PATCH 2/5] nvme-core: add a function to submit a " Maurizio Lombardi
2024-05-12 14:06   ` Sagi Grimberg
2024-06-20  8:31     ` Maurizio Lombardi
2024-06-20 13:01       ` Maurizio Lombardi
2024-06-24 10:06         ` Sagi Grimberg
2024-06-26 18:38           ` [PATCH v2 7/7] nvme: add reserved ioq tags for cancel John Meneghini
2024-06-26 19:03             ` Keith Busch
2024-06-26 19:20               ` John Meneghini
2024-06-27  8:12                 ` Maurizio Lombardi
2024-06-26 19:10             ` John Meneghini [this message]
2024-05-10 16:30 ` [PATCH 3/5] nvme-tcp: use the cancel command to perform an abort if target supports it Maurizio Lombardi
2024-05-10 16:30 ` [PATCH 4/5] nvme-rdma: " Maurizio Lombardi
2024-05-10 16:30 ` [donotmerge PATCH 5/5] nvmet: target support for cancel commands Maurizio Lombardi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=eeca40fa-5510-4db1-a66b-bc1808b398ed@redhat.com \
    --to=jmeneghi@redhat.com \
    --cc=emilne@redhat.com \
    --cc=hare@kernel.org \
    --cc=kbusch@meta.com \
    --cc=linux-nvme@lists.infradead.org \
    --cc=mlombard@redhat.com \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox