Re: 6.2 nvme-pci: something wrong

public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed

From: Christoph Hellwig <hch@infradead.org>
To: Keith Busch <kbusch@kernel.org>
Cc: Hugh Dickins <hughd@google.com>,
	Christoph Hellwig <hch@infradead.org>,
	Jens Axboe <axboe@kernel.dk>, Sagi Grimberg <sagi@grimberg.me>,
	Chaitanya Kulkarni <kch@nvidia.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Thorsten Leemhuis <regressions@leemhuis.info>,
	linux-block@vger.kernel.org, linux-nvme@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: 6.2 nvme-pci: something wrong
Date: Sat, 24 Dec 2022 21:30:10 -0800	[thread overview]
Message-ID: <Y6ff4tpk1Su/Q9bP@infradead.org> (raw)
In-Reply-To: <Y6d37vGSCKvfJhzD@kbusch-mbp.dhcp.thefacebook.com>

On Sat, Dec 24, 2022 at 03:06:38PM -0700, Keith Busch wrote:
> Your observation is a queue-wrap condition that makes it impossible for
> the controller know there are new commands.
> 
> Your patch does look like the correct thing to do. The "zero means one"
> thing is a confusing distraction, I think. It makes more sense if you
> consider sqsize as the maximum number of tags we can have outstanding at
> one time and it looks like all the drivers set it that way. We're
> supposed to leave one slot empty for a full NVMe queue, so adding one
> here to report the total number slots isn't right since that would allow
> us to fill all slots.

Yes, and pcie did actually do the ‐ 1 from q_depth, so we should
drop the +1 for sqsize.  And add back the missing BLK_MQ_MAX_DEPTH.
But we still need to keep sqsize updated as well.

> Fabrics drivers have been using this method for a while, though, so
> interesting they haven't had a simiar problem.

Fabrics doesn't have a real queue and thus no actual wrap, so
I don't think they will be hit as bad by this.

So we'll probably need something like this, split into two patches.
And then for 6.2 clean up the sqsize vs q_depth mess for real.

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 95c488ea91c303..5b723c65fbeab5 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -4926,7 +4926,7 @@ int nvme_alloc_io_tag_set(struct nvme_ctrl *ctrl, struct blk_mq_tag_set *set,
 
 	memset(set, 0, sizeof(*set));
 	set->ops = ops;
-	set->queue_depth = ctrl->sqsize + 1;
+	set->queue_depth = min_t(unsigned, ctrl->sqsize, BLK_MQ_MAX_DEPTH - 1);
 	/*
 	 * Some Apple controllers requires tags to be unique across admin and
 	 * the (only) I/O queue, so reserve the first 32 tags of the I/O queue.
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index f0f8027644bbf8..ec5e1c578a710b 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2332,10 +2332,12 @@ static int nvme_setup_io_queues(struct nvme_dev *dev)
 	if (dev->cmb_use_sqes) {
 		result = nvme_cmb_qdepth(dev, nr_io_queues,
 				sizeof(struct nvme_command));
-		if (result > 0)
+		if (result > 0) {
 			dev->q_depth = result;
-		else
+			dev->ctrl.sqsize = dev->q_depth - 1;
+		} else {
 			dev->cmb_use_sqes = false;
+		}
 	}
 
 	do {
@@ -2536,7 +2538,6 @@ static int nvme_pci_enable(struct nvme_dev *dev)
 
 	dev->q_depth = min_t(u32, NVME_CAP_MQES(dev->ctrl.cap) + 1,
 				io_queue_depth);
-	dev->ctrl.sqsize = dev->q_depth - 1; /* 0's based queue depth */
 	dev->db_stride = 1 << NVME_CAP_STRIDE(dev->ctrl.cap);
 	dev->dbs = dev->bar + 4096;
 
@@ -2577,7 +2578,7 @@ static int nvme_pci_enable(struct nvme_dev *dev)
 		dev_warn(dev->ctrl.device, "IO queue depth clamped to %d\n",
 			 dev->q_depth);
 	}
-
+	dev->ctrl.sqsize = dev->q_depth - 1; /* 0's based queue depth */
 
 	nvme_map_cmb(dev);

next prev parent reply	other threads:[~2022-12-25  5:30 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-24  5:24 6.2 nvme-pci: something wrong Hugh Dickins
2022-12-24  7:14 ` Christoph Hellwig
2022-12-24 10:19   ` Hugh Dickins
2022-12-24 16:56     ` Linus Torvalds
2022-12-24  7:52 ` 6.2 nvme-pci: something wrong #forregzbot Thorsten Leemhuis
2023-01-04 14:02   ` Thorsten Leemhuis
2022-12-24 22:06 ` 6.2 nvme-pci: something wrong Keith Busch
2022-12-25  5:30   ` Christoph Hellwig [this message]
2022-12-25  8:33     ` Hugh Dickins

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:95c488ea91c30 dfblob:5b723c65fbeab dfblob:f0f8027644bbf
dfblob:ec5e1c578a710 )
 OR (
bs:"Re: 6.2 nvme-pci: something wrong" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y6ff4tpk1Su/Q9bP@infradead.org \
    --to=hch@infradead.org \
    --cc=axboe@kernel.dk \
    --cc=hughd@google.com \
    --cc=kbusch@kernel.org \
    --cc=kch@nvidia.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=regressions@leemhuis.info \
    --cc=sagi@grimberg.me \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox