* [PATCH] nvme: simplify stripe quirk
@ 2016-12-15 0:10 Keith Busch
2016-12-15 8:32 ` Christoph Hellwig
0 siblings, 1 reply; 6+ messages in thread
From: Keith Busch @ 2016-12-15 0:10 UTC (permalink / raw)
Some subsystem vendors believe they own the Identify Controller vendor
specific region, and will repurpose it with their own values. Since we
can't rely on the VS byte to tell us what the stripe size is, we need to
do something for the quirk whitelist until this feature is standardized.
The field was originally concepted to allow back end flexibility on
its striping, but it turned out that never materialized; the chunk is
always the same as MDTS in the products subscribing to this quirk, and
it always will be that way. So, this patch removes the stripe_size field,
and sets the chunk to the max hw transfer size for quirked devices.
Signed-off-by: Keith Busch <keith.busch at intel.com>
---
For blk-mq, this addresses some performance anomolies. The more serious
problem, though, is for the pre-blk-mq driver, and I'll need to port
to stable. In some cases, the circular shifting can get a stripe size
that's smaller than a block, and that causes memory corruption.
drivers/nvme/host/core.c | 17 ++---------------
drivers/nvme/host/nvme.h | 1 -
2 files changed, 2 insertions(+), 16 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index b40cfb0..2fc86dc 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1193,8 +1193,8 @@ static void nvme_set_queue_limits(struct nvme_ctrl *ctrl,
blk_queue_max_hw_sectors(q, ctrl->max_hw_sectors);
blk_queue_max_segments(q, min_t(u32, max_segments, USHRT_MAX));
}
- if (ctrl->stripe_size)
- blk_queue_chunk_sectors(q, ctrl->stripe_size >> 9);
+ if (ctrl->quirks & NVME_QUIRK_STRIPE_SIZE)
+ blk_queue_chunk_sectors(q, ctrl->max_hw_sectors);
blk_queue_virt_boundary(q, ctrl->page_size - 1);
if (ctrl->vwc & NVME_CTRL_VWC_PRESENT)
vwc = true;
@@ -1250,19 +1250,6 @@ int nvme_init_identify(struct nvme_ctrl *ctrl)
ctrl->max_hw_sectors =
min_not_zero(ctrl->max_hw_sectors, max_hw_sectors);
- if ((ctrl->quirks & NVME_QUIRK_STRIPE_SIZE) && id->vs[3]) {
- unsigned int max_hw_sectors;
-
- ctrl->stripe_size = 1 << (id->vs[3] + page_shift);
- max_hw_sectors = ctrl->stripe_size >> (page_shift - 9);
- if (ctrl->max_hw_sectors) {
- ctrl->max_hw_sectors = min(max_hw_sectors,
- ctrl->max_hw_sectors);
- } else {
- ctrl->max_hw_sectors = max_hw_sectors;
- }
- }
-
nvme_set_queue_limits(ctrl, ctrl->admin_q);
ctrl->sgls = le32_to_cpu(id->sgls);
ctrl->kas = le16_to_cpu(id->kas);
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index bd53214..6377e14 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -135,7 +135,6 @@ struct nvme_ctrl {
u32 page_size;
u32 max_hw_sectors;
- u32 stripe_size;
u16 oncs;
u16 vid;
atomic_t abort_limit;
--
2.5.5
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH] nvme: simplify stripe quirk
2016-12-15 0:10 [PATCH] nvme: simplify stripe quirk Keith Busch
@ 2016-12-15 8:32 ` Christoph Hellwig
2016-12-15 15:59 ` Keith Busch
0 siblings, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2016-12-15 8:32 UTC (permalink / raw)
On Wed, Dec 14, 2016@07:10:55PM -0500, Keith Busch wrote:
> Some subsystem vendors believe they own the Identify Controller vendor
> specific region, and will repurpose it with their own values.
Well, it's vendor specific, right?
I don't understand this patch at all - you hardcode a value that was
previously vendor specific, and your argument for that is that the
vendor specific value is used by the value - heck, that's exactly why
the value is there.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] nvme: simplify stripe quirk
2016-12-15 8:32 ` Christoph Hellwig
@ 2016-12-15 15:59 ` Keith Busch
2016-12-16 8:07 ` Christoph Hellwig
0 siblings, 1 reply; 6+ messages in thread
From: Keith Busch @ 2016-12-15 15:59 UTC (permalink / raw)
On Thu, Dec 15, 2016@12:32:36AM -0800, Christoph Hellwig wrote:
> On Wed, Dec 14, 2016@07:10:55PM -0500, Keith Busch wrote:
> > Some subsystem vendors believe they own the Identify Controller vendor
> > specific region, and will repurpose it with their own values.
>
> Well, it's vendor specific, right?
If I'm the vendor and you're the subsystem vendor, who owns the vendor
specific region?
> I don't understand this patch at all - you hardcode a value that was
> previously vendor specific, and your argument for that is that the
> vendor specific value is used by the value - heck, that's exactly why
> the value is there.
I'm not sure what you mean. I haven't hard-coded anything. The alignment
requirements on the hardware itself always matches MDTS, so that's what
the patch proposes to use.
The current driver expects ID_CTRL.VS[3] defines the stripe use. However,
a subsystem vendor changes the field to mean something completely
different that has nothing to do with backend striping. We can't trust
how to decode the field just by knowing the VID:DID anymore, and I'm
confident we don't want to add another layer of SVID:SSID checks to know
how to decode quirk.
Instead, I'm simplifying this by exploiting a fact about the hardware. If
you're wondering why we didn't do that in the first place, it's because we
considered making this a changeable property, but that never materialized
and it never will.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] nvme: simplify stripe quirk
2016-12-15 15:59 ` Keith Busch
@ 2016-12-16 8:07 ` Christoph Hellwig
2016-12-16 15:48 ` Keith Busch
0 siblings, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2016-12-16 8:07 UTC (permalink / raw)
On Thu, Dec 15, 2016@10:59:58AM -0500, Keith Busch wrote:
> On Thu, Dec 15, 2016@12:32:36AM -0800, Christoph Hellwig wrote:
> > On Wed, Dec 14, 2016@07:10:55PM -0500, Keith Busch wrote:
> > > Some subsystem vendors believe they own the Identify Controller vendor
> > > specific region, and will repurpose it with their own values.
> >
> > Well, it's vendor specific, right?
>
> If I'm the vendor and you're the subsystem vendor, who owns the vendor
> specific region?
I was initially even more confused by your sentence - NVMe only knows
about a single vendor in the device, and it's assumed both the
controller and the subsystem come from it.
But then I realized that you probably mean the PCI subsystem vendor,
at which point all this makes a little more sense.
> The current driver expects ID_CTRL.VS[3] defines the stripe use. However,
> a subsystem vendor changes the field to mean something completely
> different that has nothing to do with backend striping. We can't trust
> how to decode the field just by knowing the VID:DID anymore, and I'm
> confident we don't want to add another layer of SVID:SSID checks to know
> how to decode quirk.
So an OEM can take the Intel Cards, and stuff their own values into the
vendor specific fields? This sounds bad as you can't rely on the
existing quirks in any old Linux version (plus FreeBSD or spdk for that
matter any more).
I'm obviously not in a postition to tell Intel not to do this, but
first claiming to Linux that a given PCI ID will always behave this way
and then later changing your opinion is a desaster. Maybe we should
simply never have merged this vendor specific optimization, but now it's
too late.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] nvme: simplify stripe quirk
2016-12-16 8:07 ` Christoph Hellwig
@ 2016-12-16 15:48 ` Keith Busch
2016-12-17 7:13 ` Christoph Hellwig
0 siblings, 1 reply; 6+ messages in thread
From: Keith Busch @ 2016-12-16 15:48 UTC (permalink / raw)
On Fri, Dec 16, 2016@12:07:53AM -0800, Christoph Hellwig wrote:
> So an OEM can take the Intel Cards, and stuff their own values into the
> vendor specific fields? This sounds bad as you can't rely on the
> existing quirks in any old Linux version (plus FreeBSD or spdk for that
> matter any more).
Yes, these are the same concerns I've raised, and a reason we're in the
process of standardizing the feature.
It's fortunately not common for an OEM to break the decoding assumptions
we were promised, but the problem exists. I think this patch is the best
I can do for these devices.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] nvme: simplify stripe quirk
2016-12-16 15:48 ` Keith Busch
@ 2016-12-17 7:13 ` Christoph Hellwig
0 siblings, 0 replies; 6+ messages in thread
From: Christoph Hellwig @ 2016-12-17 7:13 UTC (permalink / raw)
On Fri, Dec 16, 2016@10:48:01AM -0500, Keith Busch wrote:
> It's fortunately not common for an OEM to break the decoding assumptions
> we were promised, but the problem exists. I think this patch is the best
> I can do for these devices.
With a small change to the changelog including the above and clearly
stating that it's an OEM changing it (and not the subsystem vendor
which sounds weird in NVMe context):
Reluctantly-Acked-by: Christoph Hellwig <hch at lst.de>
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2016-12-17 7:13 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-12-15 0:10 [PATCH] nvme: simplify stripe quirk Keith Busch
2016-12-15 8:32 ` Christoph Hellwig
2016-12-15 15:59 ` Keith Busch
2016-12-16 8:07 ` Christoph Hellwig
2016-12-16 15:48 ` Keith Busch
2016-12-17 7:13 ` Christoph Hellwig
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.