* [PATCH] nvme-pci: qdepth 1 quirk
@ 2024-09-11 17:46 Keith Busch
2024-09-12 6:57 ` Christoph Hellwig
0 siblings, 1 reply; 15+ messages in thread
From: Keith Busch @ 2024-09-11 17:46 UTC (permalink / raw)
To: linux-nvme, hch, sagi; +Cc: Keith Busch, Robert Beckett
From: Keith Busch <kbusch@kernel.org>
Another device has been reported to be unreliable if we have more than
one outstanding command. In this new case, data corruption may occur.
Since we have two devices now needing this quirky behavior, make a
generic quirk flag.
The same Apple quirk is clearly not "temporary", so update the comment
while moving it.
Link: https://lore.kernel.org/linux-nvme/191d810a4e3.fcc6066c765804.973611676137075390@collabora.com/
Reported-by: Robert Beckett <bob.beckett@collabora.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
drivers/nvme/host/nvme.h | 5 +++++
drivers/nvme/host/pci.c | 18 +++++++++---------
2 files changed, 14 insertions(+), 9 deletions(-)
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index af22368800e77..69a3fb6de1e9c 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -90,6 +90,11 @@ enum nvme_quirks {
*/
NVME_QUIRK_NO_DEEPEST_PS = (1 << 5),
+ /*
+ * Problems seen with concurrent commands
+ */
+ NVME_QUIRK_QDEPTH_ONE = (1 << 6),
+
/*
* Set MEDIUM priority on SQ creation
*/
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 6cd9395ba9ec3..2c245c3f693a7 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2557,15 +2557,8 @@ static int nvme_pci_enable(struct nvme_dev *dev)
else
dev->io_sqes = NVME_NVM_IOSQES;
- /*
- * Temporary fix for the Apple controller found in the MacBook8,1 and
- * some MacBook7,1 to avoid controller resets and data loss.
- */
- if (pdev->vendor == PCI_VENDOR_ID_APPLE && pdev->device == 0x2001) {
+ if (dev->ctrl.quirks & NVME_QUIRK_QDEPTH_ONE) {
dev->q_depth = 2;
- dev_warn(dev->ctrl.device, "detected Apple NVMe controller, "
- "set queue depth=%u to work around controller resets\n",
- dev->q_depth);
} else if (pdev->vendor == PCI_VENDOR_ID_SAMSUNG &&
(pdev->device == 0xa821 || pdev->device == 0xa822) &&
NVME_CAP_MQES(dev->ctrl.cap) == 0) {
@@ -3425,6 +3418,8 @@ static const struct pci_device_id nvme_id_table[] = {
NVME_QUIRK_BOGUS_NID, },
{ PCI_VDEVICE(REDHAT, 0x0010), /* Qemu emulated controller */
.driver_data = NVME_QUIRK_BOGUS_NID, },
+ { PCI_DEVICE(0x1217, 0x8760),
+ .driver_data = NVME_QUIRK_QDEPTH_ONE },
{ PCI_DEVICE(0x126f, 0x2262), /* Silicon Motion generic */
.driver_data = NVME_QUIRK_NO_DEEPEST_PS |
NVME_QUIRK_BOGUS_NID, },
@@ -3559,7 +3554,12 @@ static const struct pci_device_id nvme_id_table[] = {
{ PCI_DEVICE(PCI_VENDOR_ID_AMAZON, 0xcd02),
.driver_data = NVME_QUIRK_DMA_ADDRESS_BITS_48, },
{ PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2001),
- .driver_data = NVME_QUIRK_SINGLE_VECTOR },
+ /*
+ * Fix for the Apple controller found in the MacBook8,1 and
+ * some MacBook7,1 to avoid controller resets and data loss.
+ */
+ .driver_data = NVME_QUIRK_SINGLE_VECTOR |
+ NVME_QUIRK_QDEPTH_ONE },
{ PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2003) },
{ PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2005),
.driver_data = NVME_QUIRK_SINGLE_VECTOR |
--
2.43.5
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH] nvme-pci: qdepth 1 quirk
2024-09-11 17:46 [PATCH] nvme-pci: qdepth 1 quirk Keith Busch
@ 2024-09-12 6:57 ` Christoph Hellwig
2024-09-12 16:30 ` Robert Beckett
0 siblings, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2024-09-12 6:57 UTC (permalink / raw)
To: Keith Busch; +Cc: linux-nvme, hch, sagi, Keith Busch, Robert Beckett
> + { PCI_DEVICE(0x1217, 0x8760),
> + .driver_data = NVME_QUIRK_QDEPTH_ONE },
Do we have some kind of name for this device? That always feels useful
when googling for it or finding patterns.
Otherwise looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] nvme-pci: qdepth 1 quirk
2024-09-12 6:57 ` Christoph Hellwig
@ 2024-09-12 16:30 ` Robert Beckett
2024-10-29 2:42 ` [PATCH] nvme-pci: Remove O2 Queue Depth quirk Gwendal Grignou
0 siblings, 1 reply; 15+ messages in thread
From: Robert Beckett @ 2024-09-12 16:30 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Keith Busch, linux-nvme, sagi, Keith Busch
---- On Thu, 12 Sep 2024 07:57:16 +0100 Christoph Hellwig wrote ---
> > + { PCI_DEVICE(0x1217, 0x8760),
> > + .driver_data = NVME_QUIRK_QDEPTH_ONE },
>
> Do we have some kind of name for this device? That always feels useful
> when googling for it or finding patterns.
The device is the 64GB Steam Deck from Valve.
>
> Otherwise looks good:
>
> Reviewed-by: Christoph Hellwig hch@lst.de>
>
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH] nvme-pci: Remove O2 Queue Depth quirk
2024-09-12 16:30 ` Robert Beckett
@ 2024-10-29 2:42 ` Gwendal Grignou
2024-10-29 2:52 ` Greg KH
2024-10-29 7:41 ` Christoph Hellwig
0 siblings, 2 replies; 15+ messages in thread
From: Gwendal Grignou @ 2024-10-29 2:42 UTC (permalink / raw)
To: bob.beckett
Cc: hch, kbusch, kbusch, linux-nvme, sagi, Gwendal Grignou, stable
PCI_DEVICE(0x1217, 0x8760) (O2 Micro, Inc. FORESEE E2M2 NVMe SSD)
is a NMVe to eMMC bridge, that can be used with different eMMC
memory devices.
The NVMe device name contains the eMMC device name, for instance:
`BAYHUB SanDisk-DA4128-91904055-128GB`
The bridge is known to work with many eMMC devices, we need to limit
the queue depth once we know which eMMC device is behind the bridge.
Fixes: commit 83bdfcbdbe5d ("nvme-pci: qdepth 1 quirk")
Cc: stable@vger.kernel.org
Signed-off-by: Gwendal Grignou <gwendal@chromium.org>
---
drivers/nvme/host/pci.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 4b9fda0b1d9a3..1c908e129fddf 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -3448,8 +3448,6 @@ static const struct pci_device_id nvme_id_table[] = {
NVME_QUIRK_BOGUS_NID, },
{ PCI_VDEVICE(REDHAT, 0x0010), /* Qemu emulated controller */
.driver_data = NVME_QUIRK_BOGUS_NID, },
- { PCI_DEVICE(0x1217, 0x8760), /* O2 Micro 64GB Steam Deck */
- .driver_data = NVME_QUIRK_QDEPTH_ONE },
{ PCI_DEVICE(0x126f, 0x2262), /* Silicon Motion generic */
.driver_data = NVME_QUIRK_NO_DEEPEST_PS |
NVME_QUIRK_BOGUS_NID, },
--
2.47.0.163.g1226f6d8fa-goog
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH] nvme-pci: Remove O2 Queue Depth quirk
2024-10-29 2:42 ` [PATCH] nvme-pci: Remove O2 Queue Depth quirk Gwendal Grignou
@ 2024-10-29 2:52 ` Greg KH
2024-10-29 7:41 ` Christoph Hellwig
1 sibling, 0 replies; 15+ messages in thread
From: Greg KH @ 2024-10-29 2:52 UTC (permalink / raw)
To: Gwendal Grignou
Cc: bob.beckett, hch, kbusch, kbusch, linux-nvme, sagi, stable
On Mon, Oct 28, 2024 at 07:42:36PM -0700, Gwendal Grignou wrote:
> PCI_DEVICE(0x1217, 0x8760) (O2 Micro, Inc. FORESEE E2M2 NVMe SSD)
> is a NMVe to eMMC bridge, that can be used with different eMMC
> memory devices.
> The NVMe device name contains the eMMC device name, for instance:
> `BAYHUB SanDisk-DA4128-91904055-128GB`
>
> The bridge is known to work with many eMMC devices, we need to limit
> the queue depth once we know which eMMC device is behind the bridge.
>
> Fixes: commit 83bdfcbdbe5d ("nvme-pci: qdepth 1 quirk")
Nit, no need for "commit" here, and no blank line after this one.
thanks,
greg k-h
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] nvme-pci: Remove O2 Queue Depth quirk
2024-10-29 2:42 ` [PATCH] nvme-pci: Remove O2 Queue Depth quirk Gwendal Grignou
2024-10-29 2:52 ` Greg KH
@ 2024-10-29 7:41 ` Christoph Hellwig
2024-10-29 18:58 ` Gwendal Grignou
1 sibling, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2024-10-29 7:41 UTC (permalink / raw)
To: Gwendal Grignou
Cc: bob.beckett, hch, kbusch, kbusch, linux-nvme, sagi, stable
On Mon, Oct 28, 2024 at 07:42:36PM -0700, Gwendal Grignou wrote:
> PCI_DEVICE(0x1217, 0x8760) (O2 Micro, Inc. FORESEE E2M2 NVMe SSD)
> is a NMVe to eMMC bridge, that can be used with different eMMC
> memory devices.
Holy f**k, what an awful idea..
> The NVMe device name contains the eMMC device name, for instance:
> `BAYHUB SanDisk-DA4128-91904055-128GB`
>
> The bridge is known to work with many eMMC devices, we need to limit
> the queue depth once we know which eMMC device is behind the bridge.
Please work with Tobert to quirk based on the identify data for "his"
device to keep it quirked instead of regressing it.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] nvme-pci: Remove O2 Queue Depth quirk
2024-10-29 7:41 ` Christoph Hellwig
@ 2024-10-29 18:58 ` Gwendal Grignou
2024-10-29 19:16 ` Keith Busch
2024-10-29 19:18 ` [PATCH] nvme-pci: Remove O2 Queue Depth quirk Robert Beckett
0 siblings, 2 replies; 15+ messages in thread
From: Gwendal Grignou @ 2024-10-29 18:58 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: bob.beckett, kbusch, kbusch, linux-nvme, sagi, stable
On Tue, Oct 29, 2024 at 12:41 AM Christoph Hellwig <hch@lst.de> wrote:
>
> On Mon, Oct 28, 2024 at 07:42:36PM -0700, Gwendal Grignou wrote:
> > PCI_DEVICE(0x1217, 0x8760) (O2 Micro, Inc. FORESEE E2M2 NVMe SSD)
> > is a NMVe to eMMC bridge, that can be used with different eMMC
> > memory devices.
>
> Holy f**k, what an awful idea..
>
> > The NVMe device name contains the eMMC device name, for instance:
> > `BAYHUB SanDisk-DA4128-91904055-128GB`
> >
> > The bridge is known to work with many eMMC devices, we need to limit
> > the queue depth once we know which eMMC device is behind the bridge.
>
> Please work with Tobert to quirk based on the identify data for "his"
> device to keep it quirked instead of regressing it.
The issue is we would need to base the quirk on the model name
(subsys->model) that is not available in `nvme_id_table`. Beside,
`q_depth` is set in `nvme_pci_enable`, called at probe time before
calling `nvme_init_ctrl_finish` that will indirectly populate
`subsys`.
Bob, to address the data corruption problem from user space, adding a
udev rule to set `queue/nr_requests` to 1 when `device/model` matches
the device used in the Steam Deck would most likely be too late in the
boot process, wouldn't it?
Gwendal.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] nvme-pci: Remove O2 Queue Depth quirk
2024-10-29 18:58 ` Gwendal Grignou
@ 2024-10-29 19:16 ` Keith Busch
2024-11-07 16:50 ` [PATCH] nvme-pci: 512 byte dma pool segment quirk Bob Beckett
2024-10-29 19:18 ` [PATCH] nvme-pci: Remove O2 Queue Depth quirk Robert Beckett
1 sibling, 1 reply; 15+ messages in thread
From: Keith Busch @ 2024-10-29 19:16 UTC (permalink / raw)
To: Gwendal Grignou
Cc: Christoph Hellwig, bob.beckett, kbusch, linux-nvme, sagi, stable
On Tue, Oct 29, 2024 at 11:58:40AM -0700, Gwendal Grignou wrote:
> On Tue, Oct 29, 2024 at 12:41 AM Christoph Hellwig <hch@lst.de> wrote:
> >
> > On Mon, Oct 28, 2024 at 07:42:36PM -0700, Gwendal Grignou wrote:
> > > PCI_DEVICE(0x1217, 0x8760) (O2 Micro, Inc. FORESEE E2M2 NVMe SSD)
> > > is a NMVe to eMMC bridge, that can be used with different eMMC
> > > memory devices.
> >
> > Holy f**k, what an awful idea..
> >
> > > The NVMe device name contains the eMMC device name, for instance:
> > > `BAYHUB SanDisk-DA4128-91904055-128GB`
> > >
> > > The bridge is known to work with many eMMC devices, we need to limit
> > > the queue depth once we know which eMMC device is behind the bridge.
> >
> > Please work with Tobert to quirk based on the identify data for "his"
> > device to keep it quirked instead of regressing it.
>
> The issue is we would need to base the quirk on the model name
> (subsys->model) that is not available in `nvme_id_table`. Beside,
> `q_depth` is set in `nvme_pci_enable`, called at probe time before
> calling `nvme_init_ctrl_finish` that will indirectly populate
> `subsys`.
>
> Bob, to address the data corruption problem from user space, adding a
> udev rule to set `queue/nr_requests` to 1 when `device/model` matches
> the device used in the Steam Deck would most likely be too late in the
> boot process, wouldn't it?
I think that is too late. There's the module parameter,
'nvme.io_queue_depth=2', that accomplishes the same thing as this quirk,
if adding kernel parameters isn't too inconvenient to use here.
Alternatively, you could put the quirk in the core_quirks, which do take
a model name. The pci driver would have to move this check until after
init_ctrl_finish, as you noticed, but that looks okay to do. We just
need to be careful to update both dev->q_depth and ctrl->sqsize after
the "finish".
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] nvme-pci: Remove O2 Queue Depth quirk
2024-10-29 18:58 ` Gwendal Grignou
2024-10-29 19:16 ` Keith Busch
@ 2024-10-29 19:18 ` Robert Beckett
2024-12-05 21:53 ` Gwendal Grignou
1 sibling, 1 reply; 15+ messages in thread
From: Robert Beckett @ 2024-10-29 19:18 UTC (permalink / raw)
To: Gwendal Grignou
Cc: Christoph Hellwig, kbusch, kbusch, linux-nvme, sagi, stable
---- On Tue, 29 Oct 2024 18:58:40 +0000 Gwendal Grignou wrote ---
> On Tue, Oct 29, 2024 at 12:41 AM Christoph Hellwig hch@lst.de> wrote:
> >
> > On Mon, Oct 28, 2024 at 07:42:36PM -0700, Gwendal Grignou wrote:
> > > PCI_DEVICE(0x1217, 0x8760) (O2 Micro, Inc. FORESEE E2M2 NVMe SSD)
> > > is a NMVe to eMMC bridge, that can be used with different eMMC
> > > memory devices.
> >
> > Holy f**k, what an awful idea..
> >
> > > The NVMe device name contains the eMMC device name, for instance:
> > > `BAYHUB SanDisk-DA4128-91904055-128GB`
> > >
> > > The bridge is known to work with many eMMC devices, we need to limit
> > > the queue depth once we know which eMMC device is behind the bridge.
> >
> > Please work with Tobert to quirk based on the identify data for "his"
> > device to keep it quirked instead of regressing it.
>
> The issue is we would need to base the quirk on the model name
> (subsys->model) that is not available in `nvme_id_table`. Beside,
> `q_depth` is set in `nvme_pci_enable`, called at probe time before
> calling `nvme_init_ctrl_finish` that will indirectly populate
> `subsys`.
>
> Bob, to address the data corruption problem from user space, adding a
> udev rule to set `queue/nr_requests` to 1 when `device/model` matches
> the device used in the Steam Deck would most likely be too late in the
> boot process, wouldn't it?
I've never seen the corruption outside of severe stress testing.
In the field, people reported it during and after os image updates (more stress testing).
However, as this concerns data integrity, it seems risky to me to rely on a fix
after bootup.
>
> Gwendal.
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH] nvme-pci: 512 byte dma pool segment quirk
2024-10-29 19:16 ` Keith Busch
@ 2024-11-07 16:50 ` Bob Beckett
2024-11-07 17:19 ` Keith Busch
0 siblings, 1 reply; 15+ messages in thread
From: Bob Beckett @ 2024-11-07 16:50 UTC (permalink / raw)
To: Keith Busch, Jens Axboe, Christoph Hellwig, Sagi Grimberg
Cc: kernel, Robert Beckett, linux-nvme, linux-kernel
From: Robert Beckett <bob.beckett@collabora.com>
We initially put in a quick fix of limiting the queue depth to 1
as experimentation showed that it fixed data corruption on 64GB
steamdecks.
After further experimentation, it appears that the corruption
is fixed by increasing the small dma pool segment size to
512 bytes. Testing via desync image verification shows that
it now passes thousands of verification loops, where previously
it never managed above 7.
Currently it is not known why this fixes the corruption.
Perhaps it is doing something nasty like using an mmc page
as a cache for the prp lists (mmc min. page size is 512 bytes)
and not invalidating properly, so that the dma pool change to
treats segment list as a stack ends up giving a previous
segment in the same cached page.
This fixes the previous queue depth limitation as it fixes
the corruption without incurring a 37% tested performance
degredation.
Fixes: 83bdfcbdbe5d ("nvme-pci: qdepth 1 quirk")
Signed-off-by: Robert Beckett <bob.beckett@collabora.com>
---
drivers/nvme/host/nvme.h | 5 +++++
drivers/nvme/host/pci.c | 17 ++++++++++++-----
2 files changed, 17 insertions(+), 5 deletions(-)
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 093cb423f536..bbad25d15360 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -173,6 +173,11 @@ enum nvme_quirks {
* MSI (but not MSI-X) interrupts are broken and never fire.
*/
NVME_QUIRK_BROKEN_MSI = (1 << 21),
+
+ /*
+ * Min. dma pool segment size 512 bytes
+ */
+ NVME_QUIRK_SMALL_DMAPOOL_512 = (1 << 22),
};
/*
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 4b9fda0b1d9a..0782c9b1b4e7 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -141,6 +141,7 @@ struct nvme_dev {
struct nvme_ctrl ctrl;
u32 last_ps;
bool hmb;
+ u32 small_dmapool_seg_size;
mempool_t *iod_mempool;
@@ -611,7 +612,7 @@ static blk_status_t nvme_pci_setup_prps(struct nvme_dev *dev,
}
nprps = DIV_ROUND_UP(length, NVME_CTRL_PAGE_SIZE);
- if (nprps <= (256 / 8)) {
+ if (nprps <= (dev->small_dmapool_seg_size / 8)) {
pool = dev->prp_small_pool;
iod->nr_allocations = 0;
} else {
@@ -701,7 +702,7 @@ static blk_status_t nvme_pci_setup_sgls(struct nvme_dev *dev,
return BLK_STS_OK;
}
- if (entries <= (256 / sizeof(struct nvme_sgl_desc))) {
+ if (entries <= (dev->small_dmapool_seg_size / sizeof(struct nvme_sgl_desc))) {
pool = dev->prp_small_pool;
iod->nr_allocations = 0;
} else {
@@ -2700,8 +2701,9 @@ static int nvme_setup_prp_pools(struct nvme_dev *dev)
return -ENOMEM;
/* Optimisation for I/Os between 4k and 128k */
- dev->prp_small_pool = dma_pool_create("prp list 256", dev->dev,
- 256, 256, 0);
+ dev->prp_small_pool = dma_pool_create("prp list small", dev->dev,
+ dev->small_dmapool_seg_size,
+ dev->small_dmapool_seg_size, 0);
if (!dev->prp_small_pool) {
dma_pool_destroy(dev->prp_page_pool);
return -ENOMEM;
@@ -3063,6 +3065,11 @@ static struct nvme_dev *nvme_pci_alloc_dev(struct pci_dev *pdev,
* a single integrity segment for the separate metadata pointer.
*/
dev->ctrl.max_integrity_segments = 1;
+
+ if (dev->ctrl.quirks & NVME_QUIRK_SMALL_DMAPOOL_512)
+ dev->small_dmapool_seg_size = 512;
+ else
+ dev->small_dmapool_seg_size = 256;
return dev;
out_put_device:
@@ -3449,7 +3456,7 @@ static const struct pci_device_id nvme_id_table[] = {
{ PCI_VDEVICE(REDHAT, 0x0010), /* Qemu emulated controller */
.driver_data = NVME_QUIRK_BOGUS_NID, },
{ PCI_DEVICE(0x1217, 0x8760), /* O2 Micro 64GB Steam Deck */
- .driver_data = NVME_QUIRK_QDEPTH_ONE },
+ .driver_data = NVME_QUIRK_SMALL_DMAPOOL_512 },
{ PCI_DEVICE(0x126f, 0x2262), /* Silicon Motion generic */
.driver_data = NVME_QUIRK_NO_DEEPEST_PS |
NVME_QUIRK_BOGUS_NID, },
--
2.45.2
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH] nvme-pci: 512 byte dma pool segment quirk
2024-11-07 16:50 ` [PATCH] nvme-pci: 512 byte dma pool segment quirk Bob Beckett
@ 2024-11-07 17:19 ` Keith Busch
2024-11-07 17:35 ` Robert Beckett
2024-11-08 7:08 ` Christoph Hellwig
0 siblings, 2 replies; 15+ messages in thread
From: Keith Busch @ 2024-11-07 17:19 UTC (permalink / raw)
To: Bob Beckett
Cc: Jens Axboe, Christoph Hellwig, Sagi Grimberg, kernel, linux-nvme,
linux-kernel
On Thu, Nov 07, 2024 at 04:50:46PM +0000, Bob Beckett wrote:
> @@ -611,7 +612,7 @@ static blk_status_t nvme_pci_setup_prps(struct nvme_dev *dev,
> }
>
> nprps = DIV_ROUND_UP(length, NVME_CTRL_PAGE_SIZE);
> - if (nprps <= (256 / 8)) {
> + if (nprps <= (dev->small_dmapool_seg_size / 8)) {
> pool = dev->prp_small_pool;
> iod->nr_allocations = 0;
> } else {
We have a constant expression currently, and this is changing it a full
division in the IO path. :(
Could we leave the pool selection check size as-is and just say the cost
of the quirk is additional memory overhead?
> @@ -2700,8 +2701,9 @@ static int nvme_setup_prp_pools(struct nvme_dev *dev)
> return -ENOMEM;
>
> /* Optimisation for I/Os between 4k and 128k */
> - dev->prp_small_pool = dma_pool_create("prp list 256", dev->dev,
> - 256, 256, 0);
> + dev->prp_small_pool = dma_pool_create("prp list small", dev->dev,
> + dev->small_dmapool_seg_size,
> + dev->small_dmapool_seg_size, 0);
I think it should work if we only change the alignment property of the
pool. Something like this:
if (dev->ctrl.quirks & NVME_QUIRK_SMALL_DMAPOOL_512)
dev->prp_small_pool = dma_pool_create("prp list 256", dev->dev,
256, 512, 0);
else
dev->prp_small_pool = dma_pool_create("prp list 256", dev->dev,
256, 256, 0);
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] nvme-pci: 512 byte dma pool segment quirk
2024-11-07 17:19 ` Keith Busch
@ 2024-11-07 17:35 ` Robert Beckett
2024-11-08 7:08 ` Christoph Hellwig
1 sibling, 0 replies; 15+ messages in thread
From: Robert Beckett @ 2024-11-07 17:35 UTC (permalink / raw)
To: Keith Busch
Cc: Jens Axboe, Christoph Hellwig, Sagi Grimberg, kernel, linux-nvme,
linux-kernel
---- On Thu, 07 Nov 2024 17:19:30 +0000 Keith Busch wrote ---
> On Thu, Nov 07, 2024 at 04:50:46PM +0000, Bob Beckett wrote:
> > @@ -611,7 +612,7 @@ static blk_status_t nvme_pci_setup_prps(struct nvme_dev *dev,
> > }
> >
> > nprps = DIV_ROUND_UP(length, NVME_CTRL_PAGE_SIZE);
> > - if (nprps <= (256 / 8)) {
> > + if (nprps small_dmapool_seg_size / 8)) {
> > pool = dev->prp_small_pool;
> > iod->nr_allocations = 0;
> > } else {
>
> We have a constant expression currently, and this is changing it a full
> division in the IO path. :(
yeah, that's fair. Does it get high enough throughput that this is a significant issue here? (I have little intuition for this driver).
how about pre-computing the nprps threshold during pool creation where we detect the quirk, it would then be variable comparison instead of a const comparison, but no divide?
>
> Could we leave the pool selection check size as-is and just say the cost
> of the quirk is additional memory overhead?
>
> > @@ -2700,8 +2701,9 @@ static int nvme_setup_prp_pools(struct nvme_dev *dev)
> > return -ENOMEM;
> >
> > /* Optimisation for I/Os between 4k and 128k */
> > - dev->prp_small_pool = dma_pool_create("prp list 256", dev->dev,
> > - 256, 256, 0);
> > + dev->prp_small_pool = dma_pool_create("prp list small", dev->dev,
> > + dev->small_dmapool_seg_size,
> > + dev->small_dmapool_seg_size, 0);
>
> I think it should work if we only change the alignment property of the
> pool. Something like this:
>
> if (dev->ctrl.quirks & NVME_QUIRK_SMALL_DMAPOOL_512)
> dev->prp_small_pool = dma_pool_create("prp list 256", dev->dev,
> 256, 512, 0);
I actually already tested a change of 512, 512 while keeping the 256 devision above during testing (i.e. waste half of the segment). I'll confirm with a test again against latest and send a v2 assuming it tests fine.
> else
> dev->prp_small_pool = dma_pool_create("prp list 256", dev->dev,
> 256, 256, 0);
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] nvme-pci: 512 byte dma pool segment quirk
2024-11-07 17:19 ` Keith Busch
2024-11-07 17:35 ` Robert Beckett
@ 2024-11-08 7:08 ` Christoph Hellwig
1 sibling, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2024-11-08 7:08 UTC (permalink / raw)
To: Keith Busch
Cc: Bob Beckett, Jens Axboe, Christoph Hellwig, Sagi Grimberg, kernel,
linux-nvme, linux-kernel
On Thu, Nov 07, 2024 at 10:19:30AM -0700, Keith Busch wrote:
> We have a constant expression currently, and this is changing it a full
> division in the IO path. :(
Yeah. Given that the device is broken I'd just have it pay the price
and never use the small prp pool instead, which just adds a single
extra branch to the fast path.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] nvme-pci: Remove O2 Queue Depth quirk
2024-10-29 19:18 ` [PATCH] nvme-pci: Remove O2 Queue Depth quirk Robert Beckett
@ 2024-12-05 21:53 ` Gwendal Grignou
2024-12-05 22:03 ` Keith Busch
0 siblings, 1 reply; 15+ messages in thread
From: Gwendal Grignou @ 2024-12-05 21:53 UTC (permalink / raw)
To: Robert Beckett
Cc: Christoph Hellwig, kbusch, kbusch, linux-nvme, sagi, stable
On Tue, Oct 29, 2024 at 12:19 PM Robert Beckett
<bob.beckett@collabora.com> wrote:
>
>
>
>
>
>
> ---- On Tue, 29 Oct 2024 18:58:40 +0000 Gwendal Grignou wrote ---
> > On Tue, Oct 29, 2024 at 12:41 AM Christoph Hellwig hch@lst.de> wrote:
> > >
> > > On Mon, Oct 28, 2024 at 07:42:36PM -0700, Gwendal Grignou wrote:
> > > > PCI_DEVICE(0x1217, 0x8760) (O2 Micro, Inc. FORESEE E2M2 NVMe SSD)
> > > > is a NMVe to eMMC bridge, that can be used with different eMMC
> > > > memory devices.
> > >
> > > Holy f**k, what an awful idea..
> > >
> > > > The NVMe device name contains the eMMC device name, for instance:
> > > > `BAYHUB SanDisk-DA4128-91904055-128GB`
> > > >
> > > > The bridge is known to work with many eMMC devices, we need to limit
> > > > the queue depth once we know which eMMC device is behind the bridge.
> > >
> > > Please work with Tobert to quirk based on the identify data for "his"
> > > device to keep it quirked instead of regressing it.
> >
> > The issue is we would need to base the quirk on the model name
> > (subsys->model) that is not available in `nvme_id_table`. Beside,
> > `q_depth` is set in `nvme_pci_enable`, called at probe time before
> > calling `nvme_init_ctrl_finish` that will indirectly populate
> > `subsys`.
> >
> > Bob, to address the data corruption problem from user space, adding a
> > udev rule to set `queue/nr_requests` to 1 when `device/model` matches
> > the device used in the Steam Deck would most likely be too late in the
> > boot process, wouldn't it?
>
> I've never seen the corruption outside of severe stress testing.
> In the field, people reported it during and after os image updates (more stress testing).
> However, as this concerns data integrity, it seems risky to me to rely on a fix
> after bootup.
Since limiting the queue depth to 2 is only needed for a small subset
of eMMC memory modules that can be connected behind the bridge, would
it make sense to apply this patch, but add the kernel module parameter
mentioned earlier for impacted devices?
Gwendal.
>
> >
> > Gwendal.
> >
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] nvme-pci: Remove O2 Queue Depth quirk
2024-12-05 21:53 ` Gwendal Grignou
@ 2024-12-05 22:03 ` Keith Busch
0 siblings, 0 replies; 15+ messages in thread
From: Keith Busch @ 2024-12-05 22:03 UTC (permalink / raw)
To: Gwendal Grignou
Cc: Robert Beckett, Christoph Hellwig, kbusch, linux-nvme, sagi,
stable
On Thu, Dec 05, 2024 at 01:53:29PM -0800, Gwendal Grignou wrote:
> Since limiting the queue depth to 2 is only needed for a small subset
> of eMMC memory modules that can be connected behind the bridge, would
> it make sense to apply this patch, but add the kernel module parameter
> mentioned earlier for impacted devices?
I don't think qd2 is needed for anyone, though. The problem sounds like
it was root caused to a boundary condition, which is addressed in a
different patch. But I am not 100% sure, as the thread was left hanging
for some external confirmation:
https://lore.kernel.org/linux-nvme/Z0DdU9K9QMFxBIL8@kbusch-mbp.dhcp.thefacebook.com/
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2024-12-05 22:03 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-11 17:46 [PATCH] nvme-pci: qdepth 1 quirk Keith Busch
2024-09-12 6:57 ` Christoph Hellwig
2024-09-12 16:30 ` Robert Beckett
2024-10-29 2:42 ` [PATCH] nvme-pci: Remove O2 Queue Depth quirk Gwendal Grignou
2024-10-29 2:52 ` Greg KH
2024-10-29 7:41 ` Christoph Hellwig
2024-10-29 18:58 ` Gwendal Grignou
2024-10-29 19:16 ` Keith Busch
2024-11-07 16:50 ` [PATCH] nvme-pci: 512 byte dma pool segment quirk Bob Beckett
2024-11-07 17:19 ` Keith Busch
2024-11-07 17:35 ` Robert Beckett
2024-11-08 7:08 ` Christoph Hellwig
2024-10-29 19:18 ` [PATCH] nvme-pci: Remove O2 Queue Depth quirk Robert Beckett
2024-12-05 21:53 ` Gwendal Grignou
2024-12-05 22:03 ` Keith Busch
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox