All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] NVMe: Create discard zero quirk white list
@ 2016-03-04 17:45 Keith Busch
  2016-03-04 18:04 ` Christoph Hellwig
  0 siblings, 1 reply; 5+ messages in thread
From: Keith Busch @ 2016-03-04 17:45 UTC (permalink / raw)


The NVMe specification does not require discarded blocks return zeroes on
read, but provides that behavior as a possibility. Some applications more
efficiently use an SSD if reads on discarded blocks were deterministically
zero, based on the "discard_zeroes_data" queue attribute.

There is no specification defined way to determine device behavior on
discarded blocks, so the driver always left the queue setting disabled. We
can only know behavior based on individual device models, so this patch
adds a flag to the NVMe "quirk" list that vendors may set if they know
their controller works that way. The patch also sets the new flag for one
such known device.

Suggested-by: Artur Paszkiewicz <artur.paszkiewicz at intel.com>
Signed-off-by: Keith Busch <keith.busch at intel.com>
---
 drivers/nvme/host/core.c | 9 +++++++--
 drivers/nvme/host/nvme.h | 6 ++++++
 drivers/nvme/host/pci.c  | 3 ++-
 3 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 1100a6d..6efa489 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -587,8 +587,14 @@ static void nvme_init_integrity(struct nvme_ns *ns)
 
 static void nvme_config_discard(struct nvme_ns *ns)
 {
+	struct nvme_ctrl *ctrl = ns->ctrl;
 	u32 logical_block_size = queue_logical_block_size(ns->queue);
-	ns->queue->limits.discard_zeroes_data = 0;
+
+	if (ctrl->quirks & NVME_QUIRK_DISCARD_ZEROES)
+		ns->queue->limits.discard_zeroes_data = 1;
+	else
+		ns->queue->limits.discard_zeroes_data = 0;
+
 	ns->queue->limits.discard_alignment = logical_block_size;
 	ns->queue->limits.discard_granularity = logical_block_size;
 	blk_queue_max_discard_sectors(ns->queue, 0xffffffff);
@@ -1250,7 +1256,6 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, unsigned nsid)
 	ns->disk = disk;
 	ns->lba_shift = 9; /* set to a default value for 512 until disk is validated */
 
-
 	blk_queue_logical_block_size(ns->queue, 1 << ns->lba_shift);
 	nvme_set_queue_limits(ctrl, ns->queue);
 
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index b760de1..1c6bcaa 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -59,6 +59,12 @@ enum nvme_quirks {
 	 * correctly.
 	 */
 	NVME_QUIRK_IDENTIFY_CNS			= (1 << 1),
+
+	/*
+	 * The controller deterministically returns O's on reads to discarded
+	 * logical blocks.
+	 */
+	NVME_QUIRK_DISCARD_ZEROES		= (1 << 1),
 };
 
 struct nvme_ctrl {
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 5869f20..86b4c78 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2164,7 +2164,8 @@ static const struct pci_error_handlers nvme_err_handler = {
 
 static const struct pci_device_id nvme_id_table[] = {
 	{ PCI_VDEVICE(INTEL, 0x0953),
-		.driver_data = NVME_QUIRK_STRIPE_SIZE, },
+		.driver_data = NVME_QUIRK_STRIPE_SIZE |
+				NVME_QUIRK_DISCARD_ZEROES, },
 	{ PCI_VDEVICE(INTEL, 0x5845),	/* Qemu emulated controller */
 		.driver_data = NVME_QUIRK_IDENTIFY_CNS, },
 	{ PCI_DEVICE_CLASS(PCI_CLASS_STORAGE_EXPRESS, 0xffffff) },
-- 
2.7.2

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH] NVMe: Create discard zero quirk white list
  2016-03-04 17:45 [PATCH] NVMe: Create discard zero quirk white list Keith Busch
@ 2016-03-04 18:04 ` Christoph Hellwig
  2016-03-04 18:13   ` Keith Busch
  2016-03-05 22:39   ` Matthew Wilcox
  0 siblings, 2 replies; 5+ messages in thread
From: Christoph Hellwig @ 2016-03-04 18:04 UTC (permalink / raw)


On Fri, Mar 04, 2016@10:45:22AM -0700, Keith Busch wrote:
> The NVMe specification does not require discarded blocks return zeroes on
> read, but provides that behavior as a possibility. Some applications more
> efficiently use an SSD if reads on discarded blocks were deterministically
> zero, based on the "discard_zeroes_data" queue attribute.

Meh, indeed:

"If a read occurs to a deallocated range, the controller shall return
all zeros, all ones, or the last data written to the associated LBA."

time to write a TP for a bit on the identify page..

> There is no specification defined way to determine device behavior on
> discarded blocks, so the driver always left the queue setting disabled. We
> can only know behavior based on individual device models, so this patch
> adds a flag to the NVMe "quirk" list that vendors may set if they know
> their controller works that way. The patch also sets the new flag for one
> such known device.
> 
> Suggested-by: Artur Paszkiewicz <artur.paszkiewicz at intel.com>
> Signed-off-by: Keith Busch <keith.busch at intel.com>

Not happy about this, but I guess we'll need something like this in the
long run..

Reviewed-by: Christoph Hellwig <hch at lst.de>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] NVMe: Create discard zero quirk white list
  2016-03-04 18:04 ` Christoph Hellwig
@ 2016-03-04 18:13   ` Keith Busch
  2016-03-05 22:39   ` Matthew Wilcox
  1 sibling, 0 replies; 5+ messages in thread
From: Keith Busch @ 2016-03-04 18:13 UTC (permalink / raw)


On Fri, Mar 04, 2016@10:04:43AM -0800, Christoph Hellwig wrote:
> Not happy about this, but I guess we'll need something like this in the
> long run..

Cool, thanks for the feedback. I mainly wanted to confirm there wasn't
a better way to do this.

I will need to resend as I merged this from an older branch, and flag
value is already claimed for a different quirk.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] NVMe: Create discard zero quirk white list
  2016-03-04 18:04 ` Christoph Hellwig
  2016-03-04 18:13   ` Keith Busch
@ 2016-03-05 22:39   ` Matthew Wilcox
  2016-03-06  6:20     ` Christoph Hellwig
  1 sibling, 1 reply; 5+ messages in thread
From: Matthew Wilcox @ 2016-03-05 22:39 UTC (permalink / raw)


On Fri, Mar 04, 2016@10:04:43AM -0800, Christoph Hellwig wrote:
> On Fri, Mar 04, 2016@10:45:22AM -0700, Keith Busch wrote:
> > The NVMe specification does not require discarded blocks return zeroes on
> > read, but provides that behavior as a possibility. Some applications more
> > efficiently use an SSD if reads on discarded blocks were deterministically
> > zero, based on the "discard_zeroes_data" queue attribute.
> 
> Meh, indeed:
> 
> "If a read occurs to a deallocated range, the controller shall return
> all zeros, all ones, or the last data written to the associated LBA."
> 
> time to write a TP for a bit on the identify page..

Since we have a WRITE ZEROES command, do we need this?

> > There is no specification defined way to determine device behavior on
> > discarded blocks, so the driver always left the queue setting disabled. We
> > can only know behavior based on individual device models, so this patch
> > adds a flag to the NVMe "quirk" list that vendors may set if they know
> > their controller works that way. The patch also sets the new flag for one
> > such known device.
> > 
> > Suggested-by: Artur Paszkiewicz <artur.paszkiewicz at intel.com>
> > Signed-off-by: Keith Busch <keith.busch at intel.com>
> 
> Not happy about this, but I guess we'll need something like this in the
> long run..
> 
> Reviewed-by: Christoph Hellwig <hch at lst.de>
> 
> _______________________________________________
> Linux-nvme mailing list
> Linux-nvme at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] NVMe: Create discard zero quirk white list
  2016-03-05 22:39   ` Matthew Wilcox
@ 2016-03-06  6:20     ` Christoph Hellwig
  0 siblings, 0 replies; 5+ messages in thread
From: Christoph Hellwig @ 2016-03-06  6:20 UTC (permalink / raw)


On Sat, Mar 05, 2016@05:39:49PM -0500, Matthew Wilcox wrote:
> > Meh, indeed:
> > 
> > "If a read occurs to a deallocated range, the controller shall return
> > all zeros, all ones, or the last data written to the associated LBA."
> > 
> > time to write a TP for a bit on the identify page..
> 
> Since we have a WRITE ZEROES command, do we need this?

WRITE ZEROES has no way to request deallocating blocks.  It's the moral
equivalent of a WRITE SAME without the unmap bit of all zeroes.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-03-06  6:20 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-03-04 17:45 [PATCH] NVMe: Create discard zero quirk white list Keith Busch
2016-03-04 18:04 ` Christoph Hellwig
2016-03-04 18:13   ` Keith Busch
2016-03-05 22:39   ` Matthew Wilcox
2016-03-06  6:20     ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.