public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] nvme: drop bogus nid quirk for multipath devices which passed id test
@ 2025-10-28 12:54 Daniel Wagner
  2025-10-29  7:44 ` Christoph Hellwig
  2025-11-05  7:33 ` Hannes Reinecke
  0 siblings, 2 replies; 7+ messages in thread
From: Daniel Wagner @ 2025-10-28 12:54 UTC (permalink / raw)
  To: Keith Busch, Christoph Hellwig
  Cc: John Meneghini, Bryan Gurney, linux-nvme, linux-kernel,
	Daniel Wagner, Daniel Wagner

From: Daniel Wagner <dwagner@suse.de>

Dell Express Flash NVMe P4610 reuse the Intel device id 0x0a54. The
Intel P4500/46000 device needs NVME_QUIRK_BOBUG_NID whereas the Dell
does not. Thus clear the quirk when the id check passes.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=217981
Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
Hi,

Obviously, using the correct IDs is really hard. There are at least two devices
which share a the device id and one needs the bogus id workaround the other
doesn't.

We discussed the idea to pass through configuration from the kernel command line
at ALPSS. I am not sure, how far we got with this yet. What about something like
this here?

Thanks,
Daniel

Signed-off-by: Daniel Wagner <wagi@kernel.org>
---
 drivers/nvme/host/core.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 734ad725e6f4..881981c2a3b2 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -4005,6 +4005,13 @@ static int nvme_init_ns_head(struct nvme_ns *ns, struct nvme_ns_info *info)
 		memset(&info->ids.uuid, 0, sizeof(info->ids.uuid));
 		memset(&info->ids.eui64, 0, sizeof(info->ids.eui64));
 		ctrl->quirks |= NVME_QUIRK_BOGUS_NID;
+	} else {
+		/*
+		 * If the device is a multipath device and the id check passes,
+		 * the NVME_QUIRK_BOGUS_NID is not needed.
+		 */
+		if (ns->ctrl->subsys->cmic & NVME_CTRL_CMIC_MULTI_CTRL)
+			ctrl->quirks &= ~NVME_QUIRK_BOGUS_NID;
 	}
 
 	mutex_lock(&ctrl->subsys->lock);

---
base-commit: 77a4fe6a06e265bd94d2b3cdc87fb3cde877a05b
change-id: 20251028-bogus-id-check-aa1dec56fff8

Best regards,
-- 
Daniel Wagner <wagi@kernel.org>


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] nvme: drop bogus nid quirk for multipath devices which passed id test
  2025-10-28 12:54 [PATCH] nvme: drop bogus nid quirk for multipath devices which passed id test Daniel Wagner
@ 2025-10-29  7:44 ` Christoph Hellwig
  2025-10-29  9:45   ` Daniel Wagner
  2025-11-05  7:33 ` Hannes Reinecke
  1 sibling, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2025-10-29  7:44 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: Keith Busch, Christoph Hellwig, John Meneghini, Bryan Gurney,
	linux-nvme, linux-kernel, Daniel Wagner

On Tue, Oct 28, 2025 at 01:54:56PM +0100, Daniel Wagner wrote:
> Obviously, using the correct IDs is really hard. There are at least two devices
> which share a the device id and one needs the bogus id workaround the other
> doesn't.
> 
> We discussed the idea to pass through configuration from the kernel command line
> at ALPSS. I am not sure, how far we got with this yet. What about something like
> this here?

The Intel devices are dual ported and multipath cabable, too.  Which
isn't surprising at the dell one is just an OEM version of them.

But more imporantly dropping the quirk for multipath devices is
fundamentally the wrong thing to do, as we really need proper IDs for
multipathing.  So just add another entry for the Dell device using
the dell subvendor/subdevice id that does not have the quirk.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] nvme: drop bogus nid quirk for multipath devices which passed id test
  2025-10-29  7:44 ` Christoph Hellwig
@ 2025-10-29  9:45   ` Daniel Wagner
  2025-10-29  9:59     ` Christoph Hellwig
  0 siblings, 1 reply; 7+ messages in thread
From: Daniel Wagner @ 2025-10-29  9:45 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Daniel Wagner, Keith Busch, John Meneghini, Bryan Gurney,
	linux-nvme, linux-kernel

On Wed, Oct 29, 2025 at 08:44:03AM +0100, Christoph Hellwig wrote:
> But more imporantly dropping the quirk for multipath devices is
> fundamentally the wrong thing to do, as we really need proper IDs for
> multipathing. So just add another entry for the Dell device using
> the dell subvendor/subdevice id that does not have the quirk.

The problem it is not possible to distinguish between the two devices
based on subvendor/subdevice ids. I've tried to find some way to keep
those two devices apart, serial numbers but I don't think there is
something usable. Thus the idea to rely our own sanity checks and
enable the feature if these pass. 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] nvme: drop bogus nid quirk for multipath devices which passed id test
  2025-10-29  9:45   ` Daniel Wagner
@ 2025-10-29  9:59     ` Christoph Hellwig
  2025-10-29 10:14       ` Daniel Wagner
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2025-10-29  9:59 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: Christoph Hellwig, Daniel Wagner, Keith Busch, John Meneghini,
	Bryan Gurney, linux-nvme, linux-kernel

On Wed, Oct 29, 2025 at 10:45:57AM +0100, Daniel Wagner wrote:
> On Wed, Oct 29, 2025 at 08:44:03AM +0100, Christoph Hellwig wrote:
> > But more imporantly dropping the quirk for multipath devices is
> > fundamentally the wrong thing to do, as we really need proper IDs for
> > multipathing. So just add another entry for the Dell device using
> > the dell subvendor/subdevice id that does not have the quirk.
> 
> The problem it is not possible to distinguish between the two devices
> based on subvendor/subdevice ids. I've tried to find some way to keep
> those two devices apart, serial numbers but I don't think there is
> something usable. Thus the idea to rely our own sanity checks and
> enable the feature if these pass. 

The problem is that there is no such thing as a sanity check.  Otherwise
life would be easy.  But I really can't believe that Dell is incompetent
enough to not leave a mark of their OEM Firmware version anywhere.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] nvme: drop bogus nid quirk for multipath devices which passed id test
  2025-10-29  9:59     ` Christoph Hellwig
@ 2025-10-29 10:14       ` Daniel Wagner
  2025-10-29 13:38         ` Christoph Hellwig
  0 siblings, 1 reply; 7+ messages in thread
From: Daniel Wagner @ 2025-10-29 10:14 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Daniel Wagner, Keith Busch, John Meneghini, Bryan Gurney,
	linux-nvme, linux-kernel

On Wed, Oct 29, 2025 at 10:59:07AM +0100, Christoph Hellwig wrote:
> The problem is that there is no such thing as a sanity check.  Otherwise
> life would be easy.  But I really can't believe that Dell is incompetent
> enough to not leave a mark of their OEM Firmware version anywhere.

Got it. Sure, there are version strings though not really 'stable' as
Hannes told me. Maybe there is a way to filter these out. But isn't this
what the command line idea overwrite feature is?

John, did you have some progress with this?

The sad thing is that the Intel device needs the quirk. If the Dell
device just would use it's own IDs...

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] nvme: drop bogus nid quirk for multipath devices which passed id test
  2025-10-29 10:14       ` Daniel Wagner
@ 2025-10-29 13:38         ` Christoph Hellwig
  0 siblings, 0 replies; 7+ messages in thread
From: Christoph Hellwig @ 2025-10-29 13:38 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: Christoph Hellwig, Daniel Wagner, Keith Busch, John Meneghini,
	Bryan Gurney, linux-nvme, linux-kernel

On Wed, Oct 29, 2025 at 11:14:27AM +0100, Daniel Wagner wrote:
> On Wed, Oct 29, 2025 at 10:59:07AM +0100, Christoph Hellwig wrote:
> > The problem is that there is no such thing as a sanity check.  Otherwise
> > life would be easy.  But I really can't believe that Dell is incompetent
> > enough to not leave a mark of their OEM Firmware version anywhere.
> 
> Got it. Sure, there are version strings though not really 'stable' as
> Hannes told me. Maybe there is a way to filter these out. But isn't this
> what the command line idea overwrite feature is?

Version isn't optimal, but model numbers often encode model numbers,
and some devices have identifiers in the vendor specific regions of
identify commands.

> John, did you have some progress with this?
> 
> The sad thing is that the Intel device needs the quirk. If the Dell
> device just would use it's own IDs...

I'm pretty sure they are one and the same device, just with slightly
different firmware versions.  It might be that Dell actually reported
the bogus IDs to Intel as part of compliance testing and Intel only
fixed it for them and not the generic firmware build :(

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] nvme: drop bogus nid quirk for multipath devices which passed id test
  2025-10-28 12:54 [PATCH] nvme: drop bogus nid quirk for multipath devices which passed id test Daniel Wagner
  2025-10-29  7:44 ` Christoph Hellwig
@ 2025-11-05  7:33 ` Hannes Reinecke
  1 sibling, 0 replies; 7+ messages in thread
From: Hannes Reinecke @ 2025-11-05  7:33 UTC (permalink / raw)
  To: Daniel Wagner, Keith Busch, Christoph Hellwig
  Cc: John Meneghini, Bryan Gurney, linux-nvme, linux-kernel,
	Daniel Wagner

On 10/28/25 13:54, Daniel Wagner wrote:
> From: Daniel Wagner <dwagner@suse.de>
> 
> Dell Express Flash NVMe P4610 reuse the Intel device id 0x0a54. The
> Intel P4500/46000 device needs NVME_QUIRK_BOBUG_NID whereas the Dell
> does not. Thus clear the quirk when the id check passes.
> 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=217981
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
> Hi,
> 
> Obviously, using the correct IDs is really hard. There are at least two devices
> which share a the device id and one needs the bogus id workaround the other
> doesn't.
> 
> We discussed the idea to pass through configuration from the kernel command line
> at ALPSS. I am not sure, how far we got with this yet. What about something like
> this here?
> 
> Thanks,
> Daniel
> 
> Signed-off-by: Daniel Wagner <wagi@kernel.org>
> ---
>   drivers/nvme/host/core.c | 7 +++++++
>   1 file changed, 7 insertions(+)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-11-05  7:33 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-28 12:54 [PATCH] nvme: drop bogus nid quirk for multipath devices which passed id test Daniel Wagner
2025-10-29  7:44 ` Christoph Hellwig
2025-10-29  9:45   ` Daniel Wagner
2025-10-29  9:59     ` Christoph Hellwig
2025-10-29 10:14       ` Daniel Wagner
2025-10-29 13:38         ` Christoph Hellwig
2025-11-05  7:33 ` Hannes Reinecke

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox