* [PATCH] nvme: don't reject probe due to duplicate IDs for single-ported PCIe devices
@ 2023-07-13 13:30 ` Christoph Hellwig
2023-07-13 14:02 ` Kanchan Joshi
2026-01-20 9:54 ` xiaoke
0 siblings, 2 replies; 7+ messages in thread
From: Christoph Hellwig @ 2023-07-13 13:30 UTC (permalink / raw)
To: kbusch, sagi, axboe; +Cc: linux-nvme
While duplicate IDs are still very harmful, including the potential to easily
see changing devices in /dev/disk/by-id, it turn out they are extremely
common for cheap end user NVMe devices.
Relax our check for them for so that it doesn't reject the probe on
single-ported PCIe devices, but prints a big warning instead. In doubt
we'd still like to see quirk entries to disable the potential for
changing supposed stable device identifier links, but this will at least
allow users how have two (or more) of these devices to use them without
having to manually add a new PCI ID entry with the quirk through sysfs or
by patching the kernel.
Co-developed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
drivers/nvme/host/core.c | 36 +++++++++++++++++++++++++++++++++---
1 file changed, 33 insertions(+), 3 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 47d7ba2827ff29..37b6fa74666204 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3431,10 +3431,40 @@ static int nvme_init_ns_head(struct nvme_ns *ns, struct nvme_ns_info *info)
ret = nvme_global_check_duplicate_ids(ctrl->subsys, &info->ids);
if (ret) {
- dev_err(ctrl->device,
- "globally duplicate IDs for nsid %d\n", info->nsid);
+ /*
+ * We've found two different namespaces on two different
+ * subsystems that report the same ID. This is pretty nasty
+ * for anything that actually requires unique device
+ * identification. In the kernel we need this for multipathing,
+ * and in user space the /dev/disk/by-id/ links rely on it.
+ *
+ * If the device also claims to be multi-path capable back off
+ * here now and refuse the probe the second device as this is a
+ * recipe for data corruption. If not this is probably a
+ * cheap consumer device if on the PCIe bus, so let the user
+ * proceed and use the shiny toy, but warn that with changing
+ * probing order (which due to our async probing could just be
+ * device taking longer to startup) the other device could show
+ * up at any time.
+ */
nvme_print_device_info(ctrl);
- return ret;
+ if ((ns->ctrl->ops->flags & NVME_F_FABRICS) || /* !PCIe */
+ ((ns->ctrl->subsys->cmic & NVME_CTRL_CMIC_MULTI_CTRL) &&
+ info->is_shared)) {
+ dev_err(ctrl->device,
+ "ignoring nsid %d because of duplicate IDs\n",
+ info->nsid);
+ return ret;
+ }
+
+ dev_err(ctrl->device,
+ "clearing duplicate IDs for nsid %d\n", info->nsid);
+ dev_err(ctrl->device,
+ "use of /dev/disk/by-id/ may cause data corruption\n");
+ memset(&info->ids.nguid, 0, sizeof(info->ids.nguid));
+ memset(&info->ids.uuid, 0, sizeof(info->ids.uuid));
+ memset(&info->ids.eui64, 0, sizeof(info->ids.eui64));
+ ctrl->quirks |= NVME_QUIRK_BOGUS_NID;
}
mutex_lock(&ctrl->subsys->lock);
--
2.39.2
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] nvme: don't reject probe due to duplicate IDs for single-ported PCIe devices
2023-07-13 13:30 ` [PATCH] nvme: don't reject probe due to duplicate IDs for single-ported PCIe devices Christoph Hellwig
@ 2023-07-13 14:02 ` Kanchan Joshi
2023-07-13 14:54 ` Keith Busch
2026-01-20 9:54 ` xiaoke
1 sibling, 1 reply; 7+ messages in thread
From: Kanchan Joshi @ 2023-07-13 14:02 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: kbusch, sagi, axboe, linux-nvme
[-- Attachment #1: Type: text/plain, Size: 778 bytes --]
On Thu, Jul 13, 2023 at 03:30:42PM +0200, Christoph Hellwig wrote:
>While duplicate IDs are still very harmful, including the potential to easily
>see changing devices in /dev/disk/by-id, it turn out they are extremely
>common for cheap end user NVMe devices.
>
>Relax our check for them for so that it doesn't reject the probe on
>single-ported PCIe devices, but prints a big warning instead. In doubt
>we'd still like to see quirk entries to disable the potential for
>changing supposed stable device identifier links, but this will at least
>allow users how have two (or more) of these devices to use them without
>having to manually add a new PCI ID entry with the quirk through sysfs or
>by patching the kernel.
Should this go for backport? For the commit 2079f41ec6ffa.
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] nvme: don't reject probe due to duplicate IDs for single-ported PCIe devices
2023-07-13 14:02 ` Kanchan Joshi
@ 2023-07-13 14:54 ` Keith Busch
2023-07-13 15:01 ` Jens Axboe
0 siblings, 1 reply; 7+ messages in thread
From: Keith Busch @ 2023-07-13 14:54 UTC (permalink / raw)
To: Kanchan Joshi; +Cc: Christoph Hellwig, sagi, axboe, linux-nvme
On Thu, Jul 13, 2023 at 07:32:06PM +0530, Kanchan Joshi wrote:
> On Thu, Jul 13, 2023 at 03:30:42PM +0200, Christoph Hellwig wrote:
> > While duplicate IDs are still very harmful, including the potential to easily
> > see changing devices in /dev/disk/by-id, it turn out they are extremely
> > common for cheap end user NVMe devices.
> >
> > Relax our check for them for so that it doesn't reject the probe on
> > single-ported PCIe devices, but prints a big warning instead. In doubt
> > we'd still like to see quirk entries to disable the potential for
> > changing supposed stable device identifier links, but this will at least
> > allow users how have two (or more) of these devices to use them without
> > having to manually add a new PCI ID entry with the quirk through sysfs or
> > by patching the kernel.
>
> Should this go for backport? For the commit 2079f41ec6ffa.
Is it sufficient if I just append "Cc: stable..." to the commit message?
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] nvme: don't reject probe due to duplicate IDs for single-ported PCIe devices
2023-07-13 14:54 ` Keith Busch
@ 2023-07-13 15:01 ` Jens Axboe
2023-07-13 15:20 ` Keith Busch
0 siblings, 1 reply; 7+ messages in thread
From: Jens Axboe @ 2023-07-13 15:01 UTC (permalink / raw)
To: Keith Busch, Kanchan Joshi; +Cc: Christoph Hellwig, sagi, linux-nvme
On 7/13/23 8:54 AM, Keith Busch wrote:
> On Thu, Jul 13, 2023 at 07:32:06PM +0530, Kanchan Joshi wrote:
>> On Thu, Jul 13, 2023 at 03:30:42PM +0200, Christoph Hellwig wrote:
>>> While duplicate IDs are still very harmful, including the potential to easily
>>> see changing devices in /dev/disk/by-id, it turn out they are extremely
>>> common for cheap end user NVMe devices.
>>>
>>> Relax our check for them for so that it doesn't reject the probe on
>>> single-ported PCIe devices, but prints a big warning instead. In doubt
>>> we'd still like to see quirk entries to disable the potential for
>>> changing supposed stable device identifier links, but this will at least
>>> allow users how have two (or more) of these devices to use them without
>>> having to manually add a new PCI ID entry with the quirk through sysfs or
>>> by patching the kernel.
>>
>> Should this go for backport? For the commit 2079f41ec6ffa.
>
> Is it sufficient if I just append "Cc: stable..." to the commit message?
You probably want:
Fixes: 2079f41ec6ff ("nvme: check that EUI/GUID/UUID are globally unique")
Cc: stable@vger.kernel.org
and that should do the trick.
--
Jens Axboe
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] nvme: don't reject probe due to duplicate IDs for single-ported PCIe devices
2023-07-13 15:01 ` Jens Axboe
@ 2023-07-13 15:20 ` Keith Busch
0 siblings, 0 replies; 7+ messages in thread
From: Keith Busch @ 2023-07-13 15:20 UTC (permalink / raw)
To: Jens Axboe; +Cc: Kanchan Joshi, Christoph Hellwig, sagi, linux-nvme
On Thu, Jul 13, 2023 at 09:01:18AM -0600, Jens Axboe wrote:
> You probably want:
>
> Fixes: 2079f41ec6ff ("nvme: check that EUI/GUID/UUID are globally unique")
> Cc: stable@vger.kernel.org
>
> and that should do the trick.
Done, pushed to nvme-6.5. I'll send a new pull request before
end-of-day.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] nvme: don't reject probe due to duplicate IDs for single-ported PCIe devices
2023-07-13 13:30 ` [PATCH] nvme: don't reject probe due to duplicate IDs for single-ported PCIe devices Christoph Hellwig
2023-07-13 14:02 ` Kanchan Joshi
@ 2026-01-20 9:54 ` xiaoke
2026-01-22 6:38 ` Christoph Hellwig
1 sibling, 1 reply; 7+ messages in thread
From: xiaoke @ 2026-01-20 9:54 UTC (permalink / raw)
To: linux-nvme; +Cc: Christoph Hellwig, sagi, axboe, kbusch
On 2023/7/13 21:30, Christoph Hellwig wrote:
> While duplicate IDs are still very harmful, including the potential to easily
> see changing devices in /dev/disk/by-id, it turn out they are extremely
> common for cheap end user NVMe devices.
>
> Relax our check for them for so that it doesn't reject the probe on
> single-ported PCIe devices, but prints a big warning instead. In doubt
> we'd still like to see quirk entries to disable the potential for
> changing supposed stable device identifier links, but this will at least
> allow users how have two (or more) of these devices to use them without
> having to manually add a new PCI ID entry with the quirk through sysfs or
> by patching the kernel.
>
> Co-developed-by: Sagi Grimberg <sagi@grimberg.me>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
> drivers/nvme/host/core.c | 36 +++++++++++++++++++++++++++++++++---
> 1 file changed, 33 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 47d7ba2827ff29..37b6fa74666204 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -3431,10 +3431,40 @@ static int nvme_init_ns_head(struct nvme_ns *ns, struct nvme_ns_info *info)
>
> ret = nvme_global_check_duplicate_ids(ctrl->subsys, &info->ids);
> if (ret) {
> - dev_err(ctrl->device,
> - "globally duplicate IDs for nsid %d\n", info->nsid);
> + /*
> + * We've found two different namespaces on two different
> + * subsystems that report the same ID. This is pretty nasty
> + * for anything that actually requires unique device
> + * identification. In the kernel we need this for multipathing,
> + * and in user space the /dev/disk/by-id/ links rely on it.
> + *
> + * If the device also claims to be multi-path capable back off
> + * here now and refuse the probe the second device as this is a
> + * recipe for data corruption. If not this is probably a
> + * cheap consumer device if on the PCIe bus, so let the user
> + * proceed and use the shiny toy, but warn that with changing
> + * probing order (which due to our async probing could just be
> + * device taking longer to startup) the other device could show
> + * up at any time.
> + */
> nvme_print_device_info(ctrl);
> - return ret;
> + if ((ns->ctrl->ops->flags & NVME_F_FABRICS) || /* !PCIe */
> + ((ns->ctrl->subsys->cmic & NVME_CTRL_CMIC_MULTI_CTRL) &&
> + info->is_shared)) {
> + dev_err(ctrl->device,
> + "ignoring nsid %d because of duplicate IDs\n",
> + info->nsid);
> + return ret;
> + }
> +
> + dev_err(ctrl->device,
> + "clearing duplicate IDs for nsid %d\n", info->nsid);
> + dev_err(ctrl->device,
> + "use of /dev/disk/by-id/ may cause data corruption\n");
> + memset(&info->ids.nguid, 0, sizeof(info->ids.nguid));
> + memset(&info->ids.uuid, 0, sizeof(info->ids.uuid));
> + memset(&info->ids.eui64, 0, sizeof(info->ids.eui64));
> + ctrl->quirks |= NVME_QUIRK_BOGUS_NID;
> }
>
> mutex_lock(&ctrl->subsys->lock);
Hi,
I’d like to discuss whether we should revisit the duplicate-ID check
for NVMe-oF transports, especially in HA dual-active setups.
In such HA configurations, a single LUN is exposed via multiple subsystems
(one per storage controller) to provide redundancy. Because it represents
the same namespace, it usually reports the same UUID/NGUID/EUI64 on all
paths.
With the logic introduced in this patch, Fabrics are still strictly
rejected:
> + if ((ns->ctrl->ops->flags & NVME_F_FABRICS) || /* !PCIe */
> + ((ns->ctrl->subsys->cmic & NVME_CTRL_CMIC_MULTI_CTRL) &&
> + info->is_shared)) {
Concretely, with two subsystems exposing the same LUN in a dual-active
HA configuration:
- Only paths from one subsystem are used;
- When that controller fails, the host cannot fail over to the other
subsystem because its namespace was ignored, effectively breaking HA.
Would it make sense to:
1) relax the duplicate ID check for NVMe-oF HA dual-active use cases, or
2) add a module parameter (e.g., `nvme_core.allow_duplicate_ids`) so admins
can opt-in when they know their storage topology and accept the
/dev/disk/by-id risks?
Keeping the default strict is fine, but having an escape hatch would be
very helpful for HA deployments.
Thanks,
Xiaoke
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] nvme: don't reject probe due to duplicate IDs for single-ported PCIe devices
2026-01-20 9:54 ` xiaoke
@ 2026-01-22 6:38 ` Christoph Hellwig
0 siblings, 0 replies; 7+ messages in thread
From: Christoph Hellwig @ 2026-01-22 6:38 UTC (permalink / raw)
To: xiaoke@sangfor.com.cn; +Cc: linux-nvme, Christoph Hellwig, sagi, axboe, kbusch
On Tue, Jan 20, 2026 at 05:54:25PM +0800, xiaoke@sangfor.com.cn wrote:
>
> I’d like to discuss whether we should revisit the duplicate-ID check
> for NVMe-oF transports, especially in HA dual-active setups.
>
> In such HA configurations, a single LUN is exposed via multiple subsystems
> (one per storage controller) to provide redundancy. Because it represents
> the same namespace, it usually reports the same UUID/NGUID/EUI64 on all
> paths.
Not in any configuration confirming to the NVMe spec (discounting the
completely broken dispersed namespace feature).
So no, this is exactly what both the spec, and the code try to prevent.
> > + if ((ns->ctrl->ops->flags & NVME_F_FABRICS) || /* !PCIe */
> > + ((ns->ctrl->subsys->cmic & NVME_CTRL_CMIC_MULTI_CTRL) &&
> > + info->is_shared)) {
>
> Concretely, with two subsystems exposing the same LUN in a dual-active
> HA configuration:
There are no logical units or logical unit numbers in NVMe.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2026-01-22 6:39 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CGME20230713133859epcas5p2b0de1a117d69f54f2d1dcf027f73afef@epcas5p2.samsung.com>
2023-07-13 13:30 ` [PATCH] nvme: don't reject probe due to duplicate IDs for single-ported PCIe devices Christoph Hellwig
2023-07-13 14:02 ` Kanchan Joshi
2023-07-13 14:54 ` Keith Busch
2023-07-13 15:01 ` Jens Axboe
2023-07-13 15:20 ` Keith Busch
2026-01-20 9:54 ` xiaoke
2026-01-22 6:38 ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox