public inbox for linux-nvme@lists.infradead.org
 help / color / mirror / Atom feed
* [PATCH 0/1] Enable/Disable NVMe quirks via module parameter
@ 2026-01-28 10:33 Maurizio Lombardi
  2026-01-28 10:33 ` [PATCH 1/1] nvme: add support for dynamic quirk configuration " Maurizio Lombardi
  0 siblings, 1 reply; 4+ messages in thread
From: Maurizio Lombardi @ 2026-01-28 10:33 UTC (permalink / raw)
  To: kbusch; +Cc: hch, linux-nvme, dwagner, hare, jmeneghi, emilne, mlombard

Recently, some of our customers were very vocal about wanting an
easy way to enable or disable specific nvme quirks.

I recall that during the ALPSS conference this topic has been briefly
discussed, and it was proposed to use the generic PCI sysfs "new_id" entry
to manipulate the quirks.

However, in some cases a quirk may prevent the system from properly booting,
the kernel may hang or fail to mount the root
filesystem before the system reaches a state where new_id can be written,
this makes this solution not exactly user-friendly for end users.

In particular, boot-time hangs have been reported involving the
bogus_nid quirk.

Therefore, I have been asked to re-propose this module parameter approach
with a few modifications suggested by Daniel Wagner.


# dmesg | grep nvme
[    0.000000] Command line: root=/dev/vda1 console=ttyS0,115200n8 nvme.quirks=1b36:0010:^bogus_nxid
[    6.732692] nvme: unrecognized quirk bogus_nxid

# cat /sys/devices/pci0000:00/0000:00:05.0/nvme/nvme1/quirks
bogus_nid
                                                                                                                                                                                                                         
# dmesg | grep nvme
[    0.000000] Command line: root=/dev/vda1 console=ttyS0,115200n8 nvme.quirks=1b36:0010:^bogus_nid,no_apst
                                                                                                                                                                                                                         
# cat /sys/devices/pci0000:00/0000:00:05.0/nvme/nvme0/quirks                                                                                                                                             
no_apst

Maurizio Lombardi (1):
  nvme: add support for dynamic quirk configuration via module parameter

 .../admin-guide/kernel-parameters.txt         |  13 ++
 drivers/nvme/host/pci.c                       | 154 ++++++++++++++++++
 2 files changed, 167 insertions(+)

-- 
2.52.0



^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/1] nvme: add support for dynamic quirk configuration via module parameter
  2026-01-28 10:33 [PATCH 0/1] Enable/Disable NVMe quirks via module parameter Maurizio Lombardi
@ 2026-01-28 10:33 ` Maurizio Lombardi
  2026-02-02  7:17   ` Christoph Hellwig
  0 siblings, 1 reply; 4+ messages in thread
From: Maurizio Lombardi @ 2026-01-28 10:33 UTC (permalink / raw)
  To: kbusch; +Cc: hch, linux-nvme, dwagner, hare, jmeneghi, emilne, mlombard

Introduce support for enabling or disabling specific NVMe quirks at
module load time through the `quirks` module parameter.
This mechanism allows users to apply known quirks dynamically
based on the device's PCI vendor and device IDs,
without requiring to add hardcoded entries in the driver
and recompiling the kernel.

While the generic PCI new_id sysfs interface exists for
dynamic configuration, it is insufficient for scenarios where
the system fails to boot (for example, this has been reported to happen
because of the bogus_nid quirk). The new_id attribute is writable
only after the system has booted and sysfs is mounted.

The `quirks` parameter accepts a list of quirk
specifications separated by a '-' character in the following format:

  <VID>:<DID>:<quirk_names>[-<VID>:<DID>:<quirk_names>-..]

Each quirk is represented by its name and can be prefixed with
`^` to indicate that the quirk should be disabled.
Quirk names are separated by a ',' character.

Example usage (enable BOGUS_NID and BROKEN_MSI, disable DEALLOCATE_ZEROES):

  $ modprobe nvme quirks=7170:2210:bogus_nid,broken_msi,^deallocate_zeroes

Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Tested-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 .../admin-guide/kernel-parameters.txt         |  13 ++
 drivers/nvme/host/pci.c                       | 154 ++++++++++++++++++
 2 files changed, 167 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index a8d0afde7f85..e559648f97e6 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -74,6 +74,7 @@
 	TPM	TPM drivers are enabled.
 	UMS	USB Mass Storage support is enabled.
 	USB	USB support is enabled.
+	NVME	NVMe support is enabled
 	USBHID	USB Human Interface Device support is enabled.
 	V4L	Video For Linux support is enabled.
 	VGA	The VGA console has been enabled.
@@ -4671,6 +4672,18 @@ Kernel parameters
 			This can be set from sysctl after boot.
 			See Documentation/admin-guide/sysctl/vm.rst for details.
 
+	nvme.quirks=    [NVME] A list of quirk entries to augment the built-in
+			nvme quirk list. List entries are separated by a
+			'-' character.
+			Each entry has the form VendorID:ProductID:quirk_names.
+			The IDs are 4-digits hex numbers and quirk_names is a
+			list of quirk names separated by commas. A quirk name can
+			be prefixed by '^', meaning that the specified quirk must
+			be disabled.
+
+			Example:
+			nvme.quirks=7710:2267:bogus_nid,^identify_cns-9900:7711:broken_msi
+
 	ohci1394_dma=early	[HW,EARLY] enable debugging via the ohci1394 driver.
 			See Documentation/core-api/debugging-via-ohci1394.rst for more
 			info.
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 9fc4a60280a0..512c5cecc6f0 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -72,6 +72,13 @@
 static_assert(MAX_PRP_RANGE / NVME_CTRL_PAGE_SIZE <=
 	(1 /* prp1 */ + NVME_MAX_NR_DESCRIPTORS * PRPS_PER_PAGE));
 
+struct quirk_entry {
+	u16 vendor_id;
+	u16 dev_id;
+	u32 enabled_quirks;
+	u32 disabled_quirks;
+};
+
 static int use_threaded_interrupts;
 module_param(use_threaded_interrupts, int, 0444);
 
@@ -102,6 +109,132 @@ static unsigned int io_queue_depth = 1024;
 module_param_cb(io_queue_depth, &io_queue_depth_ops, &io_queue_depth, 0644);
 MODULE_PARM_DESC(io_queue_depth, "set io queue depth, should >= 2 and < 4096");
 
+static int quirks_param_set(const char *value, const struct kernel_param *kp);
+static char quirks_param[128];
+static const struct kernel_param_ops quirks_param_ops = {
+	.set = quirks_param_set,
+	.get = param_get_string,
+};
+
+static struct kparam_string quirks_param_string = {
+	.maxlen = sizeof(quirks_param),
+	.string = quirks_param,
+};
+
+static struct quirk_entry *quirk_list;
+static unsigned int quirk_count;
+module_param_cb(quirks, &quirks_param_ops, &quirks_param_string, 0644);
+MODULE_PARM_DESC(quirks, "Enable/disable NVMe quirks by specifying quirks=vendorID:deviceID:quirk_names");
+
+static int quirks_param_set(const char *value, const struct kernel_param *kp)
+{
+	char *val, *outer_ptr, *inner_ptr, *field;
+	char *q_name;
+	u16 vid, did;
+	u32 *flags;
+	size_t i = 0, q_len, field_len;
+	int err, b;
+
+	val = kstrdup(value, GFP_KERNEL);
+	if (!val)
+		return -ENOMEM;
+
+	err = param_set_copystring(val, kp);
+	if (err)
+		goto exit;
+
+	kfree(quirk_list);
+	quirk_list = NULL;
+
+	if (!*val) {
+		quirk_count = 0;
+		goto exit;
+	}
+
+	for (quirk_count = 1, i = 0; val[i]; i++)
+		if (val[i] == '-')
+			quirk_count++;
+
+	quirk_list = kcalloc(quirk_count, sizeof(struct quirk_entry),
+			     GFP_KERNEL);
+	if (!quirk_list) {
+		quirk_count = 0;
+		err = -ENOMEM;
+		goto exit;
+	}
+
+	i = 0;
+	outer_ptr = val;
+	while ((field = strsep(&outer_ptr, "-"))) {
+		inner_ptr = field;
+
+		/* Each entry consists of VID:DID:quirk_names */
+		field = strsep(&inner_ptr, ":");
+		if (!field) {
+			pr_err("nvme: malformed quirks string: %s\n",
+							value);
+			break;
+		}
+
+		if (kstrtou16(field, 16, &vid)) {
+			pr_err("nvme: malformed vendor id: %s\n", field);
+			break;
+		}
+
+		field = strsep(&inner_ptr, ":");
+		if (!field) {
+			pr_err("nvme: malformed quirks string: %s\n",
+							value);
+			break;
+		}
+
+		if (kstrtou16(field, 16, &did)) {
+			pr_err("nvme: malformed device id: %s\n", field);
+			break;
+		}
+
+		while ((field = strsep(&inner_ptr, ",")) && *field) {
+
+			/* Each entry consists of a quirk name */
+
+			if (*field == '^') {
+				flags = &quirk_list[i].disabled_quirks;
+				field++;
+			} else {
+				flags = &quirk_list[i].enabled_quirks;
+			}
+
+			field_len = strlen(field);
+			for (b = 0; ; ++b) {
+				q_name = nvme_quirk_name(BIT(b));
+				q_len = strlen(q_name);
+
+				if (!strcmp(q_name, "unknown")) {
+					pr_err("nvme: unrecognized quirk %s\n",
+									field);
+					break;
+				}
+				if (!strncmp(q_name, field, q_len) &&
+							q_len == field_len) {
+					*flags |= BIT(b);
+					break;
+				}
+			}
+		}
+
+		quirk_list[i].vendor_id = vid;
+		quirk_list[i].dev_id = did;
+		++i;
+	}
+
+	if (i < quirk_count)
+		quirk_count = i;
+
+exit:
+	kfree(val);
+	return err;
+}
+
 static int io_queue_count_set(const char *val, const struct kernel_param *kp)
 {
 	unsigned int n;
@@ -3439,12 +3572,27 @@ static unsigned long check_vendor_combination_bug(struct pci_dev *pdev)
 	return 0;
 }
 
+static struct quirk_entry *detect_dynamic_quirks(struct pci_dev *pdev)
+{
+	int i;
+
+	for (i = 0; i < quirk_count; ++i) {
+		if (pdev->vendor == quirk_list[i].vendor_id &&
+		    pdev->device == quirk_list[i].dev_id) {
+			return &quirk_list[i];
+		}
+	}
+
+	return NULL;
+}
+
 static struct nvme_dev *nvme_pci_alloc_dev(struct pci_dev *pdev,
 		const struct pci_device_id *id)
 {
 	unsigned long quirks = id->driver_data;
 	int node = dev_to_node(&pdev->dev);
 	struct nvme_dev *dev;
+	struct quirk_entry *qentry;
 	int ret = -ENOMEM;
 
 	dev = kzalloc_node(struct_size(dev, descriptor_pools, nr_node_ids),
@@ -3476,6 +3624,11 @@ static struct nvme_dev *nvme_pci_alloc_dev(struct pci_dev *pdev,
 			 "platform quirk: setting simple suspend\n");
 		quirks |= NVME_QUIRK_SIMPLE_SUSPEND;
 	}
+	qentry = detect_dynamic_quirks(pdev);
+	if (qentry) {
+		quirks |= qentry->enabled_quirks;
+		quirks &= ~(qentry->disabled_quirks);
+	}
 	ret = nvme_init_ctrl(&dev->ctrl, &pdev->dev, &nvme_pci_ctrl_ops,
 			     quirks);
 	if (ret)
@@ -4074,6 +4227,7 @@ static int __init nvme_init(void)
 
 static void __exit nvme_exit(void)
 {
+	kfree(quirk_list);
 	pci_unregister_driver(&nvme_driver);
 	flush_workqueue(nvme_wq);
 }
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 1/1] nvme: add support for dynamic quirk configuration via module parameter
  2026-01-28 10:33 ` [PATCH 1/1] nvme: add support for dynamic quirk configuration " Maurizio Lombardi
@ 2026-02-02  7:17   ` Christoph Hellwig
  2026-02-02 17:38     ` Maurizio Lombardi
  0 siblings, 1 reply; 4+ messages in thread
From: Christoph Hellwig @ 2026-02-02  7:17 UTC (permalink / raw)
  To: Maurizio Lombardi
  Cc: kbusch, hch, linux-nvme, dwagner, hare, jmeneghi, emilne,
	mlombard

On Wed, Jan 28, 2026 at 11:33:18AM +0100, Maurizio Lombardi wrote:
> Introduce support for enabling or disabling specific NVMe quirks at
> module load time through the `quirks` module parameter.
> This mechanism allows users to apply known quirks dynamically
> based on the device's PCI vendor and device IDs,
> without requiring to add hardcoded entries in the driver
> and recompiling the kernel.

Nit: this commit message reads a little odd because line lengths are so
inconsistent, please use up all 73/75 chracaters consistently.

> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt

I thought those file only used to document always built-in paramters
using __setup and not module parameters, but grabbing random ones that
look modular disproves that.

> +			list of quirk names separated by commas. A quirk name can
> +			be prefixed by '^', meaning that the specified quirk must

Overly long lines.

> +static int quirks_param_set(const char *value, const struct kernel_param *kp);

Can we get away without a forware declaration here?

> +static struct quirk_entry *quirk_list;
> +static unsigned int quirk_count;

Please use nvme_pci_ prefixes for global variables and functions.

> +static int quirks_param_set(const char *value, const struct kernel_param *kp)
> +{
> +	char *val, *outer_ptr, *inner_ptr, *field;
> +	char *q_name;
> +	u16 vid, did;
> +	u32 *flags;
> +	size_t i = 0, q_len, field_len;
> +	int err, b;
> +
> +	val = kstrdup(value, GFP_KERNEL);
> +	if (!val)
> +		return -ENOMEM;
> +
> +	err = param_set_copystring(val, kp);
> +	if (err)
> +		goto exit;

All other callers of param_set_copystring except for usbhid set
it to the passsed in value and not a copy.  Given that this
function has no documentation I'm not sure what is right, but
this smells a bit fishy.

> +	kfree(quirk_list);
> +	quirk_list = NULL;

How can quirk_list be non-NULL here given that we don't allow
runtime-changes to the paramter?

> +	if (!*val) {
> +		quirk_count = 0;
> +		goto exit;
> +	}

> +	for (quirk_count = 1, i = 0; val[i]; i++)
> +		if (val[i] == '-')
> +			quirk_count++;

initializing the quirk_count in the loop looks odd, compare to:

	quirk_count = 1;
	for (i = 0; val[i]; i++)
		if (val[i] == '-')
			quirk_count++;

> +
> +	quirk_list = kcalloc(quirk_count, sizeof(struct quirk_entry),
> +			     GFP_KERNEL);
> +	if (!quirk_list) {
> +		quirk_count = 0;
> +		err = -ENOMEM;
> +		goto exit;
> +	}

Instead of zeroing the global variable on error, please only set it
from a local one on success to make the code more robust.

> +	i = 0;
> +	outer_ptr = val;
> +	while ((field = strsep(&outer_ptr, "-"))) {
> +		inner_ptr = field;
> +
> +		/* Each entry consists of VID:DID:quirk_names */
> +		field = strsep(&inner_ptr, ":");

This juggling between field and innter_ptr looks odd.  I'd scan directly
into innter_ptr and keep field declarated locally in the loop.  Same
for any other variable not used globally.  Also splitting out the
loop bodies into well named helpers would help readability a lot here.

> +			field_len = strlen(field);
> +			for (b = 0; ; ++b) {
> +				q_name = nvme_quirk_name(BIT(b));

Urrg.  Please avoid use of BIT in general, but especially here where
you're actually doing arithmetics amd just declare constants.

Given that we're unlikely to add tons of quirk lines, simply reallocing
the array vs pre-sizing it might also be simpler.

> +static struct quirk_entry *detect_dynamic_quirks(struct pci_dev *pdev)
> +{
> +	int i;
> +
> +	for (i = 0; i < quirk_count; ++i) {

The nvme driver like most of the kernel uses post-increments in for
loops.  (We already had at least one more instance above).

> +		if (pdev->vendor == quirk_list[i].vendor_id &&
> +		    pdev->device == quirk_list[i].dev_id) {
> +			return &quirk_list[i];
> +		}
> +	}

No need for either braces, and at least the inner ones are counter to
the usual kernel style.

> +	qentry = detect_dynamic_quirks(pdev);
> +	if (qentry) {
> +		quirks |= qentry->enabled_quirks;
> +		quirks &= ~(qentry->disabled_quirks);

No need for the braces after the ~.



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH 1/1] nvme: add support for dynamic quirk configuration via module parameter
  2026-02-02  7:17   ` Christoph Hellwig
@ 2026-02-02 17:38     ` Maurizio Lombardi
  0 siblings, 0 replies; 4+ messages in thread
From: Maurizio Lombardi @ 2026-02-02 17:38 UTC (permalink / raw)
  To: Christoph Hellwig, Maurizio Lombardi
  Cc: kbusch, linux-nvme, dwagner, hare, jmeneghi, emilne, mlombard

Hello Christoph,

On Mon Feb 2, 2026 at 8:17 AM CET, Christoph Hellwig wrote:
>
> Nit: this commit message reads a little odd because line lengths are so
> inconsistent, please use up all 73/75 chracaters consistently.
>
>
>> +			list of quirk names separated by commas. A quirk name can
>> +			be prefixed by '^', meaning that the specified quirk must
>
> Overly long lines.
>
>> +static int quirks_param_set(const char *value, const struct kernel_param *kp);
>
> Can we get away without a forware declaration here?
>
>> +static struct quirk_entry *quirk_list;
>> +static unsigned int quirk_count;
>
> Please use nvme_pci_ prefixes for global variables and functions.

Ok

>
>> +static int quirks_param_set(const char *value, const struct kernel_param *kp)
>> +{
>> +	char *val, *outer_ptr, *inner_ptr, *field;
>> +	char *q_name;
>> +	u16 vid, did;
>> +	u32 *flags;
>> +	size_t i = 0, q_len, field_len;
>> +	int err, b;
>> +
>> +	val = kstrdup(value, GFP_KERNEL);
>> +	if (!val)
>> +		return -ENOMEM;
>> +
>> +	err = param_set_copystring(val, kp);
>> +	if (err)
>> +		goto exit;
>
> All other callers of param_set_copystring except for usbhid set
> it to the passsed in value and not a copy.  Given that this
> function has no documentation I'm not sure what is right, but
> this smells a bit fishy.

Changed it to used the passed value

>
>> +	kfree(quirk_list);
>> +	quirk_list = NULL;
>
> How can quirk_list be non-NULL here given that we don't allow
> runtime-changes to the paramter?

That's right, I cleaned it up

>
>> +	if (!*val) {
>> +		quirk_count = 0;
>> +		goto exit;
>> +	}
>
>> +	for (quirk_count = 1, i = 0; val[i]; i++)
>> +		if (val[i] == '-')
>> +			quirk_count++;
>
> initializing the quirk_count in the loop looks odd, compare to:
>
> 	quirk_count = 1;
> 	for (i = 0; val[i]; i++)
> 		if (val[i] == '-')
> 			quirk_count++;
>
>> +
>> +	quirk_list = kcalloc(quirk_count, sizeof(struct quirk_entry),
>> +			     GFP_KERNEL);
>> +	if (!quirk_list) {
>> +		quirk_count = 0;
>> +		err = -ENOMEM;
>> +		goto exit;
>> +	}
>
> Instead of zeroing the global variable on error, please only set it
> from a local one on success to make the code more robust.

Ok

> Given that we're unlikely to add tons of quirk lines, simply reallocing
> the array vs pre-sizing it might also be simpler.

I tried but IMO doesn't make it much simpler.

>
>> +static struct quirk_entry *detect_dynamic_quirks(struct pci_dev *pdev)
>> +{
>> +	int i;
>> +
>> +	for (i = 0; i < quirk_count; ++i) {
>
> The nvme driver like most of the kernel uses post-increments in for
> loops.  (We already had at least one more instance above).
>
>> +		if (pdev->vendor == quirk_list[i].vendor_id &&
>> +		    pdev->device == quirk_list[i].dev_id) {
>> +			return &quirk_list[i];
>> +		}
>> +	}
>
> No need for either braces, and at least the inner ones are counter to
> the usual kernel style.
>
>> +	qentry = detect_dynamic_quirks(pdev);
>> +	if (qentry) {
>> +		quirks |= qentry->enabled_quirks;
>> +		quirks &= ~(qentry->disabled_quirks);
>
> No need for the braces after the ~.

Should all be fixed in V2, please take a look and let me know


Thanks,
Maurizio


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-02-02 17:38 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-28 10:33 [PATCH 0/1] Enable/Disable NVMe quirks via module parameter Maurizio Lombardi
2026-01-28 10:33 ` [PATCH 1/1] nvme: add support for dynamic quirk configuration " Maurizio Lombardi
2026-02-02  7:17   ` Christoph Hellwig
2026-02-02 17:38     ` Maurizio Lombardi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox