public inbox for linux-nvme@lists.infradead.org
 help / color / mirror / Atom feed
* Non-SGL transport mode warnings are set to dev_warn_once will cause confusion
@ 2026-04-09  8:46 AlanCui4080
  2026-04-09 14:36 ` Keith Busch
  0 siblings, 1 reply; 3+ messages in thread
From: AlanCui4080 @ 2026-04-09  8:46 UTC (permalink / raw)
  To: linux-nvme

Hi,

[   49.108397] nvme nvme0: using unchecked data buffer

See 6fad84a (nvme-pci: use sgls for all user requests if possible). In the 
kernel, those warnings are printed using `dev warn once`. This means that if 
multiple devices in the system do not support SGLs (most consumer-grade 
devices do not support them), only one warning for only one device will be 
printed.

This asymmetry can be misleading to users. If all devices in the system report 
the same issue, it might not be a problem, but if only one device reports it, 
it might (especially since I have two identical drives). Is it possible to 
move this warning to the device initialization phase so print it for each 
device? Or, since we cannot resolve the issue of consumer-grade devices not 
supporting SGL, should it be downgraded to an informational warning?

Alan.





^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Non-SGL transport mode warnings are set to dev_warn_once will cause confusion
  2026-04-09  8:46 Non-SGL transport mode warnings are set to dev_warn_once will cause confusion AlanCui4080
@ 2026-04-09 14:36 ` Keith Busch
  2026-04-09 15:49   ` AlanCui4080
  0 siblings, 1 reply; 3+ messages in thread
From: Keith Busch @ 2026-04-09 14:36 UTC (permalink / raw)
  To: AlanCui4080; +Cc: linux-nvme

On Thu, Apr 09, 2026 at 04:46:37PM +0800, AlanCui4080 wrote:
> See 6fad84a (nvme-pci: use sgls for all user requests if possible). In the 
> kernel, those warnings are printed using `dev warn once`. This means that if 
> multiple devices in the system do not support SGLs (most consumer-grade 
> devices do not support them), only one warning for only one device will be 
> printed.
> 
> This asymmetry can be misleading to users. If all devices in the system report 
> the same issue, it might not be a problem, but if only one device reports it, 
> it might (especially since I have two identical drives). Is it possible to 
> move this warning to the device initialization phase so print it for each 
> device? Or, since we cannot resolve the issue of consumer-grade devices not 
> supporting SGL, should it be downgraded to an informational warning?

Fine with me. The warning was added in response to people filing CVE's
against the driver as a sort of acknowledgement that yeah, this
interface can't validate transfer lengths under these conditions, so
we're trusting the user isn't abusing it. A sort of nudge that perhaps
controller vendors might consider supporting the safer option.

Anyway, it's fine with me to move the message and make it less scary.
How about this:

---
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index b42d8768d2979..b6aec0e3fbfb8 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3744,6 +3744,10 @@ int nvme_init_ctrl_finish(struct nvme_ctrl *ctrl, bool was_suspended)
 		ret = nvme_hwmon_init(ctrl);
 		if (ret == -EINTR)
 			return ret;
+
+		if (!nvme_ctrl_sgl_supported(ctrl))
+			dev_info(ctrl->device,
+				"passthrough uses implicit buffer lengths\n");
 	}
 
 	clear_bit(NVME_CTRL_DIRTY_CAPABILITY, &ctrl->flags);
diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
index 8844bbd395159..e9eecdd54d5ed 100644
--- a/drivers/nvme/host/ioctl.c
+++ b/drivers/nvme/host/ioctl.c
@@ -125,16 +125,8 @@ static int nvme_map_user_request(struct request *req, u64 ubuffer,
 	struct bio *bio = NULL;
 	int ret;
 
-	if (!nvme_ctrl_sgl_supported(ctrl))
-		dev_warn_once(ctrl->device, "using unchecked data buffer\n");
-	if (has_metadata) {
-		if (!supports_metadata)
-			return -EINVAL;
-
-		if (!nvme_ctrl_meta_sgl_supported(ctrl))
-			dev_warn_once(ctrl->device,
-				      "using unchecked metadata buffer\n");
-	}
+	if (has_metadata && !supports_metadata)
+		return -EINVAL;
 
 	if (iter)
 		ret = blk_rq_map_user_iov(q, req, NULL, iter, GFP_KERNEL);
--


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: Non-SGL transport mode warnings are set to dev_warn_once will cause confusion
  2026-04-09 14:36 ` Keith Busch
@ 2026-04-09 15:49   ` AlanCui4080
  0 siblings, 0 replies; 3+ messages in thread
From: AlanCui4080 @ 2026-04-09 15:49 UTC (permalink / raw)
  To: Keith Busch; +Cc: linux-nvme

On Thursday, 9 April 2026 22:36,you wrote:
> Fine with me. The warning was added in response to people filing CVE's
> against the driver as a sort of acknowledgement that yeah, this
> interface can't validate transfer lengths under these conditions, so
> we're trusting the user isn't abusing it. A sort of nudge that perhaps
> controller vendors might consider supporting the safer option.

Yes, in my opnion, the device node can be only access with privilege by 
default. Change the premission of /dev/nvmexxx to 0666 is as dangerous as 
change /etc/shadow to 0666. So, that's nothing really to worry, every device 
on PCI-E that can DMA will able to corrupt the kernel unless IOMMU is used.

And as what i saw, the https://lore.kernel.org/linux-nvme/
20231013051458.39987-1-joshi.k@samsung.com/T/
#m2a5f9fe3a53322ab67c1dd40d5a448405308ea4b fixed this problem and make it's 
safe even the user changed the premission to 0666.

> Anyway, it's fine with me to move the message and make it less scary.
> How about this:
> 
> ---
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index b42d8768d2979..b6aec0e3fbfb8 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -3744,6 +3744,10 @@ int nvme_init_ctrl_finish(struct nvme_ctrl *ctrl,
> bool was_suspended) ret = nvme_hwmon_init(ctrl);
>  		if (ret == -EINTR)
>  			return ret;
> +
> +		if (!nvme_ctrl_sgl_supported(ctrl))
> +			dev_info(ctrl->device,
> +				"passthrough uses implicit buffer 
lengths\n");
>  	}
> 
>  	clear_bit(NVME_CTRL_DIRTY_CAPABILITY, &ctrl->flags);
> --

Since it was a response to a CVE, and if, as I mentioned above, there are 
already patches to prevent unprivileged users from corrupting the kernel, then 
downgrading it to informational might be reasonable? In fact, the CVE rating 
was downgraded once after this vulnerability was submitted, due to the 
difficulty of exploiting it.

What about saying "passthrough uses implicit and unchecked buffer lengths for 
privilege user" which may be more descriptive, and add comment which refers to 
the CVE number like:
```
		/* See CVE-2023-6238, malformed commands from root users 
can overflow the buffer and corrupt the kernel */
		if (!nvme_ctrl_sgl_supported(ctrl))
			dev_info(ctrl->device,
				"passthrough uses implicit and 
unchecked buffer lengths for privilege user");
```

I recently started back using a Linux desktop again, and this is at least my 
first time using Linux with NVMe drives. I feel that nvme module is a bit too 
sensitive, even becoming a major source of warnings in my dmesg, including not 
only this one, but also "missing or invalid SUBNQN field." And as https://
lwn.net/Articles/876209/, A warning indicates that the kernel cannot handle a 
certain situation and is running in a degraded manner based on certain 
assumptions, which may lead to unexpected situations.I believe that a good 
system administrator should review and ensure that each kernel warnings are 
either negligible or an action has been taken to eliminate them. :)

Alan.






^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-04-09 15:50 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-09  8:46 Non-SGL transport mode warnings are set to dev_warn_once will cause confusion AlanCui4080
2026-04-09 14:36 ` Keith Busch
2026-04-09 15:49   ` AlanCui4080

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox