* Non-SGL transport mode warnings are set to dev_warn_once will cause confusion
@ 2026-04-09 8:46 AlanCui4080
2026-04-09 14:36 ` Keith Busch
0 siblings, 1 reply; 3+ messages in thread
From: AlanCui4080 @ 2026-04-09 8:46 UTC (permalink / raw)
To: linux-nvme
Hi,
[ 49.108397] nvme nvme0: using unchecked data buffer
See 6fad84a (nvme-pci: use sgls for all user requests if possible). In the
kernel, those warnings are printed using `dev warn once`. This means that if
multiple devices in the system do not support SGLs (most consumer-grade
devices do not support them), only one warning for only one device will be
printed.
This asymmetry can be misleading to users. If all devices in the system report
the same issue, it might not be a problem, but if only one device reports it,
it might (especially since I have two identical drives). Is it possible to
move this warning to the device initialization phase so print it for each
device? Or, since we cannot resolve the issue of consumer-grade devices not
supporting SGL, should it be downgraded to an informational warning?
Alan.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Non-SGL transport mode warnings are set to dev_warn_once will cause confusion
2026-04-09 8:46 Non-SGL transport mode warnings are set to dev_warn_once will cause confusion AlanCui4080
@ 2026-04-09 14:36 ` Keith Busch
2026-04-09 15:49 ` AlanCui4080
0 siblings, 1 reply; 3+ messages in thread
From: Keith Busch @ 2026-04-09 14:36 UTC (permalink / raw)
To: AlanCui4080; +Cc: linux-nvme
On Thu, Apr 09, 2026 at 04:46:37PM +0800, AlanCui4080 wrote:
> See 6fad84a (nvme-pci: use sgls for all user requests if possible). In the
> kernel, those warnings are printed using `dev warn once`. This means that if
> multiple devices in the system do not support SGLs (most consumer-grade
> devices do not support them), only one warning for only one device will be
> printed.
>
> This asymmetry can be misleading to users. If all devices in the system report
> the same issue, it might not be a problem, but if only one device reports it,
> it might (especially since I have two identical drives). Is it possible to
> move this warning to the device initialization phase so print it for each
> device? Or, since we cannot resolve the issue of consumer-grade devices not
> supporting SGL, should it be downgraded to an informational warning?
Fine with me. The warning was added in response to people filing CVE's
against the driver as a sort of acknowledgement that yeah, this
interface can't validate transfer lengths under these conditions, so
we're trusting the user isn't abusing it. A sort of nudge that perhaps
controller vendors might consider supporting the safer option.
Anyway, it's fine with me to move the message and make it less scary.
How about this:
---
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index b42d8768d2979..b6aec0e3fbfb8 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3744,6 +3744,10 @@ int nvme_init_ctrl_finish(struct nvme_ctrl *ctrl, bool was_suspended)
ret = nvme_hwmon_init(ctrl);
if (ret == -EINTR)
return ret;
+
+ if (!nvme_ctrl_sgl_supported(ctrl))
+ dev_info(ctrl->device,
+ "passthrough uses implicit buffer lengths\n");
}
clear_bit(NVME_CTRL_DIRTY_CAPABILITY, &ctrl->flags);
diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
index 8844bbd395159..e9eecdd54d5ed 100644
--- a/drivers/nvme/host/ioctl.c
+++ b/drivers/nvme/host/ioctl.c
@@ -125,16 +125,8 @@ static int nvme_map_user_request(struct request *req, u64 ubuffer,
struct bio *bio = NULL;
int ret;
- if (!nvme_ctrl_sgl_supported(ctrl))
- dev_warn_once(ctrl->device, "using unchecked data buffer\n");
- if (has_metadata) {
- if (!supports_metadata)
- return -EINVAL;
-
- if (!nvme_ctrl_meta_sgl_supported(ctrl))
- dev_warn_once(ctrl->device,
- "using unchecked metadata buffer\n");
- }
+ if (has_metadata && !supports_metadata)
+ return -EINVAL;
if (iter)
ret = blk_rq_map_user_iov(q, req, NULL, iter, GFP_KERNEL);
--
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: Non-SGL transport mode warnings are set to dev_warn_once will cause confusion
2026-04-09 14:36 ` Keith Busch
@ 2026-04-09 15:49 ` AlanCui4080
0 siblings, 0 replies; 3+ messages in thread
From: AlanCui4080 @ 2026-04-09 15:49 UTC (permalink / raw)
To: Keith Busch; +Cc: linux-nvme
On Thursday, 9 April 2026 22:36,you wrote:
> Fine with me. The warning was added in response to people filing CVE's
> against the driver as a sort of acknowledgement that yeah, this
> interface can't validate transfer lengths under these conditions, so
> we're trusting the user isn't abusing it. A sort of nudge that perhaps
> controller vendors might consider supporting the safer option.
Yes, in my opnion, the device node can be only access with privilege by
default. Change the premission of /dev/nvmexxx to 0666 is as dangerous as
change /etc/shadow to 0666. So, that's nothing really to worry, every device
on PCI-E that can DMA will able to corrupt the kernel unless IOMMU is used.
And as what i saw, the https://lore.kernel.org/linux-nvme/
20231013051458.39987-1-joshi.k@samsung.com/T/
#m2a5f9fe3a53322ab67c1dd40d5a448405308ea4b fixed this problem and make it's
safe even the user changed the premission to 0666.
> Anyway, it's fine with me to move the message and make it less scary.
> How about this:
>
> ---
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index b42d8768d2979..b6aec0e3fbfb8 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -3744,6 +3744,10 @@ int nvme_init_ctrl_finish(struct nvme_ctrl *ctrl,
> bool was_suspended) ret = nvme_hwmon_init(ctrl);
> if (ret == -EINTR)
> return ret;
> +
> + if (!nvme_ctrl_sgl_supported(ctrl))
> + dev_info(ctrl->device,
> + "passthrough uses implicit buffer
lengths\n");
> }
>
> clear_bit(NVME_CTRL_DIRTY_CAPABILITY, &ctrl->flags);
> --
Since it was a response to a CVE, and if, as I mentioned above, there are
already patches to prevent unprivileged users from corrupting the kernel, then
downgrading it to informational might be reasonable? In fact, the CVE rating
was downgraded once after this vulnerability was submitted, due to the
difficulty of exploiting it.
What about saying "passthrough uses implicit and unchecked buffer lengths for
privilege user" which may be more descriptive, and add comment which refers to
the CVE number like:
```
/* See CVE-2023-6238, malformed commands from root users
can overflow the buffer and corrupt the kernel */
if (!nvme_ctrl_sgl_supported(ctrl))
dev_info(ctrl->device,
"passthrough uses implicit and
unchecked buffer lengths for privilege user");
```
I recently started back using a Linux desktop again, and this is at least my
first time using Linux with NVMe drives. I feel that nvme module is a bit too
sensitive, even becoming a major source of warnings in my dmesg, including not
only this one, but also "missing or invalid SUBNQN field." And as https://
lwn.net/Articles/876209/, A warning indicates that the kernel cannot handle a
certain situation and is running in a degraded manner based on certain
assumptions, which may lead to unexpected situations.I believe that a good
system administrator should review and ensure that each kernel warnings are
either negligible or an action has been taken to eliminate them. :)
Alan.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-04-09 15:50 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-09 8:46 Non-SGL transport mode warnings are set to dev_warn_once will cause confusion AlanCui4080
2026-04-09 14:36 ` Keith Busch
2026-04-09 15:49 ` AlanCui4080
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox