public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] nvme: core: reject invalid LBA data size from Identify Namespace
@ 2026-04-18  4:28 Chao Shi
  2026-04-20 12:22 ` Daniel Wagner
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Chao Shi @ 2026-04-18  4:28 UTC (permalink / raw)
  To: Keith Busch, Jens Axboe, Christoph Hellwig, Sagi Grimberg,
	Daniel Wagner, Hannes Reinecke, linux-nvme, linux-kernel
  Cc: Chao Shi, Sungwoo Kim, Dave Tian, Weidong Zhu

nvme_update_ns_info_block() trusts id->lbaf[lbaf].ds from the
controller and assigns it directly to ns->head->lba_shift without
bounds checking.  nvme_lba_to_sect() then does:

    return lba << (head->lba_shift - SECTOR_SHIFT);

When called with lba = le64_to_cpu(id->nsze) to compute the device
capacity, an attacker-controlled controller can choose ds < 9 or a
combination of (ds, nsze) that makes the left shift overflow
sector_t.  The former is a C undefined behaviour that UBSAN reports
as a BUG; the latter silently yields a bogus capacity that the
block layer then trusts for bounds checking.

Validate ds against SECTOR_SHIFT and use check_shl_overflow() to
compute capacity so that any (ds, nsze) combination that would
overflow sector_t is rejected.  The namespace is skipped with -EIO
instead of crashing the kernel.  This is reachable by a malicious
NVMe device, a buggy firmware, or an attacker-controlled NVMe-oF
target.

Stack trace (UBSAN, ds < 9 variant):

  RIP: nvme_lba_to_sect drivers/nvme/host/nvme.h:699 [inline]
  RIP: nvme_update_ns_info_block.cold+0x5/0x7
  Call Trace:
   nvme_update_ns_info+0x175/0xd90 drivers/nvme/host/core.c:2467
   nvme_validate_ns drivers/nvme/host/core.c:4299 [inline]
   nvme_scan_ns drivers/nvme/host/core.c:4350
   nvme_scan_ns_async+0xa5/0xe0 drivers/nvme/host/core.c:4383
   async_run_entry_fn
   process_one_work
   worker_thread
   kthread

Found by Syzkaller.

Fixes: 9419e71b8d67 ("nvme: move ns id info to struct nvme_ns_head")
Acked-by: Sungwoo Kim <iam@sung-woo.kim>
Acked-by: Dave Tian <daveti@purdue.edu>
Acked-by: Weidong Zhu <weizhu@fiu.edu>
Signed-off-by: Chao Shi <coshi036@gmail.com>
---
 drivers/nvme/host/core.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 1e33af94c24..9b3bf3e4075 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2407,9 +2407,19 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
 	lim = queue_limits_start_update(ns->disk->queue);
 
 	memflags = blk_mq_freeze_queue(ns->disk->queue);
+	if (id->lbaf[lbaf].ds < SECTOR_SHIFT ||
+	    check_shl_overflow(le64_to_cpu(id->nsze),
+			       id->lbaf[lbaf].ds - SECTOR_SHIFT,
+			       &capacity)) {
+		dev_warn_once(ns->ctrl->device,
+			"invalid LBA data size %u, skipping namespace\n",
+			id->lbaf[lbaf].ds);
+		ret = -EIO;
+		blk_mq_unfreeze_queue(ns->disk->queue, memflags);
+		goto out;
+	}
 	ns->head->lba_shift = id->lbaf[lbaf].ds;
 	ns->head->nuse = le64_to_cpu(id->nuse);
-	capacity = nvme_lba_to_sect(ns->head, le64_to_cpu(id->nsze));
 	nvme_set_ctrl_limits(ns->ctrl, &lim, false);
 	nvme_configure_metadata(ns->ctrl, ns->head, id, nvm, info);
 	nvme_set_chunk_sectors(ns, id, &lim);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] nvme: core: reject invalid LBA data size from Identify Namespace
  2026-04-18  4:28 [PATCH] nvme: core: reject invalid LBA data size from Identify Namespace Chao Shi
@ 2026-04-20 12:22 ` Daniel Wagner
  2026-04-20 14:54   ` Keith Busch
  2026-04-20 12:22 ` Maurizio Lombardi
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 8+ messages in thread
From: Daniel Wagner @ 2026-04-20 12:22 UTC (permalink / raw)
  To: Chao Shi
  Cc: Keith Busch, Jens Axboe, Christoph Hellwig, Sagi Grimberg,
	Hannes Reinecke, linux-nvme, linux-kernel, Sungwoo Kim, Dave Tian,
	Weidong Zhu

On Sat, Apr 18, 2026 at 12:28:34AM -0400, Chao Shi wrote:
> Fixes: 9419e71b8d67 ("nvme: move ns id info to struct nvme_ns_head")

This fixes tag is wrong. The above commit doesn't touch the code line
in question.

Probably a5b1cd61820e ("nvme: move a few things out of
nvme_update_disk_info") is the better choice here.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] nvme: core: reject invalid LBA data size from Identify Namespace
  2026-04-18  4:28 [PATCH] nvme: core: reject invalid LBA data size from Identify Namespace Chao Shi
  2026-04-20 12:22 ` Daniel Wagner
@ 2026-04-20 12:22 ` Maurizio Lombardi
  2026-04-20 14:51 ` Keith Busch
  2026-04-20 23:11 ` [PATCH v2 0/1] " Chao Shi
  3 siblings, 0 replies; 8+ messages in thread
From: Maurizio Lombardi @ 2026-04-20 12:22 UTC (permalink / raw)
  To: Chao Shi, Keith Busch, Jens Axboe, Christoph Hellwig,
	Sagi Grimberg, Daniel Wagner, Hannes Reinecke, linux-nvme,
	linux-kernel
  Cc: Sungwoo Kim, Dave Tian, Weidong Zhu

On Sat Apr 18, 2026 at 6:28 AM CEST, Chao Shi wrote:
> nvme_update_ns_info_block() trusts id->lbaf[lbaf].ds from the
> controller and assigns it directly to ns->head->lba_shift without
> bounds checking.  nvme_lba_to_sect() then does:
>
>     return lba << (head->lba_shift - SECTOR_SHIFT);
>
> When called with lba = le64_to_cpu(id->nsze) to compute the device
> capacity, an attacker-controlled controller can choose ds < 9 or a
> combination of (ds, nsze) that makes the left shift overflow
> sector_t.  The former is a C undefined behaviour that UBSAN reports
> as a BUG; the latter silently yields a bogus capacity that the
> block layer then trusts for bounds checking.
>
> Validate ds against SECTOR_SHIFT and use check_shl_overflow() to
> compute capacity so that any (ds, nsze) combination that would
> overflow sector_t is rejected.  The namespace is skipped with -EIO
> instead of crashing the kernel.  This is reachable by a malicious
> NVMe device, a buggy firmware, or an attacker-controlled NVMe-oF
> target.
>
> Stack trace (UBSAN, ds < 9 variant):
>
>   RIP: nvme_lba_to_sect drivers/nvme/host/nvme.h:699 [inline]
>   RIP: nvme_update_ns_info_block.cold+0x5/0x7
>   Call Trace:
>    nvme_update_ns_info+0x175/0xd90 drivers/nvme/host/core.c:2467
>    nvme_validate_ns drivers/nvme/host/core.c:4299 [inline]
>    nvme_scan_ns drivers/nvme/host/core.c:4350
>    nvme_scan_ns_async+0xa5/0xe0 drivers/nvme/host/core.c:4383
>    async_run_entry_fn
>    process_one_work
>    worker_thread
>    kthread
>
> Found by Syzkaller.
>
> Fixes: 9419e71b8d67 ("nvme: move ns id info to struct nvme_ns_head")
> Acked-by: Sungwoo Kim <iam@sung-woo.kim>
> Acked-by: Dave Tian <daveti@purdue.edu>
> Acked-by: Weidong Zhu <weizhu@fiu.edu>
> Signed-off-by: Chao Shi <coshi036@gmail.com>
> ---
>  drivers/nvme/host/core.c | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 1e33af94c24..9b3bf3e4075 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -2407,9 +2407,19 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
>  	lim = queue_limits_start_update(ns->disk->queue);
>  
>  	memflags = blk_mq_freeze_queue(ns->disk->queue);
> +	if (id->lbaf[lbaf].ds < SECTOR_SHIFT ||
> +	    check_shl_overflow(le64_to_cpu(id->nsze),
> +			       id->lbaf[lbaf].ds - SECTOR_SHIFT,
> +			       &capacity)) {
> +		dev_warn_once(ns->ctrl->device,
> +			"invalid LBA data size %u, skipping namespace\n",
> +			id->lbaf[lbaf].ds);

Just a nit:

If I'm reading the NVMe spec correctly, ds == 0 has a special meaning:
'LBA format is not currently available.'
maybe we should use a different dev_warn() for ds == 0 ?

Maurizio


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] nvme: core: reject invalid LBA data size from Identify Namespace
  2026-04-18  4:28 [PATCH] nvme: core: reject invalid LBA data size from Identify Namespace Chao Shi
  2026-04-20 12:22 ` Daniel Wagner
  2026-04-20 12:22 ` Maurizio Lombardi
@ 2026-04-20 14:51 ` Keith Busch
  2026-04-20 23:11 ` [PATCH v2 0/1] " Chao Shi
  3 siblings, 0 replies; 8+ messages in thread
From: Keith Busch @ 2026-04-20 14:51 UTC (permalink / raw)
  To: Chao Shi
  Cc: Jens Axboe, Christoph Hellwig, Sagi Grimberg, Daniel Wagner,
	Hannes Reinecke, linux-nvme, linux-kernel, Sungwoo Kim, Dave Tian,
	Weidong Zhu

On Sat, Apr 18, 2026 at 12:28:34AM -0400, Chao Shi wrote:
>  	memflags = blk_mq_freeze_queue(ns->disk->queue);
> +	if (id->lbaf[lbaf].ds < SECTOR_SHIFT ||
> +	    check_shl_overflow(le64_to_cpu(id->nsze),
> +			       id->lbaf[lbaf].ds - SECTOR_SHIFT,
> +			       &capacity)) {
> +		dev_warn_once(ns->ctrl->device,
> +			"invalid LBA data size %u, skipping namespace\n",
> +			id->lbaf[lbaf].ds);
> +		ret = -EIO;

I think ENODEV is more appropriate errno.

> +		blk_mq_unfreeze_queue(ns->disk->queue, memflags);
> +		goto out;

I don't see any particular reason why we shouldn't validate this value
before starting the queue updates and freezing the queue, like we for
the ncap field up higher. Doing that would make the error case much
simpler. Case in point, you're missing the corresponding
queue_limits_cancel_update() for this error case.

> +	}
>  	ns->head->lba_shift = id->lbaf[lbaf].ds;
>  	ns->head->nuse = le64_to_cpu(id->nuse);
> -	capacity = nvme_lba_to_sect(ns->head, le64_to_cpu(id->nsze));
>  	nvme_set_ctrl_limits(ns->ctrl, &lim, false);
>  	nvme_configure_metadata(ns->ctrl, ns->head, id, nvm, info);
>  	nvme_set_chunk_sectors(ns, id, &lim);
> -- 
> 2.43.0
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] nvme: core: reject invalid LBA data size from Identify Namespace
  2026-04-20 12:22 ` Daniel Wagner
@ 2026-04-20 14:54   ` Keith Busch
  0 siblings, 0 replies; 8+ messages in thread
From: Keith Busch @ 2026-04-20 14:54 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: Chao Shi, Jens Axboe, Christoph Hellwig, Sagi Grimberg,
	Hannes Reinecke, linux-nvme, linux-kernel, Sungwoo Kim, Dave Tian,
	Weidong Zhu

On Mon, Apr 20, 2026 at 02:22:04PM +0200, Daniel Wagner wrote:
> On Sat, Apr 19, 2026 at 12:28:34AM -0400, Chao Shi wrote:
> > Fixes: 9419e71b8d67 ("nvme: move ns id info to struct nvme_ns_head")
> 
> This fixes tag is wrong. The above commit doesn't touch the code line
> in question.
> 
> Probably a5b1cd61820e ("nvme: move a few things out of
> nvme_update_disk_info") is the better choice here.

We've never validated the 'ds' field though, so not sure anything
introduced a regression here that warrants a fixing tag.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 0/1] nvme: core: reject invalid LBA data size from Identify Namespace
  2026-04-18  4:28 [PATCH] nvme: core: reject invalid LBA data size from Identify Namespace Chao Shi
                   ` (2 preceding siblings ...)
  2026-04-20 14:51 ` Keith Busch
@ 2026-04-20 23:11 ` Chao Shi
  2026-04-20 23:11   ` [PATCH v2 1/1] " Chao Shi
  2026-04-21  0:25   ` [PATCH v2 0/1] " Keith Busch
  3 siblings, 2 replies; 8+ messages in thread
From: Chao Shi @ 2026-04-20 23:11 UTC (permalink / raw)
  To: Keith Busch, Jens Axboe, linux-nvme, linux-kernel
  Cc: Daniel Wagner, Maurizio Lombardi, Chao Shi

Thanks for sharing comments! I did such changes in v2:
- Remove Fixes tag(Keith Busch: ds has never been validated)
- Change -EIO to -ENODEV (Keith Busch)
- Add missing "queue_limits_cancel_update()" in error path (Keith Busch)
- Add warning for ds == 0 (LBA format not available) (Maurizio Lombardi)

---
On Mon, Apr 20, 2026 at 02:51:00PM -0700, Keith Busch wrote:
> I think ENODEV is more appropriate errno.

Fixed in v2.

> you're missing the corresponding queue_limits_cancel_update() for this
> error case.

Added in both error paths in v2.

> This fixes tag is wrong.

Removed.

On Mon, Apr 20, 2026 at 02:22:04PM +0200, Daniel Wagner wrote:
> This fixes tag is wrong. Probably a5b1cd61820e is the better choice here.

Keith pointed out that ds has never been validated so no commit
introduced a regression — removed the Fixes tag entirely.

On Mon, Apr 20, 2026 at 02:22:04PM +0200, Maurizio Lombardi wrote:
> ds == 0 has a special meaning: 'LBA format is not currently available.'

Thanks,added dev_warn_once() for ds==0 in v2.

Chao Shi (1):
  nvme: core: reject invalid LBA data size from Identify Namespace

 drivers/nvme/host/core.c | 21 ++++++++++++++++++++-
 1 file changed, 20 insertions(+), 1 deletion(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 1/1] nvme: core: reject invalid LBA data size from Identify Namespace
  2026-04-20 23:11 ` [PATCH v2 0/1] " Chao Shi
@ 2026-04-20 23:11   ` Chao Shi
  2026-04-21  0:25   ` [PATCH v2 0/1] " Keith Busch
  1 sibling, 0 replies; 8+ messages in thread
From: Chao Shi @ 2026-04-20 23:11 UTC (permalink / raw)
  To: Keith Busch, Jens Axboe, linux-nvme, linux-kernel
  Cc: Daniel Wagner, Maurizio Lombardi, Chao Shi, Sungwoo Kim,
	Dave Tian, Weidong Zhu

nvme_update_ns_info_block() trusts id->lbaf[lbaf].ds from the
controller and assigns it directly to ns->head->lba_shift without
bounds checking.  nvme_lba_to_sect() then does:

    return lba << (head->lba_shift - SECTOR_SHIFT);

When called with lba = le64_to_cpu(id->nsze) to compute the device
capacity, an attacker-controlled controller can choose ds < 9 or a
combination of (ds, nsze) that makes the left shift overflow
sector_t.  The former is a C undefined behaviour that UBSAN reports
as a BUG; the latter silently yields a bogus capacity that the
block layer then trusts for bounds checking.

Validate ds against SECTOR_SHIFT and use check_shl_overflow() to
compute capacity so that any (ds, nsze) combination that would
overflow sector_t is rejected.  The namespace is skipped with -ENODEV
instead of crashing the kernel.  This is reachable by a malicious
NVMe device, a buggy firmware, or an attacker-controlled NVMe-oF
target.

Stack trace (UBSAN, ds < 9 variant):

  RIP: nvme_lba_to_sect drivers/nvme/host/nvme.h:699 [inline]
  RIP: nvme_update_ns_info_block.cold+0x5/0x7
  Call Trace:
   nvme_update_ns_info+0x175/0xd90 drivers/nvme/host/core.c:2467
   nvme_validate_ns drivers/nvme/host/core.c:4299 [inline]
   nvme_scan_ns drivers/nvme/host/core.c:4350
   nvme_scan_ns_async+0xa5/0xe0 drivers/nvme/host/core.c:4383
   async_run_entry_fn
   process_one_work
   worker_thread
   kthread

Found by Syzkaller.

Acked-by: Sungwoo Kim <iam@sung-woo.kim>
Acked-by: Dave Tian <daveti@purdue.edu>
Acked-by: Weidong Zhu <weizhu@fiu.edu>
Signed-off-by: Chao Shi <coshi036@gmail.com>
---
 drivers/nvme/host/core.c | 21 ++++++++++++++++++++-
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 1e33af94c24..d1711ef59fb 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2407,9 +2407,28 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
 	lim = queue_limits_start_update(ns->disk->queue);
 
 	memflags = blk_mq_freeze_queue(ns->disk->queue);
+	if (id->lbaf[lbaf].ds == 0) {
+		dev_warn_once(ns->ctrl->device,
+			"LBA format not available, skipping namespace\n");
+		ret = -ENODEV;
+		queue_limits_cancel_update(ns->disk->queue);
+		blk_mq_unfreeze_queue(ns->disk->queue, memflags);
+		goto out;
+	}
+	if (id->lbaf[lbaf].ds < SECTOR_SHIFT ||
+	    check_shl_overflow(le64_to_cpu(id->nsze),
+			       id->lbaf[lbaf].ds - SECTOR_SHIFT,
+			       &capacity)) {
+		dev_warn_once(ns->ctrl->device,
+			"invalid LBA data size %u, skipping namespace\n",
+			id->lbaf[lbaf].ds);
+		ret = -ENODEV;
+		queue_limits_cancel_update(ns->disk->queue);
+		blk_mq_unfreeze_queue(ns->disk->queue, memflags);
+		goto out;
+	}
 	ns->head->lba_shift = id->lbaf[lbaf].ds;
 	ns->head->nuse = le64_to_cpu(id->nuse);
-	capacity = nvme_lba_to_sect(ns->head, le64_to_cpu(id->nsze));
 	nvme_set_ctrl_limits(ns->ctrl, &lim, false);
 	nvme_configure_metadata(ns->ctrl, ns->head, id, nvm, info);
 	nvme_set_chunk_sectors(ns, id, &lim);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 0/1] nvme: core: reject invalid LBA data size from Identify Namespace
  2026-04-20 23:11 ` [PATCH v2 0/1] " Chao Shi
  2026-04-20 23:11   ` [PATCH v2 1/1] " Chao Shi
@ 2026-04-21  0:25   ` Keith Busch
  1 sibling, 0 replies; 8+ messages in thread
From: Keith Busch @ 2026-04-21  0:25 UTC (permalink / raw)
  To: Chao Shi
  Cc: Jens Axboe, linux-nvme, linux-kernel, Daniel Wagner,
	Maurizio Lombardi

On Mon, Apr 20, 2026 at 07:11:13PM -0400, Chao Shi wrote:
> On Mon, Apr 20, 2026 at 02:51:00PM -0700, Keith Busch wrote:
> 
> Fixed in v2.
> 
> > you're missing the corresponding queue_limits_cancel_update() for this
> > error case.
> 
> Added in both error paths in v2.

Not quite what I suggested. Instead, if you move up the validation
checks, you don't have any cleanup in the error case to do.

And don't bother splitting the error cases. The message about the "0"
LBADS is in reference to the the broadcast identification. It's never
valid for a controller to return an unusable flbas index value for an
attached namespace.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2026-04-21  0:25 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-18  4:28 [PATCH] nvme: core: reject invalid LBA data size from Identify Namespace Chao Shi
2026-04-20 12:22 ` Daniel Wagner
2026-04-20 14:54   ` Keith Busch
2026-04-20 12:22 ` Maurizio Lombardi
2026-04-20 14:51 ` Keith Busch
2026-04-20 23:11 ` [PATCH v2 0/1] " Chao Shi
2026-04-20 23:11   ` [PATCH v2 1/1] " Chao Shi
2026-04-21  0:25   ` [PATCH v2 0/1] " Keith Busch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox