Linux block layer
 help / color / mirror / Atom feed
* [REPORT] nvmet-rdma: integer overflow in inline-data SGL bounds check -> pre-auth kernel-memory read + remote crash (candidate patch inline)
@ 2026-05-29  6:52 hexlabsecurity
  2026-05-29 16:09 ` Keith Busch
  0 siblings, 1 reply; 2+ messages in thread
From: hexlabsecurity @ 2026-05-29  6:52 UTC (permalink / raw)
  To: security@kernel.org
  Cc: hch@lst.de, sagi@grimberg.me, kbusch@kernel.org, kch@nvidia.com,
	linux-nvme@lists.infradead.org, linux-rdma@vger.kernel.org,
	linux-block@vger.kernel.org

Hello,

I would like to report an integer-overflow vulnerability in the NVMe-oF
RDMA target (drivers/nvme/target/rdma.c).  The inline-data SGL bounds
check in nvmet_rdma_map_sgl_inline() is computed in u64 over two
host-controlled values and wraps, which a remote fabric peer can use
both to read kernel memory back over the fabric and to crash the target.

== Affected ==

  drivers/nvme/target/rdma.c, nvmet_rdma_map_sgl_inline()

  Verified present on the current mainline tree (commit 27fa82620cba,
  ~v7.1-rc5), at the bounds check:

    static u16 nvmet_rdma_map_sgl_inline(struct nvmet_rdma_rsp *rsp)
    {
        struct nvme_sgl_desc *sgl = &rsp->req.cmd->common.dptr.sgl;
        u64 off = le64_to_cpu(sgl->addr);     /* host-controlled, 64-bit */
        u32 len = le32_to_cpu(sgl->length);   /* host-controlled, 32-bit */
        ...
        if (off + len > rsp->queue->dev->inline_data_size) {   /* u64 wrap */
            pr_err("invalid inline data offset!\n");
            return NVME_SC_SGL_INVALID_OFFSET | NVME_STATUS_DNR;
        }
        ...
        nvmet_rdma_use_inline_sg(rsp, len, off);
    }

  "off + len" is evaluated in u64 and wraps modulo 2^64.  For example
  addr = 0xfffffffffffffe00, length = 0x1000 makes the sum wrap to
  0xe00, which is <= inline_data_size (default PAGE_SIZE), so the check
  passes.  The current check form (against the per-port inline_data_size)
  and the fixed-size inline_sg[NVMET_RDMA_MAX_INLINE_SGE] array with the
  num_pages(len) loop were introduced together by commit 0d5ee2b2ab4f
  ("nvmet-rdma: support max(16KB, PAGE_SIZE) inline data"), which is the
  Fixes: I used.  Note: the single-page inline path that predates that
  commit may have an analogous u64-overflow read in a different code
  shape; I would appreciate the maintainers' judgement on whether the
  stable backport scope should reach before that commit.

== Two consequences of the bypass ==

  1. Kernel-memory read (information disclosure).
     nvmet_rdma_use_inline_sg() does "sg->offset = off", truncating the
     64-bit offset to scatterlist::offset (unsigned int).  The block
     layer then accesses page_to_phys(inline_page) + (off & 0xffffffff),
     so the target reads up to inline_data_size bytes of kernel memory
     per write command and returns them to the host on read-back, or
     faults the in-kernel copy if the offset lands on unmapped memory.

  2. Kernel-memory corruption -> remote crash (denial of service).
     A large length makes "sg_count = num_pages(len)" in
     nvmet_rdma_use_inline_sg() exceed NVMET_RDMA_MAX_INLINE_SGE (4), so
     the loop writes scatterlist entries past the fixed-size inline_sg[]
     array, corrupting the surrounding command object.

== Reachability ==

  The path is reached by any write command carrying an inline SGL, i.e.
  after a Fabrics Connect.  On a subsystem configured with
  attr_allow_any_host=1 it is reachable WITHOUT authentication by any
  RDMA peer (RoCE/iWARP/IB) that can reach the target's listener.  With
  DH-CHAP configured, or attr_allow_any_host=0 with an unknown host NQN,
  a valid/known host NQN is required first.

== Empirical reproduction ==

  Reproduced against a stock nvmet-rdma target over a soft-iWARP (siw)
  fabric on a Linux 6.12.90 build with KASAN (KASAN_INLINE):

  - Read: a single write command with addr = 0xfffffffffffffe00,
    length = 0x1000 produced a KASAN out-of-bounds read and returned
    ~4 KiB of kernel memory (including kernel .text) into the
    attacker-readable namespace.

  - Crash: a write command with addr = 0xffffffffffff0500,
    length = 0x10000 (sum wraps to 0x500 <= inline_data_size, but
    num_pages(0x10000) = 16 writes 16 scatterlist entries into the
    4-entry inline_sg[], 12 past its end) deterministically corrupted
    the command object and oopsed the target:

      Oops: general protection fault [...] KASAN: null-ptr-deref
      RIP: nvmet_rdma_post_recv+0x... [nvmet_rdma]
        nvmet_rdma_post_recv <- nvmet_rdma_queue_response
        <- __nvmet_req_complete <- nvmet_check_transfer_len
        <- nvmet_rdma_handle_command <- ib_cq_poll_work

    Every reconnect re-triggers it (persistent remote DoS).  The
    nvmet_rdma_cmd objects are carved from one contiguous kcalloc'd
    array, so the over-long entry write stays within that allocation and
    KASAN flags the downstream dereference of the corrupted command in
    nvmet_rdma_post_recv rather than the store itself.  The out-of-bounds
    content is not attacker-controlled, so this is a crash/corruption
    primitive, not a controlled write; I do not see a path to remote code
    execution from this bug.

  Severity estimate.  The two consequences arise from different inline-SGL
  capsules (small vs large length) and are scored as separate single-capsule
  outcomes, not one combined vector:

    OOB read  (info-disclosure):  CVSS 7.5 HIGH
        CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N
    OOB write (corruption/DoS):   CVSS 8.2 HIGH
        CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:L/A:H

  Headline 8.2 HIGH (both reachable pre-auth with attr_allow_any_host=1).
  With attr_allow_any_host=0 a valid host NQN is required first (PR:L),
  lowering these to 6.5 and 7.1.

== Suggested fix ==

  Validate the offset with check_add_overflow() before comparing against
  inline_data_size.  A passing check then guarantees
  off + len <= inline_data_size <= NVMET_RDMA_MAX_INLINE_DATA_SIZE, which
  bounds both the truncated scatterlist::offset and
  num_pages(len) <= NVMET_RDMA_MAX_INLINE_SGE, closing the read and the
  inline_sg[] overflow together.  Candidate patch inline below (applies
  to current mainline).

== Embargo ==

  I am happy to follow the standard process.  Proposing a 7-day embargo;
  the fix is small and I can adjust as the maintainers prefer.  I have
  not notified linux-distros and will hold that until a public patch
  lands, per the usual guidance.

I am an independent security researcher; please credit
"Bryam Vargas <hexlabsecurity@proton.me>" (Reported-by already in the
patch).  Affiliation: HEXLAB SAS (registration pending) -- Cali,
Colombia.  Happy to provide the full reproduction harness on request.

Thank you,
Bryam Vargas

----- candidate patch (inline, plain text) -----

From 448c122c744430c1c2926d635855a3894370ee33 Mon Sep 17 00:00:00 2001
From: Bryam Vargas <hexlabsecurity@proton.me>
Date: Thu, 28 May 2026 21:23:52 -0500
Subject: [PATCH] nvmet-rdma: fix integer overflow in inline data SGL bounds
 check

nvmet_rdma_map_sgl_inline() bounds-checks the inline data descriptor
with both operands host-controlled and the sum evaluated in u64:

	u64 off = le64_to_cpu(sgl->addr);
	u32 len = le32_to_cpu(sgl->length);
	...
	if (off + len > rsp->queue->dev->inline_data_size)
		return NVME_SC_SGL_INVALID_OFFSET | NVME_STATUS_DNR;

"off + len" therefore wraps modulo 2^64.  A descriptor with, for
example, addr = 0xfffffffffffffe00 and length = 0x1000 makes the sum
wrap to 0xe00, which passes the inline_data_size check.  An inline-SGL
write command reaches this path after a Fabrics Connect; on a subsystem
with attr_allow_any_host set it is reachable without authentication by
any peer that can reach the target.

Two distinct out-of-bounds accesses follow from the bypass:

 - nvmet_rdma_use_inline_sg() stores the 64-bit offset into
   scatterlist::offset, which is unsigned int, committing the truncated
   attacker offset to the inline page.  The block layer then accesses
   page_to_phys(inline_page) + (off & 0xffffffff), reading up to
   inline_data_size bytes of kernel memory per command back to the host
   (or faulting the target if the offset lands on unmapped memory).

 - A large len makes sg_count = num_pages(len) in
   nvmet_rdma_use_inline_sg() exceed NVMET_RDMA_MAX_INLINE_SGE, so the
   loop writes scatterlist entries past the fixed-size inline_sg[]
   array, corrupting the surrounding command object and oopsing the
   target on the next use of that command.

Validate the offset with check_add_overflow() before comparing against
inline_data_size.  A passing check then guarantees
off + len <= inline_data_size <= NVMET_RDMA_MAX_INLINE_DATA_SIZE, which
bounds both the truncated scatterlist::offset and
num_pages(len) <= NVMET_RDMA_MAX_INLINE_SGE, closing the out-of-bounds
read and the inline_sg[] overflow together.

Reported-by: Bryam Vargas <hexlabsecurity@proton.me>
Fixes: 0d5ee2b2ab4f ("nvmet-rdma: support max(16KB, PAGE_SIZE) inline data")
Cc: stable@vger.kernel.org
Signed-off-by: Bryam Vargas <hexlabsecurity@proton.me>
---
Review context (not for the commit log):

Reproducer -- unprivileged remote RDMA peer against a target with
attr_allow_any_host=1, a single inline-SGL WRITE capsule:
  * OOB read:  sgl->addr=0xfffffffffffffe00, sgl->length=0x1000
               (off+len wraps to 0xe00 <= inline_data_size; sg->offset
               truncates to 0xfffffe00) -> ~4 KiB of kernel memory is
               read back from the namespace.
  * OOB write: sgl->addr=0xffffffffffff0500, sgl->length=0x10000
               (num_pages(0x10000)=16 overruns the 4-entry inline_sg[])
               -> target memory corruption / crash.

A/B-tested on a 6.12.90 KASAN lab kernel (same .config, only this hunk
differs): pre-fix the OOB-read capsule trips "KASAN: use-after-free in
copy_page_from_iter_atomic" via nvmet_file_execute_io; post-fix both
capsules are rejected with "invalid inline data offset!"
(NVME_SC_SGL_INVALID_OFFSET), benign inline writes still succeed, and no
KASAN/oops fires. The fix decides identically in 32- and 64-bit builds
(check_add_overflow operates on u64).

 drivers/nvme/target/rdma.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c
index e6e2c3f9afdf..a5bbf9d41c3b 100644
--- a/drivers/nvme/target/rdma.c
+++ b/drivers/nvme/target/rdma.c
@@ -12,6 +12,7 @@
 #include <linux/init.h>
 #include <linux/module.h>
 #include <linux/nvme.h>
+#include <linux/overflow.h>
 #include <linux/slab.h>
 #include <linux/string.h>
 #include <linux/wait.h>
@@ -847,6 +848,7 @@ static u16 nvmet_rdma_map_sgl_inline(struct nvmet_rdma_rsp *rsp)
 	struct nvme_sgl_desc *sgl = &rsp->req.cmd->common.dptr.sgl;
 	u64 off = le64_to_cpu(sgl->addr);
 	u32 len = le32_to_cpu(sgl->length);
+	u64 bound;

 	if (!nvme_is_write(rsp->req.cmd)) {
 		rsp->req.error_loc =
@@ -854,7 +856,8 @@ static u16 nvmet_rdma_map_sgl_inline(struct nvmet_rdma_rsp *rsp)
 		return NVME_SC_INVALID_FIELD | NVME_STATUS_DNR;
 	}

-	if (off + len > rsp->queue->dev->inline_data_size) {
+	if (check_add_overflow(off, (u64)len, &bound) ||
+	    bound > rsp->queue->dev->inline_data_size) {
 		pr_err("invalid inline data offset!\n");
 		return NVME_SC_SGL_INVALID_OFFSET | NVME_STATUS_DNR;
 	}
-- 
2.43.0

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [REPORT] nvmet-rdma: integer overflow in inline-data SGL bounds check -> pre-auth kernel-memory read + remote crash (candidate patch inline)
  2026-05-29  6:52 [REPORT] nvmet-rdma: integer overflow in inline-data SGL bounds check -> pre-auth kernel-memory read + remote crash (candidate patch inline) hexlabsecurity
@ 2026-05-29 16:09 ` Keith Busch
  0 siblings, 0 replies; 2+ messages in thread
From: Keith Busch @ 2026-05-29 16:09 UTC (permalink / raw)
  To: hexlabsecurity
  Cc: security@kernel.org, hch@lst.de, sagi@grimberg.me, kch@nvidia.com,
	linux-nvme@lists.infradead.org, linux-rdma@vger.kernel.org,
	linux-block@vger.kernel.org

On Fri, May 29, 2026 at 06:52:13AM +0000, hexlabsecurity@proton.me wrote:
> @@ -847,6 +848,7 @@ static u16 nvmet_rdma_map_sgl_inline(struct nvmet_rdma_rsp *rsp)
>  	struct nvme_sgl_desc *sgl = &rsp->req.cmd->common.dptr.sgl;
>  	u64 off = le64_to_cpu(sgl->addr);
>  	u32 len = le32_to_cpu(sgl->length);
> +	u64 bound;
> 
>  	if (!nvme_is_write(rsp->req.cmd)) {
>  		rsp->req.error_loc =
> @@ -854,7 +856,8 @@ static u16 nvmet_rdma_map_sgl_inline(struct nvmet_rdma_rsp *rsp)
>  		return NVME_SC_INVALID_FIELD | NVME_STATUS_DNR;
>  	}
> 
> -	if (off + len > rsp->queue->dev->inline_data_size) {
> +	if (check_add_overflow(off, (u64)len, &bound) ||
> +	    bound > rsp->queue->dev->inline_data_size) {

Since you don't use "bound" for anything other than the final check, I
think we make this simpler without it:

	if (off > rsp->queue->dev->inline_data_size ||
	    len > rsp->queue->dev->inline_data_size - off) {

Thanks for the report.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-05-29 16:09 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-29  6:52 [REPORT] nvmet-rdma: integer overflow in inline-data SGL bounds check -> pre-auth kernel-memory read + remote crash (candidate patch inline) hexlabsecurity
2026-05-29 16:09 ` Keith Busch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox