Linux block layer
 help / color / mirror / Atom feed
From: Bryam Vargas <hexlabsecurity@proton.me>
To: Christoph Hellwig <hch@lst.de>, Sagi Grimberg <sagi@grimberg.me>,
	Keith Busch <kbusch@kernel.org>,
	Chaitanya Kulkarni <kch@nvidia.com>
Cc: linux-nvme@lists.infradead.org, linux-rdma@vger.kernel.org,
	linux-block@vger.kernel.org
Subject: [PATCH v2] nvmet-rdma: handle inline data with a nonzero offset
Date: Thu, 04 Jun 2026 19:36:54 +0000	[thread overview]
Message-ID: <20260604193645.178350-1-hexlabsecurity@proton.me> (raw)
In-Reply-To: <aiFR81yE9_BIsNbM@kbusch-mbp>

nvmet_rdma_use_inline_sg() maps the host-controlled inline data offset
into the per-command inline scatterlist.  The bounds check admits any
offset with off + len <= inline_data_size, but the mapping still assumes
the data begins in the first inline page:

	sg->offset = off;
	sg->length = min_t(int, len, PAGE_SIZE - off);

When a port is configured with inline_data_size > PAGE_SIZE (settable up
to max(SZ_16K, PAGE_SIZE)), an offset in (PAGE_SIZE, inline_data_size]
makes "PAGE_SIZE - off" underflow, so sg->length is set to ~4 GiB and
the block backend reads far past the first inline page.  num_pages(len)
also ignores the offset, so an in-bounds offset whose [off, off+len)
span crosses a page boundary under-counts the scatterlist.

Map the offset properly: split it into a page index and an in-page
offset, start the scatterlist at that page, and size the page count from
page_off + len.  Because the request scatterlist may now start at
inline_sg[page_idx] rather than inline_sg[0], generalize the inline-SGL
identity test in nvmet_rdma_release_rsp() to a range test; otherwise the
persistent inline scatterlist is mistaken for an allocated one and
nvmet_req_free_sgls() frees an inline page (and warns in
free_large_kmalloc()).

Fixes: 0d5ee2b2ab4f ("nvmet-rdma: support max(16KB, PAGE_SIZE) inline data")
Cc: stable@vger.kernel.org
Suggested-by: Keith Busch <kbusch@kernel.org>
Reported-by: Bryam Vargas <hexlabsecurity@proton.me>
Signed-off-by: Bryam Vargas <hexlabsecurity@proton.me>
---
v1 rejected a nonzero offset; per Keith's note a nonzero in-capsule SGL
offset is legitimate (it is the per-command SGL Offset field, distinct
from the controller ICDOFF attribute that nvme_rdma_setup_ctrl() refuses
when nonzero), so v2 handles it instead, using Keith's suggested
page_idx/page_off form for nvmet_rdma_use_inline_sg().

Review context (not for the commit log):

Bound safety: with off + len <= inline_data_size the highest inline_sg[]
index touched is page_idx + sg_count - 1 = floor((off + len - 1) /
PAGE_SIZE) <= num_pages(inline_data_size) - 1 = inline_page_count - 1
(<= NVMET_RDMA_MAX_INLINE_SGE - 1), and page_off < PAGE_SIZE so
PAGE_SIZE - page_off cannot underflow.  The release_rsp range test is a
strict generalization of the old "!= inline_sg" test: inline_sg[0] is in
range (unchanged: not freed), allocated/keyed SGLs are outside it (still
freed), and only the new inline_sg[1..] starts are additionally treated
as inline.

Decides identically on 32- and 64-bit builds: off is u64, so the offset
arithmetic and PAGE_SIZE - page_off are evaluated in 64-bit on both ABIs;
num_pages() sees page_off + len <= 16384 (positive, int-safe on both);
the release_rsp comparison is a pointer comparison, identical semantics
on ILP32 and LP64.  (-m32/-m64 model output identical.)

A/B on a KASAN build (inline_data_size = 16384) over an rdma_rxe
loopback nvmet-rdma target with a block backend, inline write:
  - offset 0: succeeds, clean (control + no regression).
  - offset 8192: before this patch the block backend reads out of bounds
      BUG: KASAN: slab-out-of-bounds in copy_folio_from_iter_atomic
      (sg->length = 0xfffff000); with this patch it is served from the
      correct inline page, in bounds, no KASAN and no free_large_kmalloc
      warning.
  - the use_inline_sg() rework alone (without the release_rsp change)
      trips on offset 8192:
      WARNING: ... free_large_kmalloc ... Not a kmalloc allocation
        nvmet_req_free_sgls <- nvmet_rdma_release_rsp <- nvmet_rdma_send_done

 drivers/nvme/target/rdma.c | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c
index 565183a20007..eb975fbd74a1 100644
--- a/drivers/nvme/target/rdma.c
+++ b/drivers/nvme/target/rdma.c
@@ -666,7 +666,8 @@ static void nvmet_rdma_release_rsp(struct nvmet_rdma_rsp *rsp)
 	if (rsp->n_rdma)
 		nvmet_rdma_rw_ctx_destroy(rsp);

-	if (rsp->req.sg != rsp->cmd->inline_sg)
+	if (rsp->req.sg < rsp->cmd->inline_sg ||
+	    rsp->req.sg >= rsp->cmd->inline_sg + queue->dev->inline_page_count)
 		nvmet_req_free_sgls(&rsp->req);

 	if (unlikely(!list_empty_careful(&queue->rsp_wr_wait_list)))
@@ -821,24 +822,25 @@ static void nvmet_rdma_write_data_done(struct ib_cq *cq, struct ib_wc *wc)
 static void nvmet_rdma_use_inline_sg(struct nvmet_rdma_rsp *rsp, u32 len,
 		u64 off)
 {
-	int sg_count = num_pages(len);
+	u64 page_off = off % PAGE_SIZE;
+	u64 page_idx = off / PAGE_SIZE;
+	int sg_count = num_pages(page_off + len);
 	struct scatterlist *sg;
 	int i;

-	sg = rsp->cmd->inline_sg;
+	sg = &rsp->cmd->inline_sg[page_idx];
 	for (i = 0; i < sg_count; i++, sg++) {
 		if (i < sg_count - 1)
 			sg_unmark_end(sg);
 		else
 			sg_mark_end(sg);
-		sg->offset = off;
-		sg->length = min_t(int, len, PAGE_SIZE - off);
+		sg->offset = page_off;
+		sg->length = min_t(u64, len, PAGE_SIZE - page_off);
 		len -= sg->length;
-		if (!i)
-			off = 0;
+		page_off = 0;
 	}

-	rsp->req.sg = rsp->cmd->inline_sg;
+	rsp->req.sg = &rsp->cmd->inline_sg[page_idx];
 	rsp->req.sg_cnt = sg_count;
 }

--
2.43.0


  reply	other threads:[~2026-06-04 19:37 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-29  6:52 [REPORT] nvmet-rdma: integer overflow in inline-data SGL bounds check -> pre-auth kernel-memory read + remote crash (candidate patch inline) hexlabsecurity
2026-05-29 16:09 ` Keith Busch
2026-06-04  8:46   ` [PATCH] nvmet-rdma: reject inline data with a nonzero offset Bryam Vargas
2026-06-04  9:32     ` Keith Busch
2026-06-04 10:22     ` Keith Busch
2026-06-04 19:36       ` Bryam Vargas [this message]
2026-06-09 22:00         ` [PATCH v2] nvmet-rdma: handle " Keith Busch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260604193645.178350-1-hexlabsecurity@proton.me \
    --to=hexlabsecurity@proton.me \
    --cc=hch@lst.de \
    --cc=kbusch@kernel.org \
    --cc=kch@nvidia.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox