All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nilay Shroff <nilay@linux.ibm.com>
To: linux-nvme@lists.infradead.org
Cc: kbusch@kernel.org, hch@lst.de, axboe@fb.com, sagi@grimberg.me,
	gjoyce@linux.ibm.com, mkchauras@linux.ibm.com,
	Nilay Shroff <nilay@linux.ibm.com>
Subject: [PATCH] nvme-multipath: fix flex array size in struct nvme_ns_head
Date: Wed, 27 May 2026 11:50:00 +0530	[thread overview]
Message-ID: <20260527062010.4036702-1-nilay@linux.ibm.com> (raw)

struct nvme_ns_head contains a flexible array member, current_path[],
which is indexed using the NUMA node ID:
head->current_path[numa_node_id()]

The structure is currently allocated as:
size = sizeof(struct nvme_ns_head) +
       (num_possible_nodes() * sizeof(struct nvme_ns *));
head = kzalloc(size, GFP_KERNEL);

This allocation assumes that NUMA node IDs are sequential and densely
packed from 0 .. num_possible_nodes() - 1. While this assumption holds
on many systems, it is not always true on some architectures such as
powerpc.

On some powerpc systems, NUMA node IDs can be sparse. For example:
NUMA:
  NUMA node(s):              6
  NUMA node0 CPU(s):         80-159
  NUMA node8 CPU(s):         0-79
  NUMA node252 CPU(s):
  NUMA node253 CPU(s):
  NUMA node254 CPU(s):
  NUMA node255 CPU(s):

That is, the possible/online NUMA node IDs are: 0, 8, 252, 253, 254, 255
In this case: num_possible_nodes() = 6

So memory is allocated for only 6 entries in current_path[]. However,
the array is later indexed using the actual NUMA node ID. As a result,
accesses such as:
head->current_path[8] or
head->current_path[252]
goes out of bounds, leading to the following KASAN splat:

==================================================================
BUG: KASAN: slab-out-of-bounds in nvme_mpath_revalidate_paths+0x22c/0x290 [nvme_core]
Write of size 8 at addr c00020003bda35b8 by task kworker/u641:2/1997

CPU: 1 UID: 0 PID: 1997 Comm: kworker/u641:2 Not tainted 7.1.0-rc5-dirty #14 PREEMPT(lazy)
Hardware name: 8335-GTH POWER9 0x4e1202 opal:skiboot-v6.5.3-35-g1851b2a06 PowerNV
Workqueue: async async_run_entry_fn
Call Trace:
[c000200037fa7510] [c0000000021c23d4] dump_stack_lvl+0x88/0xdc (unreliable)
[c000200037fa7540] [c0000000009fda90] print_report+0x22c/0x67c
[c000200037fa7630] [c0000000009fd508] kasan_report+0x108/0x220
[c000200037fa7740] [c0000000009fff48] __asan_store8+0xe8/0x120
[c000200037fa7760] [c008000018e76474] nvme_mpath_revalidate_paths+0x22c/0x290 [nvme_core]
[c000200037fa7800] [c008000018e6556c] nvme_update_ns_info+0x4a4/0x5e0 [nvme_core]
[c000200037fa7a50] [c008000018e66270] nvme_alloc_ns+0x6d8/0x1a70 [nvme_core]
[c000200037fa7c20] [c008000018e679fc] nvme_scan_ns+0x3f4/0x630 [nvme_core]
[c000200037fa7d10] [c00000000031f22c] async_run_entry_fn+0x9c/0x3a0
[c000200037fa7db0] [c0000000002fa544] process_one_work+0x414/0xa10
[c000200037fa7ec0] [c0000000002fbf00] worker_thread+0x320/0x640
[c000200037fa7f80] [c00000000030d0f8] kthread+0x278/0x290
[c000200037fa7fe0] [c00000000000ded8] start_kernel_thread+0x14/0x18

Allocated by task 1997 on cpu 1 at 35.928317s:

The buggy address belongs to the object at c00020003bda3000
 which belongs to the cache kmalloc-rnd-15-2k of size 2048
The buggy address is located 16 bytes to the right of
 allocated 1448-byte region [c00020003bda3000, c00020003bda35a8)

The buggy address belongs to the physical page:

Memory state around the buggy address:
 c00020003bda3480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 c00020003bda3500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>c00020003bda3580: 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc
                                        ^
 c00020003bda3600: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 c00020003bda3680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================

Fix this by allocating the flexible array using nr_node_ids instead
of num_possible_nodes(). Since nr_node_ids represents the maximum
possible NUMA node IDs, indexing current_path[] using numa_node_id()
becomes safe even on systems with sparse node IDs.

Fixes: f333444708f8 ("nvme: take node locality into account when selecting a path")
Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
---
 drivers/nvme/host/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index c3032d6ad6b1..96809227a0e2 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3926,7 +3926,7 @@ static struct nvme_ns_head *nvme_alloc_ns_head(struct nvme_ctrl *ctrl,
 	int ret = -ENOMEM;
 
 #ifdef CONFIG_NVME_MULTIPATH
-	size += num_possible_nodes() * sizeof(struct nvme_ns *);
+	size += nr_node_ids * sizeof(struct nvme_ns *);
 #endif
 
 	head = kzalloc(size, GFP_KERNEL);
-- 
2.53.0



             reply	other threads:[~2026-05-27  6:20 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-27  6:20 Nilay Shroff [this message]
2026-05-27  6:40 ` [PATCH] nvme-multipath: fix flex array size in struct nvme_ns_head Christoph Hellwig
2026-05-27  7:30 ` John Garry
2026-05-27  8:04 ` Mukesh Kumar Chaurasiya
2026-05-27 13:36 ` Hannes Reinecke
2026-05-27 13:49 ` Keith Busch
2026-05-27 15:06   ` Nilay Shroff
2026-05-27 15:15     ` Keith Busch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260527062010.4036702-1-nilay@linux.ibm.com \
    --to=nilay@linux.ibm.com \
    --cc=axboe@fb.com \
    --cc=gjoyce@linux.ibm.com \
    --cc=hch@lst.de \
    --cc=kbusch@kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=mkchauras@linux.ibm.com \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.