* [PATCH net, v3] net: mana: Fix crash from unvalidated SHM offset read from BAR0 during FLR
@ 2026-05-01 2:47 Dipayaan Roy
2026-05-05 13:42 ` Paolo Abeni
2026-05-05 13:50 ` patchwork-bot+netdevbpf
0 siblings, 2 replies; 4+ messages in thread
From: Dipayaan Roy @ 2026-05-01 2:47 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
kuba, pabeni, leon, longli, kotaranov, horms, shradhagupta,
ssengar, ernis, shirazsaleem, linux-hyperv, netdev, linux-kernel,
linux-rdma, stephen, jacob.e.keller, dipayanroy, leitao, kees,
john.fastabend, hawk, bpf, daniel, ast, sdf, yury.norov
During Function Level Reset recovery, the MANA driver reads
hardware BAR0 registers that may temporarily contain garbage values.
The SHM (Shared Memory) offset read from GDMA_REG_SHM_OFFSET is used
to compute gc->shm_base, which is later dereferenced via readl() in
mana_smc_poll_register(). If the hardware returns an unaligned or
out-of-range value, the driver must not blindly use it, as this would
propagate the hardware error into a kernel crash.
The following crash was observed on an arm64 Hyper-V guest running
kernel 6.17.0-3013-azure during VF reset recovery triggered by HWC
timeout.
[13291.785274] Unable to handle kernel paging request at virtual address ffff8000a200001b
[13291.785311] Mem abort info:
[13291.785332] ESR = 0x0000000096000021
[13291.785343] EC = 0x25: DABT (current EL), IL = 32 bits
[13291.785355] SET = 0, FnV = 0
[13291.785363] EA = 0, S1PTW = 0
[13291.785372] FSC = 0x21: alignment fault
[13291.785382] Data abort info:
[13291.785391] ISV = 0, ISS = 0x00000021, ISS2 = 0x00000000
[13291.785404] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[13291.785412] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[13291.785421] swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000014df3a1000
[13291.785432] [ffff8000a200001b] pgd=1000000100438403, p4d=1000000100438403, pud=1000000100439403, pmd=0068000fc2000711
[13291.785703] Internal error: Oops: 0000000096000021 [#1] SMP
[13291.830975] Modules linked in: tls qrtr mana_ib ib_uverbs ib_core xt_owner xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat nf_tables cfg80211 8021q garp mrp stp llc binfmt_misc joydev serio_raw nls_iso8859_1 hid_generic aes_ce_blk aes_ce_cipher polyval_ce ghash_ce sm4_ce_gcm sm4_ce_ccm sm4_ce sm4_ce_cipher hid_hyperv sm4 sm3_ce sha3_ce hv_netvsc hid vmgenid hyperv_keyboard hyperv_drm sch_fq_codel nvme_fabrics efi_pstore dm_multipath nfnetlink vsock_loopback vmw_vsock_virtio_transport_common hv_sock vmw_vsock_vmci_transport vmw_vmci vsock dmi_sysfs ip_tables x_tables autofs4
[13291.862630] CPU: 122 UID: 0 PID: 61796 Comm: kworker/122:2 Tainted: G W 6.17.0-3013-azure #13-Ubuntu VOLUNTARY
[13291.869902] Tainted: [W]=WARN
[13291.871901] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 01/08/2026
[13291.878086] Workqueue: events mana_serv_func
[13291.880718] pstate: 62400005 (nZCv daif +PAN -UAO +TCO -DIT -SSBS BTYPE=--)
[13291.884835] pc : mana_smc_poll_register+0x48/0xb0
[13291.887902] lr : mana_smc_setup_hwc+0x70/0x1c0
[13291.890493] sp : ffff8000ab79bbb0
[13291.892364] x29: ffff8000ab79bbb0 x28: ffff00410c8b5900 x27: ffff00410d630680
[13291.896252] x26: ffff004171f9fd80 x25: 000000016ed55000 x24: 000000017f37e000
[13291.899990] x23: 0000000000000000 x22: 000000016ed55000 x21: 0000000000000000
[13291.904497] x20: ffff8000a200001b x19: 0000000000004e20 x18: ffff8000a6183050
[13291.908308] x17: 0000000000000000 x16: 0000000000000000 x15: 000000000000000a
[13291.912542] x14: 0000000000000004 x13: 0000000000000000 x12: 0000000000000000
[13291.916298] x11: 0000000000000000 x10: 0000000000000001 x9 : ffffc45006af1bd8
[13291.920945] x8 : ffff000151129000 x7 : 0000000000000000 x6 : 0000000000000000
[13291.925293] x5 : 000000015f214000 x4 : 000000017217a000 x3 : 000000016ed50000
[13291.930436] x2 : 000000016ed55000 x1 : 0000000000000000 x0 : ffff8000a1ffffff
[13291.934342] Call trace:
[13291.935736] mana_smc_poll_register+0x48/0xb0 (P)
[13291.938611] mana_smc_setup_hwc+0x70/0x1c0
[13291.941113] mana_hwc_create_channel+0x1a0/0x3a0
[13291.944283] mana_gd_setup+0x16c/0x398
[13291.946584] mana_gd_resume+0x24/0x70
[13291.948917] mana_do_service+0x13c/0x1d0
[13291.951583] mana_serv_func+0x34/0x68
[13291.953732] process_one_work+0x168/0x3d0
[13291.956745] worker_thread+0x2ac/0x480
[13291.959104] kthread+0xf8/0x110
[13291.961026] ret_from_fork+0x10/0x20
[13291.963560] Code: d2807d00 9417c551 71000673 54000220 (b9400281)
[13291.967299] ---[ end trace 0000000000000000 ]---
Disassembly of mana_smc_poll_register() around the crash site:
Disassembly of section .text:
00000000000047c8 <mana_smc_poll_register>:
47c8: d503201f nop
47cc: d503201f nop
47d0: d503233f paciasp
47d4: f800865e str x30, [x18], #8
47d8: a9bd7bfd stp x29, x30, [sp, #-48]!
47dc: 910003fd mov x29, sp
47e0: a90153f3 stp x19, x20, [sp, #16]
47e4: 91007014 add x20, x0, #0x1c
47e8: 5289c413 mov w19, #0x4e20
47ec: f90013f5 str x21, [sp, #32]
47f0: 12001c35 and w21, w1, #0xff
47f4: 14000008 b 4814 <mana_smc_poll_register+0x4c>
47f8: 36f801e1 tbz w1, #31, 4834 <mana_smc_poll_register+0x6c>
47fc: 52800042 mov w2, #0x2
4800: d280fa01 mov x1, #0x7d0
4804: d2807d00 mov x0, #0x3e8
4808: 94000000 bl 0 <usleep_range_state>
480c: 71000673 subs w19, w19, #0x1
4810: 54000200 b.eq 4850 <mana_smc_poll_register+0x88>
4814: b9400281 ldr w1, [x20] <-- **** CRASHED HERE *****
4818: d50331bf dmb oshld
481c: 2a0103e2 mov w2, w1
...
From the crash signature x20 = ffff8000a200001b, this address
ends in 0x1b which is not 4-byte aligned, so the 'ldr w1, [x20]'
instruction (readl) triggers the arm64 alignment fault (FSC = 0x21).
The root cause is in mana_gd_init_vf_regs(), which computes:
gc->shm_base = gc->bar0_va + mana_gd_r64(gc, GDMA_REG_SHM_OFFSET);
The offset is used without any validation. The same problem exists
in mana_gd_init_pf_regs() for sriov_base_off and sriov_shm_off.
Fix this by validating all offsets before use:
- VF: check shm_off is within BAR0, properly aligned to 4 bytes
(readl requirement), and leaves room for the full 256-bit
(32-byte) SMC aperture.
- PF: check sriov_base_off is within BAR0, aligned to 8 bytes
(readq requirement), and leaves room to safely read the
sriov_shm_off register at sriov_base_off + GDMA_PF_REG_SHM_OFF.
Then check sriov_shm_off leaves room for the full SMC aperture.
All arithmetic uses subtraction rather than addition to avoid
integer overflow on garbage values.
Define SMC_APERTURE_SIZE (32 bytes, derived from the 256-bit aperture
width)
Return -EPROTO on invalid values. The existing recovery path in
mana_serv_reset() already handles -EPROTO by falling through to PCI
device rescan, giving the hardware another chance to present valid
register values after reset.
Fixes: 9bf66036d686 ("net: mana: Handle hardware recovery events when probing the device")
Signed-off-by: Dipayaan Roy <dipayanroy@linux.microsoft.com>
---
Changes in v3:
- Fixed commit message.
- Removed macro duplicates.
Changes in v2:
- Fix sriov_base_off alignment check: sizeof(u32) to sizeof(u64), since
mana_gd_r64() (readq) requires 8-byte alignment on arm64.
- Fix sriov_base_off bounds: also verify enough space remains in BAR0
to safely read sriov_shm_off at offset GDMA_PF_REG_SHM_OFF + 8 bytes.
- Fix integer overflow: rewrite bounds checks using subtraction
(remaining = bar0_size - base) instead of addition.
- Fix SMC aperture size: add gc->bar0_size - shm_off < SMC_APERTURE_SIZE
checks in both VF and PF paths; previously only the start address was
validated, but mana_smc_poll_register() accesses up to shm_base + 0x1c
(28 bytes from base, 32 bytes total).
- Export SMC_APERTURE_SIZE to shm_channel.h.
---
.../net/ethernet/microsoft/mana/gdma_main.c | 40 ++++++++++++++++---
.../net/ethernet/microsoft/mana/shm_channel.c | 5 ---
include/net/mana/shm_channel.h | 6 +++
3 files changed, 41 insertions(+), 10 deletions(-)
diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c
index 098fbda0d128..d8e816882f02 100644
--- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
+++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
@@ -43,8 +43,9 @@ static u64 mana_gd_r64(struct gdma_context *g, u64 offset)
static int mana_gd_init_pf_regs(struct pci_dev *pdev)
{
struct gdma_context *gc = pci_get_drvdata(pdev);
- void __iomem *sriov_base_va;
+ u64 remaining_barsize;
u64 sriov_base_off;
+ u64 sriov_shm_off;
gc->db_page_size = mana_gd_r32(gc, GDMA_PF_REG_DB_PAGE_SIZE) & 0xFFFF;
@@ -73,10 +74,28 @@ static int mana_gd_init_pf_regs(struct pci_dev *pdev)
gc->phys_db_page_base = gc->bar0_pa + gc->db_page_off;
sriov_base_off = mana_gd_r64(gc, GDMA_SRIOV_REG_CFG_BASE_OFF);
+ if (sriov_base_off >= gc->bar0_size ||
+ gc->bar0_size - sriov_base_off <
+ GDMA_PF_REG_SHM_OFF + sizeof(u64) ||
+ !IS_ALIGNED(sriov_base_off, sizeof(u64))) {
+ dev_err(gc->dev,
+ "SRIOV base offset 0x%llx out of range or unaligned (BAR0 size 0x%llx)\n",
+ sriov_base_off, (u64)gc->bar0_size);
+ return -EPROTO;
+ }
- sriov_base_va = gc->bar0_va + sriov_base_off;
- gc->shm_base = sriov_base_va +
- mana_gd_r64(gc, sriov_base_off + GDMA_PF_REG_SHM_OFF);
+ remaining_barsize = gc->bar0_size - sriov_base_off;
+ sriov_shm_off = mana_gd_r64(gc, sriov_base_off + GDMA_PF_REG_SHM_OFF);
+ if (sriov_shm_off >= remaining_barsize ||
+ remaining_barsize - sriov_shm_off < SMC_APERTURE_SIZE ||
+ !IS_ALIGNED(sriov_shm_off, sizeof(u32))) {
+ dev_err(gc->dev,
+ "SRIOV SHM offset 0x%llx out of range or unaligned (BAR0 size 0x%llx)\n",
+ sriov_shm_off, (u64)gc->bar0_size);
+ return -EPROTO;
+ }
+
+ gc->shm_base = gc->bar0_va + sriov_base_off + sriov_shm_off;
return 0;
}
@@ -84,6 +103,7 @@ static int mana_gd_init_pf_regs(struct pci_dev *pdev)
static int mana_gd_init_vf_regs(struct pci_dev *pdev)
{
struct gdma_context *gc = pci_get_drvdata(pdev);
+ u64 shm_off;
gc->db_page_size = mana_gd_r32(gc, GDMA_REG_DB_PAGE_SIZE) & 0xFFFF;
@@ -111,7 +131,17 @@ static int mana_gd_init_vf_regs(struct pci_dev *pdev)
gc->db_page_base = gc->bar0_va + gc->db_page_off;
gc->phys_db_page_base = gc->bar0_pa + gc->db_page_off;
- gc->shm_base = gc->bar0_va + mana_gd_r64(gc, GDMA_REG_SHM_OFFSET);
+ shm_off = mana_gd_r64(gc, GDMA_REG_SHM_OFFSET);
+ if (shm_off >= gc->bar0_size ||
+ gc->bar0_size - shm_off < SMC_APERTURE_SIZE ||
+ !IS_ALIGNED(shm_off, sizeof(u32))) {
+ dev_err(gc->dev,
+ "SHM offset 0x%llx out of range or unaligned (BAR0 size 0x%llx)\n",
+ shm_off, (u64)gc->bar0_size);
+ return -EPROTO;
+ }
+
+ gc->shm_base = gc->bar0_va + shm_off;
return 0;
}
diff --git a/drivers/net/ethernet/microsoft/mana/shm_channel.c b/drivers/net/ethernet/microsoft/mana/shm_channel.c
index 0f1679ebad96..d21b5db06e50 100644
--- a/drivers/net/ethernet/microsoft/mana/shm_channel.c
+++ b/drivers/net/ethernet/microsoft/mana/shm_channel.c
@@ -61,11 +61,6 @@ union smc_proto_hdr {
};
}; /* HW DATA */
-#define SMC_APERTURE_BITS 256
-#define SMC_BASIC_UNIT (sizeof(u32))
-#define SMC_APERTURE_DWORDS (SMC_APERTURE_BITS / (SMC_BASIC_UNIT * 8))
-#define SMC_LAST_DWORD (SMC_APERTURE_DWORDS - 1)
-
static int mana_smc_poll_register(void __iomem *base, bool reset)
{
void __iomem *ptr = base + SMC_LAST_DWORD * SMC_BASIC_UNIT;
diff --git a/include/net/mana/shm_channel.h b/include/net/mana/shm_channel.h
index 5199b41497ff..dbabcfb95daf 100644
--- a/include/net/mana/shm_channel.h
+++ b/include/net/mana/shm_channel.h
@@ -4,6 +4,12 @@
#ifndef _SHM_CHANNEL_H
#define _SHM_CHANNEL_H
+#define SMC_APERTURE_BITS 256
+#define SMC_BASIC_UNIT (sizeof(u32))
+#define SMC_APERTURE_DWORDS (SMC_APERTURE_BITS / (SMC_BASIC_UNIT * 8))
+#define SMC_LAST_DWORD (SMC_APERTURE_DWORDS - 1)
+#define SMC_APERTURE_SIZE (SMC_APERTURE_BITS / 8)
+
struct shm_channel {
struct device *dev;
void __iomem *base;
--
2.43.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH net, v3] net: mana: Fix crash from unvalidated SHM offset read from BAR0 during FLR
2026-05-01 2:47 [PATCH net, v3] net: mana: Fix crash from unvalidated SHM offset read from BAR0 during FLR Dipayaan Roy
@ 2026-05-05 13:42 ` Paolo Abeni
2026-05-05 16:28 ` Dipayaan Roy
2026-05-05 13:50 ` patchwork-bot+netdevbpf
1 sibling, 1 reply; 4+ messages in thread
From: Paolo Abeni @ 2026-05-05 13:42 UTC (permalink / raw)
To: Dipayaan Roy, kys, haiyangz, wei.liu, decui, andrew+netdev, davem,
edumazet, kuba, leon, longli, kotaranov, horms, shradhagupta,
ssengar, ernis, shirazsaleem, linux-hyperv, netdev, linux-kernel,
linux-rdma, stephen, jacob.e.keller, dipayanroy, leitao, kees,
john.fastabend, hawk, bpf, daniel, ast, sdf, yury.norov
On 5/1/26 4:47 AM, Dipayaan Roy wrote:
> @@ -73,10 +74,28 @@ static int mana_gd_init_pf_regs(struct pci_dev *pdev)
> gc->phys_db_page_base = gc->bar0_pa + gc->db_page_off;
>
> sriov_base_off = mana_gd_r64(gc, GDMA_SRIOV_REG_CFG_BASE_OFF);
> + if (sriov_base_off >= gc->bar0_size ||
> + gc->bar0_size - sriov_base_off <
> + GDMA_PF_REG_SHM_OFF + sizeof(u64) ||
> + !IS_ALIGNED(sriov_base_off, sizeof(u64))) {
> + dev_err(gc->dev,
> + "SRIOV base offset 0x%llx out of range or unaligned (BAR0 size 0x%llx)\n",
> + sriov_base_off, (u64)gc->bar0_size);
> + return -EPROTO;
> + }
I think that the additional fix suggested by sashiko is really worthy,
but should go in a separate patch. @Dipayaan: please follow-up on that
one, thanks!
Paolo
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH net, v3] net: mana: Fix crash from unvalidated SHM offset read from BAR0 during FLR
2026-05-01 2:47 [PATCH net, v3] net: mana: Fix crash from unvalidated SHM offset read from BAR0 during FLR Dipayaan Roy
2026-05-05 13:42 ` Paolo Abeni
@ 2026-05-05 13:50 ` patchwork-bot+netdevbpf
1 sibling, 0 replies; 4+ messages in thread
From: patchwork-bot+netdevbpf @ 2026-05-05 13:50 UTC (permalink / raw)
To: Dipayaan Roy
Cc: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
kuba, pabeni, leon, longli, kotaranov, horms, shradhagupta,
ssengar, ernis, shirazsaleem, linux-hyperv, netdev, linux-kernel,
linux-rdma, stephen, jacob.e.keller, dipayanroy, leitao, kees,
john.fastabend, hawk, bpf, daniel, ast, sdf, yury.norov
Hello:
This patch was applied to netdev/net.git (main)
by Paolo Abeni <pabeni@redhat.com>:
On Thu, 30 Apr 2026 19:47:12 -0700 you wrote:
> During Function Level Reset recovery, the MANA driver reads
> hardware BAR0 registers that may temporarily contain garbage values.
> The SHM (Shared Memory) offset read from GDMA_REG_SHM_OFFSET is used
> to compute gc->shm_base, which is later dereferenced via readl() in
> mana_smc_poll_register(). If the hardware returns an unaligned or
> out-of-range value, the driver must not blindly use it, as this would
> propagate the hardware error into a kernel crash.
>
> [...]
Here is the summary with links:
- [net,v3] net: mana: Fix crash from unvalidated SHM offset read from BAR0 during FLR
https://git.kernel.org/netdev/net/c/95084f1883a7
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH net, v3] net: mana: Fix crash from unvalidated SHM offset read from BAR0 during FLR
2026-05-05 13:42 ` Paolo Abeni
@ 2026-05-05 16:28 ` Dipayaan Roy
0 siblings, 0 replies; 4+ messages in thread
From: Dipayaan Roy @ 2026-05-05 16:28 UTC (permalink / raw)
To: Paolo Abeni
Cc: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
kuba, leon, longli, kotaranov, horms, shradhagupta, ssengar,
ernis, shirazsaleem, linux-hyperv, netdev, linux-kernel,
linux-rdma, stephen, jacob.e.keller, dipayanroy, leitao, kees,
john.fastabend, hawk, bpf, daniel, ast, sdf, yury.norov
On Tue, May 05, 2026 at 03:42:46PM +0200, Paolo Abeni wrote:
> On 5/1/26 4:47 AM, Dipayaan Roy wrote:
> > @@ -73,10 +74,28 @@ static int mana_gd_init_pf_regs(struct pci_dev *pdev)
> > gc->phys_db_page_base = gc->bar0_pa + gc->db_page_off;
> >
> > sriov_base_off = mana_gd_r64(gc, GDMA_SRIOV_REG_CFG_BASE_OFF);
> > + if (sriov_base_off >= gc->bar0_size ||
> > + gc->bar0_size - sriov_base_off <
> > + GDMA_PF_REG_SHM_OFF + sizeof(u64) ||
> > + !IS_ALIGNED(sriov_base_off, sizeof(u64))) {
> > + dev_err(gc->dev,
> > + "SRIOV base offset 0x%llx out of range or unaligned (BAR0 size 0x%llx)\n",
> > + sriov_base_off, (u64)gc->bar0_size);
> > + return -EPROTO;
> > + }
>
> I think that the additional fix suggested by sashiko is really worthy,
> but should go in a separate patch. @Dipayaan: please follow-up on that
> one, thanks!
>
> Paolo
>
Hi Paolo,
Thanks for reviewing, and I will cross check and send out a separate patch for
issue pointed out by Sashiko(un-related to the current issue).
Regards
Dipayaan Roy
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-05-05 16:28 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-01 2:47 [PATCH net, v3] net: mana: Fix crash from unvalidated SHM offset read from BAR0 during FLR Dipayaan Roy
2026-05-05 13:42 ` Paolo Abeni
2026-05-05 16:28 ` Dipayaan Roy
2026-05-05 13:50 ` patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox