* [PATCH net, v2] net: mana: Fix crash from unvalidated SHM offset read from BAR0 during FLR
@ 2026-04-29 18:57 Dipayaan Roy
2026-05-02 1:53 ` Jakub Kicinski
0 siblings, 1 reply; 3+ messages in thread
From: Dipayaan Roy @ 2026-04-29 18:57 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
kuba, pabeni, leon, longli, kotaranov, horms, shradhagupta,
ssengar, ernis, shirazsaleem, linux-hyperv, netdev, linux-kernel,
linux-rdma, stephen, jacob.e.keller, dipayanroy, leitao, kees,
john.fastabend, hawk, bpf, daniel, ast, sdf, yury.norov
During Function Level Reset recovery, the MANA driver reads
hardware BAR0 registers that may temporarily contain garbage values.
The SHM (Shared Memory) offset read from GDMA_REG_SHM_OFFSET is used
to compute gc->shm_base, which is later dereferenced via readl() in
mana_smc_poll_register(). If the hardware returns an unaligned or
out-of-range value, the driver must not blindly use it, as this would
propagate the hardware error into a kernel crash.
The following crash was observed on an arm64 Hyper-V guest running
kernel 6.17.0-3013-azure during VF reset recovery triggered by HWC
timeout.
[13291.785274] Unable to handle kernel paging request at virtual address ffff8000a200001b
[13291.785311] Mem abort info:
[13291.785332] ESR = 0x0000000096000021
[13291.785343] EC = 0x25: DABT (current EL), IL = 32 bits
[13291.785355] SET = 0, FnV = 0
[13291.785363] EA = 0, S1PTW = 0
[13291.785372] FSC = 0x21: alignment fault
[13291.785382] Data abort info:
[13291.785391] ISV = 0, ISS = 0x00000021, ISS2 = 0x00000000
[13291.785404] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[13291.785412] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[13291.785421] swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000014df3a1000
[13291.785432] [ffff8000a200001b] pgd=1000000100438403, p4d=1000000100438403, pud=1000000100439403, pmd=0068000fc2000711
[13291.785703] Internal error: Oops: 0000000096000021 [#1] SMP
[13291.830975] Modules linked in: tls qrtr mana_ib ib_uverbs ib_core xt_owner xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat nf_tables cfg80211 8021q garp mrp stp llc binfmt_misc joydev serio_raw nls_iso8859_1 hid_generic aes_ce_blk aes_ce_cipher polyval_ce ghash_ce sm4_ce_gcm sm4_ce_ccm sm4_ce sm4_ce_cipher hid_hyperv sm4 sm3_ce sha3_ce hv_netvsc hid vmgenid hyperv_keyboard hyperv_drm sch_fq_codel nvme_fabrics efi_pstore dm_multipath nfnetlink vsock_loopback vmw_vsock_virtio_transport_common hv_sock vmw_vsock_vmci_transport vmw_vmci vsock dmi_sysfs ip_tables x_tables autofs4
[13291.862630] CPU: 122 UID: 0 PID: 61796 Comm: kworker/122:2 Tainted: G W 6.17.0-3013-azure #13-Ubuntu VOLUNTARY
[13291.869902] Tainted: [W]=WARN
[13291.871901] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 01/08/2026
[13291.878086] Workqueue: events mana_serv_func
[13291.880718] pstate: 62400005 (nZCv daif +PAN -UAO +TCO -DIT -SSBS BTYPE=--)
[13291.884835] pc : mana_smc_poll_register+0x48/0xb0
[13291.887902] lr : mana_smc_setup_hwc+0x70/0x1c0
[13291.890493] sp : ffff8000ab79bbb0
[13291.892364] x29: ffff8000ab79bbb0 x28: ffff00410c8b5900 x27: ffff00410d630680
[13291.896252] x26: ffff004171f9fd80 x25: 000000016ed55000 x24: 000000017f37e000
[13291.899990] x23: 0000000000000000 x22: 000000016ed55000 x21: 0000000000000000
[13291.904497] x20: ffff8000a200001b x19: 0000000000004e20 x18: ffff8000a6183050
[13291.908308] x17: 0000000000000000 x16: 0000000000000000 x15: 000000000000000a
[13291.912542] x14: 0000000000000004 x13: 0000000000000000 x12: 0000000000000000
[13291.916298] x11: 0000000000000000 x10: 0000000000000001 x9 : ffffc45006af1bd8
[13291.920945] x8 : ffff000151129000 x7 : 0000000000000000 x6 : 0000000000000000
[13291.925293] x5 : 000000015f214000 x4 : 000000017217a000 x3 : 000000016ed50000
[13291.930436] x2 : 000000016ed55000 x1 : 0000000000000000 x0 : ffff8000a1ffffff
[13291.934342] Call trace:
[13291.935736] mana_smc_poll_register+0x48/0xb0 (P)
[13291.938611] mana_smc_setup_hwc+0x70/0x1c0
[13291.941113] mana_hwc_create_channel+0x1a0/0x3a0
[13291.944283] mana_gd_setup+0x16c/0x398
[13291.946584] mana_gd_resume+0x24/0x70
[13291.948917] mana_do_service+0x13c/0x1d0
[13291.951583] mana_serv_func+0x34/0x68
[13291.953732] process_one_work+0x168/0x3d0
[13291.956745] worker_thread+0x2ac/0x480
[13291.959104] kthread+0xf8/0x110
[13291.961026] ret_from_fork+0x10/0x20
[13291.963560] Code: d2807d00 9417c551 71000673 54000220 (b9400281)
[13291.967299] ---[ end trace 0000000000000000 ]---
Disassembly of mana_smc_poll_register() around the crash site:
Disassembly of section .text:
00000000000047c8 <mana_smc_poll_register>:
47c8: d503201f nop
47cc: d503201f nop
47d0: d503233f paciasp
47d4: f800865e str x30, [x18], #8
47d8: a9bd7bfd stp x29, x30, [sp, #-48]!
47dc: 910003fd mov x29, sp
47e0: a90153f3 stp x19, x20, [sp, #16]
47e4: 91007014 add x20, x0, #0x1c
47e8: 5289c413 mov w19, #0x4e20
47ec: f90013f5 str x21, [sp, #32]
47f0: 12001c35 and w21, w1, #0xff
47f4: 14000008 b 4814 <mana_smc_poll_register+0x4c>
47f8: 36f801e1 tbz w1, #31, 4834 <mana_smc_poll_register+0x6c>
47fc: 52800042 mov w2, #0x2
4800: d280fa01 mov x1, #0x7d0
4804: d2807d00 mov x0, #0x3e8
4808: 94000000 bl 0 <usleep_range_state>
480c: 71000673 subs w19, w19, #0x1
4810: 54000200 b.eq 4850 <mana_smc_poll_register+0x88>
4814: b9400281 ldr w1, [x20] <-- **** CRASHED HERE *****
4818: d50331bf dmb oshld
481c: 2a0103e2 mov w2, w1
...
From the crash signature x20 = ffff8000a200001b, this address
ends in 0x1b which is not 4-byte aligned, so the 'ldr w1, [x20]'
instruction (readl) triggers the arm64 alignment fault (FSC = 0x21).
The root cause is in mana_gd_init_vf_regs(), which computes:
gc->shm_base = gc->bar0_va + mana_gd_r64(gc, GDMA_REG_SHM_OFFSET);
The offset is used without any validation. The same problem exists
in mana_gd_init_pf_regs() for sriov_base_off and sriov_shm_off.
Fix this by validating all offsets before use:
- VF: check shm_off is within BAR0, properly aligned to 4 bytes
(readl requirement), and leaves room for the full 256-bit
(32-byte) SMC aperture.
- PF: check sriov_base_off is within BAR0, aligned to 8 bytes
(readq requirement), and leaves room to safely read the
sriov_shm_off register at sriov_base_off + GDMA_PF_REG_SHM_OFF.
Then check sriov_shm_off leaves room for the full SMC aperture.
All arithmetic uses subtraction rather than addition to avoid
integer overflow on garbage firmware values.
without validating the offset read from hardware. If the register
returns a garbage value that is neither within bar 0 bounds nor aligned
to the 4-byte granularity, thus causing the alignment fault.
Define SMC_APERTURE_SIZE (32 bytes, derived from the 256-bit aperture
width)
Return -EPROTO on invalid values. The existing recovery path in
mana_serv_reset() already handles -EPROTO by falling through to PCI
device rescan, giving the hardware another chance to present valid
register values after reset.
Fixes: 9bf66036d686 ("net: mana: Handle hardware recovery events when probing the device")
Signed-off-by: Dipayaan Roy <dipayanroy@linux.microsoft.com>
---
Changes in v2:
- Fix sriov_base_off alignment check: sizeof(u32) to sizeof(u64), since
mana_gd_r64() (readq) requires 8-byte alignment on arm64.
- Fix sriov_base_off bounds: also verify enough space remains in BAR0
to safely read sriov_shm_off at offset GDMA_PF_REG_SHM_OFF + 8 bytes.
- Fix integer overflow: rewrite bounds checks using subtraction
(remaining = bar0_size - base) instead of addition.
- Fix SMC aperture size: add gc->bar0_size - shm_off < SMC_APERTURE_SIZE
checks in both VF and PF paths; previously only the start address was
validated, but mana_smc_poll_register() accesses up to shm_base + 0x1c
(28 bytes from base, 32 bytes total).
- Export SMC_APERTURE_SIZE to shm_channel.h.
---
.../net/ethernet/microsoft/mana/gdma_main.c | 40 ++++++++++++++++---
include/net/mana/shm_channel.h | 6 +++
2 files changed, 41 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c
index 098fbda0d128..d8e816882f02 100644
--- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
+++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
@@ -43,8 +43,9 @@ static u64 mana_gd_r64(struct gdma_context *g, u64 offset)
static int mana_gd_init_pf_regs(struct pci_dev *pdev)
{
struct gdma_context *gc = pci_get_drvdata(pdev);
- void __iomem *sriov_base_va;
+ u64 remaining_barsize;
u64 sriov_base_off;
+ u64 sriov_shm_off;
gc->db_page_size = mana_gd_r32(gc, GDMA_PF_REG_DB_PAGE_SIZE) & 0xFFFF;
@@ -73,10 +74,28 @@ static int mana_gd_init_pf_regs(struct pci_dev *pdev)
gc->phys_db_page_base = gc->bar0_pa + gc->db_page_off;
sriov_base_off = mana_gd_r64(gc, GDMA_SRIOV_REG_CFG_BASE_OFF);
+ if (sriov_base_off >= gc->bar0_size ||
+ gc->bar0_size - sriov_base_off <
+ GDMA_PF_REG_SHM_OFF + sizeof(u64) ||
+ !IS_ALIGNED(sriov_base_off, sizeof(u64))) {
+ dev_err(gc->dev,
+ "SRIOV base offset 0x%llx out of range or unaligned (BAR0 size 0x%llx)\n",
+ sriov_base_off, (u64)gc->bar0_size);
+ return -EPROTO;
+ }
- sriov_base_va = gc->bar0_va + sriov_base_off;
- gc->shm_base = sriov_base_va +
- mana_gd_r64(gc, sriov_base_off + GDMA_PF_REG_SHM_OFF);
+ remaining_barsize = gc->bar0_size - sriov_base_off;
+ sriov_shm_off = mana_gd_r64(gc, sriov_base_off + GDMA_PF_REG_SHM_OFF);
+ if (sriov_shm_off >= remaining_barsize ||
+ remaining_barsize - sriov_shm_off < SMC_APERTURE_SIZE ||
+ !IS_ALIGNED(sriov_shm_off, sizeof(u32))) {
+ dev_err(gc->dev,
+ "SRIOV SHM offset 0x%llx out of range or unaligned (BAR0 size 0x%llx)\n",
+ sriov_shm_off, (u64)gc->bar0_size);
+ return -EPROTO;
+ }
+
+ gc->shm_base = gc->bar0_va + sriov_base_off + sriov_shm_off;
return 0;
}
@@ -84,6 +103,7 @@ static int mana_gd_init_pf_regs(struct pci_dev *pdev)
static int mana_gd_init_vf_regs(struct pci_dev *pdev)
{
struct gdma_context *gc = pci_get_drvdata(pdev);
+ u64 shm_off;
gc->db_page_size = mana_gd_r32(gc, GDMA_REG_DB_PAGE_SIZE) & 0xFFFF;
@@ -111,7 +131,17 @@ static int mana_gd_init_vf_regs(struct pci_dev *pdev)
gc->db_page_base = gc->bar0_va + gc->db_page_off;
gc->phys_db_page_base = gc->bar0_pa + gc->db_page_off;
- gc->shm_base = gc->bar0_va + mana_gd_r64(gc, GDMA_REG_SHM_OFFSET);
+ shm_off = mana_gd_r64(gc, GDMA_REG_SHM_OFFSET);
+ if (shm_off >= gc->bar0_size ||
+ gc->bar0_size - shm_off < SMC_APERTURE_SIZE ||
+ !IS_ALIGNED(shm_off, sizeof(u32))) {
+ dev_err(gc->dev,
+ "SHM offset 0x%llx out of range or unaligned (BAR0 size 0x%llx)\n",
+ shm_off, (u64)gc->bar0_size);
+ return -EPROTO;
+ }
+
+ gc->shm_base = gc->bar0_va + shm_off;
return 0;
}
diff --git a/include/net/mana/shm_channel.h b/include/net/mana/shm_channel.h
index 5199b41497ff..dbabcfb95daf 100644
--- a/include/net/mana/shm_channel.h
+++ b/include/net/mana/shm_channel.h
@@ -4,6 +4,12 @@
#ifndef _SHM_CHANNEL_H
#define _SHM_CHANNEL_H
+#define SMC_APERTURE_BITS 256
+#define SMC_BASIC_UNIT (sizeof(u32))
+#define SMC_APERTURE_DWORDS (SMC_APERTURE_BITS / (SMC_BASIC_UNIT * 8))
+#define SMC_LAST_DWORD (SMC_APERTURE_DWORDS - 1)
+#define SMC_APERTURE_SIZE (SMC_APERTURE_BITS / 8)
+
struct shm_channel {
struct device *dev;
void __iomem *base;
--
2.43.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH net, v2] net: mana: Fix crash from unvalidated SHM offset read from BAR0 during FLR
2026-04-29 18:57 [PATCH net, v2] net: mana: Fix crash from unvalidated SHM offset read from BAR0 during FLR Dipayaan Roy
@ 2026-05-02 1:53 ` Jakub Kicinski
2026-05-03 3:03 ` Dipayaan Roy
0 siblings, 1 reply; 3+ messages in thread
From: Jakub Kicinski @ 2026-05-02 1:53 UTC (permalink / raw)
To: Dipayaan Roy
Cc: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
pabeni, leon, longli, kotaranov, horms, shradhagupta, ssengar,
ernis, shirazsaleem, linux-hyperv, netdev, linux-kernel,
linux-rdma, stephen, jacob.e.keller, dipayanroy, leitao, kees,
john.fastabend, hawk, bpf, daniel, ast, sdf, yury.norov
On Wed, 29 Apr 2026 11:57:55 -0700 Dipayaan Roy wrote:
> During Function Level Reset recovery, the MANA driver reads
> hardware BAR0 registers that may temporarily contain garbage values.
> The SHM (Shared Memory) offset read from GDMA_REG_SHM_OFFSET is used
> to compute gc->shm_base, which is later dereferenced via readl() in
> mana_smc_poll_register(). If the hardware returns an unaligned or
> out-of-range value, the driver must not blindly use it, as this would
> propagate the hardware error into a kernel crash.
>
> The following crash was observed on an arm64 Hyper-V guest running
> kernel 6.17.0-3013-azure during VF reset recovery triggered by HWC
> timeout.
>
> [13291.785274] Unable to handle kernel paging request at virtual address ffff8000a200001b
> [13291.785311] Mem abort info:
> [13291.785332] ESR = 0x0000000096000021
> [13291.785343] EC = 0x25: DABT (current EL), IL = 32 bits
> [13291.785355] SET = 0, FnV = 0
> [13291.785363] EA = 0, S1PTW = 0
> [13291.785372] FSC = 0x21: alignment fault
> [13291.785382] Data abort info:
> [13291.785391] ISV = 0, ISS = 0x00000021, ISS2 = 0x00000000
> [13291.785404] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> [13291.785412] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> [13291.785421] swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000014df3a1000
> [13291.785432] [ffff8000a200001b] pgd=1000000100438403, p4d=1000000100438403, pud=1000000100439403, pmd=0068000fc2000711
> [13291.785703] Internal error: Oops: 0000000096000021 [#1] SMP
> [13291.830975] Modules linked in: tls qrtr mana_ib ib_uverbs ib_core xt_owner xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat nf_tables cfg80211 8021q garp mrp stp llc binfmt_misc joydev serio_raw nls_iso8859_1 hid_generic aes_ce_blk aes_ce_cipher polyval_ce ghash_ce sm4_ce_gcm sm4_ce_ccm sm4_ce sm4_ce_cipher hid_hyperv sm4 sm3_ce sha3_ce hv_netvsc hid vmgenid hyperv_keyboard hyperv_drm sch_fq_codel nvme_fabrics efi_pstore dm_multipath nfnetlink vsock_loopback vmw_vsock_virtio_transport_common hv_sock vmw_vsock_vmci_transport vmw_vmci vsock dmi_sysfs ip_tables x_tables autofs4
> [13291.862630] CPU: 122 UID: 0 PID: 61796 Comm: kworker/122:2 Tainted: G W 6.17.0-3013-azure #13-Ubuntu VOLUNTARY
> [13291.869902] Tainted: [W]=WARN
> [13291.871901] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 01/08/2026
> [13291.878086] Workqueue: events mana_serv_func
> [13291.880718] pstate: 62400005 (nZCv daif +PAN -UAO +TCO -DIT -SSBS BTYPE=--)
> [13291.884835] pc : mana_smc_poll_register+0x48/0xb0
> [13291.887902] lr : mana_smc_setup_hwc+0x70/0x1c0
> [13291.890493] sp : ffff8000ab79bbb0
> [13291.892364] x29: ffff8000ab79bbb0 x28: ffff00410c8b5900 x27: ffff00410d630680
> [13291.896252] x26: ffff004171f9fd80 x25: 000000016ed55000 x24: 000000017f37e000
> [13291.899990] x23: 0000000000000000 x22: 000000016ed55000 x21: 0000000000000000
> [13291.904497] x20: ffff8000a200001b x19: 0000000000004e20 x18: ffff8000a6183050
> [13291.908308] x17: 0000000000000000 x16: 0000000000000000 x15: 000000000000000a
> [13291.912542] x14: 0000000000000004 x13: 0000000000000000 x12: 0000000000000000
> [13291.916298] x11: 0000000000000000 x10: 0000000000000001 x9 : ffffc45006af1bd8
> [13291.920945] x8 : ffff000151129000 x7 : 0000000000000000 x6 : 0000000000000000
> [13291.925293] x5 : 000000015f214000 x4 : 000000017217a000 x3 : 000000016ed50000
> [13291.930436] x2 : 000000016ed55000 x1 : 0000000000000000 x0 : ffff8000a1ffffff
> [13291.934342] Call trace:
> [13291.935736] mana_smc_poll_register+0x48/0xb0 (P)
> [13291.938611] mana_smc_setup_hwc+0x70/0x1c0
> [13291.941113] mana_hwc_create_channel+0x1a0/0x3a0
> [13291.944283] mana_gd_setup+0x16c/0x398
> [13291.946584] mana_gd_resume+0x24/0x70
> [13291.948917] mana_do_service+0x13c/0x1d0
> [13291.951583] mana_serv_func+0x34/0x68
> [13291.953732] process_one_work+0x168/0x3d0
> [13291.956745] worker_thread+0x2ac/0x480
> [13291.959104] kthread+0xf8/0x110
> [13291.961026] ret_from_fork+0x10/0x20
> [13291.963560] Code: d2807d00 9417c551 71000673 54000220 (b9400281)
> [13291.967299] ---[ end trace 0000000000000000 ]---
>
> Disassembly of mana_smc_poll_register() around the crash site:
>
> Disassembly of section .text:
>
> 00000000000047c8 <mana_smc_poll_register>:
> 47c8: d503201f nop
> 47cc: d503201f nop
> 47d0: d503233f paciasp
> 47d4: f800865e str x30, [x18], #8
> 47d8: a9bd7bfd stp x29, x30, [sp, #-48]!
> 47dc: 910003fd mov x29, sp
> 47e0: a90153f3 stp x19, x20, [sp, #16]
> 47e4: 91007014 add x20, x0, #0x1c
> 47e8: 5289c413 mov w19, #0x4e20
> 47ec: f90013f5 str x21, [sp, #32]
> 47f0: 12001c35 and w21, w1, #0xff
> 47f4: 14000008 b 4814 <mana_smc_poll_register+0x4c>
> 47f8: 36f801e1 tbz w1, #31, 4834 <mana_smc_poll_register+0x6c>
> 47fc: 52800042 mov w2, #0x2
> 4800: d280fa01 mov x1, #0x7d0
> 4804: d2807d00 mov x0, #0x3e8
> 4808: 94000000 bl 0 <usleep_range_state>
> 480c: 71000673 subs w19, w19, #0x1
> 4810: 54000200 b.eq 4850 <mana_smc_poll_register+0x88>
> 4814: b9400281 ldr w1, [x20] <-- **** CRASHED HERE *****
> 4818: d50331bf dmb oshld
> 481c: 2a0103e2 mov w2, w1
> ...
>
> From the crash signature x20 = ffff8000a200001b, this address
> ends in 0x1b which is not 4-byte aligned, so the 'ldr w1, [x20]'
> instruction (readl) triggers the arm64 alignment fault (FSC = 0x21).
>
> The root cause is in mana_gd_init_vf_regs(), which computes:
>
> gc->shm_base = gc->bar0_va + mana_gd_r64(gc, GDMA_REG_SHM_OFFSET);
>
> The offset is used without any validation. The same problem exists
> in mana_gd_init_pf_regs() for sriov_base_off and sriov_shm_off.
>
> Fix this by validating all offsets before use:
>
> - VF: check shm_off is within BAR0, properly aligned to 4 bytes
> (readl requirement), and leaves room for the full 256-bit
> (32-byte) SMC aperture.
>
> - PF: check sriov_base_off is within BAR0, aligned to 8 bytes
> (readq requirement), and leaves room to safely read the
> sriov_shm_off register at sriov_base_off + GDMA_PF_REG_SHM_OFF.
> Then check sriov_shm_off leaves room for the full SMC aperture.
> All arithmetic uses subtraction rather than addition to avoid
> integer overflow on garbage firmware values.
>
> without validating the offset read from hardware. If the register
> returns a garbage value that is neither within bar 0 bounds nor aligned
> to the 4-byte granularity, thus causing the alignment fault.
>
> Define SMC_APERTURE_SIZE (32 bytes, derived from the 256-bit aperture
> width)
>
> Return -EPROTO on invalid values. The existing recovery path in
> mana_serv_reset() already handles -EPROTO by falling through to PCI
> device rescan, giving the hardware another chance to present valid
> register values after reset.
>
> Fixes: 9bf66036d686 ("net: mana: Handle hardware recovery events when probing the device")
> Signed-off-by: Dipayaan Roy <dipayanroy@linux.microsoft.com>
>
> ---
> Changes in v2:
> - Fix sriov_base_off alignment check: sizeof(u32) to sizeof(u64), since
> mana_gd_r64() (readq) requires 8-byte alignment on arm64.
> - Fix sriov_base_off bounds: also verify enough space remains in BAR0
> to safely read sriov_shm_off at offset GDMA_PF_REG_SHM_OFF + 8 bytes.
> - Fix integer overflow: rewrite bounds checks using subtraction
> (remaining = bar0_size - base) instead of addition.
> - Fix SMC aperture size: add gc->bar0_size - shm_off < SMC_APERTURE_SIZE
> checks in both VF and PF paths; previously only the start address was
> validated, but mana_smc_poll_register() accesses up to shm_base + 0x1c
> (28 bytes from base, 32 bytes total).
> - Export SMC_APERTURE_SIZE to shm_channel.h.
> ---
> .../net/ethernet/microsoft/mana/gdma_main.c | 40 ++++++++++++++++---
> include/net/mana/shm_channel.h | 6 +++
> 2 files changed, 41 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> index 098fbda0d128..d8e816882f02 100644
> --- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
> +++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> @@ -43,8 +43,9 @@ static u64 mana_gd_r64(struct gdma_context *g, u64 offset)
> static int mana_gd_init_pf_regs(struct pci_dev *pdev)
> {
> struct gdma_context *gc = pci_get_drvdata(pdev);
> - void __iomem *sriov_base_va;
> + u64 remaining_barsize;
> u64 sriov_base_off;
> + u64 sriov_shm_off;
>
> gc->db_page_size = mana_gd_r32(gc, GDMA_PF_REG_DB_PAGE_SIZE) & 0xFFFF;
>
> @@ -73,10 +74,28 @@ static int mana_gd_init_pf_regs(struct pci_dev *pdev)
> gc->phys_db_page_base = gc->bar0_pa + gc->db_page_off;
>
> sriov_base_off = mana_gd_r64(gc, GDMA_SRIOV_REG_CFG_BASE_OFF);
> + if (sriov_base_off >= gc->bar0_size ||
> + gc->bar0_size - sriov_base_off <
> + GDMA_PF_REG_SHM_OFF + sizeof(u64) ||
nit: fits on a single line, I think?
> + !IS_ALIGNED(sriov_base_off, sizeof(u64))) {
> + dev_err(gc->dev,
> + "SRIOV base offset 0x%llx out of range or unaligned (BAR0 size 0x%llx)\n",
> + sriov_base_off, (u64)gc->bar0_size);
> + return -EPROTO;
> + }
>
> - sriov_base_va = gc->bar0_va + sriov_base_off;
> - gc->shm_base = sriov_base_va +
> - mana_gd_r64(gc, sriov_base_off + GDMA_PF_REG_SHM_OFF);
> + remaining_barsize = gc->bar0_size - sriov_base_off;
> + sriov_shm_off = mana_gd_r64(gc, sriov_base_off + GDMA_PF_REG_SHM_OFF);
> + if (sriov_shm_off >= remaining_barsize ||
> + remaining_barsize - sriov_shm_off < SMC_APERTURE_SIZE ||
> + !IS_ALIGNED(sriov_shm_off, sizeof(u32))) {
> + dev_err(gc->dev,
> + "SRIOV SHM offset 0x%llx out of range or unaligned (BAR0 size 0x%llx)\n",
> + sriov_shm_off, (u64)gc->bar0_size);
> + return -EPROTO;
> + }
> +
> + gc->shm_base = gc->bar0_va + sriov_base_off + sriov_shm_off;
>
> return 0;
> }
> @@ -84,6 +103,7 @@ static int mana_gd_init_pf_regs(struct pci_dev *pdev)
> static int mana_gd_init_vf_regs(struct pci_dev *pdev)
> {
> struct gdma_context *gc = pci_get_drvdata(pdev);
> + u64 shm_off;
>
> gc->db_page_size = mana_gd_r32(gc, GDMA_REG_DB_PAGE_SIZE) & 0xFFFF;
>
> @@ -111,7 +131,17 @@ static int mana_gd_init_vf_regs(struct pci_dev *pdev)
> gc->db_page_base = gc->bar0_va + gc->db_page_off;
> gc->phys_db_page_base = gc->bar0_pa + gc->db_page_off;
>
> - gc->shm_base = gc->bar0_va + mana_gd_r64(gc, GDMA_REG_SHM_OFFSET);
> + shm_off = mana_gd_r64(gc, GDMA_REG_SHM_OFFSET);
> + if (shm_off >= gc->bar0_size ||
> + gc->bar0_size - shm_off < SMC_APERTURE_SIZE ||
> + !IS_ALIGNED(shm_off, sizeof(u32))) {
> + dev_err(gc->dev,
> + "SHM offset 0x%llx out of range or unaligned (BAR0 size 0x%llx)\n",
> + shm_off, (u64)gc->bar0_size);
> + return -EPROTO;
> + }
> +
> + gc->shm_base = gc->bar0_va + shm_off;
>
> return 0;
> }
> diff --git a/include/net/mana/shm_channel.h b/include/net/mana/shm_channel.h
> index 5199b41497ff..dbabcfb95daf 100644
> --- a/include/net/mana/shm_channel.h
> +++ b/include/net/mana/shm_channel.h
> @@ -4,6 +4,12 @@
> #ifndef _SHM_CHANNEL_H
> #define _SHM_CHANNEL_H
>
> +#define SMC_APERTURE_BITS 256
> +#define SMC_BASIC_UNIT (sizeof(u32))
> +#define SMC_APERTURE_DWORDS (SMC_APERTURE_BITS / (SMC_BASIC_UNIT * 8))
> +#define SMC_LAST_DWORD (SMC_APERTURE_DWORDS - 1)
> +#define SMC_APERTURE_SIZE (SMC_APERTURE_BITS / 8)
AI bots complain that we're redefining this.
Since it's a fix I think it's better to remove the existing definition
even if it lives in a driver that goes via a different tree.
> struct shm_channel {
> struct device *dev;
> void __iomem *base;
--
pw-bot: cr
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH net, v2] net: mana: Fix crash from unvalidated SHM offset read from BAR0 during FLR
2026-05-02 1:53 ` Jakub Kicinski
@ 2026-05-03 3:03 ` Dipayaan Roy
0 siblings, 0 replies; 3+ messages in thread
From: Dipayaan Roy @ 2026-05-03 3:03 UTC (permalink / raw)
To: Jakub Kicinski
Cc: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
pabeni, leon, longli, kotaranov, horms, shradhagupta, ssengar,
ernis, shirazsaleem, linux-hyperv, netdev, linux-kernel,
linux-rdma, stephen, jacob.e.keller, dipayanroy, leitao, kees,
john.fastabend, hawk, bpf, daniel, ast, sdf, yury.norov
On Fri, May 01, 2026 at 06:53:24PM -0700, Jakub Kicinski wrote:
> On Wed, 29 Apr 2026 11:57:55 -0700 Dipayaan Roy wrote:
> > During Function Level Reset recovery, the MANA driver reads
> > hardware BAR0 registers that may temporarily contain garbage values.
> > The SHM (Shared Memory) offset read from GDMA_REG_SHM_OFFSET is used
> > to compute gc->shm_base, which is later dereferenced via readl() in
> > mana_smc_poll_register(). If the hardware returns an unaligned or
> > out-of-range value, the driver must not blindly use it, as this would
> > propagate the hardware error into a kernel crash.
> >
> > The following crash was observed on an arm64 Hyper-V guest running
> > kernel 6.17.0-3013-azure during VF reset recovery triggered by HWC
> > timeout.
> >
> > [13291.785274] Unable to handle kernel paging request at virtual address ffff8000a200001b
> > [13291.785311] Mem abort info:
> > [13291.785332] ESR = 0x0000000096000021
> > [13291.785343] EC = 0x25: DABT (current EL), IL = 32 bits
> > [13291.785355] SET = 0, FnV = 0
> > [13291.785363] EA = 0, S1PTW = 0
> > [13291.785372] FSC = 0x21: alignment fault
> > [13291.785382] Data abort info:
> > [13291.785391] ISV = 0, ISS = 0x00000021, ISS2 = 0x00000000
> > [13291.785404] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> > [13291.785412] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> > [13291.785421] swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000014df3a1000
> > [13291.785432] [ffff8000a200001b] pgd=1000000100438403, p4d=1000000100438403, pud=1000000100439403, pmd=0068000fc2000711
> > [13291.785703] Internal error: Oops: 0000000096000021 [#1] SMP
> > [13291.830975] Modules linked in: tls qrtr mana_ib ib_uverbs ib_core xt_owner xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat nf_tables cfg80211 8021q garp mrp stp llc binfmt_misc joydev serio_raw nls_iso8859_1 hid_generic aes_ce_blk aes_ce_cipher polyval_ce ghash_ce sm4_ce_gcm sm4_ce_ccm sm4_ce sm4_ce_cipher hid_hyperv sm4 sm3_ce sha3_ce hv_netvsc hid vmgenid hyperv_keyboard hyperv_drm sch_fq_codel nvme_fabrics efi_pstore dm_multipath nfnetlink vsock_loopback vmw_vsock_virtio_transport_common hv_sock vmw_vsock_vmci_transport vmw_vmci vsock dmi_sysfs ip_tables x_tables autofs4
> > [13291.862630] CPU: 122 UID: 0 PID: 61796 Comm: kworker/122:2 Tainted: G W 6.17.0-3013-azure #13-Ubuntu VOLUNTARY
> > [13291.869902] Tainted: [W]=WARN
> > [13291.871901] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 01/08/2026
> > [13291.878086] Workqueue: events mana_serv_func
> > [13291.880718] pstate: 62400005 (nZCv daif +PAN -UAO +TCO -DIT -SSBS BTYPE=--)
> > [13291.884835] pc : mana_smc_poll_register+0x48/0xb0
> > [13291.887902] lr : mana_smc_setup_hwc+0x70/0x1c0
> > [13291.890493] sp : ffff8000ab79bbb0
> > [13291.892364] x29: ffff8000ab79bbb0 x28: ffff00410c8b5900 x27: ffff00410d630680
> > [13291.896252] x26: ffff004171f9fd80 x25: 000000016ed55000 x24: 000000017f37e000
> > [13291.899990] x23: 0000000000000000 x22: 000000016ed55000 x21: 0000000000000000
> > [13291.904497] x20: ffff8000a200001b x19: 0000000000004e20 x18: ffff8000a6183050
> > [13291.908308] x17: 0000000000000000 x16: 0000000000000000 x15: 000000000000000a
> > [13291.912542] x14: 0000000000000004 x13: 0000000000000000 x12: 0000000000000000
> > [13291.916298] x11: 0000000000000000 x10: 0000000000000001 x9 : ffffc45006af1bd8
> > [13291.920945] x8 : ffff000151129000 x7 : 0000000000000000 x6 : 0000000000000000
> > [13291.925293] x5 : 000000015f214000 x4 : 000000017217a000 x3 : 000000016ed50000
> > [13291.930436] x2 : 000000016ed55000 x1 : 0000000000000000 x0 : ffff8000a1ffffff
> > [13291.934342] Call trace:
> > [13291.935736] mana_smc_poll_register+0x48/0xb0 (P)
> > [13291.938611] mana_smc_setup_hwc+0x70/0x1c0
> > [13291.941113] mana_hwc_create_channel+0x1a0/0x3a0
> > [13291.944283] mana_gd_setup+0x16c/0x398
> > [13291.946584] mana_gd_resume+0x24/0x70
> > [13291.948917] mana_do_service+0x13c/0x1d0
> > [13291.951583] mana_serv_func+0x34/0x68
> > [13291.953732] process_one_work+0x168/0x3d0
> > [13291.956745] worker_thread+0x2ac/0x480
> > [13291.959104] kthread+0xf8/0x110
> > [13291.961026] ret_from_fork+0x10/0x20
> > [13291.963560] Code: d2807d00 9417c551 71000673 54000220 (b9400281)
> > [13291.967299] ---[ end trace 0000000000000000 ]---
> >
> > Disassembly of mana_smc_poll_register() around the crash site:
> >
> > Disassembly of section .text:
> >
> > 00000000000047c8 <mana_smc_poll_register>:
> > 47c8: d503201f nop
> > 47cc: d503201f nop
> > 47d0: d503233f paciasp
> > 47d4: f800865e str x30, [x18], #8
> > 47d8: a9bd7bfd stp x29, x30, [sp, #-48]!
> > 47dc: 910003fd mov x29, sp
> > 47e0: a90153f3 stp x19, x20, [sp, #16]
> > 47e4: 91007014 add x20, x0, #0x1c
> > 47e8: 5289c413 mov w19, #0x4e20
> > 47ec: f90013f5 str x21, [sp, #32]
> > 47f0: 12001c35 and w21, w1, #0xff
> > 47f4: 14000008 b 4814 <mana_smc_poll_register+0x4c>
> > 47f8: 36f801e1 tbz w1, #31, 4834 <mana_smc_poll_register+0x6c>
> > 47fc: 52800042 mov w2, #0x2
> > 4800: d280fa01 mov x1, #0x7d0
> > 4804: d2807d00 mov x0, #0x3e8
> > 4808: 94000000 bl 0 <usleep_range_state>
> > 480c: 71000673 subs w19, w19, #0x1
> > 4810: 54000200 b.eq 4850 <mana_smc_poll_register+0x88>
> > 4814: b9400281 ldr w1, [x20] <-- **** CRASHED HERE *****
> > 4818: d50331bf dmb oshld
> > 481c: 2a0103e2 mov w2, w1
> > ...
> >
> > From the crash signature x20 = ffff8000a200001b, this address
> > ends in 0x1b which is not 4-byte aligned, so the 'ldr w1, [x20]'
> > instruction (readl) triggers the arm64 alignment fault (FSC = 0x21).
> >
> > The root cause is in mana_gd_init_vf_regs(), which computes:
> >
> > gc->shm_base = gc->bar0_va + mana_gd_r64(gc, GDMA_REG_SHM_OFFSET);
> >
> > The offset is used without any validation. The same problem exists
> > in mana_gd_init_pf_regs() for sriov_base_off and sriov_shm_off.
> >
> > Fix this by validating all offsets before use:
> >
> > - VF: check shm_off is within BAR0, properly aligned to 4 bytes
> > (readl requirement), and leaves room for the full 256-bit
> > (32-byte) SMC aperture.
> >
> > - PF: check sriov_base_off is within BAR0, aligned to 8 bytes
> > (readq requirement), and leaves room to safely read the
> > sriov_shm_off register at sriov_base_off + GDMA_PF_REG_SHM_OFF.
> > Then check sriov_shm_off leaves room for the full SMC aperture.
> > All arithmetic uses subtraction rather than addition to avoid
> > integer overflow on garbage firmware values.
> >
> > without validating the offset read from hardware. If the register
> > returns a garbage value that is neither within bar 0 bounds nor aligned
> > to the 4-byte granularity, thus causing the alignment fault.
> >
> > Define SMC_APERTURE_SIZE (32 bytes, derived from the 256-bit aperture
> > width)
> >
> > Return -EPROTO on invalid values. The existing recovery path in
> > mana_serv_reset() already handles -EPROTO by falling through to PCI
> > device rescan, giving the hardware another chance to present valid
> > register values after reset.
> >
> > Fixes: 9bf66036d686 ("net: mana: Handle hardware recovery events when probing the device")
> > Signed-off-by: Dipayaan Roy <dipayanroy@linux.microsoft.com>
> >
> > ---
> > Changes in v2:
> > - Fix sriov_base_off alignment check: sizeof(u32) to sizeof(u64), since
> > mana_gd_r64() (readq) requires 8-byte alignment on arm64.
> > - Fix sriov_base_off bounds: also verify enough space remains in BAR0
> > to safely read sriov_shm_off at offset GDMA_PF_REG_SHM_OFF + 8 bytes.
> > - Fix integer overflow: rewrite bounds checks using subtraction
> > (remaining = bar0_size - base) instead of addition.
> > - Fix SMC aperture size: add gc->bar0_size - shm_off < SMC_APERTURE_SIZE
> > checks in both VF and PF paths; previously only the start address was
> > validated, but mana_smc_poll_register() accesses up to shm_base + 0x1c
> > (28 bytes from base, 32 bytes total).
> > - Export SMC_APERTURE_SIZE to shm_channel.h.
> > ---
> > .../net/ethernet/microsoft/mana/gdma_main.c | 40 ++++++++++++++++---
> > include/net/mana/shm_channel.h | 6 +++
> > 2 files changed, 41 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> > index 098fbda0d128..d8e816882f02 100644
> > --- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
> > +++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> > @@ -43,8 +43,9 @@ static u64 mana_gd_r64(struct gdma_context *g, u64 offset)
> > static int mana_gd_init_pf_regs(struct pci_dev *pdev)
> > {
> > struct gdma_context *gc = pci_get_drvdata(pdev);
> > - void __iomem *sriov_base_va;
> > + u64 remaining_barsize;
> > u64 sriov_base_off;
> > + u64 sriov_shm_off;
> >
> > gc->db_page_size = mana_gd_r32(gc, GDMA_PF_REG_DB_PAGE_SIZE) & 0xFFFF;
> >
> > @@ -73,10 +74,28 @@ static int mana_gd_init_pf_regs(struct pci_dev *pdev)
> > gc->phys_db_page_base = gc->bar0_pa + gc->db_page_off;
> >
> > sriov_base_off = mana_gd_r64(gc, GDMA_SRIOV_REG_CFG_BASE_OFF);
> > + if (sriov_base_off >= gc->bar0_size ||
> > + gc->bar0_size - sriov_base_off <
> > + GDMA_PF_REG_SHM_OFF + sizeof(u64) ||
>
> nit: fits on a single line, I think?
>
It goes beyond the limit of 80, hence did it this way.
> > + !IS_ALIGNED(sriov_base_off, sizeof(u64))) {
> > + dev_err(gc->dev,
> > + "SRIOV base offset 0x%llx out of range or unaligned (BAR0 size 0x%llx)\n",
> > + sriov_base_off, (u64)gc->bar0_size);
> > + return -EPROTO;
> > + }
> >
> > - sriov_base_va = gc->bar0_va + sriov_base_off;
> > - gc->shm_base = sriov_base_va +
> > - mana_gd_r64(gc, sriov_base_off + GDMA_PF_REG_SHM_OFF);
> > + remaining_barsize = gc->bar0_size - sriov_base_off;
> > + sriov_shm_off = mana_gd_r64(gc, sriov_base_off + GDMA_PF_REG_SHM_OFF);
> > + if (sriov_shm_off >= remaining_barsize ||
> > + remaining_barsize - sriov_shm_off < SMC_APERTURE_SIZE ||
> > + !IS_ALIGNED(sriov_shm_off, sizeof(u32))) {
> > + dev_err(gc->dev,
> > + "SRIOV SHM offset 0x%llx out of range or unaligned (BAR0 size 0x%llx)\n",
> > + sriov_shm_off, (u64)gc->bar0_size);
> > + return -EPROTO;
> > + }
> > +
> > + gc->shm_base = gc->bar0_va + sriov_base_off + sriov_shm_off;
> >
> > return 0;
> > }
> > @@ -84,6 +103,7 @@ static int mana_gd_init_pf_regs(struct pci_dev *pdev)
> > static int mana_gd_init_vf_regs(struct pci_dev *pdev)
> > {
> > struct gdma_context *gc = pci_get_drvdata(pdev);
> > + u64 shm_off;
> >
> > gc->db_page_size = mana_gd_r32(gc, GDMA_REG_DB_PAGE_SIZE) & 0xFFFF;
> >
> > @@ -111,7 +131,17 @@ static int mana_gd_init_vf_regs(struct pci_dev *pdev)
> > gc->db_page_base = gc->bar0_va + gc->db_page_off;
> > gc->phys_db_page_base = gc->bar0_pa + gc->db_page_off;
> >
> > - gc->shm_base = gc->bar0_va + mana_gd_r64(gc, GDMA_REG_SHM_OFFSET);
> > + shm_off = mana_gd_r64(gc, GDMA_REG_SHM_OFFSET);
> > + if (shm_off >= gc->bar0_size ||
> > + gc->bar0_size - shm_off < SMC_APERTURE_SIZE ||
> > + !IS_ALIGNED(shm_off, sizeof(u32))) {
> > + dev_err(gc->dev,
> > + "SHM offset 0x%llx out of range or unaligned (BAR0 size 0x%llx)\n",
> > + shm_off, (u64)gc->bar0_size);
> > + return -EPROTO;
> > + }
> > +
> > + gc->shm_base = gc->bar0_va + shm_off;
> >
> > return 0;
> > }
> > diff --git a/include/net/mana/shm_channel.h b/include/net/mana/shm_channel.h
> > index 5199b41497ff..dbabcfb95daf 100644
> > --- a/include/net/mana/shm_channel.h
> > +++ b/include/net/mana/shm_channel.h
> > @@ -4,6 +4,12 @@
> > #ifndef _SHM_CHANNEL_H
> > #define _SHM_CHANNEL_H
> >
> > +#define SMC_APERTURE_BITS 256
> > +#define SMC_BASIC_UNIT (sizeof(u32))
> > +#define SMC_APERTURE_DWORDS (SMC_APERTURE_BITS / (SMC_BASIC_UNIT * 8))
> > +#define SMC_LAST_DWORD (SMC_APERTURE_DWORDS - 1)
> > +#define SMC_APERTURE_SIZE (SMC_APERTURE_BITS / 8)
>
> AI bots complain that we're redefining this.
> Since it's a fix I think it's better to remove the existing definition
> even if it lives in a driver that goes via a different tree.
>
Ack, removed this in the next version.
> > struct shm_channel {
> > struct device *dev;
> > void __iomem *base;
> --
> pw-bot: cr
Hi Jakub,
Thanks for the comments, I have shared v3 addressing it.
Regards
Dipayaan Roy
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-05-03 3:04 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-29 18:57 [PATCH net, v2] net: mana: Fix crash from unvalidated SHM offset read from BAR0 during FLR Dipayaan Roy
2026-05-02 1:53 ` Jakub Kicinski
2026-05-03 3:03 ` Dipayaan Roy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox