public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jakub Kicinski <kuba@kernel.org>
To: Dipayaan Roy <dipayanroy@linux.microsoft.com>
Cc: kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org,
	decui@microsoft.com, andrew+netdev@lunn.ch, davem@davemloft.net,
	edumazet@google.com, pabeni@redhat.com, leon@kernel.org,
	longli@microsoft.com, kotaranov@microsoft.com, horms@kernel.org,
	shradhagupta@linux.microsoft.com, ssengar@linux.microsoft.com,
	ernis@linux.microsoft.com, shirazsaleem@microsoft.com,
	linux-hyperv@vger.kernel.org, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org,
	stephen@networkplumber.org, jacob.e.keller@intel.com,
	dipayanroy@microsoft.com, leitao@debian.org, kees@kernel.org,
	john.fastabend@gmail.com, hawk@kernel.org, bpf@vger.kernel.org,
	daniel@iogearbox.net, ast@kernel.org, sdf@fomichev.me,
	yury.norov@gmail.com
Subject: Re: [PATCH net, v2] net: mana: Fix crash from unvalidated SHM offset read from BAR0 during FLR
Date: Fri, 1 May 2026 18:53:24 -0700	[thread overview]
Message-ID: <20260501185324.0f02dc72@kernel.org> (raw)
In-Reply-To: <afJUszROT+yKjth0@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net>

On Wed, 29 Apr 2026 11:57:55 -0700 Dipayaan Roy wrote:
> During Function Level Reset recovery, the MANA driver reads
> hardware BAR0 registers that may temporarily contain garbage values.
> The SHM (Shared Memory) offset read from GDMA_REG_SHM_OFFSET is used
> to compute gc->shm_base, which is later dereferenced via readl() in
> mana_smc_poll_register(). If the hardware returns an unaligned or
> out-of-range value, the driver must not blindly use it, as this would
> propagate the hardware error into a kernel crash.
> 
> The following crash was observed on an arm64 Hyper-V guest running
> kernel 6.17.0-3013-azure during VF reset recovery triggered by HWC
> timeout.
> 
> [13291.785274] Unable to handle kernel paging request at virtual address ffff8000a200001b
> [13291.785311] Mem abort info:
> [13291.785332]   ESR = 0x0000000096000021
> [13291.785343]   EC = 0x25: DABT (current EL), IL = 32 bits
> [13291.785355]   SET = 0, FnV = 0
> [13291.785363]   EA = 0, S1PTW = 0
> [13291.785372]   FSC = 0x21: alignment fault
> [13291.785382] Data abort info:
> [13291.785391]   ISV = 0, ISS = 0x00000021, ISS2 = 0x00000000
> [13291.785404]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> [13291.785412]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> [13291.785421] swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000014df3a1000
> [13291.785432] [ffff8000a200001b] pgd=1000000100438403, p4d=1000000100438403, pud=1000000100439403, pmd=0068000fc2000711
> [13291.785703] Internal error: Oops: 0000000096000021 [#1]  SMP
> [13291.830975] Modules linked in: tls qrtr mana_ib ib_uverbs ib_core xt_owner xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat nf_tables cfg80211 8021q garp mrp stp llc binfmt_misc joydev serio_raw nls_iso8859_1 hid_generic aes_ce_blk aes_ce_cipher polyval_ce ghash_ce sm4_ce_gcm sm4_ce_ccm sm4_ce sm4_ce_cipher hid_hyperv sm4 sm3_ce sha3_ce hv_netvsc hid vmgenid hyperv_keyboard hyperv_drm sch_fq_codel nvme_fabrics efi_pstore dm_multipath nfnetlink vsock_loopback vmw_vsock_virtio_transport_common hv_sock vmw_vsock_vmci_transport vmw_vmci vsock dmi_sysfs ip_tables x_tables autofs4
> [13291.862630] CPU: 122 UID: 0 PID: 61796 Comm: kworker/122:2 Tainted: G        W           6.17.0-3013-azure #13-Ubuntu VOLUNTARY
> [13291.869902] Tainted: [W]=WARN
> [13291.871901] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 01/08/2026
> [13291.878086] Workqueue: events mana_serv_func
> [13291.880718] pstate: 62400005 (nZCv daif +PAN -UAO +TCO -DIT -SSBS BTYPE=--)
> [13291.884835] pc : mana_smc_poll_register+0x48/0xb0
> [13291.887902] lr : mana_smc_setup_hwc+0x70/0x1c0
> [13291.890493] sp : ffff8000ab79bbb0
> [13291.892364] x29: ffff8000ab79bbb0 x28: ffff00410c8b5900 x27: ffff00410d630680
> [13291.896252] x26: ffff004171f9fd80 x25: 000000016ed55000 x24: 000000017f37e000
> [13291.899990] x23: 0000000000000000 x22: 000000016ed55000 x21: 0000000000000000
> [13291.904497] x20: ffff8000a200001b x19: 0000000000004e20 x18: ffff8000a6183050
> [13291.908308] x17: 0000000000000000 x16: 0000000000000000 x15: 000000000000000a
> [13291.912542] x14: 0000000000000004 x13: 0000000000000000 x12: 0000000000000000
> [13291.916298] x11: 0000000000000000 x10: 0000000000000001 x9 : ffffc45006af1bd8
> [13291.920945] x8 : ffff000151129000 x7 : 0000000000000000 x6 : 0000000000000000
> [13291.925293] x5 : 000000015f214000 x4 : 000000017217a000 x3 : 000000016ed50000
> [13291.930436] x2 : 000000016ed55000 x1 : 0000000000000000 x0 : ffff8000a1ffffff
> [13291.934342] Call trace:
> [13291.935736]  mana_smc_poll_register+0x48/0xb0 (P)
> [13291.938611]  mana_smc_setup_hwc+0x70/0x1c0
> [13291.941113]  mana_hwc_create_channel+0x1a0/0x3a0
> [13291.944283]  mana_gd_setup+0x16c/0x398
> [13291.946584]  mana_gd_resume+0x24/0x70
> [13291.948917]  mana_do_service+0x13c/0x1d0
> [13291.951583]  mana_serv_func+0x34/0x68
> [13291.953732]  process_one_work+0x168/0x3d0
> [13291.956745]  worker_thread+0x2ac/0x480
> [13291.959104]  kthread+0xf8/0x110
> [13291.961026]  ret_from_fork+0x10/0x20
> [13291.963560] Code: d2807d00 9417c551 71000673 54000220 (b9400281)
> [13291.967299] ---[ end trace 0000000000000000 ]---
> 
> Disassembly of mana_smc_poll_register() around the crash site:
> 
> Disassembly of section .text:
> 
> 00000000000047c8 <mana_smc_poll_register>:
>     47c8: d503201f        nop
>     47cc: d503201f        nop
>     47d0: d503233f        paciasp
>     47d4: f800865e        str     x30, [x18], #8
>     47d8: a9bd7bfd        stp     x29, x30, [sp, #-48]!
>     47dc: 910003fd        mov     x29, sp
>     47e0: a90153f3        stp     x19, x20, [sp, #16]
>     47e4: 91007014        add     x20, x0, #0x1c
>     47e8: 5289c413        mov     w19, #0x4e20
>     47ec: f90013f5        str     x21, [sp, #32]
>     47f0: 12001c35        and     w21, w1, #0xff
>     47f4: 14000008        b       4814 <mana_smc_poll_register+0x4c>
>     47f8: 36f801e1  tbz  w1, #31, 4834 <mana_smc_poll_register+0x6c>
>     47fc: 52800042        mov     w2, #0x2
>     4800: d280fa01        mov     x1, #0x7d0
>     4804: d2807d00        mov     x0, #0x3e8
>     4808: 94000000        bl      0 <usleep_range_state>
>     480c: 71000673        subs    w19, w19, #0x1
>     4810: 54000200        b.eq    4850 <mana_smc_poll_register+0x88>
>     4814: b9400281      ldr   w1, [x20] <-- **** CRASHED HERE *****
>     4818: d50331bf        dmb     oshld
>     481c: 2a0103e2        mov     w2, w1
>     ...
> 
> From the crash signature x20 = ffff8000a200001b, this address
> ends in 0x1b which is not 4-byte aligned, so the 'ldr w1, [x20]'
> instruction (readl) triggers the arm64 alignment fault (FSC = 0x21).
> 
> The root cause is in mana_gd_init_vf_regs(), which computes:
> 
>   gc->shm_base = gc->bar0_va + mana_gd_r64(gc, GDMA_REG_SHM_OFFSET);
> 
> The offset is used without any validation.  The same problem exists
> in mana_gd_init_pf_regs() for sriov_base_off and sriov_shm_off.
> 
> Fix this by validating all offsets before use:
> 
> - VF: check shm_off is within BAR0, properly aligned to 4 bytes
>   (readl requirement), and leaves room for the full 256-bit
>   (32-byte) SMC aperture.
> 
> - PF: check sriov_base_off is within BAR0, aligned to 8 bytes
>   (readq requirement), and leaves room to safely read the
>   sriov_shm_off register at sriov_base_off + GDMA_PF_REG_SHM_OFF.
>   Then check sriov_shm_off leaves room for the full SMC aperture.
>   All arithmetic uses subtraction rather than addition to avoid
>   integer overflow on garbage firmware values.
> 
> without validating the offset read from hardware. If the register
> returns a garbage value that is neither within bar 0 bounds nor aligned
> to the 4-byte granularity, thus causing the alignment fault.
> 
> Define SMC_APERTURE_SIZE (32 bytes, derived from the 256-bit aperture
> width)
> 
> Return -EPROTO on invalid values.  The existing recovery path in
> mana_serv_reset() already handles -EPROTO by falling through to PCI
> device rescan, giving the hardware another chance to present valid
> register values after reset.
> 
> Fixes: 9bf66036d686 ("net: mana: Handle hardware recovery events when probing the device")
> Signed-off-by: Dipayaan Roy <dipayanroy@linux.microsoft.com>
> 
> ---
> Changes in v2:
> - Fix sriov_base_off alignment check: sizeof(u32) to sizeof(u64), since
>   mana_gd_r64() (readq) requires 8-byte alignment on arm64.
> - Fix sriov_base_off bounds: also verify enough space remains in BAR0
>   to safely read sriov_shm_off at offset GDMA_PF_REG_SHM_OFF + 8 bytes.
> - Fix integer overflow: rewrite bounds checks using subtraction
>   (remaining = bar0_size - base) instead of addition.
> - Fix SMC aperture size: add gc->bar0_size - shm_off < SMC_APERTURE_SIZE
>   checks in both VF and PF paths; previously only the start address was
>   validated, but mana_smc_poll_register() accesses up to shm_base + 0x1c
>   (28 bytes from base, 32 bytes total).
> - Export SMC_APERTURE_SIZE to shm_channel.h.
> ---
>  .../net/ethernet/microsoft/mana/gdma_main.c   | 40 ++++++++++++++++---
>  include/net/mana/shm_channel.h                |  6 +++
>  2 files changed, 41 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> index 098fbda0d128..d8e816882f02 100644
> --- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
> +++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> @@ -43,8 +43,9 @@ static u64 mana_gd_r64(struct gdma_context *g, u64 offset)
>  static int mana_gd_init_pf_regs(struct pci_dev *pdev)
>  {
>  	struct gdma_context *gc = pci_get_drvdata(pdev);
> -	void __iomem *sriov_base_va;
> +	u64 remaining_barsize;
>  	u64 sriov_base_off;
> +	u64 sriov_shm_off;
>  
>  	gc->db_page_size = mana_gd_r32(gc, GDMA_PF_REG_DB_PAGE_SIZE) & 0xFFFF;
>  
> @@ -73,10 +74,28 @@ static int mana_gd_init_pf_regs(struct pci_dev *pdev)
>  	gc->phys_db_page_base = gc->bar0_pa + gc->db_page_off;
>  
>  	sriov_base_off = mana_gd_r64(gc, GDMA_SRIOV_REG_CFG_BASE_OFF);
> +	if (sriov_base_off >= gc->bar0_size ||
> +	    gc->bar0_size - sriov_base_off <
> +		GDMA_PF_REG_SHM_OFF + sizeof(u64) ||

nit: fits on a single line, I think?

> +	    !IS_ALIGNED(sriov_base_off, sizeof(u64))) {
> +		dev_err(gc->dev,
> +			"SRIOV base offset 0x%llx out of range or unaligned (BAR0 size 0x%llx)\n",
> +			sriov_base_off, (u64)gc->bar0_size);
> +		return -EPROTO;
> +	}
>  
> -	sriov_base_va = gc->bar0_va + sriov_base_off;
> -	gc->shm_base = sriov_base_va +
> -			mana_gd_r64(gc, sriov_base_off + GDMA_PF_REG_SHM_OFF);
> +	remaining_barsize = gc->bar0_size - sriov_base_off;
> +	sriov_shm_off = mana_gd_r64(gc, sriov_base_off + GDMA_PF_REG_SHM_OFF);
> +	if (sriov_shm_off >= remaining_barsize ||
> +	    remaining_barsize - sriov_shm_off < SMC_APERTURE_SIZE ||
> +	    !IS_ALIGNED(sriov_shm_off, sizeof(u32))) {
> +		dev_err(gc->dev,
> +			"SRIOV SHM offset 0x%llx out of range or unaligned (BAR0 size 0x%llx)\n",
> +			sriov_shm_off, (u64)gc->bar0_size);
> +		return -EPROTO;
> +	}
> +
> +	gc->shm_base = gc->bar0_va + sriov_base_off + sriov_shm_off;
>  
>  	return 0;
>  }
> @@ -84,6 +103,7 @@ static int mana_gd_init_pf_regs(struct pci_dev *pdev)
>  static int mana_gd_init_vf_regs(struct pci_dev *pdev)
>  {
>  	struct gdma_context *gc = pci_get_drvdata(pdev);
> +	u64 shm_off;
>  
>  	gc->db_page_size = mana_gd_r32(gc, GDMA_REG_DB_PAGE_SIZE) & 0xFFFF;
>  
> @@ -111,7 +131,17 @@ static int mana_gd_init_vf_regs(struct pci_dev *pdev)
>  	gc->db_page_base = gc->bar0_va + gc->db_page_off;
>  	gc->phys_db_page_base = gc->bar0_pa + gc->db_page_off;
>  
> -	gc->shm_base = gc->bar0_va + mana_gd_r64(gc, GDMA_REG_SHM_OFFSET);
> +	shm_off = mana_gd_r64(gc, GDMA_REG_SHM_OFFSET);
> +	if (shm_off >= gc->bar0_size ||
> +	    gc->bar0_size - shm_off < SMC_APERTURE_SIZE ||
> +	    !IS_ALIGNED(shm_off, sizeof(u32))) {
> +		dev_err(gc->dev,
> +			"SHM offset 0x%llx out of range or unaligned (BAR0 size 0x%llx)\n",
> +			shm_off, (u64)gc->bar0_size);
> +		return -EPROTO;
> +	}
> +
> +	gc->shm_base = gc->bar0_va + shm_off;
>  
>  	return 0;
>  }
> diff --git a/include/net/mana/shm_channel.h b/include/net/mana/shm_channel.h
> index 5199b41497ff..dbabcfb95daf 100644
> --- a/include/net/mana/shm_channel.h
> +++ b/include/net/mana/shm_channel.h
> @@ -4,6 +4,12 @@
>  #ifndef _SHM_CHANNEL_H
>  #define _SHM_CHANNEL_H
>  
> +#define SMC_APERTURE_BITS 256
> +#define SMC_BASIC_UNIT (sizeof(u32))
> +#define SMC_APERTURE_DWORDS (SMC_APERTURE_BITS / (SMC_BASIC_UNIT * 8))
> +#define SMC_LAST_DWORD (SMC_APERTURE_DWORDS - 1)
> +#define SMC_APERTURE_SIZE  (SMC_APERTURE_BITS / 8)

AI bots complain that we're redefining this.
Since it's a fix I think it's better to remove the existing definition
even if it lives in a driver that goes via a different tree.

>  struct shm_channel {
>  	struct device *dev;
>  	void __iomem *base;
-- 
pw-bot: cr

  reply	other threads:[~2026-05-02  1:53 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-29 18:57 [PATCH net, v2] net: mana: Fix crash from unvalidated SHM offset read from BAR0 during FLR Dipayaan Roy
2026-05-02  1:53 ` Jakub Kicinski [this message]
2026-05-03  3:03   ` Dipayaan Roy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260501185324.0f02dc72@kernel.org \
    --to=kuba@kernel.org \
    --cc=andrew+netdev@lunn.ch \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=decui@microsoft.com \
    --cc=dipayanroy@linux.microsoft.com \
    --cc=dipayanroy@microsoft.com \
    --cc=edumazet@google.com \
    --cc=ernis@linux.microsoft.com \
    --cc=haiyangz@microsoft.com \
    --cc=hawk@kernel.org \
    --cc=horms@kernel.org \
    --cc=jacob.e.keller@intel.com \
    --cc=john.fastabend@gmail.com \
    --cc=kees@kernel.org \
    --cc=kotaranov@microsoft.com \
    --cc=kys@microsoft.com \
    --cc=leitao@debian.org \
    --cc=leon@kernel.org \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=longli@microsoft.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sdf@fomichev.me \
    --cc=shirazsaleem@microsoft.com \
    --cc=shradhagupta@linux.microsoft.com \
    --cc=ssengar@linux.microsoft.com \
    --cc=stephen@networkplumber.org \
    --cc=wei.liu@kernel.org \
    --cc=yury.norov@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox