From: Marc Zyngier <maz@kernel.org>
To: Breno Leitao <leitao@debian.org>
Cc: kuba@kernel.org, Sunil Goutham <sgoutham@marvell.com>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Paolo Abeni <pabeni@redhat.com>,
horms@kernel.org,
linux-arm-kernel@lists.infradead.org (moderated list:ARM/CAVIUM
THUNDER NETWORK DRIVER),
netdev@vger.kernel.org (open list:NETWORKING DRIVERS),
linux-kernel@vger.kernel.org (open list)
Subject: Re: [PATCH net-next v2] net: thunderx: Unembed netdev structure
Date: Mon, 12 Aug 2024 15:15:41 +0100 [thread overview]
Message-ID: <864j7p25f6.wl-maz@kernel.org> (raw)
In-Reply-To: <20240626173503.87636-1-leitao@debian.org>
On Wed, 26 Jun 2024 18:35:02 +0100,
Breno Leitao <leitao@debian.org> wrote:
>
> Embedding net_device into structures prohibits the usage of flexible
> arrays in the net_device structure. For more details, see the discussion
> at [1].
>
> Un-embed the net_devices from struct lmac by converting them
> into pointers, and allocating them dynamically. Use the leverage
> alloc_netdev() to allocate the net_device object at
> bgx_lmac_enable().
>
> The free of the device occurs at bgx_lmac_disable().
>
> Do not free_netdevice() if bgx_lmac_enable() fails after lmac->netdev
> is allocated, since bgx_lmac_disable() is called if bgx_lmac_enable()
> fails, and lmac->netdev will be freed there (similarly to lmac->dmacs).
>
> Link: https://lore.kernel.org/all/20240229225910.79e224cf@kernel.org/ [1]
> Signed-off-by: Breno Leitao <leitao@debian.org>
> ---
> Changelog:
>
> v2:
> * Fixed a wrong dereference in netdev_priv (Jakub)
>
> .../net/ethernet/cavium/thunder/thunder_bgx.c | 21 +++++++++++++------
> 1 file changed, 15 insertions(+), 6 deletions(-)
This patch causes my ThunderX box to explode badly:
[ 10.022118] thunder_bgx, ver 1.0
[ 10.022594] libata version 3.00 loaded.
[ 10.023226] mdio_thunder 0000:01:01.3: Added bus at 87e005003800
[ 10.023757] mdio_thunder 0000:01:01.3: Added bus at 87e005003880
[ 10.035431] thunder_bgx 0000:01:10.0: BGX0 QLM mode: XFI
[ 10.045225] Unable to handle kernel NULL pointer dereference at virtual address 00000000000005e8
[ 10.069901] Mem abort info:
[ 10.085236] ESR = 0x0000000096000044
[ 10.109767] EC = 0x25: DABT (current EL), IL = 32 bits
[ 10.145191] SET = 0, FnV = 0
[ 10.148272] EA = 0, S1PTW = 0
[ 10.151422] FSC = 0x04: level 0 translation fault
[ 10.156309] Data abort info:
[ 10.159196] ISV = 0, ISS = 0x00000044, ISS2 = 0x00000000
[ 10.164689] CM = 0, WnR = 1, TnD = 0, TagAccess = 0
[ 10.169752] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 10.175076] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000111b43000
[ 10.181533] [00000000000005e8] pgd=0000000000000000, p4d=0000000000000000
[ 10.188328] Internal error: Oops: 0000000096000044 [#1] PREEMPT SMP
[ 10.194585] Modules linked in: libahci(E) nvme(E) nvme_core(E) t10_pi(E) mdio_thunder(E) thunder_bgx(E+) libata(E) mdio_devres(E) crc64_rocksoft(E) scsi_mod(E) igb(E+) thunder_xcv(E) mdio_cavium(E) crc64(E) i2c_algo_bit(E) gpio_keys(E) usbhid(E) scsi_common(E) of_mdio(E) fixed_phy(E) fwnode_mdio(E) i2c_thunderx(E) libphy(E)
[ 10.223291] CPU: 0 PID: 341 Comm: kworker/0:4 Tainted: G E 6.10.0-rc5-01073-g94833addfaba #3309
[ 10.233368] Hardware name: GIGABYTE MT30-GS0/MT30-GS0, BIOS F02 08/06/2019
[ 10.240231] Workqueue: events work_for_cpu_fn
[ 10.244588] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 10.251540] pc : bgx_probe+0x44c/0x640 [thunder_bgx]
[ 10.256502] lr : bgx_probe+0x410/0x640 [thunder_bgx]
[ 10.261460] sp : ffff800084dd3c80
[ 10.264876] x29: ffff800084dd3c80 x28: 0000000000000000 x27: ffff000ff6772a70
[ 10.272006] x26: ffff00010cb02480 x25: 0000000000000000 x24: ffff80007a325700
[ 10.279136] x23: ffff00010c0e60c8 x22: ffff80008100a3d8 x21: ffff000ff6772a88
[ 10.286266] x20: ffff00010c0e6000 x19: ffff00010cb02480 x18: ffffffffffffffff
[ 10.293396] x17: 000000004b2d2331 x16: 00000000b606f3da x15: 0000000000000006
[ 10.300526] x14: 0000000000000000 x13: 3030383330303530 x12: 3065373820746120
[ 10.307656] x11: 7375622064656464 x10: ffff800081e158e8 x9 : ffff800080aa9a30
[ 10.314786] x8 : 0101010101010101 x7 : 0000000000000000 x6 : ffff00010c0e60c8
[ 10.321916] x5 : ffff800084dd3cf8 x4 : 0000000000000000 x3 : 0000000000000000
[ 10.329046] x2 : 0000000000000000 x1 : ffff80007a3296d0 x0 : ffff000ff6772a70
[ 10.336176] Call trace:
[ 10.338613] bgx_probe+0x44c/0x640 [thunder_bgx]
[ 10.343225] local_pci_probe+0x48/0xb8
[ 10.346966] work_for_cpu_fn+0x24/0x40
[ 10.350706] process_one_work+0x170/0x400
[ 10.354707] worker_thread+0x26c/0x388
[ 10.358446] kthread+0xfc/0x110
[ 10.361580] ret_from_fork+0x10/0x20
[ 10.365150] Code: 52800004 52800003 d2800002 f9401f47 (f902f4e6)
[ 10.371232] ---[ end trace 0000000000000000 ]---
and I've confirmed that reverting this patch on top of -rc3 restores
normal behaviour.
There are two issues with this change:
- bgx_lmac_enable() is called *after* bgx_acpi_register_phy() and
bgx_init_of_phy(), both expecting netdev to be a valid pointer.
- bgx_init_of_phy() populates the MAC addresses for *all* LMACs
attached to a given BGX instance, and thus needs netdev for each of
them to have been allocated.
I have posted a potential fix at [1].
Thanks,
M.
[1] https://lore.kernel.org/r/20240812141322.1742918-1-maz@kernel.org
--
Without deviation from the norm, progress is not possible.
prev parent reply other threads:[~2024-08-12 14:18 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-26 17:35 [PATCH net-next v2] net: thunderx: Unembed netdev structure Breno Leitao
2024-06-28 0:10 ` patchwork-bot+netdevbpf
2024-08-12 14:15 ` Marc Zyngier [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=864j7p25f6.wl-maz@kernel.org \
--to=maz@kernel.org \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=kuba@kernel.org \
--cc=leitao@debian.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sgoutham@marvell.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.