* [PATCH net] eth: bnxt: always recalculate features after XDP clearing, fix null-deref
@ 2025-01-09 4:30 Jakub Kicinski
2025-01-09 5:38 ` Michael Chan
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Jakub Kicinski @ 2025-01-09 4:30 UTC (permalink / raw)
To: davem
Cc: netdev, edumazet, pabeni, andrew+netdev, horms, Jakub Kicinski,
michael.chan, daniel, hawk, john.fastabend, andrew.gospodarek,
pavan.chebbi
Recalculate features when XDP is detached.
Before:
# ip li set dev eth0 xdp obj xdp_dummy.bpf.o sec xdp
# ip li set dev eth0 xdp off
# ethtool -k eth0 | grep gro
rx-gro-hw: off [requested on]
After:
# ip li set dev eth0 xdp obj xdp_dummy.bpf.o sec xdp
# ip li set dev eth0 xdp off
# ethtool -k eth0 | grep gro
rx-gro-hw: on
The fact that HW-GRO doesn't get re-enabled automatically is just
a minor annoyance. The real issue is that the features will randomly
come back during another reconfiguration which just happens to invoke
netdev_update_features(). The driver doesn't handle reconfiguring
two things at a time very robustly.
Starting with commit 98ba1d931f61 ("bnxt_en: Fix RSS logic in
__bnxt_reserve_rings()") we only reconfigure the RSS hash table
if the "effective" number of Rx rings has changed. If HW-GRO is
enabled "effective" number of rings is 2x what user sees.
So if we are in the bad state, with HW-GRO re-enablement "pending"
after XDP off, and we lower the rings by / 2 - the HW-GRO rings
doing 2x and the ethtool -L doing / 2 may cancel each other out,
and the:
if (old_rx_rings != bp->hw_resc.resv_rx_rings &&
condition in __bnxt_reserve_rings() will be false.
The RSS map won't get updated, and we'll crash with:
BUG: kernel NULL pointer dereference, address: 0000000000000168
RIP: 0010:__bnxt_hwrm_vnic_set_rss+0x13a/0x1a0
bnxt_hwrm_vnic_rss_cfg_p5+0x47/0x180
__bnxt_setup_vnic_p5+0x58/0x110
bnxt_init_nic+0xb72/0xf50
__bnxt_open_nic+0x40d/0xab0
bnxt_open_nic+0x2b/0x60
ethtool_set_channels+0x18c/0x1d0
As we try to access a freed ring.
The issue is present since XDP support was added, really, but
prior to commit 98ba1d931f61 ("bnxt_en: Fix RSS logic in
__bnxt_reserve_rings()") it wasn't causing major issues.
Fixes: 1054aee82321 ("bnxt_en: Use NETIF_F_GRO_HW.")
Fixes: 98ba1d931f61 ("bnxt_en: Fix RSS logic in __bnxt_reserve_rings()")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
CC: michael.chan@broadcom.com
CC: daniel@iogearbox.net
CC: hawk@kernel.org
CC: john.fastabend@gmail.com
CC: andrew.gospodarek@broadcom.com
CC: pavan.chebbi@broadcom.com
---
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 25 +++++++++++++++----
drivers/net/ethernet/broadcom/bnxt/bnxt.h | 2 +-
drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c | 7 ------
3 files changed, 21 insertions(+), 13 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index aeaa74f03046..b6f844cac80e 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -4708,7 +4708,7 @@ void bnxt_set_ring_params(struct bnxt *bp)
/* Changing allocation mode of RX rings.
* TODO: Update when extending xdp_rxq_info to support allocation modes.
*/
-int bnxt_set_rx_skb_mode(struct bnxt *bp, bool page_mode)
+static void __bnxt_set_rx_skb_mode(struct bnxt *bp, bool page_mode)
{
struct net_device *dev = bp->dev;
@@ -4729,15 +4729,30 @@ int bnxt_set_rx_skb_mode(struct bnxt *bp, bool page_mode)
bp->rx_skb_func = bnxt_rx_page_skb;
}
bp->rx_dir = DMA_BIDIRECTIONAL;
- /* Disable LRO or GRO_HW */
- netdev_update_features(dev);
} else {
dev->max_mtu = bp->max_mtu;
bp->flags &= ~BNXT_FLAG_RX_PAGE_MODE;
bp->rx_dir = DMA_FROM_DEVICE;
bp->rx_skb_func = bnxt_rx_skb;
}
- return 0;
+}
+
+void bnxt_set_rx_skb_mode(struct bnxt *bp, bool page_mode)
+{
+ __bnxt_set_rx_skb_mode(bp, page_mode);
+
+ if (!page_mode) {
+ int rx, tx;
+
+ bnxt_get_max_rings(bp, &rx, &tx, true);
+ if (rx > 1) {
+ bp->flags &= ~BNXT_FLAG_NO_AGG_RINGS;
+ bp->dev->hw_features |= NETIF_F_LRO;
+ }
+ }
+
+ /* Update LRO and GRO_HW availability */
+ netdev_update_features(bp->dev);
}
static void bnxt_free_vnic_attributes(struct bnxt *bp)
@@ -16214,7 +16229,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
if (bp->max_fltr < BNXT_MAX_FLTR)
bp->max_fltr = BNXT_MAX_FLTR;
bnxt_init_l2_fltr_tbl(bp);
- bnxt_set_rx_skb_mode(bp, false);
+ __bnxt_set_rx_skb_mode(bp, false);
bnxt_set_tpa_flags(bp);
bnxt_set_ring_params(bp);
bnxt_rdma_aux_device_init(bp);
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 7df7a2233307..f11ed59203d9 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -2846,7 +2846,7 @@ u32 bnxt_fw_health_readl(struct bnxt *bp, int reg_idx);
bool bnxt_bs_trace_avail(struct bnxt *bp, u16 type);
void bnxt_set_tpa_flags(struct bnxt *bp);
void bnxt_set_ring_params(struct bnxt *);
-int bnxt_set_rx_skb_mode(struct bnxt *bp, bool page_mode);
+void bnxt_set_rx_skb_mode(struct bnxt *bp, bool page_mode);
void bnxt_insert_usr_fltr(struct bnxt *bp, struct bnxt_filter_base *fltr);
void bnxt_del_one_usr_fltr(struct bnxt *bp, struct bnxt_filter_base *fltr);
int bnxt_hwrm_func_drv_rgtr(struct bnxt *bp, unsigned long *bmap,
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
index f88b641533fc..dc51dce209d5 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
@@ -422,15 +422,8 @@ static int bnxt_xdp_set(struct bnxt *bp, struct bpf_prog *prog)
bnxt_set_rx_skb_mode(bp, true);
xdp_features_set_redirect_target(dev, true);
} else {
- int rx, tx;
-
xdp_features_clear_redirect_target(dev);
bnxt_set_rx_skb_mode(bp, false);
- bnxt_get_max_rings(bp, &rx, &tx, true);
- if (rx > 1) {
- bp->flags &= ~BNXT_FLAG_NO_AGG_RINGS;
- bp->dev->hw_features |= NETIF_F_LRO;
- }
}
bp->tx_nr_rings_xdp = tx_xdp;
bp->tx_nr_rings = bp->tx_nr_rings_per_tc * tc + tx_xdp;
--
2.47.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH net] eth: bnxt: always recalculate features after XDP clearing, fix null-deref
2025-01-09 4:30 [PATCH net] eth: bnxt: always recalculate features after XDP clearing, fix null-deref Jakub Kicinski
@ 2025-01-09 5:38 ` Michael Chan
2025-01-09 5:51 ` Somnath Kotur
2025-01-11 2:10 ` patchwork-bot+netdevbpf
2 siblings, 0 replies; 4+ messages in thread
From: Michael Chan @ 2025-01-09 5:38 UTC (permalink / raw)
To: Jakub Kicinski
Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms, daniel,
hawk, john.fastabend, andrew.gospodarek, pavan.chebbi
[-- Attachment #1: Type: text/plain, Size: 2350 bytes --]
On Wed, Jan 8, 2025 at 8:31 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> Recalculate features when XDP is detached.
>
> Before:
> # ip li set dev eth0 xdp obj xdp_dummy.bpf.o sec xdp
> # ip li set dev eth0 xdp off
> # ethtool -k eth0 | grep gro
> rx-gro-hw: off [requested on]
>
> After:
> # ip li set dev eth0 xdp obj xdp_dummy.bpf.o sec xdp
> # ip li set dev eth0 xdp off
> # ethtool -k eth0 | grep gro
> rx-gro-hw: on
>
> The fact that HW-GRO doesn't get re-enabled automatically is just
> a minor annoyance.
I knew about the annoyance, but didn't know about this possible crash.
> The real issue is that the features will randomly
> come back during another reconfiguration which just happens to invoke
> netdev_update_features(). The driver doesn't handle reconfiguring
> two things at a time very robustly.
>
> Starting with commit 98ba1d931f61 ("bnxt_en: Fix RSS logic in
> __bnxt_reserve_rings()") we only reconfigure the RSS hash table
> if the "effective" number of Rx rings has changed. If HW-GRO is
> enabled "effective" number of rings is 2x what user sees.
> So if we are in the bad state, with HW-GRO re-enablement "pending"
> after XDP off, and we lower the rings by / 2 - the HW-GRO rings
> doing 2x and the ethtool -L doing / 2 may cancel each other out,
> and the:
>
> if (old_rx_rings != bp->hw_resc.resv_rx_rings &&
>
> condition in __bnxt_reserve_rings() will be false.
> The RSS map won't get updated, and we'll crash with:
>
> BUG: kernel NULL pointer dereference, address: 0000000000000168
> RIP: 0010:__bnxt_hwrm_vnic_set_rss+0x13a/0x1a0
> bnxt_hwrm_vnic_rss_cfg_p5+0x47/0x180
> __bnxt_setup_vnic_p5+0x58/0x110
> bnxt_init_nic+0xb72/0xf50
> __bnxt_open_nic+0x40d/0xab0
> bnxt_open_nic+0x2b/0x60
> ethtool_set_channels+0x18c/0x1d0
>
> As we try to access a freed ring.
>
> The issue is present since XDP support was added, really, but
> prior to commit 98ba1d931f61 ("bnxt_en: Fix RSS logic in
> __bnxt_reserve_rings()") it wasn't causing major issues.
>
> Fixes: 1054aee82321 ("bnxt_en: Use NETIF_F_GRO_HW.")
> Fixes: 98ba1d931f61 ("bnxt_en: Fix RSS logic in __bnxt_reserve_rings()")
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Thanks for the patch.
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4209 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH net] eth: bnxt: always recalculate features after XDP clearing, fix null-deref
2025-01-09 4:30 [PATCH net] eth: bnxt: always recalculate features after XDP clearing, fix null-deref Jakub Kicinski
2025-01-09 5:38 ` Michael Chan
@ 2025-01-09 5:51 ` Somnath Kotur
2025-01-11 2:10 ` patchwork-bot+netdevbpf
2 siblings, 0 replies; 4+ messages in thread
From: Somnath Kotur @ 2025-01-09 5:51 UTC (permalink / raw)
To: Jakub Kicinski
Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms,
michael.chan, daniel, hawk, john.fastabend, andrew.gospodarek,
pavan.chebbi
[-- Attachment #1: Type: text/plain, Size: 6827 bytes --]
On Thu, Jan 9, 2025 at 10:01 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> Recalculate features when XDP is detached.
>
> Before:
> # ip li set dev eth0 xdp obj xdp_dummy.bpf.o sec xdp
> # ip li set dev eth0 xdp off
> # ethtool -k eth0 | grep gro
> rx-gro-hw: off [requested on]
>
> After:
> # ip li set dev eth0 xdp obj xdp_dummy.bpf.o sec xdp
> # ip li set dev eth0 xdp off
> # ethtool -k eth0 | grep gro
> rx-gro-hw: on
>
> The fact that HW-GRO doesn't get re-enabled automatically is just
> a minor annoyance. The real issue is that the features will randomly
> come back during another reconfiguration which just happens to invoke
> netdev_update_features(). The driver doesn't handle reconfiguring
> two things at a time very robustly.
>
> Starting with commit 98ba1d931f61 ("bnxt_en: Fix RSS logic in
> __bnxt_reserve_rings()") we only reconfigure the RSS hash table
> if the "effective" number of Rx rings has changed. If HW-GRO is
> enabled "effective" number of rings is 2x what user sees.
> So if we are in the bad state, with HW-GRO re-enablement "pending"
> after XDP off, and we lower the rings by / 2 - the HW-GRO rings
> doing 2x and the ethtool -L doing / 2 may cancel each other out,
> and the:
>
> if (old_rx_rings != bp->hw_resc.resv_rx_rings &&
>
> condition in __bnxt_reserve_rings() will be false.
> The RSS map won't get updated, and we'll crash with:
>
> BUG: kernel NULL pointer dereference, address: 0000000000000168
> RIP: 0010:__bnxt_hwrm_vnic_set_rss+0x13a/0x1a0
> bnxt_hwrm_vnic_rss_cfg_p5+0x47/0x180
> __bnxt_setup_vnic_p5+0x58/0x110
> bnxt_init_nic+0xb72/0xf50
> __bnxt_open_nic+0x40d/0xab0
> bnxt_open_nic+0x2b/0x60
> ethtool_set_channels+0x18c/0x1d0
>
> As we try to access a freed ring.
>
> The issue is present since XDP support was added, really, but
> prior to commit 98ba1d931f61 ("bnxt_en: Fix RSS logic in
> __bnxt_reserve_rings()") it wasn't causing major issues.
>
> Fixes: 1054aee82321 ("bnxt_en: Use NETIF_F_GRO_HW.")
> Fixes: 98ba1d931f61 ("bnxt_en: Fix RSS logic in __bnxt_reserve_rings()")
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
> CC: michael.chan@broadcom.com
> CC: daniel@iogearbox.net
> CC: hawk@kernel.org
> CC: john.fastabend@gmail.com
> CC: andrew.gospodarek@broadcom.com
> CC: pavan.chebbi@broadcom.com
> ---
> drivers/net/ethernet/broadcom/bnxt/bnxt.c | 25 +++++++++++++++----
> drivers/net/ethernet/broadcom/bnxt/bnxt.h | 2 +-
> drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c | 7 ------
> 3 files changed, 21 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> index aeaa74f03046..b6f844cac80e 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> @@ -4708,7 +4708,7 @@ void bnxt_set_ring_params(struct bnxt *bp)
> /* Changing allocation mode of RX rings.
> * TODO: Update when extending xdp_rxq_info to support allocation modes.
> */
> -int bnxt_set_rx_skb_mode(struct bnxt *bp, bool page_mode)
> +static void __bnxt_set_rx_skb_mode(struct bnxt *bp, bool page_mode)
> {
> struct net_device *dev = bp->dev;
>
> @@ -4729,15 +4729,30 @@ int bnxt_set_rx_skb_mode(struct bnxt *bp, bool page_mode)
> bp->rx_skb_func = bnxt_rx_page_skb;
> }
> bp->rx_dir = DMA_BIDIRECTIONAL;
> - /* Disable LRO or GRO_HW */
> - netdev_update_features(dev);
> } else {
> dev->max_mtu = bp->max_mtu;
> bp->flags &= ~BNXT_FLAG_RX_PAGE_MODE;
> bp->rx_dir = DMA_FROM_DEVICE;
> bp->rx_skb_func = bnxt_rx_skb;
> }
> - return 0;
> +}
> +
> +void bnxt_set_rx_skb_mode(struct bnxt *bp, bool page_mode)
> +{
> + __bnxt_set_rx_skb_mode(bp, page_mode);
> +
> + if (!page_mode) {
> + int rx, tx;
> +
> + bnxt_get_max_rings(bp, &rx, &tx, true);
> + if (rx > 1) {
> + bp->flags &= ~BNXT_FLAG_NO_AGG_RINGS;
> + bp->dev->hw_features |= NETIF_F_LRO;
> + }
> + }
> +
> + /* Update LRO and GRO_HW availability */
> + netdev_update_features(bp->dev);
> }
>
> static void bnxt_free_vnic_attributes(struct bnxt *bp)
> @@ -16214,7 +16229,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
> if (bp->max_fltr < BNXT_MAX_FLTR)
> bp->max_fltr = BNXT_MAX_FLTR;
> bnxt_init_l2_fltr_tbl(bp);
> - bnxt_set_rx_skb_mode(bp, false);
> + __bnxt_set_rx_skb_mode(bp, false);
> bnxt_set_tpa_flags(bp);
> bnxt_set_ring_params(bp);
> bnxt_rdma_aux_device_init(bp);
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
> index 7df7a2233307..f11ed59203d9 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
> @@ -2846,7 +2846,7 @@ u32 bnxt_fw_health_readl(struct bnxt *bp, int reg_idx);
> bool bnxt_bs_trace_avail(struct bnxt *bp, u16 type);
> void bnxt_set_tpa_flags(struct bnxt *bp);
> void bnxt_set_ring_params(struct bnxt *);
> -int bnxt_set_rx_skb_mode(struct bnxt *bp, bool page_mode);
> +void bnxt_set_rx_skb_mode(struct bnxt *bp, bool page_mode);
> void bnxt_insert_usr_fltr(struct bnxt *bp, struct bnxt_filter_base *fltr);
> void bnxt_del_one_usr_fltr(struct bnxt *bp, struct bnxt_filter_base *fltr);
> int bnxt_hwrm_func_drv_rgtr(struct bnxt *bp, unsigned long *bmap,
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
> index f88b641533fc..dc51dce209d5 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
> @@ -422,15 +422,8 @@ static int bnxt_xdp_set(struct bnxt *bp, struct bpf_prog *prog)
> bnxt_set_rx_skb_mode(bp, true);
> xdp_features_set_redirect_target(dev, true);
> } else {
> - int rx, tx;
> -
> xdp_features_clear_redirect_target(dev);
> bnxt_set_rx_skb_mode(bp, false);
> - bnxt_get_max_rings(bp, &rx, &tx, true);
> - if (rx > 1) {
> - bp->flags &= ~BNXT_FLAG_NO_AGG_RINGS;
> - bp->dev->hw_features |= NETIF_F_LRO;
> - }
> }
> bp->tx_nr_rings_xdp = tx_xdp;
> bp->tx_nr_rings = bp->tx_nr_rings_per_tc * tc + tx_xdp;
> --
> 2.47.1
>
>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4212 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH net] eth: bnxt: always recalculate features after XDP clearing, fix null-deref
2025-01-09 4:30 [PATCH net] eth: bnxt: always recalculate features after XDP clearing, fix null-deref Jakub Kicinski
2025-01-09 5:38 ` Michael Chan
2025-01-09 5:51 ` Somnath Kotur
@ 2025-01-11 2:10 ` patchwork-bot+netdevbpf
2 siblings, 0 replies; 4+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-01-11 2:10 UTC (permalink / raw)
To: Jakub Kicinski
Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms,
michael.chan, daniel, hawk, john.fastabend, andrew.gospodarek,
pavan.chebbi
Hello:
This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Wed, 8 Jan 2025 20:30:57 -0800 you wrote:
> Recalculate features when XDP is detached.
>
> Before:
> # ip li set dev eth0 xdp obj xdp_dummy.bpf.o sec xdp
> # ip li set dev eth0 xdp off
> # ethtool -k eth0 | grep gro
> rx-gro-hw: off [requested on]
>
> [...]
Here is the summary with links:
- [net] eth: bnxt: always recalculate features after XDP clearing, fix null-deref
https://git.kernel.org/netdev/net/c/f0aa6a37a3db
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-01-11 2:10 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-09 4:30 [PATCH net] eth: bnxt: always recalculate features after XDP clearing, fix null-deref Jakub Kicinski
2025-01-09 5:38 ` Michael Chan
2025-01-09 5:51 ` Somnath Kotur
2025-01-11 2:10 ` patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).