* [Intel-wired-lan] [net,v1] ice: Fix l2-fwd-offload toggle crash
@ 2022-10-12 15:55 Benjamin Mikailenko
2022-10-13 15:25 ` [Intel-wired-lan] [net, v1] " Michal Schmidt
0 siblings, 1 reply; 5+ messages in thread
From: Benjamin Mikailenko @ 2022-10-12 15:55 UTC (permalink / raw)
To: intel-wired-lan; +Cc: Benjamin Mikailenko
Running netperf traffic and toggling l2-fwd-offload in quick succession
caused the driver to crash.
BUG: kernel NULL pointer dereference, address: 0000000000000020
[ 861.517803] #PF: supervisor read access in kernel mode
[ 861.517805] #PF: error_code(0x0000) - not-present page
[ 861.517808] PGD 0 P4D 0
[ 861.517811] Oops: 0000 [#1] PREEMPT SMP PTI
[ 861.517815] CPU: 60 PID: 16471 Comm: netperf Kdump: loaded Tainted: G S
[ 861.517818] Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS SE
[ 861.517820] RIP: 0010:ice_start_xmit+0xb0/0x1420 [ice]
This crash would happen because during l2-fwd-offload configuration,
ice_init_macvlan or ice_deinit_macvlan would temporarily work on Tx rings.
At the same time, ice_start_xmit would attempt to select the correct send
buffer from Tx rings but reach a NULL pointer.
Fix this by checking if ring exists before proceeding xmit. If ring does
not exist, return NETDEV_TX_BUSY.
Fixes: 2b245cb29421 ("ice: Implement transmit and NAPI support")
Signed-off-by: Benjamin Mikailenko <benjamin.mikailenko@intel.com>
---
drivers/net/ethernet/intel/ice/ice_txrx.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
index 086f0b3ab68d..96bc8fad39c0 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
@@ -2405,6 +2405,8 @@ netdev_tx_t ice_start_xmit(struct sk_buff *skb, struct net_device *netdev)
struct ice_tx_ring *tx_ring;
tx_ring = vsi->tx_rings[skb->queue_mapping];
+ if (!tx_ring)
+ return NETDEV_TX_BUSY;
/* hardware can't handle really short frames, hardware padding works
* beyond this point
--
2.34.3
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [Intel-wired-lan] [net, v1] ice: Fix l2-fwd-offload toggle crash
2022-10-12 15:55 [Intel-wired-lan] [net,v1] ice: Fix l2-fwd-offload toggle crash Benjamin Mikailenko
@ 2022-10-13 15:25 ` Michal Schmidt
2022-10-13 16:51 ` Benjamin Mikailenko
2022-10-17 21:13 ` Jacob Keller
0 siblings, 2 replies; 5+ messages in thread
From: Michal Schmidt @ 2022-10-13 15:25 UTC (permalink / raw)
To: Benjamin Mikailenko; +Cc: intel-wired-lan
[-- Attachment #1.1: Type: text/plain, Size: 2400 bytes --]
On Wed, Oct 12, 2022 at 6:03 PM Benjamin Mikailenko <
benjamin.mikailenko@intel.com> wrote:
> Running netperf traffic and toggling l2-fwd-offload in quick succession
> caused the driver to crash.
>
> BUG: kernel NULL pointer dereference, address: 0000000000000020
> [ 861.517803] #PF: supervisor read access in kernel mode
> [ 861.517805] #PF: error_code(0x0000) - not-present page
> [ 861.517808] PGD 0 P4D 0
> [ 861.517811] Oops: 0000 [#1] PREEMPT SMP PTI
> [ 861.517815] CPU: 60 PID: 16471 Comm: netperf Kdump: loaded Tainted: G S
> [ 861.517818] Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS SE
> [ 861.517820] RIP: 0010:ice_start_xmit+0xb0/0x1420 [ice]
>
> This crash would happen because during l2-fwd-offload configuration,
> ice_init_macvlan or ice_deinit_macvlan would temporarily work on Tx rings.
>
What are ice_init_macvlan and ice_deinit_macvlan? Are they function names?
I do not see such functions in the code.
> At the same time, ice_start_xmit would attempt to select the correct send
> buffer from Tx rings but reach a NULL pointer.
>
> Fix this by checking if ring exists before proceeding xmit. If ring does
> not exist, return NETDEV_TX_BUSY.
>
Isn't it still racy though?
Shouldn't rather whatever is fiddling with the rings make sure the Tx
queues are stopped first with netif_tx_stop_queue or similar?
Michal
> Fixes: 2b245cb29421 ("ice: Implement transmit and NAPI support")
> Signed-off-by: Benjamin Mikailenko <benjamin.mikailenko@intel.com>
> ---
> drivers/net/ethernet/intel/ice/ice_txrx.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c
> b/drivers/net/ethernet/intel/ice/ice_txrx.c
> index 086f0b3ab68d..96bc8fad39c0 100644
> --- a/drivers/net/ethernet/intel/ice/ice_txrx.c
> +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
> @@ -2405,6 +2405,8 @@ netdev_tx_t ice_start_xmit(struct sk_buff *skb,
> struct net_device *netdev)
> struct ice_tx_ring *tx_ring;
>
> tx_ring = vsi->tx_rings[skb->queue_mapping];
> + if (!tx_ring)
> + return NETDEV_TX_BUSY;
>
> /* hardware can't handle really short frames, hardware padding
> works
> * beyond this point
> --
> 2.34.3
>
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan@osuosl.org
> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan
>
>
[-- Attachment #1.2: Type: text/html, Size: 3499 bytes --]
[-- Attachment #2: Type: text/plain, Size: 162 bytes --]
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Intel-wired-lan] [net, v1] ice: Fix l2-fwd-offload toggle crash
2022-10-13 15:25 ` [Intel-wired-lan] [net, v1] " Michal Schmidt
@ 2022-10-13 16:51 ` Benjamin Mikailenko
2022-10-17 21:14 ` Jacob Keller
2022-10-17 21:13 ` Jacob Keller
1 sibling, 1 reply; 5+ messages in thread
From: Benjamin Mikailenko @ 2022-10-13 16:51 UTC (permalink / raw)
To: Michal Schmidt; +Cc: intel-wired-lan
On 10/13/2022 8:25 AM, Michal Schmidt wrote:
> On Wed, Oct 12, 2022 at 6:03 PM Benjamin Mikailenko <benjamin.mikailenko@intel.com <mailto:benjamin.mikailenko@intel.com>> wrote:
>
> Running netperf traffic and toggling l2-fwd-offload in quick succession
> caused the driver to crash.
>
> BUG: kernel NULL pointer dereference, address: 0000000000000020
> [ 861.517803] #PF: supervisor read access in kernel mode
> [ 861.517805] #PF: error_code(0x0000) - not-present page
> [ 861.517808] PGD 0 P4D 0
> [ 861.517811] Oops: 0000 [#1] PREEMPT SMP PTI
> [ 861.517815] CPU: 60 PID: 16471 Comm: netperf Kdump: loaded Tainted: G S
> [ 861.517818] Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS SE
> [ 861.517820] RIP: 0010:ice_start_xmit+0xb0/0x1420 [ice]
>
> This crash would happen because during l2-fwd-offload configuration,
> ice_init_macvlan or ice_deinit_macvlan would temporarily work on Tx rings.
>
>
> What are ice_init_macvlan and ice_deinit_macvlan? Are they function names? I do not see such functions in the code.
>
Yes, but l2-fwd-offload for ICE is fixed to "off" in the kernel. So no need for those functions.
This patch should be rejected.
>
> At the same time, ice_start_xmit would attempt to select the correct send
> buffer from Tx rings but reach a NULL pointer.
>
> Fix this by checking if ring exists before proceeding xmit. If ring does
> not exist, return NETDEV_TX_BUSY.
>
>
> Isn't it still racy though?
> Shouldn't rather whatever is fiddling with the rings make sure the Tx queues are stopped first with netif_tx_stop_queue or similar?
> Michal
>
You're right here as well. I'll have a look into stopping queues.
Thanks Michal, really appreciate the review!
Ben
>
> Fixes: 2b245cb29421 ("ice: Implement transmit and NAPI support")
> Signed-off-by: Benjamin Mikailenko <benjamin.mikailenko@intel.com <mailto:benjamin.mikailenko@intel.com>>
> ---
> drivers/net/ethernet/intel/ice/ice_txrx.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
> index 086f0b3ab68d..96bc8fad39c0 100644
> --- a/drivers/net/ethernet/intel/ice/ice_txrx.c
> +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
> @@ -2405,6 +2405,8 @@ netdev_tx_t ice_start_xmit(struct sk_buff *skb, struct net_device *netdev)
> struct ice_tx_ring *tx_ring;
>
> tx_ring = vsi->tx_rings[skb->queue_mapping];
> + if (!tx_ring)
> + return NETDEV_TX_BUSY;
>
> /* hardware can't handle really short frames, hardware padding works
> * beyond this point
> --
> 2.34.3
>
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan@osuosl.org <mailto:Intel-wired-lan@osuosl.org>
> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan <https://lists.osuosl.org/mailman/listinfo/intel-wired-lan>
>
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Intel-wired-lan] [net, v1] ice: Fix l2-fwd-offload toggle crash
2022-10-13 15:25 ` [Intel-wired-lan] [net, v1] " Michal Schmidt
2022-10-13 16:51 ` Benjamin Mikailenko
@ 2022-10-17 21:13 ` Jacob Keller
1 sibling, 0 replies; 5+ messages in thread
From: Jacob Keller @ 2022-10-17 21:13 UTC (permalink / raw)
To: intel-wired-lan
On 10/13/2022 8:25 AM, Michal Schmidt wrote:
> On Wed, Oct 12, 2022 at 6:03 PM Benjamin Mikailenko
> <benjamin.mikailenko@intel.com <mailto:benjamin.mikailenko@intel.com>>
> wrote:
>
> Running netperf traffic and toggling l2-fwd-offload in quick succession
> caused the driver to crash.
>
> BUG: kernel NULL pointer dereference, address: 0000000000000020
> [ 861.517803] #PF: supervisor read access in kernel mode
> [ 861.517805] #PF: error_code(0x0000) - not-present page
> [ 861.517808] PGD 0 P4D 0
> [ 861.517811] Oops: 0000 [#1] PREEMPT SMP PTI
> [ 861.517815] CPU: 60 PID: 16471 Comm: netperf Kdump: loaded
> Tainted: G S
> [ 861.517818] Hardware name: Intel Corporation S2600WTT/S2600WTT,
> BIOS SE
> [ 861.517820] RIP: 0010:ice_start_xmit+0xb0/0x1420 [ice]
>
> This crash would happen because during l2-fwd-offload configuration,
> ice_init_macvlan or ice_deinit_macvlan would temporarily work on Tx
> rings.
>
>
> What are ice_init_macvlan and ice_deinit_macvlan? Are they function
> names? I do not see such functions in the code.
>
>
> At the same time, ice_start_xmit would attempt to select the correct
> send
> buffer from Tx rings but reach a NULL pointer.
>
> Fix this by checking if ring exists before proceeding xmit. If ring does
> not exist, return NETDEV_TX_BUSY.
>
>
> Isn't it still racy though?
> Shouldn't rather whatever is fiddling with the rings make sure the Tx
> queues are stopped first with netif_tx_stop_queue or similar?
> Michal
>
I was going to mention this. No sign of a lock or any other mechanism.
Ideally separately managed structures like this would benefit from
reference counting or such..
Still, sounds like this patch needs to be rejected per other comments.
Thanks,
Jake
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Intel-wired-lan] [net, v1] ice: Fix l2-fwd-offload toggle crash
2022-10-13 16:51 ` Benjamin Mikailenko
@ 2022-10-17 21:14 ` Jacob Keller
0 siblings, 0 replies; 5+ messages in thread
From: Jacob Keller @ 2022-10-17 21:14 UTC (permalink / raw)
To: intel-wired-lan
On 10/13/2022 9:51 AM, Benjamin Mikailenko wrote:
>
>
> On 10/13/2022 8:25 AM, Michal Schmidt wrote:
>> On Wed, Oct 12, 2022 at 6:03 PM Benjamin Mikailenko <benjamin.mikailenko@intel.com <mailto:benjamin.mikailenko@intel.com>> wrote:
>>
>> Running netperf traffic and toggling l2-fwd-offload in quick succession
>> caused the driver to crash.
>>
>> BUG: kernel NULL pointer dereference, address: 0000000000000020
>> [ 861.517803] #PF: supervisor read access in kernel mode
>> [ 861.517805] #PF: error_code(0x0000) - not-present page
>> [ 861.517808] PGD 0 P4D 0
>> [ 861.517811] Oops: 0000 [#1] PREEMPT SMP PTI
>> [ 861.517815] CPU: 60 PID: 16471 Comm: netperf Kdump: loaded Tainted: G S
>> [ 861.517818] Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS SE
>> [ 861.517820] RIP: 0010:ice_start_xmit+0xb0/0x1420 [ice]
>>
>> This crash would happen because during l2-fwd-offload configuration,
>> ice_init_macvlan or ice_deinit_macvlan would temporarily work on Tx rings.
>>
>>
>> What are ice_init_macvlan and ice_deinit_macvlan? Are they function names? I do not see such functions in the code.
>>
>
> Yes, but l2-fwd-offload for ICE is fixed to "off" in the kernel. So no need for those functions.
> This patch should be rejected.
>
>>
>> At the same time, ice_start_xmit would attempt to select the correct send
>> buffer from Tx rings but reach a NULL pointer.
>>
>> Fix this by checking if ring exists before proceeding xmit. If ring does
>> not exist, return NETDEV_TX_BUSY.
>>
>>
>> Isn't it still racy though?
>> Shouldn't rather whatever is fiddling with the rings make sure the Tx queues are stopped first with netif_tx_stop_queue or similar?
>> Michal
>>
>
> You're right here as well. I'll have a look into stopping queues.
>
> Thanks Michal, really appreciate the review!
> Ben
>
Ok, I'll drop this from queue. What about this reported bug? Is it
something that can't happen on the in-kernel driver?
Thanks,
Jake
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2022-10-17 21:14 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-10-12 15:55 [Intel-wired-lan] [net,v1] ice: Fix l2-fwd-offload toggle crash Benjamin Mikailenko
2022-10-13 15:25 ` [Intel-wired-lan] [net, v1] " Michal Schmidt
2022-10-13 16:51 ` Benjamin Mikailenko
2022-10-17 21:14 ` Jacob Keller
2022-10-17 21:13 ` Jacob Keller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox