* [PATCH net v3 0/4] Fix race conditions in ndo_get_stats64
@ 2024-12-18 11:51 Shinas Rasheed
2024-12-18 11:51 ` [PATCH net v3 1/4] octeon_ep: fix " Shinas Rasheed
` (3 more replies)
0 siblings, 4 replies; 8+ messages in thread
From: Shinas Rasheed @ 2024-12-18 11:51 UTC (permalink / raw)
To: netdev, linux-kernel
Cc: hgani, sedara, vimleshk, thaller, wizhao, kheib, konguyen, horms,
einstein.xue, Shinas Rasheed
Fix race conditions in ndo_get_stats64 by implementing a state variable
check, and remove unnecessary firmware stats fetch which is currently
unnecessary
Changes:
V3:
- Added warn log that happened due to rcu_read_lock in commit message
V2: https://lore.kernel.org/all/20241216075842.2394606-1-srasheed@marvell.com/
- Changed sync mechanism to fix race conditions from using an atomic
set_bit ops to a much simpler synchronize_net()
V1: https://lore.kernel.org/all/20241203072130.2316913-1-srasheed@marvell.com/
Shinas Rasheed (4):
octeon_ep: fix race conditions in ndo_get_stats64
octeon_ep: remove firmware stats fetch in ndo_get_stats64
octeon_ep_vf: fix race conditions in ndo_get_stats64
octeon_ep_vf: remove firmware stats fetch in ndo_get_stats64
drivers/net/ethernet/marvell/octeon_ep/octep_main.c | 11 +----------
.../net/ethernet/marvell/octeon_ep_vf/octep_vf_main.c | 9 +--------
2 files changed, 2 insertions(+), 18 deletions(-)
--
2.25.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH net v3 1/4] octeon_ep: fix race conditions in ndo_get_stats64
2024-12-18 11:51 [PATCH net v3 0/4] Fix race conditions in ndo_get_stats64 Shinas Rasheed
@ 2024-12-18 11:51 ` Shinas Rasheed
2024-12-20 3:21 ` Jakub Kicinski
2024-12-18 11:51 ` [PATCH net v3 2/4] octeon_ep: remove firmware stats fetch " Shinas Rasheed
` (2 subsequent siblings)
3 siblings, 1 reply; 8+ messages in thread
From: Shinas Rasheed @ 2024-12-18 11:51 UTC (permalink / raw)
To: netdev, linux-kernel
Cc: hgani, sedara, vimleshk, thaller, wizhao, kheib, konguyen, horms,
einstein.xue, Shinas Rasheed, Veerasenareddy Burru, Andrew Lunn,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Satananda Burla, Abhijit Ayarekar
ndo_get_stats64() can race with ndo_stop(), which frees input and
output queue resources. Call synchronize_net() to avoid such races.
Fixes: 6a610a46bad1 ("octeon_ep: add support for ndo ops")
Signed-off-by: Shinas Rasheed <srasheed@marvell.com>
---
V3:
- No changes
V2: https://lore.kernel.org/all/20241216075842.2394606-2-srasheed@marvell.com/
- Changed sync mechanism to fix race conditions from using an atomic
set_bit ops to a much simpler synchronize_net()
V1: https://lore.kernel.org/all/20241203072130.2316913-2-srasheed@marvell.com/
drivers/net/ethernet/marvell/octeon_ep/octep_main.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/marvell/octeon_ep/octep_main.c b/drivers/net/ethernet/marvell/octeon_ep/octep_main.c
index 549436efc204..941bbaaa67b5 100644
--- a/drivers/net/ethernet/marvell/octeon_ep/octep_main.c
+++ b/drivers/net/ethernet/marvell/octeon_ep/octep_main.c
@@ -757,6 +757,7 @@ static int octep_stop(struct net_device *netdev)
{
struct octep_device *oct = netdev_priv(netdev);
+ synchronize_net();
netdev_info(netdev, "Stopping the device ...\n");
octep_ctrl_net_set_link_status(oct, OCTEP_CTRL_NET_INVALID_VFID, false,
--
2.25.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH net v3 2/4] octeon_ep: remove firmware stats fetch in ndo_get_stats64
2024-12-18 11:51 [PATCH net v3 0/4] Fix race conditions in ndo_get_stats64 Shinas Rasheed
2024-12-18 11:51 ` [PATCH net v3 1/4] octeon_ep: fix " Shinas Rasheed
@ 2024-12-18 11:51 ` Shinas Rasheed
2024-12-20 3:23 ` Jakub Kicinski
2024-12-18 11:51 ` [PATCH net v3 3/4] octeon_ep_vf: fix race conditions " Shinas Rasheed
2024-12-18 11:51 ` [PATCH net v3 4/4] octeon_ep_vf: remove firmware stats fetch " Shinas Rasheed
3 siblings, 1 reply; 8+ messages in thread
From: Shinas Rasheed @ 2024-12-18 11:51 UTC (permalink / raw)
To: netdev, linux-kernel
Cc: hgani, sedara, vimleshk, thaller, wizhao, kheib, konguyen, horms,
einstein.xue, Shinas Rasheed, Veerasenareddy Burru, Andrew Lunn,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Abhijit Ayarekar, Satananda Burla
The per queue stats are available already and are retrieved
from register reads during ndo_get_stats64. The firmware stats
fetch call that happens in ndo_get_stats64() is currently not
required
The warn log is given below:
[ 123.316837] ------------[ cut here ]------------
[ 123.316840] Voluntary context switch within RCU read-side critical section!
[ 123.316917] pc : rcu_note_context_switch+0x2e4/0x300
[ 123.316919] lr : rcu_note_context_switch+0x2e4/0x300
[ 123.316947] Call trace:
[ 123.316949] rcu_note_context_switch+0x2e4/0x300
[ 123.316952] __schedule+0x84/0x584
[ 123.316955] schedule+0x38/0x90
[ 123.316956] schedule_timeout+0xa0/0x1d4
[ 123.316959] octep_send_mbox_req+0x190/0x230 [octeon_ep]
[ 123.316966] octep_ctrl_net_get_if_stats+0x78/0x100 [octeon_ep]
[ 123.316970] octep_get_stats64+0xd4/0xf0 [octeon_ep]
[ 123.316975] dev_get_stats+0x4c/0x114
[ 123.316977] dev_seq_printf_stats+0x3c/0x11c
[ 123.316980] dev_seq_show+0x1c/0x40
[ 123.316982] seq_read_iter+0x3cc/0x4e0
[ 123.316985] seq_read+0xc8/0x110
[ 123.316987] proc_reg_read+0x9c/0xec
[ 123.316990] vfs_read+0xc8/0x2ec
[ 123.316993] ksys_read+0x70/0x100
[ 123.316995] __arm64_sys_read+0x20/0x30
[ 123.316997] invoke_syscall.constprop.0+0x7c/0xd0
[ 123.317000] do_el0_svc+0xb4/0xd0
[ 123.317002] el0_svc+0xe8/0x1f4
[ 123.317005] el0t_64_sync_handler+0x134/0x150
[ 123.317006] el0t_64_sync+0x17c/0x180
[ 123.317008] ---[ end trace 63399811432ab69b ]---
Fixes: 6a610a46bad1 ("octeon_ep: add support for ndo ops")
Signed-off-by: Shinas Rasheed <srasheed@marvell.com>
---
V3:
- Added warn log that happened due to rcu_read_lock in commit message
V2: https://lore.kernel.org/all/20241216075842.2394606-3-srasheed@marvell.com/
- No changes
V1: https://lore.kernel.org/all/20241203072130.2316913-3-srasheed@marvell.com/
drivers/net/ethernet/marvell/octeon_ep/octep_main.c | 10 ----------
1 file changed, 10 deletions(-)
diff --git a/drivers/net/ethernet/marvell/octeon_ep/octep_main.c b/drivers/net/ethernet/marvell/octeon_ep/octep_main.c
index 941bbaaa67b5..6400d6008097 100644
--- a/drivers/net/ethernet/marvell/octeon_ep/octep_main.c
+++ b/drivers/net/ethernet/marvell/octeon_ep/octep_main.c
@@ -996,12 +996,6 @@ static void octep_get_stats64(struct net_device *netdev,
struct octep_device *oct = netdev_priv(netdev);
int q;
- if (netif_running(netdev))
- octep_ctrl_net_get_if_stats(oct,
- OCTEP_CTRL_NET_INVALID_VFID,
- &oct->iface_rx_stats,
- &oct->iface_tx_stats);
-
tx_packets = 0;
tx_bytes = 0;
rx_packets = 0;
@@ -1019,10 +1013,6 @@ static void octep_get_stats64(struct net_device *netdev,
stats->tx_bytes = tx_bytes;
stats->rx_packets = rx_packets;
stats->rx_bytes = rx_bytes;
- stats->multicast = oct->iface_rx_stats.mcast_pkts;
- stats->rx_errors = oct->iface_rx_stats.err_pkts;
- stats->collisions = oct->iface_tx_stats.xscol;
- stats->tx_fifo_errors = oct->iface_tx_stats.undflw;
}
/**
--
2.25.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH net v3 3/4] octeon_ep_vf: fix race conditions in ndo_get_stats64
2024-12-18 11:51 [PATCH net v3 0/4] Fix race conditions in ndo_get_stats64 Shinas Rasheed
2024-12-18 11:51 ` [PATCH net v3 1/4] octeon_ep: fix " Shinas Rasheed
2024-12-18 11:51 ` [PATCH net v3 2/4] octeon_ep: remove firmware stats fetch " Shinas Rasheed
@ 2024-12-18 11:51 ` Shinas Rasheed
2024-12-20 3:21 ` Jakub Kicinski
2024-12-18 11:51 ` [PATCH net v3 4/4] octeon_ep_vf: remove firmware stats fetch " Shinas Rasheed
3 siblings, 1 reply; 8+ messages in thread
From: Shinas Rasheed @ 2024-12-18 11:51 UTC (permalink / raw)
To: netdev, linux-kernel
Cc: hgani, sedara, vimleshk, thaller, wizhao, kheib, konguyen, horms,
einstein.xue, Shinas Rasheed, Veerasenareddy Burru,
Satananda Burla, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni
ndo_get_stats64() can race with ndo_stop(), which frees input and
output queue resources. Call synchornize_net() in ndo_stop()
to avoid such races.
Fixes: c3fad23cdc06 ("octeon_ep_vf: add support for ndo ops")
Signed-off-by: Shinas Rasheed <srasheed@marvell.com>
---
V3:
- No changes
V2: https://lore.kernel.org/all/20241216075842.2394606-4-srasheed@marvell.com/
- Changed sync mechanism to fix race conditions from using an atomic
set_bit ops to a much simpler synchronize_net()
V1: https://lore.kernel.org/all/20241203072130.2316913-4-srasheed@marvell.com/
drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_main.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_main.c b/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_main.c
index 7e6771c9cdbb..e6253f85b623 100644
--- a/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_main.c
+++ b/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_main.c
@@ -523,6 +523,7 @@ static int octep_vf_stop(struct net_device *netdev)
{
struct octep_vf_device *oct = netdev_priv(netdev);
+ synchronize_rcu();
netdev_info(netdev, "Stopping the device ...\n");
/* Stop Tx from stack */
--
2.25.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH net v3 4/4] octeon_ep_vf: remove firmware stats fetch in ndo_get_stats64
2024-12-18 11:51 [PATCH net v3 0/4] Fix race conditions in ndo_get_stats64 Shinas Rasheed
` (2 preceding siblings ...)
2024-12-18 11:51 ` [PATCH net v3 3/4] octeon_ep_vf: fix race conditions " Shinas Rasheed
@ 2024-12-18 11:51 ` Shinas Rasheed
3 siblings, 0 replies; 8+ messages in thread
From: Shinas Rasheed @ 2024-12-18 11:51 UTC (permalink / raw)
To: netdev, linux-kernel
Cc: hgani, sedara, vimleshk, thaller, wizhao, kheib, konguyen, horms,
einstein.xue, Shinas Rasheed, Veerasenareddy Burru,
Satananda Burla, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni
The per queue stats are available already and are retrieved
from register reads during ndo_get_stats64. The firmware stats
fetch call that happens in ndo_get_stats64() is currently not
required
The warn log is given below:
[ 123.316837] ------------[ cut here ]------------
[ 123.316840] Voluntary context switch within RCU read-side critical section!
[ 123.316917] pc : rcu_note_context_switch+0x2e4/0x300
[ 123.316919] lr : rcu_note_context_switch+0x2e4/0x300
[ 123.316947] Call trace:
[ 123.316949] rcu_note_context_switch+0x2e4/0x300
[ 123.316952] __schedule+0x84/0x584
[ 123.316955] schedule+0x38/0x90
[ 123.316956] schedule_timeout+0xa0/0x1d4
[ 123.316959] octep_send_mbox_req+0x190/0x230 [octeon_ep]
[ 123.316966] octep_ctrl_net_get_if_stats+0x78/0x100 [octeon_ep]
[ 123.316970] octep_get_stats64+0xd4/0xf0 [octeon_ep]
[ 123.316975] dev_get_stats+0x4c/0x114
[ 123.316977] dev_seq_printf_stats+0x3c/0x11c
[ 123.316980] dev_seq_show+0x1c/0x40
[ 123.316982] seq_read_iter+0x3cc/0x4e0
[ 123.316985] seq_read+0xc8/0x110
[ 123.316987] proc_reg_read+0x9c/0xec
[ 123.316990] vfs_read+0xc8/0x2ec
[ 123.316993] ksys_read+0x70/0x100
[ 123.316995] __arm64_sys_read+0x20/0x30
[ 123.316997] invoke_syscall.constprop.0+0x7c/0xd0
[ 123.317000] do_el0_svc+0xb4/0xd0
[ 123.317002] el0_svc+0xe8/0x1f4
[ 123.317005] el0t_64_sync_handler+0x134/0x150
[ 123.317006] el0t_64_sync+0x17c/0x180
[ 123.317008] ---[ end trace 63399811432ab69b ]---
Fixes: c3fad23cdc06 ("octeon_ep_vf: add support for ndo ops")
Signed-off-by: Shinas Rasheed <srasheed@marvell.com>
---
V3:
- Added warn log that happened due to rcu_read_lock in commit message
V2: https://lore.kernel.org/all/20241216075842.2394606-5-srasheed@marvell.com/
- No changes
V1: https://lore.kernel.org/all/20241203072130.2316913-5-srasheed@marvell.com/
drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_main.c | 8 --------
1 file changed, 8 deletions(-)
diff --git a/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_main.c b/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_main.c
index e6253f85b623..75dd3bd2b9ba 100644
--- a/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_main.c
+++ b/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_main.c
@@ -800,14 +800,6 @@ static void octep_vf_get_stats64(struct net_device *netdev,
stats->tx_bytes = tx_bytes;
stats->rx_packets = rx_packets;
stats->rx_bytes = rx_bytes;
- if (!octep_vf_get_if_stats(oct)) {
- stats->multicast = oct->iface_rx_stats.mcast_pkts;
- stats->rx_errors = oct->iface_rx_stats.err_pkts;
- stats->rx_dropped = oct->iface_rx_stats.dropped_pkts_fifo_full +
- oct->iface_rx_stats.err_pkts;
- stats->rx_missed_errors = oct->iface_rx_stats.dropped_pkts_fifo_full;
- stats->tx_dropped = oct->iface_tx_stats.dropped;
- }
}
/**
--
2.25.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH net v3 1/4] octeon_ep: fix race conditions in ndo_get_stats64
2024-12-18 11:51 ` [PATCH net v3 1/4] octeon_ep: fix " Shinas Rasheed
@ 2024-12-20 3:21 ` Jakub Kicinski
0 siblings, 0 replies; 8+ messages in thread
From: Jakub Kicinski @ 2024-12-20 3:21 UTC (permalink / raw)
To: Shinas Rasheed
Cc: netdev, linux-kernel, hgani, sedara, vimleshk, thaller, wizhao,
kheib, konguyen, horms, einstein.xue, Veerasenareddy Burru,
Andrew Lunn, David S. Miller, Eric Dumazet, Paolo Abeni,
Satananda Burla, Abhijit Ayarekar
On Wed, 18 Dec 2024 03:51:08 -0800 Shinas Rasheed wrote:
> ndo_get_stats64() can race with ndo_stop(), which frees input and
> output queue resources. Call synchronize_net() to avoid such races.
synchronize_rcu() acts as a barrier.
What are the two operations you are separating with this barrier?
> diff --git a/drivers/net/ethernet/marvell/octeon_ep/octep_main.c b/drivers/net/ethernet/marvell/octeon_ep/octep_main.c
> index 549436efc204..941bbaaa67b5 100644
> --- a/drivers/net/ethernet/marvell/octeon_ep/octep_main.c
> +++ b/drivers/net/ethernet/marvell/octeon_ep/octep_main.c
> @@ -757,6 +757,7 @@ static int octep_stop(struct net_device *netdev)
> {
> struct octep_device *oct = netdev_priv(netdev);
>
> + synchronize_net();
> netdev_info(netdev, "Stopping the device ...\n");
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net v3 3/4] octeon_ep_vf: fix race conditions in ndo_get_stats64
2024-12-18 11:51 ` [PATCH net v3 3/4] octeon_ep_vf: fix race conditions " Shinas Rasheed
@ 2024-12-20 3:21 ` Jakub Kicinski
0 siblings, 0 replies; 8+ messages in thread
From: Jakub Kicinski @ 2024-12-20 3:21 UTC (permalink / raw)
To: Shinas Rasheed
Cc: netdev, linux-kernel, hgani, sedara, vimleshk, thaller, wizhao,
kheib, konguyen, horms, einstein.xue, Veerasenareddy Burru,
Satananda Burla, Andrew Lunn, David S. Miller, Eric Dumazet,
Paolo Abeni
On Wed, 18 Dec 2024 03:51:10 -0800 Shinas Rasheed wrote:
> ndo_get_stats64() can race with ndo_stop(), which frees input and
> output queue resources. Call synchornize_net() in ndo_stop()
synchronize_net() here..
> + synchronize_rcu();
> netdev_info(netdev, "Stopping the device ...\n");
.. but not in the code
--
pw-bot: cr
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net v3 2/4] octeon_ep: remove firmware stats fetch in ndo_get_stats64
2024-12-18 11:51 ` [PATCH net v3 2/4] octeon_ep: remove firmware stats fetch " Shinas Rasheed
@ 2024-12-20 3:23 ` Jakub Kicinski
0 siblings, 0 replies; 8+ messages in thread
From: Jakub Kicinski @ 2024-12-20 3:23 UTC (permalink / raw)
To: Shinas Rasheed
Cc: netdev, linux-kernel, hgani, sedara, vimleshk, thaller, wizhao,
kheib, konguyen, horms, einstein.xue, Veerasenareddy Burru,
Andrew Lunn, David S. Miller, Eric Dumazet, Paolo Abeni,
Abhijit Ayarekar, Satananda Burla
On Wed, 18 Dec 2024 03:51:09 -0800 Shinas Rasheed wrote:
> The per queue stats are available already and are retrieved
> from register reads during ndo_get_stats64. The firmware stats
> fetch call that happens in ndo_get_stats64() is currently not
> required
Because they are just additional error stats?
No longer reporting errors could cause a regression for monitoring
systems.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2024-12-20 3:23 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-18 11:51 [PATCH net v3 0/4] Fix race conditions in ndo_get_stats64 Shinas Rasheed
2024-12-18 11:51 ` [PATCH net v3 1/4] octeon_ep: fix " Shinas Rasheed
2024-12-20 3:21 ` Jakub Kicinski
2024-12-18 11:51 ` [PATCH net v3 2/4] octeon_ep: remove firmware stats fetch " Shinas Rasheed
2024-12-20 3:23 ` Jakub Kicinski
2024-12-18 11:51 ` [PATCH net v3 3/4] octeon_ep_vf: fix race conditions " Shinas Rasheed
2024-12-20 3:21 ` Jakub Kicinski
2024-12-18 11:51 ` [PATCH net v3 4/4] octeon_ep_vf: remove firmware stats fetch " Shinas Rasheed
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).