From: Saeed Mahameed <saeed@kernel.org>
To: "David S. Miller" <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Eric Dumazet <edumazet@google.com>
Cc: Saeed Mahameed <saeedm@nvidia.com>,
netdev@vger.kernel.org, Tariq Toukan <tariqt@nvidia.com>,
Leon Romanovsky <leonro@nvidia.com>,
Patrisious Haddad <phaddad@nvidia.com>,
Simon Horman <simon.horman@corigine.com>
Subject: [net 10/12] net/mlx5e: Fix ESN update kernel panic
Date: Fri, 16 Jun 2023 13:01:17 -0700 [thread overview]
Message-ID: <20230616200119.44163-11-saeed@kernel.org> (raw)
In-Reply-To: <20230616200119.44163-1-saeed@kernel.org>
From: Patrisious Haddad <phaddad@nvidia.com>
Previously during mlx5e_ipsec_handle_event the driver tried to execute
an operation that could sleep, while holding a spinlock, which caused
the kernel panic mentioned below.
Move the function call that can sleep outside of the spinlock context.
Call Trace:
<TASK>
dump_stack_lvl+0x49/0x6c
__schedule_bug.cold+0x42/0x4e
schedule_debug.constprop.0+0xe0/0x118
__schedule+0x59/0x58a
? __mod_timer+0x2a1/0x3ef
schedule+0x5e/0xd4
schedule_timeout+0x99/0x164
? __pfx_process_timeout+0x10/0x10
__wait_for_common+0x90/0x1da
? __pfx_schedule_timeout+0x10/0x10
wait_func+0x34/0x142 [mlx5_core]
mlx5_cmd_invoke+0x1f3/0x313 [mlx5_core]
cmd_exec+0x1fe/0x325 [mlx5_core]
mlx5_cmd_do+0x22/0x50 [mlx5_core]
mlx5_cmd_exec+0x1c/0x40 [mlx5_core]
mlx5_modify_ipsec_obj+0xb2/0x17f [mlx5_core]
mlx5e_ipsec_update_esn_state+0x69/0xf0 [mlx5_core]
? wake_affine+0x62/0x1f8
mlx5e_ipsec_handle_event+0xb1/0xc0 [mlx5_core]
process_one_work+0x1e2/0x3e6
? __pfx_worker_thread+0x10/0x10
worker_thread+0x54/0x3ad
? __pfx_worker_thread+0x10/0x10
kthread+0xda/0x101
? __pfx_kthread+0x10/0x10
ret_from_fork+0x29/0x37
</TASK>
BUG: workqueue leaked lock or atomic: kworker/u256:4/0x7fffffff/189754#012 last function: mlx5e_ipsec_handle_event [mlx5_core]
CPU: 66 PID: 189754 Comm: kworker/u256:4 Kdump: loaded Tainted: G W 6.2.0-2596.20230309201517_5.el8uek.rc1.x86_64 #2
Hardware name: Oracle Corporation ORACLE SERVER X9-2/ASMMBX9-2, BIOS 61070300 08/17/2022
Workqueue: mlx5e_ipsec: eth%d mlx5e_ipsec_handle_event [mlx5_core]
Call Trace:
<TASK>
dump_stack_lvl+0x49/0x6c
process_one_work.cold+0x2b/0x3c
? __pfx_worker_thread+0x10/0x10
worker_thread+0x54/0x3ad
? __pfx_worker_thread+0x10/0x10
kthread+0xda/0x101
? __pfx_kthread+0x10/0x10
ret_from_fork+0x29/0x37
</TASK>
BUG: scheduling while atomic: kworker/u256:4/189754/0x00000000
Fixes: cee137a63431 ("net/mlx5e: Handle ESN update events")
Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
.../mellanox/mlx5/core/en_accel/ipsec_offload.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_offload.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_offload.c
index df90e19066bc..ca16cb9807ea 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_offload.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_offload.c
@@ -305,7 +305,17 @@ static void mlx5e_ipsec_update_esn_state(struct mlx5e_ipsec_sa_entry *sa_entry,
}
mlx5e_ipsec_build_accel_xfrm_attrs(sa_entry, &attrs);
+
+ /* It is safe to execute the modify below unlocked since the only flows
+ * that could affect this HW object, are create, destroy and this work.
+ *
+ * Creation flow can't co-exist with this modify work, the destruction
+ * flow would cancel this work, and this work is a single entity that
+ * can't conflict with it self.
+ */
+ spin_unlock_bh(&sa_entry->x->lock);
mlx5_accel_esp_modify_xfrm(sa_entry, &attrs);
+ spin_lock_bh(&sa_entry->x->lock);
data.data_offset_condition_operand =
MLX5_IPSEC_ASO_REMOVE_FLOW_PKT_CNT_OFFSET;
@@ -431,7 +441,7 @@ static void mlx5e_ipsec_handle_event(struct work_struct *_work)
aso = sa_entry->ipsec->aso;
attrs = &sa_entry->attrs;
- spin_lock(&sa_entry->x->lock);
+ spin_lock_bh(&sa_entry->x->lock);
ret = mlx5e_ipsec_aso_query(sa_entry, NULL);
if (ret)
goto unlock;
@@ -447,7 +457,7 @@ static void mlx5e_ipsec_handle_event(struct work_struct *_work)
mlx5e_ipsec_handle_limits(sa_entry);
unlock:
- spin_unlock(&sa_entry->x->lock);
+ spin_unlock_bh(&sa_entry->x->lock);
kfree(work);
}
--
2.40.1
next prev parent reply other threads:[~2023-06-16 20:01 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-16 20:01 [pull request][net 00/12] mlx5 fixes 2023-06-16 Saeed Mahameed
2023-06-16 20:01 ` [net 01/12] net/mlx5e: XDP, Allow growing tail for XDP multi buffer Saeed Mahameed
2023-06-19 9:40 ` patchwork-bot+netdevbpf
2023-06-16 20:01 ` [net 02/12] net/mlx5e: xsk: Set napi_id to support busy polling on XSK RQ Saeed Mahameed
2023-06-16 20:01 ` [net 03/12] net/mlx5: Fix driver load with single msix vector Saeed Mahameed
2023-06-16 20:01 ` [net 04/12] net/mlx5e: TC, Add null pointer check for hardware miss support Saeed Mahameed
2023-06-16 20:01 ` [net 05/12] net/mlx5e: TC, Cleanup ct resources for nic flow Saeed Mahameed
2023-06-16 20:01 ` [net 06/12] net/mlx5: DR, Support SW created encap actions for FW table Saeed Mahameed
2023-06-16 20:01 ` [net 07/12] net/mlx5: DR, Fix wrong action data allocation in decap action Saeed Mahameed
2023-06-16 20:01 ` [net 08/12] net/mlx5: Free IRQ rmap and notifier on kernel shutdown Saeed Mahameed
2023-06-16 20:01 ` [net 09/12] net/mlx5e: Don't delay release of hardware objects Saeed Mahameed
2023-06-16 20:01 ` Saeed Mahameed [this message]
2023-06-16 20:01 ` [net 11/12] net/mlx5e: Drop XFRM state lock when modifying flow steering Saeed Mahameed
2023-06-16 20:01 ` [net 12/12] net/mlx5e: Fix scheduling of IPsec ASO query while in atomic Saeed Mahameed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230616200119.44163-11-saeed@kernel.org \
--to=saeed@kernel.org \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=kuba@kernel.org \
--cc=leonro@nvidia.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=phaddad@nvidia.com \
--cc=saeedm@nvidia.com \
--cc=simon.horman@corigine.com \
--cc=tariqt@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).