netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Simon Horman <simon.horman@corigine.com>
To: Leon Romanovsky <leon@kernel.org>
Cc: Jakub Kicinski <kuba@kernel.org>,
	Leon Romanovsky <leonro@nvidia.com>,
	Eric Dumazet <edumazet@google.com>,
	netdev@vger.kernel.org, Paolo Abeni <pabeni@redhat.com>,
	Patrisious Haddad <phaddad@nvidia.com>,
	Raed Salem <raeds@nvidia.com>, Saeed Mahameed <saeedm@nvidia.com>,
	Steffen Klassert <steffen.klassert@secunet.com>
Subject: Re: [PATCH net 1/4] net/mlx5e: Don't delay release of hardware objects
Date: Tue, 6 Jun 2023 11:07:49 +0200	[thread overview]
Message-ID: <ZH73Zd+gP7/Gpyuy@corigine.com> (raw)
In-Reply-To: <e89e4c68b70d8b469e7a31613d56ce2974bc943d.1685950599.git.leonro@nvidia.com>

On Mon, Jun 05, 2023 at 11:09:49AM +0300, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@nvidia.com>
> 
> XFRM core provides two callbacks to release resources, one is .xdo_dev_policy_delete()
> and another is .xdo_dev_policy_free(). This separation allows delayed release so
> "ip xfrm policy free" commands won't starve. Unfortunately, mlx5 command interface
> can't run in .xdo_dev_policy_free() callbacks as the latter runs in ATOMIC context.
> 
>  BUG: scheduling while atomic: swapper/7/0/0x00000100
>  Modules linked in: act_mirred act_tunnel_key cls_flower sch_ingress vxlan mlx5_vdpa vringh vhost_iotlb vdpa rpcrdma rdma_ucm ib_iser libiscsi ib_umad scsi_transport_iscsi rdma_cm ib_ipoib iw_cm ib_cm mlx5_ib ib_uverbs ib_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_registry overlay mlx5_core zram zsmalloc fuse
>  CPU: 7 PID: 0 Comm: swapper/7 Not tainted 6.3.0+ #1
>  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
>  Call Trace:
>   <IRQ>
>   dump_stack_lvl+0x33/0x50
>   __schedule_bug+0x4e/0x60
>   __schedule+0x5d5/0x780
>   ? __mod_timer+0x286/0x3d0
>   schedule+0x50/0x90
>   schedule_timeout+0x7c/0xf0
>   ? __bpf_trace_tick_stop+0x10/0x10
>   __wait_for_common+0x88/0x190
>   ? usleep_range_state+0x90/0x90
>   cmd_exec+0x42e/0xb40 [mlx5_core]
>   mlx5_cmd_do+0x1e/0x40 [mlx5_core]
>   mlx5_cmd_exec+0x18/0x30 [mlx5_core]
>   mlx5_cmd_delete_fte+0xa8/0xd0 [mlx5_core]
>   del_hw_fte+0x60/0x120 [mlx5_core]
>   mlx5_del_flow_rules+0xec/0x270 [mlx5_core]
>   ? default_send_IPI_single_phys+0x26/0x30
>   mlx5e_accel_ipsec_fs_del_pol+0x1a/0x60 [mlx5_core]
>   mlx5e_xfrm_free_policy+0x15/0x20 [mlx5_core]
>   xfrm_policy_destroy+0x5a/0xb0
>   xfrm4_dst_destroy+0x7b/0x100
>   dst_destroy+0x37/0x120
>   rcu_core+0x2d6/0x540
>   __do_softirq+0xcd/0x273
>   irq_exit_rcu+0x82/0xb0
>   sysvec_apic_timer_interrupt+0x72/0x90
>   </IRQ>
>   <TASK>
>   asm_sysvec_apic_timer_interrupt+0x16/0x20
>  RIP: 0010:default_idle+0x13/0x20
>  Code: c0 08 00 00 00 4d 29 c8 4c 01 c7 4c 29 c2 e9 72 ff ff ff cc cc cc cc 8b 05 7a 4d ee 00 85 c0 7e 07 0f 00 2d 2f 98 2e 00 fb f4 <fa> c3 66 66 2e 0f 1f 84 00 00 00 00 00 65 48 8b 04 25 40 b4 02 00
>  RSP: 0018:ffff888100843ee0 EFLAGS: 00000242
>  RAX: 0000000000000001 RBX: ffff888100812b00 RCX: 4000000000000000
>  RDX: 0000000000000001 RSI: 0000000000000083 RDI: 000000000002d2ec
>  RBP: 0000000000000007 R08: 00000021daeded59 R09: 0000000000000001
>  R10: 0000000000000000 R11: 000000000000000f R12: 0000000000000000
>  R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
>   default_idle_call+0x30/0xb0
>   do_idle+0x1c1/0x1d0
>   cpu_startup_entry+0x19/0x20
>   start_secondary+0xfe/0x120
>   secondary_startup_64_no_verify+0xf3/0xfb
>   </TASK>
>  bad: scheduling from the idle thread!
> 
> Fixes: a5b8ca9471d3 ("net/mlx5e: Add XFRM policy offload logic")
> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>

Reviewed-by: Simon Horman <simon.horman@corigine.com>


  reply	other threads:[~2023-06-06  9:07 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-05  8:09 [PATCH net 0/4] Fix mixing atomic/non-atomic contexts in mlx5 IPsec code Leon Romanovsky
2023-06-05  8:09 ` [PATCH net 1/4] net/mlx5e: Don't delay release of hardware objects Leon Romanovsky
2023-06-06  9:07   ` Simon Horman [this message]
2023-06-05  8:09 ` [PATCH net 2/4] net/mlx5e: Fix ESN update kernel panic Leon Romanovsky
2023-06-06  9:08   ` Simon Horman
2023-06-05  8:09 ` [PATCH net 3/4] net/mlx5e: Drop XFRM state lock when modifying flow steering Leon Romanovsky
2023-06-06  9:12   ` Simon Horman
2023-06-05  8:09 ` [PATCH net 4/4] net/mlx5e: Fix scheduling of IPsec ASO query while in atomic Leon Romanovsky
2023-06-06  9:13   ` Simon Horman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZH73Zd+gP7/Gpyuy@corigine.com \
    --to=simon.horman@corigine.com \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=leon@kernel.org \
    --cc=leonro@nvidia.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=phaddad@nvidia.com \
    --cc=raeds@nvidia.com \
    --cc=saeedm@nvidia.com \
    --cc=steffen.klassert@secunet.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).