netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Leon Romanovsky <leon@kernel.org>
To: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
Cc: Johannes Berg <johannes.berg@intel.com>,
	linux-wireless@vger.kernel.org, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	Johannes Berg <johannes@sipsolutions.net>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Gregory Greenman <gregory.greenman@intel.com>,
	Alexander Wetzel <alexander@wetzel-home.de>
Subject: Re: [PATCH v4 1/1] wifi: mac80211: fortify the spinlock against deadlock by interrupt
Date: Wed, 26 Apr 2023 09:41:45 +0300	[thread overview]
Message-ID: <20230426064145.GE27649@unreal> (raw)
In-Reply-To: <20230425164005.25272-1-mirsad.todorovac@alu.unizg.hr>

On Tue, Apr 25, 2023 at 06:40:08PM +0200, Mirsad Goran Todorovac wrote:
> In the function ieee80211_tx_dequeue() there is a particular locking
> sequence:
> 
> begin:
> 	spin_lock(&local->queue_stop_reason_lock);
> 	q_stopped = local->queue_stop_reasons[q];
> 	spin_unlock(&local->queue_stop_reason_lock);
> 
> However small the chance (increased by ftracetest), an asynchronous
> interrupt can occur in between of spin_lock() and spin_unlock(),
> and the interrupt routine will attempt to lock the same
> &local->queue_stop_reason_lock again.
> 
> This will cause a costly reset of the CPU and the wifi device or an
> altogether hang in the single CPU and single core scenario.
> 
> The only remaining spin_lock(&local->queue_stop_reason_lock) that
> did not disable interrupts was patched, which should prevent any
> deadlocks on the same CPU/core and the same wifi device.
> 
> This is the probable trace of the deadlock:
> 
> kernel: ================================
> kernel: WARNING: inconsistent lock state
> kernel: 6.3.0-rc6-mt-20230401-00001-gf86822a1170f #4 Tainted: G        W
> kernel: --------------------------------
> kernel: inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
> kernel: kworker/5:0/25656 [HC0[0]:SC0[0]:HE1:SE1] takes:
> kernel: ffff9d6190779478 (&local->queue_stop_reason_lock){+.?.}-{2:2}, at: return_to_handler+0x0/0x40
> kernel: {IN-SOFTIRQ-W} state was registered at:
> kernel:   lock_acquire+0xc7/0x2d0
> kernel:   _raw_spin_lock+0x36/0x50
> kernel:   ieee80211_tx_dequeue+0xb4/0x1330 [mac80211]
> kernel:   iwl_mvm_mac_itxq_xmit+0xae/0x210 [iwlmvm]
> kernel:   iwl_mvm_mac_wake_tx_queue+0x2d/0xd0 [iwlmvm]
> kernel:   ieee80211_queue_skb+0x450/0x730 [mac80211]
> kernel:   __ieee80211_xmit_fast.constprop.66+0x834/0xa50 [mac80211]
> kernel:   __ieee80211_subif_start_xmit+0x217/0x530 [mac80211]
> kernel:   ieee80211_subif_start_xmit+0x60/0x580 [mac80211]
> kernel:   dev_hard_start_xmit+0xb5/0x260
> kernel:   __dev_queue_xmit+0xdbe/0x1200
> kernel:   neigh_resolve_output+0x166/0x260
> kernel:   ip_finish_output2+0x216/0xb80
> kernel:   __ip_finish_output+0x2a4/0x4d0
> kernel:   ip_finish_output+0x2d/0xd0
> kernel:   ip_output+0x82/0x2b0
> kernel:   ip_local_out+0xec/0x110
> kernel:   igmpv3_sendpack+0x5c/0x90
> kernel:   igmp_ifc_timer_expire+0x26e/0x4e0
> kernel:   call_timer_fn+0xa5/0x230
> kernel:   run_timer_softirq+0x27f/0x550
> kernel:   __do_softirq+0xb4/0x3a4
> kernel:   irq_exit_rcu+0x9b/0xc0
> kernel:   sysvec_apic_timer_interrupt+0x80/0xa0
> kernel:   asm_sysvec_apic_timer_interrupt+0x1f/0x30
> kernel:   _raw_spin_unlock_irqrestore+0x3f/0x70
> kernel:   free_to_partial_list+0x3d6/0x590
> kernel:   __slab_free+0x1b7/0x310
> kernel:   kmem_cache_free+0x52d/0x550
> kernel:   putname+0x5d/0x70
> kernel:   do_sys_openat2+0x1d7/0x310
> kernel:   do_sys_open+0x51/0x80
> kernel:   __x64_sys_openat+0x24/0x30
> kernel:   do_syscall_64+0x5c/0x90
> kernel:   entry_SYSCALL_64_after_hwframe+0x72/0xdc
> kernel: irq event stamp: 5120729
> kernel: hardirqs last  enabled at (5120729): [<ffffffff9d149936>] trace_graph_return+0xd6/0x120
> kernel: hardirqs last disabled at (5120728): [<ffffffff9d149950>] trace_graph_return+0xf0/0x120
> kernel: softirqs last  enabled at (5069900): [<ffffffff9cf65b60>] return_to_handler+0x0/0x40
> kernel: softirqs last disabled at (5067555): [<ffffffff9cf65b60>] return_to_handler+0x0/0x40
> kernel:
>         other info that might help us debug this:
> kernel:  Possible unsafe locking scenario:
> kernel:        CPU0
> kernel:        ----
> kernel:   lock(&local->queue_stop_reason_lock);
> kernel:   <Interrupt>
> kernel:     lock(&local->queue_stop_reason_lock);
> kernel:
>          *** DEADLOCK ***
> kernel: 8 locks held by kworker/5:0/25656:
> kernel:  #0: ffff9d618009d138 ((wq_completion)events_freezable){+.+.}-{0:0}, at: process_one_work+0x1ca/0x530
> kernel:  #1: ffffb1ef4637fe68 ((work_completion)(&local->restart_work)){+.+.}-{0:0}, at: process_one_work+0x1ce/0x530
> kernel:  #2: ffffffff9f166548 (rtnl_mutex){+.+.}-{3:3}, at: return_to_handler+0x0/0x40
> kernel:  #3: ffff9d6190778728 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: return_to_handler+0x0/0x40
> kernel:  #4: ffff9d619077b480 (&mvm->mutex){+.+.}-{3:3}, at: return_to_handler+0x0/0x40
> kernel:  #5: ffff9d61907bacd8 (&trans_pcie->mutex){+.+.}-{3:3}, at: return_to_handler+0x0/0x40
> kernel:  #6: ffffffff9ef9cda0 (rcu_read_lock){....}-{1:2}, at: iwl_mvm_queue_state_change+0x59/0x3a0 [iwlmvm]
> kernel:  #7: ffffffff9ef9cda0 (rcu_read_lock){....}-{1:2}, at: iwl_mvm_mac_itxq_xmit+0x42/0x210 [iwlmvm]
> kernel:
>         stack backtrace:
> kernel: CPU: 5 PID: 25656 Comm: kworker/5:0 Tainted: G        W          6.3.0-rc6-mt-20230401-00001-gf86822a1170f #4
> kernel: Hardware name: LENOVO 82H8/LNVNB161216, BIOS GGCN51WW 11/16/2022
> kernel: Workqueue: events_freezable ieee80211_restart_work [mac80211]
> kernel: Call Trace:
> kernel:  <TASK>
> kernel:  ? ftrace_regs_caller_end+0x66/0x66
> kernel:  dump_stack_lvl+0x5f/0xa0
> kernel:  dump_stack+0x14/0x20
> kernel:  print_usage_bug.part.46+0x208/0x2a0
> kernel:  mark_lock.part.47+0x605/0x630
> kernel:  ? sched_clock+0xd/0x20
> kernel:  ? trace_clock_local+0x14/0x30
> kernel:  ? __rb_reserve_next+0x5f/0x490
> kernel:  ? _raw_spin_lock+0x1b/0x50
> kernel:  __lock_acquire+0x464/0x1990
> kernel:  ? mark_held_locks+0x4e/0x80
> kernel:  lock_acquire+0xc7/0x2d0
> kernel:  ? ftrace_regs_caller_end+0x66/0x66
> kernel:  ? ftrace_return_to_handler+0x8b/0x100
> kernel:  ? preempt_count_add+0x4/0x70
> kernel:  _raw_spin_lock+0x36/0x50
> kernel:  ? ftrace_regs_caller_end+0x66/0x66
> kernel:  ? ftrace_regs_caller_end+0x66/0x66
> kernel:  ieee80211_tx_dequeue+0xb4/0x1330 [mac80211]
> kernel:  ? prepare_ftrace_return+0xc5/0x190
> kernel:  ? ftrace_graph_func+0x16/0x20
> kernel:  ? 0xffffffffc02ab0b1
> kernel:  ? lock_acquire+0xc7/0x2d0
> kernel:  ? iwl_mvm_mac_itxq_xmit+0x42/0x210 [iwlmvm]
> kernel:  ? ieee80211_tx_dequeue+0x9/0x1330 [mac80211]
> kernel:  ? __rcu_read_lock+0x4/0x40
> kernel:  ? ftrace_regs_caller_end+0x66/0x66
> kernel:  iwl_mvm_mac_itxq_xmit+0xae/0x210 [iwlmvm]
> kernel:  ? ftrace_regs_caller_end+0x66/0x66
> kernel:  iwl_mvm_queue_state_change+0x311/0x3a0 [iwlmvm]
> kernel:  ? ftrace_regs_caller_end+0x66/0x66
> kernel:  iwl_mvm_wake_sw_queue+0x17/0x20 [iwlmvm]
> kernel:  ? ftrace_regs_caller_end+0x66/0x66
> kernel:  iwl_txq_gen2_unmap+0x1c9/0x1f0 [iwlwifi]
> kernel:  ? ftrace_regs_caller_end+0x66/0x66
> kernel:  iwl_txq_gen2_free+0x55/0x130 [iwlwifi]
> kernel:  ? ftrace_regs_caller_end+0x66/0x66
> kernel:  iwl_txq_gen2_tx_free+0x63/0x80 [iwlwifi]
> kernel:  ? ftrace_regs_caller_end+0x66/0x66
> kernel:  _iwl_trans_pcie_gen2_stop_device+0x3f3/0x5b0 [iwlwifi]
> kernel:  ? _iwl_trans_pcie_gen2_stop_device+0x9/0x5b0 [iwlwifi]
> kernel:  ? mutex_lock_nested+0x4/0x30
> kernel:  ? ftrace_regs_caller_end+0x66/0x66
> kernel:  iwl_trans_pcie_gen2_stop_device+0x5f/0x90 [iwlwifi]
> kernel:  ? ftrace_regs_caller_end+0x66/0x66
> kernel:  iwl_mvm_stop_device+0x78/0xd0 [iwlmvm]
> kernel:  ? ftrace_regs_caller_end+0x66/0x66
> kernel:  __iwl_mvm_mac_start+0x114/0x210 [iwlmvm]
> kernel:  ? ftrace_regs_caller_end+0x66/0x66
> kernel:  iwl_mvm_mac_start+0x76/0x150 [iwlmvm]
> kernel:  ? ftrace_regs_caller_end+0x66/0x66
> kernel:  drv_start+0x79/0x180 [mac80211]
> kernel:  ? ftrace_regs_caller_end+0x66/0x66
> kernel:  ieee80211_reconfig+0x1523/0x1ce0 [mac80211]
> kernel:  ? synchronize_net+0x4/0x50
> kernel:  ? ftrace_regs_caller_end+0x66/0x66
> kernel:  ieee80211_restart_work+0x108/0x170 [mac80211]
> kernel:  ? ftrace_regs_caller_end+0x66/0x66
> kernel:  process_one_work+0x250/0x530
> kernel:  ? ftrace_regs_caller_end+0x66/0x66
> kernel:  worker_thread+0x48/0x3a0
> kernel:  ? __pfx_worker_thread+0x10/0x10
> kernel:  kthread+0x10f/0x140
> kernel:  ? __pfx_kthread+0x10/0x10
> kernel:  ret_from_fork+0x29/0x50
> kernel:  </TASK>
> 
> Fixes: 4444bc2116ae ("wifi: mac80211: Proper mark iTXQs for resumption")
> Link: https://lore.kernel.org/all/1f58a0d1-d2b9-d851-73c3-93fcc607501c@alu.unizg.hr/
> Reported-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
> Cc: Gregory Greenman <gregory.greenman@intel.com>
> Cc: Johannes Berg <johannes.berg@intel.com>
> Link: https://lore.kernel.org/all/cdc80531-f25f-6f9d-b15f-25e16130b53a@alu.unizg.hr/
> Cc: David S. Miller <davem@davemloft.net>
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: Jakub Kicinski <kuba@kernel.org>
> Cc: Paolo Abeni <pabeni@redhat.com>
> Cc: Leon Romanovsky <leon@kernel.org>
> Cc: Alexander Wetzel <alexander@wetzel-home.de>
> Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
> ---
> v3 -> v4:
> - Added whole lockdep trace as advised.
> - Trimmed irrelevant line prefix.
> v2 -> v3:
> - Fix the Fixes: tag as advised.
> - Change the net: to wifi: to comply with the original patch that
>   is being fixed.
> v1 -> v2:
> - Minor rewording and clarification.
> - Cc:-ed people that replied to the original bug report (forgotten
>   in v1 by omission).
> 
>  net/mac80211/tx.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 

Thanks,
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>

  reply	other threads:[~2023-04-26  6:42 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-25 16:40 [PATCH v4 1/1] wifi: mac80211: fortify the spinlock against deadlock by interrupt Mirsad Goran Todorovac
2023-04-26  6:41 ` Leon Romanovsky [this message]
2023-04-26 14:02   ` Mirsad Todorovac
2023-04-26 15:05     ` Johannes Berg
2023-04-26 17:52       ` Mirsad Goran Todorovac

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230426064145.GE27649@unreal \
    --to=leon@kernel.org \
    --cc=alexander@wetzel-home.de \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=gregory.greenman@intel.com \
    --cc=johannes.berg@intel.com \
    --cc=johannes@sipsolutions.net \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-wireless@vger.kernel.org \
    --cc=mirsad.todorovac@alu.unizg.hr \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).