All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kalle Valo <kvalo@kernel.org>
To: Harshitha Prem <quic_hprem@quicinc.com>
Cc: ath12k@lists.infradead.org,  linux-wireless@vger.kernel.org
Subject: Re: [PATCH v2 0/8] wifi: ath12k: Introduce device group abstraction
Date: Tue, 30 Apr 2024 14:54:45 +0300	[thread overview]
Message-ID: <87le4voyka.fsf@kernel.org> (raw)
In-Reply-To: <20240425055759.1181165-1-quic_hprem@quicinc.com> (Harshitha Prem's message of "Thu, 25 Apr 2024 11:27:51 +0530")

Harshitha Prem <quic_hprem@quicinc.com> writes:

> To support multi-link operation, multiple devices with different bands say
> 2 GHz or 5 GHz or 6 GHz can be combined together as a group and provide
> an abstraction to mac80211.
>
> Device group abstraction - when there are multiple devices that are
> connected by any means of communication interface between them, then these
> devices can be combined together as a single group using a group id to form
> a group abstraction. In ath12k driver, this abstraction would be named as
> ath12k_hw_group (ag).
>
> Please find below illustration of device group abstraction with two
> devices.
>
>                  Grouping of multiple devices (in future)
> +------------------------------------------------------------------------+
> |  +-------------------------------------+       +-------------------+   |
> |  |   +-----------+ | | +-----------+   |       |   +-----------+   |   |
> |  |   | ar (2GHz) | | | | ar (5GHz) |   |       |   | ar (6GHz) |   |   |
> |  |   +-----------+ | | +-----------+   |       |   +-----------+   |   |
> |  |          ath12k_base (ab)           |       | ath12k_base (ab)  |   |
> |  |         (Dual band device)          |       |                   |   |
> |  +-------------------------------------+       +-------------------+   |
> |                 ath12k_hw_group (ag) based on group id                 |
> +------------------------------------------------------------------------+
>
> Say for example, device 1 has two radios (2 GHz and 5 GHz band) and
> device 2 has one radio (6 GHz).
>
> In existing code -
>         device 1 will have two hardware abstractions hw1 (2 GHz) and hw2
>         (5 GHz) will be registered separately to mac80211 as phy0 and phy1
>         respectively. Similarly, device 2 will register its hw (6GHz) as
>         phy2 to mac80211.
>
> In future, with multi-link abstraction
>
>         combination 1 - Different group id for device1 and device 2
>                 Device 1 will create a single hardware abstraction hw1
>                 (2 GHz and  5 GHz) and will be registered to mac80211 as
>                 phy0. similarly, device 2 will register its hardware
>                 (6 GHz) to mac80211 as phy1.
>
>         combination 2 - Same group id for device1 and device 2
>                 Both device details are combined together as a group, say
>                 group1, with single hardware abstraction of radios 2 GHz,
>                 5 GHz and 6 GHz band details and will be registered to
>                 mac80211 as phy0.
>
> Add base infrastructure changes to add device grouping abstraction with
> a single device.
>
> This patch series brings the base code changes with following order:
>         1. Refactor existing code which would facilitate in introducing
>            device group abstraction.
>         2. Create a device group abstraction during device probe.
>         3. Start the device group only after QMI firmware ready event is
>            received for all the devices that are combined in the group.
>         4. Move the hardware abstractions (ath12k_hw - ah) from device
>            (ath12k_base - ab) to device group abstraction (ag) as it would
>            ease in having different combinations of group abstraction that
>            can be registered to mac80211.
>
>
> Depends-on:
>         [PATCH v2 0/3] wifi: ath12k: Refactor the hardware recovery
> 	procedures.
> 	Link - https://lore.kernel.org/ath12k/87a5ljt0p9.fsf@kernel.org/T/
>
> v2:
>  - Rebased to ToT
>
> Karthikeyan Periyasamy (8):
>   wifi: ath12k: Refactor core start api
>   wifi: ath12k: Add helpers to get or set ath12k_hw
>   wifi: ath12k: Add ath12k_get_num_hw api
>   wifi: ath12k: Introduce QMI firmware ready flag
>   wifi: ath12k: move ATH12K_FLAG_REGISTERED flag set to mac_register api
>   wifi: ath12k: Introduce device group abstraction
>   wifi: ath12k: refactor core start based on hardware group
>   wifi: ath12k: move ath12k_hw from per soc to group

I see a deadlock warning in master-pending branch (tag
ath-pending-202404291731) and based on manual bisect between patchsets
it seems to come from this patchset. Do note that I didn't look at the
patchset otherwise.

Here's the warning I see during rmmod with WCN7850 and I see it every
time:

[  147.211487] ======================================================
[  147.211547] WARNING: possible circular locking dependency detected
[  147.211599] 6.9.0-rc5-wt-ath+ #1403 Not tainted
[  147.211646] ------------------------------------------------------
[  147.211695] rmmod/1975 is trying to acquire lock:
[  147.211741] ffff888105c7f158 ((wq_completion)ath12k_qmi_driver_event){+.+.}-{0:0}, at: touch_wq_lockdep_map+0x46/0xf0
[  147.211815] #012[  147.211815] but task is already holding lock:
[  147.211864] ffff88810db290a8 (&ag->mutex_lock){+.+.}-{3:3}, at: ath12k_core_hw_group_destroy.part.0+0x26/0x290 [ath12k]
[  147.212003] #012[  147.212003] which lock already depends on the new lock.#012[  147.212003] 
[  147.212069] #012[  147.212069] the existing dependency chain (in reverse order) is:
[  147.212135] #012[  147.212135] -> #2 (&ag->mutex_lock){+.+.}-{3:3}:
[  147.212189]        __lock_acquire+0xd43/0x1dd0
[  147.212238]        lock_acquire+0x1b0/0x560
[  147.212280]        __mutex_lock+0x154/0x1430
[  147.212327]        mutex_lock_nested+0x16/0x20
[  147.212369]        ath12k_core_qmi_firmware_ready+0x9d/0x400 [ath12k]
[  147.212436]        ath12k_qmi_driver_event_work+0x4e9/0x6e0 [ath12k]
[  147.212507]        process_one_work+0x8a4/0x1980
[  147.212551]        worker_thread+0x715/0x1270
[  147.212594]        kthread+0x2fa/0x3f0
[  147.212636]        ret_from_fork+0x31/0x70
[  147.212679]        ret_from_fork_asm+0x11/0x20
[  147.212754] #012[  147.212754] -> #1 ((work_completion)(&ab->qmi.event_work)){+.+.}-{0:0}:
[  147.212828]        __lock_acquire+0xd43/0x1dd0
[  147.212885]        lock_acquire+0x1b0/0x560
[  147.213000]        process_one_work+0x82d/0x1980
[  147.213055]        worker_thread+0x715/0x1270
[  147.213779]        kthread+0x2fa/0x3f0
[  147.214529]        ret_from_fork+0x31/0x70
[  147.215291]        ret_from_fork_asm+0x11/0x20
[  147.216044] #012[  147.216044] -> #0 ((wq_completion)ath12k_qmi_driver_event){+.+.}-{0:0}:
[  147.217433]        check_prev_add+0x1bd/0x2330
[  147.218174]        validate_chain+0xf4e/0x1cf0
[  147.218853]        __lock_acquire+0xd43/0x1dd0
[  147.219589]        lock_acquire+0x1b0/0x560
[  147.220324]        touch_wq_lockdep_map+0x66/0xf0
[  147.221055]        __flush_workqueue+0xeb/0x1120
[  147.221731]        drain_workqueue+0xf5/0x320
[  147.222460]        destroy_workqueue+0xb2/0x920
[  147.223187]        ath12k_qmi_deinit_service+0x5a/0x1f0 [ath12k]
[  147.223876]        ath12k_core_hw_group_destroy.part.0+0x1f8/0x290 [ath12k]
[  147.224606]        ath12k_core_deinit+0x37/0x50 [ath12k]
[  147.225314]        ath12k_pci_remove+0xad/0x1b0 [ath12k]
[  147.226640]        pci_device_remove+0x9b/0x1b0
[  147.227326]        device_remove+0xbf/0x150
[  147.227986]        device_release_driver_internal+0x3c3/0x580
[  147.228619]        driver_detach+0xc4/0x190
[  147.229290]        bus_remove_driver+0x130/0x2a0
[  147.229900]        driver_unregister+0x68/0x90
[  147.230576]        pci_unregister_driver+0x24/0x240
[  147.231240]        ath12k_pci_exit+0x10/0x20 [ath12k]
[  147.231866]        __do_sys_delete_module+0x32c/0x580
[  147.232523]        __x64_sys_delete_module+0x4f/0x70
[  147.233168]        x64_sys_call+0x51b/0x9e0
[  147.233756]        do_syscall_64+0x65/0x130
[  147.234396]        entry_SYSCALL_64_after_hwframe+0x4b/0x53
[  147.235023] #012[  147.235023] other info that might help us debug this:#012[  147.235023] 
[  147.236688] Chain exists of:#012[  147.236688]   (wq_completion)ath12k_qmi_driver_event --> (work_completion)(&ab->qmi.event_work) --> &ag->mutex_lock#012[  147.236688] 
[  147.238414]  Possible unsafe locking scenario:#012[  147.238414] 
[  147.239520]        CPU0                    CPU1
[  147.240114]        ----                    ----
[  147.240633]   lock(&ag->mutex_lock);
[  147.241212]                                lock((work_completion)(&ab->qmi.event_work));
[  147.241746]                                lock(&ag->mutex_lock);
[  147.242341]   lock((wq_completion)ath12k_qmi_driver_event);
[  147.242876] #012[  147.242876]  *** DEADLOCK ***#012[  147.242876] 
[  147.244460] 2 locks held by rmmod/1975:
[  147.245040]  #0: ffff88810d7191b8 (&dev->mutex){....}-{3:3}, at: device_release_driver_internal+0x9d/0x580
[  147.245597]  #1: ffff88810db290a8 (&ag->mutex_lock){+.+.}-{3:3}, at: ath12k_core_hw_group_destroy.part.0+0x26/0x290 [ath12k]
[  147.246230] #012[  147.246230] stack backtrace:
[  147.247338] CPU: 5 PID: 1975 Comm: rmmod Not tainted 6.9.0-rc5-wt-ath+ #1403
[  147.247900] Hardware name: Intel(R) Client Systems NUC8i7HVK/NUC8i7HVB, BIOS HNKBLi70.86A.0067.2021.0528.1339 05/28/2021
[  147.248545] Call Trace:
[  147.249181]  <TASK>
[  147.249757]  dump_stack_lvl+0x7d/0xe0
[  147.250401]  dump_stack+0x10/0x20
[  147.251031]  print_circular_bug+0x2e8/0x480
[  147.251606]  check_noncircular+0x2f2/0x3d0
[  147.252235]  ? print_circular_bug+0x480/0x480
[  147.252808]  ? validate_chain+0x15e/0x1cf0
[  147.253435]  ? __kasan_check_read+0x11/0x20
[  147.254065]  ? mark_lock+0xe6/0x1470
[  147.254636]  ? alloc_chain_hlocks+0x4cc/0x790
[  147.255265]  check_prev_add+0x1bd/0x2330
[  147.255839]  ? __kasan_check_read+0x11/0x20
[  147.256474]  validate_chain+0xf4e/0x1cf0
[  147.257104]  ? check_prev_add+0x2330/0x2330
[  147.257673]  __lock_acquire+0xd43/0x1dd0
[  147.258303]  lock_acquire+0x1b0/0x560
[  147.258875]  ? touch_wq_lockdep_map+0x46/0xf0
[  147.259502]  ? lock_sync+0x1a0/0x1a0
[  147.260131]  ? __lock_acquired+0x208/0x810
[  147.260699]  ? lockdep_init_map_type+0x1a3/0x850
[  147.261326]  ? lockdep_init_map_type+0x1a3/0x850
[  147.261892]  ? touch_wq_lockdep_map+0x46/0xf0
[  147.262513]  touch_wq_lockdep_map+0x66/0xf0
[  147.263126]  ? touch_wq_lockdep_map+0x46/0xf0
[  147.263684]  __flush_workqueue+0xeb/0x1120
[  147.264301]  ? drain_workqueue+0xae/0x320
[  147.264862]  ? drain_workqueue+0xae/0x320
[  147.265471]  ? __this_cpu_preempt_check+0x13/0x20
[  147.266086]  ? wq_update_node_max_active+0x540/0x540
[  147.266640]  ? destroy_workqueue+0xaa/0x920
[  147.267249]  ? __this_cpu_preempt_check+0x13/0x20
[  147.267806]  ? bit_wait_timeout+0x160/0x160
[  147.268420]  ? __kasan_check_write+0x14/0x20
[  147.269031]  drain_workqueue+0xf5/0x320
[  147.269586]  destroy_workqueue+0xb2/0x920
[  147.270202]  ath12k_qmi_deinit_service+0x5a/0x1f0 [ath12k]
[  147.270785]  ? debugfs_remove+0x52/0x60
[  147.271400]  ath12k_core_hw_group_destroy.part.0+0x1f8/0x290 [ath12k]
[  147.272046]  ath12k_core_deinit+0x37/0x50 [ath12k]
[  147.272628]  ath12k_pci_remove+0xad/0x1b0 [ath12k]
[  147.273271]  pci_device_remove+0x9b/0x1b0
[  147.273829]  device_remove+0xbf/0x150
[  147.274446]  device_release_driver_internal+0x3c3/0x580
[  147.275052]  ? __kasan_check_read+0x11/0x20
[  147.275592]  driver_detach+0xc4/0x190
[  147.276189]  bus_remove_driver+0x130/0x2a0
[  147.276728]  driver_unregister+0x68/0x90
[  147.277325]  pci_unregister_driver+0x24/0x240
[  147.277876]  ? find_module_all+0x13e/0x1e0
[  147.278469]  ath12k_pci_exit+0x10/0x20 [ath12k]
[  147.279087]  __do_sys_delete_module+0x32c/0x580
[  147.279626]  ? __kasan_slab_free+0x102/0x170
[  147.280209]  ? module_flags+0x2f0/0x2f0
[  147.280741]  ? kmem_cache_free+0xed/0x3e0
[  147.281334]  ? __fput+0x40c/0xa60
[  147.281870]  ? __fput+0x40c/0xa60
[  147.282462]  ? debug_smp_processor_id+0x17/0x20
[  147.283045]  __x64_sys_delete_module+0x4f/0x70
[  147.283558]  x64_sys_call+0x51b/0x9e0
[  147.284125]  do_syscall_64+0x65/0x130
[  147.284613]  entry_SYSCALL_64_after_hwframe+0x4b/0x53
[  147.285159] RIP: 0033:0x7f2df2adcc8b
[  147.286954] Code: 73 01 c3 48 8b 0d 05 c2 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d d5 c1 0c 00 f7 d8 64 89 01 48
[  147.288033] RSP: 002b:00007ffdc52ca298 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
[  147.288563] RAX: ffffffffffffffda RBX: 000055d10d0237e0 RCX: 00007f2df2adcc8b
[  147.289155] RDX: 000000000000000a RSI: 0000000000000800 RDI: 000055d10d023848
[  147.289691] RBP: 00007ffdc52ca2f8 R08: 0000000000000000 R09: 0000000000000000
[  147.290292] R10: 00007f2df2b58ac0 R11: 0000000000000206 R12: 00007ffdc52ca4d0
[  147.290830] R13: 00007ffdc52caebf R14: 000055d10d0222a0 R15: 000055d10d0237e0
[  147.291444]  </TASK>

-- 
https://patchwork.kernel.org/project/linux-wireless/list/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches


  parent reply	other threads:[~2024-04-30 11:54 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-25  5:57 [PATCH v2 0/8] wifi: ath12k: Introduce device group abstraction Harshitha Prem
2024-04-25  5:57 ` [PATCH v2 1/8] wifi: ath12k: Refactor core start api Harshitha Prem
2024-04-25 21:26   ` Jeff Johnson
2024-04-25  5:57 ` [PATCH v2 2/8] wifi: ath12k: Add helpers to get or set ath12k_hw Harshitha Prem
2024-04-25 21:27   ` Jeff Johnson
2024-04-25  5:57 ` [PATCH v2 3/8] wifi: ath12k: Add ath12k_get_num_hw api Harshitha Prem
2024-04-25 21:28   ` Jeff Johnson
2024-04-25  5:57 ` [PATCH v2 4/8] wifi: ath12k: Introduce QMI firmware ready flag Harshitha Prem
2024-04-25 21:28   ` Jeff Johnson
2024-04-25  5:57 ` [PATCH v2 5/8] wifi: ath12k: move ATH12K_FLAG_REGISTERED flag set to mac_register api Harshitha Prem
2024-04-25 21:30   ` Jeff Johnson
2024-04-25  5:57 ` [PATCH v2 6/8] wifi: ath12k: Introduce device group abstraction Harshitha Prem
2024-04-25 21:32   ` Jeff Johnson
2024-04-25  5:57 ` [PATCH v2 7/8] wifi: ath12k: refactor core start based on hardware group Harshitha Prem
2024-04-25 21:39   ` Jeff Johnson
2024-04-25  5:57 ` [PATCH v2 8/8] wifi: ath12k: move ath12k_hw from per soc to group Harshitha Prem
2024-04-25 21:40   ` Jeff Johnson
2024-04-26  2:33     ` Ping-Ke Shih
2024-04-26 14:26       ` Jeff Johnson
2024-04-25 21:21 ` [PATCH v2 0/8] wifi: ath12k: Introduce device group abstraction Jeff Johnson
2024-04-30 11:54 ` Kalle Valo [this message]
2024-05-01 15:34   ` Harshitha Prem
2024-05-01 16:55     ` Jeff Johnson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87le4voyka.fsf@kernel.org \
    --to=kvalo@kernel.org \
    --cc=ath12k@lists.infradead.org \
    --cc=linux-wireless@vger.kernel.org \
    --cc=quic_hprem@quicinc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.