All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kalle Valo <kvalo@kernel.org>
To: "Yury Vostrikov" <mon@unformed.ru>
Cc: ath11k@lists.infradead.org
Subject: Re: hot busy loop inside ath11k
Date: Thu, 07 Nov 2024 16:36:42 +0200	[thread overview]
Message-ID: <87a5eb5cw5.fsf@kernel.org> (raw)
In-Reply-To: <7324ac7a-8b7a-42a5-aa19-de52138ff638@app.fastmail.com> (Yury Vostrikov's message of "Fri, 01 Nov 2024 22:39:26 +0100")

"Yury Vostrikov" <mon@unformed.ru> writes:

> Hi Kalle,
>
> It seems there is problem with busy wait inside
> ath11k_debugfs_fw_stats_request. I have a laptop with QCNFA765 WiFi
> controller. It is running vanilla v6.11.4 with the following fw:
>
>> [    3.934078] ath11k_pci 0000:01:00.0: wcn6855 hw2.1
>> [    4.801624] ath11k_pci 0000:01:00.0: chip_id 0x12 chip_family 0xb board_id 0xff soc_id 0x400c1211
>> [    4.802469] ath11k_pci 0000:01:00.0: fw_version 0x11088c35 fw_build_timestamp 2024-04-17 08:34 fw_build_id WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.41
>
> Sometimes after device wakes up, the system becomes
> sluggish and burns CPU. A assume it is because of a firmware bug, but driver
> should not waste CPU regardless.
>
> According to perf, the most time is spent in busy waitin inside  ath11k_debugfs_fw_stats_request:
>
> 94.60%     0.00%  i3status         [kernel.kallsyms]                 [k] do_syscall_64
>         |
>          --94.60%--do_syscall_64
>                    |
>                     --94.55%--__sys_sendmsg
>                               ___sys_sendmsg
>                               ____sys_sendmsg
>                               netlink_sendmsg
>                               netlink_unicast
>                               genl_rcv
>                               netlink_rcv_skb
>                               genl_rcv_msg
>                               |
>                                --94.55%--genl_family_rcv_msg_dumpit
>                                          __netlink_dump_start
>                                          netlink_dump
>                                          genl_dumpit
>                                          nl80211_dump_station
>                                          |
>                                           --94.55%--ieee80211_dump_station
>                                                     sta_set_sinfo
>                                                     |
>                                                      --94.55%--ath11k_mac_op_sta_statistics
>                                                                ath11k_debugfs_get_fw_stats
>                                                                |
>                                                                 --94.55%--ath11k_debugfs_fw_stats_request
>                                                                           |
>                                                                           |--41.73%--_raw_spin_lock_bh
>                                                                           |
>                                                                           |--22.74%--__local_bh_enable_ip
>                                                                           |
>                                                                           |--9.22%--_raw_spin_unlock_bh
>                                                                           |
>                                                                            --6.66%--srso_alias_safe_ret
>
> If I'm reading the code correctly, then ath11k_debugfs_fw_stats_request has 3 second timeout:
>
>> timeout = jiffies + msecs_to_jiffies(3 * 1000);
>
> however, it only waits for 1 second:
>
>> time_left = wait_for_completion_timeout(&ar->fw_stats_complete, 1 * HZ);
>
> the rest (2 seconds) is spent inside busy loop
>
>>	for (;;) {
>>		if (time_after(jiffies, timeout))
>>			break;
>>
>>		spin_lock_bh(&ar->data_lock);
>>		if (ar->fw_stats_done) {
>>			spin_unlock_bh(&ar->data_lock);
>>			break;
>>		}
>>		spin_unlock_bh(&ar->data_lock);
>>	}
>
>
> spinning  for 2 seconds seems excessive to me. What do you think?

Oh wow, excessive is an understatement :) That's horrible, I don't know
how we missed that. And an unlimited loop like that is a big no-no in
kernel, every loop should have a some kind of maximum limit. Can someone
send a patch?

> Also, if you happen to know how & where to report firmware bugs, I'd
> appreciate any pointers.

I recommend filing to bugzilla:

https://wireless.docs.kernel.org/en/latest/en/users/drivers/ath11k/bugreport.html

Though we are overwhelmed with everything right now so don't expect a
quick response :/

-- 
https://patchwork.kernel.org/project/linux-wireless/list/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches


  reply	other threads:[~2024-11-07 14:40 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-01 21:39 hot busy loop inside ath11k Yury Vostrikov
2024-11-07 14:36 ` Kalle Valo [this message]
2024-11-11  0:37   ` Baochen Qiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87a5eb5cw5.fsf@kernel.org \
    --to=kvalo@kernel.org \
    --cc=ath11k@lists.infradead.org \
    --cc=mon@unformed.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.