linux-wireless.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: James Prestwood <prestwoj@gmail.com>
To: Baochen Qiang <quic_bqiang@quicinc.com>, ath10k@lists.infradead.org
Cc: linux-wireless@vger.kernel.org
Subject: Re: [PATCH] wifi: ath10k: poll service ready message before failing
Date: Wed, 28 Feb 2024 10:46:45 -0800	[thread overview]
Message-ID: <6362d9b2-6ed2-4454-bf1b-8614d181bc93@gmail.com> (raw)
In-Reply-To: <0ee7ae2f-8034-4908-b6e3-fa17a995c661@quicinc.com>

Hi Baochen,

On 2/21/24 6:18 PM, Baochen Qiang wrote:
>
>
> On 2/21/2024 8:38 PM, James Prestwood wrote:
>> Hi Baochen,
>>
>> On 2/20/24 7:17 PM, Baochen Qiang wrote:
>>> Currently host relies on CE interrupts to get notified that
>>> the service ready message is ready. This results in timeout
>>> issue if the interrupt is not fired, due to some unknown
>>> reasons. See below logs:
>>>
>>> [76321.937866] ath10k_pci 0000:02:00.0: wmi service ready event not 
>>> received
>>> ...
>>> [76322.016738] ath10k_pci 0000:02:00.0: Could not init core: -110
>>>
>>> And finally it causes WLAN interface bring up failure.
>>>
>>> Change to give it one more chance here by polling CE rings,
>>> before failing directly.
>>>
>>> Tested-on: QCA6174 hw3.2 PCI WLAN.RM.4.4.1-00157-QCARMSWPZ-1
>>>
>>> Fixes: 5e3dd157d7e7 ("ath10k: mac80211 driver for Qualcomm Atheros 
>>> 802.11ac CQA98xx devices")
>>> Reported-by: James Prestwood <prestwoj@gmail.com>
>>> Link: 
>>> https://lore.kernel.org/linux-wireless/304ce305-fbe6-420e-ac2a-d61ae5e6ca1a@gmail.com/
>>> Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com>
>>> ---
>>>   drivers/net/wireless/ath/ath10k/wmi.c | 22 +++++++++++++++++++---
>>>   1 file changed, 19 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/net/wireless/ath/ath10k/wmi.c 
>>> b/drivers/net/wireless/ath/ath10k/wmi.c
>>> index ddf15717d504..bf6cb2c73128 100644
>>> --- a/drivers/net/wireless/ath/ath10k/wmi.c
>>> +++ b/drivers/net/wireless/ath/ath10k/wmi.c
>>> @@ -1763,12 +1763,28 @@ void ath10k_wmi_put_wmi_channel(struct 
>>> ath10k *ar, struct wmi_channel *ch,
>>>   int ath10k_wmi_wait_for_service_ready(struct ath10k *ar)
>>>   {
>>> -    unsigned long time_left;
>>> +    unsigned long time_left, i;
>>>       time_left = wait_for_completion_timeout(&ar->wmi.service_ready,
>>>                           WMI_SERVICE_READY_TIMEOUT_HZ);
>>> -    if (!time_left)
>>> -        return -ETIMEDOUT;
>>> +    if (!time_left) {
>>> +        /* Sometimes the PCI HIF doesn't receive interrupt
>>> +         * for the service ready message even if the buffer
>>> +         * was completed. PCIe sniffer shows that it's
>>> +         * because the corresponding CE ring doesn't fires
>>> +         * it. Workaround here by polling CE rings once.
>>> +         */
>>> +        ath10k_warn(ar, "failed to receive service ready 
>>> completion, polling..\n");
>>> +
>>> +        for (i = 0; i < CE_COUNT; i++)
>>> +            ath10k_hif_send_complete_check(ar, i, 1);
>>> +
>>> +        time_left = 
>>> wait_for_completion_timeout(&ar->wmi.service_ready,
>>> +                            WMI_SERVICE_READY_TIMEOUT_HZ);
>>> +        if (!time_left)
>>> +            return -ETIMEDOUT;
>>> +    }
>>> +
>>>       return 0;
>>>   }
>>>
>>> base-commit: 707e306f3573fa321ae197d77366578e4566cff5
>>
>> Thank you for looking at this I will test this and see if it resolves 
>> the problem we're seeing but since its somewhat rare it may take me a 
>> bit to validate.
>>
>> Is this any different than just trying to bring up the interface 
>> again from userspace? I could be wrong, but my concern with this is 
>> that when I retried in userspace things got into a very odd state:
>>
>>   - IWD starts
>>
>>   - ifdown interface
>>
>>   - ifup interface, timeout -110
>>
>>   - Retry ifup, success
>>
>>   - Authenticate/associate succeed
>>
>>   - 4-way handshake fails because the device never received the 1/4 
>> frame.
>>
> Don't get time to look into this case, but I suppose there might be 
> some issues in error handling when interface up fails, kind of 
> incorrect irq enable/disable or something else impacting data path, so 
> no data frame received even after a second interface up retry succeeds,
>
> Anyway please test this patch, which is supposed to be the right fix 
> to this issue.

This does appear to have fixed it! For reference this was my test:

  for i in $(seq 1 100000); do sudo ip link set wlan0 down; sudo ip link 
set wlan0 up; echo $?; done

I never saw the up command fail, and after a while I noticed one of the 
iterations took a bit longer to complete. Checked dmesg and saw:

[ 1006.017198] ath10k_pci 0000:02:00.0: failed to receive service ready 
completion, polling..
[ 1006.017295] ath10k_pci 0000:02:00.0: service ready completion 
received, continuing normally

I then started IWD and it was able to connect fine (data frames were 
being passed). I was able to trigger this 3 times relatively quickly, 
each time IWD connected afterwards. So from my end this appears fixed.

You can add tested-by me if you like:

Tested-By: James Prestwood <prestwoj@gmail.com> # on QCA6174 hw3.2


>
>> IWD would then retry indefinitely with auth/assoc succeeding but 
>> never receiving any 4-way handshake frames. The only way to get 
>> things working again was reloading the ath10k driver/reboot. Maybe 
>> this patch is different because its waiting for the initial request 
>> and no issuing a second one? Just wanted to point that out in case it 
>> sheds any light.
>>
>> Thanks,
>>
>> James
>>

  reply	other threads:[~2024-02-28 18:46 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-21  3:17 [PATCH] wifi: ath10k: poll service ready message before failing Baochen Qiang
2024-02-21 12:38 ` James Prestwood
2024-02-22  2:18   ` Baochen Qiang
2024-02-28 18:46     ` James Prestwood [this message]
2024-02-28 21:31       ` Jeff Johnson
2024-02-26 10:45 ` Kalle Valo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6362d9b2-6ed2-4454-bf1b-8614d181bc93@gmail.com \
    --to=prestwoj@gmail.com \
    --cc=ath10k@lists.infradead.org \
    --cc=linux-wireless@vger.kernel.org \
    --cc=quic_bqiang@quicinc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).