* [PATCH RESEND] ath10k: Fix potential Rx ring corruption
@ 2015-01-09 17:19 Vasanthakumar Thiagarajan
2015-01-10 0:36 ` Ben Greear
2015-01-13 14:22 ` Kalle Valo
0 siblings, 2 replies; 6+ messages in thread
From: Vasanthakumar Thiagarajan @ 2015-01-09 17:19 UTC (permalink / raw)
To: ath10k; +Cc: linux-wireless, Vasanthakumar Thiagarajan
When replenishing Rx buffers driver updates the address of the
buffer and the index of rx buffer in rx ring to the firmware.
Change in order by CPU can cause rx ring corruption. Add memory
barrier before updating rx buffer index to guarantee the order.
This could fix some instances of rx ring corruption due to done
bit in rx attention flag not set.
Signed-off-by: Vasanthakumar Thiagarajan <vthiagar@qti.qualcomm.com>
---
drivers/net/wireless/ath/ath10k/htt_rx.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/net/wireless/ath/ath10k/htt_rx.c b/drivers/net/wireless/ath/ath10k/htt_rx.c
index 9c782a4..baa1c44 100644
--- a/drivers/net/wireless/ath/ath10k/htt_rx.c
+++ b/drivers/net/wireless/ath/ath10k/htt_rx.c
@@ -97,6 +97,11 @@ static int __ath10k_htt_rx_ring_fill_n(struct ath10k_htt *htt, int num)
}
fail:
+ /*
+ * Make sure the rx buffer is updated before available buffer
+ * index to avoid any potential rx ring corruption.
+ */
+ mb();
*htt->rx_ring.alloc_idx.vaddr = __cpu_to_le32(idx);
return ret;
}
--
1.7.9.5
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH RESEND] ath10k: Fix potential Rx ring corruption
2015-01-09 17:19 [PATCH RESEND] ath10k: Fix potential Rx ring corruption Vasanthakumar Thiagarajan
@ 2015-01-10 0:36 ` Ben Greear
2015-01-10 19:01 ` Ben Greear
2015-01-13 14:22 ` Kalle Valo
1 sibling, 1 reply; 6+ messages in thread
From: Ben Greear @ 2015-01-10 0:36 UTC (permalink / raw)
To: Vasanthakumar Thiagarajan; +Cc: ath10k, linux-wireless
I added this to my tree (and a bunch more debug stuff to track
CE transport-ids), and I've done about 4500 station reconnects over
the last 2 hours and no tx-credits hang issue so far.
Could be my debugging code or that I'm getting lucky, but I'm hopeful
that your patch actually fixed the problem I was seeing!
Thanks,
Ben
On 01/09/2015 09:19 AM, Vasanthakumar Thiagarajan wrote:
> When replenishing Rx buffers driver updates the address of the
> buffer and the index of rx buffer in rx ring to the firmware.
> Change in order by CPU can cause rx ring corruption. Add memory
> barrier before updating rx buffer index to guarantee the order.
>
> This could fix some instances of rx ring corruption due to done
> bit in rx attention flag not set.
>
> Signed-off-by: Vasanthakumar Thiagarajan <vthiagar@qti.qualcomm.com>
> ---
> drivers/net/wireless/ath/ath10k/htt_rx.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/drivers/net/wireless/ath/ath10k/htt_rx.c b/drivers/net/wireless/ath/ath10k/htt_rx.c
> index 9c782a4..baa1c44 100644
> --- a/drivers/net/wireless/ath/ath10k/htt_rx.c
> +++ b/drivers/net/wireless/ath/ath10k/htt_rx.c
> @@ -97,6 +97,11 @@ static int __ath10k_htt_rx_ring_fill_n(struct ath10k_htt *htt, int num)
> }
>
> fail:
> + /*
> + * Make sure the rx buffer is updated before available buffer
> + * index to avoid any potential rx ring corruption.
> + */
> + mb();
> *htt->rx_ring.alloc_idx.vaddr = __cpu_to_le32(idx);
> return ret;
> }
>
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH RESEND] ath10k: Fix potential Rx ring corruption
2015-01-10 0:36 ` Ben Greear
@ 2015-01-10 19:01 ` Ben Greear
[not found] ` <1420969591846.36256@qti.qualcomm.com>
0 siblings, 1 reply; 6+ messages in thread
From: Ben Greear @ 2015-01-10 19:01 UTC (permalink / raw)
To: Vasanthakumar Thiagarajan; +Cc: linux-wireless, ath10k
Well, problem is not solved after all. Had total of 5 crashes on overnight run, must have
all been before midnight, because that is the earliest logs I see (journald was not configured
to use enough space...fixed for next time) and no crashes since then.
Still, it is at least no worse.
I wonder if similar wb() is needed in the firmware somewhere?
Thanks,
Ben
On 01/09/2015 04:36 PM, Ben Greear wrote:
> I added this to my tree (and a bunch more debug stuff to track
> CE transport-ids), and I've done about 4500 station reconnects over
> the last 2 hours and no tx-credits hang issue so far.
>
> Could be my debugging code or that I'm getting lucky, but I'm hopeful
> that your patch actually fixed the problem I was seeing!
>
> Thanks,
> Ben
>
>
> On 01/09/2015 09:19 AM, Vasanthakumar Thiagarajan wrote:
>> When replenishing Rx buffers driver updates the address of the
>> buffer and the index of rx buffer in rx ring to the firmware.
>> Change in order by CPU can cause rx ring corruption. Add memory
>> barrier before updating rx buffer index to guarantee the order.
>>
>> This could fix some instances of rx ring corruption due to done
>> bit in rx attention flag not set.
>>
>> Signed-off-by: Vasanthakumar Thiagarajan <vthiagar@qti.qualcomm.com>
>> ---
>> drivers/net/wireless/ath/ath10k/htt_rx.c | 5 +++++
>> 1 file changed, 5 insertions(+)
>>
>> diff --git a/drivers/net/wireless/ath/ath10k/htt_rx.c b/drivers/net/wireless/ath/ath10k/htt_rx.c
>> index 9c782a4..baa1c44 100644
>> --- a/drivers/net/wireless/ath/ath10k/htt_rx.c
>> +++ b/drivers/net/wireless/ath/ath10k/htt_rx.c
>> @@ -97,6 +97,11 @@ static int __ath10k_htt_rx_ring_fill_n(struct ath10k_htt *htt, int num)
>> }
>>
>> fail:
>> + /*
>> + * Make sure the rx buffer is updated before available buffer
>> + * index to avoid any potential rx ring corruption.
>> + */
>> + mb();
>> *htt->rx_ring.alloc_idx.vaddr = __cpu_to_le32(idx);
>> return ret;
>> }
>>
>
>
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH RESEND] ath10k: Fix potential Rx ring corruption
[not found] ` <1420969591846.36256@qti.qualcomm.com>
@ 2015-01-11 10:06 ` Vasanthakumar Thiagarajan
2015-01-11 15:33 ` Ben Greear
0 siblings, 1 reply; 6+ messages in thread
From: Vasanthakumar Thiagarajan @ 2015-01-11 10:06 UTC (permalink / raw)
To: Ben Greear; +Cc: linux-wireless, ath10k
> Well, problem is not solved after all. Had total of 5 crashes on overnight run, must have
> all been before midnight, because that is the earliest logs I see (journald was not configured
> to use enough space...fixed for next time) and no crashes since then.
Not sure about the crash you are originally seeing. This commit fixes rx ring buffer corruption,
this could make some difference in buffer corruption in copy engine 1.
>
> Still, it is at least no worse.
>
> I wonder if similar wb() is needed in the firmware somewhere?
Unlikely, there will be enough time for host to see the updated index and
rx buffer after fw updates them while sending htt rx indication. Host accesses
them only when processing the htt message.
Vasanth
>
> Thanks,
> Ben
>
> On 01/09/2015 04:36 PM, Ben Greear wrote:
>> I added this to my tree (and a bunch more debug stuff to track
>> CE transport-ids), and I've done about 4500 station reconnects over
>> the last 2 hours and no tx-credits hang issue so far.
>>
>> Could be my debugging code or that I'm getting lucky, but I'm hopeful
>> that your patch actually fixed the problem I was seeing!
>>
>> Thanks,
>> Ben
>>
>>
>> On 01/09/2015 09:19 AM, Vasanthakumar Thiagarajan wrote:
>>> When replenishing Rx buffers driver updates the address of the
>>> buffer and the index of rx buffer in rx ring to the firmware.
>>> Change in order by CPU can cause rx ring corruption. Add memory
>>> barrier before updating rx buffer index to guarantee the order.
>>>
>>> This could fix some instances of rx ring corruption due to done
>>> bit in rx attention flag not set.
>>>
>>> Signed-off-by: Vasanthakumar Thiagarajan <vthiagar@qti.qualcomm.com>
>>> ---
>>> drivers/net/wireless/ath/ath10k/htt_rx.c | 5 +++++
>>> 1 file changed, 5 insertions(+)
>>>
>>> diff --git a/drivers/net/wireless/ath/ath10k/htt_rx.c b/drivers/net/wireless/ath/ath10k/htt_rx.c
>>> index 9c782a4..baa1c44 100644
>>> --- a/drivers/net/wireless/ath/ath10k/htt_rx.c
>>> +++ b/drivers/net/wireless/ath/ath10k/htt_rx.c
>>> @@ -97,6 +97,11 @@ static int __ath10k_htt_rx_ring_fill_n(struct ath10k_htt *htt, int num)
>>> }
>>>
>>> fail:
>>> + /*
>>> + * Make sure the rx buffer is updated before available buffer
>>> + * index to avoid any potential rx ring corruption.
>>> + */
>>> + mb();
>>> *htt->rx_ring.alloc_idx.vaddr = __cpu_to_le32(idx);
>>> return ret;
>>> }
>>>
>>
>>
>
> --
> Ben Greear <greearb@candelatech.com>
> Candela Technologies Inc http://www.candelatech.com
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH RESEND] ath10k: Fix potential Rx ring corruption
2015-01-11 10:06 ` Vasanthakumar Thiagarajan
@ 2015-01-11 15:33 ` Ben Greear
0 siblings, 0 replies; 6+ messages in thread
From: Ben Greear @ 2015-01-11 15:33 UTC (permalink / raw)
To: Vasanthakumar Thiagarajan; +Cc: linux-wireless, ath10k
On 01/11/2015 02:06 AM, Vasanthakumar Thiagarajan wrote:
>
>
>
>> Well, problem is not solved after all. Had total of 5 crashes on overnight run, must have
>> all been before midnight, because that is the earliest logs I see (journald was not configured
>> to use enough space...fixed for next time) and no crashes since then.
>
> Not sure about the crash you are originally seeing. This commit fixes rx ring buffer corruption,
> this could make some difference in buffer corruption in copy engine 1.
Well, it wasn't a crash until I added keep-alive timer and assert, what I mean is that
the WMI transport hangs, apparently due to lost message (or ack) or two between firmware
and host.
Slow to debug, because the second-to-last dbglog message from firmware is sent towards host
but never seen by host. So I am going to have to play some more tricks to see the missing dbglog messages.
Thanks,
Ben
>> Still, it is at least no worse.
>>
>> I wonder if similar wb() is needed in the firmware somewhere?
>
> Unlikely, there will be enough time for host to see the updated index and
> rx buffer after fw updates them while sending htt rx indication. Host accesses
> them only when processing the htt message.
>
> Vasanth
>
>>
>> Thanks,
>> Ben
>>
>> On 01/09/2015 04:36 PM, Ben Greear wrote:
>>> I added this to my tree (and a bunch more debug stuff to track
>>> CE transport-ids), and I've done about 4500 station reconnects over
>>> the last 2 hours and no tx-credits hang issue so far.
>>>
>>> Could be my debugging code or that I'm getting lucky, but I'm hopeful
>>> that your patch actually fixed the problem I was seeing!
>>>
>>> Thanks,
>>> Ben
>>>
>>>
>>> On 01/09/2015 09:19 AM, Vasanthakumar Thiagarajan wrote:
>>>> When replenishing Rx buffers driver updates the address of the
>>>> buffer and the index of rx buffer in rx ring to the firmware.
>>>> Change in order by CPU can cause rx ring corruption. Add memory
>>>> barrier before updating rx buffer index to guarantee the order.
>>>>
>>>> This could fix some instances of rx ring corruption due to done
>>>> bit in rx attention flag not set.
>>>>
>>>> Signed-off-by: Vasanthakumar Thiagarajan <vthiagar@qti.qualcomm.com>
>>>> ---
>>>> drivers/net/wireless/ath/ath10k/htt_rx.c | 5 +++++
>>>> 1 file changed, 5 insertions(+)
>>>>
>>>> diff --git a/drivers/net/wireless/ath/ath10k/htt_rx.c b/drivers/net/wireless/ath/ath10k/htt_rx.c
>>>> index 9c782a4..baa1c44 100644
>>>> --- a/drivers/net/wireless/ath/ath10k/htt_rx.c
>>>> +++ b/drivers/net/wireless/ath/ath10k/htt_rx.c
>>>> @@ -97,6 +97,11 @@ static int __ath10k_htt_rx_ring_fill_n(struct ath10k_htt *htt, int num)
>>>> }
>>>>
>>>> fail:
>>>> + /*
>>>> + * Make sure the rx buffer is updated before available buffer
>>>> + * index to avoid any potential rx ring corruption.
>>>> + */
>>>> + mb();
>>>> *htt->rx_ring.alloc_idx.vaddr = __cpu_to_le32(idx);
>>>> return ret;
>>>> }
>>>>
>>>
>>>
>>
>> --
>> Ben Greear <greearb@candelatech.com>
>> Candela Technologies Inc http://www.candelatech.com
>>
>
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH RESEND] ath10k: Fix potential Rx ring corruption
2015-01-09 17:19 [PATCH RESEND] ath10k: Fix potential Rx ring corruption Vasanthakumar Thiagarajan
2015-01-10 0:36 ` Ben Greear
@ 2015-01-13 14:22 ` Kalle Valo
1 sibling, 0 replies; 6+ messages in thread
From: Kalle Valo @ 2015-01-13 14:22 UTC (permalink / raw)
To: Vasanthakumar Thiagarajan; +Cc: ath10k, linux-wireless
Vasanthakumar Thiagarajan <vthiagar@qti.qualcomm.com> writes:
> When replenishing Rx buffers driver updates the address of the
> buffer and the index of rx buffer in rx ring to the firmware.
> Change in order by CPU can cause rx ring corruption. Add memory
> barrier before updating rx buffer index to guarantee the order.
>
> This could fix some instances of rx ring corruption due to done
> bit in rx attention flag not set.
>
> Signed-off-by: Vasanthakumar Thiagarajan <vthiagar@qti.qualcomm.com>
Thanks, applied to ath.git.
--
Kalle Valo
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2015-01-13 14:22 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-01-09 17:19 [PATCH RESEND] ath10k: Fix potential Rx ring corruption Vasanthakumar Thiagarajan
2015-01-10 0:36 ` Ben Greear
2015-01-10 19:01 ` Ben Greear
[not found] ` <1420969591846.36256@qti.qualcomm.com>
2015-01-11 10:06 ` Vasanthakumar Thiagarajan
2015-01-11 15:33 ` Ben Greear
2015-01-13 14:22 ` Kalle Valo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).