linux-wireless.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* ath9k will not tx packets sometimes.
@ 2018-01-26 22:53 Ben Greear
  2018-01-27 13:11 ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 15+ messages in thread
From: Ben Greear @ 2018-01-26 22:53 UTC (permalink / raw)
  To: linux-wireless@vger.kernel.org

I'm doing a test with 200 virtual stations on each of 6 ath9k radios.

When I configure stations for DHCP, I see cases where stations on a particular
radio will not transmit anything sometimes.  I see no 'XMIT' logs that show indication of
frames being received in the driver from the upper stack, but if I use 'tshark' on
a station interface, it shows frames being 'transmitted'.

I do, however, see this, which looks like it might show
an issue.  It looks like whatever 'aqm' is, it has an ever expanding number
of backlog packets:

[root@2u-6n lanforge]# cat /debug/ieee80211/wiphy2/netdev\:sta30194/stations/00\:0e\:8e\:69\:b8\:f7/aqm
target 49999us interval 299999us ecn no
tid ac backlog-bytes backlog-packets new-flows drops marks overlimit collisions tx-bytes tx-packets
0 2 27616 440 9 0 0 0 428 998 7
1 3 0 0 0 0 0 0 0 0 0
2 3 0 0 0 0 0 0 0 0 0
3 2 0 0 0 0 0 0 0 0 0
4 1 0 0 0 0 0 0 0 0 0
5 1 0 0 0 0 0 0 0 0 0
6 0 0 0 1 0 0 0 0 390 1
7 0 0 0 637 0 0 0 792 22072 903
8 2 0 0 0 0 0 0 0 0 0
9 3 0 0 0 0 0 0 0 0 0
10 3 0 0 0 0 0 0 0 0 0
11 2 0 0 0 0 0 0 0 0 0
12 1 0 0 0 0 0 0 0 0 0
13 1 0 0 0 0 0 0 0 0 0
14 0 0 0 0 0 0 0 0 0 0
15 0 0 0 0 0 0 0 0 0 0


Anyone have any pointers as to whether this might be a real issue or not?  I'll go
digging in the meantime....

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ath9k will not tx packets sometimes.
  2018-01-26 22:53 ath9k will not tx packets sometimes Ben Greear
@ 2018-01-27 13:11 ` Toke Høiland-Jørgensen
  2018-01-29 21:34   ` Ben Greear
  0 siblings, 1 reply; 15+ messages in thread
From: Toke Høiland-Jørgensen @ 2018-01-27 13:11 UTC (permalink / raw)
  To: Ben Greear, linux-wireless@vger.kernel.org

Ben Greear <greearb@candelatech.com> writes:

> I'm doing a test with 200 virtual stations on each of 6 ath9k radios.
>
> When I configure stations for DHCP, I see cases where stations on a particular
> radio will not transmit anything sometimes.  I see no 'XMIT' logs that show indication of
> frames being received in the driver from the upper stack, but if I use 'tshark' on
> a station interface, it shows frames being 'transmitted'.
>
> I do, however, see this, which looks like it might show
> an issue.  It looks like whatever 'aqm' is, it has an ever expanding number
> of backlog packets:

The aqm is the intermediate queues in mac80211. So this indicates that
the driver is not pulling packets for transmission.

With that many stations, I wonder whether it is due to the airtime
fairness scheduler throttling the station? What is the contents of
debug/ieee80211/wiphy2/netdev\:sta30194/stations/00\:0e\:8e\:69\:b8\:f7/airtime
while the station is not transmitting? And is it all stations on that
particular radio, or only some of them?

-Toke

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ath9k will not tx packets sometimes.
  2018-01-27 13:11 ` Toke Høiland-Jørgensen
@ 2018-01-29 21:34   ` Ben Greear
  2018-01-29 21:47     ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 15+ messages in thread
From: Ben Greear @ 2018-01-29 21:34 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, linux-wireless@vger.kernel.org

On 01/27/2018 05:11 AM, Toke Høiland-Jørgensen wrote:
> Ben Greear <greearb@candelatech.com> writes:
>
>> I'm doing a test with 200 virtual stations on each of 6 ath9k radios.
>>
>> When I configure stations for DHCP, I see cases where stations on a particular
>> radio will not transmit anything sometimes.  I see no 'XMIT' logs that show indication of
>> frames being received in the driver from the upper stack, but if I use 'tshark' on
>> a station interface, it shows frames being 'transmitted'.
>>
>> I do, however, see this, which looks like it might show
>> an issue.  It looks like whatever 'aqm' is, it has an ever expanding number
>> of backlog packets:
>
> The aqm is the intermediate queues in mac80211. So this indicates that
> the driver is not pulling packets for transmission.
>
> With that many stations, I wonder whether it is due to the airtime
> fairness scheduler throttling the station? What is the contents of
> debug/ieee80211/wiphy2/netdev\:sta30194/stations/00\:0e\:8e\:69\:b8\:f7/airtime
> while the station is not transmitting? And is it all stations on that
> particular radio, or only some of them?

Here is the output of airtime and aqm on a hung station:

# cat /debug/ieee80211/wiphy0/netdev\:sta10057/stations/00\:0e\:8e\:50\:74\:8a/airtime
RX: 83706 us
TX: 4202 us
Deficit: VO: 198 us VI: 300 us BE: -8306 us BK: 300 us

# cat /debug/ieee80211/wiphy0/netdev\:sta10057/stations/00\:0e\:8e\:50\:74\:8a/aqm
target 49999us interval 299999us ecn no
tid ac backlog-bytes backlog-packets new-flows drops marks overlimit collisions tx-bytes tx-packets flags
0 2 3476 14 6 0 0 0 8 326 3 0x6(RUN AMPDU NO-AMSDU)
1 3 0 0 0 0 0 0 0 0 0 0x0(RUN)
2 3 0 0 0 0 0 0 0 0 0 0x0(RUN)
3 2 0 0 0 0 0 0 0 0 0 0x0(RUN)
4 1 0 0 0 0 0 0 0 0 0 0x0(RUN)
5 1 0 0 0 0 0 0 0 0 0 0x0(RUN)
6 0 0 0 0 0 0 0 0 0 0 0x0(RUN)
7 0 0 0 4 0 0 0 6 168 7 0x0(RUN)
8 2 0 0 0 0 0 0 0 0 0 0x0(RUN)
9 3 0 0 0 0 0 0 0 0 0 0x0(RUN)
10 3 0 0 0 0 0 0 0 0 0 0x0(RUN)
11 2 0 0 0 0 0 0 0 0 0 0x0(RUN)
12 1 0 0 0 0 0 0 0 0 0 0x0(RUN)
13 1 0 0 0 0 0 0 0 0 0 0x0(RUN)
14 0 0 0 0 0 0 0 0 0 0 0x0(RUN)
15 0 0 0 0 0 0 0 0 0 0 0x0(RUN)


My tool will effectively down/up the interface after 30 seconds if DHCP has not
been acquired...here is another set of debug after that has happened:

# cat /debug/ieee80211/wiphy0/netdev\:sta10057/stations/00\:0e\:8e\:50\:74\:8a/airtime
RX: 0 us
TX: 3946 us
Deficit: VO: 254 us VI: 300 us BE: 300 us BK: 300 us

# cat /debug/ieee80211/wiphy0/netdev\:sta10057/stations/00\:0e\:8e\:50\:74\:8a/aqm
target 49999us interval 299999us ecn no
tid ac backlog-bytes backlog-packets new-flows drops marks overlimit collisions tx-bytes tx-packets flags
0 2 1376 6 2 0 0 0 4 0 0 0x6(RUN AMPDU NO-AMSDU)
1 3 0 0 0 0 0 0 0 0 0 0x0(RUN)
2 3 0 0 0 0 0 0 0 0 0 0x0(RUN)
3 2 0 0 0 0 0 0 0 0 0 0x0(RUN)
4 1 0 0 0 0 0 0 0 0 0 0x0(RUN)
5 1 0 0 0 0 0 0 0 0 0 0x0(RUN)
6 0 0 0 0 0 0 0 0 0 0 0x0(RUN)
7 0 0 0 12 0 0 0 13 312 13 0x0(RUN)
8 2 0 0 0 0 0 0 0 0 0 0x0(RUN)
9 3 0 0 0 0 0 0 0 0 0 0x0(RUN)
10 3 0 0 0 0 0 0 0 0 0 0x0(RUN)
11 2 0 0 0 0 0 0 0 0 0 0x0(RUN)
12 1 0 0 0 0 0 0 0 0 0 0x0(RUN)
13 1 0 0 0 0 0 0 0 0 0 0x0(RUN)
14 0 0 0 0 0 0 0 0 0 0 0x0(RUN)
15 0 0 0 0 0 0 0 0 0 0 0x0(RUN)

I was able to reproduce this on a system by configuring "only" 200 stations on
a single ath9k radio, so probably the dedicated individual could reproduce
this on their own system as well.  Stock kernels are less optimized for this, but at least
for ath9k, it should function with multiple virtual station devices...

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ath9k will not tx packets sometimes.
  2018-01-29 21:34   ` Ben Greear
@ 2018-01-29 21:47     ` Toke Høiland-Jørgensen
  2018-01-29 21:54       ` Ben Greear
  0 siblings, 1 reply; 15+ messages in thread
From: Toke Høiland-Jørgensen @ 2018-01-29 21:47 UTC (permalink / raw)
  To: Ben Greear, linux-wireless@vger.kernel.org

Ben Greear <greearb@candelatech.com> writes:

> On 01/27/2018 05:11 AM, Toke H=C3=B8iland-J=C3=B8rgensen wrote:
>> Ben Greear <greearb@candelatech.com> writes:
>>
>>> I'm doing a test with 200 virtual stations on each of 6 ath9k radios.
>>>
>>> When I configure stations for DHCP, I see cases where stations on a par=
ticular
>>> radio will not transmit anything sometimes.  I see no 'XMIT' logs that =
show indication of
>>> frames being received in the driver from the upper stack, but if I use =
'tshark' on
>>> a station interface, it shows frames being 'transmitted'.
>>>
>>> I do, however, see this, which looks like it might show
>>> an issue.  It looks like whatever 'aqm' is, it has an ever expanding nu=
mber
>>> of backlog packets:
>>
>> The aqm is the intermediate queues in mac80211. So this indicates that
>> the driver is not pulling packets for transmission.
>>
>> With that many stations, I wonder whether it is due to the airtime
>> fairness scheduler throttling the station? What is the contents of
>> debug/ieee80211/wiphy2/netdev\:sta30194/stations/00\:0e\:8e\:69\:b8\:f7/=
airtime
>> while the station is not transmitting? And is it all stations on that
>> particular radio, or only some of them?
>
> Here is the output of airtime and aqm on a hung station:
>
> # cat /debug/ieee80211/wiphy0/netdev\:sta10057/stations/00\:0e\:8e\:50\:7=
4\:8a/airtime
> RX: 83706 us
> TX: 4202 us
> Deficit: VO: 198 us VI: 300 us BE: -8306 us BK: 300 us

Right. This looks like incoming traffic is depleting the airtime quantum
faster than it can be replenished by the scheduler, which means that the
station gets completely starved.

Could you try turning off the airtime scheduler?

echo 0 > /sys/kernel/debug/ieee80211/wiphy0/ath9k/airtime_flags

and see if the problem goes away.

If it does, please check if the problem persists when setting
airtime_flags to 1 (which means only include TX airtime).

-Toke

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ath9k will not tx packets sometimes.
  2018-01-29 21:47     ` Toke Høiland-Jørgensen
@ 2018-01-29 21:54       ` Ben Greear
  2018-01-29 22:35         ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 15+ messages in thread
From: Ben Greear @ 2018-01-29 21:54 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, linux-wireless@vger.kernel.org

On 01/29/2018 01:47 PM, Toke Høiland-Jørgensen wrote:
> Ben Greear <greearb@candelatech.com> writes:
>
>> On 01/27/2018 05:11 AM, Toke Høiland-Jørgensen wrote:
>>> Ben Greear <greearb@candelatech.com> writes:
>>>
>>>> I'm doing a test with 200 virtual stations on each of 6 ath9k radios.
>>>>
>>>> When I configure stations for DHCP, I see cases where stations on a particular
>>>> radio will not transmit anything sometimes.  I see no 'XMIT' logs that show indication of
>>>> frames being received in the driver from the upper stack, but if I use 'tshark' on
>>>> a station interface, it shows frames being 'transmitted'.
>>>>
>>>> I do, however, see this, which looks like it might show
>>>> an issue.  It looks like whatever 'aqm' is, it has an ever expanding number
>>>> of backlog packets:
>>>
>>> The aqm is the intermediate queues in mac80211. So this indicates that
>>> the driver is not pulling packets for transmission.
>>>
>>> With that many stations, I wonder whether it is due to the airtime
>>> fairness scheduler throttling the station? What is the contents of
>>> debug/ieee80211/wiphy2/netdev\:sta30194/stations/00\:0e\:8e\:69\:b8\:f7/airtime
>>> while the station is not transmitting? And is it all stations on that
>>> particular radio, or only some of them?
>>
>> Here is the output of airtime and aqm on a hung station:
>>
>> # cat /debug/ieee80211/wiphy0/netdev\:sta10057/stations/00\:0e\:8e\:50\:74\:8a/airtime
>> RX: 83706 us
>> TX: 4202 us
>> Deficit: VO: 198 us VI: 300 us BE: -8306 us BK: 300 us
>
> Right. This looks like incoming traffic is depleting the airtime quantum
> faster than it can be replenished by the scheduler, which means that the
> station gets completely starved.
>
> Could you try turning off the airtime scheduler?
>
> echo 0 > /sys/kernel/debug/ieee80211/wiphy0/ath9k/airtime_flags
>
> and see if the problem goes away.
>
> If it does, please check if the problem persists when setting
> airtime_flags to 1 (which means only include TX airtime).
>
> -Toke
>

That did not seem to help:

# cat /debug/ieee80211/wiphy0/netdev\:sta10058/stations/00\:0e\:8e\:50\:74\:8a/node_aggr
Max-AMPDU: 65535
MPDU Density: 8


TID  SEQ_START  SEQ_NEXT  BAW_SIZE  BAW_HEAD  BAW_TAIL  BAR_IDX SCHED HAS-QUED
   0          0         0        64         0         0       -1     1        1

# cat /debug/ieee80211/wiphy0/netdev\:sta10058/stations/00\:0e\:8e\:50\:74\:8a/airtime
RX: 0 us
TX: 4682 us
Deficit: VO: 300 us VI: 300 us BE: 300 us BK: 300 us

# cat /debug/ieee80211/wiphy0/netdev\:sta10058/stations/00\:0e\:8e\:50\:74\:8a/aqm
target 49999us interval 299999us ecn no
tid ac backlog-bytes backlog-packets new-flows drops marks overlimit collisions tx-bytes tx-packets flags
0 2 2406 11 3 0 0 0 7 0 0 0x6(RUN AMPDU NO-AMSDU)
1 3 0 0 0 0 0 0 0 0 0 0x0(RUN)
2 3 0 0 0 0 0 0 0 0 0 0x0(RUN)
3 2 0 0 0 0 0 0 0 0 0 0x0(RUN)
4 1 0 0 0 0 0 0 0 0 0 0x0(RUN)
5 1 0 0 0 0 0 0 0 0 0 0x0(RUN)
6 0 0 0 0 0 0 0 0 0 0 0x0(RUN)
7 0 0 0 12 0 0 0 13 312 13 0x0(RUN)
8 2 0 0 0 0 0 0 0 0 0 0x0(RUN)
9 3 0 0 0 0 0 0 0 0 0 0x0(RUN)
10 3 0 0 0 0 0 0 0 0 0 0x0(RUN)
11 2 0 0 0 0 0 0 0 0 0 0x0(RUN)
12 1 0 0 0 0 0 0 0 0 0 0x0(RUN)
13 1 0 0 0 0 0 0 0 0 0 0x0(RUN)
14 0 0 0 0 0 0 0 0 0 0 0x0(RUN)
15 0 0 0 0 0 0 0 0 0 0 0x0(RUN)


Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ath9k will not tx packets sometimes.
  2018-01-29 21:54       ` Ben Greear
@ 2018-01-29 22:35         ` Toke Høiland-Jørgensen
  2018-01-29 23:00           ` Ben Greear
  0 siblings, 1 reply; 15+ messages in thread
From: Toke Høiland-Jørgensen @ 2018-01-29 22:35 UTC (permalink / raw)
  To: Ben Greear, linux-wireless@vger.kernel.org

Ben Greear <greearb@candelatech.com> writes:

> On 01/29/2018 01:47 PM, Toke H=C3=B8iland-J=C3=B8rgensen wrote:
>> Ben Greear <greearb@candelatech.com> writes:
>>
>>> On 01/27/2018 05:11 AM, Toke H=C3=B8iland-J=C3=B8rgensen wrote:
>>>> Ben Greear <greearb@candelatech.com> writes:
>>>>
>>>>> I'm doing a test with 200 virtual stations on each of 6 ath9k radios.
>>>>>
>>>>> When I configure stations for DHCP, I see cases where stations on a p=
articular
>>>>> radio will not transmit anything sometimes.  I see no 'XMIT' logs tha=
t show indication of
>>>>> frames being received in the driver from the upper stack, but if I us=
e 'tshark' on
>>>>> a station interface, it shows frames being 'transmitted'.
>>>>>
>>>>> I do, however, see this, which looks like it might show
>>>>> an issue.  It looks like whatever 'aqm' is, it has an ever expanding =
number
>>>>> of backlog packets:
>>>>
>>>> The aqm is the intermediate queues in mac80211. So this indicates that
>>>> the driver is not pulling packets for transmission.
>>>>
>>>> With that many stations, I wonder whether it is due to the airtime
>>>> fairness scheduler throttling the station? What is the contents of
>>>> debug/ieee80211/wiphy2/netdev\:sta30194/stations/00\:0e\:8e\:69\:b8\:f=
7/airtime
>>>> while the station is not transmitting? And is it all stations on that
>>>> particular radio, or only some of them?
>>>
>>> Here is the output of airtime and aqm on a hung station:
>>>
>>> # cat /debug/ieee80211/wiphy0/netdev\:sta10057/stations/00\:0e\:8e\:50\=
:74\:8a/airtime
>>> RX: 83706 us
>>> TX: 4202 us
>>> Deficit: VO: 198 us VI: 300 us BE: -8306 us BK: 300 us
>>
>> Right. This looks like incoming traffic is depleting the airtime quantum
>> faster than it can be replenished by the scheduler, which means that the
>> station gets completely starved.
>>
>> Could you try turning off the airtime scheduler?
>>
>> echo 0 > /sys/kernel/debug/ieee80211/wiphy0/ath9k/airtime_flags
>>
>> and see if the problem goes away.
>>
>> If it does, please check if the problem persists when setting
>> airtime_flags to 1 (which means only include TX airtime).
>>
>> -Toke
>>
>
> That did not seem to help:
>
> # cat /debug/ieee80211/wiphy0/netdev\:sta10058/stations/00\:0e\:8e\:50\:7=
4\:8a/node_aggr
> Max-AMPDU: 65535
> MPDU Density: 8
>
>
> TID  SEQ_START  SEQ_NEXT  BAW_SIZE  BAW_HEAD  BAW_TAIL  BAR_IDX SCHED HAS=
-QUED
>    0          0         0        64         0         0       -1     1   =
     1

Hmm, SCHED and HAS-QUED are both set, so it should be scheduled. Is the
scheduler maybe simply taking too long to get round to scheduling that
station again?

What happens if you don't kill things after 30 seconds? Is it hanging
forever, or just long enough for your tools to lose patience?

If you have 200 stations all requesting DHCP addresses I could see how
things might take a while...

-Toke

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ath9k will not tx packets sometimes.
  2018-01-29 22:35         ` Toke Høiland-Jørgensen
@ 2018-01-29 23:00           ` Ben Greear
  2018-01-30 12:07             ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 15+ messages in thread
From: Ben Greear @ 2018-01-29 23:00 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, linux-wireless@vger.kernel.org

On 01/29/2018 02:35 PM, Toke Høiland-Jørgensen wrote:
> Ben Greear <greearb@candelatech.com> writes:
>
>> On 01/29/2018 01:47 PM, Toke Høiland-Jørgensen wrote:
>>> Ben Greear <greearb@candelatech.com> writes:
>>>
>>>> On 01/27/2018 05:11 AM, Toke Høiland-Jørgensen wrote:
>>>>> Ben Greear <greearb@candelatech.com> writes:
>>>>>
>>>>>> I'm doing a test with 200 virtual stations on each of 6 ath9k radios.
>>>>>>
>>>>>> When I configure stations for DHCP, I see cases where stations on a particular
>>>>>> radio will not transmit anything sometimes.  I see no 'XMIT' logs that show indication of
>>>>>> frames being received in the driver from the upper stack, but if I use 'tshark' on
>>>>>> a station interface, it shows frames being 'transmitted'.
>>>>>>
>>>>>> I do, however, see this, which looks like it might show
>>>>>> an issue.  It looks like whatever 'aqm' is, it has an ever expanding number
>>>>>> of backlog packets:
>>>>>
>>>>> The aqm is the intermediate queues in mac80211. So this indicates that
>>>>> the driver is not pulling packets for transmission.
>>>>>
>>>>> With that many stations, I wonder whether it is due to the airtime
>>>>> fairness scheduler throttling the station? What is the contents of
>>>>> debug/ieee80211/wiphy2/netdev\:sta30194/stations/00\:0e\:8e\:69\:b8\:f7/airtime
>>>>> while the station is not transmitting? And is it all stations on that
>>>>> particular radio, or only some of them?
>>>>
>>>> Here is the output of airtime and aqm on a hung station:
>>>>
>>>> # cat /debug/ieee80211/wiphy0/netdev\:sta10057/stations/00\:0e\:8e\:50\:74\:8a/airtime
>>>> RX: 83706 us
>>>> TX: 4202 us
>>>> Deficit: VO: 198 us VI: 300 us BE: -8306 us BK: 300 us
>>>
>>> Right. This looks like incoming traffic is depleting the airtime quantum
>>> faster than it can be replenished by the scheduler, which means that the
>>> station gets completely starved.
>>>
>>> Could you try turning off the airtime scheduler?
>>>
>>> echo 0 > /sys/kernel/debug/ieee80211/wiphy0/ath9k/airtime_flags
>>>
>>> and see if the problem goes away.
>>>
>>> If it does, please check if the problem persists when setting
>>> airtime_flags to 1 (which means only include TX airtime).
>>>
>>> -Toke
>>>
>>
>> That did not seem to help:
>>
>> # cat /debug/ieee80211/wiphy0/netdev\:sta10058/stations/00\:0e\:8e\:50\:74\:8a/node_aggr
>> Max-AMPDU: 65535
>> MPDU Density: 8
>>
>>
>> TID  SEQ_START  SEQ_NEXT  BAW_SIZE  BAW_HEAD  BAW_TAIL  BAR_IDX SCHED HAS-QUED
>>     0          0         0        64         0         0       -1     1        1
>
> Hmm, SCHED and HAS-QUED are both set, so it should be scheduled. Is the
> scheduler maybe simply taking too long to get round to scheduling that
> station again?
>
> What happens if you don't kill things after 30 seconds? Is it hanging
> forever, or just long enough for your tools to lose patience?
>
> If you have 200 stations all requesting DHCP addresses I could see how
> things might take a while...

I bring them up in groups of 30 or so.  I typically see 1-10 of them get
DHCP address, and then it seems that no data frames ever are tx'd again on
any interface on the radio...or at least tx is very rare.  Sometimes, all 200 will come
up and pass traffic, but not reliably.  Once the system gets in this state,
down/up of the affected station interfaces does not fix it.  I have not tried
bouncing all of them at once yet.

I never even see dhcp discovers on the air when sniffing on another machine,
from any interface once it is hung, so it should not be a simple over-busy
network issue.

Maybe there is some way for the scheduler to get stuck and not schedule anything?

Thanks,
Ben

>
> -Toke
>


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ath9k will not tx packets sometimes.
  2018-01-29 23:00           ` Ben Greear
@ 2018-01-30 12:07             ` Toke Høiland-Jørgensen
  2018-01-30 17:32               ` Ben Greear
  0 siblings, 1 reply; 15+ messages in thread
From: Toke Høiland-Jørgensen @ 2018-01-30 12:07 UTC (permalink / raw)
  To: Ben Greear, linux-wireless@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 506 bytes --]

Ben Greear <greearb@candelatech.com> writes:

> Maybe there is some way for the scheduler to get stuck and not
> schedule anything?

It would appear so, yeah. Do you do anything special other than
associate a whole bunch of stations to trigger this? I can try to see if
I can script something that works equivalently on my setup.

I'm actually working on reworking that whole scheduler logic, and move
some of it into mac80211. Could you test this (WiP) patch and see if
that has the same problem?

-Toke


[-- Attachment #2: mac80211-add-txq-scheduling.patch --]
[-- Type: text/x-patch, Size: 30460 bytes --]

mac80211: Add TXQ scheduling API

From: Toke Høiland-Jørgensen <toke@toke.dk>

This adds an API to mac80211 to handle scheduling of TXQs and changes the
interface between driver and mac80211 for TXQ handling as follows:

- The wake_tx_queue callback interface no longer includes the TXQ. Instead,
  the driver is expected to retrieve that from ieee80211_next_txq()

- Two new mac80211 functions are added: ieee80211_next_txq() and
  ieee80211_schedule_txq(). The former returns the next TXQ that should be
  scheduled, and is how the driver gets a queue to pull packets from. The
  latter is called internally by mac80211 to start scheduling a queue, and
  the driver is supposed to call it to re-schedule the TXQ after it is
  finished pulling packets from it (unless the queue emptied).

The ath9k and ath10k drivers are changed to use the new API.

Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
---
 drivers/net/wireless/ath/ath10k/core.c |    2 
 drivers/net/wireless/ath/ath10k/core.h |    4 -
 drivers/net/wireless/ath/ath10k/mac.c  |   55 +++------
 drivers/net/wireless/ath/ath9k/ath9k.h |    4 -
 drivers/net/wireless/ath/ath9k/recv.c  |    2 
 drivers/net/wireless/ath/ath9k/xmit.c  |  192 +++++++++-----------------------
 include/net/mac80211.h                 |   38 +++++-
 net/mac80211/agg-tx.c                  |    6 +
 net/mac80211/driver-ops.h              |   12 +-
 net/mac80211/ieee80211_i.h             |    8 +
 net/mac80211/main.c                    |    3 +
 net/mac80211/sta_info.c                |    3 -
 net/mac80211/trace.h                   |   19 +--
 net/mac80211/tx.c                      |   57 +++++++++-
 14 files changed, 187 insertions(+), 218 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/core.c b/drivers/net/wireless/ath/ath10k/core.c
index 6d065f8d7f78..042175a5653f 100644
--- a/drivers/net/wireless/ath/ath10k/core.c
+++ b/drivers/net/wireless/ath/ath10k/core.c
@@ -2655,9 +2655,7 @@ struct ath10k *ath10k_core_create(size_t priv_size, struct device *dev,
 
 	mutex_init(&ar->conf_mutex);
 	spin_lock_init(&ar->data_lock);
-	spin_lock_init(&ar->txqs_lock);
 
-	INIT_LIST_HEAD(&ar->txqs);
 	INIT_LIST_HEAD(&ar->peers);
 	init_waitqueue_head(&ar->peer_mapping_wq);
 	init_waitqueue_head(&ar->htt.empty_tx_wq);
diff --git a/drivers/net/wireless/ath/ath10k/core.h b/drivers/net/wireless/ath/ath10k/core.h
index 631df2137e25..398cd3c57139 100644
--- a/drivers/net/wireless/ath/ath10k/core.h
+++ b/drivers/net/wireless/ath/ath10k/core.h
@@ -346,7 +346,6 @@ struct ath10k_peer {
 };
 
 struct ath10k_txq {
-	struct list_head list;
 	unsigned long num_fw_queued;
 	unsigned long num_push_allowed;
 };
@@ -896,10 +895,7 @@ struct ath10k {
 
 	/* protects shared structure data */
 	spinlock_t data_lock;
-	/* protects: ar->txqs, artxq->list */
-	spinlock_t txqs_lock;
 
-	struct list_head txqs;
 	struct list_head arvifs;
 	struct list_head peers;
 	struct ath10k_peer *peer_map[ATH10K_MAX_NUM_PEER_IDS];
diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c
index 75726f12a31c..e5dd97da3fa1 100644
--- a/drivers/net/wireless/ath/ath10k/mac.c
+++ b/drivers/net/wireless/ath/ath10k/mac.c
@@ -3832,12 +3832,10 @@ static void ath10k_mac_txq_init(struct ieee80211_txq *txq)
 		return;
 
 	artxq = (void *)txq->drv_priv;
-	INIT_LIST_HEAD(&artxq->list);
 }
 
 static void ath10k_mac_txq_unref(struct ath10k *ar, struct ieee80211_txq *txq)
 {
-	struct ath10k_txq *artxq;
 	struct ath10k_skb_cb *cb;
 	struct sk_buff *msdu;
 	int msdu_id;
@@ -3845,12 +3843,6 @@ static void ath10k_mac_txq_unref(struct ath10k *ar, struct ieee80211_txq *txq)
 	if (!txq)
 		return;
 
-	artxq = (void *)txq->drv_priv;
-	spin_lock_bh(&ar->txqs_lock);
-	if (!list_empty(&artxq->list))
-		list_del_init(&artxq->list);
-	spin_unlock_bh(&ar->txqs_lock);
-
 	spin_lock_bh(&ar->htt.tx_lock);
 	idr_for_each_entry(&ar->htt.pending_tx, msdu, msdu_id) {
 		cb = ATH10K_SKB_CB(msdu);
@@ -3980,23 +3972,17 @@ int ath10k_mac_tx_push_txq(struct ieee80211_hw *hw,
 void ath10k_mac_tx_push_pending(struct ath10k *ar)
 {
 	struct ieee80211_hw *hw = ar->hw;
-	struct ieee80211_txq *txq;
-	struct ath10k_txq *artxq;
-	struct ath10k_txq *last;
+	struct ieee80211_txq *txq, *first = NULL;
 	int ret;
 	int max;
 
 	if (ar->htt.num_pending_tx >= (ar->htt.max_num_pending_tx / 2))
 		return;
 
-	spin_lock_bh(&ar->txqs_lock);
 	rcu_read_lock();
 
-	last = list_last_entry(&ar->txqs, struct ath10k_txq, list);
-	while (!list_empty(&ar->txqs)) {
-		artxq = list_first_entry(&ar->txqs, struct ath10k_txq, list);
-		txq = container_of((void *)artxq, struct ieee80211_txq,
-				   drv_priv);
+	txq = ieee80211_next_txq(hw, -1);
+	while (txq) {
 
 		/* Prevent aggressive sta/tid taking over tx queue */
 		max = 16;
@@ -4007,18 +3993,21 @@ void ath10k_mac_tx_push_pending(struct ath10k *ar)
 				break;
 		}
 
-		list_del_init(&artxq->list);
 		if (ret != -ENOENT)
-			list_add_tail(&artxq->list, &ar->txqs);
+			ieee80211_schedule_txq(hw, txq);
 
 		ath10k_htt_tx_txq_update(hw, txq);
 
-		if (artxq == last || (ret < 0 && ret != -ENOENT))
+		if (first == txq || (ret < 0 && ret != -ENOENT))
 			break;
+
+		if (!first && ret != -ENOENT)
+			first = txq;
+
+		txq = ieee80211_next_txq(hw, -1);
 	}
 
 	rcu_read_unlock();
-	spin_unlock_bh(&ar->txqs_lock);
 }
 
 /************/
@@ -4252,34 +4241,22 @@ static void ath10k_mac_op_tx(struct ieee80211_hw *hw,
 	}
 }
 
-static void ath10k_mac_op_wake_tx_queue(struct ieee80211_hw *hw,
-					struct ieee80211_txq *txq)
+static void ath10k_mac_op_wake_tx_queue(struct ieee80211_hw *hw, u8 acno)
 {
-	struct ath10k *ar = hw->priv;
-	struct ath10k_txq *artxq = (void *)txq->drv_priv;
-	struct ieee80211_txq *f_txq;
-	struct ath10k_txq *f_artxq;
+	struct ieee80211_txq *txq;
 	int ret = 0;
 	int max = 16;
 
-	spin_lock_bh(&ar->txqs_lock);
-	if (list_empty(&artxq->list))
-		list_add_tail(&artxq->list, &ar->txqs);
-
-	f_artxq = list_first_entry(&ar->txqs, struct ath10k_txq, list);
-	f_txq = container_of((void *)f_artxq, struct ieee80211_txq, drv_priv);
-	list_del_init(&f_artxq->list);
+	txq = ieee80211_next_txq(hw, acno);
 
-	while (ath10k_mac_tx_can_push(hw, f_txq) && max--) {
-		ret = ath10k_mac_tx_push_txq(hw, f_txq);
+	while (ath10k_mac_tx_can_push(hw, txq) && max--) {
+		ret = ath10k_mac_tx_push_txq(hw, txq);
 		if (ret)
 			break;
 	}
 	if (ret != -ENOENT)
-		list_add_tail(&f_artxq->list, &ar->txqs);
-	spin_unlock_bh(&ar->txqs_lock);
+		ieee80211_schedule_txq(hw, txq);
 
-	ath10k_htt_tx_txq_update(hw, f_txq);
 	ath10k_htt_tx_txq_update(hw, txq);
 }
 
diff --git a/drivers/net/wireless/ath/ath9k/ath9k.h b/drivers/net/wireless/ath/ath9k/ath9k.h
index ef0de4f1312c..43c2f6966366 100644
--- a/drivers/net/wireless/ath/ath9k/ath9k.h
+++ b/drivers/net/wireless/ath/ath9k/ath9k.h
@@ -246,10 +246,8 @@ struct ath_atx_tid {
 	s8 bar_index;
 	bool active;
 	bool clear_ps_filter;
-	bool has_queued;
 };
 
-void __ath_tx_queue_tid(struct ath_softc *sc, struct ath_atx_tid *tid);
 void ath_tx_queue_tid(struct ath_softc *sc, struct ath_atx_tid *tid);
 
 struct ath_node {
@@ -618,7 +616,7 @@ void ath9k_release_buffered_frames(struct ieee80211_hw *hw,
 				   u16 tids, int nframes,
 				   enum ieee80211_frame_release_type reason,
 				   bool more_data);
-void ath9k_wake_tx_queue(struct ieee80211_hw *hw, struct ieee80211_txq *queue);
+void ath9k_wake_tx_queue(struct ieee80211_hw *hw, u8 acno);
 
 /********/
 /* VIFs */
diff --git a/drivers/net/wireless/ath/ath9k/recv.c b/drivers/net/wireless/ath/ath9k/recv.c
index 2197aee2bb72..2032690a1335 100644
--- a/drivers/net/wireless/ath/ath9k/recv.c
+++ b/drivers/net/wireless/ath/ath9k/recv.c
@@ -1058,7 +1058,7 @@ static void ath_rx_count_airtime(struct ath_softc *sc,
 		spin_lock_bh(&acq->lock);
 		an->airtime_deficit[acno] -= airtime;
 		if (an->airtime_deficit[acno] <= 0)
-			__ath_tx_queue_tid(sc, ATH_AN_2_TID(an, tidno));
+			ath_tx_queue_tid(sc, ATH_AN_2_TID(an, tidno));
 		spin_unlock_bh(&acq->lock);
 	}
 	ath_debug_airtime(sc, an, airtime, 0);
diff --git a/drivers/net/wireless/ath/ath9k/xmit.c b/drivers/net/wireless/ath/ath9k/xmit.c
index 396bf05c6bf6..9e8bc99079a1 100644
--- a/drivers/net/wireless/ath/ath9k/xmit.c
+++ b/drivers/net/wireless/ath/ath9k/xmit.c
@@ -112,61 +112,24 @@ void ath_txq_unlock_complete(struct ath_softc *sc, struct ath_txq *txq)
 		ath_tx_status(hw, skb);
 }
 
-void __ath_tx_queue_tid(struct ath_softc *sc, struct ath_atx_tid *tid)
-{
-	struct ath_vif *avp = (struct ath_vif *) tid->an->vif->drv_priv;
-	struct ath_chanctx *ctx = avp->chanctx;
-	struct ath_acq *acq;
-	struct list_head *tid_list;
-	u8 acno = TID_TO_WME_AC(tid->tidno);
-
-	if (!ctx || !list_empty(&tid->list))
-		return;
-
-
-	acq = &ctx->acq[acno];
-	if ((sc->airtime_flags & AIRTIME_USE_NEW_QUEUES) &&
-	    tid->an->airtime_deficit[acno] > 0)
-		tid_list = &acq->acq_new;
-	else
-		tid_list = &acq->acq_old;
-
-	list_add_tail(&tid->list, tid_list);
-}
-
 void ath_tx_queue_tid(struct ath_softc *sc, struct ath_atx_tid *tid)
 {
-	struct ath_vif *avp = (struct ath_vif *) tid->an->vif->drv_priv;
-	struct ath_chanctx *ctx = avp->chanctx;
-	struct ath_acq *acq;
+	struct ieee80211_txq *queue = container_of(
+		(void *)tid, struct ieee80211_txq, drv_priv);
 
-	if (!ctx || !list_empty(&tid->list))
-		return;
-
-	acq = &ctx->acq[TID_TO_WME_AC(tid->tidno)];
-	spin_lock_bh(&acq->lock);
-	__ath_tx_queue_tid(sc, tid);
-	spin_unlock_bh(&acq->lock);
+	ieee80211_schedule_txq(sc->hw, queue);
 }
 
-
-void ath9k_wake_tx_queue(struct ieee80211_hw *hw, struct ieee80211_txq *queue)
+void ath9k_wake_tx_queue(struct ieee80211_hw *hw, u8 acno)
 {
 	struct ath_softc *sc = hw->priv;
 	struct ath_common *common = ath9k_hw_common(sc->sc_ah);
-	struct ath_atx_tid *tid = (struct ath_atx_tid *) queue->drv_priv;
-	struct ath_txq *txq = tid->txq;
+	struct ath_txq *txq = sc->tx.txq_map[acno];
 
-	ath_dbg(common, QUEUE, "Waking TX queue: %pM (%d)\n",
-		queue->sta ? queue->sta->addr : queue->vif->addr,
-		tid->tidno);
+	ath_dbg(common, QUEUE, "Waking TX queue: %d\n", acno);
 
 	ath_txq_lock(sc, txq);
-
-	tid->has_queued = true;
-	ath_tx_queue_tid(sc, tid);
 	ath_txq_schedule(sc, txq);
-
 	ath_txq_unlock(sc, txq);
 }
 
@@ -230,14 +193,9 @@ ath_tid_pull(struct ath_atx_tid *tid)
 	struct ath_frame_info *fi;
 	int q;
 
-	if (!tid->has_queued)
-		return NULL;
-
 	skb = ieee80211_tx_dequeue(hw, txq);
-	if (!skb) {
-		tid->has_queued = false;
+	if (!skb)
 		return NULL;
-	}
 
 	if (ath_tx_prepare(hw, skb, &txctl)) {
 		ieee80211_free_txskb(hw, skb);
@@ -254,12 +212,6 @@ ath_tid_pull(struct ath_atx_tid *tid)
 	return skb;
  }
 
-
-static bool ath_tid_has_buffered(struct ath_atx_tid *tid)
-{
-	return !skb_queue_empty(&tid->retry_q) || tid->has_queued;
-}
-
 static struct sk_buff *ath_tid_dequeue(struct ath_atx_tid *tid)
 {
 	struct sk_buff *skb;
@@ -672,7 +624,6 @@ static void ath_tx_complete_aggr(struct ath_softc *sc, struct ath_txq *txq,
 		skb_queue_splice_tail(&bf_pending, &tid->retry_q);
 		if (!an->sleeping) {
 			ath_tx_queue_tid(sc, tid);
-
 			if (ts->ts_status & (ATH9K_TXERR_FILT | ATH9K_TXERR_XRETRY))
 				tid->clear_ps_filter = true;
 		}
@@ -720,7 +671,7 @@ static void ath_tx_count_airtime(struct ath_softc *sc, struct ath_node *an,
 		spin_lock_bh(&acq->lock);
 		an->airtime_deficit[q] -= airtime;
 		if (an->airtime_deficit[q] <= 0)
-			__ath_tx_queue_tid(sc, tid);
+			ath_tx_queue_tid(sc, tid);
 		spin_unlock_bh(&acq->lock);
 	}
 	ath_debug_airtime(sc, an, 0, airtime);
@@ -937,20 +888,20 @@ static int ath_compute_num_delims(struct ath_softc *sc, struct ath_atx_tid *tid,
 	return ndelim;
 }
 
-static struct ath_buf *
+static int
 ath_tx_get_tid_subframe(struct ath_softc *sc, struct ath_txq *txq,
-			struct ath_atx_tid *tid)
+			struct ath_atx_tid *tid, struct ath_buf **buf)
 {
 	struct ieee80211_tx_info *tx_info;
 	struct ath_frame_info *fi;
-	struct sk_buff *skb, *first_skb = NULL;
 	struct ath_buf *bf;
+	struct sk_buff *skb, *first_skb = NULL;
 	u16 seqno;
 
 	while (1) {
 		skb = ath_tid_dequeue(tid);
 		if (!skb)
-			break;
+			return -ENOENT;
 
 		fi = get_frame_info(skb);
 		bf = fi->bf;
@@ -981,7 +932,7 @@ ath_tx_get_tid_subframe(struct ath_softc *sc, struct ath_txq *txq,
 
 		if (!(tx_info->flags & IEEE80211_TX_CTL_AMPDU)) {
 			bf->bf_state.bf_type = 0;
-			return bf;
+			break;
 		}
 
 		bf->bf_state.bf_type = BUF_AMPDU | BUF_AGGR;
@@ -1000,7 +951,7 @@ ath_tx_get_tid_subframe(struct ath_softc *sc, struct ath_txq *txq,
 					first_skb = skb;
 				continue;
 			}
-			break;
+			return -EBUSY;
 		}
 
 		if (tid->bar_index > ATH_BA_INDEX(tid->seq_start, seqno)) {
@@ -1014,10 +965,11 @@ ath_tx_get_tid_subframe(struct ath_softc *sc, struct ath_txq *txq,
 			continue;
 		}
 
-		return bf;
+		break;
 	}
 
-	return NULL;
+	*buf = bf;
+	return 0;
 }
 
 static int
@@ -1027,7 +979,7 @@ ath_tx_form_aggr(struct ath_softc *sc, struct ath_txq *txq,
 {
 #define PADBYTES(_len) ((4 - ((_len) % 4)) % 4)
 	struct ath_buf *bf = bf_first, *bf_prev = NULL;
-	int nframes = 0, ndelim;
+	int nframes = 0, ndelim, ret;
 	u16 aggr_limit = 0, al = 0, bpad = 0,
 	    al_delta, h_baw = tid->baw_size / 2;
 	struct ieee80211_tx_info *tx_info;
@@ -1081,7 +1033,9 @@ ath_tx_form_aggr(struct ath_softc *sc, struct ath_txq *txq,
 
 		bf_prev = bf;
 
-		bf = ath_tx_get_tid_subframe(sc, txq, tid);
+		ret = ath_tx_get_tid_subframe(sc, txq, tid, &bf);
+		if (ret < 0)
+			break;
 	}
 	goto finish;
 stop:
@@ -1478,7 +1432,7 @@ ath_tx_form_burst(struct ath_softc *sc, struct ath_txq *txq,
 		  struct ath_buf *bf_first)
 {
 	struct ath_buf *bf = bf_first, *bf_prev = NULL;
-	int nframes = 0;
+	int nframes = 0, ret;
 
 	do {
 		struct ieee80211_tx_info *tx_info;
@@ -1492,8 +1446,8 @@ ath_tx_form_burst(struct ath_softc *sc, struct ath_txq *txq,
 		if (nframes >= 2)
 			break;
 
-		bf = ath_tx_get_tid_subframe(sc, txq, tid);
-		if (!bf)
+		ret = ath_tx_get_tid_subframe(sc, txq, tid, &bf);
+		if (ret < 0)
 			break;
 
 		tx_info = IEEE80211_SKB_CB(bf->bf_mpdu);
@@ -1506,30 +1460,27 @@ ath_tx_form_burst(struct ath_softc *sc, struct ath_txq *txq,
 	} while (1);
 }
 
-static bool ath_tx_sched_aggr(struct ath_softc *sc, struct ath_txq *txq,
-			      struct ath_atx_tid *tid)
+static int ath_tx_sched_aggr(struct ath_softc *sc, struct ath_txq *txq,
+			     struct ath_atx_tid *tid)
 {
-	struct ath_buf *bf;
+	struct ath_buf *bf = NULL;
 	struct ieee80211_tx_info *tx_info;
 	struct list_head bf_q;
-	int aggr_len = 0;
+	int aggr_len = 0, ret;
 	bool aggr;
 
-	if (!ath_tid_has_buffered(tid))
-		return false;
-
 	INIT_LIST_HEAD(&bf_q);
 
-	bf = ath_tx_get_tid_subframe(sc, txq, tid);
-	if (!bf)
-		return false;
+	ret = ath_tx_get_tid_subframe(sc, txq, tid, &bf);
+	if (ret < 0)
+		return ret;
 
 	tx_info = IEEE80211_SKB_CB(bf->bf_mpdu);
 	aggr = !!(tx_info->flags & IEEE80211_TX_CTL_AMPDU);
 	if ((aggr && txq->axq_ampdu_depth >= ATH_AGGR_MIN_QDEPTH) ||
 	    (!aggr && txq->axq_depth >= ATH_NON_AGGR_MIN_QDEPTH)) {
 		__skb_queue_tail(&tid->retry_q, bf->bf_mpdu);
-		return false;
+		return -EBUSY;
 	}
 
 	ath_set_rates(tid->an->vif, tid->an->sta, bf);
@@ -1539,7 +1490,7 @@ static bool ath_tx_sched_aggr(struct ath_softc *sc, struct ath_txq *txq,
 		ath_tx_form_burst(sc, txq, tid, &bf_q, bf);
 
 	if (list_empty(&bf_q))
-		return false;
+		return -EAGAIN;
 
 	if (tid->clear_ps_filter || tid->an->no_ps_filter) {
 		tid->clear_ps_filter = false;
@@ -1548,7 +1499,7 @@ static bool ath_tx_sched_aggr(struct ath_softc *sc, struct ath_txq *txq,
 
 	ath_tx_fill_desc(sc, bf, txq, aggr_len);
 	ath_tx_txqaddbuf(sc, txq, &bf_q, false);
-	return true;
+	return 0;
 }
 
 int ath_tx_aggr_start(struct ath_softc *sc, struct ieee80211_sta *sta,
@@ -1611,28 +1562,16 @@ void ath_tx_aggr_sleep(struct ieee80211_sta *sta, struct ath_softc *sc,
 {
 	struct ath_common *common = ath9k_hw_common(sc->sc_ah);
 	struct ath_atx_tid *tid;
-	struct ath_txq *txq;
 	int tidno;
 
 	ath_dbg(common, XMIT, "%s called\n", __func__);
 
 	for (tidno = 0; tidno < IEEE80211_NUM_TIDS; tidno++) {
 		tid = ath_node_to_tid(an, tidno);
-		txq = tid->txq;
-
-		ath_txq_lock(sc, txq);
-
-		if (list_empty(&tid->list)) {
-			ath_txq_unlock(sc, txq);
-			continue;
-		}
 
 		if (!skb_queue_empty(&tid->retry_q))
 			ieee80211_sta_set_buffered(sta, tid->tidno, true);
 
-		list_del_init(&tid->list);
-
-		ath_txq_unlock(sc, txq);
 	}
 }
 
@@ -1651,11 +1590,12 @@ void ath_tx_aggr_wakeup(struct ath_softc *sc, struct ath_node *an)
 
 		ath_txq_lock(sc, txq);
 		tid->clear_ps_filter = true;
-		if (ath_tid_has_buffered(tid)) {
+		if (!skb_queue_empty(&tid->retry_q)) {
 			ath_tx_queue_tid(sc, tid);
 			ath_txq_schedule(sc, txq);
 		}
 		ath_txq_unlock_complete(sc, txq);
+
 	}
 }
 
@@ -1670,9 +1610,9 @@ void ath9k_release_buffered_frames(struct ieee80211_hw *hw,
 	struct ath_txq *txq = sc->tx.uapsdq;
 	struct ieee80211_tx_info *info;
 	struct list_head bf_q;
-	struct ath_buf *bf_tail = NULL, *bf;
+	struct ath_buf *bf_tail = NULL, *bf = NULL;
 	int sent = 0;
-	int i;
+	int i, ret;
 
 	INIT_LIST_HEAD(&bf_q);
 	for (i = 0; tids && nframes; i++, tids >>= 1) {
@@ -1685,8 +1625,8 @@ void ath9k_release_buffered_frames(struct ieee80211_hw *hw,
 
 		ath_txq_lock(sc, tid->txq);
 		while (nframes > 0) {
-			bf = ath_tx_get_tid_subframe(sc, sc->tx.uapsdq, tid);
-			if (!bf)
+			ret = ath_tx_get_tid_subframe(sc, sc->tx.uapsdq, tid, &bf);
+			if (ret < 0)
 				break;
 
 			list_add_tail(&bf->list, &bf_q);
@@ -1950,11 +1890,12 @@ void ath_tx_cleanupq(struct ath_softc *sc, struct ath_txq *txq)
  */
 void ath_txq_schedule(struct ath_softc *sc, struct ath_txq *txq)
 {
+	struct ieee80211_hw *hw = sc->hw;
 	struct ath_common *common = ath9k_hw_common(sc->sc_ah);
+	struct ieee80211_txq *queue;
 	struct ath_atx_tid *tid;
-	struct list_head *tid_list;
-	struct ath_acq *acq;
 	bool active = AIRTIME_ACTIVE(sc->airtime_flags);
+	int ret;
 
 	if (txq->mac80211_qnum < 0)
 		return;
@@ -1964,52 +1905,30 @@ void ath_txq_schedule(struct ath_softc *sc, struct ath_txq *txq)
 
 	spin_lock_bh(&sc->chan_lock);
 	rcu_read_lock();
-	acq = &sc->cur_chan->acq[txq->mac80211_qnum];
 
 	if (sc->cur_chan->stopped)
 		goto out;
 
 begin:
-	tid_list = &acq->acq_new;
-	if (list_empty(tid_list)) {
-		tid_list = &acq->acq_old;
-		if (list_empty(tid_list))
-			goto out;
-	}
-	tid = list_first_entry(tid_list, struct ath_atx_tid, list);
+	queue = ieee80211_next_txq(hw, txq->mac80211_qnum);
+	if (!queue)
+		goto out;
+
+	tid = (struct ath_atx_tid *) queue->drv_priv;
 
 	if (active && tid->an->airtime_deficit[txq->mac80211_qnum] <= 0) {
-		spin_lock_bh(&acq->lock);
 		tid->an->airtime_deficit[txq->mac80211_qnum] += ATH_AIRTIME_QUANTUM;
-		list_move_tail(&tid->list, &acq->acq_old);
-		spin_unlock_bh(&acq->lock);
+		ieee80211_schedule_txq(hw, queue);
 		goto begin;
 	}
 
-	if (!ath_tid_has_buffered(tid)) {
-		spin_lock_bh(&acq->lock);
-		if ((tid_list == &acq->acq_new) && !list_empty(&acq->acq_old))
-			list_move_tail(&tid->list, &acq->acq_old);
-		else {
-			list_del_init(&tid->list);
-		}
-		spin_unlock_bh(&acq->lock);
+	ret = ath_tx_sched_aggr(sc, txq, tid);
+	ath_dbg(common, QUEUE, "ath_tx_sched_aggr returned %d\n",
+		ret);
+	if (ret != -ENOENT)
+		ieee80211_schedule_txq(hw, queue);
+	if (ret != -EBUSY)
 		goto begin;
-	}
-
-
-	/*
-	 * If we succeed in scheduling something, immediately restart to make
-	 * sure we keep the HW busy.
-	 */
-	if(ath_tx_sched_aggr(sc, txq, tid)) {
-		if (!active) {
-			spin_lock_bh(&acq->lock);
-			list_move_tail(&tid->list, &acq->acq_old);
-			spin_unlock_bh(&acq->lock);
-		}
-		goto begin;
-	}
 
 out:
 	rcu_read_unlock();
@@ -2875,7 +2794,6 @@ void ath_tx_node_init(struct ath_softc *sc, struct ath_node *an)
 		tid->baw_head  = tid->baw_tail = 0;
 		tid->active	   = false;
 		tid->clear_ps_filter = true;
-		tid->has_queued  = false;
 		__skb_queue_head_init(&tid->retry_q);
 		INIT_LIST_HEAD(&tid->list);
 		acno = TID_TO_WME_AC(tidno);
diff --git a/include/net/mac80211.h b/include/net/mac80211.h
index 906e90223066..842feae31277 100644
--- a/include/net/mac80211.h
+++ b/include/net/mac80211.h
@@ -105,9 +105,12 @@
  * The driver is expected to initialize its private per-queue data for stations
  * and interfaces in the .add_interface and .sta_add ops.
  *
- * The driver can't access the queue directly. To dequeue a frame, it calls
- * ieee80211_tx_dequeue(). Whenever mac80211 adds a new frame to a queue, it
- * calls the .wake_tx_queue driver op.
+ * The driver can't access the queue directly. To obtain the next queue to pull
+ * frames from, the driver calls ieee80211_next_txq(). To dequeue a frame from a
+ * txq, it calls ieee80211_tx_dequeue(). Whenever mac80211 adds a new frame to a
+ * queue, it calls the .wake_tx_queue driver op. The driver is expected to
+ * re-schedule the txq using ieee80211_schedule_txq() if it is still active
+ * after the driver has finished pulling packets from it.
  *
  * For AP powersave TIM handling, the driver only needs to indicate if it has
  * buffered packets in the driver specific data structures by calling
@@ -3731,8 +3734,7 @@ struct ieee80211_ops {
 					 struct ieee80211_vif *vif,
 					 struct ieee80211_tdls_ch_sw_params *params);
 
-	void (*wake_tx_queue)(struct ieee80211_hw *hw,
-			      struct ieee80211_txq *txq);
+	void (*wake_tx_queue)(struct ieee80211_hw *hw, u8 acno);
 	void (*sync_rx_queues)(struct ieee80211_hw *hw);
 
 	int (*start_nan)(struct ieee80211_hw *hw,
@@ -5883,13 +5885,37 @@ void ieee80211_unreserve_tid(struct ieee80211_sta *sta, u8 tid);
  * ieee80211_tx_dequeue - dequeue a packet from a software tx queue
  *
  * @hw: pointer as obtained from ieee80211_alloc_hw()
- * @txq: pointer obtained from station or virtual interface
+ * @txq: pointer obtained from ieee80211_next_txq()
  *
  * Returns the skb if successful, %NULL if no frame was available.
  */
 struct sk_buff *ieee80211_tx_dequeue(struct ieee80211_hw *hw,
 				     struct ieee80211_txq *txq);
 
+/**
+ * ieee80211_schedule_txq - add txq to scheduling loop
+ *
+ * @hw: pointer as obtained from ieee80211_alloc_hw()
+ * @txq: pointer obtained from station or virtual interface
+ *
+ * Returns %true if the txq was actually added to the scheduling,
+ * %false otherwise.
+ */
+bool ieee80211_schedule_txq(struct ieee80211_hw *hw,
+			    struct ieee80211_txq *txq);
+
+/**
+ * ieee80211_next_txq - get next tx queue to pull packets from
+ *
+ * @hw: pointer as obtained from ieee80211_alloc_hw()
+ * @acno: filter returned txqs with this AC number. Pass -1 for no filtering.
+ *
+ * Returns the next txq if successful, %NULL if no queue is eligible. If a txq
+ * is returned, it will have been removed from the scheduler queue and needs to
+ * be re-scheduled with ieee80211_schedule_txq() to continue to be active.
+ */
+struct ieee80211_txq *ieee80211_next_txq(struct ieee80211_hw *hw, s8 acno);
+
 /**
  * ieee80211_txq_get_depth - get pending frame/byte count of given txq
  *
diff --git a/net/mac80211/agg-tx.c b/net/mac80211/agg-tx.c
index 595c662a61e8..efbfb8cfb503 100644
--- a/net/mac80211/agg-tx.c
+++ b/net/mac80211/agg-tx.c
@@ -226,9 +226,13 @@ ieee80211_agg_start_txq(struct sta_info *sta, int tid, bool enable)
 		clear_bit(IEEE80211_TXQ_AMPDU, &txqi->flags);
 
 	clear_bit(IEEE80211_TXQ_STOP, &txqi->flags);
+
+	if (!ieee80211_schedule_txq(&sta->sdata->local->hw, txq))
+		return;
+
 	local_bh_disable();
 	rcu_read_lock();
-	drv_wake_tx_queue(sta->sdata->local, txqi);
+	drv_wake_tx_queue(sta->sdata->local, txq->ac);
 	rcu_read_unlock();
 	local_bh_enable();
 }
diff --git a/net/mac80211/driver-ops.h b/net/mac80211/driver-ops.h
index 4d82fe7d627c..0c58dd8c947d 100644
--- a/net/mac80211/driver-ops.h
+++ b/net/mac80211/driver-ops.h
@@ -1159,16 +1159,10 @@ drv_tdls_recv_channel_switch(struct ieee80211_local *local,
 	trace_drv_return_void(local);
 }
 
-static inline void drv_wake_tx_queue(struct ieee80211_local *local,
-				     struct txq_info *txq)
+static inline void drv_wake_tx_queue(struct ieee80211_local *local, u8 acno)
 {
-	struct ieee80211_sub_if_data *sdata = vif_to_sdata(txq->txq.vif);
-
-	if (!check_sdata_in_driver(sdata))
-		return;
-
-	trace_drv_wake_tx_queue(local, sdata, txq);
-	local->ops->wake_tx_queue(&local->hw, &txq->txq);
+	trace_drv_wake_tx_queue(local, acno);
+	local->ops->wake_tx_queue(&local->hw, acno);
 }
 
 static inline int drv_start_nan(struct ieee80211_local *local,
diff --git a/net/mac80211/ieee80211_i.h b/net/mac80211/ieee80211_i.h
index 26900025de2f..cc7d9299658d 100644
--- a/net/mac80211/ieee80211_i.h
+++ b/net/mac80211/ieee80211_i.h
@@ -832,6 +832,7 @@ struct txq_info {
 	struct codel_vars def_cvars;
 	struct codel_stats cstats;
 	struct sk_buff_head frags;
+	struct list_head schedule_order;
 	unsigned long flags;
 
 	/* keep last! */
@@ -1112,6 +1113,9 @@ enum mac80211_scan_state {
 	SCAN_ABORT,
 };
 
+struct ieee80211_txq_list {
+};
+
 struct ieee80211_local {
 	/* embed the driver visible part.
 	 * don't cast (use the static inlines below), but we keep
@@ -1122,6 +1126,10 @@ struct ieee80211_local {
 	struct codel_vars *cvars;
 	struct codel_params cparams;
 
+	/* protects active_txqs and txqi->schedule_order */
+	spinlock_t active_txq_lock;
+	struct list_head active_txqs;
+
 	const struct ieee80211_ops *ops;
 
 	/*
diff --git a/net/mac80211/main.c b/net/mac80211/main.c
index 0785d04a80bc..935d6e2491b1 100644
--- a/net/mac80211/main.c
+++ b/net/mac80211/main.c
@@ -619,6 +619,9 @@ struct ieee80211_hw *ieee80211_alloc_hw_nm(size_t priv_data_len,
 	spin_lock_init(&local->rx_path_lock);
 	spin_lock_init(&local->queue_stop_reason_lock);
 
+	INIT_LIST_HEAD(&local->active_txqs);
+	spin_lock_init(&local->active_txq_lock);
+
 	INIT_LIST_HEAD(&local->chanctx_list);
 	mutex_init(&local->chanctx_mtx);
 
diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c
index 0c5627f8a104..7601c2871175 100644
--- a/net/mac80211/sta_info.c
+++ b/net/mac80211/sta_info.c
@@ -1241,7 +1241,8 @@ void ieee80211_sta_ps_deliver_wakeup(struct sta_info *sta)
 			if (!txq_has_queue(sta->sta.txq[i]))
 				continue;
 
-			drv_wake_tx_queue(local, to_txq_info(sta->sta.txq[i]));
+			ieee80211_schedule_txq(&local->hw, sta->sta.txq[i]);
+			drv_wake_tx_queue(local, sta->sta.txq[i]->ac);
 		}
 	}
 
diff --git a/net/mac80211/trace.h b/net/mac80211/trace.h
index 591ad02e1fa4..23bdc4dcc908 100644
--- a/net/mac80211/trace.h
+++ b/net/mac80211/trace.h
@@ -2552,32 +2552,23 @@ TRACE_EVENT(drv_tdls_recv_channel_switch,
 
 TRACE_EVENT(drv_wake_tx_queue,
 	TP_PROTO(struct ieee80211_local *local,
-		 struct ieee80211_sub_if_data *sdata,
-		 struct txq_info *txq),
+		 u8 ac),
 
-	TP_ARGS(local, sdata, txq),
+	TP_ARGS(local, ac),
 
 	TP_STRUCT__entry(
 		LOCAL_ENTRY
-		VIF_ENTRY
-		STA_ENTRY
 		__field(u8, ac)
-		__field(u8, tid)
 	),
 
 	TP_fast_assign(
-		struct ieee80211_sta *sta = txq->txq.sta;
-
 		LOCAL_ASSIGN;
-		VIF_ASSIGN;
-		STA_ASSIGN;
-		__entry->ac = txq->txq.ac;
-		__entry->tid = txq->txq.tid;
+		__entry->ac = ac;
 	),
 
 	TP_printk(
-		LOCAL_PR_FMT  VIF_PR_FMT  STA_PR_FMT " ac:%d tid:%d",
-		LOCAL_PR_ARG, VIF_PR_ARG, STA_PR_ARG, __entry->ac, __entry->tid
+		LOCAL_PR_FMT " ac:%d",
+		LOCAL_PR_ARG, __entry->ac
 	)
 );
 
diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
index 25904af38839..0ab69e5b87ad 100644
--- a/net/mac80211/tx.c
+++ b/net/mac80211/tx.c
@@ -1439,6 +1439,7 @@ void ieee80211_txq_init(struct ieee80211_sub_if_data *sdata,
 	codel_vars_init(&txqi->def_cvars);
 	codel_stats_init(&txqi->cstats);
 	__skb_queue_head_init(&txqi->frags);
+	INIT_LIST_HEAD(&txqi->schedule_order);
 
 	txqi->txq.vif = &sdata->vif;
 
@@ -1462,6 +1463,9 @@ void ieee80211_txq_purge(struct ieee80211_local *local,
 
 	fq_tin_reset(fq, tin, fq_skb_free_func);
 	ieee80211_purge_tx_queue(&local->hw, &txqi->frags);
+	spin_lock_bh(&local->active_txq_lock);
+	list_del_init(&txqi->schedule_order);
+	spin_unlock_bh(&local->active_txq_lock);
 }
 
 int ieee80211_txq_setup_flows(struct ieee80211_local *local)
@@ -1558,7 +1562,8 @@ static bool ieee80211_queue_skb(struct ieee80211_local *local,
 	ieee80211_txq_enqueue(local, txqi, skb);
 	spin_unlock_bh(&fq->lock);
 
-	drv_wake_tx_queue(local, txqi);
+	ieee80211_schedule_txq(&local->hw, &txqi->txq);
+	drv_wake_tx_queue(local, txqi->txq.ac);
 
 	return true;
 }
@@ -3553,6 +3558,56 @@ struct sk_buff *ieee80211_tx_dequeue(struct ieee80211_hw *hw,
 }
 EXPORT_SYMBOL(ieee80211_tx_dequeue);
 
+bool ieee80211_schedule_txq(struct ieee80211_hw *hw,
+			    struct ieee80211_txq *txq)
+{
+	struct ieee80211_local *local = hw_to_local(hw);
+	struct txq_info *txqi = to_txq_info(txq);
+	bool ret = false;
+
+	spin_lock_bh(&local->active_txq_lock);
+
+	if (list_empty(&txqi->schedule_order)) {
+		list_add_tail(&txqi->schedule_order, &local->active_txqs);
+		ret = true;
+	}
+
+	spin_unlock_bh(&local->active_txq_lock);
+
+	return ret;
+}
+EXPORT_SYMBOL(ieee80211_schedule_txq);
+
+struct ieee80211_txq *ieee80211_next_txq(struct ieee80211_hw *hw, s8 acno)
+{
+	struct ieee80211_local *local = hw_to_local(hw);
+	struct txq_info *txqi = NULL;
+
+	spin_lock_bh(&local->active_txq_lock);
+
+	if (list_empty(&local->active_txqs))
+		goto out;
+
+	list_for_each_entry(txqi, &local->active_txqs, schedule_order) {
+		if (acno < 0 || txqi->txq.ac == acno) {
+			list_del_init(&txqi->schedule_order);
+			goto out;
+		}
+	}
+
+	/* no txq with requested acno found */
+	txqi = NULL;
+
+out:
+	spin_unlock_bh(&local->active_txq_lock);
+
+	if (!txqi)
+		return NULL;
+
+	return &txqi->txq;
+}
+EXPORT_SYMBOL(ieee80211_next_txq);
+
 void __ieee80211_subif_start_xmit(struct sk_buff *skb,
 				  struct net_device *dev,
 				  u32 info_flags)

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: ath9k will not tx packets sometimes.
  2018-01-30 12:07             ` Toke Høiland-Jørgensen
@ 2018-01-30 17:32               ` Ben Greear
  2018-01-30 18:55                 ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 15+ messages in thread
From: Ben Greear @ 2018-01-30 17:32 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, linux-wireless@vger.kernel.org

On 01/30/2018 04:07 AM, Toke Høiland-Jørgensen wrote:
> Ben Greear <greearb@candelatech.com> writes:
>
>> Maybe there is some way for the scheduler to get stuck and not
>> schedule anything?
>
> It would appear so, yeah. Do you do anything special other than
> associate a whole bunch of stations to trigger this? I can try to see if
> I can script something that works equivalently on my setup.

Ok, and I'll give you my user-space software package to easily set up
this test case if you want to test with that.  Contact me off-list if
you want help setting this up.

>
> I'm actually working on reworking that whole scheduler logic, and move
> some of it into mac80211. Could you test this (WiP) patch and see if
> that has the same problem?

It had some serious conflicts in ath10k, due to my local changes, so
I did not actually test this.

But, a revert of the atf patches (a6e56d749 and 63fefa050) appear to have resolved the
issue.  I'll test more with these reverted, and maybe will have time to
work more on actually fixing upstream code next time I move to a newer
kernel (and/or after your pending changes get in).

Thanks,
Ben


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ath9k will not tx packets sometimes.
  2018-01-30 17:32               ` Ben Greear
@ 2018-01-30 18:55                 ` Toke Høiland-Jørgensen
  2018-01-30 20:09                   ` Sebastian Gottschall
  0 siblings, 1 reply; 15+ messages in thread
From: Toke Høiland-Jørgensen @ 2018-01-30 18:55 UTC (permalink / raw)
  To: Ben Greear, linux-wireless@vger.kernel.org

Ben Greear <greearb@candelatech.com> writes:

>> I'm actually working on reworking that whole scheduler logic, and move
>> some of it into mac80211. Could you test this (WiP) patch and see if
>> that has the same problem?
>
> It had some serious conflicts in ath10k, due to my local changes, so
> I did not actually test this.

Can send you a version without the ath10k changes tomorrow if you'd like
to test - but will try to reproduce myself as well...

> But, a revert of the atf patches (a6e56d749 and 63fefa050) appear to
> have resolved the issue. I'll test more with these reverted, and maybe
> will have time to work more on actually fixing upstream code next time
> I move to a newer kernel (and/or after your pending changes get in).

Ah, that narrows it down some. Well, that is the code I'm hacking on
currently anyway, so let's see if we can't get it fixed as part of that
series :)

-Toke

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ath9k will not tx packets sometimes.
  2018-01-30 18:55                 ` Toke Høiland-Jørgensen
@ 2018-01-30 20:09                   ` Sebastian Gottschall
  2018-01-31 11:50                     ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 15+ messages in thread
From: Sebastian Gottschall @ 2018-01-30 20:09 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Ben Greear,
	linux-wireless@vger.kernel.org

Am 30.01.2018 um 19:55 schrieb Toke Høiland-Jørgensen:
> Ben Greear <greearb@candelatech.com> writes:
>
>>> I'm actually working on reworking that whole scheduler logic, and move
>>> some of it into mac80211. Could you test this (WiP) patch and see if
>>> that has the same problem?
>> It had some serious conflicts in ath10k, due to my local changes, so
>> I did not actually test this.
> Can send you a version without the ath10k changes tomorrow if you'd like
> to test - but will try to reproduce myself as well...
>
>> But, a revert of the atf patches (a6e56d749 and 63fefa050) appear to
>> have resolved the issue. I'll test more with these reverted, and maybe
>> will have time to work more on actually fixing upstream code next time
>> I move to a newer kernel (and/or after your pending changes get in).
> Ah, that narrows it down some. Well, that is the code I'm hacking on
> currently anyway, so let's see if we can't get it fixed as part of that
> series :)
i have some addition information for you maybe. in the same timeframe i 
noticed a increased memory usage for ath9k devices.
maybe that helps. so i hit memory boundaries on embedded devices with 
dual interfaces and just 32 mb  ram now which wasnt the case before
is this patch worth to try from my side?

Sebastian

-- 
Mit freundlichen Grüssen / Regards

Sebastian Gottschall / CTO

NewMedia-NET GmbH - DD-WRT
Firmensitz:  Stubenwaldallee 21a, 64625 Bensheim
Registergericht: Amtsgericht Darmstadt, HRB 25473
Geschäftsführer: Peter Steinhäuser, Christian Scheele
http://www.dd-wrt.com
email: s.gottschall@dd-wrt.com
Tel.: +496251-582650 / Fax: +496251-5826565

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ath9k will not tx packets sometimes.
  2018-01-30 20:09                   ` Sebastian Gottschall
@ 2018-01-31 11:50                     ` Toke Høiland-Jørgensen
  2018-01-31 13:37                       ` Sebastian Gottschall
  0 siblings, 1 reply; 15+ messages in thread
From: Toke Høiland-Jørgensen @ 2018-01-31 11:50 UTC (permalink / raw)
  To: Sebastian Gottschall, Ben Greear, linux-wireless@vger.kernel.org

Sebastian Gottschall <s.gottschall@dd-wrt.com> writes:

> Am 30.01.2018 um 19:55 schrieb Toke H=C3=B8iland-J=C3=B8rgensen:
>> Ben Greear <greearb@candelatech.com> writes:
>>
>>>> I'm actually working on reworking that whole scheduler logic, and move
>>>> some of it into mac80211. Could you test this (WiP) patch and see if
>>>> that has the same problem?
>>> It had some serious conflicts in ath10k, due to my local changes, so
>>> I did not actually test this.
>> Can send you a version without the ath10k changes tomorrow if you'd like
>> to test - but will try to reproduce myself as well...
>>
>>> But, a revert of the atf patches (a6e56d749 and 63fefa050) appear to
>>> have resolved the issue. I'll test more with these reverted, and maybe
>>> will have time to work more on actually fixing upstream code next time
>>> I move to a newer kernel (and/or after your pending changes get in).
>> Ah, that narrows it down some. Well, that is the code I'm hacking on
>> currently anyway, so let's see if we can't get it fixed as part of that
>> series :)
> i have some addition information for you maybe. in the same timeframe i=20
> noticed a increased memory usage for ath9k devices.
> maybe that helps. so i hit memory boundaries on embedded devices with=20
> dual interfaces and just 32 mb=C2=A0 ram now which wasnt the case before
> is this patch worth to try from my side?

This is probably because of the added queue space. Which is sort of by
design. In 3ff23cd5654b9c8f4d567caa73439b4c39fbeaae we lowered the
default limit for non-VHT devices to 4MB. But if you have several PHYs
on a very memory constrained device you could still run out I guess.

`echo fq_memory_limit 2097152 > /sys/kernel/debug/ieee80211/phy0/aqm`
would limit it to 2MB for that phy...

-Toke

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ath9k will not tx packets sometimes.
  2018-01-31 11:50                     ` Toke Høiland-Jørgensen
@ 2018-01-31 13:37                       ` Sebastian Gottschall
  2018-01-31 13:46                         ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 15+ messages in thread
From: Sebastian Gottschall @ 2018-01-31 13:37 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Ben Greear,
	linux-wireless@vger.kernel.org

Am 31.01.2018 um 12:50 schrieb Toke Høiland-Jørgensen:
> Sebastian Gottschall <s.gottschall@dd-wrt.com> writes:
>
>> Am 30.01.2018 um 19:55 schrieb Toke Høiland-Jørgensen:
>>> Ben Greear <greearb@candelatech.com> writes:
>>>
>>>>> I'm actually working on reworking that whole scheduler logic, and move
>>>>> some of it into mac80211. Could you test this (WiP) patch and see if
>>>>> that has the same problem?
>>>> It had some serious conflicts in ath10k, due to my local changes, so
>>>> I did not actually test this.
>>> Can send you a version without the ath10k changes tomorrow if you'd like
>>> to test - but will try to reproduce myself as well...
>>>
>>>> But, a revert of the atf patches (a6e56d749 and 63fefa050) appear to
>>>> have resolved the issue. I'll test more with these reverted, and maybe
>>>> will have time to work more on actually fixing upstream code next time
>>>> I move to a newer kernel (and/or after your pending changes get in).
>>> Ah, that narrows it down some. Well, that is the code I'm hacking on
>>> currently anyway, so let's see if we can't get it fixed as part of that
>>> series :)
>> i have some addition information for you maybe. in the same timeframe i
>> noticed a increased memory usage for ath9k devices.
>> maybe that helps. so i hit memory boundaries on embedded devices with
>> dual interfaces and just 32 mb  ram now which wasnt the case before
>> is this patch worth to try from my side?
> This is probably because of the added queue space. Which is sort of by
> design. In 3ff23cd5654b9c8f4d567caa73439b4c39fbeaae we lowered the
> default limit for non-VHT devices to 4MB. But if you have several PHYs
> on a very memory constrained device you could still run out I guess.
>
> `echo fq_memory_limit 2097152 > /sys/kernel/debug/ieee80211/phy0/aqm`
> would limit it to 2MB for that phy...
what if i tried that already? :-)

Sebastian
>
> -Toke
>

-- 
Mit freundlichen Grüssen / Regards

Sebastian Gottschall / CTO

NewMedia-NET GmbH - DD-WRT
Firmensitz:  Stubenwaldallee 21a, 64625 Bensheim
Registergericht: Amtsgericht Darmstadt, HRB 25473
Geschäftsführer: Peter Steinhäuser, Christian Scheele
http://www.dd-wrt.com
email: s.gottschall@dd-wrt.com
Tel.: +496251-582650 / Fax: +496251-5826565

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ath9k will not tx packets sometimes.
  2018-01-31 13:37                       ` Sebastian Gottschall
@ 2018-01-31 13:46                         ` Toke Høiland-Jørgensen
  2018-01-31 14:52                           ` Sebastian Gottschall
  0 siblings, 1 reply; 15+ messages in thread
From: Toke Høiland-Jørgensen @ 2018-01-31 13:46 UTC (permalink / raw)
  To: Sebastian Gottschall, Ben Greear, linux-wireless@vger.kernel.org

Sebastian Gottschall <s.gottschall@dd-wrt.com> writes:

> Am 31.01.2018 um 12:50 schrieb Toke H=C3=B8iland-J=C3=B8rgensen:
>> Sebastian Gottschall <s.gottschall@dd-wrt.com> writes:
>>
>>> Am 30.01.2018 um 19:55 schrieb Toke H=C3=B8iland-J=C3=B8rgensen:
>>>> Ben Greear <greearb@candelatech.com> writes:
>>>>
>>>>>> I'm actually working on reworking that whole scheduler logic, and mo=
ve
>>>>>> some of it into mac80211. Could you test this (WiP) patch and see if
>>>>>> that has the same problem?
>>>>> It had some serious conflicts in ath10k, due to my local changes, so
>>>>> I did not actually test this.
>>>> Can send you a version without the ath10k changes tomorrow if you'd li=
ke
>>>> to test - but will try to reproduce myself as well...
>>>>
>>>>> But, a revert of the atf patches (a6e56d749 and 63fefa050) appear to
>>>>> have resolved the issue. I'll test more with these reverted, and maybe
>>>>> will have time to work more on actually fixing upstream code next time
>>>>> I move to a newer kernel (and/or after your pending changes get in).
>>>> Ah, that narrows it down some. Well, that is the code I'm hacking on
>>>> currently anyway, so let's see if we can't get it fixed as part of that
>>>> series :)
>>> i have some addition information for you maybe. in the same timeframe i
>>> noticed a increased memory usage for ath9k devices.
>>> maybe that helps. so i hit memory boundaries on embedded devices with
>>> dual interfaces and just 32 mb=C2=A0 ram now which wasnt the case before
>>> is this patch worth to try from my side?
>> This is probably because of the added queue space. Which is sort of by
>> design. In 3ff23cd5654b9c8f4d567caa73439b4c39fbeaae we lowered the
>> default limit for non-VHT devices to 4MB. But if you have several PHYs
>> on a very memory constrained device you could still run out I guess.
>>
>> `echo fq_memory_limit 2097152 > /sys/kernel/debug/ieee80211/phy0/aqm`
>> would limit it to 2MB for that phy...
> what if i tried that already? :-)

Hmm, then it's maybe a bug? Changing the limit makes no difference at
all? Does your build include 0bfe649fbb133? What are values of the
counters in /sys/kernel/debug/ieee80211/phy0/aqm ?

-Toke

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ath9k will not tx packets sometimes.
  2018-01-31 13:46                         ` Toke Høiland-Jørgensen
@ 2018-01-31 14:52                           ` Sebastian Gottschall
  0 siblings, 0 replies; 15+ messages in thread
From: Sebastian Gottschall @ 2018-01-31 14:52 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Ben Greear,
	linux-wireless@vger.kernel.org

Am 31.01.2018 um 14:46 schrieb Toke Høiland-Jørgensen:
> Sebastian Gottschall <s.gottschall@dd-wrt.com> writes:
>
>> Am 31.01.2018 um 12:50 schrieb Toke Høiland-Jørgensen:
>>> Sebastian Gottschall <s.gottschall@dd-wrt.com> writes:
>>>
>>>> Am 30.01.2018 um 19:55 schrieb Toke Høiland-Jørgensen:
>>>>> Ben Greear <greearb@candelatech.com> writes:
>>>>>
>>>>>>> I'm actually working on reworking that whole scheduler logic, and move
>>>>>>> some of it into mac80211. Could you test this (WiP) patch and see if
>>>>>>> that has the same problem?
>>>>>> It had some serious conflicts in ath10k, due to my local changes, so
>>>>>> I did not actually test this.
>>>>> Can send you a version without the ath10k changes tomorrow if you'd like
>>>>> to test - but will try to reproduce myself as well...
>>>>>
>>>>>> But, a revert of the atf patches (a6e56d749 and 63fefa050) appear to
>>>>>> have resolved the issue. I'll test more with these reverted, and maybe
>>>>>> will have time to work more on actually fixing upstream code next time
>>>>>> I move to a newer kernel (and/or after your pending changes get in).
>>>>> Ah, that narrows it down some. Well, that is the code I'm hacking on
>>>>> currently anyway, so let's see if we can't get it fixed as part of that
>>>>> series :)
>>>> i have some addition information for you maybe. in the same timeframe i
>>>> noticed a increased memory usage for ath9k devices.
>>>> maybe that helps. so i hit memory boundaries on embedded devices with
>>>> dual interfaces and just 32 mb  ram now which wasnt the case before
>>>> is this patch worth to try from my side?
>>> This is probably because of the added queue space. Which is sort of by
>>> design. In 3ff23cd5654b9c8f4d567caa73439b4c39fbeaae we lowered the
>>> default limit for non-VHT devices to 4MB. But if you have several PHYs
>>> on a very memory constrained device you could still run out I guess.
>>>
>>> `echo fq_memory_limit 2097152 > /sys/kernel/debug/ieee80211/phy0/aqm`
>>> would limit it to 2MB for that phy...
>> what if i tried that already? :-)
> Hmm, then it's maybe a bug? Changing the limit makes no difference at
> all? Does your build include 0bfe649fbb133? What are values of the
maybe it makes a change but i run into oom after a while as well
> counters in /sys/kernel/debug/ieee80211/phy0/aqm ?
i will check the current state in the next days. havent checked it over 
the last 2 months on the affected device

>
> -Toke
>

-- 
Mit freundlichen Grüssen / Regards

Sebastian Gottschall / CTO

NewMedia-NET GmbH - DD-WRT
Firmensitz:  Stubenwaldallee 21a, 64625 Bensheim
Registergericht: Amtsgericht Darmstadt, HRB 25473
Geschäftsführer: Peter Steinhäuser, Christian Scheele
http://www.dd-wrt.com
email: s.gottschall@dd-wrt.com
Tel.: +496251-582650 / Fax: +496251-5826565

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-01-31 14:52 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-26 22:53 ath9k will not tx packets sometimes Ben Greear
2018-01-27 13:11 ` Toke Høiland-Jørgensen
2018-01-29 21:34   ` Ben Greear
2018-01-29 21:47     ` Toke Høiland-Jørgensen
2018-01-29 21:54       ` Ben Greear
2018-01-29 22:35         ` Toke Høiland-Jørgensen
2018-01-29 23:00           ` Ben Greear
2018-01-30 12:07             ` Toke Høiland-Jørgensen
2018-01-30 17:32               ` Ben Greear
2018-01-30 18:55                 ` Toke Høiland-Jørgensen
2018-01-30 20:09                   ` Sebastian Gottschall
2018-01-31 11:50                     ` Toke Høiland-Jørgensen
2018-01-31 13:37                       ` Sebastian Gottschall
2018-01-31 13:46                         ` Toke Høiland-Jørgensen
2018-01-31 14:52                           ` Sebastian Gottschall

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).