From: Felix Fietkau <nbd@openwrt.org>
To: Emmanuel Grumbach <egrumbach@gmail.com>
Cc: linux-wireless <linux-wireless@vger.kernel.org>,
Johannes Berg <johannes@sipsolutions.net>
Subject: Re: [RFC v2] mac80211: add A-MSDU tx support
Date: Sun, 7 Feb 2016 11:48:53 +0100 [thread overview]
Message-ID: <56B72115.10709@openwrt.org> (raw)
In-Reply-To: <CANUX_P0QkRhSpdxrd9jAobCX9E=TED55=YaTOJ6QwwuJQHfTqA@mail.gmail.com>
On 2016-02-07 11:22, Emmanuel Grumbach wrote:
> On Sun, Feb 7, 2016 at 12:06 PM, Emmanuel Grumbach <egrumbach@gmail.com> wrote:
>> On Sun, Feb 7, 2016 at 11:08 AM, Felix Fietkau <nbd@openwrt.org> wrote:
>>> On 2016-02-07 08:25, Emmanuel Grumbach wrote:
>>>> On Fri, Feb 5, 2016 at 12:41 PM, Felix Fietkau <nbd@openwrt.org> wrote:
>>>>> Requires software tx queueing support. frag_list support (for zero-copy)
>>>>> is optional.
>>>>>
>>>>> Signed-off-by: Felix Fietkau <nbd@openwrt.org>
>>>>
>>>> Looks nice!
>>>> This would allow us to create aggregates of TCP Acks, the problem is
>>>> that when you are mostly receiving data, the hardware queues are
>>>> pretty much empty (nothing besides the TCP Acks which should go out
>>>> quickly) so that packets don't pile up in the software queues and
>>>> hence you don't have enough material to build an A-MSDU.
>>>> I guess that for AP oriented devices, this is ideal solution since you
>>>> can't rely on TSO (packets are not locally generated) and this allows
>>>> to build an A-MSDU without adding more latency since you build an
>>>> A-MSDU with packets that are already in the queue waiting instead of
>>>> delaying transmission of the first packet.
>>>> IIRC, the latter was the approach chose by the new Marvell driver
>>>> posted a few weeks ago. This approach is better in my eyes.
>>>> For iwlwifi which is much more station oriented (of GO which is
>>>> basically an AP with locally generated traffic), I took the TSO
>>>> approach. I guess we could try to change iwlwifi to use your tx
>>>> queues, and check how that works. This would allow us to have A-MSDU
>>>> on bridged traffic as well, although this use case is much less common
>>>> for Intel devices.
>>
>>
>>> Can the iwlwifi firmware maintain per-sta per-tid queues? Because that
>>> way you would get the most benefits from using that tx queueing
>>> infrastructure.
>>
>> Well... iwlwifi and athXk are very different. iwlwifi really has the
>> concept of queues and not flat descriptors.
>> Any Tx descriptor lives in the context of a Tx queue which is
>> per-sta-per tid when A-MPDUs is enabled.
>> We are now moving to a scheme in which we will have per-sta-per-tid
>> queue regardless of the A-MPDU state which will make much sense to tie
>> to the tx queueing infrastructure.
>> Thing is that in that case I am afraid we will not have enough packets
>> in the software tx queue to get A-MSDUs from your code. with TSO, it
>> is easier :) Still worth trying to work with this instead of TSO and
>> see how it goes. That won't happen anytime soon though.
>>
>>>
>>>>> +
>>>>> +static bool ieee80211_amsdu_aggregate(struct ieee80211_sub_if_data *sdata,
>>>>> + struct sta_info *sta,
>>>>> + struct ieee80211_fast_tx *fast_tx,
>>>>> + struct sk_buff *skb)
>>>>> +{
>>>>> + struct ieee80211_local *local = sdata->local;
>>>>> + u8 tid = skb->priority & IEEE80211_QOS_CTL_TAG1D_MASK;
>>>>> + struct ieee80211_txq *txq = sta->sta.txq[tid];
>>>>> + struct txq_info *txqi;
>>>>> + struct sk_buff **frag_tail, *head;
>>>>> + int subframe_len = skb->len - ETH_ALEN;
>>>>> + int max_amsdu_len;
>>>>> + __be16 len;
>>>>> + void *data;
>>>>> + bool ret = false;
>>>>> + int n = 1;
>>>>> +
>>>>> + if (!ieee80211_hw_check(&local->hw, TX_AMSDU))
>>>>> + return false;
>>>>> +
>>>>> + if (!txq)
>>>>> + return false;
>>>>> +
>>>>> + txqi = to_txq_info(txq);
>>>>> + if (test_bit(IEEE80211_TXQ_NO_AMSDU, &txqi->flags))
>>>>> + return false;
>>>>> +
>>>>> + /*
>>>>> + * A-MPDU limits maximum MPDU size to 4095 bytes. Since aggregation
>>>>> + * sessions are started/stopped without txq flush, use the limit here
>>>>> + * to avoid having to de-aggregate later.
>>>>> + */
>>>>> + max_amsdu_len = min_t(int, sta->sta.max_amsdu_len, 4095);
>>>>
>>>> So you can't get 10K A-MSDUs? I don't see where you check that you
>>>> have an A-MPDU session here. You seem to be applying the 4095 limit
>>>> also for streams that are not an A-MPDU?
>>>> I guess you could check if the sta is a VHT peer, in that case, no
>>>> limit applies.
>>> The explanation for the missing A-MPDU change is in that comment -
>>> checking for an active A-MPDU session would make it unnecessarily complex.
>>> Good point about checking for VHT capabilities to remove this limit, I
>>> will add that.
>
> Yes - I read the comment, but it seemed very sub-optimal to limit all
> the A-MSDUs to 4K. With TSO I can get up to 10K and it really helps
> TPT.
This was built with the assumption that most scenarios use A-MPDU anyway
and thus don't need really large A-MSDUs.
> One more point. In VHT, there may be a limit on the numbers of
> subframes in the A-MSDU. I don't see you handle that. Maybe I missed
> it?
I haven't looked at that much yet. Right now the driver can only specify
a limit for the number of subframes.
> And... in case the driver doesn't handle frag_list, you linearize the
> skb which is pretty much the only thing you can do at this stage. But,
> when you'll lift the 4095 bytes limit, you'll get 11K A-MSDU,
> linarizing such a long packet is really putting the memory manager
> under pressure.
I added no-frag_list support primarily for debugging purposes, it's not
supposed to perform well.
> This is an order 4 allocation, for each A-MSDU. Note
> that iwlwifi (and probably other drivers) can handle gather DMA in Tx,
> but they have a limited number of frags they can handle. iwlwifi e.g.
> can handle up to 20 frags, but 3 are taken for "paperwork". You'll
> have 2 frags per subframe at least (assuming that each subframe's
> payload is nicely contiguous and not on a page boundary). I think that
> it may be worthwhile to ask the driver how many frags it is supposed
> to handle. I can't promise iwlwifi will use it, but I guess it will be
> useful for someone.
You mean an extra frag limit in addition to the driver subframe limit,
in case individual subframes are fragmented as well?
- Felix
next prev parent reply other threads:[~2016-02-07 10:49 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-02-05 10:41 [RFC v2] mac80211: add A-MSDU tx support Felix Fietkau
2016-02-07 7:25 ` Emmanuel Grumbach
2016-02-07 9:08 ` Felix Fietkau
2016-02-07 10:06 ` Emmanuel Grumbach
2016-02-07 10:22 ` Emmanuel Grumbach
2016-02-07 10:48 ` Felix Fietkau [this message]
2016-02-07 11:32 ` Emmanuel Grumbach
2016-02-07 11:49 ` Felix Fietkau
2016-02-07 11:56 ` Emmanuel Grumbach
2016-02-07 13:21 ` Felix Fietkau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56B72115.10709@openwrt.org \
--to=nbd@openwrt.org \
--cc=egrumbach@gmail.com \
--cc=johannes@sipsolutions.net \
--cc=linux-wireless@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).