All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ben Greear <greearb@candelatech.com>
To: Michal Kazior <michal.kazior@tieto.com>
Cc: linux-wireless <linux-wireless@vger.kernel.org>,
	"ath10k@lists.infradead.org" <ath10k@lists.infradead.org>,
	Matti Laakso <malaakso@elisanet.fi>
Subject: Re: [RFT] ath10k: restart fw on tx-credit timeout
Date: Thu, 12 Feb 2015 05:21:22 -0800	[thread overview]
Message-ID: <54DCA8D2.5090006@candelatech.com> (raw)
In-Reply-To: <CA+BoTQ=cXL4pdLg==RVh5ARfNyC5qscthh4YJW1ufOM5=EsLyw@mail.gmail.com>



On 02/11/2015 10:55 PM, Michal Kazior wrote:
> On 11 February 2015 at 23:25, Ben Greear <greearb@candelatech.com> wrote:
>> On 02/10/2015 09:01 AM, Ben Greear wrote:
>>
>>> I've hacked CT firmware to do a flush of all vdevs itself when it detects WMI hang.
>>> I don't have a good test bed to reproduce the problem reliably, but I should know
>>> after a few days if the flush works at all.  If not, then it's a moot point anyway.
>>
>> So, this appears to at least partially work.
>>
>> But, what we notice is that when using multiple station vdevs, the system pretty much
>> becomes useless if we get any significant number of stuck or slow-to-transmit management
>> buffers over WMI.  Part of this is because WMI messages are sent when holding rtnl
>> much of the time, I think.
>
> Most, if not all, WMI commands are sent while holding conf_mutex. This
> lock is taken in many situations including when RTNL is held so your
> observation isn't entirely correct but isn't wrong either.
>
>
>> I would guess that an AP with lots of peers associated might have similar problems
>> if peers are not ACKing packets reliably.
>
> It's not the ACKing per se. It's whether stations are asleep and
> unresponsive or not. You could do funny DoS attacks with a single
> ath9k card (using virtual stations) on ath10k APs now I guess :-)

In our lab we have some setups where there should be no power-save at all,
but still see this issue.  Unlucky (or nefarious) broken-ness in the peer can seem to
mostly hang the local system due to the 'not entirely correct' assumption above :)


>> Probably the only useful way to fix this is to make the firmware and driver able to
>> send management frames over the normal transport like every other data packet?
>
> Agreed. HTT should've been used for entire traffic, including management frames.
>
> The workaround could've been to guarantee to have only 1 wmi-mgmt-tx
> in-flight but since tx-credits aren't replenished predictably you'll
> end up with the patch I originally did, i.e. sleep 2*bcn intval and
> wmi-peer-flush-tids after each unicast mgmt frame to a known station.

Even assuming I have the tx-credits replenishment fixed,
that work-around would make sending sending mgt frames to many peers
very slow when at least a few peers are not answering quickly, right?

>> Any idea what it wasn't written like that to begin with?
>
> Beats me.

This might be something I can fix in CT firmware..but trying to kick a release out
the door, so I think I'll put this off for a bit.

Thanks,
Ben


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

WARNING: multiple messages have this Message-ID (diff)
From: Ben Greear <greearb@candelatech.com>
To: Michal Kazior <michal.kazior@tieto.com>
Cc: "ath10k@lists.infradead.org" <ath10k@lists.infradead.org>,
	linux-wireless <linux-wireless@vger.kernel.org>,
	Matti Laakso <malaakso@elisanet.fi>
Subject: Re: [RFT] ath10k: restart fw on tx-credit timeout
Date: Thu, 12 Feb 2015 05:21:22 -0800	[thread overview]
Message-ID: <54DCA8D2.5090006@candelatech.com> (raw)
In-Reply-To: <CA+BoTQ=cXL4pdLg==RVh5ARfNyC5qscthh4YJW1ufOM5=EsLyw@mail.gmail.com>



On 02/11/2015 10:55 PM, Michal Kazior wrote:
> On 11 February 2015 at 23:25, Ben Greear <greearb@candelatech.com> wrote:
>> On 02/10/2015 09:01 AM, Ben Greear wrote:
>>
>>> I've hacked CT firmware to do a flush of all vdevs itself when it detects WMI hang.
>>> I don't have a good test bed to reproduce the problem reliably, but I should know
>>> after a few days if the flush works at all.  If not, then it's a moot point anyway.
>>
>> So, this appears to at least partially work.
>>
>> But, what we notice is that when using multiple station vdevs, the system pretty much
>> becomes useless if we get any significant number of stuck or slow-to-transmit management
>> buffers over WMI.  Part of this is because WMI messages are sent when holding rtnl
>> much of the time, I think.
>
> Most, if not all, WMI commands are sent while holding conf_mutex. This
> lock is taken in many situations including when RTNL is held so your
> observation isn't entirely correct but isn't wrong either.
>
>
>> I would guess that an AP with lots of peers associated might have similar problems
>> if peers are not ACKing packets reliably.
>
> It's not the ACKing per se. It's whether stations are asleep and
> unresponsive or not. You could do funny DoS attacks with a single
> ath9k card (using virtual stations) on ath10k APs now I guess :-)

In our lab we have some setups where there should be no power-save at all,
but still see this issue.  Unlucky (or nefarious) broken-ness in the peer can seem to
mostly hang the local system due to the 'not entirely correct' assumption above :)


>> Probably the only useful way to fix this is to make the firmware and driver able to
>> send management frames over the normal transport like every other data packet?
>
> Agreed. HTT should've been used for entire traffic, including management frames.
>
> The workaround could've been to guarantee to have only 1 wmi-mgmt-tx
> in-flight but since tx-credits aren't replenished predictably you'll
> end up with the patch I originally did, i.e. sleep 2*bcn intval and
> wmi-peer-flush-tids after each unicast mgmt frame to a known station.

Even assuming I have the tx-credits replenishment fixed,
that work-around would make sending sending mgt frames to many peers
very slow when at least a few peers are not answering quickly, right?

>> Any idea what it wasn't written like that to begin with?
>
> Beats me.

This might be something I can fix in CT firmware..but trying to kick a release out
the door, so I think I'll put this off for a bit.

Thanks,
Ben


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

  reply	other threads:[~2015-02-12 13:21 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-06 12:05 [RFT] ath10k: restart fw on tx-credit timeout Michal Kazior
2015-02-06 12:05 ` Michal Kazior
2015-02-06 16:15 ` Ben Greear
2015-02-06 16:15   ` Ben Greear
2015-02-09  6:24   ` Michal Kazior
2015-02-09  6:24     ` Michal Kazior
2015-02-09 16:03     ` Ben Greear
2015-02-09 16:03       ` Ben Greear
2015-02-10  6:09       ` Michal Kazior
2015-02-10  6:09         ` Michal Kazior
2015-02-10 17:01         ` Ben Greear
2015-02-10 17:01           ` Ben Greear
2015-02-11 22:25           ` Ben Greear
2015-02-11 22:25             ` Ben Greear
2015-02-12  6:55             ` Michal Kazior
2015-02-12  6:55               ` Michal Kazior
2015-02-12 13:21               ` Ben Greear [this message]
2015-02-12 13:21                 ` Ben Greear
2015-02-11 13:30 ` Matti Laakso
2015-02-11 13:30   ` Matti Laakso
2015-02-14  8:35   ` Matti Laakso
2015-02-14  8:35     ` Matti Laakso

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54DCA8D2.5090006@candelatech.com \
    --to=greearb@candelatech.com \
    --cc=ath10k@lists.infradead.org \
    --cc=linux-wireless@vger.kernel.org \
    --cc=malaakso@elisanet.fi \
    --cc=michal.kazior@tieto.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.