All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ben Greear <greearb@candelatech.com>
To: Michal Kazior <michal.kazior@tieto.com>
Cc: ath10k <ath10k@lists.infradead.org>
Subject: Re: More issues with ath10k_flush
Date: Fri, 06 Jun 2014 07:49:00 -0700	[thread overview]
Message-ID: <5391D4DC.7060402@candelatech.com> (raw)
In-Reply-To: <CA+BoTQm8Ga2JH6cGcJ4kr0jbocusNecL1VL+LMPXk+du_OY5sw@mail.gmail.com>



On 06/05/2014 10:16 PM, Michal Kazior wrote:
> On 6 June 2014 01:37, Ben Greear <greearb@candelatech.com> wrote:
>> On 06/05/2014 11:47 AM, Ben Greear wrote:
>>> I'm back to debugging this charmer.
>>>
>>> Currently I see the flush fail (and take 5 seconds doing so)
>>> fairly often when creating lots of station vifs against my firmware.
>>>
>>> Once stations are connected, there are usually no more timeouts,
>>> even though I might be sending/receiving 100+Mbps of traffic for hours at
>>> a time.
>>>
>>> By printing out the firmware stats, I see that much of the time
>>> the hardware has accepted X packets for transmission, but has completed
>>> X-1.  It is possible the firmware's counters are screwed up some how
>>> or that it lost a packet, but I think it may also be possible that
>>> the firmware is just being really slow about completing a packet
>>> every now and then.  I have looked at the firmware in detail and
>>> have found no way that it could actually leak tx descriptors.
>
> Interesting. This reminds me the lazy wmi-htc tx credit replenishment
> after wmi mgmt tx is completed. Maybe it's a similar sort of thing?
> Maybe it's actually completed but for some reason the completion
> hasn't been fully processed yet..

I didn't see any reason for that to happen in the firmware, but it
is not the simplest code...

>>> So, I was thinking about changing the flush logic to try
>>> the current flush (that just waits) for up to 1/5 of the
>>> flush timeout, and if that fails, try telling the firmware to purge
>>> it's tx buffers, and then wait up to 4/5ths more of the
>>> flush timeout.
>
> Sounds reasonable.

By flushing before we start waiting, maybe we don't need the extra
cleverness...but possibly it would be better to wait a short bit
of time an then flush firmware if we still have pending skbs?

>> After poking around, it seems there is no wmi command to tell
>> the firmware to just flush everything, so I hacked one into
>> my firmware, called it before ath10k_flush starts waiting,
>> and after several reboots, I do not see any timeouts trying
>> to flush.
>
> I thought WMI_PEER_FLUSH_TIDS_CMDID is for that. It didn't work for
> you? If so I would assume it's a firmware bug..

Well, actually, the command may have worked...but instead of iterating
through all peers for all vdevs and making lots of wmi calls, I just
made the firmware do the iteration by passing 0xFFFFFFFF as the vdev-id
and special-casing the firmware handling of the message.

Was only about 8 extra lines of code in the firmware...

I also noticed something where the firmware might not be
flushing it's tids when a vdev goes down...I didn't bother
to change that yet, but possibly that is part of the issue.
(It only flushed if vdev was 'paused'...not sure why.)


>> So, maybe that will do the trick...other suggestions are
>> still welcome :)
>
> Did you try to find out what kind of frame is supposedly held? I
> recall you've posted a NullFunc hexdump once pointing that it's one of
> the offending frames that didn't complete.
>
> So.. maybe just not sending NullFunc frames (hell, they don't get a
> proper ack status anyway..) or somehow altering how they are sent is
> another way to work this around.

I haven't tried printing them lately...and if the flush logic continues
to work, I probably won't bother...

In the past, I know there were sometimes lots of larger frames as well, but possibly
that was a separate issue as I have not seen more than one frame hung lately.

Thanks,
Ben


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

      reply	other threads:[~2014-06-06 14:49 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-05 18:47 More issues with ath10k_flush Ben Greear
2014-06-05 23:37 ` Ben Greear
2014-06-06  5:16   ` Michal Kazior
2014-06-06 14:49     ` Ben Greear [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5391D4DC.7060402@candelatech.com \
    --to=greearb@candelatech.com \
    --cc=ath10k@lists.infradead.org \
    --cc=michal.kazior@tieto.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.