linux-wireless.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ben Greear <greearb@candelatech.com>
To: Emmanuel Grumbach <egrumbach@gmail.com>
Cc: Kalle Valo <kvalo@qca.qualcomm.com>,
	ath10k <ath10k@lists.infradead.org>,
	"linux-wireless@vger.kernel.org" <linux-wireless@vger.kernel.org>
Subject: Re: Firmware debugging patches?
Date: Mon, 02 Jun 2014 11:58:17 -0700	[thread overview]
Message-ID: <538CC949.5020901@candelatech.com> (raw)
In-Reply-To: <538CC68C.10808@gmail.com>

On 06/02/2014 11:46 AM, Emmanuel Grumbach wrote:
>> [Good stuff snipped, adding linux-wireless as this is a more
>> general issue if we are going to consider general framework]
>>
>>
>> Maybe we should start with goals before getting to implementation
>> details.  Here's my wish list that is ath10k specific, but probably
>> similar to other firmware users:
>>
>> 1)  We need the firmware crash text currently printed to
>> /var/log/messages.
>>
>> 2)  It would be nice to get the firmware RAM and stack dumps at time of
>> crash to debug more interesting crashes.
> 
> Right - but typically you'll have closed source / IP / whatever there..

I mean that we need the raw data (ie, binary dump, something printed
in ascii-hex, etc).  I understand it will take proprietary tools to
decode it to something a developer can actually debug.

>> 3)  It would be nice to know about firmware debug messages for
>> the period of time directly before the crash (maybe 2-5 minutes?)
>>
>> 4)  It would be nice to have this interleaved with kernel, supplicant,
>> and related logs.
>>
>>
>> We need a solution for different types of users.  I suspect the number
>> of crashes seen in the wild will be more for users nearer the top
>> of this list.
>>
>> a) Normal Fedora/Ubuntu/etc default-installed distribution user
>> with ath10k NIC has wifi issues, firmware crashes, they don't
>> really know what firmware means or that it crashed, but some automated crash-log
>> tool notices and gathers debug info for automated bug reporting.
> 
> I am working on that for our firmware. I recently added such capability relying on udev to notify the userspace that something bad happens. I gather all the data and prepare a binary file that is sent through debugfs (pulled by a script triggered by udev). I remember the first crash only.

How is this binary blob encoded?

At least for drivers that can recover from firmware crashes, I think
we should continue to report crashes, not just the first.

Maybe could store another one after initial crash has been read
and 1 minute has elapsed, or if initial crash has not been read
in 1 day, or something like that.

Also, if we use debugfs then we require upstream kernels to have this
compiled in and mounted if we want to handle this class of user.

I am not sure this is really the case currently.  But, once the
blob is generated and stored in RAM, it would be easily enough to
add ethtool option to dump it w/out debugfs support.  This will
still not really address my concerns because it may take a year
or two for the latest ethtool binary to make it to normal-ish users.

>>
>> b) Slightly more advanced user actually notices the problem at coffee shop
>> earlier today, posts about it when they get home, and we ask for
>> debug info.
>>
>> c) Experienced and determined user has similar issues, but is able to
>> reproduce the problem and/or turn on more advanced debugging efforts.
>>
>> d)  Even more determined user that can and will recompile kernels and/or
>> try patches.
>>
>>
>> Anything that has to be enabled before-hand will not help a) and b) above.
>>
>> If support is not compiled into default kernels, c) will not help you either.
>>
>> If it is difficult or requires acquiring cutting edge tools not in their
>> distribution by default, many of c) and some of d) will just ignore the problem or use
>> different hardware.
>>
>> If we are storing crashes for something like ethtool to report, we need
>> RAM and/or disk storage so the firmware RAM dumps and such can be stored until
>> the user and/or automated tools ask for them.  We need some way to automatically
>> clean up old crashes so disk/ram is not overly utilized.  For APs,
>> they are low on both RAM and 'disk', so storing crash logs for any
>> length of time may be problematic.
>  
> I did something simpler - but it works. I don't really know the ethtool infrastructure though.

I think ethtool would not be overly hard to implement...basic framework is already
in the wifi stack.

Thanks,
Ben


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


  reply	other threads:[~2014-06-02 18:58 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <53891ACD.7070902@candelatech.com>
     [not found] ` <87wqczz3h9.fsf@kamboji.qca.qualcomm.com>
     [not found]   ` <538CA904.4000508@candelatech.com>
     [not found]     ` <87ioojz1b1.fsf@kamboji.qca.qualcomm.com>
2014-06-02 17:42       ` Firmware debugging patches? Ben Greear
2014-06-02 18:46         ` Emmanuel Grumbach
2014-06-02 18:58           ` Ben Greear [this message]
2014-06-02 19:29             ` Emmanuel Grumbach
2014-06-02 19:48               ` Ben Greear
2014-06-04 19:23                 ` Emmanuel Grumbach
2014-06-04 19:29                   ` Ben Greear
2014-06-05 11:10                     ` Kalle Valo
2014-06-05 15:51                       ` Ben Greear
2014-06-05 11:06                 ` Kalle Valo
2014-06-05 15:57                   ` Ben Greear
2014-06-06  6:51                     ` Kalle Valo
2014-06-06 16:02                       ` Ben Greear
2014-06-07 13:03                         ` Kalle Valo
2014-06-07 15:27                           ` Ben Greear
2014-06-08  8:35                             ` Kalle Valo
2014-06-08  9:13                               ` Johannes Berg
2014-06-08 16:01                                 ` Emmanuel Grumbach
2014-06-08 15:39                               ` Ben Greear
2014-06-09  8:17                                 ` Kalle Valo
2014-06-09 15:09                                   ` Ben Greear
2014-06-09 15:47                                     ` Ben Greear
2014-06-09 16:27                                       ` Ben Greear
2014-06-10  6:05                                         ` Kalle Valo
2014-06-10 15:06                                           ` Ben Greear
2014-06-26 15:26                                           ` Ben Greear
2014-06-26 16:01                                             ` Kalle Valo
2014-06-05 10:58             ` Kalle Valo
2014-06-05 15:59               ` Ben Greear
2014-06-05 10:51         ` Kalle Valo
2014-06-05 16:03           ` Ben Greear

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=538CC949.5020901@candelatech.com \
    --to=greearb@candelatech.com \
    --cc=ath10k@lists.infradead.org \
    --cc=egrumbach@gmail.com \
    --cc=kvalo@qca.qualcomm.com \
    --cc=linux-wireless@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).