netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jakub Kicinski <kuba@kernel.org>
To: Jacob Keller <jacob.e.keller@intel.com>
Cc: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>,
	Tony Nguyen <anthony.l.nguyen@intel.com>, <davem@davemloft.net>,
	<pabeni@redhat.com>, <edumazet@google.com>,
	<netdev@vger.kernel.org>, <jiri@nvidia.com>, <idosch@idosch.org>
Subject: Re: [PATCH net-next 0/5][pull request] add v2 FW logging for ice driver
Date: Tue, 14 Feb 2023 15:19:10 -0800	[thread overview]
Message-ID: <20230214151910.419d72cf@kernel.org> (raw)
In-Reply-To: <6198f4e4-51ac-a71a-ba20-b452e42a7b42@intel.com>

On Tue, 14 Feb 2023 14:39:18 -0800 Jacob Keller wrote:
> On 2/14/2023 8:14 AM, Paul M Stillwell Jr wrote:
> >> I believe that's in line with devlink health. The devlink health log
> >> is "formatted" but I really doubt that any user can get far in debugging
> >> without vendor support.
> >>  
> > 
> > I agree, I just don't see what the trigger is in our case for FW logging.
> >   
> 
> Here's the thoughts I had for devlink health:
> 
> 1) support health reporters storing more than a single event. Currently
> all health reporters respond to a single event and then do not allow
> storing new captures until the current one is processed. This breaks for
> our firmware logging because we get separate events from firmware for
> each buffer of messages. We could make this configurable such that we
> limit the total maximum to prevent kernel memory overrun. (and some
> policy for how to discard events when the buffer is full?)

I think the idea is that the system keeps a continuous ring buffer of
logs and dumps it whenever bad events happens. That's in case of logs,
obviously you can expose other types of state with health.

> 2a) add some knobs to enable/disable a health reporter

For ad-hoc collection of the ring buffer there are dump and diagnose
callbacks which can be triggered at any time.

> 2b) add some firmware logging specific knobs as a "build on top of
> health reporters" or by creating a separate firmware logging bit that
> ties into a reporter. These knows would be how to set level, etc.

Right, the level setting is the part that I'm the least sure of.
That sounds like something more fitting to ethtool dumps.

> 3) for ice, once the health reporter is enabled we request the firmware
> to send us logging, then we get our admin queue message and simply copy
> this into the health reporter as a new event
> 
> 4) user space is in charge of monitoring health reports and can decide
> how to copy events out to disk and when to delete the health reports
> from the kernel.

That's also out of what's expected with health reporters. User should
not have to run vendor tools with devlink health. Decoding of the dump
may require vendor tools but checking if system is healthy or something
crashed should happen without any user space involvement.

> Basically: extend health reporters to allow multiple captures and add a
> related module to configure firmware logging via a health reporter,
> where the "event" is just "I have a new blob to store".
> 
> How does this sound?
> 
> For the specifics of 2b) I think we can probably agree that levels is
> fairly generic (i.e. the specifics of what each level are is vendor
> specific but the fact that there are numbers and that higher or lower
> numbers means more severe is fairly standard)
> 
> I know the ice firmware has many such modules we can enable or disable
> and we would ideally be able to set which modules are active or not.
> However all messages come through in the same blobs so we can't separate
> them and report them to individual health reporter events. I think we
> could have modules as a separate option for toggling which ones are on
> or off. I would expect other vendors to have something similar or have
> no modules at all and just an on/off switch?

I bet all vendors at this point have separate modules in the FW.
It's been the case for a while, that's why we have multiple versions
supported in devlink dev info.

  reply	other threads:[~2023-02-14 23:19 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-09 19:06 [PATCH net-next 0/5][pull request] add v2 FW logging for ice driver Tony Nguyen
2023-02-09 19:06 ` [PATCH net-next 1/5] ice: remove FW logging code Tony Nguyen
2023-02-09 19:06 ` [PATCH net-next 2/5] ice: enable devlink to check FW logging status Tony Nguyen
2023-02-09 19:07 ` [PATCH net-next 3/5] ice: add ability to query/set FW log level and resolution Tony Nguyen
2023-02-09 19:07 ` [PATCH net-next 4/5] ice: disable FW logging on driver unload Tony Nguyen
2023-02-09 19:07 ` [PATCH net-next 5/5] ice: use debugfs to output FW log data Tony Nguyen
2023-02-11  4:23 ` [PATCH net-next 0/5][pull request] add v2 FW logging for ice driver Jakub Kicinski
2023-02-13 23:46   ` Paul M Stillwell Jr
2023-02-14  0:40     ` Jakub Kicinski
2023-02-14 16:14       ` Paul M Stillwell Jr
2023-02-14 20:44         ` Jakub Kicinski
2023-02-14 22:39         ` Jacob Keller
2023-02-14 23:19           ` Jakub Kicinski [this message]
2023-02-15  0:07             ` Jacob Keller
2023-02-15  1:16               ` Jakub Kicinski
2023-02-15  1:33                 ` Jacob Keller
2023-02-15  2:07                   ` Jakub Kicinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230214151910.419d72cf@kernel.org \
    --to=kuba@kernel.org \
    --cc=anthony.l.nguyen@intel.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=idosch@idosch.org \
    --cc=jacob.e.keller@intel.com \
    --cc=jiri@nvidia.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=paul.m.stillwell.jr@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).