netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jiri Pirko <jiri@resnulli.us>
To: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Eran Ben Elisha <eranbe@mellanox.com>,
	netdev@vger.kernel.org, "David S. Miller" <davem@davemloft.net>,
	Jiri Pirko <jiri@mellanox.com>,
	Moshe Shemesh <moshe@mellanox.com>, Aya Levin <ayal@mellanox.com>,
	Tal Alon <talal@mellanox.com>, Ariel Almog <ariela@mellanox.com>
Subject: Re: [PATCH RFC net-next 00/19] Devlink health reporting and recovery system
Date: Fri, 4 Jan 2019 09:57:57 +0100	[thread overview]
Message-ID: <20190104085757.GA21274@nanopsycho.orion> (raw)
In-Reply-To: <20181231174727.6280fc3c@cakuba.hsd1.ca.comcast.net>

Tue, Jan 01, 2019 at 02:47:27AM CET, jakub.kicinski@netronome.com wrote:
>On Mon, 31 Dec 2018 16:31:54 +0200, Eran Ben Elisha wrote:
>> The health mechanism is targeted for Real Time Alerting, in order to know when
>> something bad had happened to a PCI device
>> - Provide alert debug information
>> - Self healing
>> - If problem needs vendor support, provide a way to gather all needed debugging
>>   information.
>> 
>> The main idea is to unify and centralize driver health reports in the
>> generic devlink instance and allow the user to set different
>> attributes of the health reporting and recovery procedures.
>> 
>> The devlink health reporter:
>> Device driver creates a "health reporter" per each error/health type.
>> Error/Health type can be a known/generic (eg pci error, fw error, rx/tx error)
>> or unknown (driver specific).
>> For each registered health reporter a driver can issue error/health reports
>> asynchronously. All health reports handling is done by devlink.
>> Device driver can provide specific callbacks for each "health reporter", e.g.
>>  - Recovery procedures
>>  - Diagnostics and object dump procedures
>>  - OOB initial attributes
>> Different parts of the driver can register different types of health reporters
>> with different handlers.
>> 
>> Once an error is reported, devlink health will do the following actions:
>>   * A log is being send to the kernel trace events buffer
>>   * Health status and statistics are being updated for the reporter instance
>>   * Object dump is being taken and saved at the reporter instance (as long as
>>     there is no other Objdump which is already stored)
>>   * Auto recovery attempt is being done. Depends on:
>>     - Auto-recovery configuration
>>     - Grace period vs. time passed since last recover
>> 
>> The user interface:
>> User can access/change each reporter attributes and driver specific callbacks
>> via devlink, e.g per error type (per health reporter)
>>  - Configure reporter's generic attributes (like: Disable/enable auto recovery)
>>  - Invoke recovery procedure
>>  - Run diagnostics
>>  - Object dump
>
>Thanks for continuing this work!
>
>> The devlink health interface (via netlink):
>> DEVLINK_CMD_HEALTH_REPORTER_GET
>>   Retrieves status and configuration info per DEV and reporter.
>> DEVLINK_CMD_HEALTH_REPORTER_SET
>>   Allows reporter-related configuration setting.
>> DEVLINK_CMD_HEALTH_REPORTER_RECOVER
>>   Triggers a reporter's recovery procedure.
>> DEVLINK_CMD_HEALTH_REPORTER_DIAGNOSE
>>   Retrieves diagnostics data from a reporter on a device.
>> DEVLINK_CMD_HEALTH_REPORTER_OBJDUMP_GET
>
>The addition of "objdump" and its marshalling is a bit disappointing.
>It seemed to me when region snapshots were added that they would serve
>this exact purpose.  Taking a region snapshot or "core dump", if you
>will, after something went south.  Now it's confusing to have two
>mechanisms serving the same purpose.
>
>It's unclear from quick reading of the code how if the buffers get
>timestamped.  Can you report more than one?
>
>About the marshalling, I'm not sure it belongs in the kernel.  There is
>precedent for adding interpretation of FW blobs in user space (ethtool).
>IMHO it's a more scalable approach, if we want to avoid having 100 kLoC
>drivers.  Amount of code you add to print the simple example from last
>patch is not inspiring confidence.
>
>And on the bike shedding side :) -> I think you should steer clear of
>calling this objdump, as it has very little to do with the
>functionality of well-known objdump tool.  Its not even clear what the
>object the name is referring to is.
>
>Long story short the overlap with region snapshots makes it unclear
>what purpose either serves, and IMHO we should avoid the marshalling by
>teaching user space how to interpret snapshots.  Preferably we only
>have one dump mechanism, and user space can be taught the interpretation
>once.

Unlike regions, which are binary blobs, the "objdump" we have here is a
tree of values (json like). It is not taken from FW/HW. It is generated
in driver and passed down to user to be shown. Originally, Eran just
pushed that info into a string buffer and the user had to parse it
again. I asked to format this in Netlink attributes JSON-style so the
user gets all hierarchy without need of parsing and can easily do
Netlink->human_stdout or Netlink->JSON.



>
>>   Retrieves the last stored objdump. Devlink health
>>   saves a single objdump. If an objdump is not already stored by the devlink
>>   for this reporter, devlink generates a new objdump.
>>   Objdump output is defined by the reporter.
>> DEVLINK_CMD_HEALTH_REPORTER_OBJDUMP_CLEAR
>>   Clears the last saved objdump file for the specified reporter.
>> 
>> 
>>                                                netlink
>>                                       +--------------------------+
>>                                       |                          |
>>                                       |            +             |
>>                                       |            |             |
>>                                       +--------------------------+
>>                                                    |request for ops
>>                                                    |(diagnose,
>>  mlx5_core                             devlink     |recover,
>>                                                    |dump)
>> +--------+                            +--------------------------+
>> |        |                            |    reporter|             |
>> |        |                            |  +---------v----------+  |
>> |        |   ops execution            |  |                    |  |
>> |     <----------------------------------+                    |  |
>> |        |                            |  |                    |  |
>> |        |                            |  + ^------------------+  |
>> |        |                            |    | request for ops     |
>> |        |                            |    | (recover, dump)     |
>> |        |                            |    |                     |
>> |        |                            |  +-+------------------+  |
>> |        |     health report          |  | health handler     |  |
>> |        +------------------------------->                    |  |
>> |        |                            |  +--------------------+  |
>> |        |     health reporter create |                          |
>> |        +---------------------------->                          |
>> +--------+                            +--------------------------+
>> 
>> Available reporters:
>> In this patchset, three reporters of mlx5e driver are included. The FW
>> reporters implement devlink_health_reporter diagnostic, object dump and
>> recovery procedures. The TX reporter implements devlink_health_reporter
>> diagnostic and recovery procedures.
>> 
>> This RFC is based on the same concepts as previous RFC I sent earlier this
>> year: "[RFC PATCH iproute2-next] System specification health API". The API was
>> changed, also devlink code and mlx5e reporters were not available at the
>> previous RFC.

      parent reply	other threads:[~2019-01-04  9:06 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-31 14:31 [PATCH RFC net-next 00/19] Devlink health reporting and recovery system Eran Ben Elisha
2018-12-31 14:31 ` [PATCH RFC net-next 01/19] devlink: Add health buffer support Eran Ben Elisha
2018-12-31 14:31 ` [PATCH RFC net-next 02/19] devlink: Add health reporter create/destroy functionality Eran Ben Elisha
2018-12-31 14:31 ` [PATCH RFC net-next 03/19] devlink: Add health report functionality Eran Ben Elisha
2018-12-31 14:31 ` [PATCH RFC net-next 04/19] devlink: Add health get command Eran Ben Elisha
2018-12-31 14:31 ` [PATCH RFC net-next 05/19] devlink: Add health set command Eran Ben Elisha
2018-12-31 14:32 ` [PATCH RFC net-next 06/19] devlink: Add health recover command Eran Ben Elisha
2018-12-31 14:32 ` [PATCH RFC net-next 07/19] devlink: Add health diagnose command Eran Ben Elisha
2018-12-31 14:32 ` [PATCH RFC net-next 08/19] devlink: Add health objdump {get,clear} commands Eran Ben Elisha
2018-12-31 14:32 ` [PATCH RFC net-next 09/19] net/mlx5e: Add TX reporter support Eran Ben Elisha
2018-12-31 14:32 ` [PATCH RFC net-next 10/19] net/mlx5e: Add TX timeout support for mlx5e TX reporter Eran Ben Elisha
2018-12-31 14:32 ` [PATCH RFC net-next 11/19] net/mlx5: Move all devlink related functions calls to devlink.c Eran Ben Elisha
2018-12-31 14:32 ` [PATCH RFC net-next 12/19] net/mlx5: Refactor print health info Eran Ben Elisha
2018-12-31 14:32 ` [PATCH RFC net-next 13/19] net/mlx5: Create FW devlink_health_reporter Eran Ben Elisha
2018-12-31 14:32 ` [PATCH RFC net-next 14/19] net/mlx5: Add core dump register access functions Eran Ben Elisha
2018-12-31 14:32 ` [PATCH RFC net-next 15/19] net/mlx5: Add support for FW reporter objdump Eran Ben Elisha
2018-12-31 14:32 ` [PATCH RFC net-next 16/19] net/mlx5: Report devlink health on FW issues Eran Ben Elisha
2018-12-31 14:32 ` [PATCH RFC net-next 17/19] net/mlx5: Add FW fatal devlink_health_reporter Eran Ben Elisha
2018-12-31 14:32 ` [PATCH RFC net-next 18/19] net/mlx5: Report devlink health on FW fatal issues Eran Ben Elisha
2018-12-31 14:32 ` [PATCH RFC net-next 19/19] devlink: Add Documentation/networking/devlink-health.txt Eran Ben Elisha
2019-01-01  1:47   ` Jakub Kicinski
2019-01-01 10:01     ` Eran Ben Elisha
2018-12-31 14:41 ` [PATCH RFC iproute2-next] devlink: Add health command support Aya Levin
2019-01-01  1:47 ` [PATCH RFC net-next 00/19] Devlink health reporting and recovery system Jakub Kicinski
2019-01-01  9:58   ` Eran Ben Elisha
2019-01-02 22:46     ` Jakub Kicinski
2019-01-03 13:31       ` Eran Ben Elisha
2019-01-03 22:28         ` Jakub Kicinski
2019-01-04  9:12           ` Jiri Pirko
2019-01-04  9:01         ` Jiri Pirko
2019-01-04  8:57   ` Jiri Pirko [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190104085757.GA21274@nanopsycho.orion \
    --to=jiri@resnulli.us \
    --cc=ariela@mellanox.com \
    --cc=ayal@mellanox.com \
    --cc=davem@davemloft.net \
    --cc=eranbe@mellanox.com \
    --cc=jakub.kicinski@netronome.com \
    --cc=jiri@mellanox.com \
    --cc=moshe@mellanox.com \
    --cc=netdev@vger.kernel.org \
    --cc=talal@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).