From: Jiri Pirko <jiri@resnulli.us>
To: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Eran Ben Elisha <eranbe@mellanox.com>,
netdev@vger.kernel.org, "David S. Miller" <davem@davemloft.net>,
Jiri Pirko <jiri@mellanox.com>,
Moshe Shemesh <moshe@mellanox.com>, Aya Levin <ayal@mellanox.com>,
Tal Alon <talal@mellanox.com>, Ariel Almog <ariela@mellanox.com>
Subject: Re: [PATCH RFC net-next 00/19] Devlink health reporting and recovery system
Date: Fri, 4 Jan 2019 09:57:57 +0100 [thread overview]
Message-ID: <20190104085757.GA21274@nanopsycho.orion> (raw)
In-Reply-To: <20181231174727.6280fc3c@cakuba.hsd1.ca.comcast.net>
Tue, Jan 01, 2019 at 02:47:27AM CET, jakub.kicinski@netronome.com wrote:
>On Mon, 31 Dec 2018 16:31:54 +0200, Eran Ben Elisha wrote:
>> The health mechanism is targeted for Real Time Alerting, in order to know when
>> something bad had happened to a PCI device
>> - Provide alert debug information
>> - Self healing
>> - If problem needs vendor support, provide a way to gather all needed debugging
>> information.
>>
>> The main idea is to unify and centralize driver health reports in the
>> generic devlink instance and allow the user to set different
>> attributes of the health reporting and recovery procedures.
>>
>> The devlink health reporter:
>> Device driver creates a "health reporter" per each error/health type.
>> Error/Health type can be a known/generic (eg pci error, fw error, rx/tx error)
>> or unknown (driver specific).
>> For each registered health reporter a driver can issue error/health reports
>> asynchronously. All health reports handling is done by devlink.
>> Device driver can provide specific callbacks for each "health reporter", e.g.
>> - Recovery procedures
>> - Diagnostics and object dump procedures
>> - OOB initial attributes
>> Different parts of the driver can register different types of health reporters
>> with different handlers.
>>
>> Once an error is reported, devlink health will do the following actions:
>> * A log is being send to the kernel trace events buffer
>> * Health status and statistics are being updated for the reporter instance
>> * Object dump is being taken and saved at the reporter instance (as long as
>> there is no other Objdump which is already stored)
>> * Auto recovery attempt is being done. Depends on:
>> - Auto-recovery configuration
>> - Grace period vs. time passed since last recover
>>
>> The user interface:
>> User can access/change each reporter attributes and driver specific callbacks
>> via devlink, e.g per error type (per health reporter)
>> - Configure reporter's generic attributes (like: Disable/enable auto recovery)
>> - Invoke recovery procedure
>> - Run diagnostics
>> - Object dump
>
>Thanks for continuing this work!
>
>> The devlink health interface (via netlink):
>> DEVLINK_CMD_HEALTH_REPORTER_GET
>> Retrieves status and configuration info per DEV and reporter.
>> DEVLINK_CMD_HEALTH_REPORTER_SET
>> Allows reporter-related configuration setting.
>> DEVLINK_CMD_HEALTH_REPORTER_RECOVER
>> Triggers a reporter's recovery procedure.
>> DEVLINK_CMD_HEALTH_REPORTER_DIAGNOSE
>> Retrieves diagnostics data from a reporter on a device.
>> DEVLINK_CMD_HEALTH_REPORTER_OBJDUMP_GET
>
>The addition of "objdump" and its marshalling is a bit disappointing.
>It seemed to me when region snapshots were added that they would serve
>this exact purpose. Taking a region snapshot or "core dump", if you
>will, after something went south. Now it's confusing to have two
>mechanisms serving the same purpose.
>
>It's unclear from quick reading of the code how if the buffers get
>timestamped. Can you report more than one?
>
>About the marshalling, I'm not sure it belongs in the kernel. There is
>precedent for adding interpretation of FW blobs in user space (ethtool).
>IMHO it's a more scalable approach, if we want to avoid having 100 kLoC
>drivers. Amount of code you add to print the simple example from last
>patch is not inspiring confidence.
>
>And on the bike shedding side :) -> I think you should steer clear of
>calling this objdump, as it has very little to do with the
>functionality of well-known objdump tool. Its not even clear what the
>object the name is referring to is.
>
>Long story short the overlap with region snapshots makes it unclear
>what purpose either serves, and IMHO we should avoid the marshalling by
>teaching user space how to interpret snapshots. Preferably we only
>have one dump mechanism, and user space can be taught the interpretation
>once.
Unlike regions, which are binary blobs, the "objdump" we have here is a
tree of values (json like). It is not taken from FW/HW. It is generated
in driver and passed down to user to be shown. Originally, Eran just
pushed that info into a string buffer and the user had to parse it
again. I asked to format this in Netlink attributes JSON-style so the
user gets all hierarchy without need of parsing and can easily do
Netlink->human_stdout or Netlink->JSON.
>
>> Retrieves the last stored objdump. Devlink health
>> saves a single objdump. If an objdump is not already stored by the devlink
>> for this reporter, devlink generates a new objdump.
>> Objdump output is defined by the reporter.
>> DEVLINK_CMD_HEALTH_REPORTER_OBJDUMP_CLEAR
>> Clears the last saved objdump file for the specified reporter.
>>
>>
>> netlink
>> +--------------------------+
>> | |
>> | + |
>> | | |
>> +--------------------------+
>> |request for ops
>> |(diagnose,
>> mlx5_core devlink |recover,
>> |dump)
>> +--------+ +--------------------------+
>> | | | reporter| |
>> | | | +---------v----------+ |
>> | | ops execution | | | |
>> | <----------------------------------+ | |
>> | | | | | |
>> | | | + ^------------------+ |
>> | | | | request for ops |
>> | | | | (recover, dump) |
>> | | | | |
>> | | | +-+------------------+ |
>> | | health report | | health handler | |
>> | +-------------------------------> | |
>> | | | +--------------------+ |
>> | | health reporter create | |
>> | +----------------------------> |
>> +--------+ +--------------------------+
>>
>> Available reporters:
>> In this patchset, three reporters of mlx5e driver are included. The FW
>> reporters implement devlink_health_reporter diagnostic, object dump and
>> recovery procedures. The TX reporter implements devlink_health_reporter
>> diagnostic and recovery procedures.
>>
>> This RFC is based on the same concepts as previous RFC I sent earlier this
>> year: "[RFC PATCH iproute2-next] System specification health API". The API was
>> changed, also devlink code and mlx5e reporters were not available at the
>> previous RFC.
prev parent reply other threads:[~2019-01-04 9:06 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-12-31 14:31 [PATCH RFC net-next 00/19] Devlink health reporting and recovery system Eran Ben Elisha
2018-12-31 14:31 ` [PATCH RFC net-next 01/19] devlink: Add health buffer support Eran Ben Elisha
2018-12-31 14:31 ` [PATCH RFC net-next 02/19] devlink: Add health reporter create/destroy functionality Eran Ben Elisha
2018-12-31 14:31 ` [PATCH RFC net-next 03/19] devlink: Add health report functionality Eran Ben Elisha
2018-12-31 14:31 ` [PATCH RFC net-next 04/19] devlink: Add health get command Eran Ben Elisha
2018-12-31 14:31 ` [PATCH RFC net-next 05/19] devlink: Add health set command Eran Ben Elisha
2018-12-31 14:32 ` [PATCH RFC net-next 06/19] devlink: Add health recover command Eran Ben Elisha
2018-12-31 14:32 ` [PATCH RFC net-next 07/19] devlink: Add health diagnose command Eran Ben Elisha
2018-12-31 14:32 ` [PATCH RFC net-next 08/19] devlink: Add health objdump {get,clear} commands Eran Ben Elisha
2018-12-31 14:32 ` [PATCH RFC net-next 09/19] net/mlx5e: Add TX reporter support Eran Ben Elisha
2018-12-31 14:32 ` [PATCH RFC net-next 10/19] net/mlx5e: Add TX timeout support for mlx5e TX reporter Eran Ben Elisha
2018-12-31 14:32 ` [PATCH RFC net-next 11/19] net/mlx5: Move all devlink related functions calls to devlink.c Eran Ben Elisha
2018-12-31 14:32 ` [PATCH RFC net-next 12/19] net/mlx5: Refactor print health info Eran Ben Elisha
2018-12-31 14:32 ` [PATCH RFC net-next 13/19] net/mlx5: Create FW devlink_health_reporter Eran Ben Elisha
2018-12-31 14:32 ` [PATCH RFC net-next 14/19] net/mlx5: Add core dump register access functions Eran Ben Elisha
2018-12-31 14:32 ` [PATCH RFC net-next 15/19] net/mlx5: Add support for FW reporter objdump Eran Ben Elisha
2018-12-31 14:32 ` [PATCH RFC net-next 16/19] net/mlx5: Report devlink health on FW issues Eran Ben Elisha
2018-12-31 14:32 ` [PATCH RFC net-next 17/19] net/mlx5: Add FW fatal devlink_health_reporter Eran Ben Elisha
2018-12-31 14:32 ` [PATCH RFC net-next 18/19] net/mlx5: Report devlink health on FW fatal issues Eran Ben Elisha
2018-12-31 14:32 ` [PATCH RFC net-next 19/19] devlink: Add Documentation/networking/devlink-health.txt Eran Ben Elisha
2019-01-01 1:47 ` Jakub Kicinski
2019-01-01 10:01 ` Eran Ben Elisha
2018-12-31 14:41 ` [PATCH RFC iproute2-next] devlink: Add health command support Aya Levin
2019-01-01 1:47 ` [PATCH RFC net-next 00/19] Devlink health reporting and recovery system Jakub Kicinski
2019-01-01 9:58 ` Eran Ben Elisha
2019-01-02 22:46 ` Jakub Kicinski
2019-01-03 13:31 ` Eran Ben Elisha
2019-01-03 22:28 ` Jakub Kicinski
2019-01-04 9:12 ` Jiri Pirko
2019-01-04 9:01 ` Jiri Pirko
2019-01-04 8:57 ` Jiri Pirko [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190104085757.GA21274@nanopsycho.orion \
--to=jiri@resnulli.us \
--cc=ariela@mellanox.com \
--cc=ayal@mellanox.com \
--cc=davem@davemloft.net \
--cc=eranbe@mellanox.com \
--cc=jakub.kicinski@netronome.com \
--cc=jiri@mellanox.com \
--cc=moshe@mellanox.com \
--cc=netdev@vger.kernel.org \
--cc=talal@mellanox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).