* [RFC PATCH iproute2-next] System specification health API @ 2018-09-13 8:18 Eran Ben Elisha 2018-09-13 8:18 ` [RFC PATCH iproute2-next] man: Add devlink health man page Eran Ben Elisha 2018-09-13 17:36 ` [RFC PATCH iproute2-next] System specification health API Jakub Kicinski 0 siblings, 2 replies; 17+ messages in thread From: Eran Ben Elisha @ 2018-09-13 8:18 UTC (permalink / raw) To: netdev, Jiri Pirko, Andy Gospodarek, Michael Chan, Jakub Kicinski, Simon Horman, Alexander Duyck Cc: Andrew Lunn, Florian Fainelli, Tal Alon, Ariel Almog, Eran Ben Elisha The health spec is targeted for Real Time Alerting, in order to know when something bad had happened to a PCI device - Provide alert debug information - Self healing - If problem needs vendor support, provide a way to gather all needed debugging information. The health contains sensors which sense for malfunction. Once sensor triggered, actions such as logs and correction can be taken. Sensors are sensing the health state and can trigger correction action. The sensors are divided into the following groups - Hardware sensor - a sensor which is triggered by the device due to malfunction. - Software sensor - a sensor which is triggered by the software due to malfunction. Both group of sensors can be triggered due to error event or due to a periodic check. Actions are the way to handle sensor events. Action can be in one of the following groups: - Dump - SW trace, SW dump, HW trace, HW dump - Reset - Surgical correction (e.g. modify Q, flush Q, reset of device, etc) Actions can be performed by SW or HW. User is allowed to enable or disable sensors and sensor2action mapping. This RFC man page patch describes the suggested API of devlink-health in order to control sensors and actions. Eran Ben Elisha (1): man: Add devlink health man page man/man8/devlink-health.8 | 171 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 171 insertions(+) create mode 100644 man/man8/devlink-health.8 -- 1.8.3.1 ^ permalink raw reply [flat|nested] 17+ messages in thread
* [RFC PATCH iproute2-next] man: Add devlink health man page 2018-09-13 8:18 [RFC PATCH iproute2-next] System specification health API Eran Ben Elisha @ 2018-09-13 8:18 ` Eran Ben Elisha 2018-09-13 10:27 ` Tobin C. Harding 2018-09-13 12:08 ` Andrew Lunn 2018-09-13 17:36 ` [RFC PATCH iproute2-next] System specification health API Jakub Kicinski 1 sibling, 2 replies; 17+ messages in thread From: Eran Ben Elisha @ 2018-09-13 8:18 UTC (permalink / raw) To: netdev, Jiri Pirko, Andy Gospodarek, Michael Chan, Jakub Kicinski, Simon Horman, Alexander Duyck Cc: Andrew Lunn, Florian Fainelli, Tal Alon, Ariel Almog, Eran Ben Elisha Add devlink-health man page. Devlink-health tool will control device health attributes, sensors, actions and logging. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> ------------------------------------------------------- Copy paste man output to here for easier review process of the RFC. DEVLINK-HEALTH(8) Linux DEVLINK-HEALTH(8) NAME devlink-health - devlink health configuration SYNOPSIS devlink [ OPTIONS ] health { COMMAND | help } OPTIONS := { -V[ersion] | -n[no-nice-names] } devlink health show [ DEV ] [ sensor NAME ] devlink health sensor set DEV name NAME [ action NAME { active | inactive } ]" devlink health action set DEV name NAME period PERIOD count COUNT fail { ignore | down } devlink health action reinit DEV name NAME devlink health help DESCRIPTION devlink-health tool allows user to configure the way driver treats unexpected status. The tool allows configuration of the sensors that can trigger health activity. Set for each sensor the follow up operations, such as, reset and dump of info. In addition, set the health activity termination action. devlink health show - Display devlink health sensors and actions attributes DEV - Specifies the devlink device to show. If this argument is omitted, all devices are listed. Format is: BUS_NAME/BUS_ADDRESS sensor NAME - Specifies the devlink sensor to show. devlink health sensor set - sets devlink health sensor attributes DEV Specifies the devlink device to show. name NAME Name of the sensor to set. action NAME { active | inactive } Specify which actions to activate and which to deactivate once a sensor was triggered. actions can be dump, reset, etc. devlink health action set - sets devlink action attributes DEV Specifies the devlink device to set. name NAME Specifies the devlink action to set. period PERIOD The period on which we limit the amount of performed actions, measured in seconds. count COUNT The maximum amount of actions performed in a limit time frame. fail { ignore | down } Specify the behavior once count limit was reached. ignore - Ignore errors without execution of any action. down - Driver will remain in nonoperational state. devlink health action reinit - reset devlink action attributes (period, count, fail, etc) DEV Specifies the devlink device to set. name NAME Specifies the devlink action to set. EXAMPLES devlink health show Shows the health state of all devlink devices on the system. devlink health show pci/0000:01:00.0 Shows the health state of specified devlink device. devlink health sensor set pci/0000:01:00.0 name TX_COMP_ERROR action reset off action dump on Sets TX_COMP_ERROR sensor parameters for a specific device. devlink health action set pci/0000:01:00.0 name reset period 3600 count 5 fail ignore Sets health attributes for reset action. SEE ALSO devlink(8), devlink-port(8), devlink-sb(8), devlink-monitor(8), devlink-dev(8), AUTHOR Eran ben Elisha <eranbe@mellanox.com> iproute2 15 Aug 2018 DEVLINK-HEALTH(8) --- man/man8/devlink-health.8 | 171 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 171 insertions(+) create mode 100644 man/man8/devlink-health.8 diff --git a/man/man8/devlink-health.8 b/man/man8/devlink-health.8 new file mode 100644 index 000000000000..ac28b020be0d --- /dev/null +++ b/man/man8/devlink-health.8 @@ -0,0 +1,171 @@ +.TH DEVLINK\-HEALTH 8 "15 Aug 2018" "iproute2" "Linux" +.SH NAME +devlink-health \- devlink health configuration +.SH SYNOPSIS +.sp +.ad l +.in +8 +.ti -8 +.B devlink +.RI "[ " OPTIONS " ]" +.BR health +.RI " { " COMMAND " | " +.BR help " }" +.sp + +.ti -8 +.IR OPTIONS " := { " +\fB\-V\fR[\fIersion\fR] | +\fB\-n\fR[\fIno-nice-names\fR] } + +.ti -8 +.B devlink health show +.RI "[ " DEV " ]" +.RI "[ " +.B sensor +.IR NAME +.RI "]" + +.ti -8 +.B devlink health sensor set +.IR DEV +.B name +.IR NAME +.RI "[ " +.BR action +.IR NAME +.R "{" active "|" inactive "}" ]" + +.ti -8 +.B devlink health action set +.IR DEV +.B name +.IR NAME +.BR period +.IR PERIOD +.BR count +.IR COUNT +.BR fail " { " +.IR ignore +.BR "| " +.IR down +.R "} " + +.ti -8 +.B devlink health action reinit +.IR DEV +.B name +.IR NAME + +.ti -8 +.B devlink health help + +.SH "DESCRIPTION" +.B devlink-health +tool allows user to configure the way driver treats unexpected status. The tool allows configuration of the sensors that can trigger health activity. Set for each sensor the follow up operations, such as, reset and dump of info. In addition, set the health activity termination action. + +.SS devlink health show - Display devlink health sensors and actions attributes +.PP +.B "DEV" +- Specifies the devlink device to show. +If this argument is omitted, all devices are listed. + +.in +4 +Format is: +.in +2 +BUS_NAME/BUS_ADDRESS + +.PP +.BR sensor +.IR "NAME" +- Specifies the devlink sensor to show. + +.SS devlink health sensor set - sets devlink health sensor attributes + +.TP +.B "DEV" +Specifies the devlink device to show. + +.TP +.BI name " NAME" +Name of the sensor to set. + +.TP +.BR action +.IR NAME +.R "{" active "|" inactive "} " +.in +4 +Specify which actions to activate and which to deactivate once a sensor was triggered. actions can be dump, reset, etc. + +.SS devlink health action set - sets devlink action attributes + +.TP +.B "DEV" +Specifies the devlink device to set. + +.TP +.BI name " NAME" +Specifies the devlink action to set. + +.TP +.BI period " PERIOD" +The period on which we limit the amount of performed actions, measured in seconds. + +.TP +.BI count " COUNT" +The maximum amount of actions performed in a limit time frame. + +.TP +.BR fail +.R "{" ignore "|" down "}" +.in +4 +Specify the behavior once count limit was reached. + +.I ignore +- Ignore errors without execution of any action. + +.I down +- Driver will remain in nonoperational state. + +.SS devlink health action reinit - reset devlink action attributes (period, count, fail, etc) + +.TP +.B "DEV" +Specifies the devlink device to set. + +.TP +.BI name " NAME" +Specifies the devlink action to set. + +.SH "EXAMPLES" +.PP +devlink health show +.RS 4 +Shows the health state of all devlink devices on the system. +.RE +.PP +devlink health show pci/0000:01:00.0 +.RS 4 +Shows the health state of specified devlink device. +.RE +.PP +devlink health sensor set pci/0000:01:00.0 name TX_COMP_ERROR action reset off action dump on +.RS 4 +Sets TX_COMP_ERROR sensor parameters for a specific device. +.RE +.PP +devlink health action set pci/0000:01:00.0 name reset period 3600 count 5 fail ignore +.RS 4 +Sets health attributes for reset action. +.RE + +.SH SEE ALSO +.BR devlink (8), +.BR devlink-port (8), +.BR devlink-sb (8), +.BR devlink-monitor (8), +.BR devlink-dev (8), +.br + +.SH AUTHOR +Eran ben Elisha <eranbe@mellanox.com> -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [RFC PATCH iproute2-next] man: Add devlink health man page 2018-09-13 8:18 ` [RFC PATCH iproute2-next] man: Add devlink health man page Eran Ben Elisha @ 2018-09-13 10:27 ` Tobin C. Harding 2018-09-13 11:58 ` Eran Ben Elisha 2018-09-13 12:08 ` Andrew Lunn 1 sibling, 1 reply; 17+ messages in thread From: Tobin C. Harding @ 2018-09-13 10:27 UTC (permalink / raw) To: Eran Ben Elisha Cc: netdev, Jiri Pirko, Andy Gospodarek, Michael Chan, Jakub Kicinski, Simon Horman, Alexander Duyck, Andrew Lunn, Florian Fainelli, Tal Alon, Ariel Almog On Thu, Sep 13, 2018 at 11:18:16AM +0300, Eran Ben Elisha wrote: > Add devlink-health man page. Devlink-health tool will control device > health attributes, sensors, actions and logging. > > Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> > > ------------------------------------------------------- > Copy paste man output to here for easier review process of the RFC. > > DEVLINK-HEALTH(8) Linux DEVLINK-HEALTH(8) > > NAME > devlink-health - devlink health configuration > > SYNOPSIS > devlink [ OPTIONS ] health { COMMAND | help } > > OPTIONS := { -V[ersion] | -n[no-nice-names] } > > devlink health show [ DEV ] [ sensor NAME ] > > devlink health sensor set DEV name NAME [ action NAME { active | inactive } ]" > > devlink health action set DEV name NAME period PERIOD count COUNT fail { ignore | down } > > devlink health action reinit DEV name NAME > > devlink health help > > DESCRIPTION > devlink-health tool allows user to configure the way driver treats unexpected status. The tool allows configuration of the sensors that can trigger health activity. Set for each sensor the follow up operations, such as, > reset and dump of info. In addition, set the health activity termination action. > > devlink health show - Display devlink health sensors and actions attributes > DEV - Specifies the devlink device to show. If this argument is omitted, all devices are listed. > > Format is: > BUS_NAME/BUS_ADDRESS > > sensor NAME - Specifies the devlink sensor to show. > Perhaps the commands should include the optional arguments so when reading the description one doesn't have to scroll to the top of the page all the time e.g devlink health show [ DEV ] [ sensor NAME ] - Display devlink health sensors and actions attributes > devlink health sensor set - sets devlink health sensor attributes > DEV Specifies the devlink device to show. set > name NAME > Name of the sensor to set. > > action NAME { active | inactive } > Specify which actions to activate and which to deactivate once a sensor was triggered. actions can be dump, reset, etc. > > devlink health action set - sets devlink action attributes > DEV Specifies the devlink device to set. > > name NAME > Specifies the devlink action to set. This is a little unclear to me? > period PERIOD > The period on which we limit the amount of performed actions, measured in seconds. > > count COUNT > The maximum amount of actions performed in a limit time frame. Perhaps The maximum number of actions performed in a limited time frame. > fail { ignore | down } > Specify the behavior once count limit was reached. > > ignore - Ignore errors without execution of any action. > > down - Driver will remain in nonoperational state. > > devlink health action reinit - reset devlink action attributes (period, count, fail, etc) > DEV Specifies the devlink device to set. > > name NAME > Specifies the devlink action to set. Perhaps s/set/reinitialise/g for the above two descriptions. Hope this helps, Tobin. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH iproute2-next] man: Add devlink health man page 2018-09-13 10:27 ` Tobin C. Harding @ 2018-09-13 11:58 ` Eran Ben Elisha 2018-09-13 22:06 ` Tobin C. Harding 0 siblings, 1 reply; 17+ messages in thread From: Eran Ben Elisha @ 2018-09-13 11:58 UTC (permalink / raw) To: Tobin C. Harding Cc: netdev, Jiri Pirko, Andy Gospodarek, Michael Chan, Jakub Kicinski, Simon Horman, Alexander Duyck, Andrew Lunn, Florian Fainelli, Tal Alon, Ariel Almog On 9/13/2018 1:27 PM, Tobin C. Harding wrote: > On Thu, Sep 13, 2018 at 11:18:16AM +0300, Eran Ben Elisha wrote: >> Add devlink-health man page. Devlink-health tool will control device >> health attributes, sensors, actions and logging. >> >> Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> >> >> ------------------------------------------------------- >> Copy paste man output to here for easier review process of the RFC. >> >> DEVLINK-HEALTH(8) Linux DEVLINK-HEALTH(8) >> >> NAME >> devlink-health - devlink health configuration >> >> SYNOPSIS >> devlink [ OPTIONS ] health { COMMAND | help } >> >> OPTIONS := { -V[ersion] | -n[no-nice-names] } >> >> devlink health show [ DEV ] [ sensor NAME ] >> >> devlink health sensor set DEV name NAME [ action NAME { active | inactive } ]" >> >> devlink health action set DEV name NAME period PERIOD count COUNT fail { ignore | down } >> >> devlink health action reinit DEV name NAME >> >> devlink health help >> >> DESCRIPTION >> devlink-health tool allows user to configure the way driver treats unexpected status. The tool allows configuration of the sensors that can trigger health activity. Set for each sensor the follow up operations, such as, >> reset and dump of info. In addition, set the health activity termination action. >> >> devlink health show - Display devlink health sensors and actions attributes >> DEV - Specifies the devlink device to show. If this argument is omitted, all devices are listed. >> >> Format is: >> BUS_NAME/BUS_ADDRESS >> >> sensor NAME - Specifies the devlink sensor to show. >> > > Perhaps the commands should include the optional arguments so when > reading the description one doesn't have to scroll to the top of the > page all the time > > e.g > devlink health show [ DEV ] [ sensor NAME ] - Display devlink health sensors and actions attributes > I followed the scheme presented in all other devlink man pages. see devlink-region, devlink-port, etc. From my perspective, I am fine with adding it to devlink-health, need ack from the devlink maintainer to see if he likes it... >> devlink health sensor set - sets devlink health sensor attributes >> DEV Specifies the devlink device to show. > > set > >> name NAME >> Name of the sensor to set. >> >> action NAME { active | inactive } >> Specify which actions to activate and which to deactivate once a sensor was triggered. actions can be dump, reset, etc. >> >> devlink health action set - sets devlink action attributes >> DEV Specifies the devlink device to set. >> >> name NAME >> Specifies the devlink action to set. > > This is a little unclear to me? what is not clear? the term 'action' or the naming? can you elaborate? > >> period PERIOD >> The period on which we limit the amount of performed actions, measured in seconds. >> >> count COUNT >> The maximum amount of actions performed in a limit time frame. > > Perhaps > The maximum number of actions performed in a limited time frame. ack > >> fail { ignore | down } >> Specify the behavior once count limit was reached. >> >> ignore - Ignore errors without execution of any action. >> >> down - Driver will remain in nonoperational state. >> >> devlink health action reinit - reset devlink action attributes (period, count, fail, etc) >> DEV Specifies the devlink device to set. >> >> name NAME >> Specifies the devlink action to set. > > Perhaps s/set/reinitialise/g for the above two descriptions. ack > > Hope this helps, > Tobin. thanks ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH iproute2-next] man: Add devlink health man page 2018-09-13 11:58 ` Eran Ben Elisha @ 2018-09-13 22:06 ` Tobin C. Harding 0 siblings, 0 replies; 17+ messages in thread From: Tobin C. Harding @ 2018-09-13 22:06 UTC (permalink / raw) To: Eran Ben Elisha Cc: netdev, Jiri Pirko, Andy Gospodarek, Michael Chan, Jakub Kicinski, Simon Horman, Alexander Duyck, Andrew Lunn, Florian Fainelli, Tal Alon, Ariel Almog On Thu, Sep 13, 2018 at 02:58:52PM +0300, Eran Ben Elisha wrote: > > > On 9/13/2018 1:27 PM, Tobin C. Harding wrote: > > On Thu, Sep 13, 2018 at 11:18:16AM +0300, Eran Ben Elisha wrote: > > > Add devlink-health man page. Devlink-health tool will control device > > > health attributes, sensors, actions and logging. > > > > > > Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> > > > > > > ------------------------------------------------------- > > > Copy paste man output to here for easier review process of the RFC. > > > > > > DEVLINK-HEALTH(8) Linux DEVLINK-HEALTH(8) > > > > > > NAME > > > devlink-health - devlink health configuration > > > > > > SYNOPSIS > > > devlink [ OPTIONS ] health { COMMAND | help } > > > > > > OPTIONS := { -V[ersion] | -n[no-nice-names] } > > > > > > devlink health show [ DEV ] [ sensor NAME ] > > > > > > devlink health sensor set DEV name NAME [ action NAME { active | inactive } ]" > > > > > > devlink health action set DEV name NAME period PERIOD count COUNT fail { ignore | down } > > > > > > devlink health action reinit DEV name NAME > > > > > > devlink health help > > > > > > DESCRIPTION > > > devlink-health tool allows user to configure the way driver treats unexpected status. The tool allows configuration of the sensors that can trigger health activity. Set for each sensor the follow up operations, such as, > > > reset and dump of info. In addition, set the health activity termination action. > > > > > > devlink health show - Display devlink health sensors and actions attributes > > > DEV - Specifies the devlink device to show. If this argument is omitted, all devices are listed. > > > > > > Format is: > > > BUS_NAME/BUS_ADDRESS > > > > > > sensor NAME - Specifies the devlink sensor to show. > > > > > > > Perhaps the commands should include the optional arguments so when > > reading the description one doesn't have to scroll to the top of the > > page all the time > > > > e.g > > devlink health show [ DEV ] [ sensor NAME ] - Display devlink health sensors and actions attributes > > > > I followed the scheme presented in all other devlink man pages. > see devlink-region, devlink-port, etc. Oh ok, my mistake. I'd stick with what you have then. Thanks for pointing this out. > From my perspective, I am fine with adding it to devlink-health, need ack > from the devlink maintainer to see if he likes it... > > > > devlink health sensor set - sets devlink health sensor attributes > > > DEV Specifies the devlink device to show. > > > > set > > > > > name NAME > > > Name of the sensor to set. > > > > > > action NAME { active | inactive } > > > Specify which actions to activate and which to deactivate once a sensor was triggered. actions can be dump, reset, etc. > > > > > > devlink health action set - sets devlink action attributes > > > DEV Specifies the devlink device to set. > > > > > > name NAME > > > Specifies the devlink action to set. > > > > This is a little unclear to me? > > what is not clear? the term 'action' or the naming? can you elaborate? It wasn't immediately clear what 'name' referred to. But following on from discussion above this may be because I have not read any of the other devlink man pages. thanks, Tobin. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH iproute2-next] man: Add devlink health man page 2018-09-13 8:18 ` [RFC PATCH iproute2-next] man: Add devlink health man page Eran Ben Elisha 2018-09-13 10:27 ` Tobin C. Harding @ 2018-09-13 12:08 ` Andrew Lunn 2018-09-13 12:49 ` Eran Ben Elisha 1 sibling, 1 reply; 17+ messages in thread From: Andrew Lunn @ 2018-09-13 12:08 UTC (permalink / raw) To: Eran Ben Elisha Cc: netdev, Jiri Pirko, Andy Gospodarek, Michael Chan, Jakub Kicinski, Simon Horman, Alexander Duyck, Florian Fainelli, Tal Alon, Ariel Almog > devlink health sensor set pci/0000:01:00.0 name TX_COMP_ERROR action reset off action dump on > Sets TX_COMP_ERROR sensor parameters for a specific device. I hope the real sensors have more understandable names. If i remember correctly, the same sort of comment was given for resource management. It was pretty unclear what the resource names actually mean. Is an average user going to have any idea how to actually use these sensors and actions? Can you give more examples of sensors. We should understand if there are any overlaps with hwmon. Andrew ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH iproute2-next] man: Add devlink health man page 2018-09-13 12:08 ` Andrew Lunn @ 2018-09-13 12:49 ` Eran Ben Elisha 2018-09-13 13:24 ` Andrew Lunn 0 siblings, 1 reply; 17+ messages in thread From: Eran Ben Elisha @ 2018-09-13 12:49 UTC (permalink / raw) To: Andrew Lunn Cc: netdev, Jiri Pirko, Andy Gospodarek, Michael Chan, Jakub Kicinski, Simon Horman, Alexander Duyck, Florian Fainelli, Tal Alon, Ariel Almog On 9/13/2018 3:08 PM, Andrew Lunn wrote: >> devlink health sensor set pci/0000:01:00.0 name TX_COMP_ERROR action reset off action dump on >> Sets TX_COMP_ERROR sensor parameters for a specific device. > > I hope the real sensors have more understandable names. If i remember > correctly, the same sort of comment was given for resource > management. It was pretty unclear what the resource names actually > mean. Is an average user going to have any idea how to actually use > these sensors and actions? well, hopefully. the whole point is to have it fully controlled by the user. However, names for the command should be short. I guess we shall have it documented (challenge is to fit to multi vendors). > > Can you give more examples of sensors. We should understand if there > are any overlaps with hwmon. I restate here that we shall have SW sensors as well, and not only HW sensors. This is what I had in mind: 1. command interface error 2. command interface timeout 3. stuck TX queue (like tx_timeout) 4. stuck TX completion queue (driver did not process packets in a reasonable time period) 5. stuck RX queue 6. RX completion error 7. TX completion error 8. HW / FW catastrophic error report 9. completion queue overrun Eran > > Andrew > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH iproute2-next] man: Add devlink health man page 2018-09-13 12:49 ` Eran Ben Elisha @ 2018-09-13 13:24 ` Andrew Lunn 2018-09-13 14:30 ` Eran Ben Elisha 0 siblings, 1 reply; 17+ messages in thread From: Andrew Lunn @ 2018-09-13 13:24 UTC (permalink / raw) To: Eran Ben Elisha Cc: netdev, Jiri Pirko, Andy Gospodarek, Michael Chan, Jakub Kicinski, Simon Horman, Alexander Duyck, Florian Fainelli, Tal Alon, Ariel Almog On Thu, Sep 13, 2018 at 03:49:37PM +0300, Eran Ben Elisha wrote: > > > On 9/13/2018 3:08 PM, Andrew Lunn wrote: > >> devlink health sensor set pci/0000:01:00.0 name TX_COMP_ERROR action reset off action dump on > >> Sets TX_COMP_ERROR sensor parameters for a specific device. > > > >I hope the real sensors have more understandable names. If i remember > >correctly, the same sort of comment was given for resource > >management. It was pretty unclear what the resource names actually > >mean. Is an average user going to have any idea how to actually use > >these sensors and actions? > > well, hopefully. the whole point is to have it fully controlled by the user. > However, names for the command should be short. I guess we shall have it > documented (challenge is to fit to multi vendors). > > > > >Can you give more examples of sensors. We should understand if there > >are any overlaps with hwmon. > > I restate here that we shall have SW sensors as well, and not only HW > sensors. > > This is what I had in mind: > 1. command interface error > 2. command interface timeout > 3. stuck TX queue (like tx_timeout) > 4. stuck TX completion queue (driver did not process packets in a reasonable > time period) > 5. stuck RX queue > 6. RX completion error > 7. TX completion error > 8. HW / FW catastrophic error report > 9. completion queue overrun Hi Eran I'm having trouble differentiating between these SW sensors and bugs which need fixing. What causes a command interface error? Sending it a command it does not understand? A wrongly formatted command? A command the version of the firmware does not support? These all sound just like plain old bugs which need fixing, not something which needs a framework to detect them and try to recover from them by resetting something. I would of expected that all the issues are about physical properties. Something similar to SMART for hard disks. The power supplies are starting to droop, suggesting it might die soon. The tacho on the fan suggests the FAN is not rotating as fast as it should, so the motor is going to die soon. An SFP is giving i2c errors, suggesting it is not seated correctly. The card as a whole is overheating, despite the fan working, suggesting the ambient temperature is just too high. Andrew ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH iproute2-next] man: Add devlink health man page 2018-09-13 13:24 ` Andrew Lunn @ 2018-09-13 14:30 ` Eran Ben Elisha 2018-09-13 15:12 ` Andrew Lunn 0 siblings, 1 reply; 17+ messages in thread From: Eran Ben Elisha @ 2018-09-13 14:30 UTC (permalink / raw) To: Andrew Lunn Cc: netdev, Jiri Pirko, Andy Gospodarek, Michael Chan, Jakub Kicinski, Simon Horman, Alexander Duyck, Florian Fainelli, Tal Alon, Ariel Almog On 9/13/2018 4:24 PM, Andrew Lunn wrote: > On Thu, Sep 13, 2018 at 03:49:37PM +0300, Eran Ben Elisha wrote: >> >> >> On 9/13/2018 3:08 PM, Andrew Lunn wrote: >>>> devlink health sensor set pci/0000:01:00.0 name TX_COMP_ERROR action reset off action dump on >>>> Sets TX_COMP_ERROR sensor parameters for a specific device. >>> >>> I hope the real sensors have more understandable names. If i remember >>> correctly, the same sort of comment was given for resource >>> management. It was pretty unclear what the resource names actually >>> mean. Is an average user going to have any idea how to actually use >>> these sensors and actions? >> >> well, hopefully. the whole point is to have it fully controlled by the user. >> However, names for the command should be short. I guess we shall have it >> documented (challenge is to fit to multi vendors). >> >>> >>> Can you give more examples of sensors. We should understand if there >>> are any overlaps with hwmon. >> >> I restate here that we shall have SW sensors as well, and not only HW >> sensors. >> >> This is what I had in mind: >> 1. command interface error >> 2. command interface timeout >> 3. stuck TX queue (like tx_timeout) >> 4. stuck TX completion queue (driver did not process packets in a reasonable >> time period) >> 5. stuck RX queue >> 6. RX completion error >> 7. TX completion error >> 8. HW / FW catastrophic error report >> 9. completion queue overrun > > Hi Eran > > I'm having trouble differentiating between these SW sensors and bugs > which need fixing. What causes a command interface error? Sending it a > command it does not understand? A wrongly formatted command? A command > the version of the firmware does not support? These all sound just > like plain old bugs which need fixing, not something which needs a > framework to detect them and try to recover from them by resetting > something. Such issues do exist in production environment, and need to be handled even if root cause is a bug which will be fixed in latest release. My feature should help developers / administrator to control and recover their live systems, by auto correction and logging support. Goal is: - Provide alert debug information - Self healing - If problem needs vendor support, provide a way to gather all needed debugging information. > > I would of expected that all the issues are about physical > properties. Something similar to SMART for hard disks. The power > supplies are starting to droop, suggesting it might die soon. The > tacho on the fan suggests the FAN is not rotating as fast as it > should, so the motor is going to die soon. An SFP is giving i2c > errors, suggesting it is not seated correctly. The card as a whole is > overheating, despite the fan working, suggesting the ambient > temperature is just too high. AFAIU, the kind of sensors you suggest here requires manual fix / physically approaching to the setup, replace HW, install new Fan, etc. Monitor such events is easy, driver can just log events from HW to the dmesg and end its handle there. None of these is a real networking issue I would like to handle with devlink-health. Eran > > Andrew > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH iproute2-next] man: Add devlink health man page 2018-09-13 14:30 ` Eran Ben Elisha @ 2018-09-13 15:12 ` Andrew Lunn 2018-09-16 9:14 ` Eran Ben Elisha 0 siblings, 1 reply; 17+ messages in thread From: Andrew Lunn @ 2018-09-13 15:12 UTC (permalink / raw) To: Eran Ben Elisha Cc: netdev, Jiri Pirko, Andy Gospodarek, Michael Chan, Jakub Kicinski, Simon Horman, Alexander Duyck, Florian Fainelli, Tal Alon, Ariel Almog > >>>> devlink health sensor set pci/0000:01:00.0 name TX_COMP_ERROR action reset off action dump on > >>>> Sets TX_COMP_ERROR sensor parameters for a specific device. > >>This is what I had in mind: > >>1. command interface error > >>2. command interface timeout > >>3. stuck TX queue (like tx_timeout) > >>4. stuck TX completion queue (driver did not process packets in a reasonable > >>time period) > >>5. stuck RX queue > >>6. RX completion error > >>7. TX completion error > >>8. HW / FW catastrophic error report > >>9. completion queue overrun > Such issues do exist in production environment, and need to be handled even > if root cause is a bug which will be fixed in latest release. My feature > should help developers / administrator to control and recover their live > systems, by auto correction and logging support. > Goal is: > - Provide alert debug information > - Self healing > - If problem needs vendor support, provide a way to gather all needed > debugging information. So maybe you have the wrong name for this. Health is nice in terms of Marketing, but we are actually talking about bug recovery. devlink bug sensor set pci/0000:01:00.0 name command_interface_error action reset off action dump on devlink bug sensor set pci/0000:01:00.0 name command_interface_timeout action reset off action dump on devlink bug sensor set pci/0000:01:00.0 name transmit_completion_error action reset off action dump on devlink bug sensor set pci/0000:01:00.0 name completion_queue_overrun action reset off action dump on seems a lot more understandable than: devlink health set pci/0000:01:00.0 name TX_COMP_ERROR action reset off action dump on Andrew ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH iproute2-next] man: Add devlink health man page 2018-09-13 15:12 ` Andrew Lunn @ 2018-09-16 9:14 ` Eran Ben Elisha 0 siblings, 0 replies; 17+ messages in thread From: Eran Ben Elisha @ 2018-09-16 9:14 UTC (permalink / raw) To: Andrew Lunn Cc: netdev, Jiri Pirko, Andy Gospodarek, Michael Chan, Jakub Kicinski, Simon Horman, Alexander Duyck, Florian Fainelli, Tal Alon, Ariel Almog On 9/13/2018 6:12 PM, Andrew Lunn wrote: >>>>>> devlink health sensor set pci/0000:01:00.0 name TX_COMP_ERROR action reset off action dump on >>>>>> Sets TX_COMP_ERROR sensor parameters for a specific device. > >>>> This is what I had in mind: >>>> 1. command interface error >>>> 2. command interface timeout >>>> 3. stuck TX queue (like tx_timeout) >>>> 4. stuck TX completion queue (driver did not process packets in a reasonable >>>> time period) >>>> 5. stuck RX queue >>>> 6. RX completion error >>>> 7. TX completion error >>>> 8. HW / FW catastrophic error report >>>> 9. completion queue overrun > >> Such issues do exist in production environment, and need to be handled even >> if root cause is a bug which will be fixed in latest release. My feature >> should help developers / administrator to control and recover their live >> systems, by auto correction and logging support. >> Goal is: >> - Provide alert debug information >> - Self healing >> - If problem needs vendor support, provide a way to gather all needed >> debugging information. > > So maybe you have the wrong name for this. Health is nice in terms of > Marketing, but we are actually talking about bug recovery. The way I see it, this feature is responsible for the health of the system from the pci/xxxx perspective. I though about devlink-recover for example, but I really wouldn't like to limit the feature to be called after one of its actions. The same for devlink-bug, which highlights only part of the range of capabilities (sensor). My work is currently focused on error reporting and recovery, but I wouldn't like to see the API limited for "bugs" only. Eran > > devlink bug sensor set pci/0000:01:00.0 name command_interface_error action reset off action dump on > devlink bug sensor set pci/0000:01:00.0 name command_interface_timeout action reset off action dump on > devlink bug sensor set pci/0000:01:00.0 name transmit_completion_error action reset off action dump on > devlink bug sensor set pci/0000:01:00.0 name completion_queue_overrun action reset off action dump on > > seems a lot more understandable than: > > devlink health set pci/0000:01:00.0 name TX_COMP_ERROR action reset off action dump on > > Andrew > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH iproute2-next] System specification health API 2018-09-13 8:18 [RFC PATCH iproute2-next] System specification health API Eran Ben Elisha 2018-09-13 8:18 ` [RFC PATCH iproute2-next] man: Add devlink health man page Eran Ben Elisha @ 2018-09-13 17:36 ` Jakub Kicinski 2018-09-16 10:37 ` Eran Ben Elisha 2018-09-16 19:29 ` Stephen Hemminger 1 sibling, 2 replies; 17+ messages in thread From: Jakub Kicinski @ 2018-09-13 17:36 UTC (permalink / raw) To: Eran Ben Elisha Cc: netdev, Jiri Pirko, Andy Gospodarek, Michael Chan, Simon Horman, Alexander Duyck, Andrew Lunn, Florian Fainelli, Tal Alon, Ariel Almog On Thu, 13 Sep 2018 11:18:15 +0300, Eran Ben Elisha wrote: > The health spec is targeted for Real Time Alerting, in order to know when > something bad had happened to a PCI device By spec you mean some standards body spec you implement or this proposal is a spec? > - Provide alert debug information > - Self healing > - If problem needs vendor support, provide a way to gather all needed debugging > information. > > The health contains sensors which sense for malfunction. Once sensor triggered, > actions such as logs and correction can be taken. > Sensors are sensing the health state and can trigger correction action. > > The sensors are divided into the following groups > - Hardware sensor - a sensor which is triggered by the device due to > malfunction. > - Software sensor - a sensor which is triggered by the software due to > malfunction. > Both group of sensors can be triggered due to error event or due to a periodic check. > > Actions are the way to handle sensor events. Action can be in one of the > following groups: > - Dump - SW trace, SW dump, HW trace, HW dump > - Reset - Surgical correction (e.g. modify Q, flush Q, reset of device, etc) > Actions can be performed by SW or HW. > > User is allowed to enable or disable sensors and sensor2action mapping. > > This RFC man page patch describes the suggested API of devlink-health in order > to control sensors and actions. I like the idea of configuring response to events like this, although I'm not sure the name sensor is appropriate here - perhaps exception or error would be better? Are there going to be values reported? I'm not so sure about HW sensors in relation to existing HWMON infrastructure... I assume you're targeting things like say some HW engine/block reporting it encountered an error? Sounds good, too. Are the actions all envisioned to be performed by the driver? Firmware? Hardware? I guess that distinction can be added later. For FW/HW actions we would go back to the problem of persistence of the setting since it was only implemented for params :S Is the dump option going to tie back into region snapshots? ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH iproute2-next] System specification health API 2018-09-13 17:36 ` [RFC PATCH iproute2-next] System specification health API Jakub Kicinski @ 2018-09-16 10:37 ` Eran Ben Elisha 2018-09-25 12:00 ` Eran Ben Elisha 2018-09-16 19:29 ` Stephen Hemminger 1 sibling, 1 reply; 17+ messages in thread From: Eran Ben Elisha @ 2018-09-16 10:37 UTC (permalink / raw) To: Jakub Kicinski Cc: netdev, Jiri Pirko, Andy Gospodarek, Michael Chan, Simon Horman, Alexander Duyck, Andrew Lunn, Florian Fainelli, Tal Alon, Ariel Almog On 9/13/2018 8:36 PM, Jakub Kicinski wrote: > On Thu, 13 Sep 2018 11:18:15 +0300, Eran Ben Elisha wrote: >> The health spec is targeted for Real Time Alerting, in order to know when >> something bad had happened to a PCI device > > By spec you mean some standards body spec you implement or this > proposal is a spec? This proposal is a spec > >> - Provide alert debug information >> - Self healing >> - If problem needs vendor support, provide a way to gather all needed debugging >> information. >> >> The health contains sensors which sense for malfunction. Once sensor triggered, >> actions such as logs and correction can be taken. >> Sensors are sensing the health state and can trigger correction action. >> >> The sensors are divided into the following groups >> - Hardware sensor - a sensor which is triggered by the device due to >> malfunction. >> - Software sensor - a sensor which is triggered by the software due to >> malfunction. >> Both group of sensors can be triggered due to error event or due to a periodic check. >> >> Actions are the way to handle sensor events. Action can be in one of the >> following groups: >> - Dump - SW trace, SW dump, HW trace, HW dump >> - Reset - Surgical correction (e.g. modify Q, flush Q, reset of device, etc) >> Actions can be performed by SW or HW. >> >> User is allowed to enable or disable sensors and sensor2action mapping. >> >> This RFC man page patch describes the suggested API of devlink-health in order >> to control sensors and actions. > > I like the idea of configuring response to events like this, although > I'm not sure the name sensor is appropriate here - perhaps exception or > error would be better? I was trying to avoid the negativity description. Have it called sensor to avoid restricting the API for errors / exceptions only. I got the same type of comment from Andrew as well devlink-health->devlink-bug. But if other vendors driver developers don't see it can be expanded to sensor which are not errors, then I guess we can refactor the names. Are there going to be values reported? It depends on the sensor. If it has data that would help in the debug, then I assume yes, via the dumps. > > I'm not so sure about HW sensors in relation to existing HWMON > infrastructure... I assume you're targeting things like say some HW > engine/block reporting it encountered an error? Sounds good, too. yes, exactly. > > Are the actions all envisioned to be performed by the driver? > Firmware? Hardware? I guess that distinction can be added later. > For FW/HW actions we would go back to the problem of persistence of > the setting since it was only implemented for params :S The problem is not with FW action, the problem is when you try to set sensor2action mapping for the FW/HW. this will need persistence configuration mode. Sensor2action in SW shall be run-time mode (at least as a start). But it sound as this need some more tuning, to make it clear. > > Is the dump option going to tie back into region snapshots? > no necessarily, dumping SW objects as well can be helpful ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH iproute2-next] System specification health API 2018-09-16 10:37 ` Eran Ben Elisha @ 2018-09-25 12:00 ` Eran Ben Elisha 0 siblings, 0 replies; 17+ messages in thread From: Eran Ben Elisha @ 2018-09-25 12:00 UTC (permalink / raw) To: Jakub Kicinski Cc: netdev, Jiri Pirko, Andy Gospodarek, Michael Chan, Simon Horman, Alexander Duyck, Andrew Lunn, Florian Fainelli, Tal Alon, Ariel Almog On 9/16/2018 1:37 PM, Eran Ben Elisha wrote: > > > On 9/13/2018 8:36 PM, Jakub Kicinski wrote: >> On Thu, 13 Sep 2018 11:18:15 +0300, Eran Ben Elisha wrote: >>> The health spec is targeted for Real Time Alerting, in order to know >>> when >>> something bad had happened to a PCI device >> >> By spec you mean some standards body spec you implement or this >> proposal is a spec? > > This proposal is a spec > >> >>> - Provide alert debug information >>> - Self healing >>> - If problem needs vendor support, provide a way to gather all needed >>> debugging >>> information. >>> >>> The health contains sensors which sense for malfunction. Once sensor >>> triggered, >>> actions such as logs and correction can be taken. >>> Sensors are sensing the health state and can trigger correction action. >>> >>> The sensors are divided into the following groups >>> - Hardware sensor - a sensor which is triggered by the device due to >>> malfunction. >>> - Software sensor - a sensor which is triggered by the software due to >>> malfunction. >>> Both group of sensors can be triggered due to error event or due to a >>> periodic check. >>> >>> Actions are the way to handle sensor events. Action can be in one of the >>> following groups: >>> - Dump - SW trace, SW dump, HW trace, HW dump >>> - Reset - Surgical correction (e.g. modify Q, flush Q, reset of >>> device, etc) >>> Actions can be performed by SW or HW. >>> >>> User is allowed to enable or disable sensors and sensor2action mapping. >>> >>> This RFC man page patch describes the suggested API of devlink-health >>> in order >>> to control sensors and actions. >> >> I like the idea of configuring response to events like this, although >> I'm not sure the name sensor is appropriate here - perhaps exception or >> error would be better? > > I was trying to avoid the negativity description. Have it called sensor > to avoid restricting the API for errors / exceptions only. I got the > same type of comment from Andrew as well devlink-health->devlink-bug. > > But if other vendors driver developers don't see it can be expanded to > sensor which are not errors, then I guess we can refactor the names. > > Are there going to be values reported? > > It depends on the sensor. If it has data that would help in the debug, > then I assume yes, via the dumps. > >> >> I'm not so sure about HW sensors in relation to existing HWMON >> infrastructure... I assume you're targeting things like say some HW >> engine/block reporting it encountered an error? Sounds good, too. > > yes, exactly. > >> >> Are the actions all envisioned to be performed by the driver? >> Firmware? Hardware? I guess that distinction can be added later. >> For FW/HW actions we would go back to the problem of persistence of >> the setting since it was only implemented for params :S > > The problem is not with FW action, the problem is when you try to set > sensor2action mapping for the FW/HW. this will need persistence > configuration mode. Sensor2action in SW shall be run-time mode (at least > as a start). > But it sound as this need some more tuning, to make it clear. Revisiting this (before sending V2). My guideline is that persistency inside the device is needed only when a persistence information is needed before the driver loads. For any other configuration (i.e post HW boot), one can use standard Linux scripts in order to control its persistence information. If any new sensor will be added that requires pre HW boot information, the API can be extended later. > >> >> Is the dump option going to tie back into region snapshots? >> > no necessarily, dumping SW objects as well can be helpful ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH iproute2-next] System specification health API 2018-09-13 17:36 ` [RFC PATCH iproute2-next] System specification health API Jakub Kicinski 2018-09-16 10:37 ` Eran Ben Elisha @ 2018-09-16 19:29 ` Stephen Hemminger 2018-09-16 19:57 ` Andrew Lunn 1 sibling, 1 reply; 17+ messages in thread From: Stephen Hemminger @ 2018-09-16 19:29 UTC (permalink / raw) To: Jakub Kicinski Cc: Eran Ben Elisha, netdev, Jiri Pirko, Andy Gospodarek, Michael Chan, Simon Horman, Alexander Duyck, Andrew Lunn, Florian Fainelli, Tal Alon, Ariel Almog On Thu, 13 Sep 2018 10:36:04 -0700 Jakub Kicinski <jakub.kicinski@netronome.com> wrote: > On Thu, 13 Sep 2018 11:18:15 +0300, Eran Ben Elisha wrote: > > The health spec is targeted for Real Time Alerting, in order to know when > > something bad had happened to a PCI device > > By spec you mean some standards body spec you implement or this > proposal is a spec? > > > - Provide alert debug information > > - Self healing > > - If problem needs vendor support, provide a way to gather all needed debugging > > information. > > > > The health contains sensors which sense for malfunction. Once sensor triggered, > > actions such as logs and correction can be taken. > > Sensors are sensing the health state and can trigger correction action. > > > > The sensors are divided into the following groups > > - Hardware sensor - a sensor which is triggered by the device due to > > malfunction. > > - Software sensor - a sensor which is triggered by the software due to > > malfunction. > > Both group of sensors can be triggered due to error event or due to a periodic check. > > > > Actions are the way to handle sensor events. Action can be in one of the > > following groups: > > - Dump - SW trace, SW dump, HW trace, HW dump > > - Reset - Surgical correction (e.g. modify Q, flush Q, reset of device, etc) > > Actions can be performed by SW or HW. > > > > User is allowed to enable or disable sensors and sensor2action mapping. > > > > This RFC man page patch describes the suggested API of devlink-health in order > > to control sensors and actions. > > I like the idea of configuring response to events like this, although > I'm not sure the name sensor is appropriate here - perhaps exception or > error would be better? Are there going to be values reported? > > I'm not so sure about HW sensors in relation to existing HWMON > infrastructure... I assume you're targeting things like say some HW > engine/block reporting it encountered an error? Sounds good, too. > > Are the actions all envisioned to be performed by the driver? > Firmware? Hardware? I guess that distinction can be added later. > For FW/HW actions we would go back to the problem of persistence of > the setting since it was only implemented for params :S > > Is the dump option going to tie back into region snapshots? Why is this going under iproute rather than using one of the existing sensor API's. For example Intel NIC's have thermal sensors etc. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH iproute2-next] System specification health API 2018-09-16 19:29 ` Stephen Hemminger @ 2018-09-16 19:57 ` Andrew Lunn 2018-09-25 12:17 ` Eran Ben Elisha 0 siblings, 1 reply; 17+ messages in thread From: Andrew Lunn @ 2018-09-16 19:57 UTC (permalink / raw) To: Stephen Hemminger Cc: Jakub Kicinski, Eran Ben Elisha, netdev, Jiri Pirko, Andy Gospodarek, Michael Chan, Simon Horman, Alexander Duyck, Florian Fainelli, Tal Alon, Ariel Almog > Why is this going under iproute rather than using one of the existing sensor API's. > For example Intel NIC's have thermal sensors etc. Hi Stephen These are not that sort of sensors. This is part of the naming problem here. It is not really to do with health, it is about exceptions and bugs. And the sensors are more like timeouts and watchdogs. It is clear that the current names lead to a lot of confusion. Maybe: health -> exception sensor -> condition Andrew ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH iproute2-next] System specification health API 2018-09-16 19:57 ` Andrew Lunn @ 2018-09-25 12:17 ` Eran Ben Elisha 0 siblings, 0 replies; 17+ messages in thread From: Eran Ben Elisha @ 2018-09-25 12:17 UTC (permalink / raw) To: Andrew Lunn, Stephen Hemminger Cc: Jakub Kicinski, netdev, Jiri Pirko, Andy Gospodarek, Michael Chan, Simon Horman, Alexander Duyck, Florian Fainelli, Tal Alon, Ariel Almog On 9/16/2018 10:57 PM, Andrew Lunn wrote: >> Why is this going under iproute rather than using one of the existing sensor API's. >> For example Intel NIC's have thermal sensors etc. > > Hi Stephen > > These are not that sort of sensors. This is part of the naming problem > here. It is not really to do with health, it is about exceptions and > bugs. And the sensors are more like timeouts and watchdogs. > > It is clear that the current names lead to a lot of confusion. Maybe: > > health -> exception > sensor -> condition > > Andrew > I think those names renaming can work well. (Sorry for that response, Local holiday season...) Eran ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2018-09-25 18:25 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-09-13 8:18 [RFC PATCH iproute2-next] System specification health API Eran Ben Elisha 2018-09-13 8:18 ` [RFC PATCH iproute2-next] man: Add devlink health man page Eran Ben Elisha 2018-09-13 10:27 ` Tobin C. Harding 2018-09-13 11:58 ` Eran Ben Elisha 2018-09-13 22:06 ` Tobin C. Harding 2018-09-13 12:08 ` Andrew Lunn 2018-09-13 12:49 ` Eran Ben Elisha 2018-09-13 13:24 ` Andrew Lunn 2018-09-13 14:30 ` Eran Ben Elisha 2018-09-13 15:12 ` Andrew Lunn 2018-09-16 9:14 ` Eran Ben Elisha 2018-09-13 17:36 ` [RFC PATCH iproute2-next] System specification health API Jakub Kicinski 2018-09-16 10:37 ` Eran Ben Elisha 2018-09-25 12:00 ` Eran Ben Elisha 2018-09-16 19:29 ` Stephen Hemminger 2018-09-16 19:57 ` Andrew Lunn 2018-09-25 12:17 ` Eran Ben Elisha
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).