public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Jiri Pirko <jiri@resnulli.us>
To: Jakub Kicinski <kuba@kernel.org>
Cc: Eran Ben Elisha <eranbe@mellanox.com>,
	netdev@vger.kernel.org, Jiri Pirko <jiri@mellanox.com>,
	Michael Chan <michael.chan@broadcom.com>,
	"David S. Miller" <davem@davemloft.net>,
	Saeed Mahameed <saeedm@mellanox.com>
Subject: Re: [PATCH net-next 2/2] devlink: Add auto dump flag to health reporter
Date: Wed, 25 Mar 2020 20:08:21 +0100	[thread overview]
Message-ID: <20200325190821.GE11304@nanopsycho.orion> (raw)
In-Reply-To: <20200325114529.3f4179c1@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>

Wed, Mar 25, 2020 at 07:45:29PM CET, kuba@kernel.org wrote:
>On Wed, 25 Mar 2020 15:26:24 +0200 Eran Ben Elisha wrote:
>> On low memory system, run time dumps can consume too much memory. Add
>> administrator ability to disable auto dumps per reporter as part of the
>> error flow handle routine.
>> 
>> This attribute is not relevant while executing
>> DEVLINK_CMD_HEALTH_REPORTER_DUMP_GET.
>> 
>> By default, auto dump is activated for any reporter that has a dump method,
>> as part of the reporter registration to devlink.
>> 
>> Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
>> Reviewed-by: Jiri Pirko <jiri@mellanox.com>
>> ---
>>  include/uapi/linux/devlink.h |  2 ++
>>  net/core/devlink.c           | 26 ++++++++++++++++++++++----
>>  2 files changed, 24 insertions(+), 4 deletions(-)
>> 
>> diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
>> index dfdffc42e87d..e7891d1d2ebd 100644
>> --- a/include/uapi/linux/devlink.h
>> +++ b/include/uapi/linux/devlink.h
>> @@ -429,6 +429,8 @@ enum devlink_attr {
>>  	DEVLINK_ATTR_NETNS_FD,			/* u32 */
>>  	DEVLINK_ATTR_NETNS_PID,			/* u32 */
>>  	DEVLINK_ATTR_NETNS_ID,			/* u32 */
>> +
>> +	DEVLINK_ATTR_HEALTH_REPORTER_AUTO_DUMP,	/* u8 */
>>  	/* add new attributes above here, update the policy in devlink.c */
>>  
>>  	__DEVLINK_ATTR_MAX,
>> diff --git a/net/core/devlink.c b/net/core/devlink.c
>> index ad69379747ef..e14bf3052289 100644
>> --- a/net/core/devlink.c
>> +++ b/net/core/devlink.c
>> @@ -4837,6 +4837,7 @@ struct devlink_health_reporter {
>>  	struct mutex dump_lock; /* lock parallel read/write from dump buffers */
>>  	u64 graceful_period;
>>  	bool auto_recover;
>> +	bool auto_dump;
>>  	u8 health_state;
>>  	u64 dump_ts;
>>  	u64 dump_real_ts;
>> @@ -4903,6 +4904,7 @@ devlink_health_reporter_create(struct devlink *devlink,
>>  	reporter->devlink = devlink;
>>  	reporter->graceful_period = graceful_period;
>>  	reporter->auto_recover = !!ops->recover;
>> +	reporter->auto_dump = !!ops->dump;
>>  	mutex_init(&reporter->dump_lock);
>>  	refcount_set(&reporter->refcount, 1);
>>  	list_add_tail(&reporter->list, &devlink->reporter_list);
>> @@ -4983,6 +4985,10 @@ devlink_nl_health_reporter_fill(struct sk_buff *msg,
>>  	    nla_put_u64_64bit(msg, DEVLINK_ATTR_HEALTH_REPORTER_DUMP_TS_NS,
>>  			      reporter->dump_real_ts, DEVLINK_ATTR_PAD))
>>  		goto reporter_nest_cancel;
>> +	if (reporter->ops->dump &&
>> +	    nla_put_u8(msg, DEVLINK_ATTR_HEALTH_REPORTER_AUTO_DUMP,
>> +		       reporter->auto_dump))
>> +		goto reporter_nest_cancel;
>
>Since you're making it a u8 - does it make sense to indicate to user

Please don't be mistaken. u8 carries a bool here.


>space whether the dump is disabled or not supported?

If you want to expose "not supported", I suggest to do it in another
attr. Because this attr is here to do the config from userspace. Would
be off if the same enum would carry "not supported".

But anyway, since you opened this can, the supported/capabilities
should be probably passed by a separate bitfield for all features.


>
>Right now no attribute means either old kernel or dump not possible..
>
>>  	nla_nest_end(msg, reporter_attr);
>>  	genlmsg_end(msg, hdr);
>> @@ -5129,10 +5135,12 @@ int devlink_health_report(struct devlink_health_reporter *reporter,
>>  
>>  	reporter->health_state = DEVLINK_HEALTH_REPORTER_STATE_ERROR;
>>  
>> -	mutex_lock(&reporter->dump_lock);
>> -	/* store current dump of current error, for later analysis */
>> -	devlink_health_do_dump(reporter, priv_ctx, NULL);
>> -	mutex_unlock(&reporter->dump_lock);
>> +	if (reporter->auto_dump) {
>> +		mutex_lock(&reporter->dump_lock);
>> +		/* store current dump of current error, for later analysis */
>> +		devlink_health_do_dump(reporter, priv_ctx, NULL);
>> +		mutex_unlock(&reporter->dump_lock);
>> +	}
>>  
>>  	if (reporter->auto_recover)
>>  		return devlink_health_reporter_recover(reporter,
>> @@ -5306,6 +5314,11 @@ devlink_nl_cmd_health_reporter_set_doit(struct sk_buff *skb,
>>  		err = -EOPNOTSUPP;
>>  		goto out;
>>  	}
>> +	if (!reporter->ops->dump &&
>> +	    info->attrs[DEVLINK_ATTR_HEALTH_REPORTER_AUTO_DUMP]) {
>
>... and then this behavior may have to change, I think?
>
>> +		err = -EOPNOTSUPP;
>> +		goto out;
>> +	}
>>  
>>  	if (info->attrs[DEVLINK_ATTR_HEALTH_REPORTER_GRACEFUL_PERIOD])
>>  		reporter->graceful_period =
>> @@ -5315,6 +5328,10 @@ devlink_nl_cmd_health_reporter_set_doit(struct sk_buff *skb,
>>  		reporter->auto_recover =
>>  			nla_get_u8(info->attrs[DEVLINK_ATTR_HEALTH_REPORTER_AUTO_RECOVER]);
>>  
>> +	if (info->attrs[DEVLINK_ATTR_HEALTH_REPORTER_AUTO_DUMP])
>> +		reporter->auto_dump =
>> +		nla_get_u8(info->attrs[DEVLINK_ATTR_HEALTH_REPORTER_AUTO_DUMP]);
>> +
>>  	devlink_health_reporter_put(reporter);
>>  	return 0;
>>  out:
>> @@ -6053,6 +6070,7 @@ static const struct nla_policy devlink_nl_policy[DEVLINK_ATTR_MAX + 1] = {
>>  	[DEVLINK_ATTR_HEALTH_REPORTER_NAME] = { .type = NLA_NUL_STRING },
>>  	[DEVLINK_ATTR_HEALTH_REPORTER_GRACEFUL_PERIOD] = { .type = NLA_U64 },
>>  	[DEVLINK_ATTR_HEALTH_REPORTER_AUTO_RECOVER] = { .type = NLA_U8 },
>> +	[DEVLINK_ATTR_HEALTH_REPORTER_AUTO_DUMP] = { .type = NLA_U8 },
>
>I'd suggest we keep the attrs in order of definition, because we should
>set .strict_start_type, and then it matters which are before and which
>are after.
>
>Also please set max value of 1.
>
>>  	[DEVLINK_ATTR_FLASH_UPDATE_FILE_NAME] = { .type = NLA_NUL_STRING },
>>  	[DEVLINK_ATTR_FLASH_UPDATE_COMPONENT] = { .type = NLA_NUL_STRING },
>>  	[DEVLINK_ATTR_TRAP_NAME] = { .type = NLA_NUL_STRING },
>

  reply	other threads:[~2020-03-25 19:08 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-25 13:26 [PATCH net-next 0/2] Devlink health auto attributes refactor Eran Ben Elisha
2020-03-25 13:26 ` [PATCH net-next 1/2] devlink: Implicitly set auto recover flag when registering health reporter Eran Ben Elisha
2020-03-25 18:33   ` Jakub Kicinski
2020-03-25 13:26 ` [PATCH net-next 2/2] devlink: Add auto dump flag to " Eran Ben Elisha
2020-03-25 18:45   ` Jakub Kicinski
2020-03-25 19:08     ` Jiri Pirko [this message]
2020-03-25 19:38       ` Eran Ben Elisha
2020-03-26  0:01         ` Jakub Kicinski
2020-03-26  9:06           ` Eran Ben Elisha
2020-03-26 10:22           ` Jiri Pirko
2020-03-26 17:39             ` Jakub Kicinski
2020-03-26 21:23               ` Jiri Pirko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200325190821.GE11304@nanopsycho.orion \
    --to=jiri@resnulli.us \
    --cc=davem@davemloft.net \
    --cc=eranbe@mellanox.com \
    --cc=jiri@mellanox.com \
    --cc=kuba@kernel.org \
    --cc=michael.chan@broadcom.com \
    --cc=netdev@vger.kernel.org \
    --cc=saeedm@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox