All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Stephen Hemminger <stephen@networkplumber.org>
Cc: brouer@redhat.com, netdev@vger.kernel.org,
	Stephen Hemminger <sthemmin@microsoft.com>,
	Daniel Borkmann <borkmann@iogearbox.net>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	Arnaldo Carvalho de Melo <acme@redhat.com>,
	Steven Rostedt <rostedt@goodmis.org>
Subject: Re: [PATCH net-next] net: add network device notifier trace points
Date: Wed, 19 Dec 2018 08:36:43 +0100	[thread overview]
Message-ID: <20181219083643.7f724e59@redhat.com> (raw)
In-Reply-To: <20181219022706.10611-1-sthemmin@microsoft.com>

On Tue, 18 Dec 2018 18:27:06 -0800
Stephen Hemminger <stephen@networkplumber.org> wrote:

> This is the result of a conversation about monitoring of link
> state changes with BPF.

If you want to use this from BPF then you are in for a surprise.  As
tracepoints BPF cannot read these "__string" constructs, here the netdev
name.  I tried a lot of different tricks that didn't work, see [1],
until Alexei explained that it simply isn't supported.

I instead recommend adding the ifindex to the tracepoint.  The__string
and __assign_str is also a performance concern as it does strcpy behind
your back.

I have an year old TODO list item about improving this:
 ** TODO Make perf-script plugin for ifindex to name translation
    SCHEDULED: <2017-11-20 Mon>

Today, the existing network tracepoints using dev->name is not that
usable by BPF, as BPF cannot identify the interface.  Thus IMHO it would
make sense to convert the existing network tracepoints dev->name into
dev->ifindex, and then let perf-script convert this to the interface
name.  Either in userspace via if_indextoname(3), or (as ACME pointed
out at the time) we might want to have a lookup table stored together
with perf.data for later inspection (in-case ifindexes changed).


[1] https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/samples/bpf/napi_monitor_kern.c#L34-L130

> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
>  include/trace/events/net.h | 115 +++++++++++++++++++++++++++++++++++++
>  net/core/dev.c             |   9 ++-
>  2 files changed, 123 insertions(+), 1 deletion(-)
> 
> diff --git a/include/trace/events/net.h b/include/trace/events/net.h
> index 1efd7d9b25fe..141310d24610 100644
> --- a/include/trace/events/net.h
> +++ b/include/trace/events/net.h
[...]
> +TRACE_EVENT(net_dev_notifier_entry,
> +
> +	TP_PROTO(const struct netdev_notifier_info *info, unsigned long val),
> +
> +	TP_ARGS(info, val),
> +
> +	TP_STRUCT__entry(
> +		__string(	name,		 info->dev->name )
> +		__field(	enum netdev_cmd, event	         )
> +	),
> +
> +	TP_fast_assign(
> +		__assign_str(name, info->dev->name);
> +		__entry->event = val;
> +       ),

These __string and __assign_str are costly and behind the scenes does a
strcpy.

> +
> +	TP_printk("dev=%s event=%s",
> +		  __get_str(name), netdev_event_type(__entry->event))
> +);
> +
> +TRACE_EVENT(net_dev_notifier,
> +
> +	TP_PROTO(const struct netdev_notifier_info *info, int rc, unsigned long val),
> +
> +	TP_ARGS(info, rc, val),
> +
> +	TP_STRUCT__entry(
> +		__string(	name,		 info->dev->name   )
> +		__field(	enum netdev_cmd, event	           )
> +		__field(	int,		 rc		   )
> +	),
> +
> +	TP_fast_assign(
> +		__assign_str(name, info->dev->name);
> +		__entry->event = val;
> +		__entry->rc = rc;
> +       ),
> +
> +	TP_printk("dev=%s event=%s ret=%d",
> +		  __get_str(name), netdev_event_type(__entry->event),
> +		  __entry->rc)
> +);
> +

The bare minimum change is to _also_ add (info->dev->)ifindex to the
tracepoint, as this makes it usable from BPF.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

  parent reply	other threads:[~2018-12-19  7:36 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-19  2:27 [PATCH net-next] net: add network device notifier trace points Stephen Hemminger
2018-12-19  3:38 ` David Ahern
2018-12-19  7:36 ` Jesper Dangaard Brouer [this message]
2018-12-19 15:43   ` Stephen Hemminger
2018-12-19 15:46   ` Daniel Borkmann
2018-12-19 16:40     ` Jesper Dangaard Brouer
2018-12-19 16:55       ` Steven Rostedt
2018-12-19 17:07       ` Daniel Borkmann
2018-12-19 15:51   ` Steven Rostedt
2018-12-19 16:48   ` David Ahern

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181219083643.7f724e59@redhat.com \
    --to=brouer@redhat.com \
    --cc=acme@redhat.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=borkmann@iogearbox.net \
    --cc=netdev@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=stephen@networkplumber.org \
    --cc=sthemmin@microsoft.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.