netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ido Schimmel <idosch@idosch.org>
To: Adrian Moreno <amorenoz@redhat.com>
Cc: netdev@vger.kernel.org, aconole@redhat.com, echaudro@redhat.com,
	horms@kernel.org, i.maximets@ovn.org,
	Yotam Gigi <yotam.gi@gmail.com>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH net-next 4/8] net: psample: add tracepoint
Date: Thu, 25 Apr 2024 18:25:00 +0300	[thread overview]
Message-ID: <Zip1zKzG5aF1ceom@shredder> (raw)
In-Reply-To: <542ed8dd-2d9c-4e4f-81dc-e2a9bdaac3b0@redhat.com>

On Thu, Apr 25, 2024 at 10:06:20AM +0200, Adrian Moreno wrote:
> 
> 
> On 4/25/24 09:18, Ido Schimmel wrote:
> > On Wed, Apr 24, 2024 at 03:50:51PM +0200, Adrian Moreno wrote:
> > > Currently there are no widely-available tools to dump the metadata and
> > > group information when a packet is sampled, making it difficult to
> > > troubleshoot related issues.
> > > 
> > > This makes psample use the event tracing framework to log the sampling
> > > of a packet so that it's easier to quickly identify the source
> > > (i.e: group) and context (i.e: metadata) of a packet being sampled.
> > > 
> > > This patch creates some checkpatch splats, but the style of the
> > > tracepoint definition mimics that of other modules so it seems
> > > acceptable.
> > 
> > I don't see a good reason to add this tracepoint (which we won't be able
> > to remove) when you can easily do that with bpftrace which by now should
> > be widely available:
> > 
> > #!/usr/bin/bpftrace
> > 
> > kfunc:psample_sample_packet
> > {
> >          $ts_us = nsecs() / 1000;
> >          $secs = $ts_us / 1000000;
> >          $us = $ts_us % 1000000;
> >          $group = args.group;
> >          $skb = args.skb;
> >          $md = args.md;
> > 
> >          printf("%-16s %-6d %6llu.%6llu group_num = %u refcount=%u seq=%u skbaddr=%p len=%u data_len=%u sample_rate=%u in_ifindex=%d out_ifindex=%d user_cookie=%rx\n",
> >                 comm, pid, $secs, $us, $group->group_num, $group->refcount, $group->seq,
> >                 $skb, $skb->len, $skb->data_len, args.sample_rate,
> >                 $md->in_ifindex, $md->out_ifindex,
> >                 buf($md->user_cookie, $md->user_cookie_len));
> > }
> > 
> > Example output:
> > 
> > mausezahn        984      3299.200626 group_num = 1 refcount=1 seq=13775 skbaddr=0xffffa21143fd4000 len=42 data_len=0 sample_rate=10 in_ifindex=0 out_ifindex=20 user_cookie=
> > \xde\xad\xbe\xef
> > mausezahn        984      3299.281424 group_num = 1 refcount=1 seq=13776 skbaddr=0xffffa21143fd4000 len=42 data_len=0 sample_rate=10 in_ifindex=0 out_ifindex=20 user_cookie=
> > \xde\xad\xbe\xef
> > 
> > Note that it prints the cookie itself unlike the tracepoint which only
> > prints the hashed pointer.
> > 
> 
> I agree that bpftrace can do the work relying on kfuncs/kprobes. But I guess
> that also true for many other tracepoints out there, right?

Maybe, but this particular tracepoint is not buried deep inside some
complex function with manipulated data being passed as arguments.
Instead, this tracepoint is placed at the very beginning of the function
and takes the function arguments as its own arguments. The tracepoint
can be easily replaced with fentry/kprobes like I've shown with the
example above.

> For development and labs bpftrace is perfectly fine, but using kfuncs and
> requiring recompilation is harder in production systems compared with using
> smaller CO-RE tools.

I used bpftrace because it is very easy to write, but I could have done
the same with libbpf. I have a bunch of such tools that I wrote over the
years that I compiled once on my laptop and which I copy to various
machines where I need them.

> If OVS starts using psample heavily and users need to troubleshoot or merely
> observe packets as they are sampled in a more efficient way, they are likely
> to use ebpf for that. I think making it a bit easier (as in, providing a
> sligthly more stable tracepoint) is worth considering.

I'm not saying that it's not worth considering, I'm simply saying that
it should be done after gathering operational experience with existing
mechanisms. It's possible you will conclude that this tracepoint is not
actually needed.

Also, there are some disadvantages in using tracepoints compared to
fentry:

https://github.com/Mellanox/mlxsw/commit/e996fd583eff1c43aacb9c79e55f5add12402d7d
https://lore.kernel.org/all/CAEf4BzbhvD_f=y3SDAiFqNvuErcnXt4fErMRSfanjYQg5=7GJg@mail.gmail.com/#t

Not saying that's the case here, but worth considering / being aware.

> Can you please expand on your concerns about the tracepoint? It's on the
> main internal function of the module so, even though the function name or
> its arguments might change, it doesn't seem probable that it'll disappear
> altogether. Why else would we want to remove the tracepoint?

It's not really concerns, but dissatisfaction. It's my impression (might
be wrong) that this series commits to adding new interfaces without
first seriously evaluating existing ones. This is true for this patch
and patch #2 that adds a new netlink command instead of using
SO_ATTACH_FILTER like existing applications are doing to achieve the
same goal.

I guess some will disagree, but wanted to voice my opinion nonetheless.

  reply	other threads:[~2024-04-25 15:25 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-24 13:50 [PATCH net-next 0/8] net: openvswitch: Add sample multicasting Adrian Moreno
2024-04-24 13:50 ` [PATCH net-next 1/8] net: netlink: export genl private pointer getters Adrian Moreno
2024-04-24 13:50 ` [PATCH net-next 2/8] net: psample: add multicast filtering on group_id Adrian Moreno
2024-04-24 14:54   ` Jiri Pirko
2024-04-25  7:23     ` Adrian Moreno
2024-04-24 13:50 ` [PATCH net-next 3/8] net: psample: add user cookie Adrian Moreno
2024-04-25  7:32   ` Ido Schimmel
2024-04-25  8:09     ` Adrian Moreno
2024-04-24 13:50 ` [PATCH net-next 4/8] net: psample: add tracepoint Adrian Moreno
2024-04-25  7:18   ` Ido Schimmel
2024-04-25  8:06     ` Adrian Moreno
2024-04-25 15:25       ` Ido Schimmel [this message]
2024-04-29  5:33         ` Adrian Moreno
2024-04-30 12:53           ` Ido Schimmel
2024-04-24 13:50 ` [PATCH net-next 5/8] net: sched: act_sample: add action cookie to sample Adrian Moreno
2024-04-25  7:39   ` Ido Schimmel
2024-04-25 21:43   ` Jamal Hadi Salim
2024-04-24 13:50 ` [PATCH net-next 6/8] net:openvswitch: add psample support Adrian Moreno
2024-04-30  7:29   ` Dan Carpenter
2024-05-03  9:43   ` Eelco Chaudron
2024-05-07 14:18     ` Adrian Moreno
2024-05-08  9:48       ` Eelco Chaudron
2024-05-08 15:25       ` Aaron Conole
2024-04-24 13:50 ` [PATCH net-next 7/8] selftests: openvswitch: add sample action Adrian Moreno
2024-04-24 13:50 ` [PATCH net-next 8/8] selftests: openvswitch: add psample test Adrian Moreno

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zip1zKzG5aF1ceom@shredder \
    --to=idosch@idosch.org \
    --cc=aconole@redhat.com \
    --cc=amorenoz@redhat.com \
    --cc=davem@davemloft.net \
    --cc=echaudro@redhat.com \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=i.maximets@ovn.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=yotam.gi@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).