From: Ido Schimmel <idosch@idosch.org>
To: David Ahern <dsahern@gmail.com>
Cc: "Toke Høiland-Jørgensen" <toke@redhat.com>,
netdev@vger.kernel.org, davem@davemloft.net,
nhorman@tuxdriver.com, roopa@cumulusnetworks.com,
nikolay@cumulusnetworks.com, jakub.kicinski@netronome.com,
andy@greyhouse.net, f.fainelli@gmail.com, andrew@lunn.ch,
vivien.didelot@gmail.com, mlxsw@mellanox.com,
"Ido Schimmel" <idosch@mellanox.com>
Subject: Re: [RFC PATCH net-next 00/12] drop_monitor: Capture dropped packets and metadata
Date: Wed, 24 Jul 2019 10:57:00 +0300 [thread overview]
Message-ID: <20190724075700.GA15878@splinter> (raw)
In-Reply-To: <c02f9b6a-f343-89ee-1047-79c1fb4e3436@gmail.com>
On Tue, Jul 23, 2019 at 08:47:57AM -0700, David Ahern wrote:
> On 7/23/19 8:14 AM, Ido Schimmel wrote:
> > On Tue, Jul 23, 2019 at 02:17:49PM +0200, Toke Høiland-Jørgensen wrote:
> >> Ido Schimmel <idosch@idosch.org> writes:
> >>
> >>> On Mon, Jul 22, 2019 at 09:43:15PM +0200, Toke Høiland-Jørgensen wrote:
> >>>> Is there a mechanism for the user to filter the packets before they are
> >>>> sent to userspace? A bpf filter would be the obvious choice I guess...
> >>>
> >>> Hi Toke,
> >>>
> >>> Yes, it's on my TODO list to write an eBPF program that only lets
> >>> "unique" packets to be enqueued on the netlink socket. Where "unique" is
> >>> defined as {5-tuple, PC}. The rest of the copies will be counted in an
> >>> eBPF map, which is just a hash table keyed by {5-tuple, PC}.
> >>
> >> Yeah, that's a good idea. Or even something simpler like tcpdump-style
> >> filters for the packets returned by drop monitor (say if I'm just trying
> >> to figure out what happens to my HTTP requests).
> >
> > Yep, that's a good idea. I guess different users will use different
> > programs. Will look into both options.
>
> Perhaps I am missing something, but the dropmon code only allows a
> single user at the moment (in my attempts to run 2 instances the second
> one failed).
Yes, you're correct. By "different users" I meant users on different
systems with different needs. For example, someone trying to monitor
dropped packets on a laptop versus someone trying to do the same on a
ToR switch.
> If that part stays with the design
This stays the same.
> it afford better options for the design. e.g., attributes that control
> the enqueued packets when the event occurs as opposed to bpf filters
> which run much later when the message is enqueued to the socket.
I'm going to add an attribute that will control the number of packets
we're enqueuing on the per-CPU drop list. I'm not sure, but are you
suggesting to add even more attributes? If so, how do you imagine these
will look like?
>
> >
> >>> I think it would be good to have the program as part of the bcc
> >>> repository [1]. What do you think?
> >>
> >> Sure. We could also add it to the XDP tutorial[2]; it could go into a
> >> section on introspection and debugging (just added a TODO about that[3]).
> >
> > Great!
> >
> >>>> For integrating with XDP the trick would be to find a way to do it that
> >>>> doesn't incur any overhead when it's not enabled. Are you envisioning
> >>>> that this would be enabled separately for the different "modes" (kernel,
> >>>> hardware, XDP, etc)?
> >>>
> >>> Yes. Drop monitor have commands to enable and disable tracing, but they
> >>> don't carry any attributes at the moment. My plan is to add an attribute
> >>> (e.g., 'NET_DM_ATTR_DROP_TYPE') that will specify the type of drops
> >>> you're interested in - SW/HW/XDP. If the attribute is not specified,
> >>> then current behavior is maintained and all the drop types are traced.
> >>> But if you're only interested in SW drops, then overhead for the rest
> >>> should be zero.
> >>
> >> Makes sense (although "should be" is the key here ;)).
>
> static_key is used in other parts of the packet fast path.
>
> Toke/Jesper: Any reason to believe it is too much overhead for this path?
>
> >>
> >> I'm also worried about the drop monitor getting overwhelmed; if you turn
> >> it on for XDP and you're running a filtering program there, you'll
> >> suddenly get *a lot* of drops.
> >>
> >> As I read your patch, the current code can basically queue up an
> >> unbounded number of packets waiting to go out over netlink, can't it?
> >
> > That's a very good point. Each CPU holds a drop list. It probably makes
> > sense to limit it by default (to 1000?) and allow user to change it
> > later, if needed. I can expose a counter that shows how many packets
> > were dropped because of this limit. It can be used as an indication to
> > adjust the queue length (or flip to 'summary' mode).
> >
>
> And then with a single user limit, you can have an attribute that
> controls the backlog.
Yep, already on my list of changes for v1 :)
Thanks, David.
next prev parent reply other threads:[~2019-07-24 7:57 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-07-22 18:31 [RFC PATCH net-next 00/12] drop_monitor: Capture dropped packets and metadata Ido Schimmel
2019-07-22 18:31 ` [RFC PATCH net-next 01/12] drop_monitor: Use correct error code Ido Schimmel
2019-07-22 18:31 ` [RFC PATCH net-next 02/12] drop_monitor: Rename and document scope of mutex Ido Schimmel
2019-07-22 18:31 ` [RFC PATCH net-next 03/12] drop_monitor: Document scope of spinlock Ido Schimmel
2019-07-22 18:31 ` [RFC PATCH net-next 04/12] drop_monitor: Avoid multiple blank lines Ido Schimmel
2019-07-22 18:31 ` [RFC PATCH net-next 05/12] drop_monitor: Add extack support Ido Schimmel
2019-07-22 18:31 ` [RFC PATCH net-next 06/12] drop_monitor: Use pre_doit / post_doit hooks Ido Schimmel
2019-07-22 18:31 ` [RFC PATCH net-next 07/12] drop_monitor: Split tracing enable / disable to different functions Ido Schimmel
2019-07-22 18:31 ` [RFC PATCH net-next 08/12] drop_monitor: Initialize timer and work item upon tracing enable Ido Schimmel
2019-07-24 9:01 ` Jiri Pirko
2019-07-24 17:02 ` Ido Schimmel
2019-07-22 18:31 ` [RFC PATCH net-next 09/12] drop_monitor: Require CAP_NET_ADMIN for drop monitor configuration Ido Schimmel
2019-07-22 18:31 ` [RFC PATCH net-next 10/12] drop_monitor: Add packet alert mode Ido Schimmel
2019-07-23 12:43 ` Neil Horman
2019-07-23 14:16 ` Ido Schimmel
2019-07-23 15:14 ` Neil Horman
2019-07-24 7:10 ` Ido Schimmel
2019-07-24 12:53 ` Jiri Pirko
2019-07-24 16:57 ` Ido Schimmel
2019-07-29 9:52 ` [drop_monitor] 98ffbd6cd2: will-it-scale.per_thread_ops -17.5% regression kernel test robot
2019-08-05 11:56 ` Ido Schimmel
2019-07-22 18:31 ` [RFC PATCH net-next 11/12] drop_monitor: Allow truncation of dropped packets Ido Schimmel
[not found] ` <20190724125537.GC2225@nanopsycho>
2019-07-24 16:49 ` Ido Schimmel
2019-07-22 18:31 ` [RFC PATCH net-next 12/12] drop_monitor: Add a command to query current configuration Ido Schimmel
2019-07-22 19:43 ` [RFC PATCH net-next 00/12] drop_monitor: Capture dropped packets and metadata Toke Høiland-Jørgensen
2019-07-23 6:46 ` Ido Schimmel
2019-07-23 12:17 ` Toke Høiland-Jørgensen
2019-07-23 15:14 ` Ido Schimmel
2019-07-23 15:47 ` David Ahern
2019-07-24 7:57 ` Ido Schimmel [this message]
2019-07-23 16:08 ` Toke Høiland-Jørgensen
2019-07-24 8:10 ` Ido Schimmel
2019-07-24 9:51 ` Toke Høiland-Jørgensen
2019-07-24 12:58 ` Jiri Pirko
2019-07-24 16:48 ` Ido Schimmel
2019-07-24 22:48 ` David Miller
2019-07-24 15:15 ` Jiri Pirko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190724075700.GA15878@splinter \
--to=idosch@idosch.org \
--cc=andrew@lunn.ch \
--cc=andy@greyhouse.net \
--cc=davem@davemloft.net \
--cc=dsahern@gmail.com \
--cc=f.fainelli@gmail.com \
--cc=idosch@mellanox.com \
--cc=jakub.kicinski@netronome.com \
--cc=mlxsw@mellanox.com \
--cc=netdev@vger.kernel.org \
--cc=nhorman@tuxdriver.com \
--cc=nikolay@cumulusnetworks.com \
--cc=roopa@cumulusnetworks.com \
--cc=toke@redhat.com \
--cc=vivien.didelot@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).