All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tobias Waldekranz <tobias@waldekranz.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: davem@davemloft.net, kuba@kernel.org, roopa@nvidia.com,
	razor@blackwall.org, bridge@lists.linux.dev,
	netdev@vger.kernel.org, jiri@resnulli.us, ivecera@redhat.com,
	mhiramat@kernel.org, linux-trace-kernel@vger.kernel.org
Subject: Re: [PATCH v3 net-next 4/4] net: switchdev: Add tracepoints
Date: Mon, 26 Feb 2024 14:05:20 +0100	[thread overview]
Message-ID: <87cysjgyun.fsf@waldekranz.com> (raw)
In-Reply-To: <20240223103815.35fdf430@gandalf.local.home>

On fre, feb 23, 2024 at 10:38, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Fri, 23 Feb 2024 12:44:53 +0100
> Tobias Waldekranz <tobias@waldekranz.com> wrote:
>
>> Add a basic set of tracepoints:
>> 
>> - switchdev_defer: Fires whenever an operation is enqueued to the
>>   switchdev workqueue for deferred delivery.
>> 
>> - switchdev_call_{atomic,blocking}: Fires whenever a notification is
>>   sent to the corresponding switchdev notifier chain.
>> 
>> - switchdev_call_replay: Fires whenever a notification is sent to a
>>   specific driver's notifier block, in response to a replay request.
>> 
>> Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
>> ---
>>  include/trace/events/switchdev.h | 74 ++++++++++++++++++++++++++++++++
>>  net/switchdev/switchdev.c        | 71 +++++++++++++++++++++++++-----
>>  2 files changed, 135 insertions(+), 10 deletions(-)
>>  create mode 100644 include/trace/events/switchdev.h
>> 
>> diff --git a/include/trace/events/switchdev.h b/include/trace/events/switchdev.h
>> new file mode 100644
>> index 000000000000..dcaf6870d017
>> --- /dev/null
>> +++ b/include/trace/events/switchdev.h
>> @@ -0,0 +1,74 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +#undef TRACE_SYSTEM
>> +#define TRACE_SYSTEM	switchdev
>> +
>> +#if !defined(_TRACE_SWITCHDEV_H) || defined(TRACE_HEADER_MULTI_READ)
>> +#define _TRACE_SWITCHDEV_H
>> +
>> +#include <linux/tracepoint.h>
>> +#include <net/switchdev.h>
>> +
>> +#define SWITCHDEV_TRACE_MSG_MAX 128
>
> 128 bytes is awfully big to waste on the ring buffer. What's the average
> size of a string?

I would say the typical message is around 60-80 bytes. Some common examples:

    PORT_OBJ_ADD PORT_VLAN(flags 0x0 orig br0) vid 1 flags 0x27
    PORT_OBJ_DEL HOST_MDB(flags 0x0 orig br0) vid 100 addr 33:33:ff:ff:00:09

The worst case I can think of currently is 95 characters:

    VXLAN_FDB_ADD_TO_DEVICE vid 1000 addr de:ad:be:ef:00:01 added_by_user is_local locked offloaded

>> +
>> +DECLARE_EVENT_CLASS(switchdev_call,
>> +	TP_PROTO(unsigned long val,
>> +		 const struct switchdev_notifier_info *info,
>> +		 int err),
>> +
>> +	TP_ARGS(val, info, err),
>> +
>> +	TP_STRUCT__entry(
>> +		__field(unsigned long, val)
>> +		__string(dev, info->dev ? netdev_name(info->dev) : "(null)")
>> +		__field(const struct switchdev_notifier_info *, info)
>> +		__field(int, err)
>> +		__array(char, msg, SWITCHDEV_TRACE_MSG_MAX)
>> +	),
>> +
>> +	TP_fast_assign(
>> +		__entry->val = val;
>> +		__assign_str(dev, info->dev ? netdev_name(info->dev) : "(null)");
>> +		__entry->info = info;
>> +		__entry->err = err;
>> +		switchdev_notifier_str(val, info, __entry->msg, SWITCHDEV_TRACE_MSG_MAX);
>
> Is it possible to just store the information in the trace event and then
> call the above function in the read stage? There's helpers to pass strings
> around (namely a struct trace_seq *p).

I'm a complete novice when it comes to tracepoint internals. Am I right
in assuming that TP_printk may execute at a much later time than
TP_fast_assign? E.g., the object referenced by __entry->info may very
well have been freed by that time? That at least seems to be what my
naive refactor to replace __entry->msg with __get_buf() suggests :)

If so, the layout of the switchdev_notifier_* structs makes it a bit
cumbersome to clone the notification in the assign phase, as the size of
a specific notification (e.g., switchdev_notifier_port_obj_info) is not
known by the common notification (switchdev_notifier_info). In the case
of switchdev objects, the problem repeats a second time. E.g., the size
of switchdev_obj_port_vlan is not known by switchdev_obj.

> It would require a plugin for libtraceevent if you want to expose it to
> user space tools for trace-cmd and perf though.
>
> Another possibility is if this event will not race with other events on he
> same CPU, you could create a per-cpu buffer, write into that, and then use
> __string() and __assign_str() to save it, as traces happen with preemption
> disabled.

But bottom halves are still enabled I suppose? Notifications can be
generated both from process context (e.g., users configuring the bridge
with iproute2), and from bridge packet processing (e.g., adding new
neighbors to the FDB). So I don't think that would work in this case.

> -- Steve

Thanks for taking the time!

>> +	),
>> +
>> +	TP_printk("dev %s %s -> %d", __get_str(dev), __entry->msg, __entry->err)
>> +);
>> +
>> +DEFINE_EVENT(switchdev_call, switchdev_defer,
>> +	TP_PROTO(unsigned long val,
>> +		 const struct switchdev_notifier_info *info,
>> +		 int err),
>> +
>> +	TP_ARGS(val, info, err)
>> +);
>> +
>> +DEFINE_EVENT(switchdev_call, switchdev_call_atomic,
>> +	TP_PROTO(unsigned long val,
>> +		 const struct switchdev_notifier_info *info,
>> +		 int err),
>> +
>> +	TP_ARGS(val, info, err)
>> +);
>> +
>> +DEFINE_EVENT(switchdev_call, switchdev_call_blocking,
>> +	TP_PROTO(unsigned long val,
>> +		 const struct switchdev_notifier_info *info,
>> +		 int err),
>> +
>> +	TP_ARGS(val, info, err)
>> +);
>> +
>> +DEFINE_EVENT(switchdev_call, switchdev_call_replay,
>> +	TP_PROTO(unsigned long val,
>> +		 const struct switchdev_notifier_info *info,
>> +		 int err),
>> +
>> +	TP_ARGS(val, info, err)
>> +);
>> +
>> +#endif /* _TRACE_SWITCHDEV_H */
>> +

  reply	other threads:[~2024-02-26 13:05 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-23 11:44 [PATCH v3 net-next 0/4] net: switchdev: Tracepoints Tobias Waldekranz
2024-02-23 11:44 ` [PATCH v3 net-next 1/4] net: switchdev: Wrap enums in mapper macros Tobias Waldekranz
2024-02-23 11:44 ` [PATCH v3 net-next 2/4] net: switchdev: Add helpers to display switchdev objects as strings Tobias Waldekranz
2024-02-23 11:44 ` [PATCH v3 net-next 3/4] net: switchdev: Relay all replay messages through a central function Tobias Waldekranz
2024-02-23 11:44 ` [PATCH v3 net-next 4/4] net: switchdev: Add tracepoints Tobias Waldekranz
2024-02-23 15:38   ` Steven Rostedt
2024-02-26 13:05     ` Tobias Waldekranz [this message]
2024-02-27 10:04     ` Paolo Abeni
2024-02-28 10:47       ` Tobias Waldekranz
2024-02-28 14:56         ` Steven Rostedt
2024-03-04 22:40           ` Tobias Waldekranz
2024-03-06 15:15             ` Steven Rostedt
2024-03-06 20:02               ` Tobias Waldekranz
2024-03-06 21:46                 ` Steven Rostedt
2024-03-06 22:45                   ` Tobias Waldekranz
2024-03-06 23:33                     ` Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87cysjgyun.fsf@waldekranz.com \
    --to=tobias@waldekranz.com \
    --cc=bridge@lists.linux.dev \
    --cc=davem@davemloft.net \
    --cc=ivecera@redhat.com \
    --cc=jiri@resnulli.us \
    --cc=kuba@kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=razor@blackwall.org \
    --cc=roopa@nvidia.com \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.