From: Tobias Waldekranz <tobias@waldekranz.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: davem@davemloft.net, kuba@kernel.org, roopa@nvidia.com,
razor@blackwall.org, bridge@lists.linux.dev,
netdev@vger.kernel.org, jiri@resnulli.us, ivecera@redhat.com,
mhiramat@kernel.org, linux-trace-kernel@vger.kernel.org
Subject: Re: [PATCH v3 net-next 4/4] net: switchdev: Add tracepoints
Date: Mon, 26 Feb 2024 14:05:20 +0100 [thread overview]
Message-ID: <87cysjgyun.fsf@waldekranz.com> (raw)
In-Reply-To: <20240223103815.35fdf430@gandalf.local.home>
On fre, feb 23, 2024 at 10:38, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Fri, 23 Feb 2024 12:44:53 +0100
> Tobias Waldekranz <tobias@waldekranz.com> wrote:
>
>> Add a basic set of tracepoints:
>>
>> - switchdev_defer: Fires whenever an operation is enqueued to the
>> switchdev workqueue for deferred delivery.
>>
>> - switchdev_call_{atomic,blocking}: Fires whenever a notification is
>> sent to the corresponding switchdev notifier chain.
>>
>> - switchdev_call_replay: Fires whenever a notification is sent to a
>> specific driver's notifier block, in response to a replay request.
>>
>> Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
>> ---
>> include/trace/events/switchdev.h | 74 ++++++++++++++++++++++++++++++++
>> net/switchdev/switchdev.c | 71 +++++++++++++++++++++++++-----
>> 2 files changed, 135 insertions(+), 10 deletions(-)
>> create mode 100644 include/trace/events/switchdev.h
>>
>> diff --git a/include/trace/events/switchdev.h b/include/trace/events/switchdev.h
>> new file mode 100644
>> index 000000000000..dcaf6870d017
>> --- /dev/null
>> +++ b/include/trace/events/switchdev.h
>> @@ -0,0 +1,74 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +#undef TRACE_SYSTEM
>> +#define TRACE_SYSTEM switchdev
>> +
>> +#if !defined(_TRACE_SWITCHDEV_H) || defined(TRACE_HEADER_MULTI_READ)
>> +#define _TRACE_SWITCHDEV_H
>> +
>> +#include <linux/tracepoint.h>
>> +#include <net/switchdev.h>
>> +
>> +#define SWITCHDEV_TRACE_MSG_MAX 128
>
> 128 bytes is awfully big to waste on the ring buffer. What's the average
> size of a string?
I would say the typical message is around 60-80 bytes. Some common examples:
PORT_OBJ_ADD PORT_VLAN(flags 0x0 orig br0) vid 1 flags 0x27
PORT_OBJ_DEL HOST_MDB(flags 0x0 orig br0) vid 100 addr 33:33:ff:ff:00:09
The worst case I can think of currently is 95 characters:
VXLAN_FDB_ADD_TO_DEVICE vid 1000 addr de:ad:be:ef:00:01 added_by_user is_local locked offloaded
>> +
>> +DECLARE_EVENT_CLASS(switchdev_call,
>> + TP_PROTO(unsigned long val,
>> + const struct switchdev_notifier_info *info,
>> + int err),
>> +
>> + TP_ARGS(val, info, err),
>> +
>> + TP_STRUCT__entry(
>> + __field(unsigned long, val)
>> + __string(dev, info->dev ? netdev_name(info->dev) : "(null)")
>> + __field(const struct switchdev_notifier_info *, info)
>> + __field(int, err)
>> + __array(char, msg, SWITCHDEV_TRACE_MSG_MAX)
>> + ),
>> +
>> + TP_fast_assign(
>> + __entry->val = val;
>> + __assign_str(dev, info->dev ? netdev_name(info->dev) : "(null)");
>> + __entry->info = info;
>> + __entry->err = err;
>> + switchdev_notifier_str(val, info, __entry->msg, SWITCHDEV_TRACE_MSG_MAX);
>
> Is it possible to just store the information in the trace event and then
> call the above function in the read stage? There's helpers to pass strings
> around (namely a struct trace_seq *p).
I'm a complete novice when it comes to tracepoint internals. Am I right
in assuming that TP_printk may execute at a much later time than
TP_fast_assign? E.g., the object referenced by __entry->info may very
well have been freed by that time? That at least seems to be what my
naive refactor to replace __entry->msg with __get_buf() suggests :)
If so, the layout of the switchdev_notifier_* structs makes it a bit
cumbersome to clone the notification in the assign phase, as the size of
a specific notification (e.g., switchdev_notifier_port_obj_info) is not
known by the common notification (switchdev_notifier_info). In the case
of switchdev objects, the problem repeats a second time. E.g., the size
of switchdev_obj_port_vlan is not known by switchdev_obj.
> It would require a plugin for libtraceevent if you want to expose it to
> user space tools for trace-cmd and perf though.
>
> Another possibility is if this event will not race with other events on he
> same CPU, you could create a per-cpu buffer, write into that, and then use
> __string() and __assign_str() to save it, as traces happen with preemption
> disabled.
But bottom halves are still enabled I suppose? Notifications can be
generated both from process context (e.g., users configuring the bridge
with iproute2), and from bridge packet processing (e.g., adding new
neighbors to the FDB). So I don't think that would work in this case.
> -- Steve
Thanks for taking the time!
>> + ),
>> +
>> + TP_printk("dev %s %s -> %d", __get_str(dev), __entry->msg, __entry->err)
>> +);
>> +
>> +DEFINE_EVENT(switchdev_call, switchdev_defer,
>> + TP_PROTO(unsigned long val,
>> + const struct switchdev_notifier_info *info,
>> + int err),
>> +
>> + TP_ARGS(val, info, err)
>> +);
>> +
>> +DEFINE_EVENT(switchdev_call, switchdev_call_atomic,
>> + TP_PROTO(unsigned long val,
>> + const struct switchdev_notifier_info *info,
>> + int err),
>> +
>> + TP_ARGS(val, info, err)
>> +);
>> +
>> +DEFINE_EVENT(switchdev_call, switchdev_call_blocking,
>> + TP_PROTO(unsigned long val,
>> + const struct switchdev_notifier_info *info,
>> + int err),
>> +
>> + TP_ARGS(val, info, err)
>> +);
>> +
>> +DEFINE_EVENT(switchdev_call, switchdev_call_replay,
>> + TP_PROTO(unsigned long val,
>> + const struct switchdev_notifier_info *info,
>> + int err),
>> +
>> + TP_ARGS(val, info, err)
>> +);
>> +
>> +#endif /* _TRACE_SWITCHDEV_H */
>> +
next prev parent reply other threads:[~2024-02-26 13:05 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-23 11:44 [PATCH v3 net-next 0/4] net: switchdev: Tracepoints Tobias Waldekranz
2024-02-23 11:44 ` [PATCH v3 net-next 1/4] net: switchdev: Wrap enums in mapper macros Tobias Waldekranz
2024-02-23 11:44 ` [PATCH v3 net-next 2/4] net: switchdev: Add helpers to display switchdev objects as strings Tobias Waldekranz
2024-02-23 11:44 ` [PATCH v3 net-next 3/4] net: switchdev: Relay all replay messages through a central function Tobias Waldekranz
2024-02-23 11:44 ` [PATCH v3 net-next 4/4] net: switchdev: Add tracepoints Tobias Waldekranz
2024-02-23 15:38 ` Steven Rostedt
2024-02-26 13:05 ` Tobias Waldekranz [this message]
2024-02-27 10:04 ` Paolo Abeni
2024-02-28 10:47 ` Tobias Waldekranz
2024-02-28 14:56 ` Steven Rostedt
2024-03-04 22:40 ` Tobias Waldekranz
2024-03-06 15:15 ` Steven Rostedt
2024-03-06 20:02 ` Tobias Waldekranz
2024-03-06 21:46 ` Steven Rostedt
2024-03-06 22:45 ` Tobias Waldekranz
2024-03-06 23:33 ` Steven Rostedt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87cysjgyun.fsf@waldekranz.com \
--to=tobias@waldekranz.com \
--cc=bridge@lists.linux.dev \
--cc=davem@davemloft.net \
--cc=ivecera@redhat.com \
--cc=jiri@resnulli.us \
--cc=kuba@kernel.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=mhiramat@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=razor@blackwall.org \
--cc=roopa@nvidia.com \
--cc=rostedt@goodmis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).