public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next 0/2] DSA trace events
@ 2023-04-07 14:14 Vladimir Oltean
  2023-04-07 14:14 ` [PATCH net-next 1/2] net: dsa: add trace points for FDB/MDB operations Vladimir Oltean
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Vladimir Oltean @ 2023-04-07 14:14 UTC (permalink / raw)
  To: netdev
  Cc: Jakub Kicinski, David S. Miller, Eric Dumazet, Paolo Abeni,
	Andrew Lunn, Florian Fainelli

This series introduces the "dsa" trace event class, with the following
events:

$ trace-cmd list | grep dsa
dsa
dsa:dsa_fdb_add_hw
dsa:dsa_mdb_add_hw
dsa:dsa_fdb_del_hw
dsa:dsa_mdb_del_hw
dsa:dsa_fdb_add_bump
dsa:dsa_mdb_add_bump
dsa:dsa_fdb_del_drop
dsa:dsa_mdb_del_drop
dsa:dsa_fdb_del_not_found
dsa:dsa_mdb_del_not_found
dsa:dsa_lag_fdb_add_hw
dsa:dsa_lag_fdb_add_bump
dsa:dsa_lag_fdb_del_hw
dsa:dsa_lag_fdb_del_drop
dsa:dsa_lag_fdb_del_not_found
dsa:dsa_vlan_add_hw
dsa:dsa_vlan_del_hw
dsa:dsa_vlan_add_bump
dsa:dsa_vlan_del_drop
dsa:dsa_vlan_del_not_found

These are useful to debug refcounting issues on CPU and DSA ports, where
entries may remain lingering, or may be removed too soon, depending on
bugs in higher layers of the network stack.

Vladimir Oltean (2):
  net: dsa: add trace points for FDB/MDB operations
  net: dsa: add trace points for VLAN operations

 net/dsa/Makefile |   6 +-
 net/dsa/switch.c |  85 +++++++--
 net/dsa/trace.c  |  39 +++++
 net/dsa/trace.h  | 447 +++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 560 insertions(+), 17 deletions(-)
 create mode 100644 net/dsa/trace.c
 create mode 100644 net/dsa/trace.h

-- 
2.34.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH net-next 1/2] net: dsa: add trace points for FDB/MDB operations
  2023-04-07 14:14 [PATCH net-next 0/2] DSA trace events Vladimir Oltean
@ 2023-04-07 14:14 ` Vladimir Oltean
  2023-04-07 14:14 ` [PATCH net-next 2/2] net: dsa: add trace points for VLAN operations Vladimir Oltean
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 15+ messages in thread
From: Vladimir Oltean @ 2023-04-07 14:14 UTC (permalink / raw)
  To: netdev
  Cc: Jakub Kicinski, David S. Miller, Eric Dumazet, Paolo Abeni,
	Andrew Lunn, Florian Fainelli

DSA performs non-trivial housekeeping of unicast and multicast addresses
on shared (CPU and DSA) ports, and puts a bit of pressure on higher
layers, requiring them to behave correctly (remove these addresses
exactly as many times as they were added). Otherwise, either addresses
linger around forever, or DSA returns -ENOENT complaining that entries
that were already deleted must be deleted again.

To aid debugging, introduce some trace points specifically for FDB and
MDB - that's where some of the bugs still are right now.

Some bugs I have seen were also due to race conditions, see:
- 630fd4822af2 ("net: dsa: flush switchdev workqueue on bridge join error path")
- a2614140dc0f ("net: dsa: mv88e6xxx: flush switchdev FDB workqueue before removing VLAN")

so it would be good to not disturb the timing too much, hence the choice
to use trace points vs regular dev_dbg().

I've had these for some time on my computer in a less polished form, and
they've proven useful. What I found most useful was to enable
CONFIG_BOOTTIME_TRACING, add "trace_event=dsa" to the kernel cmdline,
and run "cat /sys/kernel/debug/tracing/trace". This is to debug more
complex environments with network managers started by the init system,
things like that.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 net/dsa/Makefile |   6 +-
 net/dsa/switch.c |  61 +++++++--
 net/dsa/trace.c  |  39 ++++++
 net/dsa/trace.h  | 329 +++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 423 insertions(+), 12 deletions(-)
 create mode 100644 net/dsa/trace.c
 create mode 100644 net/dsa/trace.h

diff --git a/net/dsa/Makefile b/net/dsa/Makefile
index cc7e93a562fe..281907e53632 100644
--- a/net/dsa/Makefile
+++ b/net/dsa/Makefile
@@ -10,7 +10,8 @@ dsa_core-y += \
 	slave.o \
 	switch.o \
 	tag.o \
-	tag_8021q.o
+	tag_8021q.o \
+	trace.o
 
 # tagging formats
 obj-$(CONFIG_NET_DSA_TAG_AR9331) += tag_ar9331.o
@@ -31,3 +32,6 @@ obj-$(CONFIG_NET_DSA_TAG_RZN1_A5PSW) += tag_rzn1_a5psw.o
 obj-$(CONFIG_NET_DSA_TAG_SJA1105) += tag_sja1105.o
 obj-$(CONFIG_NET_DSA_TAG_TRAILER) += tag_trailer.o
 obj-$(CONFIG_NET_DSA_TAG_XRS700X) += tag_xrs700x.o
+
+# for tracing framework to find trace.h
+CFLAGS_trace.o := -I$(src)
diff --git a/net/dsa/switch.c b/net/dsa/switch.c
index d5bc4bb7310d..ff1b5d980e37 100644
--- a/net/dsa/switch.c
+++ b/net/dsa/switch.c
@@ -18,6 +18,7 @@
 #include "slave.h"
 #include "switch.h"
 #include "tag_8021q.h"
+#include "trace.h"
 
 static unsigned int dsa_switch_fastest_ageing_time(struct dsa_switch *ds,
 						   unsigned int ageing_time)
@@ -164,14 +165,20 @@ static int dsa_port_do_mdb_add(struct dsa_port *dp,
 	int err = 0;
 
 	/* No need to bother with refcounting for user ports */
-	if (!(dsa_port_is_cpu(dp) || dsa_port_is_dsa(dp)))
-		return ds->ops->port_mdb_add(ds, port, mdb, db);
+	if (!(dsa_port_is_cpu(dp) || dsa_port_is_dsa(dp))) {
+		err = ds->ops->port_mdb_add(ds, port, mdb, db);
+		trace_dsa_mdb_add_hw(dp, mdb->addr, mdb->vid, &db, err);
+
+		return err;
+	}
 
 	mutex_lock(&dp->addr_lists_lock);
 
 	a = dsa_mac_addr_find(&dp->mdbs, mdb->addr, mdb->vid, db);
 	if (a) {
 		refcount_inc(&a->refcount);
+		trace_dsa_mdb_add_bump(dp, mdb->addr, mdb->vid, &db,
+				       &a->refcount);
 		goto out;
 	}
 
@@ -182,6 +189,7 @@ static int dsa_port_do_mdb_add(struct dsa_port *dp,
 	}
 
 	err = ds->ops->port_mdb_add(ds, port, mdb, db);
+	trace_dsa_mdb_add_hw(dp, mdb->addr, mdb->vid, &db, err);
 	if (err) {
 		kfree(a);
 		goto out;
@@ -209,21 +217,30 @@ static int dsa_port_do_mdb_del(struct dsa_port *dp,
 	int err = 0;
 
 	/* No need to bother with refcounting for user ports */
-	if (!(dsa_port_is_cpu(dp) || dsa_port_is_dsa(dp)))
-		return ds->ops->port_mdb_del(ds, port, mdb, db);
+	if (!(dsa_port_is_cpu(dp) || dsa_port_is_dsa(dp))) {
+		err = ds->ops->port_mdb_del(ds, port, mdb, db);
+		trace_dsa_mdb_del_hw(dp, mdb->addr, mdb->vid, &db, err);
+
+		return err;
+	}
 
 	mutex_lock(&dp->addr_lists_lock);
 
 	a = dsa_mac_addr_find(&dp->mdbs, mdb->addr, mdb->vid, db);
 	if (!a) {
+		trace_dsa_mdb_del_not_found(dp, mdb->addr, mdb->vid, &db);
 		err = -ENOENT;
 		goto out;
 	}
 
-	if (!refcount_dec_and_test(&a->refcount))
+	if (!refcount_dec_and_test(&a->refcount)) {
+		trace_dsa_mdb_del_drop(dp, mdb->addr, mdb->vid, &db,
+				       &a->refcount);
 		goto out;
+	}
 
 	err = ds->ops->port_mdb_del(ds, port, mdb, db);
+	trace_dsa_mdb_del_hw(dp, mdb->addr, mdb->vid, &db, err);
 	if (err) {
 		refcount_set(&a->refcount, 1);
 		goto out;
@@ -247,14 +264,19 @@ static int dsa_port_do_fdb_add(struct dsa_port *dp, const unsigned char *addr,
 	int err = 0;
 
 	/* No need to bother with refcounting for user ports */
-	if (!(dsa_port_is_cpu(dp) || dsa_port_is_dsa(dp)))
-		return ds->ops->port_fdb_add(ds, port, addr, vid, db);
+	if (!(dsa_port_is_cpu(dp) || dsa_port_is_dsa(dp))) {
+		err = ds->ops->port_fdb_add(ds, port, addr, vid, db);
+		trace_dsa_fdb_add_hw(dp, addr, vid, &db, err);
+
+		return err;
+	}
 
 	mutex_lock(&dp->addr_lists_lock);
 
 	a = dsa_mac_addr_find(&dp->fdbs, addr, vid, db);
 	if (a) {
 		refcount_inc(&a->refcount);
+		trace_dsa_fdb_add_bump(dp, addr, vid, &db, &a->refcount);
 		goto out;
 	}
 
@@ -265,6 +287,7 @@ static int dsa_port_do_fdb_add(struct dsa_port *dp, const unsigned char *addr,
 	}
 
 	err = ds->ops->port_fdb_add(ds, port, addr, vid, db);
+	trace_dsa_fdb_add_hw(dp, addr, vid, &db, err);
 	if (err) {
 		kfree(a);
 		goto out;
@@ -291,21 +314,29 @@ static int dsa_port_do_fdb_del(struct dsa_port *dp, const unsigned char *addr,
 	int err = 0;
 
 	/* No need to bother with refcounting for user ports */
-	if (!(dsa_port_is_cpu(dp) || dsa_port_is_dsa(dp)))
-		return ds->ops->port_fdb_del(ds, port, addr, vid, db);
+	if (!(dsa_port_is_cpu(dp) || dsa_port_is_dsa(dp))) {
+		err = ds->ops->port_fdb_del(ds, port, addr, vid, db);
+		trace_dsa_fdb_del_hw(dp, addr, vid, &db, err);
+
+		return err;
+	}
 
 	mutex_lock(&dp->addr_lists_lock);
 
 	a = dsa_mac_addr_find(&dp->fdbs, addr, vid, db);
 	if (!a) {
+		trace_dsa_fdb_del_not_found(dp, addr, vid, &db);
 		err = -ENOENT;
 		goto out;
 	}
 
-	if (!refcount_dec_and_test(&a->refcount))
+	if (!refcount_dec_and_test(&a->refcount)) {
+		trace_dsa_fdb_del_drop(dp, addr, vid, &db, &a->refcount);
 		goto out;
+	}
 
 	err = ds->ops->port_fdb_del(ds, port, addr, vid, db);
+	trace_dsa_fdb_del_hw(dp, addr, vid, &db, err);
 	if (err) {
 		refcount_set(&a->refcount, 1);
 		goto out;
@@ -332,6 +363,8 @@ static int dsa_switch_do_lag_fdb_add(struct dsa_switch *ds, struct dsa_lag *lag,
 	a = dsa_mac_addr_find(&lag->fdbs, addr, vid, db);
 	if (a) {
 		refcount_inc(&a->refcount);
+		trace_dsa_lag_fdb_add_bump(lag->dev, addr, vid, &db,
+					   &a->refcount);
 		goto out;
 	}
 
@@ -342,6 +375,7 @@ static int dsa_switch_do_lag_fdb_add(struct dsa_switch *ds, struct dsa_lag *lag,
 	}
 
 	err = ds->ops->lag_fdb_add(ds, *lag, addr, vid, db);
+	trace_dsa_lag_fdb_add_hw(lag->dev, addr, vid, &db, err);
 	if (err) {
 		kfree(a);
 		goto out;
@@ -370,14 +404,19 @@ static int dsa_switch_do_lag_fdb_del(struct dsa_switch *ds, struct dsa_lag *lag,
 
 	a = dsa_mac_addr_find(&lag->fdbs, addr, vid, db);
 	if (!a) {
+		trace_dsa_lag_fdb_del_not_found(lag->dev, addr, vid, &db);
 		err = -ENOENT;
 		goto out;
 	}
 
-	if (!refcount_dec_and_test(&a->refcount))
+	if (!refcount_dec_and_test(&a->refcount)) {
+		trace_dsa_lag_fdb_del_drop(lag->dev, addr, vid, &db,
+					   &a->refcount);
 		goto out;
+	}
 
 	err = ds->ops->lag_fdb_del(ds, *lag, addr, vid, db);
+	trace_dsa_lag_fdb_del_hw(lag->dev, addr, vid, &db, err);
 	if (err) {
 		refcount_set(&a->refcount, 1);
 		goto out;
diff --git a/net/dsa/trace.c b/net/dsa/trace.c
new file mode 100644
index 000000000000..1b107165d331
--- /dev/null
+++ b/net/dsa/trace.c
@@ -0,0 +1,39 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/* Copyright 2022-2023 NXP
+ */
+
+#define CREATE_TRACE_POINTS
+#include "trace.h"
+
+void dsa_db_print(const struct dsa_db *db, char buf[DSA_DB_BUFSIZ])
+{
+	switch (db->type) {
+	case DSA_DB_PORT:
+		sprintf(buf, "port %s", db->dp->name);
+		break;
+	case DSA_DB_LAG:
+		sprintf(buf, "lag %s id %d", db->lag.dev->name, db->lag.id);
+		break;
+	case DSA_DB_BRIDGE:
+		sprintf(buf, "bridge %s num %d", db->bridge.dev->name,
+			db->bridge.num);
+		break;
+	default:
+		sprintf(buf, "unknown");
+		break;
+	}
+}
+
+const char *dsa_port_kind(const struct dsa_port *dp)
+{
+	switch (dp->type) {
+	case DSA_PORT_TYPE_USER:
+		return "user";
+	case DSA_PORT_TYPE_CPU:
+		return "cpu";
+	case DSA_PORT_TYPE_DSA:
+		return "dsa";
+	default:
+		return "unused";
+	}
+}
diff --git a/net/dsa/trace.h b/net/dsa/trace.h
new file mode 100644
index 000000000000..42c8bbc7d472
--- /dev/null
+++ b/net/dsa/trace.h
@@ -0,0 +1,329 @@
+/* SPDX-License-Identifier: GPL-2.0
+ * Copyright 2022-2023 NXP
+ */
+
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM	dsa
+
+#if !defined(_NET_DSA_TRACE_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _NET_DSA_TRACE_H
+
+#include <net/dsa.h>
+#include <linux/etherdevice.h>
+#include <linux/refcount.h>
+#include <linux/tracepoint.h>
+
+/* Enough to fit "bridge %s num %d" where num has 3 digits */
+#define DSA_DB_BUFSIZ	(IFNAMSIZ + 16)
+
+void dsa_db_print(const struct dsa_db *db, char buf[DSA_DB_BUFSIZ]);
+const char *dsa_port_kind(const struct dsa_port *dp);
+
+DECLARE_EVENT_CLASS(dsa_port_addr_op_hw,
+
+	TP_PROTO(const struct dsa_port *dp, const unsigned char *addr, u16 vid,
+		 const struct dsa_db *db, int err),
+
+	TP_ARGS(dp, addr, vid, db, err),
+
+	TP_STRUCT__entry(
+		__string(dev, dev_name(dp->ds->dev))
+		__string(kind, dsa_port_kind(dp))
+		__field(int, port)
+		__array(unsigned char, addr, ETH_ALEN)
+		__field(u16, vid)
+		__array(char, db_buf, DSA_DB_BUFSIZ)
+		__field(int, err)
+	),
+
+	TP_fast_assign(
+		__assign_str(dev, dev_name(dp->ds->dev));
+		__assign_str(kind, dsa_port_kind(dp));
+		__entry->port = dp->index;
+		ether_addr_copy(__entry->addr, addr);
+		__entry->vid = vid;
+		dsa_db_print(db, __entry->db_buf);
+		__entry->err = err;
+	),
+
+	TP_printk("%s %s port %d addr %pM vid %u db \"%s\" err %d",
+		  __get_str(dev), __get_str(kind), __entry->port, __entry->addr,
+		  __entry->vid, __entry->db_buf, __entry->err)
+);
+
+/* Add unicast/multicast address to hardware, either on user ports
+ * (where no refcounting is kept), or on shared ports when the entry
+ * is first seen and its refcount is 1.
+ */
+DEFINE_EVENT(dsa_port_addr_op_hw, dsa_fdb_add_hw,
+	     TP_PROTO(const struct dsa_port *dp, const unsigned char *addr,
+		      u16 vid, const struct dsa_db *db, int err),
+	     TP_ARGS(dp, addr, vid, db, err));
+
+DEFINE_EVENT(dsa_port_addr_op_hw, dsa_mdb_add_hw,
+	     TP_PROTO(const struct dsa_port *dp, const unsigned char *addr,
+		      u16 vid, const struct dsa_db *db, int err),
+	     TP_ARGS(dp, addr, vid, db, err));
+
+/* Delete unicast/multicast address from hardware, either on user ports or
+ * when the refcount on shared ports reaches 0
+ */
+DEFINE_EVENT(dsa_port_addr_op_hw, dsa_fdb_del_hw,
+	     TP_PROTO(const struct dsa_port *dp, const unsigned char *addr,
+		      u16 vid, const struct dsa_db *db, int err),
+	     TP_ARGS(dp, addr, vid, db, err));
+
+DEFINE_EVENT(dsa_port_addr_op_hw, dsa_mdb_del_hw,
+	     TP_PROTO(const struct dsa_port *dp, const unsigned char *addr,
+		      u16 vid, const struct dsa_db *db, int err),
+	     TP_ARGS(dp, addr, vid, db, err));
+
+DECLARE_EVENT_CLASS(dsa_port_addr_op_refcount,
+
+	TP_PROTO(const struct dsa_port *dp, const unsigned char *addr, u16 vid,
+		 const struct dsa_db *db, const refcount_t *refcount),
+
+	TP_ARGS(dp, addr, vid, db, refcount),
+
+	TP_STRUCT__entry(
+		__string(dev, dev_name(dp->ds->dev))
+		__string(kind, dsa_port_kind(dp))
+		__field(int, port)
+		__array(unsigned char, addr, ETH_ALEN)
+		__field(u16, vid)
+		__array(char, db_buf, DSA_DB_BUFSIZ)
+		__field(unsigned int, refcount)
+	),
+
+	TP_fast_assign(
+		__assign_str(dev, dev_name(dp->ds->dev));
+		__assign_str(kind, dsa_port_kind(dp));
+		__entry->port = dp->index;
+		ether_addr_copy(__entry->addr, addr);
+		__entry->vid = vid;
+		dsa_db_print(db, __entry->db_buf);
+		__entry->refcount = refcount_read(refcount);
+	),
+
+	TP_printk("%s %s port %d addr %pM vid %u db \"%s\" refcount %u",
+		  __get_str(dev), __get_str(kind), __entry->port, __entry->addr,
+		  __entry->vid, __entry->db_buf, __entry->refcount)
+);
+
+/* Bump the refcount of an existing unicast/multicast address on shared ports */
+DEFINE_EVENT(dsa_port_addr_op_refcount, dsa_fdb_add_bump,
+	     TP_PROTO(const struct dsa_port *dp, const unsigned char *addr,
+		      u16 vid, const struct dsa_db *db,
+		      const refcount_t *refcount),
+	     TP_ARGS(dp, addr, vid, db, refcount));
+
+DEFINE_EVENT(dsa_port_addr_op_refcount, dsa_mdb_add_bump,
+	     TP_PROTO(const struct dsa_port *dp, const unsigned char *addr,
+		      u16 vid, const struct dsa_db *db,
+		      const refcount_t *refcount),
+	     TP_ARGS(dp, addr, vid, db, refcount));
+
+/* Drop the refcount of a multicast address that we still keep on
+ * shared ports
+ */
+DEFINE_EVENT(dsa_port_addr_op_refcount, dsa_fdb_del_drop,
+	     TP_PROTO(const struct dsa_port *dp, const unsigned char *addr,
+		      u16 vid, const struct dsa_db *db,
+		      const refcount_t *refcount),
+	     TP_ARGS(dp, addr, vid, db, refcount));
+
+DEFINE_EVENT(dsa_port_addr_op_refcount, dsa_mdb_del_drop,
+	     TP_PROTO(const struct dsa_port *dp, const unsigned char *addr,
+		      u16 vid, const struct dsa_db *db,
+		      const refcount_t *refcount),
+	     TP_ARGS(dp, addr, vid, db, refcount));
+
+DECLARE_EVENT_CLASS(dsa_port_addr_del_not_found,
+
+	TP_PROTO(const struct dsa_port *dp, const unsigned char *addr, u16 vid,
+		 const struct dsa_db *db),
+
+	TP_ARGS(dp, addr, vid, db),
+
+	TP_STRUCT__entry(
+		__string(dev, dev_name(dp->ds->dev))
+		__string(kind, dsa_port_kind(dp))
+		__field(int, port)
+		__array(unsigned char, addr, ETH_ALEN)
+		__field(u16, vid)
+		__array(char, db_buf, DSA_DB_BUFSIZ)
+	),
+
+	TP_fast_assign(
+		__assign_str(dev, dev_name(dp->ds->dev));
+		__assign_str(kind, dsa_port_kind(dp));
+		__entry->port = dp->index;
+		ether_addr_copy(__entry->addr, addr);
+		__entry->vid = vid;
+		dsa_db_print(db, __entry->db_buf);
+	),
+
+	TP_printk("%s %s port %d addr %pM vid %u db \"%s\"",
+		  __get_str(dev), __get_str(kind), __entry->port,
+		  __entry->addr, __entry->vid, __entry->db_buf)
+);
+
+/* Attempt to delete a unicast/multicast address on shared ports for which
+ * the delete operation was called more times than the addition
+ */
+DEFINE_EVENT(dsa_port_addr_del_not_found, dsa_fdb_del_not_found,
+	     TP_PROTO(const struct dsa_port *dp, const unsigned char *addr,
+		      u16 vid, const struct dsa_db *db),
+	     TP_ARGS(dp, addr, vid, db));
+
+DEFINE_EVENT(dsa_port_addr_del_not_found, dsa_mdb_del_not_found,
+	     TP_PROTO(const struct dsa_port *dp, const unsigned char *addr,
+		      u16 vid, const struct dsa_db *db),
+	     TP_ARGS(dp, addr, vid, db));
+
+TRACE_EVENT(dsa_lag_fdb_add_hw,
+
+	TP_PROTO(const struct net_device *lag_dev, const unsigned char *addr,
+		 u16 vid, const struct dsa_db *db, int err),
+
+	TP_ARGS(lag_dev, addr, vid, db, err),
+
+	TP_STRUCT__entry(
+		__string(dev, lag_dev->name)
+		__array(unsigned char, addr, ETH_ALEN)
+		__field(u16, vid)
+		__array(char, db_buf, DSA_DB_BUFSIZ)
+		__field(int, err)
+	),
+
+	TP_fast_assign(
+		__assign_str(dev, lag_dev->name);
+		ether_addr_copy(__entry->addr, addr);
+		__entry->vid = vid;
+		dsa_db_print(db, __entry->db_buf);
+		__entry->err = err;
+	),
+
+	TP_printk("%s addr %pM vid %u db \"%s\" err %d",
+		  __get_str(dev), __entry->addr, __entry->vid,
+		  __entry->db_buf, __entry->err)
+);
+
+TRACE_EVENT(dsa_lag_fdb_add_bump,
+
+	TP_PROTO(const struct net_device *lag_dev, const unsigned char *addr,
+		 u16 vid, const struct dsa_db *db, const refcount_t *refcount),
+
+	TP_ARGS(lag_dev, addr, vid, db, refcount),
+
+	TP_STRUCT__entry(
+		__string(dev, lag_dev->name)
+		__array(unsigned char, addr, ETH_ALEN)
+		__field(u16, vid)
+		__array(char, db_buf, DSA_DB_BUFSIZ)
+		__field(unsigned int, refcount)
+	),
+
+	TP_fast_assign(
+		__assign_str(dev, lag_dev->name);
+		ether_addr_copy(__entry->addr, addr);
+		__entry->vid = vid;
+		dsa_db_print(db, __entry->db_buf);
+		__entry->refcount = refcount_read(refcount);
+	),
+
+	TP_printk("%s addr %pM vid %u db \"%s\" refcount %u",
+		  __get_str(dev), __entry->addr, __entry->vid,
+		  __entry->db_buf, __entry->refcount)
+);
+
+TRACE_EVENT(dsa_lag_fdb_del_hw,
+
+	TP_PROTO(const struct net_device *lag_dev, const unsigned char *addr,
+		 u16 vid, const struct dsa_db *db, int err),
+
+	TP_ARGS(lag_dev, addr, vid, db, err),
+
+	TP_STRUCT__entry(
+		__string(dev, lag_dev->name)
+		__array(unsigned char, addr, ETH_ALEN)
+		__field(u16, vid)
+		__array(char, db_buf, DSA_DB_BUFSIZ)
+		__field(int, err)
+	),
+
+	TP_fast_assign(
+		__assign_str(dev, lag_dev->name);
+		ether_addr_copy(__entry->addr, addr);
+		__entry->vid = vid;
+		dsa_db_print(db, __entry->db_buf);
+		__entry->err = err;
+	),
+
+	TP_printk("%s addr %pM vid %u db \"%s\" err %d",
+		  __get_str(dev), __entry->addr, __entry->vid,
+		  __entry->db_buf, __entry->err)
+);
+
+TRACE_EVENT(dsa_lag_fdb_del_drop,
+
+	TP_PROTO(const struct net_device *lag_dev, const unsigned char *addr,
+		 u16 vid, const struct dsa_db *db, const refcount_t *refcount),
+
+	TP_ARGS(lag_dev, addr, vid, db, refcount),
+
+	TP_STRUCT__entry(
+		__string(dev, lag_dev->name)
+		__array(unsigned char, addr, ETH_ALEN)
+		__field(u16, vid)
+		__array(char, db_buf, DSA_DB_BUFSIZ)
+		__field(unsigned int, refcount)
+	),
+
+	TP_fast_assign(
+		__assign_str(dev, lag_dev->name);
+		ether_addr_copy(__entry->addr, addr);
+		__entry->vid = vid;
+		dsa_db_print(db, __entry->db_buf);
+		__entry->refcount = refcount_read(refcount);
+	),
+
+	TP_printk("%s addr %pM vid %u db \"%s\" refcount %u",
+		  __get_str(dev), __entry->addr, __entry->vid,
+		  __entry->db_buf, __entry->refcount)
+);
+
+TRACE_EVENT(dsa_lag_fdb_del_not_found,
+
+	TP_PROTO(const struct net_device *lag_dev, const unsigned char *addr,
+		 u16 vid, const struct dsa_db *db),
+
+	TP_ARGS(lag_dev, addr, vid, db),
+
+	TP_STRUCT__entry(
+		__string(dev, lag_dev->name)
+		__array(unsigned char, addr, ETH_ALEN)
+		__field(u16, vid)
+		__array(char, db_buf, DSA_DB_BUFSIZ)
+	),
+
+	TP_fast_assign(
+		__assign_str(dev, lag_dev->name);
+		ether_addr_copy(__entry->addr, addr);
+		__entry->vid = vid;
+		dsa_db_print(db, __entry->db_buf);
+	),
+
+	TP_printk("%s addr %pM vid %u db \"%s\"",
+		  __get_str(dev), __entry->addr, __entry->vid, __entry->db_buf)
+);
+
+#endif /* _NET_DSA_TRACE_H */
+
+/* We don't want to use include/trace/events */
+#undef TRACE_INCLUDE_PATH
+#define TRACE_INCLUDE_PATH .
+#undef TRACE_INCLUDE_FILE
+#define TRACE_INCLUDE_FILE	trace
+/* This part must be outside protection */
+#include <trace/define_trace.h>
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH net-next 2/2] net: dsa: add trace points for VLAN operations
  2023-04-07 14:14 [PATCH net-next 0/2] DSA trace events Vladimir Oltean
  2023-04-07 14:14 ` [PATCH net-next 1/2] net: dsa: add trace points for FDB/MDB operations Vladimir Oltean
@ 2023-04-07 14:14 ` Vladimir Oltean
  2023-04-12  0:48 ` [PATCH net-next 0/2] DSA trace events Andrew Lunn
  2023-04-12  8:30 ` patchwork-bot+netdevbpf
  3 siblings, 0 replies; 15+ messages in thread
From: Vladimir Oltean @ 2023-04-07 14:14 UTC (permalink / raw)
  To: netdev
  Cc: Jakub Kicinski, David S. Miller, Eric Dumazet, Paolo Abeni,
	Andrew Lunn, Florian Fainelli

These are not as critical as the FDB/MDB trace points (I'm not aware of
outstanding VLAN related bugs), but maybe they are useful to somebody,
either debugging something or simply trying to learn more.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 net/dsa/switch.c |  24 ++++++++--
 net/dsa/trace.h  | 118 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 137 insertions(+), 5 deletions(-)

diff --git a/net/dsa/switch.c b/net/dsa/switch.c
index ff1b5d980e37..8c9a9f94b756 100644
--- a/net/dsa/switch.c
+++ b/net/dsa/switch.c
@@ -695,8 +695,12 @@ static int dsa_port_do_vlan_add(struct dsa_port *dp,
 	int err = 0;
 
 	/* No need to bother with refcounting for user ports. */
-	if (!(dsa_port_is_cpu(dp) || dsa_port_is_dsa(dp)))
-		return ds->ops->port_vlan_add(ds, port, vlan, extack);
+	if (!(dsa_port_is_cpu(dp) || dsa_port_is_dsa(dp))) {
+		err = ds->ops->port_vlan_add(ds, port, vlan, extack);
+		trace_dsa_vlan_add_hw(dp, vlan, err);
+
+		return err;
+	}
 
 	/* No need to propagate on shared ports the existing VLANs that were
 	 * re-notified after just the flags have changed. This would cause a
@@ -711,6 +715,7 @@ static int dsa_port_do_vlan_add(struct dsa_port *dp,
 	v = dsa_vlan_find(&dp->vlans, vlan);
 	if (v) {
 		refcount_inc(&v->refcount);
+		trace_dsa_vlan_add_bump(dp, vlan, &v->refcount);
 		goto out;
 	}
 
@@ -721,6 +726,7 @@ static int dsa_port_do_vlan_add(struct dsa_port *dp,
 	}
 
 	err = ds->ops->port_vlan_add(ds, port, vlan, extack);
+	trace_dsa_vlan_add_hw(dp, vlan, err);
 	if (err) {
 		kfree(v);
 		goto out;
@@ -745,21 +751,29 @@ static int dsa_port_do_vlan_del(struct dsa_port *dp,
 	int err = 0;
 
 	/* No need to bother with refcounting for user ports */
-	if (!(dsa_port_is_cpu(dp) || dsa_port_is_dsa(dp)))
-		return ds->ops->port_vlan_del(ds, port, vlan);
+	if (!(dsa_port_is_cpu(dp) || dsa_port_is_dsa(dp))) {
+		err = ds->ops->port_vlan_del(ds, port, vlan);
+		trace_dsa_vlan_del_hw(dp, vlan, err);
+
+		return err;
+	}
 
 	mutex_lock(&dp->vlans_lock);
 
 	v = dsa_vlan_find(&dp->vlans, vlan);
 	if (!v) {
+		trace_dsa_vlan_del_not_found(dp, vlan);
 		err = -ENOENT;
 		goto out;
 	}
 
-	if (!refcount_dec_and_test(&v->refcount))
+	if (!refcount_dec_and_test(&v->refcount)) {
+		trace_dsa_vlan_del_drop(dp, vlan, &v->refcount);
 		goto out;
+	}
 
 	err = ds->ops->port_vlan_del(ds, port, vlan);
+	trace_dsa_vlan_del_hw(dp, vlan, err);
 	if (err) {
 		refcount_set(&v->refcount, 1);
 		goto out;
diff --git a/net/dsa/trace.h b/net/dsa/trace.h
index 42c8bbc7d472..567f29a39707 100644
--- a/net/dsa/trace.h
+++ b/net/dsa/trace.h
@@ -9,7 +9,9 @@
 #define _NET_DSA_TRACE_H
 
 #include <net/dsa.h>
+#include <net/switchdev.h>
 #include <linux/etherdevice.h>
+#include <linux/if_bridge.h>
 #include <linux/refcount.h>
 #include <linux/tracepoint.h>
 
@@ -318,6 +320,122 @@ TRACE_EVENT(dsa_lag_fdb_del_not_found,
 		  __get_str(dev), __entry->addr, __entry->vid, __entry->db_buf)
 );
 
+DECLARE_EVENT_CLASS(dsa_vlan_op_hw,
+
+	TP_PROTO(const struct dsa_port *dp,
+		 const struct switchdev_obj_port_vlan *vlan, int err),
+
+	TP_ARGS(dp, vlan, err),
+
+	TP_STRUCT__entry(
+		__string(dev, dev_name(dp->ds->dev))
+		__string(kind, dsa_port_kind(dp))
+		__field(int, port)
+		__field(u16, vid)
+		__field(u16, flags)
+		__field(bool, changed)
+		__field(int, err)
+	),
+
+	TP_fast_assign(
+		__assign_str(dev, dev_name(dp->ds->dev));
+		__assign_str(kind, dsa_port_kind(dp));
+		__entry->port = dp->index;
+		__entry->vid = vlan->vid;
+		__entry->flags = vlan->flags;
+		__entry->changed = vlan->changed;
+		__entry->err = err;
+	),
+
+	TP_printk("%s %s port %d vid %u%s%s%s",
+		  __get_str(dev), __get_str(kind), __entry->port, __entry->vid,
+		  __entry->flags & BRIDGE_VLAN_INFO_PVID ? " pvid" : "",
+		  __entry->flags & BRIDGE_VLAN_INFO_UNTAGGED ? " untagged" : "",
+		  __entry->changed ? " (changed)" : "")
+);
+
+DEFINE_EVENT(dsa_vlan_op_hw, dsa_vlan_add_hw,
+	     TP_PROTO(const struct dsa_port *dp,
+		      const struct switchdev_obj_port_vlan *vlan, int err),
+	     TP_ARGS(dp, vlan, err));
+
+DEFINE_EVENT(dsa_vlan_op_hw, dsa_vlan_del_hw,
+	     TP_PROTO(const struct dsa_port *dp,
+		      const struct switchdev_obj_port_vlan *vlan, int err),
+	     TP_ARGS(dp, vlan, err));
+
+DECLARE_EVENT_CLASS(dsa_vlan_op_refcount,
+
+	TP_PROTO(const struct dsa_port *dp,
+		 const struct switchdev_obj_port_vlan *vlan,
+		 const refcount_t *refcount),
+
+	TP_ARGS(dp, vlan, refcount),
+
+	TP_STRUCT__entry(
+		__string(dev, dev_name(dp->ds->dev))
+		__string(kind, dsa_port_kind(dp))
+		__field(int, port)
+		__field(u16, vid)
+		__field(u16, flags)
+		__field(bool, changed)
+		__field(unsigned int, refcount)
+	),
+
+	TP_fast_assign(
+		__assign_str(dev, dev_name(dp->ds->dev));
+		__assign_str(kind, dsa_port_kind(dp));
+		__entry->port = dp->index;
+		__entry->vid = vlan->vid;
+		__entry->flags = vlan->flags;
+		__entry->changed = vlan->changed;
+		__entry->refcount = refcount_read(refcount);
+	),
+
+	TP_printk("%s %s port %d vid %u%s%s%s refcount %u",
+		  __get_str(dev), __get_str(kind), __entry->port, __entry->vid,
+		  __entry->flags & BRIDGE_VLAN_INFO_PVID ? " pvid" : "",
+		  __entry->flags & BRIDGE_VLAN_INFO_UNTAGGED ? " untagged" : "",
+		  __entry->changed ? " (changed)" : "", __entry->refcount)
+);
+
+DEFINE_EVENT(dsa_vlan_op_refcount, dsa_vlan_add_bump,
+	     TP_PROTO(const struct dsa_port *dp,
+		      const struct switchdev_obj_port_vlan *vlan,
+		      const refcount_t *refcount),
+	     TP_ARGS(dp, vlan, refcount));
+
+DEFINE_EVENT(dsa_vlan_op_refcount, dsa_vlan_del_drop,
+	     TP_PROTO(const struct dsa_port *dp,
+		      const struct switchdev_obj_port_vlan *vlan,
+		      const refcount_t *refcount),
+	     TP_ARGS(dp, vlan, refcount));
+
+TRACE_EVENT(dsa_vlan_del_not_found,
+
+	TP_PROTO(const struct dsa_port *dp,
+		 const struct switchdev_obj_port_vlan *vlan),
+
+	TP_ARGS(dp, vlan),
+
+	TP_STRUCT__entry(
+		__string(dev, dev_name(dp->ds->dev))
+		__string(kind, dsa_port_kind(dp))
+		__field(int, port)
+		__field(u16, vid)
+	),
+
+	TP_fast_assign(
+		__assign_str(dev, dev_name(dp->ds->dev));
+		__assign_str(kind, dsa_port_kind(dp));
+		__entry->port = dp->index;
+		__entry->vid = vlan->vid;
+	),
+
+	TP_printk("%s %s port %d vid %u",
+		  __get_str(dev), __get_str(kind), __entry->port, __entry->vid)
+);
+
 #endif /* _NET_DSA_TRACE_H */
 
 /* We don't want to use include/trace/events */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 0/2] DSA trace events
  2023-04-07 14:14 [PATCH net-next 0/2] DSA trace events Vladimir Oltean
  2023-04-07 14:14 ` [PATCH net-next 1/2] net: dsa: add trace points for FDB/MDB operations Vladimir Oltean
  2023-04-07 14:14 ` [PATCH net-next 2/2] net: dsa: add trace points for VLAN operations Vladimir Oltean
@ 2023-04-12  0:48 ` Andrew Lunn
  2023-04-12  9:55   ` Vladimir Oltean
  2023-04-12  8:30 ` patchwork-bot+netdevbpf
  3 siblings, 1 reply; 15+ messages in thread
From: Andrew Lunn @ 2023-04-12  0:48 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, Jakub Kicinski, David S. Miller, Eric Dumazet,
	Paolo Abeni, Florian Fainelli

On Fri, Apr 07, 2023 at 05:14:49PM +0300, Vladimir Oltean wrote:
> This series introduces the "dsa" trace event class, with the following
> events:
> 
> $ trace-cmd list | grep dsa
> dsa
> dsa:dsa_fdb_add_hw
> dsa:dsa_mdb_add_hw
> dsa:dsa_fdb_del_hw
> dsa:dsa_mdb_del_hw
> dsa:dsa_fdb_add_bump
> dsa:dsa_mdb_add_bump
> dsa:dsa_fdb_del_drop
> dsa:dsa_mdb_del_drop
> dsa:dsa_fdb_del_not_found
> dsa:dsa_mdb_del_not_found
> dsa:dsa_lag_fdb_add_hw
> dsa:dsa_lag_fdb_add_bump
> dsa:dsa_lag_fdb_del_hw
> dsa:dsa_lag_fdb_del_drop
> dsa:dsa_lag_fdb_del_not_found
> dsa:dsa_vlan_add_hw
> dsa:dsa_vlan_del_hw
> dsa:dsa_vlan_add_bump
> dsa:dsa_vlan_del_drop
> dsa:dsa_vlan_del_not_found
> 
> These are useful to debug refcounting issues on CPU and DSA ports, where
> entries may remain lingering, or may be removed too soon, depending on
> bugs in higher layers of the network stack.

Hi Vladimir

I don't know anything about trace points. Should you Cc: 

Steven Rostedt <rostedt@goodmis.org> (maintainer:TRACING)
Masami Hiramatsu <mhiramat@kernel.org> (maintainer:TRACING)

to get some feedback from people who do?

   Andrew

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 0/2] DSA trace events
  2023-04-07 14:14 [PATCH net-next 0/2] DSA trace events Vladimir Oltean
                   ` (2 preceding siblings ...)
  2023-04-12  0:48 ` [PATCH net-next 0/2] DSA trace events Andrew Lunn
@ 2023-04-12  8:30 ` patchwork-bot+netdevbpf
  3 siblings, 0 replies; 15+ messages in thread
From: patchwork-bot+netdevbpf @ 2023-04-12  8:30 UTC (permalink / raw)
  To: Vladimir Oltean; +Cc: netdev, kuba, davem, edumazet, pabeni, andrew, f.fainelli

Hello:

This series was applied to netdev/net-next.git (main)
by David S. Miller <davem@davemloft.net>:

On Fri,  7 Apr 2023 17:14:49 +0300 you wrote:
> This series introduces the "dsa" trace event class, with the following
> events:
> 
> $ trace-cmd list | grep dsa
> dsa
> dsa:dsa_fdb_add_hw
> dsa:dsa_mdb_add_hw
> dsa:dsa_fdb_del_hw
> dsa:dsa_mdb_del_hw
> dsa:dsa_fdb_add_bump
> dsa:dsa_mdb_add_bump
> dsa:dsa_fdb_del_drop
> dsa:dsa_mdb_del_drop
> dsa:dsa_fdb_del_not_found
> dsa:dsa_mdb_del_not_found
> dsa:dsa_lag_fdb_add_hw
> dsa:dsa_lag_fdb_add_bump
> dsa:dsa_lag_fdb_del_hw
> dsa:dsa_lag_fdb_del_drop
> dsa:dsa_lag_fdb_del_not_found
> dsa:dsa_vlan_add_hw
> dsa:dsa_vlan_del_hw
> dsa:dsa_vlan_add_bump
> dsa:dsa_vlan_del_drop
> dsa:dsa_vlan_del_not_found
> 
> [...]

Here is the summary with links:
  - [net-next,1/2] net: dsa: add trace points for FDB/MDB operations
    https://git.kernel.org/netdev/net-next/c/9538ebce88ff
  - [net-next,2/2] net: dsa: add trace points for VLAN operations
    https://git.kernel.org/netdev/net-next/c/02020bd70fa6

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 0/2] DSA trace events
  2023-04-12  0:48 ` [PATCH net-next 0/2] DSA trace events Andrew Lunn
@ 2023-04-12  9:55   ` Vladimir Oltean
  2023-04-21 12:38     ` Masami Hiramatsu
  0 siblings, 1 reply; 15+ messages in thread
From: Vladimir Oltean @ 2023-04-12  9:55 UTC (permalink / raw)
  To: Andrew Lunn, Steven Rostedt, Masami Hiramatsu
  Cc: netdev, Jakub Kicinski, David S. Miller, Eric Dumazet,
	Paolo Abeni, Florian Fainelli

On Wed, Apr 12, 2023 at 02:48:35AM +0200, Andrew Lunn wrote:
> On Fri, Apr 07, 2023 at 05:14:49PM +0300, Vladimir Oltean wrote:
> > This series introduces the "dsa" trace event class, with the following
> > events:
> > 
> > $ trace-cmd list | grep dsa
> > dsa
> > dsa:dsa_fdb_add_hw
> > dsa:dsa_mdb_add_hw
> > dsa:dsa_fdb_del_hw
> > dsa:dsa_mdb_del_hw
> > dsa:dsa_fdb_add_bump
> > dsa:dsa_mdb_add_bump
> > dsa:dsa_fdb_del_drop
> > dsa:dsa_mdb_del_drop
> > dsa:dsa_fdb_del_not_found
> > dsa:dsa_mdb_del_not_found
> > dsa:dsa_lag_fdb_add_hw
> > dsa:dsa_lag_fdb_add_bump
> > dsa:dsa_lag_fdb_del_hw
> > dsa:dsa_lag_fdb_del_drop
> > dsa:dsa_lag_fdb_del_not_found
> > dsa:dsa_vlan_add_hw
> > dsa:dsa_vlan_del_hw
> > dsa:dsa_vlan_add_bump
> > dsa:dsa_vlan_del_drop
> > dsa:dsa_vlan_del_not_found
> > 
> > These are useful to debug refcounting issues on CPU and DSA ports, where
> > entries may remain lingering, or may be removed too soon, depending on
> > bugs in higher layers of the network stack.
> 
> Hi Vladimir
> 
> I don't know anything about trace points. Should you Cc: 
> 
> Steven Rostedt <rostedt@goodmis.org> (maintainer:TRACING)
> Masami Hiramatsu <mhiramat@kernel.org> (maintainer:TRACING)
> 
> to get some feedback from people who do?
> 
>    Andrew

I suppose I could.

Hi Steven, Masami, would you mind taking a look and letting me know
if the trace API was used reasonably? The patches were merged as:
https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=9538ebce88ffa074202d592d468521995cb1e714
https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=02020bd70fa6abcb1c2a8525ce7c1500dd4f44a8
but I can make incremental changes if necessary.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 0/2] DSA trace events
  2023-04-12  9:55   ` Vladimir Oltean
@ 2023-04-21 12:38     ` Masami Hiramatsu
  2023-04-21 12:47       ` Vladimir Oltean
  0 siblings, 1 reply; 15+ messages in thread
From: Masami Hiramatsu @ 2023-04-21 12:38 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Andrew Lunn, Steven Rostedt, netdev, Jakub Kicinski,
	David S. Miller, Eric Dumazet, Paolo Abeni, Florian Fainelli

On Wed, 12 Apr 2023 12:55:34 +0300
Vladimir Oltean <vladimir.oltean@nxp.com> wrote:

> On Wed, Apr 12, 2023 at 02:48:35AM +0200, Andrew Lunn wrote:
> > On Fri, Apr 07, 2023 at 05:14:49PM +0300, Vladimir Oltean wrote:
> > > This series introduces the "dsa" trace event class, with the following
> > > events:
> > > 
> > > $ trace-cmd list | grep dsa
> > > dsa
> > > dsa:dsa_fdb_add_hw
> > > dsa:dsa_mdb_add_hw
> > > dsa:dsa_fdb_del_hw
> > > dsa:dsa_mdb_del_hw
> > > dsa:dsa_fdb_add_bump
> > > dsa:dsa_mdb_add_bump
> > > dsa:dsa_fdb_del_drop
> > > dsa:dsa_mdb_del_drop
> > > dsa:dsa_fdb_del_not_found
> > > dsa:dsa_mdb_del_not_found
> > > dsa:dsa_lag_fdb_add_hw
> > > dsa:dsa_lag_fdb_add_bump
> > > dsa:dsa_lag_fdb_del_hw
> > > dsa:dsa_lag_fdb_del_drop
> > > dsa:dsa_lag_fdb_del_not_found
> > > dsa:dsa_vlan_add_hw
> > > dsa:dsa_vlan_del_hw
> > > dsa:dsa_vlan_add_bump
> > > dsa:dsa_vlan_del_drop
> > > dsa:dsa_vlan_del_not_found
> > > 
> > > These are useful to debug refcounting issues on CPU and DSA ports, where
> > > entries may remain lingering, or may be removed too soon, depending on
> > > bugs in higher layers of the network stack.
> > 
> > Hi Vladimir
> > 
> > I don't know anything about trace points. Should you Cc: 
> > 
> > Steven Rostedt <rostedt@goodmis.org> (maintainer:TRACING)
> > Masami Hiramatsu <mhiramat@kernel.org> (maintainer:TRACING)
> > 
> > to get some feedback from people who do?
> > 
> >    Andrew
> 
> I suppose I could.
> 
> Hi Steven, Masami, would you mind taking a look and letting me know
> if the trace API was used reasonably? The patches were merged as:
> https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=9538ebce88ffa074202d592d468521995cb1e714
> https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=02020bd70fa6abcb1c2a8525ce7c1500dd4f44a8
> but I can make incremental changes if necessary.

If the subsystem maintainers are OK for including this, it is OK.
But basically, since the event is exposed to userland and you may keep
these events maintained, you should carefully add the events.
If those are only for debugging (after debug, it will not be used
frequently), can you consider to use kprobe events?
'perf probe' command will also help you to trace local variables and
structure members as like gdb does.


Thank you,


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 0/2] DSA trace events
  2023-04-21 12:38     ` Masami Hiramatsu
@ 2023-04-21 12:47       ` Vladimir Oltean
  2023-04-24 22:25         ` Steven Rostedt
  0 siblings, 1 reply; 15+ messages in thread
From: Vladimir Oltean @ 2023-04-21 12:47 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Andrew Lunn, Steven Rostedt, netdev, Jakub Kicinski,
	David S. Miller, Eric Dumazet, Paolo Abeni, Florian Fainelli

On Fri, Apr 21, 2023 at 09:38:50PM +0900, Masami Hiramatsu wrote:
> If the subsystem maintainers are OK for including this, it is OK.
> But basically, since the event is exposed to userland and you may keep
> these events maintained, you should carefully add the events.
> If those are only for debugging (after debug, it will not be used
> frequently), can you consider to use kprobe events?
> 'perf probe' command will also help you to trace local variables and
> structure members as like gdb does.

Thanks for taking a look. I haven't looked at kprobe events. I also
wasn't planning on maintaining these assuming stable UABI terms, just
for debugging. What are some user space consumers that expect the UABI
to be stable, and what is it about the trace events that can/can't change?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 0/2] DSA trace events
  2023-04-21 12:47       ` Vladimir Oltean
@ 2023-04-24 22:25         ` Steven Rostedt
  2023-04-26 19:13           ` Vladimir Oltean
  0 siblings, 1 reply; 15+ messages in thread
From: Steven Rostedt @ 2023-04-24 22:25 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Masami Hiramatsu, Andrew Lunn, netdev, Jakub Kicinski,
	David S. Miller, Eric Dumazet, Paolo Abeni, Florian Fainelli

On Fri, 21 Apr 2023 15:47:08 +0300
Vladimir Oltean <vladimir.oltean@nxp.com> wrote:

> On Fri, Apr 21, 2023 at 09:38:50PM +0900, Masami Hiramatsu wrote:
> > If the subsystem maintainers are OK for including this, it is OK.
> > But basically, since the event is exposed to userland and you may keep
> > these events maintained, you should carefully add the events.
> > If those are only for debugging (after debug, it will not be used
> > frequently), can you consider to use kprobe events?
> > 'perf probe' command will also help you to trace local variables and
> > structure members as like gdb does.  
> 
> Thanks for taking a look. I haven't looked at kprobe events. I also
> wasn't planning on maintaining these assuming stable UABI terms, just
> for debugging. What are some user space consumers that expect the UABI
> to be stable, and what is it about the trace events that can/can't change?

Ideally, tooling will use the libtraceevent library[1] to parse the
events. In that case, if an event is used by tooling, you'll need to
keep around the fields that are used by the tooling.

-- Steve

[1] https://git.kernel.org/pub/scm/libs/libtrace/libtraceevent.git/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 0/2] DSA trace events
  2023-04-24 22:25         ` Steven Rostedt
@ 2023-04-26 19:13           ` Vladimir Oltean
  2023-04-26 19:23             ` Steven Rostedt
  0 siblings, 1 reply; 15+ messages in thread
From: Vladimir Oltean @ 2023-04-26 19:13 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Masami Hiramatsu, Andrew Lunn, netdev, Jakub Kicinski,
	David S. Miller, Eric Dumazet, Paolo Abeni, Florian Fainelli

On Mon, Apr 24, 2023 at 06:25:54PM -0400, Steven Rostedt wrote:
> On Fri, 21 Apr 2023 15:47:08 +0300
> Vladimir Oltean <vladimir.oltean@nxp.com> wrote:
> 
> > On Fri, Apr 21, 2023 at 09:38:50PM +0900, Masami Hiramatsu wrote:
> > > If the subsystem maintainers are OK for including this, it is OK.
> > > But basically, since the event is exposed to userland and you may keep
> > > these events maintained, you should carefully add the events.
> > > If those are only for debugging (after debug, it will not be used
> > > frequently), can you consider to use kprobe events?
> > > 'perf probe' command will also help you to trace local variables and
> > > structure members as like gdb does.  
> > 
> > Thanks for taking a look. I haven't looked at kprobe events. I also
> > wasn't planning on maintaining these assuming stable UABI terms, just
> > for debugging. What are some user space consumers that expect the UABI
> > to be stable, and what is it about the trace events that can/can't change?
> 
> Ideally, tooling will use the libtraceevent library[1] to parse the
> events. In that case, if an event is used by tooling, you'll need to
> keep around the fields that are used by the tooling.

Ok, I did not plan for user space to treat these events as something
stable to pick up on. The Linux bridge already notifies VLANs, FDBs,
MDBs through the rtnetlink socket, and that's what I would consider to
be the stable ABI. What can be seen here (DSA) is essentially a
framework used by multiple hardware drivers, but ultimately still device
driver-level code.

What would you recommend here? A revert?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 0/2] DSA trace events
  2023-04-26 19:13           ` Vladimir Oltean
@ 2023-04-26 19:23             ` Steven Rostedt
  2023-04-26 19:43               ` Vladimir Oltean
  0 siblings, 1 reply; 15+ messages in thread
From: Steven Rostedt @ 2023-04-26 19:23 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Masami Hiramatsu, Andrew Lunn, netdev, Jakub Kicinski,
	David S. Miller, Eric Dumazet, Paolo Abeni, Florian Fainelli

On Wed, 26 Apr 2023 22:13:36 +0300
Vladimir Oltean <vladimir.oltean@nxp.com> wrote:

> Ok, I did not plan for user space to treat these events as something
> stable to pick up on. The Linux bridge already notifies VLANs, FDBs,
> MDBs through the rtnetlink socket, and that's what I would consider to
> be the stable ABI. What can be seen here (DSA) is essentially a
> framework used by multiple hardware drivers, but ultimately still device
> driver-level code.
> 
> What would you recommend here? A revert?

There's lots of events in the kernel that no tools use. Do you expect
anyone to create a tool that uses these events?

We break user space API all the time. As long as nothing notices, it's OK.
We take the "tree in the forest" approach. If user space API breaks, but no
tool uses it, did it break? The answer according to Linus, is "no".

Al Viro refuses to have trace events in VFS, because there's lots of places
that could become useful for tooling, and he doesn't want to support it.
But if the events are not useful for user space tooling, they should be
generally safe to keep.

There's tons of events in the wifi code, because they are very useful for
debugging remote applications out in the world, that the wifi maintainers
have tooling for. But those are not considered "stable", because the only
tools are the ones that the maintainer of the trace events, created.

If you don't see anything using these events for useful tooling outside
your own use, then I'd just keep them. There's a thousand other events in
the kernel that are not used by tools, I doubt these will be any different.

If you think that a tool that will end up in a distribution will start
using them, then you need to take care.

-- Steve

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 0/2] DSA trace events
  2023-04-26 19:23             ` Steven Rostedt
@ 2023-04-26 19:43               ` Vladimir Oltean
  2023-04-26 19:47                 ` Steven Rostedt
  2023-04-27  0:33                 ` Andrew Lunn
  0 siblings, 2 replies; 15+ messages in thread
From: Vladimir Oltean @ 2023-04-26 19:43 UTC (permalink / raw)
  To: Steven Rostedt, Andrew Lunn, Florian Fainelli
  Cc: Masami Hiramatsu, netdev, Jakub Kicinski, David S. Miller,
	Eric Dumazet, Paolo Abeni

On Wed, Apr 26, 2023 at 03:23:45PM -0400, Steven Rostedt wrote:
> There's lots of events in the kernel that no tools use. Do you expect
> anyone to create a tool that uses these events?
> 
> We break user space API all the time. As long as nothing notices, it's OK.
> We take the "tree in the forest" approach. If user space API breaks, but no
> tool uses it, did it break? The answer according to Linus, is "no".
> 
> Al Viro refuses to have trace events in VFS, because there's lots of places
> that could become useful for tooling, and he doesn't want to support it.
> But if the events are not useful for user space tooling, they should be
> generally safe to keep.
> 
> There's tons of events in the wifi code, because they are very useful for
> debugging remote applications out in the world, that the wifi maintainers
> have tooling for. But those are not considered "stable", because the only
> tools are the ones that the maintainer of the trace events, created.
> 
> If you don't see anything using these events for useful tooling outside
> your own use, then I'd just keep them. There's a thousand other events in
> the kernel that are not used by tools, I doubt these will be any different.
> 
> If you think that a tool that will end up in a distribution will start
> using them, then you need to take care.

Well, there's one thing that could become useful for tooling, and
that's determining the resource utilization of the hardware (number of
dsa_fdb_add_hw events minus dsa_fdb_del_hw, number of dsa_vlan_add_hw
minus dsa_vlan_del_hw, etc) relative to some hardcoded maximum capacity
which would be somehow determined by userspace for each driver. There
have been requests for this in the not so distant past.

Instead of living in fear that this might happen, I think what would be
the most productive thing to do would be to just create a proper API in
the next kernel development cycle to expose just that information, and
point other people to that other API, and keep the trace events just for
debugging.

Andrew, Florian, what do you think?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 0/2] DSA trace events
  2023-04-26 19:43               ` Vladimir Oltean
@ 2023-04-26 19:47                 ` Steven Rostedt
  2023-04-26 21:07                   ` Vladimir Oltean
  2023-04-27  0:33                 ` Andrew Lunn
  1 sibling, 1 reply; 15+ messages in thread
From: Steven Rostedt @ 2023-04-26 19:47 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Andrew Lunn, Florian Fainelli, Masami Hiramatsu, netdev,
	Jakub Kicinski, David S. Miller, Eric Dumazet, Paolo Abeni

On Wed, 26 Apr 2023 22:43:01 +0300
Vladimir Oltean <vladimir.oltean@nxp.com> wrote:

> Instead of living in fear that this might happen, I think what would be
> the most productive thing to do would be to just create a proper API in
> the next kernel development cycle to expose just that information, and
> point other people to that other API, and keep the trace events just for
> debugging.

I also want to add that if a tool does use a trace event that was not your
intention, you can then fix the tool to do it properly.

I had this with powertop. It had hardcoded events (did not use
libtraceevent) and when I modified the layout, it broke, and Linus reverted
my changes. After fixing powertop to do things properly, I was able to get
my changes back in.

So even if you do break user space, you can still fix it later.

-- Steve

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 0/2] DSA trace events
  2023-04-26 19:47                 ` Steven Rostedt
@ 2023-04-26 21:07                   ` Vladimir Oltean
  0 siblings, 0 replies; 15+ messages in thread
From: Vladimir Oltean @ 2023-04-26 21:07 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Andrew Lunn, Florian Fainelli, Masami Hiramatsu, netdev,
	Jakub Kicinski, David S. Miller, Eric Dumazet, Paolo Abeni

On Wed, Apr 26, 2023 at 03:47:13PM -0400, Steven Rostedt wrote:
> I also want to add that if a tool does use a trace event that was not your
> intention, you can then fix the tool to do it properly.
> 
> I had this with powertop. It had hardcoded events (did not use
> libtraceevent) and when I modified the layout, it broke, and Linus reverted
> my changes. After fixing powertop to do things properly, I was able to get
> my changes back in.
> 
> So even if you do break user space, you can still fix it later.

Thanks, this is helpful.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 0/2] DSA trace events
  2023-04-26 19:43               ` Vladimir Oltean
  2023-04-26 19:47                 ` Steven Rostedt
@ 2023-04-27  0:33                 ` Andrew Lunn
  1 sibling, 0 replies; 15+ messages in thread
From: Andrew Lunn @ 2023-04-27  0:33 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Steven Rostedt, Florian Fainelli, Masami Hiramatsu, netdev,
	Jakub Kicinski, David S. Miller, Eric Dumazet, Paolo Abeni

> Well, there's one thing that could become useful for tooling, and
> that's determining the resource utilization of the hardware (number of
> dsa_fdb_add_hw events minus dsa_fdb_del_hw, number of dsa_vlan_add_hw
> minus dsa_vlan_del_hw, etc) relative to some hardcoded maximum capacity
> which would be somehow determined by userspace for each driver. There
> have been requests for this in the not so distant past.

That sounds similar to devlink resources. The mv88e6xxx driver already
returns the number of entries in the ATU, and the size of the
ATU. Maybe that just needs generalising?

     Andrew

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2023-04-27  0:34 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-04-07 14:14 [PATCH net-next 0/2] DSA trace events Vladimir Oltean
2023-04-07 14:14 ` [PATCH net-next 1/2] net: dsa: add trace points for FDB/MDB operations Vladimir Oltean
2023-04-07 14:14 ` [PATCH net-next 2/2] net: dsa: add trace points for VLAN operations Vladimir Oltean
2023-04-12  0:48 ` [PATCH net-next 0/2] DSA trace events Andrew Lunn
2023-04-12  9:55   ` Vladimir Oltean
2023-04-21 12:38     ` Masami Hiramatsu
2023-04-21 12:47       ` Vladimir Oltean
2023-04-24 22:25         ` Steven Rostedt
2023-04-26 19:13           ` Vladimir Oltean
2023-04-26 19:23             ` Steven Rostedt
2023-04-26 19:43               ` Vladimir Oltean
2023-04-26 19:47                 ` Steven Rostedt
2023-04-26 21:07                   ` Vladimir Oltean
2023-04-27  0:33                 ` Andrew Lunn
2023-04-12  8:30 ` patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox