Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH iproute2-next v3 0/2] f_flower: match on the number of vlan tags
From: Stephen Hemminger @ 2022-04-26 15:11 UTC (permalink / raw)
  To: Boris Sukholitko; +Cc: netdev, David Ahern, Ilya Lifshits
In-Reply-To: <20220426091417.7153-1-boris.sukholitko@broadcom.com>

On Tue, 26 Apr 2022 12:14:15 +0300
Boris Sukholitko <boris.sukholitko@broadcom.com> wrote:

> Hi,
> 
> Our customers in the fiber telecom world have network configurations
> where they would like to control their traffic according to the number
> of tags appearing in the packet.
> 
> For example, TR247 GPON conformance test suite specification mostly
> talks about untagged, single, double tagged packets and gives lax
> guidelines on the vlan protocol vs. number of vlan tags.
> 
> This is different from the common IT networks where 802.1Q and 802.1ad
> protocols are usually describe single and double tagged packet. GPON
> configurations that we work with have arbitrary mix the above protocols
> and number of vlan tags in the packet.
> 
> The following patch series implement number of vlans flower filter. They
> add num_of_vlans flower filter as an alternative to vlan ethtype protocol
> matching. The end result is that the following command becomes possible:
> 
> tc filter add dev eth1 ingress flower \
>   num_of_vlans 1 vlan_prio 5 action drop
> 
> Also, from our logs, we have redirect rules such that:
> 
> tc filter add dev $GPON ingress flower num_of_vlans $N \
>      action mirred egress redirect dev $DEV
> 
> where N can range from 0 to 3 and $DEV is the function of $N.
> 
> Also there are rules setting skb mark based on the number of vlans:
> 
> tc filter add dev $GPON ingress flower num_of_vlans $N vlan_prio \
>     $P action skbedit mark $M
> 
> Thanks,
> Boris.
> 
> - v3: rebased to the latest iproute2-next
> - v2: add missing f_flower subject prefix
> 
> Boris Sukholitko (2):
>   f_flower: Add num of vlans parameter
>   f_flower: Check args with num_of_vlans
> 
>  tc/f_flower.c | 57 ++++++++++++++++++++++++++++++++++++---------------
>  1 file changed, 41 insertions(+), 16 deletions(-)

Can you do this with BPF? instead of kernel change?

^ permalink raw reply

* Re: [PATCH] drivers, ixgbe: show VF statistics via ethtool
From: Jakub Kicinski @ 2022-04-26 15:09 UTC (permalink / raw)
  To: Maximilian Heyne
  Cc: Jesse Brandeburg, Tony Nguyen, David S. Miller, Paolo Abeni,
	intel-wired-lan, netdev, linux-kernel
In-Reply-To: <20220426084636.31609-1-mheyne@amazon.de>

On Tue, 26 Apr 2022 08:46:35 +0000 Maximilian Heyne wrote:
> +		for (i = 0; i < adapter->num_vfs; i++) {
> +			ethtool_sprintf(&p, "VF %u Rx Packets", i);
> +			ethtool_sprintf(&p, "VF %u Rx Bytes", i);
> +			ethtool_sprintf(&p, "VF %u Tx Packets", i);
> +			ethtool_sprintf(&p, "VF %u Tx Bytes", i);
> +			ethtool_sprintf(&p, "VF %u MC Packets", i);

Please use ndo_get_vf_stats / IFLA_VF_STATS.

^ permalink raw reply

* Re: [PATCH RFC 4/5] net/tls: Add support for PF_TLSH (a TLS handshake listener)
From: Jakub Kicinski @ 2022-04-26 15:02 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Hannes Reinecke, Chuck Lever, netdev, linux-nfs, linux-nvme,
	linux-cifs, linux-fsdevel, ak, borisp, simo
In-Reply-To: <1fca2eda-83e4-fe39-13c8-0e5e7553689b@grimberg.me>

On Tue, 26 Apr 2022 17:29:03 +0300 Sagi Grimberg wrote:
> >> Create the socket in user space, do all the handshakes you need there
> >> and then pass it to the kernel.  This is how NBD + TLS works.  Scales
> >> better and requires much less kernel code.
> >>  
> > But we can't, as the existing mechanisms (at least for NVMe) creates the 
> > socket in-kernel.
> > Having to create the socket in userspace would require a completely new 
> > interface for nvme and will not be backwards compatible.  
> 
> And we will still need the upcall anyways when we reconnect 
> (re-establish the socket)

That totally flew over my head, I have zero familiarity with in-kernel
storage network users :S

In all honesty the tls code in the kernel is a bit of a dumping ground.
People come, dump a bunch of code and disappear. Nobody seems to care
that the result is still (years in) not ready for production use :/
Until a month ago it'd break connections even under moderate memory
pressure. This set does not even have selftests.

Plus there are more protocols being actively worked on (QUIC, PSP etc.)
Having per ULP special sauce to invoke a user space helper is not the
paradigm we chose, and the time as inopportune as ever to change that.

^ permalink raw reply

* Re: [PATCH v2 net-next 07/10] net: dsa: request drivers to perform FDB isolation
From: Hans Schultz @ 2022-04-26 15:01 UTC (permalink / raw)
  To: Vladimir Oltean, netdev, Jakub Kicinski, David S. Miller
  Cc: Florian Fainelli, Andrew Lunn, Vivien Didelot, Vladimir Oltean,
	Kurt Kanzenbach, Hauke Mehrtens, Woojung Huh, UNGLinuxDriver,
	Sean Wang, Landen Chao, DENG Qingfang, Claudiu Manoil,
	Alexandre Belloni, Linus Walleij, Alvin Šipraga,
	George McCollister
In-Reply-To: <20220225092225.594851-8-vladimir.oltean@nxp.com>

On fre, feb 25, 2022 at 11:22, Vladimir Oltean <vladimir.oltean@nxp.com> wrote:
> For DSA, to encourage drivers to perform FDB isolation simply means to
> track which bridge does each FDB and MDB entry belong to. It then
> becomes the driver responsibility to use something that makes the FDB
> entry from one bridge not match the FDB lookup of ports from other
> bridges.
>
> The top-level functions where the bridge is determined are:
> - dsa_port_fdb_{add,del}
> - dsa_port_host_fdb_{add,del}
> - dsa_port_mdb_{add,del}
> - dsa_port_host_mdb_{add,del}
>
> aka the pre-crosschip-notifier functions.
>
> Changing the API to pass a reference to a bridge is not superfluous, and
> looking at the passed bridge argument is not the same as having the
> driver look at dsa_to_port(ds, port)->bridge from the ->port_fdb_add()
> method.
>
> DSA installs FDB and MDB entries on shared (CPU and DSA) ports as well,
> and those do not have any dp->bridge information to retrieve, because
> they are not in any bridge - they are merely the pipes that serve the
> user ports that are in one or multiple bridges.
>
> The struct dsa_bridge associated with each FDB/MDB entry is encapsulated
> in a larger "struct dsa_db" database. Although only databases associated
> to bridges are notified for now, this API will be the starting point for
> implementing IFF_UNICAST_FLT in DSA. There, the idea is to install FDB
> entries on the CPU port which belong to the corresponding user port's
> port database. These are supposed to match only when the port is
> standalone.
>
> It is better to introduce the API in its expected final form than to
> introduce it for bridges first, then to have to change drivers which may
> have made one or more assumptions.
>
> Drivers can use the provided bridge.num, but they can also use a
> different numbering scheme that is more convenient.
>
> DSA must perform refcounting on the CPU and DSA ports by also taking
> into account the bridge number. So if two bridges request the same local
> address, DSA must notify the driver twice, once for each bridge.
>
> In fact, if the driver supports FDB isolation, DSA must perform
> refcounting per bridge, but if the driver doesn't, DSA must refcount
> host addresses across all bridges, otherwise it would be telling the
> driver to delete an FDB entry for a bridge and the driver would delete
> it for all bridges. So introduce a bool fdb_isolation in drivers which
> would make all bridge databases passed to the cross-chip notifier have
> the same number (0). This makes dsa_mac_addr_find() -> dsa_db_equal()
> say that all bridge databases are the same database - which is
> essentially the legacy behavior.
>
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> ---
>  drivers/net/dsa/b53/b53_common.c       | 12 ++--
>  drivers/net/dsa/b53/b53_priv.h         | 12 ++--
>  drivers/net/dsa/hirschmann/hellcreek.c |  6 +-
>  drivers/net/dsa/lan9303-core.c         | 13 ++--
>  drivers/net/dsa/lantiq_gswip.c         |  6 +-
>  drivers/net/dsa/microchip/ksz9477.c    | 12 ++--
>  drivers/net/dsa/microchip/ksz_common.c |  6 +-
>  drivers/net/dsa/microchip/ksz_common.h |  6 +-
>  drivers/net/dsa/mt7530.c               | 12 ++--
>  drivers/net/dsa/mv88e6xxx/chip.c       | 12 ++--
>  drivers/net/dsa/ocelot/felix.c         | 18 +++--
>  drivers/net/dsa/qca8k.c                | 12 ++--
>  drivers/net/dsa/sja1105/sja1105_main.c | 26 +++++--
>  include/net/dsa.h                      | 42 +++++++++--
>  net/dsa/dsa_priv.h                     |  3 +
>  net/dsa/port.c                         | 75 ++++++++++++++++++-
>  net/dsa/switch.c                       | 99 +++++++++++++++++---------
>  17 files changed, 280 insertions(+), 92 deletions(-)
>
> diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c
> index 83bf30349c26..a8cc6e182c45 100644
> --- a/drivers/net/dsa/b53/b53_common.c
> +++ b/drivers/net/dsa/b53/b53_common.c
> @@ -1708,7 +1708,8 @@ static int b53_arl_op(struct b53_device *dev, int op, int port,
>  }
>  
>  int b53_fdb_add(struct dsa_switch *ds, int port,
> -		const unsigned char *addr, u16 vid)
> +		const unsigned char *addr, u16 vid,
> +		struct dsa_db db)
>  {
>  	struct b53_device *priv = ds->priv;
>  	int ret;
> @@ -1728,7 +1729,8 @@ int b53_fdb_add(struct dsa_switch *ds, int port,
>  EXPORT_SYMBOL(b53_fdb_add);
>  
>  int b53_fdb_del(struct dsa_switch *ds, int port,
> -		const unsigned char *addr, u16 vid)
> +		const unsigned char *addr, u16 vid,
> +		struct dsa_db db)
>  {
>  	struct b53_device *priv = ds->priv;
>  	int ret;
> @@ -1829,7 +1831,8 @@ int b53_fdb_dump(struct dsa_switch *ds, int port,
>  EXPORT_SYMBOL(b53_fdb_dump);
>  
>  int b53_mdb_add(struct dsa_switch *ds, int port,
> -		const struct switchdev_obj_port_mdb *mdb)
> +		const struct switchdev_obj_port_mdb *mdb,
> +		struct dsa_db db)
>  {
>  	struct b53_device *priv = ds->priv;
>  	int ret;
> @@ -1849,7 +1852,8 @@ int b53_mdb_add(struct dsa_switch *ds, int port,
>  EXPORT_SYMBOL(b53_mdb_add);
>  
>  int b53_mdb_del(struct dsa_switch *ds, int port,
> -		const struct switchdev_obj_port_mdb *mdb)
> +		const struct switchdev_obj_port_mdb *mdb,
> +		struct dsa_db db)
>  {
>  	struct b53_device *priv = ds->priv;
>  	int ret;
> diff --git a/drivers/net/dsa/b53/b53_priv.h b/drivers/net/dsa/b53/b53_priv.h
> index a6b339fcb17e..d3091f0ad3e6 100644
> --- a/drivers/net/dsa/b53/b53_priv.h
> +++ b/drivers/net/dsa/b53/b53_priv.h
> @@ -359,15 +359,19 @@ int b53_vlan_add(struct dsa_switch *ds, int port,
>  int b53_vlan_del(struct dsa_switch *ds, int port,
>  		 const struct switchdev_obj_port_vlan *vlan);
>  int b53_fdb_add(struct dsa_switch *ds, int port,
> -		const unsigned char *addr, u16 vid);
> +		const unsigned char *addr, u16 vid,
> +		struct dsa_db db);
>  int b53_fdb_del(struct dsa_switch *ds, int port,
> -		const unsigned char *addr, u16 vid);
> +		const unsigned char *addr, u16 vid,
> +		struct dsa_db db);
>  int b53_fdb_dump(struct dsa_switch *ds, int port,
>  		 dsa_fdb_dump_cb_t *cb, void *data);
>  int b53_mdb_add(struct dsa_switch *ds, int port,
> -		const struct switchdev_obj_port_mdb *mdb);
> +		const struct switchdev_obj_port_mdb *mdb,
> +		struct dsa_db db);
>  int b53_mdb_del(struct dsa_switch *ds, int port,
> -		const struct switchdev_obj_port_mdb *mdb);
> +		const struct switchdev_obj_port_mdb *mdb,
> +		struct dsa_db db);
>  int b53_mirror_add(struct dsa_switch *ds, int port,
>  		   struct dsa_mall_mirror_tc_entry *mirror, bool ingress);
>  enum dsa_tag_protocol b53_get_tag_protocol(struct dsa_switch *ds, int port,
> diff --git a/drivers/net/dsa/hirschmann/hellcreek.c b/drivers/net/dsa/hirschmann/hellcreek.c
> index 726f267cb228..cb89be9de43a 100644
> --- a/drivers/net/dsa/hirschmann/hellcreek.c
> +++ b/drivers/net/dsa/hirschmann/hellcreek.c
> @@ -827,7 +827,8 @@ static int hellcreek_fdb_get(struct hellcreek *hellcreek,
>  }
>  
>  static int hellcreek_fdb_add(struct dsa_switch *ds, int port,
> -			     const unsigned char *addr, u16 vid)
> +			     const unsigned char *addr, u16 vid,
> +			     struct dsa_db db)
>  {
>  	struct hellcreek_fdb_entry entry = { 0 };
>  	struct hellcreek *hellcreek = ds->priv;
> @@ -872,7 +873,8 @@ static int hellcreek_fdb_add(struct dsa_switch *ds, int port,
>  }
>  
>  static int hellcreek_fdb_del(struct dsa_switch *ds, int port,
> -			     const unsigned char *addr, u16 vid)
> +			     const unsigned char *addr, u16 vid,
> +			     struct dsa_db db)
>  {
>  	struct hellcreek_fdb_entry entry = { 0 };
>  	struct hellcreek *hellcreek = ds->priv;
> diff --git a/drivers/net/dsa/lan9303-core.c b/drivers/net/dsa/lan9303-core.c
> index 3969d89fa4db..a21184e7fcb6 100644
> --- a/drivers/net/dsa/lan9303-core.c
> +++ b/drivers/net/dsa/lan9303-core.c
> @@ -1188,7 +1188,8 @@ static void lan9303_port_fast_age(struct dsa_switch *ds, int port)
>  }
>  
>  static int lan9303_port_fdb_add(struct dsa_switch *ds, int port,
> -				const unsigned char *addr, u16 vid)
> +				const unsigned char *addr, u16 vid,
> +				struct dsa_db db)
>  {
>  	struct lan9303 *chip = ds->priv;
>  
> @@ -1200,8 +1201,8 @@ static int lan9303_port_fdb_add(struct dsa_switch *ds, int port,
>  }
>  
>  static int lan9303_port_fdb_del(struct dsa_switch *ds, int port,
> -				const unsigned char *addr, u16 vid)
> -
> +				const unsigned char *addr, u16 vid,
> +				struct dsa_db db)
>  {
>  	struct lan9303 *chip = ds->priv;
>  
> @@ -1245,7 +1246,8 @@ static int lan9303_port_mdb_prepare(struct dsa_switch *ds, int port,
>  }
>  
>  static int lan9303_port_mdb_add(struct dsa_switch *ds, int port,
> -				const struct switchdev_obj_port_mdb *mdb)
> +				const struct switchdev_obj_port_mdb *mdb,
> +				struct dsa_db db)
>  {
>  	struct lan9303 *chip = ds->priv;
>  	int err;
> @@ -1260,7 +1262,8 @@ static int lan9303_port_mdb_add(struct dsa_switch *ds, int port,
>  }
>  
>  static int lan9303_port_mdb_del(struct dsa_switch *ds, int port,
> -				const struct switchdev_obj_port_mdb *mdb)
> +				const struct switchdev_obj_port_mdb *mdb,
> +				struct dsa_db db)
>  {
>  	struct lan9303 *chip = ds->priv;
>  
> diff --git a/drivers/net/dsa/lantiq_gswip.c b/drivers/net/dsa/lantiq_gswip.c
> index 8a7a8093a156..3dfb532b7784 100644
> --- a/drivers/net/dsa/lantiq_gswip.c
> +++ b/drivers/net/dsa/lantiq_gswip.c
> @@ -1389,13 +1389,15 @@ static int gswip_port_fdb(struct dsa_switch *ds, int port,
>  }
>  
>  static int gswip_port_fdb_add(struct dsa_switch *ds, int port,
> -			      const unsigned char *addr, u16 vid)
> +			      const unsigned char *addr, u16 vid,
> +			      struct dsa_db db)
>  {
>  	return gswip_port_fdb(ds, port, addr, vid, true);
>  }
>  
>  static int gswip_port_fdb_del(struct dsa_switch *ds, int port,
> -			      const unsigned char *addr, u16 vid)
> +			      const unsigned char *addr, u16 vid,
> +			      struct dsa_db db)
>  {
>  	return gswip_port_fdb(ds, port, addr, vid, false);
>  }
> diff --git a/drivers/net/dsa/microchip/ksz9477.c b/drivers/net/dsa/microchip/ksz9477.c
> index 18ffc8ded7ee..94ad6d9504f4 100644
> --- a/drivers/net/dsa/microchip/ksz9477.c
> +++ b/drivers/net/dsa/microchip/ksz9477.c
> @@ -640,7 +640,8 @@ static int ksz9477_port_vlan_del(struct dsa_switch *ds, int port,
>  }
>  
>  static int ksz9477_port_fdb_add(struct dsa_switch *ds, int port,
> -				const unsigned char *addr, u16 vid)
> +				const unsigned char *addr, u16 vid,
> +				struct dsa_db db)
>  {
>  	struct ksz_device *dev = ds->priv;
>  	u32 alu_table[4];
> @@ -697,7 +698,8 @@ static int ksz9477_port_fdb_add(struct dsa_switch *ds, int port,
>  }
>  
>  static int ksz9477_port_fdb_del(struct dsa_switch *ds, int port,
> -				const unsigned char *addr, u16 vid)
> +				const unsigned char *addr, u16 vid,
> +				struct dsa_db db)
>  {
>  	struct ksz_device *dev = ds->priv;
>  	u32 alu_table[4];
> @@ -839,7 +841,8 @@ static int ksz9477_port_fdb_dump(struct dsa_switch *ds, int port,
>  }
>  
>  static int ksz9477_port_mdb_add(struct dsa_switch *ds, int port,
> -				const struct switchdev_obj_port_mdb *mdb)
> +				const struct switchdev_obj_port_mdb *mdb,
> +				struct dsa_db db)
>  {
>  	struct ksz_device *dev = ds->priv;
>  	u32 static_table[4];
> @@ -914,7 +917,8 @@ static int ksz9477_port_mdb_add(struct dsa_switch *ds, int port,
>  }
>  
>  static int ksz9477_port_mdb_del(struct dsa_switch *ds, int port,
> -				const struct switchdev_obj_port_mdb *mdb)
> +				const struct switchdev_obj_port_mdb *mdb,
> +				struct dsa_db db)
>  {
>  	struct ksz_device *dev = ds->priv;
>  	u32 static_table[4];
> diff --git a/drivers/net/dsa/microchip/ksz_common.c b/drivers/net/dsa/microchip/ksz_common.c
> index 94e618b8352b..104458ec9cbc 100644
> --- a/drivers/net/dsa/microchip/ksz_common.c
> +++ b/drivers/net/dsa/microchip/ksz_common.c
> @@ -276,7 +276,8 @@ int ksz_port_fdb_dump(struct dsa_switch *ds, int port, dsa_fdb_dump_cb_t *cb,
>  EXPORT_SYMBOL_GPL(ksz_port_fdb_dump);
>  
>  int ksz_port_mdb_add(struct dsa_switch *ds, int port,
> -		     const struct switchdev_obj_port_mdb *mdb)
> +		     const struct switchdev_obj_port_mdb *mdb,
> +		     struct dsa_db db)
>  {
>  	struct ksz_device *dev = ds->priv;
>  	struct alu_struct alu;
> @@ -321,7 +322,8 @@ int ksz_port_mdb_add(struct dsa_switch *ds, int port,
>  EXPORT_SYMBOL_GPL(ksz_port_mdb_add);
>  
>  int ksz_port_mdb_del(struct dsa_switch *ds, int port,
> -		     const struct switchdev_obj_port_mdb *mdb)
> +		     const struct switchdev_obj_port_mdb *mdb,
> +		     struct dsa_db db)
>  {
>  	struct ksz_device *dev = ds->priv;
>  	struct alu_struct alu;
> diff --git a/drivers/net/dsa/microchip/ksz_common.h b/drivers/net/dsa/microchip/ksz_common.h
> index c6fa487fb006..66933445a447 100644
> --- a/drivers/net/dsa/microchip/ksz_common.h
> +++ b/drivers/net/dsa/microchip/ksz_common.h
> @@ -166,9 +166,11 @@ void ksz_port_fast_age(struct dsa_switch *ds, int port);
>  int ksz_port_fdb_dump(struct dsa_switch *ds, int port, dsa_fdb_dump_cb_t *cb,
>  		      void *data);
>  int ksz_port_mdb_add(struct dsa_switch *ds, int port,
> -		     const struct switchdev_obj_port_mdb *mdb);
> +		     const struct switchdev_obj_port_mdb *mdb,
> +		     struct dsa_db db);
>  int ksz_port_mdb_del(struct dsa_switch *ds, int port,
> -		     const struct switchdev_obj_port_mdb *mdb);
> +		     const struct switchdev_obj_port_mdb *mdb,
> +		     struct dsa_db db);
>  int ksz_enable_port(struct dsa_switch *ds, int port, struct phy_device *phy);
>  
>  /* Common register access functions */
> diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c
> index f74f25f479ed..abe63ec05066 100644
> --- a/drivers/net/dsa/mt7530.c
> +++ b/drivers/net/dsa/mt7530.c
> @@ -1349,7 +1349,8 @@ mt7530_port_bridge_leave(struct dsa_switch *ds, int port,
>  
>  static int
>  mt7530_port_fdb_add(struct dsa_switch *ds, int port,
> -		    const unsigned char *addr, u16 vid)
> +		    const unsigned char *addr, u16 vid,
> +		    struct dsa_db db)
>  {
>  	struct mt7530_priv *priv = ds->priv;
>  	int ret;
> @@ -1365,7 +1366,8 @@ mt7530_port_fdb_add(struct dsa_switch *ds, int port,
>  
>  static int
>  mt7530_port_fdb_del(struct dsa_switch *ds, int port,
> -		    const unsigned char *addr, u16 vid)
> +		    const unsigned char *addr, u16 vid,
> +		    struct dsa_db db)
>  {
>  	struct mt7530_priv *priv = ds->priv;
>  	int ret;
> @@ -1416,7 +1418,8 @@ mt7530_port_fdb_dump(struct dsa_switch *ds, int port,
>  
>  static int
>  mt7530_port_mdb_add(struct dsa_switch *ds, int port,
> -		    const struct switchdev_obj_port_mdb *mdb)
> +		    const struct switchdev_obj_port_mdb *mdb,
> +		    struct dsa_db db)
>  {
>  	struct mt7530_priv *priv = ds->priv;
>  	const u8 *addr = mdb->addr;
> @@ -1442,7 +1445,8 @@ mt7530_port_mdb_add(struct dsa_switch *ds, int port,
>  
>  static int
>  mt7530_port_mdb_del(struct dsa_switch *ds, int port,
> -		    const struct switchdev_obj_port_mdb *mdb)
> +		    const struct switchdev_obj_port_mdb *mdb,
> +		    struct dsa_db db)
>  {
>  	struct mt7530_priv *priv = ds->priv;
>  	const u8 *addr = mdb->addr;
> diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
> index 1b9a20bf1bd6..d79c65bb227e 100644
> --- a/drivers/net/dsa/mv88e6xxx/chip.c
> +++ b/drivers/net/dsa/mv88e6xxx/chip.c
> @@ -2456,7 +2456,8 @@ static int mv88e6xxx_port_vlan_del(struct dsa_switch *ds, int port,
>  }
>  
>  static int mv88e6xxx_port_fdb_add(struct dsa_switch *ds, int port,
> -				  const unsigned char *addr, u16 vid)
> +				  const unsigned char *addr, u16 vid,
> +				  struct dsa_db db)
>  {
>  	struct mv88e6xxx_chip *chip = ds->priv;
>  	int err;
> @@ -2470,7 +2471,8 @@ static int mv88e6xxx_port_fdb_add(struct dsa_switch *ds, int port,
>  }
>  
>  static int mv88e6xxx_port_fdb_del(struct dsa_switch *ds, int port,
> -				  const unsigned char *addr, u16 vid)
> +				  const unsigned char *addr, u16 vid,
> +				  struct dsa_db db)
>  {
>  	struct mv88e6xxx_chip *chip = ds->priv;
>  	int err;
> @@ -6002,7 +6004,8 @@ static int mv88e6xxx_change_tag_protocol(struct dsa_switch *ds, int port,
>  }
>  
>  static int mv88e6xxx_port_mdb_add(struct dsa_switch *ds, int port,
> -				  const struct switchdev_obj_port_mdb *mdb)
> +				  const struct switchdev_obj_port_mdb *mdb,
> +				  struct dsa_db db)
>  {
>  	struct mv88e6xxx_chip *chip = ds->priv;
>  	int err;
> @@ -6016,7 +6019,8 @@ static int mv88e6xxx_port_mdb_add(struct dsa_switch *ds, int port,
>  }
>  
>  static int mv88e6xxx_port_mdb_del(struct dsa_switch *ds, int port,
> -				  const struct switchdev_obj_port_mdb *mdb)
> +				  const struct switchdev_obj_port_mdb *mdb,
> +				  struct dsa_db db)
>  {
>  	struct mv88e6xxx_chip *chip = ds->priv;
>  	int err;
> diff --git a/drivers/net/dsa/ocelot/felix.c b/drivers/net/dsa/ocelot/felix.c
> index 04f5da33b944..d92feee97c63 100644
> --- a/drivers/net/dsa/ocelot/felix.c
> +++ b/drivers/net/dsa/ocelot/felix.c
> @@ -592,7 +592,8 @@ static int felix_fdb_dump(struct dsa_switch *ds, int port,
>  }
>  
>  static int felix_fdb_add(struct dsa_switch *ds, int port,
> -			 const unsigned char *addr, u16 vid)
> +			 const unsigned char *addr, u16 vid,
> +			 struct dsa_db db)
>  {
>  	struct ocelot *ocelot = ds->priv;
>  
> @@ -600,7 +601,8 @@ static int felix_fdb_add(struct dsa_switch *ds, int port,
>  }
>  
>  static int felix_fdb_del(struct dsa_switch *ds, int port,
> -			 const unsigned char *addr, u16 vid)
> +			 const unsigned char *addr, u16 vid,
> +			 struct dsa_db db)
>  {
>  	struct ocelot *ocelot = ds->priv;
>  
> @@ -608,7 +610,8 @@ static int felix_fdb_del(struct dsa_switch *ds, int port,
>  }
>  
>  static int felix_lag_fdb_add(struct dsa_switch *ds, struct dsa_lag lag,
> -			     const unsigned char *addr, u16 vid)
> +			     const unsigned char *addr, u16 vid,
> +			     struct dsa_db db)
>  {
>  	struct ocelot *ocelot = ds->priv;
>  
> @@ -616,7 +619,8 @@ static int felix_lag_fdb_add(struct dsa_switch *ds, struct dsa_lag lag,
>  }
>  
>  static int felix_lag_fdb_del(struct dsa_switch *ds, struct dsa_lag lag,
> -			     const unsigned char *addr, u16 vid)
> +			     const unsigned char *addr, u16 vid,
> +			     struct dsa_db db)
>  {
>  	struct ocelot *ocelot = ds->priv;
>  
> @@ -624,7 +628,8 @@ static int felix_lag_fdb_del(struct dsa_switch *ds, struct dsa_lag lag,
>  }
>  
>  static int felix_mdb_add(struct dsa_switch *ds, int port,
> -			 const struct switchdev_obj_port_mdb *mdb)
> +			 const struct switchdev_obj_port_mdb *mdb,
> +			 struct dsa_db db)
>  {
>  	struct ocelot *ocelot = ds->priv;
>  
> @@ -632,7 +637,8 @@ static int felix_mdb_add(struct dsa_switch *ds, int port,
>  }
>  
>  static int felix_mdb_del(struct dsa_switch *ds, int port,
> -			 const struct switchdev_obj_port_mdb *mdb)
> +			 const struct switchdev_obj_port_mdb *mdb,
> +			 struct dsa_db db)
>  {
>  	struct ocelot *ocelot = ds->priv;
>  
> diff --git a/drivers/net/dsa/qca8k.c b/drivers/net/dsa/qca8k.c
> index 6844106975a9..7189fd8120d7 100644
> --- a/drivers/net/dsa/qca8k.c
> +++ b/drivers/net/dsa/qca8k.c
> @@ -2397,7 +2397,8 @@ qca8k_port_fdb_insert(struct qca8k_priv *priv, const u8 *addr,
>  
>  static int
>  qca8k_port_fdb_add(struct dsa_switch *ds, int port,
> -		   const unsigned char *addr, u16 vid)
> +		   const unsigned char *addr, u16 vid,
> +		   struct dsa_db db)
>  {
>  	struct qca8k_priv *priv = (struct qca8k_priv *)ds->priv;
>  	u16 port_mask = BIT(port);
> @@ -2407,7 +2408,8 @@ qca8k_port_fdb_add(struct dsa_switch *ds, int port,
>  
>  static int
>  qca8k_port_fdb_del(struct dsa_switch *ds, int port,
> -		   const unsigned char *addr, u16 vid)
> +		   const unsigned char *addr, u16 vid,
> +		   struct dsa_db db)
>  {
>  	struct qca8k_priv *priv = (struct qca8k_priv *)ds->priv;
>  	u16 port_mask = BIT(port);
> @@ -2444,7 +2446,8 @@ qca8k_port_fdb_dump(struct dsa_switch *ds, int port,
>  
>  static int
>  qca8k_port_mdb_add(struct dsa_switch *ds, int port,
> -		   const struct switchdev_obj_port_mdb *mdb)
> +		   const struct switchdev_obj_port_mdb *mdb,
> +		   struct dsa_db db)
>  {
>  	struct qca8k_priv *priv = ds->priv;
>  	const u8 *addr = mdb->addr;
> @@ -2455,7 +2458,8 @@ qca8k_port_mdb_add(struct dsa_switch *ds, int port,
>  
>  static int
>  qca8k_port_mdb_del(struct dsa_switch *ds, int port,
> -		   const struct switchdev_obj_port_mdb *mdb)
> +		   const struct switchdev_obj_port_mdb *mdb,
> +		   struct dsa_db db)
>  {
>  	struct qca8k_priv *priv = ds->priv;
>  	const u8 *addr = mdb->addr;
> diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
> index dd89b077aae6..91b0e636d194 100644
> --- a/drivers/net/dsa/sja1105/sja1105_main.c
> +++ b/drivers/net/dsa/sja1105/sja1105_main.c
> @@ -1819,7 +1819,8 @@ int sja1105pqrs_fdb_del(struct dsa_switch *ds, int port,
>  }
>  
>  static int sja1105_fdb_add(struct dsa_switch *ds, int port,
> -			   const unsigned char *addr, u16 vid)
> +			   const unsigned char *addr, u16 vid,
> +			   struct dsa_db db)
>  {
>  	struct sja1105_private *priv = ds->priv;
>  
> @@ -1827,7 +1828,8 @@ static int sja1105_fdb_add(struct dsa_switch *ds, int port,
>  }
>  
>  static int sja1105_fdb_del(struct dsa_switch *ds, int port,
> -			   const unsigned char *addr, u16 vid)
> +			   const unsigned char *addr, u16 vid,
> +			   struct dsa_db db)
>  {
>  	struct sja1105_private *priv = ds->priv;
>  
> @@ -1885,7 +1887,15 @@ static int sja1105_fdb_dump(struct dsa_switch *ds, int port,
>  
>  static void sja1105_fast_age(struct dsa_switch *ds, int port)
>  {
> +	struct dsa_port *dp = dsa_to_port(ds, port);
>  	struct sja1105_private *priv = ds->priv;
> +	struct dsa_db db = {
> +		.type = DSA_DB_BRIDGE,
> +		.bridge = {
> +			.dev = dsa_port_bridge_dev_get(dp),
> +			.num = dsa_port_bridge_num_get(dp),
> +		},
> +	};
>  	int i;
>  
>  	for (i = 0; i < SJA1105_MAX_L2_LOOKUP_COUNT; i++) {
> @@ -1913,7 +1923,7 @@ static void sja1105_fast_age(struct dsa_switch *ds, int port)
>  
>  		u64_to_ether_addr(l2_lookup.macaddr, macaddr);
>  
> -		rc = sja1105_fdb_del(ds, port, macaddr, l2_lookup.vlanid);
> +		rc = sja1105_fdb_del(ds, port, macaddr, l2_lookup.vlanid, db);
>  		if (rc) {
>  			dev_err(ds->dev,
>  				"Failed to delete FDB entry %pM vid %lld: %pe\n",
> @@ -1924,15 +1934,17 @@ static void sja1105_fast_age(struct dsa_switch *ds, int port)
>  }
>  
>  static int sja1105_mdb_add(struct dsa_switch *ds, int port,
> -			   const struct switchdev_obj_port_mdb *mdb)
> +			   const struct switchdev_obj_port_mdb *mdb,
> +			   struct dsa_db db)
>  {
> -	return sja1105_fdb_add(ds, port, mdb->addr, mdb->vid);
> +	return sja1105_fdb_add(ds, port, mdb->addr, mdb->vid, db);
>  }
>  
>  static int sja1105_mdb_del(struct dsa_switch *ds, int port,
> -			   const struct switchdev_obj_port_mdb *mdb)
> +			   const struct switchdev_obj_port_mdb *mdb,
> +			   struct dsa_db db)
>  {
> -	return sja1105_fdb_del(ds, port, mdb->addr, mdb->vid);
> +	return sja1105_fdb_del(ds, port, mdb->addr, mdb->vid, db);
>  }
>  
>  /* Common function for unicast and broadcast flood configuration.
> diff --git a/include/net/dsa.h b/include/net/dsa.h
> index 01faba89c987..87c5f18eb381 100644
> --- a/include/net/dsa.h
> +++ b/include/net/dsa.h
> @@ -341,11 +341,28 @@ struct dsa_link {
>  	struct list_head list;
>  };
>  
> +enum dsa_db_type {
> +	DSA_DB_PORT,
> +	DSA_DB_LAG,
> +	DSA_DB_BRIDGE,
> +};
> +
> +struct dsa_db {
> +	enum dsa_db_type type;
> +
> +	union {
> +		const struct dsa_port *dp;
> +		struct dsa_lag lag;
> +		struct dsa_bridge bridge;
> +	};
> +};
> +
>  struct dsa_mac_addr {
>  	unsigned char addr[ETH_ALEN];
>  	u16 vid;
>  	refcount_t refcount;
>  	struct list_head list;
> +	struct dsa_db db;
>  };
>  
>  struct dsa_vlan {
> @@ -409,6 +426,13 @@ struct dsa_switch {
>  	 */
>  	u32			mtu_enforcement_ingress:1;
>  
> +	/* Drivers that isolate the FDBs of multiple bridges must set this
> +	 * to true to receive the bridge as an argument in .port_fdb_{add,del}
> +	 * and .port_mdb_{add,del}. Otherwise, the bridge.num will always be
> +	 * passed as zero.
> +	 */
> +	u32			fdb_isolation:1;
> +
>  	/* Listener for switch fabric events */
>  	struct notifier_block	nb;
>  
> @@ -941,23 +965,29 @@ struct dsa_switch_ops {
>  	 * Forwarding database
>  	 */
>  	int	(*port_fdb_add)(struct dsa_switch *ds, int port,
> -				const unsigned char *addr, u16 vid);
> +				const unsigned char *addr, u16 vid,
> +				struct dsa_db db);

Hi! Wouldn't it be better to have a struct that has all the functions
parameters in one instead of adding further parameters to these
functions?

I am asking because I am also needing to add a parameter to
port_fdb_add(), and it would be more future oriented to have a single
function parameter as a struct, so that it is easier to add parameters
to these functions without havíng to change the prototype of the
function every time.

>  	int	(*port_fdb_del)(struct dsa_switch *ds, int port,
> -				const unsigned char *addr, u16 vid);
> +				const unsigned char *addr, u16 vid,
> +				struct dsa_db db);
>  	int	(*port_fdb_dump)(struct dsa_switch *ds, int port,
>  				 dsa_fdb_dump_cb_t *cb, void *data);
>  	int	(*lag_fdb_add)(struct dsa_switch *ds, struct dsa_lag lag,
> -			       const unsigned char *addr, u16 vid);
> +			       const unsigned char *addr, u16 vid,
> +			       struct dsa_db db);
>  	int	(*lag_fdb_del)(struct dsa_switch *ds, struct dsa_lag lag,
> -			       const unsigned char *addr, u16 vid);
> +			       const unsigned char *addr, u16 vid,
> +			       struct dsa_db db);
>  
>  	/*
>  	 * Multicast database
>  	 */
>  	int	(*port_mdb_add)(struct dsa_switch *ds, int port,
> -				const struct switchdev_obj_port_mdb *mdb);
> +				const struct switchdev_obj_port_mdb *mdb,
> +				struct dsa_db db);
>  	int	(*port_mdb_del)(struct dsa_switch *ds, int port,
> -				const struct switchdev_obj_port_mdb *mdb);
> +				const struct switchdev_obj_port_mdb *mdb,
> +				struct dsa_db db);
>  	/*
>  	 * RXNFC
>  	 */
> diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h
> index 7a1c98581f53..27575fc3883e 100644
> --- a/net/dsa/dsa_priv.h
> +++ b/net/dsa/dsa_priv.h
> @@ -67,6 +67,7 @@ struct dsa_notifier_fdb_info {
>  	int port;
>  	const unsigned char *addr;
>  	u16 vid;
> +	struct dsa_db db;
>  };
>  
>  /* DSA_NOTIFIER_LAG_FDB_* */
> @@ -74,6 +75,7 @@ struct dsa_notifier_lag_fdb_info {
>  	struct dsa_lag *lag;
>  	const unsigned char *addr;
>  	u16 vid;
> +	struct dsa_db db;
>  };
>  
>  /* DSA_NOTIFIER_MDB_* */
> @@ -81,6 +83,7 @@ struct dsa_notifier_mdb_info {
>  	const struct switchdev_obj_port_mdb *mdb;
>  	int sw_index;
>  	int port;
> +	struct dsa_db db;
>  };
>  
>  /* DSA_NOTIFIER_LAG_* */
> diff --git a/net/dsa/port.c b/net/dsa/port.c
> index adab159c8c21..7af44a28f032 100644
> --- a/net/dsa/port.c
> +++ b/net/dsa/port.c
> @@ -798,8 +798,19 @@ int dsa_port_fdb_add(struct dsa_port *dp, const unsigned char *addr,
>  		.port = dp->index,
>  		.addr = addr,
>  		.vid = vid,
> +		.db = {
> +			.type = DSA_DB_BRIDGE,
> +			.bridge = *dp->bridge,
> +		},
>  	};
>  
> +	/* Refcounting takes bridge.num as a key, and should be global for all
> +	 * bridges in the absence of FDB isolation, and per bridge otherwise.
> +	 * Force the bridge.num to zero here in the absence of FDB isolation.
> +	 */
> +	if (!dp->ds->fdb_isolation)
> +		info.db.bridge.num = 0;
> +
>  	return dsa_port_notify(dp, DSA_NOTIFIER_FDB_ADD, &info);
>  }
>  
> @@ -811,9 +822,15 @@ int dsa_port_fdb_del(struct dsa_port *dp, const unsigned char *addr,
>  		.port = dp->index,
>  		.addr = addr,
>  		.vid = vid,
> -
> +		.db = {
> +			.type = DSA_DB_BRIDGE,
> +			.bridge = *dp->bridge,
> +		},
>  	};
>  
> +	if (!dp->ds->fdb_isolation)
> +		info.db.bridge.num = 0;
> +
>  	return dsa_port_notify(dp, DSA_NOTIFIER_FDB_DEL, &info);
>  }
>  
> @@ -825,6 +842,10 @@ int dsa_port_host_fdb_add(struct dsa_port *dp, const unsigned char *addr,
>  		.port = dp->index,
>  		.addr = addr,
>  		.vid = vid,
> +		.db = {
> +			.type = DSA_DB_BRIDGE,
> +			.bridge = *dp->bridge,
> +		},
>  	};
>  	struct dsa_port *cpu_dp = dp->cpu_dp;
>  	int err;
> @@ -839,6 +860,9 @@ int dsa_port_host_fdb_add(struct dsa_port *dp, const unsigned char *addr,
>  			return err;
>  	}
>  
> +	if (!dp->ds->fdb_isolation)
> +		info.db.bridge.num = 0;
> +
>  	return dsa_port_notify(dp, DSA_NOTIFIER_HOST_FDB_ADD, &info);
>  }
>  
> @@ -850,6 +874,10 @@ int dsa_port_host_fdb_del(struct dsa_port *dp, const unsigned char *addr,
>  		.port = dp->index,
>  		.addr = addr,
>  		.vid = vid,
> +		.db = {
> +			.type = DSA_DB_BRIDGE,
> +			.bridge = *dp->bridge,
> +		},
>  	};
>  	struct dsa_port *cpu_dp = dp->cpu_dp;
>  	int err;
> @@ -860,6 +888,9 @@ int dsa_port_host_fdb_del(struct dsa_port *dp, const unsigned char *addr,
>  			return err;
>  	}
>  
> +	if (!dp->ds->fdb_isolation)
> +		info.db.bridge.num = 0;
> +
>  	return dsa_port_notify(dp, DSA_NOTIFIER_HOST_FDB_DEL, &info);
>  }
>  
> @@ -870,8 +901,15 @@ int dsa_port_lag_fdb_add(struct dsa_port *dp, const unsigned char *addr,
>  		.lag = dp->lag,
>  		.addr = addr,
>  		.vid = vid,
> +		.db = {
> +			.type = DSA_DB_BRIDGE,
> +			.bridge = *dp->bridge,
> +		},
>  	};
>  
> +	if (!dp->ds->fdb_isolation)
> +		info.db.bridge.num = 0;
> +
>  	return dsa_port_notify(dp, DSA_NOTIFIER_LAG_FDB_ADD, &info);
>  }
>  
> @@ -882,8 +920,15 @@ int dsa_port_lag_fdb_del(struct dsa_port *dp, const unsigned char *addr,
>  		.lag = dp->lag,
>  		.addr = addr,
>  		.vid = vid,
> +		.db = {
> +			.type = DSA_DB_BRIDGE,
> +			.bridge = *dp->bridge,
> +		},
>  	};
>  
> +	if (!dp->ds->fdb_isolation)
> +		info.db.bridge.num = 0;
> +
>  	return dsa_port_notify(dp, DSA_NOTIFIER_LAG_FDB_DEL, &info);
>  }
>  
> @@ -905,8 +950,15 @@ int dsa_port_mdb_add(const struct dsa_port *dp,
>  		.sw_index = dp->ds->index,
>  		.port = dp->index,
>  		.mdb = mdb,
> +		.db = {
> +			.type = DSA_DB_BRIDGE,
> +			.bridge = *dp->bridge,
> +		},
>  	};
>  
> +	if (!dp->ds->fdb_isolation)
> +		info.db.bridge.num = 0;
> +
>  	return dsa_port_notify(dp, DSA_NOTIFIER_MDB_ADD, &info);
>  }
>  
> @@ -917,8 +969,15 @@ int dsa_port_mdb_del(const struct dsa_port *dp,
>  		.sw_index = dp->ds->index,
>  		.port = dp->index,
>  		.mdb = mdb,
> +		.db = {
> +			.type = DSA_DB_BRIDGE,
> +			.bridge = *dp->bridge,
> +		},
>  	};
>  
> +	if (!dp->ds->fdb_isolation)
> +		info.db.bridge.num = 0;
> +
>  	return dsa_port_notify(dp, DSA_NOTIFIER_MDB_DEL, &info);
>  }
>  
> @@ -929,6 +988,10 @@ int dsa_port_host_mdb_add(const struct dsa_port *dp,
>  		.sw_index = dp->ds->index,
>  		.port = dp->index,
>  		.mdb = mdb,
> +		.db = {
> +			.type = DSA_DB_BRIDGE,
> +			.bridge = *dp->bridge,
> +		},
>  	};
>  	struct dsa_port *cpu_dp = dp->cpu_dp;
>  	int err;
> @@ -937,6 +1000,9 @@ int dsa_port_host_mdb_add(const struct dsa_port *dp,
>  	if (err)
>  		return err;
>  
> +	if (!dp->ds->fdb_isolation)
> +		info.db.bridge.num = 0;
> +
>  	return dsa_port_notify(dp, DSA_NOTIFIER_HOST_MDB_ADD, &info);
>  }
>  
> @@ -947,6 +1013,10 @@ int dsa_port_host_mdb_del(const struct dsa_port *dp,
>  		.sw_index = dp->ds->index,
>  		.port = dp->index,
>  		.mdb = mdb,
> +		.db = {
> +			.type = DSA_DB_BRIDGE,
> +			.bridge = *dp->bridge,
> +		},
>  	};
>  	struct dsa_port *cpu_dp = dp->cpu_dp;
>  	int err;
> @@ -955,6 +1025,9 @@ int dsa_port_host_mdb_del(const struct dsa_port *dp,
>  	if (err)
>  		return err;
>  
> +	if (!dp->ds->fdb_isolation)
> +		info.db.bridge.num = 0;
> +
>  	return dsa_port_notify(dp, DSA_NOTIFIER_HOST_MDB_DEL, &info);
>  }
>  
> diff --git a/net/dsa/switch.c b/net/dsa/switch.c
> index eb38beb10147..1d3c161e3131 100644
> --- a/net/dsa/switch.c
> +++ b/net/dsa/switch.c
> @@ -210,21 +210,41 @@ static bool dsa_port_host_address_match(struct dsa_port *dp,
>  	return false;
>  }
>  
> +static bool dsa_db_equal(const struct dsa_db *a, const struct dsa_db *b)
> +{
> +	if (a->type != b->type)
> +		return false;
> +
> +	switch (a->type) {
> +	case DSA_DB_PORT:
> +		return a->dp == b->dp;
> +	case DSA_DB_LAG:
> +		return a->lag.dev == b->lag.dev;
> +	case DSA_DB_BRIDGE:
> +		return a->bridge.num == b->bridge.num;
> +	default:
> +		WARN_ON(1);
> +		return false;
> +	}
> +}
> +
>  static struct dsa_mac_addr *dsa_mac_addr_find(struct list_head *addr_list,
> -					      const unsigned char *addr,
> -					      u16 vid)
> +					      const unsigned char *addr, u16 vid,
> +					      struct dsa_db db)
>  {
>  	struct dsa_mac_addr *a;
>  
>  	list_for_each_entry(a, addr_list, list)
> -		if (ether_addr_equal(a->addr, addr) && a->vid == vid)
> +		if (ether_addr_equal(a->addr, addr) && a->vid == vid &&
> +		    dsa_db_equal(&a->db, &db))
>  			return a;
>  
>  	return NULL;
>  }
>  
>  static int dsa_port_do_mdb_add(struct dsa_port *dp,
> -			       const struct switchdev_obj_port_mdb *mdb)
> +			       const struct switchdev_obj_port_mdb *mdb,
> +			       struct dsa_db db)
>  {
>  	struct dsa_switch *ds = dp->ds;
>  	struct dsa_mac_addr *a;
> @@ -233,11 +253,11 @@ static int dsa_port_do_mdb_add(struct dsa_port *dp,
>  
>  	/* No need to bother with refcounting for user ports */
>  	if (!(dsa_port_is_cpu(dp) || dsa_port_is_dsa(dp)))
> -		return ds->ops->port_mdb_add(ds, port, mdb);
> +		return ds->ops->port_mdb_add(ds, port, mdb, db);
>  
>  	mutex_lock(&dp->addr_lists_lock);
>  
> -	a = dsa_mac_addr_find(&dp->mdbs, mdb->addr, mdb->vid);
> +	a = dsa_mac_addr_find(&dp->mdbs, mdb->addr, mdb->vid, db);
>  	if (a) {
>  		refcount_inc(&a->refcount);
>  		goto out;
> @@ -249,7 +269,7 @@ static int dsa_port_do_mdb_add(struct dsa_port *dp,
>  		goto out;
>  	}
>  
> -	err = ds->ops->port_mdb_add(ds, port, mdb);
> +	err = ds->ops->port_mdb_add(ds, port, mdb, db);
>  	if (err) {
>  		kfree(a);
>  		goto out;
> @@ -257,6 +277,7 @@ static int dsa_port_do_mdb_add(struct dsa_port *dp,
>  
>  	ether_addr_copy(a->addr, mdb->addr);
>  	a->vid = mdb->vid;
> +	a->db = db;
>  	refcount_set(&a->refcount, 1);
>  	list_add_tail(&a->list, &dp->mdbs);
>  
> @@ -267,7 +288,8 @@ static int dsa_port_do_mdb_add(struct dsa_port *dp,
>  }
>  
>  static int dsa_port_do_mdb_del(struct dsa_port *dp,
> -			       const struct switchdev_obj_port_mdb *mdb)
> +			       const struct switchdev_obj_port_mdb *mdb,
> +			       struct dsa_db db)
>  {
>  	struct dsa_switch *ds = dp->ds;
>  	struct dsa_mac_addr *a;
> @@ -276,11 +298,11 @@ static int dsa_port_do_mdb_del(struct dsa_port *dp,
>  
>  	/* No need to bother with refcounting for user ports */
>  	if (!(dsa_port_is_cpu(dp) || dsa_port_is_dsa(dp)))
> -		return ds->ops->port_mdb_del(ds, port, mdb);
> +		return ds->ops->port_mdb_del(ds, port, mdb, db);
>  
>  	mutex_lock(&dp->addr_lists_lock);
>  
> -	a = dsa_mac_addr_find(&dp->mdbs, mdb->addr, mdb->vid);
> +	a = dsa_mac_addr_find(&dp->mdbs, mdb->addr, mdb->vid, db);
>  	if (!a) {
>  		err = -ENOENT;
>  		goto out;
> @@ -289,7 +311,7 @@ static int dsa_port_do_mdb_del(struct dsa_port *dp,
>  	if (!refcount_dec_and_test(&a->refcount))
>  		goto out;
>  
> -	err = ds->ops->port_mdb_del(ds, port, mdb);
> +	err = ds->ops->port_mdb_del(ds, port, mdb, db);
>  	if (err) {
>  		refcount_set(&a->refcount, 1);
>  		goto out;
> @@ -305,7 +327,7 @@ static int dsa_port_do_mdb_del(struct dsa_port *dp,
>  }
>  
>  static int dsa_port_do_fdb_add(struct dsa_port *dp, const unsigned char *addr,
> -			       u16 vid)
> +			       u16 vid, struct dsa_db db)
>  {
>  	struct dsa_switch *ds = dp->ds;
>  	struct dsa_mac_addr *a;
> @@ -314,11 +336,11 @@ static int dsa_port_do_fdb_add(struct dsa_port *dp, const unsigned char *addr,
>  
>  	/* No need to bother with refcounting for user ports */
>  	if (!(dsa_port_is_cpu(dp) || dsa_port_is_dsa(dp)))
> -		return ds->ops->port_fdb_add(ds, port, addr, vid);
> +		return ds->ops->port_fdb_add(ds, port, addr, vid, db);
>  
>  	mutex_lock(&dp->addr_lists_lock);
>  
> -	a = dsa_mac_addr_find(&dp->fdbs, addr, vid);
> +	a = dsa_mac_addr_find(&dp->fdbs, addr, vid, db);
>  	if (a) {
>  		refcount_inc(&a->refcount);
>  		goto out;
> @@ -330,7 +352,7 @@ static int dsa_port_do_fdb_add(struct dsa_port *dp, const unsigned char *addr,
>  		goto out;
>  	}
>  
> -	err = ds->ops->port_fdb_add(ds, port, addr, vid);
> +	err = ds->ops->port_fdb_add(ds, port, addr, vid, db);
>  	if (err) {
>  		kfree(a);
>  		goto out;
> @@ -338,6 +360,7 @@ static int dsa_port_do_fdb_add(struct dsa_port *dp, const unsigned char *addr,
>  
>  	ether_addr_copy(a->addr, addr);
>  	a->vid = vid;
> +	a->db = db;
>  	refcount_set(&a->refcount, 1);
>  	list_add_tail(&a->list, &dp->fdbs);
>  
> @@ -348,7 +371,7 @@ static int dsa_port_do_fdb_add(struct dsa_port *dp, const unsigned char *addr,
>  }
>  
>  static int dsa_port_do_fdb_del(struct dsa_port *dp, const unsigned char *addr,
> -			       u16 vid)
> +			       u16 vid, struct dsa_db db)
>  {
>  	struct dsa_switch *ds = dp->ds;
>  	struct dsa_mac_addr *a;
> @@ -357,11 +380,11 @@ static int dsa_port_do_fdb_del(struct dsa_port *dp, const unsigned char *addr,
>  
>  	/* No need to bother with refcounting for user ports */
>  	if (!(dsa_port_is_cpu(dp) || dsa_port_is_dsa(dp)))
> -		return ds->ops->port_fdb_del(ds, port, addr, vid);
> +		return ds->ops->port_fdb_del(ds, port, addr, vid, db);
>  
>  	mutex_lock(&dp->addr_lists_lock);
>  
> -	a = dsa_mac_addr_find(&dp->fdbs, addr, vid);
> +	a = dsa_mac_addr_find(&dp->fdbs, addr, vid, db);
>  	if (!a) {
>  		err = -ENOENT;
>  		goto out;
> @@ -370,7 +393,7 @@ static int dsa_port_do_fdb_del(struct dsa_port *dp, const unsigned char *addr,
>  	if (!refcount_dec_and_test(&a->refcount))
>  		goto out;
>  
> -	err = ds->ops->port_fdb_del(ds, port, addr, vid);
> +	err = ds->ops->port_fdb_del(ds, port, addr, vid, db);
>  	if (err) {
>  		refcount_set(&a->refcount, 1);
>  		goto out;
> @@ -386,14 +409,15 @@ static int dsa_port_do_fdb_del(struct dsa_port *dp, const unsigned char *addr,
>  }
>  
>  static int dsa_switch_do_lag_fdb_add(struct dsa_switch *ds, struct dsa_lag *lag,
> -				     const unsigned char *addr, u16 vid)
> +				     const unsigned char *addr, u16 vid,
> +				     struct dsa_db db)
>  {
>  	struct dsa_mac_addr *a;
>  	int err = 0;
>  
>  	mutex_lock(&lag->fdb_lock);
>  
> -	a = dsa_mac_addr_find(&lag->fdbs, addr, vid);
> +	a = dsa_mac_addr_find(&lag->fdbs, addr, vid, db);
>  	if (a) {
>  		refcount_inc(&a->refcount);
>  		goto out;
> @@ -405,7 +429,7 @@ static int dsa_switch_do_lag_fdb_add(struct dsa_switch *ds, struct dsa_lag *lag,
>  		goto out;
>  	}
>  
> -	err = ds->ops->lag_fdb_add(ds, *lag, addr, vid);
> +	err = ds->ops->lag_fdb_add(ds, *lag, addr, vid, db);
>  	if (err) {
>  		kfree(a);
>  		goto out;
> @@ -423,14 +447,15 @@ static int dsa_switch_do_lag_fdb_add(struct dsa_switch *ds, struct dsa_lag *lag,
>  }
>  
>  static int dsa_switch_do_lag_fdb_del(struct dsa_switch *ds, struct dsa_lag *lag,
> -				     const unsigned char *addr, u16 vid)
> +				     const unsigned char *addr, u16 vid,
> +				     struct dsa_db db)
>  {
>  	struct dsa_mac_addr *a;
>  	int err = 0;
>  
>  	mutex_lock(&lag->fdb_lock);
>  
> -	a = dsa_mac_addr_find(&lag->fdbs, addr, vid);
> +	a = dsa_mac_addr_find(&lag->fdbs, addr, vid, db);
>  	if (!a) {
>  		err = -ENOENT;
>  		goto out;
> @@ -439,7 +464,7 @@ static int dsa_switch_do_lag_fdb_del(struct dsa_switch *ds, struct dsa_lag *lag,
>  	if (!refcount_dec_and_test(&a->refcount))
>  		goto out;
>  
> -	err = ds->ops->lag_fdb_del(ds, *lag, addr, vid);
> +	err = ds->ops->lag_fdb_del(ds, *lag, addr, vid, db);
>  	if (err) {
>  		refcount_set(&a->refcount, 1);
>  		goto out;
> @@ -466,7 +491,8 @@ static int dsa_switch_host_fdb_add(struct dsa_switch *ds,
>  	dsa_switch_for_each_port(dp, ds) {
>  		if (dsa_port_host_address_match(dp, info->sw_index,
>  						info->port)) {
> -			err = dsa_port_do_fdb_add(dp, info->addr, info->vid);
> +			err = dsa_port_do_fdb_add(dp, info->addr, info->vid,
> +						  info->db);
>  			if (err)
>  				break;
>  		}
> @@ -487,7 +513,8 @@ static int dsa_switch_host_fdb_del(struct dsa_switch *ds,
>  	dsa_switch_for_each_port(dp, ds) {
>  		if (dsa_port_host_address_match(dp, info->sw_index,
>  						info->port)) {
> -			err = dsa_port_do_fdb_del(dp, info->addr, info->vid);
> +			err = dsa_port_do_fdb_del(dp, info->addr, info->vid,
> +						  info->db);
>  			if (err)
>  				break;
>  		}
> @@ -505,7 +532,7 @@ static int dsa_switch_fdb_add(struct dsa_switch *ds,
>  	if (!ds->ops->port_fdb_add)
>  		return -EOPNOTSUPP;
>  
> -	return dsa_port_do_fdb_add(dp, info->addr, info->vid);
> +	return dsa_port_do_fdb_add(dp, info->addr, info->vid, info->db);
>  }
>  
>  static int dsa_switch_fdb_del(struct dsa_switch *ds,
> @@ -517,7 +544,7 @@ static int dsa_switch_fdb_del(struct dsa_switch *ds,
>  	if (!ds->ops->port_fdb_del)
>  		return -EOPNOTSUPP;
>  
> -	return dsa_port_do_fdb_del(dp, info->addr, info->vid);
> +	return dsa_port_do_fdb_del(dp, info->addr, info->vid, info->db);
>  }
>  
>  static int dsa_switch_lag_fdb_add(struct dsa_switch *ds,
> @@ -532,7 +559,8 @@ static int dsa_switch_lag_fdb_add(struct dsa_switch *ds,
>  	dsa_switch_for_each_port(dp, ds)
>  		if (dsa_port_offloads_lag(dp, info->lag))
>  			return dsa_switch_do_lag_fdb_add(ds, info->lag,
> -							 info->addr, info->vid);
> +							 info->addr, info->vid,
> +							 info->db);
>  
>  	return 0;
>  }
> @@ -549,7 +577,8 @@ static int dsa_switch_lag_fdb_del(struct dsa_switch *ds,
>  	dsa_switch_for_each_port(dp, ds)
>  		if (dsa_port_offloads_lag(dp, info->lag))
>  			return dsa_switch_do_lag_fdb_del(ds, info->lag,
> -							 info->addr, info->vid);
> +							 info->addr, info->vid,
> +							 info->db);
>  
>  	return 0;
>  }
> @@ -604,7 +633,7 @@ static int dsa_switch_mdb_add(struct dsa_switch *ds,
>  	if (!ds->ops->port_mdb_add)
>  		return -EOPNOTSUPP;
>  
> -	return dsa_port_do_mdb_add(dp, info->mdb);
> +	return dsa_port_do_mdb_add(dp, info->mdb, info->db);
>  }
>  
>  static int dsa_switch_mdb_del(struct dsa_switch *ds,
> @@ -616,7 +645,7 @@ static int dsa_switch_mdb_del(struct dsa_switch *ds,
>  	if (!ds->ops->port_mdb_del)
>  		return -EOPNOTSUPP;
>  
> -	return dsa_port_do_mdb_del(dp, info->mdb);
> +	return dsa_port_do_mdb_del(dp, info->mdb, info->db);
>  }
>  
>  static int dsa_switch_host_mdb_add(struct dsa_switch *ds,
> @@ -631,7 +660,7 @@ static int dsa_switch_host_mdb_add(struct dsa_switch *ds,
>  	dsa_switch_for_each_port(dp, ds) {
>  		if (dsa_port_host_address_match(dp, info->sw_index,
>  						info->port)) {
> -			err = dsa_port_do_mdb_add(dp, info->mdb);
> +			err = dsa_port_do_mdb_add(dp, info->mdb, info->db);
>  			if (err)
>  				break;
>  		}
> @@ -652,7 +681,7 @@ static int dsa_switch_host_mdb_del(struct dsa_switch *ds,
>  	dsa_switch_for_each_port(dp, ds) {
>  		if (dsa_port_host_address_match(dp, info->sw_index,
>  						info->port)) {
> -			err = dsa_port_do_mdb_del(dp, info->mdb);
> +			err = dsa_port_do_mdb_del(dp, info->mdb, info->db);
>  			if (err)
>  				break;
>  		}
> -- 
> 2.25.1

^ permalink raw reply

* Re: [linux-next:master] BUILD REGRESSION e7d6987e09a328d4a949701db40ef63fbb970670
From: Jiri Pirko @ 2022-04-26 14:59 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: kernel test robot, Andrew Morton, netdev, Ido Schimmel
In-Reply-To: <Ymfol/Cf66KCYKA1@nanopsycho>

Tue, Apr 26, 2022 at 02:41:59PM CEST, jiri@resnulli.us wrote:
>Tue, Apr 26, 2022 at 02:17:16PM CEST, kuba@kernel.org wrote:
>>On Tue, 26 Apr 2022 13:42:04 +0800 kernel test robot wrote:
>>> drivers/net/ethernet/mellanox/mlxsw/core_linecards.c:851:8: warning: Use of memory after it is freed [clang-analyzer-unix.Malloc]
>>
>>Hi Ido, Jiri,
>>
>>is this one on your radar?
>
>Will send a fix for this, thanks.

Can't find the line. I don't see
e7d6987e09a328d4a949701db40ef63fbb970670 in linux-next :/

>

^ permalink raw reply

* Aw: [RFC v1 3/3] arm64: dts: rockchip: Add mt7531 dsa node to BPI-R2-Pro board
From: Frank Wunderlich @ 2022-04-26 14:57 UTC (permalink / raw)
  To: Frank Wunderlich
  Cc: linux-mediatek, linux-rockchip, Rob Herring, Krzysztof Kozlowski,
	Heiko Stuebner, Sean Wang, Landen Chao, DENG Qingfang,
	Andrew Lunn, Vivien Didelot, Florian Fainelli, Vladimir Oltean,
	David S. Miller, Jakub Kicinski, Paolo Abeni, Matthias Brugger,
	Peter Geis, devicetree, linux-arm-kernel, linux-kernel, netdev
In-Reply-To: <20220426134924.30372-4-linux@fw-web.de>

> Gesendet: Dienstag, 26. April 2022 um 15:49 Uhr
> Von: "Frank Wunderlich" <linux@fw-web.de>

> +&mdio0 {
> +	#address-cells = <1>;
> +	#size-cells = <0>;
> +
> +	switch@0 {
> +		compatible = "mediatek,mt7531";
> +		reg = <0>;
> +		status = "disabled";

seems i had missed to delete this, but it looks like it was ignored as switch was probed

> +		ports {
> +			#address-cells = <1>;
> +			#size-cells = <0>;


^ permalink raw reply

* Re: [PATCH RFC 4/5] net/tls: Add support for PF_TLSH (a TLS handshake listener)
From: Jakub Kicinski @ 2022-04-26 14:55 UTC (permalink / raw)
  To: Chuck Lever III
  Cc: netdev, Linux NFS Mailing List, linux-nvme@lists.infradead.org,
	linux-cifs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	ak@tempesta-tech.com, borisp@nvidia.com, simo@redhat.com
In-Reply-To: <E8809EC2-D49A-4171-8C88-D5E24FFA4079@oracle.com>

On Tue, 26 Apr 2022 13:48:20 +0000 Chuck Lever III wrote:
> > Create the socket in user space, do all the handshakes you need there
> > and then pass it to the kernel.  This is how NBD + TLS works.  Scales
> > better and requires much less kernel code.  
> 
> The RPC-with-TLS standard allows unencrypted RPC traffic on the connection
> before sending ClientHello. I think we'd like to stick with creating the
> socket in the kernel, for this reason and for the reasons Hannes mentions
> in his reply.

Umpf, I presume that's reviewed by security people in IETF so I guess
it's done right this time (tm).

Your wording seems careful not to imply that you actually need that,
tho. Am I over-interpreting?

^ permalink raw reply

* Re: [PATCH RFC 4/5] net/tls: Add support for PF_TLSH (a TLS handshake listener)
From: Jakub Kicinski @ 2022-04-26 14:55 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Chuck Lever, netdev, linux-nfs, linux-nvme, linux-cifs,
	linux-fsdevel, ak, borisp, simo
In-Reply-To: <66077b73-c1a4-d2ae-c8e4-3e19e9053171@suse.de>

On Tue, 26 Apr 2022 11:43:37 +0200 Hannes Reinecke wrote:
> > Create the socket in user space, do all the handshakes you need there
> > and then pass it to the kernel.  This is how NBD + TLS works.  Scales
> > better and requires much less kernel code.
> >   
> But we can't, as the existing mechanisms (at least for NVMe) creates the 
> socket in-kernel.
> Having to create the socket in userspace would require a completely new 
> interface for nvme and will not be backwards compatible.
> Not to mention having to rework the nvme driver to accept sockets from 
> userspace instead of creating them internally.
> 
> With this approach we can keep existing infrastructure, and can get a 
> common implementation for either transport.

You add 1.5kLoC and require running a user space agent, surely you're
adding new interfaces and are not backward-compatible already.

I don't understand your argument, maybe you could rephrase / dumb it
down for me?

^ permalink raw reply

* Fw: [Bug 215888] New: raw socket test with stress-ng trigger soft lockup
From: Stephen Hemminger @ 2022-04-26 14:54 UTC (permalink / raw)
  To: netdev



Begin forwarded message:

Date: Tue, 26 Apr 2022 09:31:50 +0000
From: bugzilla-daemon@kernel.org
To: stephen@networkplumber.org
Subject: [Bug 215888] New: raw socket test with stress-ng trigger soft lockup


https://bugzilla.kernel.org/show_bug.cgi?id=215888

            Bug ID: 215888
           Summary: raw socket test with stress-ng trigger soft lockup
           Product: Networking
           Version: 2.5
    Kernel Version: 5.17
          Hardware: Intel
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: blocking
          Priority: P1
         Component: IPV4
          Assignee: stephen@networkplumber.org
          Reporter: colin.king@canonical.com
        Regression: No

Running stress-ng [1] with the following raw socket stressor triggers a
softlockup on a SMP NUMA x86-64 system:

sudo stress-ng --rawsock 20 -t 60

kernel:watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [stress-ng:49781]

Tested this on 5.17. User has also reported this against the stress-ng project:

https://github.com/ColinIanKing/stress-ng/issues/187

[1] Stress-ng:
https://github.com/ColinIanKing/stress-ng
git clone https://github.com/ColinIanKing/stress-ng
cd stress-ng
make
sudo ./stress-ng --rawsock 0 -t 60

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply

* Re: [PATCH net-next 00/11] mlxsw: extend line card model by devices and info
From: Jakub Kicinski @ 2022-04-26 14:51 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Ido Schimmel, Ido Schimmel, netdev, davem, pabeni, jiri, petrm,
	dsahern, andrew, mlxsw
In-Reply-To: <Ymf66h5dMNOLun8k@nanopsycho>

On Tue, 26 Apr 2022 16:00:10 +0200 Jiri Pirko wrote:
> >> But you have to somehow link the component to the particular gearbox on
> >> particular line card. Say, you need to flash GB on line card 8. This is
> >> basically providing a way to expose this relationship to user.
> >> 
> >> Also, the "lc info" shows the FW version for gearboxes. As Ido
> >> mentioned, the GB versions could be listed in "devlink dev info" in
> >> theory. But then, you need to somehow expose the relationship with
> >> line card as well.  
> >
> >Why would the automation which comes to update the firmware care 
> >at all where the component is? Humans can see what the component 
> >is by looking at the name.  
> 
> The relationship-by-name sounds a bit fragile to me. The names of
> components are up to the individual drivers.

I asked you how the automation will operate. You must answer questions
if you want to have a discussion. Automation is the relevant part.
You're not designing an interface for SDK users but for end users.

> >If we do need to know (*if*!) you can list FW components as a lc
> >attribute, no need for new commands and objects.  
> 
> There is no new command for that, only one nested attribute which
> carries the device list added to the existing command. They are no new
> objects, they are just few nested values.

DEVLINK_CMD_LINECARD_INFO_GET

> >IMHO we should either keep lc objects simple and self contained or 
> >give them a devlink instance. Creating sub-objects from them is very  
> 
> Give them a devlink instance? I don't understand how. LC is not a
> separate device, far from that. That does not make any sense to me.

You can put a name of another devlink instance as an attribute of a lc.
See below.

> >worrying. If there is _any_ chance we'll need per-lc health reporters 
> >or sbs or params(🤢) etc. etc. - let's bite the bullet _now_ and create
> >full devlink sub-instances!  
> 
> Does not make sense to me at all. Line cards are detachable PHY sets in
> essence, very basic functionality. They does not have buffers, health
> and params, I don't think so. 

I guess the definition of a "line card" has become somewhat murky over
the years, since the olden days of serial lines.

Perhaps David and others can enlighten us but what I'm used to hearing
about as a line card these days in a chassis system is a full-on switch.
Chassis being effectively a Clos network in a box, the main difference
being the line cards talk cells to the backplane, not full packets.

Back in my Netronome days we called those detachable front panel gear
boxes "phy mods". Those had nowhere near the complexity of a real line
card. Sounds like that's more aligned with what you have.

To summarize, since your definition of a line card is a bit special,
the less uAPI we add to fit your definition we add the better.

> >> I don't see any simpler iface than this.  
> >
> >Based on the assumptions you've made, maybe, but the uAPI should
> >abstract away the irrelevant details. I'm questioning the assumptions.  
> 
> Is the FW version of gearbox on a line card irrelevand detail?

Not what I said.

> If so, how does the user know if/when to flash it?
> If not, where would you list it if devices nest is not the correct place?

Let me mock up what I had in mind for you since it did not come thru 
in the explanation:

$ devlink dev info show pci/0000:01:00.0
    versions:
        fixed:
          hw.revision 0
          lc2.hw.revision a
          lc8.hw.revision b
        running:
          ini.version 4
          lc2.gearbox 1.1.3
          lc8.gearbox 1.2.3

$ devlink lc show pci/0000:01:00.0 lc 8
pci/0000:01:00.0:
  lc 8 state active type 16x100G
    supported_types:
      16x100G
    versions: 
      lc8.hw.revision (a) 
      lc8.gearbox (1.2.3)

Where the data in the brackets is optionally fetched thru the existing
"dev info" API, but rendered together by the user space.

> >> There are 4 gearboxes on the line card. They share the same flash. So
> >> if you flash gearbox 0, the rest will use the same FW.  
> >
> >o_0 so the FW component is called lcX_dev0 and yet it applies to _all_
> >devices, not just dev0?! Looking at the output above I thought other
> >devices simply don't have FW ("flashable false").  
> 
> Yes, device 0 is "flash master" (RW). 1-3 are RO. I know it is a bit
> confusing. Maybe Andy's suggestion of "shared" flag of some sort might
> help.
> 
> >> I'm exposing them for the sake of completeness. Also, the interface
> >> needs to be designed as a list anyway, as different line cards may
> >> have separate flash per gearbox.
> >> 
> >> What's is the harm in exposing devices 1-3? If you insist, we can hide
> >> them.  
> >
> >Well, they are unnecessary (this is uAPI), and coming from the outside
> >I misinterpreted what the information reported means, so yeah, I'd
> >classify it as harmful :(  
> 
> UAPI is the "devices nest". It has to be list one way or another
> (we may need to expose more gearboxes anyway). So what is differently
> harmful with having list [0] or list [0,1,2,3] ?

^ permalink raw reply

* Re: [PATCH bpf] xsk: fix possible crash when multiple sockets are created
From: patchwork-bot+netdevbpf @ 2022-04-26 14:30 UTC (permalink / raw)
  To: Maciej Fijalkowski; +Cc: bpf, ast, daniel, andriin, netdev, magnus.karlsson
In-Reply-To: <20220425153745.481322-1-maciej.fijalkowski@intel.com>

Hello:

This patch was applied to bpf/bpf.git (master)
by Daniel Borkmann <daniel@iogearbox.net>:

On Mon, 25 Apr 2022 17:37:45 +0200 you wrote:
> Fix a crash that happens if an Rx only socket is created first, then a
> second socket is created that is Tx only and bound to the same umem as
> the first socket and also the same netdev and queue_id together with the
> XDP_SHARED_UMEM flag. In this specific case, the tx_descs array page
> pool was not created by the first socket as it was an Rx only socket.
> When the second socket is bound it needs this tx_descs array of this
> shared page pool as it has a Tx component, but unfortunately it was
> never allocated, leading to a crash. Note that this array is only used
> for zero-copy drivers using the batched Tx APIs, currently only ice and
> i40e.
> 
> [...]

Here is the summary with links:
  - [bpf] xsk: fix possible crash when multiple sockets are created
    https://git.kernel.org/bpf/bpf/c/ba3beec2ec1d

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH RFC 4/5] net/tls: Add support for PF_TLSH (a TLS handshake listener)
From: Sagi Grimberg @ 2022-04-26 14:29 UTC (permalink / raw)
  To: Hannes Reinecke, Jakub Kicinski, Chuck Lever
  Cc: netdev, linux-nfs, linux-nvme, linux-cifs, linux-fsdevel, ak,
	borisp, simo
In-Reply-To: <66077b73-c1a4-d2ae-c8e4-3e19e9053171@suse.de>


>>> Currently the prototype does not handle multiple listeners that
>>> overlap -- multiple listeners in the same net namespace that have
>>> overlapping bind addresses.
>>
>> Create the socket in user space, do all the handshakes you need there
>> and then pass it to the kernel.  This is how NBD + TLS works.  Scales
>> better and requires much less kernel code.
>>
> But we can't, as the existing mechanisms (at least for NVMe) creates the 
> socket in-kernel.
> Having to create the socket in userspace would require a completely new 
> interface for nvme and will not be backwards compatible.

And we will still need the upcall anyways when we reconnect 
(re-establish the socket)

^ permalink raw reply

* Re: [PATCH net-next 00/11] mlxsw: extend line card model by devices and info
From: Jiri Pirko @ 2022-04-26 14:05 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Ido Schimmel, Jakub Kicinski, Ido Schimmel, netdev, davem, pabeni,
	jiri, petrm, dsahern, mlxsw
In-Reply-To: <Ymf3jKNeyuYHzsBC@lunn.ch>

Tue, Apr 26, 2022 at 03:45:48PM CEST, andrew@lunn.ch wrote:
>> Well, I got your point. If the HW would be designed in the way the
>> building blocks are exposed to the host, that would work. However, that
>> is not the case here, unfortunatelly.
>
>I'm with Jakub. It is the uAPI which matters here. It should look the
>same for a SoC style enterprise router and your discombobulated TOR
>router. How you talk to the different building blocks is an
>implementation detail.

It's not that simple. Take the gearbox for example. You say bunch of
MDIO registers. ASIC FW has a custom SDK internally that is used to
talk to the gearbox.

The flash, you say expose by MTD, but there is no access to it directly
from host. Can't be done. There are HW design limitations that are
blocking your concept.

^ permalink raw reply

* Re: [PATCH net-next 1/5] net: ipqess: introduce the Qualcomm IPQESS driver
From: Maxime Chevallier @ 2022-04-26 13:59 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: davem, Rob Herring, netdev, linux-kernel, devicetree,
	thomas.petazzoni, Florian Fainelli, Heiner Kallweit, Russell King,
	linux-arm-kernel, Vladimir Oltean, Luka Perkov, Robert Marko
In-Reply-To: <YmMN37VjQNwhLDuX@lunn.ch>

Hello Andrew,

On Fri, 22 Apr 2022 22:19:43 +0200
Andrew Lunn <andrew@lunn.ch> wrote:

Thanks for the review :)

> > +static int ipqess_axi_probe(struct platform_device *pdev)
> > +{
> > +	struct device_node *np = pdev->dev.of_node;
> > +	struct net_device *netdev;
> > +	phy_interface_t phy_mode;
> > +	struct resource *res;
> > +	struct ipqess *ess;
> > +	int i, err = 0;
> > +
> > +	netdev = devm_alloc_etherdev_mqs(&pdev->dev, sizeof(struct
> > ipqess),
> > +					 IPQESS_NETDEV_QUEUES,
> > +					 IPQESS_NETDEV_QUEUES);
> > +	if (!netdev)
> > +		return -ENOMEM;
> > +
> > +	ess = netdev_priv(netdev);
> > +	ess->netdev = netdev;
> > +	ess->pdev = pdev;
> > +	spin_lock_init(&ess->stats_lock);
> > +	SET_NETDEV_DEV(netdev, &pdev->dev);
> > +	platform_set_drvdata(pdev, netdev);  
> 
> ....
> 
> > +
> > +	ipqess_set_ethtool_ops(netdev);
> > +
> > +	err = register_netdev(netdev);
> > +	if (err)
> > +		goto err_out;  
> 
> Before register_netdev() even returns, your devices can be in use, the
> open callback called and packets sent. This is particularly true for
> NFS root. Which means any setup done after this is probably wrong.

Nice catch, thank you !

> > +
> > +	err = ipqess_hw_init(ess);
> > +	if (err)
> > +		goto err_out;
> > +
> > +	for (i = 0; i < IPQESS_NETDEV_QUEUES; i++) {
> > +		int qid;
> > +
> > +		netif_tx_napi_add(netdev, &ess->tx_ring[i].napi_tx,
> > +				  ipqess_tx_napi, 64);
> > +		netif_napi_add(netdev,
> > +			       &ess->rx_ring[i].napi_rx,
> > +			       ipqess_rx_napi, 64);
> > +
> > +		qid = ess->tx_ring[i].idx;
> > +		err = devm_request_irq(&ess->netdev->dev,
> > ess->tx_irq[qid],
> > +				       ipqess_interrupt_tx, 0,
> > +				       ess->tx_irq_names[qid],
> > +				       &ess->tx_ring[i]);
> > +		if (err)
> > +			goto err_out;
> > +
> > +		qid = ess->rx_ring[i].idx;
> > +		err = devm_request_irq(&ess->netdev->dev,
> > ess->rx_irq[qid],
> > +				       ipqess_interrupt_rx, 0,
> > +				       ess->rx_irq_names[qid],
> > +				       &ess->rx_ring[i]);
> > +		if (err)
> > +			goto err_out;
> > +	}  
> 
> All this should probably go before netdev_register().

I'll fix this for V2.

> > +static int ipqess_get_strset_count(struct net_device *netdev, int
> > sset) +{
> > +	switch (sset) {
> > +	case ETH_SS_STATS:
> > +		return ARRAY_SIZE(ipqess_stats);
> > +	default:
> > +		netdev_dbg(netdev, "%s: Invalid string set",
> > __func__);  
> 
> Unsupported would be better than invalid.

That's right, thanks

> > +		return -EOPNOTSUPP;
> > +	}
> > +}  
> 
>   Andrew

Best Regards,

Maxime

^ permalink raw reply

* Re: [PATCH net-next 4/5] net: dt-bindings: Introduce the Qualcomm IPQESS Ethernet controller
From: Maxime Chevallier @ 2022-04-26 14:02 UTC (permalink / raw)
  To: Krzysztof Kozlowski
  Cc: davem, Rob Herring, netdev, linux-kernel, devicetree,
	thomas.petazzoni, Andrew Lunn, Florian Fainelli, Heiner Kallweit,
	Russell King, linux-arm-kernel, Vladimir Oltean, Luka Perkov,
	Robert Marko
In-Reply-To: <34d3bfdc-cf8c-bf63-4f67-57c8d6c9b780@linaro.org>

Hi Krzysztof

On Sat, 23 Apr 2022 19:49:30 +0200
Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> wrote:

Thanks a lot for the review, I'll address all your comments in a V2.

> On 22/04/2022 20:03, Maxime Chevallier wrote:
> > Add the DT binding for the IPQESS Ethernet Controller. This is a
> > simple controller, only requiring the phy-mode, interrupts, clocks,
> > and possibly a MAC address setting.
> > 
> > Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
> > ---
> >  .../devicetree/bindings/net/qcom,ipqess.yaml  | 94
> > +++++++++++++++++++ 1 file changed, 94 insertions(+)
> >  create mode 100644
> > Documentation/devicetree/bindings/net/qcom,ipqess.yaml
> > 
> > diff --git a/Documentation/devicetree/bindings/net/qcom,ipqess.yaml
> > b/Documentation/devicetree/bindings/net/qcom,ipqess.yaml new file
> > mode 100644 index 000000000000..8fec5633692f
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/net/qcom,ipqess.yaml
> > @@ -0,0 +1,94 @@
> > +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> > +%YAML 1.2
> > +---
> > +$id: http://devicetree.org/schemas/net/qcom,ipqess.yaml#
> > +$schema: http://devicetree.org/meta-schemas/core.yaml#
> > +
> > +title: Qualcomm IPQ ESS EDMA Ethernet Controller Device Tree
> > Bindings  
> 
> s/Device Tree Bindings//
> 
> > +
> > +allOf:
> > +  - $ref: "ethernet-controller.yaml#"  
> 
> allOf goes after maintainers.
> 
> > +
> > +maintainers:
> > +  - Maxime Chevallier <maxime.chevallier@bootlin.com>
> > +
> > +properties:
> > +  compatible:
> > +    const: qcom,ipq4019e-ess-edma
> > +
> > +  reg:
> > +    maxItems: 1
> > +
> > +  interrupts:
> > +    minItems: 2
> > +    maxItems: 32
> > +    description: One interrupt per tx and rx queue, with up to 16
> > queues. +
> > +  clocks:
> > +    maxItems: 1
> > +
> > +  phy-mode: true
> > +
> > +  fixed-link: true
> > +
> > +  mac-address: true  
> 
> You don't need all these three. They come from ethernet-controller and
> you use unevaluatedProperties.
> 
> > +
> > +required:
> > +  - compatible
> > +  - reg
> > +  - interrupts
> > +  - clocks
> > +  - phy-mode
> > +
> > +unevaluatedProperties: false
> > +
> > +examples:
> > +  - |
> > +    gmac: ethernet@c080000 {
> > +        compatible = "qcom,ipq4019-ess-edma";
> > +        reg = <0xc080000 0x8000>;
> > +        interrupts = <GIC_SPI  65 IRQ_TYPE_EDGE_RISING>,
> > +                     <GIC_SPI  66 IRQ_TYPE_EDGE_RISING>,
> > +                     <GIC_SPI  67 IRQ_TYPE_EDGE_RISING>,
> > +                     <GIC_SPI  68 IRQ_TYPE_EDGE_RISING>,
> > +                     <GIC_SPI  69 IRQ_TYPE_EDGE_RISING>,
> > +                     <GIC_SPI  70 IRQ_TYPE_EDGE_RISING>,
> > +                     <GIC_SPI  71 IRQ_TYPE_EDGE_RISING>,
> > +                     <GIC_SPI  72 IRQ_TYPE_EDGE_RISING>,
> > +                     <GIC_SPI  73 IRQ_TYPE_EDGE_RISING>,
> > +                     <GIC_SPI  74 IRQ_TYPE_EDGE_RISING>,
> > +                     <GIC_SPI  75 IRQ_TYPE_EDGE_RISING>,
> > +                     <GIC_SPI  76 IRQ_TYPE_EDGE_RISING>,
> > +                     <GIC_SPI  77 IRQ_TYPE_EDGE_RISING>,
> > +                     <GIC_SPI  78 IRQ_TYPE_EDGE_RISING>,
> > +                     <GIC_SPI  79 IRQ_TYPE_EDGE_RISING>,
> > +                     <GIC_SPI  80 IRQ_TYPE_EDGE_RISING>,
> > +                     <GIC_SPI 240 IRQ_TYPE_EDGE_RISING>,
> > +                     <GIC_SPI 241 IRQ_TYPE_EDGE_RISING>,
> > +                     <GIC_SPI 242 IRQ_TYPE_EDGE_RISING>,
> > +                     <GIC_SPI 243 IRQ_TYPE_EDGE_RISING>,
> > +                     <GIC_SPI 244 IRQ_TYPE_EDGE_RISING>,
> > +                     <GIC_SPI 245 IRQ_TYPE_EDGE_RISING>,
> > +                     <GIC_SPI 246 IRQ_TYPE_EDGE_RISING>,
> > +                     <GIC_SPI 247 IRQ_TYPE_EDGE_RISING>,
> > +                     <GIC_SPI 248 IRQ_TYPE_EDGE_RISING>,
> > +                     <GIC_SPI 249 IRQ_TYPE_EDGE_RISING>,
> > +                     <GIC_SPI 250 IRQ_TYPE_EDGE_RISING>,
> > +                     <GIC_SPI 251 IRQ_TYPE_EDGE_RISING>,
> > +                     <GIC_SPI 252 IRQ_TYPE_EDGE_RISING>,
> > +                     <GIC_SPI 253 IRQ_TYPE_EDGE_RISING>,
> > +                     <GIC_SPI 254 IRQ_TYPE_EDGE_RISING>,
> > +                     <GIC_SPI 255 IRQ_TYPE_EDGE_RISING>;
> > +
> > +        status = "okay";  
> 
> No status in the example.
> 
> > +
> > +        phy-mode = "internal";
> > +        fixed-link {
> > +            speed = <1000>;
> > +            full-duplex;
> > +            pause;
> > +            asym-pause;
> > +        };
> > +    };
> > +
> > +...  
> 
> 
> Best regards,
> Krzysztof

Best Regards,

Maxime

^ permalink raw reply

* Re: [PATCH net-next 00/11] mlxsw: extend line card model by devices and info
From: Jiri Pirko @ 2022-04-26 14:00 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Ido Schimmel, Ido Schimmel, netdev, davem, pabeni, jiri, petrm,
	dsahern, andrew, mlxsw
In-Reply-To: <20220426054130.7d997821@kernel.org>

Tue, Apr 26, 2022 at 02:41:30PM CEST, kuba@kernel.org wrote:
>On Tue, 26 Apr 2022 08:57:15 +0200 Jiri Pirko wrote:
>> >> In this particular case, these devices are gearboxes. They are running
>> >> their own firmware and we want user space to be able to query and update
>> >> the running firmware version.  
>> >
>> >Nothing too special, then, we don't create "devices" for every
>> >component of the system which can have a separate FW. That's where
>> >"components" are intended to be used..  
>> 
>> *
>> Sure, that is why I re-used components :)
>
>Well, right, I guess you did reuse them a little :)

I use them a lot. It is not visible in this patchset, but in the
flashing follow-up patchset.


>
>> But you have to somehow link the component to the particular gearbox on
>> particular line card. Say, you need to flash GB on line card 8. This is
>> basically providing a way to expose this relationship to user.
>> 
>> Also, the "lc info" shows the FW version for gearboxes. As Ido
>> mentioned, the GB versions could be listed in "devlink dev info" in
>> theory. But then, you need to somehow expose the relationship with
>> line card as well.
>
>Why would the automation which comes to update the firmware care 
>at all where the component is? Humans can see what the component 
>is by looking at the name.

The relationship-by-name sounds a bit fragile to me. The names of
components are up to the individual drivers.


>
>If we do need to know (*if*!) you can list FW components as a lc
>attribute, no need for new commands and objects.

There is no new command for that, only one nested attribute which
carries the device list added to the existing command. They are no new
objects, they are just few nested values.


>
>IMHO we should either keep lc objects simple and self contained or 
>give them a devlink instance. Creating sub-objects from them is very

Give them a devlink instance? I don't understand how. LC is not a
separate device, far from that. That does not make any sense to me.


>worrying. If there is _any_ chance we'll need per-lc health reporters 
>or sbs or params(🤢) etc. etc. - let's bite the bullet _now_ and create
>full devlink sub-instances!

Does not make sense to me at all. Line cards are detachable PHY sets in
essence, very basic functionality. They does not have buffers, health
and params, I don't think so. 


>
>> I don't see any simpler iface than this.
>
>Based on the assumptions you've made, maybe, but the uAPI should
>abstract away the irrelevant details. I'm questioning the assumptions.

Is the FW version of gearbox on a line card irrelevand detail?
If so, how does the user know if/when to flash it?
If not, where would you list it if devices nest is not the correct place?


>
>> >> The idea (implemented in the next patchset) is to let these devices
>> >> expose their own "component name", which can then be plugged into
>> >> the existing flash command:
>> >> 
>> >>     $ devlink lc show pci/0000:01:00.0 lc 8
>> >>     pci/0000:01:00.0:
>> >>       lc 8 state active type 16x100G
>> >>         supported_types:
>> >>            16x100G
>> >>         devices:
>> >>           device 0 flashable true component lc8_dev0
>> >>           device 1 flashable false
>> >>           device 2 flashable false
>> >>           device 3 flashable false
>> >>     $ devlink dev flash pci/0000:01:00.0 file some_file.mfa2
>> >> component lc8_dev0  
>> >
>> >IDK if it's just me or this assumes deep knowledge of the system.
>> >I don't understand why we need to list devices 1-3 at all. And they
>> >don't even have names. No information is exposed.   
>> 
>> There are 4 gearboxes on the line card. They share the same flash. So
>> if you flash gearbox 0, the rest will use the same FW.
>
>o_0 so the FW component is called lcX_dev0 and yet it applies to _all_
>devices, not just dev0?! Looking at the output above I thought other
>devices simply don't have FW ("flashable false").

Yes, device 0 is "flash master" (RW). 1-3 are RO. I know it is a bit
confusing. Maybe Andy's suggestion of "shared" flag of some sort might
help.


>
>> I'm exposing them for the sake of completeness. Also, the interface
>> needs to be designed as a list anyway, as different line cards may
>> have separate flash per gearbox.
>> 
>> What's is the harm in exposing devices 1-3? If you insist, we can hide
>> them.
>
>Well, they are unnecessary (this is uAPI), and coming from the outside
>I misinterpreted what the information reported means, so yeah, I'd
>classify it as harmful :(

UAPI is the "devices nest". It has to be list one way or another
(we may need to expose more gearboxes anyway). So what is differently
harmful with having list [0] or list [0,1,2,3] ?


^ permalink raw reply

* [RFC v1 0/3] Add MT7531 switch to BPI-R2Pro Board
From: Frank Wunderlich @ 2022-04-26 13:49 UTC (permalink / raw)
  To: linux-mediatek, linux-rockchip
  Cc: Frank Wunderlich, Rob Herring, Krzysztof Kozlowski,
	Heiko Stuebner, Sean Wang, Landen Chao, DENG Qingfang,
	Andrew Lunn, Vivien Didelot, Florian Fainelli, Vladimir Oltean,
	David S. Miller, Jakub Kicinski, Paolo Abeni, Matthias Brugger,
	Peter Geis, devicetree, linux-arm-kernel, linux-kernel, netdev

From: Frank Wunderlich <frank-w@public-files.de>

The Rockchip-Board Bananapi-R2-Pro has a Mediatek MT7531 connected to
the GMAC0.
This series changes DSA driver where needed to work on the Board and
adds necessary Devicetree node.

Frank Wunderlich (3):
  net: dsa: mt753x: make reset optional
  net: dsa: mt753x: make CPU-Port dynamic
  arm64: dts: rockchip: Add mt7531 dsa node to BPI-R2-Pro board

 .../boot/dts/rockchip/rk3568-bpi-r2-pro.dts   | 49 +++++++++++++++++
 drivers/net/dsa/mt7530.c                      | 53 ++++++++++---------
 drivers/net/dsa/mt7530.h                      |  2 +-
 3 files changed, 78 insertions(+), 26 deletions(-)

-- 
2.25.1


^ permalink raw reply

* Re: [PATCH net 1/1] net: stmmac: disable Split Header (SPH) for Intel platforms
From: Kurt Kanzenbach @ 2022-04-26 13:58 UTC (permalink / raw)
  To: Tan Tee Min, Giuseppe Cavallaro, Alexandre Torgue, Jose Abreu,
	David S . Miller, Jakub Kicinski, Paolo Abeni, Maxime Coquelin,
	Michael Sit Wei Hong, Xiaoliang Yang, Wong Vee Khee, Tan Tee Min,
	Ling Pei Lee, Bhupesh Sharma
  Cc: netdev, linux-stm32, linux-arm-kernel, linux-kernel, stable,
	Voon Wei Feng, Song Yoong Siang, Ong, Boon Leong, Tan Tee Min
In-Reply-To: <20220426074531.4115683-1-tee.min.tan@linux.intel.com>

[-- Attachment #1: Type: text/plain, Size: 622 bytes --]

Hi,

On Tue Apr 26 2022, Tan Tee Min wrote:
> Based on DesignWare Ethernet QoS datasheet, we are seeing the limitation
> of Split Header (SPH) feature is not supported for Ipv4 fragmented packet.
> This SPH limitation will cause ping failure when the packets size exceed
> the MTU size. For example, the issue happens once the basic ping packet
> size is larger than the configured MTU size and the data is lost inside
> the fragmented packet, replaced by zeros/corrupted values, and leads to
> ping fail.
>
> So, disable the Split Header for Intel platforms.

Does this issue only apply on Intel platforms?

Thanks,
Kurt

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 861 bytes --]

^ permalink raw reply

* Re: [PATCH net-next 2/5] net: dsa: add out-of-band tagging protocol
From: Maxime Chevallier @ 2022-04-26 13:57 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: davem, Rob Herring, netdev, linux-kernel, devicetree,
	thomas.petazzoni, Andrew Lunn, Heiner Kallweit, Russell King,
	linux-arm-kernel, Vladimir Oltean, Luka Perkov, Robert Marko
In-Reply-To: <68c4710d-013e-85e0-154d-413f4e13b27e@gmail.com>

Hello Florian,

On Fri, 22 Apr 2022 11:28:30 -0700
Florian Fainelli <f.fainelli@gmail.com> wrote:

Thanks for the review :)

> On 4/22/22 11:03, Maxime Chevallier wrote:
> > This tagging protocol is designed for the situation where the link
> > between the MAC and the Switch is designed such that the Destination
> > Port, which is usually embedded in some part of the Ethernet
> > Header, is sent out-of-band, and isn't present at all in the
> > Ethernet frame.
> > 
> > This can happen when the MAC and Switch are tightly integrated on an
> > SoC, as is the case with the Qualcomm IPQ4019 for example, where
> > the DSA tag is inserted directly into the DMA descriptors. In that
> > case, the MAC driver is responsible for sending the tag to the
> > switch using the out-of-band medium. To do so, the MAC driver needs
> > to have the information of the destination port for that skb.
> > 
> > This tagging protocol relies on a new set of fields in skb->shinfo
> > to transmit the dsa tagging information to and from the MAC driver.
> > 
> > Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>  
> 
> First off, I am not a big fan of expanding skb::shared_info because
> it is sensitive to cache line sizes and is critical for performance
> at much higher speeds, I would expect Eric and Jakub to not be
> terribly happy about it.

No problem, I'm testing with the skb->cb approach as you suggested and
see how it goes.

> The Broadcom systemport (bcmsysport.c) has a mode where it can
> extract the Broadcom tag and put it in front of the actual packet
> contents which appears to be very similar here. From there on, you
> can have two strategies:
> 
> - have the Ethernet controller mangle the packet contents such that
> the QCA tag is located in front of the actual Ethernet frame and
> create a new tagging protocol variant for QCA, similar to the
> TAG_BRCM versus TAG_BRCM_PREPEND
> 
> - provide the necessary information for the tagger to work using an
> out of band mechanism, which is what you have done, in which case,
> maybe you can use skb->cb[] instead of using skb::shared_info?

One of the reason why I chose the second is to support possible future
cases where another controller would face a similar situation, and also
make use of the out-of-band tagger.

I understand that it's not very elegant in the sense that this breaks
the nice tagging model we have, but adding/removing data before the
payload also seems convoluted to achieve the same thing :) It seems
that this approach comes with a bit of an overhead since it implies
mangling the skb a bit, but I've yet to test this myself.

That's actually what I wanted your opinion on, it also seems like
Andrew likes the idea of putting the tag ahead of the frame to stick
with the actual model.

I don't have strong feelings myself on the way of doing this, I'm
looking for an approach that is efficient but yet easily maintainable.

Thanks,

Maxime

^ permalink raw reply

* [RFC v1 1/3] net: dsa: mt753x: make reset optional
From: Frank Wunderlich @ 2022-04-26 13:49 UTC (permalink / raw)
  To: linux-mediatek, linux-rockchip
  Cc: Frank Wunderlich, Rob Herring, Krzysztof Kozlowski,
	Heiko Stuebner, Sean Wang, Landen Chao, DENG Qingfang,
	Andrew Lunn, Vivien Didelot, Florian Fainelli, Vladimir Oltean,
	David S. Miller, Jakub Kicinski, Paolo Abeni, Matthias Brugger,
	Peter Geis, devicetree, linux-arm-kernel, linux-kernel, netdev
In-Reply-To: <20220426134924.30372-1-linux@fw-web.de>

From: Frank Wunderlich <frank-w@public-files.de>

Currently a reset line is required, but on BPI-R2-Pro board
this reset is shared with the gmac and prevents the switch to
be initialized because mdio is not ready fast enough after
the reset.

So make the reset optional to allow shared reset lines.

Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
---
 drivers/net/dsa/mt7530.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c
index 19f0035d4410..ccf4cb944167 100644
--- a/drivers/net/dsa/mt7530.c
+++ b/drivers/net/dsa/mt7530.c
@@ -2134,7 +2134,7 @@ mt7530_setup(struct dsa_switch *ds)
 		reset_control_assert(priv->rstc);
 		usleep_range(1000, 1100);
 		reset_control_deassert(priv->rstc);
-	} else {
+	} else if (priv->reset) {
 		gpiod_set_value_cansleep(priv->reset, 0);
 		usleep_range(1000, 1100);
 		gpiod_set_value_cansleep(priv->reset, 1);
@@ -2276,7 +2276,7 @@ mt7531_setup(struct dsa_switch *ds)
 		reset_control_assert(priv->rstc);
 		usleep_range(1000, 1100);
 		reset_control_deassert(priv->rstc);
-	} else {
+	} else if (priv->reset) {
 		gpiod_set_value_cansleep(priv->reset, 0);
 		usleep_range(1000, 1100);
 		gpiod_set_value_cansleep(priv->reset, 1);
@@ -3272,8 +3272,7 @@ mt7530_probe(struct mdio_device *mdiodev)
 		priv->reset = devm_gpiod_get_optional(&mdiodev->dev, "reset",
 						      GPIOD_OUT_LOW);
 		if (IS_ERR(priv->reset)) {
-			dev_err(&mdiodev->dev, "Couldn't get our reset line\n");
-			return PTR_ERR(priv->reset);
+			dev_warn(&mdiodev->dev, "Couldn't get our reset line\n");
 		}
 	}
 
-- 
2.25.1


^ permalink raw reply related

* [RFC v1 3/3] arm64: dts: rockchip: Add mt7531 dsa node to BPI-R2-Pro board
From: Frank Wunderlich @ 2022-04-26 13:49 UTC (permalink / raw)
  To: linux-mediatek, linux-rockchip
  Cc: Frank Wunderlich, Rob Herring, Krzysztof Kozlowski,
	Heiko Stuebner, Sean Wang, Landen Chao, DENG Qingfang,
	Andrew Lunn, Vivien Didelot, Florian Fainelli, Vladimir Oltean,
	David S. Miller, Jakub Kicinski, Paolo Abeni, Matthias Brugger,
	Peter Geis, devicetree, linux-arm-kernel, linux-kernel, netdev
In-Reply-To: <20220426134924.30372-1-linux@fw-web.de>

From: Frank Wunderlich <frank-w@public-files.de>

Add Device Tree node for mt7531 switch connected to gmac0.

Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
---
 .../boot/dts/rockchip/rk3568-bpi-r2-pro.dts   | 49 +++++++++++++++++++
 1 file changed, 49 insertions(+)

diff --git a/arch/arm64/boot/dts/rockchip/rk3568-bpi-r2-pro.dts b/arch/arm64/boot/dts/rockchip/rk3568-bpi-r2-pro.dts
index e091f0407460..ea5b01a90ee0 100644
--- a/arch/arm64/boot/dts/rockchip/rk3568-bpi-r2-pro.dts
+++ b/arch/arm64/boot/dts/rockchip/rk3568-bpi-r2-pro.dts
@@ -437,6 +437,55 @@ &i2c5 {
 	status = "disabled";
 };
 
+&mdio0 {
+	#address-cells = <1>;
+	#size-cells = <0>;
+
+	switch@0 {
+		compatible = "mediatek,mt7531";
+		reg = <0>;
+		status = "disabled";
+
+		ports {
+			#address-cells = <1>;
+			#size-cells = <0>;
+
+			port@1 {
+				reg = <1>;
+				label = "lan0";
+			};
+
+			port@2 {
+				reg = <2>;
+				label = "lan1";
+			};
+
+			port@3 {
+				reg = <3>;
+				label = "lan2";
+			};
+
+			port@4 {
+				reg = <4>;
+				label = "lan3";
+			};
+
+			port@5 {
+				reg = <5>;
+				label = "cpu";
+				ethernet = <&gmac0>;
+				phy-mode = "rgmii";
+
+				fixed-link {
+					speed = <1000>;
+					full-duplex;
+					pause;
+				};
+			};
+		};
+	};
+};
+
 &mdio1 {
 	rgmii_phy1: ethernet-phy@0 {
 		compatible = "ethernet-phy-ieee802.3-c22";
-- 
2.25.1


^ permalink raw reply related

* [RFC v1 2/3] net: dsa: mt753x: make CPU-Port dynamic
From: Frank Wunderlich @ 2022-04-26 13:49 UTC (permalink / raw)
  To: linux-mediatek, linux-rockchip
  Cc: Frank Wunderlich, Rob Herring, Krzysztof Kozlowski,
	Heiko Stuebner, Sean Wang, Landen Chao, DENG Qingfang,
	Andrew Lunn, Vivien Didelot, Florian Fainelli, Vladimir Oltean,
	David S. Miller, Jakub Kicinski, Paolo Abeni, Matthias Brugger,
	Peter Geis, devicetree, linux-arm-kernel, linux-kernel, netdev
In-Reply-To: <20220426134924.30372-1-linux@fw-web.de>

From: Frank Wunderlich <frank-w@public-files.de>

Currently CPU-Port is hardcoded to Port 6.

On BPI-R2-Pro board this port is not connected and only Port 5 is
connected to gmac of SoC.

Replace this hardcoded CPU-Port with a member in mt7530_priv struct
which is set in mt753x_cpu_port_enable to the right port.

I defined a default in probe (in case no CPU-Port will be setup) and
if both cpu-port were setup port 6 will be used like the const prior
this patch.

In mt7531_setup first access is before we know which port should be used
(mt753x_cpu_port_enable) so section "BPDU to CPU port" needs to be moved
down.

Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
---
 drivers/net/dsa/mt7530.c | 46 ++++++++++++++++++++++------------------
 drivers/net/dsa/mt7530.h |  2 +-
 2 files changed, 26 insertions(+), 22 deletions(-)

diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c
index ccf4cb944167..4789105b8137 100644
--- a/drivers/net/dsa/mt7530.c
+++ b/drivers/net/dsa/mt7530.c
@@ -1004,6 +1004,7 @@ mt753x_cpu_port_enable(struct dsa_switch *ds, int port)
 			return ret;
 	}
 
+	priv->cpu_port = port;
 	/* Enable Mediatek header mode on the cpu port */
 	mt7530_write(priv, MT7530_PVC_P(port),
 		     PORT_SPEC_TAG);
@@ -1041,7 +1042,7 @@ mt7530_port_enable(struct dsa_switch *ds, int port,
 	 * restore the port matrix if the port is the member of a certain
 	 * bridge.
 	 */
-	priv->ports[port].pm |= PCR_MATRIX(BIT(MT7530_CPU_PORT));
+	priv->ports[port].pm |= PCR_MATRIX(BIT(priv->cpu_port));
 	priv->ports[port].enable = true;
 	mt7530_rmw(priv, MT7530_PCR_P(port), PCR_MATRIX_MASK,
 		   priv->ports[port].pm);
@@ -1190,8 +1191,8 @@ mt7530_port_bridge_join(struct dsa_switch *ds, int port,
 			struct netlink_ext_ack *extack)
 {
 	struct dsa_port *dp = dsa_to_port(ds, port), *other_dp;
-	u32 port_bitmap = BIT(MT7530_CPU_PORT);
 	struct mt7530_priv *priv = ds->priv;
+	u32 port_bitmap = BIT(priv->cpu_port);
 
 	mutex_lock(&priv->reg_mutex);
 
@@ -1267,9 +1268,9 @@ mt7530_port_set_vlan_unaware(struct dsa_switch *ds, int port)
 	 * the CPU port get out of VLAN filtering mode.
 	 */
 	if (all_user_ports_removed) {
-		mt7530_write(priv, MT7530_PCR_P(MT7530_CPU_PORT),
+		mt7530_write(priv, MT7530_PCR_P(priv->cpu_port),
 			     PCR_MATRIX(dsa_user_ports(priv->ds)));
-		mt7530_write(priv, MT7530_PVC_P(MT7530_CPU_PORT), PORT_SPEC_TAG
+		mt7530_write(priv, MT7530_PVC_P(priv->cpu_port), PORT_SPEC_TAG
 			     | PVC_EG_TAG(MT7530_VLAN_EG_CONSISTENT));
 	}
 }
@@ -1335,8 +1336,8 @@ mt7530_port_bridge_leave(struct dsa_switch *ds, int port,
 	 */
 	if (priv->ports[port].enable)
 		mt7530_rmw(priv, MT7530_PCR_P(port), PCR_MATRIX_MASK,
-			   PCR_MATRIX(BIT(MT7530_CPU_PORT)));
-	priv->ports[port].pm = PCR_MATRIX(BIT(MT7530_CPU_PORT));
+			   PCR_MATRIX(BIT(priv->cpu_port)));
+	priv->ports[port].pm = PCR_MATRIX(BIT(priv->cpu_port));
 
 	/* When a port is removed from the bridge, the port would be set up
 	 * back to the default as is at initial boot which is a VLAN-unaware
@@ -1503,6 +1504,7 @@ static int
 mt7530_port_vlan_filtering(struct dsa_switch *ds, int port, bool vlan_filtering,
 			   struct netlink_ext_ack *extack)
 {
+	struct mt7530_priv *priv = ds->priv;
 	if (vlan_filtering) {
 		/* The port is being kept as VLAN-unaware port when bridge is
 		 * set up with vlan_filtering not being set, Otherwise, the
@@ -1510,7 +1512,7 @@ mt7530_port_vlan_filtering(struct dsa_switch *ds, int port, bool vlan_filtering,
 		 * for becoming a VLAN-aware port.
 		 */
 		mt7530_port_set_vlan_aware(ds, port);
-		mt7530_port_set_vlan_aware(ds, MT7530_CPU_PORT);
+		mt7530_port_set_vlan_aware(ds, priv->cpu_port);
 	} else {
 		mt7530_port_set_vlan_unaware(ds, port);
 	}
@@ -1526,7 +1528,7 @@ mt7530_hw_vlan_add(struct mt7530_priv *priv,
 	u32 val;
 
 	new_members = entry->old_members | BIT(entry->port) |
-		      BIT(MT7530_CPU_PORT);
+		      BIT(priv->cpu_port);
 
 	/* Validate the entry with independent learning, create egress tag per
 	 * VLAN and joining the port as one of the port members.
@@ -1550,8 +1552,8 @@ mt7530_hw_vlan_add(struct mt7530_priv *priv,
 	 * DSA tag.
 	 */
 	mt7530_rmw(priv, MT7530_VAWD2,
-		   ETAG_CTRL_P_MASK(MT7530_CPU_PORT),
-		   ETAG_CTRL_P(MT7530_CPU_PORT,
+		   ETAG_CTRL_P_MASK(priv->cpu_port),
+		   ETAG_CTRL_P(priv->cpu_port,
 			       MT7530_VLAN_EGRESS_STACK));
 }
 
@@ -1575,7 +1577,7 @@ mt7530_hw_vlan_del(struct mt7530_priv *priv,
 	 * the entry would be kept valid. Otherwise, the entry is got to be
 	 * disabled.
 	 */
-	if (new_members && new_members != BIT(MT7530_CPU_PORT)) {
+	if (new_members && new_members != BIT(priv->cpu_port)) {
 		val = IVL_MAC | VTAG_EN | PORT_MEM(new_members) |
 		      VLAN_VALID;
 		mt7530_write(priv, MT7530_VAWD1, val);
@@ -2105,7 +2107,7 @@ mt7530_setup(struct dsa_switch *ds)
 	 * controller also is the container for two GMACs nodes representing
 	 * as two netdev instances.
 	 */
-	dn = dsa_to_port(ds, MT7530_CPU_PORT)->master->dev.of_node->parent;
+	dn = dsa_to_port(ds, priv->cpu_port)->master->dev.of_node->parent;
 	ds->assisted_learning_on_cpu_port = true;
 	ds->mtu_enforcement_ingress = true;
 
@@ -2337,15 +2339,6 @@ mt7531_setup(struct dsa_switch *ds)
 	mt7531_ind_c45_phy_write(priv, MT753X_CTRL_PHY_ADDR, MDIO_MMD_VEND2,
 				 CORE_PLL_GROUP4, val);
 
-	/* BPDU to CPU port */
-	mt7530_rmw(priv, MT7531_CFC, MT7531_CPU_PMAP_MASK,
-		   BIT(MT7530_CPU_PORT));
-	mt7530_rmw(priv, MT753X_BPC, MT753X_BPDU_PORT_FW_MASK,
-		   MT753X_BPDU_CPU_ONLY);
-
-	/* Enable and reset MIB counters */
-	mt7530_mib_reset(ds);
-
 	for (i = 0; i < MT7530_NUM_PORTS; i++) {
 		/* Disable forwarding by default on all ports */
 		mt7530_rmw(priv, MT7530_PCR_P(i), PCR_MATRIX_MASK,
@@ -2373,6 +2366,15 @@ mt7531_setup(struct dsa_switch *ds)
 			   PVC_EG_TAG(MT7530_VLAN_EG_CONSISTENT));
 	}
 
+	/* BPDU to CPU port */
+	mt7530_rmw(priv, MT7531_CFC, MT7531_CPU_PMAP_MASK,
+		   BIT(priv->cpu_port));
+	mt7530_rmw(priv, MT753X_BPC, MT753X_BPDU_PORT_FW_MASK,
+		   MT753X_BPDU_CPU_ONLY);
+
+	/* Enable and reset MIB counters */
+	mt7530_mib_reset(ds);
+
 	/* Setup VLAN ID 0 for VLAN-unaware bridges */
 	ret = mt7530_setup_vlan0(priv);
 	if (ret)
@@ -3213,6 +3215,8 @@ mt7530_probe(struct mdio_device *mdiodev)
 	if (!priv)
 		return -ENOMEM;
 
+	priv->cpu_port = 6;
+
 	priv->ds = devm_kzalloc(&mdiodev->dev, sizeof(*priv->ds), GFP_KERNEL);
 	if (!priv->ds)
 		return -ENOMEM;
diff --git a/drivers/net/dsa/mt7530.h b/drivers/net/dsa/mt7530.h
index 91508e2feef9..62df8d10f6d4 100644
--- a/drivers/net/dsa/mt7530.h
+++ b/drivers/net/dsa/mt7530.h
@@ -8,7 +8,6 @@
 
 #define MT7530_NUM_PORTS		7
 #define MT7530_NUM_PHYS			5
-#define MT7530_CPU_PORT			6
 #define MT7530_NUM_FDB_RECORDS		2048
 #define MT7530_ALL_MEMBERS		0xff
 
@@ -823,6 +822,7 @@ struct mt7530_priv {
 	u8			mirror_tx;
 
 	struct mt7530_port	ports[MT7530_NUM_PORTS];
+	int			cpu_port;
 	/* protect among processes for registers access*/
 	struct mutex reg_mutex;
 	int irq;
-- 
2.25.1


^ permalink raw reply related

* Re: [PATCH RFC 4/5] net/tls: Add support for PF_TLSH (a TLS handshake listener)
From: Chuck Lever III @ 2022-04-26 13:48 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, Linux NFS Mailing List, linux-nvme@lists.infradead.org,
	linux-cifs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	ak@tempesta-tech.com, borisp@nvidia.com, simo@redhat.com
In-Reply-To: <20220425101459.15484d17@kernel.org>

Hi Jakub-

> On Apr 25, 2022, at 1:14 PM, Jakub Kicinski <kuba@kernel.org> wrote:
> 
> On Mon, 18 Apr 2022 12:49:50 -0400 Chuck Lever wrote:
>> In-kernel TLS consumers need a way to perform a TLS handshake. In
>> the absence of a handshake implementation in the kernel itself, a
>> mechanism to perform the handshake in user space, using an existing
>> TLS handshake library, is necessary.
>> 
>> I've designed a way to pass a connected kernel socket endpoint to
>> user space using the traditional listen/accept mechanism. accept(2)
>> gives us a well-understood way to materialize a socket endpoint as a
>> normal file descriptor in a specific user space process. Like any
>> open socket descriptor, the accepted FD can then be passed to a
>> library such as openSSL to perform a TLS handshake.
>> 
>> This prototype currently handles only initiating client-side TLS
>> handshakes. Server-side handshakes and key renegotiation are left
>> to do.
>> 
>> Security Considerations
>> ~~~~~~~~ ~~~~~~~~~~~~~~
>> 
>> This prototype is net-namespace aware.
>> 
>> The kernel has no mechanism to attest that the listening user space
>> agent is trustworthy.
>> 
>> Currently the prototype does not handle multiple listeners that
>> overlap -- multiple listeners in the same net namespace that have
>> overlapping bind addresses.
> 
> Create the socket in user space, do all the handshakes you need there
> and then pass it to the kernel.  This is how NBD + TLS works.  Scales
> better and requires much less kernel code.

The RPC-with-TLS standard allows unencrypted RPC traffic on the connection
before sending ClientHello. I think we'd like to stick with creating the
socket in the kernel, for this reason and for the reasons Hannes mentions
in his reply.

--
Chuck Lever




^ permalink raw reply

* Re: [PATCH net-next 00/11] mlxsw: extend line card model by devices and info
From: Andrew Lunn @ 2022-04-26 13:45 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Ido Schimmel, Jakub Kicinski, Ido Schimmel, netdev, davem, pabeni,
	jiri, petrm, dsahern, mlxsw
In-Reply-To: <YmfoXsw+o9LE9dF3@nanopsycho>

> Well, I got your point. If the HW would be designed in the way the
> building blocks are exposed to the host, that would work. However, that
> is not the case here, unfortunatelly.

I'm with Jakub. It is the uAPI which matters here. It should look the
same for a SoC style enterprise router and your discombobulated TOR
router. How you talk to the different building blocks is an
implementation detail.

	Andrew

^ permalink raw reply

* [PATCH -next 0/2] Support riscv jit to provide bpf_line_info
From: Pu Lehui @ 2022-04-26 14:09 UTC (permalink / raw)
  To: bpf, linux-riscv, netdev, linux-kernel
  Cc: bjorn, luke.r.nels, xi.wang, ast, daniel, andrii, kafai,
	songliubraving, yhs, john.fastabend, kpsingh, paul.walmsley,
	palmer, aou, pulehui

patch 1 fix an issue that could not print bpf line info due
to data inconsistency in 32-bit environment.

patch 2 add support for riscv jit to provide bpf_line_info.
Both RV32 and RV64 tests have been passed as like follow:

./test_progs -a btf
#19 btf:OK
Summary: 1/215 PASSED, 0 SKIPPED, 0 FAILED

Pu Lehui (2):
  bpf: Unify data extension operation of jited_ksyms and jited_linfo
  riscv, bpf: Support riscv jit to provide bpf_line_info

 arch/riscv/net/bpf_jit.h                     |  1 +
 arch/riscv/net/bpf_jit_core.c                |  7 ++++++-
 kernel/bpf/syscall.c                         |  5 ++++-
 tools/lib/bpf/bpf_prog_linfo.c               |  8 ++++----
 tools/testing/selftests/bpf/prog_tests/btf.c | 18 +++++++++---------
 5 files changed, 24 insertions(+), 15 deletions(-)

-- 
2.25.1


^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox