* Re: [PATCH net] net: dsa: microchip: ksz9477: fix multicast filtering [not found] <20241212215132.3111392-1-tharvey@gateworks.com> @ 2024-12-13 0:00 ` Vladimir Oltean 2024-12-13 0:32 ` Tim Harvey 0 siblings, 1 reply; 15+ messages in thread From: Vladimir Oltean @ 2024-12-13 0:00 UTC (permalink / raw) To: Tim Harvey Cc: Woojung Huh, UNGLinuxDriver, Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Arun Ramadoss, netdev, linux-kernel On Thu, Dec 12, 2024 at 01:51:32PM -0800, Tim Harvey wrote: > commit 331d64f752bb ("net: dsa: microchip: add the enable_stp_addr > pointer in ksz_dev_ops") introduced enabling of the reserved multicast > address table function to filter packets based on multicast MAC address > but only configured one MAC address group, group 0 for > (01-80-C2-00)-00-00 for bridge group data. > > This causes other multicast groups to fail to be received such as LLDP > which uses a MAC address of 01-80-c2-00-00-0e (group 6). > > Enabling the reserved multicast address table requires configuring the > port forward mask for all eight address groups as the mask depends on > the port configuration. Personal experience reading your commit message: it took me a long while to realize that the reason why the 8 pre-configured Reserved Multicast table entries don't work is written here: "the mask depends on the port configuration." It is absolutely understated IMO. > The table determines the forwarding ports for > 48 specific multicast addresses and is addressed by the least > significant 6 bits of the multicast address. Changing a forwarding > port mask for one address also makes the same change for all other > addresses in the same group. > > Add configuration of the groups as such: > - leave these as default: > group 1 (01-80-C2-00)-00-01 (MAC Control Frame) (drop) > group 3 (01-80-C2-00)-00-10) (Bridge Management) (all ports) > - forward to cpu port: > group 0 (01-80-C2-00)-00-00 (Bridge Group Data) > group 2 (01-80-C2-00)-00-03 (802.1X access control) > group 6 (01-80-C2-00)-00-02, (01-80-C2-00)-00-04 – (01-80-C2-00)-00-0F > - forward to all but cpu port: Why would you not forward packets to the CPU port as a hardcoded configuration? What if the KSZ ports are bridged together with a foreign interface (different NIC, WLAN, tunnel etc), how should the packets reach that? > group 4 (01-80-C2-00)-00-20 (GMRP) > group 5 (01-80-C2-00)-00-21 (GVRP) > group 7 (01-80-C2-00)-00-11 - (01-80-C2-00)-00-1F, > (01-80-C2-00)-00-22 - (01-80-C2-00)-00-2F Don't you want to forgo the (odd) hardware defaults for the Reserved Multicast table, and instead follow what the Linux bridge does in br_handle_frame()? Which is to trap all is_link_local_ether_addr() addresses to the CPU, do _not_ call dsa_default_offload_fwd_mark() for those packets (aka let the bridge know that they haven't been forwarded in hardware, and if they should reach other bridge ports, this must be done in software), and let the user choose, via the bridge group_fwd_mask, if they should be forwarded to other bridge ports or not? > > Datasheets: > [1] https://ww1.microchip.com/downloads/en/DeviceDoc/KSZ9897S-Data-Sheet-DS00002394C.pdf > [2] https://ww1.microchip.com/downloads/en/DeviceDoc/KSZ9896C-Data-Sheet-DS00002390C.pdf > [3] https://ww1.microchip.com/downloads/en/DeviceDoc/KSZ9893R-Data-Sheet-DS00002420D.pdf > [4] https://ww1.microchip.com/downloads/en/DeviceDoc/00002330B.pdf > [5] https://ww1.microchip.com/downloads/en/DeviceDoc/KSZ9563R-Data-Sheet-DS00002419D.pdf > [6] https://ww1.microchip.com/downloads/aemDocuments/documents/UNG/ProductDocuments/DataSheets/KSZ9567R-Data-Sheet-DS00002329.pdf > [7] https://ww1.microchip.com/downloads/aemDocuments/documents/UNG/ProductDocuments/DataSheets/KSZ9567R-Data-Sheet-DS00002329.pdf [6] and [7] are the same. Also, you'd better specify in the commit message what's with these datasheet links, which to me and I suppose all other non-expert readers, are pasted here out of the blue, with no context. Like for example: "KSZ9897, ..., have arbitrary CPU port assignments, as can be seen in the driver's ksz_chip_data :: cpu_ports entries for these families, and the CPU port selection on a certain board rarely coincides with the default host port selection in the Reserved Multicast address table". > > Fixes: 331d64f752bb ("net: dsa: microchip: add the enable_stp_addr pointer in ksz_dev_ops") > Signed-off-by: Tim Harvey <tharvey@gateworks.com> > --- > drivers/net/dsa/microchip/ksz9477.c | 84 +++++++++++++++++++++++++---- > 1 file changed, 75 insertions(+), 9 deletions(-) > > diff --git a/drivers/net/dsa/microchip/ksz9477.c b/drivers/net/dsa/microchip/ksz9477.c > index d16817e0476f..d8fe809dd461 100644 > --- a/drivers/net/dsa/microchip/ksz9477.c > +++ b/drivers/net/dsa/microchip/ksz9477.c > @@ -1138,25 +1138,24 @@ void ksz9477_config_cpu_port(struct dsa_switch *ds) > } > } > > -int ksz9477_enable_stp_addr(struct ksz_device *dev) > +static int ksz9477_reserved_muticast_group(struct ksz_device *dev, int index, int mask) > { > + const u8 *shifts; > const u32 *masks; > u32 data; > int ret; > > + shifts = dev->info->shifts; > masks = dev->info->masks; > > - /* Enable Reserved multicast table */ > - ksz_cfg(dev, REG_SW_LUE_CTRL_0, SW_RESV_MCAST_ENABLE, true); > - > - /* Set the Override bit for forwarding BPDU packet to CPU */ > - ret = ksz_write32(dev, REG_SW_ALU_VAL_B, > - ALU_V_OVERRIDE | BIT(dev->cpu_port)); > + /* write the PORT_FORWARD value to the Reserved Multicast Address Table Entry 2 Register */ In netdev the coding style limits the line length to 80 characters where that is easy, like here. > + ret = ksz_write32(dev, REG_SW_ALU_VAL_B, mask); > if (ret < 0) > return ret; > > - data = ALU_STAT_START | ALU_RESV_MCAST_ADDR | masks[ALU_STAT_WRITE]; > - > + /* write to the Static Address and Reserved Multicast Table Control Register */ > + data = (index << shifts[ALU_STAT_INDEX]) | > + ALU_STAT_START | ALU_RESV_MCAST_ADDR | masks[ALU_STAT_WRITE]; > ret = ksz_write32(dev, REG_SW_ALU_STAT_CTRL__4, data); > if (ret < 0) > return ret; > @@ -1167,8 +1166,75 @@ int ksz9477_enable_stp_addr(struct ksz_device *dev) > dev_err(dev->dev, "Failed to update Reserved Multicast table\n"); > return ret; > } > + return ksz9477_wait_alu_sta_ready(dev); > +} > + > +int ksz9477_enable_stp_addr(struct ksz_device *dev) > +{ > + int ret; > + int cpu_mask = dsa_cpu_ports(dev->ds); > + int user_mask = dsa_user_ports(dev->ds); Also, in netdev, the coding style is to sort lines with variable declarations in the reverse order of their length (so-called reverse Christmas tree). > + /* array of indexes into table: > + * The table is indexed by the low 6 bits of the MAC address. > + * Changing the PORT_FORWARD value for any single address affects > + * all others in group > + */ > + u16 addr_groups[8] = { Array can be static const. Also, since all elements are initialized, specifying its size explicitly is not necessary ("[8]" can be "[]"). > + /* group 0: (01-80-C2-00)-00-00 (Bridge Group Data) */ > + 0x000, > + /* group 1: (01-80-C2-00)-00-01 (MAC Control Frame) */ > + 0x001, > + /* group 2: (01-80-C2-00)-00-03 (802.1X access control) */ > + 0x003, > + /* group 3: (01-80-C2-00)-00-10) (Bridge Management) */ > + 0x010, > + /* group 4: (01-80-C2-00)-00-20 (GMRP) */ > + 0x020, > + /* group 5: (01-80-C2-00)-00-21 (GVRP) */ > + 0x021, > + /* group 6: (01-80-C2-00)-00-02, (01-80-C2-00)-00-04 – (01-80-C2-00)-00-0F */ > + 0x002, > + /* group 7: (01-80-C2-00)-00-11 - (01-80-C2-00)-00-1F, > + * (01-80-C2-00)-00-22 - (01-80-C2-00)-00-2F > + */ > + 0x011, > + }; > + > + /* Enable Reserved multicast table */ > + ksz_cfg(dev, REG_SW_LUE_CTRL_0, SW_RESV_MCAST_ENABLE, true); > + > + /* update reserved multicast address table: > + * leave as default: > + * - group 1 (01-80-C2-00)-00-01 (MAC Control Frame) (drop) > + * - group 3 (01-80-C2-00)-00-10) (Bridge Management) (all ports) > + * forward to cpu port: > + * - group 0 (01-80-C2-00)-00-00 (Bridge Group Data) > + * - group 2 (01-80-C2-00)-00-03 (802.1X access control) > + * - group 6 (01-80-C2-00)-00-02, (01-80-C2-00)-00-04 – (01-80-C2-00)-00-0F > + * forward to all but cpu port: > + * - group 4 (01-80-C2-00)-00-20 (GMRP) > + * - group 5 (01-80-C2-00)-00-21 (GVRP) > + * - group 7 (01-80-C2-00)-00-11 - (01-80-C2-00)-00-1F, > + * (01-80-C2-00)-00-22 - (01-80-C2-00)-00-2F > + */ > + if (ksz9477_reserved_muticast_group(dev, addr_groups[0], cpu_mask)) > + goto exit; err = (function return code), and print it with %pe, ERR_PTR(err) please. We want to distinguish between -ETIMEDOUT in ksz9477_wait_alu_sta_ready() vs whatever ksz_write32() may return. > + if (ksz9477_reserved_muticast_group(dev, addr_groups[2], cpu_mask)) > + goto exit; > + if (ksz9477_reserved_muticast_group(dev, addr_groups[6], cpu_mask)) > + goto exit; > + if (ksz9477_reserved_muticast_group(dev, addr_groups[4], user_mask)) > + goto exit; > + if (ksz9477_reserved_muticast_group(dev, addr_groups[5], user_mask)) > + goto exit; > + if (ksz9477_reserved_muticast_group(dev, addr_groups[7], user_mask)) > + goto exit; > > return 0; > + > +exit: > + dev_err(dev->dev, "Failed to update Reserved Multicast table\n"); > + return ret; > } > > int ksz9477_setup(struct dsa_switch *ds) > -- > 2.34.1 > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net] net: dsa: microchip: ksz9477: fix multicast filtering 2024-12-13 0:00 ` [PATCH net] net: dsa: microchip: ksz9477: fix multicast filtering Vladimir Oltean @ 2024-12-13 0:32 ` Tim Harvey 2024-12-13 1:14 ` Vladimir Oltean 0 siblings, 1 reply; 15+ messages in thread From: Tim Harvey @ 2024-12-13 0:32 UTC (permalink / raw) To: Vladimir Oltean, Arun Ramadoss Cc: Woojung Huh, UNGLinuxDriver, Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel On Thu, Dec 12, 2024 at 4:00 PM Vladimir Oltean <olteanv@gmail.com> wrote: > > On Thu, Dec 12, 2024 at 01:51:32PM -0800, Tim Harvey wrote: > > commit 331d64f752bb ("net: dsa: microchip: add the enable_stp_addr > > pointer in ksz_dev_ops") introduced enabling of the reserved multicast > > address table function to filter packets based on multicast MAC address > > but only configured one MAC address group, group 0 for > > (01-80-C2-00)-00-00 for bridge group data. > > > > This causes other multicast groups to fail to be received such as LLDP > > which uses a MAC address of 01-80-c2-00-00-0e (group 6). > > > > Enabling the reserved multicast address table requires configuring the > > port forward mask for all eight address groups as the mask depends on > > the port configuration. > Hi Vladimir, > Personal experience reading your commit message: it took me a long while > to realize that the reason why the 8 pre-configured Reserved Multicast > table entries don't work is written here: "the mask depends on the port > configuration." It is absolutely understated IMO. > Yes, if you are going to enable the reserved multicast address table it should be configured fully as the default configuration makes an assumption on what user/host ports are valid. > > The table determines the forwarding ports for > > 48 specific multicast addresses and is addressed by the least > > significant 6 bits of the multicast address. Changing a forwarding > > port mask for one address also makes the same change for all other > > addresses in the same group. > > > > Add configuration of the groups as such: > > - leave these as default: > > group 1 (01-80-C2-00)-00-01 (MAC Control Frame) (drop) > > group 3 (01-80-C2-00)-00-10) (Bridge Management) (all ports) > > - forward to cpu port: > > group 0 (01-80-C2-00)-00-00 (Bridge Group Data) > > group 2 (01-80-C2-00)-00-03 (802.1X access control) > > group 6 (01-80-C2-00)-00-02, (01-80-C2-00)-00-04 – (01-80-C2-00)-00-0F > > - forward to all but cpu port: > > Why would you not forward packets to the CPU port as a hardcoded configuration? > What if the KSZ ports are bridged together with a foreign interface > (different NIC, WLAN, tunnel etc), how should the packets reach that? > If that is the correct thing to do I can certainly do that. I was assuming that the default policy above must be somewhat standard. This patch leaves the policy that was created by the default table configuration and just updates the port configuration based on the dt definition of the user vs host ports. > > group 4 (01-80-C2-00)-00-20 (GMRP) > > group 5 (01-80-C2-00)-00-21 (GVRP) > > group 7 (01-80-C2-00)-00-11 - (01-80-C2-00)-00-1F, > > (01-80-C2-00)-00-22 - (01-80-C2-00)-00-2F > > Don't you want to forgo the (odd) hardware defaults for the Reserved Multicast > table, and instead follow what the Linux bridge does in br_handle_frame()? > Which is to trap all is_link_local_ether_addr() addresses to the CPU, do > _not_ call dsa_default_offload_fwd_mark() for those packets (aka let the > bridge know that they haven't been forwarded in hardware, and if they > should reach other bridge ports, this must be done in software), and let the > user choose, via the bridge group_fwd_mask, if they should be forwarded > to other bridge ports or not? Again, I really don't know what the 'right' thing to do is for multicast packets but the enabling of the reserved multicast table done in commit 331d64f752bb ("net: dsa: microchip: add the enable_stp_addr pointer in ksz_dev_ops") breaks forwarding of all multicast packets that are not sent to 01-80-C2-00-00-00 > > > > Datasheets: > > [1] https://ww1.microchip.com/downloads/en/DeviceDoc/KSZ9897S-Data-Sheet-DS00002394C.pdf > > [2] https://ww1.microchip.com/downloads/en/DeviceDoc/KSZ9896C-Data-Sheet-DS00002390C.pdf > > [3] https://ww1.microchip.com/downloads/en/DeviceDoc/KSZ9893R-Data-Sheet-DS00002420D.pdf > > [4] https://ww1.microchip.com/downloads/en/DeviceDoc/00002330B.pdf > > [5] https://ww1.microchip.com/downloads/en/DeviceDoc/KSZ9563R-Data-Sheet-DS00002419D.pdf > > [6] https://ww1.microchip.com/downloads/aemDocuments/documents/UNG/ProductDocuments/DataSheets/KSZ9567R-Data-Sheet-DS00002329.pdf > > [7] https://ww1.microchip.com/downloads/aemDocuments/documents/UNG/ProductDocuments/DataSheets/KSZ9567R-Data-Sheet-DS00002329.pdf > > [6] and [7] are the same. > > Also, you'd better specify in the commit message what's with these datasheet > links, which to me and I suppose all other non-expert readers, are pasted here > out of the blue, with no context. > > Like for example: "KSZ9897, ..., have arbitrary CPU port assignments, as > can be seen in the driver's ksz_chip_data :: cpu_ports entries for these > families, and the CPU port selection on a certain board rarely coincides > with the default host port selection in the Reserved Multicast address > table". I was just trying to be thorough. I took the time to look up the datasheets for all the switches that the ksz9447 driver supports to ensure they all had the same default configuration policy and same configuration method/registers so I thought I would include them in the message. I can drop the datasheet links if they add no value. I was also expecting perhaps the commit message was confusing so I wanted to show where the information came from. What you're suggesting above regarding trapping all is_link_local_ether_addr() addresses to the CPU and not calling dsa_default_offload_fwd_mark() is beyond my understanding. If the behavior of the reserved multicast address table is non-standard then it should be disabled and the content of ksz9477_enable_stp_addr() removed. However based on Arun's commit message it seems that prior to that patch STP BPDU packets were not being forwarded to the CPU so it's unclear to me what the default behavior was for multicast without the reserved muticast address table being enabled. I know that if the table is disabled by removing the call to ksz9477_enable_stp_addr then LLDP packets are forwarded to the cpu port like they were before that patch. All of your coding style comments below make complete sense to me so as soon as we figure out what the proper fix is for commit 331d64f752bb ("net: dsa: microchip: add the enable_stp_addr pointer in ksz_dev_ops") breaking multicast I can resubmit with those resolved. Best Regards, Tim > > > > > Fixes: 331d64f752bb ("net: dsa: microchip: add the enable_stp_addr pointer in ksz_dev_ops") > > Signed-off-by: Tim Harvey <tharvey@gateworks.com> > > --- > > drivers/net/dsa/microchip/ksz9477.c | 84 +++++++++++++++++++++++++---- > > 1 file changed, 75 insertions(+), 9 deletions(-) > > > > diff --git a/drivers/net/dsa/microchip/ksz9477.c b/drivers/net/dsa/microchip/ksz9477.c > > index d16817e0476f..d8fe809dd461 100644 > > --- a/drivers/net/dsa/microchip/ksz9477.c > > +++ b/drivers/net/dsa/microchip/ksz9477.c > > @@ -1138,25 +1138,24 @@ void ksz9477_config_cpu_port(struct dsa_switch *ds) > > } > > } > > > > -int ksz9477_enable_stp_addr(struct ksz_device *dev) > > +static int ksz9477_reserved_muticast_group(struct ksz_device *dev, int index, int mask) > > { > > + const u8 *shifts; > > const u32 *masks; > > u32 data; > > int ret; > > > > + shifts = dev->info->shifts; > > masks = dev->info->masks; > > > > - /* Enable Reserved multicast table */ > > - ksz_cfg(dev, REG_SW_LUE_CTRL_0, SW_RESV_MCAST_ENABLE, true); > > - > > - /* Set the Override bit for forwarding BPDU packet to CPU */ > > - ret = ksz_write32(dev, REG_SW_ALU_VAL_B, > > - ALU_V_OVERRIDE | BIT(dev->cpu_port)); > > + /* write the PORT_FORWARD value to the Reserved Multicast Address Table Entry 2 Register */ > > In netdev the coding style limits the line length to 80 characters where > that is easy, like here. > > > + ret = ksz_write32(dev, REG_SW_ALU_VAL_B, mask); > > if (ret < 0) > > return ret; > > > > - data = ALU_STAT_START | ALU_RESV_MCAST_ADDR | masks[ALU_STAT_WRITE]; > > - > > + /* write to the Static Address and Reserved Multicast Table Control Register */ > > + data = (index << shifts[ALU_STAT_INDEX]) | > > + ALU_STAT_START | ALU_RESV_MCAST_ADDR | masks[ALU_STAT_WRITE]; > > ret = ksz_write32(dev, REG_SW_ALU_STAT_CTRL__4, data); > > if (ret < 0) > > return ret; > > @@ -1167,8 +1166,75 @@ int ksz9477_enable_stp_addr(struct ksz_device *dev) > > dev_err(dev->dev, "Failed to update Reserved Multicast table\n"); > > return ret; > > } > > + return ksz9477_wait_alu_sta_ready(dev); > > +} > > + > > +int ksz9477_enable_stp_addr(struct ksz_device *dev) > > +{ > > + int ret; > > + int cpu_mask = dsa_cpu_ports(dev->ds); > > + int user_mask = dsa_user_ports(dev->ds); > > Also, in netdev, the coding style is to sort lines with variable > declarations in the reverse order of their length (so-called reverse > Christmas tree). > > > + /* array of indexes into table: > > + * The table is indexed by the low 6 bits of the MAC address. > > + * Changing the PORT_FORWARD value for any single address affects > > + * all others in group > > + */ > > + u16 addr_groups[8] = { > > Array can be static const. Also, since all elements are initialized, > specifying its size explicitly is not necessary ("[8]" can be "[]"). > > > + /* group 0: (01-80-C2-00)-00-00 (Bridge Group Data) */ > > + 0x000, > > + /* group 1: (01-80-C2-00)-00-01 (MAC Control Frame) */ > > + 0x001, > > + /* group 2: (01-80-C2-00)-00-03 (802.1X access control) */ > > + 0x003, > > + /* group 3: (01-80-C2-00)-00-10) (Bridge Management) */ > > + 0x010, > > + /* group 4: (01-80-C2-00)-00-20 (GMRP) */ > > + 0x020, > > + /* group 5: (01-80-C2-00)-00-21 (GVRP) */ > > + 0x021, > > + /* group 6: (01-80-C2-00)-00-02, (01-80-C2-00)-00-04 – (01-80-C2-00)-00-0F */ > > + 0x002, > > + /* group 7: (01-80-C2-00)-00-11 - (01-80-C2-00)-00-1F, > > + * (01-80-C2-00)-00-22 - (01-80-C2-00)-00-2F > > + */ > > + 0x011, > > + }; > > + > > + /* Enable Reserved multicast table */ > > + ksz_cfg(dev, REG_SW_LUE_CTRL_0, SW_RESV_MCAST_ENABLE, true); > > + > > + /* update reserved multicast address table: > > + * leave as default: > > + * - group 1 (01-80-C2-00)-00-01 (MAC Control Frame) (drop) > > + * - group 3 (01-80-C2-00)-00-10) (Bridge Management) (all ports) > > + * forward to cpu port: > > + * - group 0 (01-80-C2-00)-00-00 (Bridge Group Data) > > + * - group 2 (01-80-C2-00)-00-03 (802.1X access control) > > + * - group 6 (01-80-C2-00)-00-02, (01-80-C2-00)-00-04 – (01-80-C2-00)-00-0F > > + * forward to all but cpu port: > > + * - group 4 (01-80-C2-00)-00-20 (GMRP) > > + * - group 5 (01-80-C2-00)-00-21 (GVRP) > > + * - group 7 (01-80-C2-00)-00-11 - (01-80-C2-00)-00-1F, > > + * (01-80-C2-00)-00-22 - (01-80-C2-00)-00-2F > > + */ > > + if (ksz9477_reserved_muticast_group(dev, addr_groups[0], cpu_mask)) > > + goto exit; > > err = (function return code), and print it with %pe, ERR_PTR(err) please. > We want to distinguish between -ETIMEDOUT in ksz9477_wait_alu_sta_ready() > vs whatever ksz_write32() may return. > > > + if (ksz9477_reserved_muticast_group(dev, addr_groups[2], cpu_mask)) > > + goto exit; > > + if (ksz9477_reserved_muticast_group(dev, addr_groups[6], cpu_mask)) > > + goto exit; > > + if (ksz9477_reserved_muticast_group(dev, addr_groups[4], user_mask)) > > + goto exit; > > + if (ksz9477_reserved_muticast_group(dev, addr_groups[5], user_mask)) > > + goto exit; > > + if (ksz9477_reserved_muticast_group(dev, addr_groups[7], user_mask)) > > + goto exit; > > > > return 0; > > + > > +exit: > > + dev_err(dev->dev, "Failed to update Reserved Multicast table\n"); > > + return ret; > > } > > > > int ksz9477_setup(struct dsa_switch *ds) > > -- > > 2.34.1 > > > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net] net: dsa: microchip: ksz9477: fix multicast filtering 2024-12-13 0:32 ` Tim Harvey @ 2024-12-13 1:14 ` Vladimir Oltean 2024-12-27 19:06 ` Tim Harvey 0 siblings, 1 reply; 15+ messages in thread From: Vladimir Oltean @ 2024-12-13 1:14 UTC (permalink / raw) To: Tim Harvey Cc: Arun Ramadoss, Woojung Huh, UNGLinuxDriver, Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel On Thu, Dec 12, 2024 at 04:32:28PM -0800, Tim Harvey wrote: > > > - forward to all but cpu port: > > > > Why would you not forward packets to the CPU port as a hardcoded configuration? > > What if the KSZ ports are bridged together with a foreign interface > > (different NIC, WLAN, tunnel etc), how should the packets reach that? > > If that is the correct thing to do I can certainly do that. I was > assuming that the default policy above must be somewhat standard. This > patch leaves the policy that was created by the default table > configuration and just updates the port configuration based on the dt > definition of the user vs host ports. I think you misunderstood my comment which you've quoted here, it was: "why would you hardcode a configuration which can't be changed and which doesn't include _at least_ the CPU port in the list of destination ports?! isn't that needed for so many reasons?" This particular paragraph did not contain any suggestion of the form "the correct thing to do is ...", just an observation that the choice you've made is most likely wrong. > > > group 4 (01-80-C2-00)-00-20 (GMRP) > > > group 5 (01-80-C2-00)-00-21 (GVRP) > > > group 7 (01-80-C2-00)-00-11 - (01-80-C2-00)-00-1F, > > > (01-80-C2-00)-00-22 - (01-80-C2-00)-00-2F > > > > Don't you want to forgo the (odd) hardware defaults for the Reserved Multicast > > table, and instead follow what the Linux bridge does in br_handle_frame()? > > Which is to trap all is_link_local_ether_addr() addresses to the CPU, do > > _not_ call dsa_default_offload_fwd_mark() for those packets (aka let the > > bridge know that they haven't been forwarded in hardware, and if they > > should reach other bridge ports, this must be done in software), and let the > > user choose, via the bridge group_fwd_mask, if they should be forwarded > > to other bridge ports or not? > > Again, I really don't know what the 'right' thing to do is for > multicast packets but the enabling of the reserved multicast table > done in commit 331d64f752bb ("net: dsa: microchip: add the > enable_stp_addr pointer in ksz_dev_ops") breaks forwarding of all > multicast packets that are not sent to 01-80-C2-00-00-00 Yes, because prior to that commit, this table wasn't consulted by the data path of the switch. > > > Datasheets: > > > [1] https://ww1.microchip.com/downloads/en/DeviceDoc/KSZ9897S-Data-Sheet-DS00002394C.pdf > > > [2] https://ww1.microchip.com/downloads/en/DeviceDoc/KSZ9896C-Data-Sheet-DS00002390C.pdf > > > [3] https://ww1.microchip.com/downloads/en/DeviceDoc/KSZ9893R-Data-Sheet-DS00002420D.pdf > > > [4] https://ww1.microchip.com/downloads/en/DeviceDoc/00002330B.pdf > > > [5] https://ww1.microchip.com/downloads/en/DeviceDoc/KSZ9563R-Data-Sheet-DS00002419D.pdf > > > [6] https://ww1.microchip.com/downloads/aemDocuments/documents/UNG/ProductDocuments/DataSheets/KSZ9567R-Data-Sheet-DS00002329.pdf > > > [7] https://ww1.microchip.com/downloads/aemDocuments/documents/UNG/ProductDocuments/DataSheets/KSZ9567R-Data-Sheet-DS00002329.pdf > > > > [6] and [7] are the same. > > > > Also, you'd better specify in the commit message what's with these datasheet > > links, which to me and I suppose all other non-expert readers, are pasted here > > out of the blue, with no context. > > > > Like for example: "KSZ9897, ..., have arbitrary CPU port assignments, as > > can be seen in the driver's ksz_chip_data :: cpu_ports entries for these > > families, and the CPU port selection on a certain board rarely coincides > > with the default host port selection in the Reserved Multicast address > > table". > > I was just trying to be thorough. I took the time to look up the > datasheets for all the switches that the ksz9447 driver supports to > ensure they all had the same default configuration policy and same > configuration method/registers so I thought I would include them in > the message. I can drop the datasheet links if they add no value. I > was also expecting perhaps the commit message was confusing so I > wanted to show where the information came from. They do add value, just guide the reader towards what they should be looking at, rather than throwing 6 books at them. I gave my own interpretation above of what I think should be the takeaway, after spending more than 1 hour studying, and I still might have not seen the same things as you. Just don't expect everybody to spend the same amount of time. > What you're suggesting above regarding trapping all > is_link_local_ether_addr() addresses to the CPU and not calling > dsa_default_offload_fwd_mark() is beyond my understanding. > If the behavior of the reserved multicast address table is non-standard The behavior of that table is customizable, in fact, and can be brought into compliance with the Linux network stack's expectations. Other network stacks might be different, but in Linux, the easiest way to achieve configurability of the group forwarding would be to let software do it. The bridge group_fwd_mask is a bit mask of reserved multicast groups (group X in 01-80-C2-00-00-0X). For example, setting bit 14 (0xe) (aka set group_fwd_mask to 0x4000) would mean "let group 01-80-C2-00-00-0E be forwarded on all bridge ports, and all other groups be just trapped". Conceivably, even in Linux there might be other ways to do it, like reprogram the hardware tables according to the group_fwd_mask value communicated through switchdev. But that infrastructure doesn't exist. > then it should be disabled and the content of ksz9477_enable_stp_addr() > removed. I wouldn't jump the gun so soon. > However based on Arun's commit message it seems that prior to > that patch STP BPDU packets were not being forwarded to the CPU so > it's unclear to me what the default behavior was for multicast without > the reserved muticast address table being enabled. I know that if the > table is disabled by removing the call to ksz9477_enable_stp_addr then > LLDP packets are forwarded to the cpu port like they were before that > patch. Reading the commit message: "In order to transmit the STP BPDU packet to the CPU port, the STP address (...) has to be added to static alu table", I think the correct key in which it should be interpreted is: "In order to transmit the STP BPDU packets [just] to the CPU port [rather than flood them towards all ports], the STP address has to ...". I will give it to you that it is quite a stretch to interpret it in this way, and it is also frustrating to me to have to extract technically valid information from loose formulations like these. Plus, unlike both you and Arun, no access to this hardware. But at least, this is the only interpretation with which I see no contradictions. I will let Arun confirm that the commit message should be interpreted in this way and not in another. But at the same time, you could also confirm that when the Reserved Multicast address table lookup is disabled, they are treated just like any other multicast traffic with no hit in the address table: aka broadcast. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net] net: dsa: microchip: ksz9477: fix multicast filtering 2024-12-13 1:14 ` Vladimir Oltean @ 2024-12-27 19:06 ` Tim Harvey 2025-01-03 3:29 ` Arun.Ramadoss 0 siblings, 1 reply; 15+ messages in thread From: Tim Harvey @ 2024-12-27 19:06 UTC (permalink / raw) To: Arun Ramadoss Cc: Woojung Huh, UNGLinuxDriver, Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel, Vladimir Oltean On Thu, Dec 12, 2024 at 5:14 PM Vladimir Oltean <olteanv@gmail.com> wrote: > > On Thu, Dec 12, 2024 at 04:32:28PM -0800, Tim Harvey wrote: > > > > - forward to all but cpu port: > > > > > > Why would you not forward packets to the CPU port as a hardcoded configuration? > > > What if the KSZ ports are bridged together with a foreign interface > > > (different NIC, WLAN, tunnel etc), how should the packets reach that? > > > > If that is the correct thing to do I can certainly do that. I was > > assuming that the default policy above must be somewhat standard. This > > patch leaves the policy that was created by the default table > > configuration and just updates the port configuration based on the dt > > definition of the user vs host ports. > > I think you misunderstood my comment which you've quoted here, it was: > "why would you hardcode a configuration which can't be changed and which > doesn't include _at least_ the CPU port in the list of destination > ports?! isn't that needed for so many reasons?" > > This particular paragraph did not contain any suggestion of the form > "the correct thing to do is ...", just an observation that the choice > you've made is most likely wrong. > > > > > group 4 (01-80-C2-00)-00-20 (GMRP) > > > > group 5 (01-80-C2-00)-00-21 (GVRP) > > > > group 7 (01-80-C2-00)-00-11 - (01-80-C2-00)-00-1F, > > > > (01-80-C2-00)-00-22 - (01-80-C2-00)-00-2F > > > > > > Don't you want to forgo the (odd) hardware defaults for the Reserved Multicast > > > table, and instead follow what the Linux bridge does in br_handle_frame()? > > > Which is to trap all is_link_local_ether_addr() addresses to the CPU, do > > > _not_ call dsa_default_offload_fwd_mark() for those packets (aka let the > > > bridge know that they haven't been forwarded in hardware, and if they > > > should reach other bridge ports, this must be done in software), and let the > > > user choose, via the bridge group_fwd_mask, if they should be forwarded > > > to other bridge ports or not? > > > > Again, I really don't know what the 'right' thing to do is for > > multicast packets but the enabling of the reserved multicast table > > done in commit 331d64f752bb ("net: dsa: microchip: add the > > enable_stp_addr pointer in ksz_dev_ops") breaks forwarding of all > > multicast packets that are not sent to 01-80-C2-00-00-00 > > Yes, because prior to that commit, this table wasn't consulted by the > data path of the switch. > > > > > Datasheets: > > > > [1] https://ww1.microchip.com/downloads/en/DeviceDoc/KSZ9897S-Data-Sheet-DS00002394C.pdf > > > > [2] https://ww1.microchip.com/downloads/en/DeviceDoc/KSZ9896C-Data-Sheet-DS00002390C.pdf > > > > [3] https://ww1.microchip.com/downloads/en/DeviceDoc/KSZ9893R-Data-Sheet-DS00002420D.pdf > > > > [4] https://ww1.microchip.com/downloads/en/DeviceDoc/00002330B.pdf > > > > [5] https://ww1.microchip.com/downloads/en/DeviceDoc/KSZ9563R-Data-Sheet-DS00002419D.pdf > > > > [6] https://ww1.microchip.com/downloads/aemDocuments/documents/UNG/ProductDocuments/DataSheets/KSZ9567R-Data-Sheet-DS00002329.pdf > > > > [7] https://ww1.microchip.com/downloads/aemDocuments/documents/UNG/ProductDocuments/DataSheets/KSZ9567R-Data-Sheet-DS00002329.pdf > > > > > > [6] and [7] are the same. > > > > > > Also, you'd better specify in the commit message what's with these datasheet > > > links, which to me and I suppose all other non-expert readers, are pasted here > > > out of the blue, with no context. > > > > > > Like for example: "KSZ9897, ..., have arbitrary CPU port assignments, as > > > can be seen in the driver's ksz_chip_data :: cpu_ports entries for these > > > families, and the CPU port selection on a certain board rarely coincides > > > with the default host port selection in the Reserved Multicast address > > > table". > > > > I was just trying to be thorough. I took the time to look up the > > datasheets for all the switches that the ksz9447 driver supports to > > ensure they all had the same default configuration policy and same > > configuration method/registers so I thought I would include them in > > the message. I can drop the datasheet links if they add no value. I > > was also expecting perhaps the commit message was confusing so I > > wanted to show where the information came from. > > They do add value, just guide the reader towards what they should be > looking at, rather than throwing 6 books at them. I gave my own > interpretation above of what I think should be the takeaway, after > spending more than 1 hour studying, and I still might have not seen the > same things as you. Just don't expect everybody to spend the same amount > of time. > > > What you're suggesting above regarding trapping all > > is_link_local_ether_addr() addresses to the CPU and not calling > > dsa_default_offload_fwd_mark() is beyond my understanding. > > If the behavior of the reserved multicast address table is non-standard > > The behavior of that table is customizable, in fact, and can be brought > into compliance with the Linux network stack's expectations. > > Other network stacks might be different, but in Linux, the easiest way > to achieve configurability of the group forwarding would be to let > software do it. The bridge group_fwd_mask is a bit mask of reserved > multicast groups (group X in 01-80-C2-00-00-0X). For example, setting > bit 14 (0xe) (aka set group_fwd_mask to 0x4000) would mean "let group > 01-80-C2-00-00-0E be forwarded on all bridge ports, and all other groups > be just trapped". > > Conceivably, even in Linux there might be other ways to do it, like > reprogram the hardware tables according to the group_fwd_mask value > communicated through switchdev. But that infrastructure doesn't exist. > > > then it should be disabled and the content of ksz9477_enable_stp_addr() > > removed. > > I wouldn't jump the gun so soon. > > > However based on Arun's commit message it seems that prior to > > that patch STP BPDU packets were not being forwarded to the CPU so > > it's unclear to me what the default behavior was for multicast without > > the reserved muticast address table being enabled. I know that if the > > table is disabled by removing the call to ksz9477_enable_stp_addr then > > LLDP packets are forwarded to the cpu port like they were before that > > patch. > > Reading the commit message: "In order to transmit the STP BPDU packet to > the CPU port, the STP address (...) has to be added to static alu table", > I think the correct key in which it should be interpreted is: > > "In order to transmit the STP BPDU packets [just] to the CPU port > [rather than flood them towards all ports], the STP address has to ...". > > I will give it to you that it is quite a stretch to interpret it in this > way, and it is also frustrating to me to have to extract technically > valid information from loose formulations like these. Plus, unlike both > you and Arun, no access to this hardware. But at least, this is the only > interpretation with which I see no contradictions. > > I will let Arun confirm that the commit message should be interpreted in > this way and not in another. But at the same time, you could also > confirm that when the Reserved Multicast address table lookup is disabled, > they are treated just like any other multicast traffic with no hit in > the address table: aka broadcast. Arun, Can you confirm that prior to commit 331d64f752bb ("net: dsa: microchip: add the enable_stp_addr pointer in ksz_dev_ops") that packets in the bridge group were being forwarded to all ports and that the intention of the patch was to limit them to only go to 'only' the cpu port? Do you have any comments on this patch with regards to how the other packet groups should be configured? Best Regards, Tim ^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: [PATCH net] net: dsa: microchip: ksz9477: fix multicast filtering 2024-12-27 19:06 ` Tim Harvey @ 2025-01-03 3:29 ` Arun.Ramadoss 2025-01-14 18:27 ` Tim Harvey 0 siblings, 1 reply; 15+ messages in thread From: Arun.Ramadoss @ 2025-01-03 3:29 UTC (permalink / raw) To: tharvey Cc: Woojung.Huh, UNGLinuxDriver, andrew, davem, edumazet, kuba, pabeni, netdev, linux-kernel, olteanv Hi Tim, > > Arun, > > Can you confirm that prior to commit 331d64f752bb ("net: dsa: > microchip: add the enable_stp_addr pointer in ksz_dev_ops") that packets in the bridge group were being forwarded to all ports and > that the intention of the patch was to limit them to only go to 'only' the cpu port? By Default, In the reserved multicast table, the forwarding port for the mac address 01:80:C2:00:00:00 (BPDU Mac address) is 10h. It means, the STP BPDU packet will be forwarded to Port 5. However, the CPU port varies depending on the board schematics. Hence to fix this, forwarding port needs to changed in reserved multicast table. Procedure for updating Override bit: - 0x0424 = 0x8000_0000 | dev->cpu_port. (0x10 for lan9370 and 0x20 for lan9374) - 0x41c = 0x0000_0085 (write, reserved multicast table and start. Table Index is 0 which is the index for BPDU packet 01:80:c2:00:00:00) - 0x41c = wait till start bit is cleared. Let me know, if you need any more information. > > Do you have any comments on this patch with regards to how the other packet groups should be configured? > > Best Regards, > > Tim ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net] net: dsa: microchip: ksz9477: fix multicast filtering 2025-01-03 3:29 ` Arun.Ramadoss @ 2025-01-14 18:27 ` Tim Harvey 2025-01-16 9:06 ` Arun.Ramadoss 0 siblings, 1 reply; 15+ messages in thread From: Tim Harvey @ 2025-01-14 18:27 UTC (permalink / raw) To: Arun.Ramadoss Cc: Woojung.Huh, UNGLinuxDriver, andrew, davem, edumazet, kuba, pabeni, netdev, linux-kernel, olteanv On Thu, Jan 2, 2025 at 7:29 PM <Arun.Ramadoss@microchip.com> wrote: > > Hi Tim, > > > > > Arun, > > > > Can you confirm that prior to commit 331d64f752bb ("net: dsa: > > microchip: add the enable_stp_addr pointer in ksz_dev_ops") that packets in the bridge group were being forwarded to all ports and > > that the intention of the patch was to limit them to only go to 'only' the cpu port? > > By Default, In the reserved multicast table, the forwarding port for the mac address 01:80:C2:00:00:00 (BPDU Mac address) is 10h. It means, the STP BPDU packet will be forwarded to Port 5. However, the CPU port varies depending on the board schematics. Hence to fix this, forwarding port needs to changed in reserved multicast table. > > Procedure for updating Override bit: > - 0x0424 = 0x8000_0000 | dev->cpu_port. (0x10 for lan9370 and 0x20 for lan9374) > - 0x41c = 0x0000_0085 (write, reserved multicast table and start. Table Index is 0 which is the index for BPDU packet 01:80:c2:00:00:00) > - 0x41c = wait till start bit is cleared. > > Let me know, if you need any more information. > Hi Arun, Ok, that makes sense to me and falls in line with what my patch here was trying to do. When you enable the reserved multicast table it makes sense to update the entire table right? You are only updating one address/group. Can you please review and comment on my patch here? Best Regards, Tim ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net] net: dsa: microchip: ksz9477: fix multicast filtering 2025-01-14 18:27 ` Tim Harvey @ 2025-01-16 9:06 ` Arun.Ramadoss 2025-01-17 1:09 ` Tristram.Ha 0 siblings, 1 reply; 15+ messages in thread From: Arun.Ramadoss @ 2025-01-16 9:06 UTC (permalink / raw) To: tharvey Cc: andrew, davem, olteanv, Woojung.Huh, linux-kernel, pabeni, netdev, edumazet, UNGLinuxDriver, kuba Hi Tim, > Hi Arun, > > Ok, that makes sense to me and falls in line with what my patch here > was trying to do. When you enable the reserved multicast table it > makes sense to update the entire table right? You are only updating > one address/group. Can you please review and comment on my patch > here? During my testing of STP protocol, I found that Group 0 of reserved multicast table needs to be updated. Since I have not worked on other groups in the multicast table, I didn't update it. I could not find the original patch to review, it shows "not found" in lore.kernel.org. Below are my comments, - Why override bit is not set in REG_SW_ALU_VAL_B register. - ksz9477_enable_stp_addr() can be renamed since it updates all the table entries. > Best Regards, > > Tim ^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: [PATCH net] net: dsa: microchip: ksz9477: fix multicast filtering 2025-01-16 9:06 ` Arun.Ramadoss @ 2025-01-17 1:09 ` Tristram.Ha 2025-01-17 1:25 ` Tim Harvey 0 siblings, 1 reply; 15+ messages in thread From: Tristram.Ha @ 2025-01-17 1:09 UTC (permalink / raw) To: Arun.Ramadoss, tharvey Cc: andrew, davem, olteanv, Woojung.Huh, linux-kernel, pabeni, netdev, edumazet, UNGLinuxDriver, kuba > Hi Tim, > > > Hi Arun, > > > > Ok, that makes sense to me and falls in line with what my patch here > > was trying to do. When you enable the reserved multicast table it > > makes sense to update the entire table right? You are only updating > > one address/group. Can you please review and comment on my patch > > here? > > > During my testing of STP protocol, I found that Group 0 of reserved > multicast table needs to be updated. Since I have not worked on other > groups in the multicast table, I didn't update it. > > I could not find the original patch to review, it shows "not found" in > lore.kernel.org. > > Below are my comments, > > - Why override bit is not set in REG_SW_ALU_VAL_B register. > - ksz9477_enable_stp_addr() can be renamed since it updates all the > table entries. The reserved multicast table has only 8 entries that apply to 48 multicast addresses, so some addresses share one entry. Some entries that are supposed to forward only to the host port or skip should be updated when that host port is not the default one. The override bit should be set for the STP address as that is required for receiving when the port is closed. Some entries for MVRP/MSRP should forward to the host port when the host can process those messages and broadcast to all ports when the host does not process those messages, but that is not controllable by the switch driver so I do not know how to handle in this situation. The default reserved multicast table forwards to host port on entries 0, 2, and 6; skips host port on entries 4, 5, and 7; forwards to all ports on entry 3; and drops on entry 1. enable_stp_addr() is used to enable STP support in all KSZ switches, so ksz9477_enable_stp_addr() cannot be simply renamed. It is probably best to have a specific setup_reserved_multicast_table In KSZ9477 and LAN937X drivers to update those entries. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net] net: dsa: microchip: ksz9477: fix multicast filtering 2025-01-17 1:09 ` Tristram.Ha @ 2025-01-17 1:25 ` Tim Harvey 2025-01-17 16:13 ` Vladimir Oltean 2025-01-18 0:49 ` Tristram.Ha 0 siblings, 2 replies; 15+ messages in thread From: Tim Harvey @ 2025-01-17 1:25 UTC (permalink / raw) To: Tristram.Ha Cc: Arun.Ramadoss, andrew, davem, olteanv, Woojung.Huh, linux-kernel, pabeni, netdev, edumazet, UNGLinuxDriver, kuba On Thu, Jan 16, 2025 at 5:09 PM <Tristram.Ha@microchip.com> wrote: > > > Hi Tim, > > > > > Hi Arun, > > > > > > Ok, that makes sense to me and falls in line with what my patch here > > > was trying to do. When you enable the reserved multicast table it > > > makes sense to update the entire table right? You are only updating > > > one address/group. Can you please review and comment on my patch > > > here? > > > > > > During my testing of STP protocol, I found that Group 0 of reserved > > multicast table needs to be updated. Since I have not worked on other > > groups in the multicast table, I didn't update it. > > > > I could not find the original patch to review, it shows "not found" in > > lore.kernel.org. > > > > Below are my comments, > > > > - Why override bit is not set in REG_SW_ALU_VAL_B register. > > - ksz9477_enable_stp_addr() can be renamed since it updates all the > > table entries. > > The reserved multicast table has only 8 entries that apply to 48 > multicast addresses, so some addresses share one entry. > > Some entries that are supposed to forward only to the host port or skip > should be updated when that host port is not the default one. > > The override bit should be set for the STP address as that is required > for receiving when the port is closed. > > Some entries for MVRP/MSRP should forward to the host port when the host > can process those messages and broadcast to all ports when the host does > not process those messages, but that is not controllable by the switch > driver so I do not know how to handle in this situation. > Hi Tristram, Thanks for your feedback. What is the behavior when the reserved multicast table is not enabled (does it forward to all ports, drop all mcast, something else?) > The default reserved multicast table forwards to host port on entries 0, > 2, and 6; skips host port on entries 4, 5, and 7; forwards to all ports > on entry 3; and drops on entry 1. > Is this behavior the desired behavior as far as the Linux DSA folks would want? commit 331d64f752bb ("net: dsa: microchip: add the enable_stp_addr pointer in ksz_dev_ops") enables the reserved multicast table and adjusts the cpu port for entry 0 leaving the rest the same (and wrong if the cpu port is not the highest port in the switch). My patch adjusts the entries but keeps the rules the same and the question that is posed is that the right thing to do with respect to Linux DSA? Best Regard, Tim > enable_stp_addr() is used to enable STP support in all KSZ switches, so > ksz9477_enable_stp_addr() cannot be simply renamed. > > It is probably best to have a specific setup_reserved_multicast_table > In KSZ9477 and LAN937X drivers to update those entries. > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net] net: dsa: microchip: ksz9477: fix multicast filtering 2025-01-17 1:25 ` Tim Harvey @ 2025-01-17 16:13 ` Vladimir Oltean 2025-01-17 21:02 ` Tim Harvey 2025-01-18 0:49 ` Tristram.Ha 1 sibling, 1 reply; 15+ messages in thread From: Vladimir Oltean @ 2025-01-17 16:13 UTC (permalink / raw) To: Tim Harvey Cc: Tristram.Ha, Arun.Ramadoss, andrew, davem, Woojung.Huh, linux-kernel, pabeni, netdev, edumazet, UNGLinuxDriver, kuba On Thu, Jan 16, 2025 at 05:25:49PM -0800, Tim Harvey wrote: > On Thu, Jan 16, 2025 at 5:09 PM <Tristram.Ha@microchip.com> wrote: > > > > > Hi Tim, > > > > > > > Hi Arun, > > > > > > > > Ok, that makes sense to me and falls in line with what my patch here > > > > was trying to do. When you enable the reserved multicast table it > > > > makes sense to update the entire table right? You are only updating > > > > one address/group. Can you please review and comment on my patch > > > > here? > > > > > > > > > During my testing of STP protocol, I found that Group 0 of reserved > > > multicast table needs to be updated. Since I have not worked on other > > > groups in the multicast table, I didn't update it. > > > > > > I could not find the original patch to review, it shows "not found" in > > > lore.kernel.org. > > > > > > Below are my comments, > > > > > > - Why override bit is not set in REG_SW_ALU_VAL_B register. > > > - ksz9477_enable_stp_addr() can be renamed since it updates all the > > > table entries. > > > > The reserved multicast table has only 8 entries that apply to 48 > > multicast addresses, so some addresses share one entry. > > > > Some entries that are supposed to forward only to the host port or skip > > should be updated when that host port is not the default one. > > > > The override bit should be set for the STP address as that is required > > for receiving when the port is closed. > > > > Some entries for MVRP/MSRP should forward to the host port when the host > > can process those messages and broadcast to all ports when the host does > > not process those messages, but that is not controllable by the switch > > driver so I do not know how to handle in this situation. > > > > Hi Tristram, > > Thanks for your feedback. > > What is the behavior when the reserved multicast table is not enabled > (does it forward to all ports, drop all mcast, something else?) > > > The default reserved multicast table forwards to host port on entries 0, > > 2, and 6; skips host port on entries 4, 5, and 7; forwards to all ports > > on entry 3; and drops on entry 1. > > > > Is this behavior the desired behavior as far as the Linux DSA folks would want? > > commit 331d64f752bb ("net: dsa: microchip: add the enable_stp_addr > pointer in ksz_dev_ops") enables the reserved multicast table and > adjusts the cpu port for entry 0 leaving the rest the same (and wrong > if the cpu port is not the highest port in the switch). > > My patch adjusts the entries but keeps the rules the same and the > question that is posed is that the right thing to do with respect to > Linux DSA? > > Best Regard, > > Tim Not sure if that's a question for the DSA maintainers or for Tristram, because I've expressed already earlier in this thread what is the current way in which the Linux bridge expects this mechanism to work. This is not the Microchip KSZ SDK, and thus, following the network stack rules is not optional, and inventing local conventions when there already exist global ones is a no-go. If you don't like the conventions you are of course free to challenge them, but this is not the right audience to do that. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net] net: dsa: microchip: ksz9477: fix multicast filtering 2025-01-17 16:13 ` Vladimir Oltean @ 2025-01-17 21:02 ` Tim Harvey 2025-01-18 1:18 ` Vladimir Oltean 0 siblings, 1 reply; 15+ messages in thread From: Tim Harvey @ 2025-01-17 21:02 UTC (permalink / raw) To: Vladimir Oltean Cc: Tristram.Ha, Arun.Ramadoss, andrew, davem, Woojung.Huh, linux-kernel, pabeni, netdev, edumazet, UNGLinuxDriver, kuba On Fri, Jan 17, 2025 at 8:13 AM Vladimir Oltean <olteanv@gmail.com> wrote: > > On Thu, Jan 16, 2025 at 05:25:49PM -0800, Tim Harvey wrote: > > On Thu, Jan 16, 2025 at 5:09 PM <Tristram.Ha@microchip.com> wrote: > > > > > > > Hi Tim, > > > > > > > > > Hi Arun, > > > > > > > > > > Ok, that makes sense to me and falls in line with what my patch here > > > > > was trying to do. When you enable the reserved multicast table it > > > > > makes sense to update the entire table right? You are only updating > > > > > one address/group. Can you please review and comment on my patch > > > > > here? > > > > > > > > > > > > During my testing of STP protocol, I found that Group 0 of reserved > > > > multicast table needs to be updated. Since I have not worked on other > > > > groups in the multicast table, I didn't update it. > > > > > > > > I could not find the original patch to review, it shows "not found" in > > > > lore.kernel.org. > > > > > > > > Below are my comments, > > > > > > > > - Why override bit is not set in REG_SW_ALU_VAL_B register. > > > > - ksz9477_enable_stp_addr() can be renamed since it updates all the > > > > table entries. > > > > > > The reserved multicast table has only 8 entries that apply to 48 > > > multicast addresses, so some addresses share one entry. > > > > > > Some entries that are supposed to forward only to the host port or skip > > > should be updated when that host port is not the default one. > > > > > > The override bit should be set for the STP address as that is required > > > for receiving when the port is closed. > > > > > > Some entries for MVRP/MSRP should forward to the host port when the host > > > can process those messages and broadcast to all ports when the host does > > > not process those messages, but that is not controllable by the switch > > > driver so I do not know how to handle in this situation. > > > > > > > Hi Tristram, > > > > Thanks for your feedback. > > > > What is the behavior when the reserved multicast table is not enabled > > (does it forward to all ports, drop all mcast, something else?) > > > > > The default reserved multicast table forwards to host port on entries 0, > > > 2, and 6; skips host port on entries 4, 5, and 7; forwards to all ports > > > on entry 3; and drops on entry 1. > > > > > > > Is this behavior the desired behavior as far as the Linux DSA folks would want? > > > > commit 331d64f752bb ("net: dsa: microchip: add the enable_stp_addr > > pointer in ksz_dev_ops") enables the reserved multicast table and > > adjusts the cpu port for entry 0 leaving the rest the same (and wrong > > if the cpu port is not the highest port in the switch). > > > > My patch adjusts the entries but keeps the rules the same and the > > question that is posed is that the right thing to do with respect to > > Linux DSA? > > > > Best Regard, > > > > Tim > > Not sure if that's a question for the DSA maintainers or for Tristram, > because I've expressed already earlier in this thread what is the > current way in which the Linux bridge expects this mechanism to work. > This is not the Microchip KSZ SDK, and thus, following the network stack > rules is not optional, and inventing local conventions when there > already exist global ones is a no-go. If you don't like the conventions > you are of course free to challenge them, but this is not the right > audience to do that. Vladimir, The question to Tristram was what is the behavior if the multicast address table is disabled which was the case prior to commit 331d64f752bb ("net: dsa: microchip: add the enable_stp_addr pointer in ksz_dev_ops"). I have tested and been able to determine that the behavior without the multicast address table enabled is that multicast packets are forwarded to all ports. What commit 331d64f752bb ("net: dsa: microchip: add the enable_stp_addr pointer in ksz_dev_ops") does is to effectively 'disable' forwarding of packets to 01-80-C2-00-00-00 (group 0) to downstream ports. I misread or confused the commit log by thinking the patch was 'enabling' forwarding... it's the opposite. The flaw with that patch is that enabling the multicast address table invokes other default rules in the table that need to be re-configured for the cpu port but the patch only configures group 0 (01-80-C2-00-00-00). It fails to configure group 6 (01-80-C2-00-00-08) which is also used for STP so I would argue that it doesn't even do what the commit log says it does. It also has the side effect of disabling forwarding of other groups that were previously forwarded: - group 1 01-80-C2-00-00-01 (MAC Control Frame) (previously were forwarded, now are dropped) - group 2 01-80-C2-00-00-03 (802.1X access control) (previously were forwarded, now are forwarded to the highest port which may not be the cpu port) - group 4 01-80-C2-00-00-20 (GMRP) (previously were forwarded, now forwarded to all except the highest port number which may not be the cpu port) - group 5 01-80-C2-00-00-21 (GVRP) (previously were forwarded, now forwarded to all except the highest port number which may not be the cpu port) - group 6 01-80-C2-00-00-02, 01-80-C2-00-00-04 - 01-80-C2-00-00-0F (previously were forwarded, now are forwarded to the highest port which may not be the cpu port) - group 7 01-80-C2-00-00-11 - 01-80-C2-00-00-1F, 01-80-C2-00-00-22 - 01-80-C2-00-00-2F (previously were forwarded, now forwarded to all except the highest port number which may not be the cpu port) To fix this, I propose adding a function to configure each of the above groups (which are hardware filtering functions of the switch) with proper port masks but I need to know from the DSA experts what is desired for the port mask of those groups. The multicast address table can only invoke rules based on those groups of addresses so if that is not flexible enough then the multicast address table should instead be disabled. Obviously Alun felt that forwarding STP packets to only the cpu port instead of all ports was desired and there were no complaints about that. If that is wrong then I can revert the enabling of the multicast address table for ksz9477 and restore the default forwarding of multicast packets to all ports. I am not that familiar with Linux DSA so I don't know if it's normal to forward multicast packets to upstream cpu port only, all ports, or no ports. If there is some software table where the decision making is done then the multicast address table should not have been enabled at all in order to allow the software to determine the course of action. My goal was to restore multicast packets flowing through to the CPU port which in my board's case is not the highest number port so the above default rules are just wrong and would need to be updated if the multicast address table is to be used. How would you like me to proceed? Best Regards, Tim ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net] net: dsa: microchip: ksz9477: fix multicast filtering 2025-01-17 21:02 ` Tim Harvey @ 2025-01-18 1:18 ` Vladimir Oltean 2025-01-23 1:48 ` Tim Harvey 0 siblings, 1 reply; 15+ messages in thread From: Vladimir Oltean @ 2025-01-18 1:18 UTC (permalink / raw) To: Tim Harvey Cc: Tristram.Ha, Arun.Ramadoss, andrew, davem, Woojung.Huh, linux-kernel, pabeni, netdev, edumazet, UNGLinuxDriver, kuba On Fri, Jan 17, 2025 at 01:02:31PM -0800, Tim Harvey wrote: > The flaw with that patch is that enabling the multicast address table > invokes other default rules in the table that need to be re-configured > for the cpu port but the patch only configures group 0 > (01-80-C2-00-00-00). It fails to configure group 6 (01-80-C2-00-00-08) > which is also used for STP so I would argue that it doesn't even do > what the commit log says it does. It also has the side effect of > disabling forwarding of other groups that were previously forwarded: > - group 1 01-80-C2-00-00-01 (MAC Control Frame) (previously were > forwarded, now are dropped) > - group 2 01-80-C2-00-00-03 (802.1X access control) (previously were > forwarded, now are forwarded to the highest port which may not be the > cpu port) > - group 4 01-80-C2-00-00-20 (GMRP) (previously were forwarded, now > forwarded to all except the highest port number which may not be the > cpu port) > - group 5 01-80-C2-00-00-21 (GVRP) (previously were forwarded, now > forwarded to all except the highest port number which may not be the > cpu port) > - group 6 01-80-C2-00-00-02, 01-80-C2-00-00-04 - 01-80-C2-00-00-0F > (previously were forwarded, now are forwarded to the highest port > which may not be the cpu port) > - group 7 01-80-C2-00-00-11 - 01-80-C2-00-00-1F, 01-80-C2-00-00-22 - > 01-80-C2-00-00-2F (previously were forwarded, now forwarded to all > except the highest port number which may not be the cpu port) > To fix this, I propose adding a function to configure each of the > above groups (which are hardware filtering functions of the switch) > with proper port masks but I need to know from the DSA experts what is > desired for the port mask of those groups. The multicast address table > can only invoke rules based on those groups of addresses so if that is > not flexible enough then the multicast address table should instead be > disabled. The recommendation from the DSA maintainers will be to follow what the software bridge data path does, which just means testing and seeing how each group reacts to the known inputs which might affect it, i.e.: - is it a group of the form 01-80-c2-00-00-0X? if yes, group_fwd_mask should dictate how it is forwarded by software. All that hardware needs to take care of is to send it just to the CPU. - is multicast flooding enabled on the egress port? - is there an MDB entry towards the egress port? how about another port? The groups outside the 01-80-c2-00-00-0X range should be treated as regular multicast, i.e. group_fwd_mask doesn't matter, and mdb/flooding does. One easy way out, if synchronizing the hardware port masks with the software state turns out too hard, is to configure the switch to send all these groups just to the CPU, and make sure skb->offload_fwd_mark is unset for packets belonging to these groups (don't call dsa_default_offload_fwd_mark() from the tagger). The software takes this as a cue that it should forward them where the hardware didn't reach. Also, never exclude the CPU port from the destination port mask, unless you really, really know what you're doing. The software bridge might need to forward to another foreign (non-switch) bridge port which is an Intel e1000 card, or a Wi-Fi AP, or a tunnel, and by cutting out the CPU from the flood path, you're taking that possibility away from it. Here's a script to get you started with testing. #!/bin/bash ARP=" \ ff:ff:ff:ff:ff:ff 00:00:de:ad:be:ef 08 06 00 01 \ 08 00 06 04 00 01 e0 07 1b 81 13 40 c0 a8 01 ad \ 00 00 00 00 00 00 c0 a8 01 ea" groups=( \ 01:80:C2:00:00:00 \ 01:80:C2:00:00:08 \ 01:80:C2:00:00:01 \ 01:80:C2:00:00:03 \ 01:80:C2:00:00:20 \ 01:80:C2:00:00:21 \ 01:80:C2:00:00:02 \ 01:80:C2:00:00:04 \ 01:80:C2:00:00:0F \ 01:80:C2:00:00:11 \ 01:80:C2:00:00:1F \ 01:80:C2:00:00:22 \ 01:80:C2:00:00:2F \ ) pkt_count=1000 mac_get() { local if_name=$1 ip -j link show dev $if_name | jq -r '.[]["address"]' } get_rx_stats() { local if_name=$1 ip -j -s link show $if_name | jq '.[].stats64.rx.packets' } last_nibble() { local macaddr=$1 echo "0x${macaddr:0-1}" } send_raw() { local if_name=$1; shift local group=$1; shift local pkt="$1"; shift local smac=$(mac_get $if_name) pkt="${pkt/ff:ff:ff:ff:ff:ff/$group}" pkt="${pkt/00:00:de:ad:be:ef/$smac}" mausezahn -c $pkt_count -q $if_name "$pkt" } run_test() { before=$(get_rx_stats veth4) send_raw veth0 $group "$ARP" after=$(get_rx_stats veth4) delta=$((after - before)) [ $delta -ge $pkt_count ] && echo "forwarded" || echo "not forwarded" } # br0 # / | \ # / | \ # / | \ # / | \ # veth1 veth3 veth5 # | | | # veth0 veth2 veth4 ip link add veth0 type veth peer name veth1 ip link add veth2 type veth peer name veth3 ip link add veth4 type veth peer name veth5 ip link add br0 type bridge && ip link set br0 up ip link set veth1 master br0 && ip link set veth1 up ip link set veth3 master br0 && ip link set veth3 up ip link set veth5 master br0 && ip link set veth5 up ip link set veth0 up && ip link set veth2 up && ip link set veth4 up for group in "${groups[@]}"; do ip link set veth5 type bridge_slave mcast_flood on with_flooding=$(run_test $group) ip link set veth5 type bridge_slave mcast_flood off without_flooding=$(run_test $group) bridge mdb add dev br0 port veth5 grp $group permanent with_mdb_and_no_flooding=$(run_test $group) bridge mdb del dev br0 port veth5 grp $group permanent # restore ip link set veth5 type bridge_slave mcast_flood on # restore bridge mdb add dev br0 port veth3 grp $group permanent with_mdb_on_another_port=$(run_test $group) bridge mdb del dev br0 port veth3 grp $group permanent # restore ip link set br0 type bridge group_fwd_mask $((1 << $(last_nibble $group))) 2>/dev/null if [ $? = 0 ]; then with_group_fwd_mask=$(run_test $group) ip link set br0 type bridge group_fwd_mask 0 # restore else with_group_fwd_mask="can't test" fi printf "Group %s: %s with flooding, %s without flooding, %s with mdb and no flooding, %s with mdb on another port and flooding, %s with group_fwd_mask\n" \ "$group" \ "$with_flooding" \ "$without_flooding" \ "$with_mdb_and_no_flooding" \ "$with_mdb_on_another_port" \ "$with_group_fwd_mask" \ done ip link del veth0 ip link del veth2 ip link del veth4 ip link del br0 ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net] net: dsa: microchip: ksz9477: fix multicast filtering 2025-01-18 1:18 ` Vladimir Oltean @ 2025-01-23 1:48 ` Tim Harvey 2025-01-27 12:54 ` Vladimir Oltean 0 siblings, 1 reply; 15+ messages in thread From: Tim Harvey @ 2025-01-23 1:48 UTC (permalink / raw) To: Vladimir Oltean Cc: Tristram.Ha, Arun.Ramadoss, andrew, davem, Woojung.Huh, linux-kernel, pabeni, netdev, edumazet, UNGLinuxDriver, kuba On Fri, Jan 17, 2025 at 5:18 PM Vladimir Oltean <olteanv@gmail.com> wrote: > > On Fri, Jan 17, 2025 at 01:02:31PM -0800, Tim Harvey wrote: > > The flaw with that patch is that enabling the multicast address table > > invokes other default rules in the table that need to be re-configured > > for the cpu port but the patch only configures group 0 > > (01-80-C2-00-00-00). It fails to configure group 6 (01-80-C2-00-00-08) > > which is also used for STP so I would argue that it doesn't even do > > what the commit log says it does. It also has the side effect of > > disabling forwarding of other groups that were previously forwarded: > > - group 1 01-80-C2-00-00-01 (MAC Control Frame) (previously were > > forwarded, now are dropped) > > - group 2 01-80-C2-00-00-03 (802.1X access control) (previously were > > forwarded, now are forwarded to the highest port which may not be the > > cpu port) > > - group 4 01-80-C2-00-00-20 (GMRP) (previously were forwarded, now > > forwarded to all except the highest port number which may not be the > > cpu port) > > - group 5 01-80-C2-00-00-21 (GVRP) (previously were forwarded, now > > forwarded to all except the highest port number which may not be the > > cpu port) > > - group 6 01-80-C2-00-00-02, 01-80-C2-00-00-04 - 01-80-C2-00-00-0F > > (previously were forwarded, now are forwarded to the highest port > > which may not be the cpu port) > > - group 7 01-80-C2-00-00-11 - 01-80-C2-00-00-1F, 01-80-C2-00-00-22 - > > 01-80-C2-00-00-2F (previously were forwarded, now forwarded to all > > except the highest port number which may not be the cpu port) > > > To fix this, I propose adding a function to configure each of the > > above groups (which are hardware filtering functions of the switch) > > with proper port masks but I need to know from the DSA experts what is > > desired for the port mask of those groups. The multicast address table > > can only invoke rules based on those groups of addresses so if that is > > not flexible enough then the multicast address table should instead be > > disabled. > > The recommendation from the DSA maintainers will be to follow what the > software bridge data path does, which just means testing and seeing how > each group reacts to the known inputs which might affect it, i.e.: > > - is it a group of the form 01-80-c2-00-00-0X? if yes, group_fwd_mask > should dictate how it is forwarded by software. All that hardware > needs to take care of is to send it just to the CPU. > > - is multicast flooding enabled on the egress port? > > - is there an MDB entry towards the egress port? how about another port? > The groups outside the 01-80-c2-00-00-0X range should be treated as > regular multicast, i.e. group_fwd_mask doesn't matter, and mdb/flooding > does. > > One easy way out, if synchronizing the hardware port masks with the > software state turns out too hard, is to configure the switch to send > all these groups just to the CPU, and make sure skb->offload_fwd_mark is > unset for packets belonging to these groups (don't call > dsa_default_offload_fwd_mark() from the tagger). The software takes this > as a cue that it should forward them where the hardware didn't reach. > > Also, never exclude the CPU port from the destination port mask, unless > you really, really know what you're doing. The software bridge might > need to forward to another foreign (non-switch) bridge port which is an > Intel e1000 card, or a Wi-Fi AP, or a tunnel, and by cutting out the CPU > from the flood path, you're taking that possibility away from it. > > Here's a script to get you started with testing. > > #!/bin/bash > > ARP=" \ > ff:ff:ff:ff:ff:ff 00:00:de:ad:be:ef 08 06 00 01 \ > 08 00 06 04 00 01 e0 07 1b 81 13 40 c0 a8 01 ad \ > 00 00 00 00 00 00 c0 a8 01 ea" > groups=( \ > 01:80:C2:00:00:00 \ > 01:80:C2:00:00:08 \ > 01:80:C2:00:00:01 \ > 01:80:C2:00:00:03 \ > 01:80:C2:00:00:20 \ > 01:80:C2:00:00:21 \ > 01:80:C2:00:00:02 \ > 01:80:C2:00:00:04 \ > 01:80:C2:00:00:0F \ > 01:80:C2:00:00:11 \ > 01:80:C2:00:00:1F \ > 01:80:C2:00:00:22 \ > 01:80:C2:00:00:2F \ > ) > pkt_count=1000 > > mac_get() > { > local if_name=$1 > > ip -j link show dev $if_name | jq -r '.[]["address"]' > } > > get_rx_stats() > { > local if_name=$1 > > ip -j -s link show $if_name | jq '.[].stats64.rx.packets' > } > > last_nibble() > { > local macaddr=$1 > > echo "0x${macaddr:0-1}" > } > > send_raw() > { > local if_name=$1; shift > local group=$1; shift > local pkt="$1"; shift > local smac=$(mac_get $if_name) > > pkt="${pkt/ff:ff:ff:ff:ff:ff/$group}" > pkt="${pkt/00:00:de:ad:be:ef/$smac}" > > mausezahn -c $pkt_count -q $if_name "$pkt" > } > > run_test() > { > before=$(get_rx_stats veth4) > send_raw veth0 $group "$ARP" > after=$(get_rx_stats veth4) > delta=$((after - before)) > > [ $delta -ge $pkt_count ] && echo "forwarded" || echo "not forwarded" > } > > # br0 > # / | \ > # / | \ > # / | \ > # / | \ > # veth1 veth3 veth5 > # | | | > # veth0 veth2 veth4 > ip link add veth0 type veth peer name veth1 > ip link add veth2 type veth peer name veth3 > ip link add veth4 type veth peer name veth5 > ip link add br0 type bridge && ip link set br0 up > ip link set veth1 master br0 && ip link set veth1 up > ip link set veth3 master br0 && ip link set veth3 up > ip link set veth5 master br0 && ip link set veth5 up > ip link set veth0 up && ip link set veth2 up && ip link set veth4 up > > for group in "${groups[@]}"; do > ip link set veth5 type bridge_slave mcast_flood on > with_flooding=$(run_test $group) > > ip link set veth5 type bridge_slave mcast_flood off > without_flooding=$(run_test $group) > > bridge mdb add dev br0 port veth5 grp $group permanent > with_mdb_and_no_flooding=$(run_test $group) > bridge mdb del dev br0 port veth5 grp $group permanent # restore > > ip link set veth5 type bridge_slave mcast_flood on # restore > > bridge mdb add dev br0 port veth3 grp $group permanent > with_mdb_on_another_port=$(run_test $group) > bridge mdb del dev br0 port veth3 grp $group permanent # restore > > ip link set br0 type bridge group_fwd_mask $((1 << $(last_nibble $group))) 2>/dev/null > if [ $? = 0 ]; then > with_group_fwd_mask=$(run_test $group) > ip link set br0 type bridge group_fwd_mask 0 # restore > else > with_group_fwd_mask="can't test" > fi > > printf "Group %s: %s with flooding, %s without flooding, %s with mdb and no flooding, %s with mdb on another port and flooding, %s with group_fwd_mask\n" \ > "$group" \ > "$with_flooding" \ > "$without_flooding" \ > "$with_mdb_and_no_flooding" \ > "$with_mdb_on_another_port" \ > "$with_group_fwd_mask" \ > > done > > ip link del veth0 > ip link del veth2 > ip link del veth4 > ip link del br0 Hi Vladimir, Here is the output of your script with Linux 6.13: Group 01:80:C2:00:00:00: forwarded with flooding, not forwarded without flooding, forwarded with mdb and no flooding, not forwarded with mdb on another port and flooding, can't test with group_fwd_mask Group 01:80:C2:00:00:08: not forwarded with flooding, not forwarded without flooding, not forwarded with mdb and no flooding, not forwarded with mdb on another port and flooding, forwarded with group_fwd_mask Group 01:80:C2:00:00:01: not forwarded with flooding, not forwarded without flooding, not forwarded with mdb and no flooding, not forwarded with mdb on another port and flooding, can't test with group_fwd_mask Group 01:80:C2:00:00:03: not forwarded with flooding, not forwarded without flooding, not forwarded with mdb and no flooding, not forwarded with mdb on another port and flooding, forwarded with group_fwd_mask Group 01:80:C2:00:00:20: forwarded with flooding, not forwarded without flooding, forwarded with mdb and no flooding, not forwarded with mdb on another port and flooding, can't test with group_fwd_mask Group 01:80:C2:00:00:21: forwarded with flooding, not forwarded without flooding, forwarded with mdb and no flooding, not forwarded with mdb on another port and flooding, can't test with group_fwd_mask Group 01:80:C2:00:00:02: not forwarded with flooding, not forwarded without flooding, not forwarded with mdb and no flooding, not forwarded with mdb on another port and flooding, can't test with group_fwd_mask Group 01:80:C2:00:00:04: not forwarded with flooding, not forwarded without flooding, not forwarded with mdb and no flooding, not forwarded with mdb on another port and flooding, forwarded with group_fwd_mask Group 01:80:C2:00:00:0F: not forwarded with flooding, not forwarded without flooding, not forwarded with mdb and no flooding, not forwarded with mdb on another port and flooding, forwarded with group_fwd_mask Group 01:80:C2:00:00:11: forwarded with flooding, not forwarded without flooding, forwarded with mdb and no flooding, not forwarded with mdb on another port and flooding, can't test with group_fwd_mask Group 01:80:C2:00:00:1F: forwarded with flooding, not forwarded without flooding, forwarded with mdb and no flooding, not forwarded with mdb on another port and flooding, forwarded with group_fwd_mask Group 01:80:C2:00:00:22: forwarded with flooding, not forwarded without flooding, forwarded with mdb and no flooding, not forwarded with mdb on another port and flooding, can't test with group_fwd_mask Group 01:80:C2:00:00:2F: forwarded with flooding, not forwarded without flooding, forwarded with mdb and no flooding, not forwarded with mdb on another port and flooding, forwarded with group_fwd_mask Why did you choose these addresses? The original complaint I'm trying to address was that LLDP used to be forwarded on the ksz9477 prior to the enabling of the hw multicast address table and now is not. LLDP uses both 01-80-c2-00-00-00 and 01-80-c2-00-00-0e and while 01-80-c2-00-00-00 is forwarded currently on the ksz9477 01-80-c2-00-00-0e is not. It's the same for the software bridge scenario above - when I add 01-80-c2-00-00-0e to the test, it's not forwarded. Where are the above rules implemented for the software bridge and why are these the choices? Best Regards, Tim ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net] net: dsa: microchip: ksz9477: fix multicast filtering 2025-01-23 1:48 ` Tim Harvey @ 2025-01-27 12:54 ` Vladimir Oltean 0 siblings, 0 replies; 15+ messages in thread From: Vladimir Oltean @ 2025-01-27 12:54 UTC (permalink / raw) To: Tim Harvey Cc: Tristram.Ha, Arun.Ramadoss, andrew, davem, Woojung.Huh, linux-kernel, pabeni, netdev, edumazet, UNGLinuxDriver, kuba On Wed, Jan 22, 2025 at 05:48:51PM -0800, Tim Harvey wrote: > On Fri, Jan 17, 2025 at 5:18 PM Vladimir Oltean <olteanv@gmail.com> wrote: > > > > On Fri, Jan 17, 2025 at 01:02:31PM -0800, Tim Harvey wrote: > > > The flaw with that patch is that enabling the multicast address table > > > invokes other default rules in the table that need to be re-configured > > > for the cpu port but the patch only configures group 0 > > > (01-80-c2-00-00-00). it fails to configure group 6 (01-80-c2-00-00-08) > > > which is also used for stp so i would argue that it doesn't even do > > > what the commit log says it does. it also has the side effect of > > > disabling forwarding of other groups that were previously forwarded: > > > - group 1 01-80-c2-00-00-01 (mac control frame) (previously were > > > forwarded, now are dropped) > > > - group 2 01-80-c2-00-00-03 (802.1x access control) (previously were > > > forwarded, now are forwarded to the highest port which may not be the > > > cpu port) > > > - group 4 01-80-c2-00-00-20 (gmrp) (previously were forwarded, now > > > forwarded to all except the highest port number which may not be the > > > cpu port) > > > - group 5 01-80-c2-00-00-21 (gvrp) (previously were forwarded, now > > > forwarded to all except the highest port number which may not be the > > > cpu port) > > > - group 6 01-80-c2-00-00-02, 01-80-c2-00-00-04 - 01-80-c2-00-00-0f > > > (previously were forwarded, now are forwarded to the highest port > > > which may not be the cpu port) > > > - group 7 01-80-c2-00-00-11 - 01-80-c2-00-00-1f, 01-80-c2-00-00-22 - > > > 01-80-c2-00-00-2f (previously were forwarded, now forwarded to all > > > except the highest port number which may not be the cpu port) > > > > > To fix this, I propose adding a function to configure each of the > > > above groups (which are hardware filtering functions of the switch) > > > with proper port masks but I need to know from the DSA experts what is > > > desired for the port mask of those groups. The multicast address table > > > can only invoke rules based on those groups of addresses so if that is > > > not flexible enough then the multicast address table should instead be > > > disabled. > > > > The recommendation from the DSA maintainers will be to follow what the > > software bridge data path does, which just means testing and seeing how > > each group reacts to the known inputs which might affect it, i.e.: > > > > - is it a group of the form 01-80-c2-00-00-0X? if yes, group_fwd_mask > > should dictate how it is forwarded by software. All that hardware > > needs to take care of is to send it just to the CPU. > > > > - is multicast flooding enabled on the egress port? > > > > - is there an MDB entry towards the egress port? how about another port? > > The groups outside the 01-80-c2-00-00-0X range should be treated as > > regular multicast, i.e. group_fwd_mask doesn't matter, and mdb/flooding > > does. > > > > One easy way out, if synchronizing the hardware port masks with the > > software state turns out too hard, is to configure the switch to send > > all these groups just to the CPU, and make sure skb->offload_fwd_mark is > > unset for packets belonging to these groups (don't call > > dsa_default_offload_fwd_mark() from the tagger). The software takes this > > as a cue that it should forward them where the hardware didn't reach. > > > > Also, never exclude the CPU port from the destination port mask, unless > > you really, really know what you're doing. The software bridge might > > need to forward to another foreign (non-switch) bridge port which is an > > Intel e1000 card, or a Wi-Fi AP, or a tunnel, and by cutting out the CPU > > from the flood path, you're taking that possibility away from it. > > > > Here's a script to get you started with testing. > > > > #!/bin/bash > > > > ARP=" \ > > ff:ff:ff:ff:ff:ff 00:00:de:ad:be:ef 08 06 00 01 \ > > 08 00 06 04 00 01 e0 07 1b 81 13 40 c0 a8 01 ad \ > > 00 00 00 00 00 00 c0 a8 01 ea" > > groups=( \ > > 01:80:C2:00:00:00 \ > > 01:80:C2:00:00:08 \ > > 01:80:C2:00:00:01 \ > > 01:80:C2:00:00:03 \ > > 01:80:C2:00:00:20 \ > > 01:80:C2:00:00:21 \ > > 01:80:C2:00:00:02 \ > > 01:80:C2:00:00:04 \ > > 01:80:C2:00:00:0F \ > > 01:80:C2:00:00:11 \ > > 01:80:C2:00:00:1F \ > > 01:80:C2:00:00:22 \ > > 01:80:C2:00:00:2F \ > > ) > > pkt_count=1000 > > > > mac_get() > > { > > local if_name=$1 > > > > ip -j link show dev $if_name | jq -r '.[]["address"]' > > } > > > > get_rx_stats() > > { > > local if_name=$1 > > > > ip -j -s link show $if_name | jq '.[].stats64.rx.packets' > > } > > > > last_nibble() > > { > > local macaddr=$1 > > > > echo "0x${macaddr:0-1}" > > } > > > > send_raw() > > { > > local if_name=$1; shift > > local group=$1; shift > > local pkt="$1"; shift > > local smac=$(mac_get $if_name) > > > > pkt="${pkt/ff:ff:ff:ff:ff:ff/$group}" > > pkt="${pkt/00:00:de:ad:be:ef/$smac}" > > > > mausezahn -c $pkt_count -q $if_name "$pkt" > > } > > > > run_test() > > { > > before=$(get_rx_stats veth4) > > send_raw veth0 $group "$ARP" > > after=$(get_rx_stats veth4) > > delta=$((after - before)) > > > > [ $delta -ge $pkt_count ] && echo "forwarded" || echo "not forwarded" > > } > > > > # br0 > > # / | \ > > # / | \ > > # / | \ > > # / | \ > > # veth1 veth3 veth5 > > # | | | > > # veth0 veth2 veth4 > > ip link add veth0 type veth peer name veth1 > > ip link add veth2 type veth peer name veth3 > > ip link add veth4 type veth peer name veth5 > > ip link add br0 type bridge && ip link set br0 up > > ip link set veth1 master br0 && ip link set veth1 up > > ip link set veth3 master br0 && ip link set veth3 up > > ip link set veth5 master br0 && ip link set veth5 up > > ip link set veth0 up && ip link set veth2 up && ip link set veth4 up > > > > for group in "${groups[@]}"; do > > ip link set veth5 type bridge_slave mcast_flood on > > with_flooding=$(run_test $group) > > > > ip link set veth5 type bridge_slave mcast_flood off > > without_flooding=$(run_test $group) > > > > bridge mdb add dev br0 port veth5 grp $group permanent > > with_mdb_and_no_flooding=$(run_test $group) > > bridge mdb del dev br0 port veth5 grp $group permanent # restore > > > > ip link set veth5 type bridge_slave mcast_flood on # restore > > > > bridge mdb add dev br0 port veth3 grp $group permanent > > with_mdb_on_another_port=$(run_test $group) > > bridge mdb del dev br0 port veth3 grp $group permanent # restore > > > > ip link set br0 type bridge group_fwd_mask $((1 << $(last_nibble $group))) 2>/dev/null > > if [ $? = 0 ]; then > > with_group_fwd_mask=$(run_test $group) > > ip link set br0 type bridge group_fwd_mask 0 # restore > > else > > with_group_fwd_mask="can't test" > > fi > > > > printf "Group %s: %s with flooding, %s without flooding, %s with mdb and no flooding, %s with mdb on another port and flooding, %s with group_fwd_mask\n" \ > > "$group" \ > > "$with_flooding" \ > > "$without_flooding" \ > > "$with_mdb_and_no_flooding" \ > > "$with_mdb_on_another_port" \ > > "$with_group_fwd_mask" \ > > > > done > > > > ip link del veth0 > > ip link del veth2 > > ip link del veth4 > > ip link del br0 > > Hi Vladimir, > > Here is the output of your script with Linux 6.13: > Group 01:80:C2:00:00:00: forwarded with flooding, not forwarded > without flooding, forwarded with mdb and no flooding, not forwarded > with mdb on another port and flooding, can't test with group_fwd_mask > Group 01:80:C2:00:00:08: not forwarded with flooding, not forwarded > without flooding, not forwarded with mdb and no flooding, not > forwarded with mdb on another port and flooding, forwarded with > group_fwd_mask > Group 01:80:C2:00:00:01: not forwarded with flooding, not forwarded > without flooding, not forwarded with mdb and no flooding, not > forwarded with mdb on another port and flooding, can't test with > group_fwd_mask > Group 01:80:C2:00:00:03: not forwarded with flooding, not forwarded > without flooding, not forwarded with mdb and no flooding, not > forwarded with mdb on another port and flooding, forwarded with > group_fwd_mask > Group 01:80:C2:00:00:20: forwarded with flooding, not forwarded > without flooding, forwarded with mdb and no flooding, not forwarded > with mdb on another port and flooding, can't test with group_fwd_mask > Group 01:80:C2:00:00:21: forwarded with flooding, not forwarded > without flooding, forwarded with mdb and no flooding, not forwarded > with mdb on another port and flooding, can't test with group_fwd_mask > Group 01:80:C2:00:00:02: not forwarded with flooding, not forwarded > without flooding, not forwarded with mdb and no flooding, not > forwarded with mdb on another port and flooding, can't test with > group_fwd_mask > Group 01:80:C2:00:00:04: not forwarded with flooding, not forwarded > without flooding, not forwarded with mdb and no flooding, not > forwarded with mdb on another port and flooding, forwarded with > group_fwd_mask > Group 01:80:C2:00:00:0F: not forwarded with flooding, not forwarded > without flooding, not forwarded with mdb and no flooding, not > forwarded with mdb on another port and flooding, forwarded with > group_fwd_mask > Group 01:80:C2:00:00:11: forwarded with flooding, not forwarded > without flooding, forwarded with mdb and no flooding, not forwarded > with mdb on another port and flooding, can't test with group_fwd_mask > Group 01:80:C2:00:00:1F: forwarded with flooding, not forwarded > without flooding, forwarded with mdb and no flooding, not forwarded > with mdb on another port and flooding, forwarded with group_fwd_mask > Group 01:80:C2:00:00:22: forwarded with flooding, not forwarded > without flooding, forwarded with mdb and no flooding, not forwarded > with mdb on another port and flooding, can't test with group_fwd_mask > Group 01:80:C2:00:00:2F: forwarded with flooding, not forwarded > without flooding, forwarded with mdb and no flooding, not forwarded > with mdb on another port and flooding, forwarded with group_fwd_mask > > Why did you choose these addresses? I took these addresses from your previous reply. You can customize as needed, to find out the bridge behavior for any group, of course. > > The original complaint I'm trying to address was that LLDP used to be > forwarded on the ksz9477 prior to the enabling of the hw multicast > address table and now is not. LLDP uses both 01-80-c2-00-00-00 and > 01-80-c2-00-00-0e and while 01-80-c2-00-00-00 is forwarded currently > on the ksz9477 01-80-c2-00-00-0e is not. It's the same for the > software bridge scenario above - when I add 01-80-c2-00-00-0e to the > test, it's not forwarded. Where are the above rules implemented for > the software bridge and why are these the choices? If you see the "can't test with group_fwd_mask" error, it means that the bridge is outright refusing to forward this particular group. See https://elixir.bootlin.com/linux/v6.12.6/A/ident/BR_GROUPFWD_RESTRICTED for more details. For example, the bridge refuses to forward 01-80-c2-00-00-00 and the question is why you would want to do that. "Previous behavior" doesn't always mean "correct behavior". Whereas group 01-80-c2-00-00-0e, as far as I can see, can be forwarded fine by the software bridge when BIT(14) is set in the bridge group_fwd_mask. For such groups, an accelerator has nothing more to do than ensure skb->offload_fwd_mark = 0 on RX, and trap them exclusively to the CPU. ^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: [PATCH net] net: dsa: microchip: ksz9477: fix multicast filtering 2025-01-17 1:25 ` Tim Harvey 2025-01-17 16:13 ` Vladimir Oltean @ 2025-01-18 0:49 ` Tristram.Ha 1 sibling, 0 replies; 15+ messages in thread From: Tristram.Ha @ 2025-01-18 0:49 UTC (permalink / raw) To: tharvey Cc: Arun.Ramadoss, andrew, davem, olteanv, Woojung.Huh, linux-kernel, pabeni, netdev, edumazet, UNGLinuxDriver, kuba > Hi Tristram, > > Thanks for your feedback. > > What is the behavior when the reserved multicast table is not enabled > (does it forward to all ports, drop all mcast, something else?) > > > The default reserved multicast table forwards to host port on entries 0, > > 2, and 6; skips host port on entries 4, 5, and 7; forwards to all ports > > on entry 3; and drops on entry 1. > > > > Is this behavior the desired behavior as far as the Linux DSA folks would want? > > commit 331d64f752bb ("net: dsa: microchip: add the enable_stp_addr > pointer in ksz_dev_ops") enables the reserved multicast table and > adjusts the cpu port for entry 0 leaving the rest the same (and wrong > if the cpu port is not the highest port in the switch). > > My patch adjusts the entries but keeps the rules the same and the > question that is posed is that the right thing to do with respect to > Linux DSA? When reserved multicast address table is not enabled the multicast forwarding follows the standard operation which is broadcast to all ports. Note KSZ9477 and LAN937X are able to program multicast forwarding in either static or dynamic MAC table with high entry count, so it is probably best to not use the reserved multicast address table and let the DSA stack program each multicast address individually. ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2025-01-27 12:54 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20241212215132.3111392-1-tharvey@gateworks.com>
2024-12-13 0:00 ` [PATCH net] net: dsa: microchip: ksz9477: fix multicast filtering Vladimir Oltean
2024-12-13 0:32 ` Tim Harvey
2024-12-13 1:14 ` Vladimir Oltean
2024-12-27 19:06 ` Tim Harvey
2025-01-03 3:29 ` Arun.Ramadoss
2025-01-14 18:27 ` Tim Harvey
2025-01-16 9:06 ` Arun.Ramadoss
2025-01-17 1:09 ` Tristram.Ha
2025-01-17 1:25 ` Tim Harvey
2025-01-17 16:13 ` Vladimir Oltean
2025-01-17 21:02 ` Tim Harvey
2025-01-18 1:18 ` Vladimir Oltean
2025-01-23 1:48 ` Tim Harvey
2025-01-27 12:54 ` Vladimir Oltean
2025-01-18 0:49 ` Tristram.Ha
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox