Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH net-next v3 09/12] net: stmmac: selftests: Add tests for SA Insertion/Replacement
From: Jose Abreu @ 2019-08-17 18:54 UTC (permalink / raw)
  To: netdev
  Cc: Joao Pinto, Jakub Kicinski, Jose Abreu, Giuseppe Cavallaro,
	Alexandre Torgue, David S. Miller, Maxime Coquelin, linux-stm32,
	linux-arm-kernel, linux-kernel
In-Reply-To: <cover.1566067802.git.joabreu@synopsys.com>

Add 4 new tests:
	- SA Insertion (register based)
	- SA Insertion (descriptor based)
	- SA Replacament (register based)
	- SA Replacement (descriptor based)

Signed-off-by: Jose Abreu <joabreu@synopsys.com>

---
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Cc: Jose Abreu <joabreu@synopsys.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: netdev@vger.kernel.org
Cc: linux-stm32@st-md-mailman.stormreply.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
---
 .../net/ethernet/stmicro/stmmac/stmmac_selftests.c | 98 +++++++++++++++++++++-
 1 file changed, 97 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_selftests.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_selftests.c
index abab84f2ef8b..acfab86431b1 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_selftests.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_selftests.c
@@ -45,6 +45,7 @@ struct stmmac_packet_attrs {
 	int size;
 	int remove_sa;
 	u8 id;
+	int sarc;
 };
 
 static u8 stmmac_test_next_id;
@@ -230,7 +231,10 @@ static int stmmac_test_loopback_validate(struct sk_buff *skb,
 		if (!ether_addr_equal(ehdr->h_dest, tpriv->packet->dst))
 			goto out;
 	}
-	if (tpriv->packet->src) {
+	if (tpriv->packet->sarc) {
+		if (!ether_addr_equal(ehdr->h_source, ehdr->h_dest))
+			goto out;
+	} else if (tpriv->packet->src) {
 		if (!ether_addr_equal(ehdr->h_source, tpriv->packet->src))
 			goto out;
 	}
@@ -1004,6 +1008,82 @@ static int stmmac_test_rxp(struct stmmac_priv *priv)
 }
 #endif
 
+static int stmmac_test_desc_sai(struct stmmac_priv *priv)
+{
+	unsigned char src[ETH_ALEN] = {0x00, 0x00, 0x00, 0x00, 0x00, 0x00};
+	struct stmmac_packet_attrs attr = { };
+	int ret;
+
+	attr.remove_sa = true;
+	attr.sarc = true;
+	attr.src = src;
+	attr.dst = priv->dev->dev_addr;
+
+	priv->sarc_type = 0x1;
+
+	ret = __stmmac_test_loopback(priv, &attr);
+
+	priv->sarc_type = 0x0;
+	return ret;
+}
+
+static int stmmac_test_desc_sar(struct stmmac_priv *priv)
+{
+	unsigned char src[ETH_ALEN] = {0x00, 0x00, 0x00, 0x00, 0x00, 0x00};
+	struct stmmac_packet_attrs attr = { };
+	int ret;
+
+	attr.sarc = true;
+	attr.src = src;
+	attr.dst = priv->dev->dev_addr;
+
+	priv->sarc_type = 0x2;
+
+	ret = __stmmac_test_loopback(priv, &attr);
+
+	priv->sarc_type = 0x0;
+	return ret;
+}
+
+static int stmmac_test_reg_sai(struct stmmac_priv *priv)
+{
+	unsigned char src[ETH_ALEN] = {0x00, 0x00, 0x00, 0x00, 0x00, 0x00};
+	struct stmmac_packet_attrs attr = { };
+	int ret;
+
+	attr.remove_sa = true;
+	attr.sarc = true;
+	attr.src = src;
+	attr.dst = priv->dev->dev_addr;
+
+	if (stmmac_sarc_configure(priv, priv->ioaddr, 0x2))
+		return -EOPNOTSUPP;
+
+	ret = __stmmac_test_loopback(priv, &attr);
+
+	stmmac_sarc_configure(priv, priv->ioaddr, 0x0);
+	return ret;
+}
+
+static int stmmac_test_reg_sar(struct stmmac_priv *priv)
+{
+	unsigned char src[ETH_ALEN] = {0x00, 0x00, 0x00, 0x00, 0x00, 0x00};
+	struct stmmac_packet_attrs attr = { };
+	int ret;
+
+	attr.sarc = true;
+	attr.src = src;
+	attr.dst = priv->dev->dev_addr;
+
+	if (stmmac_sarc_configure(priv, priv->ioaddr, 0x3))
+		return -EOPNOTSUPP;
+
+	ret = __stmmac_test_loopback(priv, &attr);
+
+	stmmac_sarc_configure(priv, priv->ioaddr, 0x0);
+	return ret;
+}
+
 #define STMMAC_LOOPBACK_NONE	0
 #define STMMAC_LOOPBACK_MAC	1
 #define STMMAC_LOOPBACK_PHY	2
@@ -1065,6 +1145,22 @@ static const struct stmmac_test {
 		.name = "Flexible RX Parser   ",
 		.lb = STMMAC_LOOPBACK_PHY,
 		.fn = stmmac_test_rxp,
+	}, {
+		.name = "SA Insertion (desc)  ",
+		.lb = STMMAC_LOOPBACK_PHY,
+		.fn = stmmac_test_desc_sai,
+	}, {
+		.name = "SA Replacement (desc)",
+		.lb = STMMAC_LOOPBACK_PHY,
+		.fn = stmmac_test_desc_sar,
+	}, {
+		.name = "SA Insertion (reg)  ",
+		.lb = STMMAC_LOOPBACK_PHY,
+		.fn = stmmac_test_reg_sai,
+	}, {
+		.name = "SA Replacement (reg)",
+		.lb = STMMAC_LOOPBACK_PHY,
+		.fn = stmmac_test_reg_sar,
 	},
 };
 
-- 
2.7.4


^ permalink raw reply related

* Re: Unable to create htb tc classes more than 64K
From: Akshat Kakkar @ 2019-08-17 19:04 UTC (permalink / raw)
  To: Cong Wang; +Cc: NetFilter, lartc, netdev
In-Reply-To: <CAM_iQpUH6y8oEct3FXUhqNekQ3sn3N7LoSR0chJXAPYUzvWbxA@mail.gmail.com>

On Sat, Aug 17, 2019 at 11:54 PM Cong Wang <xiyou.wangcong@gmail.com> wrote:
>
> On Sat, Aug 17, 2019 at 5:46 AM Akshat Kakkar <akshat.1984@gmail.com> wrote:
> >
> > I agree that it is because of 16bit of minor I'd of class which
> > restricts it to 64K.
> > Point is, can we use multilevel qdisc and classes to extend it to more
> > no. of classes i.e. to more than 64K classes
>
> If your goal is merely having as many classes as you can, then yes.
My goal is not just to make as many classes as possible, but also to
use them to do rate limiting per ip per server. Say, I have a list of
10000 IPs and more than 100 servers. So simply if I want few IPs to
get speed of says 1Mbps per server but others say speed of 2 Mbps per
server. How can I achieve this without having 10000 x 100 classes.
These numbers can be large than this and hence I am looking for a
generic solution to this.

>
>
> >
> > One scheme can be like
> >                                       100: root qdisc
> >                                          |
> >                                        / | \
> >                                      /   |   \
> >                                    /     |     \
> >                                  /       |       \
> >                           100:1   100:2   100:3        child classes
> >                             |              |           |
> >                             |              |           |
> >                             |              |           |
> >                            1:            2:          3:     qdisc
> >                            / \           / \           / \
> >                          /     \                     /     \
> >                       1:1    1:2             3:1      3:2 leaf classes
> >
> > with all qdisc and classes defined as htb.
> >
> > Is this correct approach? Any alternative??
>
> Again, depends on what your goal is.
>
>
> >
> > Besides, in order to direct traffic to leaf classes 1:1, 1:2, 2:1,
> > 2:2, 3:1, 3:2 .... , instead of using filters I am using ipset with
> > skbprio and iptables map-set match rule.
> > But even after all this it don't work. Why?
>
> Again, the filters you use to classify the packets could only
> work for the classes on the same level, no the next level.

I am using ipset +  iptables to classify and not filters. Besides, if
tc is allowing me to define qdisc -> classes -> qdsic -> classes
(1,2,3 ...) sort of structure (ie like the one shown in ascii tree)
then how can those lowest child classes be actually used or consumed?

>
>
> Thanks.

^ permalink raw reply

* [PATCH RFC v2 net-next 0/4] mv88e6xxx: setting 2500base-x mode for CPU/DSA port in dts
From: Marek Behún @ 2019-08-17 19:14 UTC (permalink / raw)
  To: netdev
  Cc: Andrew Lunn, Vivien Didelot, Florian Fainelli, Vladimir Oltean,
	Marek Behún

Hi,

here is another proposal for supporting setting 2500base-x mode for
CPU/DSA ports in device tree correctly.

The changes from v1 are that instead of adding .port_setup() and
.port_teardown() methods to the DSA operations struct we instead, for
CPU/DSA ports, call dsa_port_enable() from dsa_port_setup(), but only
after the port is registered (and required phylink/devlink structures
exist).

The .port_enable/.port_disable methods are now only meant to be used
for user ports, when the slave interface is brought up/down. This
proposal changes that in such a way that these methods are also called
for CPU/DSA ports, but only just after the switch is set up (and just
before the switch is tore down).

If we went this way, we would have to patch the other DSA drivers to
check if user port is being given in their respective .port_enable
and .port_disable implmentations.

What do you think about this?

Marek

Marek Behún (4):
  net: dsa: mv88e6xxx: support 2500base-x in SGMII IRQ handler
  net: dsa: call port_enable/port_disable for CPU/DSA ports
  net: dsa: mv88e6xxx: check for port type in port_disable
  net: dsa: mv88e6xxx: do not enable SERDESes in mv88e6xxx_setup

 drivers/net/dsa/mv88e6xxx/chip.c   | 15 +++------------
 drivers/net/dsa/mv88e6xxx/serdes.c | 23 +++++++++++++++++++++--
 net/dsa/dsa2.c                     | 21 ++++++++++++++++++++-
 net/dsa/port.c                     |  4 ++--
 4 files changed, 46 insertions(+), 17 deletions(-)

-- 
2.21.0

^ permalink raw reply

* [PATCH RFC v2 net-next 4/4] net: dsa: mv88e6xxx: do not enable SERDESes in mv88e6xxx_setup
From: Marek Behún @ 2019-08-17 19:14 UTC (permalink / raw)
  To: netdev
  Cc: Andrew Lunn, Vivien Didelot, Florian Fainelli, Vladimir Oltean,
	Marek Behún
In-Reply-To: <20190817191452.16716-1-marek.behun@nic.cz>

CPU/DSA ports are now enabled/disabled in the .port_enable() and
.port_disable() methods. We do not need to enable SERDESes for these
ports in mv88e6xxx_setup.

Signed-off-by: Marek Behún <marek.behun@nic.cz>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Vladimir Oltean <olteanv@gmail.com>
Cc: Vivien Didelot <vivien.didelot@gmail.com>
---
 drivers/net/dsa/mv88e6xxx/chip.c | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index ad27f2fc5c33..cca9f1e2038f 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -2151,16 +2151,6 @@ static int mv88e6xxx_setup_port(struct mv88e6xxx_chip *chip, int port)
 	if (err)
 		return err;
 
-	/* Enable the SERDES interface for DSA and CPU ports. Normal
-	 * ports SERDES are enabled when the port is enabled, thus
-	 * saving a bit of power.
-	 */
-	if ((dsa_is_cpu_port(ds, port) || dsa_is_dsa_port(ds, port))) {
-		err = mv88e6xxx_serdes_power(chip, port, true);
-		if (err)
-			return err;
-	}
-
 	/* Port Control 2: don't force a good FCS, set the maximum frame size to
 	 * 10240 bytes, disable 802.1q tags checking, don't discard tagged or
 	 * untagged frames on this port, do a destination address lookup on all
-- 
2.21.0


^ permalink raw reply related

* [PATCH RFC v2 net-next 3/4] net: dsa: mv88e6xxx: check for port type in port_disable
From: Marek Behún @ 2019-08-17 19:14 UTC (permalink / raw)
  To: netdev
  Cc: Andrew Lunn, Vivien Didelot, Florian Fainelli, Vladimir Oltean,
	Marek Behún
In-Reply-To: <20190817191452.16716-1-marek.behun@nic.cz>

The mv88e6xxx_port_disable method calls mv88e6xxx_port_set_state, which
should be called only for user ports.

Signed-off-by: Marek Behún <marek.behun@nic.cz>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Vladimir Oltean <olteanv@gmail.com>
Cc: Vivien Didelot <vivien.didelot@gmail.com>
---
 drivers/net/dsa/mv88e6xxx/chip.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 9b3ad22a5b98..ad27f2fc5c33 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -2285,8 +2285,9 @@ static void mv88e6xxx_port_disable(struct dsa_switch *ds, int port)
 
 	mv88e6xxx_reg_lock(chip);
 
-	if (mv88e6xxx_port_set_state(chip, port, BR_STATE_DISABLED))
-		dev_err(chip->dev, "failed to disable port\n");
+	if (dsa_is_user_port(ds, port))
+		if (mv88e6xxx_port_set_state(chip, port, BR_STATE_DISABLED))
+			dev_err(chip->dev, "failed to disable port\n");
 
 	if (chip->info->ops->serdes_irq_free)
 		chip->info->ops->serdes_irq_free(chip, port);
-- 
2.21.0


^ permalink raw reply related

* [PATCH RFC v2 net-next 1/4] net: dsa: mv88e6xxx: support 2500base-x in SGMII IRQ handler
From: Marek Behún @ 2019-08-17 19:14 UTC (permalink / raw)
  To: netdev
  Cc: Andrew Lunn, Vivien Didelot, Florian Fainelli, Vladimir Oltean,
	Marek Behún
In-Reply-To: <20190817191452.16716-1-marek.behun@nic.cz>

The mv88e6390_serdes_irq_link_sgmii IRQ handler reads the SERDES PHY
status register to determine speed, among other things. If cmode of the
port is set to 2500base-x, though, the PHY still reports 1000 Mbps (the
PHY register itself does not differentiate between 1000 Mbps and 2500
Mbps - it thinks it is running at 1000 Mbps, although clock is 2.5x
faster).
Look at the cmode and set SPEED_2500 if cmode is set to 2500base-x.
Also tell mv88e6xxx_port_setup_mac the PHY interface mode corresponding
to current cmode in terms of phy_interface_t.

Signed-off-by: Marek Behún <marek.behun@nic.cz>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Vladimir Oltean <olteanv@gmail.com>
Cc: Vivien Didelot <vivien.didelot@gmail.com>
---
 drivers/net/dsa/mv88e6xxx/serdes.c | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/drivers/net/dsa/mv88e6xxx/serdes.c b/drivers/net/dsa/mv88e6xxx/serdes.c
index 20c526c2a9ee..0f3d7cbb696b 100644
--- a/drivers/net/dsa/mv88e6xxx/serdes.c
+++ b/drivers/net/dsa/mv88e6xxx/serdes.c
@@ -506,9 +506,11 @@ static void mv88e6390_serdes_irq_link_sgmii(struct mv88e6xxx_chip *chip,
 					    int port, int lane)
 {
 	struct dsa_switch *ds = chip->ds;
+	u8 cmode = chip->ports[port].cmode;
 	int duplex = DUPLEX_UNKNOWN;
 	int speed = SPEED_UNKNOWN;
 	int link, err;
+	phy_interface_t mode;
 	u16 status;
 
 	err = mv88e6390_serdes_read(chip, lane, MDIO_MMD_PHYXS,
@@ -527,7 +529,10 @@ static void mv88e6390_serdes_irq_link_sgmii(struct mv88e6xxx_chip *chip,
 
 		switch (status & MV88E6390_SGMII_PHY_STATUS_SPEED_MASK) {
 		case MV88E6390_SGMII_PHY_STATUS_SPEED_1000:
-			speed = SPEED_1000;
+			if (cmode == MV88E6XXX_PORT_STS_CMODE_2500BASEX)
+				speed = SPEED_2500;
+			else
+				speed = SPEED_1000;
 			break;
 		case MV88E6390_SGMII_PHY_STATUS_SPEED_100:
 			speed = SPEED_100;
@@ -541,8 +546,22 @@ static void mv88e6390_serdes_irq_link_sgmii(struct mv88e6xxx_chip *chip,
 		}
 	}
 
+	switch (cmode) {
+	case MV88E6XXX_PORT_STS_CMODE_SGMII:
+		mode = PHY_INTERFACE_MODE_SGMII;
+		break;
+	case MV88E6XXX_PORT_STS_CMODE_1000BASE_X:
+		mode = PHY_INTERFACE_MODE_1000BASEX;
+		break;
+	case MV88E6XXX_PORT_STS_CMODE_2500BASEX:
+		mode = PHY_INTERFACE_MODE_2500BASEX;
+		break;
+	default:
+		mode = PHY_INTERFACE_MODE_NA;
+	}
+
 	err = mv88e6xxx_port_setup_mac(chip, port, link, speed, duplex,
-				       PAUSE_OFF, PHY_INTERFACE_MODE_NA);
+				       PAUSE_OFF, mode);
 	if (err)
 		dev_err(chip->dev, "can't propagate PHY settings to MAC: %d\n",
 			err);
-- 
2.21.0


^ permalink raw reply related

* [PATCH RFC v2 net-next 2/4] net: dsa: call port_enable/port_disable for CPU/DSA ports
From: Marek Behún @ 2019-08-17 19:14 UTC (permalink / raw)
  To: netdev
  Cc: Andrew Lunn, Vivien Didelot, Florian Fainelli, Vladimir Oltean,
	Marek Behún
In-Reply-To: <20190817191452.16716-1-marek.behun@nic.cz>

Call dsa_port_enable for CPU/DSA ports in dsa_port_setup, and
dsa_port_disable for CPU/DSA ports in dsa_port_teardown. This requires
changing all DSA drivers, since they expect the port_enable/port_disable
methods to be called only for user ports.

Signed-off-by: Marek Behún <marek.behun@nic.cz>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Vladimir Oltean <olteanv@gmail.com>
Cc: Vivien Didelot <vivien.didelot@gmail.com>
---
 net/dsa/dsa2.c | 21 ++++++++++++++++++++-
 net/dsa/port.c |  4 ++--
 2 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c
index 3abd173ebacb..98ea5c158ee3 100644
--- a/net/dsa/dsa2.c
+++ b/net/dsa/dsa2.c
@@ -315,6 +315,16 @@ static int dsa_port_setup(struct dsa_port *dp)
 		break;
 	}
 
+	switch (dp->type) {
+	case DSA_PORT_TYPE_CPU:
+	case DSA_PORT_TYPE_DSA:
+		if (!err)
+			err = dsa_port_enable(dp, NULL);
+		break;
+	default:
+		break;
+	}
+
 	if (err)
 		devlink_port_unregister(&dp->devlink_port);
 
@@ -323,8 +333,17 @@ static int dsa_port_setup(struct dsa_port *dp)
 
 static void dsa_port_teardown(struct dsa_port *dp)
 {
-	if (dp->type != DSA_PORT_TYPE_UNUSED)
+	switch (dp->type) {
+	case DSA_PORT_TYPE_UNUSED:
+		break;
+	case DSA_PORT_TYPE_CPU:
+	case DSA_PORT_TYPE_DSA:
+		dsa_port_disable(dp);
+		/* fall-through */
+	case DSA_PORT_TYPE_USER:
 		devlink_port_unregister(&dp->devlink_port);
+		break;
+	}
 
 	switch (dp->type) {
 	case DSA_PORT_TYPE_UNUSED:
diff --git a/net/dsa/port.c b/net/dsa/port.c
index f071acf2842b..0cadda57df1f 100644
--- a/net/dsa/port.c
+++ b/net/dsa/port.c
@@ -75,7 +75,7 @@ int dsa_port_enable(struct dsa_port *dp, struct phy_device *phy)
 			return err;
 	}
 
-	if (!dp->bridge_dev)
+	if (dp->type == DSA_PORT_TYPE_USER && !dp->bridge_dev)
 		dsa_port_set_state_now(dp, BR_STATE_FORWARDING);
 
 	return 0;
@@ -86,7 +86,7 @@ void dsa_port_disable(struct dsa_port *dp)
 	struct dsa_switch *ds = dp->ds;
 	int port = dp->index;
 
-	if (!dp->bridge_dev)
+	if (dp->type == DSA_PORT_TYPE_USER && !dp->bridge_dev)
 		dsa_port_set_state_now(dp, BR_STATE_DISABLED);
 
 	if (ds->ops->port_disable)
-- 
2.21.0


^ permalink raw reply related

* Re: [PATCH RFC net-next 3/3] net: dsa: mv88e6xxx: setup SERDES irq also for CPU/DSA ports
From: Vivien Didelot @ 2019-08-17 19:27 UTC (permalink / raw)
  To: Marek Behun; +Cc: netdev, Andrew Lunn, Vladimir Oltean, Florian Fainelli
In-Reply-To: <20190817201552.06c39d3e@nic.cz>

Hi Marek,

On Sat, 17 Aug 2019 20:15:52 +0200, Marek Behun <marek.behun@nic.cz> wrote:
> On Sat, 17 Aug 2019 20:03:42 +0200
> Marek Behun <marek.behun@nic.cz> wrote:
> 
> > One way would be to rename the mv88e6xxx_setup_port function to
> > mv88e6xxx_setup_port_regs, or mv88e6xxx_port_pre_setup, or something
> > like that. Would the names mv88e6xxx_port_setup and
> > mv88e6xxx_setup_port_regs still be very confusing and error prone?
> > I think maybe yes...
> > 
> > Other solution would be to, instead of the .port_setup()
> > and .port_teardown() DSA ops, create the .after_setup()
> > and .before_teardown() ops I mentioned in the previous mail.
> > 
> > And yet another (in my opinion very improper) solution could be that
> > the .setup() method could call dsa_port_setup() from within itself, to
> > ensure that the needed structres exist.
> 
> I thought of another solution, one that does not need new DSA
> operations. What if dsa_port_enable was called for CPU/DSA ports after
> in dsa_port_setup_switches, after all ports are setup, and
> dsa_port_disable called for CPU/DSA ports in dsa_port_teardown_switches?
> 
> This seems to me as cleaner solution.
> 
> Marek

Yes DSA needs to initialize a switch before its ports, but the driver may
need the opposite. Splitting the switch and ports setup and moving it up to
the DSA stack is a separate topic in fact that I'll handle soon.

I guess you meant dsa_tree_setup_switches instead of dsa_port_setup_switches.

Let's call dsa_port_enable/dsa_port_disable for the CPU and DSA ports as well.


Thanks,

	Vivien

^ permalink raw reply

* Re: [PATCH net-next v3 0/3] net: phy: remove genphy_config_init
From: David Miller @ 2019-08-17 19:35 UTC (permalink / raw)
  To: hkallweit1
  Cc: andrew, f.fainelli, khilman, vivien.didelot, netdev,
	linux-amlogic
In-Reply-To: <8790db9d-af10-c3b1-bc65-ee21bb99e6d9@gmail.com>

From: Heiner Kallweit <hkallweit1@gmail.com>
Date: Sat, 17 Aug 2019 12:28:18 +0200

> Supported PHY features are either auto-detected or explicitly set.
> In both cases calling genphy_config_init isn't needed. All that
> genphy_config_init does is removing features that are set as
> supported but can't be auto-detected. Basically it duplicates the
> code in genphy_read_abilities. Therefore remove genphy_config_init.
> 
> v2:
> - remove call also from new adin driver
> v3:
> - pass NULL as config_init function pointer for dp83848

Series applied.

^ permalink raw reply

* Re: [PATCH net-next v3 0/4] net: bridge: mdb: allow dump/add/del of host-joined entries
From: David Miller @ 2019-08-17 19:37 UTC (permalink / raw)
  To: nikolay; +Cc: netdev, roopa, bridge
In-Reply-To: <20190817112213.27097-1-nikolay@cumulusnetworks.com>

From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Date: Sat, 17 Aug 2019 14:22:09 +0300

> This set makes the bridge dump host-joined mdb entries, they should be
> treated as normal entries since they take a slot and are aging out.
> We already have notifications for them but we couldn't dump them until
> now so they remained hidden. We dump them similar to how they're
> notified, in order to keep user-space compatibility with the dumped
> objects (e.g. iproute2 dumps mdbs in a format which can be fed into
> add/del commands) we allow host-joined groups also to be added/deleted via
> mdb commands. That can later be used for L2 mcast MAC manipulation as
> was recently discussed. Note that iproute2 changes are not necessary,
> this set will work with the current user-space mdb code.
> 
> Patch 01 - a trivial comment move
> Patch 02 - factors out the mdb filling code so it can be
>            re-used for the host-joined entries
> Patch 03 - dumps host-joined entries
> Patch 04 - allows manipulation of host-joined entries via standard mdb
>            calls
> 
> v3: fix compiler warning in patch 04 (DaveM)
> v2: change patch 04 to avoid double notification and improve host group
>     manual removal if no ports are present in the group

Series applied.

^ permalink raw reply

* Re: pull request: bluetooth 2019-08-17
From: David Miller @ 2019-08-17 19:38 UTC (permalink / raw)
  To: johan.hedberg; +Cc: netdev, linux-bluetooth
In-Reply-To: <20190817114451.GA10661@abukhnin-mobl1.ccr.corp.intel.com>

From: Johan Hedberg <johan.hedberg@gmail.com>
Date: Sat, 17 Aug 2019 14:44:51 +0300

> Here's a set of Bluetooth fixes for the 5.3-rc series:
> 
>  - Multiple fixes for Qualcomm (btqca & hci_qca) drivers
>  - Minimum encryption key size debugfs setting (this is required for
>    Bluetooth Qualification)
>  - Fix hidp_send_message() to have a meaningful return value

Pulled, thanks.

^ permalink raw reply

* Re: [PATCH net-next v3 00/16] Add drop monitor for offloaded data paths
From: David Miller @ 2019-08-17 19:41 UTC (permalink / raw)
  To: idosch
  Cc: netdev, nhorman, jiri, toke, dsahern, roopa, nikolay,
	jakub.kicinski, andy, f.fainelli, andrew, vivien.didelot, mlxsw,
	idosch
In-Reply-To: <20190817132825.29790-1-idosch@idosch.org>

From: Ido Schimmel <idosch@idosch.org>
Date: Sat, 17 Aug 2019 16:28:09 +0300

> Users have several ways to debug the kernel and understand why a packet
> was dropped. For example, using drop monitor and perf. Both utilities
> trace kfree_skb(), which is the function called when a packet is freed
> as part of a failure. The information provided by these tools is
> invaluable when trying to understand the cause of a packet loss.
> 
> In recent years, large portions of the kernel data path were offloaded
> to capable devices. Today, it is possible to perform L2 and L3
> forwarding in hardware, as well as tunneling (IP-in-IP and VXLAN).
> Different TC classifiers and actions are also offloaded to capable
> devices, at both ingress and egress.
> 
> However, when the data path is offloaded it is not possible to achieve
> the same level of introspection since packets are dropped by the
> underlying device and never reach the kernel.
> 
> This patchset aims to solve this by allowing users to monitor packets
> that the underlying device decided to drop along with relevant metadata
> such as the drop reason and ingress port.
 ...

Looks great, series applied, thanks.

^ permalink raw reply

* Re: [PATCH net-next v3 00/12] net: stmmac: Improvements for -next
From: David Miller @ 2019-08-17 19:44 UTC (permalink / raw)
  To: Jose.Abreu
  Cc: netdev, Joao.Pinto, jakub.kicinski, peppe.cavallaro,
	alexandre.torgue, mcoquelin.stm32, linux-stm32, linux-arm-kernel,
	linux-kernel
In-Reply-To: <cover.1566067802.git.joabreu@synopsys.com>

From: Jose Abreu <Jose.Abreu@synopsys.com>
Date: Sat, 17 Aug 2019 20:54:39 +0200

> Couple of improvements for -next tree. More info in commit logs.

Series applied.

^ permalink raw reply

* Re: [net-next 11/16] net/mlx5e: Report and recover from rx timeout
From: David Miller @ 2019-08-17 19:48 UTC (permalink / raw)
  To: saeedm; +Cc: netdev, ayal, tariqt, jiri
In-Reply-To: <20190815190911.12050-12-saeedm@mellanox.com>

From: Saeed Mahameed <saeedm@mellanox.com>
Date: Thu, 15 Aug 2019 19:10:07 +0000

> +static int mlx5e_rx_reporter_timeout_recover(void *ctx)
> +{
> +	struct mlx5e_rq *rq = ctx;
> +	struct mlx5e_icosq *icosq = &rq->channel->icosq;
> +	struct mlx5_eq_comp *eq = rq->cq.mcq.eq;
> +	int err;

In this and several further patches, this non-reverse-christmas tree
sequence appears.  Please fix it.

Put the variable assignments into the body of the function if you have
to in order to make this styled correctly.

Thanks.

^ permalink raw reply

* Re: [PATCH RFC v2 net-next 0/4] mv88e6xxx: setting 2500base-x mode for CPU/DSA port in dts
From: Vivien Didelot @ 2019-08-17 19:50 UTC (permalink / raw)
  To: Marek Behún
  Cc: netdev, Andrew Lunn, Florian Fainelli, Vladimir Oltean,
	Marek Behún
In-Reply-To: <20190817191452.16716-1-marek.behun@nic.cz>

Hi Marek,

On Sat, 17 Aug 2019 21:14:48 +0200, Marek Behún <marek.behun@nic.cz> wrote:
> Hi,
> 
> here is another proposal for supporting setting 2500base-x mode for
> CPU/DSA ports in device tree correctly.
> 
> The changes from v1 are that instead of adding .port_setup() and
> .port_teardown() methods to the DSA operations struct we instead, for
> CPU/DSA ports, call dsa_port_enable() from dsa_port_setup(), but only
> after the port is registered (and required phylink/devlink structures
> exist).
> 
> The .port_enable/.port_disable methods are now only meant to be used
> for user ports, when the slave interface is brought up/down. This
> proposal changes that in such a way that these methods are also called
> for CPU/DSA ports, but only just after the switch is set up (and just
> before the switch is tore down).
> 
> If we went this way, we would have to patch the other DSA drivers to
> check if user port is being given in their respective .port_enable
> and .port_disable implmentations.
> 
> What do you think about this?

This looks much better. Let me pass through all patches of this RFC so that
I can include bits I would like to see in your next series.


Thanks,

	Vivien

^ permalink raw reply

* [PATCH net v2 0/6] bnxt_en: Bug fixes.
From: Michael Chan @ 2019-08-17 21:04 UTC (permalink / raw)
  To: davem; +Cc: netdev

2 Bug fixes related to 57500 shutdown sequence and doorbell sequence,
2 TC Flower bug fixes related to the setting of the flow direction,
1 NVRAM update bug fix, and a minor fix to suppress an unnecessary
error message.  Please queue for -stable as well.  Thanks.

v2: Fix compile warning reported by kbuild test robot on patch #6.

Michael Chan (2):
  bnxt_en: Fix VNIC clearing logic for 57500 chips.
  bnxt_en: Improve RX doorbell sequence.

Somnath Kotur (1):
  bnxt_en: Fix to include flow direction in L2 key

Vasundhara Volam (2):
  bnxt_en: Fix handling FRAG_ERR when NVM_INSTALL_UPDATE cmd fails
  bnxt_en: Suppress HWRM errors for HWRM_NVM_GET_VARIABLE command

Venkat Duvvuru (1):
  bnxt_en: Use correct src_fid to determine direction of the flow

 drivers/net/ethernet/broadcom/bnxt/bnxt.c         | 36 ++++++++++++++++-------
 drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c |  9 ++++--
 drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 12 ++++----
 drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c      |  8 ++---
 drivers/net/ethernet/broadcom/bnxt/bnxt_tc.h      |  6 ++--
 5 files changed, 42 insertions(+), 29 deletions(-)

-- 
2.5.1

^ permalink raw reply

* [PATCH net v2 1/6] bnxt_en: Fix VNIC clearing logic for 57500 chips.
From: Michael Chan @ 2019-08-17 21:04 UTC (permalink / raw)
  To: davem; +Cc: netdev
In-Reply-To: <1566075892-30064-1-git-send-email-michael.chan@broadcom.com>

During device shutdown, the VNIC clearing sequence needs to be modified
to free the VNIC first before freeing the RSS contexts.  The current
code is doing the reverse and we can get mis-directed RX completions
to CP ring ID 0 when the RSS contexts are freed and zeroed.  The clearing
of RSS contexts is not required with the new sequence.

Refactor the VNIC clearing logic into a new function bnxt_clear_vnic()
and do the chip specific VNIC clearing sequence.

Fixes: 7b3af4f75b81 ("bnxt_en: Add RSS support for 57500 chips.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 26 ++++++++++++++++++--------
 1 file changed, 18 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 7070349..1ef224f 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -7016,19 +7016,29 @@ static void bnxt_hwrm_clear_vnic_rss(struct bnxt *bp)
 		bnxt_hwrm_vnic_set_rss(bp, i, false);
 }
 
-static void bnxt_hwrm_resource_free(struct bnxt *bp, bool close_path,
-				    bool irq_re_init)
+static void bnxt_clear_vnic(struct bnxt *bp)
 {
-	if (bp->vnic_info) {
-		bnxt_hwrm_clear_vnic_filter(bp);
+	if (!bp->vnic_info)
+		return;
+
+	bnxt_hwrm_clear_vnic_filter(bp);
+	if (!(bp->flags & BNXT_FLAG_CHIP_P5)) {
 		/* clear all RSS setting before free vnic ctx */
 		bnxt_hwrm_clear_vnic_rss(bp);
 		bnxt_hwrm_vnic_ctx_free(bp);
-		/* before free the vnic, undo the vnic tpa settings */
-		if (bp->flags & BNXT_FLAG_TPA)
-			bnxt_set_tpa(bp, false);
-		bnxt_hwrm_vnic_free(bp);
 	}
+	/* before free the vnic, undo the vnic tpa settings */
+	if (bp->flags & BNXT_FLAG_TPA)
+		bnxt_set_tpa(bp, false);
+	bnxt_hwrm_vnic_free(bp);
+	if (bp->flags & BNXT_FLAG_CHIP_P5)
+		bnxt_hwrm_vnic_ctx_free(bp);
+}
+
+static void bnxt_hwrm_resource_free(struct bnxt *bp, bool close_path,
+				    bool irq_re_init)
+{
+	bnxt_clear_vnic(bp);
 	bnxt_hwrm_ring_free(bp, close_path);
 	bnxt_hwrm_ring_grp_free(bp);
 	if (irq_re_init) {
-- 
2.5.1


^ permalink raw reply related

* [PATCH net v2 2/6] bnxt_en: Improve RX doorbell sequence.
From: Michael Chan @ 2019-08-17 21:04 UTC (permalink / raw)
  To: davem; +Cc: netdev
In-Reply-To: <1566075892-30064-1-git-send-email-michael.chan@broadcom.com>

When both RX buffers and RX aggregation buffers have to be
replenished at the end of NAPI, post the RX aggregation buffers first
before RX buffers.  Otherwise, we may run into a situation where
there are only RX buffers without RX aggregation buffers for a split
second.  This will cause the hardware to abort the RX packet and
report buffer errors, which will cause unnecessary cleanup by the
driver.

Ringing the Aggregation ring doorbell first before the RX ring doorbell
will prevent some of these buffer errors.  Use the same sequence during
ring initialization as well.

Fixes: 697197e5a173 ("bnxt_en: Re-structure doorbells.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 1ef224f..8dce406 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -2021,9 +2021,9 @@ static void __bnxt_poll_work_done(struct bnxt *bp, struct bnxt_napi *bnapi)
 	if (bnapi->events & BNXT_RX_EVENT) {
 		struct bnxt_rx_ring_info *rxr = bnapi->rx_ring;
 
-		bnxt_db_write(bp, &rxr->rx_db, rxr->rx_prod);
 		if (bnapi->events & BNXT_AGG_EVENT)
 			bnxt_db_write(bp, &rxr->rx_agg_db, rxr->rx_agg_prod);
+		bnxt_db_write(bp, &rxr->rx_db, rxr->rx_prod);
 	}
 	bnapi->events = 0;
 }
@@ -5064,6 +5064,7 @@ static void bnxt_set_db(struct bnxt *bp, struct bnxt_db_info *db, u32 ring_type,
 
 static int bnxt_hwrm_ring_alloc(struct bnxt *bp)
 {
+	bool agg_rings = !!(bp->flags & BNXT_FLAG_AGG_RINGS);
 	int i, rc = 0;
 	u32 type;
 
@@ -5139,7 +5140,9 @@ static int bnxt_hwrm_ring_alloc(struct bnxt *bp)
 		if (rc)
 			goto err_out;
 		bnxt_set_db(bp, &rxr->rx_db, type, map_idx, ring->fw_ring_id);
-		bnxt_db_write(bp, &rxr->rx_db, rxr->rx_prod);
+		/* If we have agg rings, post agg buffers first. */
+		if (!agg_rings)
+			bnxt_db_write(bp, &rxr->rx_db, rxr->rx_prod);
 		bp->grp_info[map_idx].rx_fw_ring_id = ring->fw_ring_id;
 		if (bp->flags & BNXT_FLAG_CHIP_P5) {
 			struct bnxt_cp_ring_info *cpr = &bnapi->cp_ring;
@@ -5158,7 +5161,7 @@ static int bnxt_hwrm_ring_alloc(struct bnxt *bp)
 		}
 	}
 
-	if (bp->flags & BNXT_FLAG_AGG_RINGS) {
+	if (agg_rings) {
 		type = HWRM_RING_ALLOC_AGG;
 		for (i = 0; i < bp->rx_nr_rings; i++) {
 			struct bnxt_rx_ring_info *rxr = &bp->rx_ring[i];
@@ -5174,6 +5177,7 @@ static int bnxt_hwrm_ring_alloc(struct bnxt *bp)
 			bnxt_set_db(bp, &rxr->rx_agg_db, type, map_idx,
 				    ring->fw_ring_id);
 			bnxt_db_write(bp, &rxr->rx_agg_db, rxr->rx_agg_prod);
+			bnxt_db_write(bp, &rxr->rx_db, rxr->rx_prod);
 			bp->grp_info[grp_idx].agg_fw_ring_id = ring->fw_ring_id;
 		}
 	}
-- 
2.5.1


^ permalink raw reply related

* [PATCH net v2 4/6] bnxt_en: Suppress HWRM errors for HWRM_NVM_GET_VARIABLE command
From: Michael Chan @ 2019-08-17 21:04 UTC (permalink / raw)
  To: davem; +Cc: netdev, Vasundhara Volam
In-Reply-To: <1566075892-30064-1-git-send-email-michael.chan@broadcom.com>

From: Vasundhara Volam <vasundhara-v.volam@broadcom.com>

For newly added NVM parameters, older firmware may not have the support.
Suppress the error message to avoid the unncessary error message which is
triggered when devlink calls the driver during initialization.

Fixes: 782a624d00fa ("bnxt_en: Add bnxt_en initial params table and register it.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
index 549c90d3..c05d663 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
@@ -98,10 +98,13 @@ static int bnxt_hwrm_nvm_req(struct bnxt *bp, u32 param_id, void *msg,
 	if (idx)
 		req->dimensions = cpu_to_le16(1);
 
-	if (req->req_type == cpu_to_le16(HWRM_NVM_SET_VARIABLE))
+	if (req->req_type == cpu_to_le16(HWRM_NVM_SET_VARIABLE)) {
 		memcpy(data_addr, buf, bytesize);
-
-	rc = hwrm_send_message(bp, msg, msg_len, HWRM_CMD_TIMEOUT);
+		rc = hwrm_send_message(bp, msg, msg_len, HWRM_CMD_TIMEOUT);
+	} else {
+		rc = hwrm_send_message_silent(bp, msg, msg_len,
+					      HWRM_CMD_TIMEOUT);
+	}
 	if (!rc && req->req_type == cpu_to_le16(HWRM_NVM_GET_VARIABLE))
 		memcpy(buf, data_addr, bytesize);
 
-- 
2.5.1


^ permalink raw reply related

* [PATCH net v2 3/6] bnxt_en: Fix handling FRAG_ERR when NVM_INSTALL_UPDATE cmd fails
From: Michael Chan @ 2019-08-17 21:04 UTC (permalink / raw)
  To: davem; +Cc: netdev, Vasundhara Volam
In-Reply-To: <1566075892-30064-1-git-send-email-michael.chan@broadcom.com>

From: Vasundhara Volam <vasundhara-v.volam@broadcom.com>

If FW returns FRAG_ERR in response error code, driver is resending the
command only when HWRM command returns success. Fix the code to resend
NVM_INSTALL_UPDATE command with DEFRAG install flags, if FW returns
FRAG_ERR in its response error code.

Fixes: cb4d1d626145 ("bnxt_en: Retry failed NVM_INSTALL_UPDATE with defragmentation flag enabled.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
index c7ee63d..8445a0c 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
@@ -2016,21 +2016,19 @@ static int bnxt_flash_package_from_file(struct net_device *dev,
 	mutex_lock(&bp->hwrm_cmd_lock);
 	hwrm_err = _hwrm_send_message(bp, &install, sizeof(install),
 				      INSTALL_PACKAGE_TIMEOUT);
-	if (hwrm_err)
-		goto flash_pkg_exit;
-
-	if (resp->error_code) {
+	if (hwrm_err) {
 		u8 error_code = ((struct hwrm_err_output *)resp)->cmd_err;
 
-		if (error_code == NVM_INSTALL_UPDATE_CMD_ERR_CODE_FRAG_ERR) {
+		if (resp->error_code && error_code ==
+		    NVM_INSTALL_UPDATE_CMD_ERR_CODE_FRAG_ERR) {
 			install.flags |= cpu_to_le16(
 			       NVM_INSTALL_UPDATE_REQ_FLAGS_ALLOWED_TO_DEFRAG);
 			hwrm_err = _hwrm_send_message(bp, &install,
 						      sizeof(install),
 						      INSTALL_PACKAGE_TIMEOUT);
-			if (hwrm_err)
-				goto flash_pkg_exit;
 		}
+		if (hwrm_err)
+			goto flash_pkg_exit;
 	}
 
 	if (resp->result) {
-- 
2.5.1


^ permalink raw reply related

* [PATCH net v2 5/6] bnxt_en: Use correct src_fid to determine direction of the flow
From: Michael Chan @ 2019-08-17 21:04 UTC (permalink / raw)
  To: davem; +Cc: netdev, Venkat Duvvuru
In-Reply-To: <1566075892-30064-1-git-send-email-michael.chan@broadcom.com>

From: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>

Direction of the flow is determined using src_fid. For an RX flow,
src_fid is PF's fid and for TX flow, src_fid is VF's fid. Direction
of the flow must be specified, when getting statistics for that flow.
Currently, for DECAP flow, direction is determined incorrectly, i.e.,
direction is initialized as TX for DECAP flow, instead of RX. Because
of which, stats are not reported for this DECAP flow, though it is
offloaded and there is traffic for that flow, resulting in flow age out.

This patch fixes the problem by determining the DECAP flow's direction
using correct fid.  Set the flow direction in all cases for consistency
even if 64-bit flow handle is not used.

Fixes: abd43a13525d ("bnxt_en: Support for 64-bit flow handle.")
Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
index 6fe4a71..6224c30 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
@@ -1285,9 +1285,7 @@ static int bnxt_tc_add_flow(struct bnxt *bp, u16 src_fid,
 		goto free_node;

 	bnxt_tc_set_src_fid(bp, flow, src_fid);
-
-	if (bp->fw_cap & BNXT_FW_CAP_OVS_64BIT_HANDLE)
-		bnxt_tc_set_flow_dir(bp, flow, src_fid);
+	bnxt_tc_set_flow_dir(bp, flow, flow->src_fid);

 	if (!bnxt_tc_can_offload(bp, flow)) {
 		rc = -EOPNOTSUPP;
-- 
2.5.1

^ permalink raw reply related

* [PATCH net v2 6/6] bnxt_en: Fix to include flow direction in L2 key
From: Michael Chan @ 2019-08-17 21:04 UTC (permalink / raw)
  To: davem; +Cc: netdev, Somnath Kotur
In-Reply-To: <1566075892-30064-1-git-send-email-michael.chan@broadcom.com>

From: Somnath Kotur <somnath.kotur@broadcom.com>

FW expects the driver to provide unique flow reference handles
for Tx or Rx flows. When a Tx flow and an Rx flow end up sharing
a reference handle, flow offload does not seem to work.
This could happen in the case of 2 flows having their L2 fields
wildcarded but in different direction.
Fix to incorporate the flow direction as part of the L2 key

v2: Move the dir field to the end of the bnxt_tc_l2_key struct to
fix the warning reported by kbuild test robot <lkp@intel.com>.
There is existing code that initializes the structure using
nested initializer and will warn with the new u8 field added to
the beginning.  The structure also packs nicer when this new u8 is
added to the end of the structure [MChan].

Fixes: abd43a13525d ("bnxt_en: Support for 64-bit flow handle.")
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c | 4 ++--
 drivers/net/ethernet/broadcom/bnxt/bnxt_tc.h | 6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
index 6224c30..dd621f6 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
@@ -1236,7 +1236,7 @@ static int __bnxt_tc_del_flow(struct bnxt *bp,
 static void bnxt_tc_set_flow_dir(struct bnxt *bp, struct bnxt_tc_flow *flow,
 				 u16 src_fid)
 {
-	flow->dir = (bp->pf.fw_fid == src_fid) ? BNXT_DIR_RX : BNXT_DIR_TX;
+	flow->l2_key.dir = (bp->pf.fw_fid == src_fid) ? BNXT_DIR_RX : BNXT_DIR_TX;
 }
 
 static void bnxt_tc_set_src_fid(struct bnxt *bp, struct bnxt_tc_flow *flow,
@@ -1405,7 +1405,7 @@ static void bnxt_fill_cfa_stats_req(struct bnxt *bp,
 		 * 2. 15th bit of flow_handle must specify the flow
 		 *    direction (TX/RX).
 		 */
-		if (flow_node->flow.dir == BNXT_DIR_RX)
+		if (flow_node->flow.l2_key.dir == BNXT_DIR_RX)
 			handle = CFA_FLOW_INFO_REQ_FLOW_HANDLE_DIR_RX |
 				 CFA_FLOW_INFO_REQ_FLOW_HANDLE_MAX_MASK;
 		else
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.h
index ffec57d..4f05305 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.h
@@ -23,6 +23,9 @@ struct bnxt_tc_l2_key {
 	__be16		inner_vlan_tci;
 	__be16		ether_type;
 	u8		num_vlans;
+	u8		dir;
+#define BNXT_DIR_RX	1
+#define BNXT_DIR_TX	0
 };
 
 struct bnxt_tc_l3_key {
@@ -98,9 +101,6 @@ struct bnxt_tc_flow {
 
 	/* flow applicable to pkts ingressing on this fid */
 	u16				src_fid;
-	u8				dir;
-#define BNXT_DIR_RX	1
-#define BNXT_DIR_TX	0
 	struct bnxt_tc_l2_key		l2_key;
 	struct bnxt_tc_l2_key		l2_mask;
 	struct bnxt_tc_l3_key		l3_key;
-- 
2.5.1


^ permalink raw reply related

* Re: [PATCH bpf-next v5 0/2] net: xdp: XSKMAP improvements
From: Daniel Borkmann @ 2019-08-17 21:28 UTC (permalink / raw)
  To: Björn Töpel, ast, netdev
  Cc: magnus.karlsson, jonathan.lemon, bjorn.topel, bruce.richardson,
	songliubraving, bpf
In-Reply-To: <20190815093014.31174-1-bjorn.topel@gmail.com>

On 8/15/19 11:30 AM, Björn Töpel wrote:
> This series (v5 and counting) add two improvements for the XSKMAP,
> used by AF_XDP sockets.
> 
> 1. Automatic cleanup when an AF_XDP socket goes out of scope/is
>     released. Instead of require that the user manually clears the
>     "released" state socket from the map, this is done
>     automatically. Each socket tracks which maps it resides in, and
>     remove itself from those maps at relase. A notable implementation
>     change, is that the sockets references the map, instead of the map
>     referencing the sockets. Which implies that when the XSKMAP is
>     freed, it is by definition cleared of sockets.
> 
> 2. The XSKMAP did not honor the BPF_EXIST/BPF_NOEXIST flag on insert,
>     which this patch addresses.
> 
> 
> Thanks,
> Björn
> 
> v1->v2: Fixed deadlock and broken cleanup. (Daniel)
> v2->v3: Rebased onto bpf-next
> v3->v4: {READ, WRITE}_ONCE consistency. (Daniel)
>          Socket release/map update race. (Daniel)
> v4->v5: Avoid use-after-free on XSKMAP self-assignment [1]. (Daniel)
>          Removed redundant assignment in xsk_map_update_elem().
>          Variable name consistency; Use map_entry everywhere.
> 
> [1] https://lore.kernel.org/bpf/20190802081154.30962-1-bjorn.topel@gmail.com/T/#mc68439e97bc07fa301dad9fc4850ed5aa392f385
> 
> Björn Töpel (2):
>    xsk: remove AF_XDP socket from map when the socket is released
>    xsk: support BPF_EXIST and BPF_NOEXIST flags in XSKMAP
> 
>   include/net/xdp_sock.h |  18 ++++++
>   kernel/bpf/xskmap.c    | 133 ++++++++++++++++++++++++++++++++++-------
>   net/xdp/xsk.c          |  50 ++++++++++++++++
>   3 files changed, 179 insertions(+), 22 deletions(-)
> 

Looks better, applied thanks!

^ permalink raw reply

* Re: [PATCH bpf-next v4 0/8] add need_wakeup flag to the AF_XDP rings
From: Daniel Borkmann @ 2019-08-17 21:29 UTC (permalink / raw)
  To: Magnus Karlsson, bjorn.topel, ast, netdev, brouer, maximmi
  Cc: bpf, bruce.richardson, ciara.loftus, jakub.kicinski, xiaolong.ye,
	qi.z.zhang, sridhar.samudrala, kevin.laatz, ilias.apalodimas,
	jonathan.lemon, kiran.patil, axboe, maciej.fijalkowski,
	maciejromanfijalkowski, intel-wired-lan
In-Reply-To: <1565767643-4908-1-git-send-email-magnus.karlsson@intel.com>

On 8/14/19 9:27 AM, Magnus Karlsson wrote:
> This patch set adds support for a new flag called need_wakeup in the
> AF_XDP Tx and fill rings. When this flag is set by the driver, it
> means that the application has to explicitly wake up the kernel Rx
> (for the bit in the fill ring) or kernel Tx (for bit in the Tx ring)
> processing by issuing a syscall. Poll() can wake up both and sendto()
> will wake up Tx processing only.
> 
> The main reason for introducing this new flag is to be able to
> efficiently support the case when application and driver is executing
> on the same core. Previously, the driver was just busy-spinning on the
> fill ring if it ran out of buffers in the HW and there were none to
> get from the fill ring. This approach works when the application and
> driver is running on different cores as the application can replenish
> the fill ring while the driver is busy-spinning. Though, this is a
> lousy approach if both of them are running on the same core as the
> probability of the fill ring getting more entries when the driver is
> busy-spinning is zero. With this new feature the driver now sets the
> need_wakeup flag and returns to the application. The application can
> then replenish the fill queue and then explicitly wake up the Rx
> processing in the kernel using the syscall poll(). For Tx, the flag is
> only set to one if the driver has no outstanding Tx completion
> interrupts. If it has some, the flag is zero as it will be woken up by
> a completion interrupt anyway. This flag can also be used in other
> situations where the driver needs to be woken up explicitly.
> 
> As a nice side effect, this new flag also improves the Tx performance
> of the case where application and driver are running on two different
> cores as it reduces the number of syscalls to the kernel. The kernel
> tells user space if it needs to be woken up by a syscall, and this
> eliminates many of the syscalls. The Rx performance of the 2-core case
> is on the other hand slightly worse, since there is a need to use a
> syscall now to wake up the driver, instead of the driver
> busy-spinning. It does waste less CPU cycles though, which might lead
> to better overall system performance.
> 
> This new flag needs some simple driver support. If the driver does not
> support it, the Rx flag is always zero and the Tx flag is always
> one. This makes any application relying on this feature default to the
> old behavior of not requiring any syscalls in the Rx path and always
> having to call sendto() in the Tx path.
> 
> For backwards compatibility reasons, this feature has to be explicitly
> turned on using a new bind flag (XDP_USE_NEED_WAKEUP). I recommend
> that you always turn it on as it has a large positive performance
> impact for the one core case and does not degrade 2 core performance
> and actually improves it for Tx heavy workloads.
> 
> Here are some performance numbers measured on my local,
> non-performance optimized development system. That is why you are
> seeing numbers lower than the ones from Björn and Jesper. 64 byte
> packets at 40Gbit/s line rate. All results in Mpps. Cores == 1 means
> that both application and driver is executing on the same core. Cores
> == 2 that they are on different cores.
> 
>                                Applications
> need_wakeup  cores    txpush    rxdrop      l2fwd
> ---------------------------------------------------------------
>       n         1       0.07      0.06        0.03
>       y         1       21.6      8.2         6.5
>       n         2       32.3      11.7        8.7
>       y         2       33.1      11.7        8.7
> 
> Overall, the need_wakeup flag provides the same or better performance
> in all the micro-benchmarks. The reduction of sendto() calls in txpush
> is large. Only a few per second is needed. For l2fwd, the drop is 50%
> for the 1 core case and more than 99.9% for the 2 core case. Do not
> know why I am not seeing the same drop for the 1 core case yet.
> 
> The name and inspiration of the flag has been taken from io_uring by
> Jens Axboe. Details about this feature in io_uring can be found in
> http://kernel.dk/io_uring.pdf, section 8.3. It also addresses most of
> the denial of service and sendto() concerns raised by Maxim
> Mikityanskiy in https://www.spinics.net/lists/netdev/msg554657.html.
> 
> The typical Tx part of an application will have to change from:
> 
> ret = sendto(fd,....)
> 
> to:
> 
> if (xsk_ring_prod__needs_wakeup(&xsk->tx))
>         ret = sendto(fd,....)
> 
> and th Rx part from:
> 
> rcvd = xsk_ring_cons__peek(&xsk->rx, BATCH_SIZE, &idx_rx);
> if (!rcvd)
>         return;
> 
> to:
> 
> rcvd = xsk_ring_cons__peek(&xsk->rx, BATCH_SIZE, &idx_rx);
> if (!rcvd) {
>         if (xsk_ring_prod__needs_wakeup(&xsk->umem->fq))
>                ret = poll(fd,.....);
>         return;
> }
> 
> v3 -> v4:
> * Maxim found a possible race in the Tx part of the driver. The
>    setting of the flag needs to happen before the sending, otherwise it
>    might trigger this race. Fixed in ixgbe and i40e driver.
> * Mellanox support contributed by Maxim
> * Removed the XSK_DRV_CAN_SLEEP flag as it was not used
>    anymore. Thanks to Sridhar for discovering this.
> * For consistency the feature is now always called need_wakeup. There
>    were some places where it was referred to as might_sleep, but they
>    have been removed. Thanks to Sridhar for spotting.
> * Fixed some typos in the commit messages
> 
> v2 -> v3:
> * Converted the Mellanox driver to the new ndo in patch 1 as pointed out
>    by Maxim
> * Fixed the compatibility code of XDP_MMAP_OFFSETS so it now works.
> 
> v1 -> v2:
> * Fixed bisectability problem pointed out by Jakub
> * Added missing initiliztion of the Tx need_wakeup flag to 1
> 
> This patch has been applied against commit b753c5a7f99f ("Merge branch 'r8152-RX-improve'")
> 
> Structure of the patch set:
> 
> Patch 1: Replaces the ndo_xsk_async_xmit with ndo_xsk_wakeup to
>           support waking up both Rx and Tx processing
> Patch 2: Implements the need_wakeup functionality in common code
> Patch 3-4: Add need_wakeup support to the i40e and ixgbe drivers
> Patch 5: Add need_wakeup support to libbpf
> Patch 6: Add need_wakeup support to the xdpsock sample application
> Patch 7-8: Add need_wakeup support to the Mellanox mlx5 driver
> 
> Thanks: Magnus
> 
> Magnus Karlsson (6):
>    xsk: replace ndo_xsk_async_xmit with ndo_xsk_wakeup
>    xsk: add support for need_wakeup flag in AF_XDP rings
>    i40e: add support for AF_XDP need_wakeup feature
>    ixgbe: add support for AF_XDP need_wakeup feature
>    libbpf: add support for need_wakeup flag in AF_XDP part
>    samples/bpf: add use of need_wakeup flag in xdpsock
> 
> Maxim Mikityanskiy (2):
>    net/mlx5e: Move the SW XSK code from NAPI poll to a separate function
>    net/mlx5e: Add AF_XDP need_wakeup support
> 
>   drivers/net/ethernet/intel/i40e/i40e_main.c        |   5 +-
>   drivers/net/ethernet/intel/i40e/i40e_xsk.c         |  25 ++-
>   drivers/net/ethernet/intel/i40e/i40e_xsk.h         |   2 +-
>   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c      |   5 +-
>   .../net/ethernet/intel/ixgbe/ixgbe_txrx_common.h   |   2 +-
>   drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c       |  22 ++-
>   .../net/ethernet/mellanox/mlx5/core/en/xsk/rx.h    |  14 ++
>   .../net/ethernet/mellanox/mlx5/core/en/xsk/tx.c    |   2 +-
>   .../net/ethernet/mellanox/mlx5/core/en/xsk/tx.h    |  14 +-
>   drivers/net/ethernet/mellanox/mlx5/core/en_main.c  |   2 +-
>   drivers/net/ethernet/mellanox/mlx5/core/en_rx.c    |   7 +-
>   drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c  |  27 ++-
>   include/linux/netdevice.h                          |  14 +-
>   include/net/xdp_sock.h                             |  33 +++-
>   include/uapi/linux/if_xdp.h                        |  13 ++
>   net/xdp/xdp_umem.c                                 |  12 +-
>   net/xdp/xsk.c                                      | 149 +++++++++++++---
>   net/xdp/xsk.h                                      |  13 ++
>   net/xdp/xsk_queue.h                                |   1 +
>   samples/bpf/xdpsock_user.c                         | 192 +++++++++++++--------
>   tools/include/uapi/linux/if_xdp.h                  |  13 ++
>   tools/lib/bpf/xsk.c                                |   4 +
>   tools/lib/bpf/xsk.h                                |   6 +
>   23 files changed, 462 insertions(+), 115 deletions(-)
> 
> --
> 2.7.4
> 

Series applied, thanks everyone!

^ permalink raw reply

* Re: [PATCH bpf-next v2] net: Don't call XDP_SETUP_PROG when nothing is changed
From: Daniel Borkmann @ 2019-08-17 21:29 UTC (permalink / raw)
  To: Maxim Mikityanskiy, Alexei Starovoitov, Jakub Kicinski
  Cc: bpf@vger.kernel.org, netdev@vger.kernel.org, David S. Miller,
	Björn Töpel, Saeed Mahameed, Jesper Dangaard Brouer,
	John Fastabend, Martin KaFai Lau, Song Liu, Yonghong Song
In-Reply-To: <20190814143352.3759-1-maximmi@mellanox.com>

On 8/14/19 4:34 PM, Maxim Mikityanskiy wrote:
> Don't uninstall an XDP program when none is installed, and don't install
> an XDP program that has the same ID as the one already installed.
> 
> dev_change_xdp_fd doesn't perform any checks in case it uninstalls an
> XDP program. It means that the driver's ndo_bpf can be called with
> XDP_SETUP_PROG asking to set it to NULL even if it's already NULL. This
> case happens if the user runs `ip link set eth0 xdp off` when there is
> no XDP program attached.
> 
> The symmetrical case is possible when the user tries to set the program
> that is already set.
> 
> The drivers typically perform some heavy operations on XDP_SETUP_PROG,
> so they all have to handle these cases internally to return early if
> they happen. This patch puts this check into the kernel code, so that
> all drivers will benefit from it.
> 
> Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>

Applied, thanks!

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox