Netdev List
 help / color / mirror / Atom feed
* Re: [bpf PATCH 4/6] bpf: sockmap, tcp_disconnect to listen transition
From: John Fastabend @ 2018-06-14  5:48 UTC (permalink / raw)
  To: Martin KaFai Lau; +Cc: ast, daniel, netdev
In-Reply-To: <20180614005601.bsjijlsr7smpngde@kafai-mbp>

On 06/13/2018 05:56 PM, Martin KaFai Lau wrote:
> On Wed, Jun 13, 2018 at 10:50:14AM -0700, John Fastabend wrote:
>> After adding checks to ensure TCP is in ESTABLISHED state when a
>> sock is added we need to also ensure that user does not transition
>> through tcp_disconnect() and back into ESTABLISHED state without
>> sockmap removing the sock.
>>
>> To do this add unhash hook and remove sock from map there.
> In bpf_tcp_init():
>         sk->sk_prot = &tcp_bpf_proto;
> 
> I may have missed a lock while reading sockmap.c.
> Is it possible that tcp_disconnect() is being called while
> the above assignment is also being done (e.g. through BPF_MAP_UPDATE_ELEM)?
> The same situation go for the ESTABLISHED check.
> 

Right because ESTABLISHED is checked without any locking its
possible that the state changes during the update (from userspce
BPF_MAP_UPDATE, from sock_ops program it is locked). I have
the below patch on my tree now, I was thinking to send it as
a follow on but on second thought it likely makes more sense
as part of the patch that adds the ESTABLISHED check.

Also after the below the sk_callback lock used to protect
psock->maps is becoming increasingly pointless it allows the
delete and map free ops to be called without taking the full
sock lock. It might be time to just drop it in bpf-next and
use the sock lock in the delete cases. The more annoying part
will be the delete will have to have different userspace and
bpf program helpers so we know when we need the lock.

--- a/kernel/bpf/sockmap.c
+++ b/kernel/bpf/sockmap.c
@@ -2074,17 +2074,20 @@ static int sock_map_update_elem(struct bpf_map *map,
                return -EINVAL;
        }

+       lock_sock(skops.sk);
        /* ULPs are currently supported only for TCP sockets in ESTABLISHED
         * state.
         */
        if (skops.sk->sk_type != SOCK_STREAM ||
            skops.sk->sk_protocol != IPPROTO_TCP ||
            skops.sk->sk_state != TCP_ESTABLISHED) {
-               fput(socket->file);
-               return -EOPNOTSUPP;
+               err = -EOPNOTSUPP;
+               goto out;
        }

        err = sock_map_ctx_update_elem(&skops, map, key, flags);
+out:
+       release_sock(skops.sk);
        fput(socket->file);
        return err;
 }
@@ -2423,17 +2426,20 @@ static int sock_hash_update_elem(struct bpf_map
*map,
                return -EINVAL;
        }

+       lock_sock(skops.sk);
        /* ULPs are currently supported only for TCP sockets in ESTABLISHED
         * state.
         */
        if (skops.sk->sk_type != SOCK_STREAM ||
            skops.sk->sk_protocol != IPPROTO_TCP ||
            skops.sk->sk_state != TCP_ESTABLISHED) {
-               fput(socket->file);
-               return -EOPNOTSUPP;
+               err = -EOPNOTSUPP;
+               goto out;
        }

        err = sock_hash_ctx_update_elem(&skops, map, key, flags);
+out:
+       release_sock(skops.sk);
        fput(socket->file);
        return err;

^ permalink raw reply

* [PATCH v2] net: ethernet: stmmac: dwmac-rk: Add GMAC support for PX30
From: David Wu @ 2018-06-14  6:15 UTC (permalink / raw)
  To: davem, heiko, robh+dt
  Cc: mark.rutland, huangtao, netdev, linux-arm-kernel, linux-rockchip,
	linux-kernel, David Wu

Add constants and callback functions for the dwmac on PX30 Soc.
The base structure is the same, but registers and the bits in
them are moved slightly, and add the clk_mac_speed for selecting
mac speed.

Signed-off-by: David Wu <david.wu@rock-chips.com>
---
Change in v2:
- Fix some error in commit title and message

 .../devicetree/bindings/net/rockchip-dwmac.txt     |  1 +
 drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c     | 64 ++++++++++++++++++++++
 2 files changed, 65 insertions(+)

diff --git a/Documentation/devicetree/bindings/net/rockchip-dwmac.txt b/Documentation/devicetree/bindings/net/rockchip-dwmac.txt
index 9c16ee2..3b71da7 100644
--- a/Documentation/devicetree/bindings/net/rockchip-dwmac.txt
+++ b/Documentation/devicetree/bindings/net/rockchip-dwmac.txt
@@ -4,6 +4,7 @@ The device node has following properties.
 
 Required properties:
  - compatible: should be "rockchip,<name>-gamc"
+   "rockchip,px30-gmac":   found on PX30 SoCs
    "rockchip,rk3128-gmac": found on RK312x SoCs
    "rockchip,rk3228-gmac": found on RK322x SoCs
    "rockchip,rk3288-gmac": found on RK3288 SoCs
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c
index 13133b3..5e69dd0 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c
@@ -61,6 +61,7 @@ struct rk_priv_data {
 	struct clk *mac_clk_tx;
 	struct clk *clk_mac_ref;
 	struct clk *clk_mac_refout;
+	struct clk *clk_mac_speed;
 	struct clk *aclk_mac;
 	struct clk *pclk_mac;
 	struct clk *clk_phy;
@@ -83,6 +84,64 @@ struct rk_priv_data {
 	(((tx) ? soc##_GMAC_TXCLK_DLY_ENABLE : soc##_GMAC_TXCLK_DLY_DISABLE) | \
 	 ((rx) ? soc##_GMAC_RXCLK_DLY_ENABLE : soc##_GMAC_RXCLK_DLY_DISABLE))
 
+#define PX30_GRF_GMAC_CON1		0x0904
+
+/* PX30_GRF_GMAC_CON1 */
+#define PX30_GMAC_PHY_INTF_SEL_RMII	(GRF_CLR_BIT(4) | GRF_CLR_BIT(5) | \
+					 GRF_BIT(6))
+#define PX30_GMAC_SPEED_10M		GRF_CLR_BIT(2)
+#define PX30_GMAC_SPEED_100M		GRF_BIT(2)
+
+static void px30_set_to_rmii(struct rk_priv_data *bsp_priv)
+{
+	struct device *dev = &bsp_priv->pdev->dev;
+
+	if (IS_ERR(bsp_priv->grf)) {
+		dev_err(dev, "%s: Missing rockchip,grf property\n", __func__);
+		return;
+	}
+
+	regmap_write(bsp_priv->grf, PX30_GRF_GMAC_CON1,
+		     PX30_GMAC_PHY_INTF_SEL_RMII);
+}
+
+static void px30_set_rmii_speed(struct rk_priv_data *bsp_priv, int speed)
+{
+	struct device *dev = &bsp_priv->pdev->dev;
+	int ret;
+
+	if (IS_ERR(bsp_priv->clk_mac_speed)) {
+		dev_err(dev, "%s: Missing clk_mac_speed clock\n", __func__);
+		return;
+	}
+
+	if (speed == 10) {
+		regmap_write(bsp_priv->grf, PX30_GRF_GMAC_CON1,
+			     PX30_GMAC_SPEED_10M);
+
+		ret = clk_set_rate(bsp_priv->clk_mac_speed, 2500000);
+		if (ret)
+			dev_err(dev, "%s: set clk_mac_speed rate 2500000 failed: %d\n",
+				__func__, ret);
+	} else if (speed == 100) {
+		regmap_write(bsp_priv->grf, PX30_GRF_GMAC_CON1,
+			     PX30_GMAC_SPEED_100M);
+
+		ret = clk_set_rate(bsp_priv->clk_mac_speed, 25000000);
+		if (ret)
+			dev_err(dev, "%s: set clk_mac_speed rate 25000000 failed: %d\n",
+				__func__, ret);
+
+	} else {
+		dev_err(dev, "unknown speed value for RMII! speed=%d", speed);
+	}
+}
+
+static const struct rk_gmac_ops px30_ops = {
+	.set_to_rmii = px30_set_to_rmii,
+	.set_rmii_speed = px30_set_rmii_speed,
+};
+
 #define RK3128_GRF_MAC_CON0	0x0168
 #define RK3128_GRF_MAC_CON1	0x016c
 
@@ -1042,6 +1101,10 @@ static int rk_gmac_clk_init(struct plat_stmmacenet_data *plat)
 		}
 	}
 
+	bsp_priv->clk_mac_speed = devm_clk_get(dev, "clk_mac_speed");
+	if (IS_ERR(bsp_priv->clk_mac_speed))
+		dev_err(dev, "cannot get clock %s\n", "clk_mac_speed");
+
 	if (bsp_priv->clock_input) {
 		dev_info(dev, "clock input from PHY\n");
 	} else {
@@ -1424,6 +1487,7 @@ static int rk_gmac_resume(struct device *dev)
 static SIMPLE_DEV_PM_OPS(rk_gmac_pm_ops, rk_gmac_suspend, rk_gmac_resume);
 
 static const struct of_device_id rk_gmac_dwmac_match[] = {
+	{ .compatible = "rockchip,px30-gmac",	.data = &px30_ops   },
 	{ .compatible = "rockchip,rk3128-gmac", .data = &rk3128_ops },
 	{ .compatible = "rockchip,rk3228-gmac", .data = &rk3228_ops },
 	{ .compatible = "rockchip,rk3288-gmac", .data = &rk3288_ops },
-- 
2.7.4

^ permalink raw reply related

* [PATCH v2 net-next 1/6] net: ethernet: ti: cpsw: use cpdma channels in backward order for txq
From: Ivan Khoronzhuk @ 2018-06-14  7:36 UTC (permalink / raw)
  To: grygorii.strashko, davem
  Cc: corbet, akpm, netdev, linux-doc, linux-kernel, linux-omap,
	vinicius.gomes, henrik, jesus.sanchez-palencia, ilias.apalodimas,
	p-varis, spatton, francois.ozog, yogeshs, nsekhar, andrew,
	Ivan Khoronzhuk
In-Reply-To: <20180614073650.29659-1-ivan.khoronzhuk@linaro.org>

The cpdma channel highest priority is from hi to lo number.
The driver has limited number of descriptors that are shared between
number of cpdma channels. Number of queues can be tuned with ethtool,
that allows to not spend descriptors on not needed cpdma channels.
In AVB usually only 2 tx queues can be enough with rate limitation.
The rate limitation can be used only for hi priority queues. Thus, to
use only 2 queues the 8 has to be created. It's wasteful.

So, in order to allow using only needed number of rate limited
tx queues, save resources, and be able to set rate limitation for
them, let assign tx cpdma channels in backward order to queues.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
---
 drivers/net/ethernet/ti/cpsw.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index 534596ce00d3..406537d74ec1 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -967,8 +967,8 @@ static int cpsw_tx_mq_poll(struct napi_struct *napi_tx, int budget)
 
 	/* process every unprocessed channel */
 	ch_map = cpdma_ctrl_txchs_state(cpsw->dma);
-	for (ch = 0, num_tx = 0; ch_map; ch_map >>= 1, ch++) {
-		if (!(ch_map & 0x01))
+	for (ch = 0, num_tx = 0; ch_map & 0xff; ch_map <<= 1, ch++) {
+		if (!(ch_map & 0x80))
 			continue;
 
 		txv = &cpsw->txv[ch];
@@ -2431,7 +2431,7 @@ static int cpsw_update_channels_res(struct cpsw_priv *priv, int ch_num, int rx)
 	void (*handler)(void *, int, int);
 	struct netdev_queue *queue;
 	struct cpsw_vector *vec;
-	int ret, *ch;
+	int ret, *ch, vch;
 
 	if (rx) {
 		ch = &cpsw->rx_ch_num;
@@ -2444,7 +2444,8 @@ static int cpsw_update_channels_res(struct cpsw_priv *priv, int ch_num, int rx)
 	}
 
 	while (*ch < ch_num) {
-		vec[*ch].ch = cpdma_chan_create(cpsw->dma, *ch, handler, rx);
+		vch = rx ? *ch : 7 - *ch;
+		vec[*ch].ch = cpdma_chan_create(cpsw->dma, vch, handler, rx);
 		queue = netdev_get_tx_queue(priv->ndev, *ch);
 		queue->tx_maxrate = 0;
 
@@ -2980,7 +2981,7 @@ static int cpsw_probe(struct platform_device *pdev)
 	u32 slave_offset, sliver_offset, slave_size;
 	const struct soc_device_attribute *soc;
 	struct cpsw_common		*cpsw;
-	int ret = 0, i;
+	int ret = 0, i, ch;
 	int irq;
 
 	cpsw = devm_kzalloc(&pdev->dev, sizeof(struct cpsw_common), GFP_KERNEL);
@@ -3155,7 +3156,8 @@ static int cpsw_probe(struct platform_device *pdev)
 	if (soc)
 		cpsw->quirk_irq = 1;
 
-	cpsw->txv[0].ch = cpdma_chan_create(cpsw->dma, 0, cpsw_tx_handler, 0);
+	ch = cpsw->quirk_irq ? 0 : 7;
+	cpsw->txv[0].ch = cpdma_chan_create(cpsw->dma, ch, cpsw_tx_handler, 0);
 	if (IS_ERR(cpsw->txv[0].ch)) {
 		dev_err(priv->dev, "error initializing tx dma channel\n");
 		ret = PTR_ERR(cpsw->txv[0].ch);
-- 
2.17.1

^ permalink raw reply related

* [PATCH v2 net-next 2/6] net: ethernet: ti: cpdma: fit rated channels in backward order
From: Ivan Khoronzhuk @ 2018-06-14  7:36 UTC (permalink / raw)
  To: grygorii.strashko, davem
  Cc: corbet, akpm, netdev, linux-doc, linux-kernel, linux-omap,
	vinicius.gomes, henrik, jesus.sanchez-palencia, ilias.apalodimas,
	p-varis, spatton, francois.ozog, yogeshs, nsekhar, andrew,
	Ivan Khoronzhuk
In-Reply-To: <20180614073650.29659-1-ivan.khoronzhuk@linaro.org>

According to TRM tx rated channels should be in 7..0 order,
so correct it.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
---
 drivers/net/ethernet/ti/davinci_cpdma.c | 31 ++++++++++++-------------
 1 file changed, 15 insertions(+), 16 deletions(-)

diff --git a/drivers/net/ethernet/ti/davinci_cpdma.c b/drivers/net/ethernet/ti/davinci_cpdma.c
index cdbddf16dd29..19bb63902997 100644
--- a/drivers/net/ethernet/ti/davinci_cpdma.c
+++ b/drivers/net/ethernet/ti/davinci_cpdma.c
@@ -406,37 +406,36 @@ static int cpdma_chan_fit_rate(struct cpdma_chan *ch, u32 rate,
 	struct cpdma_chan *chan;
 	u32 old_rate = ch->rate;
 	u32 new_rmask = 0;
-	int rlim = 1;
+	int rlim = 0;
 	int i;
 
-	*prio_mode = 0;
 	for (i = tx_chan_num(0); i < tx_chan_num(CPDMA_MAX_CHANNELS); i++) {
 		chan = ctlr->channels[i];
-		if (!chan) {
-			rlim = 0;
+		if (!chan)
 			continue;
-		}
 
 		if (chan == ch)
 			chan->rate = rate;
 
 		if (chan->rate) {
-			if (rlim) {
-				new_rmask |= chan->mask;
-			} else {
-				ch->rate = old_rate;
-				dev_err(ctlr->dev, "Prev channel of %dch is not rate limited\n",
-					chan->chan_num);
-				return -EINVAL;
-			}
-		} else {
-			*prio_mode = 1;
-			rlim = 0;
+			rlim = 1;
+			new_rmask |= chan->mask;
+			continue;
 		}
+
+		if (rlim)
+			goto err;
 	}
 
 	*rmask = new_rmask;
+	*prio_mode = rlim;
 	return 0;
+
+err:
+	ch->rate = old_rate;
+	dev_err(ctlr->dev, "Upper cpdma ch%d is not rate limited\n",
+		chan->chan_num);
+	return -EINVAL;
 }
 
 static u32 cpdma_chan_set_factors(struct cpdma_ctlr *ctlr,
-- 
2.17.1

^ permalink raw reply related

* [PATCH v2 net-next 4/6] net: ethernet: ti: cpsw: add CBS Qdisc offload
From: Ivan Khoronzhuk @ 2018-06-14  7:36 UTC (permalink / raw)
  To: grygorii.strashko, davem
  Cc: corbet, akpm, netdev, linux-doc, linux-kernel, linux-omap,
	vinicius.gomes, henrik, jesus.sanchez-palencia, ilias.apalodimas,
	p-varis, spatton, francois.ozog, yogeshs, nsekhar, andrew,
	Ivan Khoronzhuk
In-Reply-To: <20180614073650.29659-1-ivan.khoronzhuk@linaro.org>

The cpsw has up to 4 FIFOs per port and upper 3 FIFOs can feed rate
limited queue with shaping. In order to set and enable shaping for
those 3 FIFOs queues the network device with CBS qdisc attached is
needed. The CBS configuration is added for dual-emac/single port mode
only, but potentially can be used in switch mode also, based on
switchdev for instance.

Despite the FIFO shapers can work w/o cpdma level shapers the base
usage must be in combine with cpdma level shapers as described in TRM,
that are set as maximum rates for interface queues with sysfs.

One of the possible configuration with txq shapers and CBS shapers:

                      Configured with echo RATE >
                  /sys/class/net/eth0/queues/tx-0/tx_maxrate
             /---------------------------------------------------
            /
           /            cpdma level shapers
        +----+ +----+ +----+ +----+ +----+ +----+ +----+ +----+
        | c7 | | c6 | | c5 | | c4 | | c3 | | c2 | | c1 | | c0 |
        \    / \    / \    / \    / \    / \    / \    / \    /
         \  /   \  /   \  /   \  /   \  /   \  /   \  /   \  /
          \/     \/     \/     \/     \/     \/     \/     \/
+---------|------|------|------|-------------------------------------+
|    +----+      |      |  +---+                                     |
|    |      +----+      |  |                                         |
|    v      v           v  v                                         |
| +----+ +----+ +----+ +----+ p        p+----+ +----+ +----+ +----+  |
| |    | |    | |    | |    | o        o|    | |    | |    | |    |  |
| | f3 | | f2 | | f1 | | f0 | r  CPSW  r| f3 | | f2 | | f1 | | f0 |  |
| |    | |    | |    | |    | t        t|    | |    | |    | |    |  |
| \    / \    / \    / \    / 0        1\    / \    / \    / \    /  |
|  \  X   \  /   \  /   \  /             \  /   \  /   \  /   \  /   |
|   \/ \   \/     \/     \/               \/     \/     \/     \/    |
+-------\------------------------------------------------------------+
         \
          \ FIFO shaper, set with CBS offload added in this patch,
           \ FIFO0 cannot be rate limited
            ------------------------------------------------------

CBS shaper configuration is supposed to be used with root MQPRIO Qdisc
offload allowing to add sk_prio->tc->txq maps that direct traffic to
appropriate tx queue and maps L2 priority to FIFO shaper.

The CBS shaper is intended to be used for AVB where L2 priority
(pcp field) is used to differentiate class of traffic. So additionally
vlan needs to be created with appropriate egress sk_prio->l2 prio map.

If CBS has several tx queues assigned to it, the sum of their
bandwidth has not overlap bandwidth set for CBS. It's recomended the
CBS bandwidth to be a little bit more.

The CBS shaper is configured with CBS qdisc offload interface using tc
tool from iproute2 packet.

For instance:

$ tc qdisc replace dev eth0 handle 100: parent root mqprio num_tc 3 \
map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@0 1@1 2@2 hw 1

$ tc -g class show dev eth0
+---(100:ffe2) mqprio
|    +---(100:3) mqprio
|    +---(100:4) mqprio
|    
+---(100:ffe1) mqprio
|    +---(100:2) mqprio
|    
+---(100:ffe0) mqprio
     +---(100:1) mqprio

$ tc qdisc add dev eth0 parent 100:1 cbs locredit -1440 \
hicredit 60 sendslope -960000 idleslope 40000 offload 1

$ tc qdisc add dev eth0 parent 100:2 cbs locredit -1470 \
hicredit 62 sendslope -980000 idleslope 20000 offload 1

The above code set CBS shapers for tc0 and tc1, for that txq0 and
txq1 is used. Pay attention, the real set bandwidth can differ a bit
due to discreteness of configuration parameters.

Here parameters like locredit, hicredit and sendslope are ignored
internally and are supposed to be set with assumption that maximum
frame size for frame - 1500.

It's supposed that interface speed is not changed while reconnection,
not always is true, so inform user in case speed of interface was
changed, as it can impact on dependent shapers configuration.

For more examples see Documentation.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
---
 drivers/net/ethernet/ti/cpsw.c | 221 +++++++++++++++++++++++++++++++++
 1 file changed, 221 insertions(+)

diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index edd14def98df..3623c2994ddf 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -46,6 +46,8 @@
 #include "cpts.h"
 #include "davinci_cpdma.h"
 
+#include <net/pkt_sched.h>
+
 #define CPSW_DEBUG	(NETIF_MSG_HW		| NETIF_MSG_WOL		| \
 			 NETIF_MSG_DRV		| NETIF_MSG_LINK	| \
 			 NETIF_MSG_IFUP		| NETIF_MSG_INTR	| \
@@ -154,8 +156,12 @@ do {								\
 #define IRQ_NUM			2
 #define CPSW_MAX_QUEUES		8
 #define CPSW_CPDMA_DESCS_POOL_SIZE_DEFAULT 256
+#define CPSW_FIFO_QUEUE_TYPE_SHIFT	16
+#define CPSW_FIFO_SHAPE_EN_SHIFT	16
+#define CPSW_FIFO_RATE_EN_SHIFT		20
 #define CPSW_TC_NUM			4
 #define CPSW_FIFO_SHAPERS_NUM		(CPSW_TC_NUM - 1)
+#define CPSW_PCT_MASK			0x7f
 
 #define CPSW_RX_VLAN_ENCAP_HDR_PRIO_SHIFT	29
 #define CPSW_RX_VLAN_ENCAP_HDR_PRIO_MSK		GENMASK(2, 0)
@@ -457,6 +463,8 @@ struct cpsw_priv {
 	bool				rx_pause;
 	bool				tx_pause;
 	bool				mqprio_hw;
+	int				fifo_bw[CPSW_TC_NUM];
+	int				shp_cfg_speed;
 	u32 emac_port;
 	struct cpsw_common *cpsw;
 };
@@ -1081,6 +1089,38 @@ static void cpsw_set_slave_mac(struct cpsw_slave *slave,
 	slave_write(slave, mac_lo(priv->mac_addr), SA_LO);
 }
 
+static bool cpsw_shp_is_off(struct cpsw_priv *priv)
+{
+	struct cpsw_common *cpsw = priv->cpsw;
+	struct cpsw_slave *slave;
+	u32 shift, mask, val;
+
+	val = readl_relaxed(&cpsw->regs->ptype);
+
+	slave = &cpsw->slaves[cpsw_slave_index(cpsw, priv)];
+	shift = CPSW_FIFO_SHAPE_EN_SHIFT + 3 * slave->slave_num;
+	mask = 7 << shift;
+	val = val & mask;
+
+	return !val;
+}
+
+static void cpsw_fifo_shp_on(struct cpsw_priv *priv, int fifo, int on)
+{
+	struct cpsw_common *cpsw = priv->cpsw;
+	struct cpsw_slave *slave;
+	u32 shift, mask, val;
+
+	val = readl_relaxed(&cpsw->regs->ptype);
+
+	slave = &cpsw->slaves[cpsw_slave_index(cpsw, priv)];
+	shift = CPSW_FIFO_SHAPE_EN_SHIFT + 3 * slave->slave_num;
+	mask = (1 << --fifo) << shift;
+	val = on ? val | mask : val & ~mask;
+
+	writel_relaxed(val, &cpsw->regs->ptype);
+}
+
 static void _cpsw_adjust_link(struct cpsw_slave *slave,
 			      struct cpsw_priv *priv, bool *link)
 {
@@ -1120,6 +1160,12 @@ static void _cpsw_adjust_link(struct cpsw_slave *slave,
 			mac_control |= BIT(4);
 
 		*link = true;
+
+		if (priv->shp_cfg_speed &&
+		    priv->shp_cfg_speed != slave->phy->speed &&
+		    !cpsw_shp_is_off(priv))
+			dev_warn(priv->dev,
+				 "Speed was changed, CBS sahper speeds are changed!");
 	} else {
 		mac_control = 0;
 		/* disable forwarding */
@@ -1589,6 +1635,178 @@ static int cpsw_tc_to_fifo(int tc, int num_tc)
 	return CPSW_FIFO_SHAPERS_NUM - tc;
 }
 
+static int cpsw_set_fifo_bw(struct cpsw_priv *priv, int fifo, int bw)
+{
+	struct cpsw_common *cpsw = priv->cpsw;
+	u32 val = 0, send_pct, shift;
+	struct cpsw_slave *slave;
+	int pct = 0, i;
+
+	if (bw > priv->shp_cfg_speed * 1000)
+		goto err;
+
+	/* shaping has to stay enabled for highest fifos linearly
+	 * and fifo bw no more then interface can allow
+	 */
+	slave = &cpsw->slaves[cpsw_slave_index(cpsw, priv)];
+	send_pct = slave_read(slave, SEND_PERCENT);
+	for (i = CPSW_FIFO_SHAPERS_NUM; i > 0; i--) {
+		if (!bw) {
+			if (i >= fifo || !priv->fifo_bw[i])
+				continue;
+
+			dev_warn(priv->dev, "Prev FIFO%d is shaped", i);
+			continue;
+		}
+
+		if (!priv->fifo_bw[i] && i > fifo) {
+			dev_err(priv->dev, "Upper FIFO%d is not shaped", i);
+			return -EINVAL;
+		}
+
+		shift = (i - 1) * 8;
+		if (i == fifo) {
+			send_pct &= ~(CPSW_PCT_MASK << shift);
+			val = DIV_ROUND_UP(bw, priv->shp_cfg_speed * 10);
+			if (!val)
+				val = 1;
+
+			send_pct |= val << shift;
+			pct += val;
+			continue;
+		}
+
+		if (priv->fifo_bw[i])
+			pct += (send_pct >> shift) & CPSW_PCT_MASK;
+	}
+
+	if (pct >= 100)
+		goto err;
+
+	slave_write(slave, send_pct, SEND_PERCENT);
+	priv->fifo_bw[fifo] = bw;
+
+	dev_warn(priv->dev, "set FIFO%d bw = %d\n", fifo,
+		 DIV_ROUND_CLOSEST(val * priv->shp_cfg_speed, 100));
+
+	return 0;
+err:
+	dev_err(priv->dev, "Bandwidth doesn't fit in tc configuration");
+	return -EINVAL;
+}
+
+static int cpsw_set_fifo_rlimit(struct cpsw_priv *priv, int fifo, int bw)
+{
+	struct cpsw_common *cpsw = priv->cpsw;
+	struct cpsw_slave *slave;
+	u32 tx_in_ctl_rg, val;
+	int ret;
+
+	ret = cpsw_set_fifo_bw(priv, fifo, bw);
+	if (ret)
+		return ret;
+
+	slave = &cpsw->slaves[cpsw_slave_index(cpsw, priv)];
+	tx_in_ctl_rg = cpsw->version == CPSW_VERSION_1 ?
+		       CPSW1_TX_IN_CTL : CPSW2_TX_IN_CTL;
+
+	if (!bw)
+		cpsw_fifo_shp_on(priv, fifo, bw);
+
+	val = slave_read(slave, tx_in_ctl_rg);
+	if (cpsw_shp_is_off(priv)) {
+		/* disable FIFOs rate limited queues */
+		val &= ~(0xf << CPSW_FIFO_RATE_EN_SHIFT);
+
+		/* set type of FIFO queues to normal priority mode */
+		val &= ~(3 << CPSW_FIFO_QUEUE_TYPE_SHIFT);
+
+		/* set type of FIFO queues to be rate limited */
+		if (bw)
+			val |= 2 << CPSW_FIFO_QUEUE_TYPE_SHIFT;
+		else
+			priv->shp_cfg_speed = 0;
+	}
+
+	/* toggle a FIFO rate limited queue */
+	if (bw)
+		val |= BIT(fifo + CPSW_FIFO_RATE_EN_SHIFT);
+	else
+		val &= ~BIT(fifo + CPSW_FIFO_RATE_EN_SHIFT);
+	slave_write(slave, val, tx_in_ctl_rg);
+
+	/* FIFO transmit shape enable */
+	cpsw_fifo_shp_on(priv, fifo, bw);
+	return 0;
+}
+
+/* Defaults:
+ * class A - prio 3
+ * class B - prio 2
+ * shaping for class A should be set first
+ */
+static int cpsw_set_cbs(struct net_device *ndev,
+			struct tc_cbs_qopt_offload *qopt)
+{
+	struct cpsw_priv *priv = netdev_priv(ndev);
+	struct cpsw_common *cpsw = priv->cpsw;
+	struct cpsw_slave *slave;
+	int prev_speed = 0;
+	int tc, ret, fifo;
+	u32 bw = 0;
+
+	tc = netdev_txq_to_tc(priv->ndev, qopt->queue);
+
+	/* enable channels in backward order, as highest FIFOs must be rate
+	 * limited first and for compliance with CPDMA rate limited channels
+	 * that also used in bacward order. FIFO0 cannot be rate limited.
+	 */
+	fifo = cpsw_tc_to_fifo(tc, ndev->num_tc);
+	if (!fifo) {
+		dev_err(priv->dev, "Last tc%d can't be rate limited", tc);
+		return -EINVAL;
+	}
+
+	/* do nothing, it's disabled anyway */
+	if (!qopt->enable && !priv->fifo_bw[fifo])
+		return 0;
+
+	/* shapers can be set if link speed is known */
+	slave = &cpsw->slaves[cpsw_slave_index(cpsw, priv)];
+	if (slave->phy && slave->phy->link) {
+		if (priv->shp_cfg_speed &&
+		    priv->shp_cfg_speed != slave->phy->speed)
+			prev_speed = priv->shp_cfg_speed;
+
+		priv->shp_cfg_speed = slave->phy->speed;
+	}
+
+	if (!priv->shp_cfg_speed) {
+		dev_err(priv->dev, "Link speed is not known");
+		return -1;
+	}
+
+	ret = pm_runtime_get_sync(cpsw->dev);
+	if (ret < 0) {
+		pm_runtime_put_noidle(cpsw->dev);
+		return ret;
+	}
+
+	bw = qopt->enable ? qopt->idleslope : 0;
+	ret = cpsw_set_fifo_rlimit(priv, fifo, bw);
+	if (ret) {
+		priv->shp_cfg_speed = prev_speed;
+		prev_speed = 0;
+	}
+
+	if (bw && prev_speed)
+		dev_warn(priv->dev,
+			 "Speed was changed, CBS sahper speeds are changed!");
+
+	pm_runtime_put_sync(cpsw->dev);
+	return ret;
+}
+
 static int cpsw_ndo_open(struct net_device *ndev)
 {
 	struct cpsw_priv *priv = netdev_priv(ndev);
@@ -2263,6 +2481,9 @@ static int cpsw_ndo_setup_tc(struct net_device *ndev, enum tc_setup_type type,
 			     void *type_data)
 {
 	switch (type) {
+	case TC_SETUP_QDISC_CBS:
+		return cpsw_set_cbs(ndev, type_data);
+
 	case TC_SETUP_QDISC_MQPRIO:
 		return cpsw_set_mqprio(ndev, type_data);
 
-- 
2.17.1

^ permalink raw reply related

* [PATCH v2 net-next 5/6] net: ethernet: ti: cpsw: restore shaper configuration while down/up
From: Ivan Khoronzhuk @ 2018-06-14  7:36 UTC (permalink / raw)
  To: grygorii.strashko, davem
  Cc: corbet, akpm, netdev, linux-doc, linux-kernel, linux-omap,
	vinicius.gomes, henrik, jesus.sanchez-palencia, ilias.apalodimas,
	p-varis, spatton, francois.ozog, yogeshs, nsekhar, andrew,
	Ivan Khoronzhuk
In-Reply-To: <20180614073650.29659-1-ivan.khoronzhuk@linaro.org>

Need to restore shapers configuration after interface was down/up.
This is needed as appropriate configuration is still replicated in
kernel settings. This only shapers context restore, so vlan
configuration should be restored by user if needed, especially for
devices with one port where vlan frames are sent via ALE.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
---
 drivers/net/ethernet/ti/cpsw.c | 47 ++++++++++++++++++++++++++++++++++
 1 file changed, 47 insertions(+)

diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index 3623c2994ddf..b2dfa7053b40 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -1807,6 +1807,51 @@ static int cpsw_set_cbs(struct net_device *ndev,
 	return ret;
 }
 
+static void cpsw_cbs_resume(struct cpsw_slave *slave, struct cpsw_priv *priv)
+{
+	int fifo, bw;
+
+	for (fifo = CPSW_FIFO_SHAPERS_NUM; fifo > 0; fifo--) {
+		bw = priv->fifo_bw[fifo];
+		if (!bw)
+			continue;
+
+		cpsw_set_fifo_rlimit(priv, fifo, bw);
+	}
+}
+
+static void cpsw_mqprio_resume(struct cpsw_slave *slave, struct cpsw_priv *priv)
+{
+	struct cpsw_common *cpsw = priv->cpsw;
+	u32 tx_prio_map = 0;
+	int i, tc, fifo;
+	u32 tx_prio_rg;
+
+	if (!priv->mqprio_hw)
+		return;
+
+	for (i = 0; i < 8; i++) {
+		tc = netdev_get_prio_tc_map(priv->ndev, i);
+		fifo = CPSW_FIFO_SHAPERS_NUM - tc;
+		tx_prio_map |= fifo << (4 * i);
+	}
+
+	tx_prio_rg = cpsw->version == CPSW_VERSION_1 ?
+		     CPSW1_TX_PRI_MAP : CPSW2_TX_PRI_MAP;
+
+	slave_write(slave, tx_prio_map, tx_prio_rg);
+}
+
+/* restore resources after port reset */
+static void cpsw_restore(struct cpsw_priv *priv)
+{
+	/* restore MQPRIO offload */
+	for_each_slave(priv, cpsw_mqprio_resume, priv);
+
+	/* restore CBS offload */
+	for_each_slave(priv, cpsw_cbs_resume, priv);
+}
+
 static int cpsw_ndo_open(struct net_device *ndev)
 {
 	struct cpsw_priv *priv = netdev_priv(ndev);
@@ -1886,6 +1931,8 @@ static int cpsw_ndo_open(struct net_device *ndev)
 
 	}
 
+	cpsw_restore(priv);
+
 	/* Enable Interrupt pacing if configured */
 	if (cpsw->coal_intvl != 0) {
 		struct ethtool_coalesce coal;
-- 
2.17.1

^ permalink raw reply related

* [PATCH v2 net-next 6/6] Documentation: networking: cpsw: add MQPRIO & CBS offload examples
From: Ivan Khoronzhuk @ 2018-06-14  7:36 UTC (permalink / raw)
  To: grygorii.strashko, davem
  Cc: corbet, akpm, netdev, linux-doc, linux-kernel, linux-omap,
	vinicius.gomes, henrik, jesus.sanchez-palencia, ilias.apalodimas,
	p-varis, spatton, francois.ozog, yogeshs, nsekhar, andrew,
	Ivan Khoronzhuk
In-Reply-To: <20180614073650.29659-1-ivan.khoronzhuk@linaro.org>

This document describes MQPRIO and CBS Qdisc offload configuration
for cpsw driver based on examples. It potentially can be used in
audio video bridging (AVB) and time sensitive networking (TSN).

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
---
 Documentation/networking/ti-cpsw.txt | 540 +++++++++++++++++++++++++++
 1 file changed, 540 insertions(+)
 create mode 100644 Documentation/networking/ti-cpsw.txt

diff --git a/Documentation/networking/ti-cpsw.txt b/Documentation/networking/ti-cpsw.txt
new file mode 100644
index 000000000000..f5d58f502e52
--- /dev/null
+++ b/Documentation/networking/ti-cpsw.txt
@@ -0,0 +1,540 @@
+* Texas Instruments CPSW ethernet driver
+
+Multiqueue & CBS & MQPRIO
+=====================================================================
+=====================================================================
+
+The cpsw has 3 CBS shapers for each external ports. This document
+describes MQPRIO and CBS Qdisc offload configuration for cpsw driver
+based on examples. It potentially can be used in audio video bridging
+(AVB) and time sensitive networking (TSN).
+
+The following examples was tested on AM572x EVM and BBB boards.
+
+Test setup
+==========
+
+Under consideration two examples with AM52xx EVM running cpsw driver
+in dual_emac mode.
+
+Several prerequisites:
+- TX queues must be rated starting from txq0 that has highest priority
+- Traffic classes are used starting from 0, that has highest priority
+- CBS shapers should be used with rated queues
+- The bandwidth for CBS shapers has to be set a little bit more then
+  potential incoming rate, thus, rate of all incoming tx queues has
+  to be a little less
+- Real rates can differ, due to discreetness
+- Map skb-priority to txq is not enough, also skb-priority to l2 prio
+  map has to be created with ip or vconfig tool
+- Any l2/socket prio (0 - 7) for classes can be used, but for
+  simplicity default values are used: 3 and 2
+- only 2 classes tested: A and B, but checked and can work with more,
+  maximum allowed 4, but only for 3 rate can be set.
+
+Test setup for examples
+=======================
+                                    +-------------------------------+
+                                    |--+                            |
+                                    |  |      Workstation0          |
+                                    |E |  MAC 18:03:73:66:87:42     |
++-----------------------------+  +--|t |                            |
+|                    | 1  | E |  |  |h |./tsn_listener -d \         |
+|  Target board:     | 0  | t |--+  |0 | 18:03:73:66:87:42 -i eth0 \|
+|  AM572x EVM        | 0  | h |     |  | -s 1500                    |
+|                    | 0  | 0 |     |--+                            |
+|  Only 2 classes:   |Mb  +---|     +-------------------------------+
+|  class A, class B  |        |
+|                    |    +---|     +-------------------------------+
+|                    | 1  | E |     |--+                            |
+|                    | 0  | t |     |  |      Workstation1          |
+|                    | 0  | h |--+  |E |  MAC 20:cf:30:85:7d:fd     |
+|                    |Mb  | 1 |  +--|t |                            |
++-----------------------------+     |h |./tsn_listener -d \         |
+                                    |0 | 20:cf:30:85:7d:fd -i eth0 \|
+                                    |  | -s 1500                    |
+                                    |--+                            |
+                                    +-------------------------------+
+
+*********************************************************************
+*********************************************************************
+*********************************************************************
+Example 1: One port tx AVB configuration scheme for target board
+----------------------------------------------------------------------
+(prints and scheme for AM52xx evm, applicable for single port boards)
+
+tc - traffic class
+txq - transmit queue
+p - priority
+f - fifo (cpsw fifo)
+S - shaper configured
+
++------------------------------------------------------------------+ u
+| +---------------+  +---------------+  +------+ +------+          | s
+| |               |  |               |  |      | |      |          | e
+| | App 1         |  | App 2         |  | Apps | | Apps |          | r
+| | Class A       |  | Class B       |  | Rest | | Rest |          |
+| | Eth0          |  | Eth0          |  | Eth0 | | Eth1 |          | s
+| | VLAN100       |  | VLAN100       |  |   |  | |   |  |          | p
+| | 40 Mb/s       |  | 20 Mb/s       |  |   |  | |   |  |          | a
+| | SO_PRIORITY=3 |  | SO_PRIORITY=2 |  |   |  | |   |  |          | c
+| |   |           |  |   |           |  |   |  | |   |  |          | e
+| +---|-----------+  +---|-----------+  +---|--+ +---|--+          |
++-----|------------------|------------------|--------|-------------+
+    +-+     +------------+                  |        |
+    |       |             +-----------------+     +--+
+    |       |             |                       |
++---|-------|-------------|-----------------------|----------------+
+| +----+ +----+ +----+ +----+                   +----+             |
+| | p3 | | p2 | | p1 | | p0 |                   | p0 |             | k
+| \    / \    / \    / \    /                   \    /             | e
+|  \  /   \  /   \  /   \  /                     \  /              | r
+|   \/     \/     \/     \/                       \/               | n
+|    |     |             |                        |                | e
+|    |     |       +-----+                        |                | l
+|    |     |       |                              |                |
+| +----+ +----+ +----+                          +----+             | s
+| |tc0 | |tc1 | |tc2 |                          |tc0 |             | p
+| \    / \    / \    /                          \    /             | a
+|  \  /   \  /   \  /                            \  /              | c
+|   \/     \/     \/                              \/               | e
+|   |      |       +-----+                        |                |
+|   |      |       |     |                        |                |
+|   |      |       |     |                        |                |
+|   |      |       |     |                        |                |
+| +----+ +----+ +----+ +----+                   +----+             |
+| |txq0| |txq1| |txq2| |txq3|                   |txq4|             |
+| \    / \    / \    / \    /                   \    /             |
+|  \  /   \  /   \  /   \  /                     \  /              |
+|   \/     \/     \/     \/                       \/               |
+| +-|------|------|------|--+                  +--|--------------+ |
+| | |      |      |      |  | Eth0.100         |  |     Eth1     | |
++---|------|------|------|------------------------|----------------+
+    |      |      |      |                        |
+    p      p      p      p                        |
+    3      2      0-1, 4-7  <- L2 priority        |
+    |      |      |      |                        |
+    |      |      |      |                        |
++---|------|------|------|------------------------|----------------+
+|   |      |      |      |             |----------+                |
+| +----+ +----+ +----+ +----+       +----+                         |
+| |dma7| |dma6| |dma5| |dma4|       |dma3|                         |
+| \    / \    / \    / \    /       \    /                         | c
+|  \S /   \S /   \  /   \  /         \  /                          | p
+|   \/     \/     \/     \/           \/                           | s
+|   |      |      | +-----            |                            | w
+|   |      |      | |                 |                            |
+|   |      |      | |                 |                            | d
+| +----+ +----+ +----+p            p+----+                         | r
+| |    | |    | |    |o            o|    |                         | i
+| | f3 | | f2 | | f0 |r            r| f0 |                         | v
+| |tc0 | |tc1 | |tc2 |t            t|tc0 |                         | e
+| \CBS / \CBS / \CBS /1            2\CBS /                         | r
+|  \S /   \S /   \  /                \  /                          |
+|   \/     \/     \/                  \/                           |
++------------------------------------------------------------------+
+========================================Eth==========================>
+
+1)
+// Add 4 tx queues, for interface Eth0, and 1 tx queue for Eth1
+$ ethtool -L eth0 rx 1 tx 5
+rx unmodified, ignoring
+
+2)
+// Check if num of queues is set correctly:
+$ ethtool -l eth0
+Channel parameters for eth0:
+Pre-set maximums:
+RX:             8
+TX:             8
+Other:          0
+Combined:       0
+Current hardware settings:
+RX:             1
+TX:             5
+Other:          0
+Combined:       0
+
+3)
+// TX queues must be rated starting from 0, so set bws for tx0 and tx1
+// Set rates 40 and 20 Mb/s appropriately.
+// Pay attention, real speed can differ a bit due to discreetness.
+// Leave last 2 tx queues not rated.
+$ echo 40 > /sys/class/net/eth0/queues/tx-0/tx_maxrate
+$ echo 20 > /sys/class/net/eth0/queues/tx-1/tx_maxrate
+
+4)
+// Check maximum rate of tx (cpdma) queues:
+$ cat /sys/class/net/eth0/queues/tx-*/tx_maxrate
+40
+20
+0
+0
+0
+
+5)
+// Map skb->priority to traffic class:
+// 3pri -> tc0, 2pri -> tc1, (0,1,4-7)pri -> tc2
+// Map traffic class to transmit queue:
+// tc0 -> txq0, tc1 -> txq1, tc2 -> (txq2, txq3)
+$ tc qdisc replace dev eth0 handle 100: parent root mqprio num_tc 3 \
+map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@0 1@1 2@2 hw 1
+
+5a)
+// As two interface sharing same set of tx queues, assign all traffic
+// coming to interface Eth1 to separate queue in order to not mix it
+// with traffic from interface Eth0, so use separate txq to send
+// packets to Eth1, so all prio -> tc0 and tc0 -> txq4
+// Here hw 0, so here still default configuration for eth1 in hw
+$ tc qdisc replace dev eth1 handle 100: parent root mqprio num_tc 1 \
+map 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 queues 1@4 hw 0
+
+6)
+// Check classes settings
+$ tc -g class show dev eth0
++---(100:ffe2) mqprio
+|    +---(100:3) mqprio
+|    +---(100:4) mqprio
+|
++---(100:ffe1) mqprio
+|    +---(100:2) mqprio
+|
++---(100:ffe0) mqprio
+     +---(100:1) mqprio
+
+$ tc -g class show dev eth1
++---(100:ffe0) mqprio
+     +---(100:5) mqprio
+
+7)
+// Set rate for class A - 41 Mbit (tc0, txq0) using CBS Qdisc
+// Set it +1 Mb for reserve (important!)
+// here only idle slope is important, others arg are ignored
+// Pay attention, real speed can differ a bit due to discreetness
+$ tc qdisc add dev eth0 parent 100:1 cbs locredit -1438 \
+hicredit 62 sendslope -959000 idleslope 41000 offload 1
+net eth0: set FIFO3 bw = 50
+
+8)
+// Set rate for class B - 21 Mbit (tc1, txq1) using CBS Qdisc:
+// Set it +1 Mb for reserve (important!)
+$ tc qdisc add dev eth0 parent 100:2 cbs locredit -1468 \
+hicredit 65 sendslope -979000 idleslope 21000 offload 1
+net eth0: set FIFO2 bw = 30
+
+9)
+// Create vlan 100 to map sk->priority to vlan qos
+$ ip link add link eth0 name eth0.100 type vlan id 100
+8021q: 802.1Q VLAN Support v1.8
+8021q: adding VLAN 0 to HW filter on device eth0
+8021q: adding VLAN 0 to HW filter on device eth1
+net eth0: Adding vlanid 100 to vlan filter
+
+10)
+// Map skb->priority to L2 prio, 1 to 1
+$ ip link set eth0.100 type vlan \
+egress 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
+
+11)
+// Check egress map for vlan 100
+$ cat /proc/net/vlan/eth0.100
+[...]
+INGRESS priority mappings: 0:0  1:0  2:0  3:0  4:0  5:0  6:0 7:0
+EGRESS priority mappings: 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
+
+12)
+// Run your appropriate tools with socket option "SO_PRIORITY"
+// to 3 for class A and/or to 2 for class B
+// (I took at https://www.spinics.net/lists/netdev/msg460869.html)
+./tsn_talker -d 18:03:73:66:87:42 -i eth0.100 -p3 -s 1500&
+./tsn_talker -d 18:03:73:66:87:42 -i eth0.100 -p2 -s 1500&
+
+13)
+// run your listener on workstation
+// (I took at https://www.spinics.net/lists/netdev/msg460869.html)
+./tsn_listener -d 18:03:73:66:87:42 -i enp5s0 -s 1500
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39000 kbps
+
+14)
+// Restore default configuration if needed
+$ ip link del eth0.100
+$ tc qdisc del dev eth1 root
+$ tc qdisc del dev eth0 root
+net eth0: Prev FIFO2 is shaped
+net eth0: set FIFO3 bw = 0
+net eth0: set FIFO2 bw = 0
+$ ethtool -L eth0 rx 1 tx 1
+
+*********************************************************************
+*********************************************************************
+*********************************************************************
+Example 2: Two port tx AVB configuration scheme for target board
+----------------------------------------------------------------------
+(prints and scheme for AM52xx evm, for dual emac boards only)
+
++------------------------------------------------------------------+ u
+| +----------+  +----------+  +------+  +----------+  +----------+ | s
+| |          |  |          |  |      |  |          |  |          | | e
+| | App 1    |  | App 2    |  | Apps |  | App 3    |  | App 4    | | r
+| | Class A  |  | Class B  |  | Rest |  | Class B  |  | Class A  | |
+| | Eth0     |  | Eth0     |  |   |  |  | Eth1     |  | Eth1     | | s
+| | VLAN100  |  | VLAN100  |  |   |  |  | VLAN100  |  | VLAN100  | | p
+| | 40 Mb/s  |  | 20 Mb/s  |  |   |  |  | 10 Mb/s  |  | 30 Mb/s  | | a
+| | SO_PRI=3 |  | SO_PRI=2 |  |   |  |  | SO_PRI=3 |  | SO_PRI=2 | | c
+| |   |      |  |   |      |  |   |  |  |   |      |  |   |      | | e
+| +---|------+  +---|------+  +---|--+  +---|------+  +---|------+ |
++-----|-------------|-------------|---------|-------------|--------+
+    +-+     +-------+             |         +----------+  +----+
+    |       |             +-------+------+             |       |
+    |       |             |              |             |       |
++---|-------|-------------|--------------|-------------|-------|---+
+| +----+ +----+ +----+ +----+          +----+ +----+ +----+ +----+ |
+| | p3 | | p2 | | p1 | | p0 |          | p0 | | p1 | | p2 | | p3 | | k
+| \    / \    / \    / \    /          \    / \    / \    / \    / | e
+|  \  /   \  /   \  /   \  /            \  /   \  /   \  /   \  /  | r
+|   \/     \/     \/     \/              \/     \/     \/     \/   | n
+|   |      |             |                |             |      |   | e
+|   |      |        +----+                +----+        |      |   | l
+|   |      |        |                          |        |      |   |
+| +----+ +----+ +----+                        +----+ +----+ +----+ | s
+| |tc0 | |tc1 | |tc2 |                        |tc2 | |tc1 | |tc0 | | p
+| \    / \    / \    /                        \    / \    / \    / | a
+|  \  /   \  /   \  /                          \  /   \  /   \  /  | c
+|   \/     \/     \/                            \/     \/     \/   | e
+|   |      |       +-----+                +-----+      |       |   |
+|   |      |       |     |                |     |      |       |   |
+|   |      |       |     |                |     |      |       |   |
+|   |      |       |     |    E      E    |     |      |       |   |
+| +----+ +----+ +----+ +----+ t      t +----+ +----+ +----+ +----+ |
+| |txq0| |txq1| |txq4| |txq5| h      h |txq6| |txq7| |txq3| |txq2| |
+| \    / \    / \    / \    / 0      1 \    / \    / \    / \    / |
+|  \  /   \  /   \  /   \  /  .      .  \  /   \  /   \  /   \  /  |
+|   \/     \/     \/     \/   1      1   \/     \/     \/     \/   |
+| +-|------|------|------|--+ 0      0 +-|------|------|------|--+ |
+| | |      |      |      |  | 0      0 | |      |      |      |  | |
++---|------|------|------|---------------|------|------|------|----+
+    |      |      |      |               |      |      |      |
+    p      p      p      p               p      p      p      p
+    3      2      0-1, 4-7   <-L2 pri->  0-1, 4-7      2      3
+    |      |      |      |               |      |      |      |
+    |      |      |      |               |      |      |      |
++---|------|------|------|---------------|------|------|------|----+
+|   |      |      |      |               |      |      |      |    |
+| +----+ +----+ +----+ +----+          +----+ +----+ +----+ +----+ |
+| |dma7| |dma6| |dma3| |dma2|          |dma1| |dma0| |dma4| |dma5| |
+| \    / \    / \    / \    /          \    / \    / \    / \    / | c
+|  \S /   \S /   \  /   \  /            \  /   \  /   \S /   \S /  | p
+|   \/     \/     \/     \/              \/     \/     \/     \/   | s
+|   |      |      | +-----                |      |      |      |   | w
+|   |      |      | |                     +----+ |      |      |   |
+|   |      |      | |                          | |      |      |   | d
+| +----+ +----+ +----+p                      p+----+ +----+ +----+ | r
+| |    | |    | |    |o                      o|    | |    | |    | | i
+| | f3 | | f2 | | f0 |r        CPSW          r| f3 | | f2 | | f0 | | v
+| |tc0 | |tc1 | |tc2 |t                      t|tc0 | |tc1 | |tc2 | | e
+| \CBS / \CBS / \CBS /1                      2\CBS / \CBS / \CBS / | r
+|  \S /   \S /   \  /                          \S /   \S /   \  /  |
+|   \/     \/     \/                            \/     \/     \/   |
++------------------------------------------------------------------+
+========================================Eth==========================>
+
+1)
+// Add 8 tx queues, for interface Eth0, but they are common, so are accessed
+// by two interfaces Eth0 and Eth1.
+$ ethtool -L eth1 rx 1 tx 8
+rx unmodified, ignoring
+
+2)
+// Check if num of queues is set correctly:
+$ ethtool -l eth0
+Channel parameters for eth0:
+Pre-set maximums:
+RX:             8
+TX:             8
+Other:          0
+Combined:       0
+Current hardware settings:
+RX:             1
+TX:             8
+Other:          0
+Combined:       0
+
+3)
+// TX queues must be rated starting from 0, so set bws for tx0 and tx1 for Eth0
+// and for tx2 and tx3 for Eth1. That is, rates 40 and 20 Mb/s appropriately
+// for Eth0 and 30 and 10 Mb/s for Eth1.
+// Real speed can differ a bit due to discreetness
+// Leave last 4 tx queues as not rated
+$ echo 40 > /sys/class/net/eth0/queues/tx-0/tx_maxrate
+$ echo 20 > /sys/class/net/eth0/queues/tx-1/tx_maxrate
+$ echo 30 > /sys/class/net/eth1/queues/tx-2/tx_maxrate
+$ echo 10 > /sys/class/net/eth1/queues/tx-3/tx_maxrate
+
+4)
+// Check maximum rate of tx (cpdma) queues:
+$ cat /sys/class/net/eth0/queues/tx-*/tx_maxrate
+40
+20
+30
+10
+0
+0
+0
+0
+
+5)
+// Map skb->priority to traffic class for Eth0:
+// 3pri -> tc0, 2pri -> tc1, (0,1,4-7)pri -> tc2
+// Map traffic class to transmit queue:
+// tc0 -> txq0, tc1 -> txq1, tc2 -> (txq4, txq5)
+$ tc qdisc replace dev eth0 handle 100: parent root mqprio num_tc 3 \
+map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@0 1@1 2@4 hw 1
+
+6)
+// Check classes settings
+$ tc -g class show dev eth0
++---(100:ffe2) mqprio
+|    +---(100:5) mqprio
+|    +---(100:6) mqprio
+|
++---(100:ffe1) mqprio
+|    +---(100:2) mqprio
+|
++---(100:ffe0) mqprio
+     +---(100:1) mqprio
+
+7)
+// Set rate for class A - 41 Mbit (tc0, txq0) using CBS Qdisc for Eth0
+// here only idle slope is important, others ignored
+// Real speed can differ a bit due to discreetness
+$ tc qdisc add dev eth0 parent 100:1 cbs locredit -1470 \
+hicredit 62 sendslope -959000 idleslope 41000 offload 1
+net eth0: set FIFO3 bw = 50
+
+8)
+// Set rate for class B - 21 Mbit (tc1, txq1) using CBS Qdisc for Eth0
+$ tc qdisc add dev eth0 parent 100:2 cbs locredit -1470 \
+hicredit 65 sendslope -979000 idleslope 21000 offload 1
+net eth0: set FIFO2 bw = 30
+
+9)
+// Create vlan 100 to map sk->priority to vlan qos for Eth0
+$ ip link add link eth0 name eth0.100 type vlan id 100
+net eth0: Adding vlanid 100 to vlan filter
+
+10)
+// Map skb->priority to L2 prio for Eth0.100, one to one
+$ ip link set eth0.100 type vlan \
+egress 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
+
+11)
+// Check egress map for vlan 100
+$ cat /proc/net/vlan/eth0.100
+[...]
+INGRESS priority mappings: 0:0  1:0  2:0  3:0  4:0  5:0  6:0 7:0
+EGRESS priority mappings: 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
+
+12)
+// Map skb->priority to traffic class for Eth1:
+// 3pri -> tc0, 2pri -> tc1, (0,1,4-7)pri -> tc2
+// Map traffic class to transmit queue:
+// tc0 -> txq2, tc1 -> txq3, tc2 -> (txq6, txq7)
+$ tc qdisc replace dev eth1 handle 100: parent root mqprio num_tc 3 \
+map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@2 1@3 2@6 hw 1
+
+13)
+// Check classes settings
+$ tc -g class show dev eth1
++---(100:ffe2) mqprio
+|    +---(100:7) mqprio
+|    +---(100:8) mqprio
+|
++---(100:ffe1) mqprio
+|    +---(100:4) mqprio
+|
++---(100:ffe0) mqprio
+     +---(100:3) mqprio
+
+14)
+// Set rate for class A - 31 Mbit (tc0, txq2) using CBS Qdisc for Eth1
+// here only idle slope is important, others ignored
+// Set it +1 Mb for reserve (important!)
+$ tc qdisc add dev eth1 parent 100:3 cbs locredit -1453 \
+hicredit 47 sendslope -969000 idleslope 31000 offload 1
+net eth1: set FIFO3 bw = 31
+
+15)
+// Set rate for class B - 11 Mbit (tc1, txq3) using CBS Qdisc for Eth1
+// Set it +1 Mb for reserve (important!)
+$ tc qdisc add dev eth1 parent 100:4 cbs locredit -1483 \
+hicredit 34 sendslope -989000 idleslope 11000 offload 1
+net eth1: set FIFO2 bw = 11
+
+16)
+// Create vlan 100 to map sk->priority to vlan qos for Eth1
+$ ip link add link eth1 name eth1.100 type vlan id 100
+net eth1: Adding vlanid 100 to vlan filter
+
+17)
+// Map skb->priority to L2 prio for Eth1.100, one to one
+$ ip link set eth1.100 type vlan \
+egress 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
+
+18)
+// Check egress map for vlan 100
+$ cat /proc/net/vlan/eth1.100
+[...]
+INGRESS priority mappings: 0:0  1:0  2:0  3:0  4:0  5:0  6:0 7:0
+EGRESS priority mappings: 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
+
+19)
+// Run appropriate tools with socket option "SO_PRIORITY" to 3
+// for class A and to 2 for class B. For both interfaces
+./tsn_talker -d 18:03:73:66:87:42 -i eth0.100 -p2 -s 1500&
+./tsn_talker -d 18:03:73:66:87:42 -i eth0.100 -p3 -s 1500&
+./tsn_talker -d 20:cf:30:85:7d:fd -i eth1.100 -p2 -s 1500&
+./tsn_talker -d 20:cf:30:85:7d:fd -i eth1.100 -p3 -s 1500&
+
+20)
+// run your listeners on workstations
+// (I took at https://www.spinics.net/lists/netdev/msg460869.html)
+./tsn_listener -d 18:03:73:66:87:42 -i enp5s0 -s 1500
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39000 kbps
+
+21)
+// Restore default configuration if needed
+$ ip link del eth1.100
+$ ip link del eth0.100
+$ tc qdisc del dev eth1 root
+net eth1: Prev FIFO2 is shaped
+net eth1: set FIFO3 bw = 0
+net eth1: set FIFO2 bw = 0
+$ tc qdisc del dev eth0 root
+net eth0: Prev FIFO2 is shaped
+net eth0: set FIFO3 bw = 0
+net eth0: set FIFO2 bw = 0
+$ ethtool -L eth0 rx 1 tx 1
-- 
2.17.1

^ permalink raw reply related

* WARNING in bpf_prog_select_runtime
From: syzbot @ 2018-06-14  7:37 UTC (permalink / raw)
  To: ast, daniel, linux-kernel, netdev, syzkaller-bugs

Hello,

syzbot found the following crash on:

HEAD commit:    ee946c36be21 Merge tag 'platform-drivers-x86-v4.17-2' of g..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=11ca275b800000
kernel config:  https://syzkaller.appspot.com/x/.config?x=889265cebaf9bda1
dashboard link: https://syzkaller.appspot.com/bug?extid=3b889862e65a98317058
compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=17530b5b800000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+3b889862e65a98317058@syzkaller.appspotmail.com

RAX: ffffffffffffffda RBX: 00000000014b0914 RCX: 0000000000455979
RDX: 0000000000000033 RSI: 0000000000000001 RDI: 0000000000000004
RBP: 000000000072bea0 R08: 0000000000000010 R09: 0000000000000000
R10: 0000000020000500 R11: 0000000000000246 R12: 0000000000000005
R13: 0000000000000578 R14: 00000000006fc3e0 R15: 000000000000000a
WARNING: CPU: 0 PID: 4502 at include/linux/filter.h:651 bpf_prog_lock_ro  
include/linux/filter.h:651 [inline]
WARNING: CPU: 0 PID: 4502 at include/linux/filter.h:651  
bpf_prog_select_runtime+0x53c/0x640 kernel/bpf/core.c:1503
Kernel panic - not syncing: panic_on_warn set ...

CPU: 0 PID: 4502 Comm: syz-executor0 Not tainted 4.17.0-rc3+ #35
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x1b9/0x294 lib/dump_stack.c:113
  panic+0x22f/0x4de kernel/panic.c:184
  __warn.cold.8+0x163/0x1b3 kernel/panic.c:536
  report_bug+0x252/0x2d0 lib/bug.c:186
  fixup_bug arch/x86/kernel/traps.c:178 [inline]
  do_error_trap+0x1de/0x490 arch/x86/kernel/traps.c:296
  do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315
  invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:992
RIP: 0010:bpf_prog_lock_ro include/linux/filter.h:651 [inline]
RIP: 0010:bpf_prog_select_runtime+0x53c/0x640 kernel/bpf/core.c:1503
RSP: 0018:ffff8801b094f8a8 EFLAGS: 00010293
RAX: ffff8801b194e040 RBX: ffffc90001944000 RCX: ffffffff81857b67
RDX: 0000000000000000 RSI: ffffffff81857efc RDI: 0000000000000005
RBP: ffff8801b094f908 R08: ffff8801b194e040 R09: 0000000000000006
R10: ffff8801b194e040 R11: 0000000000000000 R12: 00000000fffffff4
R13: ffffffff81862050 R14: 0000000000000000 R15: ffff8801d7186480
  bpf_migrate_filter net/core/filter.c:1069 [inline]
  bpf_prepare_filter+0xb65/0x1060 net/core/filter.c:1117
  __get_filter+0x1e0/0x280 net/core/filter.c:1310
  sk_reuseport_attach_filter+0x1d/0x90 net/core/filter.c:1343
  sock_setsockopt+0x1ad3/0x1f40 net/core/sock.c:954
  __sys_setsockopt+0x2df/0x390 net/socket.c:1899
  __do_sys_setsockopt net/socket.c:1914 [inline]
  __se_sys_setsockopt net/socket.c:1911 [inline]
  __x64_sys_setsockopt+0xbe/0x150 net/socket.c:1911
  do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x455979
RSP: 002b:00007ffd0a44c648 EFLAGS: 00000246 ORIG_RAX: 0000000000000036
RAX: ffffffffffffffda RBX: 00000000014b0914 RCX: 0000000000455979
RDX: 0000000000000033 RSI: 0000000000000001 RDI: 0000000000000004
RBP: 000000000072bea0 R08: 0000000000000010 R09: 0000000000000000
R10: 0000000020000500 R11: 0000000000000246 R12: 0000000000000005
R13: 0000000000000578 R14: 00000000006fc3e0 R15: 000000000000000a
Dumping ftrace buffer:
    (ftrace buffer empty)
Kernel Offset: disabled
Rebooting in 86400 seconds..


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with  
syzbot.
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply

* [PATCH v2 net-next 0/6] net: ethernet: ti: cpsw: add MQPRIO and CBS Qdisc offload
From: Ivan Khoronzhuk @ 2018-06-14  7:36 UTC (permalink / raw)
  To: grygorii.strashko, davem
  Cc: corbet, akpm, netdev, linux-doc, linux-kernel, linux-omap,
	vinicius.gomes, henrik, jesus.sanchez-palencia, ilias.apalodimas,
	p-varis, spatton, francois.ozog, yogeshs, nsekhar, andrew,
	Ivan Khoronzhuk

This series adds MQPRIO and CBS Qdisc offload for TI cpsw driver.
It potentially can be used in audio video bridging (AVB) and time
sensitive networking (TSN).

Patchset was tested on AM572x EVM and BBB boards. Last patch from this
series adds detailed description of configuration with examples. For
consistency reasons, in role of talker and listener, tools from
patchset "TSN: Add qdisc based config interface for CBS" were used and
can be seen here: https://www.spinics.net/lists/netdev/msg460869.html

Based on net-next/master

v2..v1:
 - changed name cpsw.txt on ti-cpsw.txt
 - changed name cpsw_set_tc() on cpsw_set_mqprio()

Ivan Khoronzhuk (6):
  net: ethernet: ti: cpsw: use cpdma channels in backward order for txq
  net: ethernet: ti: cpdma: fit rated channels in backward order
  net: ethernet: ti: cpsw: add MQPRIO Qdisc offload
  net: ethernet: ti: cpsw: add CBS Qdisc offload
  net: ethernet: ti: cpsw: restore shaper configuration while down/up
  Documentation: networking: cpsw: add MQPRIO & CBS offload examples

 Documentation/networking/ti-cpsw.txt    | 540 ++++++++++++++++++++++++
 drivers/net/ethernet/ti/cpsw.c          | 364 +++++++++++++++-
 drivers/net/ethernet/ti/davinci_cpdma.c |  31 +-
 3 files changed, 913 insertions(+), 22 deletions(-)
 create mode 100644 Documentation/networking/ti-cpsw.txt

-- 
2.17.1

^ permalink raw reply

* Re: KASAN: slab-out-of-bounds Read in bpf_skb_vlan_push
From: Dmitry Vyukov @ 2018-06-14  7:37 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: syzbot, Alexei Starovoitov, David Miller, LKML, netdev,
	syzkaller-bugs
In-Reply-To: <26c434ee-0a0a-fbba-282c-dabddfac652e@iogearbox.net>

On Wed, Jun 13, 2018 at 8:15 PM, Daniel Borkmann <daniel@iogearbox.net> wrote:
> On 06/13/2018 08:13 PM, syzbot wrote:
>>> On 06/13/2018 06:17 PM, syzbot wrote:
>>>> Hello,
>>
>>>> syzbot found the following crash on:
>>
>>>> HEAD commit:    75d4e704fa8d netdev-FAQ: clarify DaveM's position for stab..
>>>> git tree:       bpf-next
>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1754783f800000
>>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=a601a80fec461d44
>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=76de61614cb1abdd73fc
>>>> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
>>>> syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=12c1e1bf800000
>>
>>>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>>>> Reported-by: syzbot+76de61614cb1abdd73fc@syzkaller.appspotmail.com
>>
>>>> IPv6: ADDRCONF(NETDEV_CHANGE): veth1: link becomes ready
>>>> IPv6: ADDRCONF(NETDEV_CHANGE): veth0: link becomes ready
>>>> 8021q: adding VLAN 0 to HW filter on device team0
>>>> 8021q: adding VLAN 0 to HW filter on device team0
>>>> ==================================================================
>>>> BUG: KASAN: slab-out-of-bounds in skb_at_tc_ingress include/net/sch_generic.h:535 [inline]
>>>> BUG: KASAN: slab-out-of-bounds in bpf_push_mac_rcsum net/core/filter.c:1625 [inline]
>>>> BUG: KASAN: slab-out-of-bounds in ____bpf_skb_vlan_push net/core/filter.c:2446 [inline]
>>>> BUG: KASAN: slab-out-of-bounds in bpf_skb_vlan_push+0x6b7/0x720 net/core/filter.c:2437
>>>> Read of size 5 at addr ffff8801b77347d0 by task syz-executor6/6529
>>
>>> Should be fixed already by:
>>
>>>    https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=58990d1ff3f7896ee341030e9a7c2e4002570683
>>
>>
>>> #syz test: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
>>
>> want 2 args (repo, branch), got 1
>
> Fair enough ... assumed default would have been master. ;-)

There is an issue with making defaults. Some email clients reflow
emails and split lines and git repo address can be lengthy and trigger
such reflow. To work around this syzbot currently looks for 2 "tokens"
after syz test, not necessary on the same line. The default for branch
will cause ambiguity in parsing: is it only repo without branch, or is
it repo with branch on the next line? Engineering hits reality...

> #syz test: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

^ permalink raw reply

* [PATCH v2 net-next 3/6] net: ethernet: ti: cpsw: add MQPRIO Qdisc offload
From: Ivan Khoronzhuk @ 2018-06-14  7:36 UTC (permalink / raw)
  To: grygorii.strashko, davem
  Cc: corbet, akpm, netdev, linux-doc, linux-kernel, linux-omap,
	vinicius.gomes, henrik, jesus.sanchez-palencia, ilias.apalodimas,
	p-varis, spatton, francois.ozog, yogeshs, nsekhar, andrew,
	Ivan Khoronzhuk
In-Reply-To: <20180614073650.29659-1-ivan.khoronzhuk@linaro.org>

That's possible to offload vlan to tc priority mapping with
assumption sk_prio == L2 prio.

Example:
$ ethtool -L eth0 rx 1 tx 4

$ qdisc replace dev eth0 handle 100: parent root mqprio num_tc 3 \
map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@0 1@1 2@2 hw 1

$ tc -g class show dev eth0
+---(100:ffe2) mqprio
|    +---(100:3) mqprio
|    +---(100:4) mqprio
|    
+---(100:ffe1) mqprio
|    +---(100:2) mqprio
|    
+---(100:ffe0) mqprio
     +---(100:1) mqprio

Here, 100:1 is txq0, 100:2 is txq1, 100:3 is txq2, 100:4 is txq3
txq0 belongs to tc0, txq1 to tc1, txq2 and txq3 to tc2
The offload part only maps L2 prio to classes of traffic, but not
to transmit queues, so to direct traffic to traffic class vlan has
to be created with appropriate egress map.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
---
 drivers/net/ethernet/ti/cpsw.c | 82 ++++++++++++++++++++++++++++++++++
 1 file changed, 82 insertions(+)

diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index 406537d74ec1..edd14def98df 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -39,6 +39,7 @@
 #include <linux/sys_soc.h>
 
 #include <linux/pinctrl/consumer.h>
+#include <net/pkt_cls.h>
 
 #include "cpsw.h"
 #include "cpsw_ale.h"
@@ -153,6 +154,8 @@ do {								\
 #define IRQ_NUM			2
 #define CPSW_MAX_QUEUES		8
 #define CPSW_CPDMA_DESCS_POOL_SIZE_DEFAULT 256
+#define CPSW_TC_NUM			4
+#define CPSW_FIFO_SHAPERS_NUM		(CPSW_TC_NUM - 1)
 
 #define CPSW_RX_VLAN_ENCAP_HDR_PRIO_SHIFT	29
 #define CPSW_RX_VLAN_ENCAP_HDR_PRIO_MSK		GENMASK(2, 0)
@@ -453,6 +456,7 @@ struct cpsw_priv {
 	u8				mac_addr[ETH_ALEN];
 	bool				rx_pause;
 	bool				tx_pause;
+	bool				mqprio_hw;
 	u32 emac_port;
 	struct cpsw_common *cpsw;
 };
@@ -1577,6 +1581,14 @@ static void cpsw_slave_stop(struct cpsw_slave *slave, struct cpsw_common *cpsw)
 	soft_reset_slave(slave);
 }
 
+static int cpsw_tc_to_fifo(int tc, int num_tc)
+{
+	if (tc == num_tc - 1)
+		return 0;
+
+	return CPSW_FIFO_SHAPERS_NUM - tc;
+}
+
 static int cpsw_ndo_open(struct net_device *ndev)
 {
 	struct cpsw_priv *priv = netdev_priv(ndev);
@@ -2190,6 +2202,75 @@ static int cpsw_ndo_set_tx_maxrate(struct net_device *ndev, int queue, u32 rate)
 	return ret;
 }
 
+static int cpsw_set_mqprio(struct net_device *ndev, void *type_data)
+{
+	struct tc_mqprio_qopt_offload *mqprio = type_data;
+	struct cpsw_priv *priv = netdev_priv(ndev);
+	struct cpsw_common *cpsw = priv->cpsw;
+	int fifo, num_tc, count, offset;
+	struct cpsw_slave *slave;
+	u32 tx_prio_map = 0;
+	int i, tc, ret;
+
+	num_tc = mqprio->qopt.num_tc;
+	if (num_tc > CPSW_TC_NUM)
+		return -EINVAL;
+
+	if (mqprio->mode != TC_MQPRIO_MODE_DCB)
+		return -EINVAL;
+
+	ret = pm_runtime_get_sync(cpsw->dev);
+	if (ret < 0) {
+		pm_runtime_put_noidle(cpsw->dev);
+		return ret;
+	}
+
+	if (num_tc) {
+		for (i = 0; i < 8; i++) {
+			tc = mqprio->qopt.prio_tc_map[i];
+			fifo = cpsw_tc_to_fifo(tc, num_tc);
+			tx_prio_map |= fifo << (4 * i);
+		}
+
+		netdev_set_num_tc(ndev, num_tc);
+		for (i = 0; i < num_tc; i++) {
+			count = mqprio->qopt.count[i];
+			offset = mqprio->qopt.offset[i];
+			netdev_set_tc_queue(ndev, i, count, offset);
+		}
+	}
+
+	if (!mqprio->qopt.hw) {
+		/* restore default configuration */
+		netdev_reset_tc(ndev);
+		tx_prio_map = TX_PRIORITY_MAPPING;
+	}
+
+	priv->mqprio_hw = mqprio->qopt.hw;
+
+	offset = cpsw->version == CPSW_VERSION_1 ?
+		 CPSW1_TX_PRI_MAP : CPSW2_TX_PRI_MAP;
+
+	slave = &cpsw->slaves[cpsw_slave_index(cpsw, priv)];
+	slave_write(slave, tx_prio_map, offset);
+
+	pm_runtime_put_sync(cpsw->dev);
+
+	return 0;
+}
+
+static int cpsw_ndo_setup_tc(struct net_device *ndev, enum tc_setup_type type,
+			     void *type_data)
+{
+	switch (type) {
+	case TC_SETUP_QDISC_MQPRIO:
+		return cpsw_set_mqprio(ndev, type_data);
+
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
 static const struct net_device_ops cpsw_netdev_ops = {
 	.ndo_open		= cpsw_ndo_open,
 	.ndo_stop		= cpsw_ndo_stop,
@@ -2205,6 +2286,7 @@ static const struct net_device_ops cpsw_netdev_ops = {
 #endif
 	.ndo_vlan_rx_add_vid	= cpsw_ndo_vlan_rx_add_vid,
 	.ndo_vlan_rx_kill_vid	= cpsw_ndo_vlan_rx_kill_vid,
+	.ndo_setup_tc           = cpsw_ndo_setup_tc,
 };
 
 static int cpsw_get_regs_len(struct net_device *ndev)
-- 
2.17.1

^ permalink raw reply related

* Re: [BUG] net: stmmac: socfpga ethernet no longer working on linux-next
From: Jose Abreu @ 2018-06-14  7:38 UTC (permalink / raw)
  To: Dinh Nguyen, netdev
  Cc: David Miller, clabbe, Jose.Abreu, Dinh Nguyen, Marek Vasut
In-Reply-To: <CADhT+wewQKmqXDqBj4fk_zhJbQdSSK5LTneJGJ9je3tP2eVMDw@mail.gmail.com>

Hello,

On 13-06-2018 21:46, Dinh Nguyen wrote:
> Hi,
>
> The stmmac ethernet has stopped working in linux-next and linus/master
> branch(v4.17-11782-gbe779f03d563)
>
> It appears that the stmmac ethernet has stopped working after these 2 commits:
>
> 4dbbe8dde848 net: stmmac: Add support for U32 TC filter using Flexible RX Parser
> 5f0456b43140 net: stmmac: Implement logic to automatically select HW Interface
>
> If I move to this commit "565020aaeebf net: stmmac: Disable ACS
> Feature for GMAC >= 4", then the stmmac works again on SoCFPGA.
>
> I was following this thread:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.spinics.net_lists_netdev_msg502858.html&d=DwIBaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=yaVFU4TjGY0gVF8El1uKcisy6TPsyCl9uN7Wsis-qhY&m=fvPkLp2xlWolmIYwoFLmALhxlycg1w0UmxiYdT7qojc&s=aC4a2U3X_siDxSNz3c5OeadhEJWll31yP-oi5nNar94&e=
>
> Was wondering if there was a patch to fix dwmac-sun8i that the socfpga
> platform needs as well?

Probably. I will check and get back to you ASAP.

Thanks and Best Regards,
Jose Miguel Abreu

>
> Thanks,
> Dinh

^ permalink raw reply

* Re: [PATCH v2] net: ethernet: stmmac: dwmac-rk: Add GMAC support for PX30
From: Heiko Stübner @ 2018-06-14  7:54 UTC (permalink / raw)
  To: David Wu
  Cc: davem, robh+dt, mark.rutland, huangtao, netdev, linux-arm-kernel,
	linux-rockchip, linux-kernel
In-Reply-To: <1528956927-32440-1-git-send-email-david.wu@rock-chips.com>

Hi David,

Am Donnerstag, 14. Juni 2018, 08:15:27 CEST schrieb David Wu:
> Add constants and callback functions for the dwmac on PX30 Soc.
> The base structure is the same, but registers and the bits in
> them are moved slightly, and add the clk_mac_speed for selecting
> mac speed.
> 
> Signed-off-by: David Wu <david.wu@rock-chips.com>

[...]

> @@ -1042,6 +1101,10 @@ static int rk_gmac_clk_init(struct
> plat_stmmacenet_data *plat) }
>  	}
> 
> +	bsp_priv->clk_mac_speed = devm_clk_get(dev, "clk_mac_speed");
> +	if (IS_ERR(bsp_priv->clk_mac_speed))
> +		dev_err(dev, "cannot get clock %s\n", "clk_mac_speed");
> +

I don't see that new clock documented in the dt-binding.
Also, which clock from the clock-controller does this connect to?


Thanks
Heiko

^ permalink raw reply

* Request to enable setting the nested network namespace
From: Pamela Mei @ 2018-06-14  8:04 UTC (permalink / raw)
  To: netdev

In linux, set up 2 network namespaces, ns1 and ns2. "ip netns list"
can view the 2 network namespaces.
Move one network device from linux root namespace to ns1 then from ns1
to ns2, then delete ns2,
expect that network device can move back to ns1,
but actual result is that eth1 is back to linux root network
namespace. I'm not sure whether it's as expected.

Here is the detail test steps:

1.ip netns add ns1

2.ip netns add ns2

3.ip link set eth1 netns ns1

4.ip netns exec ns1 ip link set eth1 netns ns2

5.ip netns del ns2

Expected result: eth1 will be in ns1

Actual result: eth1 is back in linux root namespace 1

Question: is there any method to realize such scenario to make sure
device can be back to ns1 not linux root network namespace 1?

How about if there's a function to enable nest network namespace e.g.
can set ns1 as the parent namespace of ns2, then device can return to
ns1 when ns2 is gone.


Cheers,

Pamela MEI

^ permalink raw reply

* Re: [PATCH v2 net-next 4/6] net: ethernet: ti: cpsw: add CBS Qdisc offload
From: Ilias Apalodimas @ 2018-06-14  8:09 UTC (permalink / raw)
  To: Ivan Khoronzhuk
  Cc: grygorii.strashko, davem, corbet, akpm, netdev, linux-doc,
	linux-kernel, linux-omap, vinicius.gomes, henrik,
	jesus.sanchez-palencia, p-varis, spatton, francois.ozog, yogeshs,
	nsekhar, andrew
In-Reply-To: <20180614073650.29659-5-ivan.khoronzhuk@linaro.org>

On Thu, Jun 14, 2018 at 10:36:48AM +0300, Ivan Khoronzhuk wrote:
> The cpsw has up to 4 FIFOs per port and upper 3 FIFOs can feed rate
> limited queue with shaping. In order to set and enable shaping for
> those 3 FIFOs queues the network device with CBS qdisc attached is
> needed. The CBS configuration is added for dual-emac/single port mode
> only, but potentially can be used in switch mode also, based on
> switchdev for instance.
> 
> Despite the FIFO shapers can work w/o cpdma level shapers the base
> usage must be in combine with cpdma level shapers as described in TRM,
> that are set as maximum rates for interface queues with sysfs.
> 
> One of the possible configuration with txq shapers and CBS shapers:
> 
>                       Configured with echo RATE >
>                   /sys/class/net/eth0/queues/tx-0/tx_maxrate
>              /---------------------------------------------------
>             /
>            /            cpdma level shapers
>         +----+ +----+ +----+ +----+ +----+ +----+ +----+ +----+
>         | c7 | | c6 | | c5 | | c4 | | c3 | | c2 | | c1 | | c0 |
>         \    / \    / \    / \    / \    / \    / \    / \    /
>          \  /   \  /   \  /   \  /   \  /   \  /   \  /   \  /
>           \/     \/     \/     \/     \/     \/     \/     \/
> +---------|------|------|------|-------------------------------------+
> |    +----+      |      |  +---+                                     |
> |    |      +----+      |  |                                         |
> |    v      v           v  v                                         |
> | +----+ +----+ +----+ +----+ p        p+----+ +----+ +----+ +----+  |
> | |    | |    | |    | |    | o        o|    | |    | |    | |    |  |
> | | f3 | | f2 | | f1 | | f0 | r  CPSW  r| f3 | | f2 | | f1 | | f0 |  |
> | |    | |    | |    | |    | t        t|    | |    | |    | |    |  |
> | \    / \    / \    / \    / 0        1\    / \    / \    / \    /  |
> |  \  X   \  /   \  /   \  /             \  /   \  /   \  /   \  /   |
> |   \/ \   \/     \/     \/               \/     \/     \/     \/    |
> +-------\------------------------------------------------------------+
>          \
>           \ FIFO shaper, set with CBS offload added in this patch,
>            \ FIFO0 cannot be rate limited
>             ------------------------------------------------------
> 
> CBS shaper configuration is supposed to be used with root MQPRIO Qdisc
> offload allowing to add sk_prio->tc->txq maps that direct traffic to
> appropriate tx queue and maps L2 priority to FIFO shaper.
> 
> The CBS shaper is intended to be used for AVB where L2 priority
> (pcp field) is used to differentiate class of traffic. So additionally
> vlan needs to be created with appropriate egress sk_prio->l2 prio map.
> 
> If CBS has several tx queues assigned to it, the sum of their
> bandwidth has not overlap bandwidth set for CBS. It's recomended the
> CBS bandwidth to be a little bit more.
> 
> The CBS shaper is configured with CBS qdisc offload interface using tc
> tool from iproute2 packet.
> 
> For instance:
> 
> $ tc qdisc replace dev eth0 handle 100: parent root mqprio num_tc 3 \
> map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@0 1@1 2@2 hw 1
> 
> $ tc -g class show dev eth0
> +---(100:ffe2) mqprio
> |    +---(100:3) mqprio
> |    +---(100:4) mqprio
> |    
> +---(100:ffe1) mqprio
> |    +---(100:2) mqprio
> |    
> +---(100:ffe0) mqprio
>      +---(100:1) mqprio
> 
> $ tc qdisc add dev eth0 parent 100:1 cbs locredit -1440 \
> hicredit 60 sendslope -960000 idleslope 40000 offload 1
> 
> $ tc qdisc add dev eth0 parent 100:2 cbs locredit -1470 \
> hicredit 62 sendslope -980000 idleslope 20000 offload 1
> 
> The above code set CBS shapers for tc0 and tc1, for that txq0 and
> txq1 is used. Pay attention, the real set bandwidth can differ a bit
> due to discreteness of configuration parameters.
> 
> Here parameters like locredit, hicredit and sendslope are ignored
> internally and are supposed to be set with assumption that maximum
> frame size for frame - 1500.
> 
> It's supposed that interface speed is not changed while reconnection,
> not always is true, so inform user in case speed of interface was
> changed, as it can impact on dependent shapers configuration.
> 
> For more examples see Documentation.
> 
> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
> ---
>  drivers/net/ethernet/ti/cpsw.c | 221 +++++++++++++++++++++++++++++++++
>  1 file changed, 221 insertions(+)
> 
> diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
> index edd14def98df..3623c2994ddf 100644
> --- a/drivers/net/ethernet/ti/cpsw.c
> +++ b/drivers/net/ethernet/ti/cpsw.c
> @@ -46,6 +46,8 @@
>  #include "cpts.h"
>  #include "davinci_cpdma.h"
>  
> +#include <net/pkt_sched.h>
> +
>  #define CPSW_DEBUG	(NETIF_MSG_HW		| NETIF_MSG_WOL		| \
>  			 NETIF_MSG_DRV		| NETIF_MSG_LINK	| \
>  			 NETIF_MSG_IFUP		| NETIF_MSG_INTR	| \
> @@ -154,8 +156,12 @@ do {								\
>  #define IRQ_NUM			2
>  #define CPSW_MAX_QUEUES		8
>  #define CPSW_CPDMA_DESCS_POOL_SIZE_DEFAULT 256
> +#define CPSW_FIFO_QUEUE_TYPE_SHIFT	16
> +#define CPSW_FIFO_SHAPE_EN_SHIFT	16
> +#define CPSW_FIFO_RATE_EN_SHIFT		20
>  #define CPSW_TC_NUM			4
>  #define CPSW_FIFO_SHAPERS_NUM		(CPSW_TC_NUM - 1)
> +#define CPSW_PCT_MASK			0x7f
>  
>  #define CPSW_RX_VLAN_ENCAP_HDR_PRIO_SHIFT	29
>  #define CPSW_RX_VLAN_ENCAP_HDR_PRIO_MSK		GENMASK(2, 0)
> @@ -457,6 +463,8 @@ struct cpsw_priv {
>  	bool				rx_pause;
>  	bool				tx_pause;
>  	bool				mqprio_hw;
> +	int				fifo_bw[CPSW_TC_NUM];
> +	int				shp_cfg_speed;
>  	u32 emac_port;
>  	struct cpsw_common *cpsw;
>  };
> @@ -1081,6 +1089,38 @@ static void cpsw_set_slave_mac(struct cpsw_slave *slave,
>  	slave_write(slave, mac_lo(priv->mac_addr), SA_LO);
>  }
>  
> +static bool cpsw_shp_is_off(struct cpsw_priv *priv)
> +{
> +	struct cpsw_common *cpsw = priv->cpsw;
> +	struct cpsw_slave *slave;
> +	u32 shift, mask, val;
> +
> +	val = readl_relaxed(&cpsw->regs->ptype);
> +
> +	slave = &cpsw->slaves[cpsw_slave_index(cpsw, priv)];
> +	shift = CPSW_FIFO_SHAPE_EN_SHIFT + 3 * slave->slave_num;
> +	mask = 7 << shift;
> +	val = val & mask;
> +
> +	return !val;
> +}
> +
> +static void cpsw_fifo_shp_on(struct cpsw_priv *priv, int fifo, int on)
> +{
> +	struct cpsw_common *cpsw = priv->cpsw;
> +	struct cpsw_slave *slave;
> +	u32 shift, mask, val;
> +
> +	val = readl_relaxed(&cpsw->regs->ptype);
> +
> +	slave = &cpsw->slaves[cpsw_slave_index(cpsw, priv)];
> +	shift = CPSW_FIFO_SHAPE_EN_SHIFT + 3 * slave->slave_num;
> +	mask = (1 << --fifo) << shift;
> +	val = on ? val | mask : val & ~mask;
> +
> +	writel_relaxed(val, &cpsw->regs->ptype);
> +}
> +
>  static void _cpsw_adjust_link(struct cpsw_slave *slave,
>  			      struct cpsw_priv *priv, bool *link)
>  {
> @@ -1120,6 +1160,12 @@ static void _cpsw_adjust_link(struct cpsw_slave *slave,
>  			mac_control |= BIT(4);
>  
>  		*link = true;
> +
> +		if (priv->shp_cfg_speed &&
> +		    priv->shp_cfg_speed != slave->phy->speed &&
> +		    !cpsw_shp_is_off(priv))
> +			dev_warn(priv->dev,
> +				 "Speed was changed, CBS sahper speeds are changed!");
typo here, should be shaper
>  	} else {
>  		mac_control = 0;
>  		/* disable forwarding */
> @@ -1589,6 +1635,178 @@ static int cpsw_tc_to_fifo(int tc, int num_tc)
>  	return CPSW_FIFO_SHAPERS_NUM - tc;
>  }
>  
> +static int cpsw_set_fifo_bw(struct cpsw_priv *priv, int fifo, int bw)
> +{
> +	struct cpsw_common *cpsw = priv->cpsw;
> +	u32 val = 0, send_pct, shift;
> +	struct cpsw_slave *slave;
> +	int pct = 0, i;
> +
> +	if (bw > priv->shp_cfg_speed * 1000)
> +		goto err;
> +
> +	/* shaping has to stay enabled for highest fifos linearly
> +	 * and fifo bw no more then interface can allow
> +	 */
> +	slave = &cpsw->slaves[cpsw_slave_index(cpsw, priv)];
> +	send_pct = slave_read(slave, SEND_PERCENT);
> +	for (i = CPSW_FIFO_SHAPERS_NUM; i > 0; i--) {
> +		if (!bw) {
> +			if (i >= fifo || !priv->fifo_bw[i])
> +				continue;
> +
> +			dev_warn(priv->dev, "Prev FIFO%d is shaped", i);
> +			continue;
> +		}
> +
> +		if (!priv->fifo_bw[i] && i > fifo) {
> +			dev_err(priv->dev, "Upper FIFO%d is not shaped", i);
> +			return -EINVAL;
> +		}
> +
> +		shift = (i - 1) * 8;
> +		if (i == fifo) {
> +			send_pct &= ~(CPSW_PCT_MASK << shift);
> +			val = DIV_ROUND_UP(bw, priv->shp_cfg_speed * 10);
> +			if (!val)
> +				val = 1;
> +
> +			send_pct |= val << shift;
> +			pct += val;
> +			continue;
> +		}
> +
> +		if (priv->fifo_bw[i])
> +			pct += (send_pct >> shift) & CPSW_PCT_MASK;
> +	}
> +
> +	if (pct >= 100)
> +		goto err;
> +
> +	slave_write(slave, send_pct, SEND_PERCENT);
> +	priv->fifo_bw[fifo] = bw;
> +
> +	dev_warn(priv->dev, "set FIFO%d bw = %d\n", fifo,
> +		 DIV_ROUND_CLOSEST(val * priv->shp_cfg_speed, 100));
> +
> +	return 0;
> +err:
> +	dev_err(priv->dev, "Bandwidth doesn't fit in tc configuration");
> +	return -EINVAL;
> +}
> +
> +static int cpsw_set_fifo_rlimit(struct cpsw_priv *priv, int fifo, int bw)
> +{
> +	struct cpsw_common *cpsw = priv->cpsw;
> +	struct cpsw_slave *slave;
> +	u32 tx_in_ctl_rg, val;
> +	int ret;
> +
> +	ret = cpsw_set_fifo_bw(priv, fifo, bw);
> +	if (ret)
> +		return ret;
> +
> +	slave = &cpsw->slaves[cpsw_slave_index(cpsw, priv)];
> +	tx_in_ctl_rg = cpsw->version == CPSW_VERSION_1 ?
> +		       CPSW1_TX_IN_CTL : CPSW2_TX_IN_CTL;
> +
> +	if (!bw)
> +		cpsw_fifo_shp_on(priv, fifo, bw);
> +
> +	val = slave_read(slave, tx_in_ctl_rg);
> +	if (cpsw_shp_is_off(priv)) {
> +		/* disable FIFOs rate limited queues */
> +		val &= ~(0xf << CPSW_FIFO_RATE_EN_SHIFT);
> +
> +		/* set type of FIFO queues to normal priority mode */
> +		val &= ~(3 << CPSW_FIFO_QUEUE_TYPE_SHIFT);
> +
> +		/* set type of FIFO queues to be rate limited */
> +		if (bw)
> +			val |= 2 << CPSW_FIFO_QUEUE_TYPE_SHIFT;
> +		else
> +			priv->shp_cfg_speed = 0;
> +	}
> +
> +	/* toggle a FIFO rate limited queue */
> +	if (bw)
> +		val |= BIT(fifo + CPSW_FIFO_RATE_EN_SHIFT);
> +	else
> +		val &= ~BIT(fifo + CPSW_FIFO_RATE_EN_SHIFT);
> +	slave_write(slave, val, tx_in_ctl_rg);
> +
> +	/* FIFO transmit shape enable */
> +	cpsw_fifo_shp_on(priv, fifo, bw);
> +	return 0;
> +}
> +
> +/* Defaults:
> + * class A - prio 3
> + * class B - prio 2
> + * shaping for class A should be set first
> + */
> +static int cpsw_set_cbs(struct net_device *ndev,
> +			struct tc_cbs_qopt_offload *qopt)
> +{
> +	struct cpsw_priv *priv = netdev_priv(ndev);
> +	struct cpsw_common *cpsw = priv->cpsw;
> +	struct cpsw_slave *slave;
> +	int prev_speed = 0;
> +	int tc, ret, fifo;
> +	u32 bw = 0;
> +
> +	tc = netdev_txq_to_tc(priv->ndev, qopt->queue);
> +
> +	/* enable channels in backward order, as highest FIFOs must be rate
> +	 * limited first and for compliance with CPDMA rate limited channels
> +	 * that also used in bacward order. FIFO0 cannot be rate limited.
> +	 */
> +	fifo = cpsw_tc_to_fifo(tc, ndev->num_tc);
> +	if (!fifo) {
> +		dev_err(priv->dev, "Last tc%d can't be rate limited", tc);
> +		return -EINVAL;
> +	}
> +
> +	/* do nothing, it's disabled anyway */
> +	if (!qopt->enable && !priv->fifo_bw[fifo])
> +		return 0;
> +
> +	/* shapers can be set if link speed is known */
> +	slave = &cpsw->slaves[cpsw_slave_index(cpsw, priv)];
> +	if (slave->phy && slave->phy->link) {
> +		if (priv->shp_cfg_speed &&
> +		    priv->shp_cfg_speed != slave->phy->speed)
> +			prev_speed = priv->shp_cfg_speed;
> +
> +		priv->shp_cfg_speed = slave->phy->speed;
> +	}
> +
> +	if (!priv->shp_cfg_speed) {
> +		dev_err(priv->dev, "Link speed is not known");
> +		return -1;
> +	}
> +
> +	ret = pm_runtime_get_sync(cpsw->dev);
> +	if (ret < 0) {
> +		pm_runtime_put_noidle(cpsw->dev);
> +		return ret;
> +	}
> +
> +	bw = qopt->enable ? qopt->idleslope : 0;
> +	ret = cpsw_set_fifo_rlimit(priv, fifo, bw);
> +	if (ret) {
> +		priv->shp_cfg_speed = prev_speed;
> +		prev_speed = 0;
> +	}
> +
> +	if (bw && prev_speed)
> +		dev_warn(priv->dev,
> +			 "Speed was changed, CBS sahper speeds are changed!");
same c/p typo
> +
> +	pm_runtime_put_sync(cpsw->dev);
> +	return ret;
> +}
> +
>  static int cpsw_ndo_open(struct net_device *ndev)
>  {
>  	struct cpsw_priv *priv = netdev_priv(ndev);
> @@ -2263,6 +2481,9 @@ static int cpsw_ndo_setup_tc(struct net_device *ndev, enum tc_setup_type type,
>  			     void *type_data)
>  {
>  	switch (type) {
> +	case TC_SETUP_QDISC_CBS:
> +		return cpsw_set_cbs(ndev, type_data);
> +
>  	case TC_SETUP_QDISC_MQPRIO:
>  		return cpsw_set_mqprio(ndev, type_data);
>  
> -- 
> 2.17.1
> 
Other than that looks good 

^ permalink raw reply

* Re: [PATCH v2] net: ethernet: stmmac: dwmac-rk: Add GMAC support for PX30
From: David Wu @ 2018-06-14  8:14 UTC (permalink / raw)
  To: Heiko Stübner
  Cc: davem, robh+dt, mark.rutland, huangtao, netdev, linux-arm-kernel,
	linux-rockchip, linux-kernel
In-Reply-To: <1961033.25ax7s0Z5i@diego>

Hi Heiko,

在 2018年06月14日 15:54, Heiko Stübner 写道:
> I don't see that new clock documented in the dt-binding.
> Also, which clock from the clock-controller does this connect to?

The clock is the "SCLK_GMAC_RMII" at the clock-controller, which could 
be set rate by the link speed.

^ permalink raw reply

* Re: [BUG] net: stmmac: socfpga ethernet no longer working on linux-next
From: Jose Abreu @ 2018-06-14  8:18 UTC (permalink / raw)
  To: Dinh Nguyen, netdev
  Cc: David Miller, clabbe, Jose.Abreu, Dinh Nguyen, Marek Vasut
In-Reply-To: <63bfbeb4-ca90-cc8e-3b06-ea257a34a261@synopsys.com>

On 14-06-2018 08:38, Jose Abreu wrote:
> Hello,
>
> On 13-06-2018 21:46, Dinh Nguyen wrote:
>> Hi,
>>
>> The stmmac ethernet has stopped working in linux-next and linus/master
>> branch(v4.17-11782-gbe779f03d563)
>>
>> It appears that the stmmac ethernet has stopped working after these 2 commits:
>>
>> 4dbbe8dde848 net: stmmac: Add support for U32 TC filter using Flexible RX Parser
>> 5f0456b43140 net: stmmac: Implement logic to automatically select HW Interface
>>
>> If I move to this commit "565020aaeebf net: stmmac: Disable ACS
>> Feature for GMAC >= 4", then the stmmac works again on SoCFPGA.
>>
>> I was following this thread:
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.spinics.net_lists_netdev_msg502858.html&d=DwIBaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=yaVFU4TjGY0gVF8El1uKcisy6TPsyCl9uN7Wsis-qhY&m=fvPkLp2xlWolmIYwoFLmALhxlycg1w0UmxiYdT7qojc&s=aC4a2U3X_siDxSNz3c5OeadhEJWll31yP-oi5nNar94&e=
>>
>> Was wondering if there was a patch to fix dwmac-sun8i that the socfpga
>> platform needs as well?
> Probably. I will check and get back to you ASAP.

This seems to be a different problem. Can you send me your dmesg
log and DT bindings you are using?

>
> Thanks and Best Regards,
> Jose Miguel Abreu
>
>> Thanks,
>> Dinh

^ permalink raw reply

* Re: [PATCH v2] net: ethernet: stmmac: dwmac-rk: Add GMAC support for PX30
From: Heiko Stübner @ 2018-06-14  8:30 UTC (permalink / raw)
  To: David Wu
  Cc: davem, robh+dt, mark.rutland, huangtao, netdev, linux-arm-kernel,
	linux-rockchip, linux-kernel
In-Reply-To: <3aa2445f-ab2a-93b6-3a49-36be6c98d327@rock-chips.com>

Am Donnerstag, 14. Juni 2018, 10:14:31 CEST schrieb David Wu:
> Hi Heiko,
> 
> 在 2018年06月14日 15:54, Heiko Stübner 写道:
> > I don't see that new clock documented in the dt-binding.
> > Also, which clock from the clock-controller does this connect to?
> 
> The clock is the "SCLK_GMAC_RMII" at the clock-controller, which could
> be set rate by the link speed.

Hmm, while these huge number of clocks are somewhat strange,
shouldn't it be named something with _rmii instead of _speed then?

Also, I don't see any clk_enable action for that new clock, so you could
end up with being off?

And someone could convert the driver to use the new clk-bulk APIs [0],
so the large number of clk_prepare_enable calls would be a bit
trimmed down.


Heiko

[0] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/clk/clk-bulk.c

^ permalink raw reply

* Re: BUG: 4.14.11 unable to handle kernel NULL pointer dereference in xfrm_lookup
From: Kristian Evensen @ 2018-06-14  8:38 UTC (permalink / raw)
  To: Steffen Klassert
  Cc: Tobias Hommel, Markus Berner, Network Development,
	Florian Westphal
In-Reply-To: <CAKfDRXjFC4L7Rmv_-nUbuOLLstvid64JxF9ECOL4Dbzn3FZwLA@mail.gmail.com>

Hello,

On Tue, Jun 12, 2018 at 10:29 AM, Kristian Evensen
<kristian.evensen@gmail.com> wrote:
> Thanks for spending time on this. I will see what I can manage in
> terms of a bisect. Our last good kernel was 4.9, so at least it
> narrows the scope down a bit compared to 4.4 or 4.1.

I hope we might have got somewhere. While looking more into ipsec and
4.14, we noticed large performance regressions (-~20%) on some
low-powered devices we are also using. We quickly identified the
removal of the flow cache as the "culprit", and the performance
regression is discussed in the netdev-thread for the removal of the
cache ("xfrm: remove flow cache"). For the time being and in order to
restore the performance, we have reverted the patch series removing
the flow cache. When running our tests (on the APU) after the revert,
we no longer see the crash. Before the revert, the APU would always
crash within some hours. After the revert, our tests have been running
for 24 hours+. Our test is quite basic, we establish 1, 2, 3 ...,  50
tunnels and then run iperf on all tunnels in parallel. The tunnels are
teared down between each iteration.

We are still running the test and will keep doing so, but I thought I
should share this finding in case it can help in fixing the error. I
will report back in case we find out something more, and please let me
know if you have any suggestions for things I can test. I don't for
example know if it is safe to revert one and one commit of the flow
cache, to try to pin the crash even more down.

BR,
Kristian

^ permalink raw reply

* Re: [RFC PATCH RESEND] tcp: avoid F-RTO if SACK and timestamps are disabled
From: Ilpo Järvinen @ 2018-06-14  8:42 UTC (permalink / raw)
  To: Yuchung Cheng; +Cc: Michal Kubecek, netdev, Eric Dumazet, LKML
In-Reply-To: <CAK6E8=eCOLU9AX0+bSrOg_UYBm1mFxrGT=ybksba9B0OUfp7jg@mail.gmail.com>

On Wed, 13 Jun 2018, Yuchung Cheng wrote:

> On Wed, Jun 13, 2018 at 9:55 AM, Michal Kubecek <mkubecek@suse.cz> wrote:
> >
> > When F-RTO algorithm (RFC 5682) is used on connection without both SACK and
> > timestamps (either because of (mis)configuration or because the other
> > endpoint does not advertise them), specific pattern loss can make RTO grow
> > exponentially until the sender is only able to send one packet per two
> > minutes (TCP_RTO_MAX).
> >
> > One way to reproduce is to
> >
> >   - make sure the connection uses neither SACK nor timestamps
> >   - let tp->reorder grow enough so that lost packets are retransmitted
> >     after RTO (rather than when high_seq - snd_una > reorder * MSS)
> >   - let the data flow stabilize
> >   - drop multiple sender packets in "every second" pattern

Hmm? What is deterministically dropping every second packet for a 
particular flow that has RTOs in between?

Years back I was privately contacted by somebody from a middlebox vendor 
for a case with very similar exponentially growing RTO due to the FRTO 
heuristic. It turned out that they didn't want to send dupacks for 
out-of-order packets because they wanted to keep the TCP side of their 
deep packet inspection middlebox primitive. He claimed that the middlebox 
doesn't need to send dupacks because there could be such a TCP 
implementation that too doesn't do them either (not that he had anything 
to point to besides their middlebox ;-)), which according to him was 
not required because of his intepretation of RFC793 (IIRC). ...Nevermind 
anything that has occurred since that era.

...Back then, I also envisioned in that mail exchange with him that a 
middlebox could break FRTO by always forcing a drop on the key packet
FRTO depends on. Ironically, that is exactly what is required to trigger 
this issue? Sure, every a heuristic can be fooled if a deterministic (or
crafted) pattern is introduced to defeat that particular heuristic. ...But 
I'd prefer that networks "dropping every second packet" of a flow to be 
fixed rather than FRTO?

In addition, one could even argue that the sender is sending whole the 
time with lower and lower rate (given the exponentially increasing RTO) 
and still gets losses, so that a further rate reduction would be the 
correct action. ...But take this intuitive reasoning with some grain of 
salt (that is, I can see reasons myself to disagree with it :-)).

> >   - either there is no new data to send or acks received in response to new
> >     data are also window updates (i.e. not dupacks by definition)

Can you explain what exactly do you mean with this "no new data to send" 
condition here as F-RTO is/should not be used if there's no new data to 
send?!?

...Or, why is the receiver going against SHOULD in RFC5681:
   "A TCP receiver SHOULD send an immediate duplicate ACK when an out-
   of-order segment arrives."
? ...And yes, I know there's this very issue with window updates masking 
duplicate ACKs in Linux TCP receiver but I was met with some skepticism 
on whether fixing it is worth it or not.

> > In this scenario, the sender keeps cycling between retransmitting first
> > lost packet (step 1 of RFC 5682), sending new data by (2b) and timing out
> > again. In this loop, the sender only gets
> >
> >   (a) acks for retransmitted segments (possibly together with old ones)
> >   (b) window updates
> >
> > Without timestamps, neither can be used for RTT estimator and without SACK,
> > we have no newly sacked segments to estimate RTT either. Therefore each
> > timeout doubles RTO and without usable RTT samples so that there is nothing
> > to counter the exponential growth.
> >
> > While disabling both SACK and timestamps doesn't make any sense, the
> > resulting behaviour is so pathological that it deserves an improvement.
> > (Also, both can be disabled on the other side.) Avoid F-RTO algorithm in
> > case both SACK and timestamps are disabled so that the sender falls back to
> > traditional slow start retransmission.
> >
> > Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
> Acked-by: Yuchung Cheng <ycheng@google.com>
> 
> Thanks for the patch (and packedrill test)! I would encourage
> submitting an errata to F-RTO RFC about this case.

Unless there's a convincing explination how such a drop pattern would 
occur in real world except due to serious brokeness/misconfiguration on 
network side (that should not be there), I'm not that sure it's exactly
what erratas are meant for.


-- 
 i.

^ permalink raw reply

* Re: [RFC PATCH 06/12] xen-blkfront: add callbacks for PM suspend and hibernation
From: Roger Pau Monné @ 2018-06-14  8:43 UTC (permalink / raw)
  To: Anchal Agarwal
  Cc: tglx, mingo, hpa, x86, boris.ostrovsky, konrad.wilk, netdev,
	jgross, xen-devel, linux-kernel, kamatam, eduval, vallish,
	fllinden, guruanb, rjw, pavel, len.brown, linux-pm, cyberax
In-Reply-To: <20180613222048.GB33296@kaos-source-ops-60001.pdx1.amazon.com>

Please try to avoid top posting.

On Wed, Jun 13, 2018 at 10:20:48PM +0000, Anchal Agarwal wrote:
> Hi Roger,
> To answer your question, due to the lack of mentioned commit
> (commit 12ea729645ac ("xen/blkback: unmap all persistent grants when
> frontend gets disconnected") in the older dom0 kernels(<3.2),resume from

This fix that you mention is only present in kernels >= 3.18 AFAICT,
and persistent grants where introduced in 3.8 (0a8704a51f38), so
anything < 3.8 should work fine. Not sure why you mention 3.2 here.

> hibernation can fail on guest side. In the absence of the commit,
> Persistant Grants are not unmapped immediately when frontend is 
> disconnected from backend and hence leave the block device in an 
> inconsistent state. To avoid this unstability and work with larger set 
> of kernel versions, this approach had been used. Once you don't have 
> any pending req/resp it is safer for guest to resume from hibernation.

I think the fix should be backported (if it hasn't been done yet) to
kernels between 3.8 and 3.18. I don't like to add all this code just
to work around a Linux backend kernel bug.

AFAICT if persistent grants work as expected you could use almost the
same path that's used for migration, greatly reducing the amount of
code that you need to add.

Thanks, Roger.

^ permalink raw reply

* Re: [PATCH bpf v2] xdp: Fix handling of devmap in generic XDP
From: Jesper Dangaard Brouer @ 2018-06-14  8:49 UTC (permalink / raw)
  To: Toshiaki Makita; +Cc: Alexei Starovoitov, Daniel Borkmann, netdev, brouer
In-Reply-To: <1528942062-2353-1-git-send-email-makita.toshiaki@lab.ntt.co.jp>

On Thu, 14 Jun 2018 11:07:42 +0900
Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> wrote:

> Commit 67f29e07e131 ("bpf: devmap introduce dev_map_enqueue") changed
> the return value type of __devmap_lookup_elem() from struct net_device *
> to struct bpf_dtab_netdev * but forgot to modify generic XDP code
> accordingly.
> Thus generic XDP incorrectly used struct bpf_dtab_netdev where struct
> net_device is expected, then skb->dev was set to invalid value.
> 
> v2:
> - Fix compiler warning without CONFIG_BPF_SYSCALL.
> 
> Fixes: 67f29e07e131 ("bpf: devmap introduce dev_map_enqueue")
> Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>

Thanks for catching this!

Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>

Notice, that the current code works (and does not crash), but it is
pure luck.  Because struct bpf_dtab_netdev happen to have the
net_device as the first member.

struct bpf_dtab_netdev {
	struct net_device *dev; /* must be first member, due to tracepoint */
	struct bpf_dtab *dtab;
	unsigned int bit;
	struct xdp_bulk_queue __percpu *bulkq;
	struct rcu_head rcu;
};

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply

* Re: [PATCH bpf v2] xdp: Fix handling of devmap in generic XDP
From: Toshiaki Makita @ 2018-06-14  9:00 UTC (permalink / raw)
  To: Jesper Dangaard Brouer; +Cc: Alexei Starovoitov, Daniel Borkmann, netdev
In-Reply-To: <20180614104959.4e4e57b8@redhat.com>

On 2018/06/14 17:49, Jesper Dangaard Brouer wrote:
> On Thu, 14 Jun 2018 11:07:42 +0900
> Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> wrote:
> 
>> Commit 67f29e07e131 ("bpf: devmap introduce dev_map_enqueue") changed
>> the return value type of __devmap_lookup_elem() from struct net_device *
>> to struct bpf_dtab_netdev * but forgot to modify generic XDP code
>> accordingly.
>> Thus generic XDP incorrectly used struct bpf_dtab_netdev where struct
>> net_device is expected, then skb->dev was set to invalid value.
>>
>> v2:
>> - Fix compiler warning without CONFIG_BPF_SYSCALL.
>>
>> Fixes: 67f29e07e131 ("bpf: devmap introduce dev_map_enqueue")
>> Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
> 
> Thanks for catching this!
> 
> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
> 
> Notice, that the current code works (and does not crash), but it is
> pure luck.  Because struct bpf_dtab_netdev happen to have the
> net_device as the first member.
> 
> struct bpf_dtab_netdev {
> 	struct net_device *dev; /* must be first member, due to tracepoint */
> 	struct bpf_dtab *dtab;
> 	unsigned int bit;
> 	struct xdp_bulk_queue __percpu *bulkq;
> 	struct rcu_head rcu;
> };
> 

Actually no, the current code does not work and can crash, because we
need to dereference the pointer, i.e. need fwd->dev (IOW *fwd) not fwd.

-- 
Toshiaki Makita

^ permalink raw reply

* RE: [Intel-wired-lan] [PATCH net-queue] i40e: Fix incorrect skb reserved size on rx
From: Malek, Patryk @ 2018-06-14  9:14 UTC (permalink / raw)
  To: Toshiaki Makita, Daniel Borkmann, Kirsher, Jeffrey T
  Cc: intel-wired-lan@lists.osuosl.org, netdev@vger.kernel.org
In-Reply-To: <8963a38e-0583-1a3f-bcfe-8a62d5da6dbf@lab.ntt.co.jp>

> On 2018/06/13 18:06, Daniel Borkmann wrote:
> > On 06/13/2018 10:08 AM, Toshiaki Makita wrote:
> >> i40e_build_skb() reserves I40E_SKB_PAD + (xdp->data -
> >> xdp->data_hard_start) but obviously I40E_SKB_PAD is unnecessary
> here
> >> and mac_header/data feilds in skb becomes incorrect, and breaks

Shouldn't this be fields instead of feilds?

^ permalink raw reply

* Re: [Intel-wired-lan] [PATCH net-queue] i40e: Fix incorrect skb reserved size on rx
From: Toshiaki Makita @ 2018-06-14  9:21 UTC (permalink / raw)
  To: Malek, Patryk
  Cc: Daniel Borkmann, Kirsher, Jeffrey T,
	intel-wired-lan@lists.osuosl.org, netdev@vger.kernel.org
In-Reply-To: <FA03331EB45A2544B0CBCB1A14B6429E2B2A3ED7@IRSMSX104.ger.corp.intel.com>

On 2018/06/14 18:14, Malek, Patryk wrote:
>> On 2018/06/13 18:06, Daniel Borkmann wrote:
>>> On 06/13/2018 10:08 AM, Toshiaki Makita wrote:
>>>> i40e_build_skb() reserves I40E_SKB_PAD + (xdp->data -
>>>> xdp->data_hard_start) but obviously I40E_SKB_PAD is unnecessary
>> here
>>>> and mac_header/data feilds in skb becomes incorrect, and breaks
> 
> Shouldn't this be fields instead of feilds?

Thanks, but this is now superseded by Daniel's patch so dropped I think.
http://patchwork.ozlabs.org/patch/928778/

-- 
Toshiaki Makita

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox