Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH net-next 1/1] net/smc: parameter cleanup in smc_cdc_get_free_slot()
From: Ursula Braun @ 2017-09-21  7:17 UTC (permalink / raw)
  To: davem
  Cc: netdev, linux-rdma, linux-s390, jwi, schwidefsky, heiko.carstens,
	raspl, ubraun

Use the smc_connection as first parameter with smc_cdc_get_free_slot().
This is just a small code cleanup, no functional change.

Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
---
 net/smc/smc_cdc.c | 7 ++++---
 net/smc/smc_cdc.h | 3 ++-
 net/smc/smc_tx.c  | 6 ++----
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/net/smc/smc_cdc.c b/net/smc/smc_cdc.c
index a7294edbc221..5ef97e5a5f78 100644
--- a/net/smc/smc_cdc.c
+++ b/net/smc/smc_cdc.c
@@ -62,10 +62,12 @@ static void smc_cdc_tx_handler(struct smc_wr_tx_pend_priv *pnd_snd,
 	bh_unlock_sock(&smc->sk);
 }
 
-int smc_cdc_get_free_slot(struct smc_link *link,
+int smc_cdc_get_free_slot(struct smc_connection *conn,
 			  struct smc_wr_buf **wr_buf,
 			  struct smc_cdc_tx_pend **pend)
 {
+	struct smc_link *link = &conn->lgr->lnk[SMC_SINGLE_LINK];
+
 	return smc_wr_tx_get_free_slot(link, smc_cdc_tx_handler, wr_buf,
 				       (struct smc_wr_tx_pend_priv **)pend);
 }
@@ -118,8 +120,7 @@ int smc_cdc_get_slot_and_msg_send(struct smc_connection *conn)
 	struct smc_wr_buf *wr_buf;
 	int rc;
 
-	rc = smc_cdc_get_free_slot(&conn->lgr->lnk[SMC_SINGLE_LINK], &wr_buf,
-				   &pend);
+	rc = smc_cdc_get_free_slot(conn, &wr_buf, &pend);
 	if (rc)
 		return rc;
 
diff --git a/net/smc/smc_cdc.h b/net/smc/smc_cdc.h
index 8e1d76f26007..56f883d1159c 100644
--- a/net/smc/smc_cdc.h
+++ b/net/smc/smc_cdc.h
@@ -206,7 +206,8 @@ static inline void smc_cdc_msg_to_host(struct smc_host_cdc_msg *local,
 
 struct smc_cdc_tx_pend;
 
-int smc_cdc_get_free_slot(struct smc_link *link, struct smc_wr_buf **wr_buf,
+int smc_cdc_get_free_slot(struct smc_connection *conn,
+			  struct smc_wr_buf **wr_buf,
 			  struct smc_cdc_tx_pend **pend);
 void smc_cdc_tx_dismiss_slots(struct smc_connection *conn);
 int smc_cdc_msg_send(struct smc_connection *conn, struct smc_wr_buf *wr_buf,
diff --git a/net/smc/smc_tx.c b/net/smc/smc_tx.c
index 3c656beb8820..e2228f6d1c25 100644
--- a/net/smc/smc_tx.c
+++ b/net/smc/smc_tx.c
@@ -394,8 +394,7 @@ int smc_tx_sndbuf_nonempty(struct smc_connection *conn)
 	int rc;
 
 	spin_lock_bh(&conn->send_lock);
-	rc = smc_cdc_get_free_slot(&conn->lgr->lnk[SMC_SINGLE_LINK], &wr_buf,
-				   &pend);
+	rc = smc_cdc_get_free_slot(conn, &wr_buf, &pend);
 	if (rc < 0) {
 		if (rc == -EBUSY) {
 			struct smc_sock *smc =
@@ -463,8 +462,7 @@ void smc_tx_consumer_update(struct smc_connection *conn)
 	    ((to_confirm > conn->rmbe_update_limit) &&
 	     ((to_confirm > (conn->rmbe_size / 2)) ||
 	      conn->local_rx_ctrl.prod_flags.write_blocked))) {
-		rc = smc_cdc_get_free_slot(&conn->lgr->lnk[SMC_SINGLE_LINK],
-					   &wr_buf, &pend);
+		rc = smc_cdc_get_free_slot(conn, &wr_buf, &pend);
 		if (!rc)
 			rc = smc_cdc_msg_send(conn, wr_buf, pend);
 		if (rc < 0) {
-- 
2.13.5

^ permalink raw reply related

* Re: [PATCHv3 iproute2 1/2] lib/libnetlink: re malloc buff if size is not enough
From: Hangbin Liu @ 2017-09-21  7:20 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev, Michal Kubecek, Phil Sutter
In-Reply-To: <20170920095605.1ea527fc@xeon-e3>

Hi Stephen,
On Wed, Sep 20, 2017 at 09:56:05AM -0700, Stephen Hemminger wrote:
> > +realloc:
> > +	bufp = realloc(buf, buf_len);
> > +
> > +	if (bufp == NULL) {
> 
> Minor personal style issue:
> To me, blank lines are like paragraphs in writing.
> Code reads better assignment and condition check are next to
> each other.

OK, I will remove the blank lines.
> 
> > +recv:
> > +	len = recvmsg(fd, msg, flag);
> > +
> > +	if (len < 0) {
> > +		if (errno == EINTR || errno == EAGAIN)
> > +			goto recv;
> > +		fprintf(stderr, "netlink receive error %s (%d)\n",
> > +			strerror(errno), errno);
> > +		free(buf);
> > +		return -errno;
> > +	}
> > +
> > +	if (len == 0) {
> > +		fprintf(stderr, "EOF on netlink\n");
> > +		free(buf);
> > +		return -ENODATA;
> > +	}
> > +
> > +	if (len > buf_len) {
> > +		buf_len = len;
> > +		flag = 0;
> > +		goto realloc;
> > +	}
> > +
> > +	if (flag != 0) {
> > +		flag = 0;
> > +		goto recv;
> 
> Although I programmed in BASIC years ago. I never liked code
> with loops via goto. To me it indicates the logic is not well thought
> through.  Not sure exactly how to rearrange the control flow, but it
> should be possible to rewrite this so that it reads cleaner.

Hmm, if we remove goto. Then the logic should look like

	bufp = realloc(buf, buf_len);
	/* check bufp and set msg */

	len = recvmsg(fd, msg, flag);
	/* check len */

	if (len > buf_len) {
		buf_len = len;
		bufp = realloc(buf, buf_len);
		/* check bufp and set msg */

		len = recvmsg(fd, msg, flag);
		/* check len */
	}

	len = recvmsg(fd, msg, flag);
	/* check len */

Or maybe we can set buf_len very small first. Then it will force to realloc at
the second time. And the code would like

	int buf_len = 16;
	bufp = realloc(buf, buf_len);
	/* check bufp and set msg */

	len = recvmsg(fd, msg, flag);
	/* check len */

	buf_len = len;
	bufp = realloc(buf, buf_len);
	/* check bufp and set msg */

	len = recvmsg(fd, msg, flag);
	/* check len */

What do you think?

Thanks
Hangbin

^ permalink raw reply

* [PATCH net-next] cxgb4: avoid stall while shutting down the adapter
From: Ganesh Goudar @ 2017-09-21  7:20 UTC (permalink / raw)
  To: netdev, davem; +Cc: nirranjan, indranil, venkatesh, Ganesh Goudar

do not wait for completion while deleting the filters
when the adapter is shutting down because we may not get
the response as interrupts will be disabled.

Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4.h        | 1 +
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c | 7 ++++++-
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c   | 4 ++++
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
index ea72d2d..c4e997f 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
@@ -549,6 +549,7 @@ enum {                                 /* adapter flags */
 	MASTER_PF          = (1 << 7),
 	FW_OFLD_CONN       = (1 << 9),
 	ROOT_NO_RELAXED_ORDERING = (1 << 10),
+	SHUTTING_DOWN	   = (1 << 11),
 };
 
 enum {
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c
index 45b5853..97ead2c 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c
@@ -191,7 +191,8 @@ static int del_filter_wr(struct adapter *adapter, int fidx)
 		return -ENOMEM;
 
 	fwr = __skb_put(skb, len);
-	t4_mk_filtdelwr(f->tid, fwr, adapter->sge.fw_evtq.abs_id);
+	t4_mk_filtdelwr(f->tid, fwr, (adapter->flags & SHUTTING_DOWN) ? -1
+			: adapter->sge.fw_evtq.abs_id);
 
 	/* Mark the filter as "pending" and ship off the Filter Work Request.
 	 * When we get the Work Request Reply we'll clear the pending status.
@@ -636,6 +637,10 @@ int cxgb4_del_filter(struct net_device *dev, int filter_id)
 	struct filter_ctx ctx;
 	int ret;
 
+	/* If we are shutting down the adapter do not wait for completion */
+	if (netdev2adap(dev)->flags & SHUTTING_DOWN)
+		return __cxgb4_del_filter(dev, filter_id, NULL);
+
 	init_completion(&ctx.completion);
 
 	ret = __cxgb4_del_filter(dev, filter_id, &ctx);
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index 92d9d79..5fe81a4 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -5254,6 +5254,8 @@ static void remove_one(struct pci_dev *pdev)
 		return;
 	}
 
+	adapter->flags |= SHUTTING_DOWN;
+
 	if (adapter->pf == 4) {
 		int i;
 
@@ -5339,6 +5341,8 @@ static void shutdown_one(struct pci_dev *pdev)
 		return;
 	}
 
+	adapter->flags |= SHUTTING_DOWN;
+
 	if (adapter->pf == 4) {
 		int i;
 
-- 
2.1.0

^ permalink raw reply related

* Re: [PATCHv3 iproute2 1/2] lib/libnetlink: re malloc buff if size is not enough
From: Michal Kubecek @ 2017-09-21  7:34 UTC (permalink / raw)
  To: Hangbin Liu; +Cc: Stephen Hemminger, netdev, Phil Sutter
In-Reply-To: <20170921072002.GM5465@leo.usersys.redhat.com>

On Thu, Sep 21, 2017 at 03:20:02PM +0800, Hangbin Liu wrote:
> 
> Or maybe we can set buf_len very small first. Then it will force to realloc at
> the second time. And the code would like
> 
> 	int buf_len = 16;
> 	bufp = realloc(buf, buf_len);
> 	/* check bufp and set msg */
> 
> 	len = recvmsg(fd, msg, flag);
> 	/* check len */
> 
> 	buf_len = len;
> 	bufp = realloc(buf, buf_len);
> 	/* check bufp and set msg */
> 
> 	len = recvmsg(fd, msg, flag);
> 	/* check len */
> 
> What do you think?

I will have to check but IIRC it might be possible to use zero length
for the peek to only check the length which could help you to avoid both
the reallocation and copying the same data from kernel to userspace
twice.

Michal Kubecek

^ permalink raw reply

* [PATCH net-next 0/4] cxgb4: add support to offload tc flower
From: Rahul Lakkireddy @ 2017-09-21  7:33 UTC (permalink / raw)
  To: netdev; +Cc: davem, kumaras, ganeshgr, nirranjan, indranil, Rahul Lakkireddy

This series of patches add support to offload tc flower onto Chelsio
NICs.

Patch 1 adds basic skeleton to prepare for offloading tc flower flows.

Patch 2 adds support to add/remove flows for offload.  Flows can have
accompanying masks.  Following match and action are currently supported
for offload:
Match:  ether-protocol, IPv4/IPv6 addresses, L4 ports (TCP/UDP)
Action: drop, redirect to another port on the device.

Patch 3 adds support to offload tc-flower flows having
vlan actions: pop, push, and modify.

Patch 4 adds support to fetch stats for the offloaded tc flower flows
from hardware.

Support for offloading more match and action types are to be followed
in subsequent series.

Thanks,
Rahul

Kumar Sanghvi (4):
  cxgb4: add tc flower offload skeleton
  cxgb4: add basic tc flower offload support
  cxgb4: add support to offload action vlan
  cxgb4: fetch stats for offloaded tc flower flows

 drivers/net/ethernet/chelsio/cxgb4/Makefile        |   4 +-
 drivers/net/ethernet/chelsio/cxgb4/cxgb4.h         |   4 +
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c  | 102 +++++
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c    |  25 ++
 .../net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c   | 458 +++++++++++++++++++++
 .../net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h   |  66 +++
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h     |   3 +
 7 files changed, 661 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
 create mode 100644 drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h

-- 
2.14.1

^ permalink raw reply

* [PATCH net-next 1/4] cxgb4: add tc flower offload skeleton
From: Rahul Lakkireddy @ 2017-09-21  7:33 UTC (permalink / raw)
  To: netdev; +Cc: davem, kumaras, ganeshgr, nirranjan, indranil, Rahul Lakkireddy
In-Reply-To: <cover.1505977744.git.rahul.lakkireddy@chelsio.com>

From: Kumar Sanghvi <kumaras@chelsio.com>

Add basic skeleton to prepare for offloading tc-flower flows.

Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
---
 drivers/net/ethernet/chelsio/cxgb4/Makefile        |  4 +-
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c    | 22 +++++++++
 .../net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c   | 57 ++++++++++++++++++++++
 .../net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h   | 46 +++++++++++++++++
 4 files changed, 128 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
 create mode 100644 drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h

diff --git a/drivers/net/ethernet/chelsio/cxgb4/Makefile b/drivers/net/ethernet/chelsio/cxgb4/Makefile
index 817212702f0a..fecd7aab673b 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/Makefile
+++ b/drivers/net/ethernet/chelsio/cxgb4/Makefile
@@ -4,7 +4,9 @@
 
 obj-$(CONFIG_CHELSIO_T4) += cxgb4.o
 
-cxgb4-objs := cxgb4_main.o l2t.o t4_hw.o sge.o clip_tbl.o cxgb4_ethtool.o cxgb4_uld.o sched.o cxgb4_filter.o cxgb4_tc_u32.o cxgb4_ptp.o
+cxgb4-objs := cxgb4_main.o l2t.o t4_hw.o sge.o clip_tbl.o cxgb4_ethtool.o \
+	      cxgb4_uld.o sched.o cxgb4_filter.o cxgb4_tc_u32.o \
+	      cxgb4_ptp.o cxgb4_tc_flower.o
 cxgb4-$(CONFIG_CHELSIO_T4_DCB) +=  cxgb4_dcb.o
 cxgb4-$(CONFIG_CHELSIO_T4_FCOE) +=  cxgb4_fcoe.o
 cxgb4-$(CONFIG_DEBUG_FS) += cxgb4_debugfs.o
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index 92d9d795d874..8923affbdaf8 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -79,6 +79,7 @@
 #include "l2t.h"
 #include "sched.h"
 #include "cxgb4_tc_u32.h"
+#include "cxgb4_tc_flower.h"
 #include "cxgb4_ptp.h"
 
 char cxgb4_driver_name[] = KBUILD_MODNAME;
@@ -2873,6 +2874,25 @@ static int cxgb_set_tx_maxrate(struct net_device *dev, int index, u32 rate)
 	return err;
 }
 
+static int cxgb_setup_tc_flower(struct net_device *dev,
+				struct tc_cls_flower_offload *cls_flower)
+{
+	if (!is_classid_clsact_ingress(cls_flower->common.classid) ||
+	    cls_flower->common.chain_index)
+		return -EOPNOTSUPP;
+
+	switch (cls_flower->command) {
+	case TC_CLSFLOWER_REPLACE:
+		return cxgb4_tc_flower_replace(dev, cls_flower);
+	case TC_CLSFLOWER_DESTROY:
+		return cxgb4_tc_flower_destroy(dev, cls_flower);
+	case TC_CLSFLOWER_STATS:
+		return cxgb4_tc_flower_stats(dev, cls_flower);
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
 static int cxgb_setup_tc_cls_u32(struct net_device *dev,
 				 struct tc_cls_u32_offload *cls_u32)
 {
@@ -2907,6 +2927,8 @@ static int cxgb_setup_tc(struct net_device *dev, enum tc_setup_type type,
 	switch (type) {
 	case TC_SETUP_CLSU32:
 		return cxgb_setup_tc_cls_u32(dev, type_data);
+	case TC_SETUP_CLSFLOWER:
+		return cxgb_setup_tc_flower(dev, type_data);
 	default:
 		return -EOPNOTSUPP;
 	}
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
new file mode 100644
index 000000000000..16dff71e4d02
--- /dev/null
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
@@ -0,0 +1,57 @@
+/*
+ * This file is part of the Chelsio T4/T5/T6 Ethernet driver for Linux.
+ *
+ * Copyright (c) 2017 Chelsio Communications, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <net/tc_act/tc_gact.h>
+#include <net/tc_act/tc_mirred.h>
+
+#include "cxgb4.h"
+#include "cxgb4_tc_flower.h"
+
+int cxgb4_tc_flower_replace(struct net_device *dev,
+			    struct tc_cls_flower_offload *cls)
+{
+	return -EOPNOTSUPP;
+}
+
+int cxgb4_tc_flower_destroy(struct net_device *dev,
+			    struct tc_cls_flower_offload *cls)
+{
+	return -EOPNOTSUPP;
+}
+
+int cxgb4_tc_flower_stats(struct net_device *dev,
+			  struct tc_cls_flower_offload *cls)
+{
+	return -EOPNOTSUPP;
+}
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h
new file mode 100644
index 000000000000..b321fc205b5a
--- /dev/null
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h
@@ -0,0 +1,46 @@
+/*
+ * This file is part of the Chelsio T4/T5/T6 Ethernet driver for Linux.
+ *
+ * Copyright (c) 2017 Chelsio Communications, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef __CXGB4_TC_FLOWER_H
+#define __CXGB4_TC_FLOWER_H
+
+#include <net/pkt_cls.h>
+
+int cxgb4_tc_flower_replace(struct net_device *dev,
+			    struct tc_cls_flower_offload *cls);
+int cxgb4_tc_flower_destroy(struct net_device *dev,
+			    struct tc_cls_flower_offload *cls);
+int cxgb4_tc_flower_stats(struct net_device *dev,
+			  struct tc_cls_flower_offload *cls);
+#endif /* __CXGB4_TC_FLOWER_H */
-- 
2.14.1

^ permalink raw reply related

* [PATCH net-next 3/4] cxgb4: add support to offload action vlan
From: Rahul Lakkireddy @ 2017-09-21  7:33 UTC (permalink / raw)
  To: netdev; +Cc: davem, kumaras, ganeshgr, nirranjan, indranil, Rahul Lakkireddy
In-Reply-To: <cover.1505977744.git.rahul.lakkireddy@chelsio.com>

From: Kumar Sanghvi <kumaras@chelsio.com>

Add support for offloading tc-flower flows having
vlan actions: pop, push and modify.

Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
---
 .../net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c   | 43 ++++++++++++++++++++++
 1 file changed, 43 insertions(+)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
index 1af01101faaf..fddb0c419edc 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
@@ -34,6 +34,7 @@
 
 #include <net/tc_act/tc_gact.h>
 #include <net/tc_act/tc_mirred.h>
+#include <net/tc_act/tc_vlan.h>
 
 #include "cxgb4.h"
 #include "cxgb4_tc_flower.h"
@@ -185,6 +186,27 @@ static void cxgb4_process_flow_actions(struct net_device *in,
 
 			fs->action = FILTER_SWITCH;
 			fs->eport = pi->port_id;
+		} else if (is_tcf_vlan(a)) {
+			u32 vlan_action = tcf_vlan_action(a);
+			u8 prio = tcf_vlan_push_prio(a);
+			u16 vid = tcf_vlan_push_vid(a);
+			u16 vlan_tci = (prio << VLAN_PRIO_SHIFT) | vid;
+
+			switch (vlan_action) {
+			case TCA_VLAN_ACT_POP:
+				fs->newvlan |= VLAN_REMOVE;
+				break;
+			case TCA_VLAN_ACT_PUSH:
+				fs->newvlan |= VLAN_INSERT;
+				fs->vlan = vlan_tci;
+				break;
+			case TCA_VLAN_ACT_MODIFY:
+				fs->newvlan |= VLAN_REWRITE;
+				fs->vlan = vlan_tci;
+				break;
+			default:
+				break;
+			}
 		}
 	}
 }
@@ -222,6 +244,27 @@ static int cxgb4_validate_flow_actions(struct net_device *dev,
 					   __func__);
 				return -EINVAL;
 			}
+		} else if (is_tcf_vlan(a)) {
+			u16 proto = be16_to_cpu(tcf_vlan_push_proto(a));
+			u32 vlan_action = tcf_vlan_action(a);
+
+			switch (vlan_action) {
+			case TCA_VLAN_ACT_POP:
+				break;
+			case TCA_VLAN_ACT_PUSH:
+			case TCA_VLAN_ACT_MODIFY:
+				if (proto != ETH_P_8021Q) {
+					netdev_err(dev,
+						   "%s: Unsupp. vlan proto\n",
+						   __func__);
+					return -EOPNOTSUPP;
+				}
+				break;
+			default:
+				netdev_err(dev, "%s: Unsupported vlan action\n",
+					   __func__);
+				return -EOPNOTSUPP;
+			}
 		} else {
 			netdev_err(dev, "%s: Unsupported action\n", __func__);
 			return -EOPNOTSUPP;
-- 
2.14.1

^ permalink raw reply related

* [PATCH net-next 2/4] cxgb4: add basic tc flower offload support
From: Rahul Lakkireddy @ 2017-09-21  7:33 UTC (permalink / raw)
  To: netdev; +Cc: davem, kumaras, ganeshgr, nirranjan, indranil, Rahul Lakkireddy
In-Reply-To: <cover.1505977744.git.rahul.lakkireddy@chelsio.com>

From: Kumar Sanghvi <kumaras@chelsio.com>

Add support to add/remove flows for offload.  Following match
and action are supported for offloading a flow:

Match: ether-protocol, IPv4/IPv6 addresses, L4 ports (TCP/UDP)
Action: drop, redirect to another port on the device.

The qualifying flows can have accompanying mask information.

Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4.h         |   3 +
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c  |  26 ++
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c    |   2 +
 .../net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c   | 285 ++++++++++++++++++++-
 .../net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h   |  17 ++
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h     |   1 +
 6 files changed, 332 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
index ea72d2d2e1b4..26eac599ab2c 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
@@ -904,6 +904,9 @@ struct adapter {
 	/* TC u32 offload */
 	struct cxgb4_tc_u32_table *tc_u32;
 	struct chcr_stats_debug chcr_stats;
+
+	/* TC flower offload */
+	DECLARE_HASHTABLE(flower_anymatch_tbl, 9);
 };
 
 /* Support for "sched-class" command to allow a TX Scheduling Class to be
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c
index 45b5853ca2f1..07a4619e2164 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c
@@ -148,6 +148,32 @@ static int get_filter_steerq(struct net_device *dev,
 	return iq;
 }
 
+int cxgb4_get_free_ftid(struct net_device *dev, int family)
+{
+	struct adapter *adap = netdev2adap(dev);
+	struct tid_info *t = &adap->tids;
+	int ftid;
+
+	spin_lock_bh(&t->ftid_lock);
+	if (family == PF_INET) {
+		ftid = find_first_zero_bit(t->ftid_bmap, t->nftids);
+		if (ftid >= t->nftids)
+			ftid = -1;
+	} else {
+		ftid = bitmap_find_free_region(t->ftid_bmap, t->nftids, 2);
+		if (ftid < 0) {
+			ftid = -1;
+			goto out_unlock;
+		}
+
+		/* this is only a lookup, keep the found region unallocated */
+		bitmap_release_region(t->ftid_bmap, ftid, 2);
+	}
+out_unlock:
+	spin_unlock_bh(&t->ftid_lock);
+	return ftid;
+}
+
 static int cxgb4_set_ftid(struct tid_info *t, int fidx, int family)
 {
 	spin_lock_bh(&t->ftid_lock);
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index 8923affbdaf8..3ba4e1ff8486 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -5105,6 +5105,8 @@ static int init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 		if (!adapter->tc_u32)
 			dev_warn(&pdev->dev,
 				 "could not offload tc u32, continuing\n");
+
+		cxgb4_init_tc_flower(adapter);
 	}
 
 	if (is_offload(adapter)) {
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
index 16dff71e4d02..1af01101faaf 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
@@ -38,16 +38,292 @@
 #include "cxgb4.h"
 #include "cxgb4_tc_flower.h"
 
+static struct ch_tc_flower_entry *allocate_flower_entry(void)
+{
+	struct ch_tc_flower_entry *new = kzalloc(sizeof(*new), GFP_KERNEL);
+	return new;
+}
+
+/* Must be called with either RTNL or rcu_read_lock */
+static struct ch_tc_flower_entry *ch_flower_lookup(struct adapter *adap,
+						   unsigned long flower_cookie)
+{
+	struct ch_tc_flower_entry *flower_entry;
+
+	hash_for_each_possible_rcu(adap->flower_anymatch_tbl, flower_entry,
+				   link, flower_cookie)
+		if (flower_entry->tc_flower_cookie == flower_cookie)
+			return flower_entry;
+	return NULL;
+}
+
+static void cxgb4_process_flow_match(struct net_device *dev,
+				     struct tc_cls_flower_offload *cls,
+				     struct ch_filter_specification *fs)
+{
+	u16 addr_type = 0;
+
+	if (dissector_uses_key(cls->dissector, FLOW_DISSECTOR_KEY_CONTROL)) {
+		struct flow_dissector_key_control *key =
+			skb_flow_dissector_target(cls->dissector,
+						  FLOW_DISSECTOR_KEY_CONTROL,
+						  cls->key);
+
+		addr_type = key->addr_type;
+	}
+
+	if (dissector_uses_key(cls->dissector, FLOW_DISSECTOR_KEY_BASIC)) {
+		struct flow_dissector_key_basic *key =
+			skb_flow_dissector_target(cls->dissector,
+						  FLOW_DISSECTOR_KEY_BASIC,
+						  cls->key);
+		struct flow_dissector_key_basic *mask =
+			skb_flow_dissector_target(cls->dissector,
+						  FLOW_DISSECTOR_KEY_BASIC,
+						  cls->mask);
+		u16 ethtype_key = ntohs(key->n_proto);
+		u16 ethtype_mask = ntohs(mask->n_proto);
+
+		if (ethtype_key == ETH_P_ALL) {
+			ethtype_key = 0;
+			ethtype_mask = 0;
+		}
+
+		fs->val.ethtype = ethtype_key;
+		fs->mask.ethtype = ethtype_mask;
+		fs->val.proto = key->ip_proto;
+		fs->mask.proto = mask->ip_proto;
+	}
+
+	if (addr_type == FLOW_DISSECTOR_KEY_IPV4_ADDRS) {
+		struct flow_dissector_key_ipv4_addrs *key =
+			skb_flow_dissector_target(cls->dissector,
+						  FLOW_DISSECTOR_KEY_IPV4_ADDRS,
+						  cls->key);
+		struct flow_dissector_key_ipv4_addrs *mask =
+			skb_flow_dissector_target(cls->dissector,
+						  FLOW_DISSECTOR_KEY_IPV4_ADDRS,
+						  cls->mask);
+		fs->type = 0;
+		memcpy(&fs->val.lip[0], &key->dst, sizeof(key->dst));
+		memcpy(&fs->val.fip[0], &key->src, sizeof(key->src));
+		memcpy(&fs->mask.lip[0], &mask->dst, sizeof(mask->dst));
+		memcpy(&fs->mask.fip[0], &mask->src, sizeof(mask->src));
+	}
+
+	if (addr_type == FLOW_DISSECTOR_KEY_IPV6_ADDRS) {
+		struct flow_dissector_key_ipv6_addrs *key =
+			skb_flow_dissector_target(cls->dissector,
+						  FLOW_DISSECTOR_KEY_IPV6_ADDRS,
+						  cls->key);
+		struct flow_dissector_key_ipv6_addrs *mask =
+			skb_flow_dissector_target(cls->dissector,
+						  FLOW_DISSECTOR_KEY_IPV6_ADDRS,
+						  cls->mask);
+
+		fs->type = 1;
+		memcpy(&fs->val.lip[0], key->dst.s6_addr, sizeof(key->dst));
+		memcpy(&fs->val.fip[0], key->src.s6_addr, sizeof(key->src));
+		memcpy(&fs->mask.lip[0], mask->dst.s6_addr, sizeof(mask->dst));
+		memcpy(&fs->mask.fip[0], mask->src.s6_addr, sizeof(mask->src));
+	}
+
+	if (dissector_uses_key(cls->dissector, FLOW_DISSECTOR_KEY_PORTS)) {
+		struct flow_dissector_key_ports *key, *mask;
+
+		key = skb_flow_dissector_target(cls->dissector,
+						FLOW_DISSECTOR_KEY_PORTS,
+						cls->key);
+		mask = skb_flow_dissector_target(cls->dissector,
+						 FLOW_DISSECTOR_KEY_PORTS,
+						 cls->mask);
+		fs->val.lport = cpu_to_be16(key->dst);
+		fs->mask.lport = cpu_to_be16(mask->dst);
+		fs->val.fport = cpu_to_be16(key->src);
+		fs->mask.fport = cpu_to_be16(mask->src);
+	}
+
+	/* Match only packets coming from the ingress port where this
+	 * filter will be created.
+	 */
+	fs->val.iport = netdev2pinfo(dev)->port_id;
+	fs->mask.iport = ~0;
+}
+
+static int cxgb4_validate_flow_match(struct net_device *dev,
+				     struct tc_cls_flower_offload *cls)
+{
+	if (cls->dissector->used_keys &
+	    ~(BIT(FLOW_DISSECTOR_KEY_CONTROL) |
+	      BIT(FLOW_DISSECTOR_KEY_BASIC) |
+	      BIT(FLOW_DISSECTOR_KEY_IPV4_ADDRS) |
+	      BIT(FLOW_DISSECTOR_KEY_IPV6_ADDRS) |
+	      BIT(FLOW_DISSECTOR_KEY_PORTS))) {
+		netdev_warn(dev, "Unsupported key used: 0x%x\n",
+			    cls->dissector->used_keys);
+		return -EOPNOTSUPP;
+	}
+	return 0;
+}
+
+static void cxgb4_process_flow_actions(struct net_device *in,
+				       struct tc_cls_flower_offload *cls,
+				       struct ch_filter_specification *fs)
+{
+	const struct tc_action *a;
+	LIST_HEAD(actions);
+
+	tcf_exts_to_list(cls->exts, &actions);
+	list_for_each_entry(a, &actions, list) {
+		if (is_tcf_gact_shot(a)) {
+			fs->action = FILTER_DROP;
+		} else if (is_tcf_mirred_egress_redirect(a)) {
+			int ifindex = tcf_mirred_ifindex(a);
+			struct net_device *out = __dev_get_by_index(dev_net(in),
+								    ifindex);
+			struct port_info *pi = netdev_priv(out);
+
+			fs->action = FILTER_SWITCH;
+			fs->eport = pi->port_id;
+		}
+	}
+}
+
+static int cxgb4_validate_flow_actions(struct net_device *dev,
+				       struct tc_cls_flower_offload *cls)
+{
+	const struct tc_action *a;
+	LIST_HEAD(actions);
+
+	tcf_exts_to_list(cls->exts, &actions);
+	list_for_each_entry(a, &actions, list) {
+		if (is_tcf_gact_shot(a)) {
+			/* Do nothing */
+		} else if (is_tcf_mirred_egress_redirect(a)) {
+			struct adapter *adap = netdev2adap(dev);
+			struct net_device *n_dev;
+			unsigned int i, ifindex;
+			bool found = false;
+
+			ifindex = tcf_mirred_ifindex(a);
+			for_each_port(adap, i) {
+				n_dev = adap->port[i];
+				if (ifindex == n_dev->ifindex) {
+					found = true;
+					break;
+				}
+			}
+
+			/* If interface doesn't belong to our hw, then
+			 * the provided output port is not valid
+			 */
+			if (!found) {
+				netdev_err(dev, "%s: Out port invalid\n",
+					   __func__);
+				return -EINVAL;
+			}
+		} else {
+			netdev_err(dev, "%s: Unsupported action\n", __func__);
+			return -EOPNOTSUPP;
+		}
+	}
+	return 0;
+}
+
 int cxgb4_tc_flower_replace(struct net_device *dev,
 			    struct tc_cls_flower_offload *cls)
 {
-	return -EOPNOTSUPP;
+	struct adapter *adap = netdev2adap(dev);
+	struct ch_tc_flower_entry *ch_flower;
+	struct ch_filter_specification *fs;
+	struct filter_ctx ctx;
+	int fidx;
+	int ret;
+
+	if (cxgb4_validate_flow_actions(dev, cls))
+		return -EOPNOTSUPP;
+
+	if (cxgb4_validate_flow_match(dev, cls))
+		return -EOPNOTSUPP;
+
+	ch_flower = allocate_flower_entry();
+	if (!ch_flower) {
+		netdev_err(dev, "%s: ch_flower alloc failed.\n", __func__);
+		ret = -ENOMEM;
+		goto err;
+	}
+
+	fs = &ch_flower->fs;
+	fs->hitcnts = 1;
+	cxgb4_process_flow_actions(dev, cls, fs);
+	cxgb4_process_flow_match(dev, cls, fs);
+
+	fidx = cxgb4_get_free_ftid(dev, fs->type ? PF_INET6 : PF_INET);
+	if (fidx < 0) {
+		netdev_err(dev, "%s: No fidx for offload.\n", __func__);
+		ret = -ENOMEM;
+		goto free_entry;
+	}
+
+	init_completion(&ctx.completion);
+	ret = __cxgb4_set_filter(dev, fidx, fs, &ctx);
+	if (ret) {
+		netdev_err(dev, "%s: filter creation err %d\n",
+			   __func__, ret);
+		goto free_entry;
+	}
+
+	/* Wait for reply */
+	ret = wait_for_completion_timeout(&ctx.completion, 10 * HZ);
+	if (!ret) {
+		ret = -ETIMEDOUT;
+		goto free_entry;
+	}
+
+	ret = ctx.result;
+	/* Check if hw returned error for filter creation */
+	if (ret) {
+		netdev_err(dev, "%s: filter creation err %d\n",
+			   __func__, ret);
+		goto free_entry;
+	}
+
+	INIT_HLIST_NODE(&ch_flower->link);
+	ch_flower->tc_flower_cookie = cls->cookie;
+	ch_flower->filter_id = ctx.tid;
+	hash_add_rcu(adap->flower_anymatch_tbl, &ch_flower->link, cls->cookie);
+
+	return ret;
+
+free_entry:
+	kfree(ch_flower);
+err:
+	return ret;
 }
 
 int cxgb4_tc_flower_destroy(struct net_device *dev,
 			    struct tc_cls_flower_offload *cls)
 {
-	return -EOPNOTSUPP;
+	struct adapter *adap = netdev2adap(dev);
+	struct ch_tc_flower_entry *ch_flower;
+	int ret;
+
+	ch_flower = ch_flower_lookup(adap, cls->cookie);
+	if (!ch_flower) {
+		ret = -ENOENT;
+		goto err;
+	}
+
+	ret = cxgb4_del_filter(dev, ch_flower->filter_id);
+	if (ret)
+		goto err;
+
+	hash_del_rcu(&ch_flower->link);
+	kfree_rcu(ch_flower, rcu);
+	return ret;
+
+err:
+	return ret;
 }
 
 int cxgb4_tc_flower_stats(struct net_device *dev,
@@ -55,3 +331,8 @@ int cxgb4_tc_flower_stats(struct net_device *dev,
 {
 	return -EOPNOTSUPP;
 }
+
+void cxgb4_init_tc_flower(struct adapter *adap)
+{
+	hash_init(adap->flower_anymatch_tbl);
+}
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h
index b321fc205b5a..6145a9e056eb 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h
@@ -37,10 +37,27 @@
 
 #include <net/pkt_cls.h>
 
+struct ch_tc_flower_stats {
+	u64 packet_count;
+	u64 byte_count;
+	u64 last_used;
+};
+
+struct ch_tc_flower_entry {
+	struct ch_filter_specification fs;
+	struct ch_tc_flower_stats stats;
+	unsigned long tc_flower_cookie;
+	struct hlist_node link;
+	struct rcu_head rcu;
+	u32 filter_id;
+};
+
 int cxgb4_tc_flower_replace(struct net_device *dev,
 			    struct tc_cls_flower_offload *cls);
 int cxgb4_tc_flower_destroy(struct net_device *dev,
 			    struct tc_cls_flower_offload *cls);
 int cxgb4_tc_flower_stats(struct net_device *dev,
 			  struct tc_cls_flower_offload *cls);
+
+void cxgb4_init_tc_flower(struct adapter *adap);
 #endif /* __CXGB4_TC_FLOWER_H */
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h
index 84541fce94c5..88487095d14f 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h
@@ -212,6 +212,7 @@ struct filter_ctx {
 
 struct ch_filter_specification;
 
+int cxgb4_get_free_ftid(struct net_device *dev, int family);
 int __cxgb4_set_filter(struct net_device *dev, int filter_id,
 		       struct ch_filter_specification *fs,
 		       struct filter_ctx *ctx);
-- 
2.14.1

^ permalink raw reply related

* [PATCH net-next 4/4] cxgb4: fetch stats for offloaded tc flower flows
From: Rahul Lakkireddy @ 2017-09-21  7:33 UTC (permalink / raw)
  To: netdev; +Cc: davem, kumaras, ganeshgr, nirranjan, indranil, Rahul Lakkireddy
In-Reply-To: <cover.1505977744.git.rahul.lakkireddy@chelsio.com>

From: Kumar Sanghvi <kumaras@chelsio.com>

Add support to retrieve stats from hardware for offloaded tc flower
flows.  Also, poll for the stats of offloaded flows via timer callback.

Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4.h         |  1 +
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c  | 76 +++++++++++++++++++++
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c    |  1 +
 .../net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c   | 79 +++++++++++++++++++++-
 .../net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h   |  3 +
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h     |  2 +
 6 files changed, 161 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
index 26eac599ab2c..8a94d97df025 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
@@ -907,6 +907,7 @@ struct adapter {
 
 	/* TC flower offload */
 	DECLARE_HASHTABLE(flower_anymatch_tbl, 9);
+	struct timer_list flower_stats_timer;
 };
 
 /* Support for "sched-class" command to allow a TX Scheduling Class to be
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c
index 07a4619e2164..c09c4de8c9fb 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c
@@ -148,6 +148,82 @@ static int get_filter_steerq(struct net_device *dev,
 	return iq;
 }
 
+static int get_filter_count(struct adapter *adapter, unsigned int fidx,
+			    u64 *pkts, u64 *bytes)
+{
+	unsigned int tcb_base, tcbaddr;
+	unsigned int word_offset;
+	struct filter_entry *f;
+	__be64 be64_byte_count;
+	int ret;
+
+	tcb_base = t4_read_reg(adapter, TP_CMM_TCB_BASE_A);
+	if ((fidx != (adapter->tids.nftids + adapter->tids.nsftids - 1)) &&
+	    fidx >= adapter->tids.nftids)
+		return -E2BIG;
+
+	f = &adapter->tids.ftid_tab[fidx];
+	if (!f->valid)
+		return -EINVAL;
+
+	tcbaddr = tcb_base + f->tid * TCB_SIZE;
+
+	spin_lock(&adapter->win0_lock);
+	if (is_t4(adapter->params.chip)) {
+		__be64 be64_count;
+
+		/* T4 doesn't maintain byte counts in hw */
+		*bytes = 0;
+
+		/* Get pkts */
+		word_offset = 4;
+		ret = t4_memory_rw(adapter, MEMWIN_NIC, MEM_EDC0,
+				   tcbaddr + (word_offset * sizeof(__be32)),
+				   sizeof(be64_count),
+				   (__be32 *)&be64_count,
+				   T4_MEMORY_READ);
+		if (ret < 0)
+			goto out;
+		*pkts = be64_to_cpu(be64_count);
+	} else {
+		__be32 be32_count;
+
+		/* Get bytes */
+		word_offset = 4;
+		ret = t4_memory_rw(adapter, MEMWIN_NIC, MEM_EDC0,
+				   tcbaddr + (word_offset * sizeof(__be32)),
+				   sizeof(be64_byte_count),
+				   &be64_byte_count,
+				   T4_MEMORY_READ);
+		if (ret < 0)
+			goto out;
+		*bytes = be64_to_cpu(be64_byte_count);
+
+		/* Get pkts */
+		word_offset = 6;
+		ret = t4_memory_rw(adapter, MEMWIN_NIC, MEM_EDC0,
+				   tcbaddr + (word_offset * sizeof(__be32)),
+				   sizeof(be32_count),
+				   &be32_count,
+				   T4_MEMORY_READ);
+		if (ret < 0)
+			goto out;
+		*pkts = (u64)be32_to_cpu(be32_count);
+	}
+
+out:
+	spin_unlock(&adapter->win0_lock);
+	return ret;
+}
+
+int cxgb4_get_filter_counters(struct net_device *dev, unsigned int fidx,
+			      u64 *hitcnt, u64 *bytecnt)
+{
+	struct adapter *adapter = netdev2adap(dev);
+
+	return get_filter_count(adapter, fidx, hitcnt, bytecnt);
+}
+
 int cxgb4_get_free_ftid(struct net_device *dev, int family)
 {
 	struct adapter *adap = netdev2adap(dev);
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index 3ba4e1ff8486..d634098d52ab 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -4637,6 +4637,7 @@ static void free_some_resources(struct adapter *adapter)
 	kvfree(adapter->l2t);
 	t4_cleanup_sched(adapter);
 	kvfree(adapter->tids.tid_tab);
+	cxgb4_cleanup_tc_flower(adapter);
 	cxgb4_cleanup_tc_u32(adapter);
 	kfree(adapter->sge.egr_map);
 	kfree(adapter->sge.ingr_map);
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
index fddb0c419edc..7a47d4e88a57 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
@@ -39,9 +39,12 @@
 #include "cxgb4.h"
 #include "cxgb4_tc_flower.h"
 
+#define STATS_CHECK_PERIOD (HZ / 2)
+
 static struct ch_tc_flower_entry *allocate_flower_entry(void)
 {
 	struct ch_tc_flower_entry *new = kzalloc(sizeof(*new), GFP_KERNEL);
+	spin_lock_init(&new->lock);
 	return new;
 }
 
@@ -369,13 +372,87 @@ int cxgb4_tc_flower_destroy(struct net_device *dev,
 	return ret;
 }
 
+void ch_flower_stats_cb(unsigned long data)
+{
+	struct adapter *adap = (struct adapter *)data;
+	struct ch_tc_flower_entry *flower_entry;
+	struct ch_tc_flower_stats *ofld_stats;
+	unsigned int i;
+	u64 packets;
+	u64 bytes;
+	int ret;
+
+	rcu_read_lock();
+	hash_for_each_rcu(adap->flower_anymatch_tbl, i, flower_entry, link) {
+		ret = cxgb4_get_filter_counters(adap->port[0],
+						flower_entry->filter_id,
+						&packets, &bytes);
+		if (!ret) {
+			spin_lock(&flower_entry->lock);
+			ofld_stats = &flower_entry->stats;
+
+			if (ofld_stats->prev_packet_count != packets) {
+				ofld_stats->prev_packet_count = packets;
+				ofld_stats->last_used = jiffies;
+			}
+			spin_unlock(&flower_entry->lock);
+		}
+	}
+	rcu_read_unlock();
+	mod_timer(&adap->flower_stats_timer, jiffies + STATS_CHECK_PERIOD);
+}
+
 int cxgb4_tc_flower_stats(struct net_device *dev,
 			  struct tc_cls_flower_offload *cls)
 {
-	return -EOPNOTSUPP;
+	struct adapter *adap = netdev2adap(dev);
+	struct ch_tc_flower_stats *ofld_stats;
+	struct ch_tc_flower_entry *ch_flower;
+	u64 packets;
+	u64 bytes;
+	int ret;
+
+	ch_flower = ch_flower_lookup(adap, cls->cookie);
+	if (!ch_flower) {
+		ret = -ENOENT;
+		goto err;
+	}
+
+	ret = cxgb4_get_filter_counters(dev, ch_flower->filter_id,
+					&packets, &bytes);
+	if (ret < 0)
+		goto err;
+
+	spin_lock_bh(&ch_flower->lock);
+	ofld_stats = &ch_flower->stats;
+	if (ofld_stats->packet_count != packets) {
+		if (ofld_stats->prev_packet_count != packets)
+			ofld_stats->last_used = jiffies;
+		tcf_exts_stats_update(cls->exts, bytes - ofld_stats->byte_count,
+				      packets - ofld_stats->packet_count,
+				      ofld_stats->last_used);
+
+		ofld_stats->packet_count = packets;
+		ofld_stats->byte_count = bytes;
+		ofld_stats->prev_packet_count = packets;
+	}
+	spin_unlock_bh(&ch_flower->lock);
+	return 0;
+
+err:
+	return ret;
 }
 
 void cxgb4_init_tc_flower(struct adapter *adap)
 {
 	hash_init(adap->flower_anymatch_tbl);
+	setup_timer(&adap->flower_stats_timer, ch_flower_stats_cb,
+		    (unsigned long)adap);
+	mod_timer(&adap->flower_stats_timer, jiffies + STATS_CHECK_PERIOD);
+}
+
+void cxgb4_cleanup_tc_flower(struct adapter *adap)
+{
+	if (adap->flower_stats_timer.function)
+		del_timer_sync(&adap->flower_stats_timer);
 }
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h
index 6145a9e056eb..604feffc752e 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h
@@ -38,6 +38,7 @@
 #include <net/pkt_cls.h>
 
 struct ch_tc_flower_stats {
+	u64 prev_packet_count;
 	u64 packet_count;
 	u64 byte_count;
 	u64 last_used;
@@ -49,6 +50,7 @@ struct ch_tc_flower_entry {
 	unsigned long tc_flower_cookie;
 	struct hlist_node link;
 	struct rcu_head rcu;
+	spinlock_t lock; /* lock for stats */
 	u32 filter_id;
 };
 
@@ -60,4 +62,5 @@ int cxgb4_tc_flower_stats(struct net_device *dev,
 			  struct tc_cls_flower_offload *cls);
 
 void cxgb4_init_tc_flower(struct adapter *adap);
+void cxgb4_cleanup_tc_flower(struct adapter *adap);
 #endif /* __CXGB4_TC_FLOWER_H */
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h
index 88487095d14f..52324c77a4fe 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h
@@ -221,6 +221,8 @@ int __cxgb4_del_filter(struct net_device *dev, int filter_id,
 int cxgb4_set_filter(struct net_device *dev, int filter_id,
 		     struct ch_filter_specification *fs);
 int cxgb4_del_filter(struct net_device *dev, int filter_id);
+int cxgb4_get_filter_counters(struct net_device *dev, unsigned int fidx,
+			      u64 *hitcnt, u64 *bytecnt);
 
 static inline void set_wr_txq(struct sk_buff *skb, int prio, int queue)
 {
-- 
2.14.1

^ permalink raw reply related

* [PATCH iproute2 master 0/2] BPF/XDP json follow-up
From: Daniel Borkmann @ 2017-09-21  8:42 UTC (permalink / raw)
  To: stephen; +Cc: ast, netdev, Daniel Borkmann

After merging net-next branch into master, Stephen asked to
fix up json dump for XDP as there were some merge conflicts,
so here it is.

Thanks!

Daniel Borkmann (2):
  json: move json printer to common library
  bpf: properly output json for xdp

 include/json_print.h |  71 ++++++++++++++++
 ip/Makefile          |   2 +-
 ip/ip_common.h       |  65 ++------------
 ip/ip_print.c        | 233 ---------------------------------------------------
 ip/iplink_xdp.c      |  74 +++++++++-------
 lib/Makefile         |   2 +-
 lib/bpf.c            |  19 +++--
 lib/json_print.c     | 231 ++++++++++++++++++++++++++++++++++++++++++++++++++
 8 files changed, 369 insertions(+), 328 deletions(-)
 create mode 100644 include/json_print.h
 delete mode 100644 ip/ip_print.c
 create mode 100644 lib/json_print.c

-- 
1.9.3

^ permalink raw reply

* [PATCH iproute2 master 1/2] json: move json printer to common library
From: Daniel Borkmann @ 2017-09-21  8:42 UTC (permalink / raw)
  To: stephen; +Cc: ast, netdev, Daniel Borkmann
In-Reply-To: <cover.1505956723.git.daniel@iogearbox.net>

Move the json printer which is based on json writer into the
iproute2 library, so it can be used by library code and tools
other than ip. Should probably have been done from the beginning
like that given json writer is in the library already anyway.
No functional changes.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
 include/json_print.h |  71 ++++++++++++++++
 ip/Makefile          |   2 +-
 ip/ip_common.h       |  65 ++------------
 ip/ip_print.c        | 233 ---------------------------------------------------
 lib/Makefile         |   2 +-
 lib/json_print.c     | 231 ++++++++++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 312 insertions(+), 292 deletions(-)
 create mode 100644 include/json_print.h
 delete mode 100644 ip/ip_print.c
 create mode 100644 lib/json_print.c

diff --git a/include/json_print.h b/include/json_print.h
new file mode 100644
index 0000000..44cf5ac
--- /dev/null
+++ b/include/json_print.h
@@ -0,0 +1,71 @@
+/*
+ * json_print.h		"print regular or json output, based on json_writer".
+ *
+ *             This program is free software; you can redistribute it and/or
+ *             modify it under the terms of the GNU General Public License
+ *             as published by the Free Software Foundation; either version
+ *             2 of the License, or (at your option) any later version.
+ *
+ * Authors:    Julien Fortin, <julien@cumulusnetworks.com>
+ */
+
+#ifndef _JSON_PRINT_H_
+#define _JSON_PRINT_H_
+
+#include "json_writer.h"
+#include "color.h"
+
+json_writer_t *get_json_writer(void);
+
+/*
+ * use:
+ *      - PRINT_ANY for context based output
+ *      - PRINT_FP for non json specific output
+ *      - PRINT_JSON for json specific output
+ */
+enum output_type {
+	PRINT_FP = 1,
+	PRINT_JSON = 2,
+	PRINT_ANY = 4,
+};
+
+void new_json_obj(int json, FILE *fp);
+void delete_json_obj(void);
+
+bool is_json_context(void);
+
+void set_current_fp(FILE *fp);
+
+void fflush_fp(void);
+
+void open_json_object(const char *str);
+void close_json_object(void);
+void open_json_array(enum output_type type, const char *delim);
+void close_json_array(enum output_type type, const char *delim);
+
+#define _PRINT_FUNC(type_name, type)					\
+	void print_color_##type_name(enum output_type t,		\
+				     enum color_attr color,		\
+				     const char *key,			\
+				     const char *fmt,			\
+				     type value);			\
+									\
+	static inline void print_##type_name(enum output_type t,	\
+					     const char *key,		\
+					     const char *fmt,		\
+					     type value)		\
+	{								\
+		print_color_##type_name(t, -1, key, fmt, value);	\
+	}
+_PRINT_FUNC(int, int);
+_PRINT_FUNC(bool, bool);
+_PRINT_FUNC(null, const char*);
+_PRINT_FUNC(string, const char*);
+_PRINT_FUNC(uint, uint64_t);
+_PRINT_FUNC(hu, unsigned short);
+_PRINT_FUNC(hex, unsigned int);
+_PRINT_FUNC(0xhex, unsigned int);
+_PRINT_FUNC(lluint, unsigned long long int);
+#undef _PRINT_FUNC
+
+#endif /* _JSON_PRINT_H_ */
diff --git a/ip/Makefile b/ip/Makefile
index 52c9a2e..5a1c7ad 100644
--- a/ip/Makefile
+++ b/ip/Makefile
@@ -9,7 +9,7 @@ IPOBJ=ip.o ipaddress.o ipaddrlabel.o iproute.o iprule.o ipnetns.o \
     link_iptnl.o link_gre6.o iplink_bond.o iplink_bond_slave.o iplink_hsr.o \
     iplink_bridge.o iplink_bridge_slave.o ipfou.o iplink_ipvlan.o \
     iplink_geneve.o iplink_vrf.o iproute_lwtunnel.o ipmacsec.o ipila.o \
-    ipvrf.o iplink_xstats.o ipseg6.o ip_print.o
+    ipvrf.o iplink_xstats.o ipseg6.o
 
 RTMONOBJ=rtmon.o
 
diff --git a/ip/ip_common.h b/ip/ip_common.h
index efc789c..4b8b0a7 100644
--- a/ip/ip_common.h
+++ b/ip/ip_common.h
@@ -1,3 +1,10 @@
+#ifndef _IP_COMMON_H_
+#define _IP_COMMON_H_
+
+#include <stdbool.h>
+
+#include "json_print.h"
+
 struct link_filter {
 	int ifindex;
 	int family;
@@ -101,8 +108,6 @@ static inline int rtm_get_table(struct rtmsg *r, struct rtattr **tb)
 
 extern struct rtnl_handle rth;
 
-#include <stdbool.h>
-
 struct link_util {
 	struct link_util	*next;
 	const char		*id;
@@ -141,58 +146,4 @@ int name_is_vrf(const char *name);
 
 void print_num(FILE *fp, unsigned int width, uint64_t count);
 
-#include "json_writer.h"
-
-json_writer_t   *get_json_writer(void);
-/*
- * use:
- *      - PRINT_ANY for context based output
- *      - PRINT_FP for non json specific output
- *      - PRINT_JSON for json specific output
- */
-enum output_type {
-	PRINT_FP = 1,
-	PRINT_JSON = 2,
-	PRINT_ANY = 4,
-};
-
-void new_json_obj(int json, FILE *fp);
-void delete_json_obj(void);
-
-bool is_json_context(void);
-
-void set_current_fp(FILE *fp);
-
-void fflush_fp(void);
-
-void open_json_object(const char *str);
-void close_json_object(void);
-void open_json_array(enum output_type type, const char *delim);
-void close_json_array(enum output_type type, const char *delim);
-
-#include "color.h"
-
-#define _PRINT_FUNC(type_name, type)					\
-	void print_color_##type_name(enum output_type t,		\
-				     enum color_attr color,		\
-				     const char *key,			\
-				     const char *fmt,			\
-				     type value);			\
-									\
-	static inline void print_##type_name(enum output_type t,	\
-					     const char *key,		\
-					     const char *fmt,		\
-					     type value)		\
-	{								\
-		print_color_##type_name(t, -1, key, fmt, value);	\
-	}
-_PRINT_FUNC(int, int);
-_PRINT_FUNC(bool, bool);
-_PRINT_FUNC(null, const char*);
-_PRINT_FUNC(string, const char*);
-_PRINT_FUNC(uint, uint64_t);
-_PRINT_FUNC(hu, unsigned short);
-_PRINT_FUNC(hex, unsigned int);
-_PRINT_FUNC(0xhex, unsigned int);
-_PRINT_FUNC(lluint, unsigned long long int);
-#undef _PRINT_FUNC
+#endif /* _IP_COMMON_H_ */
diff --git a/ip/ip_print.c b/ip/ip_print.c
deleted file mode 100644
index 4cd6a0b..0000000
--- a/ip/ip_print.c
+++ /dev/null
@@ -1,233 +0,0 @@
-/*
- * ip_print.c          "ip print regular or json output".
- *
- *             This program is free software; you can redistribute it and/or
- *             modify it under the terms of the GNU General Public License
- *             as published by the Free Software Foundation; either version
- *             2 of the License, or (at your option) any later version.
- *
- * Authors:    Julien Fortin, <julien@cumulusnetworks.com>
- *
- */
-
-#include <stdarg.h>
-#include <stdio.h>
-
-#include "utils.h"
-#include "ip_common.h"
-#include "json_writer.h"
-
-static json_writer_t *_jw;
-static FILE *_fp;
-
-#define _IS_JSON_CONTEXT(type) ((type & PRINT_JSON || type & PRINT_ANY) && _jw)
-#define _IS_FP_CONTEXT(type) (!_jw && (type & PRINT_FP || type & PRINT_ANY))
-
-void new_json_obj(int json, FILE *fp)
-{
-	if (json) {
-		_jw = jsonw_new(fp);
-		if (!_jw) {
-			perror("json object");
-			exit(1);
-		}
-		jsonw_pretty(_jw, true);
-		jsonw_start_array(_jw);
-	}
-	set_current_fp(fp);
-}
-
-void delete_json_obj(void)
-{
-	if (_jw) {
-		jsonw_end_array(_jw);
-		jsonw_destroy(&_jw);
-	}
-}
-
-bool is_json_context(void)
-{
-	return _jw != NULL;
-}
-
-void set_current_fp(FILE *fp)
-{
-	if (!fp) {
-		fprintf(stderr, "Error: invalid file pointer.\n");
-		exit(1);
-	}
-	_fp = fp;
-}
-
-json_writer_t *get_json_writer(void)
-{
-	return _jw;
-}
-
-void open_json_object(const char *str)
-{
-	if (_IS_JSON_CONTEXT(PRINT_JSON)) {
-		if (str)
-			jsonw_name(_jw, str);
-		jsonw_start_object(_jw);
-	}
-}
-
-void close_json_object(void)
-{
-	if (_IS_JSON_CONTEXT(PRINT_JSON))
-		jsonw_end_object(_jw);
-}
-
-/*
- * Start json array or string array using
- * the provided string as json key (if not null)
- * or as array delimiter in non-json context.
- */
-void open_json_array(enum output_type type, const char *str)
-{
-	if (_IS_JSON_CONTEXT(type)) {
-		if (str)
-			jsonw_name(_jw, str);
-		jsonw_start_array(_jw);
-	} else if (_IS_FP_CONTEXT(type)) {
-		fprintf(_fp, "%s", str);
-	}
-}
-
-/*
- * End json array or string array
- */
-void close_json_array(enum output_type type, const char *str)
-{
-	if (_IS_JSON_CONTEXT(type)) {
-		jsonw_pretty(_jw, false);
-		jsonw_end_array(_jw);
-		jsonw_pretty(_jw, true);
-	} else if (_IS_FP_CONTEXT(type)) {
-		fprintf(_fp, "%s", str);
-	}
-}
-
-/*
- * pre-processor directive to generate similar
- * functions handling different types
- */
-#define _PRINT_FUNC(type_name, type)					\
-	void print_color_##type_name(enum output_type t,		\
-				     enum color_attr color,		\
-				     const char *key,			\
-				     const char *fmt,			\
-				     type value)			\
-	{								\
-		if (_IS_JSON_CONTEXT(t)) {				\
-			if (!key)					\
-				jsonw_##type_name(_jw, value);		\
-			else						\
-				jsonw_##type_name##_field(_jw, key, value); \
-		} else if (_IS_FP_CONTEXT(t)) {				\
-			color_fprintf(_fp, color, fmt, value);          \
-		}							\
-	}
-_PRINT_FUNC(int, int);
-_PRINT_FUNC(hu, unsigned short);
-_PRINT_FUNC(uint, uint64_t);
-_PRINT_FUNC(lluint, unsigned long long int);
-#undef _PRINT_FUNC
-
-void print_color_string(enum output_type type,
-			enum color_attr color,
-			const char *key,
-			const char *fmt,
-			const char *value)
-{
-	if (_IS_JSON_CONTEXT(type)) {
-		if (key && !value)
-			jsonw_name(_jw, key);
-		else if (!key && value)
-			jsonw_string(_jw, value);
-		else
-			jsonw_string_field(_jw, key, value);
-	} else if (_IS_FP_CONTEXT(type)) {
-		color_fprintf(_fp, color, fmt, value);
-	}
-}
-
-/*
- * value's type is bool. When using this function in FP context you can't pass
- * a value to it, you will need to use "is_json_context()" to have different
- * branch for json and regular output. grep -r "print_bool" for example
- */
-void print_color_bool(enum output_type type,
-		      enum color_attr color,
-		      const char *key,
-		      const char *fmt,
-		      bool value)
-{
-	if (_IS_JSON_CONTEXT(type)) {
-		if (key)
-			jsonw_bool_field(_jw, key, value);
-		else
-			jsonw_bool(_jw, value);
-	} else if (_IS_FP_CONTEXT(type)) {
-		color_fprintf(_fp, color, fmt, value ? "true" : "false");
-	}
-}
-
-/*
- * In JSON context uses hardcode %#x format: 42 -> 0x2a
- */
-void print_color_0xhex(enum output_type type,
-		       enum color_attr color,
-		       const char *key,
-		       const char *fmt,
-		       unsigned int hex)
-{
-	if (_IS_JSON_CONTEXT(type)) {
-		SPRINT_BUF(b1);
-
-		snprintf(b1, sizeof(b1), "%#x", hex);
-		print_string(PRINT_JSON, key, NULL, b1);
-	} else if (_IS_FP_CONTEXT(type)) {
-		color_fprintf(_fp, color, fmt, hex);
-	}
-}
-
-void print_color_hex(enum output_type type,
-		     enum color_attr color,
-		     const char *key,
-		     const char *fmt,
-		     unsigned int hex)
-{
-	if (_IS_JSON_CONTEXT(type)) {
-		SPRINT_BUF(b1);
-
-		snprintf(b1, sizeof(b1), "%x", hex);
-		if (key)
-			jsonw_string_field(_jw, key, b1);
-		else
-			jsonw_string(_jw, b1);
-	} else if (_IS_FP_CONTEXT(type)) {
-		color_fprintf(_fp, color, fmt, hex);
-	}
-}
-
-/*
- * In JSON context we don't use the argument "value" we simply call jsonw_null
- * whereas FP context can use "value" to output anything
- */
-void print_color_null(enum output_type type,
-		      enum color_attr color,
-		      const char *key,
-		      const char *fmt,
-		      const char *value)
-{
-	if (_IS_JSON_CONTEXT(type)) {
-		if (key)
-			jsonw_null_field(_jw, key);
-		else
-			jsonw_null(_jw);
-	} else if (_IS_FP_CONTEXT(type)) {
-		color_fprintf(_fp, color, fmt, value);
-	}
-}
diff --git a/lib/Makefile b/lib/Makefile
index 5e9f72f..0fbdf4c 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -3,7 +3,7 @@ include ../config.mk
 CFLAGS += -fPIC
 
 UTILOBJ = utils.o rt_names.o ll_types.o ll_proto.o ll_addr.o \
-	inet_proto.o namespace.o json_writer.o \
+	inet_proto.o namespace.o json_writer.o json_print.o \
 	names.o color.o bpf.o exec.o fs.o
 
 NLOBJ=libgenl.o ll_map.o libnetlink.o
diff --git a/lib/json_print.c b/lib/json_print.c
new file mode 100644
index 0000000..93b4119
--- /dev/null
+++ b/lib/json_print.c
@@ -0,0 +1,231 @@
+/*
+ * json_print.c		"print regular or json output, based on json_writer".
+ *
+ *             This program is free software; you can redistribute it and/or
+ *             modify it under the terms of the GNU General Public License
+ *             as published by the Free Software Foundation; either version
+ *             2 of the License, or (at your option) any later version.
+ *
+ * Authors:    Julien Fortin, <julien@cumulusnetworks.com>
+ */
+
+#include <stdarg.h>
+#include <stdio.h>
+
+#include "utils.h"
+#include "json_print.h"
+
+static json_writer_t *_jw;
+static FILE *_fp;
+
+#define _IS_JSON_CONTEXT(type) ((type & PRINT_JSON || type & PRINT_ANY) && _jw)
+#define _IS_FP_CONTEXT(type) (!_jw && (type & PRINT_FP || type & PRINT_ANY))
+
+void new_json_obj(int json, FILE *fp)
+{
+	if (json) {
+		_jw = jsonw_new(fp);
+		if (!_jw) {
+			perror("json object");
+			exit(1);
+		}
+		jsonw_pretty(_jw, true);
+		jsonw_start_array(_jw);
+	}
+	set_current_fp(fp);
+}
+
+void delete_json_obj(void)
+{
+	if (_jw) {
+		jsonw_end_array(_jw);
+		jsonw_destroy(&_jw);
+	}
+}
+
+bool is_json_context(void)
+{
+	return _jw != NULL;
+}
+
+void set_current_fp(FILE *fp)
+{
+	if (!fp) {
+		fprintf(stderr, "Error: invalid file pointer.\n");
+		exit(1);
+	}
+	_fp = fp;
+}
+
+json_writer_t *get_json_writer(void)
+{
+	return _jw;
+}
+
+void open_json_object(const char *str)
+{
+	if (_IS_JSON_CONTEXT(PRINT_JSON)) {
+		if (str)
+			jsonw_name(_jw, str);
+		jsonw_start_object(_jw);
+	}
+}
+
+void close_json_object(void)
+{
+	if (_IS_JSON_CONTEXT(PRINT_JSON))
+		jsonw_end_object(_jw);
+}
+
+/*
+ * Start json array or string array using
+ * the provided string as json key (if not null)
+ * or as array delimiter in non-json context.
+ */
+void open_json_array(enum output_type type, const char *str)
+{
+	if (_IS_JSON_CONTEXT(type)) {
+		if (str)
+			jsonw_name(_jw, str);
+		jsonw_start_array(_jw);
+	} else if (_IS_FP_CONTEXT(type)) {
+		fprintf(_fp, "%s", str);
+	}
+}
+
+/*
+ * End json array or string array
+ */
+void close_json_array(enum output_type type, const char *str)
+{
+	if (_IS_JSON_CONTEXT(type)) {
+		jsonw_pretty(_jw, false);
+		jsonw_end_array(_jw);
+		jsonw_pretty(_jw, true);
+	} else if (_IS_FP_CONTEXT(type)) {
+		fprintf(_fp, "%s", str);
+	}
+}
+
+/*
+ * pre-processor directive to generate similar
+ * functions handling different types
+ */
+#define _PRINT_FUNC(type_name, type)					\
+	void print_color_##type_name(enum output_type t,		\
+				     enum color_attr color,		\
+				     const char *key,			\
+				     const char *fmt,			\
+				     type value)			\
+	{								\
+		if (_IS_JSON_CONTEXT(t)) {				\
+			if (!key)					\
+				jsonw_##type_name(_jw, value);		\
+			else						\
+				jsonw_##type_name##_field(_jw, key, value); \
+		} else if (_IS_FP_CONTEXT(t)) {				\
+			color_fprintf(_fp, color, fmt, value);          \
+		}							\
+	}
+_PRINT_FUNC(int, int);
+_PRINT_FUNC(hu, unsigned short);
+_PRINT_FUNC(uint, uint64_t);
+_PRINT_FUNC(lluint, unsigned long long int);
+#undef _PRINT_FUNC
+
+void print_color_string(enum output_type type,
+			enum color_attr color,
+			const char *key,
+			const char *fmt,
+			const char *value)
+{
+	if (_IS_JSON_CONTEXT(type)) {
+		if (key && !value)
+			jsonw_name(_jw, key);
+		else if (!key && value)
+			jsonw_string(_jw, value);
+		else
+			jsonw_string_field(_jw, key, value);
+	} else if (_IS_FP_CONTEXT(type)) {
+		color_fprintf(_fp, color, fmt, value);
+	}
+}
+
+/*
+ * value's type is bool. When using this function in FP context you can't pass
+ * a value to it, you will need to use "is_json_context()" to have different
+ * branch for json and regular output. grep -r "print_bool" for example
+ */
+void print_color_bool(enum output_type type,
+		      enum color_attr color,
+		      const char *key,
+		      const char *fmt,
+		      bool value)
+{
+	if (_IS_JSON_CONTEXT(type)) {
+		if (key)
+			jsonw_bool_field(_jw, key, value);
+		else
+			jsonw_bool(_jw, value);
+	} else if (_IS_FP_CONTEXT(type)) {
+		color_fprintf(_fp, color, fmt, value ? "true" : "false");
+	}
+}
+
+/*
+ * In JSON context uses hardcode %#x format: 42 -> 0x2a
+ */
+void print_color_0xhex(enum output_type type,
+		       enum color_attr color,
+		       const char *key,
+		       const char *fmt,
+		       unsigned int hex)
+{
+	if (_IS_JSON_CONTEXT(type)) {
+		SPRINT_BUF(b1);
+
+		snprintf(b1, sizeof(b1), "%#x", hex);
+		print_string(PRINT_JSON, key, NULL, b1);
+	} else if (_IS_FP_CONTEXT(type)) {
+		color_fprintf(_fp, color, fmt, hex);
+	}
+}
+
+void print_color_hex(enum output_type type,
+		     enum color_attr color,
+		     const char *key,
+		     const char *fmt,
+		     unsigned int hex)
+{
+	if (_IS_JSON_CONTEXT(type)) {
+		SPRINT_BUF(b1);
+
+		snprintf(b1, sizeof(b1), "%x", hex);
+		if (key)
+			jsonw_string_field(_jw, key, b1);
+		else
+			jsonw_string(_jw, b1);
+	} else if (_IS_FP_CONTEXT(type)) {
+		color_fprintf(_fp, color, fmt, hex);
+	}
+}
+
+/*
+ * In JSON context we don't use the argument "value" we simply call jsonw_null
+ * whereas FP context can use "value" to output anything
+ */
+void print_color_null(enum output_type type,
+		      enum color_attr color,
+		      const char *key,
+		      const char *fmt,
+		      const char *value)
+{
+	if (_IS_JSON_CONTEXT(type)) {
+		if (key)
+			jsonw_null_field(_jw, key);
+		else
+			jsonw_null(_jw);
+	} else if (_IS_FP_CONTEXT(type)) {
+		color_fprintf(_fp, color, fmt, value);
+	}
+}
-- 
1.9.3

^ permalink raw reply related

* [PATCH iproute2 master 2/2] bpf: properly output json for xdp
From: Daniel Borkmann @ 2017-09-21  8:42 UTC (permalink / raw)
  To: stephen; +Cc: ast, netdev, Daniel Borkmann
In-Reply-To: <cover.1505956723.git.daniel@iogearbox.net>

After merging net-next branch into master, Stephen asked
to fix up json dump for XDP. Thus, rework the json dump a
bit, such that 'ip -json l' looks as below.

  [{
        "ifindex": 1,
        "ifname": "lo",
        "flags": ["LOOPBACK","UP","LOWER_UP"],
        "mtu": 65536,
        "xdp": {
            "mode": 2,
            "prog": {
                "id": 5,
                "tag": "e1e9d0ec0f55d638",
                "jited": 1
            }
        },
        "qdisc": "noqueue",
        "operstate": "UNKNOWN",
        "linkmode": "DEFAULT",
        "group": "default",
        "txqlen": 1000,
        "link_type": "loopback",
        "address": "00:00:00:00:00:00",
        "broadcast": "00:00:00:00:00:00"
    },[...]
  ]

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
 ip/iplink_xdp.c | 74 ++++++++++++++++++++++++++++++++++-----------------------
 lib/bpf.c       | 19 ++++++++++-----
 2 files changed, 57 insertions(+), 36 deletions(-)

diff --git a/ip/iplink_xdp.c b/ip/iplink_xdp.c
index 71f7798..2d2953a 100644
--- a/ip/iplink_xdp.c
+++ b/ip/iplink_xdp.c
@@ -14,9 +14,9 @@
 
 #include <linux/bpf.h>
 
+#include "json_print.h"
 #include "xdp.h"
 #include "bpf_util.h"
-#include "ip_common.h"
 
 extern int force;
 
@@ -82,6 +82,22 @@ int xdp_parse(int *argc, char ***argv, struct iplink_req *req, bool generic,
 	return 0;
 }
 
+static void xdp_dump_json(struct rtattr *tb[IFLA_XDP_MAX + 1])
+{
+	__u32 prog_id = 0;
+	__u8 mode;
+
+	mode = rta_getattr_u8(tb[IFLA_XDP_ATTACHED]);
+	if (tb[IFLA_XDP_PROG_ID])
+		prog_id = rta_getattr_u32(tb[IFLA_XDP_PROG_ID]);
+
+	open_json_object("xdp");
+	print_uint(PRINT_JSON, "mode", NULL, mode);
+	if (prog_id)
+		bpf_dump_prog_info(NULL, prog_id);
+	close_json_object();
+}
+
 void xdp_dump(FILE *fp, struct rtattr *xdp, bool link, bool details)
 {
 	struct rtattr *tb[IFLA_XDP_MAX + 1];
@@ -94,34 +110,32 @@ void xdp_dump(FILE *fp, struct rtattr *xdp, bool link, bool details)
 		return;
 
 	mode = rta_getattr_u8(tb[IFLA_XDP_ATTACHED]);
-	if (is_json_context()) {
-		print_uint(PRINT_JSON, "attached", NULL, mode);
-	} else {
-		if (mode == XDP_ATTACHED_NONE)
-			return;
-		else if (details && link)
-			fprintf(fp, "%s    prog/xdp", _SL_);
-		else if (mode == XDP_ATTACHED_DRV)
-			fprintf(fp, "xdp");
-		else if (mode == XDP_ATTACHED_SKB)
-			fprintf(fp, "xdpgeneric");
-		else if (mode == XDP_ATTACHED_HW)
-			fprintf(fp, "xdpoffload");
-		else
-			fprintf(fp, "xdp[%u]", mode);
-
-		if (tb[IFLA_XDP_PROG_ID])
-			prog_id = rta_getattr_u32(tb[IFLA_XDP_PROG_ID]);
-		if (!details) {
-			if (prog_id && !link)
-				fprintf(fp, "/id:%u", prog_id);
-			fprintf(fp, " ");
-			return;
-		}
-
-		if (prog_id) {
-			fprintf(fp, " ");
-			bpf_dump_prog_info(fp, prog_id);
-		}
+	if (mode == XDP_ATTACHED_NONE)
+		return;
+	else if (is_json_context())
+		return details ? (void)0 : xdp_dump_json(tb);
+	else if (details && link)
+		fprintf(fp, "%s    prog/xdp", _SL_);
+	else if (mode == XDP_ATTACHED_DRV)
+		fprintf(fp, "xdp");
+	else if (mode == XDP_ATTACHED_SKB)
+		fprintf(fp, "xdpgeneric");
+	else if (mode == XDP_ATTACHED_HW)
+		fprintf(fp, "xdpoffload");
+	else
+		fprintf(fp, "xdp[%u]", mode);
+
+	if (tb[IFLA_XDP_PROG_ID])
+		prog_id = rta_getattr_u32(tb[IFLA_XDP_PROG_ID]);
+	if (!details) {
+		if (prog_id && !link)
+			fprintf(fp, "/id:%u", prog_id);
+		fprintf(fp, " ");
+		return;
+	}
+
+	if (prog_id) {
+		fprintf(fp, " ");
+		bpf_dump_prog_info(fp, prog_id);
 	}
 }
diff --git a/lib/bpf.c b/lib/bpf.c
index cfa1f79..10ea23a 100644
--- a/lib/bpf.c
+++ b/lib/bpf.c
@@ -40,6 +40,7 @@
 #include <arpa/inet.h>
 
 #include "utils.h"
+#include "json_print.h"
 
 #include "bpf_util.h"
 #include "bpf_elf.h"
@@ -186,23 +187,29 @@ int bpf_dump_prog_info(FILE *f, uint32_t id)
 	int fd, ret, dump_ok = 0;
 	SPRINT_BUF(tmp);
 
-	fprintf(f, "id %u ", id);
+	open_json_object("prog");
+	print_uint(PRINT_ANY, "id", "id %u ", id);
 
 	fd = bpf_prog_fd_by_id(id);
 	if (fd < 0)
-		return dump_ok;
+		goto out;
 
 	ret = bpf_prog_info_by_fd(fd, &info, &len);
 	if (!ret && len) {
-		fprintf(f, "tag %s ",
-			hexstring_n2a(info.tag, sizeof(info.tag),
-				      tmp, sizeof(tmp)));
-		if (info.jited_prog_len)
+		int jited = !!info.jited_prog_len;
+
+		print_string(PRINT_ANY, "tag", "tag %s ",
+			     hexstring_n2a(info.tag, sizeof(info.tag),
+					   tmp, sizeof(tmp)));
+		print_uint(PRINT_JSON, "jited", NULL, jited);
+		if (jited && !is_json_context())
 			fprintf(f, "jited ");
 		dump_ok = 1;
 	}
 
 	close(fd);
+out:
+	close_json_object();
 	return dump_ok;
 }
 
-- 
1.9.3

^ permalink raw reply related

* Re: [PATCH net-next 3/4] cxgb4: add support to offload action vlan
From: Jiri Pirko @ 2017-09-21  8:55 UTC (permalink / raw)
  To: Rahul Lakkireddy; +Cc: netdev, davem, kumaras, ganeshgr, nirranjan, indranil
In-Reply-To: <016c3bf21a7bfe45e73275d3191cf61cceffd362.1505977744.git.rahul.lakkireddy@chelsio.com>

Thu, Sep 21, 2017 at 09:33:36AM CEST, rahul.lakkireddy@chelsio.com wrote:
>From: Kumar Sanghvi <kumaras@chelsio.com>
>
>Add support for offloading tc-flower flows having
>vlan actions: pop, push and modify.
>
>Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
>Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
>Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
>---
> .../net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c   | 43 ++++++++++++++++++++++
> 1 file changed, 43 insertions(+)
>
>diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c

[...]


>+			switch (vlan_action) {
>+			case TCA_VLAN_ACT_POP:
>+				break;
>+			case TCA_VLAN_ACT_PUSH:
>+			case TCA_VLAN_ACT_MODIFY:
>+				if (proto != ETH_P_8021Q) {
>+					netdev_err(dev,
>+						   "%s: Unsupp. vlan proto\n",

Don't wrap this. Also "Unsupp."vs"Unsupported". Please be consistent.


>+						   __func__);
>+					return -EOPNOTSUPP;
>+				}
>+				break;
>+			default:
>+				netdev_err(dev, "%s: Unsupported vlan action\n",
>+					   __func__);
>+				return -EOPNOTSUPP;
>+			}
> 		} else {
> 			netdev_err(dev, "%s: Unsupported action\n", __func__);
> 			return -EOPNOTSUPP;
>-- 
>2.14.1
>

^ permalink raw reply

* Re: [PATCH net-next 0/4] cxgb4: add support to offload tc flower
From: Jiri Pirko @ 2017-09-21  8:56 UTC (permalink / raw)
  To: Rahul Lakkireddy; +Cc: netdev, davem, kumaras, ganeshgr, nirranjan, indranil
In-Reply-To: <cover.1505977744.git.rahul.lakkireddy@chelsio.com>

Thu, Sep 21, 2017 at 09:33:33AM CEST, rahul.lakkireddy@chelsio.com wrote:
>This series of patches add support to offload tc flower onto Chelsio
>NICs.
>
>Patch 1 adds basic skeleton to prepare for offloading tc flower flows.
>
>Patch 2 adds support to add/remove flows for offload.  Flows can have
>accompanying masks.  Following match and action are currently supported
>for offload:
>Match:  ether-protocol, IPv4/IPv6 addresses, L4 ports (TCP/UDP)
>Action: drop, redirect to another port on the device.
>
>Patch 3 adds support to offload tc-flower flows having
>vlan actions: pop, push, and modify.
>
>Patch 4 adds support to fetch stats for the offloaded tc flower flows
>from hardware.
>
>Support for offloading more match and action types are to be followed
>in subsequent series.

Looks good to me. Thanks!

^ permalink raw reply

* Re: Latest net-next from GIT panic
From: Paweł Staszewski @ 2017-09-21  9:06 UTC (permalink / raw)
  To: Eric Dumazet, Wei Wang
  Cc: Cong Wang, Linux Kernel Network Developers, Eric Dumazet
In-Reply-To: <1505956639.29839.108.camel@edumazet-glaptop3.roam.corp.google.com>



W dniu 2017-09-21 o 03:17, Eric Dumazet pisze:
> On Wed, 2017-09-20 at 18:09 -0700, Wei Wang wrote:
>>> Thanks very much Pawel for the feedback.
>>>
>>> I was looking into the code (specifically IPv4 part) and found that in
>>> free_fib_info_rcu(), we call free_nh_exceptions() without holding the
>>> fnhe_lock. I am wondering if that could cause some race condition on
>>> fnhe->fnhe_rth_input/output so a double call on dst_dev_put() on the
>>> same dst could be happening.
>>>
>>> But as we call free_fib_info_rcu() only after the grace period, and
>>> the lookup code which could potentially modify
>>> fnhe->fnhe_rth_input/output all holds rcu_read_lock(), it seems
>>> fine...
>>>
>> Hi Pawel,
>>
>> Could you try the following debug patch on top of net-next branch and
>> reproduce the issue check if there are warning msg showing?
>>
>> diff --git a/include/net/dst.h b/include/net/dst.h
>> index 93568bd0a352..82aff41c6f63 100644
>> --- a/include/net/dst.h
>> +++ b/include/net/dst.h
>> @@ -271,7 +271,7 @@ static inline void dst_use_noref(struct dst_entry
>> *dst, unsigned long time)
>>   static inline struct dst_entry *dst_clone(struct dst_entry *dst)
>>   {
>>          if (dst)
>> -               atomic_inc(&dst->__refcnt);
>> +               dst_hold(dst);
>>          return dst;
>>   }
>>
>> Thanks.
>> Wei
>>
>
> Yes, we believe skb_dst_force() and skb_dst_force_safe() should be
> unified  (to the 'safe' version)
>
> We no longer have gc to protect from 0 -> 1 transition of dst refcount.
>
>
>
>

After adding patch from Wei
https://bugzilla.kernel.org/show_bug.cgi?id=197005#c14

^ permalink raw reply

* Re: [PATCH net-next 2/5] net: allow early demux to fetch noref socket
From: Paolo Abeni @ 2017-09-21  9:13 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Pablo Neira Ayuso, Florian Westphal,
	Eric Dumazet, Hannes Frederic Sowa
In-Reply-To: <db75c6a6872040712a9ab97b0bac04b697c42a4c.1505926196.git.pabeni@redhat.com>

On Wed, 2017-09-20 at 18:54 +0200, Paolo Abeni wrote:
> We must be careful to avoid leaking such sockets outside
> the RCU section containing the early demux call; we clear
> them on nonlocal delivery.
> 
> For ipv4 we must take care of local mcast delivery, too,
> since udp early demux works also for mcast addresses.
> 
> Also update all iptables/nftables extension that can
> happen in the input chain and can transmit the skb outside
> such patch, namely TEE, nft_dup and nfqueue.
> 
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
>  net/ipv4/ip_input.c              | 12 ++++++++++++
>  net/ipv4/ipmr.c                  | 18 ++++++++++++++----
>  net/ipv4/netfilter/nf_dup_ipv4.c |  3 +++
>  net/ipv6/ip6_input.c             |  7 ++++++-
>  net/ipv6/netfilter/nf_dup_ipv6.c |  3 +++
>  net/netfilter/nf_queue.c         |  3 +++
>  6 files changed, 41 insertions(+), 5 deletions(-)
> 
> diff --git a/net/ipv4/ip_input.c b/net/ipv4/ip_input.c
> index fa2dc8f692c6..e71abc8b698c 100644
> --- a/net/ipv4/ip_input.c
> +++ b/net/ipv4/ip_input.c
> @@ -349,6 +349,18 @@ static int ip_rcv_finish(struct net *net, struct sock *sk, struct sk_buff *skb)
>  				__NET_INC_STATS(net, LINUX_MIB_IPRPFILTER);
>  			goto drop;
>  		}
> +
> +		/* Since the sk has no reference to the socket, we must
> +		 * clear it before escaping this RCU section.
> +		 * The sk is just an hint and we know we are not going to use
> +		 * it outside the input path.
> +		 */
> +		if (skb_dst(skb)->input != ip_local_deliver
> +#ifdef CONFIG_IP_MROUTE
> +		    && skb_dst(skb)->input != ip_mr_input
> +#endif
> +		    )
> +			skb_clear_noref_sk(skb);
>  	}

The above is to allow early demux for multicast sockets even on hosts
acting as multicast router. This is probably overkill: an host will
probably act as a multicast router or receive large amount of locally
terminate mcast traffic.

We can instead preserve the sknoref only for ip_local_deliver(),
dropping the early demux optimization in the above scenario, which
should not be very relevant. Will simplify the above chunk and drop the
need for the ipmr.c changes below; overall this patch will become much
simpler.

Paolo

^ permalink raw reply

* Re: [PATCH net-next 1/5] net: add support for noref skb->sk
From: Paolo Abeni @ 2017-09-21  9:14 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: netdev, David S. Miller, Pablo Neira Ayuso, Florian Westphal,
	Eric Dumazet, Hannes Frederic Sowa
In-Reply-To: <1505929295.29839.103.camel@edumazet-glaptop3.roam.corp.google.com>

Hi,

Thank you for looking at it!

On Wed, 2017-09-20 at 10:41 -0700, Eric Dumazet wrote:
> On Wed, 2017-09-20 at 18:54 +0200, Paolo Abeni wrote:
> > Noref sk do not carry a socket refcount, are valid
> > only inside the current RCU section and must be
> > explicitly cleared before exiting such section.
> > 
> > They will be used in a later patch to allow early demux
> > without sock refcounting.
> 
> 
> 
> 
> > +/* dummy destructor used by noref sockets */
> > +void sock_dummyfree(struct sk_buff *skb)
> > +{
> 
> BUG();
> 
> > +}
> > +EXPORT_SYMBOL(sock_dummyfree);
> > +

We can call sock_dummyfree() in legitimate paths, see below, but we can
add a:

WARN_ON_ONCE(!rcu_read_lock_held());

here and in  skb_clear_noref_sk(). That should help much to catch
possible bugs.

> I do not see how you ensure we do not leave RCU section with an skb
> destructor pointing to this sock_dummyfree()
> 
> This patch series looks quite dangerous to me.

The idea is to explicitly clear the sknoref references before leaving
the RCU section. Quite alike what we currently do for dst noref, but
here the only place where we get a noref socket is the socket early
demux, thus the scope of this change is more limited to what we have
with noref dst_entries.

The relevant code is in the next 2 patches; after the demux we preserve
the sknoref only if the skb has a local destination. The UDP socket
will then set the noref on early demux lookup, and the skb will either:

* land on the corresponding UDP socket, the receive function will steal
the sknoref
* be dropped by some nft/iptables target - the dummy destructor is
called
* forwarded by some nft/iptables target outside the input path; we
clear the skref explicitly in such targets. 

Currently there are an handful of places affected, and we can simplify
the code dropping the early demux result for locally terminated
multicast sockets on a host acting as a multicast router, please see
the comment on the next patch.

> Do we really have real applications using connected UDP sockets and
> wanting very high pps throughput ?

The ultimate goal is to improve the unconnected UDP sockets scenario,
we do actually have use cases for that - DNS servers and VoIP SBCs.

Thanks,

Paolo

^ permalink raw reply

* [RFC PATCH 0/0] Introduce MPLS over GRE
From: Amine Kherbouche @ 2017-09-21  9:25 UTC (permalink / raw)
  To: netdev, xeb, roopa; +Cc: amine.kherbouche, equinox

This series introduces the MPLS over GRE encapsulation (RFC 4023).

Various applications of MPLS make use of label stacks with multiple
entries.  In some cases, it is possible to replace the top label of
the stack with an IP-based encapsulation, thereby, it is possible for
two LSRs that are adjacent on an LSP to be separated by an IP network,
even if that IP network does not provide MPLS.

An example of configuration:


         node1                LER1                       LER2                node2
        +-----+             +------+                   +------+             +-----+
        |     |             |      |                   |      |             |     |
        |     |             |      |p3  GRE tunnel   p4|      |             |     |
        |     |p1         p2|      +-------------------+      |p5         p6|     |
        |     +-------------+      +-------------------+      +------------+|     |
        |     |10.100.0.0/24|      |                   |      |10.200.0.0/24|     |
        |     |fd00:100::/64|      |  10.125.0.0/24    |      |fd00:200::/64|     |
        |     |             |      |  fd00:125::/64    |      |             |     |
        |     |             |      |                   |      |             |     |
        |     |             |      |                   |      |             |     |
        |     |             |      |                   |      |             |     |
        |     |             |      |                   |      |             |     |
        +-----+             +------+                   +------+             +-----+


		###	node1	###

ip link set p1 up
ip addr add 10.100.0.1/24 dev p1

		###	LER1	###

ip link set p2 up
ip addr add 10.100.0.2/24 dev p2

ip link set p3 up
ip addr add 10.125.0.1/24 dev p3

modprobe mpls_router
sysctl -w net.mpls.conf.p2.input=1
sysctl -w net.mpls.conf.p3.input=1
sysctl -w net.mpls.platform_labels=1000

ip link add gre1 type gre ttl 64 local 10.125.0.1 remote 10.125.0.2 dev p3
ip link set dev gre1 up

ip -M route add 111 as 222 dev gre1
ip -M route add 555 as 666 via inet 10.100.0.1 dev p2

		###	LER2	###

ip link set p5 up
ip addr add 10.200.0.2/24 dev p5

ip link set p4 up
ip addr add 10.125.0.2/24 dev p4

modprobe mpls_router
sysctl -w net.mpls.conf.p4.input=1
sysctl -w net.mpls.conf.p5.input=1
sysctl -w net.mpls.platform_labels=1000

ip link add gre1 type gre ttl 64 local 10.125.0.2 remote 10.125.0.1 dev p4
ip link set dev gre1 up

ip -M route add 444 as 555 dev gre1
ip -M route add 222 as 333 via inet 10.200.0.1 dev p5

		###	node2	###

ip link set p6 up
ip addr add 10.200.0.1/24 dev p6


Now using this scapy to forge and send packets from the port p1 of node1:

p = Ether(src='de:ed:01:0c:41:09', dst='de:ed:01:2f:3b:ba')
p /= MPLS(s=1, ttl=64, label=111)/Raw(load='\xde')
sendp(p, iface="p1", count=20, inter=0.1)

^ permalink raw reply

* [PATCH 1/2] mpls: expose stack entry function
From: Amine Kherbouche @ 2017-09-21  9:25 UTC (permalink / raw)
  To: netdev, xeb, roopa; +Cc: amine.kherbouche, equinox
In-Reply-To: <1505985924-12479-1-git-send-email-amine.kherbouche@6wind.com>

Exposing mpls_forward() function to be able to be called from elsewhere
such as MPLS over GRE in the next commit.

Signed-off-by: Amine Kherbouche <amine.kherbouche@6wind.com>
---
 include/linux/mpls.h | 3 +++
 net/mpls/af_mpls.c   | 5 +++--
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/include/linux/mpls.h b/include/linux/mpls.h
index 384fb22..d5c7599 100644
--- a/include/linux/mpls.h
+++ b/include/linux/mpls.h
@@ -2,10 +2,13 @@
 #define _LINUX_MPLS_H
 
 #include <uapi/linux/mpls.h>
+#include <linux/netdevice.h>
 
 #define MPLS_TTL_MASK		(MPLS_LS_TTL_MASK >> MPLS_LS_TTL_SHIFT)
 #define MPLS_BOS_MASK		(MPLS_LS_S_MASK >> MPLS_LS_S_SHIFT)
 #define MPLS_TC_MASK		(MPLS_LS_TC_MASK >> MPLS_LS_TC_SHIFT)
 #define MPLS_LABEL_MASK		(MPLS_LS_LABEL_MASK >> MPLS_LS_LABEL_SHIFT)
 
+int mpls_forward(struct sk_buff *skb, struct net_device *dev,
+		 struct packet_type *pt, struct net_device *orig_dev);
 #endif  /* _LINUX_MPLS_H */
diff --git a/net/mpls/af_mpls.c b/net/mpls/af_mpls.c
index c5b9ce4..36ea2ad 100644
--- a/net/mpls/af_mpls.c
+++ b/net/mpls/af_mpls.c
@@ -307,8 +307,8 @@ static bool mpls_egress(struct net *net, struct mpls_route *rt,
 	return success;
 }
 
-static int mpls_forward(struct sk_buff *skb, struct net_device *dev,
-			struct packet_type *pt, struct net_device *orig_dev)
+int mpls_forward(struct sk_buff *skb, struct net_device *dev,
+		 struct packet_type *pt, struct net_device *orig_dev)
 {
 	struct net *net = dev_net(dev);
 	struct mpls_shim_hdr *hdr;
@@ -442,6 +442,7 @@ static int mpls_forward(struct sk_buff *skb, struct net_device *dev,
 	kfree_skb(skb);
 	return NET_RX_DROP;
 }
+EXPORT_SYMBOL(mpls_forward);
 
 static struct packet_type mpls_packet_type __read_mostly = {
 	.type = cpu_to_be16(ETH_P_MPLS_UC),
-- 
2.1.4

^ permalink raw reply related

* [PATCH 2/2] ip_tunnel: add mpls over gre encapsulation
From: Amine Kherbouche @ 2017-09-21  9:25 UTC (permalink / raw)
  To: netdev, xeb, roopa; +Cc: amine.kherbouche, equinox
In-Reply-To: <1505985924-12479-1-git-send-email-amine.kherbouche@6wind.com>

This commit introduces the MPLSoGRE support (RFC 4023), using ip tunnel
API.

Encap:
  - Add a new iptunnel type mpls.

Decap:
  - pull gre hdr and call mpls_forward().

Signed-off-by: Amine Kherbouche <amine.kherbouche@6wind.com>
---
 include/net/gre.h              |  3 +++
 include/uapi/linux/if_tunnel.h |  1 +
 net/ipv4/gre_demux.c           | 22 ++++++++++++++++++++++
 net/ipv4/ip_gre.c              |  9 +++++++++
 net/ipv6/ip6_gre.c             |  7 +++++++
 net/mpls/af_mpls.c             | 37 +++++++++++++++++++++++++++++++++++++
 6 files changed, 79 insertions(+)

diff --git a/include/net/gre.h b/include/net/gre.h
index d25d836..88a8343 100644
--- a/include/net/gre.h
+++ b/include/net/gre.h
@@ -35,6 +35,9 @@ struct net_device *gretap_fb_dev_create(struct net *net, const char *name,
 				       u8 name_assign_type);
 int gre_parse_header(struct sk_buff *skb, struct tnl_ptk_info *tpi,
 		     bool *csum_err, __be16 proto, int nhs);
+#if IS_ENABLED(CONFIG_MPLS)
+int mpls_gre_rcv(struct sk_buff *skb, int gre_hdr_len);
+#endif
 
 static inline int gre_calc_hlen(__be16 o_flags)
 {
diff --git a/include/uapi/linux/if_tunnel.h b/include/uapi/linux/if_tunnel.h
index 2e52088..a2f48c0 100644
--- a/include/uapi/linux/if_tunnel.h
+++ b/include/uapi/linux/if_tunnel.h
@@ -84,6 +84,7 @@ enum tunnel_encap_types {
 	TUNNEL_ENCAP_NONE,
 	TUNNEL_ENCAP_FOU,
 	TUNNEL_ENCAP_GUE,
+	TUNNEL_ENCAP_MPLS,
 };
 
 #define TUNNEL_ENCAP_FLAG_CSUM		(1<<0)
diff --git a/net/ipv4/gre_demux.c b/net/ipv4/gre_demux.c
index b798862..a6a937e 100644
--- a/net/ipv4/gre_demux.c
+++ b/net/ipv4/gre_demux.c
@@ -23,6 +23,9 @@
 #include <linux/netdevice.h>
 #include <linux/if_tunnel.h>
 #include <linux/spinlock.h>
+#if IS_ENABLED(CONFIG_MPLS)
+#include <linux/mpls.h>
+#endif
 #include <net/protocol.h>
 #include <net/gre.h>
 
@@ -122,6 +125,25 @@ int gre_parse_header(struct sk_buff *skb, struct tnl_ptk_info *tpi,
 }
 EXPORT_SYMBOL(gre_parse_header);
 
+#if IS_ENABLED(CONFIG_MPLS)
+int mpls_gre_rcv(struct sk_buff *skb, int gre_hdr_len)
+{
+	if (unlikely(!pskb_may_pull(skb, gre_hdr_len)))
+		goto drop;
+
+	/* Pop GRE hdr and reset the skb */
+	skb_pull(skb, gre_hdr_len);
+	skb_reset_network_header(skb);
+
+	mpls_forward(skb, skb->dev, NULL, NULL);
+
+	return 0;
+drop:
+	return NET_RX_DROP;
+}
+EXPORT_SYMBOL(mpls_gre_rcv);
+#endif
+
 static int gre_rcv(struct sk_buff *skb)
 {
 	const struct gre_protocol *proto;
diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
index 9cee986..dd4431c 100644
--- a/net/ipv4/ip_gre.c
+++ b/net/ipv4/ip_gre.c
@@ -412,10 +412,19 @@ static int gre_rcv(struct sk_buff *skb)
 			return 0;
 	}
 
+#if IS_ENABLED(CONFIG_MPLS)
+	if (unlikely(tpi.proto == htons(ETH_P_MPLS_UC))) {
+		if (mpls_gre_rcv(skb, hdr_len))
+			goto drop;
+		return 0;
+	}
+#endif
+
 	if (ipgre_rcv(skb, &tpi, hdr_len) == PACKET_RCVD)
 		return 0;
 
 	icmp_send(skb, ICMP_DEST_UNREACH, ICMP_PORT_UNREACH, 0);
+
 drop:
 	kfree_skb(skb);
 	return 0;
diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
index c82d41e..e52396d 100644
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -476,6 +476,13 @@ static int gre_rcv(struct sk_buff *skb)
 	if (hdr_len < 0)
 		goto drop;
 
+#if IS_ENABLED(CONFIG_MPLS)
+	if (unlikely(tpi.proto == htons(ETH_P_MPLS_UC))) {
+		if (mpls_gre_rcv(skb, hdr_len))
+			goto drop;
+		return 0;
+	}
+#endif
 	if (iptunnel_pull_header(skb, hdr_len, tpi.proto, false))
 		goto drop;
 
diff --git a/net/mpls/af_mpls.c b/net/mpls/af_mpls.c
index 36ea2ad..060ed07 100644
--- a/net/mpls/af_mpls.c
+++ b/net/mpls/af_mpls.c
@@ -16,6 +16,7 @@
 #include <net/arp.h>
 #include <net/ip_fib.h>
 #include <net/netevent.h>
+#include <net/ip_tunnels.h>
 #include <net/netns/generic.h>
 #if IS_ENABLED(CONFIG_IPV6)
 #include <net/ipv6.h>
@@ -39,6 +40,40 @@ static int one = 1;
 static int label_limit = (1 << 20) - 1;
 static int ttl_max = 255;
 
+size_t ipgre_mpls_encap_hlen(struct ip_tunnel_encap *e)
+{
+	return sizeof(struct mpls_shim_hdr);
+}
+
+int ipgre_mpls_build_header(struct sk_buff *skb, struct ip_tunnel_encap *e,
+			    u8 *protocol, struct flowi4 *fl4)
+{
+	return 0;
+}
+
+static const struct ip_tunnel_encap_ops mpls_iptun_ops = {
+	.encap_hlen = ipgre_mpls_encap_hlen,
+	.build_header = ipgre_mpls_build_header,
+};
+
+int ipgre_tunnel_encap_add_mpls_ops(void)
+{
+	int ret;
+
+	ret = ip_tunnel_encap_add_ops(&mpls_iptun_ops, TUNNEL_ENCAP_MPLS);
+	if (ret < 0) {
+		pr_err("can't add mplsgre ops\n");
+		return ret;
+	}
+
+	return 0;
+}
+
+static void ipgre_tunnel_encap_del_mpls_ops(void)
+{
+	ip_tunnel_encap_del_ops(&mpls_iptun_ops, TUNNEL_ENCAP_MPLS);
+}
+
 static void rtmsg_lfib(int event, u32 label, struct mpls_route *rt,
 		       struct nlmsghdr *nlh, struct net *net, u32 portid,
 		       unsigned int nlm_flags);
@@ -2486,6 +2521,7 @@ static int __init mpls_init(void)
 		      0);
 	rtnl_register(PF_MPLS, RTM_GETNETCONF, mpls_netconf_get_devconf,
 		      mpls_netconf_dump_devconf, 0);
+	ipgre_tunnel_encap_add_mpls_ops();
 	err = 0;
 out:
 	return err;
@@ -2503,6 +2539,7 @@ static void __exit mpls_exit(void)
 	dev_remove_pack(&mpls_packet_type);
 	unregister_netdevice_notifier(&mpls_dev_notifier);
 	unregister_pernet_subsys(&mpls_net_ops);
+	ipgre_tunnel_encap_del_mpls_ops();
 }
 module_exit(mpls_exit);
 
-- 
2.1.4

^ permalink raw reply related

* Re: [PATCH v3 10/31] befs: Define usercopy region in befs_inode_cache slab cache
From: Luis de Bethencourt @ 2017-09-21  9:34 UTC (permalink / raw)
  To: Kees Cook, linux-kernel
  Cc: David Windsor, Salah Triki, linux-fsdevel, netdev, linux-mm,
	kernel-hardening
In-Reply-To: <1505940337-79069-11-git-send-email-keescook@chromium.org>

On 09/20/2017 09:45 PM, Kees Cook wrote:
> From: David Windsor <dave@nullcore.net>
> 
> befs symlink pathnames, stored in struct befs_inode_info.i_data.symlink
> and therefore contained in the befs_inode_cache slab cache, need to be
> copied to/from userspace.
> 
> cache object allocation:
>      fs/befs/linuxvfs.c:
>          befs_alloc_inode(...):
>              ...
>              bi = kmem_cache_alloc(befs_inode_cachep, GFP_KERNEL);
>              ...
>              return &bi->vfs_inode;
> 
>          befs_iget(...):
>              ...
>              strlcpy(befs_ino->i_data.symlink, raw_inode->data.symlink,
>                      BEFS_SYMLINK_LEN);
>              ...
>              inode->i_link = befs_ino->i_data.symlink;
> 
> example usage trace:
>      readlink_copy+0x43/0x70
>      vfs_readlink+0x62/0x110
>      SyS_readlinkat+0x100/0x130
> 
>      fs/namei.c:
>          readlink_copy(..., link):
>              ...
>              copy_to_user(..., link, len);
> 
>          (inlined in vfs_readlink)
>          generic_readlink(dentry, ...):
>              struct inode *inode = d_inode(dentry);
>              const char *link = inode->i_link;
>              ...
>              readlink_copy(..., link);
> 
> In support of usercopy hardening, this patch defines a region in the
> befs_inode_cache slab cache in which userspace copy operations are
> allowed.
> 
> This region is known as the slab cache's usercopy region. Slab caches can
> now check that each copy operation involving cache-managed memory falls
> entirely within the slab's usercopy region.
> 
> This patch is modified from Brad Spengler/PaX Team's PAX_USERCOPY
> whitelisting code in the last public patch of grsecurity/PaX based on my
> understanding of the code. Changes or omissions from the original code are
> mine and don't reflect the original grsecurity/PaX code.
> 
> Signed-off-by: David Windsor <dave@nullcore.net>
> [kees: adjust commit log, provide usage trace]
> Cc: Luis de Bethencourt <luisbg@kernel.org>
> Cc: Salah Triki <salah.triki@gmail.com>
> Signed-off-by: Kees Cook <keescook@chromium.org>
> Acked-by: Luis de Bethencourt <luisbg@kernel.org>
> ---
>   fs/befs/linuxvfs.c | 14 +++++++++-----
>   1 file changed, 9 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/befs/linuxvfs.c b/fs/befs/linuxvfs.c
> index a92355cc453b..e5dcd26003dc 100644
> --- a/fs/befs/linuxvfs.c
> +++ b/fs/befs/linuxvfs.c
> @@ -444,11 +444,15 @@ static struct inode *befs_iget(struct super_block *sb, unsigned long ino)
>   static int __init
>   befs_init_inodecache(void)
>   {
> -	befs_inode_cachep = kmem_cache_create("befs_inode_cache",
> -					      sizeof (struct befs_inode_info),
> -					      0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
> -					      init_once);
> +	befs_inode_cachep = kmem_cache_create_usercopy("befs_inode_cache",
> +				sizeof(struct befs_inode_info), 0,
> +				(SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|
> +					SLAB_ACCOUNT),
> +				offsetof(struct befs_inode_info,
> +					i_data.symlink),
> +				sizeof_field(struct befs_inode_info,
> +					i_data.symlink),
> +				init_once);
>   	if (befs_inode_cachep == NULL)
>   		return -ENOMEM;
>   
> 

No changes in the befs patch in v3. It goes without saying I continue to 
Ack this.

Thanks Kees and David,
Luis

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* [PATCH net-next 0/2] lan9303: Add basic offloading of unicast traffic
From: Egil Hjelmeland @ 2017-09-21  9:41 UTC (permalink / raw)
  To: andrew, vivien.didelot, f.fainelli, netdev, linux-kernel; +Cc: Egil Hjelmeland

This series add basic offloading of unicast traffic to the lan9303
DSA driver.

Comments welcome!

Egil Hjelmeland (2):
  net: dsa: lan9303: Move tag setup to new lan9303_setup_tagging
  net: dsa: lan9303: Add basic offloading of unicast traffic

 drivers/net/dsa/lan9303-core.c | 130 +++++++++++++++++++++++++++++++++++------
 drivers/net/dsa/lan9303.h      |   1 +
 2 files changed, 114 insertions(+), 17 deletions(-)

-- 
2.11.0

^ permalink raw reply

* [PATCH net-next 1/2] net: dsa: lan9303: Move tag setup to new lan9303_setup_tagging
From: Egil Hjelmeland @ 2017-09-21  9:41 UTC (permalink / raw)
  To: andrew, vivien.didelot, f.fainelli, netdev, linux-kernel; +Cc: Egil Hjelmeland
In-Reply-To: <20170921094139.4250-1-privat@egil-hjelmeland.no>

Prepare for next patch:
Move tag setup from lan9303_separate_ports() to new function
lan9303_setup_tagging()

Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no>
---
 drivers/net/dsa/lan9303-core.c | 42 +++++++++++++++++++++++++-----------------
 1 file changed, 25 insertions(+), 17 deletions(-)

diff --git a/drivers/net/dsa/lan9303-core.c b/drivers/net/dsa/lan9303-core.c
index 07355db2ad81..bba875e114e7 100644
--- a/drivers/net/dsa/lan9303-core.c
+++ b/drivers/net/dsa/lan9303-core.c
@@ -157,6 +157,7 @@
 # define LAN9303_SWE_PORT_MIRROR_ENABLE_RX_MIRRORING BIT(1)
 # define LAN9303_SWE_PORT_MIRROR_ENABLE_TX_MIRRORING BIT(0)
 #define LAN9303_SWE_INGRESS_PORT_TYPE 0x1847
+#define  LAN9303_SWE_INGRESS_PORT_TYPE_VLAN 3
 #define LAN9303_BM_CFG 0x1c00
 #define LAN9303_BM_EGRSS_PORT_TYPE 0x1c0c
 # define LAN9303_BM_EGRSS_PORT_TYPE_SPECIAL_TAG_PORT2 (BIT(17) | BIT(16))
@@ -510,11 +511,30 @@ static int lan9303_enable_processing_port(struct lan9303 *chip,
 				LAN9303_MAC_TX_CFG_X_TX_ENABLE);
 }
 
+/* forward special tagged packets from port 0 to port 1 *or* port 2 */
+static int lan9303_setup_tagging(struct lan9303 *chip)
+{
+	int ret;
+	/* enable defining the destination port via special VLAN tagging
+	 * for port 0
+	 */
+	ret = lan9303_write_switch_reg(chip, LAN9303_SWE_INGRESS_PORT_TYPE,
+				       LAN9303_SWE_INGRESS_PORT_TYPE_VLAN);
+	if (ret)
+		return ret;
+
+	/* tag incoming packets at port 1 and 2 on their way to port 0 to be
+	 * able to discover their source port
+	 */
+	return lan9303_write_switch_reg(
+		chip, LAN9303_BM_EGRSS_PORT_TYPE,
+		LAN9303_BM_EGRSS_PORT_TYPE_SPECIAL_TAG_PORT0);
+}
+
 /* We want a special working switch:
  * - do not forward packets between port 1 and 2
  * - forward everything from port 1 to port 0
  * - forward everything from port 2 to port 0
- * - forward special tagged packets from port 0 to port 1 *or* port 2
  */
 static int lan9303_separate_ports(struct lan9303 *chip)
 {
@@ -529,22 +549,6 @@ static int lan9303_separate_ports(struct lan9303 *chip)
 	if (ret)
 		return ret;
 
-	/* enable defining the destination port via special VLAN tagging
-	 * for port 0
-	 */
-	ret = lan9303_write_switch_reg(chip, LAN9303_SWE_INGRESS_PORT_TYPE,
-				       0x03);
-	if (ret)
-		return ret;
-
-	/* tag incoming packets at port 1 and 2 on their way to port 0 to be
-	 * able to discover their source port
-	 */
-	ret = lan9303_write_switch_reg(chip, LAN9303_BM_EGRSS_PORT_TYPE,
-			LAN9303_BM_EGRSS_PORT_TYPE_SPECIAL_TAG_PORT0);
-	if (ret)
-		return ret;
-
 	/* prevent port 1 and 2 from forwarding packets by their own */
 	return lan9303_write_switch_reg(chip, LAN9303_SWE_PORT_STATE,
 				LAN9303_SWE_PORT_STATE_FORWARDING_PORT0 |
@@ -644,6 +648,10 @@ static int lan9303_setup(struct dsa_switch *ds)
 		return -EINVAL;
 	}
 
+	ret = lan9303_setup_tagging(chip);
+	if (ret)
+		dev_err(chip->dev, "failed to setup port tagging %d\n", ret);
+
 	ret = lan9303_separate_ports(chip);
 	if (ret)
 		dev_err(chip->dev, "failed to separate ports %d\n", ret);
-- 
2.11.0

^ permalink raw reply related

* [PATCH net-next 2/2] net: dsa: lan9303: Add basic offloading of unicast traffic
From: Egil Hjelmeland @ 2017-09-21  9:41 UTC (permalink / raw)
  To: andrew, vivien.didelot, f.fainelli, netdev, linux-kernel; +Cc: Egil Hjelmeland
In-Reply-To: <20170921094139.4250-1-privat@egil-hjelmeland.no>

When both user ports are joined to the same bridge, the normal
HW MAC learning is enabled. This means that unicast traffic is forwarded
in HW.

If one of the user ports leave the bridge,
the ports goes back to the initial separated operation.

Port separation relies on disabled HW MAC learning. Hence the condition
that both ports must join same bridge.

Add brigde methods port_bridge_join, port_bridge_leave and
port_stp_state_set.

Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no>
---
 drivers/net/dsa/lan9303-core.c | 88 ++++++++++++++++++++++++++++++++++++++++++
 drivers/net/dsa/lan9303.h      |  1 +
 2 files changed, 89 insertions(+)

diff --git a/drivers/net/dsa/lan9303-core.c b/drivers/net/dsa/lan9303-core.c
index bba875e114e7..76112674fe6a 100644
--- a/drivers/net/dsa/lan9303-core.c
+++ b/drivers/net/dsa/lan9303-core.c
@@ -18,6 +18,7 @@
 #include <linux/mutex.h>
 #include <linux/mii.h>
 #include <linux/phy.h>
+#include <linux/if_bridge.h>
 
 #include "lan9303.h"
 
@@ -146,6 +147,7 @@
 # define LAN9303_SWE_PORT_STATE_FORWARDING_PORT0 (0)
 # define LAN9303_SWE_PORT_STATE_LEARNING_PORT0 BIT(1)
 # define LAN9303_SWE_PORT_STATE_BLOCKING_PORT0 BIT(0)
+# define LAN9303_SWE_PORT_STATE_DISABLED_PORT0 (3)
 #define LAN9303_SWE_PORT_MIRROR 0x1846
 # define LAN9303_SWE_PORT_MIRROR_SNIFF_ALL BIT(8)
 # define LAN9303_SWE_PORT_MIRROR_SNIFFER_PORT2 BIT(7)
@@ -431,6 +433,20 @@ static int lan9303_read_switch_reg(struct lan9303 *chip, u16 regnum, u32 *val)
 	return ret;
 }
 
+static int lan9303_write_switch_reg_mask(
+	struct lan9303 *chip, u16 regnum, u32 val, u32 mask)
+{
+	int ret;
+	u32 reg;
+
+	ret = lan9303_read_switch_reg(chip, regnum, &reg);
+	if (ret)
+		return ret;
+	reg = (reg & ~mask) | val;
+
+	return lan9303_write_switch_reg(chip, regnum, reg);
+}
+
 static int lan9303_write_switch_port(struct lan9303 *chip, int port,
 				     u16 regnum, u32 val)
 {
@@ -556,6 +572,12 @@ static int lan9303_separate_ports(struct lan9303 *chip)
 				LAN9303_SWE_PORT_STATE_BLOCKING_PORT2);
 }
 
+static void lan9303_bridge_ports(struct lan9303 *chip)
+{
+	/* ports bridged: remove mirroring */
+	lan9303_write_switch_reg(chip, LAN9303_SWE_PORT_MIRROR, 0);
+}
+
 static int lan9303_handle_reset(struct lan9303 *chip)
 {
 	if (!chip->reset_gpio)
@@ -844,6 +866,69 @@ static void lan9303_port_disable(struct dsa_switch *ds, int port,
 	}
 }
 
+static int lan9303_port_bridge_join(struct dsa_switch *ds, int port,
+				    struct net_device *br)
+{
+	struct lan9303 *chip = ds->priv;
+
+	dev_dbg(chip->dev, "%s(port %d)\n", __func__, port);
+	if (ds->ports[1].bridge_dev ==  ds->ports[2].bridge_dev) {
+		lan9303_bridge_ports(chip);
+		chip->is_bridged = true;  /* unleash stp_state_set() */
+	}
+
+	return 0;
+}
+
+static void lan9303_port_bridge_leave(struct dsa_switch *ds, int port,
+				      struct net_device *br)
+{
+	struct lan9303 *chip = ds->priv;
+
+	dev_dbg(chip->dev, "%s(port %d)\n", __func__, port);
+	if (chip->is_bridged) {
+		lan9303_separate_ports(chip);
+		chip->is_bridged = false;
+	}
+}
+
+static void lan9303_port_stp_state_set(struct dsa_switch *ds, int port,
+				       u8 state)
+{
+	int portmask, portstate;
+	struct lan9303 *chip = ds->priv;
+
+	dev_dbg(chip->dev, "%s(port %d, state %d)\n",
+		__func__, port, state);
+	if (!chip->is_bridged)
+		return; /* touching SWE_PORT_STATE will break port separation */
+
+	switch (state) {
+	case BR_STATE_DISABLED:
+		portstate = LAN9303_SWE_PORT_STATE_DISABLED_PORT0;
+		break;
+	case BR_STATE_BLOCKING:
+	case BR_STATE_LISTENING:
+		portstate = LAN9303_SWE_PORT_STATE_BLOCKING_PORT0;
+		break;
+	case BR_STATE_LEARNING:
+		portstate = LAN9303_SWE_PORT_STATE_LEARNING_PORT0;
+		break;
+	case BR_STATE_FORWARDING:
+		portstate = LAN9303_SWE_PORT_STATE_FORWARDING_PORT0;
+		break;
+	default:
+		portstate = LAN9303_SWE_PORT_STATE_DISABLED_PORT0;
+		dev_err(chip->dev, "unknown stp state: port %d, state %d\n",
+			port, state);
+	}
+
+	portmask = 0x3 << (port * 2);
+	portstate     <<= (port * 2);
+	lan9303_write_switch_reg_mask(chip, LAN9303_SWE_PORT_STATE,
+				      portstate, portmask);
+}
+
 static const struct dsa_switch_ops lan9303_switch_ops = {
 	.get_tag_protocol = lan9303_get_tag_protocol,
 	.setup = lan9303_setup,
@@ -855,6 +940,9 @@ static const struct dsa_switch_ops lan9303_switch_ops = {
 	.get_sset_count = lan9303_get_sset_count,
 	.port_enable = lan9303_port_enable,
 	.port_disable = lan9303_port_disable,
+	.port_bridge_join       = lan9303_port_bridge_join,
+	.port_bridge_leave      = lan9303_port_bridge_leave,
+	.port_stp_state_set     = lan9303_port_stp_state_set,
 };
 
 static int lan9303_register_switch(struct lan9303 *chip)
diff --git a/drivers/net/dsa/lan9303.h b/drivers/net/dsa/lan9303.h
index 4d8be555ff4d..5be246f05965 100644
--- a/drivers/net/dsa/lan9303.h
+++ b/drivers/net/dsa/lan9303.h
@@ -21,6 +21,7 @@ struct lan9303 {
 	struct dsa_switch *ds;
 	struct mutex indirect_mutex; /* protect indexed register access */
 	const struct lan9303_phy_ops *ops;
+	bool is_bridged; /* true if port 1 and 2 is bridged */
 };
 
 extern const struct regmap_access_table lan9303_register_set;
-- 
2.11.0

^ permalink raw reply related

* Re: [PATCH net-next 0/5] net: introduce noref sk
From: Paolo Abeni @ 2017-09-21  9:42 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, pablo, fw, edumazet, hannes
In-Reply-To: <20170920.202022.2073272587145961156.davem@davemloft.net>

Hi,

Thanks for the feedback!

On Wed, 2017-09-20 at 20:20 -0700, David Miller wrote:
> From: Paolo Abeni <pabeni@redhat.com>
> Date: Wed, 20 Sep 2017 18:54:00 +0200
> 
> > This series introduce the infrastructure to store inside the skb a socket
> > pointer without carrying a refcount to the socket.
> > 
> > Such infrastructure is then used in the network receive path - and
> > specifically the early demux operation.
> > 
> > This allows the UDP early demux to perform a full lookup for UDP sockets,
> > with many benefits:
> > 
> > - the UDP early demux code is now much simpler
> > - the early demux does not hit any performance penalties in case of UDP hash
> >   table collision - previously the early demux performed a partial, unsuccesful,
> >   lookup
> > - early demux is now operational also for unconnected sockets.
> > 
> > This infrastrcture will be used in follow-up series to allow dst caching for
> > unconnected UDP sockets, and than to extend the same features to TCP listening
> > sockets.
> 
> Like Eric, I find this series (while exciting) quite scary :-)
> 
> You really have to post some kind of performance numbers in your
> header posting in order to justify something with these ramifications
> and scale.

This is actually a preparatory work for the next series which will
bring in the real gain. The next patches are still to be polished so we
 posted this separately to get some early feedback. 

If that would help, I can post the follow-up soon as RFC. Overall -
with the follow-up appplied, too - when using a single rx ingress
queue, I measured ~20% tput gain for unconnected ipv4 sockets - with
rp_filter disabled - and ~30% for ipv6 sockets. In case of multiple
ingress queues, the gain is smaller but still measurable (roughly 5%). 

Please let me know if you prefer the see the full work early. 

Thanks,

Paolo

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox