Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net-next 0/6] mlxsw: Support bridge router interfaces with non-default VLAN
From: David Miller @ 2018-06-26  9:06 UTC (permalink / raw)
  To: idosch; +Cc: netdev, jiri, petrm, mlxsw
In-Reply-To: <20180625074818.17073-1-idosch@mellanox.com>

From: Ido Schimmel <idosch@mellanox.com>
Date: Mon, 25 Jun 2018 10:48:12 +0300

> Petr says:
> 
> When traffic is inserted on a router interface associated with an 802.1q
> bridge, the VLAN that the traffic appears on is determined by PVID of
> the bridge device itself. However currently mlxsw always configures such
> traffic to be forwarded to VLAN 1, regardless of the bridge PVID.
> 
> Fix the problem by modifying the FID-handling code to assign such
> traffic not to FID that corresponds to VLAN 1, but to a FID that
> corresponds to the configured PVID. Bail out if there is no PVID. This
> is implemented in patches #1 and #2.
> 
> From that point on, also forbid any changes to bridge device PVID,
> because such changes would not be reflected. This is implemented in
> patches #3, #4 and #5.
> 
> Finally in patch #6, introduce tests that use bridge as a routed
> interface, and test mlxsw in both the currently-supported scenario of
> using PVID 1, and the newly-supported one of using a custom PVID.

Series applied, thank you.

^ permalink raw reply

* Re: [PATCH 1/1] r8152: napi hangup fix after disconnect
From: David Miller @ 2018-06-26  9:01 UTC (permalink / raw)
  To: jslaby; +Cc: linux-kernel, linux-usb, netdev
In-Reply-To: <20180625072627.9109-1-jslaby@suse.cz>

From: Jiri Slaby <jslaby@suse.cz>
Date: Mon, 25 Jun 2018 09:26:27 +0200

> When unplugging an r8152 adapter while the interface is UP, the NIC
> becomes unusable.  usb->disconnect (aka rtl8152_disconnect) deletes
> napi. Then, rtl8152_disconnect calls unregister_netdev and that invokes
> netdev->ndo_stop (aka rtl8152_close). rtl8152_close tries to
> napi_disable, but the napi is already deleted by disconnect above. So
> the first while loop in napi_disable never finishes. This results in
> complete deadlock of the network layer as there is rtnl_mutex held by
> unregister_netdev.
> 
> So avoid the call to napi_disable in rtl8152_close when the device is
> already gone.
> 
> The other calls to usb_kill_urb, cancel_delayed_work_sync,
> netif_stop_queue etc. seem to be fine. The urb and netdev is not
> destroyed yet.
> 
> Signed-off-by: Jiri Slaby <jslaby@suse.cz>

Applied, thank you.

^ permalink raw reply

* [PATCH net] net: macb: initialize bp->queues[0].bp for at91rm9200
From: Alexandre Belloni @ 2018-06-26  8:44 UTC (permalink / raw)
  To: David S. Miller
  Cc: Rafal Ozieblo, Nicolas Ferre, netdev, linux-kernel,
	Alexandre Belloni

The macb driver currently crashes on at91rm9200 with the following trace:

Unable to handle kernel NULL pointer dereference at virtual address 00000014
[...]
[<c031da44>] (macb_rx_desc) from [<c031f2bc>] (at91ether_open+0x2e8/0x3f8)
[<c031f2bc>] (at91ether_open) from [<c041e8d8>] (__dev_open+0x120/0x13c)
[<c041e8d8>] (__dev_open) from [<c041ec08>] (__dev_change_flags+0x17c/0x1a8)
[<c041ec08>] (__dev_change_flags) from [<c041ec4c>] (dev_change_flags+0x18/0x4c)
[<c041ec4c>] (dev_change_flags) from [<c07a5f4c>] (ip_auto_config+0x220/0x10b0)
[<c07a5f4c>] (ip_auto_config) from [<c000a4fc>] (do_one_initcall+0x78/0x18c)
[<c000a4fc>] (do_one_initcall) from [<c0783e50>] (kernel_init_freeable+0x184/0x1c4)
[<c0783e50>] (kernel_init_freeable) from [<c0574d70>] (kernel_init+0x8/0xe8)
[<c0574d70>] (kernel_init) from [<c00090e0>] (ret_from_fork+0x14/0x34)

Solve that by initializing bp->queues[0].bp in at91ether_init (as is done
in macb_init).

Fixes: ae1f2a56d273 ("net: macb: Added support for many RX queues")
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
---
 drivers/net/ethernet/cadence/macb_main.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
index 3e93df5d4e3b..96cc03a6d942 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -3726,6 +3726,8 @@ static int at91ether_init(struct platform_device *pdev)
 	int err;
 	u32 reg;
 
+	bp->queues[0].bp = bp;
+
 	dev->netdev_ops = &at91ether_netdev_ops;
 	dev->ethtool_ops = &macb_ethtool_ops;
 
-- 
2.18.0

^ permalink raw reply related

* Re: [PATCH 00/14] ARM: davinci: step towards removing at24_platform_data
From: Andrew Lunn @ 2018-06-26  8:38 UTC (permalink / raw)
  To: Bartosz Golaszewski
  Cc: Rob Herring, Grygorii Strashko, David Lechner, Ivan Khoronzhuk,
	Kevin Hilman, Greg Kroah-Hartman, Sekhar Nori, Russell King,
	Linux Kernel Mailing List, Bartosz Golaszewski, Lukas Wunner,
	Srinivas Kandagatla, Linux ARM, netdev, Florian Fainelli,
	linux-omap, David S . Miller, Dan Carpenter
In-Reply-To: <CAMRc=MejDn=t6EPbKOfhMfRAb_O-5hxtWRyjAWH1mb5BUySrDA@mail.gmail.com>

On Tue, Jun 26, 2018 at 09:44:19AM +0200, Bartosz Golaszewski wrote:
> 2018-06-25 20:02 GMT+02:00 Andrew Lunn <andrew@lunn.ch>:
> >> With my patch 1/14 you'll get -EPROBE_DEFER from nvmem_cell_get() if
> >> the nvmem provider is not yet registered. Will that help in your case?
> >
> > I don't think so. My driver instantiates the AT24 device. So if i get
> > -EPROBE_DEFER, i need to cleanup the probe, and return -EPROBDE_DEFER
> > to the code. Which means i need to remove the AT24 device...
> >
> >        Andrew
> 
> Are you sure this is the correct approach? I understand that you want
> to use something like board files for your machine? Wouldn't it be
> better to register a platform device for at24, register a cell lookup
> and then depend on that cell from your driver?

Hi Bartosz

The machine is based around a Kontron Com Express module, with an
intel Ivy Bridge CPU. This is then placed into a custom carrier board,
which has a number of i2c devices.

I have a platform driver which matches on the DMI system ID for the
Kontron module.

The Com Express module has a PLD which implements i2c, gpio,
etc. There is an MFD for this, which instantiates the i2c-kempld i2c
bus driver.

My platform driver finds this i2c-kempld bus driver. If it does not
exist yet, it return -EPROBE_DEFER. If it is found it instantiates an
at24 device on it. I need to look at the content of the EEPROM to
determine the hardware revision, plus do a checksum.  From that, i
need to instantiates 1 or 2 additional AT24, up to 4 GPIO expanders,
and i2c to spi converter, add some gpio-leds on the gpio expanders,
create a bit-banging MDIO bus, instantiate an Ethernet switch on the
MDIO bus, maybe add an Ethernet switch to the SPI bus, etc.

As you can see, i have a chain of events. I cannot move onto the next
part of the chain until i know the probe for the previous part has
finished. e.g. i cannot add gpio-leds until i know the gpio expander
has probed. But the gpio expander provides a call back, similar to the
at24 setup(). The MDIO bus and the SPI bus has a mechanism to register
an info structure, just like you have done for NVMEM cells. So that
works out.

The weak link in this chain is that first at24 probe, and knowing when
i can access the nvmem cells for the revision and checksum
information. -EPROBE_DEFER does not help me here. I need either some
sort of blocking wait for the cells to become available, or a callback
in a context which allows me to instantiate more devices.

I also have some steps which cannot be undone. You don't provide a
mechanism to unregister the nvmem info structure. The I2C and MDIO
equivalent also does not provide an unregister for bus info. So once i
register the first info structure, i'm past the point of no return. I
cannot return -EPROBE_DEFER because i cannot unregister the info
structures, so that i can register them again the next time the
platform driver gets probed.

	 Andrew

^ permalink raw reply

* [PATCH net-next V3 2/2] cxgb4: Support ethtool private flags
From: Ganesh Goudar @ 2018-06-26  8:25 UTC (permalink / raw)
  To: netdev, davem
  Cc: nirranjan, indranil, venkatesh, Arjun Vynipadath, Casey Leedom,
	Ganesh Goudar

From: Arjun Vynipadath <arjun@chelsio.com>

This is used to change TX workrequests, which helps in
host->vf communication.

Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c | 42 ++++++++++++++++++++++
 1 file changed, 42 insertions(+)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c
index f7eef93..ddb8b9e 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c
@@ -177,6 +177,10 @@ static char loopback_stats_strings[][ETH_GSTRING_LEN] = {
 	"bg3_frames_trunc       ",
 };
 
+static const char cxgb4_priv_flags_strings[][ETH_GSTRING_LEN] = {
+	[PRIV_FLAG_PORT_TX_VM_BIT] = "port_tx_vm_wr",
+};
+
 static int get_sset_count(struct net_device *dev, int sset)
 {
 	switch (sset) {
@@ -185,6 +189,8 @@ static int get_sset_count(struct net_device *dev, int sset)
 		       ARRAY_SIZE(adapter_stats_strings) +
 		       ARRAY_SIZE(channel_stats_strings) +
 		       ARRAY_SIZE(loopback_stats_strings);
+	case ETH_SS_PRIV_FLAGS:
+		return ARRAY_SIZE(cxgb4_priv_flags_strings);
 	default:
 		return -EOPNOTSUPP;
 	}
@@ -235,6 +241,7 @@ static void get_drvinfo(struct net_device *dev, struct ethtool_drvinfo *info)
 			 FW_HDR_FW_VER_MINOR_G(exprom_vers),
 			 FW_HDR_FW_VER_MICRO_G(exprom_vers),
 			 FW_HDR_FW_VER_BUILD_G(exprom_vers));
+	info->n_priv_flags = ARRAY_SIZE(cxgb4_priv_flags_strings);
 }
 
 static void get_strings(struct net_device *dev, u32 stringset, u8 *data)
@@ -250,6 +257,9 @@ static void get_strings(struct net_device *dev, u32 stringset, u8 *data)
 		data += sizeof(channel_stats_strings);
 		memcpy(data, loopback_stats_strings,
 		       sizeof(loopback_stats_strings));
+	} else if (stringset == ETH_SS_PRIV_FLAGS) {
+		memcpy(data, cxgb4_priv_flags_strings,
+		       sizeof(cxgb4_priv_flags_strings));
 	}
 }
 
@@ -1499,6 +1509,36 @@ static int cxgb4_get_module_eeprom(struct net_device *dev,
 			 offset, len, &data[eprom->len - len]);
 }
 
+static u32 cxgb4_get_priv_flags(struct net_device *netdev)
+{
+	struct port_info *pi = netdev_priv(netdev);
+	struct adapter *adapter = pi->adapter;
+
+	return (adapter->eth_flags | pi->eth_flags);
+}
+
+/**
+ *	set_flags - set/unset specified flags if passed in new_flags
+ *	@cur_flags: pointer to current flags
+ *	@new_flags: new incoming flags
+ *	@flags: set of flags to set/unset
+ */
+static inline void set_flags(u32 *cur_flags, u32 new_flags, u32 flags)
+{
+	*cur_flags = (*cur_flags & ~flags) | (new_flags & flags);
+}
+
+static int cxgb4_set_priv_flags(struct net_device *netdev, u32 flags)
+{
+	struct port_info *pi = netdev_priv(netdev);
+	struct adapter *adapter = pi->adapter;
+
+	set_flags(&adapter->eth_flags, flags, PRIV_FLAGS_ADAP);
+	set_flags(&pi->eth_flags, flags, PRIV_FLAGS_PORT);
+
+	return 0;
+}
+
 static const struct ethtool_ops cxgb_ethtool_ops = {
 	.get_link_ksettings = get_link_ksettings,
 	.set_link_ksettings = set_link_ksettings,
@@ -1535,6 +1575,8 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
 	.get_dump_data     = get_dump_data,
 	.get_module_info   = cxgb4_get_module_info,
 	.get_module_eeprom = cxgb4_get_module_eeprom,
+	.get_priv_flags    = cxgb4_get_priv_flags,
+	.set_priv_flags    = cxgb4_set_priv_flags,
 };
 
 void cxgb4_set_ethtool_ops(struct net_device *netdev)
-- 
2.1.0

^ permalink raw reply related

* [PATCH net-next V3 1/2] cxgb4: Add support for FW_ETH_TX_PKT_VM_WR
From: Ganesh Goudar @ 2018-06-26  8:24 UTC (permalink / raw)
  To: netdev, davem
  Cc: nirranjan, indranil, venkatesh, Arjun Vynipadath, Casey Leedom,
	Ganesh Goudar

From: Arjun Vynipadath <arjun@chelsio.com>

The present TX workrequest(FW_ETH_TX_PKT_WR) cant be used for
host->vf communication, since it doesn't loopback the outgoing
packets to virtual interfaces on the same port. This can be done using
FW_ETH_TX_PKT_VM_WR.
This fix depends on ethtool_flags to determine what WR to use for
TX path. Support for setting this flags by user is added in next commit.

Based on the original work by : Casey Leedom <leedom@chelsio.com>

V3
- Made eth_flags type consistent across struct adapter and
  struct port_info.
V2
- Renamed t4_eth_xmit() and t4vf_eth_xmit(), since some compilers
  were warning about conflicting definition in cxgb4vf driver

Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4.h      |  13 +-
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c |   2 +-
 drivers/net/ethernet/chelsio/cxgb4/sge.c        | 372 +++++++++++++++++++++++-
 3 files changed, 383 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
index 1adb968..a4ea53d 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
@@ -522,6 +522,15 @@ enum {
 	MAX_INGQ = MAX_ETH_QSETS + INGQ_EXTRAS,
 };
 
+enum {
+	PRIV_FLAG_PORT_TX_VM_BIT,
+};
+
+#define PRIV_FLAG_PORT_TX_VM		BIT(PRIV_FLAG_PORT_TX_VM_BIT)
+
+#define PRIV_FLAGS_ADAP			0
+#define PRIV_FLAGS_PORT			PRIV_FLAG_PORT_TX_VM
+
 struct adapter;
 struct sge_rspq;
 
@@ -558,6 +567,7 @@ struct port_info {
 	struct hwtstamp_config tstamp_config;
 	bool ptp_enable;
 	struct sched_table *sched_tbl;
+	u32 eth_flags;
 };
 
 struct dentry;
@@ -868,6 +878,7 @@ struct adapter {
 	unsigned int flags;
 	unsigned int adap_idx;
 	enum chip_type chip;
+	u32 eth_flags;
 
 	int msg_enable;
 	__be16 vxlan_port;
@@ -1334,7 +1345,7 @@ void t4_os_link_changed(struct adapter *adap, int port_id, int link_stat);
 void t4_free_sge_resources(struct adapter *adap);
 void t4_free_ofld_rxqs(struct adapter *adap, int n, struct sge_ofld_rxq *q);
 irq_handler_t t4_intr_handler(struct adapter *adap);
-netdev_tx_t t4_eth_xmit(struct sk_buff *skb, struct net_device *dev);
+netdev_tx_t t4_start_xmit(struct sk_buff *skb, struct net_device *dev);
 int t4_ethrx_handler(struct sge_rspq *q, const __be64 *rsp,
 		     const struct pkt_gl *gl);
 int t4_mgmt_tx(struct adapter *adap, struct sk_buff *skb);
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index bc03c17..d3b0f9c 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -3217,7 +3217,7 @@ static netdev_features_t cxgb_fix_features(struct net_device *dev,
 static const struct net_device_ops cxgb4_netdev_ops = {
 	.ndo_open             = cxgb_open,
 	.ndo_stop             = cxgb_close,
-	.ndo_start_xmit       = t4_eth_xmit,
+	.ndo_start_xmit       = t4_start_xmit,
 	.ndo_select_queue     =	cxgb_select_queue,
 	.ndo_get_stats64      = cxgb_get_stats,
 	.ndo_set_rx_mode      = cxgb_set_rxmode,
diff --git a/drivers/net/ethernet/chelsio/cxgb4/sge.c b/drivers/net/ethernet/chelsio/cxgb4/sge.c
index 395e2a0..f1311fd 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/sge.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/sge.c
@@ -1288,13 +1288,13 @@ static inline void t6_fill_tnl_lso(struct sk_buff *skb,
 }
 
 /**
- *	t4_eth_xmit - add a packet to an Ethernet Tx queue
+ *	cxgb4_eth_xmit - add a packet to an Ethernet Tx queue
  *	@skb: the packet
  *	@dev: the egress net device
  *
  *	Add a packet to an SGE Ethernet Tx queue.  Runs with softirqs disabled.
  */
-netdev_tx_t t4_eth_xmit(struct sk_buff *skb, struct net_device *dev)
+static netdev_tx_t cxgb4_eth_xmit(struct sk_buff *skb, struct net_device *dev)
 {
 	u32 wr_mid, ctrl0, op;
 	u64 cntrl, *end, *sgl;
@@ -1547,6 +1547,374 @@ out_free:	dev_kfree_skb_any(skb);
 	return NETDEV_TX_OK;
 }
 
+/* Constants ... */
+enum {
+	/* Egress Queue sizes, producer and consumer indices are all in units
+	 * of Egress Context Units bytes.  Note that as far as the hardware is
+	 * concerned, the free list is an Egress Queue (the host produces free
+	 * buffers which the hardware consumes) and free list entries are
+	 * 64-bit PCI DMA addresses.
+	 */
+	EQ_UNIT = SGE_EQ_IDXSIZE,
+	FL_PER_EQ_UNIT = EQ_UNIT / sizeof(__be64),
+	TXD_PER_EQ_UNIT = EQ_UNIT / sizeof(__be64),
+
+	T4VF_ETHTXQ_MAX_HDR = (sizeof(struct fw_eth_tx_pkt_vm_wr) +
+			       sizeof(struct cpl_tx_pkt_lso_core) +
+			       sizeof(struct cpl_tx_pkt_core)) / sizeof(__be64),
+};
+
+/**
+ *	t4vf_is_eth_imm - can an Ethernet packet be sent as immediate data?
+ *	@skb: the packet
+ *
+ *	Returns whether an Ethernet packet is small enough to fit completely as
+ *	immediate data.
+ */
+static inline int t4vf_is_eth_imm(const struct sk_buff *skb)
+{
+	/* The VF Driver uses the FW_ETH_TX_PKT_VM_WR firmware Work Request
+	 * which does not accommodate immediate data.  We could dike out all
+	 * of the support code for immediate data but that would tie our hands
+	 * too much if we ever want to enhace the firmware.  It would also
+	 * create more differences between the PF and VF Drivers.
+	 */
+	return false;
+}
+
+/**
+ *	t4vf_calc_tx_flits - calculate the number of flits for a packet TX WR
+ *	@skb: the packet
+ *
+ *	Returns the number of flits needed for a TX Work Request for the
+ *	given Ethernet packet, including the needed WR and CPL headers.
+ */
+static inline unsigned int t4vf_calc_tx_flits(const struct sk_buff *skb)
+{
+	unsigned int flits;
+
+	/* If the skb is small enough, we can pump it out as a work request
+	 * with only immediate data.  In that case we just have to have the
+	 * TX Packet header plus the skb data in the Work Request.
+	 */
+	if (t4vf_is_eth_imm(skb))
+		return DIV_ROUND_UP(skb->len + sizeof(struct cpl_tx_pkt),
+				    sizeof(__be64));
+
+	/* Otherwise, we're going to have to construct a Scatter gather list
+	 * of the skb body and fragments.  We also include the flits necessary
+	 * for the TX Packet Work Request and CPL.  We always have a firmware
+	 * Write Header (incorporated as part of the cpl_tx_pkt_lso and
+	 * cpl_tx_pkt structures), followed by either a TX Packet Write CPL
+	 * message or, if we're doing a Large Send Offload, an LSO CPL message
+	 * with an embedded TX Packet Write CPL message.
+	 */
+	flits = sgl_len(skb_shinfo(skb)->nr_frags + 1);
+	if (skb_shinfo(skb)->gso_size)
+		flits += (sizeof(struct fw_eth_tx_pkt_vm_wr) +
+			  sizeof(struct cpl_tx_pkt_lso_core) +
+			  sizeof(struct cpl_tx_pkt_core)) / sizeof(__be64);
+	else
+		flits += (sizeof(struct fw_eth_tx_pkt_vm_wr) +
+			  sizeof(struct cpl_tx_pkt_core)) / sizeof(__be64);
+	return flits;
+}
+
+/**
+ *	cxgb4_vf_eth_xmit - add a packet to an Ethernet TX queue
+ *	@skb: the packet
+ *	@dev: the egress net device
+ *
+ *	Add a packet to an SGE Ethernet TX queue.  Runs with softirqs disabled.
+ */
+static netdev_tx_t cxgb4_vf_eth_xmit(struct sk_buff *skb,
+				     struct net_device *dev)
+{
+	dma_addr_t addr[MAX_SKB_FRAGS + 1];
+	const struct skb_shared_info *ssi;
+	struct fw_eth_tx_pkt_vm_wr *wr;
+	int qidx, credits, max_pkt_len;
+	const size_t fw_hdr_copy_len;
+	struct cpl_tx_pkt_core *cpl;
+	const struct port_info *pi;
+	unsigned int flits, ndesc;
+	struct sge_eth_txq *txq;
+	struct adapter *adapter;
+	u64 cntrl, *end;
+	u32 wr_mid;
+
+	fw_hdr_copy_len = (sizeof(wr->ethmacdst) + sizeof(wr->ethmacsrc) +
+			   sizeof(wr->ethtype) + sizeof(wr->vlantci));
+
+	/* The chip minimum packet length is 10 octets but the firmware
+	 * command that we are using requires that we copy the Ethernet header
+	 * (including the VLAN tag) into the header so we reject anything
+	 * smaller than that ...
+	 */
+	if (unlikely(skb->len < fw_hdr_copy_len))
+		goto out_free;
+
+	/* Discard the packet if the length is greater than mtu */
+	max_pkt_len = ETH_HLEN + dev->mtu;
+	if (skb_vlan_tag_present(skb))
+		max_pkt_len += VLAN_HLEN;
+	if (!skb_shinfo(skb)->gso_size && (unlikely(skb->len > max_pkt_len)))
+		goto out_free;
+
+	/* Figure out which TX Queue we're going to use. */
+	pi = netdev_priv(dev);
+	adapter = pi->adapter;
+	qidx = skb_get_queue_mapping(skb);
+	WARN_ON(qidx >= pi->nqsets);
+	txq = &adapter->sge.ethtxq[pi->first_qset + qidx];
+
+	/* Take this opportunity to reclaim any TX Descriptors whose DMA
+	 * transfers have completed.
+	 */
+	cxgb4_reclaim_completed_tx(adapter, &txq->q, true);
+
+	/* Calculate the number of flits and TX Descriptors we're going to
+	 * need along with how many TX Descriptors will be left over after
+	 * we inject our Work Request.
+	 */
+	flits = t4vf_calc_tx_flits(skb);
+	ndesc = flits_to_desc(flits);
+	credits = txq_avail(&txq->q) - ndesc;
+
+	if (unlikely(credits < 0)) {
+		/* Not enough room for this packet's Work Request.  Stop the
+		 * TX Queue and return a "busy" condition.  The queue will get
+		 * started later on when the firmware informs us that space
+		 * has opened up.
+		 */
+		eth_txq_stop(txq);
+		dev_err(adapter->pdev_dev,
+			"%s: TX ring %u full while queue awake!\n",
+			dev->name, qidx);
+		return NETDEV_TX_BUSY;
+	}
+
+	if (!t4vf_is_eth_imm(skb) &&
+	    unlikely(cxgb4_map_skb(adapter->pdev_dev, skb, addr) < 0)) {
+		/* We need to map the skb into PCI DMA space (because it can't
+		 * be in-lined directly into the Work Request) and the mapping
+		 * operation failed.  Record the error and drop the packet.
+		 */
+		txq->mapping_err++;
+		goto out_free;
+	}
+
+	wr_mid = FW_WR_LEN16_V(DIV_ROUND_UP(flits, 2));
+	if (unlikely(credits < ETHTXQ_STOP_THRES)) {
+		/* After we're done injecting the Work Request for this
+		 * packet, we'll be below our "stop threshold" so stop the TX
+		 * Queue now and schedule a request for an SGE Egress Queue
+		 * Update message.  The queue will get started later on when
+		 * the firmware processes this Work Request and sends us an
+		 * Egress Queue Status Update message indicating that space
+		 * has opened up.
+		 */
+		eth_txq_stop(txq);
+		wr_mid |= FW_WR_EQUEQ_F | FW_WR_EQUIQ_F;
+	}
+
+	/* Start filling in our Work Request.  Note that we do _not_ handle
+	 * the WR Header wrapping around the TX Descriptor Ring.  If our
+	 * maximum header size ever exceeds one TX Descriptor, we'll need to
+	 * do something else here.
+	 */
+	WARN_ON(DIV_ROUND_UP(T4VF_ETHTXQ_MAX_HDR, TXD_PER_EQ_UNIT) > 1);
+	wr = (void *)&txq->q.desc[txq->q.pidx];
+	wr->equiq_to_len16 = cpu_to_be32(wr_mid);
+	wr->r3[0] = cpu_to_be32(0);
+	wr->r3[1] = cpu_to_be32(0);
+	skb_copy_from_linear_data(skb, (void *)wr->ethmacdst, fw_hdr_copy_len);
+	end = (u64 *)wr + flits;
+
+	/* If this is a Large Send Offload packet we'll put in an LSO CPL
+	 * message with an encapsulated TX Packet CPL message.  Otherwise we
+	 * just use a TX Packet CPL message.
+	 */
+	ssi = skb_shinfo(skb);
+	if (ssi->gso_size) {
+		struct cpl_tx_pkt_lso_core *lso = (void *)(wr + 1);
+		bool v6 = (ssi->gso_type & SKB_GSO_TCPV6) != 0;
+		int l3hdr_len = skb_network_header_len(skb);
+		int eth_xtra_len = skb_network_offset(skb) - ETH_HLEN;
+
+		wr->op_immdlen =
+			cpu_to_be32(FW_WR_OP_V(FW_ETH_TX_PKT_VM_WR) |
+				    FW_WR_IMMDLEN_V(sizeof(*lso) +
+						    sizeof(*cpl)));
+		 /* Fill in the LSO CPL message. */
+		lso->lso_ctrl =
+			cpu_to_be32(LSO_OPCODE_V(CPL_TX_PKT_LSO) |
+				    LSO_FIRST_SLICE_F |
+				    LSO_LAST_SLICE_F |
+				    LSO_IPV6_V(v6) |
+				    LSO_ETHHDR_LEN_V(eth_xtra_len / 4) |
+				    LSO_IPHDR_LEN_V(l3hdr_len / 4) |
+				    LSO_TCPHDR_LEN_V(tcp_hdr(skb)->doff));
+		lso->ipid_ofst = cpu_to_be16(0);
+		lso->mss = cpu_to_be16(ssi->gso_size);
+		lso->seqno_offset = cpu_to_be32(0);
+		if (is_t4(adapter->params.chip))
+			lso->len = cpu_to_be32(skb->len);
+		else
+			lso->len = cpu_to_be32(LSO_T5_XFER_SIZE_V(skb->len));
+
+		/* Set up TX Packet CPL pointer, control word and perform
+		 * accounting.
+		 */
+		cpl = (void *)(lso + 1);
+
+		if (CHELSIO_CHIP_VERSION(adapter->params.chip) <= CHELSIO_T5)
+			cntrl = TXPKT_ETHHDR_LEN_V(eth_xtra_len);
+		else
+			cntrl = T6_TXPKT_ETHHDR_LEN_V(eth_xtra_len);
+
+		cntrl |= TXPKT_CSUM_TYPE_V(v6 ?
+					   TX_CSUM_TCPIP6 : TX_CSUM_TCPIP) |
+			 TXPKT_IPHDR_LEN_V(l3hdr_len);
+		txq->tso++;
+		txq->tx_cso += ssi->gso_segs;
+	} else {
+		int len;
+
+		len = (t4vf_is_eth_imm(skb)
+		       ? skb->len + sizeof(*cpl)
+		       : sizeof(*cpl));
+		wr->op_immdlen =
+			cpu_to_be32(FW_WR_OP_V(FW_ETH_TX_PKT_VM_WR) |
+				    FW_WR_IMMDLEN_V(len));
+
+		/* Set up TX Packet CPL pointer, control word and perform
+		 * accounting.
+		 */
+		cpl = (void *)(wr + 1);
+		if (skb->ip_summed == CHECKSUM_PARTIAL) {
+			cntrl = hwcsum(adapter->params.chip, skb) |
+				TXPKT_IPCSUM_DIS_F;
+			txq->tx_cso++;
+		} else {
+			cntrl = TXPKT_L4CSUM_DIS_F | TXPKT_IPCSUM_DIS_F;
+		}
+	}
+
+	/* If there's a VLAN tag present, add that to the list of things to
+	 * do in this Work Request.
+	 */
+	if (skb_vlan_tag_present(skb)) {
+		txq->vlan_ins++;
+		cntrl |= TXPKT_VLAN_VLD_F | TXPKT_VLAN_V(skb_vlan_tag_get(skb));
+	}
+
+	 /* Fill in the TX Packet CPL message header. */
+	cpl->ctrl0 = cpu_to_be32(TXPKT_OPCODE_V(CPL_TX_PKT_XT) |
+				 TXPKT_INTF_V(pi->port_id) |
+				 TXPKT_PF_V(0));
+	cpl->pack = cpu_to_be16(0);
+	cpl->len = cpu_to_be16(skb->len);
+	cpl->ctrl1 = cpu_to_be64(cntrl);
+
+	/* Fill in the body of the TX Packet CPL message with either in-lined
+	 * data or a Scatter/Gather List.
+	 */
+	if (t4vf_is_eth_imm(skb)) {
+		/* In-line the packet's data and free the skb since we don't
+		 * need it any longer.
+		 */
+		cxgb4_inline_tx_skb(skb, &txq->q, cpl + 1);
+		dev_consume_skb_any(skb);
+	} else {
+		/* Write the skb's Scatter/Gather list into the TX Packet CPL
+		 * message and retain a pointer to the skb so we can free it
+		 * later when its DMA completes.  (We store the skb pointer
+		 * in the Software Descriptor corresponding to the last TX
+		 * Descriptor used by the Work Request.)
+		 *
+		 * The retained skb will be freed when the corresponding TX
+		 * Descriptors are reclaimed after their DMAs complete.
+		 * However, this could take quite a while since, in general,
+		 * the hardware is set up to be lazy about sending DMA
+		 * completion notifications to us and we mostly perform TX
+		 * reclaims in the transmit routine.
+		 *
+		 * This is good for performamce but means that we rely on new
+		 * TX packets arriving to run the destructors of completed
+		 * packets, which open up space in their sockets' send queues.
+		 * Sometimes we do not get such new packets causing TX to
+		 * stall.  A single UDP transmitter is a good example of this
+		 * situation.  We have a clean up timer that periodically
+		 * reclaims completed packets but it doesn't run often enough
+		 * (nor do we want it to) to prevent lengthy stalls.  A
+		 * solution to this problem is to run the destructor early,
+		 * after the packet is queued but before it's DMAd.  A con is
+		 * that we lie to socket memory accounting, but the amount of
+		 * extra memory is reasonable (limited by the number of TX
+		 * descriptors), the packets do actually get freed quickly by
+		 * new packets almost always, and for protocols like TCP that
+		 * wait for acks to really free up the data the extra memory
+		 * is even less.  On the positive side we run the destructors
+		 * on the sending CPU rather than on a potentially different
+		 * completing CPU, usually a good thing.
+		 *
+		 * Run the destructor before telling the DMA engine about the
+		 * packet to make sure it doesn't complete and get freed
+		 * prematurely.
+		 */
+		struct ulptx_sgl *sgl = (struct ulptx_sgl *)(cpl + 1);
+		struct sge_txq *tq = &txq->q;
+		int last_desc;
+
+		/* If the Work Request header was an exact multiple of our TX
+		 * Descriptor length, then it's possible that the starting SGL
+		 * pointer lines up exactly with the end of our TX Descriptor
+		 * ring.  If that's the case, wrap around to the beginning
+		 * here ...
+		 */
+		if (unlikely((void *)sgl == (void *)tq->stat)) {
+			sgl = (void *)tq->desc;
+			end = (void *)((void *)tq->desc +
+				       ((void *)end - (void *)tq->stat));
+		}
+
+		cxgb4_write_sgl(skb, tq, sgl, end, 0, addr);
+		skb_orphan(skb);
+
+		last_desc = tq->pidx + ndesc - 1;
+		if (last_desc >= tq->size)
+			last_desc -= tq->size;
+		tq->sdesc[last_desc].skb = skb;
+		tq->sdesc[last_desc].sgl = sgl;
+	}
+
+	/* Advance our internal TX Queue state, tell the hardware about
+	 * the new TX descriptors and return success.
+	 */
+	txq_advance(&txq->q, ndesc);
+
+	cxgb4_ring_tx_db(adapter, &txq->q, ndesc);
+	return NETDEV_TX_OK;
+
+out_free:
+	/* An error of some sort happened.  Free the TX skb and tell the
+	 * OS that we've "dealt" with the packet ...
+	 */
+	dev_kfree_skb_any(skb);
+	return NETDEV_TX_OK;
+}
+
+netdev_tx_t t4_start_xmit(struct sk_buff *skb, struct net_device *dev)
+{
+	struct port_info *pi = netdev_priv(dev);
+
+	if (unlikely(pi->eth_flags & PRIV_FLAG_PORT_TX_VM))
+		return cxgb4_vf_eth_xmit(skb, dev);
+
+	return cxgb4_eth_xmit(skb, dev);
+}
+
 /**
  *	reclaim_completed_tx_imm - reclaim completed control-queue Tx descs
  *	@q: the SGE control Tx queue
-- 
2.1.0

^ permalink raw reply related

* Re: [PATCH v2 net] nfp: cast sizeof() to int when comparing with error code
From: cgxu519 @ 2018-06-26  8:21 UTC (permalink / raw)
  To: Julia Lawall
  Cc: Joe Perches, jakub.kicinski, davem, LKML, cocci, oss-drivers,
	netdev, Dmitry Torokhov, linux-input, linux-s390
In-Reply-To: <alpine.DEB.2.20.1806261005190.4512@hadrien>

On 06/26/2018 04:06 PM, Julia Lawall wrote:
>
> On Mon, 25 Jun 2018, Joe Perches wrote:
>
>> On Tue, 2018-06-26 at 09:16 +0800, Chengguang Xu wrote:
>>> sizeof() will return unsigned value so in the error check
>>> negative error code will be always larger than sizeof().
>> This looks like a general class of error in the kernel
>> where a signed result that could be returning a -errno
>> is tested against < or <= sizeof()
>>
>> A couple examples:
>>
>> drivers/input/mouse/elan_i2c_smbus.c:
>>
>> 		len = i2c_smbus_read_block_data(client,
>> 						ETP_SMBUS_IAP_PASSWORD_READ,
>> 						val);
>> 		if (len < sizeof(u16)) {
>>
>> i2c_smbus_read_block_data can return a negative errno
>>
>>
>> net/smc/smc_clc.c:
>>
>> 	len = kernel_sendmsg(smc->clcsock, &msg, &vec, 1,
>> 			     sizeof(struct smc_clc_msg_decline));
>> 	if (len < sizeof(struct smc_clc_msg_decline))
>>
>> where kernel_sendmsg can return a negative errno
>>
>> There are probably others, I didn't look hard.
>>
>> Perhaps a cocci script to find these could be generated?
> Here's another one:
>
> drivers/usb/serial/ir-usb.c
> @@ -126,13 +126,8 @@ irda_usb_find_class_desc(struct usb_seri
>   	if (!desc)
>   		return NULL;
>
> -	ret = usb_control_msg(dev, usb_rcvctrlpipe(dev, 0),
> -			USB_REQ_CS_IRDA_GET_CLASS_DESC,
> -			USB_DIR_IN | USB_TYPE_CLASS | USB_RECIP_INTERFACE,
> -			0, ifnum, desc, sizeof(*desc), 1000);
>
>   	dev_dbg(&serial->dev->dev, "%s -  ret=%d\n", __func__, ret);
> -	if (ret < sizeof(*desc)) {
>   		dev_dbg(&serial->dev->dev,
>   			"%s - class descriptor read %s (%d)\n", __func__,
>   			(ret < 0) ? "failed" : "too short", ret);
>
> There are other results, but I haven't checked all of them.

Hi Julia,

Thanks for your check. I posted a patch yesterday to fix three places in 
usb subsystem
and the patch is just in queue now, so you can skip these places.

The detail of patch.
---

diff --git a/drivers/usb/serial/ir-usb.c b/drivers/usb/serial/ir-usb.c
index 24b06c7e5e2d..7643716b5299 100644
--- a/drivers/usb/serial/ir-usb.c
+++ b/drivers/usb/serial/ir-usb.c
@@ -132,7 +132,7 @@ irda_usb_find_class_desc(struct usb_serial *serial, unsigned int ifnum)
  			0, ifnum, desc, sizeof(*desc), 1000);
  
  	dev_dbg(&serial->dev->dev, "%s -  ret=%d\n", __func__, ret);
-	if (ret < sizeof(*desc)) {
+	if (ret < (int)sizeof(*desc)) {
  		dev_dbg(&serial->dev->dev,
  			"%s - class descriptor read %s (%d)\n", __func__,
  			(ret < 0) ? "failed" : "too short", ret);
diff --git a/drivers/usb/serial/quatech2.c b/drivers/usb/serial/quatech2.c
index 958e12e1e7c7..ff2322ea5e14 100644
--- a/drivers/usb/serial/quatech2.c
+++ b/drivers/usb/serial/quatech2.c
@@ -194,7 +194,7 @@ static inline int qt2_getregister(struct usb_device *dev,
  	ret = usb_control_msg(dev, usb_rcvctrlpipe(dev, 0),
  			      QT_SET_GET_REGISTER, 0xc0, reg,
  			      uart, data, sizeof(*data), QT2_USB_TIMEOUT);
-	if (ret < sizeof(*data)) {
+	if (ret < (int)sizeof(*data)) {
  		if (ret >= 0)
  			ret = -EIO;
  	}
diff --git a/drivers/usb/serial/ssu100.c b/drivers/usb/serial/ssu100.c
index 2083c267787b..0900b47b5f57 100644
--- a/drivers/usb/serial/ssu100.c
+++ b/drivers/usb/serial/ssu100.c
@@ -104,7 +104,7 @@ static inline int ssu100_getregister(struct usb_device *dev,
  	ret = usb_control_msg(dev, usb_rcvctrlpipe(dev, 0),
  			      QT_SET_GET_REGISTER, 0xc0, reg,
  			      uart, data, sizeof(*data), 300);
-	if (ret < sizeof(*data)) {
+	if (ret < (int)sizeof(*data)) {
  		if (ret >= 0)
  			ret = -EIO;
  	}
---




Thanks,
Chengguang.

^ permalink raw reply related

* [PATCH v3] net: ethernet: stmmac: dwmac-rk: Add GMAC support for px30
From: David Wu @ 2018-06-26  8:19 UTC (permalink / raw)
  To: davem, heiko, robh+dt
  Cc: mark.rutland, huangtao, netdev, linux-kernel, linux-rockchip,
	David Wu, linux-arm-kernel

Add constants and callback functions for the dwmac on px30 Soc.
The base structure is the same, but registers and the bits in
them are moved slightly, and add the clk_mac_speed for selecting
mac speed.

Signed-off-by: David Wu <david.wu@rock-chips.com>
---
Change in v3:
- Add the clock enable/disable for clk_mac_speed.

Change in v2:
- Fix some error in commit title and message.

 .../devicetree/bindings/net/rockchip-dwmac.txt     |  1 +
 drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c     | 71 ++++++++++++++++++++++
 2 files changed, 72 insertions(+)

diff --git a/Documentation/devicetree/bindings/net/rockchip-dwmac.txt b/Documentation/devicetree/bindings/net/rockchip-dwmac.txt
index 9c16ee2..3b71da7 100644
--- a/Documentation/devicetree/bindings/net/rockchip-dwmac.txt
+++ b/Documentation/devicetree/bindings/net/rockchip-dwmac.txt
@@ -4,6 +4,7 @@ The device node has following properties.
 
 Required properties:
  - compatible: should be "rockchip,<name>-gamc"
+   "rockchip,px30-gmac":   found on PX30 SoCs
    "rockchip,rk3128-gmac": found on RK312x SoCs
    "rockchip,rk3228-gmac": found on RK322x SoCs
    "rockchip,rk3288-gmac": found on RK3288 SoCs
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c
index 13133b3..fc5fef7 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c
@@ -61,6 +61,7 @@ struct rk_priv_data {
 	struct clk *mac_clk_tx;
 	struct clk *clk_mac_ref;
 	struct clk *clk_mac_refout;
+	struct clk *clk_mac_speed;
 	struct clk *aclk_mac;
 	struct clk *pclk_mac;
 	struct clk *clk_phy;
@@ -83,6 +84,64 @@ struct rk_priv_data {
 	(((tx) ? soc##_GMAC_TXCLK_DLY_ENABLE : soc##_GMAC_TXCLK_DLY_DISABLE) | \
 	 ((rx) ? soc##_GMAC_RXCLK_DLY_ENABLE : soc##_GMAC_RXCLK_DLY_DISABLE))
 
+#define PX30_GRF_GMAC_CON1		0x0904
+
+/* PX30_GRF_GMAC_CON1 */
+#define PX30_GMAC_PHY_INTF_SEL_RMII	(GRF_CLR_BIT(4) | GRF_CLR_BIT(5) | \
+					 GRF_BIT(6))
+#define PX30_GMAC_SPEED_10M		GRF_CLR_BIT(2)
+#define PX30_GMAC_SPEED_100M		GRF_BIT(2)
+
+static void px30_set_to_rmii(struct rk_priv_data *bsp_priv)
+{
+	struct device *dev = &bsp_priv->pdev->dev;
+
+	if (IS_ERR(bsp_priv->grf)) {
+		dev_err(dev, "%s: Missing rockchip,grf property\n", __func__);
+		return;
+	}
+
+	regmap_write(bsp_priv->grf, PX30_GRF_GMAC_CON1,
+		     PX30_GMAC_PHY_INTF_SEL_RMII);
+}
+
+static void px30_set_rmii_speed(struct rk_priv_data *bsp_priv, int speed)
+{
+	struct device *dev = &bsp_priv->pdev->dev;
+	int ret;
+
+	if (IS_ERR(bsp_priv->clk_mac_speed)) {
+		dev_err(dev, "%s: Missing clk_mac_speed clock\n", __func__);
+		return;
+	}
+
+	if (speed == 10) {
+		regmap_write(bsp_priv->grf, PX30_GRF_GMAC_CON1,
+			     PX30_GMAC_SPEED_10M);
+
+		ret = clk_set_rate(bsp_priv->clk_mac_speed, 2500000);
+		if (ret)
+			dev_err(dev, "%s: set clk_mac_speed rate 2500000 failed: %d\n",
+				__func__, ret);
+	} else if (speed == 100) {
+		regmap_write(bsp_priv->grf, PX30_GRF_GMAC_CON1,
+			     PX30_GMAC_SPEED_100M);
+
+		ret = clk_set_rate(bsp_priv->clk_mac_speed, 25000000);
+		if (ret)
+			dev_err(dev, "%s: set clk_mac_speed rate 25000000 failed: %d\n",
+				__func__, ret);
+
+	} else {
+		dev_err(dev, "unknown speed value for RMII! speed=%d", speed);
+	}
+}
+
+static const struct rk_gmac_ops px30_ops = {
+	.set_to_rmii = px30_set_to_rmii,
+	.set_rmii_speed = px30_set_rmii_speed,
+};
+
 #define RK3128_GRF_MAC_CON0	0x0168
 #define RK3128_GRF_MAC_CON1	0x016c
 
@@ -1042,6 +1101,10 @@ static int rk_gmac_clk_init(struct plat_stmmacenet_data *plat)
 		}
 	}
 
+	bsp_priv->clk_mac_speed = devm_clk_get(dev, "clk_mac_speed");
+	if (IS_ERR(bsp_priv->clk_mac_speed))
+		dev_err(dev, "cannot get clock %s\n", "clk_mac_speed");
+
 	if (bsp_priv->clock_input) {
 		dev_info(dev, "clock input from PHY\n");
 	} else {
@@ -1094,6 +1157,9 @@ static int gmac_clk_enable(struct rk_priv_data *bsp_priv, bool enable)
 			if (!IS_ERR(bsp_priv->mac_clk_tx))
 				clk_prepare_enable(bsp_priv->mac_clk_tx);
 
+			if (!IS_ERR(bsp_priv->clk_mac_speed))
+				clk_prepare_enable(bsp_priv->clk_mac_speed);
+
 			/**
 			 * if (!IS_ERR(bsp_priv->clk_mac))
 			 *	clk_prepare_enable(bsp_priv->clk_mac);
@@ -1128,6 +1194,10 @@ static int gmac_clk_enable(struct rk_priv_data *bsp_priv, bool enable)
 
 			if (!IS_ERR(bsp_priv->mac_clk_tx))
 				clk_disable_unprepare(bsp_priv->mac_clk_tx);
+
+			if (!IS_ERR(bsp_priv->clk_mac_speed))
+				clk_disable_unprepare(bsp_priv->clk_mac_speed);
+
 			/**
 			 * if (!IS_ERR(bsp_priv->clk_mac))
 			 *	clk_disable_unprepare(bsp_priv->clk_mac);
@@ -1424,6 +1494,7 @@ static int rk_gmac_resume(struct device *dev)
 static SIMPLE_DEV_PM_OPS(rk_gmac_pm_ops, rk_gmac_suspend, rk_gmac_resume);
 
 static const struct of_device_id rk_gmac_dwmac_match[] = {
+	{ .compatible = "rockchip,px30-gmac",	.data = &px30_ops   },
 	{ .compatible = "rockchip,rk3128-gmac", .data = &rk3128_ops },
 	{ .compatible = "rockchip,rk3228-gmac", .data = &rk3228_ops },
 	{ .compatible = "rockchip,rk3288-gmac", .data = &rk3288_ops },
-- 
2.7.4

^ permalink raw reply related

* Re: [PATCH v2 net] nfp: cast sizeof() to int when comparing with error code
From: Julia Lawall @ 2018-06-26  8:06 UTC (permalink / raw)
  To: Joe Perches
  Cc: linux-s390, jakub.kicinski, Chengguang Xu, netdev,
	Dmitry Torokhov, LKML, davem, oss-drivers, linux-input, cocci
In-Reply-To: <ff0b2834b771a467246b9f7c1b4600c38f5bb463.camel@perches.com>



On Mon, 25 Jun 2018, Joe Perches wrote:

> On Tue, 2018-06-26 at 09:16 +0800, Chengguang Xu wrote:
> > sizeof() will return unsigned value so in the error check
> > negative error code will be always larger than sizeof().
>
> This looks like a general class of error in the kernel
> where a signed result that could be returning a -errno
> is tested against < or <= sizeof()
>
> A couple examples:
>
> drivers/input/mouse/elan_i2c_smbus.c:
>
> 		len = i2c_smbus_read_block_data(client,
> 						ETP_SMBUS_IAP_PASSWORD_READ,
> 						val);
> 		if (len < sizeof(u16)) {
>
> i2c_smbus_read_block_data can return a negative errno
>
>
> net/smc/smc_clc.c:
>
> 	len = kernel_sendmsg(smc->clcsock, &msg, &vec, 1,
> 			     sizeof(struct smc_clc_msg_decline));
> 	if (len < sizeof(struct smc_clc_msg_decline))
>
> where kernel_sendmsg can return a negative errno
>
> There are probably others, I didn't look hard.
>
> Perhaps a cocci script to find these could be generated?

Here's another one:

drivers/usb/serial/ir-usb.c
@@ -126,13 +126,8 @@ irda_usb_find_class_desc(struct usb_seri
 	if (!desc)
 		return NULL;

-	ret = usb_control_msg(dev, usb_rcvctrlpipe(dev, 0),
-			USB_REQ_CS_IRDA_GET_CLASS_DESC,
-			USB_DIR_IN | USB_TYPE_CLASS | USB_RECIP_INTERFACE,
-			0, ifnum, desc, sizeof(*desc), 1000);

 	dev_dbg(&serial->dev->dev, "%s -  ret=%d\n", __func__, ret);
-	if (ret < sizeof(*desc)) {
 		dev_dbg(&serial->dev->dev,
 			"%s - class descriptor read %s (%d)\n", __func__,
 			(ret < 0) ? "failed" : "too short", ret);

There are other results, but I haven't checked all of them.

julia

^ permalink raw reply

* [patch net-next v2 9/9] selftests: forwarding: add tests for TC chain templates
From: Jiri Pirko @ 2018-06-26  8:00 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, jakub.kicinski, simon.horman,
	john.hurley, dsahern, mlxsw
In-Reply-To: <20180626080000.12964-1-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Add basic sanity tests for TC chain templates.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 tools/testing/selftests/net/forwarding/lib.sh      |   9 ++
 .../selftests/net/forwarding/tc_chaintemplates.sh  | 160 +++++++++++++++++++++
 2 files changed, 169 insertions(+)
 create mode 100755 tools/testing/selftests/net/forwarding/tc_chaintemplates.sh

diff --git a/tools/testing/selftests/net/forwarding/lib.sh b/tools/testing/selftests/net/forwarding/lib.sh
index a736d1d7ecdb..128a5b5a8ea9 100644
--- a/tools/testing/selftests/net/forwarding/lib.sh
+++ b/tools/testing/selftests/net/forwarding/lib.sh
@@ -39,6 +39,15 @@ check_tc_shblock_support()
 	fi
 }
 
+check_tc_chaintemplate_support()
+{
+	tc filter help 2>&1|grep template &> /dev/null
+	if [[ $? -ne 0 ]]; then
+		echo "SKIP: iproute2 too old; tc is missing chain template support"
+		exit 1
+	fi
+}
+
 if [[ "$(id -u)" -ne 0 ]]; then
 	echo "SKIP: need root privileges"
 	exit 0
diff --git a/tools/testing/selftests/net/forwarding/tc_chaintemplates.sh b/tools/testing/selftests/net/forwarding/tc_chaintemplates.sh
new file mode 100755
index 000000000000..21f2c18e973a
--- /dev/null
+++ b/tools/testing/selftests/net/forwarding/tc_chaintemplates.sh
@@ -0,0 +1,160 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+
+ALL_TESTS="template_create_destroy template_filter_fits \
+	   template_create_nonempty template_destroy_nonempty"
+NUM_NETIFS=2
+source tc_common.sh
+source lib.sh
+
+h1_create()
+{
+	simple_if_init $h1 192.0.2.1/24
+}
+
+h1_destroy()
+{
+	simple_if_fini $h1 192.0.2.1/24
+}
+
+h2_create()
+{
+	simple_if_init $h2 192.0.2.2/24
+	tc qdisc add dev $h2 clsact
+}
+
+h2_destroy()
+{
+	tc qdisc del dev $h2 clsact
+	simple_if_fini $h2 192.0.2.2/24
+}
+
+template_create_destroy()
+{
+	RET=0
+
+	tc filter template add dev $h2 ingress protocol ip \
+		flower dst_mac 00:00:00:00:00:00/FF:FF:FF:FF:FF:FF
+	check_err $? "Failed to create template for default chain"
+
+	tc filter template add dev $h2 ingress chain 1 protocol ip \
+		flower dst_mac 00:00:00:00:00:00/FF:FF:FF:FF:FF:FF
+	check_err $? "Failed to create template for chain 1"
+
+	tc filter template del dev $h2 ingress
+	check_err $? "Failed to destroy template for default chain"
+
+	tc filter template del dev $h2 ingress chain 1
+	check_err $? "Failed to destroy template for chain 1"
+
+	log_test "template create destroy"
+}
+
+template_filter_fits()
+{
+	RET=0
+
+	tc filter template add dev $h2 ingress protocol ip \
+		flower dst_mac 00:00:00:00:00:00/FF:FF:FF:FF:FF:FF &> /dev/null
+	tc filter template add dev $h2 ingress chain 1 protocol ip \
+		flower src_mac 00:00:00:00:00:00/FF:FF:FF:FF:FF:FF &> /dev/null
+
+	tc filter add dev $h2 ingress protocol ip pref 1 handle 1101 \
+		flower dst_mac $h2mac action drop
+	check_err $? "Failed to insert filter which fits template"
+
+	tc filter add dev $h2 ingress protocol ip pref 1 handle 1102 \
+		flower src_mac $h2mac action drop &> /dev/null
+	check_fail $? "Incorrectly succeded to insert filter which does not template"
+
+	tc filter add dev $h2 ingress chain 1 protocol ip pref 1 handle 1101 \
+		flower src_mac $h2mac action drop
+	check_err $? "Failed to insert filter which fits template"
+
+	tc filter add dev $h2 ingress chain 1protocol ip pref 1 handle 1102 \
+		flower dst_mac $h2mac action drop &> /dev/null
+	check_fail $? "Incorrectly succeded to insert filter which does not template"
+
+	tc filter del dev $h2 ingress chain 1 protocol ip pref 1 handle 1102 \
+		flower &> /dev/null
+	tc filter del dev $h2 ingress chain 1 protocol ip pref 1 handle 1101 \
+		flower &> /dev/null
+
+	tc filter del dev $h2 ingress protocol ip pref 1 handle 1102 \
+		flower &> /dev/null
+	tc filter del dev $h2 ingress protocol ip pref 1 handle 1101 \
+		flower &> /dev/null
+
+	tc filter template del dev $h2 ingress chain 1
+	tc filter template del dev $h2 ingress
+
+	log_test "template filter fits"
+}
+
+template_create_nonempty()
+{
+	RET=0
+
+	tc filter add dev $h2 ingress protocol ip pref 1 handle 1101 \
+		flower dst_mac $h2mac action drop
+	tc filter template add dev $h2 ingress protocol ip \
+		flower dst_mac 00:00:00:00:00:00/FF:FF:FF:FF:FF:FF &> /dev/null
+	check_fail $? "Incorrectly succeded to create template for non-empty chain"
+
+	tc filter template del dev $h2 ingress &> /dev/null
+	tc filter del dev $h2 ingress protocol ip pref 1 handle 1101 flower
+
+	log_test "template create non-empty"
+}
+
+template_destroy_nonempty()
+{
+	RET=0
+
+	tc filter template add dev $h2 ingress protocol ip \
+		flower dst_mac 00:00:00:00:00:00/FF:FF:FF:FF:FF:FF
+	tc filter add dev $h2 ingress protocol ip pref 1 handle 1101 \
+		flower dst_mac $h2mac action drop
+
+	tc filter template del dev $h2 ingress &> /dev/null
+	check_fail $? "Incorrectly succeded to destroy template for non-empty chain"
+	tc filter del dev $h2 ingress protocol ip pref 1 handle 1101 flower
+	tc filter template del dev $h2 ingress &> /dev/null
+	check_err $? "Failed to destroy template for empty chain"
+
+	log_test "template destroy non-empty"
+}
+
+setup_prepare()
+{
+	h1=${NETIFS[p1]}
+	h2=${NETIFS[p2]}
+	h1mac=$(mac_get $h1)
+	h2mac=$(mac_get $h2)
+
+	vrf_prepare
+
+	h1_create
+	h2_create
+}
+
+cleanup()
+{
+	pre_cleanup
+
+	h2_destroy
+	h1_destroy
+
+	vrf_cleanup
+}
+
+check_tc_chaintemplate_support
+
+trap cleanup EXIT
+
+setup_prepare
+setup_wait
+
+tests_run
+
+exit $EXIT_STATUS
-- 
2.14.4

^ permalink raw reply related

* [patch net-next v2 7/9] mlxsw: spectrum: Implement chain template hinting
From: Jiri Pirko @ 2018-06-26  7:59 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, jakub.kicinski, simon.horman,
	john.hurley, dsahern, mlxsw
In-Reply-To: <20180626080000.12964-1-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Since cld_flower provides information about the filter template for
specific chain, use this information in order to prepare a region.
Use the template to find out what elements are going to be used
and pass that down to mlxsw_sp_acl_tcam_group_add(). Later on, when the
first filter is inserted, the mlxsw_sp_acl_tcam_group_use_patterns()
function would use this element usage information instead of looking
up a pattern.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum.c     |  5 +++
 drivers/net/ethernet/mellanox/mlxsw/spectrum.h     | 12 +++++-
 drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c | 12 ++++--
 .../ethernet/mellanox/mlxsw/spectrum_acl_tcam.c    | 25 ++++++++++--
 .../net/ethernet/mellanox/mlxsw/spectrum_flower.c  | 44 ++++++++++++++++++++--
 5 files changed, 86 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index 968b88af2ef5..da19fa343d0b 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -1441,6 +1441,11 @@ mlxsw_sp_setup_tc_cls_flower(struct mlxsw_sp_acl_block *acl_block,
 		return 0;
 	case TC_CLSFLOWER_STATS:
 		return mlxsw_sp_flower_stats(mlxsw_sp, acl_block, f);
+	case TC_CLSFLOWER_TMPLT_CREATE:
+		return mlxsw_sp_flower_tmplt_create(mlxsw_sp, acl_block, f);
+	case TC_CLSFLOWER_TMPLT_DESTROY:
+		mlxsw_sp_flower_tmplt_destroy(mlxsw_sp, acl_block, f);
+		return 0;
 	default:
 		return -EOPNOTSUPP;
 	}
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
index 4a519d8edec8..b0a8e611e730 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
@@ -459,7 +459,8 @@ enum mlxsw_sp_acl_profile {
 struct mlxsw_sp_acl_profile_ops {
 	size_t ruleset_priv_size;
 	int (*ruleset_add)(struct mlxsw_sp *mlxsw_sp,
-			   void *priv, void *ruleset_priv);
+			   void *priv, void *ruleset_priv,
+			   struct mlxsw_afk_element_usage *tmplt_elusage);
 	void (*ruleset_del)(struct mlxsw_sp *mlxsw_sp, void *ruleset_priv);
 	int (*ruleset_bind)(struct mlxsw_sp *mlxsw_sp, void *ruleset_priv,
 			    struct mlxsw_sp_port *mlxsw_sp_port,
@@ -514,7 +515,8 @@ mlxsw_sp_acl_ruleset_lookup(struct mlxsw_sp *mlxsw_sp,
 struct mlxsw_sp_acl_ruleset *
 mlxsw_sp_acl_ruleset_get(struct mlxsw_sp *mlxsw_sp,
 			 struct mlxsw_sp_acl_block *block, u32 chain_index,
-			 enum mlxsw_sp_acl_profile profile);
+			 enum mlxsw_sp_acl_profile profile,
+			 struct mlxsw_afk_element_usage *tmplt_elusage);
 void mlxsw_sp_acl_ruleset_put(struct mlxsw_sp *mlxsw_sp,
 			      struct mlxsw_sp_acl_ruleset *ruleset);
 u16 mlxsw_sp_acl_ruleset_group_id(struct mlxsw_sp_acl_ruleset *ruleset);
@@ -594,6 +596,12 @@ void mlxsw_sp_flower_destroy(struct mlxsw_sp *mlxsw_sp,
 int mlxsw_sp_flower_stats(struct mlxsw_sp *mlxsw_sp,
 			  struct mlxsw_sp_acl_block *block,
 			  struct tc_cls_flower_offload *f);
+int mlxsw_sp_flower_tmplt_create(struct mlxsw_sp *mlxsw_sp,
+				 struct mlxsw_sp_acl_block *block,
+				 struct tc_cls_flower_offload *f);
+void mlxsw_sp_flower_tmplt_destroy(struct mlxsw_sp *mlxsw_sp,
+				   struct mlxsw_sp_acl_block *block,
+				   struct tc_cls_flower_offload *f);
 
 /* spectrum_qdisc.c */
 int mlxsw_sp_tc_qdisc_init(struct mlxsw_sp_port *mlxsw_sp_port);
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
index 79b1fa27a9a4..ea42605c451d 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
@@ -319,7 +319,8 @@ int mlxsw_sp_acl_block_unbind(struct mlxsw_sp *mlxsw_sp,
 static struct mlxsw_sp_acl_ruleset *
 mlxsw_sp_acl_ruleset_create(struct mlxsw_sp *mlxsw_sp,
 			    struct mlxsw_sp_acl_block *block, u32 chain_index,
-			    const struct mlxsw_sp_acl_profile_ops *ops)
+			    const struct mlxsw_sp_acl_profile_ops *ops,
+			    struct mlxsw_afk_element_usage *tmplt_elusage)
 {
 	struct mlxsw_sp_acl *acl = mlxsw_sp->acl;
 	struct mlxsw_sp_acl_ruleset *ruleset;
@@ -339,7 +340,8 @@ mlxsw_sp_acl_ruleset_create(struct mlxsw_sp *mlxsw_sp,
 	if (err)
 		goto err_rhashtable_init;
 
-	err = ops->ruleset_add(mlxsw_sp, acl->priv, ruleset->priv);
+	err = ops->ruleset_add(mlxsw_sp, acl->priv, ruleset->priv,
+			       tmplt_elusage);
 	if (err)
 		goto err_ops_ruleset_add;
 
@@ -421,7 +423,8 @@ mlxsw_sp_acl_ruleset_lookup(struct mlxsw_sp *mlxsw_sp,
 struct mlxsw_sp_acl_ruleset *
 mlxsw_sp_acl_ruleset_get(struct mlxsw_sp *mlxsw_sp,
 			 struct mlxsw_sp_acl_block *block, u32 chain_index,
-			 enum mlxsw_sp_acl_profile profile)
+			 enum mlxsw_sp_acl_profile profile,
+			 struct mlxsw_afk_element_usage *tmplt_elusage)
 {
 	const struct mlxsw_sp_acl_profile_ops *ops;
 	struct mlxsw_sp_acl *acl = mlxsw_sp->acl;
@@ -436,7 +439,8 @@ mlxsw_sp_acl_ruleset_get(struct mlxsw_sp *mlxsw_sp,
 		mlxsw_sp_acl_ruleset_ref_inc(ruleset);
 		return ruleset;
 	}
-	return mlxsw_sp_acl_ruleset_create(mlxsw_sp, block, chain_index, ops);
+	return mlxsw_sp_acl_ruleset_create(mlxsw_sp, block, chain_index, ops,
+					   tmplt_elusage);
 }
 
 void mlxsw_sp_acl_ruleset_put(struct mlxsw_sp *mlxsw_sp,
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c
index ad1b548e3cac..d6e4e00dfd3d 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c
@@ -157,6 +157,8 @@ struct mlxsw_sp_acl_tcam_group {
 	struct mlxsw_sp_acl_tcam_group_ops *ops;
 	const struct mlxsw_sp_acl_tcam_pattern *patterns;
 	unsigned int patterns_count;
+	bool tmplt_elusage_set;
+	struct mlxsw_afk_element_usage tmplt_elusage;
 };
 
 struct mlxsw_sp_acl_tcam_region {
@@ -216,13 +218,19 @@ mlxsw_sp_acl_tcam_group_add(struct mlxsw_sp *mlxsw_sp,
 			    struct mlxsw_sp_acl_tcam *tcam,
 			    struct mlxsw_sp_acl_tcam_group *group,
 			    const struct mlxsw_sp_acl_tcam_pattern *patterns,
-			    unsigned int patterns_count)
+			    unsigned int patterns_count,
+			    struct mlxsw_afk_element_usage *tmplt_elusage)
 {
 	int err;
 
 	group->tcam = tcam;
 	group->patterns = patterns;
 	group->patterns_count = patterns_count;
+	if (tmplt_elusage) {
+		group->tmplt_elusage_set = true;
+		memcpy(&group->tmplt_elusage, tmplt_elusage,
+		       sizeof(group->tmplt_elusage));
+	}
 	INIT_LIST_HEAD(&group->region_list);
 	err = mlxsw_sp_acl_tcam_group_id_get(tcam, &group->id);
 	if (err)
@@ -431,6 +439,15 @@ mlxsw_sp_acl_tcam_group_use_patterns(struct mlxsw_sp_acl_tcam_group *group,
 	const struct mlxsw_sp_acl_tcam_pattern *pattern;
 	int i;
 
+	/* In case the template is set, we don't have to look up the pattern
+	 * and just use the template.
+	 */
+	if (group->tmplt_elusage_set) {
+		memcpy(out, &group->tmplt_elusage, sizeof(*out));
+		WARN_ON(!mlxsw_afk_element_usage_subset(elusage, out));
+		return;
+	}
+
 	for (i = 0; i < group->patterns_count; i++) {
 		pattern = &group->patterns[i];
 		mlxsw_afk_element_usage_fill(out, pattern->elements,
@@ -1019,14 +1036,16 @@ struct mlxsw_sp_acl_tcam_flower_rule {
 
 static int
 mlxsw_sp_acl_tcam_flower_ruleset_add(struct mlxsw_sp *mlxsw_sp,
-				     void *priv, void *ruleset_priv)
+				     void *priv, void *ruleset_priv,
+				     struct mlxsw_afk_element_usage *tmplt_elusage)
 {
 	struct mlxsw_sp_acl_tcam_flower_ruleset *ruleset = ruleset_priv;
 	struct mlxsw_sp_acl_tcam *tcam = priv;
 
 	return mlxsw_sp_acl_tcam_group_add(mlxsw_sp, tcam, &ruleset->group,
 					   mlxsw_sp_acl_tcam_patterns,
-					   MLXSW_SP_ACL_TCAM_PATTERNS_COUNT);
+					   MLXSW_SP_ACL_TCAM_PATTERNS_COUNT,
+					   tmplt_elusage);
 }
 
 static void
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c
index 89dbf569dff5..f34b410e5048 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c
@@ -401,7 +401,7 @@ int mlxsw_sp_flower_replace(struct mlxsw_sp *mlxsw_sp,
 
 	ruleset = mlxsw_sp_acl_ruleset_get(mlxsw_sp, block,
 					   f->common.chain_index,
-					   MLXSW_SP_ACL_PROFILE_FLOWER);
+					   MLXSW_SP_ACL_PROFILE_FLOWER, NULL);
 	if (IS_ERR(ruleset))
 		return PTR_ERR(ruleset);
 
@@ -445,7 +445,7 @@ void mlxsw_sp_flower_destroy(struct mlxsw_sp *mlxsw_sp,
 
 	ruleset = mlxsw_sp_acl_ruleset_get(mlxsw_sp, block,
 					   f->common.chain_index,
-					   MLXSW_SP_ACL_PROFILE_FLOWER);
+					   MLXSW_SP_ACL_PROFILE_FLOWER, NULL);
 	if (IS_ERR(ruleset))
 		return;
 
@@ -471,7 +471,7 @@ int mlxsw_sp_flower_stats(struct mlxsw_sp *mlxsw_sp,
 
 	ruleset = mlxsw_sp_acl_ruleset_get(mlxsw_sp, block,
 					   f->common.chain_index,
-					   MLXSW_SP_ACL_PROFILE_FLOWER);
+					   MLXSW_SP_ACL_PROFILE_FLOWER, NULL);
 	if (WARN_ON(IS_ERR(ruleset)))
 		return -EINVAL;
 
@@ -493,3 +493,41 @@ int mlxsw_sp_flower_stats(struct mlxsw_sp *mlxsw_sp,
 	mlxsw_sp_acl_ruleset_put(mlxsw_sp, ruleset);
 	return err;
 }
+
+int mlxsw_sp_flower_tmplt_create(struct mlxsw_sp *mlxsw_sp,
+				 struct mlxsw_sp_acl_block *block,
+				 struct tc_cls_flower_offload *f)
+{
+	struct mlxsw_sp_acl_ruleset *ruleset;
+	struct mlxsw_sp_acl_rule_info rulei;
+	int err;
+
+	memset(&rulei, 0, sizeof(rulei));
+	err = mlxsw_sp_flower_parse(mlxsw_sp, block, &rulei, f);
+	if (err)
+		return err;
+	ruleset = mlxsw_sp_acl_ruleset_get(mlxsw_sp, block,
+					   f->common.chain_index,
+					   MLXSW_SP_ACL_PROFILE_FLOWER,
+					   &rulei.values.elusage);
+	if (IS_ERR(ruleset))
+		return PTR_ERR(ruleset);
+	/* keep the reference to the ruleset */
+	return 0;
+}
+
+void mlxsw_sp_flower_tmplt_destroy(struct mlxsw_sp *mlxsw_sp,
+				   struct mlxsw_sp_acl_block *block,
+				   struct tc_cls_flower_offload *f)
+{
+	struct mlxsw_sp_acl_ruleset *ruleset;
+
+	ruleset = mlxsw_sp_acl_ruleset_get(mlxsw_sp, block,
+					   f->common.chain_index,
+					   MLXSW_SP_ACL_PROFILE_FLOWER, NULL);
+	if (IS_ERR(ruleset))
+		return;
+	/* put the reference to the ruleset kept in create */
+	mlxsw_sp_acl_ruleset_put(mlxsw_sp, ruleset);
+	mlxsw_sp_acl_ruleset_put(mlxsw_sp, ruleset);
+}
-- 
2.14.4

^ permalink raw reply related

* [patch net-next v2 8/9] selftests: forwarding: move shblock tc support check to a separate helper
From: Jiri Pirko @ 2018-06-26  7:59 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, jakub.kicinski, simon.horman,
	john.hurley, dsahern, mlxsw
In-Reply-To: <20180626080000.12964-1-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

The shared block support is only needed for tc_shblock.sh. No need to
require that for other test.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 tools/testing/selftests/net/forwarding/lib.sh         | 3 +++
 tools/testing/selftests/net/forwarding/tc_shblocks.sh | 2 ++
 2 files changed, 5 insertions(+)

diff --git a/tools/testing/selftests/net/forwarding/lib.sh b/tools/testing/selftests/net/forwarding/lib.sh
index 7b18a53aa556..a736d1d7ecdb 100644
--- a/tools/testing/selftests/net/forwarding/lib.sh
+++ b/tools/testing/selftests/net/forwarding/lib.sh
@@ -28,7 +28,10 @@ check_tc_version()
 		echo "SKIP: iproute2 too old; tc is missing JSON support"
 		exit 1
 	fi
+}
 
+check_tc_shblock_support()
+{
 	tc filter help 2>&1 | grep block &> /dev/null
 	if [[ $? -ne 0 ]]; then
 		echo "SKIP: iproute2 too old; tc is missing shared block support"
diff --git a/tools/testing/selftests/net/forwarding/tc_shblocks.sh b/tools/testing/selftests/net/forwarding/tc_shblocks.sh
index b5b917203815..9826a446e2c0 100755
--- a/tools/testing/selftests/net/forwarding/tc_shblocks.sh
+++ b/tools/testing/selftests/net/forwarding/tc_shblocks.sh
@@ -105,6 +105,8 @@ cleanup()
 	ip link set $swp2 address $swp2origmac
 }
 
+check_tc_shblock_support
+
 trap cleanup EXIT
 
 setup_prepare
-- 
2.14.4

^ permalink raw reply related

* [patch net-next v2 6/9] net: sched: cls_flower: propagate chain teplate creation and destruction to drivers
From: Jiri Pirko @ 2018-06-26  7:59 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, jakub.kicinski, simon.horman,
	john.hurley, dsahern, mlxsw
In-Reply-To: <20180626080000.12964-1-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Introduce a couple of flower offload commands in order to propagate
template creation/destruction events down to device drivers.
Drivers may use this information to prepare HW in an optimal way
for future filter insertions.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
v1->v2:
- remove leftover extack arg in fl_hw_create_tmplt()
---
 include/net/pkt_cls.h  |  2 ++
 net/sched/cls_flower.c | 39 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 41 insertions(+)

diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index a3c1a2c47cd4..e83968cf9a70 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -715,6 +715,8 @@ enum tc_fl_command {
 	TC_CLSFLOWER_REPLACE,
 	TC_CLSFLOWER_DESTROY,
 	TC_CLSFLOWER_STATS,
+	TC_CLSFLOWER_TMPLT_CREATE,
+	TC_CLSFLOWER_TMPLT_DESTROY,
 };
 
 struct tc_cls_flower_offload {
diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index d64d43843a3a..614dd558d5f1 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -1120,6 +1120,42 @@ static void fl_walk(struct tcf_proto *tp, struct tcf_walker *arg)
 	}
 }
 
+static void fl_hw_create_tmplt(struct tcf_chain *chain,
+			       struct fl_flow_tmplt *tmplt)
+{
+	struct tc_cls_flower_offload cls_flower = {};
+	struct tcf_block *block = chain->block;
+	struct tcf_exts dummy_exts = { 0, };
+
+	cls_flower.common.chain_index = chain->index;
+	cls_flower.command = TC_CLSFLOWER_TMPLT_CREATE;
+	cls_flower.cookie = (unsigned long) tmplt;
+	cls_flower.dissector = &tmplt->dissector;
+	cls_flower.mask = &tmplt->mask;
+	cls_flower.key = &tmplt->dummy_key;
+	cls_flower.exts = &dummy_exts;
+
+	/* We don't care if driver (any of them) fails to handle this
+	 * call. It serves just as a hint for it.
+	 */
+	tc_setup_cb_call(block, NULL, TC_SETUP_CLSFLOWER,
+			 &cls_flower, false);
+}
+
+static void fl_hw_destroy_tmplt(struct tcf_chain *chain,
+				struct fl_flow_tmplt *tmplt)
+{
+	struct tc_cls_flower_offload cls_flower = {};
+	struct tcf_block *block = chain->block;
+
+	cls_flower.common.chain_index = chain->index;
+	cls_flower.command = TC_CLSFLOWER_TMPLT_DESTROY;
+	cls_flower.cookie = (unsigned long) tmplt;
+
+	tc_setup_cb_call(block, NULL, TC_SETUP_CLSFLOWER,
+			 &cls_flower, false);
+}
+
 static void *fl_tmplt_create(struct net *net, struct tcf_chain *chain,
 			     struct nlattr **tca,
 			     struct netlink_ext_ack *extack)
@@ -1150,6 +1186,8 @@ static void *fl_tmplt_create(struct net *net, struct tcf_chain *chain,
 
 	fl_init_dissector(&tmplt->dissector, &tmplt->mask);
 
+	fl_hw_create_tmplt(chain, tmplt);
+
 	return tmplt;
 
 errout_tmplt:
@@ -1163,6 +1201,7 @@ static void fl_tmplt_destroy(void *tmplt_priv)
 {
 	struct fl_flow_tmplt *tmplt = tmplt_priv;
 
+	fl_hw_destroy_tmplt(tmplt->chain, tmplt);
 	kfree(tmplt);
 }
 
-- 
2.14.4

^ permalink raw reply related

* [patch net-next v2 5/9] net: sched: cls_flower: implement chain templates
From: Jiri Pirko @ 2018-06-26  7:59 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, jakub.kicinski, simon.horman,
	john.hurley, dsahern, mlxsw
In-Reply-To: <20180626080000.12964-1-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Use the previously introduced template extension and implement
callback to create, destroy and dump chain template. The existing
parsing and dumping functions are re-used. Also, check if newly added
filters fit the template if it is set.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 net/sched/cls_flower.c | 107 ++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 106 insertions(+), 1 deletion(-)

diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index 9ce4375b3252..d64d43843a3a 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -70,6 +70,13 @@ struct fl_flow_mask {
 	struct list_head list;
 };
 
+struct fl_flow_tmplt {
+	struct fl_flow_key dummy_key;
+	struct fl_flow_key mask;
+	struct flow_dissector dissector;
+	struct tcf_chain *chain;
+};
+
 struct cls_fl_head {
 	struct rhashtable ht;
 	struct list_head masks;
@@ -144,6 +151,23 @@ static void fl_set_masked_key(struct fl_flow_key *mkey, struct fl_flow_key *key,
 		*lmkey++ = *lkey++ & *lmask++;
 }
 
+static bool fl_mask_fits_tmplt(struct fl_flow_tmplt *tmplt,
+			       struct fl_flow_mask *mask)
+{
+	const long *lmask = fl_key_get_start(&mask->key, mask);
+	const long *ltmplt;
+	int i;
+
+	if (!tmplt)
+		return true;
+	ltmplt = fl_key_get_start(&tmplt->mask, mask);
+	for (i = 0; i < fl_mask_range(mask); i += sizeof(long)) {
+		if (~*ltmplt++ & *lmask++)
+			return false;
+	}
+	return true;
+}
+
 static void fl_clear_masked_range(struct fl_flow_key *key,
 				  struct fl_flow_mask *mask)
 {
@@ -902,6 +926,7 @@ static int fl_set_parms(struct net *net, struct tcf_proto *tp,
 			struct cls_fl_filter *f, struct fl_flow_mask *mask,
 			unsigned long base, struct nlattr **tb,
 			struct nlattr *est, bool ovr,
+			struct fl_flow_tmplt *tmplt,
 			struct netlink_ext_ack *extack)
 {
 	int err;
@@ -922,6 +947,11 @@ static int fl_set_parms(struct net *net, struct tcf_proto *tp,
 	fl_mask_update_range(mask);
 	fl_set_masked_key(&f->mkey, &f->key, mask);
 
+	if (!fl_mask_fits_tmplt(tmplt, mask)) {
+		NL_SET_ERR_MSG_MOD(extack, "Mask does not fit the template");
+		return -EINVAL;
+	}
+
 	return 0;
 }
 
@@ -932,6 +962,7 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
 		     struct netlink_ext_ack *extack)
 {
 	struct cls_fl_head *head = rtnl_dereference(tp->root);
+	struct fl_flow_tmplt *tmplt = tmplt_priv;
 	struct cls_fl_filter *fold = *arg;
 	struct cls_fl_filter *fnew;
 	struct nlattr **tb;
@@ -988,7 +1019,7 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
 	}
 
 	err = fl_set_parms(net, tp, fnew, &mask, base, tb, tca[TCA_RATE], ovr,
-			   extack);
+			   tmplt, extack);
 	if (err)
 		goto errout_idr;
 
@@ -1089,6 +1120,52 @@ static void fl_walk(struct tcf_proto *tp, struct tcf_walker *arg)
 	}
 }
 
+static void *fl_tmplt_create(struct net *net, struct tcf_chain *chain,
+			     struct nlattr **tca,
+			     struct netlink_ext_ack *extack)
+{
+	struct fl_flow_tmplt *tmplt;
+	struct nlattr **tb;
+	int err;
+
+	if (!tca[TCA_OPTIONS])
+		return ERR_PTR(-EINVAL);
+
+	tb = kcalloc(TCA_FLOWER_MAX + 1, sizeof(struct nlattr *), GFP_KERNEL);
+	if (!tb)
+		return ERR_PTR(-ENOBUFS);
+	err = nla_parse_nested(tb, TCA_FLOWER_MAX, tca[TCA_OPTIONS],
+			       fl_policy, NULL);
+	if (err)
+		goto errout_tb;
+
+	tmplt = kzalloc(sizeof(*tmplt), GFP_KERNEL);
+	if (!tmplt)
+		goto errout_tb;
+	tmplt->chain = chain;
+	err = fl_set_key(net, tb, &tmplt->dummy_key, &tmplt->mask, extack);
+	if (err)
+		goto errout_tmplt;
+	kfree(tb);
+
+	fl_init_dissector(&tmplt->dissector, &tmplt->mask);
+
+	return tmplt;
+
+errout_tmplt:
+	kfree(tmplt);
+errout_tb:
+	kfree(tb);
+	return ERR_PTR(err);
+}
+
+static void fl_tmplt_destroy(void *tmplt_priv)
+{
+	struct fl_flow_tmplt *tmplt = tmplt_priv;
+
+	kfree(tmplt);
+}
+
 static int fl_dump_key_val(struct sk_buff *skb,
 			   void *val, int val_type,
 			   void *mask, int mask_type, int len)
@@ -1435,6 +1512,31 @@ static int fl_dump(struct net *net, struct tcf_proto *tp, void *fh,
 	return -1;
 }
 
+static int fl_tmplt_dump(struct sk_buff *skb, struct net *net, void *tmplt_priv)
+{
+	struct fl_flow_tmplt *tmplt = tmplt_priv;
+	struct fl_flow_key *key, *mask;
+	struct nlattr *nest;
+
+	nest = nla_nest_start(skb, TCA_OPTIONS);
+	if (!nest)
+		goto nla_put_failure;
+
+	key = &tmplt->dummy_key;
+	mask = &tmplt->mask;
+
+	if (fl_dump_key(skb, net, key, mask))
+		goto nla_put_failure;
+
+	nla_nest_end(skb, nest);
+
+	return skb->len;
+
+nla_put_failure:
+	nla_nest_cancel(skb, nest);
+	return -EMSGSIZE;
+}
+
 static void fl_bind_class(void *fh, u32 classid, unsigned long cl)
 {
 	struct cls_fl_filter *f = fh;
@@ -1454,6 +1556,9 @@ static struct tcf_proto_ops cls_fl_ops __read_mostly = {
 	.walk		= fl_walk,
 	.dump		= fl_dump,
 	.bind_class	= fl_bind_class,
+	.tmplt_create	= fl_tmplt_create,
+	.tmplt_destroy	= fl_tmplt_destroy,
+	.tmplt_dump	= fl_tmplt_dump,
 	.owner		= THIS_MODULE,
 };
 
-- 
2.14.4

^ permalink raw reply related

* [patch net-next v2 4/9] net: sched: cls_flower: change fl_init_dissector to accept mask and dissector
From: Jiri Pirko @ 2018-06-26  7:59 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, jakub.kicinski, simon.horman,
	john.hurley, dsahern, mlxsw
In-Reply-To: <20180626080000.12964-1-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

This function is going to be used for templates as well, so we need to
pass the pointer separately.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 net/sched/cls_flower.c | 39 ++++++++++++++++++++-------------------
 1 file changed, 20 insertions(+), 19 deletions(-)

diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index 76c5516357d5..9ce4375b3252 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -793,47 +793,48 @@ static int fl_init_mask_hashtable(struct fl_flow_mask *mask)
 			FL_KEY_SET(keys, cnt, id, member);			\
 	} while(0);
 
-static void fl_init_dissector(struct fl_flow_mask *mask)
+static void fl_init_dissector(struct flow_dissector *dissector,
+			      struct fl_flow_key *mask)
 {
 	struct flow_dissector_key keys[FLOW_DISSECTOR_KEY_MAX];
 	size_t cnt = 0;
 
 	FL_KEY_SET(keys, cnt, FLOW_DISSECTOR_KEY_CONTROL, control);
 	FL_KEY_SET(keys, cnt, FLOW_DISSECTOR_KEY_BASIC, basic);
-	FL_KEY_SET_IF_MASKED(&mask->key, keys, cnt,
+	FL_KEY_SET_IF_MASKED(mask, keys, cnt,
 			     FLOW_DISSECTOR_KEY_ETH_ADDRS, eth);
-	FL_KEY_SET_IF_MASKED(&mask->key, keys, cnt,
+	FL_KEY_SET_IF_MASKED(mask, keys, cnt,
 			     FLOW_DISSECTOR_KEY_IPV4_ADDRS, ipv4);
-	FL_KEY_SET_IF_MASKED(&mask->key, keys, cnt,
+	FL_KEY_SET_IF_MASKED(mask, keys, cnt,
 			     FLOW_DISSECTOR_KEY_IPV6_ADDRS, ipv6);
-	FL_KEY_SET_IF_MASKED(&mask->key, keys, cnt,
+	FL_KEY_SET_IF_MASKED(mask, keys, cnt,
 			     FLOW_DISSECTOR_KEY_PORTS, tp);
-	FL_KEY_SET_IF_MASKED(&mask->key, keys, cnt,
+	FL_KEY_SET_IF_MASKED(mask, keys, cnt,
 			     FLOW_DISSECTOR_KEY_IP, ip);
-	FL_KEY_SET_IF_MASKED(&mask->key, keys, cnt,
+	FL_KEY_SET_IF_MASKED(mask, keys, cnt,
 			     FLOW_DISSECTOR_KEY_TCP, tcp);
-	FL_KEY_SET_IF_MASKED(&mask->key, keys, cnt,
+	FL_KEY_SET_IF_MASKED(mask, keys, cnt,
 			     FLOW_DISSECTOR_KEY_ICMP, icmp);
-	FL_KEY_SET_IF_MASKED(&mask->key, keys, cnt,
+	FL_KEY_SET_IF_MASKED(mask, keys, cnt,
 			     FLOW_DISSECTOR_KEY_ARP, arp);
-	FL_KEY_SET_IF_MASKED(&mask->key, keys, cnt,
+	FL_KEY_SET_IF_MASKED(mask, keys, cnt,
 			     FLOW_DISSECTOR_KEY_MPLS, mpls);
-	FL_KEY_SET_IF_MASKED(&mask->key, keys, cnt,
+	FL_KEY_SET_IF_MASKED(mask, keys, cnt,
 			     FLOW_DISSECTOR_KEY_VLAN, vlan);
-	FL_KEY_SET_IF_MASKED(&mask->key, keys, cnt,
+	FL_KEY_SET_IF_MASKED(mask, keys, cnt,
 			     FLOW_DISSECTOR_KEY_ENC_KEYID, enc_key_id);
-	FL_KEY_SET_IF_MASKED(&mask->key, keys, cnt,
+	FL_KEY_SET_IF_MASKED(mask, keys, cnt,
 			     FLOW_DISSECTOR_KEY_ENC_IPV4_ADDRS, enc_ipv4);
-	FL_KEY_SET_IF_MASKED(&mask->key, keys, cnt,
+	FL_KEY_SET_IF_MASKED(mask, keys, cnt,
 			     FLOW_DISSECTOR_KEY_ENC_IPV6_ADDRS, enc_ipv6);
-	if (FL_KEY_IS_MASKED(&mask->key, enc_ipv4) ||
-	    FL_KEY_IS_MASKED(&mask->key, enc_ipv6))
+	if (FL_KEY_IS_MASKED(mask, enc_ipv4) ||
+	    FL_KEY_IS_MASKED(mask, enc_ipv6))
 		FL_KEY_SET(keys, cnt, FLOW_DISSECTOR_KEY_ENC_CONTROL,
 			   enc_control);
-	FL_KEY_SET_IF_MASKED(&mask->key, keys, cnt,
+	FL_KEY_SET_IF_MASKED(mask, keys, cnt,
 			     FLOW_DISSECTOR_KEY_ENC_PORTS, enc_tp);
 
-	skb_flow_dissector_init(&mask->dissector, keys, cnt);
+	skb_flow_dissector_init(dissector, keys, cnt);
 }
 
 static struct fl_flow_mask *fl_create_new_mask(struct cls_fl_head *head,
@@ -852,7 +853,7 @@ static struct fl_flow_mask *fl_create_new_mask(struct cls_fl_head *head,
 	if (err)
 		goto errout_free;
 
-	fl_init_dissector(newmask);
+	fl_init_dissector(&newmask->dissector, &newmask->key);
 
 	INIT_LIST_HEAD_RCU(&newmask->filters);
 
-- 
2.14.4

^ permalink raw reply related

* [patch net-next v2 3/9] net: sched: cls_flower: move key/mask dumping into a separate function
From: Jiri Pirko @ 2018-06-26  7:59 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, jakub.kicinski, simon.horman,
	john.hurley, dsahern, mlxsw
In-Reply-To: <20180626080000.12964-1-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Push key/mask dumping from fl_dump() into a separate function
fl_dump_key(), that will be reused for template dumping.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 net/sched/cls_flower.c | 62 ++++++++++++++++++++++++++++++--------------------
 1 file changed, 37 insertions(+), 25 deletions(-)

diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index 09d6c6e67f9d..76c5516357d5 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -1217,29 +1217,9 @@ static int fl_dump_key_flags(struct sk_buff *skb, u32 flags_key, u32 flags_mask)
 	return nla_put(skb, TCA_FLOWER_KEY_FLAGS_MASK, 4, &_mask);
 }
 
-static int fl_dump(struct net *net, struct tcf_proto *tp, void *fh,
-		   struct sk_buff *skb, struct tcmsg *t)
+static int fl_dump_key(struct sk_buff *skb, struct net *net,
+		       struct fl_flow_key *key, struct fl_flow_key *mask)
 {
-	struct cls_fl_filter *f = fh;
-	struct nlattr *nest;
-	struct fl_flow_key *key, *mask;
-
-	if (!f)
-		return skb->len;
-
-	t->tcm_handle = f->handle;
-
-	nest = nla_nest_start(skb, TCA_OPTIONS);
-	if (!nest)
-		goto nla_put_failure;
-
-	if (f->res.classid &&
-	    nla_put_u32(skb, TCA_FLOWER_CLASSID, f->res.classid))
-		goto nla_put_failure;
-
-	key = &f->key;
-	mask = &f->mask->key;
-
 	if (mask->indev_ifindex) {
 		struct net_device *dev;
 
@@ -1248,9 +1228,6 @@ static int fl_dump(struct net *net, struct tcf_proto *tp, void *fh,
 			goto nla_put_failure;
 	}
 
-	if (!tc_skip_hw(f->flags))
-		fl_hw_update_stats(tp, f);
-
 	if (fl_dump_key_val(skb, key->eth.dst, TCA_FLOWER_KEY_ETH_DST,
 			    mask->eth.dst, TCA_FLOWER_KEY_ETH_DST_MASK,
 			    sizeof(key->eth.dst)) ||
@@ -1404,6 +1381,41 @@ static int fl_dump(struct net *net, struct tcf_proto *tp, void *fh,
 	if (fl_dump_key_flags(skb, key->control.flags, mask->control.flags))
 		goto nla_put_failure;
 
+	return 0;
+
+nla_put_failure:
+	return -EMSGSIZE;
+}
+
+static int fl_dump(struct net *net, struct tcf_proto *tp, void *fh,
+		   struct sk_buff *skb, struct tcmsg *t)
+{
+	struct cls_fl_filter *f = fh;
+	struct nlattr *nest;
+	struct fl_flow_key *key, *mask;
+
+	if (!f)
+		return skb->len;
+
+	t->tcm_handle = f->handle;
+
+	nest = nla_nest_start(skb, TCA_OPTIONS);
+	if (!nest)
+		goto nla_put_failure;
+
+	if (f->res.classid &&
+	    nla_put_u32(skb, TCA_FLOWER_CLASSID, f->res.classid))
+		goto nla_put_failure;
+
+	key = &f->key;
+	mask = &f->mask->key;
+
+	if (fl_dump_key(skb, net, key, mask))
+		goto nla_put_failure;
+
+	if (!tc_skip_hw(f->flags))
+		fl_hw_update_stats(tp, f);
+
 	if (f->flags && nla_put_u32(skb, TCA_FLOWER_FLAGS, f->flags))
 		goto nla_put_failure;
 
-- 
2.14.4

^ permalink raw reply related

* [patch net-next v2 2/9] net: sched: introduce chain templates
From: Jiri Pirko @ 2018-06-26  7:59 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, jakub.kicinski, simon.horman,
	john.hurley, dsahern, mlxsw
In-Reply-To: <20180626080000.12964-1-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Introduce a group of new tc-rtnl commands to allow user to set per-chain
template. Templates lock down individual chains for particular
classifier type/options combinations. The classifier needs to support
templates, otherwise kernel would reply with error.

For example, to lock chain 22 to allow only filters of type
flower with destination mac address, user needs to do:
  chain 22 flower dst_mac 00:00:00:00:00:00/FF:FF:FF:FF:FF:FF

In case the chain already contains some filters it is not possible to
add or remove template. That is permitted only for empty chains.

Alongside with add/del commands, introduce also get/dump and
notifications.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 include/net/sch_generic.h      |  14 +-
 include/uapi/linux/rtnetlink.h |   7 +
 net/sched/cls_api.c            | 371 ++++++++++++++++++++++++++++++++++++++++-
 net/sched/cls_basic.c          |   2 +-
 net/sched/cls_bpf.c            |   3 +-
 net/sched/cls_cgroup.c         |   2 +-
 net/sched/cls_flow.c           |   3 +-
 net/sched/cls_flower.c         |   3 +-
 net/sched/cls_fw.c             |   3 +-
 net/sched/cls_matchall.c       |   3 +-
 net/sched/cls_route.c          |   2 +-
 net/sched/cls_rsvp.h           |   3 +-
 net/sched/cls_tcindex.c        |   2 +-
 net/sched/cls_u32.c            |   2 +-
 security/selinux/nlmsgtab.c    |   2 +-
 15 files changed, 405 insertions(+), 17 deletions(-)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 6488daa32f82..f2a27d41fed5 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -235,6 +235,8 @@ struct tcf_result {
 	};
 };
 
+struct tcf_chain;
+
 struct tcf_proto_ops {
 	struct list_head	head;
 	char			kind[IFNAMSIZ];
@@ -250,17 +252,25 @@ struct tcf_proto_ops {
 	int			(*change)(struct net *net, struct sk_buff *,
 					struct tcf_proto*, unsigned long,
 					u32 handle, struct nlattr **,
-					void **, bool,
+					void **, bool, void *tmplt_priv,
 					struct netlink_ext_ack *);
 	int			(*delete)(struct tcf_proto *tp, void *arg,
 					  bool *last,
 					  struct netlink_ext_ack *);
 	void			(*walk)(struct tcf_proto*, struct tcf_walker *arg);
 	void			(*bind_class)(void *, u32, unsigned long);
+	void *			(*tmplt_create)(struct net *net,
+						struct tcf_chain *chain,
+						struct nlattr **tca,
+						struct netlink_ext_ack *extack);
+	void			(*tmplt_destroy)(void *tmplt_priv);
 
 	/* rtnetlink specific */
 	int			(*dump)(struct net*, struct tcf_proto*, void *,
 					struct sk_buff *skb, struct tcmsg*);
+	int			(*tmplt_dump)(struct sk_buff *skb,
+					      struct net *net,
+					      void *tmplt_priv);
 
 	struct module		*owner;
 };
@@ -299,6 +309,8 @@ struct tcf_chain {
 	struct tcf_block *block;
 	u32 index; /* chain index */
 	unsigned int refcnt;
+	const struct tcf_proto_ops *tmplt_ops;
+	void *tmplt_priv;
 };
 
 struct tcf_block {
diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h
index 7d8502313c99..45fd8cc1fdb2 100644
--- a/include/uapi/linux/rtnetlink.h
+++ b/include/uapi/linux/rtnetlink.h
@@ -150,6 +150,13 @@ enum {
 	RTM_NEWCACHEREPORT = 96,
 #define RTM_NEWCACHEREPORT RTM_NEWCACHEREPORT
 
+	RTM_NEWCHAINTMPLT = 100,
+#define RTM_NEWCHAINTMPLT RTM_NEWCHAINTMPLT
+	RTM_DELCHAINTMPLT,
+#define RTM_DELCHAINTMPLT RTM_DELCHAINTMPLT
+	RTM_GETCHAINTMPLT,
+#define RTM_GETCHAINTMPLT RTM_GETCHAINTMPLT
+
 	__RTM_MAX,
 #define RTM_MAX		(((__RTM_MAX + 3) & ~3) - 1)
 };
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index db45931bbada..0c88520f80f2 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -227,7 +227,7 @@ static void tcf_chain_head_change(struct tcf_chain *chain,
 		tcf_chain_head_change_item(item, tp_head);
 }
 
-static void tcf_chain_flush(struct tcf_chain *chain)
+static void tcf_chain_flush(struct tcf_chain *chain, bool destroy_template)
 {
 	struct tcf_proto *tp = rtnl_dereference(chain->filter_chain);
 
@@ -238,6 +238,11 @@ static void tcf_chain_flush(struct tcf_chain *chain)
 		tp = rtnl_dereference(chain->filter_chain);
 		tcf_chain_put(chain);
 	}
+	if (destroy_template && chain->tmplt_ops) {
+		chain->tmplt_ops->tmplt_destroy(chain->tmplt_priv);
+		module_put(chain->tmplt_ops->owner);
+		tcf_chain_put(chain);
+	}
 }
 
 static void tcf_chain_destroy(struct tcf_chain *chain)
@@ -691,7 +696,7 @@ void tcf_block_put_ext(struct tcf_block *block, struct Qdisc *q,
 			tcf_chain_hold(chain);
 
 		list_for_each_entry(chain, &block->chain_list, list)
-			tcf_chain_flush(chain);
+			tcf_chain_flush(chain, true);
 	}
 
 	tcf_block_offload_unbind(block, q, ei);
@@ -1191,9 +1196,15 @@ static int tc_new_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
 		goto errout;
 	}
 
+	if (chain->tmplt_ops && chain->tmplt_ops != tp->ops) {
+		NL_SET_ERR_MSG(extack, "Chain template is set to a different filter kind");
+		err = -EINVAL;
+		goto errout;
+	}
+
 	err = tp->ops->change(net, skb, tp, cl, t->tcm_handle, tca, &fh,
 			      n->nlmsg_flags & NLM_F_CREATE ? TCA_ACT_NOREPLACE : TCA_ACT_REPLACE,
-			      extack);
+			      chain->tmplt_priv, extack);
 	if (err == 0) {
 		if (tp_created)
 			tcf_chain_tp_insert(chain, &chain_info, tp);
@@ -1274,7 +1285,7 @@ static int tc_del_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
 	if (prio == 0) {
 		tfilter_notify_chain(net, skb, block, q, parent, n,
 				     chain, RTM_DELTFILTER);
-		tcf_chain_flush(chain);
+		tcf_chain_flush(chain, false);
 		err = 0;
 		goto errout;
 	}
@@ -1570,6 +1581,354 @@ static int tc_dump_tfilter(struct sk_buff *skb, struct netlink_callback *cb)
 	return skb->len;
 }
 
+static int tc_tmplt_fill_node(struct tcf_chain *chain, struct net *net,
+			      struct sk_buff *skb, struct tcf_block *block,
+			      struct Qdisc *q, u32 parent,
+			      u32 portid, u32 seq, u16 flags, int event)
+{
+	unsigned char *b = skb_tail_pointer(skb);
+	const struct tcf_proto_ops *ops;
+	struct nlmsghdr *nlh;
+	struct tcmsg *tcm;
+	void *priv;
+
+	ops = chain->tmplt_ops;
+	priv = chain->tmplt_priv;
+
+	nlh = nlmsg_put(skb, portid, seq, event, sizeof(*tcm), flags);
+	if (!nlh)
+		goto out_nlmsg_trim;
+	tcm = nlmsg_data(nlh);
+	tcm->tcm_family = AF_UNSPEC;
+	tcm->tcm__pad1 = 0;
+	tcm->tcm__pad2 = 0;
+	tcm->tcm_handle = 0;
+	if (q) {
+		tcm->tcm_ifindex = qdisc_dev(q)->ifindex;
+		tcm->tcm_parent = parent;
+	} else {
+		tcm->tcm_ifindex = TCM_IFINDEX_MAGIC_BLOCK;
+		tcm->tcm_block_index = block->index;
+	}
+
+	if (nla_put_string(skb, TCA_KIND, ops->kind))
+		goto nla_put_failure;
+	if (nla_put_u32(skb, TCA_CHAIN, chain->index))
+		goto nla_put_failure;
+	if (ops->tmplt_dump(skb, net, priv) < 0)
+		goto nla_put_failure;
+
+	nlh->nlmsg_len = skb_tail_pointer(skb) - b;
+	return skb->len;
+
+out_nlmsg_trim:
+nla_put_failure:
+	nlmsg_trim(skb, b);
+	return -EMSGSIZE;
+}
+
+static int tc_tmplt_notify(struct tcf_chain *chain, struct net *net,
+			   struct sk_buff *oskb, struct nlmsghdr *n,
+			   struct tcf_block *block, struct Qdisc *q,
+			   u32 parent, int event, bool unicast)
+{
+	u32 portid = oskb ? NETLINK_CB(oskb).portid : 0;
+	struct sk_buff *skb;
+
+	skb = alloc_skb(NLMSG_GOODSIZE, GFP_KERNEL);
+	if (!skb)
+		return -ENOBUFS;
+
+	if (tc_tmplt_fill_node(chain, net, skb, block, q, parent, portid,
+			       n->nlmsg_seq, n->nlmsg_flags, event) <= 0) {
+		kfree_skb(skb);
+		return -EINVAL;
+	}
+
+	if (unicast)
+		return netlink_unicast(net->rtnl, skb, portid, MSG_DONTWAIT);
+
+	return rtnetlink_send(skb, net, portid, RTNLGRP_TC,
+			      n->nlmsg_flags & NLM_F_ECHO);
+}
+
+static int tc_ctl_tmplt_add(struct tcf_chain *chain, struct net *net,
+			    struct sk_buff *skb, struct nlmsghdr *n,
+			    struct tcf_block *block, struct Qdisc *q,
+			    u32 parent, struct nlattr **tca,
+			    struct netlink_ext_ack *extack)
+{
+	const struct tcf_proto_ops *ops;
+	void *tmplt_priv;
+
+	if (chain->tmplt_ops) {
+		NL_SET_ERR_MSG(extack, "A template is already set for the chain");
+		return -EBUSY;
+	}
+	if (chain->filter_chain) {
+		NL_SET_ERR_MSG(extack, "The chain is not empty, unable to add template");
+		return -EBUSY;
+	}
+	if (!(n->nlmsg_flags & NLM_F_CREATE)) {
+		NL_SET_ERR_MSG(extack, "Need NLM_F_CREATE to create a new chain template");
+		return -ENOENT;
+	}
+	ops = tcf_proto_lookup_ops(nla_data(tca[TCA_KIND]), extack);
+	if (IS_ERR(ops))
+		return PTR_ERR(ops);
+	if (!ops->tmplt_create || !ops->tmplt_destroy || !ops->tmplt_dump) {
+		NL_SET_ERR_MSG(extack, "Chain templates are not supported with this classifier");
+		return -EOPNOTSUPP;
+	}
+
+	tmplt_priv = ops->tmplt_create(net, chain, tca, extack);
+	if (IS_ERR(tmplt_priv)) {
+		module_put(ops->owner);
+		return PTR_ERR(tmplt_priv);
+	}
+	chain->tmplt_ops = ops;
+	chain->tmplt_priv = tmplt_priv;
+	tc_tmplt_notify(chain, net, skb, n, block, q, parent,
+			RTM_NEWCHAINTMPLT, false);
+	return 0;
+}
+
+static int tc_ctl_tmplt_del(struct tcf_chain *chain, struct net *net,
+			    struct sk_buff *skb, struct nlmsghdr *n,
+			    struct tcf_block *block, struct Qdisc *q,
+			    u32 parent, struct netlink_ext_ack *extack)
+{
+	const struct tcf_proto_ops *ops = chain->tmplt_ops;
+
+	if (!ops) {
+		NL_SET_ERR_MSG(extack, "Unable to delete template as this chain does not have template");
+		return -ENOENT;
+	}
+	if (chain->filter_chain) {
+		NL_SET_ERR_MSG(extack, "The chain is not empty, unable to delete template");
+		return -EBUSY;
+	}
+	if (!ops->tmplt_create) {
+		NL_SET_ERR_MSG(extack, "Chain templates are not supported with this classifier");
+		return -EOPNOTSUPP;
+	}
+	tc_tmplt_notify(chain, net, skb, n, block, q, parent,
+			RTM_DELCHAINTMPLT, false);
+	ops->tmplt_destroy(chain->tmplt_priv);
+	module_put(ops->owner);
+	chain->tmplt_ops = NULL;
+	chain->tmplt_priv = NULL;
+	return 0;
+}
+
+static int tc_ctl_tmplt_get(struct tcf_chain *chain, struct net *net,
+			    struct sk_buff *skb, struct nlmsghdr *n,
+			    struct tcf_block *block, struct Qdisc *q,
+			    u32 parent, struct netlink_ext_ack *extack)
+{
+	const struct tcf_proto_ops *ops = chain->tmplt_ops;
+	int err;
+
+	if (!ops) {
+		NL_SET_ERR_MSG(extack, "Unable to get template as this chain does not have template");
+		return -ENOENT;
+	}
+	err = tc_tmplt_notify(chain, net, skb, n, block, q, parent,
+			      RTM_NEWCHAINTMPLT, true);
+	if (err < 0)
+		NL_SET_ERR_MSG(extack, "Failed to send chain template notify message");
+	return err;
+}
+
+/* Add/delete/get a chain template */
+
+static int tc_ctl_tmplt(struct sk_buff *skb, struct nlmsghdr *n,
+			struct netlink_ext_ack *extack)
+{
+	struct net *net = sock_net(skb->sk);
+	struct nlattr *tca[TCA_MAX + 1];
+	struct tcmsg *t;
+	u32 parent;
+	u32 chain_index;
+	struct Qdisc *q = NULL;
+	struct tcf_chain *chain = NULL;
+	struct tcf_block *block;
+	unsigned long cl;
+	int err;
+
+	if (n->nlmsg_type != RTM_GETCHAINTMPLT &&
+	    !netlink_ns_capable(skb, net->user_ns, CAP_NET_ADMIN))
+		return -EPERM;
+
+replay:
+	err = nlmsg_parse(n, sizeof(*t), tca, TCA_MAX, NULL, extack);
+	if (err < 0)
+		return err;
+
+	t = nlmsg_data(n);
+	parent = t->tcm_parent;
+	cl = 0;
+
+	/* Find filter chain. */
+
+	block = tcf_block_find(net, &q, &parent, &cl,
+			       t->tcm_ifindex, t->tcm_block_index, extack);
+	if (IS_ERR(block)) {
+		err = PTR_ERR(block);
+		goto errout;
+	}
+
+	chain_index = tca[TCA_CHAIN] ? nla_get_u32(tca[TCA_CHAIN]) : 0;
+	if (chain_index > TC_ACT_EXT_VAL_MASK) {
+		NL_SET_ERR_MSG(extack, "Specified chain index exceeds upper limit");
+		err = -EINVAL;
+		goto errout;
+	}
+	chain = tcf_chain_get(block, chain_index,
+			      n->nlmsg_type == RTM_NEWCHAINTMPLT);
+	if (!chain) {
+		NL_SET_ERR_MSG(extack, "Cannot find specified filter chain");
+		err = n->nlmsg_type == RTM_NEWTFILTER ? -ENOMEM : -EINVAL;
+		goto errout;
+	}
+
+	switch (n->nlmsg_type) {
+	case RTM_NEWCHAINTMPLT:
+		err = tc_ctl_tmplt_add(chain, net, skb, n, block,
+				       q, parent, tca, extack);
+		/* In case the chain template was successfully added,
+		 * take a reference to the chain. This ensures that
+		 * an empty chain with template does not disappear
+		 * at the end of this function.
+		 */
+		if (!err)
+			tcf_chain_hold(chain);
+		break;
+	case RTM_DELCHAINTMPLT:
+		err = tc_ctl_tmplt_del(chain, net, skb, n, block,
+				       q, parent, extack);
+		/* In case the chain template was successfully deleted,
+		 * put a reference to the chain previously taken
+		 * during template addition.
+		 */
+		if (!err)
+			tcf_chain_put(chain);
+		break;
+	case RTM_GETCHAINTMPLT:
+		err = tc_ctl_tmplt_get(chain, net, skb, n, block,
+				       q, parent, extack);
+		break;
+	default:
+		err = -EOPNOTSUPP;
+		NL_SET_ERR_MSG(extack, "Unsupported message type");
+		break;
+	}
+errout:
+	if (chain)
+		tcf_chain_put(chain);
+	if (err == -EAGAIN)
+		/* Replay the request. */
+		goto replay;
+	return err;
+}
+
+/* called with RTNL */
+static int tc_dump_tmplt(struct sk_buff *skb, struct netlink_callback *cb)
+{
+	struct net *net = sock_net(skb->sk);
+	struct nlattr *tca[TCA_MAX + 1];
+	struct Qdisc *q = NULL;
+	struct tcf_block *block;
+	struct tcf_chain *chain;
+	struct tcmsg *tcm = nlmsg_data(cb->nlh);
+	long index_start;
+	long index;
+	u32 parent;
+	int err;
+
+	if (nlmsg_len(cb->nlh) < sizeof(*tcm))
+		return skb->len;
+
+	err = nlmsg_parse(cb->nlh, sizeof(*tcm), tca, TCA_MAX, NULL, NULL);
+	if (err)
+		return err;
+
+	if (tcm->tcm_ifindex == TCM_IFINDEX_MAGIC_BLOCK) {
+		block = tcf_block_lookup(net, tcm->tcm_block_index);
+		if (!block)
+			goto out;
+		/* If we work with block index, q is NULL and parent value
+		 * will never be used in the following code. The check
+		 * in tcf_fill_node prevents it. However, compiler does not
+		 * see that far, so set parent to zero to silence the warning
+		 * about parent being uninitialized.
+		 */
+		parent = 0;
+	} else {
+		const struct Qdisc_class_ops *cops;
+		struct net_device *dev;
+		unsigned long cl = 0;
+
+		dev = __dev_get_by_index(net, tcm->tcm_ifindex);
+		if (!dev)
+			return skb->len;
+
+		parent = tcm->tcm_parent;
+		if (!parent) {
+			q = dev->qdisc;
+			parent = q->handle;
+		} else {
+			q = qdisc_lookup(dev, TC_H_MAJ(tcm->tcm_parent));
+		}
+		if (!q)
+			goto out;
+		cops = q->ops->cl_ops;
+		if (!cops)
+			goto out;
+		if (!cops->tcf_block)
+			goto out;
+		if (TC_H_MIN(tcm->tcm_parent)) {
+			cl = cops->find(q, tcm->tcm_parent);
+			if (cl == 0)
+				goto out;
+		}
+		block = cops->tcf_block(q, cl, NULL);
+		if (!block)
+			goto out;
+		if (tcf_block_shared(block))
+			q = NULL;
+	}
+
+	index_start = cb->args[0];
+	index = 0;
+
+	list_for_each_entry(chain, &block->chain_list, list) {
+		if ((tca[TCA_CHAIN] &&
+		     nla_get_u32(tca[TCA_CHAIN]) != chain->index) ||
+		    !chain->tmplt_ops)
+			continue;
+		if (index < index_start) {
+			index++;
+			continue;
+		}
+		err = tc_tmplt_fill_node(chain, net, skb, block, q, parent,
+					 NETLINK_CB(cb->skb).portid,
+					 cb->nlh->nlmsg_seq, NLM_F_MULTI,
+					 RTM_NEWCHAINTMPLT);
+		if (err <= 0)
+			break;
+		index++;
+	}
+
+	cb->args[0] = index;
+
+out:
+	/* If we did no progress, the error (EMSGSIZE) is real */
+	if (skb->len == 0 && err)
+		return err;
+	return skb->len;
+}
+
 void tcf_exts_destroy(struct tcf_exts *exts)
 {
 #ifdef CONFIG_NET_CLS_ACT
@@ -1795,6 +2154,10 @@ static int __init tc_filter_init(void)
 	rtnl_register(PF_UNSPEC, RTM_DELTFILTER, tc_del_tfilter, NULL, 0);
 	rtnl_register(PF_UNSPEC, RTM_GETTFILTER, tc_get_tfilter,
 		      tc_dump_tfilter, 0);
+	rtnl_register(PF_UNSPEC, RTM_NEWCHAINTMPLT, tc_ctl_tmplt, NULL, 0);
+	rtnl_register(PF_UNSPEC, RTM_DELCHAINTMPLT, tc_ctl_tmplt, NULL, 0);
+	rtnl_register(PF_UNSPEC, RTM_GETCHAINTMPLT, tc_ctl_tmplt,
+		      tc_dump_tmplt, 0);
 
 	return 0;
 
diff --git a/net/sched/cls_basic.c b/net/sched/cls_basic.c
index 95367f37098d..a690acac7e6e 100644
--- a/net/sched/cls_basic.c
+++ b/net/sched/cls_basic.c
@@ -168,7 +168,7 @@ static int basic_set_parms(struct net *net, struct tcf_proto *tp,
 static int basic_change(struct net *net, struct sk_buff *in_skb,
 			struct tcf_proto *tp, unsigned long base, u32 handle,
 			struct nlattr **tca, void **arg, bool ovr,
-			struct netlink_ext_ack *extack)
+			void *tmplt_priv, struct netlink_ext_ack *extack)
 {
 	int err;
 	struct basic_head *head = rtnl_dereference(tp->root);
diff --git a/net/sched/cls_bpf.c b/net/sched/cls_bpf.c
index 1aa7f6511065..363c43dfb894 100644
--- a/net/sched/cls_bpf.c
+++ b/net/sched/cls_bpf.c
@@ -455,7 +455,8 @@ static int cls_bpf_set_parms(struct net *net, struct tcf_proto *tp,
 static int cls_bpf_change(struct net *net, struct sk_buff *in_skb,
 			  struct tcf_proto *tp, unsigned long base,
 			  u32 handle, struct nlattr **tca,
-			  void **arg, bool ovr, struct netlink_ext_ack *extack)
+			  void **arg, bool ovr, void *tmplt_priv,
+			  struct netlink_ext_ack *extack)
 {
 	struct cls_bpf_head *head = rtnl_dereference(tp->root);
 	struct cls_bpf_prog *oldprog = *arg;
diff --git a/net/sched/cls_cgroup.c b/net/sched/cls_cgroup.c
index 3bc01bdde165..ca5d0315432c 100644
--- a/net/sched/cls_cgroup.c
+++ b/net/sched/cls_cgroup.c
@@ -78,7 +78,7 @@ static void cls_cgroup_destroy_work(struct work_struct *work)
 static int cls_cgroup_change(struct net *net, struct sk_buff *in_skb,
 			     struct tcf_proto *tp, unsigned long base,
 			     u32 handle, struct nlattr **tca,
-			     void **arg, bool ovr,
+			     void **arg, bool ovr, void *tmplt_priv,
 			     struct netlink_ext_ack *extack)
 {
 	struct nlattr *tb[TCA_CGROUP_MAX + 1];
diff --git a/net/sched/cls_flow.c b/net/sched/cls_flow.c
index 2bb043cd436b..2a21e26fcee0 100644
--- a/net/sched/cls_flow.c
+++ b/net/sched/cls_flow.c
@@ -391,7 +391,8 @@ static void flow_destroy_filter_work(struct work_struct *work)
 static int flow_change(struct net *net, struct sk_buff *in_skb,
 		       struct tcf_proto *tp, unsigned long base,
 		       u32 handle, struct nlattr **tca,
-		       void **arg, bool ovr, struct netlink_ext_ack *extack)
+		       void **arg, bool ovr, void *tmplt_priv,
+		       struct netlink_ext_ack *extack)
 {
 	struct flow_head *head = rtnl_dereference(tp->root);
 	struct flow_filter *fold, *fnew;
diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index 9e8b26a80fb3..09d6c6e67f9d 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -927,7 +927,8 @@ static int fl_set_parms(struct net *net, struct tcf_proto *tp,
 static int fl_change(struct net *net, struct sk_buff *in_skb,
 		     struct tcf_proto *tp, unsigned long base,
 		     u32 handle, struct nlattr **tca,
-		     void **arg, bool ovr, struct netlink_ext_ack *extack)
+		     void **arg, bool ovr, void *tmplt_priv,
+		     struct netlink_ext_ack *extack)
 {
 	struct cls_fl_head *head = rtnl_dereference(tp->root);
 	struct cls_fl_filter *fold = *arg;
diff --git a/net/sched/cls_fw.c b/net/sched/cls_fw.c
index 29eeeaf3ea44..a1d40d48aa24 100644
--- a/net/sched/cls_fw.c
+++ b/net/sched/cls_fw.c
@@ -250,7 +250,8 @@ static int fw_set_parms(struct net *net, struct tcf_proto *tp,
 static int fw_change(struct net *net, struct sk_buff *in_skb,
 		     struct tcf_proto *tp, unsigned long base,
 		     u32 handle, struct nlattr **tca, void **arg,
-		     bool ovr, struct netlink_ext_ack *extack)
+		     bool ovr, void *tmplt_priv,
+		     struct netlink_ext_ack *extack)
 {
 	struct fw_head *head = rtnl_dereference(tp->root);
 	struct fw_filter *f = *arg;
diff --git a/net/sched/cls_matchall.c b/net/sched/cls_matchall.c
index 47b207ef7762..481e77cbf501 100644
--- a/net/sched/cls_matchall.c
+++ b/net/sched/cls_matchall.c
@@ -152,7 +152,8 @@ static int mall_set_parms(struct net *net, struct tcf_proto *tp,
 static int mall_change(struct net *net, struct sk_buff *in_skb,
 		       struct tcf_proto *tp, unsigned long base,
 		       u32 handle, struct nlattr **tca,
-		       void **arg, bool ovr, struct netlink_ext_ack *extack)
+		       void **arg, bool ovr, void *tmplt_priv,
+		       struct netlink_ext_ack *extack)
 {
 	struct cls_mall_head *head = rtnl_dereference(tp->root);
 	struct nlattr *tb[TCA_MATCHALL_MAX + 1];
diff --git a/net/sched/cls_route.c b/net/sched/cls_route.c
index 0404aa5fa7cb..321eb746fe01 100644
--- a/net/sched/cls_route.c
+++ b/net/sched/cls_route.c
@@ -468,7 +468,7 @@ static int route4_set_parms(struct net *net, struct tcf_proto *tp,
 static int route4_change(struct net *net, struct sk_buff *in_skb,
 			 struct tcf_proto *tp, unsigned long base, u32 handle,
 			 struct nlattr **tca, void **arg, bool ovr,
-			 struct netlink_ext_ack *extack)
+			 void *tmplt_priv, struct netlink_ext_ack *extack)
 {
 	struct route4_head *head = rtnl_dereference(tp->root);
 	struct route4_filter __rcu **fp;
diff --git a/net/sched/cls_rsvp.h b/net/sched/cls_rsvp.h
index e9ccf7daea7d..371618720ef2 100644
--- a/net/sched/cls_rsvp.h
+++ b/net/sched/cls_rsvp.h
@@ -477,7 +477,8 @@ static int rsvp_change(struct net *net, struct sk_buff *in_skb,
 		       struct tcf_proto *tp, unsigned long base,
 		       u32 handle,
 		       struct nlattr **tca,
-		       void **arg, bool ovr, struct netlink_ext_ack *extack)
+		       void **arg, bool ovr, void *tmplt_priv,
+		       struct netlink_ext_ack *extack)
 {
 	struct rsvp_head *data = rtnl_dereference(tp->root);
 	struct rsvp_filter *f, *nfp;
diff --git a/net/sched/cls_tcindex.c b/net/sched/cls_tcindex.c
index 32f4bbd82f35..d9fb5d56c60d 100644
--- a/net/sched/cls_tcindex.c
+++ b/net/sched/cls_tcindex.c
@@ -500,7 +500,7 @@ tcindex_set_parms(struct net *net, struct tcf_proto *tp, unsigned long base,
 static int
 tcindex_change(struct net *net, struct sk_buff *in_skb,
 	       struct tcf_proto *tp, unsigned long base, u32 handle,
-	       struct nlattr **tca, void **arg, bool ovr,
+	       struct nlattr **tca, void **arg, bool ovr, void *tmplt_priv,
 	       struct netlink_ext_ack *extack)
 {
 	struct nlattr *opt = tca[TCA_OPTIONS];
diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c
index fb861f90fde6..b500ce62ef3c 100644
--- a/net/sched/cls_u32.c
+++ b/net/sched/cls_u32.c
@@ -903,7 +903,7 @@ static struct tc_u_knode *u32_init_knode(struct tcf_proto *tp,
 static int u32_change(struct net *net, struct sk_buff *in_skb,
 		      struct tcf_proto *tp, unsigned long base, u32 handle,
 		      struct nlattr **tca, void **arg, bool ovr,
-		      struct netlink_ext_ack *extack)
+		      void *tmplt_priv, struct netlink_ext_ack *extack)
 {
 	struct tc_u_common *tp_c = tp->data;
 	struct tc_u_hnode *ht;
diff --git a/security/selinux/nlmsgtab.c b/security/selinux/nlmsgtab.c
index 7b7433a1a34c..825777efc83e 100644
--- a/security/selinux/nlmsgtab.c
+++ b/security/selinux/nlmsgtab.c
@@ -159,7 +159,7 @@ int selinux_nlmsg_lookup(u16 sclass, u16 nlmsg_type, u32 *perm)
 	switch (sclass) {
 	case SECCLASS_NETLINK_ROUTE_SOCKET:
 		/* RTM_MAX always point to RTM_SETxxxx, ie RTM_NEWxxx + 3 */
-		BUILD_BUG_ON(RTM_MAX != (RTM_NEWCACHEREPORT + 3));
+		BUILD_BUG_ON(RTM_MAX != (RTM_NEWCHAINTMPLT + 3));
 		err = nlmsg_perm(nlmsg_type, perm, nlmsg_route_perms,
 				 sizeof(nlmsg_route_perms));
 		break;
-- 
2.14.4

^ permalink raw reply related

* [patch net-next v2 1/9] net: sched: push ops lookup bits into tcf_proto_lookup_ops()
From: Jiri Pirko @ 2018-06-26  7:59 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, jakub.kicinski, simon.horman,
	john.hurley, dsahern, mlxsw
In-Reply-To: <20180626080000.12964-1-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Push all bits that take care of ops lookup, including module loading
outside tcf_proto_create() function, into tcf_proto_lookup_ops()

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 net/sched/cls_api.c | 53 +++++++++++++++++++++++++++++++----------------------
 1 file changed, 31 insertions(+), 22 deletions(-)

diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index cdc3c87c53e6..db45931bbada 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -39,7 +39,7 @@ static DEFINE_RWLOCK(cls_mod_lock);
 
 /* Find classifier type by string name */
 
-static const struct tcf_proto_ops *tcf_proto_lookup_ops(const char *kind)
+static const struct tcf_proto_ops *__tcf_proto_lookup_ops(const char *kind)
 {
 	const struct tcf_proto_ops *t, *res = NULL;
 
@@ -57,6 +57,33 @@ static const struct tcf_proto_ops *tcf_proto_lookup_ops(const char *kind)
 	return res;
 }
 
+static const struct tcf_proto_ops *
+tcf_proto_lookup_ops(const char *kind, struct netlink_ext_ack *extack)
+{
+	const struct tcf_proto_ops *ops;
+
+	ops = __tcf_proto_lookup_ops(kind);
+	if (ops)
+		return ops;
+#ifdef CONFIG_MODULES
+	rtnl_unlock();
+	request_module("cls_%s", kind);
+	rtnl_lock();
+	ops = __tcf_proto_lookup_ops(kind);
+	/* We dropped the RTNL semaphore in order to perform
+	 * the module load. So, even if we succeeded in loading
+	 * the module we have to replay the request. We indicate
+	 * this using -EAGAIN.
+	 */
+	if (ops) {
+		module_put(ops->owner);
+		return ERR_PTR(-EAGAIN);
+	}
+#endif
+	NL_SET_ERR_MSG(extack, "TC classifier not found");
+	return ERR_PTR(-ENOENT);
+}
+
 /* Register(unregister) new classifier type */
 
 int register_tcf_proto_ops(struct tcf_proto_ops *ops)
@@ -133,27 +160,9 @@ static struct tcf_proto *tcf_proto_create(const char *kind, u32 protocol,
 	if (!tp)
 		return ERR_PTR(-ENOBUFS);
 
-	err = -ENOENT;
-	tp->ops = tcf_proto_lookup_ops(kind);
-	if (!tp->ops) {
-#ifdef CONFIG_MODULES
-		rtnl_unlock();
-		request_module("cls_%s", kind);
-		rtnl_lock();
-		tp->ops = tcf_proto_lookup_ops(kind);
-		/* We dropped the RTNL semaphore in order to perform
-		 * the module load. So, even if we succeeded in loading
-		 * the module we have to replay the request. We indicate
-		 * this using -EAGAIN.
-		 */
-		if (tp->ops) {
-			module_put(tp->ops->owner);
-			err = -EAGAIN;
-		} else {
-			NL_SET_ERR_MSG(extack, "TC classifier not found");
-			err = -ENOENT;
-		}
-#endif
+	tp->ops = tcf_proto_lookup_ops(kind, extack);
+	if (IS_ERR(tp->ops)) {
+		err = PTR_ERR(tp->ops);
 		goto errout;
 	}
 	tp->classify = tp->ops->classify;
-- 
2.14.4

^ permalink raw reply related

* [patch net-next v2 0/9] net: sched: introduce chain templates support with offloading to mlxsw
From: Jiri Pirko @ 2018-06-26  7:59 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, jakub.kicinski, simon.horman,
	john.hurley, dsahern, mlxsw

From: Jiri Pirko <jiri@mellanox.com>

For the TC clsact offload these days, some of HW drivers need
to hold a magic ball. The reason is, with the first inserted rule inside
HW they need to guess what fields will be used for the matching. If
later on this guess proves to be wrong and user adds a filter with a
different field to match, there's a problem. Mlxsw resolves it now with
couple of patterns. Those try to cover as many match fields as possible.
This aproach is far from optimal, both performance-wise and scale-wise.
Also, there is a combination of filters that in certain order won't
succeed.

Most of the time, when user inserts filters in chain, he knows right away
how the filters are going to look like - what type and option will they
have. For example, he knows that he will only insert filters of type
flower matching destination IP address. He can specify a template that
would cover all the filters in the chain.

This patchset is providing the possibility to user to provide such
template  to kernel and propagate it all the way down to device
drivers.

See the examples below.

Create dummy device with clsact first:
# ip link add type dummy
# tc qdisc add dev dummy0 clsact

There is no template assigned by default:
# tc filter template show dev dummy0 ingress

Add a template of type flower allowing to insert rules matching on last
2 bytes of destination mac address:
# tc filter template add dev dummy0 ingress proto ip flower dst_mac 00:00:00:00:00:00/00:00:00:00:FF:FF

The template is now showed in the list:
# tc filter template show dev dummy0 ingress
filter flower chain 0
  dst_mac 00:00:00:00:00:00/00:00:00:00:ff:ff
  eth_type ipv4

Add another template, this time for chain number 22:
# tc filter template add dev dummy0 ingress proto ip chain 22 flower dst_ip 0.0.0.0/16
# tc filter template show dev dummy0 ingress
filter flower chain 0
  dst_mac 00:00:00:00:00:00/00:00:00:00:ff:ff
  eth_type ipv4
filter flower chain 22
  eth_type ipv4
  dst_ip 0.0.0.0/16

Add a filter that fits the template:
# tc filter add dev dummy0 ingress proto ip flower dst_mac aa:bb:cc:dd:ee:ff/00:00:00:00:00:0F action drop

Addition of filters that does not fit the template would fail:
# tc filter add dev dummy0 ingress proto ip flower dst_mac aa:11:22:33:44:55/00:00:00:FF:00:00 action drop
Error: Mask does not fit the template.
We have an error talking to the kernel, -1
# tc filter add dev dummy0 ingress proto ip flower dst_ip 10.0.0.1 action drop
Error: Mask does not fit the template.
We have an error talking to the kernel, -1

Additions of filters to chain 22:
# tc filter add dev dummy0 ingress proto ip chain 22 flower dst_ip 10.0.0.1/8 action drop
# tc filter add dev dummy0 ingress proto ip chain 22 flower dst_ip 10.0.0.1 action drop
Error: Mask does not fit the template.
We have an error talking to the kernel, -1
# tc filter add dev dummy0 ingress proto ip chain 22 flower dst_ip 10.0.0.1/24 action drop
Error: Mask does not fit the template.
We have an error talking to the kernel, -1

Removal of a template from non-empty chain would fail:
# tc filter template del dev dummy0 ingress
Error: The chain is not empty, unable to delete template.
We have an error talking to the kernel, -1

Once the chain is flushed, the template could be removed:
# tc filter del dev dummy0 ingress
# tc filter template del dev dummy0 ingress

---
v1->v2:
-patch 6:
  - remove leftover extack arg in fl_hw_create_tmplt()

Jiri Pirko (9):
  net: sched: push ops lookup bits into tcf_proto_lookup_ops()
  net: sched: introduce chain templates
  net: sched: cls_flower: move key/mask dumping into a separate function
  net: sched: cls_flower: change fl_init_dissector to accept mask and
    dissector
  net: sched: cls_flower: implement chain templates
  net: sched: cls_flower: propagate chain teplate creation and
    destruction to drivers
  mlxsw: spectrum: Implement chain template hinting
  selftests: forwarding: move shblock tc support check to a separate
    helper
  selftests: forwarding: add tests for TC chain templates

 drivers/net/ethernet/mellanox/mlxsw/spectrum.c     |   5 +
 drivers/net/ethernet/mellanox/mlxsw/spectrum.h     |  12 +-
 drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c |  12 +-
 .../ethernet/mellanox/mlxsw/spectrum_acl_tcam.c    |  25 +-
 .../net/ethernet/mellanox/mlxsw/spectrum_flower.c  |  44 ++-
 include/net/pkt_cls.h                              |   2 +
 include/net/sch_generic.h                          |  14 +-
 include/uapi/linux/rtnetlink.h                     |   7 +
 net/sched/cls_api.c                                | 424 +++++++++++++++++++--
 net/sched/cls_basic.c                              |   2 +-
 net/sched/cls_bpf.c                                |   3 +-
 net/sched/cls_cgroup.c                             |   2 +-
 net/sched/cls_flow.c                               |   3 +-
 net/sched/cls_flower.c                             | 250 +++++++++---
 net/sched/cls_fw.c                                 |   3 +-
 net/sched/cls_matchall.c                           |   3 +-
 net/sched/cls_route.c                              |   2 +-
 net/sched/cls_rsvp.h                               |   3 +-
 net/sched/cls_tcindex.c                            |   2 +-
 net/sched/cls_u32.c                                |   2 +-
 security/selinux/nlmsgtab.c                        |   2 +-
 tools/testing/selftests/net/forwarding/lib.sh      |  12 +
 .../selftests/net/forwarding/tc_chaintemplates.sh  | 160 ++++++++
 .../selftests/net/forwarding/tc_shblocks.sh        |   2 +
 24 files changed, 900 insertions(+), 96 deletions(-)
 create mode 100755 tools/testing/selftests/net/forwarding/tc_chaintemplates.sh

-- 
2.14.4

^ permalink raw reply

* Re: [PATCH 00/14] ARM: davinci: step towards removing at24_platform_data
From: Bartosz Golaszewski @ 2018-06-26  7:44 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Sekhar Nori, Kevin Hilman, Russell King, Grygorii Strashko,
	David S . Miller, Srinivas Kandagatla, Lukas Wunner, Rob Herring,
	Florian Fainelli, Dan Carpenter, Ivan Khoronzhuk, David Lechner,
	Greg Kroah-Hartman, Linux ARM, Linux Kernel Mailing List,
	linux-omap, netdev, Bartosz Golaszewski
In-Reply-To: <20180625180237.GC17417@lunn.ch>

2018-06-25 20:02 GMT+02:00 Andrew Lunn <andrew@lunn.ch>:
>> With my patch 1/14 you'll get -EPROBE_DEFER from nvmem_cell_get() if
>> the nvmem provider is not yet registered. Will that help in your case?
>
> I don't think so. My driver instantiates the AT24 device. So if i get
> -EPROBE_DEFER, i need to cleanup the probe, and return -EPROBDE_DEFER
> to the code. Which means i need to remove the AT24 device...
>
>        Andrew

Are you sure this is the correct approach? I understand that you want
to use something like board files for your machine? Wouldn't it be
better to register a platform device for at24, register a cell lookup
and then depend on that cell from your driver?

Would you mind describing your platform in detail a bit more?

Best regards,
Bartosz

^ permalink raw reply

* Re: [PATCH 2/2] sh_eth: remove sh_eth_cpu_data::rpadir_value
From: Geert Uytterhoeven @ 2018-06-26  7:25 UTC (permalink / raw)
  To: Sergei Shtylyov; +Cc: netdev, David S. Miller, Linux-Renesas
In-Reply-To: <d2b67329-7bed-80e2-76e8-d0514f8545ae@cogentembedded.com>

On Mon, Jun 25, 2018 at 10:38 PM Sergei Shtylyov
<sergei.shtylyov@cogentembedded.com> wrote:
> If RPADIR exists, the value written to it is always the same for all SoCs
> (and derived from NET_IP_ALIGN), so there has not  been any need to store
> it in the *struct* sh_eth_cpu_data...
>
> Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>

Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply

* Re: [PATCH 1/2] sh_eth: fix *enum* RPADIR_BIT
From: Geert Uytterhoeven @ 2018-06-26  7:25 UTC (permalink / raw)
  To: Sergei Shtylyov; +Cc: netdev, David S. Miller, Linux-Renesas
In-Reply-To: <8c72d27f-8b1a-23cf-3f41-781944cd1388@cogentembedded.com>

Hi Sergei,

On Mon, Jun 25, 2018 at 10:37 PM Sergei Shtylyov
<sergei.shtylyov@cogentembedded.com> wrote:
> The *enum*  RPADIR_BIT  was declared in the commit 86a74ff21a7a ("net:
> sh_eth: add support for Renesas SuperH Ethernet") adding SH771x support,
> however the SH771x manual doesn't have the RPADIR register described and,
> moreover, tells why the padding insertion must not be used. The newer SoC
> manuals do have RPADIR documented, though with somewhat different layout --
> update the *enum* according to these manuals...
>
> Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>

Thanks for your patch!

Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>

> --- net-next.orig/drivers/net/ethernet/renesas/sh_eth.h
> +++ net-next/drivers/net/ethernet/renesas/sh_eth.h
> @@ -403,8 +403,7 @@ enum DESC_I_BIT {
>
>  /* RPADIR */
>  enum RPADIR_BIT {
> -       RPADIR_PADS1 = 0x20000, RPADIR_PADS0 = 0x10000,
> -       RPADIR_PADR = 0x0003f,
> +       RPADIR_PADS = 0x1f0000, RPADIR_PADR = 0xffff,

Perhaps add some comments?

        RPADIR_PADS = 0x1f0000; /* Padding Size (insert N bytes of padding) */
        RPADIR_PADR = 0xffff;   /* Padding Slot (insert padding at byte N) */
>  };

Note that none of the RPADIR enums are actually used.

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply

* Re: [PATCH v2 net] nfp: cast sizeof() to int when comparing with error code
From: Julia Lawall @ 2018-06-26  7:21 UTC (permalink / raw)
  To: Joe Perches
  Cc: Chengguang Xu, jakub.kicinski, davem, LKML, cocci, oss-drivers,
	netdev, Dmitry Torokhov, linux-input, linux-s390
In-Reply-To: <ff0b2834b771a467246b9f7c1b4600c38f5bb463.camel@perches.com>



On Mon, 25 Jun 2018, Joe Perches wrote:

> On Tue, 2018-06-26 at 09:16 +0800, Chengguang Xu wrote:
> > sizeof() will return unsigned value so in the error check
> > negative error code will be always larger than sizeof().
>
> This looks like a general class of error in the kernel
> where a signed result that could be returning a -errno
> is tested against < or <= sizeof()
>
> A couple examples:
>
> drivers/input/mouse/elan_i2c_smbus.c:
>
> 		len = i2c_smbus_read_block_data(client,
> 						ETP_SMBUS_IAP_PASSWORD_READ,
> 						val);
> 		if (len < sizeof(u16)) {
>
> i2c_smbus_read_block_data can return a negative errno
>
>
> net/smc/smc_clc.c:
>
> 	len = kernel_sendmsg(smc->clcsock, &msg, &vec, 1,
> 			     sizeof(struct smc_clc_msg_decline));
> 	if (len < sizeof(struct smc_clc_msg_decline))
>
> where kernel_sendmsg can return a negative errno
>
> There are probably others, I didn't look hard.
>
> Perhaps a cocci script to find these could be generated?

Currently there is a rule for comparison of unsigneds to 0.  It would be
reasonable to extend it for sizes.  I will see what it gives.

julia

^ permalink raw reply

* Re: [patch net-next 0/9] net: sched: introduce chain templates support with offloading to mlxsw
From: Jiri Pirko @ 2018-06-26  7:12 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Linux Netdev List, David Miller, Jamal Hadi Salim, Cong Wang,
	Simon Horman, John Hurley, David Ahern, mlxsw
In-Reply-To: <CAJpBn1xigrmrHdfP4itPCPSnfkJVi3H3513K5KO=ZJEAnRUgNQ@mail.gmail.com>

Tue, Jun 26, 2018 at 09:00:45AM CEST, jakub.kicinski@netronome.com wrote:
>On Mon, Jun 25, 2018 at 11:43 PM, Jiri Pirko <jiri@resnulli.us> wrote:
>> Tue, Jun 26, 2018 at 06:58:50AM CEST, jakub.kicinski@netronome.com wrote:
>>>On Mon, 25 Jun 2018 23:01:39 +0200, Jiri Pirko wrote:
>>>> From: Jiri Pirko <jiri@mellanox.com>
>>>>
>>>> For the TC clsact offload these days, some of HW drivers need
>>>> to hold a magic ball. The reason is, with the first inserted rule inside
>>>> HW they need to guess what fields will be used for the matching. If
>>>> later on this guess proves to be wrong and user adds a filter with a
>>>> different field to match, there's a problem. Mlxsw resolves it now with
>>>> couple of patterns. Those try to cover as many match fields as possible.
>>>> This aproach is far from optimal, both performance-wise and scale-wise.
>>>> Also, there is a combination of filters that in certain order won't
>>>> succeed.
>>>>
>>>> Most of the time, when user inserts filters in chain, he knows right away
>>>> how the filters are going to look like - what type and option will they
>>>> have. For example, he knows that he will only insert filters of type
>>>> flower matching destination IP address. He can specify a template that
>>>> would cover all the filters in the chain.
>>>
>>>Perhaps it's lack of sleep, but this paragraph threw me a little off
>>>the track.  IIUC the goal of this set is to provide a way to inform the
>>>HW about expected matches before any rule is programmed into the HW.
>>>Not before any rule is added to a particular chain.  One can just use
>>>the first rule in the chain to make a guess about the chain, but thanks
>>>to this set user can configure *all* chains before any rules are added.
>>
>> The template is per-chain. User can use template for chain x and
>> not-use it for chain y. Up to him.
>
>Makes sense.
>
>I can't help but wonder if it'd be better to associate the
>constraints/rules with chains instead of creating a new "template"
>object.  It seems more natural to create a chain with specific
>constraints in place than add and delete template of which there can
>be at most one to a chain...  Perhaps that's more about the user space
>tc command line.  Anyway, not a strong objection, just a thought.

Hmm. I don't think it is good idea. User should see the template in a
"show" command per chain. We would have to have 2 show commands, one to
list the template objects and one to list templates per chains. It makes
things more complicated for no good reason. I think that this simple
chain-lock is easier and serves the purpose.

>
>>>And that's needed because once any rule is added the tcam config can no
>>>longer be easily modified?
>>
>> Yes.

^ permalink raw reply

* Re: [PATCH bpf-next 1/7] nfp: bpf: allow source ptr type be map ptr in memcpy optimization
From: Jakub Kicinski @ 2018-06-26  7:08 UTC (permalink / raw)
  To: Song Liu
  Cc: Alexei Starovoitov, Daniel Borkmann, oss-drivers, Networking,
	Jiong Wang
In-Reply-To: <CAPhsuW4yqanLp6WEYihwMwKny8VxZQTa3G-+nOLu_e4peokNGQ@mail.gmail.com>

On Mon, Jun 25, 2018 at 10:50 PM, Song Liu <liu.song.a23@gmail.com> wrote:
> On Sun, Jun 24, 2018 at 8:54 PM, Jakub Kicinski
> <jakub.kicinski@netronome.com> wrote:
>> From: Jiong Wang <jiong.wang@netronome.com>
>>
>> Map read has been supported on NFP, this patch enables optimization for
>> memcpy from map to packet.
>>
>> This patch also fixed one latent bug which will cause copying from
>> unexpected address once memcpy for map pointer enabled.
>>
>> Reported-by: Mary Pham <mary.pham@netronome.com>
>> Reported-by: David Beckett <david.beckett@netronome.com>
>> Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
>> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
>> ---
>>  drivers/net/ethernet/netronome/nfp/bpf/jit.c | 5 +++--
>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/netronome/nfp/bpf/jit.c b/drivers/net/ethernet/netronome/nfp/bpf/jit.c
>> index 8a92088df0d7..33111739b210 100644
>> --- a/drivers/net/ethernet/netronome/nfp/bpf/jit.c
>> +++ b/drivers/net/ethernet/netronome/nfp/bpf/jit.c
>> @@ -670,7 +670,7 @@ static int nfp_cpp_memcpy(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
>>         xfer_num = round_up(len, 4) / 4;
>>
>>         if (src_40bit_addr)
>> -               addr40_offset(nfp_prog, meta->insn.src_reg, off, &src_base,
>> +               addr40_offset(nfp_prog, meta->insn.src_reg * 2, off, &src_base,
>>                               &off);
>
> Did this break other cases before this patch?
>
> I am sorry if this is a dumb question. I don't think I fully
> understand addr40_offset().

Only map memory uses 40 bit addressing right now, so the if was pretty
much dead code before the patch.

The memcpy optimization was left out of the initial map support due to
insufficient test coverage, I should have probably left more of the 40
bit addressing code out back then.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox