Netdev List
 help / color / mirror / Atom feed
* [bpf-next V2 PATCH 0/4] xdp: introduce bulking for ndo_xdp_xmit API
From: Jesper Dangaard Brouer @ 2018-05-11 18:11 UTC (permalink / raw)
  To: netdev, Daniel Borkmann, Alexei Starovoitov,
	Jesper Dangaard Brouer
  Cc: Christoph Hellwig, BjörnTöpel, Magnus Karlsson

This patchset change ndo_xdp_xmit API to take a bulk of xdp frames.

When kernel is compiled with CONFIG_RETPOLINE, every indirect function
pointer (branch) call hurts performance. For XDP this have a huge
negative performance impact.

This patchset reduce the needed (indirect) calls to ndo_xdp_xmit, but
also prepares for further optimizations.  The DMA APIs use of indirect
function pointer calls is the primary source the regression.  It is
left for a followup patchset, to use bulking calls towards the DMA API
(via the scatter-gatter calls).

The other advantage of this API change is that drivers can easier
amortize the cost of any sync/locking scheme, over the bulk of
packets.  The assumption of the current API is that the driver
implemementing the NDO will also allocate a dedicated XDP TX queue for
every CPU in the system.  Which is not always possible or practical to
configure. E.g. ixgbe cannot load an XDP program on a machine with
more than 96 CPUs, due to limited hardware TX queues.  E.g. virtio_net
is hard to configure as it requires manually increasing the
queues. E.g. tun driver chooses to use a per XDP frame producer lock
modulo smp_processor_id over avail queues.

---

Jesper Dangaard Brouer (4):
      bpf: devmap introduce dev_map_enqueue
      bpf: devmap prepare xdp frames for bulking
      xdp: add tracepoint for devmap like cpumap have
      xdp: change ndo_xdp_xmit API to support bulking


 drivers/net/ethernet/intel/i40e/i40e_txrx.c   |   26 ++++-
 drivers/net/ethernet/intel/i40e/i40e_txrx.h   |    2 
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |   21 +++-
 drivers/net/tun.c                             |   37 ++++---
 drivers/net/virtio_net.c                      |   66 +++++++++---
 include/linux/bpf.h                           |   16 ++-
 include/linux/netdevice.h                     |   14 ++-
 include/net/page_pool.h                       |    5 +
 include/net/xdp.h                             |    1 
 include/trace/events/xdp.h                    |   50 +++++++++
 kernel/bpf/devmap.c                           |  134 ++++++++++++++++++++++++-
 net/core/filter.c                             |   19 +---
 net/core/xdp.c                                |   20 +++-
 samples/bpf/xdp_monitor_kern.c                |   49 +++++++++
 samples/bpf/xdp_monitor_user.c                |   69 +++++++++++++
 15 files changed, 446 insertions(+), 83 deletions(-)

^ permalink raw reply

* Re: [PATCH net-next v10 2/4] net: Introduce generic failover module
From: Michael S. Tsirkin @ 2018-05-11 18:09 UTC (permalink / raw)
  To: Randy Dunlap
  Cc: Sridhar Samudrala, stephen, davem, netdev, virtualization,
	virtio-dev, jesse.brandeburg, alexander.h.duyck, kubakici,
	jasowang, loseweigh, jiri, aaron.f.brown
In-Reply-To: <460f3d8f-b2ec-2118-e296-03f4f9655c5a@infradead.org>

On Mon, May 07, 2018 at 03:39:19PM -0700, Randy Dunlap wrote:
> Hi,
> 
> On 05/07/2018 03:10 PM, Sridhar Samudrala wrote:
> > 
> > Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
> > ---
> >  MAINTAINERS                |    7 +
> >  include/linux/netdevice.h  |   16 +
> >  include/net/net_failover.h |   52 +++
> >  net/Kconfig                |   10 +
> >  net/core/Makefile          |    1 +
> >  net/core/net_failover.c    | 1044 ++++++++++++++++++++++++++++++++++++++++++++
> >  6 files changed, 1130 insertions(+)
> >  create mode 100644 include/net/net_failover.h
> >  create mode 100644 net/core/net_failover.c
> 
> 
> > diff --git a/net/Kconfig b/net/Kconfig
> > index b62089fb1332..0540856676de 100644
> > --- a/net/Kconfig
> > +++ b/net/Kconfig
> > @@ -429,6 +429,16 @@ config MAY_USE_DEVLINK
> >  config PAGE_POOL
> >         bool
> >  
> > +config NET_FAILOVER
> > +	tristate "Failover interface"
> > +	default m
> 
> Need some justification for default m (as opposed to n).

Or one can just leave the default line out.

^ permalink raw reply

* [PATCH v3] net: phy: DP83TC811: Introduce support for the DP83TC811 phy
From: Dan Murphy @ 2018-05-11 18:08 UTC (permalink / raw)
  To: andrew, f.fainelli; +Cc: netdev, linux-kernel, Dan Murphy

Add support for the DP83811 phy.

The DP83811 supports both rgmii and sgmii interfaces.
There are 2 part numbers for this the DP83TC811R does not
reliably support the SGMII interface but the DP83TC811S will.

There is not a way to differentiate these parts from the
hardware or register set.  So this is controlled via the DT
to indicate which phy mode is required.  Or the part can be
strapped to a certain interface.

Data sheet can be found here:
http://www.ti.com/product/DP83TC811S-Q1/description
http://www.ti.com/product/DP83TC811R-Q1/description

Signed-off-by: Dan Murphy <dmurphy@ti.com>
---

v3 - Variable length alignment - https://patchwork.kernel.org/patch/10389657/

v2 - Remove extra config_init in reset, update config_init call back function
fix a checkpatch alignment issue, add SGMII check in autoneg api - https://patchwork.kernel.org/patch/10389323/

 drivers/net/phy/Kconfig     |   5 +
 drivers/net/phy/Makefile    |   1 +
 drivers/net/phy/dp83tc811.c | 347 ++++++++++++++++++++++++++++++++++++
 3 files changed, 353 insertions(+)
 create mode 100644 drivers/net/phy/dp83tc811.c

diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig
index bdfbabb86ee0..810140a9e114 100644
--- a/drivers/net/phy/Kconfig
+++ b/drivers/net/phy/Kconfig
@@ -285,6 +285,11 @@ config DP83822_PHY
 	---help---
 	  Supports the DP83822 PHY.
 
+config DP83TC811_PHY
+	tristate "Texas Instruments DP83TC822 PHY"
+	---help---
+	  Supports the DP83TC822 PHY.
+
 config DP83848_PHY
 	tristate "Texas Instruments DP83848 PHY"
 	---help---
diff --git a/drivers/net/phy/Makefile b/drivers/net/phy/Makefile
index 01acbcb2c798..00445b61a9a8 100644
--- a/drivers/net/phy/Makefile
+++ b/drivers/net/phy/Makefile
@@ -57,6 +57,7 @@ obj-$(CONFIG_CORTINA_PHY)	+= cortina.o
 obj-$(CONFIG_DAVICOM_PHY)	+= davicom.o
 obj-$(CONFIG_DP83640_PHY)	+= dp83640.o
 obj-$(CONFIG_DP83822_PHY)	+= dp83822.o
+obj-$(CONFIG_DP83TC811_PHY)	+= dp83tc811.o
 obj-$(CONFIG_DP83848_PHY)	+= dp83848.o
 obj-$(CONFIG_DP83867_PHY)	+= dp83867.o
 obj-$(CONFIG_FIXED_PHY)		+= fixed_phy.o
diff --git a/drivers/net/phy/dp83tc811.c b/drivers/net/phy/dp83tc811.c
new file mode 100644
index 000000000000..081d99aa3985
--- /dev/null
+++ b/drivers/net/phy/dp83tc811.c
@@ -0,0 +1,347 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Driver for the Texas Instruments DP83TC811 PHY
+ *
+ * Copyright (C) 2018 Texas Instruments Incorporated - http://www.ti.com/
+ *
+ */
+
+#include <linux/ethtool.h>
+#include <linux/etherdevice.h>
+#include <linux/kernel.h>
+#include <linux/mii.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/phy.h>
+#include <linux/netdevice.h>
+
+#define DP83TC811_PHY_ID	0x2000a253
+#define DP83811_DEVADDR		0x1f
+
+#define MII_DP83811_SGMII_CTRL	0x09
+#define MII_DP83811_INT_STAT1	0x12
+#define MII_DP83811_INT_STAT2	0x13
+#define MII_DP83811_RESET_CTRL	0x1f
+
+#define DP83811_HW_RESET	BIT(15)
+#define DP83811_SW_RESET	BIT(14)
+
+/* INT_STAT1 bits */
+#define DP83811_RX_ERR_HF_INT_EN	BIT(0)
+#define DP83811_MS_TRAINING_INT_EN	BIT(1)
+#define DP83811_ANEG_COMPLETE_INT_EN	BIT(2)
+#define DP83811_ESD_EVENT_INT_EN	BIT(3)
+#define DP83811_WOL_INT_EN		BIT(4)
+#define DP83811_LINK_STAT_INT_EN	BIT(5)
+#define DP83811_ENERGY_DET_INT_EN	BIT(6)
+#define DP83811_LINK_QUAL_INT_EN	BIT(7)
+
+/* INT_STAT2 bits */
+#define DP83811_JABBER_DET_INT_EN	BIT(0)
+#define DP83811_POLARITY_INT_EN		BIT(1)
+#define DP83811_SLEEP_MODE_INT_EN	BIT(2)
+#define DP83811_OVERTEMP_INT_EN		BIT(3)
+#define DP83811_OVERVOLTAGE_INT_EN	BIT(6)
+#define DP83811_UNDERVOLTAGE_INT_EN	BIT(7)
+
+#define MII_DP83811_RXSOP1	0x04a5
+#define MII_DP83811_RXSOP2	0x04a6
+#define MII_DP83811_RXSOP3	0x04a7
+
+/* WoL Registers */
+#define MII_DP83811_WOL_CFG	0x04a0
+#define MII_DP83811_WOL_STAT	0x04a1
+#define MII_DP83811_WOL_DA1	0x04a2
+#define MII_DP83811_WOL_DA2	0x04a3
+#define MII_DP83811_WOL_DA3	0x04a4
+
+/* WoL bits */
+#define DP83811_WOL_MAGIC_EN	BIT(0)
+#define DP83811_WOL_SECURE_ON	BIT(5)
+#define DP83811_WOL_EN		BIT(7)
+#define DP83811_WOL_INDICATION_SEL BIT(8)
+#define DP83811_WOL_CLR_INDICATION BIT(11)
+
+/* SGMII CTRL bits */
+#define DP83811_TDR_AUTO		BIT(8)
+#define DP83811_SGMII_EN		BIT(12)
+#define DP83811_SGMII_AUTO_NEG_EN	BIT(13)
+#define DP83811_SGMII_TX_ERR_DIS	BIT(14)
+#define DP83811_SGMII_SOFT_RESET	BIT(15)
+
+static int dp83811_ack_interrupt(struct phy_device *phydev)
+{
+	int err;
+
+	err = phy_read(phydev, MII_DP83811_INT_STAT1);
+	if (err < 0)
+		return err;
+
+	err = phy_read(phydev, MII_DP83811_INT_STAT2);
+	if (err < 0)
+		return err;
+
+	return 0;
+}
+
+static int dp83811_set_wol(struct phy_device *phydev,
+			   struct ethtool_wolinfo *wol)
+{
+	struct net_device *ndev = phydev->attached_dev;
+	const u8 *mac;
+	u16 value;
+
+	if (wol->wolopts & (WAKE_MAGIC | WAKE_MAGICSECURE)) {
+		mac = (const u8 *)ndev->dev_addr;
+
+		if (!is_valid_ether_addr(mac))
+			return -EINVAL;
+
+		/* MAC addresses start with byte 5, but stored in mac[0].
+		 * 811 PHYs store bytes 4|5, 2|3, 0|1
+		 */
+		phy_write_mmd(phydev, DP83811_DEVADDR, MII_DP83811_WOL_DA1,
+			      (mac[1] << 8) | mac[0]);
+		phy_write_mmd(phydev, DP83811_DEVADDR, MII_DP83811_WOL_DA2,
+			      (mac[3] << 8) | mac[2]);
+		phy_write_mmd(phydev, DP83811_DEVADDR, MII_DP83811_WOL_DA3,
+			      (mac[5] << 8) | mac[4]);
+
+		value = phy_read_mmd(phydev, DP83811_DEVADDR,
+				     MII_DP83811_WOL_CFG);
+		if (wol->wolopts & WAKE_MAGIC)
+			value |= DP83811_WOL_MAGIC_EN;
+		else
+			value &= ~DP83811_WOL_MAGIC_EN;
+
+		if (wol->wolopts & WAKE_MAGICSECURE) {
+			phy_write_mmd(phydev, DP83811_DEVADDR,
+				      MII_DP83811_RXSOP1,
+				      (wol->sopass[1] << 8) | wol->sopass[0]);
+			phy_write_mmd(phydev, DP83811_DEVADDR,
+				      MII_DP83811_RXSOP2,
+				      (wol->sopass[3] << 8) | wol->sopass[2]);
+			phy_write_mmd(phydev, DP83811_DEVADDR,
+				      MII_DP83811_RXSOP3,
+				      (wol->sopass[5] << 8) | wol->sopass[4]);
+			value |= DP83811_WOL_SECURE_ON;
+		} else {
+			value &= ~DP83811_WOL_SECURE_ON;
+		}
+
+		value |= (DP83811_WOL_EN | DP83811_WOL_INDICATION_SEL |
+			  DP83811_WOL_CLR_INDICATION);
+		phy_write_mmd(phydev, DP83811_DEVADDR, MII_DP83811_WOL_CFG,
+			      value);
+	} else {
+		value = phy_read_mmd(phydev, DP83811_DEVADDR,
+				     MII_DP83811_WOL_CFG);
+		value &= ~DP83811_WOL_EN;
+		phy_write_mmd(phydev, DP83811_DEVADDR, MII_DP83811_WOL_CFG,
+			      value);
+	}
+
+	return 0;
+}
+
+static void dp83811_get_wol(struct phy_device *phydev,
+			    struct ethtool_wolinfo *wol)
+{
+	u16 sopass_val;
+	int value;
+
+	wol->supported = (WAKE_MAGIC | WAKE_MAGICSECURE);
+	wol->wolopts = 0;
+
+	value = phy_read_mmd(phydev, DP83811_DEVADDR, MII_DP83811_WOL_CFG);
+
+	if (value & DP83811_WOL_MAGIC_EN)
+		wol->wolopts |= WAKE_MAGIC;
+
+	if (value & DP83811_WOL_SECURE_ON) {
+		sopass_val = phy_read_mmd(phydev, DP83811_DEVADDR,
+					  MII_DP83811_RXSOP1);
+		wol->sopass[0] = (sopass_val & 0xff);
+		wol->sopass[1] = (sopass_val >> 8);
+
+		sopass_val = phy_read_mmd(phydev, DP83811_DEVADDR,
+					  MII_DP83811_RXSOP2);
+		wol->sopass[2] = (sopass_val & 0xff);
+		wol->sopass[3] = (sopass_val >> 8);
+
+		sopass_val = phy_read_mmd(phydev, DP83811_DEVADDR,
+					  MII_DP83811_RXSOP3);
+		wol->sopass[4] = (sopass_val & 0xff);
+		wol->sopass[5] = (sopass_val >> 8);
+
+		wol->wolopts |= WAKE_MAGICSECURE;
+	}
+
+	/* WoL is not enabled so set wolopts to 0 */
+	if (!(value & DP83811_WOL_EN))
+		wol->wolopts = 0;
+}
+
+static int dp83811_config_intr(struct phy_device *phydev)
+{
+	int misr_status, err;
+
+	if (phydev->interrupts == PHY_INTERRUPT_ENABLED) {
+		misr_status = phy_read(phydev, MII_DP83811_INT_STAT1);
+		if (misr_status < 0)
+			return misr_status;
+
+		misr_status |= (DP83811_RX_ERR_HF_INT_EN |
+				DP83811_MS_TRAINING_INT_EN |
+				DP83811_ANEG_COMPLETE_INT_EN |
+				DP83811_ESD_EVENT_INT_EN |
+				DP83811_WOL_INT_EN |
+				DP83811_LINK_STAT_INT_EN |
+				DP83811_ENERGY_DET_INT_EN |
+				DP83811_LINK_QUAL_INT_EN);
+
+		err = phy_write(phydev, MII_DP83811_INT_STAT1, misr_status);
+		if (err < 0)
+			return err;
+
+		misr_status = phy_read(phydev, MII_DP83811_INT_STAT2);
+		if (misr_status < 0)
+			return misr_status;
+
+		misr_status |= (DP83811_JABBER_DET_INT_EN |
+				DP83811_POLARITY_INT_EN |
+				DP83811_SLEEP_MODE_INT_EN |
+				DP83811_OVERTEMP_INT_EN |
+				DP83811_OVERVOLTAGE_INT_EN |
+				DP83811_UNDERVOLTAGE_INT_EN);
+
+		err = phy_write(phydev, MII_DP83811_INT_STAT2, misr_status);
+
+	} else {
+		err = phy_write(phydev, MII_DP83811_INT_STAT1, 0);
+		if (err < 0)
+			return err;
+
+		err = phy_write(phydev, MII_DP83811_INT_STAT1, 0);
+	}
+
+	return err;
+}
+
+static int dp83811_config_aneg(struct phy_device *phydev)
+{
+	int value, err;
+
+	if (phydev->interface == PHY_INTERFACE_MODE_SGMII) {
+		value = phy_read(phydev, MII_DP83811_SGMII_CTRL);
+		if (phydev->autoneg == AUTONEG_ENABLE) {
+			err = phy_write(phydev, MII_DP83811_SGMII_CTRL,
+					(DP83811_SGMII_AUTO_NEG_EN | value));
+			if (err < 0)
+				return err;
+		} else {
+			err = phy_write(phydev, MII_DP83811_SGMII_CTRL,
+					(~DP83811_SGMII_AUTO_NEG_EN & value));
+			if (err < 0)
+				return err;
+		}
+	}
+
+	return genphy_config_aneg(phydev);
+}
+
+static int dp83811_config_init(struct phy_device *phydev)
+{
+	int value, err;
+
+	err = genphy_config_init(phydev);
+	if (err < 0)
+		return err;
+
+	if (phydev->interface == PHY_INTERFACE_MODE_SGMII) {
+		value = phy_read(phydev, MII_DP83811_SGMII_CTRL);
+		if (!(value & DP83811_SGMII_EN)) {
+			err = phy_write(phydev, MII_DP83811_SGMII_CTRL,
+					(DP83811_SGMII_EN | value));
+			if (err < 0)
+				return err;
+		} else {
+			err = phy_write(phydev, MII_DP83811_SGMII_CTRL,
+					(~DP83811_SGMII_EN & value));
+			if (err < 0)
+				return err;
+		}
+	}
+
+	value = DP83811_WOL_MAGIC_EN | DP83811_WOL_SECURE_ON | DP83811_WOL_EN;
+
+	return phy_write_mmd(phydev, DP83811_DEVADDR, MII_DP83811_WOL_CFG,
+	      value);
+}
+
+static int dp83811_phy_reset(struct phy_device *phydev)
+{
+	int err;
+
+	err = phy_write(phydev, MII_DP83811_RESET_CTRL, DP83811_HW_RESET);
+	if (err < 0)
+		return err;
+
+	return 0;
+}
+
+static int dp83811_suspend(struct phy_device *phydev)
+{
+	int value;
+
+	value = phy_read_mmd(phydev, DP83811_DEVADDR, MII_DP83811_WOL_CFG);
+
+	if (!(value & DP83811_WOL_EN))
+		genphy_suspend(phydev);
+
+	return 0;
+}
+
+static int dp83811_resume(struct phy_device *phydev)
+{
+	int value;
+
+	genphy_resume(phydev);
+
+	value = phy_read_mmd(phydev, DP83811_DEVADDR, MII_DP83811_WOL_CFG);
+
+	phy_write_mmd(phydev, DP83811_DEVADDR, MII_DP83811_WOL_CFG, value |
+		      DP83811_WOL_CLR_INDICATION);
+
+	return 0;
+}
+
+static struct phy_driver dp83811_driver[] = {
+	{
+		.phy_id = DP83TC811_PHY_ID,
+		.phy_id_mask = 0xfffffff0,
+		.name = "TI DP83TC811",
+		.features = PHY_BASIC_FEATURES,
+		.flags = PHY_HAS_INTERRUPT,
+		.config_init = dp83811_config_init,
+		.config_aneg = dp83811_config_aneg,
+		.soft_reset = dp83811_phy_reset,
+		.get_wol = dp83811_get_wol,
+		.set_wol = dp83811_set_wol,
+		.ack_interrupt = dp83811_ack_interrupt,
+		.config_intr = dp83811_config_intr,
+		.suspend = dp83811_suspend,
+		.resume = dp83811_resume,
+	 },
+};
+module_phy_driver(dp83811_driver);
+
+static struct mdio_device_id __maybe_unused dp83811_tbl[] = {
+	{ DP83TC811_PHY_ID, 0xfffffff0 },
+	{ },
+};
+MODULE_DEVICE_TABLE(mdio, dp83811_tbl);
+
+MODULE_DESCRIPTION("Texas Instruments DP83TC811 PHY driver");
+MODULE_AUTHOR("Dan Murphy <dmurphy@ti.com");
+MODULE_LICENSE("GPL");
-- 
2.17.0.582.gccdcbd54c

^ permalink raw reply related

* Re: [patch net] net: sched: fix error path in tcf_proto_create() when modules are not configured
From: Cong Wang @ 2018-05-11 17:56 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Linux Kernel Network Developers, David Miller, Jamal Hadi Salim,
	mlxsw
In-Reply-To: <20180511154532.2391-1-jiri@resnulli.us>

On Fri, May 11, 2018 at 8:45 AM, Jiri Pirko <jiri@resnulli.us> wrote:
> From: Jiri Pirko <jiri@mellanox.com>
>
> In case modules are not configured, error out when tp->ops is null
> and prevent later null pointer dereference.
>
> Fixes: 33a48927c193 ("sched: push TC filter protocol creation into a separate function")
> Signed-off-by: Jiri Pirko <jiri@mellanox.com>

Acked-by: Cong Wang <xiyou.wangcong@gmail.com>

^ permalink raw reply

* Re: [PATCH net] tun: fix use after free for ptr_ring
From: Michael S. Tsirkin @ 2018-05-11 17:52 UTC (permalink / raw)
  To: Jason Wang; +Cc: netdev, linux-kernel, Eric Dumazet, Cong Wang
In-Reply-To: <1525849198-9786-1-git-send-email-jasowang@redhat.com>

On Wed, May 09, 2018 at 02:59:58PM +0800, Jason Wang wrote:
> We used to initialize ptr_ring during TUNSETIFF, this is because its
> size depends on the tx_queue_len of netdevice. And we try to clean it
> up when socket were detached from netdevice. A race were spotted when
> trying to do uninit during a read which will lead a use after free for
> pointer ring. Solving this by always initialize a zero size ptr_ring
> in open() and do resizing during TUNSETIFF, and then we can safely do
> cleanup during close(). With this, there's no need for the workaround
> that was introduced by commit 4df0bfc79904 ("tun: fix a memory leak
> for tfile->tx_array").
> 
> Reported-by: syzbot+e8b902c3c3fadf0a9dba@syzkaller.appspotmail.com
> Cc: Eric Dumazet <eric.dumazet@gmail.com>
> Cc: Cong Wang <xiyou.wangcong@gmail.com>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Fixes: 1576d9860599 ("tun: switch to use skb array for tx")
> Signed-off-by: Jason Wang <jasowang@redhat.com>

Acked-by: Michael S. Tsirkin <mst@redhat.com>

> ---
>  drivers/net/tun.c | 26 +++++++++++---------------
>  1 file changed, 11 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index ef33950..298cb96 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -681,15 +681,6 @@ static void tun_queue_purge(struct tun_file *tfile)
>  	skb_queue_purge(&tfile->sk.sk_error_queue);
>  }
>  
> -static void tun_cleanup_tx_ring(struct tun_file *tfile)
> -{
> -	if (tfile->tx_ring.queue) {
> -		ptr_ring_cleanup(&tfile->tx_ring, tun_ptr_free);
> -		xdp_rxq_info_unreg(&tfile->xdp_rxq);
> -		memset(&tfile->tx_ring, 0, sizeof(tfile->tx_ring));
> -	}
> -}
> -
>  static void __tun_detach(struct tun_file *tfile, bool clean)
>  {
>  	struct tun_file *ntfile;
> @@ -736,7 +727,8 @@ static void __tun_detach(struct tun_file *tfile, bool clean)
>  			    tun->dev->reg_state == NETREG_REGISTERED)
>  				unregister_netdevice(tun->dev);
>  		}
> -		tun_cleanup_tx_ring(tfile);
> +		if (tun)
> +			xdp_rxq_info_unreg(&tfile->xdp_rxq);
>  		sock_put(&tfile->sk);
>  	}
>  }
> @@ -783,14 +775,14 @@ static void tun_detach_all(struct net_device *dev)
>  		tun_napi_del(tun, tfile);
>  		/* Drop read queue */
>  		tun_queue_purge(tfile);
> +		xdp_rxq_info_unreg(&tfile->xdp_rxq);
>  		sock_put(&tfile->sk);
> -		tun_cleanup_tx_ring(tfile);
>  	}
>  	list_for_each_entry_safe(tfile, tmp, &tun->disabled, next) {
>  		tun_enable_queue(tfile);
>  		tun_queue_purge(tfile);
> +		xdp_rxq_info_unreg(&tfile->xdp_rxq);
>  		sock_put(&tfile->sk);
> -		tun_cleanup_tx_ring(tfile);
>  	}
>  	BUG_ON(tun->numdisabled != 0);
>  
> @@ -834,7 +826,8 @@ static int tun_attach(struct tun_struct *tun, struct file *file,
>  	}
>  
>  	if (!tfile->detached &&
> -	    ptr_ring_init(&tfile->tx_ring, dev->tx_queue_len, GFP_KERNEL)) {
> +	    ptr_ring_resize(&tfile->tx_ring, dev->tx_queue_len,
> +			    GFP_KERNEL, __skb_array_destroy_skb)) {
>  		err = -ENOMEM;
>  		goto out;
>  	}
> @@ -3219,6 +3212,11 @@ static int tun_chr_open(struct inode *inode, struct file * file)
>  					    &tun_proto, 0);
>  	if (!tfile)
>  		return -ENOMEM;
> +	if (ptr_ring_init(&tfile->tx_ring, 0, GFP_KERNEL)) {
> +		sk_free(&tfile->sk);
> +		return -ENOMEM;
> +	}
> +
>  	RCU_INIT_POINTER(tfile->tun, NULL);
>  	tfile->flags = 0;
>  	tfile->ifindex = 0;
> @@ -3239,8 +3237,6 @@ static int tun_chr_open(struct inode *inode, struct file * file)
>  
>  	sock_set_flag(&tfile->sk, SOCK_ZEROCOPY);
>  
> -	memset(&tfile->tx_ring, 0, sizeof(tfile->tx_ring));
> -
>  	return 0;
>  }
>  
> -- 
> 2.7.4

^ permalink raw reply

* Re: [PATCH v2 net 1/1] net sched actions: fix invalid pointer dereferencing if skbedit flags missing
From: Cong Wang @ 2018-05-11 17:47 UTC (permalink / raw)
  To: Roman Mashak
  Cc: David Miller, Linux Kernel Network Developers, kernel,
	Jamal Hadi Salim, Jiri Pirko, Alexander Duyck
In-Reply-To: <1526050509-30487-1-git-send-email-mrv@mojatatu.com>

On Fri, May 11, 2018 at 7:55 AM, Roman Mashak <mrv@mojatatu.com> wrote:
> When application fails to pass flags in netlink TLV for a new skbedit action,
> the kernel results in the following oops:
>
> [    8.307732] BUG: unable to handle kernel paging request at 0000000000021130
> [    8.309167] PGD 80000000193d1067 P4D 80000000193d1067 PUD 180e0067 PMD 0
> [    8.310595] Oops: 0000 [#1] SMP PTI
> [    8.311334] Modules linked in: kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd glue_helper serio_raw
> [    8.314190] CPU: 1 PID: 397 Comm: tc Not tainted 4.17.0-rc3+ #357
> [    8.315252] RIP: 0010:__tcf_idr_release+0x33/0x140
> [    8.316203] RSP: 0018:ffffa0718038f840 EFLAGS: 00010246
> [    8.317123] RAX: 0000000000000001 RBX: 0000000000021100 RCX: 0000000000000000
> [    8.319831] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000021100
> [    8.321181] RBP: 0000000000000000 R08: 000000000004adf8 R09: 0000000000000122
> [    8.322645] R10: 0000000000000000 R11: ffffffff9e5b01ed R12: 0000000000000000
> [    8.324157] R13: ffffffff9e0d3cc0 R14: 0000000000000000 R15: 0000000000000000
> [    8.325590] FS:  00007f591292e700(0000) GS:ffff8fcf5bc40000(0000) knlGS:0000000000000000
> [    8.327001] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    8.327987] CR2: 0000000000021130 CR3: 00000000180e6004 CR4: 00000000001606a0
> [    8.329289] Call Trace:
> [    8.329735]  tcf_skbedit_init+0xa7/0xb0
> [    8.330423]  tcf_action_init_1+0x362/0x410
> [    8.331139]  ? try_to_wake_up+0x44/0x430
> [    8.331817]  tcf_action_init+0x103/0x190
> [    8.332511]  tc_ctl_action+0x11a/0x220
> [    8.333174]  rtnetlink_rcv_msg+0x23d/0x2e0
> [    8.333902]  ? _cond_resched+0x16/0x40
> [    8.334569]  ? __kmalloc_node_track_caller+0x5b/0x2c0
> [    8.335440]  ? rtnl_calcit.isra.31+0xf0/0xf0
> [    8.336178]  netlink_rcv_skb+0xdb/0x110
> [    8.336855]  netlink_unicast+0x167/0x220
> [    8.337550]  netlink_sendmsg+0x2a7/0x390
> [    8.338258]  sock_sendmsg+0x30/0x40
> [    8.338865]  ___sys_sendmsg+0x2c5/0x2e0
> [    8.339531]  ? pagecache_get_page+0x27/0x210
> [    8.340271]  ? filemap_fault+0xa2/0x630
> [    8.340943]  ? page_add_file_rmap+0x108/0x200
> [    8.341732]  ? alloc_set_pte+0x2aa/0x530
> [    8.342573]  ? finish_fault+0x4e/0x70
> [    8.343332]  ? __handle_mm_fault+0xbc1/0x10d0
> [    8.344337]  ? __sys_sendmsg+0x53/0x80
> [    8.345040]  __sys_sendmsg+0x53/0x80
> [    8.345678]  do_syscall_64+0x4f/0x100
> [    8.346339]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [    8.347206] RIP: 0033:0x7f591191da67
> [    8.347831] RSP: 002b:00007fff745abd48 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
> [    8.349179] RAX: ffffffffffffffda RBX: 00007fff745abe70 RCX: 00007f591191da67
> [    8.350431] RDX: 0000000000000000 RSI: 00007fff745abdc0 RDI: 0000000000000003
> [    8.351659] RBP: 000000005af35251 R08: 0000000000000001 R09: 0000000000000000
> [    8.352922] R10: 00000000000005f1 R11: 0000000000000246 R12: 0000000000000000
> [    8.354183] R13: 00007fff745afed0 R14: 0000000000000001 R15: 00000000006767c0
> [    8.355400] Code: 41 89 d4 53 89 f5 48 89 fb e8 aa 20 fd ff 85 c0 0f 84 ed 00
> 00 00 48 85 db 0f 84 cf 00 00 00 40 84 ed 0f 85 cd 00 00 00 45 84 e4 <8b> 53 30
> 74 0d 85 d2 b8 ff ff ff ff 0f 8f b3 00 00 00 8b 43 2c
> [    8.358699] RIP: __tcf_idr_release+0x33/0x140 RSP: ffffa0718038f840
> [    8.359770] CR2: 0000000000021130
> [    8.360438] ---[ end trace 60c66be45dfc14f0 ]---
>
> The caller calls action's ->init() and passes pointer to "struct tc_action *a",
> which later may be initialized to point at the existing action, otherwise
> "struct tc_action *a" is still invalid, and therefore dereferencing it is an
> error as happens in tcf_idr_release, where refcnt is decremented.
>
> So in case of missing flags tcf_idr_release must be called only for
> existing actions.
>
> v2:
>     - prepare patch for net tree
>
> Signed-off-by: Roman Mashak <mrv@mojatatu.com>

Fixes: 5e1567aeb7fe ("net sched: skbedit action fix late binding")

Acked-by: Cong Wang <xiyou.wangcong@gmail.com>

^ permalink raw reply

* Re: [PATCH net V2] tun: fix use after free for ptr_ring
From: Cong Wang @ 2018-05-11 17:39 UTC (permalink / raw)
  To: Jason Wang
  Cc: Linux Kernel Network Developers, LKML, Eric Dumazet,
	Michael S. Tsirkin
In-Reply-To: <1526006965-9124-1-git-send-email-jasowang@redhat.com>

On Thu, May 10, 2018 at 7:49 PM, Jason Wang <jasowang@redhat.com> wrote:
>  static void __tun_detach(struct tun_file *tfile, bool clean)
>  {
>         struct tun_file *ntfile;
> @@ -736,7 +727,8 @@ static void __tun_detach(struct tun_file *tfile, bool clean)
>                             tun->dev->reg_state == NETREG_REGISTERED)
>                                 unregister_netdevice(tun->dev);
>                 }
> -               tun_cleanup_tx_ring(tfile);
> +               if (tun)
> +                       xdp_rxq_info_unreg(&tfile->xdp_rxq);
>                 sock_put(&tfile->sk);
>         }
>  }
> @@ -783,14 +775,14 @@ static void tun_detach_all(struct net_device *dev)
>                 tun_napi_del(tun, tfile);
>                 /* Drop read queue */
>                 tun_queue_purge(tfile);
> +               xdp_rxq_info_unreg(&tfile->xdp_rxq);
>                 sock_put(&tfile->sk);
> -               tun_cleanup_tx_ring(tfile);
>         }
>         list_for_each_entry_safe(tfile, tmp, &tun->disabled, next) {
>                 tun_enable_queue(tfile);
>                 tun_queue_purge(tfile);
> +               xdp_rxq_info_unreg(&tfile->xdp_rxq);
>                 sock_put(&tfile->sk);
> -               tun_cleanup_tx_ring(tfile);

Are you sure this is safe?

xdp_rxq_info_unreg() can't be called more than once either,
please make sure the warning that commit c13da21cdb80
("tun: avoid calling xdp_rxq_info_unreg() twice") fixed will not
show up again.

^ permalink raw reply

* [PATCH net] packet: in packet_snd start writing at link layer allocation
From: Willem de Bruijn @ 2018-05-11 17:24 UTC (permalink / raw)
  To: netdev; +Cc: davem, eric.dumazet, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

Packet sockets allow construction of packets shorter than
dev->hard_header_len to accommodate protocols with variable length
link layer headers. These packets are padded to dev->hard_header_len,
because some device drivers interpret that as a minimum packet size.

packet_snd reserves dev->hard_header_len bytes on allocation.
SOCK_DGRAM sockets call skb_push in dev_hard_header() to ensure that
link layer headers are stored in the reserved range. SOCK_RAW sockets
do the same in tpacket_snd, but not in packet_snd.

Syzbot was able to send a zero byte packet to a device with massive
116B link layer header, causing padding to cross over into skb_shinfo.
Fix this by writing from the start of the llheader reserved range also
in the case of packet_snd/SOCK_RAW.

Update skb_set_network_header to the new offset. This also corrects
it for SOCK_DGRAM, where it incorrectly double counted reserve due to
the skb_push in dev_hard_header.

Fixes: 9ed988cd5915 ("packet: validate variable length ll headers")
Reported-by: syzbot+71d74a5406d02057d559@syzkaller.appspotmail.com
Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 net/packet/af_packet.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 01f3515cada0d..e9422fe451793 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2903,13 +2903,15 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 	if (skb == NULL)
 		goto out_unlock;
 
-	skb_set_network_header(skb, reserve);
+	skb_reset_network_header(skb);
 
 	err = -EINVAL;
 	if (sock->type == SOCK_DGRAM) {
 		offset = dev_hard_header(skb, dev, ntohs(proto), addr, NULL, len);
 		if (unlikely(offset < 0))
 			goto out_free;
+	} else if (reserve) {
+		skb_push(skb, reserve);
 	}
 
 	/* Returns -EFAULT on error */
-- 
2.17.0.441.gb46fe60e1d-goog

^ permalink raw reply related

* Re: [PATCH net-next v10 2/4] net: Introduce generic failover module
From: Michael S. Tsirkin @ 2018-05-11 17:15 UTC (permalink / raw)
  To: Samudrala, Sridhar
  Cc: alexander.h.duyck, virtio-dev, jiri, kubakici, netdev,
	virtualization, loseweigh, aaron.f.brown, davem
In-Reply-To: <e8454b29-d66b-9e20-a887-cb312a63847e@intel.com>

On Mon, May 07, 2018 at 05:24:27PM -0700, Samudrala, Sridhar wrote:
> 
> 
> On 5/7/2018 4:53 PM, Stephen Hemminger wrote:
> > On Mon,  7 May 2018 15:10:44 -0700
> > Sridhar Samudrala <sridhar.samudrala@intel.com> wrote:
> > 
> > > +static struct net_device *net_failover_get_bymac(u8 *mac,
> > > +						 struct net_failover_ops **ops)
> > > +{
> > > +	struct net_device *failover_dev;
> > > +	struct net_failover *failover;
> > > +
> > > +	spin_lock(&net_failover_lock);
> > > +	list_for_each_entry(failover, &net_failover_list, list) {
> > > +		failover_dev = rtnl_dereference(failover->failover_dev);
> > > +		if (ether_addr_equal(failover_dev->perm_addr, mac)) {
> > > +			*ops = rtnl_dereference(failover->ops);
> > > +			spin_unlock(&net_failover_lock);
> > > +			return failover_dev;
> > > +		}
> > > +	}
> > > +	spin_unlock(&net_failover_lock);
> > > +	return NULL;
> > > +}
> > This is broken if non-ethernet devices such as Infiniband are present.
> 
> There is check to make sure that a slave and failover devices are of the same type in
> net_failover_slave_register()
> 
> 	failover_dev = net_failover_get_bymac(slave_dev->perm_addr, &nfo_ops);
>         if (!failover_dev)
>                 goto done;
> 
>         if (failover_dev->type != slave_dev->type)
>                 goto done;
> 
> Do you think this is not good enough? I had an explicit check for ARPHRD_ETHER in
> earlier patchsets, but removed it based on Jiri's comment.

Right but how is ether_addr_equal supposed to work if types are
identical but not ethernet?

This can also benefit from a comment referring to the check in
net_failover_slave_register.

-- 
MST

^ permalink raw reply

* [PATCH v2 3/3] selinux: correctly handle sa_family cases in selinux_sctp_bind_connect()
From: Alexey Kodanev @ 2018-05-11 17:15 UTC (permalink / raw)
  To: selinux
  Cc: Richard Haines, Paul Moore, Stephen Smalley, Eric Paris,
	linux-security-module, netdev, Alexey Kodanev
In-Reply-To: <1526058913-14198-1-git-send-email-alexey.kodanev@oracle.com>

Allow to pass the socket address structure with AF_UNSPEC family for
compatibility purposes. selinux_socket_bind() will further check it
for INADDR_ANY and selinux_socket_connect_helper() should return
EINVAL.

For a bad address family return EINVAL instead of AFNOSUPPORT error,
i.e. what is expected from SCTP protocol in such case.

Fixes: d452930fd3b9 ("selinux: Add SCTP support")
Suggested-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
---

v2: new patch in v2

 security/selinux/hooks.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index e7882e5a..be5817d 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -5277,6 +5277,7 @@ static int selinux_sctp_bind_connect(struct sock *sk, int optname,
 	while (walk_size < addrlen) {
 		addr = addr_buf;
 		switch (addr->sa_family) {
+		case AF_UNSPEC:
 		case AF_INET:
 			len = sizeof(struct sockaddr_in);
 			break;
@@ -5284,7 +5285,7 @@ static int selinux_sctp_bind_connect(struct sock *sk, int optname,
 			len = sizeof(struct sockaddr_in6);
 			break;
 		default:
-			return -EAFNOSUPPORT;
+			return -EINVAL;
 		}
 
 		err = -EINVAL;
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH v2 1/3] selinux: add AF_UNSPEC and INADDR_ANY checks to selinux_socket_bind()
From: Alexey Kodanev @ 2018-05-11 17:15 UTC (permalink / raw)
  To: selinux
  Cc: Richard Haines, Paul Moore, Stephen Smalley, Eric Paris,
	linux-security-module, netdev, Alexey Kodanev

Commit d452930fd3b9 ("selinux: Add SCTP support") breaks compatibility
with the old programs that can pass sockaddr_in structure with AF_UNSPEC
and INADDR_ANY to bind(). As a result, bind() returns EAFNOSUPPORT error.
This was found with LTP/asapi_01 test.

Similar to commit 29c486df6a20 ("net: ipv4: relax AF_INET check in
bind()"), which relaxed AF_INET check for compatibility, add AF_UNSPEC
case to AF_INET and make sure that the address is INADDR_ANY.

Fixes: d452930fd3b9 ("selinux: Add SCTP support")
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
---

v2: As suggested by Paul:
    * return EINVAL for SCTP socket if sa_family is AF_UNSPEC and
      address is not INADDR_ANY
    * add new 'sa_family' variable so that it equals either AF_INET
      or AF_INET6. Besides, it it will be used in the next patch that
      fixes audit record.

 security/selinux/hooks.c | 29 +++++++++++++++++++----------
 1 file changed, 19 insertions(+), 10 deletions(-)

diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 4cafe6a..1ed7004 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -4576,6 +4576,7 @@ static int selinux_socket_post_create(struct socket *sock, int family,
 static int selinux_socket_bind(struct socket *sock, struct sockaddr *address, int addrlen)
 {
 	struct sock *sk = sock->sk;
+	struct sk_security_struct *sksec = sk->sk_security;
 	u16 family;
 	int err;
 
@@ -4587,11 +4588,11 @@ static int selinux_socket_bind(struct socket *sock, struct sockaddr *address, in
 	family = sk->sk_family;
 	if (family == PF_INET || family == PF_INET6) {
 		char *addrp;
-		struct sk_security_struct *sksec = sk->sk_security;
 		struct common_audit_data ad;
 		struct lsm_network_audit net = {0,};
 		struct sockaddr_in *addr4 = NULL;
 		struct sockaddr_in6 *addr6 = NULL;
+		u16 family_sa = address->sa_family;
 		unsigned short snum;
 		u32 sid, node_perm;
 
@@ -4601,11 +4602,20 @@ static int selinux_socket_bind(struct socket *sock, struct sockaddr *address, in
 		 * need to check address->sa_family as it is possible to have
 		 * sk->sk_family = PF_INET6 with addr->sa_family = AF_INET.
 		 */
-		switch (address->sa_family) {
+		switch (family_sa) {
+		case AF_UNSPEC:
 		case AF_INET:
 			if (addrlen < sizeof(struct sockaddr_in))
 				return -EINVAL;
 			addr4 = (struct sockaddr_in *)address;
+			if (family_sa == AF_UNSPEC) {
+				/* see __inet_bind(), we only want to allow
+				 * AF_UNSPEC if the address is INADDR_ANY
+				 */
+				if (addr4->sin_addr.s_addr != htonl(INADDR_ANY))
+					goto err_af;
+				family_sa = AF_INET;
+			}
 			snum = ntohs(addr4->sin_port);
 			addrp = (char *)&addr4->sin_addr.s_addr;
 			break;
@@ -4617,13 +4627,7 @@ static int selinux_socket_bind(struct socket *sock, struct sockaddr *address, in
 			addrp = (char *)&addr6->sin6_addr.s6_addr;
 			break;
 		default:
-			/* Note that SCTP services expect -EINVAL, whereas
-			 * others expect -EAFNOSUPPORT.
-			 */
-			if (sksec->sclass == SECCLASS_SCTP_SOCKET)
-				return -EINVAL;
-			else
-				return -EAFNOSUPPORT;
+			goto err_af;
 		}
 
 		if (snum) {
@@ -4681,7 +4685,7 @@ static int selinux_socket_bind(struct socket *sock, struct sockaddr *address, in
 		ad.u.net->sport = htons(snum);
 		ad.u.net->family = family;
 
-		if (address->sa_family == AF_INET)
+		if (family_sa == AF_INET)
 			ad.u.net->v4info.saddr = addr4->sin_addr.s_addr;
 		else
 			ad.u.net->v6info.saddr = addr6->sin6_addr;
@@ -4694,6 +4698,11 @@ static int selinux_socket_bind(struct socket *sock, struct sockaddr *address, in
 	}
 out:
 	return err;
+err_af:
+	/* Note that SCTP services expect -EINVAL, others -EAFNOSUPPORT. */
+	if (sksec->sclass == SECCLASS_SCTP_SOCKET)
+		return -EINVAL;
+	return -EAFNOSUPPORT;
 }
 
 /* This supports connect(2) and SCTP connect services such as sctp_connectx(3)
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH v2 2/3] selinux: fix address family in bind() and connect() to match address/port
From: Alexey Kodanev @ 2018-05-11 17:15 UTC (permalink / raw)
  To: selinux
  Cc: Richard Haines, Paul Moore, Stephen Smalley, Eric Paris,
	linux-security-module, netdev, Alexey Kodanev
In-Reply-To: <1526058913-14198-1-git-send-email-alexey.kodanev@oracle.com>

Since sctp_bindx() and sctp_connectx() can have multiple addresses,
sk_family can differ from sa_family. Therefore, selinux_socket_bind()
and selinux_socket_connect_helper(), which process sockaddr structure
(address and port), should use the address family from that structure
too, and not from the socket one.

The initialization of the data for the audit record is moved above,
in selinux_socket_bind(), so that there is no duplicate changes and
code.

Fixes: d452930fd3b9 ("selinux: Add SCTP support")
Suggested-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
---

v2: new patch in v2

 security/selinux/hooks.c | 18 +++++++-----------
 1 file changed, 7 insertions(+), 11 deletions(-)

diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 1ed7004..e7882e5a 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -4630,6 +4630,11 @@ static int selinux_socket_bind(struct socket *sock, struct sockaddr *address, in
 			goto err_af;
 		}
 
+		ad.type = LSM_AUDIT_DATA_NET;
+		ad.u.net = &net;
+		ad.u.net->sport = htons(snum);
+		ad.u.net->family = family_sa;
+
 		if (snum) {
 			int low, high;
 
@@ -4641,10 +4646,6 @@ static int selinux_socket_bind(struct socket *sock, struct sockaddr *address, in
 						      snum, &sid);
 				if (err)
 					goto out;
-				ad.type = LSM_AUDIT_DATA_NET;
-				ad.u.net = &net;
-				ad.u.net->sport = htons(snum);
-				ad.u.net->family = family;
 				err = avc_has_perm(&selinux_state,
 						   sksec->sid, sid,
 						   sksec->sclass,
@@ -4676,15 +4677,10 @@ static int selinux_socket_bind(struct socket *sock, struct sockaddr *address, in
 			break;
 		}
 
-		err = sel_netnode_sid(addrp, family, &sid);
+		err = sel_netnode_sid(addrp, family_sa, &sid);
 		if (err)
 			goto out;
 
-		ad.type = LSM_AUDIT_DATA_NET;
-		ad.u.net = &net;
-		ad.u.net->sport = htons(snum);
-		ad.u.net->family = family;
-
 		if (family_sa == AF_INET)
 			ad.u.net->v4info.saddr = addr4->sin_addr.s_addr;
 		else
@@ -4780,7 +4776,7 @@ static int selinux_socket_connect_helper(struct socket *sock,
 		ad.type = LSM_AUDIT_DATA_NET;
 		ad.u.net = &net;
 		ad.u.net->dport = htons(snum);
-		ad.u.net->family = sk->sk_family;
+		ad.u.net->family = address->sa_family;
 		err = avc_has_perm(&selinux_state,
 				   sksec->sid, sid, sksec->sclass, perm, &ad);
 		if (err)
-- 
1.8.3.1

^ permalink raw reply related

* Re: KASAN: null-ptr-deref Read in rds_ib_get_mr
From: Santosh Shilimkar @ 2018-05-11 16:58 UTC (permalink / raw)
  To: Yanjun Zhu, DaeRyong Jeong, davem
  Cc: netdev, linux-rdma, rds-devel, linux-kernel, byoungyoung, kt0755
In-Reply-To: <fa3461d4-8872-48af-9b67-be0affd16bbd@oracle.com>

On 5/11/2018 12:48 AM, Yanjun Zhu wrote:
> 
> 
> On 2018/5/11 13:20, DaeRyong Jeong wrote:
>> We report the crash: KASAN: null-ptr-deref Read in rds_ib_get_mr
>>
>> Note that this bug is previously reported by syzkaller.
>> https://syzkaller.appspot.com/bug?id=0bb56a5a48b000b52aa2b0d8dd20b1f545214d91 
>>
>> Nonetheless, this bug has not fixed yet, and we hope that this report 
>> and our
>> analysis, which gets help by the RaceFuzzer's feature, will helpful to 
>> fix the
>> crash.
>>
>> This crash has been found in v4.17-rc1 using RaceFuzzer (a modified
>> version of Syzkaller), which we describe more at the end of this
>> report. Our analysis shows that the race occurs when invoking two
>> syscalls concurrently, bind$rds and setsockopt$RDS_GET_MR.
>>
>>
>> Analysis:
>> We think the concurrent execution of __rds_rdma_map() and rds_bind()
>> causes the problem. __rds_rdma_map() checks whether rs->rs_bound_addr 
>> is 0
>> or not. But the concurrent execution with rds_bind() can by-pass this
>> check. Therefore, __rds_rdmap_map() calls rs->rs_transport->get_mr() and
>> rds_ib_get_mr() causes the null deref at ib_rdma.c:544 in v4.17-rc1, when
>> dereferencing rs_conn.
>>
>>
>> Thread interleaving:
>> CPU0 (__rds_rdma_map)                    CPU1 (rds_bind)
>>                             // rds_add_bound() sets rs->bound_addr as 
>> none 0
>>                             ret = rds_add_bound(rs, 
>> sin->sin_addr.s_addr, &sin->sin_port);
>> if (rs->rs_bound_addr == 0 || !rs->rs_transport) {
>>     ret = -ENOTCONN; /* XXX not a great errno */
>>     goto out;
>> }
>>                             if (rs->rs_transport) { /* previously 
>> bound */
>>                                 trans = rs->rs_transport;
>>                                 if 
>> (trans->laddr_check(sock_net(sock->sk),
>>                                                sin->sin_addr.s_addr) 
>> != 0) {
>>                                     ret = -ENOPROTOOPT;
>>                                     // rds_remove_bound() sets 
>> rs->bound_addr as 0
>>                                     rds_remove_bound(rs);
>> ...
>> trans_private = rs->rs_transport->get_mr(sg, nents, rs,
>>                      &mr->r_key);
>> (in rds_ib_get_mr())
>> struct rds_ib_connection *ic = rs->rs_conn->c_transport_data;
>>
>>
>> Call sequence (v4.17-rc1):
>> CPU0
>> rds_setsockopt
>>     rds_get_mr
>>         __rds_rdma_map
>>             rds_ib_get_mr
>>
>>
>> CPU1
>> rds_bind
>>     rds_add_bound
>>     ...
>>     rds_remove_bound
>>
>>
>> Crash log:
>> ==================================================================
>> BUG: KASAN: null-ptr-deref in rds_ib_get_mr+0x3a/0x150 
>> net/rds/ib_rdma.c:544
>> Read of size 8 at addr 0000000000000068 by task syz-executor0/32067
>>
>> CPU: 0 PID: 32067 Comm: syz-executor0 Not tainted 4.17.0-rc1 #1
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
>> rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
>> Call Trace:
>>   __dump_stack lib/dump_stack.c:77 [inline]
>>   dump_stack+0x166/0x21c lib/dump_stack.c:113
>>   kasan_report_error mm/kasan/report.c:352 [inline]
>>   kasan_report+0x140/0x360 mm/kasan/report.c:412
>>   check_memory_region_inline mm/kasan/kasan.c:260 [inline]
>>   __asan_load8+0x54/0x90 mm/kasan/kasan.c:699
>>   rds_ib_get_mr+0x3a/0x150 net/rds/ib_rdma.c:544
>>   __rds_rdma_map+0x521/0x9d0 net/rds/rdma.c:271
>>   rds_get_mr+0xad/0xf0 net/rds/rdma.c:333
>>   rds_setsockopt+0x57f/0x720 net/rds/af_rds.c:347
>>   __sys_setsockopt+0x147/0x230 net/socket.c:1903
>>   __do_sys_setsockopt net/socket.c:1914 [inline]
>>   __se_sys_setsockopt net/socket.c:1911 [inline]
>>   __x64_sys_setsockopt+0x67/0x80 net/socket.c:1911
>>   do_syscall_64+0x15f/0x4a0 arch/x86/entry/common.c:287
>>   entry_SYSCALL_64_after_hwframe+0x49/0xbe
>> RIP: 0033:0x4563f9
>> RSP: 002b:00007f6a2b3c2b28 EFLAGS: 00000246 ORIG_RAX: 0000000000000036
>> RAX: ffffffffffffffda RBX: 000000000072bee0 RCX: 00000000004563f9
>> RDX: 0000000000000002 RSI: 0000000000000114 RDI: 0000000000000015
>> RBP: 0000000000000575 R08: 0000000000000020 R09: 0000000000000000
>> R10: 0000000020000140 R11: 0000000000000246 R12: 00007f6a2b3c36d4
>> R13: 00000000ffffffff R14: 00000000006fd398 R15: 0000000000000000
>> ==================================================================
> diff --git a/net/rds/ib_rdma.c b/net/rds/ib_rdma.c
> index e678699..2228b50 100644
> --- a/net/rds/ib_rdma.c
> +++ b/net/rds/ib_rdma.c
> @@ -539,11 +539,17 @@ void rds_ib_flush_mrs(void)
>   void *rds_ib_get_mr(struct scatterlist *sg, unsigned long nents,
>                      struct rds_sock *rs, u32 *key_ret)
>   {
> -       struct rds_ib_device *rds_ibdev;
> +       struct rds_ib_device *rds_ibdev = NULL;
>          struct rds_ib_mr *ibmr = NULL;
> -       struct rds_ib_connection *ic = rs->rs_conn->c_transport_data;
> +       struct rds_ib_connection *ic = NULL;
>          int ret;
> 
> +       if (rs->rs_bound_addr == 0) {
> +               ret = -EPERM;
> +               goto out;
> +       }
> +
No you can't return such error for this API and the
socket related checks needs to be done at core layer.
I remember fixing this race but probably never pushed
fix upstream.

The MR code is due for update with optimized FRWR code
which now stable enough. We will address this issue as
well as part of that patchset.

Thanks for looking into it.

Regards,
Santosh

^ permalink raw reply

* Re: [PATCH bpf-next] samples/bpf: xdp_monitor, accept short options
From: Jesper Dangaard Brouer @ 2018-05-11 16:31 UTC (permalink / raw)
  To: Prashant Bhole
  Cc: Daniel Borkmann, Alexei Starovoitov, David S . Miller, netdev,
	brouer
In-Reply-To: <20180511013751.4360-1-bhole_prashant_q7@lab.ntt.co.jp>

On Fri, 11 May 2018 10:37:51 +0900
Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> wrote:

> updated optstring accept short options
> 
> Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
> ---
>  samples/bpf/xdp_monitor_user.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/samples/bpf/xdp_monitor_user.c b/samples/bpf/xdp_monitor_user.c
> index 894bc64c2cac..668511c77aaf 100644
> --- a/samples/bpf/xdp_monitor_user.c
> +++ b/samples/bpf/xdp_monitor_user.c
> @@ -594,7 +594,7 @@ int main(int argc, char **argv)
>  	snprintf(bpf_obj_file, sizeof(bpf_obj_file), "%s_kern.o", argv[0]);
>  
>  	/* Parse commands line args */
> -	while ((opt = getopt_long(argc, argv, "h",
> +	while ((opt = getopt_long(argc, argv, "hDSs:",
>  				  long_options, &longindex)) != -1) {
>  		switch (opt) {
>  		case 'D':

It was actually on purpose that I didn't add the short options,
in-order to force people use those "self-documenting" long-options when
they show the usage on public mailing lists or in blog-posts.

If you want these short options, you also have to correct the "usage"
function that state these are "internal" short-options.

Notice the long options parsing done by getopt_long() allow you to only
specify part of the string.  Al-through, I can see --s is ambiguous.

$ sudo ./xdp_monitor --s
./xdp_monitor: option '--s' is ambiguous; possibilities: '--stats' '--sec'

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply

* Re: [PATCH net-next] udp: avoid refcount_t saturation in __udp_gso_segment()
From: David Miller @ 2018-05-11 16:30 UTC (permalink / raw)
  To: edumazet; +Cc: netdev, eric.dumazet, willemb, alexander.h.duyck
In-Reply-To: <20180511020713.159465-1-edumazet@google.com>

From: Eric Dumazet <edumazet@google.com>
Date: Thu, 10 May 2018 19:07:13 -0700

> For some reason, Willem thought that the issue we fixed for TCP
> in commit 7ec318feeed1 ("tcp: gso: avoid refcount_t warning from
> tcp_gso_segment()") was not relevant for UDP GSO.
> 
> But syzbot found its way.
 ...
> Fixes: ad405857b174 ("udp: better wmem accounting on gso")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Willem de Bruijn <willemb@google.com>
> Cc: Alexander Duyck <alexander.h.duyck@intel.com>
> Reported-by: syzbot <syzkaller@googlegroups.com>

Applied, thanks Eric.

^ permalink raw reply

* Re: [resend PATCH] rxrpc: Neaten logging macros and add KERN_DEBUG logging level
From: Joe Perches @ 2018-05-11 16:29 UTC (permalink / raw)
  To: David Howells; +Cc: David S. Miller, linux-afs, netdev, linux-kernel
In-Reply-To: <35831b4769a0415ae7b975e88badb3033dbfe82d.1522176274.git.joe@perches.com>

On Tue, 2018-03-27 at 11:52 -0700, Joe Perches wrote:
> When enabled, the current debug logging does not have a KERN_<LEVEL>.
> Add KERN_DEBUG to the logging macros.
> 
> Miscellanea:
> 
> o Remove #define redundancy and neaten the macros a bit

ping?

> Signed-off-by: Joe Perches <joe@perches.com>
> ---
> 
> Resend of patch: https://lkml.org/lkml/2017/11/30/573
> 
> No change in patch.
> 
> David Howells is now a listed maintainer for net/rxrpc/ so he should receive
> this patch via get_maintainer
> 
>  net/rxrpc/ar-internal.h | 75 ++++++++++++++++++-------------------------------
>  1 file changed, 28 insertions(+), 47 deletions(-)
> 
> diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h
> index 416688381eb7..d4b53b2339b3 100644
> --- a/net/rxrpc/ar-internal.h
> +++ b/net/rxrpc/ar-internal.h
> @@ -1147,66 +1147,47 @@ static inline bool after_eq(u32 seq1, u32 seq2)
>   */
>  extern unsigned int rxrpc_debug;
>  
> -#define dbgprintk(FMT,...) \
> -	printk("[%-6.6s] "FMT"\n", current->comm ,##__VA_ARGS__)
> +#if defined(__KDEBUG) || defined(CONFIG_AF_RXRPC_DEBUG)
> +#define dbgprintk(fmt, ...)						\
> +	printk(KERN_DEBUG "[%-6.6s] " fmt "\n", current->comm, ##__VA_ARGS__)
> +#else
> +#define dbgprintk(fmt, ...)						\
> +	no_printk(KERN_DEBUG "[%-6.6s] " fmt "\n", current->comm, ##__VA_ARGS__)
> +#endif
>  
> -#define kenter(FMT,...)	dbgprintk("==> %s("FMT")",__func__ ,##__VA_ARGS__)
> -#define kleave(FMT,...)	dbgprintk("<== %s()"FMT"",__func__ ,##__VA_ARGS__)
> -#define kdebug(FMT,...)	dbgprintk("    "FMT ,##__VA_ARGS__)
> -#define kproto(FMT,...)	dbgprintk("### "FMT ,##__VA_ARGS__)
> -#define knet(FMT,...)	dbgprintk("@@@ "FMT ,##__VA_ARGS__)
> +#define kenter(fmt, ...)	dbgprintk("==> %s(" fmt ")", __func__, ##__VA_ARGS__)
> +#define kleave(fmt, ...)	dbgprintk("<== %s()" fmt "", __func__, ##__VA_ARGS__)
> +#define kdebug(fmt, ...)	dbgprintk("    " fmt, ##__VA_ARGS__)
> +#define kproto(fmt, ...)	dbgprintk("### " fmt, ##__VA_ARGS__)
> +#define knet(fmt, ...)		dbgprintk("@@@ " fmt, ##__VA_ARGS__)
>  
> +#if defined(__KDEBUG) || !defined(CONFIG_AF_RXRPC_DEBUG)
> +#define _enter(fmt, ...)	kenter(fmt, ##__VA_ARGS__)
> +#define _leave(fmt, ...)	kleave(fmt, ##__VA_ARGS__)
> +#define _debug(fmt, ...)	kdebug(fmt, ##__VA_ARGS__)
> +#define _proto(fmt, ...)	kproto(fmt, ##__VA_ARGS__)
> +#define _net(fmt, ...)		knet(fmt, ##__VA_ARGS__)
>  
> -#if defined(__KDEBUG)
> -#define _enter(FMT,...)	kenter(FMT,##__VA_ARGS__)
> -#define _leave(FMT,...)	kleave(FMT,##__VA_ARGS__)
> -#define _debug(FMT,...)	kdebug(FMT,##__VA_ARGS__)
> -#define _proto(FMT,...)	kproto(FMT,##__VA_ARGS__)
> -#define _net(FMT,...)	knet(FMT,##__VA_ARGS__)
> +#else
>  
> -#elif defined(CONFIG_AF_RXRPC_DEBUG)
>  #define RXRPC_DEBUG_KENTER	0x01
>  #define RXRPC_DEBUG_KLEAVE	0x02
>  #define RXRPC_DEBUG_KDEBUG	0x04
>  #define RXRPC_DEBUG_KPROTO	0x08
>  #define RXRPC_DEBUG_KNET	0x10
>  
> -#define _enter(FMT,...)					\
> -do {							\
> -	if (unlikely(rxrpc_debug & RXRPC_DEBUG_KENTER))	\
> -		kenter(FMT,##__VA_ARGS__);		\
> -} while (0)
> -
> -#define _leave(FMT,...)					\
> -do {							\
> -	if (unlikely(rxrpc_debug & RXRPC_DEBUG_KLEAVE))	\
> -		kleave(FMT,##__VA_ARGS__);		\
> -} while (0)
> -
> -#define _debug(FMT,...)					\
> -do {							\
> -	if (unlikely(rxrpc_debug & RXRPC_DEBUG_KDEBUG))	\
> -		kdebug(FMT,##__VA_ARGS__);		\
> -} while (0)
> -
> -#define _proto(FMT,...)					\
> -do {							\
> -	if (unlikely(rxrpc_debug & RXRPC_DEBUG_KPROTO))	\
> -		kproto(FMT,##__VA_ARGS__);		\
> +#define RXRPC_DEBUG(TYPE, type, fmt, ...)				\
> +do {									\
> +	if (unlikely(rxrpc_debug & RXRPC_DEBUG_##TYPE))			\
> +		type(fmt, ##__VA_ARGS__);				\
>  } while (0)
>  
> -#define _net(FMT,...)					\
> -do {							\
> -	if (unlikely(rxrpc_debug & RXRPC_DEBUG_KNET))	\
> -		knet(FMT,##__VA_ARGS__);		\
> -} while (0)
> +#define _enter(fmt, ...)	RXRPC_DEBUG(KENTER, kenter, fmt, ##__VA_ARGS__)
> +#define _leave(fmt, ...)	RXRPC_DEBUG(KLEAVE, kleave, fmt, ##__VA_ARGS__)
> +#define _debug(fmt, ...)	RXRPC_DEBUG(KDEBUG, kdebug, fmt, ##__VA_ARGS__)
> +#define _proto(fmt, ...)	RXRPC_DEBUG(KPROTO, kproto, fmt, ##__VA_ARGS__)
> +#define _net(fmt, ...)		RXRPC_DEBUG(KNET, knet, fmt, ##__VA_ARGS__)
>  
> -#else
> -#define _enter(FMT,...)	no_printk("==> %s("FMT")",__func__ ,##__VA_ARGS__)
> -#define _leave(FMT,...)	no_printk("<== %s()"FMT"",__func__ ,##__VA_ARGS__)
> -#define _debug(FMT,...)	no_printk("    "FMT ,##__VA_ARGS__)
> -#define _proto(FMT,...)	no_printk("### "FMT ,##__VA_ARGS__)
> -#define _net(FMT,...)	no_printk("@@@ "FMT ,##__VA_ARGS__)
>  #endif
>  
>  /*

^ permalink raw reply

* Re: [pull request][net 0/3] Mellanox, mlx5 fixes 2018-05-10
From: David Miller @ 2018-05-11 16:27 UTC (permalink / raw)
  To: saeedm; +Cc: netdev
In-Reply-To: <20180510231915.23754-1-saeedm@mellanox.com>

From: Saeed Mahameed <saeedm@mellanox.com>
Date: Thu, 10 May 2018 16:19:12 -0700

> the following series includes some fixes for mlx5 core driver.
> Please pull and let me know if there's any problem.

Pulled.

> For -stable v4.5
> ("net/mlx5: E-Switch, Include VF RDMA stats in vport statistics")
> 
> For -stable v4.10
> ("net/mlx5e: Err if asked to offload TC match on frag being first")

Queued up for -stable.

Thanks.

^ permalink raw reply

* Re: [PATCH v2 net-next] tcp: switch pacing timer to softirq based hrtimer
From: David Miller @ 2018-05-11 16:24 UTC (permalink / raw)
  To: edumazet; +Cc: netdev, eric.dumazet
In-Reply-To: <20180510215943.94513-1-edumazet@google.com>

From: Eric Dumazet <edumazet@google.com>
Date: Thu, 10 May 2018 14:59:43 -0700

> linux-4.16 got support for softirq based hrtimers.
> TCP can switch its pacing hrtimer to this variant, since this
> avoids going through a tasklet and some atomic operations.
> 
> pacing timer logic looks like other (jiffies based) tcp timers.
> 
> v2: use hrtimer_try_to_cancel() in tcp_clear_xmit_timers()
>     to correctly release reference on socket if needed.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Looks great, applied, thanks Eric.

^ permalink raw reply

* Re: [PATCH net-next v2 0/9] net: dsa: Plug in PHYLINK support
From: David Miller @ 2018-05-11 16:23 UTC (permalink / raw)
  To: f.fainelli
  Cc: netdev, privat, andrew, vivien.didelot, rmk+kernel, sean.wang,
	Woojung.Huh, john, cphealy
In-Reply-To: <20180510201737.13887-1-f.fainelli@gmail.com>

From: Florian Fainelli <f.fainelli@gmail.com>
Date: Thu, 10 May 2018 13:17:28 -0700

> This patch series adds PHYLINK support to DSA which is necessary to
> support more complex PHY and pluggable modules setups.

Series applied, thanks.

^ permalink raw reply

* Re: [PATCH] coredump: rename umh_pipe_setup() to coredump_pipe_setup()
From: Luis R. Rodriguez @ 2018-05-11 16:17 UTC (permalink / raw)
  To: Al Viro, David S. Miller
  Cc: Luis R. Rodriguez, Alexei Starovoitov, ast, linux-fsdevel,
	linux-kernel, netdev
In-Reply-To: <20180511024851.GB30522@ZenIV.linux.org.uk>

On Fri, May 11, 2018 at 03:48:51AM +0100, Al Viro wrote:
> On Thu, May 10, 2018 at 11:32:47PM +0000, Luis R. Rodriguez wrote:
> 
> > I think net-next makes sense if Al Viro is OK with that. This way it could go
> > in regardless of the state of your series, but it also lines up with your work.
> 
> Fine by me...

OK thanks.

Dave, I'll bounce a copy of the original patch to you, if anything else is needed
please let me know.

  Luis

^ permalink raw reply

* Re: [PATCH net] ipv4: fix memory leaks in udp_sendmsg, ping_v4_sendmsg
From: David Miller @ 2018-05-11 16:01 UTC (permalink / raw)
  To: rdna; +Cc: netdev, ast, eric.dumazet, kernel-team
In-Reply-To: <20180510175934.2259802-1-rdna@fb.com>

From: Andrey Ignatov <rdna@fb.com>
Date: Thu, 10 May 2018 10:59:34 -0700

> Fix more memory leaks in ip_cmsg_send() callers. Part of them were fixed
> earlier in 919483096bfe.
> 
> * udp_sendmsg one was there since the beginning when linux sources were
>   first added to git;
> * ping_v4_sendmsg one was copy/pasted in c319b4d76b9e.
> 
> Whenever return happens in udp_sendmsg() or ping_v4_sendmsg() IP options
> have to be freed if they were allocated previously.
> 
> Add label so that future callers (if any) can use it instead of kfree()
> before return that is easy to forget.
> 
> Fixes: c319b4d76b9e (net: ipv4: add IPPROTO_ICMP socket kind)
> Signed-off-by: Andrey Ignatov <rdna@fb.com>

Applied and queued up for -stable, thank you.

^ permalink raw reply

* Re: [PATCH] mlxsw: core: Fix an error handling path in 'mlxsw_core_bus_device_register()'
From: David Miller @ 2018-05-11 15:57 UTC (permalink / raw)
  To: idosch
  Cc: christophe.jaillet, jiri, idosch, netdev, linux-kernel,
	kernel-janitors
In-Reply-To: <20180510115821.GA10270@splinter.mtl.com>

From: Ido Schimmel <idosch@idosch.org>
Date: Thu, 10 May 2018 14:58:21 +0300

> On Thu, May 10, 2018 at 01:26:16PM +0200, Christophe JAILLET wrote:
>> Resources are not freed in the reverse order of the allocation.
>> Labels are also mixed-up.
>> 
>> Fix it and reorder code and labels in the error handling path of
>> 'mlxsw_core_bus_device_register()'
>> 
>> Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
> 
> For net:
> 
> Fixes: ef3116e5403e ("mlxsw: spectrum: Register KVD resources with devlink")
> Reviewed-by: Ido Schimmel <idosch@mellanox.com>

Applied and queued up for -stable, thanks.

^ permalink raw reply

* Re: [PATCH net 0/2] bonding: bug fixes and regressions
From: David Miller @ 2018-05-11 15:51 UTC (permalink / raw)
  To: dbanerje; +Cc: netdev, vyasevic, j.vosburgh, vfalico, andy
In-Reply-To: <20180509233211.28207-1-dbanerje@akamai.com>

From: Debabrata Banerjee <dbanerje@akamai.com>
Date: Wed,  9 May 2018 19:32:09 -0400

> Fixes to bonding driver for balance-alb mode, suitable for stable.

Series applied and queued up for -stable, thanks.

^ permalink raw reply

* [patch net] net: sched: fix error path in tcf_proto_create() when modules are not configured
From: Jiri Pirko @ 2018-05-11 15:45 UTC (permalink / raw)
  To: netdev; +Cc: davem, jhs, xiyou.wangcong, mlxsw

From: Jiri Pirko <jiri@mellanox.com>

In case modules are not configured, error out when tp->ops is null
and prevent later null pointer dereference.

Fixes: 33a48927c193 ("sched: push TC filter protocol creation into a separate function")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 net/sched/cls_api.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index b66754f52a9f..963e4bf0aab8 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -152,8 +152,8 @@ static struct tcf_proto *tcf_proto_create(const char *kind, u32 protocol,
 			NL_SET_ERR_MSG(extack, "TC classifier not found");
 			err = -ENOENT;
 		}
-		goto errout;
 #endif
+		goto errout;
 	}
 	tp->classify = tp->ops->classify;
 	tp->protocol = protocol;
-- 
2.14.3

^ permalink raw reply related

* Re: [PATCH net-next v10 2/4] net: Introduce generic failover module
From: Samudrala, Sridhar @ 2018-05-11 15:43 UTC (permalink / raw)
  To: Randy Dunlap, mst, stephen, davem, netdev, virtualization,
	virtio-dev, jesse.brandeburg, alexander.h.duyck, kubakici,
	jasowang, loseweigh, jiri, aaron.f.brown
In-Reply-To: <460f3d8f-b2ec-2118-e296-03f4f9655c5a@infradead.org>

On 5/7/2018 3:39 PM, Randy Dunlap wrote:
> Hi,
>
> On 05/07/2018 03:10 PM, Sridhar Samudrala wrote:
>> Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
>> ---
>>   MAINTAINERS                |    7 +
>>   include/linux/netdevice.h  |   16 +
>>   include/net/net_failover.h |   52 +++
>>   net/Kconfig                |   10 +
>>   net/core/Makefile          |    1 +
>>   net/core/net_failover.c    | 1044 ++++++++++++++++++++++++++++++++++++++++++++
>>   6 files changed, 1130 insertions(+)
>>   create mode 100644 include/net/net_failover.h
>>   create mode 100644 net/core/net_failover.c
>
>> diff --git a/net/Kconfig b/net/Kconfig
>> index b62089fb1332..0540856676de 100644
>> --- a/net/Kconfig
>> +++ b/net/Kconfig
>> @@ -429,6 +429,16 @@ config MAY_USE_DEVLINK
>>   config PAGE_POOL
>>          bool
>>   
>> +config NET_FAILOVER
>> +	tristate "Failover interface"
>> +	default m
> Need some justification for default m (as opposed to n).

default n should be fine.  It will get selected automatically when virtio_net or
netvsc are enabled. will fix in the next revision.


>
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox