Netdev List
 help / color / mirror / Atom feed
* [PATCH net-next v3 6/7] net: dsa: qca8k: Replace GPL boilerplate by SPDX
From: Michal Vokáč @ 2018-05-23  6:20 UTC (permalink / raw)
  To: netdev
  Cc: linux-kernel, devicetree, f.fainelli, vivien.didelot, andrew,
	mark.rutland, robh+dt, davem, michal.vokac
In-Reply-To: <1527056424-14528-1-git-send-email-michal.vokac@ysoft.com>

Replace the GPLv2 license boilerplate with the SPDX license identifier.

Signed-off-by: Michal Vokáč <michal.vokac@ysoft.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
---
Changes in v3:
 - none

Changes in v2:
 - Add commit message.
 - Add "Reviewed-by" tags from Andrew and Florian.

 drivers/net/dsa/qca8k.c | 10 +---------
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/drivers/net/dsa/qca8k.c b/drivers/net/dsa/qca8k.c
index 7eba987..c834893 100644
--- a/drivers/net/dsa/qca8k.c
+++ b/drivers/net/dsa/qca8k.c
@@ -1,17 +1,9 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
  * Copyright (C) 2009 Felix Fietkau <nbd@nbd.name>
  * Copyright (C) 2011-2012 Gabor Juhos <juhosg@openwrt.org>
  * Copyright (c) 2015, The Linux Foundation. All rights reserved.
  * Copyright (c) 2016 John Crispin <john@phrozen.org>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 and
- * only version 2 as published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
  */
 
 #define DEBUG
-- 
2.7.4

^ permalink raw reply related

* [PATCH net-next v3 5/7] net: dsa: qca8k: Allow overwriting CPU port setting
From: Michal Vokáč @ 2018-05-23  6:20 UTC (permalink / raw)
  To: netdev
  Cc: linux-kernel, devicetree, f.fainelli, vivien.didelot, andrew,
	mark.rutland, robh+dt, davem, michal.vokac
In-Reply-To: <1527056424-14528-1-git-send-email-michal.vokac@ysoft.com>

Implement adjust_link function that allows to overwrite default CPU port
setting using fixed-link device tree subnode.

Signed-off-by: Michal Vokáč <michal.vokac@ysoft.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
---
Changes in v3:
 - none

Changes in v2:
 - Add "Reviewed-by" tags from Andrew and Florian.

 drivers/net/dsa/qca8k.c | 43 +++++++++++++++++++++++++++++++++++++++++++
 drivers/net/dsa/qca8k.h |  1 +
 2 files changed, 44 insertions(+)

diff --git a/drivers/net/dsa/qca8k.c b/drivers/net/dsa/qca8k.c
index 14a108b38..7eba987 100644
--- a/drivers/net/dsa/qca8k.c
+++ b/drivers/net/dsa/qca8k.c
@@ -636,6 +636,47 @@ qca8k_setup(struct dsa_switch *ds)
 	return 0;
 }
 
+static void
+qca8k_adjust_link(struct dsa_switch *ds, int port, struct phy_device *phy)
+{
+	struct qca8k_priv *priv = ds->priv;
+	u32 reg;
+
+	/* Force fixed-link setting for CPU port, skip others. */
+	if (!phy_is_pseudo_fixed_link(phy))
+		return;
+
+	/* Set port speed */
+	switch (phy->speed) {
+	case 10:
+		reg = QCA8K_PORT_STATUS_SPEED_10;
+		break;
+	case 100:
+		reg = QCA8K_PORT_STATUS_SPEED_100;
+		break;
+	case 1000:
+		reg = QCA8K_PORT_STATUS_SPEED_1000;
+		break;
+	default:
+		dev_dbg(priv->dev, "port%d link speed %dMbps not supported.\n",
+			port, phy->speed);
+		return;
+	}
+
+	/* Set duplex mode */
+	if (phy->duplex == DUPLEX_FULL)
+		reg |= QCA8K_PORT_STATUS_DUPLEX;
+
+	/* Force flow control */
+	if (dsa_is_cpu_port(ds, port))
+		reg |= QCA8K_PORT_STATUS_RXFLOW | QCA8K_PORT_STATUS_TXFLOW;
+
+	/* Force link down before changing MAC options */
+	qca8k_port_set_status(priv, port, 0);
+	qca8k_write(priv, QCA8K_REG_PORT_STATUS(port), reg);
+	qca8k_port_set_status(priv, port, 1);
+}
+
 static int
 qca8k_phy_read(struct dsa_switch *ds, int phy, int regnum)
 {
@@ -909,6 +950,7 @@ qca8k_get_tag_protocol(struct dsa_switch *ds, int port)
 static const struct dsa_switch_ops qca8k_switch_ops = {
 	.get_tag_protocol	= qca8k_get_tag_protocol,
 	.setup			= qca8k_setup,
+	.adjust_link            = qca8k_adjust_link,
 	.get_strings		= qca8k_get_strings,
 	.phy_read		= qca8k_phy_read,
 	.phy_write		= qca8k_phy_write,
@@ -942,6 +984,7 @@ qca8k_sw_probe(struct mdio_device *mdiodev)
 		return -ENOMEM;
 
 	priv->bus = mdiodev->bus;
+	priv->dev = &mdiodev->dev;
 
 	/* read the switches ID register */
 	id = qca8k_read(priv, QCA8K_REG_MASK_CTRL);
diff --git a/drivers/net/dsa/qca8k.h b/drivers/net/dsa/qca8k.h
index 5bda165..613fe5c5 100644
--- a/drivers/net/dsa/qca8k.h
+++ b/drivers/net/dsa/qca8k.h
@@ -167,6 +167,7 @@ struct qca8k_priv {
 	struct ar8xxx_port_status port_sts[QCA8K_NUM_PORTS];
 	struct dsa_switch *ds;
 	struct mutex reg_mutex;
+	struct device *dev;
 };
 
 struct qca8k_mib_desc {
-- 
2.7.4

^ permalink raw reply related

* [PATCH net-next v3 3/7] net: dsa: qca8k: Enable RXMAC when bringing up a port
From: Michal Vokáč @ 2018-05-23  6:20 UTC (permalink / raw)
  To: netdev
  Cc: linux-kernel, devicetree, f.fainelli, vivien.didelot, andrew,
	mark.rutland, robh+dt, davem, michal.vokac
In-Reply-To: <1527056424-14528-1-git-send-email-michal.vokac@ysoft.com>

When a port is brought up/down do not enable/disable only the TXMAC
but the RXMAC as well. This is essential for the CPU port to work.

Fixes: 6b93fb46480a ("net-next: dsa: add new driver for qca8xxx family")
Signed-off-by: Michal Vokáč <michal.vokac@ysoft.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
---
Changes in v3:
 - none

Changes in v2:
 - Add "Fixes" tag as pointed out by Florian.
 - Add "Reviewed-by" tags from Andrew and Florian.

 drivers/net/dsa/qca8k.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/dsa/qca8k.c b/drivers/net/dsa/qca8k.c
index 6a3ffb2..0d224f3 100644
--- a/drivers/net/dsa/qca8k.c
+++ b/drivers/net/dsa/qca8k.c
@@ -516,7 +516,7 @@ qca8k_set_pad_ctrl(struct qca8k_priv *priv, int port, int mode)
 static void
 qca8k_port_set_status(struct qca8k_priv *priv, int port, int enable)
 {
-	u32 mask = QCA8K_PORT_STATUS_TXMAC;
+	u32 mask = QCA8K_PORT_STATUS_TXMAC | QCA8K_PORT_STATUS_RXMAC;
 
 	pr_debug("qca: port %i set status %i\n", port, enable);
 
-- 
2.7.4

^ permalink raw reply related

* [PATCH net-next v3 2/7] net: dsa: qca8k: Add support for QCA8334 switch
From: Michal Vokáč @ 2018-05-23  6:20 UTC (permalink / raw)
  To: netdev
  Cc: linux-kernel, devicetree, f.fainelli, vivien.didelot, andrew,
	mark.rutland, robh+dt, davem, michal.vokac
In-Reply-To: <1527056424-14528-1-git-send-email-michal.vokac@ysoft.com>

Add support for the four-port variant of the Qualcomm QCA833x switch.

Signed-off-by: Michal Vokáč <michal.vokac@ysoft.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
---
Changes in v3:
 - Add "Reviewed-by" tag from Andrew

Changes in v2:
 - Add commit message

 drivers/net/dsa/qca8k.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/dsa/qca8k.c b/drivers/net/dsa/qca8k.c
index 3684e56..6a3ffb2 100644
--- a/drivers/net/dsa/qca8k.c
+++ b/drivers/net/dsa/qca8k.c
@@ -1010,6 +1010,7 @@ static SIMPLE_DEV_PM_OPS(qca8k_pm_ops,
 			 qca8k_suspend, qca8k_resume);
 
 static const struct of_device_id qca8k_of_match[] = {
+	{ .compatible = "qca,qca8334" },
 	{ .compatible = "qca,qca8337" },
 	{ /* sentinel */ },
 };
-- 
2.7.4

^ permalink raw reply related

* [PATCH net-next v3 1/7] net: dsa: qca8k: Add QCA8334 binding documentation
From: Michal Vokáč @ 2018-05-23  6:20 UTC (permalink / raw)
  To: netdev
  Cc: linux-kernel, devicetree, f.fainelli, vivien.didelot, andrew,
	mark.rutland, robh+dt, davem, michal.vokac
In-Reply-To: <1527056424-14528-1-git-send-email-michal.vokac@ysoft.com>

Add support for the four-port variant of the Qualcomm QCA833x switch.

The CPU port default link settings can be reconfigured using
a fixed-link sub-node.

Signed-off-by: Michal Vokáč <michal.vokac@ysoft.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
---
Changes in v3:
 - Correct fixed-link node documentation term: s/property/node.
 - Add "Reviewed-by" tag from Rob and Andrew.

Changes in v2:
 - Add commit message and document fixed-link binding.

 .../devicetree/bindings/net/dsa/qca8k.txt          | 23 +++++++++++++++++++++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/net/dsa/qca8k.txt b/Documentation/devicetree/bindings/net/dsa/qca8k.txt
index 9c67ee4..bbcb255 100644
--- a/Documentation/devicetree/bindings/net/dsa/qca8k.txt
+++ b/Documentation/devicetree/bindings/net/dsa/qca8k.txt
@@ -2,7 +2,10 @@
 
 Required properties:
 
-- compatible: should be "qca,qca8337"
+- compatible: should be one of:
+    "qca,qca8334"
+    "qca,qca8337"
+
 - #size-cells: must be 0
 - #address-cells: must be 1
 
@@ -14,6 +17,20 @@ port and PHY id, each subnode describing a port needs to have a valid phandle
 referencing the internal PHY connected to it. The CPU port of this switch is
 always port 0.
 
+A CPU port node has the following optional node:
+
+- fixed-link            : Fixed-link subnode describing a link to a non-MDIO
+                          managed entity. See
+                          Documentation/devicetree/bindings/net/fixed-link.txt
+                          for details.
+
+For QCA8K the 'fixed-link' sub-node supports only the following properties:
+
+- 'speed' (integer, mandatory), to indicate the link speed. Accepted
+  values are 10, 100 and 1000
+- 'full-duplex' (boolean, optional), to indicate that full duplex is
+  used. When absent, half duplex is assumed.
+
 Example:
 
 
@@ -53,6 +70,10 @@ Example:
 					label = "cpu";
 					ethernet = <&gmac1>;
 					phy-mode = "rgmii";
+					fixed-link {
+						speed = 1000;
+						full-duplex;
+					};
 				};
 
 				port@1 {
-- 
2.7.4

^ permalink raw reply related

* [PATCH net-next v3 0/7] Add support for QCA8334 switch
From: Michal Vokáč @ 2018-05-23  6:20 UTC (permalink / raw)
  To: netdev
  Cc: linux-kernel, devicetree, f.fainelli, vivien.didelot, andrew,
	mark.rutland, robh+dt, davem, michal.vokac

This series basically adds support for a QCA8334 ethernet switch to the
qca8k driver. It is a four-port variant of the already supported seven
port QCA8337. Register map is the same for the whole familly and all chips
have the same device ID.

Major part of this series enhances the CPU port setting. Currently the CPU
port is not set to any sensible defaults compatible with the xGMII
interface. This series forces the CPU port to its maximum bandwidth and
also allows to adjust the new defaults using fixed-link device tree
sub-node.

Alongside these changes I fixed two checkpatch warnings regarding SPDX and
redundant parentheses.

Changes in v3:
 - Rebased on latest net-next/master.
 - Corrected fixed-link documentation.

Michal Vokáč (7):
  net: dsa: qca8k: Add QCA8334 binding documentation
  net: dsa: qca8k: Add support for QCA8334 switch
  net: dsa: qca8k: Enable RXMAC when bringing up a port
  net: dsa: qca8k: Force CPU port to its highest bandwidth
  net: dsa: qca8k: Allow overwriting CPU port setting
  net: dsa: qca8k: Replace GPL boilerplate by SPDX
  net: dsa: qca8k: Remove redundant parentheses

 .../devicetree/bindings/net/dsa/qca8k.txt          | 23 +++++++-
 drivers/net/dsa/qca8k.c                            | 64 ++++++++++++++++++----
 drivers/net/dsa/qca8k.h                            |  7 ++-
 3 files changed, 79 insertions(+), 15 deletions(-)

-- 
2.7.4

^ permalink raw reply

* Re: [net-next] i40iw/i40e: Remove link dependency on i40e
From: Christoph Hellwig @ 2018-05-23  6:19 UTC (permalink / raw)
  To: Jeff Kirsher
  Cc: Jason Gunthorpe, davem, dledford, Sindhu Devale, netdev,
	linux-rdma, nhorman, sassmann, jogreene, Shiraz Saleem
In-Reply-To: <079ceee3bc8cd0ea50dd7ddc12b27512ca5ac49e.camel@intel.com>

On Tue, May 22, 2018 at 02:04:06PM -0700, Jeff Kirsher wrote:
> > Why would you want to do this? The rdma driver is non-functional
> > without the ethernet driver, so why on earth would we want to defeat
> > the module dependency mechanism?
> 
> This change is driven by the OSV's like Red Hat, where customer's were
> updating the i40e driver, which in turn broke i40iw.

Doctor it hurts when I do this..

There is no reason to make a mess of our drivers because people are
doing things they should haver never done and that aren't supported
in Linux.

If Intel didn;t offer any out of tree drivers I'm pretty sure no
customer would even attempt this.  So fix this where the problem is.

^ permalink raw reply

* Re: [net-next 1/6] net/dcb: Add dcbnl buffer attribute
From: Or Gerlitz @ 2018-05-23  6:15 UTC (permalink / raw)
  To: Huy Nguyen; +Cc: Linux Netdev List
In-Reply-To: <1576e986-6e04-63f3-3569-a105493929a6@mellanox.com>

On Wed, May 23, 2018 at 4:01 AM, Huy Nguyen <huyn@mellanox.com> wrote:
> Dear Jakub, PSB.
> On 5/22/2018 1:32 PM, Jakub Kicinski wrote:

>> Devlink API accommodates requirements of simpler (SwitchX2?) and more
>> advanced schemes (present in Spectrum).  The simpler/basic static
>> threshold configurations is exactly what you are doing here, AFAIU.

> [HQN] Devlink API is tailored specifically for switch. We don't configure
> threshold configuration
> explicitly. It is done via PFC. Once PFC is enabled on priority, threshold
> is setup based on our
> proprietary formula that were tested rigorously for performance.

Huy, please do not prefix your reply lines with your name, it's not needed
and confusing, the email clients used by people in this list do the job.

^ permalink raw reply

* [PATCH net-next] cxgb4: Add new T6 device ids
From: Ganesh Goudar @ 2018-05-23  6:06 UTC (permalink / raw)
  To: netdev, davem; +Cc: nirranjan, indranil, venkatesh, Ganesh Goudar

Add 0x6088 and 0x6089 device ids for new T6 cards.

Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
---
 drivers/net/ethernet/chelsio/cxgb4/t4_pci_id_tbl.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_pci_id_tbl.h b/drivers/net/ethernet/chelsio/cxgb4/t4_pci_id_tbl.h
index adacc63..c7f8d04 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/t4_pci_id_tbl.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/t4_pci_id_tbl.h
@@ -212,6 +212,8 @@ CH_PCI_DEVICE_ID_TABLE_DEFINE_BEGIN
 	CH_PCI_ID_TABLE_FENTRY(0x6085), /* Custom T6240-SO */
 	CH_PCI_ID_TABLE_FENTRY(0x6086), /* Custom T6225-SO-CR */
 	CH_PCI_ID_TABLE_FENTRY(0x6087), /* Custom T6225-CR */
+	CH_PCI_ID_TABLE_FENTRY(0x6088), /* Custom T62100-CR */
+	CH_PCI_ID_TABLE_FENTRY(0x6089), /* Custom T62100-KR */
 CH_PCI_DEVICE_ID_TABLE_DEFINE_END;
 
 #endif /* __T4_PCI_ID_TBL_H__ */
-- 
2.1.0

^ permalink raw reply related

* [PATCH] net: phy: replace bool members in struct phy_device with bit-fields
From: Heiner Kallweit @ 2018-05-23  6:05 UTC (permalink / raw)
  To: Florian Fainelli, Andrew Lunn; +Cc: netdev@vger.kernel.org

In struct phy_device we have a number of flags being defined as type
bool. Similar to e.g. struct pci_dev we can save some space by using
bit-fields.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 include/linux/phy.h | 23 +++++++++++------------
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/include/linux/phy.h b/include/linux/phy.h
index 073235e70..6cd090984 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -406,13 +406,17 @@ struct phy_device {
 	u32 phy_id;
 
 	struct phy_c45_device_ids c45_ids;
-	bool is_c45;
-	bool is_internal;
-	bool is_pseudo_fixed_link;
-	bool has_fixups;
-	bool suspended;
-	bool sysfs_links;
-	bool loopback_enabled;
+	unsigned is_c45:1;
+	unsigned is_internal:1;
+	unsigned is_pseudo_fixed_link:1;
+	unsigned has_fixups:1;
+	unsigned suspended:1;
+	unsigned sysfs_links:1;
+	unsigned loopback_enabled:1;
+
+	unsigned autoneg:1;
+	/* The most recently read link state */
+	unsigned link:1;
 
 	enum phy_state state;
 
@@ -429,9 +433,6 @@ struct phy_device {
 	int pause;
 	int asym_pause;
 
-	/* The most recently read link state */
-	int link;
-
 	/* Enabled Interrupts */
 	u32 interrupts;
 
@@ -444,8 +445,6 @@ struct phy_device {
 	/* Energy efficient ethernet modes which should be prohibited */
 	u32 eee_broken_modes;
 
-	int autoneg;
-
 	int link_timeout;
 
 #ifdef CONFIG_LED_TRIGGER_PHY
-- 
2.17.0

^ permalink raw reply related

* Re: [PATCH] [RFC] bpf: tracing: new helper bpf_get_current_cgroup_ino
From: Y Song @ 2018-05-23  4:31 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Alban Crequy, netdev, linux-kernel, containers, cgroups,
	Alban Crequy, tj
In-Reply-To: <20180523033550.z3tqo4lhd3zrmtdu@ast-mbp>

On Tue, May 22, 2018 at 8:35 PM, Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
> On Tue, May 22, 2018 at 08:33:24PM -0700, Y Song wrote:
>> +       struct cgroup *cgrp = task_dfl_cgroup(current);
>> +       if (!cgrp)
>> +               return -EINVAL;
>
> why this check is needed?

No reason :-) Originally I am concerned whether it is possible cgrp
could be NULL.
By looking at the code, it SEEMS to me that it could not be NULL, but I am not
100% sure (as I am not a cgroup expert). Since you are asking,
probably means it cannot be NULL, so will remove it in formal upstream patch.

^ permalink raw reply

* [PATCH] ppp: remove the PPPIOCDETACH ioctl
From: Eric Biggers @ 2018-05-23  3:59 UTC (permalink / raw)
  To: linux-ppp, Paul Mackerras
  Cc: netdev, linux-fsdevel, linux-kernel, Guillaume Nault,
	syzkaller-bugs, Eric Biggers
In-Reply-To: <20180523032958.GE658@sol.localdomain>

From: Eric Biggers <ebiggers@google.com>

The PPPIOCDETACH ioctl effectively tries to "close" the given ppp file
before f_count has reached 0, which is fundamentally a bad idea.  It
does check 'f_count < 2', which excludes concurrent operations on the
file since they would only be possible with a shared fd table, in which
case each fdget() would take a file reference.  However, it fails to
account for the fact that even with 'f_count == 1' the file can still be
linked into epoll instances.  As reported by syzbot, this can trivially
be used to cause a use-after-free.

Yet, the only known user of PPPIOCDETACH is pppd versions older than
ppp-2.4.2, which was released almost 15 years ago (November 2003).
Also, PPPIOCDETACH apparently stopped working reliably at around the
same time, when the f_count check was added to the kernel, e.g. see
https://lkml.org/lkml/2002/12/31/83.  Also, the current 'f_count < 2'
check makes PPPIOCDETACH only work in single-threaded applications; it
always fails if called from a multithreaded application.

All pppd versions released in the last 15 years just close() the file
descriptor instead.

Therefore, instead of hacking around this bug by exporting epoll
internals to modules, and probably missing other related bugs, just
remove the PPPIOCDETACH ioctl and see if anyone actually notices.

Reported-by: syzbot+16363c99d4134717c05b@syzkaller.appspotmail.com
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 Documentation/networking/ppp_generic.txt |  6 -----
 drivers/net/ppp/ppp_generic.c            | 29 ------------------------
 fs/compat_ioctl.c                        |  1 -
 include/uapi/linux/ppp-ioctl.h           |  1 -
 4 files changed, 37 deletions(-)

diff --git a/Documentation/networking/ppp_generic.txt b/Documentation/networking/ppp_generic.txt
index 091d20273dcb..61daf4b39600 100644
--- a/Documentation/networking/ppp_generic.txt
+++ b/Documentation/networking/ppp_generic.txt
@@ -300,12 +300,6 @@ unattached instance are:
 The ioctl calls available on an instance of /dev/ppp attached to a
 channel are:
 
-* PPPIOCDETACH detaches the instance from the channel.  This ioctl is
-  deprecated since the same effect can be achieved by closing the
-  instance.  In order to prevent possible races this ioctl will fail
-  with an EINVAL error if more than one file descriptor refers to this
-  instance (i.e. as a result of dup(), dup2() or fork()).
-
 * PPPIOCCONNECT connects this channel to a PPP interface.  The
   argument should point to an int containing the interface unit
   number.  It will return an EINVAL error if the channel is already
diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c
index dc7c7ec43202..dce8812fe802 100644
--- a/drivers/net/ppp/ppp_generic.c
+++ b/drivers/net/ppp/ppp_generic.c
@@ -603,35 +603,6 @@ static long ppp_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 		goto out;
 	}
 
-	if (cmd == PPPIOCDETACH) {
-		/*
-		 * We have to be careful here... if the file descriptor
-		 * has been dup'd, we could have another process in the
-		 * middle of a poll using the same file *, so we had
-		 * better not free the interface data structures -
-		 * instead we fail the ioctl.  Even in this case, we
-		 * shut down the interface if we are the owner of it.
-		 * Actually, we should get rid of PPPIOCDETACH, userland
-		 * (i.e. pppd) could achieve the same effect by closing
-		 * this fd and reopening /dev/ppp.
-		 */
-		err = -EINVAL;
-		if (pf->kind == INTERFACE) {
-			ppp = PF_TO_PPP(pf);
-			rtnl_lock();
-			if (file == ppp->owner)
-				unregister_netdevice(ppp->dev);
-			rtnl_unlock();
-		}
-		if (atomic_long_read(&file->f_count) < 2) {
-			ppp_release(NULL, file);
-			err = 0;
-		} else
-			pr_warn("PPPIOCDETACH file->f_count=%ld\n",
-				atomic_long_read(&file->f_count));
-		goto out;
-	}
-
 	if (pf->kind == CHANNEL) {
 		struct channel *pch;
 		struct ppp_channel *chan;
diff --git a/fs/compat_ioctl.c b/fs/compat_ioctl.c
index ef80085ed564..8285b570d635 100644
--- a/fs/compat_ioctl.c
+++ b/fs/compat_ioctl.c
@@ -917,7 +917,6 @@ COMPATIBLE_IOCTL(PPPIOCSDEBUG)
 /* PPPIOCGIDLE is translated */
 COMPATIBLE_IOCTL(PPPIOCNEWUNIT)
 COMPATIBLE_IOCTL(PPPIOCATTACH)
-COMPATIBLE_IOCTL(PPPIOCDETACH)
 COMPATIBLE_IOCTL(PPPIOCSMRRU)
 COMPATIBLE_IOCTL(PPPIOCCONNECT)
 COMPATIBLE_IOCTL(PPPIOCDISCONN)
diff --git a/include/uapi/linux/ppp-ioctl.h b/include/uapi/linux/ppp-ioctl.h
index b19a9c249b15..d46caf217ea4 100644
--- a/include/uapi/linux/ppp-ioctl.h
+++ b/include/uapi/linux/ppp-ioctl.h
@@ -106,7 +106,6 @@ struct pppol2tp_ioc_stats {
 #define PPPIOCGIDLE	_IOR('t', 63, struct ppp_idle) /* get idle time */
 #define PPPIOCNEWUNIT	_IOWR('t', 62, int)	/* create new ppp unit */
 #define PPPIOCATTACH	_IOW('t', 61, int)	/* attach to ppp unit */
-#define PPPIOCDETACH	_IOW('t', 60, int)	/* detach from ppp unit/chan */
 #define PPPIOCSMRRU	_IOW('t', 59, int)	/* set multilink MRU */
 #define PPPIOCCONNECT	_IOW('t', 58, int)	/* connect channel to unit */
 #define PPPIOCDISCONN	_IO('t', 57)		/* disconnect channel */
-- 
2.17.0

^ permalink raw reply related

* Re: [PATCH] [RFC] bpf: tracing: new helper bpf_get_current_cgroup_ino
From: Alexei Starovoitov @ 2018-05-23  3:35 UTC (permalink / raw)
  To: Y Song
  Cc: Alban Crequy, netdev, linux-kernel, containers, cgroups,
	Alban Crequy, tj
In-Reply-To: <CAH3MdRVdfw52atavT3KL8MpPw7zDM_hR6aUcqDP1PogLn_sH+w@mail.gmail.com>

On Tue, May 22, 2018 at 08:33:24PM -0700, Y Song wrote:
> +       struct cgroup *cgrp = task_dfl_cgroup(current);
> +       if (!cgrp)
> +               return -EINVAL;

why this check is needed?

^ permalink raw reply

* Re: [PATCH] [RFC] bpf: tracing: new helper bpf_get_current_cgroup_ino
From: Y Song @ 2018-05-23  3:33 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Alban Crequy, netdev, linux-kernel, containers, cgroups,
	Alban Crequy, tj
In-Reply-To: <CAH3MdRWgruVq+3r+2pHTah-c2zTO03vPkepjWDZ0_KrYcroy9A@mail.gmail.com>

I did a quick prototyping and the above interface seems working fine.

The kernel change:
===============

[yhs@localhost bpf-next]$ git diff
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 97446bbe2ca5..669b7383fddb 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -1976,7 +1976,8 @@ union bpf_attr {
        FN(fib_lookup),                 \
        FN(sock_hash_update),           \
        FN(msg_redirect_hash),          \
-       FN(sk_redirect_hash),
+       FN(sk_redirect_hash),           \
+       FN(get_current_cgroup_id),

 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
  * function eBPF program intends to call
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index ce2cbbff27e4..e11e3298f911 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -493,6 +493,21 @@ static const struct bpf_func_proto
bpf_current_task_under_cgroup_proto = {
        .arg2_type      = ARG_ANYTHING,
 };

+BPF_CALL_0(bpf_get_current_cgroup_id)
+{
+       struct cgroup *cgrp = task_dfl_cgroup(current);
+       if (!cgrp)
+               return -EINVAL;
+
+       return cgrp->kn->id.id;
+}
+
+static const struct bpf_func_proto bpf_get_current_cgroup_id_proto = {
+       .func           = bpf_get_current_cgroup_id,
+       .gpl_only       = false,
+       .ret_type       = RET_INTEGER,
+};
+
 BPF_CALL_3(bpf_probe_read_str, void *, dst, u32, size,
           const void *, unsafe_ptr)
 {
@@ -563,6 +578,8 @@ tracing_func_proto(enum bpf_func_id func_id, const
struct bpf_prog *prog)
                return &bpf_get_prandom_u32_proto;
        case BPF_FUNC_probe_read_str:
                return &bpf_probe_read_str_proto;
+       case BPF_FUNC_get_current_cgroup_id:
+               return &bpf_get_current_cgroup_id_proto;
        default:
                return NULL;
        }

The following program can be used to print out a cgroup id given a cgroup path.
[yhs@localhost cg]$ cat get_cgroup_id.c
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

int main(int argc, char **argv)
{
    int dirfd, err, flags, mount_id, fhsize;
    struct file_handle *fhp;
    char *pathname;

    if (argc != 2) {
        printf("usage: %s <cgroup_path>\n", argv[0]);
        return 1;
    }

    pathname = argv[1];
    dirfd = AT_FDCWD;
    flags = 0;

    fhsize = sizeof(*fhp);
    fhp = malloc(fhsize);
    if (!fhp)
        return 1;

    err = name_to_handle_at(dirfd, pathname, fhp, &mount_id, flags);
    if (err >= 0) {
        printf("error\n");
        return 1;
    }

    fhsize = sizeof(struct file_handle) + fhp->handle_bytes;
    fhp = realloc(fhp, fhsize);
    if (!fhp)
        return 1;

    err = name_to_handle_at(dirfd, pathname, fhp, &mount_id, flags);
    if (err < 0)
        perror("name_to_handle_at");
    else {
        int i;

        printf("dir = %s, mount_id = %d\n", pathname, mount_id);
        printf("handle_bytes = %d, handle_type = %d\n", fhp->handle_bytes,
            fhp->handle_type);
        if (fhp->handle_bytes != 8)
            return 1;

        printf("cgroup_id = 0x%llx\n", *(unsigned long long *)fhp->f_handle);
    }

    return 0;
}
[yhs@localhost cg]$

Given a cgroup path, the user can get cgroup_id and use it in their bpf
program for filtering purpose.

I run a simple program t.c
   int main() { while(1) sleep(1); return 0; }
in the cgroup v2 directory /home/yhs/tmp/yhs
   none on /home/yhs/tmp type cgroup2 (rw,relatime,seclabel)

$ ./get_cgroup_id /home/yhs/tmp/yhs
dir = /home/yhs/tmp/yhs, mount_id = 124
handle_bytes = 8, handle_type = 1
cgroup_id = 0x1000006b2

// the below command to get cgroup_id from the kernel for the
// process compiled with t.c and ran under /home/yhs/tmp/yhs:
$ sudo ./trace.py -p 4067 '__x64_sys_nanosleep "cgid = %llx", $cgid'
PID     TID     COMM            FUNC             -
4067    4067    a.out           __x64_sys_nanosleep cgid = 1000006b2
4067    4067    a.out           __x64_sys_nanosleep cgid = 1000006b2
4067    4067    a.out           __x64_sys_nanosleep cgid = 1000006b2
^C[yhs@localhost tools]$

The kernel and user space cgid matches. Will provide a
formal patch later.




On Mon, May 21, 2018 at 5:24 PM, Y Song <ys114321@gmail.com> wrote:
> On Mon, May 21, 2018 at 9:26 AM, Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
>> On Sun, May 13, 2018 at 07:33:18PM +0200, Alban Crequy wrote:
>>>
>>> +BPF_CALL_2(bpf_get_current_cgroup_ino, u32, hierarchy, u64, flags)
>>> +{
>>> +     // TODO: pick the correct hierarchy instead of the mem controller
>>> +     struct cgroup *cgrp = task_cgroup(current, memory_cgrp_id);
>>> +
>>> +     if (unlikely(!cgrp))
>>> +             return -EINVAL;
>>> +     if (unlikely(hierarchy))
>>> +             return -EINVAL;
>>> +     if (unlikely(flags))
>>> +             return -EINVAL;
>>> +
>>> +     return cgrp->kn->id.ino;
>>
>> ino only is not enough to identify cgroup. It needs generation number too.
>> I don't quite see how hierarchy and flags can be used in the future.
>> Also why limit it to memcg?
>>
>> How about something like this instead:
>>
>> BPF_CALL_2(bpf_get_current_cgroup_id)
>> {
>>         struct cgroup *cgrp = task_dfl_cgroup(current);
>>
>>         return cgrp->kn->id.id;
>> }
>> The user space can use fhandle api to get the same 64-bit id.
>
> I think this should work. This will also be useful to bcc as user
> space can encode desired id
> in the bpf program and compared that id to the current cgroup id, so we can have
> cgroup level tracing (esp. stat collection) support. To cope with
> cgroup hierarchy, user can use
> cgroup-array based approach or explicitly compare against multiple cgroup id's.

^ permalink raw reply related

* Re: [PATCH] e1000: check the return of pci_get_drvdata() in e1000_remove()
From: Bo Chen @ 2018-05-23  3:31 UTC (permalink / raw)
  Cc: netdev, linux-kernel
In-Reply-To: <20180522184702.64cd62f6@xeon-e3>

Re-send to mailing lists as the previous email was rejected because of
not using plain text.

-----

Hi Stephen,

Thanks for the quick reply and your comments. I am new to network
drivers, and certainly trust your assessment.

I was assuming 'pdev->dev.p' can be NULL with a 'kzalloc()' failure in
'pci_set_drvdata()' invoked by 'e1000_probe()', in which case the
"pci_get_drvdata()" will return a NULL pointer. But I did not trace
back to confirm whether 'pdev->dev.p' has to be valid before
'e1000_probe()' is invoked. Maybe this is what I am missing? Please
excuse my newbie questions and mistakes.

Here is some context about why I started this patch. I am building a
tool to perform "grey-box" fault injection on linux-kernel-module
binaries. In my tool, there is a set of kernel API functions that
faults can be generated, for example making the return of
'__kmalloc()' be NULL. My tool fired an alarm (a kernel panic), when a
fault on ''dev_get_drvdata()" is injected to 'e1000.ko' (and other
drivers were fine, such as e100.ko and pcnet32.ko). It seems that I
was wrong to assume 'dev_get_drvdata()' can return NULL and inject
faults from it. Do you have any suggestions about how I can avoid such
wrong assumptions?

Thanks again for your time and attention. I hope this patch is not
wasting too much effort from the netdev community. Any other comments
or suggestions will be appreciated.

Best Regards,
Bo Chen

On Tue, May 22, 2018 at 6:47 PM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Tue, 22 May 2018 17:17:43 -0700
> Bo Chen <chenbo@pdx.edu> wrote:
>
> > This check on pci_get_drvdata() prevents potential invalid pointer dereferences,
> > and is a common practice in *_remove() functions from other drivers, such as
> > 'intel/e100.c', 'amd/pcnet32.c', 'realtek/8139too.c', and 'broadcom/tg3.c'.
> >
> > Signed-off-by: Bo Chen <chenbo@pdx.edu>
>
> Why check for something that can never be true.
> You are creating dead code paths that can never be exercised.

^ permalink raw reply

* Re: [PATCH net-next v2 00/10] qed*: Add support for management firmware TLV request.
From: David Miller @ 2018-05-23  3:31 UTC (permalink / raw)
  To: sudarsana.kalluru; +Cc: netdev, Ariel.Elior, chad.dupuis, manish.rangankar
In-Reply-To: <20180522072846.2454-1-sudarsana.kalluru@cavium.com>

From: Sudarsana Reddy Kalluru <sudarsana.kalluru@cavium.com>
Date: Tue, 22 May 2018 00:28:36 -0700

> From: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
> 
> Management firmware (MFW) requires config and state information from
> the driver. It queries this via TLV (type-length-value) request wherein
> mfw specificies the list of required TLVs. Driver fills the TLV data
> and responds back to MFW.
> This patch series adds qed/qede/qedf/qedi driver implementation for
> supporting the TLV queries from MFW.
> 
> Changes from previous versions:
> -------------------------------
> v2: Split patch (2) into multiple simpler patches.
> v2: Update qed_tlv_parsed_buf->p_val datatype to void pointer to avoid
>     bunch of unnecessary typecasts.
> 
> Please consider applying this series to "net-next".

Series applied, thanks.

^ permalink raw reply

* Re: [PATCH net-next] cxgb4: do L1 config when module is inserted
From: kbuild test robot @ 2018-05-23  3:29 UTC (permalink / raw)
  To: Ganesh Goudar
  Cc: kbuild-all, netdev, davem, nirranjan, indranil, venkatesh,
	Ganesh Goudar, Casey Leedom
In-Reply-To: <1526894143-4986-1-git-send-email-ganeshgr@chelsio.com>

[-- Attachment #1: Type: text/plain, Size: 4438 bytes --]

Hi Ganesh,

I love your patch! Yet something to improve:

[auto build test ERROR on net-next/master]

url:    https://github.com/0day-ci/linux/commits/Ganesh-Goudar/cxgb4-do-L1-config-when-module-is-inserted/20180523-085637
config: x86_64-rhel (attached as .config)
compiler: gcc-7 (Debian 7.3.0-16) 7.3.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All errors (new ones prefixed by >>):

   drivers/scsi/csiostor/csio_hw.c: In function 'fwcaps16_to_caps32':
>> drivers/scsi/csiostor/csio_hw.c:1490:17: error: 'FW_PORT_CAP_MDIX' undeclared (first use in this function); did you mean 'FW_PORT_CAP_MDI_V'?
       if (caps16 & FW_PORT_CAP_##__cap) \
                    ^
   drivers/scsi/csiostor/csio_hw.c:1503:2: note: in expansion of macro 'CAP16_TO_CAP32'
     CAP16_TO_CAP32(MDIX);
     ^~~~~~~~~~~~~~
   drivers/scsi/csiostor/csio_hw.c:1490:17: note: each undeclared identifier is reported only once for each function it appears in
       if (caps16 & FW_PORT_CAP_##__cap) \
                    ^
   drivers/scsi/csiostor/csio_hw.c:1503:2: note: in expansion of macro 'CAP16_TO_CAP32'
     CAP16_TO_CAP32(MDIX);
     ^~~~~~~~~~~~~~
   drivers/scsi/csiostor/csio_hw.c:1491:15: error: 'FW_PORT_CAP32_MDIX' undeclared (first use in this function); did you mean 'FW_PORT_CAP_MDIX'?
        caps32 |= FW_PORT_CAP32_##__cap; \
                  ^
   drivers/scsi/csiostor/csio_hw.c:1503:2: note: in expansion of macro 'CAP16_TO_CAP32'
     CAP16_TO_CAP32(MDIX);
     ^~~~~~~~~~~~~~

vim +1490 drivers/scsi/csiostor/csio_hw.c

e1735d9a Varun Prakash 2018-03-11  1477  
e1735d9a Varun Prakash 2018-03-11  1478  /**
e1735d9a Varun Prakash 2018-03-11  1479   *      fwcaps16_to_caps32 - convert 16-bit Port Capabilities to 32-bits
e1735d9a Varun Prakash 2018-03-11  1480   *      @caps16: a 16-bit Port Capabilities value
e1735d9a Varun Prakash 2018-03-11  1481   *
e1735d9a Varun Prakash 2018-03-11  1482   *      Returns the equivalent 32-bit Port Capabilities value.
e1735d9a Varun Prakash 2018-03-11  1483   */
e1735d9a Varun Prakash 2018-03-11  1484  fw_port_cap32_t fwcaps16_to_caps32(fw_port_cap16_t caps16)
e1735d9a Varun Prakash 2018-03-11  1485  {
e1735d9a Varun Prakash 2018-03-11  1486  	fw_port_cap32_t caps32 = 0;
e1735d9a Varun Prakash 2018-03-11  1487  
e1735d9a Varun Prakash 2018-03-11  1488  	#define CAP16_TO_CAP32(__cap) \
e1735d9a Varun Prakash 2018-03-11  1489  		do { \
e1735d9a Varun Prakash 2018-03-11 @1490  			if (caps16 & FW_PORT_CAP_##__cap) \
e1735d9a Varun Prakash 2018-03-11  1491  				caps32 |= FW_PORT_CAP32_##__cap; \
e1735d9a Varun Prakash 2018-03-11  1492  		} while (0)
e1735d9a Varun Prakash 2018-03-11  1493  
e1735d9a Varun Prakash 2018-03-11  1494  	CAP16_TO_CAP32(SPEED_100M);
e1735d9a Varun Prakash 2018-03-11  1495  	CAP16_TO_CAP32(SPEED_1G);
e1735d9a Varun Prakash 2018-03-11  1496  	CAP16_TO_CAP32(SPEED_25G);
e1735d9a Varun Prakash 2018-03-11  1497  	CAP16_TO_CAP32(SPEED_10G);
e1735d9a Varun Prakash 2018-03-11  1498  	CAP16_TO_CAP32(SPEED_40G);
e1735d9a Varun Prakash 2018-03-11  1499  	CAP16_TO_CAP32(SPEED_100G);
e1735d9a Varun Prakash 2018-03-11  1500  	CAP16_TO_CAP32(FC_RX);
e1735d9a Varun Prakash 2018-03-11  1501  	CAP16_TO_CAP32(FC_TX);
e1735d9a Varun Prakash 2018-03-11  1502  	CAP16_TO_CAP32(ANEG);
e1735d9a Varun Prakash 2018-03-11  1503  	CAP16_TO_CAP32(MDIX);
e1735d9a Varun Prakash 2018-03-11  1504  	CAP16_TO_CAP32(MDIAUTO);
e1735d9a Varun Prakash 2018-03-11  1505  	CAP16_TO_CAP32(FEC_RS);
e1735d9a Varun Prakash 2018-03-11  1506  	CAP16_TO_CAP32(FEC_BASER_RS);
e1735d9a Varun Prakash 2018-03-11  1507  	CAP16_TO_CAP32(802_3_PAUSE);
e1735d9a Varun Prakash 2018-03-11  1508  	CAP16_TO_CAP32(802_3_ASM_DIR);
e1735d9a Varun Prakash 2018-03-11  1509  
e1735d9a Varun Prakash 2018-03-11  1510  	#undef CAP16_TO_CAP32
e1735d9a Varun Prakash 2018-03-11  1511  
e1735d9a Varun Prakash 2018-03-11  1512  	return caps32;
e1735d9a Varun Prakash 2018-03-11  1513  }
e1735d9a Varun Prakash 2018-03-11  1514  

:::::: The code at line 1490 was first introduced by commit
:::::: e1735d9a98ab5593484bbba1933e362a261e0de0 scsi: csiostor: add support for 32 bit port capabilities

:::::: TO: Varun Prakash <varun@chelsio.com>
:::::: CC: Martin K. Petersen <martin.petersen@oracle.com>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 40459 bytes --]

^ permalink raw reply

* Re: KASAN: use-after-free Read in remove_wait_queue (2)
From: Eric Biggers @ 2018-05-23  3:29 UTC (permalink / raw)
  To: Guillaume Nault
  Cc: linux-ppp, Paul Mackerras, netdev, linux-kernel, syzkaller-bugs,
	syzbot, viro
In-Reply-To: <20180518160223.GF1534@alphalink.fr>

On Fri, May 18, 2018 at 06:02:23PM +0200, Guillaume Nault wrote:
> On Sun, May 13, 2018 at 11:11:55PM -0700, Eric Biggers wrote:
> > [+ppp list and maintainer]
> > 
> > This is a bug in ppp_generic.c; it still happens on Linus' tree and it's easily
> > reproducible, see program below.  The bug is that the PPPIOCDETACH ioctl doesn't
> > consider that the file can still be attached to epoll instances even when
> > ->f_count == 1.
> 
> Right. What would it take to remove the file for the epoll instances?
> Sorry for the naive question, but I'm not familiar with VFS and didn't
> find a helper function we could call.
> 

There is eventpoll_release_file(), but it's not exported to modules.  It might
work to call it, but it seems like a hack.

> > Also, the reproducer doesn't test this but I think ppp_poll(),
> > ppp_read(), and ppp_write() can all race with PPPIOCDETACH, causing
> > use-after-frees as well.
> 
> I also believe so. ppp_release() resets ->private_data, and even though
> functions like ppp_read() test ->private_data before executing, there's
> no synchronisation mechanism to ensure that the update is visible
> before the unit or channel is destroyed. Is that the kind of race you
> had in mind?

Yes, though after looking into it more I *think* these additional races aren't
actually possible, due to the 'f_count < 2' check.  These races could only
happen with a shared fd table, but in that case fdget() would increment f_count
for the duration of each operation, resulting in 'f_count >= 2' if both ioctl()
and something else is running on the same file concurrently.

Note that this also means PPPIOCDETACH doesn't work at all if called from a
multithreaded application...

> 
> > Any chance that PPPIOCDETACH can simply be removed,
> > given that it's apparently been "deprecated" for 16 years?
> > Does anyone use it?
> 
> The only users I'm aware of are pppd versions older than ppp-2.4.2
> (released in November 2003). But even at that time, there were issues
> with PPPIOCDETACH as pppd didn't seem to react properly when this call
> failed. An Internet search on the "PPPIOCDETACH file->f_count=" kernel
> log string, or on the "Couldn't release PPP unit: Invalid argument"
> error message of pppd, returns several related bug reports.
> 
> Originally, PPPIOCDETACH never failed, but testing ->f_count was
> later introduced to fix crashes when the file descriptor had been
> duplicated. It seems that this was motivated by polling issues too.
> 
> Long story short, it looks like PPPIOCDETACH never has worked well
> and we have at least two more bugs to fix. Given how it has proven
> fragile, I wouldn't be surprised if there were even more lurking
> around. I'd say that it's probably safer to drop it than to add more
> workarounds and playing wack-a-mole with those bugs.

IMO, if we can get away with removing it without any users noticing, that would
be much better than trying to fix it with a VFS-level hack, and probably missing
some cases.  I'll send a patch to get things started...

- Eric

^ permalink raw reply

* Re: [PATCH net-next] cxgb4: do L1 config when module is inserted
From: kbuild test robot @ 2018-05-23  1:47 UTC (permalink / raw)
  To: Ganesh Goudar
  Cc: kbuild-all, netdev, davem, nirranjan, indranil, venkatesh,
	Ganesh Goudar, Casey Leedom
In-Reply-To: <1526894143-4986-1-git-send-email-ganeshgr@chelsio.com>

[-- Attachment #1: Type: text/plain, Size: 4618 bytes --]

Hi Ganesh,

I love your patch! Yet something to improve:

[auto build test ERROR on net-next/master]

url:    https://github.com/0day-ci/linux/commits/Ganesh-Goudar/cxgb4-do-L1-config-when-module-is-inserted/20180523-085637
config: sparc64-allyesconfig (attached as .config)
compiler: sparc64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=sparc64 

All error/warnings (new ones prefixed by >>):

   drivers/scsi/csiostor/csio_hw.c: In function 'fwcaps16_to_caps32':
>> drivers/scsi/csiostor/csio_hw.c:1490:17: error: 'FW_PORT_CAP_MDIX' undeclared (first use in this function); did you mean 'FW_PORT_CAP_MDI_S'?
       if (caps16 & FW_PORT_CAP_##__cap) \
                    ^
>> drivers/scsi/csiostor/csio_hw.c:1503:2: note: in expansion of macro 'CAP16_TO_CAP32'
     CAP16_TO_CAP32(MDIX);
     ^~~~~~~~~~~~~~
   drivers/scsi/csiostor/csio_hw.c:1490:17: note: each undeclared identifier is reported only once for each function it appears in
       if (caps16 & FW_PORT_CAP_##__cap) \
                    ^
>> drivers/scsi/csiostor/csio_hw.c:1503:2: note: in expansion of macro 'CAP16_TO_CAP32'
     CAP16_TO_CAP32(MDIX);
     ^~~~~~~~~~~~~~
>> drivers/scsi/csiostor/csio_hw.c:1491:15: error: 'FW_PORT_CAP32_MDIX' undeclared (first use in this function); did you mean 'FW_PORT_CAP_MDIX'?
        caps32 |= FW_PORT_CAP32_##__cap; \
                  ^
>> drivers/scsi/csiostor/csio_hw.c:1503:2: note: in expansion of macro 'CAP16_TO_CAP32'
     CAP16_TO_CAP32(MDIX);
     ^~~~~~~~~~~~~~

vim +1490 drivers/scsi/csiostor/csio_hw.c

e1735d9a Varun Prakash 2018-03-11  1477  
e1735d9a Varun Prakash 2018-03-11  1478  /**
e1735d9a Varun Prakash 2018-03-11  1479   *      fwcaps16_to_caps32 - convert 16-bit Port Capabilities to 32-bits
e1735d9a Varun Prakash 2018-03-11  1480   *      @caps16: a 16-bit Port Capabilities value
e1735d9a Varun Prakash 2018-03-11  1481   *
e1735d9a Varun Prakash 2018-03-11  1482   *      Returns the equivalent 32-bit Port Capabilities value.
e1735d9a Varun Prakash 2018-03-11  1483   */
e1735d9a Varun Prakash 2018-03-11  1484  fw_port_cap32_t fwcaps16_to_caps32(fw_port_cap16_t caps16)
e1735d9a Varun Prakash 2018-03-11  1485  {
e1735d9a Varun Prakash 2018-03-11  1486  	fw_port_cap32_t caps32 = 0;
e1735d9a Varun Prakash 2018-03-11  1487  
e1735d9a Varun Prakash 2018-03-11  1488  	#define CAP16_TO_CAP32(__cap) \
e1735d9a Varun Prakash 2018-03-11  1489  		do { \
e1735d9a Varun Prakash 2018-03-11 @1490  			if (caps16 & FW_PORT_CAP_##__cap) \
e1735d9a Varun Prakash 2018-03-11 @1491  				caps32 |= FW_PORT_CAP32_##__cap; \
e1735d9a Varun Prakash 2018-03-11  1492  		} while (0)
e1735d9a Varun Prakash 2018-03-11  1493  
e1735d9a Varun Prakash 2018-03-11  1494  	CAP16_TO_CAP32(SPEED_100M);
e1735d9a Varun Prakash 2018-03-11  1495  	CAP16_TO_CAP32(SPEED_1G);
e1735d9a Varun Prakash 2018-03-11  1496  	CAP16_TO_CAP32(SPEED_25G);
e1735d9a Varun Prakash 2018-03-11  1497  	CAP16_TO_CAP32(SPEED_10G);
e1735d9a Varun Prakash 2018-03-11  1498  	CAP16_TO_CAP32(SPEED_40G);
e1735d9a Varun Prakash 2018-03-11  1499  	CAP16_TO_CAP32(SPEED_100G);
e1735d9a Varun Prakash 2018-03-11  1500  	CAP16_TO_CAP32(FC_RX);
e1735d9a Varun Prakash 2018-03-11  1501  	CAP16_TO_CAP32(FC_TX);
e1735d9a Varun Prakash 2018-03-11  1502  	CAP16_TO_CAP32(ANEG);
e1735d9a Varun Prakash 2018-03-11 @1503  	CAP16_TO_CAP32(MDIX);
e1735d9a Varun Prakash 2018-03-11  1504  	CAP16_TO_CAP32(MDIAUTO);
e1735d9a Varun Prakash 2018-03-11  1505  	CAP16_TO_CAP32(FEC_RS);
e1735d9a Varun Prakash 2018-03-11  1506  	CAP16_TO_CAP32(FEC_BASER_RS);
e1735d9a Varun Prakash 2018-03-11  1507  	CAP16_TO_CAP32(802_3_PAUSE);
e1735d9a Varun Prakash 2018-03-11  1508  	CAP16_TO_CAP32(802_3_ASM_DIR);
e1735d9a Varun Prakash 2018-03-11  1509  
e1735d9a Varun Prakash 2018-03-11  1510  	#undef CAP16_TO_CAP32
e1735d9a Varun Prakash 2018-03-11  1511  
e1735d9a Varun Prakash 2018-03-11  1512  	return caps32;
e1735d9a Varun Prakash 2018-03-11  1513  }
e1735d9a Varun Prakash 2018-03-11  1514  

:::::: The code at line 1490 was first introduced by commit
:::::: e1735d9a98ab5593484bbba1933e362a261e0de0 scsi: csiostor: add support for 32 bit port capabilities

:::::: TO: Varun Prakash <varun@chelsio.com>
:::::: CC: Martin K. Petersen <martin.petersen@oracle.com>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 53396 bytes --]

^ permalink raw reply

* Re: [PATCH] e1000: check the return of pci_get_drvdata() in e1000_remove()
From: Stephen Hemminger @ 2018-05-23  1:47 UTC (permalink / raw)
  To: Bo Chen; +Cc: jeffrey.t.kirsher, davem, intel-wired-lan, netdev, linux-kernel
In-Reply-To: <20180523001743.8492-1-chenbo@pdx.edu>

On Tue, 22 May 2018 17:17:43 -0700
Bo Chen <chenbo@pdx.edu> wrote:

> This check on pci_get_drvdata() prevents potential invalid pointer dereferences,
> and is a common practice in *_remove() functions from other drivers, such as
> 'intel/e100.c', 'amd/pcnet32.c', 'realtek/8139too.c', and 'broadcom/tg3.c'.
> 
> Signed-off-by: Bo Chen <chenbo@pdx.edu>

Why check for something that can never be true.
You are creating dead code paths that can never be exercised.

^ permalink raw reply

* YAaioRace (was Re: [PATCH 08/31] aio: implement IOCB_CMD_POLL)
From: Al Viro @ 2018-05-23  1:43 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Avi Kivity, linux-aio, linux-fsdevel, netdev, linux-api,
	linux-kernel, Kent Overstreet, Christoph Hellwig
In-Reply-To: <20180523004904.GH30522@ZenIV.linux.org.uk>

On Wed, May 23, 2018 at 01:49:04AM +0100, Al Viro wrote:

> > Looks like we want to call ->ki_cancel() *BEFORE* removing from the list,
> > as well as doing fput() after aio_complete().  The same ordering, BTW, goes
> > for aio_read() et.al.
> > 
> > Look:
> > CPU1:	io_cancel() grabs ->ctx_lock, finds iocb and removes it from the list.
> > CPU2:	aio_rw_complete() on that iocb.  Since the sucker is not in the list
> > anymore, we do NOT spin on ->ctx_lock and proceed to free iocb
> > CPU1:	pass freed iocb to ->ki_cancel().  BOOM.
> 
> BTW, it seems that the mainline is vulnerable to this one.  I might be
> missing something, but...

It is, but with a different attack vector - io_cancel(2) won't do it (it
does not remove from the list at all), but io_destroy(2) bloody well will.

IMO, we need this in mainline; unless somebody has a problem with it, to
#fixes it goes:

fix io_destroy()/aio_complete() race

If io_destroy() gets to cancelling everything that can be cancelled and
gets to kiocb_cancel() calling the function driver has left in ->ki_cancel,
it becomes vulnerable to a race with IO completion.  At that point req
is already taken off the list and aio_complete() does *NOT* spin until
we (in free_ioctx_users()) releases ->ctx_lock.  As the result, it proceeds
to kiocb_free(), freing req just it gets passed to ->ki_cancel().

Fix is simple - remove from the list after the call of kiocb_cancel().  All
instances of ->ki_cancel() already have to cope with the being called with
iocb still on list - that's what happens in io_cancel(2).

Cc: stable@kernel.org
Fixes: 0460fef2a921 "aio: use cancellation list lazily"
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
diff --git a/fs/aio.c b/fs/aio.c
index 8061d9787e54..49f53516eef0 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -634,9 +634,8 @@ static void free_ioctx_users(struct percpu_ref *ref)
 	while (!list_empty(&ctx->active_reqs)) {
 		req = list_first_entry(&ctx->active_reqs,
 				       struct aio_kiocb, ki_list);
-
-		list_del_init(&req->ki_list);
 		kiocb_cancel(req);
+		list_del_init(&req->ki_list);
 	}
 
 	spin_unlock_irq(&ctx->ctx_lock);

--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo@kvack.org.  For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>

^ permalink raw reply related

* Re: [PATCH v3] mlx4_core: allocate ICM memory in page size chunks
From: Qing Huang @ 2018-05-23  1:41 UTC (permalink / raw)
  To: Tariq Toukan, Eric Dumazet, davem, haakon.bugge, yanjun.zhu
  Cc: netdev, linux-rdma, linux-kernel, gi-oh.kim
In-Reply-To: <35ba0f14-7b24-96ff-6b2d-610a4b2980c2@mellanox.com>



On 5/22/2018 8:33 AM, Tariq Toukan wrote:
>
>
> On 18/05/2018 12:45 AM, Qing Huang wrote:
>>
>>
>> On 5/17/2018 2:14 PM, Eric Dumazet wrote:
>>> On 05/17/2018 01:53 PM, Qing Huang wrote:
>>>> When a system is under memory presure (high usage with fragments),
>>>> the original 256KB ICM chunk allocations will likely trigger kernel
>>>> memory management to enter slow path doing memory compact/migration
>>>> ops in order to complete high order memory allocations.
>>>>
>>>> When that happens, user processes calling uverb APIs may get stuck
>>>> for more than 120s easily even though there are a lot of free pages
>>>> in smaller chunks available in the system.
>>>>
>>>> Syslog:
>>>> ...
>>>> Dec 10 09:04:51 slcc03db02 kernel: [397078.572732] INFO: task
>>>> oracle_205573_e:205573 blocked for more than 120 seconds.
>>>> ...
>>>>
>>> NACK on this patch.
>>>
>>> You have been asked repeatedly to use kvmalloc()
>>>
>>> This is not a minor suggestion.
>>>
>>> Take a look 
>>> athttps://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d8c13f2271ec5178c52fbde072ec7b562651ed9d 
>>>
>>
>> Would you please take a look at how table->icm is being used in the 
>> mlx4 driver? It's a meta data used for individual pointer variable 
>> referencing,
>> not as data frag or in/out buffer. It has no need for contiguous phy. 
>> memory.
>>
>> Thanks.
>>
>
> NACK.
>
> This would cause a degradation when iterating the entries of table->icm.
> For example, in mlx4_table_get_range.
E.g.
int mlx4_table_get_range(struct mlx4_dev *dev, struct mlx4_icm_table *table,
                          u32 start, u32 end)
{
         int inc = MLX4_TABLE_CHUNK_SIZE / table->obj_size;
         int err;
         u32 i;

         for (i = start; i <= end; i += inc) {
                 err = mlx4_table_get(dev, table, i);
                 if (err)
                         goto fail;
         }

         return 0;
...
}

E.g. mtt obj is 8 bytes, so a 4KB ICM block would have 512 mtt objects. 
So you will have to allocate
more 512 mtt objects in order to have table->icm pointer to increment by 
1 to fetch next pointer
value.  So 256K mtt objects are needed in order to traverse table->icm 
pointer across a page boundary
in the call stacks.

Considering mlx4_table_get_range() is only used in control path, there 
is no significant gain by using kvzalloc
vs. vzalloc for table->icm.

Anyway, if a user makes sure mlx4 driver to be loaded very early and 
doesn't remove and reload it afterwards,
we should have enough (and not wasting) contiguous phy mem for 
table->icm allocation. I will use kvzalloc to
replace vzalloc and send a V4 patch.

Thanks,
Qing


>
> Thanks,
> Tariq
>
>>> And you'll understand some people care about this.
>>>
>>> Strongly.
>>>
>>> Thanks.
>>>
>>
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [RFC V4 PATCH 7/8] vhost: packed ring support
From: Jason Wang @ 2018-05-23  1:39 UTC (permalink / raw)
  To: Wei Xu; +Cc: mst, kvm, virtualization, netdev, linux-kernel, jfreimann,
	tiwei.bie
In-Reply-To: <20180522165448.GA13523@wei-ubt>



On 2018年05月23日 00:54, Wei Xu wrote:
> On Wed, May 16, 2018 at 08:32:20PM +0800, Jason Wang wrote:
>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>> ---
>>   drivers/vhost/net.c   |   3 +-
>>   drivers/vhost/vhost.c | 539 ++++++++++++++++++++++++++++++++++++++++++++++----
>>   drivers/vhost/vhost.h |   8 +-
>>   3 files changed, 513 insertions(+), 37 deletions(-)
>>
>> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
>> index 8304c30..f2a0f5b 100644
>> --- a/drivers/vhost/vhost.c
>> +++ b/drivers/vhost/vhost.c
>> @@ -1358,6 +1382,8 @@ long vhost_vring_ioctl(struct vhost_dev *d, unsigned int ioctl, void __user *arg
>>   			break;
>>   		}
>>   		vq->last_avail_idx = s.num;
>> +		if (vhost_has_feature(vq, VIRTIO_F_RING_PACKED))
>> +			vq->avail_wrap_counter = s.num >> 31;
>>   		/* Forget the cached index value. */
>>   		vq->avail_idx = vq->last_avail_idx;
>>   		break;
>> @@ -1366,6 +1392,8 @@ long vhost_vring_ioctl(struct vhost_dev *d, unsigned int ioctl, void __user *arg
>>   		s.num = vq->last_avail_idx;
>>   		if (copy_to_user(argp, &s, sizeof s))
>>   			r = -EFAULT;
>> +		if (vhost_has_feature(vq, VIRTIO_F_RING_PACKED))
>> +			s.num |= vq->avail_wrap_counter << 31;
>>   		break;
>>   	case VHOST_SET_VRING_ADDR:
>>   		if (copy_from_user(&a, argp, sizeof a)) {
> 'last_used_idx' also needs to be saved/restored here.
>
> I have figured out the root cause of broken device after reloading
> 'virtio-net' module, all indices have been reset for a reloading but
> 'last_used_idx' is not properly reset in this case. This confuses
> handle_rx()/tx().
>
> Wei
>

Good catch, so we probably need a new ioctl to sync between qemu and vhost.

Something like VHOST_SET/GET_USED_BASE.

Thanks

^ permalink raw reply

* Re: [PATCH net-next 1/2] tcp: add max_quickacks param to tcp_incr_quickack and tcp_enter_quickack_mode
From: Neal Cardwell @ 2018-05-23  1:37 UTC (permalink / raw)
  To: kbuild test robot
  Cc: Eric Dumazet, kbuild-all, David Miller, Netdev, Van Jacobson,
	Yuchung Cheng, Soheil Hassas Yeganeh, Eric Dumazet
In-Reply-To: <201805230804.KP8LODgK%fengguang.wu@intel.com>

On Tue, May 22, 2018 at 8:31 PM kbuild test robot <lkp@intel.com> wrote:

> Hi Eric,

> Thank you for the patch! Yet something to improve:

> [auto build test ERROR on net/master]
> [also build test ERROR on v4.17-rc6 next-20180517]
> [cannot apply to net-next/master]
> [if your patch is applied to the wrong git tree, please drop us a note to
help improve the system]

> url:
https://github.com/0day-ci/linux/commits/Eric-Dumazet/tcp-add-max_quickacks-param-to-tcp_incr_quickack-and-tcp_enter_quickack_mode/20180523-075103
> config: i386-randconfig-x012-201820 (attached as .config)
> compiler: gcc-7 (Debian 7.3.0-16) 7.3.0
> reproduce:
>          # save the attached .config to linux build tree
>          make ARCH=i386

> All errors (new ones prefixed by >>):

>     net//ipv4/tcp_input.c: In function 'tcp_data_queue':
> >> net//ipv4/tcp_input.c:4656:2: error: too few arguments to function
'tcp_enter_quickack_mode'
>       tcp_enter_quickack_mode(sk);
>       ^~~~~~~~~~~~~~~~~~~~~~~
>     net//ipv4/tcp_input.c:199:13: note: declared here
>      static void tcp_enter_quickack_mode(struct sock *sk, unsigned int
max_quickacks)
>                  ^~~~~~~~~~~~~~~~~~~~~~~
...

For the record, this is an error in the tool, rather than the patch. The
tool seems to be using a stale net-next tree for building this patch.

The compile error is here in line 4656:

> ^1da177e4 Linus Torvalds           2005-04-16  4652     /* Out of window.
F.e. zero window probe. */
> ^1da177e4 Linus Torvalds           2005-04-16  4653     if
(!before(TCP_SKB_CB(skb)->seq, tp->rcv_nxt + tcp_receive_window(tp)))
> ^1da177e4 Linus Torvalds           2005-04-16  4654             goto
out_of_window;
> ^1da177e4 Linus Torvalds           2005-04-16  4655
> 463c84b97 Arnaldo Carvalho de Melo 2005-08-09 @4656
tcp_enter_quickack_mode(sk);
> ^1da177e4 Linus Torvalds           2005-04-16  4657
> ^1da177e4 Linus Torvalds           2005-04-16  4658     if
(before(TCP_SKB_CB(skb)->seq, tp->rcv_nxt)) {
> ^1da177e4 Linus Torvalds           2005-04-16  4659             /*
Partial packet, seq < rcv_next < end_seq */
...

But that line is not in net-next any more, after Eric's recent net-next
commit:

a3893637e1eb0e ("tcp: do not force quickack when receiving out-of-order
packets")

https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/commit/net/ipv4/tcp_input.c?id=a3893637e1eb0ef5eb1bbc52b3a8d2dfa317a35d

That commit removed that line:

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 0bf032839548f..f5622b2506651 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4715,8 +4715,6 @@ drop:
         if (!before(TCP_SKB_CB(skb)->seq, tp->rcv_nxt +
tcp_receive_window(tp)))
                 goto out_of_window;

-       tcp_enter_quickack_mode(sk);
-
         if (before(TCP_SKB_CB(skb)->seq, tp->rcv_nxt)) {
                 /* Partial packet, seq < rcv_next < end_seq */
                 SOCK_DEBUG(sk, "partial packet: rcv_next %X seq %X - %X\n",

cheers,
neal

^ permalink raw reply related

* Re: [PATCH net] net: phy: broadcom: Fix bcm_write_exp()
From: Florian Fainelli @ 2018-05-23  1:20 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: netdev, arunp, David S. Miller, Ray Jui, Scott Branden, Jon Mason,
	maintainer:BROADCOM IPROC ARM ARCHITECTURE,
	moderated list:BROADCOM IPROC ARM ARCHITECTURE, open list
In-Reply-To: <20180523001503.GA16062@lunn.ch>

Hi Andrew,

On 05/22/2018 05:15 PM, Andrew Lunn wrote:
> On Tue, May 22, 2018 at 05:04:49PM -0700, Florian Fainelli wrote:
>> On newer PHYs, we need to select the expansion register to write with
>> setting bits [11:8] to 0xf. This was done correctly by bcm7xxx.c prior
>> to being migrated to generic code under bcm-phy-lib.c which
>> unfortunately used the older implementation from the BCM54xx days.
> 
> Hi Florian
> 
> Does selecting the expansion register affect access to the standard
> registers? Does this need locking like the Marvell PHY has when
> changing pages?

We should probably convert this to the page accessors since the
expansion, misc and other shadow 0x1c accesses are all indirection
layers to poke into a different address space of the PHY. That would be
a separate fix though for a number of reasons.
-- 
Florian

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox