Netdev List
 help / color / mirror / Atom feed
* [PATCH v5 13/13] dt-bindings: net: add bindings for ADIN PHY driver
From: Alexandru Ardelean @ 2019-08-16 13:10 UTC (permalink / raw)
  To: netdev, devicetree, linux-kernel
  Cc: davem, robh+dt, mark.rutland, f.fainelli, hkallweit1, andrew,
	Alexandru Ardelean, Rob Herring
In-Reply-To: <20190816131011.23264-1-alexandru.ardelean@analog.com>

This change adds bindings for the Analog Devices ADIN PHY driver, detailing
all the properties implemented by the driver.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
---
 .../devicetree/bindings/net/adi,adin.yaml     | 73 +++++++++++++++++++
 MAINTAINERS                                   |  1 +
 2 files changed, 74 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/net/adi,adin.yaml

diff --git a/Documentation/devicetree/bindings/net/adi,adin.yaml b/Documentation/devicetree/bindings/net/adi,adin.yaml
new file mode 100644
index 000000000000..69375cb28e92
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/adi,adin.yaml
@@ -0,0 +1,73 @@
+# SPDX-License-Identifier: GPL-2.0+
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/net/adi,adin.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Analog Devices ADIN1200/ADIN1300 PHY
+
+maintainers:
+  - Alexandru Ardelean <alexandru.ardelean@analog.com>
+
+description: |
+  Bindings for Analog Devices Industrial Ethernet PHYs
+
+allOf:
+  - $ref: ethernet-phy.yaml#
+
+properties:
+  adi,rx-internal-delay-ps:
+    description: |
+      RGMII RX Clock Delay used only when PHY operates in RGMII mode with
+      internal delay (phy-mode is 'rgmii-id' or 'rgmii-rxid') in pico-seconds.
+    enum: [ 1600, 1800, 2000, 2200, 2400 ]
+    default: 2000
+
+  adi,tx-internal-delay-ps:
+    description: |
+      RGMII TX Clock Delay used only when PHY operates in RGMII mode with
+      internal delay (phy-mode is 'rgmii-id' or 'rgmii-txid') in pico-seconds.
+    enum: [ 1600, 1800, 2000, 2200, 2400 ]
+    default: 2000
+
+  adi,fifo-depth-bits:
+    description: |
+      When operating in RMII mode, this option configures the FIFO depth.
+    enum: [ 4, 8, 12, 16, 20, 24 ]
+    default: 8
+
+  adi,disable-energy-detect:
+    description: |
+      Disables Energy Detect Powerdown Mode (default disabled, i.e energy detect
+      is enabled if this property is unspecified)
+    type: boolean
+
+examples:
+  - |
+    ethernet {
+        #address-cells = <1>;
+        #size-cells = <0>;
+
+        phy-mode = "rgmii-id";
+
+        ethernet-phy@0 {
+            reg = <0>;
+
+            adi,rx-internal-delay-ps = <1800>;
+            adi,tx-internal-delay-ps = <2200>;
+        };
+    };
+  - |
+    ethernet {
+        #address-cells = <1>;
+        #size-cells = <0>;
+
+        phy-mode = "rmii";
+
+        ethernet-phy@1 {
+            reg = <1>;
+
+            adi,fifo-depth-bits = <16>;
+            adi,disable-energy-detect;
+        };
+    };
diff --git a/MAINTAINERS b/MAINTAINERS
index e8aa8a667864..fd9ab61c2670 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -944,6 +944,7 @@ L:	netdev@vger.kernel.org
 W:	http://ez.analog.com/community/linux-device-drivers
 S:	Supported
 F:	drivers/net/phy/adin.c
+F:	Documentation/devicetree/bindings/net/adi,adin.yaml
 
 ANALOG DEVICES INC ADIS DRIVER LIBRARY
 M:	Alexandru Ardelean <alexandru.ardelean@analog.com>
-- 
2.20.1


^ permalink raw reply related

* [PATCH v5 07/13] net: phy: adin: make RMII fifo depth configurable
From: Alexandru Ardelean @ 2019-08-16 13:10 UTC (permalink / raw)
  To: netdev, devicetree, linux-kernel
  Cc: davem, robh+dt, mark.rutland, f.fainelli, hkallweit1, andrew,
	Alexandru Ardelean
In-Reply-To: <20190816131011.23264-1-alexandru.ardelean@analog.com>

The FIFO depth can be configured for the RMII mode. This change adds
support for doing this via device-tree (or ACPI).

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
---
 drivers/net/phy/adin.c | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/drivers/net/phy/adin.c b/drivers/net/phy/adin.c
index c882fcd9ada5..4ca685780622 100644
--- a/drivers/net/phy/adin.c
+++ b/drivers/net/phy/adin.c
@@ -52,8 +52,19 @@
 #define	ADIN1300_RGMII_2_40_NS			0x0007
 
 #define ADIN1300_GE_RMII_CFG_REG		0xff24
+#define   ADIN1300_GE_RMII_FIFO_DEPTH_MSK	GENMASK(6, 4)
+#define   ADIN1300_GE_RMII_FIFO_DEPTH_SEL(x)	\
+		FIELD_PREP(ADIN1300_GE_RMII_FIFO_DEPTH_MSK, x)
 #define   ADIN1300_GE_RMII_EN			BIT(0)
 
+/* RMII fifo depth values */
+#define ADIN1300_RMII_4_BITS			0x0000
+#define ADIN1300_RMII_8_BITS			0x0001
+#define ADIN1300_RMII_12_BITS			0x0002
+#define ADIN1300_RMII_16_BITS			0x0003
+#define ADIN1300_RMII_20_BITS			0x0004
+#define ADIN1300_RMII_24_BITS			0x0005
+
 /**
  * struct adin_cfg_reg_map - map a config value to aregister value
  * @cfg		value in device configuration
@@ -73,6 +84,16 @@ static const struct adin_cfg_reg_map adin_rgmii_delays[] = {
 	{ },
 };
 
+static const struct adin_cfg_reg_map adin_rmii_fifo_depths[] = {
+	{ 4,  ADIN1300_RMII_4_BITS },
+	{ 8,  ADIN1300_RMII_8_BITS },
+	{ 12, ADIN1300_RMII_12_BITS },
+	{ 16, ADIN1300_RMII_16_BITS },
+	{ 20, ADIN1300_RMII_20_BITS },
+	{ 24, ADIN1300_RMII_24_BITS },
+	{ },
+};
+
 static int adin_lookup_reg_value(const struct adin_cfg_reg_map *tbl, int cfg)
 {
 	size_t i;
@@ -156,6 +177,7 @@ static int adin_config_rgmii_mode(struct phy_device *phydev)
 
 static int adin_config_rmii_mode(struct phy_device *phydev)
 {
+	u32 val;
 	int reg;
 
 	if (phydev->interface != PHY_INTERFACE_MODE_RMII)
@@ -169,6 +191,13 @@ static int adin_config_rmii_mode(struct phy_device *phydev)
 
 	reg |= ADIN1300_GE_RMII_EN;
 
+	val = adin_get_reg_value(phydev, "adi,fifo-depth-bits",
+				 adin_rmii_fifo_depths,
+				 ADIN1300_RMII_8_BITS);
+
+	reg &= ~ADIN1300_GE_RMII_FIFO_DEPTH_MSK;
+	reg |= ADIN1300_GE_RMII_FIFO_DEPTH_SEL(val);
+
 	return phy_write_mmd(phydev, MDIO_MMD_VEND1,
 			     ADIN1300_GE_RMII_CFG_REG, reg);
 }
-- 
2.20.1


^ permalink raw reply related

* [PATCH v5 08/13] net: phy: adin: add support MDI/MDIX/Auto-MDI selection
From: Alexandru Ardelean @ 2019-08-16 13:10 UTC (permalink / raw)
  To: netdev, devicetree, linux-kernel
  Cc: davem, robh+dt, mark.rutland, f.fainelli, hkallweit1, andrew,
	Alexandru Ardelean
In-Reply-To: <20190816131011.23264-1-alexandru.ardelean@analog.com>

The ADIN PHYs support automatic MDI/MDIX negotiation. By default this is
disabled, so this is enabled at `config_init`.

This is controlled via the PHY Control 1 register.
The supported modes are:
  1. Manual MDI
  2. Manual MDIX
  3. Auto MDIX - prefer MDIX
  4. Auto MDIX - prefer MDI

The phydev mdix & mdix_ctrl fields include modes 3 & 4 into a single
auto-mode. So, the default mode this driver enables is 4 when Auto-MDI mode
is used.

When detecting MDI/MDIX mode, a combination of the PHY Control 1 register
and PHY Status 1 register is used to determine the correct MDI/MDIX mode.

If Auto-MDI mode is not set, then the manual MDI/MDIX mode is returned.
If Auto-MDI mode is set, then MDIX mode is returned differs from the
preferred MDI/MDIX mode.
This covers all cases where:
  1. MDI preferred  & Pair01Swapped   == MDIX
  2. MDIX preferred & Pair01Swapped   == MDI
  3. MDI preferred  & ! Pair01Swapped == MDIX
  4. MDIX preferred & ! Pair01Swapped == MDI

The preferred MDI/MDIX mode is not configured via SW, but can be configured
via HW pins. Note that the `Pair01Swapped` is the Green-Yellow physical
pairs.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
---
 drivers/net/phy/adin.c | 117 +++++++++++++++++++++++++++++++++++++++--
 1 file changed, 113 insertions(+), 4 deletions(-)

diff --git a/drivers/net/phy/adin.c b/drivers/net/phy/adin.c
index 4ca685780622..51c0d17577de 100644
--- a/drivers/net/phy/adin.c
+++ b/drivers/net/phy/adin.c
@@ -19,6 +19,10 @@
 #define ADIN1300_MII_EXT_REG_PTR		0x0010
 #define ADIN1300_MII_EXT_REG_DATA		0x0011
 
+#define ADIN1300_PHY_CTRL1			0x0012
+#define   ADIN1300_AUTO_MDI_EN			BIT(10)
+#define   ADIN1300_MAN_MDIX_EN			BIT(9)
+
 #define ADIN1300_INT_MASK_REG			0x0018
 #define   ADIN1300_INT_MDIO_SYNC_EN		BIT(9)
 #define   ADIN1300_INT_ANEG_STAT_CHNG_EN	BIT(8)
@@ -33,6 +37,9 @@
 	(ADIN1300_INT_LINK_STAT_CHNG_EN | ADIN1300_INT_HW_IRQ_EN)
 #define ADIN1300_INT_STATUS_REG			0x0019
 
+#define ADIN1300_PHY_STATUS1			0x001a
+#define   ADIN1300_PAIR_01_SWAP			BIT(11)
+
 #define ADIN1300_GE_RGMII_CFG_REG		0xff23
 #define   ADIN1300_GE_RGMII_RX_MSK		GENMASK(8, 6)
 #define   ADIN1300_GE_RGMII_RX_SEL(x)		\
@@ -206,6 +213,8 @@ static int adin_config_init(struct phy_device *phydev)
 {
 	int rc;
 
+	phydev->mdix_ctrl = ETH_TP_MDI_AUTO;
+
 	rc = genphy_config_init(phydev);
 	if (rc < 0)
 		return rc;
@@ -269,13 +278,113 @@ static int adin_write_mmd(struct phy_device *phydev, int devad, u16 regnum,
 	return __mdiobus_write(bus, phy_addr, ADIN1300_MII_EXT_REG_DATA, val);
 }
 
+static int adin_config_mdix(struct phy_device *phydev)
+{
+	bool auto_en, mdix_en;
+	int reg;
+
+	mdix_en = false;
+	auto_en = false;
+	switch (phydev->mdix_ctrl) {
+	case ETH_TP_MDI:
+		break;
+	case ETH_TP_MDI_X:
+		mdix_en = true;
+		break;
+	case ETH_TP_MDI_AUTO:
+		auto_en = true;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	reg = phy_read(phydev, ADIN1300_PHY_CTRL1);
+	if (reg < 0)
+		return reg;
+
+	if (mdix_en)
+		reg |= ADIN1300_MAN_MDIX_EN;
+	else
+		reg &= ~ADIN1300_MAN_MDIX_EN;
+
+	if (auto_en)
+		reg |= ADIN1300_AUTO_MDI_EN;
+	else
+		reg &= ~ADIN1300_AUTO_MDI_EN;
+
+	return phy_write(phydev, ADIN1300_PHY_CTRL1, reg);
+}
+
+static int adin_config_aneg(struct phy_device *phydev)
+{
+	int ret;
+
+	ret = adin_config_mdix(phydev);
+	if (ret)
+		return ret;
+
+	return genphy_config_aneg(phydev);
+}
+
+static int adin_mdix_update(struct phy_device *phydev)
+{
+	bool auto_en, mdix_en;
+	bool swapped;
+	int reg;
+
+	reg = phy_read(phydev, ADIN1300_PHY_CTRL1);
+	if (reg < 0)
+		return reg;
+
+	auto_en = !!(reg & ADIN1300_AUTO_MDI_EN);
+	mdix_en = !!(reg & ADIN1300_MAN_MDIX_EN);
+
+	/* If MDI/MDIX is forced, just read it from the control reg */
+	if (!auto_en) {
+		if (mdix_en)
+			phydev->mdix = ETH_TP_MDI_X;
+		else
+			phydev->mdix = ETH_TP_MDI;
+		return 0;
+	}
+
+	/**
+	 * Otherwise, we need to deduce it from the PHY status2 reg.
+	 * When Auto-MDI is enabled, the ADIN1300_MAN_MDIX_EN bit implies
+	 * a preference for MDIX when it is set.
+	 */
+	reg = phy_read(phydev, ADIN1300_PHY_STATUS1);
+	if (reg < 0)
+		return reg;
+
+	swapped = !!(reg & ADIN1300_PAIR_01_SWAP);
+
+	if (mdix_en != swapped)
+		phydev->mdix = ETH_TP_MDI_X;
+	else
+		phydev->mdix = ETH_TP_MDI;
+
+	return 0;
+}
+
+static int adin_read_status(struct phy_device *phydev)
+{
+	int ret;
+
+	ret = adin_mdix_update(phydev);
+	if (ret < 0)
+		return ret;
+
+	return genphy_read_status(phydev);
+}
+
 static struct phy_driver adin_driver[] = {
 	{
 		PHY_ID_MATCH_MODEL(PHY_ID_ADIN1200),
 		.name		= "ADIN1200",
 		.config_init	= adin_config_init,
-		.config_aneg	= genphy_config_aneg,
-		.read_status	= genphy_read_status,
+		.config_aneg	= adin_config_aneg,
+		.read_status	= adin_read_status,
 		.ack_interrupt	= adin_phy_ack_intr,
 		.config_intr	= adin_phy_config_intr,
 		.resume		= genphy_resume,
@@ -287,8 +396,8 @@ static struct phy_driver adin_driver[] = {
 		PHY_ID_MATCH_MODEL(PHY_ID_ADIN1300),
 		.name		= "ADIN1300",
 		.config_init	= adin_config_init,
-		.config_aneg	= genphy_config_aneg,
-		.read_status	= genphy_read_status,
+		.config_aneg	= adin_config_aneg,
+		.read_status	= adin_read_status,
 		.ack_interrupt	= adin_phy_ack_intr,
 		.config_intr	= adin_phy_config_intr,
 		.resume		= genphy_resume,
-- 
2.20.1


^ permalink raw reply related

* [PATCH v5 04/13] net: phy: adin: add {write,read}_mmd hooks
From: Alexandru Ardelean @ 2019-08-16 13:10 UTC (permalink / raw)
  To: netdev, devicetree, linux-kernel
  Cc: davem, robh+dt, mark.rutland, f.fainelli, hkallweit1, andrew,
	Alexandru Ardelean
In-Reply-To: <20190816131011.23264-1-alexandru.ardelean@analog.com>

Both ADIN1200 & ADIN1300 support Clause 45 access for some registers.
The Extended Management Interface (EMI) registers are accessible via both
Clause 45 (at register MDIO_MMD_VEND1) and using Clause 22.

The Clause 22 access for MMD regs differs from the standard one defined by
802.3. The ADIN PHYs  use registers ExtRegPtr (0x0010) and ExtRegData
(0x0011) to access Clause 45 & EMI registers.

The indirect access is done via the following mechanism (for both R/W):
1. Write the address of the register in the ExtRegPtr
2. Read/write the value of the register via reg ExtRegData

This mechanism is needed to manage configuration of chip settings and to
access EEE registers via Clause 22.

Since Clause 45 access will likely never be used, it is not implemented via
this hook.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
---
 drivers/net/phy/adin.c | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/drivers/net/phy/adin.c b/drivers/net/phy/adin.c
index f4ee611e33df..efbb732f0398 100644
--- a/drivers/net/phy/adin.c
+++ b/drivers/net/phy/adin.c
@@ -14,6 +14,9 @@
 #define PHY_ID_ADIN1200				0x0283bc20
 #define PHY_ID_ADIN1300				0x0283bc30
 
+#define ADIN1300_MII_EXT_REG_PTR		0x0010
+#define ADIN1300_MII_EXT_REG_DATA		0x0011
+
 #define ADIN1300_INT_MASK_REG			0x0018
 #define   ADIN1300_INT_MDIO_SYNC_EN		BIT(9)
 #define   ADIN1300_INT_ANEG_STAT_CHNG_EN	BIT(8)
@@ -51,6 +54,33 @@ static int adin_phy_config_intr(struct phy_device *phydev)
 			      ADIN1300_INT_MASK_EN);
 }
 
+static int adin_read_mmd(struct phy_device *phydev, int devad, u16 regnum)
+{
+	struct mii_bus *bus = phydev->mdio.bus;
+	int phy_addr = phydev->mdio.addr;
+	int err;
+
+	err = __mdiobus_write(bus, phy_addr, ADIN1300_MII_EXT_REG_PTR, regnum);
+	if (err)
+		return err;
+
+	return __mdiobus_read(bus, phy_addr, ADIN1300_MII_EXT_REG_DATA);
+}
+
+static int adin_write_mmd(struct phy_device *phydev, int devad, u16 regnum,
+			  u16 val)
+{
+	struct mii_bus *bus = phydev->mdio.bus;
+	int phy_addr = phydev->mdio.addr;
+	int err;
+
+	err = __mdiobus_write(bus, phy_addr, ADIN1300_MII_EXT_REG_PTR, regnum);
+	if (err)
+		return err;
+
+	return __mdiobus_write(bus, phy_addr, ADIN1300_MII_EXT_REG_DATA, val);
+}
+
 static struct phy_driver adin_driver[] = {
 	{
 		PHY_ID_MATCH_MODEL(PHY_ID_ADIN1200),
@@ -62,6 +92,8 @@ static struct phy_driver adin_driver[] = {
 		.config_intr	= adin_phy_config_intr,
 		.resume		= genphy_resume,
 		.suspend	= genphy_suspend,
+		.read_mmd	= adin_read_mmd,
+		.write_mmd	= adin_write_mmd,
 	},
 	{
 		PHY_ID_MATCH_MODEL(PHY_ID_ADIN1300),
@@ -73,6 +105,8 @@ static struct phy_driver adin_driver[] = {
 		.config_intr	= adin_phy_config_intr,
 		.resume		= genphy_resume,
 		.suspend	= genphy_suspend,
+		.read_mmd	= adin_read_mmd,
+		.write_mmd	= adin_write_mmd,
 	},
 };
 
-- 
2.20.1


^ permalink raw reply related

* Re: [PATCH v5] perf machine: arm/arm64: Improve completeness for kernel address space
From: Adrian Hunter @ 2019-08-16 13:00 UTC (permalink / raw)
  To: Leo Yan
  Cc: Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Alexei Starovoitov, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, linux-kernel, netdev,
	bpf, clang-built-linux, Mathieu Poirier, Peter Zijlstra,
	Suzuki Poulouse, coresight, linux-arm-kernel
In-Reply-To: <20190816014541.GA17960@leoy-ThinkPad-X240s>

On 16/08/19 4:45 AM, Leo Yan wrote:
> Hi Adrian,
> 
> On Thu, Aug 15, 2019 at 02:45:57PM +0300, Adrian Hunter wrote:
> 
> [...]
> 
>>>> How come you cannot use kallsyms to get the information?
>>>
>>> Thanks for pointing out this.  Sorry I skipped your comment "I don't
>>> know how you intend to calculate ARM_PRE_START_SIZE" when you reviewed
>>> the patch v3, I should use that chance to elaborate the detailed idea
>>> and so can get more feedback/guidance before procceed.
>>>
>>> Actually, I have considered to use kallsyms when worked on the previous
>>> patch set.
>>>
>>> As mentioned in patch set v4's cover letter, I tried to implement
>>> machine__create_extra_kernel_maps() for arm/arm64, the purpose is to
>>> parse kallsyms so can find more kernel maps and thus also can fixup
>>> the kernel start address.  But I found the 'perf script' tool directly
>>> calls machine__get_kernel_start() instead of running into the flow for
>>> machine__create_extra_kernel_maps();
>>
>> Doesn't it just need to loop through each kernel map to find the lowest
>> start address?
> 
> Based on your suggestion, I worked out below change and verified it
> can work well on arm64 for fixing up start address; please let me know
> if the change works for you?

How does that work if take a perf.data file to a machine with a different
architecture?

> 
> diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> index f6ee7fbad3e4..51d78313dca1 100644
> --- a/tools/perf/util/machine.c
> +++ b/tools/perf/util/machine.c
> @@ -2671,9 +2671,26 @@ int machine__nr_cpus_avail(struct machine *machine)
>  	return machine ? perf_env__nr_cpus_avail(machine->env) : 0;
>  }
>  
> +static int machine__fixup_kernel_start(void *arg,
> +				       const char *name __maybe_unused,
> +				       char type,
> +				       u64 start)
> +{
> +	struct machine *machine = arg;
> +
> +	type = toupper(type);
> +
> +	/* Fixup for text, weak, data and bss sections. */
> +	if (type == 'T' || type == 'W' || type == 'D' || type == 'B')
> +		machine->kernel_start = min(machine->kernel_start, start);
> +
> +	return 0;
> +}
> +
>  int machine__get_kernel_start(struct machine *machine)
>  {
>  	struct map *map = machine__kernel_map(machine);
> +	char filename[PATH_MAX];
>  	int err = 0;
>  
>  	/*
> @@ -2687,6 +2704,7 @@ int machine__get_kernel_start(struct machine *machine)
>  	machine->kernel_start = 1ULL << 63;
>  	if (map) {
>  		err = map__load(map);
>  		/*
>  		 * On x86_64, PTI entry trampolines are less than the
>  		 * start of kernel text, but still above 2^63. So leave
> @@ -2695,6 +2713,16 @@ int machine__get_kernel_start(struct machine *machine)
>  		if (!err && !machine__is(machine, "x86_64"))
>  			machine->kernel_start = map->start;
>  	}
> +
> +	machine__get_kallsyms_filename(machine, filename, PATH_MAX);
> +
> +	if (symbol__restricted_filename(filename, "/proc/kallsyms"))
> +		goto out;
> +
> +	if (kallsyms__parse(filename, machine, machine__fixup_kernel_start))
> +		pr_warning("Fail to fixup kernel start address. skipping...\n");
> +
> +out:
>  	return err;
>  }
> 
> Thanks,
> Leo Yan
> 


^ permalink raw reply

* Re: [RFC PATCH net-next 04/11] spi: spi-fsl-dspi: Cosmetic cleanup
From: Mark Brown @ 2019-08-16 12:59 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Hubert Feurstein, mlichvar, Richard Cochran, Andrew Lunn,
	Florian Fainelli, linux-spi, netdev
In-Reply-To: <CA+h21hoP3t6j2mTd2BLwizqbFap+9Z2vdxQ4ahHS3-7Vr31Lxw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 993 bytes --]

On Fri, Aug 16, 2019 at 03:37:46PM +0300, Vladimir Oltean wrote:
> On Fri, 16 Aug 2019 at 15:21, Mark Brown <broonie@kernel.org> wrote:

> > This is difficult to review since there's a bunch of largely unrelated
> > changes all munged into one patch.  It'd be better to split this up so
> > each change makes one kind of fix, and better to do this separately to
> > the rest of the series.  In particular having alignment changes along
> > with other changes hurts reviewability as it's less immediately clear
> > what's a like for liken substitution.

> Yes, the diff of this patch looks relatively bad. But I don't know if
> splitting it in more patches isn't in fact going to pollute the git
> history, so I can just as well drop it.

No problem with lots of patches in git history if you want to split it
up (and probably split it out of the series).  Like I say it's mainly
the alignment changes that it'd be better to pull out, the others really
should be but it's easier to cope there.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* Re: [RFC PATCH net-next 03/11] spi: Add a PTP system timestamp to the transfer structure
From: Mark Brown @ 2019-08-16 12:58 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Hubert Feurstein, mlichvar, Richard Cochran, Andrew Lunn,
	Florian Fainelli, linux-spi, netdev
In-Reply-To: <CA+h21hqatTeS2shV9QSiPzkjSeNj2Z4SOTrycffDjRHj=9s=nQ@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2533 bytes --]

On Fri, Aug 16, 2019 at 03:35:30PM +0300, Vladimir Oltean wrote:
> On Fri, 16 Aug 2019 at 15:18, Mark Brown <broonie@kernel.org> wrote:
> > On Fri, Aug 16, 2019 at 03:44:41AM +0300, Vladimir Oltean wrote:

> > > @@ -842,6 +843,9 @@ struct spi_transfer {
> > >
> > >       u32             effective_speed_hz;
> > >
> > > +     struct ptp_system_timestamp *ptp_sts;
> > > +     unsigned int    ptp_sts_word_offset;
> > > +

> > You've not documented these fields at all so it's not clear what the
> > intended usage is.

> Thanks for looking into this.
> Indeed I didn't document them as the patch is part of a RFC and I
> thought the purpose was more clear from the context (cover letter
> etc).
> If I do ever send a patchset for submission I will document the newly
> introduced fields properly.

The issue I'm having is that I have zero idea about the PTP API so I've
got nothing to go on when thinking about if this approach makes any
sense unless I go do some research.

> So let me clarify:
> The SPI slave device driver is populating these fields to indicate to
> the controller driver that it wants word number @ptp_sts_word_offset
> from the tx buffer snapshotted. The controller driver is supposed to
> put the snapshot into the @ptp_sts field, which is a pointer to a
> memory location under the control of the SPI slave device driver.

Snapshot here basically meaning recording a timestamp?  This interface
does seem like it basically precludes DMA based controllers from using
it unless someone happened to implement some very specific stuff in
hardware which seems implausible.  I'd be inclined to just require that
users can only snapshot the first (and possibly also the last, though
DMA completions make that fun) word of a transfer, we could then pull
this out into the core a bit by providing a wrapper function drivers
should call at the appropriate moment.

> It is ok if the ptp_sts pointer is NULL (no need to check), because
> the API for taking snapshots already checks for that.
> At the moment there is yet no proposed mechanism for the SPI slave
> driver to ensure that the controller will really act upon this
> request. That would be really nice to have, since some SPI slave
> devices are time-sensitive and warning early is a good way to prevent
> unnecessary troubleshooting.

Yes, that's one of the things I was thinking about looking at the series
- we should at least be able to warn if we can't timestamp so nobody
gets confused, possibly error out if the calling code particularly
depends on it.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* Unable to create htb tc classes more than 64K
From: Akshat Kakkar @ 2019-08-16 12:48 UTC (permalink / raw)
  To: netfilter-devel, lartc, netdev

I want to have around 1 Million htb tc classes.
The simple structure of htb tc class, allow having only 64K classes at once.
But, it is possible to make it more hierarchical using hierarchy of
qdisc and classes.
For this I tried something like this

tc qdisc add dev eno2 root handle 100: htb
tc class add dev eno2 parent 100: classid 100:1 htb rate 100Mbps
tc class add dev eno2 parent 100: classid 100:2 htb rate 100Mbps

tc qdisc add dev eno2 parent 100:1 handle 1: htb
tc class add dev eno2 parent 1: classid 1:10 htb rate 100kbps
tc class add dev eno2 parent 1: classid 1:20 htb rate 300kbps

tc qdisc add dev eno2 parent 100:2 handle 2: htb
tc class add dev eno2 parent 2: classid 2:10 htb rate 100kbps
tc class add dev eno2 parent 2: classid 2:20 htb rate 300kbps

What I want is something like:
tc filter add dev eno2 parent 100: protocol ip prio 1 handle
0x00000001 fw flowid 1:10
tc filter add dev eno2 parent 100: protocol ip prio 1 handle
0x00000002 fw flowid 1:20
tc filter add dev eno2 parent 100: protocol ip prio 1 handle
0x00000003 fw flowid 2:10
tc filter add dev eno2 parent 100: protocol ip prio 1 handle
0x00000004 fw flowid 2:20

But I am unable to shape my traffic by any of 1:10, 1:20, 2:10 or 2:20.

Can you please suggest, where is it going wrong?
Is it not possible altogether?

-Akshat

^ permalink raw reply

* Re: linux-next: Signed-off-by missing for commits in the net-next tree
From: Chris Mason @ 2019-08-16 12:45 UTC (permalink / raw)
  To: Andy Grover
  Cc: Gerd Rausch, Stephen Rothwell, David Miller, Networking,
	Linux Next Mailing List, Linux Kernel Mailing List, Andy Grover,
	Chris Mason
In-Reply-To: <e85146f3-93a0-b23f-6a6e-11e42815946d@groveronline.com>

On 16 Aug 2019, at 5:15, Andy Grover wrote:

> On 8/16/19 3:06 PM, Gerd Rausch wrote:
>> Hi,
>>
>> Just added the e-mail addresses I found using a simple "google 
>> search",
>> in order to reach out to the original authors of these commits:
>> Chris Mason and Andy Grover.
>>
>> I'm hoping they still remember their work from 7-8 years ago.
>
> Yes looks like what I was working on. What did you need from me? It's
> too late to amend the commitlogs...

Same question ;)  The missing signed-off-by is a mistake, but from the 
point of view of the DCO, these patches are totally fine by me.

-chris

^ permalink raw reply

* Re: WARNING in tracepoint_probe_register_prio (3)
From: Sebastian Andrzej Siewior @ 2019-08-16 12:41 UTC (permalink / raw)
  To: syzbot
  Cc: antoine.tenart, ard.biesheuvel, baruch, davem, gregkh, gustavo,
	jeyu, linux-kernel, mathieu.desnoyers, maxime.chevallier, mingo,
	netdev, paulmck, paulmck, rmk+kernel, rostedt, syzkaller-bugs,
	tglx
In-Reply-To: <000000000000479a1705903b2dc9@google.com>

On 2019-08-16 05:32:00 [-0700], syzbot wrote:
> syzbot has bisected this bug to:
> 
> commit ecb9f80db23a7ab09b46b298b404e41dd7aff6e6
> Author: Thomas Gleixner <tglx@linutronix.de>
> Date:   Tue Aug 13 08:00:25 2019 +0000
> 
>     net/mvpp2: Replace tasklet with softirq hrtimer
> 
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=13ffb9ee600000

that bisect is wrong. That warning triggered once and this commit was
the top most one in net-next at the time…

Sebastian

^ permalink raw reply

* Re: linux-next: manual merge of the net-next tree with the kbuild tree
From: Stephen Rothwell @ 2019-08-16 12:39 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: David Miller, Networking, Masahiro Yamada,
	Linux Next Mailing List, Linux Kernel Mailing List, Kees Cook,
	Andrii Nakryiko, Daniel Borkmann
In-Reply-To: <20190816160128.36e5cc4e@canb.auug.org.au>

[-- Attachment #1: Type: text/plain, Size: 638 bytes --]

Hi all,

On Fri, 16 Aug 2019 16:01:28 +1000 Stephen Rothwell <sfr@canb.auug.org.au> wrote:
>
> On Thu, 15 Aug 2019 22:21:29 -0700 Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> >
> > Thanks, Stephen! Looks good except one minor issue below.  
> 
> Thanks for checking.
> 
> > >   vmlinux_link()
> > >   {
> > >  +      info LD ${2}    
> > 
> > This needs to be ${1}.  
> 
> At least its only an information message and doesn't affect the build.
> I will fix my resolution for Monday.

I also fixed it up in today's linux-next (just so people aren't
suprised and report it :-)).
-- 
Cheers,
Stephen Rothwell

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* Re: [RFC PATCH net-next 04/11] spi: spi-fsl-dspi: Cosmetic cleanup
From: Vladimir Oltean @ 2019-08-16 12:37 UTC (permalink / raw)
  To: Mark Brown
  Cc: Hubert Feurstein, mlichvar, Richard Cochran, Andrew Lunn,
	Florian Fainelli, linux-spi, netdev
In-Reply-To: <20190816122103.GE4039@sirena.co.uk>

Hi Mark,

On Fri, 16 Aug 2019 at 15:21, Mark Brown <broonie@kernel.org> wrote:
>
> On Fri, Aug 16, 2019 at 03:44:42AM +0300, Vladimir Oltean wrote:
> > This patch addresses some cosmetic issues:
> > - Alignment
> > - Typos
> > - (Non-)use of BIT() and GENMASK() macros
> > - Unused definitions
> > - Unused includes
> > - Abuse of ternary operator in detriment of readability
> > - Reduce indentation level
>
> This is difficult to review since there's a bunch of largely unrelated
> changes all munged into one patch.  It'd be better to split this up so
> each change makes one kind of fix, and better to do this separately to
> the rest of the series.  In particular having alignment changes along
> with other changes hurts reviewability as it's less immediately clear
> what's a like for liken substitution.

Yes, the diff of this patch looks relatively bad. But I don't know if
splitting it in more patches isn't in fact going to pollute the git
history, so I can just as well drop it.

Regards,
-Vladimir

^ permalink raw reply

* Re: [RFC PATCH net-next 03/11] spi: Add a PTP system timestamp to the transfer structure
From: Vladimir Oltean @ 2019-08-16 12:35 UTC (permalink / raw)
  To: Mark Brown
  Cc: Hubert Feurstein, mlichvar, Richard Cochran, Andrew Lunn,
	Florian Fainelli, linux-spi, netdev
In-Reply-To: <20190816121837.GD4039@sirena.co.uk>

Hi Mark,

On Fri, 16 Aug 2019 at 15:18, Mark Brown <broonie@kernel.org> wrote:
>
> On Fri, Aug 16, 2019 at 03:44:41AM +0300, Vladimir Oltean wrote:
>
> > @@ -842,6 +843,9 @@ struct spi_transfer {
> >
> >       u32             effective_speed_hz;
> >
> > +     struct ptp_system_timestamp *ptp_sts;
> > +     unsigned int    ptp_sts_word_offset;
> > +
>
> You've not documented these fields at all so it's not clear what the
> intended usage is.

Thanks for looking into this.
Indeed I didn't document them as the patch is part of a RFC and I
thought the purpose was more clear from the context (cover letter
etc).
If I do ever send a patchset for submission I will document the newly
introduced fields properly.
So let me clarify:
The SPI slave device driver is populating these fields to indicate to
the controller driver that it wants word number @ptp_sts_word_offset
from the tx buffer snapshotted. The controller driver is supposed to
put the snapshot into the @ptp_sts field, which is a pointer to a
memory location under the control of the SPI slave device driver.
It is ok if the ptp_sts pointer is NULL (no need to check), because
the API for taking snapshots already checks for that.
At the moment there is yet no proposed mechanism for the SPI slave
driver to ensure that the controller will really act upon this
request. That would be really nice to have, since some SPI slave
devices are time-sensitive and warning early is a good way to prevent
unnecessary troubleshooting.

Regards,
-Vladimir

^ permalink raw reply

* Re: r8169: Performance regression and latency instability
From: Eric Dumazet @ 2019-08-16 12:35 UTC (permalink / raw)
  To: Juliana Rodrigueiro, netdev; +Cc: edumazet, hkallweit1
In-Reply-To: <72898d5b-9424-0bcd-3d8a-fc2e2dd0dbf1@intra2net.com>



On 8/16/19 2:09 PM, Juliana Rodrigueiro wrote:
> Greetings!
> 
> During migration from kernel 3.14 to 4.19, we noticed a regression on the network performance. Under the exact same circumstances, the standard deviation of the latency is more than double than before on the Realtek RTL8111/8168B (10ec:8168) using the r8169 driver.
> 
> Kernel 3.14:
>     # netperf -v 2 -P 0 -H <netserver-IP>,4 -I 99,5 -t omni -l 1 -- -O STDDEV_LATENCY -m 64K -d Send
>     313.37
> 
> Kernel 4.19:
>     # netperf -v 2 -P 0 -H <netserver-IP>,4 -I 99,5 -t omni -l 1 -- -O STDDEV_LATENCY -m 64K -d Send
>     632.96
> 
> In contrast, we noticed small improvements in performance with other non-Realtek network cards (igb, tg3). Which suggested a possible driver related bug.
> 
> However after bisecting the code, I ended up with the following patch, which was introduced in kernel 4.17 and modifies net/ipv4:
> 
>     commit 0a6b2a1dc2a2105f178255fe495eb914b09cb37a
>     Author: Eric Dumazet <edumazet@google.com>
>     Date:   Mon Feb 19 11:56:47 2018 -0800
> 
>         tcp: switch to GSO being always on
> 
> Could you please help me to clarify, should GSO be always on on my device? Or does it just affect TCP? According to ethtool it is always off, "ethtool -K eth0 gso on" has no effect, unless I switch SG on.
> 
>     # ethtool -k eth0
>     Offload parameters for eth0:
>     Cannot get device udp large send offload settings: Operation not supported
>     rx-checksumming: on
>     tx-checksumming: off
>     scatter-gather: off
>     tcp-segmentation-offload: off
>     udp-fragmentation-offload: off
>     generic-segmentation-offload: off
>     generic-receive-offload: on
>     large-receive-offload: off
> 
> I validated that reverting "tcp: switch to GSO being always on" successfully brings back the better performance for the r8169 driver.
> 
> I'm sure that reverting that commit is not the optimal solution, so I would like to kindly ask for help to shed some light in this issue.

Hi Juliana

I am sure that all commits done in TCP stack can show a regression on a particular
combination of packet sizes, MTU size, NIC, and measured metric.

Basically if your NIC does not support SG and TSO, there is a possibility
that the changes we did to enter the era of 100Gbit and 200Gbit NIC might
hurt a bit.

Lack of SG means that the lower stack might have to perform memory  allocations
to perform the segmentation and this might be slow (or even fail) under memory pressure.

I have no idea why you can even turn on SG, if it is turned off by default.

Please give us more information on the NIC

ethtool -i eth0 ; ifconfig eth0

Possibly try to use a recent ethtool, it seems yours is pretty old.

I also see this relevant commit : I have no idea why SG would have any relation with TSO.

commit a7eb6a4f2560d5ae64bfac98d79d11378ca2de6c
Author: Holger Hoffstätte <holger@applied-asynchrony.com>
Date:   Fri Aug 9 00:02:40 2019 +0200

    r8169: fix performance issue on RTL8168evl
    
    Disabling TSO but leaving SG active results is a significant
    performance drop. Therefore disable also SG on RTL8168evl.
    This restores the original performance.
    
    Fixes: 93681cd7d94f ("r8169: enable HW csum and TSO")
    Signed-off-by: Holger Hoffstätte <holger@applied-asynchrony.com>
    Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
index b2a275d8504c..912bd41eaa1b 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -6898,9 +6898,9 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 
        /* RTL8168e-vl has a HW issue with TSO */
        if (tp->mac_version == RTL_GIGA_MAC_VER_34) {
-               dev->vlan_features &= ~NETIF_F_ALL_TSO;
-               dev->hw_features &= ~NETIF_F_ALL_TSO;
-               dev->features &= ~NETIF_F_ALL_TSO;
+               dev->vlan_features &= ~(NETIF_F_ALL_TSO | NETIF_F_SG);
+               dev->hw_features &= ~(NETIF_F_ALL_TSO | NETIF_F_SG);
+               dev->features &= ~(NETIF_F_ALL_TSO | NETIF_F_SG);
        }
 
        dev->hw_features |= NETIF_F_RXALL;


^ permalink raw reply related

* Re: WARNING in tracepoint_probe_register_prio (3)
From: syzbot @ 2019-08-16 12:32 UTC (permalink / raw)
  To: antoine.tenart, ard.biesheuvel, baruch, bigeasy, davem, gregkh,
	gustavo, jeyu, linux-kernel, mathieu.desnoyers, maxime.chevallier,
	mingo, netdev, paulmck, paulmck, rmk+kernel, rostedt,
	syzkaller-bugs, tglx
In-Reply-To: <000000000000ab6f84056c786b93@google.com>

syzbot has bisected this bug to:

commit ecb9f80db23a7ab09b46b298b404e41dd7aff6e6
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Tue Aug 13 08:00:25 2019 +0000

     net/mvpp2: Replace tasklet with softirq hrtimer

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=13ffb9ee600000
start commit:   ecb9f80d net/mvpp2: Replace tasklet with softirq hrtimer
git tree:       net-next
final crash:    https://syzkaller.appspot.com/x/report.txt?x=100079ee600000
console output: https://syzkaller.appspot.com/x/log.txt?x=17ffb9ee600000
kernel config:  https://syzkaller.appspot.com/x/.config?x=d4cf1ffb87d590d7
dashboard link: https://syzkaller.appspot.com/bug?extid=774fddf07b7ab29a1e55
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=11b02a22600000

Reported-by: syzbot+774fddf07b7ab29a1e55@syzkaller.appspotmail.com
Fixes: ecb9f80db23a ("net/mvpp2: Replace tasklet with softirq hrtimer")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

^ permalink raw reply

* Re: [PATCH net-next, 2/6] PCI: hv: Add a Hyper-V PCI mini driver for software backchannel interface
From: Vitaly Kuznetsov @ 2019-08-16 12:27 UTC (permalink / raw)
  To: Haiyang Zhang, sashal@kernel.org, davem@davemloft.net,
	saeedm@mellanox.com, leon@kernel.org, eranbe@mellanox.com,
	lorenzo.pieralisi@arm.com, bhelgaas@google.com,
	linux-pci@vger.kernel.org, linux-hyperv@vger.kernel.org,
	netdev@vger.kernel.org
  Cc: Haiyang Zhang, KY Srinivasan, Stephen Hemminger,
	linux-kernel@vger.kernel.org
In-Reply-To: <1565809632-39138-3-git-send-email-haiyangz@microsoft.com>

Haiyang Zhang <haiyangz@microsoft.com> writes:

> This mini driver is a helper driver allows other drivers to
> have a common interface with the Hyper-V PCI frontend driver.
>
> Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
> ---
>  MAINTAINERS                              |  1 +
>  drivers/pci/Kconfig                      |  1 +
>  drivers/pci/controller/Kconfig           |  7 ++++
>  drivers/pci/controller/Makefile          |  1 +
>  drivers/pci/controller/pci-hyperv-mini.c | 70 ++++++++++++++++++++++++++++++++
>  drivers/pci/controller/pci-hyperv.c      | 12 ++++--
>  include/linux/hyperv.h                   | 30 ++++++++++----
>  7 files changed, 111 insertions(+), 11 deletions(-)
>  create mode 100644 drivers/pci/controller/pci-hyperv-mini.c
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index e352550..c4962b9 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -7453,6 +7453,7 @@ F:	drivers/hid/hid-hyperv.c
>  F:	drivers/hv/
>  F:	drivers/input/serio/hyperv-keyboard.c
>  F:	drivers/pci/controller/pci-hyperv.c
> +F:	drivers/pci/controller/pci-hyperv-mini.c
>  F:	drivers/net/hyperv/
>  F:	drivers/scsi/storvsc_drv.c
>  F:	drivers/uio/uio_hv_generic.c
> diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
> index 2ab9240..bb852f5 100644
> --- a/drivers/pci/Kconfig
> +++ b/drivers/pci/Kconfig
> @@ -182,6 +182,7 @@ config PCI_LABEL
>  config PCI_HYPERV
>          tristate "Hyper-V PCI Frontend"
>          depends on X86 && HYPERV && PCI_MSI && PCI_MSI_IRQ_DOMAIN && X86_64
> +	select PCI_HYPERV_MINI
>          help
>            The PCI device frontend driver allows the kernel to import arbitrary
>            PCI devices from a PCI backend to support PCI driver domains.
> diff --git a/drivers/pci/controller/Kconfig b/drivers/pci/controller/Kconfig
> index fe9f9f1..8e31cba 100644
> --- a/drivers/pci/controller/Kconfig
> +++ b/drivers/pci/controller/Kconfig
> @@ -281,5 +281,12 @@ config VMD
>  	  To compile this driver as a module, choose M here: the
>  	  module will be called vmd.
>  
> +config PCI_HYPERV_MINI
> +	tristate "Hyper-V PCI Mini"
> +	depends on X86 && HYPERV && PCI_MSI && PCI_MSI_IRQ_DOMAIN && X86_64
> +	help
> +	  The Hyper-V PCI Mini is a helper driver allows other drivers to
> +	  have a common interface with the Hyper-V PCI frontend driver.
> +

Out of pure curiosity, why not just export this interface from
PCI_HYPERV directly? Why do we need this stub?

>  source "drivers/pci/controller/dwc/Kconfig"
>  endmenu
> diff --git a/drivers/pci/controller/Makefile b/drivers/pci/controller/Makefile
> index d56a507..77e0132 100644
> --- a/drivers/pci/controller/Makefile
> +++ b/drivers/pci/controller/Makefile
> @@ -4,6 +4,7 @@ obj-$(CONFIG_PCIE_CADENCE_HOST) += pcie-cadence-host.o
>  obj-$(CONFIG_PCIE_CADENCE_EP) += pcie-cadence-ep.o
>  obj-$(CONFIG_PCI_FTPCI100) += pci-ftpci100.o
>  obj-$(CONFIG_PCI_HYPERV) += pci-hyperv.o
> +obj-$(CONFIG_PCI_HYPERV_MINI) += pci-hyperv-mini.o
>  obj-$(CONFIG_PCI_MVEBU) += pci-mvebu.o
>  obj-$(CONFIG_PCI_AARDVARK) += pci-aardvark.o
>  obj-$(CONFIG_PCI_TEGRA) += pci-tegra.o
> diff --git a/drivers/pci/controller/pci-hyperv-mini.c b/drivers/pci/controller/pci-hyperv-mini.c
> new file mode 100644
> index 0000000..9b6cd1c
> --- /dev/null
> +++ b/drivers/pci/controller/pci-hyperv-mini.c
> @@ -0,0 +1,70 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) Microsoft Corporation.
> + *
> + * Author:
> + *   Haiyang Zhang <haiyangz@microsoft.com>
> + *
> + * This mini driver is a helper driver allows other drivers to
> + * have a common interface with the Hyper-V PCI frontend driver.
> + */
> +
> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> +
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/hyperv.h>
> +
> +struct hyperv_pci_block_ops hvpci_block_ops;
> +EXPORT_SYMBOL(hvpci_block_ops);
> +
> +int hyperv_read_cfg_blk(struct pci_dev *dev, void *buf, unsigned int buf_len,
> +			unsigned int block_id, unsigned int *bytes_returned)
> +{
> +	if (!hvpci_block_ops.read_block)
> +		return -EOPNOTSUPP;
> +
> +	return hvpci_block_ops.read_block(dev, buf, buf_len, block_id,
> +					  bytes_returned);
> +}
> +EXPORT_SYMBOL(hyperv_read_cfg_blk);
> +
> +int hyperv_write_cfg_blk(struct pci_dev *dev, void *buf, unsigned int len,
> +			 unsigned int block_id)
> +{
> +	if (!hvpci_block_ops.write_block)
> +		return -EOPNOTSUPP;
> +
> +	return hvpci_block_ops.write_block(dev, buf, len, block_id);
> +}
> +EXPORT_SYMBOL(hyperv_write_cfg_blk);
> +
> +int hyperv_reg_block_invalidate(struct pci_dev *dev, void *context,
> +				void (*block_invalidate)(void *context,
> +							 u64 block_mask))
> +{
> +	if (!hvpci_block_ops.reg_blk_invalidate)
> +		return -EOPNOTSUPP;
> +
> +	return hvpci_block_ops.reg_blk_invalidate(dev, context,
> +						  block_invalidate);
> +}
> +EXPORT_SYMBOL(hyperv_reg_block_invalidate);
> +
> +static void __exit exit_hv_pci_mini(void)
> +{
> +	pr_info("unloaded\n");
> +}
> +
> +static int __init init_hv_pci_mini(void)
> +{
> +	pr_info("loaded\n");
> +
> +	return 0;
> +}
> +
> +module_init(init_hv_pci_mini);
> +module_exit(exit_hv_pci_mini);
> +
> +MODULE_DESCRIPTION("Hyper-V PCI Mini");
> +MODULE_LICENSE("GPL v2");
> diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
> index 57adeca..9c93ac2 100644
> --- a/drivers/pci/controller/pci-hyperv.c
> +++ b/drivers/pci/controller/pci-hyperv.c
> @@ -983,7 +983,6 @@ int hv_read_config_block(struct pci_dev *pdev, void *buf, unsigned int len,
>  	*bytes_returned = comp_pkt.bytes_returned;
>  	return 0;
>  }
> -EXPORT_SYMBOL(hv_read_config_block);
>  
>  /**
>   * hv_pci_write_config_compl() - Invoked when a response packet for a write
> @@ -1070,7 +1069,6 @@ int hv_write_config_block(struct pci_dev *pdev, void *buf, unsigned int len,
>  
>  	return 0;
>  }
> -EXPORT_SYMBOL(hv_write_config_block);
>  
>  /**
>   * hv_register_block_invalidate() - Invoked when a config block invalidation
> @@ -1101,7 +1099,6 @@ int hv_register_block_invalidate(struct pci_dev *pdev, void *context,
>  	return 0;
>  
>  }
> -EXPORT_SYMBOL(hv_register_block_invalidate);
>  
>  /* Interrupt management hooks */
>  static void hv_int_desc_free(struct hv_pci_dev *hpdev,
> @@ -3045,10 +3042,19 @@ static int hv_pci_remove(struct hv_device *hdev)
>  static void __exit exit_hv_pci_drv(void)
>  {
>  	vmbus_driver_unregister(&hv_pci_drv);
> +
> +	hvpci_block_ops.read_block = NULL;
> +	hvpci_block_ops.write_block = NULL;
> +	hvpci_block_ops.reg_blk_invalidate = NULL;
>  }
>  
>  static int __init init_hv_pci_drv(void)
>  {
> +	/* Initialize PCI block r/w interface */
> +	hvpci_block_ops.read_block = hv_read_config_block;
> +	hvpci_block_ops.write_block = hv_write_config_block;
> +	hvpci_block_ops.reg_blk_invalidate = hv_register_block_invalidate;
> +
>  	return vmbus_driver_register(&hv_pci_drv);
>  }
>  
> diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
> index 9d37f8c..2afe6fd 100644
> --- a/include/linux/hyperv.h
> +++ b/include/linux/hyperv.h
> @@ -1579,18 +1579,32 @@ struct vmpacket_descriptor *
>  	    pkt = hv_pkt_iter_next(channel, pkt))
>  
>  /*
> - * Functions for passing data between SR-IOV PF and VF drivers.  The VF driver
> + * Interface for passing data between SR-IOV PF and VF drivers. The VF driver
>   * sends requests to read and write blocks. Each block must be 128 bytes or
>   * smaller. Optionally, the VF driver can register a callback function which
>   * will be invoked when the host says that one or more of the first 64 block
>   * IDs is "invalid" which means that the VF driver should reread them.
>   */
>  #define HV_CONFIG_BLOCK_SIZE_MAX 128
> -int hv_read_config_block(struct pci_dev *dev, void *buf, unsigned int buf_len,
> -			 unsigned int block_id, unsigned int *bytes_returned);
> -int hv_write_config_block(struct pci_dev *dev, void *buf, unsigned int len,
> -			  unsigned int block_id);
> -int hv_register_block_invalidate(struct pci_dev *dev, void *context,
> -				 void (*block_invalidate)(void *context,
> -							  u64 block_mask));
> +
> +int hyperv_read_cfg_blk(struct pci_dev *dev, void *buf, unsigned int buf_len,
> +			unsigned int block_id, unsigned int *bytes_returned);
> +int hyperv_write_cfg_blk(struct pci_dev *dev, void *buf, unsigned int len,
> +			 unsigned int block_id);
> +int hyperv_reg_block_invalidate(struct pci_dev *dev, void *context,
> +				void (*block_invalidate)(void *context,
> +							 u64 block_mask));
> +
> +struct hyperv_pci_block_ops {
> +	int (*read_block)(struct pci_dev *dev, void *buf, unsigned int buf_len,
> +			  unsigned int block_id, unsigned int *bytes_returned);
> +	int (*write_block)(struct pci_dev *dev, void *buf, unsigned int len,
> +			   unsigned int block_id);
> +	int (*reg_blk_invalidate)(struct pci_dev *dev, void *context,
> +				  void (*block_invalidate)(void *context,
> +							   u64 block_mask));
> +};
> +
> +extern struct hyperv_pci_block_ops hvpci_block_ops;
> +
>  #endif /* _HYPERV_H */

-- 
Vitaly

^ permalink raw reply

* Re: [RFC PATCH net-next 04/11] spi: spi-fsl-dspi: Cosmetic cleanup
From: Mark Brown @ 2019-08-16 12:21 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: h.feurstein, mlichvar, richardcochran, andrew, f.fainelli,
	linux-spi, netdev
In-Reply-To: <20190816004449.10100-5-olteanv@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 700 bytes --]

On Fri, Aug 16, 2019 at 03:44:42AM +0300, Vladimir Oltean wrote:
> This patch addresses some cosmetic issues:
> - Alignment
> - Typos
> - (Non-)use of BIT() and GENMASK() macros
> - Unused definitions
> - Unused includes
> - Abuse of ternary operator in detriment of readability
> - Reduce indentation level

This is difficult to review since there's a bunch of largely unrelated
changes all munged into one patch.  It'd be better to split this up so
each change makes one kind of fix, and better to do this separately to
the rest of the series.  In particular having alignment changes along
with other changes hurts reviewability as it's less immediately clear
what's a like for liken substitution.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* Re: [RFC PATCH net-next 03/11] spi: Add a PTP system timestamp to the transfer structure
From: Mark Brown @ 2019-08-16 12:18 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: h.feurstein, mlichvar, richardcochran, andrew, f.fainelli,
	linux-spi, netdev
In-Reply-To: <20190816004449.10100-4-olteanv@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 332 bytes --]

On Fri, Aug 16, 2019 at 03:44:41AM +0300, Vladimir Oltean wrote:

> @@ -842,6 +843,9 @@ struct spi_transfer {
>  
>  	u32		effective_speed_hz;
>  
> +	struct ptp_system_timestamp *ptp_sts;
> +	unsigned int	ptp_sts_word_offset;
> +

You've not documented these fields at all so it's not clear what the
intended usage is.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* r8169: Performance regression and latency instability
From: Juliana Rodrigueiro @ 2019-08-16 12:09 UTC (permalink / raw)
  To: netdev; +Cc: edumazet, hkallweit1

Greetings!

During migration from kernel 3.14 to 4.19, we noticed a regression on 
the network performance. Under the exact same circumstances, the 
standard deviation of the latency is more than double than before on the 
Realtek RTL8111/8168B (10ec:8168) using the r8169 driver.

Kernel 3.14:
     # netperf -v 2 -P 0 -H <netserver-IP>,4 -I 99,5 -t omni -l 1 -- -O 
STDDEV_LATENCY -m 64K -d Send
     313.37

Kernel 4.19:
     # netperf -v 2 -P 0 -H <netserver-IP>,4 -I 99,5 -t omni -l 1 -- -O 
STDDEV_LATENCY -m 64K -d Send
     632.96

In contrast, we noticed small improvements in performance with other 
non-Realtek network cards (igb, tg3). Which suggested a possible driver 
related bug.

However after bisecting the code, I ended up with the following patch, 
which was introduced in kernel 4.17 and modifies net/ipv4:

     commit 0a6b2a1dc2a2105f178255fe495eb914b09cb37a
     Author: Eric Dumazet <edumazet@google.com>
     Date:   Mon Feb 19 11:56:47 2018 -0800

         tcp: switch to GSO being always on

Could you please help me to clarify, should GSO be always on on my 
device? Or does it just affect TCP? According to ethtool it is always 
off, "ethtool -K eth0 gso on" has no effect, unless I switch SG on.

     # ethtool -k eth0
     Offload parameters for eth0:
     Cannot get device udp large send offload settings: Operation not 
supported
     rx-checksumming: on
     tx-checksumming: off
     scatter-gather: off
     tcp-segmentation-offload: off
     udp-fragmentation-offload: off
     generic-segmentation-offload: off
     generic-receive-offload: on
     large-receive-offload: off

I validated that reverting "tcp: switch to GSO being always on" 
successfully brings back the better performance for the r8169 driver.

I'm sure that reverting that commit is not the optimal solution, so I 
would like to kindly ask for help to shed some light in this issue.

Best regards,
Juliana.

^ permalink raw reply

* Re: [PATCH bpf-next] libbpf: relicense bpf_helpers.h and bpf_endian.h
From: Jesper Dangaard Brouer @ 2019-08-16 12:10 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: brouer, bpf, netdev, andrii.nakryiko, kernel-team,
	Michael Holzheu, Naveen N . Rao, David S . Miller,
	Michal Rostecki, John Fastabend, Sargun Dhillon
In-Reply-To: <20190816054543.2215626-1-andriin@fb.com>

On Thu, 15 Aug 2019 22:45:43 -0700
Andrii Nakryiko <andriin@fb.com> wrote:

> bpf_helpers.h and bpf_endian.h contain useful macros and BPF helper
> definitions essential to almost every BPF program. Which makes them
> useful not just for selftests. To be able to expose them as part of
> libbpf, though, we need them to be dual-licensed as LGPL-2.1 OR
> BSD-2-Clause. This patch updates licensing of those two files.

I've already ACKed this, and is fine with (LGPL-2.1 OR BSD-2-Clause).

I just want to understand, why "BSD-2-Clause" and not "Apache-2.0" ?

The original argument was that this needed to be compatible with
"Apache-2.0", then why not simply add this in the "OR" ?

> Acked-by: Alexei Starovoitov <ast@kernel.org>
> Acked-by: Hechao Li <hechaol@fb.com>
> Acked-by: Martin KaFai Lau <kafai@fb.com>
> Acked-by: Andrey Ignatov <rdna@fb.com>
> Acked-by: Yonghong Song <yhs@fb.com>
> Acked-by: Lawrence Brakmo <brakmo@fb.com>
> Acked-by: Adam Barth <arb@fb.com>
> Acked-by: Roman Gushchin <guro@fb.com>
> Acked-by: Josef Bacik <jbacik@fb.com>
> Acked-by: Joe Stringer <joe@wand.net.nz>
> Acked-by: Daniel Borkmann <daniel@iogearbox.net>
> Acked-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> Acked-by: David Ahern <dsahern@gmail.com>
> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>

Confirming I acked this.

> Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
> Acked-by: Lorenz Bauer <lmb@cloudflare.com>
> Acked-by: Adrian Ratiu <adrian.ratiu@collabora.com>
> Acked-by: Nikita V. Shirokov <tehnerd@tehnerd.com>
> Acked-by: Willem de Bruijn <willemb@google.com>
> Acked-by: Petar Penkov <ppenkov@google.com>
> Acked-by: Teng Qin <palmtenor@gmail.com>
> Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com>
> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
> Cc: David S. Miller <davem@davemloft.net>
> Cc: Michal Rostecki <mrostecki@opensuse.org>
> Cc: John Fastabend <john.fastabend@gmail.com>
> Cc: Sargun Dhillon <sargun@sargun.me>
> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> ---
>  tools/testing/selftests/bpf/bpf_endian.h  | 2 +-
>  tools/testing/selftests/bpf/bpf_helpers.h | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/testing/selftests/bpf/bpf_endian.h b/tools/testing/selftests/bpf/bpf_endian.h
> index 05f036df8a4c..ff3593b0ae03 100644
> --- a/tools/testing/selftests/bpf/bpf_endian.h
> +++ b/tools/testing/selftests/bpf/bpf_endian.h
> @@ -1,4 +1,4 @@
> -/* SPDX-License-Identifier: GPL-2.0 */
> +/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
>  #ifndef __BPF_ENDIAN__
>  #define __BPF_ENDIAN__
>  
> diff --git a/tools/testing/selftests/bpf/bpf_helpers.h b/tools/testing/selftests/bpf/bpf_helpers.h
> index 8b503ea142f0..6c4930bc6e2e 100644
> --- a/tools/testing/selftests/bpf/bpf_helpers.h
> +++ b/tools/testing/selftests/bpf/bpf_helpers.h
> @@ -1,4 +1,4 @@
> -/* SPDX-License-Identifier: GPL-2.0 */
> +/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
>  #ifndef __BPF_HELPERS_H
>  #define __BPF_HELPERS_H
>  



-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply

* Re: [EXT] Re: [PATCH net-next 1/1] Added BASE-T1 PHY support to PHY Subsystem
From: Christian Herber @ 2019-08-16 12:05 UTC (permalink / raw)
  To: Heiner Kallweit, Andrew Lunn
  Cc: davem@davemloft.net, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, Florian Fainelli
In-Reply-To: <2ca68436-8e49-b0b2-2460-4fcac3094b09@gmail.com>

On 15.08.2019 18:34, Heiner Kallweit wrote:
> Caution: EXT Email
> 
> On 15.08.2019 17:56, Andrew Lunn wrote:
>> On Thu, Aug 15, 2019 at 03:32:29PM +0000, Christian Herber wrote:
>>> BASE-T1 is a category of Ethernet PHYs.
>>> They use a single copper pair for transmission.
>>> This patch add basic support for this category of PHYs.
>>> It coveres the discovery of abilities and basic configuration.
>>> It includes setting fixed speed and enabling auto-negotiation.
>>> BASE-T1 devices should always Clause-45 managed.
>>> Therefore, this patch extends phy-c45.c.
>>> While for some functions like auto-neogtiation different registers are
>>> used, the layout of these registers is the same for the used fields.
>>> Thus, much of the logic of basic Clause-45 devices can be reused.
>>>
>>> Signed-off-by: Christian Herber <christian.herber@nxp.com>
>>> ---
>>>   drivers/net/phy/phy-c45.c    | 113 +++++++++++++++++++++++++++++++----
>>>   drivers/net/phy/phy-core.c   |   4 +-
>>>   include/uapi/linux/ethtool.h |   2 +
>>>   include/uapi/linux/mdio.h    |  21 +++++++
>>>   4 files changed, 129 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/drivers/net/phy/phy-c45.c b/drivers/net/phy/phy-c45.c
>>> index b9d4145781ca..9ff0b8c785de 100644
>>> --- a/drivers/net/phy/phy-c45.c
>>> +++ b/drivers/net/phy/phy-c45.c
>>> @@ -8,13 +8,23 @@
>>>   #include <linux/mii.h>
>>>   #include <linux/phy.h>
>>>
>>> +#define IS_100BASET1(phy) (linkmode_test_bit( \
>>> +                       ETHTOOL_LINK_MODE_100baseT1_Full_BIT, \
>>> +                       (phy)->supported))
>>> +#define IS_1000BASET1(phy) (linkmode_test_bit( \
>>> +                        ETHTOOL_LINK_MODE_1000baseT1_Full_BIT, \
>>> +                        (phy)->supported))
>>
>> Hi Christian
>>
>> We already have the flag phydev->is_gigabit_capable. Maybe add a flag
>> phydev->is_t1_capable
>>
>>> +
>>> +static u32 get_aneg_ctrl(struct phy_device *phydev);
>>> +static u32 get_aneg_stat(struct phy_device *phydev);
>>
>> No forward declarations please. Put the code in the right order so
>> they are not needed.
>>
>> Thanks
>>
>>       Andrew
>>
> 
> For whatever reason I don't have the original mail in my netdev inbox (yet).
> 
> +       if (IS_100BASET1(phydev) || IS_1000BASET1(phydev))
> +               ctrl = MDIO_AN_BT1_CTRL;
> 
> Code like this could be problematic once a PHY supports one of the T1 modes
> AND normal modes. Then normal modes would be unusable.
> 
> I think this scenario isn't completely hypothetical. See the Aquantia
> AQCS109 that supports normal modes and (proprietary) 1000Base-T2.
> 
> Maybe we need separate versions of the generic functions for T1.
> Then it would be up to the PHY driver to decide when to use which
> version.
> 
> Heiner
> 

Integrating this with the existing driver or creating a new on is an 
interesting question. I came to the conclusion that it is most efficient 
to integrate to avoid all to much copy paste code.

So far, I am not aware of any device that supports T1 and something else 
at the same time. From a HW perspective, I also consider this quite 
unlikely. In the unlikely case that such a device comes up, support from 
the genphy driver would be limited to the BASE-T1 modes. But i would 
rather create the special case for the special device and cater the 
current mainstream support to the mainstream devices.

I think this boils down to a general strategy for the PHY framework, as 
this can happen for other classes of devices also like NGBASE-T1, 
MultiGBASE-T and future unknown devices. For now, I think the 
architecture is sufficiently scalable with a single c45 genphy driver

Christian

^ permalink raw reply

* Re: [EXT] Re: [PATCH net-next 1/1] Added BASE-T1 PHY support to PHY Subsystem
From: Christian Herber @ 2019-08-16 11:56 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: davem@davemloft.net, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <20190815155613.GE15291@lunn.ch>

On 15.08.2019 17:56, Andrew Lunn wrote:
> Caution: EXT Email
> 
> On Thu, Aug 15, 2019 at 03:32:29PM +0000, Christian Herber wrote:
>> BASE-T1 is a category of Ethernet PHYs.
>> They use a single copper pair for transmission.
>> This patch add basic support for this category of PHYs.
>> It coveres the discovery of abilities and basic configuration.
>> It includes setting fixed speed and enabling auto-negotiation.
>> BASE-T1 devices should always Clause-45 managed.
>> Therefore, this patch extends phy-c45.c.
>> While for some functions like auto-neogtiation different registers are
>> used, the layout of these registers is the same for the used fields.
>> Thus, much of the logic of basic Clause-45 devices can be reused.
>>
>> Signed-off-by: Christian Herber <christian.herber@nxp.com>
>> ---
>>   drivers/net/phy/phy-c45.c    | 113 +++++++++++++++++++++++++++++++----
>>   drivers/net/phy/phy-core.c   |   4 +-
>>   include/uapi/linux/ethtool.h |   2 +
>>   include/uapi/linux/mdio.h    |  21 +++++++
>>   4 files changed, 129 insertions(+), 11 deletions(-)
>>
>> diff --git a/drivers/net/phy/phy-c45.c b/drivers/net/phy/phy-c45.c
>> index b9d4145781ca..9ff0b8c785de 100644
>> --- a/drivers/net/phy/phy-c45.c
>> +++ b/drivers/net/phy/phy-c45.c
>> @@ -8,13 +8,23 @@
>>   #include <linux/mii.h>
>>   #include <linux/phy.h>
>>
>> +#define IS_100BASET1(phy) (linkmode_test_bit( \
>> +                        ETHTOOL_LINK_MODE_100baseT1_Full_BIT, \
>> +                        (phy)->supported))
>> +#define IS_1000BASET1(phy) (linkmode_test_bit( \
>> +                         ETHTOOL_LINK_MODE_1000baseT1_Full_BIT, \
>> +                         (phy)->supported))
> 
> Hi Christian
> 
> We already have the flag phydev->is_gigabit_capable. Maybe add a flag
> phydev->is_t1_capable
> 
>> +
>> +static u32 get_aneg_ctrl(struct phy_device *phydev);
>> +static u32 get_aneg_stat(struct phy_device *phydev);
> 
> No forward declarations please. Put the code in the right order so
> they are not needed.
> 
> Thanks
> 
>       Andrew
> 

Hi Andrew,

thanks for feedback. The use of an additional flag is a good proposal.
I was hesitant to touch the phydev structure.
I will add this along with removing the forward declaration in v2.

Regards,

Christian

^ permalink raw reply

* Re: [PATCH v2 bpf-next 1/4] bpf: unprivileged BPF access via /dev/bpf
From: Jordan Glover @ 2019-08-16 11:33 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Alexei Starovoitov, Andy Lutomirski, Daniel Colascione, Song Liu,
	Kees Cook, Networking, bpf, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, Lorenz Bauer, Jann Horn, Greg KH, Linux API,
	LSM List
In-Reply-To: <alpine.DEB.2.21.1908161158490.1873@nanos.tec.linutronix.de>

On Friday, August 16, 2019 9:59 AM, Thomas Gleixner <tglx@linutronix.de> wrote:

> On Fri, 16 Aug 2019, Jordan Glover wrote:
>
> > "systemd --user" service? Trying to do so will fail with:
> > "Failed to apply ambient capabilities (before UID change): Operation not permitted"
> > I think it's crucial to clear that point to avoid confusion in this discussion
> > where people are talking about different things.
> > On the other hand running "systemd --system" service with:
> > User=nobody
> > AmbientCapabilities=CAP_NET_ADMIN
> > is perfectly legit and clears some security concerns as only privileged user
> > can start such service.
>
> While we are at it, can we please stop looking at this from a systemd only
> perspective. There is a world outside of systemd.
>
> Thanks,
>
> tglx

If you define:

"systemd --user" == unprivileged process started by unprivileged user
"systemd --system" == process started by privileged user but run as another
user which keeps some of parent user privileges and drops others

you can get rid of "systemd" from the equation.

"systemd --user" was the example provided by Alexei when asked about the usecase
but his description didn't match what it does so it's not obvious what the real
usecase is. I'm sure there can be many more examples and systemd isn't important
here in particular beside to understand this specific example.

Jordan

^ permalink raw reply

* Re: [PATCH net] tunnel: fix dev null pointer dereference when send pkg larger than mtu in collect_md mode
From: Hangbin Liu @ 2019-08-16 10:51 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: netdev, Stefano Brivio, wenxu, Alexei Starovoitov,
	David S . Miller
In-Reply-To: <fe103dee-bba8-1e0d-83b2-f91c2c37089d@gmail.com>

On Fri, Aug 16, 2019 at 10:23:55AM +0200, Eric Dumazet wrote:
> 
> 
> On 8/16/19 5:24 AM, Hangbin Liu wrote:
> > Hi Eric,
> > 
> > Thanks for the review.
> > On Thu, Aug 15, 2019 at 11:16:58AM +0200, Eric Dumazet wrote:
> >>> diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
> >>> index 38c02bb62e2c..c6713c7287df 100644
> >>> --- a/net/ipv4/ip_tunnel.c
> >>> +++ b/net/ipv4/ip_tunnel.c
> >>> @@ -597,6 +597,9 @@ void ip_md_tunnel_xmit(struct sk_buff *skb, struct net_device *dev,
> >>>  		goto tx_error;
> >>>  	}
> >>>  
> >>> +	if (skb_dst(skb) && !skb_dst(skb)->dev)
> >>> +		skb_dst(skb)->dev = rt->dst.dev;
> >>> +
> >>
> >>
> >> IMO this looks wrong.
> >> This dst seems shared. 
> > 
> > If the dst is shared, it may cause some problem. Could you point me where the
> > dst may be shared possibly?
> >
> 
> dst are inherently shared.
> 
> This is why we have a refcount on them.
> 
> Only when the dst has been allocated by the current thread we can make changes on them.
> 

OK, I see now.

Then how about fix the issue in __icmp_send and decode_session{4,6}. The
fix in there is safe, as in __icmp_send() we only want to get net,
dev_net(skb_in->dev) could also do the work, just as icmp6_send() does.

For decode_session{4,6} the oif is also not needed in this scenario as this
is called by xfrm_decode_session_reverse(), we only need the skb_iif
fl4->flowi4_oif = reverse ? skb->skb_iif : oif;

I also need to check more code in OVS..

diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
index 1510e951f451..95d803543df5 100644
--- a/net/ipv4/icmp.c
+++ b/net/ipv4/icmp.c
@@ -582,7 +582,11 @@ void __icmp_send(struct sk_buff *skb_in, int type, int code, __be32 info,

        if (!rt)
                goto out;
-       net = dev_net(rt->dst.dev);
+
+       if (skb_in->dev)
+               net = dev_net(skb_in->dev);
+       else
+               goto out;

        /*
         *      Find the original header. It is expected to be valid, of course.
diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index 8ca637a72697..ec94f5795ea4 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -3269,7 +3269,7 @@ decode_session4(struct sk_buff *skb, struct flowi *fl, bool reverse)
        struct flowi4 *fl4 = &fl->u.ip4;
        int oif = 0;

-       if (skb_dst(skb))
+       if (skb_dst(skb) && skb_dst(skb)->dev)
                oif = skb_dst(skb)->dev->ifindex;

        memset(fl4, 0, sizeof(struct flowi4));
@@ -3387,7 +3387,7 @@ decode_session6(struct sk_buff *skb, struct flowi *fl, bool reverse)

        nexthdr = nh[nhoff];

-       if (skb_dst(skb))
+       if (skb_dst(skb) && skb_dst(skb)->dev)
                oif = skb_dst(skb)->dev->ifindex;

        memset(fl6, 0, sizeof(struct flowi6));


Thanks
Hangbin

^ permalink raw reply related

* Re: [PATCH v2 00/10] Add definition for the number of standard PCI BARs
From: Andrew Murray @ 2019-08-16 10:51 UTC (permalink / raw)
  To: Denis Efremov
  Cc: Bjorn Helgaas, linux-kernel, linux-pci, Sebastian Ott,
	Gerald Schaefer, H. Peter Anvin, Giuseppe Cavallaro,
	Alexandre Torgue, Matt Porter, Alexandre Bounine, Peter Jones,
	Bartlomiej Zolnierkiewicz, Cornelia Huck, Alex Williamson,
	Jose Abreu, kvm, linux-fbdev, netdev, x86, linux-s390
In-Reply-To: <20190816092437.31846-1-efremov@linux.com>

On Fri, Aug 16, 2019 at 12:24:27PM +0300, Denis Efremov wrote:
> Code that iterates over all standard PCI BARs typically uses
> PCI_STD_RESOURCE_END, but this is error-prone because it requires
> "i <= PCI_STD_RESOURCE_END" rather than something like
> "i < PCI_STD_NUM_BARS". We could add such a definition and use it the same
> way PCI_SRIOV_NUM_BARS is used. There is already the definition
> PCI_BAR_COUNT for s390 only. Thus, this patchset introduces it globally.
> 
> Changes in v2:
>   - Reverse checks in pci_iomap_range,pci_iomap_wc_range.
>   - Refactor loops in vfio_pci to keep PCI_STD_RESOURCES.
>   - Add 2 new patches to replace the magic constant with new define.
>   - Split net patch in v1 to separate stmmac and dwc-xlgmac patches.
> 
> Denis Efremov (10):
>   PCI: Add define for the number of standard PCI BARs
>   s390/pci: Loop using PCI_STD_NUM_BARS
>   x86/PCI: Loop using PCI_STD_NUM_BARS
>   stmmac: pci: Loop using PCI_STD_NUM_BARS
>   net: dwc-xlgmac: Loop using PCI_STD_NUM_BARS
>   rapidio/tsi721: Loop using PCI_STD_NUM_BARS
>   efifb: Loop using PCI_STD_NUM_BARS
>   vfio_pci: Loop using PCI_STD_NUM_BARS
>   PCI: hv: Use PCI_STD_NUM_BARS
>   PCI: Use PCI_STD_NUM_BARS
> 
>  arch/s390/include/asm/pci.h                      |  5 +----
>  arch/s390/include/asm/pci_clp.h                  |  6 +++---
>  arch/s390/pci/pci.c                              | 16 ++++++++--------
>  arch/s390/pci/pci_clp.c                          |  6 +++---
>  arch/x86/pci/common.c                            |  2 +-
>  drivers/net/ethernet/stmicro/stmmac/stmmac_pci.c |  4 ++--
>  drivers/net/ethernet/synopsys/dwc-xlgmac-pci.c   |  2 +-
>  drivers/pci/controller/pci-hyperv.c              | 10 +++++-----
>  drivers/pci/pci.c                                | 11 ++++++-----
>  drivers/pci/quirks.c                             |  4 ++--
>  drivers/rapidio/devices/tsi721.c                 |  2 +-
>  drivers/vfio/pci/vfio_pci.c                      | 11 +++++++----
>  drivers/vfio/pci/vfio_pci_config.c               | 10 ++++++----
>  drivers/vfio/pci/vfio_pci_private.h              |  4 ++--
>  drivers/video/fbdev/efifb.c                      |  2 +-
>  include/linux/pci.h                              |  2 +-
>  include/uapi/linux/pci_regs.h                    |  1 +
>  17 files changed, 51 insertions(+), 47 deletions(-)

I've come across a few more places where this change can be made. There
may be multiple instances in the same file, but only the first is shown
below:

drivers/misc/pci_endpoint_test.c:       for (bar = BAR_0; bar <= BAR_5; bar++) {
drivers/net/ethernet/intel/e1000/e1000_main.c:          for (i = BAR_1; i <= BAR_5; i++) {
drivers/net/ethernet/intel/ixgb/ixgb_main.c:    for (i = BAR_1; i <= BAR_5; i++) {
drivers/pci/controller/dwc/pci-dra7xx.c:        for (bar = BAR_0; bar <= BAR_5; bar++)
drivers/pci/controller/dwc/pci-layerscape-ep.c: for (bar = BAR_0; bar <= BAR_5; bar++)
drivers/pci/controller/dwc/pcie-artpec6.c:      for (bar = BAR_0; bar <= BAR_5; bar++)
drivers/pci/controller/dwc/pcie-designware-plat.c:      for (bar = BAR_0; bar <= BAR_5; bar++)
drivers/pci/endpoint/functions/pci-epf-test.c:  for (bar = BAR_0; bar <= BAR_5; bar++) {
include/linux/pci-epc.h:        u64     bar_fixed_size[BAR_5 + 1];
drivers/scsi/pm8001/pm8001_hwi.c:       for (bar = 0; bar < 6; bar++) {
drivers/scsi/pm8001/pm8001_init.c:      for (bar = 0; bar < 6; bar++) {
drivers/ata/sata_nv.c:  for (bar = 0; bar < 6; bar++)
drivers/video/fbdev/core/fbmem.c:       for (idx = 0, bar = 0; bar < PCI_ROM_RESOURCE; bar++) {
drivers/staging/gasket/gasket_core.c:   for (i = 0; i < GASKET_NUM_BARS; i++) {
drivers/tty/serial/8250/8250_pci.c:     for (i = 0; i < PCI_NUM_BAR_RESOURCES; i++) { <-----------

It looks like BARs are often iterated with PCI_NUM_BAR_RESOURCES, there
are a load of these too found with:

git grep PCI_ROM_RESOURCE | grep "< "

I'm happy to share patches if preferred.

Thanks,

Andrew Murray

> 
> -- 
> 2.21.0
> 

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox