Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: pull-request: can 2019-07-24
From: David Miller @ 2019-07-24 21:15 UTC (permalink / raw)
  To: mkl; +Cc: netdev, linux-can, kernel
In-Reply-To: <20190724130322.31702-1-mkl@pengutronix.de>

From: Marc Kleine-Budde <mkl@pengutronix.de>
Date: Wed, 24 Jul 2019 15:03:15 +0200

> this is a pull reqeust of 7 patches for net/master.

Pulled, thank you.

^ permalink raw reply

* Re: [PATCH net-next] r8169: improve rtl_set_rx_mode
From: Saeed Mahameed @ 2019-07-24 21:24 UTC (permalink / raw)
  To: nic_swsd@realtek.com, hkallweit1@gmail.com, davem@davemloft.net
  Cc: netdev@vger.kernel.org
In-Reply-To: <d9900738-0eaf-63cc-dbbf-41ca05539794@gmail.com>

On Wed, 2019-07-24 at 23:04 +0200, Heiner Kallweit wrote:
> This patch improves and simplifies rtl_set_rx_mode a little.
> No functional change intended.
> 
> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
> ---
>  drivers/net/ethernet/realtek/r8169_main.c | 52 ++++++++++-----------
> --
>  1 file changed, 22 insertions(+), 30 deletions(-)
> 
> diff --git a/drivers/net/ethernet/realtek/r8169_main.c
> b/drivers/net/ethernet/realtek/r8169_main.c
> index 9c743d2fc..c39d3a77c 100644
> --- a/drivers/net/ethernet/realtek/r8169_main.c
> +++ b/drivers/net/ethernet/realtek/r8169_main.c
> @@ -61,7 +61,7 @@
>  
>  /* Maximum number of multicast addresses to filter (vs. Rx-all-
> multicast).
>     The RTL chips use a 64 element hash table based on the Ethernet
> CRC. */
> -static const int multicast_filter_limit = 32;
> +#define	MC_FILTER_LIMIT	32
>  
>  #define TX_DMA_BURST	7	/* Maximum PCI burst, '7' is unlimited */
>  #define InterFrameGap	0x03	/* 3 means InterFrameGap =
> the shortest one */
> @@ -4147,53 +4147,45 @@ static void rtl8169_set_magic_reg(struct
> rtl8169_private *tp, unsigned mac_versi
>  static void rtl_set_rx_mode(struct net_device *dev)
>  {
>  	struct rtl8169_private *tp = netdev_priv(dev);
> -	u32 mc_filter[2];	/* Multicast hash filter */
> -	int rx_mode;
> -	u32 tmp = 0;
> +	/* Multicast hash filter */
> +	u32 mc_filter[2] = { 0xffffffff, 0xffffffff };
> +	u32 rx_mode = AcceptBroadcast | AcceptMyPhys | AcceptMulticast;
> +	u32 tmp;
>  

While you are here, maybe improve the declaration order with a reversed
xmas tree ..


^ permalink raw reply

* [net-next v2 0/5][pull request] 1GbE Intel Wired LAN Driver Updates 2019-07-24
From: Jeff Kirsher @ 2019-07-24 21:26 UTC (permalink / raw)
  To: davem; +Cc: Jeff Kirsher, netdev, nhorman, sassmann

This series contains updates to igc and e1000e client drivers only.

Sasha provides a couple of cleanups to remove code that is not needed
and reduce structure sizes.  Updated the MAC reset flow to use the
device reset flow instead of a port reset flow.  Added addition device
id's that will be supported.

Kai-Heng Feng provides a workaround for a possible stalled packet issue
in our ICH devices due to a clock recovery from the PCH being too slow.

v2: removed the last patch in the series that supposedly fixed a MAC/PHY
    de-sync potential issue while waiting for additional information from
    hardware engineers.

The following are changes since commit 92493a2f8a8d5a5bc1188fc71ef02df859ebd932:
  Build fixes for skb_frag_size conversion
and are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue 1GbE

Kai-Heng Feng (1):
  e1000e: add workaround for possible stalled packet

Sasha Neftin (4):
  igc: Remove the polarity field from a PHY information structure
  igc: Remove the unused field from a device specification structure
  igc: Update the MAC reset flow
  igc: Add more SKUs for i225 device

 drivers/net/ethernet/intel/e1000e/ich8lan.c  | 10 ++++++++++
 drivers/net/ethernet/intel/e1000e/ich8lan.h  |  2 +-
 drivers/net/ethernet/intel/igc/igc_base.c    |  5 ++++-
 drivers/net/ethernet/intel/igc/igc_defines.h |  2 +-
 drivers/net/ethernet/intel/igc/igc_hw.h      | 14 +++-----------
 drivers/net/ethernet/intel/igc/igc_main.c    |  3 +++
 6 files changed, 22 insertions(+), 14 deletions(-)

-- 
2.21.0


^ permalink raw reply

* [net-next v2 2/5] igc: Remove the unused field from a device specification structure
From: Jeff Kirsher @ 2019-07-24 21:26 UTC (permalink / raw)
  To: davem; +Cc: Sasha Neftin, netdev, nhorman, sassmann, Aaron Brown,
	Jeff Kirsher
In-Reply-To: <20190724212613.1580-1-jeffrey.t.kirsher@intel.com>

From: Sasha Neftin <sasha.neftin@intel.com>

This patch comes to clean up the device specification structure.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igc/igc_hw.h | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc_hw.h b/drivers/net/ethernet/intel/igc/igc_hw.h
index f689f0a02b5d..9a338fbf671c 100644
--- a/drivers/net/ethernet/intel/igc/igc_hw.h
+++ b/drivers/net/ethernet/intel/igc/igc_hw.h
@@ -184,12 +184,7 @@ struct igc_fc_info {
 };
 
 struct igc_dev_spec_base {
-	bool global_device_reset;
-	bool eee_disable;
 	bool clear_semaphore_once;
-	bool module_plugged;
-	u8 media_port;
-	bool mas_capable;
 };
 
 struct igc_hw {
-- 
2.21.0


^ permalink raw reply related

* [net-next v2 1/5] igc: Remove the polarity field from a PHY information structure
From: Jeff Kirsher @ 2019-07-24 21:26 UTC (permalink / raw)
  To: davem; +Cc: Sasha Neftin, netdev, nhorman, sassmann, Aaron Brown,
	Jeff Kirsher
In-Reply-To: <20190724212613.1580-1-jeffrey.t.kirsher@intel.com>

From: Sasha Neftin <sasha.neftin@intel.com>

Polarity and cable length fields is not applicable for the i225 device.
This patch comes to clean up PHY information structure.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igc/igc_hw.h | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc_hw.h b/drivers/net/ethernet/intel/igc/igc_hw.h
index 1039a224ac80..f689f0a02b5d 100644
--- a/drivers/net/ethernet/intel/igc/igc_hw.h
+++ b/drivers/net/ethernet/intel/igc/igc_hw.h
@@ -151,16 +151,10 @@ struct igc_phy_info {
 
 	u16 autoneg_advertised;
 	u16 autoneg_mask;
-	u16 cable_length;
-	u16 max_cable_length;
-	u16 min_cable_length;
-	u16 pair_length[4];
 
 	u8 mdix;
 
-	bool disable_polarity_correction;
 	bool is_mdix;
-	bool polarity_correction;
 	bool reset_disable;
 	bool speed_downgraded;
 	bool autoneg_wait_to_complete;
-- 
2.21.0


^ permalink raw reply related

* [net-next v2 5/5] e1000e: add workaround for possible stalled packet
From: Jeff Kirsher @ 2019-07-24 21:26 UTC (permalink / raw)
  To: davem; +Cc: Kai-Heng Feng, netdev, nhorman, sassmann, Aaron Brown,
	Jeff Kirsher
In-Reply-To: <20190724212613.1580-1-jeffrey.t.kirsher@intel.com>

From: Kai-Heng Feng <kai.heng.feng@canonical.com>

This works around a possible stalled packet issue, which may occur due to
clock recovery from the PCH being too slow, when the LAN is transitioning
from K1 at 1G link speed.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204057

Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/e1000e/ich8lan.c | 10 ++++++++++
 drivers/net/ethernet/intel/e1000e/ich8lan.h |  2 +-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/e1000e/ich8lan.c b/drivers/net/ethernet/intel/e1000e/ich8lan.c
index 395b05701480..a1fab77b2096 100644
--- a/drivers/net/ethernet/intel/e1000e/ich8lan.c
+++ b/drivers/net/ethernet/intel/e1000e/ich8lan.c
@@ -1429,6 +1429,16 @@ static s32 e1000_check_for_copper_link_ich8lan(struct e1000_hw *hw)
 			else
 				phy_reg |= 0xFA;
 			e1e_wphy_locked(hw, I217_PLL_CLOCK_GATE_REG, phy_reg);
+
+			if (speed == SPEED_1000) {
+				hw->phy.ops.read_reg_locked(hw, HV_PM_CTRL,
+							    &phy_reg);
+
+				phy_reg |= HV_PM_CTRL_K1_CLK_REQ;
+
+				hw->phy.ops.write_reg_locked(hw, HV_PM_CTRL,
+							     phy_reg);
+			}
 		}
 		hw->phy.ops.release(hw);
 
diff --git a/drivers/net/ethernet/intel/e1000e/ich8lan.h b/drivers/net/ethernet/intel/e1000e/ich8lan.h
index eb09c755fa17..1502895eb45d 100644
--- a/drivers/net/ethernet/intel/e1000e/ich8lan.h
+++ b/drivers/net/ethernet/intel/e1000e/ich8lan.h
@@ -210,7 +210,7 @@
 
 /* PHY Power Management Control */
 #define HV_PM_CTRL		PHY_REG(770, 17)
-#define HV_PM_CTRL_PLL_STOP_IN_K1_GIGA	0x100
+#define HV_PM_CTRL_K1_CLK_REQ		0x200
 #define HV_PM_CTRL_K1_ENABLE		0x4000
 
 #define I217_PLL_CLOCK_GATE_REG	PHY_REG(772, 28)
-- 
2.21.0


^ permalink raw reply related

* [net-next v2 4/5] igc: Add more SKUs for i225 device
From: Jeff Kirsher @ 2019-07-24 21:26 UTC (permalink / raw)
  To: davem; +Cc: Sasha Neftin, netdev, nhorman, sassmann, Aaron Brown,
	Jeff Kirsher
In-Reply-To: <20190724212613.1580-1-jeffrey.t.kirsher@intel.com>

From: Sasha Neftin <sasha.neftin@intel.com>

Add support for more SKUs.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igc/igc_base.c | 3 +++
 drivers/net/ethernet/intel/igc/igc_hw.h   | 3 +++
 drivers/net/ethernet/intel/igc/igc_main.c | 3 +++
 3 files changed, 9 insertions(+)

diff --git a/drivers/net/ethernet/intel/igc/igc_base.c b/drivers/net/ethernet/intel/igc/igc_base.c
index 46206b3dabfb..db289bcce21d 100644
--- a/drivers/net/ethernet/intel/igc/igc_base.c
+++ b/drivers/net/ethernet/intel/igc/igc_base.c
@@ -209,6 +209,9 @@ static s32 igc_get_invariants_base(struct igc_hw *hw)
 	switch (hw->device_id) {
 	case IGC_DEV_ID_I225_LM:
 	case IGC_DEV_ID_I225_V:
+	case IGC_DEV_ID_I225_I:
+	case IGC_DEV_ID_I220_V:
+	case IGC_DEV_ID_I225_K:
 		mac->type = igc_i225;
 		break;
 	default:
diff --git a/drivers/net/ethernet/intel/igc/igc_hw.h b/drivers/net/ethernet/intel/igc/igc_hw.h
index 9a338fbf671c..abb2d72911ff 100644
--- a/drivers/net/ethernet/intel/igc/igc_hw.h
+++ b/drivers/net/ethernet/intel/igc/igc_hw.h
@@ -18,6 +18,9 @@
 
 #define IGC_DEV_ID_I225_LM			0x15F2
 #define IGC_DEV_ID_I225_V			0x15F3
+#define IGC_DEV_ID_I225_I			0x15F8
+#define IGC_DEV_ID_I220_V			0x15F7
+#define IGC_DEV_ID_I225_K			0x3100
 
 #define IGC_FUNC_0				0
 
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index 9ffe71424ece..e5114bebd30b 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -36,6 +36,9 @@ static const struct igc_info *igc_info_tbl[] = {
 static const struct pci_device_id igc_pci_tbl[] = {
 	{ PCI_VDEVICE(INTEL, IGC_DEV_ID_I225_LM), board_base },
 	{ PCI_VDEVICE(INTEL, IGC_DEV_ID_I225_V), board_base },
+	{ PCI_VDEVICE(INTEL, IGC_DEV_ID_I225_I), board_base },
+	{ PCI_VDEVICE(INTEL, IGC_DEV_ID_I220_V), board_base },
+	{ PCI_VDEVICE(INTEL, IGC_DEV_ID_I225_K), board_base },
 	/* required last entry */
 	{0, }
 };
-- 
2.21.0


^ permalink raw reply related

* [net-next v2 3/5] igc: Update the MAC reset flow
From: Jeff Kirsher @ 2019-07-24 21:26 UTC (permalink / raw)
  To: davem; +Cc: Sasha Neftin, netdev, nhorman, sassmann, Aaron Brown,
	Jeff Kirsher
In-Reply-To: <20190724212613.1580-1-jeffrey.t.kirsher@intel.com>

From: Sasha Neftin <sasha.neftin@intel.com>

Use Device Reset flow instead of Port Reset flow.
This flow performs a reset of the entire controller device,
resulting in a state nearly approximating the state
following a power-up reset or internal PCIe reset,
except for system PCI configuration.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igc/igc_base.c    | 2 +-
 drivers/net/ethernet/intel/igc/igc_defines.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc_base.c b/drivers/net/ethernet/intel/igc/igc_base.c
index 59258d791106..46206b3dabfb 100644
--- a/drivers/net/ethernet/intel/igc/igc_base.c
+++ b/drivers/net/ethernet/intel/igc/igc_base.c
@@ -40,7 +40,7 @@ static s32 igc_reset_hw_base(struct igc_hw *hw)
 	ctrl = rd32(IGC_CTRL);
 
 	hw_dbg("Issuing a global reset to MAC\n");
-	wr32(IGC_CTRL, ctrl | IGC_CTRL_RST);
+	wr32(IGC_CTRL, ctrl | IGC_CTRL_DEV_RST);
 
 	ret_val = igc_get_auto_rd_done(hw);
 	if (ret_val) {
diff --git a/drivers/net/ethernet/intel/igc/igc_defines.h b/drivers/net/ethernet/intel/igc/igc_defines.h
index fc0ccfe38a20..11b99acf4abe 100644
--- a/drivers/net/ethernet/intel/igc/igc_defines.h
+++ b/drivers/net/ethernet/intel/igc/igc_defines.h
@@ -54,7 +54,7 @@
 #define IGC_ERR_SWFW_SYNC		13
 
 /* Device Control */
-#define IGC_CTRL_RST		0x04000000  /* Global reset */
+#define IGC_CTRL_DEV_RST	0x20000000  /* Device reset */
 
 #define IGC_CTRL_PHY_RST	0x80000000  /* PHY Reset */
 #define IGC_CTRL_SLU		0x00000040  /* Set link up (Force Link) */
-- 
2.21.0


^ permalink raw reply related

* [PATCH v2 net-next] r8169: improve rtl_set_rx_mode
From: Heiner Kallweit @ 2019-07-24 21:34 UTC (permalink / raw)
  To: Realtek linux nic maintainers, David Miller; +Cc: netdev@vger.kernel.org

This patch improves and simplifies rtl_set_rx_mode a little.
No functional change intended.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
v2:
- change variable declarations to reverse xmas tree
---
 drivers/net/ethernet/realtek/r8169_main.c | 52 ++++++++++-------------
 1 file changed, 22 insertions(+), 30 deletions(-)

diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
index 9c743d2fc..9bd310938 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -61,7 +61,7 @@
 
 /* Maximum number of multicast addresses to filter (vs. Rx-all-multicast).
    The RTL chips use a 64 element hash table based on the Ethernet CRC. */
-static const int multicast_filter_limit = 32;
+#define	MC_FILTER_LIMIT	32
 
 #define TX_DMA_BURST	7	/* Maximum PCI burst, '7' is unlimited */
 #define InterFrameGap	0x03	/* 3 means InterFrameGap = the shortest one */
@@ -4146,54 +4146,46 @@ static void rtl8169_set_magic_reg(struct rtl8169_private *tp, unsigned mac_versi
 
 static void rtl_set_rx_mode(struct net_device *dev)
 {
+	u32 rx_mode = AcceptBroadcast | AcceptMyPhys | AcceptMulticast;
+	/* Multicast hash filter */
+	u32 mc_filter[2] = { 0xffffffff, 0xffffffff };
 	struct rtl8169_private *tp = netdev_priv(dev);
-	u32 mc_filter[2];	/* Multicast hash filter */
-	int rx_mode;
-	u32 tmp = 0;
+	u32 tmp;
 
 	if (dev->flags & IFF_PROMISC) {
 		/* Unconditionally log net taps. */
 		netif_notice(tp, link, dev, "Promiscuous mode enabled\n");
-		rx_mode =
-		    AcceptBroadcast | AcceptMulticast | AcceptMyPhys |
-		    AcceptAllPhys;
-		mc_filter[1] = mc_filter[0] = 0xffffffff;
-	} else if ((netdev_mc_count(dev) > multicast_filter_limit) ||
-		   (dev->flags & IFF_ALLMULTI)) {
-		/* Too many to filter perfectly -- accept all multicasts. */
-		rx_mode = AcceptBroadcast | AcceptMulticast | AcceptMyPhys;
-		mc_filter[1] = mc_filter[0] = 0xffffffff;
+		rx_mode |= AcceptAllPhys;
+	} else if (netdev_mc_count(dev) > MC_FILTER_LIMIT ||
+		   dev->flags & IFF_ALLMULTI ||
+		   tp->mac_version == RTL_GIGA_MAC_VER_35) {
+		/* accept all multicasts */
+	} else if (netdev_mc_empty(dev)) {
+		rx_mode &= ~AcceptMulticast;
 	} else {
 		struct netdev_hw_addr *ha;
 
-		rx_mode = AcceptBroadcast | AcceptMyPhys;
 		mc_filter[1] = mc_filter[0] = 0;
 		netdev_for_each_mc_addr(ha, dev) {
-			int bit_nr = ether_crc(ETH_ALEN, ha->addr) >> 26;
-			mc_filter[bit_nr >> 5] |= 1 << (bit_nr & 31);
-			rx_mode |= AcceptMulticast;
+			u32 bit_nr = ether_crc(ETH_ALEN, ha->addr) >> 26;
+			mc_filter[bit_nr >> 5] |= BIT(bit_nr & 31);
+		}
+
+		if (tp->mac_version > RTL_GIGA_MAC_VER_06) {
+			tmp = mc_filter[0];
+			mc_filter[0] = swab32(mc_filter[1]);
+			mc_filter[1] = swab32(tmp);
 		}
 	}
 
 	if (dev->features & NETIF_F_RXALL)
 		rx_mode |= (AcceptErr | AcceptRunt);
 
-	tmp = (RTL_R32(tp, RxConfig) & ~RX_CONFIG_ACCEPT_MASK) | rx_mode;
-
-	if (tp->mac_version > RTL_GIGA_MAC_VER_06) {
-		u32 data = mc_filter[0];
-
-		mc_filter[0] = swab32(mc_filter[1]);
-		mc_filter[1] = swab32(data);
-	}
-
-	if (tp->mac_version == RTL_GIGA_MAC_VER_35)
-		mc_filter[1] = mc_filter[0] = 0xffffffff;
-
 	RTL_W32(tp, MAR0 + 4, mc_filter[1]);
 	RTL_W32(tp, MAR0 + 0, mc_filter[0]);
 
-	RTL_W32(tp, RxConfig, tmp);
+	tmp = RTL_R32(tp, RxConfig);
+	RTL_W32(tp, RxConfig, (tmp & ~RX_CONFIG_ACCEPT_MASK) | rx_mode);
 }
 
 DECLARE_RTL_COND(rtl_csiar_cond)
-- 
2.22.0


^ permalink raw reply related

* Re: [PATCH net-next] r8169: improve rtl_set_rx_mode
From: Heiner Kallweit @ 2019-07-24 21:36 UTC (permalink / raw)
  To: Saeed Mahameed, nic_swsd@realtek.com, davem@davemloft.net
  Cc: netdev@vger.kernel.org
In-Reply-To: <228e1254af247a66ebbae649f5a1385b8da64597.camel@mellanox.com>

On 24.07.2019 23:24, Saeed Mahameed wrote:
> On Wed, 2019-07-24 at 23:04 +0200, Heiner Kallweit wrote:
>> This patch improves and simplifies rtl_set_rx_mode a little.
>> No functional change intended.
>>
>> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
>> ---
>>  drivers/net/ethernet/realtek/r8169_main.c | 52 ++++++++++-----------
>> --
>>  1 file changed, 22 insertions(+), 30 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/realtek/r8169_main.c
>> b/drivers/net/ethernet/realtek/r8169_main.c
>> index 9c743d2fc..c39d3a77c 100644
>> --- a/drivers/net/ethernet/realtek/r8169_main.c
>> +++ b/drivers/net/ethernet/realtek/r8169_main.c
>> @@ -61,7 +61,7 @@
>>  
>>  /* Maximum number of multicast addresses to filter (vs. Rx-all-
>> multicast).
>>     The RTL chips use a 64 element hash table based on the Ethernet
>> CRC. */
>> -static const int multicast_filter_limit = 32;
>> +#define	MC_FILTER_LIMIT	32
>>  
>>  #define TX_DMA_BURST	7	/* Maximum PCI burst, '7' is unlimited */
>>  #define InterFrameGap	0x03	/* 3 means InterFrameGap =
>> the shortest one */
>> @@ -4147,53 +4147,45 @@ static void rtl8169_set_magic_reg(struct
>> rtl8169_private *tp, unsigned mac_versi
>>  static void rtl_set_rx_mode(struct net_device *dev)
>>  {
>>  	struct rtl8169_private *tp = netdev_priv(dev);
>> -	u32 mc_filter[2];	/* Multicast hash filter */
>> -	int rx_mode;
>> -	u32 tmp = 0;
>> +	/* Multicast hash filter */
>> +	u32 mc_filter[2] = { 0xffffffff, 0xffffffff };
>> +	u32 rx_mode = AcceptBroadcast | AcceptMyPhys | AcceptMulticast;
>> +	u32 tmp;
>>  
> 
> While you are here, maybe improve the declaration order with a reversed
> xmas tree ..
> 
Indeed .. Thanks.

^ permalink raw reply

* Re: [PATCH v2] net: phylink: don't start and stop SGMII PHYs in SFP modules twice
From: David Miller @ 2019-07-24 21:38 UTC (permalink / raw)
  To: asolokha; +Cc: claudiu.manoil, ioana.ciornei, linux, andrew, netdev
In-Reply-To: <20190724133139.8356-1-asolokha@kb.kras.ru>

From: Arseny Solokha <asolokha@kb.kras.ru>
Date: Wed, 24 Jul 2019 20:31:39 +0700

> SFP modules connected using the SGMII interface have their own PHYs which
> are handled by the struct phylink's phydev field. On the other hand, for
> the modules connected using 1000Base-X interface that field is not set.
> 
> Since commit ce0aa27ff3f6 ("sfp: add sfp-bus to bridge between network
> devices and sfp cages") phylink_start() ends up setting the phydev field
> using the sfp-bus infrastructure, which eventually calls phy_start() on it,
> and then calling phy_start() again on the same phydev from phylink_start()
> itself. Similar call sequence holds for phylink_stop(), only in the reverse
> order. This results in WARNs during network interface bringup and shutdown
> when a copper SFP module is connected, as phy_start() and phy_stop() are
> called twice in a row for the same phy_device:
 ...
> SFP modules with the 1000Base-X interface are not affected.
> 
> Place explicit calls to phy_start() and phy_stop() before enabling or after
> disabling an attached SFP module, where phydev is not yet set (or is
> already unset), so they will be made only from the inside of sfp-bus, if
> needed.
> 
> Signed-off-by: Arseny Solokha <asolokha@kb.kras.ru>

Applied with appropriate Fixes: tag added and queued up for -stable.

Thanks.

^ permalink raw reply

* Re: [PATCH bpf-next 2/6] bpf: add BPF_MAP_DUMP command to dump more than one entry per call
From: Song Liu @ 2019-07-24 21:40 UTC (permalink / raw)
  To: Brian Vazquez
  Cc: Brian Vazquez, Alexei Starovoitov, Daniel Borkmann,
	David S . Miller, Stanislav Fomichev, Willem de Bruijn,
	Petar Penkov, open list, Networking, bpf
In-Reply-To: <20190724165803.87470-3-brianvv@google.com>

On Wed, Jul 24, 2019 at 10:10 AM Brian Vazquez <brianvv@google.com> wrote:
>
> This introduces a new command to retrieve multiple number of entries
> from a bpf map, wrapping the existing bpf methods:
> map_get_next_key and map_lookup_elem
>
> To start dumping the map from the beginning you must specify NULL as
> the prev_key.
>
> The new API returns 0 when it successfully copied all the elements
> requested or it copied less because there weren't more elements to
> retrieved (i.e err == -ENOENT). In last scenario err will be masked to 0.
>
> On a successful call buf and buf_len will contain correct data and in
> case prev_key was provided (not for the first walk, since prev_key is
> NULL) it will contain the last_key copied into the prev_key which will
> simplify next call.
>
> Only when it can't find a single element it will return -ENOENT meaning
> that the map has been entirely walked. When an error is return buf,
> buf_len and prev_key shouldn't be read nor used.
>
> Because maps can be called from userspace and kernel code, this function
> can have a scenario where the next_key was found but by the time we
> try to retrieve the value the element is not there, in this case the
> function continues and tries to get a new next_key value, skipping the
> deleted key. If at some point the function find itself trap in a loop,
> it will return -EINTR.
>
> The function will try to fit as much as possible in the buf provided and
> will return -EINVAL if buf_len is smaller than elem_size.
>
> QUEUE and STACK maps are not supported.
>
> Note that map_dump doesn't guarantee that reading the entire table is
> consistent since this function is always racing with kernel and user code
> but the same behaviour is found when the entire table is walked using
> the current interfaces: map_get_next_key + map_lookup_elem.
> It is also important to note that with  a locked map, the lock is grabbed
> for 1 entry at the time, meaning that the returned buf might or might not
> be consistent.
>
> Suggested-by: Stanislav Fomichev <sdf@google.com>
> Signed-off-by: Brian Vazquez <brianvv@google.com>
> ---
>  include/uapi/linux/bpf.h |   9 +++
>  kernel/bpf/syscall.c     | 117 +++++++++++++++++++++++++++++++++++++++
>  2 files changed, 126 insertions(+)
>
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index fa1c753dcdbc7..66dab5385170d 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -106,6 +106,7 @@ enum bpf_cmd {
>         BPF_TASK_FD_QUERY,
>         BPF_MAP_LOOKUP_AND_DELETE_ELEM,
>         BPF_MAP_FREEZE,
> +       BPF_MAP_DUMP,
>  };
>
>  enum bpf_map_type {
> @@ -388,6 +389,14 @@ union bpf_attr {
>                 __u64           flags;
>         };
>
> +       struct { /* struct used by BPF_MAP_DUMP command */
> +               __aligned_u64   prev_key;
> +               __aligned_u64   buf;
> +               __aligned_u64   buf_len; /* input/output: len of buf */
> +               __u64           flags;

Please add explanation of flags here. Also, we need to update the
comments of BPF_F_LOCK for BPF_MAP_DUMP.

> +               __u32           map_fd;
> +       } dump;
> +
>         struct { /* anonymous struct used by BPF_PROG_LOAD command */
>                 __u32           prog_type;      /* one of enum bpf_prog_type */
>                 __u32           insn_cnt;
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index 86cdc2f7bb56e..0c35505aa219f 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -1097,6 +1097,120 @@ static int map_get_next_key(union bpf_attr *attr)
>         return err;
>  }
>
> +/* last field in 'union bpf_attr' used by this command */
> +#define BPF_MAP_DUMP_LAST_FIELD dump.map_fd
> +
> +static int map_dump(union bpf_attr *attr)
> +{
> +       void __user *ukey = u64_to_user_ptr(attr->dump.prev_key);
> +       void __user *ubuf = u64_to_user_ptr(attr->dump.buf);
> +       u32 __user *ubuf_len = u64_to_user_ptr(attr->dump.buf_len);
> +       int ufd = attr->dump.map_fd;
> +       struct bpf_map *map;
> +       void *buf, *prev_key, *key, *value;
> +       u32 value_size, elem_size, buf_len, cp_len;
> +       struct fd f;
> +       int err;
> +       bool first_key = false;
> +
> +       if (CHECK_ATTR(BPF_MAP_DUMP))
> +               return -EINVAL;
> +
> +       if (attr->dump.flags & ~BPF_F_LOCK)
> +               return -EINVAL;
> +
> +       f = fdget(ufd);
> +       map = __bpf_map_get(f);
> +       if (IS_ERR(map))
> +               return PTR_ERR(map);
> +       if (!(map_get_sys_perms(map, f) & FMODE_CAN_READ)) {
> +               err = -EPERM;
> +               goto err_put;
> +       }
> +
> +       if ((attr->dump.flags & BPF_F_LOCK) &&
> +           !map_value_has_spin_lock(map)) {
> +               err = -EINVAL;
> +               goto err_put;
> +       }

We can share these lines with map_lookup_elem(). Maybe
add another helper function?

> +
> +       if (map->map_type == BPF_MAP_TYPE_QUEUE ||
> +           map->map_type == BPF_MAP_TYPE_STACK) {
> +               err = -ENOTSUPP;
> +               goto err_put;
> +       }
> +
> +       value_size = bpf_map_value_size(map);
> +
> +       err = get_user(buf_len, ubuf_len);
> +       if (err)
> +               goto err_put;
> +
> +       elem_size = map->key_size + value_size;
> +       if (buf_len < elem_size) {
> +               err = -EINVAL;
> +               goto err_put;
> +       }
> +
> +       if (ukey) {
> +               prev_key = __bpf_copy_key(ukey, map->key_size);
> +               if (IS_ERR(prev_key)) {
> +                       err = PTR_ERR(prev_key);
> +                       goto err_put;
> +               }
> +       } else {
> +               prev_key = NULL;
> +               first_key = true;
> +       }
> +
> +       err = -ENOMEM;
> +       buf = kmalloc(elem_size, GFP_USER | __GFP_NOWARN);
> +       if (!buf)
> +               goto err_put;
> +
> +       key = buf;
> +       value = key + map->key_size;
> +       for (cp_len = 0; cp_len + elem_size <= buf_len;) {
> +               if (signal_pending(current)) {
> +                       err = -EINTR;
> +                       break;
> +               }
> +
> +               rcu_read_lock();
> +               err = map->ops->map_get_next_key(map, prev_key, key);

If prev_key is deleted before map_get_next_key(), we get the first key
again. This is pretty weird.

> +               rcu_read_unlock();
> +
> +               if (err)
> +                       break;
> +
> +               err = bpf_map_copy_value(map, key, value, attr->dump.flags);
> +
> +               if (err == -ENOENT)
> +                       continue;
> +               if (err)
> +                       goto free_buf;
> +
> +               if (copy_to_user(ubuf + cp_len, buf, elem_size)) {
> +                       err = -EFAULT;
> +                       goto free_buf;
> +               }
> +
> +               prev_key = key;
> +               cp_len += elem_size;
> +       }
> +
> +       if (err == -ENOENT && cp_len)
> +               err = 0;
> +       if (!err && (copy_to_user(ubuf_len, &cp_len, sizeof(cp_len)) ||
> +                   (!first_key && copy_to_user(ukey, key, map->key_size))))
> +               err = -EFAULT;
> +free_buf:
> +       kfree(buf);
> +err_put:
> +       fdput(f);
> +       return err;
> +}
> +
>  #define BPF_MAP_LOOKUP_AND_DELETE_ELEM_LAST_FIELD value
>
>  static int map_lookup_and_delete_elem(union bpf_attr *attr)
> @@ -2910,6 +3024,9 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
>         case BPF_MAP_LOOKUP_AND_DELETE_ELEM:
>                 err = map_lookup_and_delete_elem(&attr);
>                 break;
> +       case BPF_MAP_DUMP:
> +               err = map_dump(&attr);
> +               break;
>         default:
>                 err = -EINVAL;
>                 break;
> --
> 2.22.0.657.g960e92d24f-goog
>

^ permalink raw reply

* Re: [PATCH bpf-next 3/6] bpf: keep bpf.h in sync with tools/
From: Song Liu @ 2019-07-24 21:41 UTC (permalink / raw)
  To: Brian Vazquez
  Cc: Brian Vazquez, Alexei Starovoitov, Daniel Borkmann,
	David S . Miller, Stanislav Fomichev, Willem de Bruijn,
	Petar Penkov, open list, Networking, bpf
In-Reply-To: <20190724165803.87470-4-brianvv@google.com>

On Wed, Jul 24, 2019 at 10:10 AM Brian Vazquez <brianvv@google.com> wrote:
>
> Adds bpf_attr.dump structure to libbpf.
>
> Suggested-by: Stanislav Fomichev <sdf@google.com>
> Signed-off-by: Brian Vazquez <brianvv@google.com>
> ---
>  tools/include/uapi/linux/bpf.h | 9 +++++++++
>  tools/lib/bpf/libbpf.map       | 2 ++
>  2 files changed, 11 insertions(+)
>
> diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> index 4e455018da65f..e127f16e4e932 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -106,6 +106,7 @@ enum bpf_cmd {
>         BPF_TASK_FD_QUERY,
>         BPF_MAP_LOOKUP_AND_DELETE_ELEM,
>         BPF_MAP_FREEZE,
> +       BPF_MAP_DUMP,
>  };
>
>  enum bpf_map_type {
> @@ -388,6 +389,14 @@ union bpf_attr {
>                 __u64           flags;
>         };
>
> +       struct { /* struct used by BPF_MAP_DUMP command */
> +               __aligned_u64   prev_key;
> +               __aligned_u64   buf;
> +               __aligned_u64   buf_len; /* input/output: len of buf */
> +               __u64           flags;
> +               __u32           map_fd;
> +       } dump;
> +
>         struct { /* anonymous struct used by BPF_PROG_LOAD command */
>                 __u32           prog_type;      /* one of enum bpf_prog_type */
>                 __u32           insn_cnt;
> diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
> index f9d316e873d8d..cac3723d5c45c 100644
> --- a/tools/lib/bpf/libbpf.map
> +++ b/tools/lib/bpf/libbpf.map
> @@ -183,4 +183,6 @@ LIBBPF_0.0.4 {
>                 perf_buffer__new;
>                 perf_buffer__new_raw;
>                 perf_buffer__poll;
> +               bpf_map_dump;
> +               bpf_map_dump_flags;
>  } LIBBPF_0.0.3;

libbpf.map change should go with 4/6.

> --
> 2.22.0.657.g960e92d24f-goog
>

^ permalink raw reply

* Re: [PATCH -next v2] net/ixgbevf: fix a compilation error of skb_frag_t
From: David Miller @ 2019-07-24 21:41 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: cai, netdev, linux-kernel
In-Reply-To: <4b5abf35a7b78ceae788ad7c2609d84dd33e5e9e.camel@intel.com>

From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Wed, 24 Jul 2019 09:39:26 -0700

> Dave I will pick this up and add it to my queue.

How soon will you get that to me?  The sooner this build fix is cured the
better.

Thanks.

^ permalink raw reply

* Re: [PATCH bpf-next 01/10] libbpf: add .BTF.ext offset relocation section loading
From: Andrii Nakryiko @ 2019-07-24 21:42 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Yonghong Song, Kernel Team
In-Reply-To: <20190724192742.1419254-2-andriin@fb.com>

On Wed, Jul 24, 2019 at 12:28 PM Andrii Nakryiko <andriin@fb.com> wrote:
>
> Add support for BPF CO-RE offset relocations. Add section/record
> iteration macros for .BTF.ext. These macro are useful for iterating over
> each .BTF.ext record, either for dumping out contents or later for BPF
> CO-RE relocation handling.
>
> To enable other parts of libbpf to work with .BTF.ext contents, moved
> a bunch of type definitions into libbpf_internal.h.
>
> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> ---

[...]

> + *
> + * Example to provide a better feel.
> + *
> + *   struct sample {
> + *       int a;
> + *       struct {
> + *           int b[10];
> + *       };
> + *   };
> + *
> + *   struct sample *s = ...;
> + *   int x = &s->a;     // encoded as "0:0" (a is field #0)
> + *   int y = &s->b[5];  // encoded as "0:1:5" (b is field #1, arr elem #5)

This should be "0:1:0:5", actually. Anon struct is field #1 in BTF, b
is field #0 inside that anon struct + array index 5.
Will update it locally and incorporate into next version once the rest
of patch set is reviewed.

> + *   int z = &s[10]->b; // encoded as "10:1" (ptr is used as an array)
> + *
> + * type_id for all relocs in this example  will capture BTF type id of
> + * `struct sample`.
> + *
> + *   [0] https://llvm.org/docs/LangRef.html#getelementptr-instruction
> + */
> +struct bpf_offset_reloc {
> +       __u32   insn_off;
> +       __u32   type_id;
> +       __u32   access_str_off;
> +};
> +
>  #endif /* __LIBBPF_LIBBPF_INTERNAL_H */
> --
> 2.17.1
>

^ permalink raw reply

* Re: [PATCH 3/3] net: dsa: ksz: Add Microchip KSZ8795 DSA driver
From: David Miller @ 2019-07-24 21:46 UTC (permalink / raw)
  To: marex; +Cc: netdev, Tristram.Ha, andrew, f.fainelli, vivien.didelot,
	woojung.huh
In-Reply-To: <20190724134048.31029-4-marex@denx.de>

From: Marek Vasut <marex@denx.de>
Date: Wed, 24 Jul 2019 15:40:48 +0200

> +static void ksz8795_r_mib_cnt(struct ksz_device *dev, int port, u16 addr,
> +			      u64 *cnt)
> +{
> +	u32 data;
> +	u16 ctrl_addr;
> +	u8 check;
> +	int loop;

Reverse christmas tree for these local variable declarations.

> +static void ksz8795_r_mib_pkt(struct ksz_device *dev, int port, u16 addr,
> +			      u64 *dropped, u64 *cnt)
> +{
> +	u32 data;
> +	u16 ctrl_addr;
> +	u8 check;
> +	int loop;

Likewise.

> +static int ksz8795_r_dyn_mac_table(struct ksz_device *dev, u16 addr,
> +				   u8 *mac_addr, u8 *fid, u8 *src_port,
> +				   u8 *timestamp, u16 *entries)
> +{
> +	u32 data_hi;
> +	u32 data_lo;
> +	u16 ctrl_addr;
> +	int rc;
> +	u8 data;

Likewise.

> +static int ksz8795_r_sta_mac_table(struct ksz_device *dev, u16 addr,
> +				   struct alu_struct *alu)
> +{
> +	u64 data;
> +	u32 data_hi;
> +	u32 data_lo;

Likewise.

> +static void ksz8795_w_sta_mac_table(struct ksz_device *dev, u16 addr,
> +				    struct alu_struct *alu)
> +{
> +	u64 data;
> +	u32 data_hi;
> +	u32 data_lo;

Likewise.

> +static inline void ksz8795_from_vlan(u16 vlan, u8 *fid, u8 *member, u8 *valid)

Never use the inline directive in foo.c files, let the compiler decide.

> +static inline void ksz8795_to_vlan(u8 fid, u8 member, u8 valid, u16 *vlan)

Likewise.

> +static void ksz8795_r_vlan_table(struct ksz_device *dev, u16 vid, u16 *vlan)
> +{
> +	u64 buf;
> +	u16 *data = (u16 *)&buf;
> +	u16 addr;
> +	int index;

Reverse christmas tree please.

> +static void ksz8795_w_vlan_table(struct ksz_device *dev, u16 vid, u16 vlan)
> +{
> +	u64 buf;
> +	u16 *data = (u16 *)&buf;
> +	u16 addr;
> +	int index;

Likewise.

> +static void ksz8795_r_phy(struct ksz_device *dev, u16 phy, u16 reg, u16 *val)
> +{
> +	struct ksz_port *port;
> +	u8 ctrl;
> +	u8 restart;
> +	u8 link;
> +	u8 speed;
> +	u8 p = phy;
> +	u16 data = 0;
> +	int processed = true;

Likewise.

> +static void ksz8795_w_phy(struct ksz_device *dev, u16 phy, u16 reg, u16 val)
> +{
> +	u8 ctrl;
> +	u8 restart;
> +	u8 speed;
> +	u8 data;
> +	u8 p = phy;

Likewise.

> +static void ksz8795_port_stp_state_set(struct dsa_switch *ds, int port,
> +				       u8 state)
> +{
> +	struct ksz_device *dev = ds->priv;
> +	struct ksz_port *p = &dev->ports[port];
> +	u8 data;
> +	int member = -1;
> +	int forward = dev->member;

Likewise.

> +static void ksz8795_flush_dyn_mac_table(struct ksz_device *dev, int port)
> +{
> +	struct ksz_port *p;
> +	int cnt;
> +	int first;
> +	int index;
> +	u8 learn[TOTAL_PORT_NUM];

Likewise.

> +static void ksz8795_port_vlan_add(struct dsa_switch *ds, int port,
> +				  const struct switchdev_obj_port_vlan *vlan)
> +{
> +	struct ksz_device *dev = ds->priv;
> +	bool untagged = vlan->flags & BRIDGE_VLAN_INFO_UNTAGGED;
> +	u16 data;
> +	u16 vid;
> +	u8 fid;
> +	u8 member;
> +	u8 valid;
> +	u16 new_pvid = 0;

Likewise.  And seriously, combine all of the same typed variables onto one line
to compress the space a bit:

	bool untagged = vlan->flags & BRIDGE_VLAN_INFO_UNTAGGED;
	struct ksz_device *dev = ds->priv;
	u16 data, vid, new_pvid = 0;
	u8 fid, member, valid;

Doesn't that look like 1,000 times nicer? :-)

> +static int ksz8795_port_vlan_del(struct dsa_switch *ds, int port,
> +				 const struct switchdev_obj_port_vlan *vlan)
> +{
> +	struct ksz_device *dev = ds->priv;
> +	bool untagged = vlan->flags & BRIDGE_VLAN_INFO_UNTAGGED;
> +	u16 data;
> +	u16 vid;
> +	u16 pvid;
> +	u8 fid;
> +	u8 member;
> +	u8 valid;
> +	u16 new_pvid = 0;

Again, same thing.

> +static void ksz8795_port_setup(struct ksz_device *dev, int port, bool cpu_port)
> +{
> +	u8 data8;
> +	u8 member;
> +	struct ksz_port *p = &dev->ports[port];

Likewise.
> +static void ksz8795_config_cpu_port(struct dsa_switch *ds)
> +{
> +	struct ksz_device *dev = ds->priv;
> +	struct ksz_port *p;
> +	int i;
> +	u8 remote;

Likewise.

> +static int ksz8795_setup(struct dsa_switch *ds)
> +{
> +	u8 data8;
> +	u16 data16;
> +	u32 value;
> +	int i;
> +	struct alu_struct alu;
> +	struct ksz_device *dev = ds->priv;
> +	int ret = 0;

Likewise.
> +static int ksz8795_switch_detect(struct ksz_device *dev)
> +{
> +	u16 id16;
> +	u8 id1;
> +	u8 id2;
> +	int ret;

Likewise.

^ permalink raw reply

* [PATCH RFC net-next] net: use helper skb_ensure_writable in skb_checksum_help and skb_crc32c_csum_help
From: Heiner Kallweit @ 2019-07-24 21:47 UTC (permalink / raw)
  To: David Miller; +Cc: netdev@vger.kernel.org

Instead of open-coding making the checksum writable we can use an
appropriate helper. skb_ensure_writable is a candidate, however we
might also use skb_header_unclone. Hints welcome.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 net/core/dev.c | 20 ++++++++------------
 1 file changed, 8 insertions(+), 12 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index fc676b261..90516a800 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2939,12 +2939,9 @@ int skb_checksum_help(struct sk_buff *skb)
 	offset += skb->csum_offset;
 	BUG_ON(offset + sizeof(__sum16) > skb_headlen(skb));
 
-	if (skb_cloned(skb) &&
-	    !skb_clone_writable(skb, offset + sizeof(__sum16))) {
-		ret = pskb_expand_head(skb, 0, 0, GFP_ATOMIC);
-		if (ret)
-			goto out;
-	}
+	ret = skb_ensure_writable(skb, offset + sizeof(__sum16));
+	if (ret)
+		goto out;
 
 	*(__sum16 *)(skb->data + offset) = csum_fold(csum) ?: CSUM_MANGLED_0;
 out_set_summed:
@@ -2979,12 +2976,11 @@ int skb_crc32c_csum_help(struct sk_buff *skb)
 		ret = -EINVAL;
 		goto out;
 	}
-	if (skb_cloned(skb) &&
-	    !skb_clone_writable(skb, offset + sizeof(__le32))) {
-		ret = pskb_expand_head(skb, 0, 0, GFP_ATOMIC);
-		if (ret)
-			goto out;
-	}
+
+	ret = skb_ensure_writable(skb, offset + sizeof(__le32));
+	if (ret)
+		goto out;
+
 	crc32c_csum = cpu_to_le32(~__skb_checksum(skb, start,
 						  skb->len - start, ~(__u32)0,
 						  crc32c_csum_stub));
-- 
2.22.0


^ permalink raw reply related

* [PATCH v2 bpf] libbpf: silence GCC8 warning about string truncation
From: Andrii Nakryiko @ 2019-07-24 21:47 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel
  Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko, Magnus Karlsson

Despite a proper NULL-termination after strncpy(..., ..., IFNAMSIZ - 1),
GCC8 still complains about *expected* string truncation:

  xsk.c:330:2: error: 'strncpy' output may be truncated copying 15 bytes
  from a string of length 15 [-Werror=stringop-truncation]
    strncpy(ifr.ifr_name, xsk->ifname, IFNAMSIZ - 1);

This patch gets rid of the issue altogether by using memcpy instead.
There is no performance regression, as strncpy will still copy and fill
all of the bytes anyway.

v1->v2:
- rebase against bpf tree.

Cc: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Acked-by: Song Liu <songliubraving@fb.com>
---
 tools/lib/bpf/xsk.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c
index e02025bbe36d..680e63066cf3 100644
--- a/tools/lib/bpf/xsk.c
+++ b/tools/lib/bpf/xsk.c
@@ -326,7 +326,7 @@ static int xsk_get_max_queues(struct xsk_socket *xsk)
 		return -errno;
 
 	ifr.ifr_data = (void *)&channels;
-	strncpy(ifr.ifr_name, xsk->ifname, IFNAMSIZ - 1);
+	memcpy(ifr.ifr_name, xsk->ifname, IFNAMSIZ - 1);
 	ifr.ifr_name[IFNAMSIZ - 1] = '\0';
 	err = ioctl(fd, SIOCETHTOOL, &ifr);
 	if (err && errno != EOPNOTSUPP) {
@@ -516,7 +516,7 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,
 		err = -errno;
 		goto out_socket;
 	}
-	strncpy(xsk->ifname, ifname, IFNAMSIZ - 1);
+	memcpy(xsk->ifname, ifname, IFNAMSIZ - 1);
 	xsk->ifname[IFNAMSIZ - 1] = '\0';
 
 	err = xsk_set_xdp_socket_config(&xsk->config, usr_config);
-- 
2.17.1


^ permalink raw reply related

* Re: [RFC PATCH net-next 0/3] net: batched receive in GRO path
From: Edward Cree @ 2019-07-24 21:49 UTC (permalink / raw)
  To: Eric Dumazet, Paolo Abeni, David Miller; +Cc: netdev
In-Reply-To: <d7ca6e7a-b80e-12e8-9050-c25b8b92bf26@gmail.com>

On 12/07/2019 17:48, Eric Dumazet wrote:
>> but the rest is the stuff we're polling for for low latency.
>> I'm putting a gro_normal_list() call after the trace_napi_poll() in
>>  napi_busy_loop() and testing that, let's see how it goes...
One thing that's causing me some uncertainty: busy_poll_stop() does a
 napi->poll(), which can potentially gro_normal_one() something.  But
 when I tried to put a gro_normal_list() just after that, I ran into
 list corruption because it could race against the one in
 napi_complete_done().  I'm not entirely sure how, my current theory
 goes something like:
- clear_bit(IN_BUSY_POLL)
- task switch, start napi poll
- get as far as starting gro_normal_list()
- task switch back to busy_poll_stop()
- local_bh_disable()
- do a napi poll
- start gro_normal_list()
- list corruption ensues as we have two instances of
  netif_receive_skb_list_internal() trying to consume the same list
But I may be wildly mistaken.
Questions that arise from that:
1) Is it safe to potentially be adding to the rx_list (gro_normal_one(),
   which in theory can end up calling gro_normal_list() as well) within
   busy_poll_stop()?  I haven't ever seen a splat from that, but it seems
   every bit as possible as what I have been seeing.
2) Why does busy_poll_stop() not do its local_bh_disable() *before*
   clearing napi state bits, which (if I'm understanding correctly) would
   ensure an ordinary napi poll can't race with the one in busy_poll_stop()?

Apart from that I have indeed established that with the patches as posted
 busy-polling latency is awful, but adding a gro_normal_list() into
 napi_busy_loop() fixes that, as expected.

-Ed

^ permalink raw reply

* Re: [PATCH net-next] selftests: mlxsw: Fix typo in qos_mc_aware.sh
From: David Miller @ 2019-07-24 21:51 UTC (permalink / raw)
  To: standby24x7
  Cc: shuah, linux-kernel, jiri, idosch, linux-kselftest, rdunlap,
	netdev
In-Reply-To: <20190724152951.4618-1-standby24x7@gmail.com>

From: Masanari Iida <standby24x7@gmail.com>
Date: Thu, 25 Jul 2019 00:29:51 +0900

> This patch fix some spelling typo in qos_mc_aware.sh
> 
> Signed-off-by: Masanari Iida <standby24x7@gmail.com>
> Acked-by: Randy Dunlap <rdunlap@infradead.org>

Applied.

^ permalink raw reply

* Re: [PATCH bpf-next 5/6] selftests/bpf: test BPF_MAP_DUMP command on a bpf hashmap
From: Song Liu @ 2019-07-24 21:58 UTC (permalink / raw)
  To: Brian Vazquez
  Cc: Brian Vazquez, Alexei Starovoitov, Daniel Borkmann,
	David S . Miller, Stanislav Fomichev, Willem de Bruijn,
	Petar Penkov, open list, Networking, bpf
In-Reply-To: <20190724165803.87470-6-brianvv@google.com>

On Wed, Jul 24, 2019 at 10:10 AM Brian Vazquez <brianvv@google.com> wrote:
>
> This tests exercise the new command on a bpf hashmap and make sure it
> works as expected.
>
> Signed-off-by: Brian Vazquez <brianvv@google.com>
> ---
>  tools/testing/selftests/bpf/test_maps.c | 83 ++++++++++++++++++++++++-
>  1 file changed, 81 insertions(+), 2 deletions(-)
>
> diff --git a/tools/testing/selftests/bpf/test_maps.c b/tools/testing/selftests/bpf/test_maps.c
> index 5443b9bd75ed7..f7ab401399d40 100644
> --- a/tools/testing/selftests/bpf/test_maps.c
> +++ b/tools/testing/selftests/bpf/test_maps.c
> @@ -309,6 +309,86 @@ static void test_hashmap_walk(unsigned int task, void *data)
>         close(fd);
>  }
>
> +static void test_hashmap_dump(void)
> +{
> +       int fd, i, max_entries = 5;

5 is too small for map_dump.

> +       uint64_t keys[max_entries], values[max_entries];
> +       uint64_t key, value, next_key, prev_key;
> +       bool next_key_valid = true;
> +       void *buf, *elem;
> +       u32 buf_len;
> +       const int elem_size = sizeof(key) + sizeof(value);
> +
> +       fd = helper_fill_hashmap(max_entries);
> +
> +       // Get the elements in the hashmap, and store them in that order

Please use /* */ instead of //.

> +       assert(bpf_map_get_next_key(fd, NULL, &key) == 0);
> +       i = 0;
> +       keys[i] = key;
> +       for (i = 1; next_key_valid; i++) {
> +               next_key_valid = bpf_map_get_next_key(fd, &key, &next_key) == 0;
> +               assert(bpf_map_lookup_elem(fd, &key, &values[i - 1]) == 0);
> +               keys[i-1] = key;
> +               key = next_key;
> +       }
> +
> +       // Alloc memory for the whole table
> +       buf = malloc(elem_size * max_entries);
> +       assert(buf != NULL);
> +
> +       // Check that buf_len < elem_size returns EINVAL
> +       buf_len = elem_size-1;
> +       errno = 0;
> +       assert(bpf_map_dump(fd, NULL, buf, &buf_len) == -1 && errno == EINVAL);
> +
> +       // Check that it returns the first two elements
> +       errno = 0;
> +       buf_len = elem_size * 2;
> +       i = 0;
> +       assert(bpf_map_dump(fd, NULL, buf, &buf_len) == 0 &&
> +              buf_len == 2*elem_size);
> +       elem = buf;
> +       assert((*(uint64_t *)elem) == keys[i] &&
> +              (*(uint64_t *)(elem + sizeof(key))) == values[i]);
> +       elem = buf + elem_size;
> +       i++;
> +       assert((*(uint64_t *)elem) == keys[i] &&
> +              (*(uint64_t *)(elem + sizeof(key))) == values[i]);
> +       i++;
> +
> +       /* Check that prev_key contains key from last_elem retrieved in previous
> +        * call
> +        */
> +       prev_key = *((uint64_t *)elem);
> +       assert(bpf_map_dump(fd, &prev_key, buf, &buf_len) == 0 &&
> +              buf_len == elem_size*2);
> +       elem = buf;
> +       assert((*(uint64_t *)elem) == keys[i] &&
> +              (*(uint64_t *)(elem + sizeof(key))) == values[i]);
> +       elem = buf + elem_size;
> +       i++;
> +       assert((*(uint64_t *)elem) == keys[i] &&
> +              (*(uint64_t *)(elem + sizeof(key))) == values[i]);
> +       i++;
> +       assert(prev_key == (*(uint64_t *)elem));
> +
> +       /* Continue reading from map and verify buf_len only contains 1 element
> +        * even though buf_len is 2 elem_size and it returns err = 0.
> +        */
> +       assert(bpf_map_dump(fd, &prev_key, buf, &buf_len) == 0 &&
> +              buf_len == elem_size);
> +       elem = buf;
> +       assert((*(uint64_t *)elem) == keys[i] &&
> +              (*(uint64_t *)(elem + sizeof(key))) == values[i]);
> +
> +       // Verify there's no more entries and err = ENOENT
> +       assert(bpf_map_dump(fd, &prev_key, buf, &buf_len) == -1 &&
> +              errno == ENOENT);
> +
> +       free(buf);
> +       close(fd);
> +}
> +
>  static void test_hashmap_zero_seed(void)
>  {
>         int i, first, second, old_flags;
> @@ -1677,6 +1757,7 @@ static void run_all_tests(void)
>         test_hashmap_percpu(0, NULL);
>         test_hashmap_walk(0, NULL);
>         test_hashmap_zero_seed();
> +       test_hashmap_dump();
>
>         test_arraymap(0, NULL);
>         test_arraymap_percpu(0, NULL);
> @@ -1714,11 +1795,9 @@ int main(void)
>
>         map_flags = BPF_F_NO_PREALLOC;
>         run_all_tests();
> -
>  #define CALL
>  #include <map_tests/tests.h>
>  #undef CALL
> -
>         printf("test_maps: OK, %d SKIPPED\n", skips);
>         return 0;
>  }
> --
> 2.22.0.657.g960e92d24f-goog
>

^ permalink raw reply

* Re: [PATCH net-next] selftests: mlxsw: Fix typo in qos_mc_aware.sh
From: shuah @ 2019-07-24 22:00 UTC (permalink / raw)
  To: David Miller, standby24x7
  Cc: linux-kernel, jiri, idosch, linux-kselftest, rdunlap, netdev,
	shuah
In-Reply-To: <20190724.145123.912916059374852633.davem@davemloft.net>

On 7/24/19 3:51 PM, David Miller wrote:
> From: Masanari Iida <standby24x7@gmail.com>
> Date: Thu, 25 Jul 2019 00:29:51 +0900
> 
>> This patch fix some spelling typo in qos_mc_aware.sh
>>
>> Signed-off-by: Masanari Iida <standby24x7@gmail.com>
>> Acked-by: Randy Dunlap <rdunlap@infradead.org>
> 
> Applied.
> 

I applied to this my fixes branch this morning on auto-pilot
without realizing that it is in your domain :)

Would you like like me to drop it from mine?

thanks,
-- Shuah

^ permalink raw reply

* Re: [PATCH bpf-next 6/6] selftests/bpf: add test to measure performance of BPF_MAP_DUMP
From: Song Liu @ 2019-07-24 22:01 UTC (permalink / raw)
  To: Brian Vazquez
  Cc: Brian Vazquez, Alexei Starovoitov, Daniel Borkmann,
	David S . Miller, Stanislav Fomichev, Willem de Bruijn,
	Petar Penkov, open list, Networking, bpf
In-Reply-To: <20190724165803.87470-7-brianvv@google.com>

On Wed, Jul 24, 2019 at 10:10 AM Brian Vazquez <brianvv@google.com> wrote:
>
> This tests compares the amount of time that takes to read an entire
> table of 100K elements on a bpf hashmap using both BPF_MAP_DUMP and
> BPF_MAP_GET_NEXT_KEY + BPF_MAP_LOOKUP_ELEM.
>
> Signed-off-by: Brian Vazquez <brianvv@google.com>
> ---
>  tools/testing/selftests/bpf/test_maps.c | 65 +++++++++++++++++++++++++
>  1 file changed, 65 insertions(+)
>
> diff --git a/tools/testing/selftests/bpf/test_maps.c b/tools/testing/selftests/bpf/test_maps.c
> index f7ab401399d40..c4593a8904ca6 100644
> --- a/tools/testing/selftests/bpf/test_maps.c
> +++ b/tools/testing/selftests/bpf/test_maps.c
> @@ -18,6 +18,7 @@
>  #include <sys/socket.h>
>  #include <netinet/in.h>
>  #include <linux/bpf.h>
> +#include <linux/time64.h>
>
>  #include <bpf/bpf.h>
>  #include <bpf/libbpf.h>
> @@ -389,6 +390,69 @@ static void test_hashmap_dump(void)
>         close(fd);
>  }
>
> +static void test_hashmap_dump_perf(void)
> +{
> +       int fd, i, max_entries = 100000;
> +       uint64_t key, value, next_key;
> +       bool next_key_valid = true;
> +       void *buf;
> +       u32 buf_len, entries;
> +       int j = 0;
> +       int clk_id = CLOCK_MONOTONIC;
> +       struct timespec begin, end;
> +       long long time_spent, dump_time_spent;
> +       double res;
> +       int tests[] = {1, 2, 230, 5000, 73000, 100000, 234567};
> +       int test_len = ARRAY_SIZE(tests);
> +       const int elem_size = sizeof(key) + sizeof(value);
> +
> +       fd = helper_fill_hashmap(max_entries);
> +       // Alloc memory considering the largest buffer
> +       buf = malloc(elem_size * tests[test_len-1]);
> +       assert(buf != NULL);
> +
> +test:
> +       entries = tests[j];
> +       buf_len = elem_size*tests[j];
> +       j++;
> +       clock_gettime(clk_id, &begin);
> +       errno = 0;
> +       i = 0;
> +       while (errno == 0) {
> +               bpf_map_dump(fd, !i ? NULL : &key,
> +                                 buf, &buf_len);
> +               if (errno)
> +                       break;
> +               if (!i)
> +                       key = *((uint64_t *)(buf + buf_len - elem_size));
> +               i += buf_len / elem_size;
> +       }
> +       clock_gettime(clk_id, &end);
> +       assert(i  == max_entries);
> +       dump_time_spent = NSEC_PER_SEC * (end.tv_sec - begin.tv_sec) +
> +                         end.tv_nsec - begin.tv_nsec;
> +       next_key_valid = true;
> +       clock_gettime(clk_id, &begin);
> +       assert(bpf_map_get_next_key(fd, NULL, &key) == 0);
> +       for (i = 0; next_key_valid; i++) {
> +               next_key_valid = bpf_map_get_next_key(fd, &key, &next_key) == 0;
> +               assert(bpf_map_lookup_elem(fd, &key, &value) == 0);
> +               key = next_key;
> +       }
> +       clock_gettime(clk_id, &end);
> +       time_spent = NSEC_PER_SEC * (end.tv_sec - begin.tv_sec) +
> +                    end.tv_nsec - begin.tv_nsec;
> +       res = (1-((double)dump_time_spent/time_spent))*100;
> +       printf("buf_len_%u:\t %llu entry-by-entry: %llu improvement %lf\n",
> +              entries, dump_time_spent, time_spent, res);
> +       assert(i  == max_entries);
              ^ extra space after i.

> +
> +       if (j < test_len)
> +               goto test;
> +       free(buf);
> +       close(fd);
> +}
> +
>  static void test_hashmap_zero_seed(void)
>  {
>         int i, first, second, old_flags;
> @@ -1758,6 +1822,7 @@ static void run_all_tests(void)
>         test_hashmap_walk(0, NULL);
>         test_hashmap_zero_seed();
>         test_hashmap_dump();
> +       test_hashmap_dump_perf();
>
>         test_arraymap(0, NULL);
>         test_arraymap_percpu(0, NULL);
> --
> 2.22.0.657.g960e92d24f-goog
>

^ permalink raw reply

* Re: [PATCH -next v2] net/ixgbevf: fix a compilation error of skb_frag_t
From: Jeff Kirsher @ 2019-07-24 22:02 UTC (permalink / raw)
  To: David Miller; +Cc: cai, netdev, linux-kernel
In-Reply-To: <20190724.144143.2055459359936675888.davem@davemloft.net>

[-- Attachment #1: Type: text/plain, Size: 482 bytes --]

On Wed, 2019-07-24 at 14:41 -0700, David Miller wrote:
> From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> Date: Wed, 24 Jul 2019 09:39:26 -0700
> 
> > Dave I will pick this up and add it to my queue.
> 
> How soon will you get that to me?  The sooner this build fix is cured
> the
> better.

Go ahead and pick it up, I was able to get it through an initial round
of testing with no issues.  No need to wait for me to re-send it, I
will go ahead and ACK Qian's patch.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH -next] net/ixgbevf: fix a compilation error of skb_frag_t
From: Jeff Kirsher @ 2019-07-24 22:02 UTC (permalink / raw)
  To: Qian Cai, willy; +Cc: davem, netdev, linux-kernel
In-Reply-To: <1563975157-30691-1-git-send-email-cai@lca.pw>

[-- Attachment #1: Type: text/plain, Size: 1535 bytes --]

On Wed, 2019-07-24 at 09:32 -0400, Qian Cai wrote:
> The linux-next commit "net: Rename skb_frag_t size to bv_len" [1]
> introduced a compilation error on powerpc as it forgot to rename
> "size"
> to "bv_len" for ixgbevf.
> 
> [1] 
> https://lore.kernel.org/netdev/20190723030831.11879-1-willy@infradead.org/T/#md052f1c7de965ccd1bdcb6f92e1990a52298eac5
> 
> In file included from ./include/linux/cache.h:5,
>                  from ./include/linux/printk.h:9,
>                  from ./include/linux/kernel.h:15,
>                  from ./include/linux/list.h:9,
>                  from ./include/linux/module.h:9,
>                  from
> drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c:12:
> drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c: In function
> 'ixgbevf_xmit_frame_ring':
> drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c:4138:51: error:
> 'skb_frag_t' {aka 'struct bio_vec'} has no member named 'size'
>    count += TXD_USE_COUNT(skb_shinfo(skb)->frags[f].size);
>                                                    ^
> ./include/uapi/linux/kernel.h:13:40: note: in definition of macro
> '__KERNEL_DIV_ROUND_UP'
>  #define __KERNEL_DIV_ROUND_UP(n, d) (((n) + (d) - 1) / (d))
>                                         ^
> drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c:4138:12: note: in
> expansion of macro 'TXD_USE_COUNT'
>    count += TXD_USE_COUNT(skb_shinfo(skb)->frags[f].size);
> 
> Signed-off-by: Qian Cai <cai@lca.pw>

Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox