netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [net-next 00/20][pull request] 1GbE Intel Wired LAN Driver Updates 2016-02-23
@ 2016-02-24  4:26 Jeff Kirsher
  2016-02-24  4:26 ` [net-next 01/20] e1000e: Increase ULP timer Jeff Kirsher
                   ` (19 more replies)
  0 siblings, 20 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-02-24  4:26 UTC (permalink / raw)
  To: davem; +Cc: Jeff Kirsher, netdev, nhorman, sassmann, jogreene, john.ronciak

This series contains updates to e1000e, igb and igbvf.

Raanan provides updates for e1000e, first increases the ULP timer since it
now takes longer for the ULP exit to complete on Skylake.  Fixes the
configuration of the internal hardware PHY clock gating mechanism, which was
causing packet loss due to mis configuring.  Fixed additional ULP
configuration settings which were not being properly cleared after cable
connect in V-Pro capable systems.  Added support for more i219 devices.

Gangfeng Huang provides a couple of patches for igb to add support for the
QAV tranmit mode and character device support for AVB, which is supported
in I210 devices.  The character device can be used for developing Audio
and Video Bridging or Industrial Ethernet (whatever that is) applications.

Julia Lawall then provides three fixes to the code added by Gangfeng Huang
to add character device support for AVB.  Guess I could have just fixed up
Gangfeng's patches the first time, but then Julia would not get credit for
fixing his code. :-)

Takuma Ueba provides a fix for I210 where IPv6 autoconf test sometimes
fails due to DAD NS for link-local is not transmitted.  To avoid this
issue, we need to wait until 1000BASE-T status register "Remote receiver
status OK".

Todd provides a patch to override EEPROM WoL settings for specific OEM
devices. Then renamed igb defines to be more generic, since the define
E1000_MRQC_ENABLE_RSS_4Q enables 4 and 8 queues depending on the part.

Roland Hii fixes an issue where only the half cycle time of less than or
equal to 70 millisecond uses the I210 clock output function.  His patch
adds additional conditions when half cycle time is equal to 125 or 250 or
500 millisecond to use the clock output function.

Alex Duyck adds support for generic transmit checksums for igb and igbvf.

Jon Maxwell fixes an issues where customer applications are registering
and un-registering multicast addresses every few seconds which is leading
to many "Link is up" messages in the logs as a result of the
netif_carrier_off(netdev) in igbvf_msix_other().  So remove the
link is up message when registering multicast addresses.

Corinna Vinschen provides a fix for when switching off VLAN offloading on
i350, the VLAN interface becomes unusable.

Stefan Assmann updates the driver to use ndo_stop() instead of
dev_close() when running ethtool offline self test.  Since dev_close()
causes IFF_UP to be cleared which will remove the interfaces routes
and some addresses.

The following are changes since commit a30a9ea6e21b495372aff549f3dfd63198bd1f45:
  rocker: fix rocker_world_port_obj_vlan_add()
and are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue 1GbE

Alexander Duyck (2):
  igb: Add support for generic Tx checksums
  igbvf: Add support for generic Tx checksums

Corinna Vinschen (1):
  igb: Fix VLAN tag stripping on Intel i350

Gangfeng Huang (2):
  igb: add function to set I210 transmit mode
  igb: add a character device to support AVB

Jon Maxwell (1):
  igbvf: remove "link is Up" message when registering mcast address

Julia Lawall (4):
  igb: fix compare_const_fl.cocci warnings
  igb: fix itnull.cocci warnings
  igb: fix semicolon.cocci warnings
  igb: constify e1000_phy_operations structure

Raanan Avargil (5):
  e1000e: Increase ULP timer
  e1000e: Increase PHY PLL clock gate timing
  e1000e: Set HW FIFO minimum pointer gap for non-gig speeds
  e1000e: Clear ULP configuration register on ULP exit
  e1000e: Initial support for KabeLake

Roland Hii (1):
  igb: add conditions for I210 to generate periodic clock output

Stefan Assmann (1):
  igb: call ndo_stop() instead of dev_close() when running offline
    selftest

Takuma Ueba (1):
  igb: When GbE link up, wait for Remote receiver status condition

Todd Fujinaka (2):
  igb: enable WoL for OEM devices regardless of EEPROM setting
  igb: rename igb define to be more generic

 drivers/net/ethernet/intel/e1000e/hw.h         |   4 +
 drivers/net/ethernet/intel/e1000e/ich8lan.c    |  30 +-
 drivers/net/ethernet/intel/e1000e/ich8lan.h    |   7 +
 drivers/net/ethernet/intel/e1000e/netdev.c     |   4 +
 drivers/net/ethernet/intel/igb/Makefile        |   2 +-
 drivers/net/ethernet/intel/igb/e1000_82575.c   |   2 +-
 drivers/net/ethernet/intel/igb/e1000_82575.h   |   4 +-
 drivers/net/ethernet/intel/igb/e1000_defines.h |  22 ++
 drivers/net/ethernet/intel/igb/e1000_hw.h      |   2 +-
 drivers/net/ethernet/intel/igb/e1000_regs.h    |   7 +
 drivers/net/ethernet/intel/igb/igb.h           |  21 +-
 drivers/net/ethernet/intel/igb/igb_cdev.c      | 510 +++++++++++++++++++++++++
 drivers/net/ethernet/intel/igb/igb_cdev.h      |  45 +++
 drivers/net/ethernet/intel/igb/igb_ethtool.c   |   4 +-
 drivers/net/ethernet/intel/igb/igb_main.c      | 484 ++++++++++++++++++-----
 drivers/net/ethernet/intel/igb/igb_ptp.c       |   3 +-
 drivers/net/ethernet/intel/igbvf/netdev.c      | 143 ++++---
 drivers/net/ethernet/intel/igbvf/vf.h          |   1 +
 18 files changed, 1136 insertions(+), 159 deletions(-)
 create mode 100644 drivers/net/ethernet/intel/igb/igb_cdev.c
 create mode 100644 drivers/net/ethernet/intel/igb/igb_cdev.h

-- 
2.5.0

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [net-next 01/20] e1000e: Increase ULP timer
  2016-02-24  4:26 [net-next 00/20][pull request] 1GbE Intel Wired LAN Driver Updates 2016-02-23 Jeff Kirsher
@ 2016-02-24  4:26 ` Jeff Kirsher
  2016-02-24  4:26 ` [net-next 02/20] e1000e: Increase PHY PLL clock gate timing Jeff Kirsher
                   ` (18 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-02-24  4:26 UTC (permalink / raw)
  To: davem; +Cc: Raanan Avargil, netdev, nhorman, sassmann, jogreene, Jeff Kirsher

From: Raanan Avargil <raanan.avargil@intel.com>

Due to system level changes introduced in Skylake, ULP exit takes
significantly longer to occur.  Therefore, driver must wait longer for.

Signed-off-by: Raanan Avargil <raanan.avargil@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/e1000e/ich8lan.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000e/ich8lan.c b/drivers/net/ethernet/intel/e1000e/ich8lan.c
index a049e30..c731465 100644
--- a/drivers/net/ethernet/intel/e1000e/ich8lan.c
+++ b/drivers/net/ethernet/intel/e1000e/ich8lan.c
@@ -1252,9 +1252,9 @@ static s32 e1000_disable_ulp_lpt_lp(struct e1000_hw *hw, bool force)
 			ew32(H2ME, mac_reg);
 		}
 
-		/* Poll up to 100msec for ME to clear ULP_CFG_DONE */
+		/* Poll up to 300msec for ME to clear ULP_CFG_DONE. */
 		while (er32(FWSM) & E1000_FWSM_ULP_CFG_DONE) {
-			if (i++ == 10) {
+			if (i++ == 30) {
 				ret_val = -E1000_ERR_PHY;
 				goto out;
 			}
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [net-next 02/20] e1000e: Increase PHY PLL clock gate timing
  2016-02-24  4:26 [net-next 00/20][pull request] 1GbE Intel Wired LAN Driver Updates 2016-02-23 Jeff Kirsher
  2016-02-24  4:26 ` [net-next 01/20] e1000e: Increase ULP timer Jeff Kirsher
@ 2016-02-24  4:26 ` Jeff Kirsher
  2016-02-24  4:26 ` [net-next 03/20] e1000e: Set HW FIFO minimum pointer gap for non-gig speeds Jeff Kirsher
                   ` (17 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-02-24  4:26 UTC (permalink / raw)
  To: davem; +Cc: Raanan Avargil, netdev, nhorman, sassmann, jogreene, Jeff Kirsher

From: Raanan Avargil <raanan.avargil@intel.com>

Several packet loss issues were reported for which the root cause for
them was an incorrect configuration of internal HW PHY clock gating
mechanism by SW.
This patch provides the correct mechanism.

Signed-off-by: Raanan Avargil <raanan.avargil@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/e1000e/ich8lan.c | 12 ++++++++++++
 drivers/net/ethernet/intel/e1000e/ich8lan.h |  3 +++
 2 files changed, 15 insertions(+)

diff --git a/drivers/net/ethernet/intel/e1000e/ich8lan.c b/drivers/net/ethernet/intel/e1000e/ich8lan.c
index c731465..786d214 100644
--- a/drivers/net/ethernet/intel/e1000e/ich8lan.c
+++ b/drivers/net/ethernet/intel/e1000e/ich8lan.c
@@ -1433,6 +1433,18 @@ static s32 e1000_check_for_copper_link_ich8lan(struct e1000_hw *hw)
 			emi_addr = I217_RX_CONFIG;
 		ret_val = e1000_write_emi_reg_locked(hw, emi_addr, emi_val);
 
+		if (hw->mac.type == e1000_pch_lpt ||
+		    hw->mac.type == e1000_pch_spt) {
+			u16 phy_reg;
+
+			e1e_rphy_locked(hw, I217_PLL_CLOCK_GATE_REG, &phy_reg);
+			phy_reg &= ~I217_PLL_CLOCK_GATE_MASK;
+			if (speed == SPEED_100 || speed == SPEED_10)
+				phy_reg |= 0x3E8;
+			else
+				phy_reg |= 0xFA;
+			e1e_wphy_locked(hw, I217_PLL_CLOCK_GATE_REG, phy_reg);
+		}
 		hw->phy.ops.release(hw);
 
 		if (ret_val)
diff --git a/drivers/net/ethernet/intel/e1000e/ich8lan.h b/drivers/net/ethernet/intel/e1000e/ich8lan.h
index 34c551e..7d85f00 100644
--- a/drivers/net/ethernet/intel/e1000e/ich8lan.h
+++ b/drivers/net/ethernet/intel/e1000e/ich8lan.h
@@ -226,6 +226,9 @@
 #define HV_PM_CTRL_PLL_STOP_IN_K1_GIGA	0x100
 #define HV_PM_CTRL_K1_ENABLE		0x4000
 
+#define I217_PLL_CLOCK_GATE_REG	PHY_REG(772, 28)
+#define I217_PLL_CLOCK_GATE_MASK	0x07FF
+
 #define SW_FLAG_TIMEOUT		1000	/* SW Semaphore flag timeout in ms */
 
 /* Inband Control */
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [net-next 03/20] e1000e: Set HW FIFO minimum pointer gap for non-gig speeds
  2016-02-24  4:26 [net-next 00/20][pull request] 1GbE Intel Wired LAN Driver Updates 2016-02-23 Jeff Kirsher
  2016-02-24  4:26 ` [net-next 01/20] e1000e: Increase ULP timer Jeff Kirsher
  2016-02-24  4:26 ` [net-next 02/20] e1000e: Increase PHY PLL clock gate timing Jeff Kirsher
@ 2016-02-24  4:26 ` Jeff Kirsher
  2016-02-24  4:26 ` [net-next 04/20] e1000e: Clear ULP configuration register on ULP exit Jeff Kirsher
                   ` (16 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-02-24  4:26 UTC (permalink / raw)
  To: davem; +Cc: Raanan Avargil, netdev, nhorman, sassmann, jogreene, Jeff Kirsher

From: Raanan Avargil <raanan.avargil@intel.com>

Based on feedback from HW team, the configured value of the internal PHY
HW FIFO pointer gap was incorrect for non-gig speeds.
This patch provides the correct configuration.

Signed-off-by: Raanan Avargil <raanan.avargil@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/e1000e/ich8lan.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/drivers/net/ethernet/intel/e1000e/ich8lan.c b/drivers/net/ethernet/intel/e1000e/ich8lan.c
index 786d214..e7ccf5f 100644
--- a/drivers/net/ethernet/intel/e1000e/ich8lan.c
+++ b/drivers/net/ethernet/intel/e1000e/ich8lan.c
@@ -1479,6 +1479,18 @@ static s32 e1000_check_for_copper_link_ich8lan(struct e1000_hw *hw)
 				hw->phy.ops.release(hw);
 				if (ret_val)
 					return ret_val;
+			} else {
+				ret_val = hw->phy.ops.acquire(hw);
+				if (ret_val)
+					return ret_val;
+
+				ret_val = e1e_wphy_locked(hw,
+							  PHY_REG(776, 20),
+							  0xC023);
+				hw->phy.ops.release(hw);
+				if (ret_val)
+					return ret_val;
+
 			}
 		}
 	}
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [net-next 04/20] e1000e: Clear ULP configuration register on ULP exit
  2016-02-24  4:26 [net-next 00/20][pull request] 1GbE Intel Wired LAN Driver Updates 2016-02-23 Jeff Kirsher
                   ` (2 preceding siblings ...)
  2016-02-24  4:26 ` [net-next 03/20] e1000e: Set HW FIFO minimum pointer gap for non-gig speeds Jeff Kirsher
@ 2016-02-24  4:26 ` Jeff Kirsher
  2016-02-24  4:26 ` [net-next 05/20] e1000e: Initial support for KabeLake Jeff Kirsher
                   ` (15 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-02-24  4:26 UTC (permalink / raw)
  To: davem; +Cc: Raanan Avargil, netdev, nhorman, sassmann, jogreene, Jeff Kirsher

From: Raanan Avargil <raanan.avargil@intel.com>

There have been bugs caused by HW ULP configuration settings not being
properly cleared after cable connect in V-Pro capable systems.
This caused HW to get out of sync occasionally.
The fix ensures that ULP settings are cleared in HW after
LAN cable re-connect.

Signed-off-by: Raanan Avargil <raanan.avargil@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/e1000e/ich8lan.c | 2 ++
 drivers/net/ethernet/intel/e1000e/ich8lan.h | 4 ++++
 2 files changed, 6 insertions(+)

diff --git a/drivers/net/ethernet/intel/e1000e/ich8lan.c b/drivers/net/ethernet/intel/e1000e/ich8lan.c
index e7ccf5f..c0f4887 100644
--- a/drivers/net/ethernet/intel/e1000e/ich8lan.c
+++ b/drivers/net/ethernet/intel/e1000e/ich8lan.c
@@ -1328,6 +1328,8 @@ static s32 e1000_disable_ulp_lpt_lp(struct e1000_hw *hw, bool force)
 		     I218_ULP_CONFIG1_RESET_TO_SMBUS |
 		     I218_ULP_CONFIG1_WOL_HOST |
 		     I218_ULP_CONFIG1_INBAND_EXIT |
+		     I218_ULP_CONFIG1_EN_ULP_LANPHYPC |
+		     I218_ULP_CONFIG1_DIS_CLR_STICKY_ON_PERST |
 		     I218_ULP_CONFIG1_DISABLE_SMB_PERST);
 	e1000_write_phy_reg_hv_locked(hw, I218_ULP_CONFIG1, phy_reg);
 
diff --git a/drivers/net/ethernet/intel/e1000e/ich8lan.h b/drivers/net/ethernet/intel/e1000e/ich8lan.h
index 7d85f00..2311f60 100644
--- a/drivers/net/ethernet/intel/e1000e/ich8lan.h
+++ b/drivers/net/ethernet/intel/e1000e/ich8lan.h
@@ -188,6 +188,10 @@
 #define I218_ULP_CONFIG1_INBAND_EXIT	0x0020	/* Inband on ULP exit */
 #define I218_ULP_CONFIG1_WOL_HOST	0x0040	/* WoL Host on ULP exit */
 #define I218_ULP_CONFIG1_RESET_TO_SMBUS	0x0100	/* Reset to SMBus mode */
+/* enable ULP even if when phy powered down via lanphypc */
+#define I218_ULP_CONFIG1_EN_ULP_LANPHYPC	0x0400
+/* disable clear of sticky ULP on PERST */
+#define I218_ULP_CONFIG1_DIS_CLR_STICKY_ON_PERST	0x0800
 #define I218_ULP_CONFIG1_DISABLE_SMB_PERST	0x1000	/* Disable on PERST# */
 
 /* SMBus Address Phy Register */
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [net-next 05/20] e1000e: Initial support for KabeLake
  2016-02-24  4:26 [net-next 00/20][pull request] 1GbE Intel Wired LAN Driver Updates 2016-02-23 Jeff Kirsher
                   ` (3 preceding siblings ...)
  2016-02-24  4:26 ` [net-next 04/20] e1000e: Clear ULP configuration register on ULP exit Jeff Kirsher
@ 2016-02-24  4:26 ` Jeff Kirsher
  2016-02-24  4:26 ` [net-next 06/20] igb: add function to set I210 transmit mode Jeff Kirsher
                   ` (14 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-02-24  4:26 UTC (permalink / raw)
  To: davem; +Cc: Raanan Avargil, netdev, nhorman, sassmann, jogreene, Jeff Kirsher

From: Raanan Avargil <raanan.avargil@intel.com>

i219 (4) and i219 (5) are the next LOM generations that will be
available on the next Intel platform (KabeLake).
This patch provides the initial support for the devices.

Signed-off-by: Raanan Avargil <raanan.avargil@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/e1000e/hw.h     | 4 ++++
 drivers/net/ethernet/intel/e1000e/netdev.c | 4 ++++
 2 files changed, 8 insertions(+)

diff --git a/drivers/net/ethernet/intel/e1000e/hw.h b/drivers/net/ethernet/intel/e1000e/hw.h
index b3949d5..4e733bf 100644
--- a/drivers/net/ethernet/intel/e1000e/hw.h
+++ b/drivers/net/ethernet/intel/e1000e/hw.h
@@ -92,6 +92,10 @@ struct e1000_hw;
 #define E1000_DEV_ID_PCH_SPT_I219_LM2		0x15B7	/* SPT-H PCH */
 #define E1000_DEV_ID_PCH_SPT_I219_V2		0x15B8	/* SPT-H PCH */
 #define E1000_DEV_ID_PCH_LBG_I219_LM3		0x15B9	/* LBG PCH */
+#define E1000_DEV_ID_PCH_SPT_I219_LM4		0x15D7
+#define E1000_DEV_ID_PCH_SPT_I219_V4		0x15D8
+#define E1000_DEV_ID_PCH_SPT_I219_LM5		0x15E3
+#define E1000_DEV_ID_PCH_SPT_I219_V5		0x15D6
 
 #define E1000_REVISION_4	4
 
diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index c71ba1b..9b4ec13 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -7452,6 +7452,10 @@ static const struct pci_device_id e1000_pci_tbl[] = {
 	{ PCI_VDEVICE(INTEL, E1000_DEV_ID_PCH_SPT_I219_LM2), board_pch_spt },
 	{ PCI_VDEVICE(INTEL, E1000_DEV_ID_PCH_SPT_I219_V2), board_pch_spt },
 	{ PCI_VDEVICE(INTEL, E1000_DEV_ID_PCH_LBG_I219_LM3), board_pch_spt },
+	{ PCI_VDEVICE(INTEL, E1000_DEV_ID_PCH_SPT_I219_LM4), board_pch_spt },
+	{ PCI_VDEVICE(INTEL, E1000_DEV_ID_PCH_SPT_I219_V4), board_pch_spt },
+	{ PCI_VDEVICE(INTEL, E1000_DEV_ID_PCH_SPT_I219_LM5), board_pch_spt },
+	{ PCI_VDEVICE(INTEL, E1000_DEV_ID_PCH_SPT_I219_V5), board_pch_spt },
 
 	{ 0, 0, 0, 0, 0, 0, 0 }	/* terminate list */
 };
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [net-next 06/20] igb: add function to set I210 transmit mode
  2016-02-24  4:26 [net-next 00/20][pull request] 1GbE Intel Wired LAN Driver Updates 2016-02-23 Jeff Kirsher
                   ` (4 preceding siblings ...)
  2016-02-24  4:26 ` [net-next 05/20] e1000e: Initial support for KabeLake Jeff Kirsher
@ 2016-02-24  4:26 ` Jeff Kirsher
  2016-02-24  4:26 ` [net-next 07/20] igb: add a character device to support AVB Jeff Kirsher
                   ` (13 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-02-24  4:26 UTC (permalink / raw)
  To: davem; +Cc: Gangfeng Huang, netdev, nhorman, sassmann, jogreene, Jeff Kirsher

From: Gangfeng Huang <gangfeng.huang@ni.com>

I210 supports two transmit modes, legacy and QAV. The transmit mode is
configured in TQAVCTRL.QavMode register. Before this patch igb driver
only support legacy mode. This patch makes it possible to configure the
transmit mode.

Example:
Get the transmit mode:
$ echo /sys/class/net/eth0/qav_mode
0
Set transmit mode to qav mode
$ echo 1 > /sys/class/net/eth0/qav_mode

Tested:
Setting /sys/class/net/eth0/qav_mode to Qav mode,
 1) Switch back and forth between Qav mode and legacy mode
 2) Send/recv packets in both mode.

Signed-off-by: Gangfeng Huang <gangfeng.huang@ni.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igb/e1000_defines.h |  21 +++
 drivers/net/ethernet/intel/igb/e1000_regs.h    |   7 +
 drivers/net/ethernet/intel/igb/igb.h           |   5 +
 drivers/net/ethernet/intel/igb/igb_main.c      | 182 ++++++++++++++++++++++++-
 4 files changed, 213 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/e1000_defines.h b/drivers/net/ethernet/intel/igb/e1000_defines.h
index e9f23ee..c8b10d2 100644
--- a/drivers/net/ethernet/intel/igb/e1000_defines.h
+++ b/drivers/net/ethernet/intel/igb/e1000_defines.h
@@ -360,6 +360,7 @@
 #define MAX_JUMBO_FRAME_SIZE	0x2600
 
 /* PBA constants */
+#define E1000_PBA_32K 0x0020
 #define E1000_PBA_34K 0x0022
 #define E1000_PBA_64K 0x0040    /* 64KB */
 
@@ -1024,4 +1025,24 @@
 #define E1000_RTTBCNRC_RF_INT_MASK	\
 	(E1000_RTTBCNRC_RF_DEC_MASK << E1000_RTTBCNRC_RF_INT_SHIFT)
 
+/* Queue mode, 0=strict, 1=SR mode */
+#define E1000_TQAVCC_QUEUEMODE		0x80000000
+/* Transmit mode, 0=legacy, 1=QAV */
+#define E1000_TQAVCTRL_TXMODE		0x00000001
+/* Report DMA time of tx packets */
+#define E1000_TQAVCTRL_1588_STAT_EN	0x00000004
+#define E1000_TQAVCTRL_DATA_FETCH_ARB	0x00000010 /* Data fetch arbitration */
+#define E1000_TQAVCTRL_DATA_TRAN_ARB	0x00000100 /* Data tx arbitration */
+#define E1000_TQAVCTRL_DATA_TRAN_TIM	0x00000200 /* Data launch time valid */
+/* Stall SP to guarantee SR */
+#define E1000_TQAVCTRL_SP_WAIT_SR	0x00000400
+#define E1000_TQAVCTRL_FETCH_TM_SHIFT	(16)
+
+#define E1000_TXPBSIZE_TX0PB_SHIFT	0
+#define E1000_TXPBSIZE_TX1PB_SHIFT	6
+#define E1000_TXPBSIZE_TX2PB_SHIFT	12
+#define E1000_TXPBSIZE_TX3PB_SHIFT	18
+
+#define E1000_DTXMXPKTSZ_DEFAULT	0x00000098
+
 #endif
diff --git a/drivers/net/ethernet/intel/igb/e1000_regs.h b/drivers/net/ethernet/intel/igb/e1000_regs.h
index 21d9d02..bb5ed14 100644
--- a/drivers/net/ethernet/intel/igb/e1000_regs.h
+++ b/drivers/net/ethernet/intel/igb/e1000_regs.h
@@ -138,6 +138,12 @@
 #define E1000_FCRTC	0x02170 /* Flow Control Rx high watermark */
 #define E1000_PCIEMISC	0x05BB8 /* PCIE misc config register */
 
+/* High credit registers where _n can be 0 or 1. */
+#define E1000_TQAVHC(_n)	(0x300C + 0x40 * (_n))
+/* QAV Tx mode control registers where _n can be 0 or 1. */
+#define E1000_TQAVCC(_n)	(0x3004 + 0x40 * (_n))
+#define E1000_TQAVCTRL	0x3570 /* Tx Qav Control registers */
+
 /* TX Rate Limit Registers */
 #define E1000_RTTDQSEL	0x3604 /* Tx Desc Plane Queue Select - WO */
 #define E1000_RTTBCNRM	0x3690 /* Tx BCN Rate-scheduler MMW */
@@ -204,6 +210,7 @@
 #define E1000_TDFT     0x03418  /* TX Data FIFO Tail - RW */
 #define E1000_TDFHS    0x03420  /* TX Data FIFO Head Saved - RW */
 #define E1000_TDFPC    0x03430  /* TX Data FIFO Packet Count - RW */
+#define E1000_DTXMXPKT 0x0355C  /* DMA TX Maximum Packet Size */
 #define E1000_DTXCTL   0x03590  /* DMA TX Control - RW */
 #define E1000_CRCERRS  0x04000  /* CRC Error Count - R/clr */
 #define E1000_ALGNERRC 0x04004  /* Alignment Error Count - R/clr */
diff --git a/drivers/net/ethernet/intel/igb/igb.h b/drivers/net/ethernet/intel/igb/igb.h
index 707ae5c..3ad5517 100644
--- a/drivers/net/ethernet/intel/igb/igb.h
+++ b/drivers/net/ethernet/intel/igb/igb.h
@@ -131,6 +131,9 @@ struct vf_data_storage {
 /* this is the size past which hardware will drop packets when setting LPE=0 */
 #define MAXIMUM_ETHERNET_VLAN_SIZE 1522
 
+/* In qav mode, the maximum frame size is 1536 */
+#define IGB_MAX_QAV_FRAME_SIZE	1536
+
 /* Supported Rx Buffer Sizes */
 #define IGB_RXBUFFER_256	256
 #define IGB_RXBUFFER_2048	2048
@@ -464,6 +467,8 @@ struct igb_adapter {
 	int copper_tries;
 	struct e1000_info ei;
 	u16 eee_advert;
+
+	bool qav_mode;
 };
 
 #define IGB_FLAG_HAS_MSI		(1 << 0)
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index af46fcf..362d579 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -176,6 +176,17 @@ static int igb_ndo_get_vf_config(struct net_device *netdev, int vf,
 				 struct ifla_vf_info *ivi);
 static void igb_check_vf_rate_limit(struct igb_adapter *);
 
+/* Switch qav mode and legacy mode by sysfs*/
+static void igb_setup_qav_mode(struct igb_adapter *adapter);
+static void igb_setup_normal_mode(struct igb_adapter *adapter);
+static ssize_t igb_get_qav_mode(struct device *dev,
+				struct device_attribute *attr, char *buf);
+static ssize_t igb_set_qav_mode(struct device *dev,
+				struct device_attribute *attr,
+				const char *buf, size_t count);
+static DEVICE_ATTR(qav_mode, S_IRUGO | S_IWUSR,
+		   igb_get_qav_mode, igb_set_qav_mode);
+
 #ifdef CONFIG_PCI_IOV
 static int igb_vf_configure(struct igb_adapter *adapter, int vf);
 static int igb_pci_enable_sriov(struct pci_dev *dev, int num_vfs);
@@ -1606,6 +1617,11 @@ static void igb_configure(struct igb_adapter *adapter)
 
 	igb_restore_vlan(adapter);
 
+	if (adapter->qav_mode)
+		igb_setup_qav_mode(adapter);
+	else
+		igb_setup_normal_mode(adapter);
+
 	igb_setup_tctl(adapter);
 	igb_setup_mrqc(adapter);
 	igb_setup_rctl(adapter);
@@ -1883,8 +1899,10 @@ void igb_reset(struct igb_adapter *adapter)
 		pba = rd32(E1000_RXPBS);
 		pba &= E1000_RXPBS_SIZE_MASK_82576;
 		break;
-	case e1000_82575:
 	case e1000_i210:
+		pba = (adapter->qav_mode) ? E1000_PBA_32K : E1000_PBA_34K;
+		break;
+	case e1000_82575:
 	case e1000_i211:
 	default:
 		pba = E1000_PBA_34K;
@@ -2314,6 +2332,7 @@ static int igb_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	hw = &adapter->hw;
 	hw->back = adapter;
 	adapter->msg_enable = netif_msg_init(debug, DEFAULT_MSG_ENABLE);
+	adapter->qav_mode = false;
 
 	err = -EIO;
 	adapter->io_addr = pci_iomap(pdev, 0, 0);
@@ -2561,6 +2580,15 @@ static int igb_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	if (err)
 		goto err_register;
 
+	if (hw->mac.type == e1000_i210) {
+		err = sysfs_create_file(&netdev->dev.kobj,
+					&dev_attr_qav_mode.attr);
+		if (err) {
+			netdev_err(netdev, "error creating sysfs file\n");
+			goto err_register;
+		}
+	}
+
 	/* carrier off reporting is important to ethtool even BEFORE open */
 	netif_carrier_off(netdev);
 
@@ -2843,6 +2871,9 @@ static void igb_remove(struct pci_dev *pdev)
 	igb_disable_sriov(pdev);
 #endif
 
+	if (hw->mac.type == e1000_i210)
+		sysfs_remove_file(&netdev->dev.kobj, &dev_attr_qav_mode.attr);
+
 	unregister_netdev(netdev);
 
 	igb_clear_interrupt_scheme(adapter);
@@ -2927,7 +2958,12 @@ static void igb_init_queue_configuration(struct igb_adapter *adapter)
 		break;
 	}
 
-	adapter->rss_queues = min_t(u32, max_rss_queues, num_online_cpus());
+	/* For QAV mode, always enable all queues */
+	if (adapter->qav_mode)
+		adapter->rss_queues = max_rss_queues;
+	else
+		adapter->rss_queues = min_t(u32, max_rss_queues,
+					    num_online_cpus());
 
 	igb_set_flag_queue_pairs(adapter, max_rss_queues);
 }
@@ -5293,6 +5329,10 @@ static int igb_change_mtu(struct net_device *netdev, int new_mtu)
 		return -EINVAL;
 	}
 
+	/* For i210 Qav mode, the max frame is 1536 */
+	if (adapter->qav_mode && max_frame > IGB_MAX_QAV_FRAME_SIZE)
+		return -EINVAL;
+
 #define MAX_STD_JUMBO_FRAME_SIZE 9238
 	if (max_frame > MAX_STD_JUMBO_FRAME_SIZE) {
 		dev_err(&pdev->dev, "MTU > 9216 not supported.\n");
@@ -8192,4 +8232,142 @@ int igb_reinit_queues(struct igb_adapter *adapter)
 
 	return err;
 }
+
+static void igb_setup_qav_mode(struct igb_adapter *adapter)
+{
+	struct e1000_hw *hw = &adapter->hw;
+	u32	tqavctrl;
+	u32	tqavcc0, tqavcc1;
+	u32	tqavhc0, tqavhc1;
+	u32	txpbsize;
+
+	/* reconfigure the tx packet buffer allocation */
+	txpbsize = (8);
+	txpbsize |= (8) << E1000_TXPBSIZE_TX1PB_SHIFT;
+	txpbsize |= (4) << E1000_TXPBSIZE_TX2PB_SHIFT;
+	txpbsize |= (4) << E1000_TXPBSIZE_TX3PB_SHIFT;
+
+	wr32(E1000_TXPBS, txpbsize);
+
+	/* In Qav mode, the maximum sized frames of 1536 bytes */
+	wr32(E1000_DTXMXPKT, IGB_MAX_QAV_FRAME_SIZE / 64);
+
+	/* The I210 implements 4 queues, up to two queues are dedicated
+	 * for stream reservation or priority, strict priority queuing
+	 * while SR queue are subjected to launch time policy
+	 */
+
+	tqavcc0 = E1000_TQAVCC_QUEUEMODE; /* no idle slope */
+	tqavcc1 = E1000_TQAVCC_QUEUEMODE; /* no idle slope */
+	tqavhc0 = 0xFFFFFFFF; /* unlimited credits */
+	tqavhc1 = 0xFFFFFFFF; /* unlimited credits */
+
+	wr32(E1000_TQAVCC(0), tqavcc0);
+	wr32(E1000_TQAVCC(1), tqavcc1);
+	wr32(E1000_TQAVHC(0), tqavhc0);
+	wr32(E1000_TQAVHC(1), tqavhc1);
+
+	tqavctrl = E1000_TQAVCTRL_TXMODE |
+		   E1000_TQAVCTRL_DATA_FETCH_ARB |
+		   E1000_TQAVCTRL_DATA_TRAN_TIM |
+		   E1000_TQAVCTRL_SP_WAIT_SR;
+
+	/* Default to a 10 usec prefetch delta from launch time - time for
+	 * a 1500 byte rx frame to be received over the PCIe Gen1 x1 link.
+	 */
+	tqavctrl |= (10 << 5) << E1000_TQAVCTRL_FETCH_TM_SHIFT;
+
+	wr32(E1000_TQAVCTRL, tqavctrl);
+}
+
+static void igb_setup_normal_mode(struct igb_adapter *adapter)
+{
+	struct e1000_hw *hw = &adapter->hw;
+
+	wr32(E1000_TXPBS, I210_TXPBSIZE_DEFAULT);
+	wr32(E1000_DTXMXPKT, E1000_DTXMXPKTSZ_DEFAULT);
+	wr32(E1000_TQAVCTRL, 0);
+}
+
+static int igb_change_mode(struct igb_adapter *adapter, int request_mode)
+{
+	struct net_device *netdev;
+	int err = 0;
+	int current_mode;
+
+	if (NULL == adapter) {
+		dev_err(&adapter->pdev->dev, "map to unbound device!\n");
+		return -ENOENT;
+	}
+
+	current_mode = adapter->qav_mode;
+
+	if (request_mode == current_mode)
+		return 0;
+
+	netdev = adapter->netdev;
+
+	rtnl_lock();
+
+	if (netif_running(netdev))
+		igb_close(netdev);
+	else
+		igb_reset(adapter);
+
+	igb_clear_interrupt_scheme(adapter);
+
+	adapter->qav_mode = request_mode;
+
+	igb_init_queue_configuration(adapter);
+
+	if (igb_init_interrupt_scheme(adapter, true)) {
+		dev_err(&adapter->pdev->dev,
+			"Unable to allocate memory for queues\n");
+		err = -ENOMEM;
+		goto err_out;
+	}
+
+	if (netif_running(netdev))
+		igb_open(netdev);
+
+	rtnl_unlock();
+
+	return err;
+err_out:
+	rtnl_unlock();
+	return err;
+}
+
+static ssize_t igb_get_qav_mode(struct device *dev,
+				struct device_attribute *attr, char *buf)
+{
+	struct net_device *netdev = to_net_dev(dev);
+	struct igb_adapter *adapter = netdev_priv(netdev);
+
+	return scnprintf(buf, PAGE_SIZE, "%d\n", adapter->qav_mode);
+}
+
+static ssize_t igb_set_qav_mode(struct device *dev,
+				struct device_attribute *attr,
+				const char *buf, size_t len)
+{
+	struct net_device *netdev = to_net_dev(dev);
+	struct igb_adapter *adapter = netdev_priv(netdev);
+	int request_mode, err;
+
+	if (!capable(CAP_NET_ADMIN))
+		return -EPERM;
+
+	if (0 > kstrtoint(buf, 0, &request_mode))
+		return -EINVAL;
+
+	if (request_mode != 0 && request_mode != 1)
+		return -EINVAL;
+
+	err = igb_change_mode(adapter, request_mode);
+	if (err)
+		return err;
+
+	return len;
+}
 /* igb_main.c */
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [net-next 07/20] igb: add a character device to support AVB
  2016-02-24  4:26 [net-next 00/20][pull request] 1GbE Intel Wired LAN Driver Updates 2016-02-23 Jeff Kirsher
                   ` (5 preceding siblings ...)
  2016-02-24  4:26 ` [net-next 06/20] igb: add function to set I210 transmit mode Jeff Kirsher
@ 2016-02-24  4:26 ` Jeff Kirsher
  2016-02-24 20:06   ` Or Gerlitz
  2016-02-24  4:26 ` [net-next 08/20] igb: fix compare_const_fl.cocci warnings Jeff Kirsher
                   ` (12 subsequent siblings)
  19 siblings, 1 reply; 24+ messages in thread
From: Jeff Kirsher @ 2016-02-24  4:26 UTC (permalink / raw)
  To: davem; +Cc: Gangfeng Huang, netdev, nhorman, sassmann, jogreene, Jeff Kirsher

From: Gangfeng Huang <gangfeng.huang@ni.com>

This patch create a character device for Intel I210 Ethernet controller,
it can be used for developing Audio/Video Bridging applications,Industrial
Ethernet applications which require precise timing control over frame
transmission, or test harnesses for measuring system latencies and sampling
events.

As the AVB queues (0,1) are mapped to a  user-space application, typical
LAN traffic must be steered away from these queues. For transmit, this
driver implements one method registering an ndo_select_queue handler to
map traffic to queue[3] and set the register MRQC to receive all BE
traffic to Rx queue[3].

This patch is reference to the Intel Open-AVB project:
http://github.com/AVnu/Open-AVB/tree/master/kmod/igb

Signed-off-by: Gangfeng Huang <gangfeng.huang@ni.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igb/Makefile        |   2 +-
 drivers/net/ethernet/intel/igb/e1000_defines.h |   1 +
 drivers/net/ethernet/intel/igb/igb.h           |  14 +-
 drivers/net/ethernet/intel/igb/igb_cdev.c      | 511 +++++++++++++++++++++++++
 drivers/net/ethernet/intel/igb/igb_cdev.h      |  45 +++
 drivers/net/ethernet/intel/igb/igb_main.c      | 103 ++++-
 6 files changed, 663 insertions(+), 13 deletions(-)
 create mode 100644 drivers/net/ethernet/intel/igb/igb_cdev.c
 create mode 100644 drivers/net/ethernet/intel/igb/igb_cdev.h

diff --git a/drivers/net/ethernet/intel/igb/Makefile b/drivers/net/ethernet/intel/igb/Makefile
index 5bcb2de..3fee429 100644
--- a/drivers/net/ethernet/intel/igb/Makefile
+++ b/drivers/net/ethernet/intel/igb/Makefile
@@ -33,4 +33,4 @@ obj-$(CONFIG_IGB) += igb.o
 
 igb-objs := igb_main.o igb_ethtool.o e1000_82575.o \
 	    e1000_mac.o e1000_nvm.o e1000_phy.o e1000_mbx.o \
-	    e1000_i210.o igb_ptp.o igb_hwmon.o
+	    e1000_i210.o igb_ptp.o igb_hwmon.o igb_cdev.o
diff --git a/drivers/net/ethernet/intel/igb/e1000_defines.h b/drivers/net/ethernet/intel/igb/e1000_defines.h
index c8b10d2..5686a2c 100644
--- a/drivers/net/ethernet/intel/igb/e1000_defines.h
+++ b/drivers/net/ethernet/intel/igb/e1000_defines.h
@@ -112,6 +112,7 @@
 #define E1000_MRQC_RSS_FIELD_IPV6              0x00100000
 #define E1000_MRQC_RSS_FIELD_IPV6_TCP          0x00200000
 
+#define E1000_MRQC_DEF_QUEUE_OFFSET            0x3
 
 /* Management Control */
 #define E1000_MANC_SMBUS_EN      0x00000001 /* SMBus Enabled - RO */
diff --git a/drivers/net/ethernet/intel/igb/igb.h b/drivers/net/ethernet/intel/igb/igb.h
index 3ad5517..3fa3a85 100644
--- a/drivers/net/ethernet/intel/igb/igb.h
+++ b/drivers/net/ethernet/intel/igb/igb.h
@@ -38,6 +38,8 @@
 #include <linux/i2c-algo-bit.h>
 #include <linux/pci.h>
 #include <linux/mdio.h>
+#include <linux/types.h>
+#include <linux/cdev.h>
 
 struct igb_adapter;
 
@@ -50,12 +52,12 @@ struct igb_adapter;
 #define IGB_70K_ITR		56
 
 /* TX/RX descriptor defines */
-#define IGB_DEFAULT_TXD		256
+#define IGB_DEFAULT_TXD		1024
 #define IGB_DEFAULT_TX_WORK	128
 #define IGB_MIN_TXD		80
 #define IGB_MAX_TXD		4096
 
-#define IGB_DEFAULT_RXD		256
+#define IGB_DEFAULT_RXD		1024
 #define IGB_MIN_RXD		80
 #define IGB_MAX_RXD		4096
 
@@ -469,6 +471,14 @@ struct igb_adapter {
 	u16 eee_advert;
 
 	bool qav_mode;
+	struct cdev char_dev;
+	struct list_head user_page_list;
+	struct mutex user_page_mutex; /* protect user_page_list */
+	unsigned long tx_uring_init;
+	unsigned long rx_uring_init;
+	struct mutex user_ring_mutex; /* protect tx/rx_uring_init */
+	bool cdev_in_use;
+	struct mutex cdev_mutex; /* protect cdev_in_use */
 };
 
 #define IGB_FLAG_HAS_MSI		(1 << 0)
diff --git a/drivers/net/ethernet/intel/igb/igb_cdev.c b/drivers/net/ethernet/intel/igb/igb_cdev.c
new file mode 100644
index 0000000..df237c6
--- /dev/null
+++ b/drivers/net/ethernet/intel/igb/igb_cdev.c
@@ -0,0 +1,511 @@
+#include "igb.h"
+#include "igb_cdev.h"
+
+#include <linux/pagemap.h>
+#include <linux/bitops.h>
+#include <linux/types.h>
+#include <linux/cdev.h>
+
+/* TSN char dev */
+static DECLARE_BITMAP(cdev_minors, IGB_MAX_DEV_NUM);
+
+static int igb_major;
+static struct class *igb_class;
+static const char * const igb_class_name = "igb_tsn";
+static const char * const igb_dev_name = "igb_tsn_%s";
+
+/* user-mode API forward definitions */
+static int igb_open_file(struct inode *inode, struct file *file);
+static int igb_close_file(struct inode *inode, struct file *file);
+static int igb_mmap(struct file *file, struct vm_area_struct *vma);
+static long igb_ioctl_file(struct file *file, unsigned int cmd,
+			   unsigned long arg);
+
+/* user-mode IO API registrations */
+static const struct file_operations igb_fops = {
+		.owner   = THIS_MODULE,
+		.llseek  = no_llseek,
+		.open	= igb_open_file,
+		.release = igb_close_file,
+		.mmap	= igb_mmap,
+		.unlocked_ioctl = igb_ioctl_file,
+};
+
+int igb_tsn_setup_all_tx_resources(struct igb_adapter *adapter)
+{
+	struct pci_dev *pdev = adapter->pdev;
+	int i, err = 0;
+
+	for (i = 0; i < IGB_USER_TX_QUEUES; i++) {
+		err = igb_setup_tx_resources(adapter->tx_ring[i]);
+		if (err) {
+			dev_err(&pdev->dev,
+				"Allocation for Tx Queue %u failed\n", i);
+			for (i--; i >= 0; i--)
+				igb_free_tx_resources(adapter->tx_ring[i]);
+			break;
+		}
+	}
+
+	return err;
+}
+
+int igb_tsn_setup_all_rx_resources(struct igb_adapter *adapter)
+{
+	struct pci_dev *pdev = adapter->pdev;
+	int i, err = 0;
+
+	for (i = 0; i < IGB_USER_RX_QUEUES; i++) {
+		err = igb_setup_rx_resources(adapter->rx_ring[i]);
+		if (err) {
+			dev_err(&pdev->dev,
+				"Allocation for Rx Queue %u failed\n", i);
+			for (i--; i >= 0; i--)
+				igb_free_rx_resources(adapter->rx_ring[i]);
+			break;
+		}
+	}
+
+	return err;
+}
+
+void igb_tsn_free_all_tx_resources(struct igb_adapter *adapter)
+{
+	int i;
+
+	for (i = 0; i < IGB_USER_TX_QUEUES; i++)
+		igb_free_tx_resources(adapter->tx_ring[i]);
+}
+
+void igb_tsn_free_all_rx_resources(struct igb_adapter *adapter)
+{
+	int i;
+
+	for (i = 0; i < IGB_USER_RX_QUEUES; i++)
+		igb_free_rx_resources(adapter->rx_ring[i]);
+}
+
+static int igb_bind(struct file *file, void __user *argp)
+{
+	struct igb_adapter *adapter;
+	u32 mmap_size;
+
+	adapter = (struct igb_adapter *)file->private_data;
+
+	if (NULL == adapter)
+		return -ENOENT;
+
+	mmap_size = pci_resource_len(adapter->pdev, 0);
+
+	if (copy_to_user(argp, &mmap_size, sizeof(mmap_size)))
+		return -EFAULT;
+
+	return 0;
+}
+
+static long igb_mapring(struct file *file, void __user *arg)
+{
+	struct igb_adapter *adapter;
+	struct igb_buf_cmd req;
+	int queue_size;
+	unsigned long *uring_init;
+	struct igb_ring *ring;
+	int err;
+
+	if (copy_from_user(&req, arg, sizeof(req)))
+		return -EFAULT;
+
+	if (req.flags != 0 && req.flags != 1)
+		return -EINVAL;
+
+	adapter = file->private_data;
+	if (NULL == adapter) {
+		dev_err(&adapter->pdev->dev, "map to unbound device!\n");
+		return -ENOENT;
+	}
+
+	/* Req flags, Tx: 0, Rx: 1 */
+	if (req.flags == 0) {
+		queue_size = IGB_USER_TX_QUEUES;
+		uring_init =  &adapter->tx_uring_init;
+		ring = adapter->tx_ring[req.queue];
+	} else {
+		queue_size = IGB_USER_RX_QUEUES;
+		uring_init =  &adapter->rx_uring_init;
+		ring = adapter->rx_ring[req.queue];
+	}
+
+	mutex_lock(&adapter->user_ring_mutex);
+	if (test_bit(req.queue, uring_init)) {
+		dev_err(&adapter->pdev->dev, "the queue is in using\n");
+		err = -EBUSY;
+		goto failed;
+	}
+
+	if (req.queue >= queue_size) {
+		err = -EINVAL;
+		goto failed;
+	}
+
+	set_pages_uc(virt_to_page(ring->desc), ring->size >> PAGE_SHIFT);
+	set_bit(req.queue, uring_init);
+	mutex_unlock(&adapter->user_ring_mutex);
+
+	req.physaddr = ring->dma;
+	req.mmap_size = ring->size;
+
+	if (copy_to_user(arg, &req, sizeof(req))) {
+		dev_err(&adapter->pdev->dev, "copyout to user failed\n");
+		return -EFAULT;
+	}
+
+	return 0;
+failed:
+	mutex_unlock(&adapter->user_ring_mutex);
+	return err;
+}
+
+static long igb_mapbuf(struct file *file, void __user *arg)
+{
+	struct igb_adapter *adapter;
+	struct igb_buf_cmd req;
+	struct page *page;
+	dma_addr_t page_dma;
+	struct igb_user_page *userpage;
+	int err = 0;
+	int direction;
+
+	if (copy_from_user(&req, arg, sizeof(req)))
+		return -EFAULT;
+
+	if (req.flags != 0 && req.flags != 1)
+		return -EINVAL;
+
+	adapter = file->private_data;
+	if (NULL == adapter) {
+		dev_err(&adapter->pdev->dev, "map to unbound device!\n");
+		return -ENOENT;
+	}
+
+	userpage = kzalloc(sizeof(*userpage), GFP_KERNEL);
+	if (unlikely(!userpage))
+		return -ENOMEM;
+
+	page = alloc_page(GFP_KERNEL | __GFP_COLD);
+	if (unlikely(!page)) {
+		err = -ENOMEM;
+		goto failed;
+	}
+
+	direction = req.flags ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
+	page_dma = dma_map_page(&adapter->pdev->dev, page,
+				0, PAGE_SIZE, direction);
+
+	if (dma_mapping_error(&adapter->pdev->dev, page_dma)) {
+		put_page(page);
+		err = -ENOMEM;
+		goto failed;
+	}
+
+	set_pages_uc(page, 1);
+	userpage->page = page;
+	userpage->page_dma = page_dma;
+	userpage->flags = req.flags;
+
+	mutex_lock(&adapter->user_page_mutex);
+	list_add_tail(&userpage->page_node, &adapter->user_page_list);
+	mutex_unlock(&adapter->user_page_mutex);
+
+	req.physaddr = page_dma;
+	req.mmap_size = PAGE_SIZE;
+
+	if (copy_to_user(arg, &req, sizeof(req))) {
+		dev_err(&adapter->pdev->dev, "copyout to user failed\n");
+		return -EFAULT;
+	}
+	return 0;
+
+failed:
+	kfree(userpage);
+	return err;
+}
+
+static long igb_unmapring(struct file *file, void __user *arg)
+{
+	struct igb_adapter *adapter;
+	struct igb_buf_cmd req;
+	struct igb_ring *ring;
+	int queue_size;
+	unsigned long *uring_init;
+	int err;
+
+	if (copy_from_user(&req, arg, sizeof(req)))
+		return -EFAULT;
+
+	if (req.flags != 0 && req.flags != 1)
+		return -EINVAL;
+
+	adapter = file->private_data;
+	if (NULL == adapter) {
+		dev_err(&adapter->pdev->dev, "map to unbound device!\n");
+		return -ENOENT;
+	}
+
+	if (req.flags == 0) {
+		queue_size = IGB_USER_TX_QUEUES;
+		uring_init =  &adapter->tx_uring_init;
+		ring = adapter->tx_ring[req.queue];
+	} else {
+		queue_size = IGB_USER_RX_QUEUES;
+		uring_init =  &adapter->rx_uring_init;
+		ring = adapter->rx_ring[req.queue];
+	}
+
+	if (req.queue >= queue_size)
+		return -EINVAL;
+
+	mutex_lock(&adapter->user_ring_mutex);
+	if (!test_bit(req.queue, uring_init)) {
+		dev_err(&adapter->pdev->dev,
+			"the ring is already unmap\n");
+		err = -EINVAL;
+		goto failed;
+	}
+
+	set_pages_wb(virt_to_page(ring->desc), ring->size >> PAGE_SHIFT);
+	clear_bit(req.queue, uring_init);
+	mutex_unlock(&adapter->user_ring_mutex);
+
+	return 0;
+failed:
+	mutex_unlock(&adapter->user_ring_mutex);
+	return err;
+}
+
+static void igb_free_page(struct igb_adapter *adapter,
+			  struct igb_user_page *userpage)
+{
+	int direction = userpage->flags ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
+
+	set_pages_wb(userpage->page, 1);
+	dma_unmap_page(&adapter->pdev->dev,
+		       userpage->page_dma,
+		       PAGE_SIZE,
+		       direction);
+
+	put_page(userpage->page);
+	list_del(&userpage->page_node);
+	kfree(userpage);
+	userpage = NULL;
+}
+
+static long igb_unmapbuf(struct file *file, void __user *arg)
+{
+	int err = 0;
+	struct igb_adapter *adapter;
+	struct igb_buf_cmd req;
+	struct igb_user_page *userpage, *tmp;
+
+	if (copy_from_user(&req, arg, sizeof(req)))
+		return -EFAULT;
+
+	adapter = file->private_data;
+	if (NULL == adapter) {
+		dev_err(&adapter->pdev->dev, "map to unbound device!\n");
+		return -ENOENT;
+	}
+
+	mutex_lock(&adapter->user_page_mutex);
+	if (list_empty(&adapter->user_page_list)) {
+		err = -EINVAL;
+		goto failed;
+	}
+
+	list_for_each_entry_safe(userpage, tmp, &adapter->user_page_list,
+				 page_node) {
+		if (req.physaddr == userpage->page_dma) {
+			igb_free_page(adapter, userpage);
+			break;
+		}
+	}
+	mutex_unlock(&adapter->user_page_mutex);
+
+	return 0;
+failed:
+	mutex_unlock(&adapter->user_page_mutex);
+	return err;
+}
+
+static long igb_ioctl_file(struct file *file, unsigned int cmd,
+			   unsigned long arg)
+{
+	void __user *argp = (void __user *)arg;
+	int err;
+
+	switch (cmd) {
+	case IGB_BIND:
+		err = igb_bind(file, argp);
+		break;
+	case IGB_MAPRING:
+		err = igb_mapring(file, argp);
+		break;
+	case IGB_MAPBUF:
+		err = igb_mapbuf(file, argp);
+		break;
+	case IGB_UNMAPRING:
+		err = igb_unmapring(file, argp);
+		break;
+	case IGB_UNMAPBUF:
+		err = igb_unmapbuf(file, argp);
+		break;
+	default:
+		err = -EINVAL;
+		break;
+	};
+
+	return err;
+}
+
+static int igb_open_file(struct inode *inode, struct file *file)
+{
+	struct igb_adapter *adapter;
+	int err = 0;
+
+	adapter = container_of(inode->i_cdev, struct igb_adapter, char_dev);
+	if (!adapter)
+		return -ENOENT;
+
+	if (!adapter->qav_mode)
+		return -EPERM;
+
+	mutex_lock(&adapter->cdev_mutex);
+	if (adapter->cdev_in_use) {
+		err = -EBUSY;
+		goto failed;
+	}
+
+	file->private_data = adapter;
+	adapter->cdev_in_use = true;
+	mutex_unlock(&adapter->cdev_mutex);
+
+	return 0;
+failed:
+	mutex_unlock(&adapter->cdev_mutex);
+	return err;
+}
+
+static int igb_close_file(struct inode *inode, struct file *file)
+{
+	struct igb_adapter *adapter = file->private_data;
+
+	if (NULL == adapter)
+		return 0;
+
+	mutex_lock(&adapter->cdev_mutex);
+	if (!adapter->cdev_in_use)
+		goto out;
+
+	mutex_lock(&adapter->user_page_mutex);
+	if (!list_empty(&adapter->user_page_list)) {
+		struct igb_user_page *userpage, *tmp;
+
+		list_for_each_entry_safe(userpage, tmp,
+					 &adapter->user_page_list, page_node) {
+			if (userpage)
+				igb_free_page(adapter, userpage);
+		}
+	}
+	mutex_unlock(&adapter->user_page_mutex);
+
+	file->private_data = NULL;
+	adapter->cdev_in_use = false;
+	adapter->tx_uring_init = 0;
+	adapter->rx_uring_init = 0;
+
+out:
+	mutex_unlock(&adapter->cdev_mutex);
+	return 0;
+}
+
+static int igb_mmap(struct file *file, struct vm_area_struct *vma)
+{
+	struct igb_adapter *adapter = file->private_data;
+	unsigned long size  = vma->vm_end - vma->vm_start;
+	dma_addr_t pgoff = vma->vm_pgoff;
+	dma_addr_t physaddr;
+
+	if (NULL == adapter)
+		return -ENODEV;
+
+	if (pgoff == 0)
+		physaddr = pci_resource_start(adapter->pdev, 0) >> PAGE_SHIFT;
+	else
+		physaddr = pgoff;
+
+	vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
+
+	if (remap_pfn_range(vma, vma->vm_start,
+			    physaddr, size, vma->vm_page_prot))
+		return -EAGAIN;
+
+	return 0;
+}
+
+int igb_add_cdev(struct igb_adapter *adapter)
+{
+	int result = 0;
+	dev_t dev_num;
+	int igb_minor;
+
+	igb_minor = find_first_zero_bit(cdev_minors, IGB_MAX_DEV_NUM);
+	if (igb_minor >= IGB_MAX_DEV_NUM)
+		return -EBUSY;
+	set_bit(igb_minor, cdev_minors);
+
+	dev_num = MKDEV(igb_major, igb_minor);
+	cdev_init(&adapter->char_dev, &igb_fops);
+	adapter->char_dev.owner = THIS_MODULE;
+	adapter->char_dev.ops = &igb_fops;
+	result = cdev_add(&adapter->char_dev, dev_num, 1);
+
+	if (result) {
+		dev_err(&adapter->pdev->dev,
+			"igb_tsn: add character device failed\n");
+		return result;
+	}
+
+	device_create(igb_class, NULL, dev_num, NULL, igb_dev_name,
+		      adapter->netdev->name);
+
+	return 0;
+}
+
+void igb_remove_cdev(struct igb_adapter *adapter)
+{
+	device_destroy(igb_class, adapter->char_dev.dev);
+	cdev_del(&adapter->char_dev);
+}
+
+int igb_cdev_init(char *igb_driver_name)
+{
+	dev_t dev_num;
+	int ret;
+
+	ret = alloc_chrdev_region(&dev_num, 0, IGB_MAX_DEV_NUM,
+				  igb_driver_name);
+	if (ret)
+		return ret;
+	igb_major = MAJOR(dev_num);
+
+	igb_class = class_create(THIS_MODULE, igb_class_name);
+	if (IS_ERR(igb_class))
+		pr_info("igb_tsn: create device class failed\n");
+
+	return ret;
+}
+
+void igb_cdev_destroy(void)
+{
+	class_destroy(igb_class);
+	unregister_chrdev_region(MKDEV(igb_major, 0), IGB_MAX_DEV_NUM);
+}
diff --git a/drivers/net/ethernet/intel/igb/igb_cdev.h b/drivers/net/ethernet/intel/igb/igb_cdev.h
new file mode 100644
index 0000000..a07b208
--- /dev/null
+++ b/drivers/net/ethernet/intel/igb/igb_cdev.h
@@ -0,0 +1,45 @@
+#ifndef _IGB_CDEV_H_
+#define _IGB_CDEV_H_
+
+#include <asm/page.h>
+#include <asm/ioctl.h>
+
+struct igb_adapter;
+/* queues reserved for user mode */
+#define IGB_USER_TX_QUEUES	2
+#define IGB_USER_RX_QUEUES	2
+#define IGB_MAX_DEV_NUM	64
+
+/* TSN char dev ioctls */
+#define IGB_BIND       _IOW('E', 200, int)
+#define IGB_MAPRING    _IOW('E', 201, int)
+#define IGB_UNMAPRING  _IOW('E', 202, int)
+#define IGB_MAPBUF     _IOW('E', 203, int)
+#define IGB_UNMAPBUF   _IOW('E', 204, int)
+
+/* Used with both map/unmap ring & buf ioctls */
+struct igb_buf_cmd {
+	u64		physaddr;
+	u32		queue;
+	u32		mmap_size;
+	u32		flags;
+};
+
+struct igb_user_page {
+	struct list_head page_node;
+	struct page *page;
+	dma_addr_t page_dma;
+	u32 flags;
+};
+
+int igb_tsn_setup_all_tx_resources(struct igb_adapter *);
+int igb_tsn_setup_all_rx_resources(struct igb_adapter *);
+void igb_tsn_free_all_tx_resources(struct igb_adapter *);
+void igb_tsn_free_all_rx_resources(struct igb_adapter *);
+
+int igb_add_cdev(struct igb_adapter *adapter);
+void igb_remove_cdev(struct igb_adapter *adapter);
+int igb_cdev_init(char *igb_driver_name);
+void igb_cdev_destroy(void);
+
+#endif
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 362d579..0d501a8 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -55,6 +55,7 @@
 #endif
 #include <linux/i2c.h>
 #include "igb.h"
+#include "igb_cdev.h"
 
 #define MAJ 5
 #define MIN 3
@@ -690,6 +691,11 @@ static int __init igb_init_module(void)
 #ifdef CONFIG_IGB_DCA
 	dca_register_notify(&dca_notifier);
 #endif
+
+	ret = igb_cdev_init(igb_driver_name);
+	if (ret)
+		return ret;
+
 	ret = pci_register_driver(&igb_driver);
 	return ret;
 }
@@ -708,6 +714,8 @@ static void __exit igb_exit_module(void)
 	dca_unregister_notify(&dca_notifier);
 #endif
 	pci_unregister_driver(&igb_driver);
+
+	igb_cdev_destroy();
 }
 
 module_exit(igb_exit_module);
@@ -1635,7 +1643,8 @@ static void igb_configure(struct igb_adapter *adapter)
 	 * at least 1 descriptor unused to make sure
 	 * next_to_use != next_to_clean
 	 */
-	for (i = 0; i < adapter->num_rx_queues; i++) {
+	i = adapter->qav_mode ? IGB_USER_RX_QUEUES : 0;
+	for (; i < adapter->num_rx_queues; i++) {
 		struct igb_ring *ring = adapter->rx_ring[i];
 		igb_alloc_rx_buffers(ring, igb_desc_unused(ring));
 	}
@@ -2104,10 +2113,24 @@ static int igb_ndo_fdb_add(struct ndmsg *ndm, struct nlattr *tb[],
 	return ndo_dflt_fdb_add(ndm, tb, dev, addr, vid, flags);
 }
 
+static u16 igb_select_queue(struct net_device *netdev,
+			    struct sk_buff *skb,
+			    void *accel_priv,
+			    select_queue_fallback_t fallback)
+{
+	struct igb_adapter *adapter = netdev_priv(netdev);
+
+	if (adapter->qav_mode)
+		return adapter->num_tx_queues - 1;
+	else
+		return fallback(netdev, skb);
+}
+
 static const struct net_device_ops igb_netdev_ops = {
 	.ndo_open		= igb_open,
 	.ndo_stop		= igb_close,
 	.ndo_start_xmit		= igb_xmit_frame,
+	.ndo_select_queue	= igb_select_queue,
 	.ndo_get_stats64	= igb_get_stats64,
 	.ndo_set_rx_mode	= igb_set_rx_mode,
 	.ndo_set_mac_address	= igb_set_mac,
@@ -2334,6 +2357,10 @@ static int igb_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	adapter->msg_enable = netif_msg_init(debug, DEFAULT_MSG_ENABLE);
 	adapter->qav_mode = false;
 
+	adapter->tx_uring_init = 0;
+	adapter->rx_uring_init = 0;
+	adapter->cdev_in_use = false;
+
 	err = -EIO;
 	adapter->io_addr = pci_iomap(pdev, 0, 0);
 	if (!adapter->io_addr)
@@ -2589,6 +2616,10 @@ static int igb_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 		}
 	}
 
+	err = igb_add_cdev(adapter);
+	if (err)
+		goto err_register;
+
 	/* carrier off reporting is important to ethtool even BEFORE open */
 	netif_carrier_off(netdev);
 
@@ -2837,6 +2868,8 @@ static void igb_remove(struct pci_dev *pdev)
 	struct igb_adapter *adapter = netdev_priv(netdev);
 	struct e1000_hw *hw = &adapter->hw;
 
+	igb_remove_cdev(adapter);
+
 	pm_runtime_get_noresume(&pdev->dev);
 #ifdef CONFIG_IGB_HWMON
 	igb_sysfs_exit(adapter);
@@ -3028,6 +3061,12 @@ static int igb_sw_init(struct igb_adapter *adapter)
 	adapter->min_frame_size = ETH_ZLEN + ETH_FCS_LEN;
 
 	spin_lock_init(&adapter->stats64_lock);
+
+	INIT_LIST_HEAD(&adapter->user_page_list);
+	mutex_init(&adapter->user_page_mutex);
+	mutex_init(&adapter->user_ring_mutex);
+	mutex_init(&adapter->cdev_mutex);
+
 #ifdef CONFIG_PCI_IOV
 	switch (hw->mac.type) {
 	case e1000_82576:
@@ -3277,7 +3316,8 @@ static int igb_setup_all_tx_resources(struct igb_adapter *adapter)
 	struct pci_dev *pdev = adapter->pdev;
 	int i, err = 0;
 
-	for (i = 0; i < adapter->num_tx_queues; i++) {
+	i = adapter->qav_mode ? IGB_USER_TX_QUEUES : 0;
+	for (; i < adapter->num_tx_queues; i++) {
 		err = igb_setup_tx_resources(adapter->tx_ring[i]);
 		if (err) {
 			dev_err(&pdev->dev,
@@ -3365,7 +3405,8 @@ static void igb_configure_tx(struct igb_adapter *adapter)
 {
 	int i;
 
-	for (i = 0; i < adapter->num_tx_queues; i++)
+	i = adapter->qav_mode ? IGB_USER_TX_QUEUES : 0;
+	for (; i < adapter->num_tx_queues; i++)
 		igb_configure_tx_ring(adapter, adapter->tx_ring[i]);
 }
 
@@ -3420,7 +3461,8 @@ static int igb_setup_all_rx_resources(struct igb_adapter *adapter)
 	struct pci_dev *pdev = adapter->pdev;
 	int i, err = 0;
 
-	for (i = 0; i < adapter->num_rx_queues; i++) {
+	i = adapter->qav_mode ? IGB_USER_RX_QUEUES : 0;
+	for (; i < adapter->num_rx_queues; i++) {
 		err = igb_setup_rx_resources(adapter->rx_ring[i]);
 		if (err) {
 			dev_err(&pdev->dev,
@@ -3445,6 +3487,15 @@ static void igb_setup_mrqc(struct igb_adapter *adapter)
 	u32 j, num_rx_queues;
 	u32 rss_key[10];
 
+	/* For TSN, kernel driver only create buffer for queue 2 and queue 3,
+	 * by default receive all BE packets from queue 3.
+	 */
+	if (adapter->qav_mode) {
+		wr32(E1000_MRQC, (adapter->num_rx_queues - 1)
+		     << E1000_MRQC_DEF_QUEUE_OFFSET);
+		return;
+	}
+
 	netdev_rss_key_fill(rss_key, sizeof(rss_key));
 	for (j = 0; j < 10; j++)
 		wr32(E1000_RSSRK(j), rss_key[j]);
@@ -3520,6 +3571,7 @@ static void igb_setup_mrqc(struct igb_adapter *adapter)
 		if (hw->mac.type != e1000_i211)
 			mrqc |= E1000_MRQC_ENABLE_RSS_4Q;
 	}
+
 	igb_vmm_control(adapter);
 
 	wr32(E1000_MRQC, mrqc);
@@ -3713,7 +3765,8 @@ static void igb_configure_rx(struct igb_adapter *adapter)
 	/* Setup the HW Rx Head and Tail Descriptor Pointers and
 	 * the Base and Length of the Rx Descriptor Ring
 	 */
-	for (i = 0; i < adapter->num_rx_queues; i++)
+	i = adapter->qav_mode ? IGB_USER_RX_QUEUES : 0;
+	for (; i < adapter->num_rx_queues; i++)
 		igb_configure_rx_ring(adapter, adapter->rx_ring[i]);
 }
 
@@ -3749,8 +3802,8 @@ void igb_free_tx_resources(struct igb_ring *tx_ring)
 static void igb_free_all_tx_resources(struct igb_adapter *adapter)
 {
 	int i;
-
-	for (i = 0; i < adapter->num_tx_queues; i++)
+	i = adapter->qav_mode ? IGB_USER_TX_QUEUES : 0;
+	for (; i < adapter->num_tx_queues; i++)
 		if (adapter->tx_ring[i])
 			igb_free_tx_resources(adapter->tx_ring[i]);
 }
@@ -3816,7 +3869,8 @@ static void igb_clean_all_tx_rings(struct igb_adapter *adapter)
 {
 	int i;
 
-	for (i = 0; i < adapter->num_tx_queues; i++)
+	i = adapter->qav_mode ? IGB_USER_TX_QUEUES : 0;
+	for (; i < adapter->num_tx_queues; i++)
 		if (adapter->tx_ring[i])
 			igb_clean_tx_ring(adapter->tx_ring[i]);
 }
@@ -3854,7 +3908,8 @@ static void igb_free_all_rx_resources(struct igb_adapter *adapter)
 {
 	int i;
 
-	for (i = 0; i < adapter->num_rx_queues; i++)
+	i = adapter->qav_mode ? IGB_USER_RX_QUEUES : 0;
+	for (; i < adapter->num_rx_queues; i++)
 		if (adapter->rx_ring[i])
 			igb_free_rx_resources(adapter->rx_ring[i]);
 }
@@ -3910,7 +3965,8 @@ static void igb_clean_all_rx_rings(struct igb_adapter *adapter)
 {
 	int i;
 
-	for (i = 0; i < adapter->num_rx_queues; i++)
+	i = adapter->qav_mode ? IGB_USER_TX_QUEUES : 0;
+	for (; i < adapter->num_rx_queues; i++)
 		if (adapter->rx_ring[i])
 			igb_clean_rx_ring(adapter->rx_ring[i]);
 }
@@ -7055,6 +7111,11 @@ static int igb_clean_rx_irq(struct igb_q_vector *q_vector, const int budget)
 	struct sk_buff *skb = rx_ring->skb;
 	unsigned int total_bytes = 0, total_packets = 0;
 	u16 cleaned_count = igb_desc_unused(rx_ring);
+	struct igb_adapter *adapter = netdev_priv(rx_ring->netdev);
+
+	/* Don't service user (AVB) queues */
+	if (adapter->qav_mode && rx_ring->queue_index < IGB_USER_RX_QUEUES)
+		return true;
 
 	while (likely(total_packets < budget)) {
 		union e1000_adv_rx_desc *rx_desc;
@@ -7254,6 +7315,9 @@ static int igb_mii_ioctl(struct net_device *netdev, struct ifreq *ifr, int cmd)
 	return 0;
 }
 
+#define SIOSTXQUEUESELECT SIOCDEVPRIVATE
+#define SIOSRXQUEUESELECT (SIOCDEVPRIVATE + 1)
+
 /**
  * igb_ioctl -
  * @netdev:
@@ -8305,6 +8369,9 @@ static int igb_change_mode(struct igb_adapter *adapter, int request_mode)
 	if (request_mode == current_mode)
 		return 0;
 
+	if (adapter->cdev_in_use)
+		return -EBUSY;
+
 	netdev = adapter->netdev;
 
 	rtnl_lock();
@@ -8314,6 +8381,11 @@ static int igb_change_mode(struct igb_adapter *adapter, int request_mode)
 	else
 		igb_reset(adapter);
 
+	if (current_mode) {
+		igb_tsn_free_all_rx_resources(adapter);
+		igb_tsn_free_all_tx_resources(adapter);
+	}
+
 	igb_clear_interrupt_scheme(adapter);
 
 	adapter->qav_mode = request_mode;
@@ -8327,12 +8399,23 @@ static int igb_change_mode(struct igb_adapter *adapter, int request_mode)
 		goto err_out;
 	}
 
+	if (request_mode) {
+		err = igb_tsn_setup_all_tx_resources(adapter);
+		if (err)
+			goto err_out;
+		err = igb_tsn_setup_all_rx_resources(adapter);
+		if (err)
+			goto err_tsn_setup_rx;
+	}
+
 	if (netif_running(netdev))
 		igb_open(netdev);
 
 	rtnl_unlock();
 
 	return err;
+err_tsn_setup_rx:
+	igb_tsn_free_all_tx_resources(adapter);
 err_out:
 	rtnl_unlock();
 	return err;
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [net-next 08/20] igb: fix compare_const_fl.cocci warnings
  2016-02-24  4:26 [net-next 00/20][pull request] 1GbE Intel Wired LAN Driver Updates 2016-02-23 Jeff Kirsher
                   ` (6 preceding siblings ...)
  2016-02-24  4:26 ` [net-next 07/20] igb: add a character device to support AVB Jeff Kirsher
@ 2016-02-24  4:26 ` Jeff Kirsher
  2016-02-24  4:26 ` [net-next 09/20] igb: fix itnull.cocci warnings Jeff Kirsher
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-02-24  4:26 UTC (permalink / raw)
  To: davem
  Cc: Julia Lawall, netdev, nhorman, sassmann, jogreene, Gangfeng Huang,
	Fengguang Wu, Jeff Kirsher

From: Julia Lawall <julia.lawall@lip6.fr>

Kernel code typically uses == NULL.

Generated by: scripts/coccinelle/misc/compare_const_fl.cocci

CC: Gangfeng Huang <gangfeng.huang@ni.com>
Signed-off-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igb/igb_cdev.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_cdev.c b/drivers/net/ethernet/intel/igb/igb_cdev.c
index df237c6..28fafec 100644
--- a/drivers/net/ethernet/intel/igb/igb_cdev.c
+++ b/drivers/net/ethernet/intel/igb/igb_cdev.c
@@ -92,7 +92,7 @@ static int igb_bind(struct file *file, void __user *argp)
 
 	adapter = (struct igb_adapter *)file->private_data;
 
-	if (NULL == adapter)
+	if (adapter == NULL)
 		return -ENOENT;
 
 	mmap_size = pci_resource_len(adapter->pdev, 0);
@@ -119,7 +119,7 @@ static long igb_mapring(struct file *file, void __user *arg)
 		return -EINVAL;
 
 	adapter = file->private_data;
-	if (NULL == adapter) {
+	if (adapter == NULL) {
 		dev_err(&adapter->pdev->dev, "map to unbound device!\n");
 		return -ENOENT;
 	}
@@ -182,7 +182,7 @@ static long igb_mapbuf(struct file *file, void __user *arg)
 		return -EINVAL;
 
 	adapter = file->private_data;
-	if (NULL == adapter) {
+	if (adapter == NULL) {
 		dev_err(&adapter->pdev->dev, "map to unbound device!\n");
 		return -ENOENT;
 	}
@@ -246,7 +246,7 @@ static long igb_unmapring(struct file *file, void __user *arg)
 		return -EINVAL;
 
 	adapter = file->private_data;
-	if (NULL == adapter) {
+	if (adapter == NULL) {
 		dev_err(&adapter->pdev->dev, "map to unbound device!\n");
 		return -ENOENT;
 	}
@@ -310,7 +310,7 @@ static long igb_unmapbuf(struct file *file, void __user *arg)
 		return -EFAULT;
 
 	adapter = file->private_data;
-	if (NULL == adapter) {
+	if (adapter == NULL) {
 		dev_err(&adapter->pdev->dev, "map to unbound device!\n");
 		return -ENOENT;
 	}
@@ -398,7 +398,7 @@ static int igb_close_file(struct inode *inode, struct file *file)
 {
 	struct igb_adapter *adapter = file->private_data;
 
-	if (NULL == adapter)
+	if (adapter == NULL)
 		return 0;
 
 	mutex_lock(&adapter->cdev_mutex);
@@ -434,7 +434,7 @@ static int igb_mmap(struct file *file, struct vm_area_struct *vma)
 	dma_addr_t pgoff = vma->vm_pgoff;
 	dma_addr_t physaddr;
 
-	if (NULL == adapter)
+	if (adapter == NULL)
 		return -ENODEV;
 
 	if (pgoff == 0)
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [net-next 09/20] igb: fix itnull.cocci warnings
  2016-02-24  4:26 [net-next 00/20][pull request] 1GbE Intel Wired LAN Driver Updates 2016-02-23 Jeff Kirsher
                   ` (7 preceding siblings ...)
  2016-02-24  4:26 ` [net-next 08/20] igb: fix compare_const_fl.cocci warnings Jeff Kirsher
@ 2016-02-24  4:26 ` Jeff Kirsher
  2016-02-24  4:26 ` [net-next 10/20] igb: fix semicolon.cocci warnings Jeff Kirsher
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-02-24  4:26 UTC (permalink / raw)
  To: davem
  Cc: Julia Lawall, netdev, nhorman, sassmann, jogreene, Gangfeng Huang,
	Fengguang Wu, Jeff Kirsher

From: Julia Lawall <julia.lawall@lip6.fr>

The index variable of list_for_each_entry_safe should never be NULL.

Generated by: scripts/coccinelle/iterators/itnull.cocci

CC: Gangfeng Huang <gangfeng.huang@ni.com>
Signed-off-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igb/igb_cdev.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_cdev.c b/drivers/net/ethernet/intel/igb/igb_cdev.c
index 28fafec..5e6579c 100644
--- a/drivers/net/ethernet/intel/igb/igb_cdev.c
+++ b/drivers/net/ethernet/intel/igb/igb_cdev.c
@@ -411,8 +411,7 @@ static int igb_close_file(struct inode *inode, struct file *file)
 
 		list_for_each_entry_safe(userpage, tmp,
 					 &adapter->user_page_list, page_node) {
-			if (userpage)
-				igb_free_page(adapter, userpage);
+			igb_free_page(adapter, userpage);
 		}
 	}
 	mutex_unlock(&adapter->user_page_mutex);
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [net-next 10/20] igb: fix semicolon.cocci warnings
  2016-02-24  4:26 [net-next 00/20][pull request] 1GbE Intel Wired LAN Driver Updates 2016-02-23 Jeff Kirsher
                   ` (8 preceding siblings ...)
  2016-02-24  4:26 ` [net-next 09/20] igb: fix itnull.cocci warnings Jeff Kirsher
@ 2016-02-24  4:26 ` Jeff Kirsher
  2016-02-24  4:26 ` [net-next 11/20] igb: When GbE link up, wait for Remote receiver status condition Jeff Kirsher
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-02-24  4:26 UTC (permalink / raw)
  To: davem
  Cc: Julia Lawall, netdev, nhorman, sassmann, jogreene, Gangfeng Huang,
	Fengguang Wu, Jeff Kirsher

From: Julia Lawall <julia.lawall@lip6.fr>

Remove unneeded semicolon.

Generated by: scripts/coccinelle/misc/semicolon.cocci

CC: Gangfeng Huang <gangfeng.huang@ni.com>
Signed-off-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igb/igb_cdev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_cdev.c b/drivers/net/ethernet/intel/igb/igb_cdev.c
index 5e6579c..6bb3b66 100644
--- a/drivers/net/ethernet/intel/igb/igb_cdev.c
+++ b/drivers/net/ethernet/intel/igb/igb_cdev.c
@@ -361,7 +361,7 @@ static long igb_ioctl_file(struct file *file, unsigned int cmd,
 	default:
 		err = -EINVAL;
 		break;
-	};
+	}
 
 	return err;
 }
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [net-next 11/20] igb: When GbE link up, wait for Remote receiver status condition
  2016-02-24  4:26 [net-next 00/20][pull request] 1GbE Intel Wired LAN Driver Updates 2016-02-23 Jeff Kirsher
                   ` (9 preceding siblings ...)
  2016-02-24  4:26 ` [net-next 10/20] igb: fix semicolon.cocci warnings Jeff Kirsher
@ 2016-02-24  4:26 ` Jeff Kirsher
  2016-02-24  4:26 ` [net-next 12/20] igb: constify e1000_phy_operations structure Jeff Kirsher
                   ` (8 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-02-24  4:26 UTC (permalink / raw)
  To: davem; +Cc: Takuma Ueba, netdev, nhorman, sassmann, jogreene, Jeff Kirsher

From: Takuma Ueba <t.ueba11@gmail.com>

I210 device IPv6 autoconf test sometimes fails,
because DAD NS for link-local is not transmitted.
This packet is silently dropped.
This problem is seen only GbE environment.

igb_watchdog_task link up detection continues to the following process.
The following cases are observed:
1.PHY 1000BASE-T Status Register Remote receiver status bit is NG.
(NG status becomes OK after about 200 - 700ms)
2.In this case, the transfer packet is silently dropped.

1000BASE-T Status register
[Expected]: 0x3800 or 0x7800
[problem occurred]: 0x2800 or 0x6800
Frequency of occurrence: approx 1/10 - 1/40 observed

In order to avoid this problem,
wait until 1000BASE-T Status register "Remote receiver status OK"

After applying this patch, at least 400 runs succeed with no problems.

Signed-off-by: Takuma Ueba <t.ueba11@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igb/igb_main.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 0d501a8..0f7805c 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -4449,6 +4449,7 @@ static void igb_watchdog_task(struct work_struct *work)
 	u32 link;
 	int i;
 	u32 connsw;
+	u16 phy_data, retry_count = 20;
 
 	link = igb_has_link(adapter);
 
@@ -4527,6 +4528,25 @@ static void igb_watchdog_task(struct work_struct *work)
 				break;
 			}
 
+			if (adapter->link_speed != SPEED_1000)
+				goto no_wait;
+
+			/* wait for Remote receiver status OK */
+retry_read_status:
+			if (!igb_read_phy_reg(hw, PHY_1000T_STATUS,
+					      &phy_data)) {
+				if (!(phy_data & SR_1000T_REMOTE_RX_STATUS) &&
+				    retry_count) {
+					msleep(100);
+					retry_count--;
+					goto retry_read_status;
+				} else if (!retry_count) {
+					dev_err(&adapter->pdev->dev, "exceed max 2 second\n");
+				}
+			} else {
+				dev_err(&adapter->pdev->dev, "read 1000Base-T Status Reg\n");
+			}
+no_wait:
 			netif_carrier_on(netdev);
 
 			igb_ping_all_vfs(adapter);
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [net-next 12/20] igb: constify e1000_phy_operations structure
  2016-02-24  4:26 [net-next 00/20][pull request] 1GbE Intel Wired LAN Driver Updates 2016-02-23 Jeff Kirsher
                   ` (10 preceding siblings ...)
  2016-02-24  4:26 ` [net-next 11/20] igb: When GbE link up, wait for Remote receiver status condition Jeff Kirsher
@ 2016-02-24  4:26 ` Jeff Kirsher
  2016-02-24  4:26 ` [net-next 13/20] igb: enable WoL for OEM devices regardless of EEPROM setting Jeff Kirsher
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-02-24  4:26 UTC (permalink / raw)
  To: davem
  Cc: Julia Lawall, netdev, nhorman, sassmann, jogreene, Julia Lawall,
	Jeff Kirsher

From: Julia Lawall <julia.lawall@lip6.fr>

This e1000_phy_operations structure is never modified, so declare it as
const.  Other structures of this type are already const.

Done with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igb/e1000_82575.c | 2 +-
 drivers/net/ethernet/intel/igb/e1000_hw.h    | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/e1000_82575.c b/drivers/net/ethernet/intel/igb/e1000_82575.c
index 9a1a9c7..a23aa67 100644
--- a/drivers/net/ethernet/intel/igb/e1000_82575.c
+++ b/drivers/net/ethernet/intel/igb/e1000_82575.c
@@ -2920,7 +2920,7 @@ static struct e1000_mac_operations e1000_mac_ops_82575 = {
 #endif
 };
 
-static struct e1000_phy_operations e1000_phy_ops_82575 = {
+static const struct e1000_phy_operations e1000_phy_ops_82575 = {
 	.acquire              = igb_acquire_phy_82575,
 	.get_cfg_done         = igb_get_cfg_done_82575,
 	.release              = igb_release_phy_82575,
diff --git a/drivers/net/ethernet/intel/igb/e1000_hw.h b/drivers/net/ethernet/intel/igb/e1000_hw.h
index f0c416e..2fb2213 100644
--- a/drivers/net/ethernet/intel/igb/e1000_hw.h
+++ b/drivers/net/ethernet/intel/igb/e1000_hw.h
@@ -372,7 +372,7 @@ struct e1000_thermal_sensor_data {
 struct e1000_info {
 	s32 (*get_invariants)(struct e1000_hw *);
 	struct e1000_mac_operations *mac_ops;
-	struct e1000_phy_operations *phy_ops;
+	const struct e1000_phy_operations *phy_ops;
 	struct e1000_nvm_operations *nvm_ops;
 };
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [net-next 13/20] igb: enable WoL for OEM devices regardless of EEPROM setting
  2016-02-24  4:26 [net-next 00/20][pull request] 1GbE Intel Wired LAN Driver Updates 2016-02-23 Jeff Kirsher
                   ` (11 preceding siblings ...)
  2016-02-24  4:26 ` [net-next 12/20] igb: constify e1000_phy_operations structure Jeff Kirsher
@ 2016-02-24  4:26 ` Jeff Kirsher
  2016-02-24  4:26 ` [net-next 14/20] igb: add conditions for I210 to generate periodic clock output Jeff Kirsher
                   ` (6 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-02-24  4:26 UTC (permalink / raw)
  To: davem; +Cc: Todd Fujinaka, netdev, nhorman, sassmann, jogreene, Jeff Kirsher

From: Todd Fujinaka <todd.fujinaka@intel.com>

Override EEPROM settings for specific OEM devices.

Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igb/igb_main.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 0f7805c..cd3708b 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -2584,6 +2584,26 @@ static int igb_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 		adapter->wol = 0;
 	}
 
+	/* Some vendors want the ability to Use the EEPROM setting as
+	 * enable/disable only, and not for capability
+	 */
+	if (((hw->mac.type == e1000_i350) ||
+	     (hw->mac.type == e1000_i354)) &&
+	    (pdev->subsystem_vendor == PCI_VENDOR_ID_DELL)) {
+		adapter->flags |= IGB_FLAG_WOL_SUPPORTED;
+		adapter->wol = 0;
+	}
+	if (hw->mac.type == e1000_i350) {
+		if (((pdev->subsystem_device == 0x5001) ||
+		     (pdev->subsystem_device == 0x5002)) &&
+				(hw->bus.func == 0)) {
+			adapter->flags |= IGB_FLAG_WOL_SUPPORTED;
+			adapter->wol = 0;
+		}
+		if (pdev->subsystem_device == 0x1F52)
+			adapter->flags |= IGB_FLAG_WOL_SUPPORTED;
+	}
+
 	device_set_wakeup_enable(&adapter->pdev->dev,
 				 adapter->flags & IGB_FLAG_WOL_SUPPORTED);
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [net-next 14/20] igb: add conditions for I210 to generate periodic clock output
  2016-02-24  4:26 [net-next 00/20][pull request] 1GbE Intel Wired LAN Driver Updates 2016-02-23 Jeff Kirsher
                   ` (12 preceding siblings ...)
  2016-02-24  4:26 ` [net-next 13/20] igb: enable WoL for OEM devices regardless of EEPROM setting Jeff Kirsher
@ 2016-02-24  4:26 ` Jeff Kirsher
  2016-02-24  4:26 ` [net-next 15/20] igb: rename igb define to be more generic Jeff Kirsher
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-02-24  4:26 UTC (permalink / raw)
  To: davem; +Cc: Roland Hii, netdev, nhorman, sassmann, jogreene, Jeff Kirsher

From: Roland Hii <roland.king.guan.hii@intel.com>

In general case the maximum supported half cycle time of the synchronized
output clock is 70msec. Slower half cycle time than 70msec can be
programmed also as long as the output clock is synchronized to whole
seconds, useful specifically for generating a 1Hz clock.

Permitted values for the clock half cycle time are: 125,000,000 decimal,
250,000,000 decimal and 500,000,000 decimal (equals to 125msec, 250msec
and 500msec respectively).

Before this patch, only the half cycle time of less than or equal to 70msec
uses the I210 clock output function. This patch adds additional conditions
when half cycle time is equal to 125msec or 250msec or 500msec to use
clock output function.

Under other conditions, interrupt driven target time output events method
is still used to generate the desired clock output.

Signed-off-by: Roland Hii <roland.king.guan.hii@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igb/igb_ptp.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_ptp.c b/drivers/net/ethernet/intel/igb/igb_ptp.c
index c44df87..22a8a29 100644
--- a/drivers/net/ethernet/intel/igb/igb_ptp.c
+++ b/drivers/net/ethernet/intel/igb/igb_ptp.c
@@ -525,7 +525,8 @@ static int igb_ptp_feature_enable_i210(struct ptp_clock_info *ptp,
 		ts.tv_nsec = rq->perout.period.nsec;
 		ns = timespec64_to_ns(&ts);
 		ns = ns >> 1;
-		if (on && ns <= 70000000LL) {
+		if (on && ((ns <= 70000000LL) || (ns == 125000000LL) ||
+			   (ns == 250000000LL) || (ns == 500000000LL))) {
 			if (ns < 8LL)
 				return -EINVAL;
 			use_freq = 1;
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [net-next 15/20] igb: rename igb define to be more generic
  2016-02-24  4:26 [net-next 00/20][pull request] 1GbE Intel Wired LAN Driver Updates 2016-02-23 Jeff Kirsher
                   ` (13 preceding siblings ...)
  2016-02-24  4:26 ` [net-next 14/20] igb: add conditions for I210 to generate periodic clock output Jeff Kirsher
@ 2016-02-24  4:26 ` Jeff Kirsher
  2016-02-24  4:26 ` [net-next 16/20] igb: Add support for generic Tx checksums Jeff Kirsher
                   ` (4 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-02-24  4:26 UTC (permalink / raw)
  To: davem; +Cc: Todd Fujinaka, netdev, nhorman, sassmann, jogreene, Jeff Kirsher

From: Todd Fujinaka <todd.fujinaka@intel.com>

E1000_MRQC_ENABLE_RSS_4Q enables 4 and 8 queues depending on the part
so rename to be generic.

Similarly, E1000_MRQC_ENABLE_VMDQ_RSS_2Q has no numeric meaning so
rename to be more generic.

Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igb/e1000_82575.h | 4 ++--
 drivers/net/ethernet/intel/igb/igb_main.c    | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/e1000_82575.h b/drivers/net/ethernet/intel/igb/e1000_82575.h
index 2154aea..de8805a 100644
--- a/drivers/net/ethernet/intel/igb/e1000_82575.h
+++ b/drivers/net/ethernet/intel/igb/e1000_82575.h
@@ -56,10 +56,10 @@ s32 igb_write_i2c_byte(struct e1000_hw *hw, u8 byte_offset, u8 dev_addr,
 #define E1000_SRRCTL_TIMESTAMP                          0x40000000
 
 
-#define E1000_MRQC_ENABLE_RSS_4Q            0x00000002
+#define E1000_MRQC_ENABLE_RSS_MQ            0x00000002
 #define E1000_MRQC_ENABLE_VMDQ              0x00000003
 #define E1000_MRQC_RSS_FIELD_IPV4_UDP       0x00400000
-#define E1000_MRQC_ENABLE_VMDQ_RSS_2Q       0x00000005
+#define E1000_MRQC_ENABLE_VMDQ_RSS_MQ       0x00000005
 #define E1000_MRQC_RSS_FIELD_IPV6_UDP       0x00800000
 #define E1000_MRQC_RSS_FIELD_IPV6_UDP_EX    0x01000000
 
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index cd3708b..855eebf 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -3584,12 +3584,12 @@ static void igb_setup_mrqc(struct igb_adapter *adapter)
 			wr32(E1000_VT_CTL, vtctl);
 		}
 		if (adapter->rss_queues > 1)
-			mrqc |= E1000_MRQC_ENABLE_VMDQ_RSS_2Q;
+			mrqc |= E1000_MRQC_ENABLE_VMDQ_RSS_MQ;
 		else
 			mrqc |= E1000_MRQC_ENABLE_VMDQ;
 	} else {
 		if (hw->mac.type != e1000_i211)
-			mrqc |= E1000_MRQC_ENABLE_RSS_4Q;
+			mrqc |= E1000_MRQC_ENABLE_RSS_MQ;
 	}
 
 	igb_vmm_control(adapter);
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [net-next 16/20] igb: Add support for generic Tx checksums
  2016-02-24  4:26 [net-next 00/20][pull request] 1GbE Intel Wired LAN Driver Updates 2016-02-23 Jeff Kirsher
                   ` (14 preceding siblings ...)
  2016-02-24  4:26 ` [net-next 15/20] igb: rename igb define to be more generic Jeff Kirsher
@ 2016-02-24  4:26 ` Jeff Kirsher
  2016-02-24  4:26 ` [net-next 17/20] igbvf: " Jeff Kirsher
                   ` (3 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-02-24  4:26 UTC (permalink / raw)
  To: davem; +Cc: Alexander Duyck, netdev, nhorman, sassmann, jogreene,
	Jeff Kirsher

From: Alexander Duyck <aduyck@mirantis.com>

This patch adds support for generic Tx checksums to the igb driver.  It
turns out this is actually pretty easy after going over the datasheet as we
were doing a number of steps we didn't need to.

In order to perform a Tx checksum for an L4 header we need to fill in the
following fields in the Tx descriptor:
  MACLEN (maximum of 127), retrieved from:
		skb_network_offset()
  IPLEN  (maximum of 511), retrieved from:
		skb_checksum_start_offset() - skb_network_offset()
  TUCMD.L4T indicates offset and if checksum or crc32c, based on:
		skb->csum_offset

The added advantage to doing this is that we can support inner checksum
offloads for tunnels and MPLS while still being able to transparently
insert VLAN tags.

I also took the opportunity to clean-up many of the feature flag
configuration bits to make them a bit more consistent between drivers.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igb/igb_main.c | 106 ++++++++++++++----------------
 1 file changed, 48 insertions(+), 58 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 855eebf..96a53bf 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -2418,27 +2418,35 @@ static int igb_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	 * assignment.
 	 */
 	netdev->features |= NETIF_F_SG |
-			    NETIF_F_IP_CSUM |
-			    NETIF_F_IPV6_CSUM |
 			    NETIF_F_TSO |
 			    NETIF_F_TSO6 |
 			    NETIF_F_RXHASH |
 			    NETIF_F_RXCSUM |
+			    NETIF_F_HW_CSUM |
 			    NETIF_F_HW_VLAN_CTAG_RX |
 			    NETIF_F_HW_VLAN_CTAG_TX;
 
+	if (hw->mac.type >= e1000_82576)
+		netdev->features |= NETIF_F_SCTP_CRC;
+
 	/* copy netdev features into list of user selectable features */
 	netdev->hw_features |= netdev->features;
 	netdev->hw_features |= NETIF_F_RXALL;
 
+	if (hw->mac.type >= e1000_i350)
+		netdev->hw_features |= NETIF_F_NTUPLE;
+
 	/* set this bit last since it cannot be part of hw_features */
 	netdev->features |= NETIF_F_HW_VLAN_CTAG_FILTER;
 
-	netdev->vlan_features |= NETIF_F_TSO |
+	netdev->vlan_features |= NETIF_F_SG |
+				 NETIF_F_TSO |
 				 NETIF_F_TSO6 |
-				 NETIF_F_IP_CSUM |
-				 NETIF_F_IPV6_CSUM |
-				 NETIF_F_SG;
+				 NETIF_F_HW_CSUM |
+				 NETIF_F_SCTP_CRC;
+
+	netdev->mpls_features |= NETIF_F_HW_CSUM;
+	netdev->hw_enc_features |= NETIF_F_HW_CSUM;
 
 	netdev->priv_flags |= IFF_SUPP_NOFCS;
 
@@ -2447,11 +2455,6 @@ static int igb_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 		netdev->vlan_features |= NETIF_F_HIGHDMA;
 	}
 
-	if (hw->mac.type >= e1000_82576) {
-		netdev->hw_features |= NETIF_F_SCTP_CRC;
-		netdev->features |= NETIF_F_SCTP_CRC;
-	}
-
 	netdev->priv_flags |= IFF_UNICAST_FLT;
 
 	adapter->en_mng_pt = igb_enable_mng_pass_thru(hw);
@@ -4975,70 +4978,57 @@ static int igb_tso(struct igb_ring *tx_ring,
 	return 1;
 }
 
+static inline bool igb_ipv6_csum_is_sctp(struct sk_buff *skb)
+{
+	unsigned int offset = 0;
+
+	ipv6_find_hdr(skb, &offset, IPPROTO_SCTP, NULL, NULL);
+
+	return offset == skb_checksum_start_offset(skb);
+}
+
 static void igb_tx_csum(struct igb_ring *tx_ring, struct igb_tx_buffer *first)
 {
 	struct sk_buff *skb = first->skb;
 	u32 vlan_macip_lens = 0;
-	u32 mss_l4len_idx = 0;
 	u32 type_tucmd = 0;
 
 	if (skb->ip_summed != CHECKSUM_PARTIAL) {
+csum_failed:
 		if (!(first->tx_flags & IGB_TX_FLAGS_VLAN))
 			return;
-	} else {
-		u8 l4_hdr = 0;
-
-		switch (first->protocol) {
-		case htons(ETH_P_IP):
-			vlan_macip_lens |= skb_network_header_len(skb);
-			type_tucmd |= E1000_ADVTXD_TUCMD_IPV4;
-			l4_hdr = ip_hdr(skb)->protocol;
-			break;
-		case htons(ETH_P_IPV6):
-			vlan_macip_lens |= skb_network_header_len(skb);
-			l4_hdr = ipv6_hdr(skb)->nexthdr;
-			break;
-		default:
-			if (unlikely(net_ratelimit())) {
-				dev_warn(tx_ring->dev,
-					 "partial checksum but proto=%x!\n",
-					 first->protocol);
-			}
-			break;
-		}
+		goto no_csum;
+	}
 
-		switch (l4_hdr) {
-		case IPPROTO_TCP:
-			type_tucmd |= E1000_ADVTXD_TUCMD_L4T_TCP;
-			mss_l4len_idx = tcp_hdrlen(skb) <<
-					E1000_ADVTXD_L4LEN_SHIFT;
-			break;
-		case IPPROTO_SCTP:
-			type_tucmd |= E1000_ADVTXD_TUCMD_L4T_SCTP;
-			mss_l4len_idx = sizeof(struct sctphdr) <<
-					E1000_ADVTXD_L4LEN_SHIFT;
-			break;
-		case IPPROTO_UDP:
-			mss_l4len_idx = sizeof(struct udphdr) <<
-					E1000_ADVTXD_L4LEN_SHIFT;
-			break;
-		default:
-			if (unlikely(net_ratelimit())) {
-				dev_warn(tx_ring->dev,
-					 "partial checksum but l4 proto=%x!\n",
-					 l4_hdr);
-			}
+	switch (skb->csum_offset) {
+	case offsetof(struct tcphdr, check):
+		type_tucmd = E1000_ADVTXD_TUCMD_L4T_TCP;
+		/* fall through */
+	case offsetof(struct udphdr, check):
+		break;
+	case offsetof(struct sctphdr, checksum):
+		/* validate that this is actually an SCTP request */
+		if (((first->protocol == htons(ETH_P_IP)) &&
+		     (ip_hdr(skb)->protocol == IPPROTO_SCTP)) ||
+		    ((first->protocol == htons(ETH_P_IPV6)) &&
+		     igb_ipv6_csum_is_sctp(skb))) {
+			type_tucmd = E1000_ADVTXD_TUCMD_L4T_SCTP;
 			break;
 		}
-
-		/* update TX checksum flag */
-		first->tx_flags |= IGB_TX_FLAGS_CSUM;
+	default:
+		skb_checksum_help(skb);
+		goto csum_failed;
 	}
 
+	/* update TX checksum flag */
+	first->tx_flags |= IGB_TX_FLAGS_CSUM;
+	vlan_macip_lens = skb_checksum_start_offset(skb) -
+			  skb_network_offset(skb);
+no_csum:
 	vlan_macip_lens |= skb_network_offset(skb) << E1000_ADVTXD_MACLEN_SHIFT;
 	vlan_macip_lens |= first->tx_flags & IGB_TX_FLAGS_VLAN_MASK;
 
-	igb_tx_ctxtdesc(tx_ring, vlan_macip_lens, type_tucmd, mss_l4len_idx);
+	igb_tx_ctxtdesc(tx_ring, vlan_macip_lens, type_tucmd, 0);
 }
 
 #define IGB_SET_FLAG(_input, _flag, _result) \
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [net-next 17/20] igbvf: Add support for generic Tx checksums
  2016-02-24  4:26 [net-next 00/20][pull request] 1GbE Intel Wired LAN Driver Updates 2016-02-23 Jeff Kirsher
                   ` (15 preceding siblings ...)
  2016-02-24  4:26 ` [net-next 16/20] igb: Add support for generic Tx checksums Jeff Kirsher
@ 2016-02-24  4:26 ` Jeff Kirsher
  2016-02-24  4:26 ` [net-next 18/20] igbvf: remove "link is Up" message when registering mcast address Jeff Kirsher
                   ` (2 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-02-24  4:26 UTC (permalink / raw)
  To: davem; +Cc: Alexander Duyck, netdev, nhorman, sassmann, jogreene,
	Jeff Kirsher

From: Alexander Duyck <aduyck@mirantis.com>

This patch adds support for generic Tx checksums to the igbvf driver.  It
turns out this is actually pretty easy after going over the datasheet as we
were doing a number of steps we didn't need to.

In order to perform a Tx checksum for an L4 header we need to fill in the
following fields in the Tx descriptor:
  MACLEN (maximum of 127), retrieved from:
		skb_network_offset()
  IPLEN  (maximum of 511), retrieved from:
		skb_checksum_start_offset() - skb_network_offset()
  TUCMD.L4T indicates offset and if checksum or crc32c, based on:
		skb->csum_offset

The added advantage to doing this is that we can support inner checksum
offloads for tunnels and MPLS while still being able to transparently
insert VLAN tags.

I also took the opportunity to clean-up many of the feature flag
configuration bits to make them a bit more consistent between drivers.  In
the case of the VF drivers this meant adding support for SCTP CRCs, and
inner checksum offloads for MPLS and various tunnel types.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igbvf/netdev.c | 142 +++++++++++++++++-------------
 drivers/net/ethernet/intel/igbvf/vf.h     |   1 +
 2 files changed, 82 insertions(+), 61 deletions(-)

diff --git a/drivers/net/ethernet/intel/igbvf/netdev.c b/drivers/net/ethernet/intel/igbvf/netdev.c
index 297af80..aa34865 100644
--- a/drivers/net/ethernet/intel/igbvf/netdev.c
+++ b/drivers/net/ethernet/intel/igbvf/netdev.c
@@ -43,6 +43,7 @@
 #include <linux/ethtool.h>
 #include <linux/if_vlan.h>
 #include <linux/prefetch.h>
+#include <linux/sctp.h>
 
 #include "igbvf.h"
 
@@ -1908,6 +1909,31 @@ static void igbvf_watchdog_task(struct work_struct *work)
 #define IGBVF_TX_FLAGS_VLAN_MASK	0xffff0000
 #define IGBVF_TX_FLAGS_VLAN_SHIFT	16
 
+static void igbvf_tx_ctxtdesc(struct igbvf_ring *tx_ring, u32 vlan_macip_lens,
+			      u32 type_tucmd, u32 mss_l4len_idx)
+{
+	struct e1000_adv_tx_context_desc *context_desc;
+	struct igbvf_buffer *buffer_info;
+	u16 i = tx_ring->next_to_use;
+
+	context_desc = IGBVF_TX_CTXTDESC_ADV(*tx_ring, i);
+	buffer_info = &tx_ring->buffer_info[i];
+
+	i++;
+	tx_ring->next_to_use = (i < tx_ring->count) ? i : 0;
+
+	/* set bits to identify this as an advanced context descriptor */
+	type_tucmd |= E1000_TXD_CMD_DEXT | E1000_ADVTXD_DTYP_CTXT;
+
+	context_desc->vlan_macip_lens	= cpu_to_le32(vlan_macip_lens);
+	context_desc->seqnum_seed	= 0;
+	context_desc->type_tucmd_mlhl	= cpu_to_le32(type_tucmd);
+	context_desc->mss_l4len_idx	= cpu_to_le32(mss_l4len_idx);
+
+	buffer_info->time_stamp = jiffies;
+	buffer_info->dma = 0;
+}
+
 static int igbvf_tso(struct igbvf_adapter *adapter,
 		     struct igbvf_ring *tx_ring,
 		     struct sk_buff *skb, u32 tx_flags, u8 *hdr_len,
@@ -1987,65 +2013,56 @@ static int igbvf_tso(struct igbvf_adapter *adapter,
 	return true;
 }
 
-static inline bool igbvf_tx_csum(struct igbvf_adapter *adapter,
-				 struct igbvf_ring *tx_ring,
-				 struct sk_buff *skb, u32 tx_flags,
-				 __be16 protocol)
+static inline bool igbvf_ipv6_csum_is_sctp(struct sk_buff *skb)
 {
-	struct e1000_adv_tx_context_desc *context_desc;
-	unsigned int i;
-	struct igbvf_buffer *buffer_info;
-	u32 info = 0, tu_cmd = 0;
-
-	if ((skb->ip_summed == CHECKSUM_PARTIAL) ||
-	    (tx_flags & IGBVF_TX_FLAGS_VLAN)) {
-		i = tx_ring->next_to_use;
-		buffer_info = &tx_ring->buffer_info[i];
-		context_desc = IGBVF_TX_CTXTDESC_ADV(*tx_ring, i);
+	unsigned int offset = 0;
 
-		if (tx_flags & IGBVF_TX_FLAGS_VLAN)
-			info |= (tx_flags & IGBVF_TX_FLAGS_VLAN_MASK);
+	ipv6_find_hdr(skb, &offset, IPPROTO_SCTP, NULL, NULL);
 
-		info |= (skb_network_offset(skb) << E1000_ADVTXD_MACLEN_SHIFT);
-		if (skb->ip_summed == CHECKSUM_PARTIAL)
-			info |= (skb_transport_header(skb) -
-				 skb_network_header(skb));
+	return offset == skb_checksum_start_offset(skb);
+}
 
-		context_desc->vlan_macip_lens = cpu_to_le32(info);
+static bool igbvf_tx_csum(struct igbvf_ring *tx_ring, struct sk_buff *skb,
+			  u32 tx_flags, __be16 protocol)
+{
+	u32 vlan_macip_lens = 0;
+	u32 type_tucmd = 0;
 
-		tu_cmd |= (E1000_TXD_CMD_DEXT | E1000_ADVTXD_DTYP_CTXT);
+	if (skb->ip_summed != CHECKSUM_PARTIAL) {
+csum_failed:
+		if (!(tx_flags & IGBVF_TX_FLAGS_VLAN))
+			return false;
+		goto no_csum;
+	}
 
-		if (skb->ip_summed == CHECKSUM_PARTIAL) {
-			switch (protocol) {
-			case htons(ETH_P_IP):
-				tu_cmd |= E1000_ADVTXD_TUCMD_IPV4;
-				if (ip_hdr(skb)->protocol == IPPROTO_TCP)
-					tu_cmd |= E1000_ADVTXD_TUCMD_L4T_TCP;
-				break;
-			case htons(ETH_P_IPV6):
-				if (ipv6_hdr(skb)->nexthdr == IPPROTO_TCP)
-					tu_cmd |= E1000_ADVTXD_TUCMD_L4T_TCP;
-				break;
-			default:
-				break;
-			}
+	switch (skb->csum_offset) {
+	case offsetof(struct tcphdr, check):
+		type_tucmd = E1000_ADVTXD_TUCMD_L4T_TCP;
+		/* fall through */
+	case offsetof(struct udphdr, check):
+		break;
+	case offsetof(struct sctphdr, checksum):
+		/* validate that this is actually an SCTP request */
+		if (((protocol == htons(ETH_P_IP)) &&
+		     (ip_hdr(skb)->protocol == IPPROTO_SCTP)) ||
+		    ((protocol == htons(ETH_P_IPV6)) &&
+		     igbvf_ipv6_csum_is_sctp(skb))) {
+			type_tucmd = E1000_ADVTXD_TUCMD_L4T_SCTP;
+			break;
 		}
-
-		context_desc->type_tucmd_mlhl = cpu_to_le32(tu_cmd);
-		context_desc->seqnum_seed = 0;
-		context_desc->mss_l4len_idx = 0;
-
-		buffer_info->time_stamp = jiffies;
-		buffer_info->dma = 0;
-		i++;
-		if (i == tx_ring->count)
-			i = 0;
-		tx_ring->next_to_use = i;
-
-		return true;
+	default:
+		skb_checksum_help(skb);
+		goto csum_failed;
 	}
 
-	return false;
+	vlan_macip_lens = skb_checksum_start_offset(skb) -
+			  skb_network_offset(skb);
+no_csum:
+	vlan_macip_lens |= skb_network_offset(skb) << E1000_ADVTXD_MACLEN_SHIFT;
+	vlan_macip_lens |= tx_flags & IGBVF_TX_FLAGS_VLAN_MASK;
+
+	igbvf_tx_ctxtdesc(tx_ring, vlan_macip_lens, type_tucmd, 0);
+	return true;
 }
 
 static int igbvf_maybe_stop_tx(struct net_device *netdev, int size)
@@ -2264,7 +2281,7 @@ static netdev_tx_t igbvf_xmit_frame_ring_adv(struct sk_buff *skb,
 
 	if (tso)
 		tx_flags |= IGBVF_TX_FLAGS_TSO;
-	else if (igbvf_tx_csum(adapter, tx_ring, skb, tx_flags, protocol) &&
+	else if (igbvf_tx_csum(tx_ring, skb, tx_flags, protocol) &&
 		 (skb->ip_summed == CHECKSUM_PARTIAL))
 		tx_flags |= IGBVF_TX_FLAGS_CSUM;
 
@@ -2717,11 +2734,11 @@ static int igbvf_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	adapter->bd_number = cards_found++;
 
 	netdev->hw_features = NETIF_F_SG |
-			   NETIF_F_IP_CSUM |
-			   NETIF_F_IPV6_CSUM |
-			   NETIF_F_TSO |
-			   NETIF_F_TSO6 |
-			   NETIF_F_RXCSUM;
+			      NETIF_F_TSO |
+			      NETIF_F_TSO6 |
+			      NETIF_F_RXCSUM |
+			      NETIF_F_HW_CSUM |
+			      NETIF_F_SCTP_CRC;
 
 	netdev->features = netdev->hw_features |
 			   NETIF_F_HW_VLAN_CTAG_TX |
@@ -2731,11 +2748,14 @@ static int igbvf_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	if (pci_using_dac)
 		netdev->features |= NETIF_F_HIGHDMA;
 
-	netdev->vlan_features |= NETIF_F_TSO;
-	netdev->vlan_features |= NETIF_F_TSO6;
-	netdev->vlan_features |= NETIF_F_IP_CSUM;
-	netdev->vlan_features |= NETIF_F_IPV6_CSUM;
-	netdev->vlan_features |= NETIF_F_SG;
+	netdev->vlan_features |= NETIF_F_SG |
+				 NETIF_F_TSO |
+				 NETIF_F_TSO6 |
+				 NETIF_F_HW_CSUM |
+				 NETIF_F_SCTP_CRC;
+
+	netdev->mpls_features |= NETIF_F_HW_CSUM;
+	netdev->hw_enc_features |= NETIF_F_HW_CSUM;
 
 	/*reset the controller to put the device in a known good state */
 	err = hw->mac.ops.reset_hw(hw);
diff --git a/drivers/net/ethernet/intel/igbvf/vf.h b/drivers/net/ethernet/intel/igbvf/vf.h
index 0f1eca6..f00a41d 100644
--- a/drivers/net/ethernet/intel/igbvf/vf.h
+++ b/drivers/net/ethernet/intel/igbvf/vf.h
@@ -126,6 +126,7 @@ struct e1000_adv_tx_context_desc {
 #define E1000_ADVTXD_MACLEN_SHIFT	9  /* Adv ctxt desc mac len shift */
 #define E1000_ADVTXD_TUCMD_IPV4		0x00000400 /* IP Packet Type: 1=IPv4 */
 #define E1000_ADVTXD_TUCMD_L4T_TCP	0x00000800 /* L4 Packet TYPE of TCP */
+#define E1000_ADVTXD_TUCMD_L4T_SCTP	0x00001000 /* L4 packet TYPE of SCTP */
 #define E1000_ADVTXD_L4LEN_SHIFT	8  /* Adv ctxt L4LEN shift */
 #define E1000_ADVTXD_MSS_SHIFT		16 /* Adv ctxt MSS shift */
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [net-next 18/20] igbvf: remove "link is Up" message when registering mcast address
  2016-02-24  4:26 [net-next 00/20][pull request] 1GbE Intel Wired LAN Driver Updates 2016-02-23 Jeff Kirsher
                   ` (16 preceding siblings ...)
  2016-02-24  4:26 ` [net-next 17/20] igbvf: " Jeff Kirsher
@ 2016-02-24  4:26 ` Jeff Kirsher
  2016-02-24  4:26 ` [net-next 19/20] igb: Fix VLAN tag stripping on Intel i350 Jeff Kirsher
  2016-02-24  4:26 ` [net-next 20/20] igb: call ndo_stop() instead of dev_close() when running offline selftest Jeff Kirsher
  19 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-02-24  4:26 UTC (permalink / raw)
  To: davem; +Cc: Jon Maxwell, netdev, nhorman, sassmann, jogreene, Jeff Kirsher

From: Jon Maxwell <jmaxwell37@gmail.com>

A similar issue was addressed a few years ago in the following thread:

http://www.spinics.net/lists/netdev/msg245877.html

At that time there were concerns that removing this statement may cause other
side effects. However the submitter addressed those concerns. But the dialogue
went cold. We have a new case where a customers application is registering and
un-registering multicast addresses every few seconds. This is leading to many
"Link is Up" messages in the logs as a result of the
"netif_carrier_off(netdev)" statement called by igbvf_msix_other(). Also on
some kernels it is interfering with the bonding driver causing it to failover
and subsequently affecting connectivity.

The Sourgeforge driver does not make this call and is therefore not affected.
If there were any side effects I would expect that driver to also be affected.
I have tested re-loading the igbvf driver and downing the adapter with the PF
entity on the host where the VM has this patch. When I bring it back up again
connectivity is restored as expected. Therefore I request that this patch gets
submitted.

Signed-off-by: Jon Maxwell <jmaxwell37@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igbvf/netdev.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/igbvf/netdev.c b/drivers/net/ethernet/intel/igbvf/netdev.c
index aa34865..c124422 100644
--- a/drivers/net/ethernet/intel/igbvf/netdev.c
+++ b/drivers/net/ethernet/intel/igbvf/netdev.c
@@ -877,7 +877,6 @@ static irqreturn_t igbvf_msix_other(int irq, void *data)
 
 	adapter->int_counter1++;
 
-	netif_carrier_off(netdev);
 	hw->mac.get_link_status = 1;
 	if (!test_bit(__IGBVF_DOWN, &adapter->state))
 		mod_timer(&adapter->watchdog_timer, jiffies + 1);
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [net-next 19/20] igb: Fix VLAN tag stripping on Intel i350
  2016-02-24  4:26 [net-next 00/20][pull request] 1GbE Intel Wired LAN Driver Updates 2016-02-23 Jeff Kirsher
                   ` (17 preceding siblings ...)
  2016-02-24  4:26 ` [net-next 18/20] igbvf: remove "link is Up" message when registering mcast address Jeff Kirsher
@ 2016-02-24  4:26 ` Jeff Kirsher
  2016-02-24  4:26 ` [net-next 20/20] igb: call ndo_stop() instead of dev_close() when running offline selftest Jeff Kirsher
  19 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-02-24  4:26 UTC (permalink / raw)
  To: davem; +Cc: Corinna Vinschen, netdev, nhorman, sassmann, jogreene,
	Jeff Kirsher

From: Corinna Vinschen <vinschen@redhat.com>

Problem: When switching off VLAN offloading on an i350, the VLAN
interface gets unusable.  For testing, set up a VLAN on an i350
and some remote machine, e.g.:

  $ ip link add link eth0 name eth0.42 type vlan id 42
  $ ip addr add 192.168.42.1/24 dev eth0.42
  $ ip link set dev eth0.42 up

Offloading is switched on by default:

  $ ethtool -k eth0 | grep vlan-offload
  rx-vlan-offload: on
  tx-vlan-offload: on

  $ ping -c 3 -I eth0.42 192.168.42.2
  [...works as usual...]

Now switch off VLAN offloading and try again:

  $ ethtool -K eth0 rxvlan off
  Actual changes:
  rx-vlan-offload: off
  tx-vlan-offload: off [requested on]
  $ ping -c 3 -I eth0.42 192.168.42.2
  PING 192.168.42.2 (192.168.42.2) from 192.168.42.1 eth0.42: 56(84) bytes of da
ta.

  --- 192.168.42.2 ping statistics ---
  3 packets transmitted, 0 received, 100% packet loss, time 1999ms

I can only reproduce it on an i350, the above works fine on a 82580.

While inspecting the igb source, I came across the code in igb_set_vmolr
which sets the E1000_VMOLR_STRVLAN/E1000_DVMOLR_STRVLAN flags once and
for all, and in all of the igb code there's no other place where the
STRVLAN is set or cleared.  Thus, VLAN stripping is enabled in igb
unconditionally, independently of the offloading setting.

I compared that to the latest Intel igb-5.3.3.5 driver from
http://sourceforge.net/projects/e1000/ which in fact sets and clears the
STRVLAN flag independently from igb_set_vmolr in its own function
igb_set_vf_vlan_strip, depending on the vlan settings.

So I included the STRVLAN handling from the igb-5.3.3.5 driver into our
current igb driver and tested the above scenario again.  This time ping
still works after switching off VLAN offloading.

Tested on i350, with and without addtional VFs, as well as on 82580
successfully.

Signed-off-by: Corinna Vinschen <vinschen@redhat.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igb/igb_main.c | 41 ++++++++++++++++++++++++-------
 1 file changed, 32 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 96a53bf..efbcf77 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -3677,6 +3677,28 @@ static inline int igb_set_vf_rlpml(struct igb_adapter *adapter, int size,
 	return 0;
 }
 
+static inline void igb_set_vf_vlan_strip(struct igb_adapter *adapter,
+					 int vfn, bool enable)
+{
+	struct e1000_hw *hw = &adapter->hw;
+	u32 val, reg;
+
+	if (hw->mac.type < e1000_82576)
+		return;
+
+	if (hw->mac.type == e1000_i350)
+		reg = E1000_DVMOLR(vfn);
+	else
+		reg = E1000_VMOLR(vfn);
+
+	val = rd32(reg);
+	if (enable)
+		val |= E1000_VMOLR_STRVLAN;
+	else
+		val &= ~(E1000_VMOLR_STRVLAN);
+	wr32(reg, val);
+}
+
 static inline void igb_set_vmolr(struct igb_adapter *adapter,
 				 int vfn, bool aupe)
 {
@@ -3690,14 +3712,6 @@ static inline void igb_set_vmolr(struct igb_adapter *adapter,
 		return;
 
 	vmolr = rd32(E1000_VMOLR(vfn));
-	vmolr |= E1000_VMOLR_STRVLAN; /* Strip vlan tags */
-	if (hw->mac.type == e1000_i350) {
-		u32 dvmolr;
-
-		dvmolr = rd32(E1000_DVMOLR(vfn));
-		dvmolr |= E1000_DVMOLR_STRVLAN;
-		wr32(E1000_DVMOLR(vfn), dvmolr);
-	}
 	if (aupe)
 		vmolr |= E1000_VMOLR_AUPE; /* Accept untagged packets */
 	else
@@ -6195,6 +6209,7 @@ static int igb_enable_port_vlan(struct igb_adapter *adapter, int vf,
 
 	adapter->vf_data[vf].pf_vlan = vlan;
 	adapter->vf_data[vf].pf_qos = qos;
+	igb_set_vf_vlan_strip(adapter, vf, true);
 	dev_info(&adapter->pdev->dev,
 		 "Setting VLAN %d, QOS 0x%x on VF %d\n", vlan, qos, vf);
 	if (test_bit(__IGB_DOWN, &adapter->state)) {
@@ -6222,6 +6237,7 @@ static int igb_disable_port_vlan(struct igb_adapter *adapter, int vf)
 
 	adapter->vf_data[vf].pf_vlan = 0;
 	adapter->vf_data[vf].pf_qos = 0;
+	igb_set_vf_vlan_strip(adapter, vf, false);
 
 	return 0;
 }
@@ -6242,6 +6258,7 @@ static int igb_set_vf_vlan_msg(struct igb_adapter *adapter, u32 *msgbuf, u32 vf)
 {
 	int add = (msgbuf[0] & E1000_VT_MSGINFO_MASK) >> E1000_VT_MSGINFO_SHIFT;
 	int vid = (msgbuf[1] & E1000_VLVF_VLANID_MASK);
+	int ret;
 
 	if (adapter->vf_data[vf].pf_vlan)
 		return -1;
@@ -6250,7 +6267,10 @@ static int igb_set_vf_vlan_msg(struct igb_adapter *adapter, u32 *msgbuf, u32 vf)
 	if (!vid && !add)
 		return 0;
 
-	return igb_set_vf_vlan(adapter, vid, !!add, vf);
+	ret = igb_set_vf_vlan(adapter, vid, !!add, vf);
+	if (!ret)
+		igb_set_vf_vlan_strip(adapter, vf, !!vid);
+	return ret;
 }
 
 static inline void igb_vf_reset(struct igb_adapter *adapter, u32 vf)
@@ -6267,6 +6287,7 @@ static inline void igb_vf_reset(struct igb_adapter *adapter, u32 vf)
 	igb_set_vmvir(adapter, vf_data->pf_vlan |
 			       (vf_data->pf_qos << VLAN_PRIO_SHIFT), vf);
 	igb_set_vmolr(adapter, vf, !vf_data->pf_vlan);
+	igb_set_vf_vlan_strip(adapter, vf, !!(vf_data->pf_vlan));
 
 	/* reset multicast table array for vf */
 	adapter->vf_data[vf].num_vf_mc_hashes = 0;
@@ -7427,6 +7448,8 @@ static void igb_vlan_mode(struct net_device *netdev, netdev_features_t features)
 		ctrl &= ~E1000_CTRL_VME;
 		wr32(E1000_CTRL, ctrl);
 	}
+
+	igb_set_vf_vlan_strip(adapter, adapter->vfs_allocated_count, enable);
 }
 
 static int igb_vlan_rx_add_vid(struct net_device *netdev,
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [net-next 20/20] igb: call ndo_stop() instead of dev_close() when running offline selftest
  2016-02-24  4:26 [net-next 00/20][pull request] 1GbE Intel Wired LAN Driver Updates 2016-02-23 Jeff Kirsher
                   ` (18 preceding siblings ...)
  2016-02-24  4:26 ` [net-next 19/20] igb: Fix VLAN tag stripping on Intel i350 Jeff Kirsher
@ 2016-02-24  4:26 ` Jeff Kirsher
  19 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-02-24  4:26 UTC (permalink / raw)
  To: davem; +Cc: Stefan Assmann, netdev, nhorman, sassmann, jogreene, Jeff Kirsher

From: Stefan Assmann <sassmann@kpanic.de>

Calling dev_close() causes IFF_UP to be cleared which will remove the
interfaces routes and some addresses. That's probably not what the user
intended when running the offline selftest. Besides this does not happen
if the interface is brought down before the test, so the current
behaviour is inconsistent.
Instead call the net_device_ops ndo_stop function directly and avoid
touching IFF_UP at all.

Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igb/igb.h         | 2 ++
 drivers/net/ethernet/intel/igb/igb_ethtool.c | 4 ++--
 drivers/net/ethernet/intel/igb/igb_main.c    | 8 ++++----
 3 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb.h b/drivers/net/ethernet/intel/igb/igb.h
index 3fa3a85..3ada885 100644
--- a/drivers/net/ethernet/intel/igb/igb.h
+++ b/drivers/net/ethernet/intel/igb/igb.h
@@ -525,6 +525,8 @@ enum igb_boards {
 extern char igb_driver_name[];
 extern char igb_driver_version[];
 
+int igb_open(struct net_device *netdev);
+int igb_close(struct net_device *netdev);
 int igb_up(struct igb_adapter *);
 void igb_down(struct igb_adapter *);
 void igb_reinit_locked(struct igb_adapter *);
diff --git a/drivers/net/ethernet/intel/igb/igb_ethtool.c b/drivers/net/ethernet/intel/igb/igb_ethtool.c
index 1d329f1..7982243 100644
--- a/drivers/net/ethernet/intel/igb/igb_ethtool.c
+++ b/drivers/net/ethernet/intel/igb/igb_ethtool.c
@@ -2017,7 +2017,7 @@ static void igb_diag_test(struct net_device *netdev,
 
 		if (if_running)
 			/* indicate we're in test mode */
-			dev_close(netdev);
+			igb_close(netdev);
 		else
 			igb_reset(adapter);
 
@@ -2050,7 +2050,7 @@ static void igb_diag_test(struct net_device *netdev,
 
 		clear_bit(__IGB_TESTING, &adapter->state);
 		if (if_running)
-			dev_open(netdev);
+			igb_open(netdev);
 	} else {
 		dev_info(&adapter->pdev->dev, "online testing starting\n");
 
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index efbcf77..79354f6 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -123,8 +123,8 @@ static void igb_setup_mrqc(struct igb_adapter *);
 static int igb_probe(struct pci_dev *, const struct pci_device_id *);
 static void igb_remove(struct pci_dev *pdev);
 static int igb_sw_init(struct igb_adapter *);
-static int igb_open(struct net_device *);
-static int igb_close(struct net_device *);
+int igb_open(struct net_device *);
+int igb_close(struct net_device *);
 static void igb_configure(struct igb_adapter *);
 static void igb_configure_tx(struct igb_adapter *);
 static void igb_configure_rx(struct igb_adapter *);
@@ -3247,7 +3247,7 @@ err_setup_tx:
 	return err;
 }
 
-static int igb_open(struct net_device *netdev)
+int igb_open(struct net_device *netdev)
 {
 	return __igb_open(netdev, false);
 }
@@ -3284,7 +3284,7 @@ static int __igb_close(struct net_device *netdev, bool suspending)
 	return 0;
 }
 
-static int igb_close(struct net_device *netdev)
+int igb_close(struct net_device *netdev)
 {
 	return __igb_close(netdev, false);
 }
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [net-next 07/20] igb: add a character device to support AVB
  2016-02-24  4:26 ` [net-next 07/20] igb: add a character device to support AVB Jeff Kirsher
@ 2016-02-24 20:06   ` Or Gerlitz
  2016-02-24 21:45     ` David Miller
  0 siblings, 1 reply; 24+ messages in thread
From: Or Gerlitz @ 2016-02-24 20:06 UTC (permalink / raw)
  To: Jeff Kirsher, David Miller
  Cc: Gangfeng Huang, Linux Netdev List, nhorman@redhat.com,
	sassmann@redhat.com, jogreene@redhat.com

On Wed, Feb 24, 2016 at 6:26 AM, Jeff Kirsher
<jeffrey.t.kirsher@intel.com> wrote:
> From: Gangfeng Huang <gangfeng.huang@ni.com>

> This patch create a character device for Intel I210 Ethernet controller,

wait, do we want L2 network driver to create char devices

> it can be used for developing Audio/Video Bridging applications,Industrial
> Ethernet applications which require precise timing control over frame
> transmission, or test harnesses for measuring system latencies and sampling
> events.

for various reasons such as the above?

> As the AVB queues (0,1) are mapped to a  user-space application, typical
> LAN traffic must be steered away from these queues. For transmit, this
> driver implements one method registering an ndo_select_queue handler to
> map traffic to queue[3] and set the register MRQC to receive all BE
> traffic to Rx queue[3].
>
> This patch is reference to the Intel Open-AVB project:
> http://github.com/AVnu/Open-AVB/tree/master/kmod/igb
>
> Signed-off-by: Gangfeng Huang <gangfeng.huang@ni.com>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [net-next 07/20] igb: add a character device to support AVB
  2016-02-24 20:06   ` Or Gerlitz
@ 2016-02-24 21:45     ` David Miller
  2016-02-24 21:50       ` Jeff Kirsher
  0 siblings, 1 reply; 24+ messages in thread
From: David Miller @ 2016-02-24 21:45 UTC (permalink / raw)
  To: gerlitz.or
  Cc: jeffrey.t.kirsher, gangfeng.huang, netdev, nhorman, sassmann,
	jogreene

From: Or Gerlitz <gerlitz.or@gmail.com>
Date: Wed, 24 Feb 2016 22:06:25 +0200

> On Wed, Feb 24, 2016 at 6:26 AM, Jeff Kirsher
> <jeffrey.t.kirsher@intel.com> wrote:
>> From: Gangfeng Huang <gangfeng.huang@ni.com>
> 
>> This patch create a character device for Intel I210 Ethernet controller,
> 
> wait, do we want L2 network driver to create char devices
> 
>> it can be used for developing Audio/Video Bridging applications,Industrial
>> Ethernet applications which require precise timing control over frame
>> transmission, or test harnesses for measuring system latencies and sampling
>> events.
> 
> for various reasons such as the above?

This is definitely not the direction to go for such a facility.
Character devices make no sense at all, and are an invitation for
ad-hoc user interfaces for what should be a generic and clean
facility.

There is no reason we cannot provide this facility with extensions
of traditional networking APIs such as netlink or recvmsg/sendmsg
over a raw or AF_PACKET socket.

If there has been a lot of work, time and effort put into this
character device solution then that's too bad.  Because anything that
ends up being user facing should have been proposed here on netdev
from the start.

I'm not applying this, no way.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [net-next 07/20] igb: add a character device to support AVB
  2016-02-24 21:45     ` David Miller
@ 2016-02-24 21:50       ` Jeff Kirsher
  0 siblings, 0 replies; 24+ messages in thread
From: Jeff Kirsher @ 2016-02-24 21:50 UTC (permalink / raw)
  To: David Miller, gerlitz.or
  Cc: gangfeng.huang, netdev, nhorman, sassmann, jogreene

[-- Attachment #1: Type: text/plain, Size: 1510 bytes --]

On Wed, 2016-02-24 at 16:45 -0500, David Miller wrote:
> From: Or Gerlitz <gerlitz.or@gmail.com>
> Date: Wed, 24 Feb 2016 22:06:25 +0200
> 
> > On Wed, Feb 24, 2016 at 6:26 AM, Jeff Kirsher
> > <jeffrey.t.kirsher@intel.com> wrote:
> >> From: Gangfeng Huang <gangfeng.huang@ni.com>
> > 
> >> This patch create a character device for Intel I210 Ethernet
> controller,
> > 
> > wait, do we want L2 network driver to create char devices
> > 
> >> it can be used for developing Audio/Video Bridging
> applications,Industrial
> >> Ethernet applications which require precise timing control over
> frame
> >> transmission, or test harnesses for measuring system latencies and
> sampling
> >> events.
> > 
> > for various reasons such as the above?
> 
> This is definitely not the direction to go for such a facility.
> Character devices make no sense at all, and are an invitation for
> ad-hoc user interfaces for what should be a generic and clean
> facility.
> 
> There is no reason we cannot provide this facility with extensions
> of traditional networking APIs such as netlink or recvmsg/sendmsg
> over a raw or AF_PACKET socket.
> 
> If there has been a lot of work, time and effort put into this
> character device solution then that's too bad.  Because anything that
> ends up being user facing should have been proposed here on netdev
> from the start.
> 
> I'm not applying this, no way.

Thanks Dave, I will drop this and the associated patches from the
series.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2016-02-24 21:50 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-24  4:26 [net-next 00/20][pull request] 1GbE Intel Wired LAN Driver Updates 2016-02-23 Jeff Kirsher
2016-02-24  4:26 ` [net-next 01/20] e1000e: Increase ULP timer Jeff Kirsher
2016-02-24  4:26 ` [net-next 02/20] e1000e: Increase PHY PLL clock gate timing Jeff Kirsher
2016-02-24  4:26 ` [net-next 03/20] e1000e: Set HW FIFO minimum pointer gap for non-gig speeds Jeff Kirsher
2016-02-24  4:26 ` [net-next 04/20] e1000e: Clear ULP configuration register on ULP exit Jeff Kirsher
2016-02-24  4:26 ` [net-next 05/20] e1000e: Initial support for KabeLake Jeff Kirsher
2016-02-24  4:26 ` [net-next 06/20] igb: add function to set I210 transmit mode Jeff Kirsher
2016-02-24  4:26 ` [net-next 07/20] igb: add a character device to support AVB Jeff Kirsher
2016-02-24 20:06   ` Or Gerlitz
2016-02-24 21:45     ` David Miller
2016-02-24 21:50       ` Jeff Kirsher
2016-02-24  4:26 ` [net-next 08/20] igb: fix compare_const_fl.cocci warnings Jeff Kirsher
2016-02-24  4:26 ` [net-next 09/20] igb: fix itnull.cocci warnings Jeff Kirsher
2016-02-24  4:26 ` [net-next 10/20] igb: fix semicolon.cocci warnings Jeff Kirsher
2016-02-24  4:26 ` [net-next 11/20] igb: When GbE link up, wait for Remote receiver status condition Jeff Kirsher
2016-02-24  4:26 ` [net-next 12/20] igb: constify e1000_phy_operations structure Jeff Kirsher
2016-02-24  4:26 ` [net-next 13/20] igb: enable WoL for OEM devices regardless of EEPROM setting Jeff Kirsher
2016-02-24  4:26 ` [net-next 14/20] igb: add conditions for I210 to generate periodic clock output Jeff Kirsher
2016-02-24  4:26 ` [net-next 15/20] igb: rename igb define to be more generic Jeff Kirsher
2016-02-24  4:26 ` [net-next 16/20] igb: Add support for generic Tx checksums Jeff Kirsher
2016-02-24  4:26 ` [net-next 17/20] igbvf: " Jeff Kirsher
2016-02-24  4:26 ` [net-next 18/20] igbvf: remove "link is Up" message when registering mcast address Jeff Kirsher
2016-02-24  4:26 ` [net-next 19/20] igb: Fix VLAN tag stripping on Intel i350 Jeff Kirsher
2016-02-24  4:26 ` [net-next 20/20] igb: call ndo_stop() instead of dev_close() when running offline selftest Jeff Kirsher

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).