Netdev List
 help / color / mirror / Atom feed
* Re: [Patch net] net: make pskb_trim_rcsum_slow() robust
From: David Miller @ 2018-10-31 19:36 UTC (permalink / raw)
  To: xiyou.wangcong; +Cc: netdev, edumazet
In-Reply-To: <20181030003515.12075-1-xiyou.wangcong@gmail.com>

From: Cong Wang <xiyou.wangcong@gmail.com>
Date: Mon, 29 Oct 2018 17:35:15 -0700

> Most callers of pskb_trim_rcsum() simply drops the skb when
> it fails, however, ip_check_defrag() still continues to pass
> the skb up to stack. In that case, we should restore its previous
> csum if __pskb_trim() fails.
> 
> Found this during code review.
> 
> Fixes: 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends")
> Cc: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>

I kind of agree with Eric that we should make all callers, including
ip_check_defrag(), fail just as with any memory allocation failure.

^ permalink raw reply

* Re: [PATCH 1/2] net: add an identifier name for 'struct sock *'
From: David Miller @ 2018-10-31 19:37 UTC (permalink / raw)
  To: tsu.yubo; +Cc: yuzibode, netdev
In-Reply-To: <e71ca1167e90b50d11c4024e6bbd30c18c9e826b.1540868768.git.tsu.yubo@gmail.com>

From: Bo YU <tsu.yubo@gmail.com>
Date: Mon, 29 Oct 2018 23:42:09 -0400

> Fix a warning from checkpatch:
> function definition argument 'struct sock *' should also have an
> identifier name in include/net/af_unix.h.
> 
> Signed-off-by: Bo YU <tsu.yubo@gmail.com>

Applied.

^ permalink raw reply

* Re: [PATCH 2/2] net: drop a space before tabs
From: David Miller @ 2018-10-31 19:38 UTC (permalink / raw)
  To: tsu.yubo; +Cc: yuzibode, netdev
In-Reply-To: <fd9196479a6994755968e57bbe412a962cc77cf3.1540868768.git.tsu.yubo@gmail.com>

From: Bo YU <tsu.yubo@gmail.com>
Date: Mon, 29 Oct 2018 23:42:10 -0400

> Fix a warning from checkpatch.pl:'please no space before tabs'
> in include/net/af_unix.h
> 
> Signed-off-by: Bo YU <tsu.yubo@gmail.com>

Applied.

^ permalink raw reply

* Re: [PATCH net] net/mlx5e: fix csum adjustments caused by RXFCS
From: David Miller @ 2018-10-31 19:41 UTC (permalink / raw)
  To: edumazet
  Cc: netdev, eric.dumazet, eranbe, saeedm, dmichail, xiyou.wangcong,
	pstaszewski
In-Reply-To: <20181030075725.195824-1-edumazet@google.com>

From: Eric Dumazet <edumazet@google.com>
Date: Tue, 30 Oct 2018 00:57:25 -0700

> As shown by Dmitris, we need to use csum_block_add() instead of csum_add()
> when adding the FCS contribution to skb csum.
> 
> Before 4.18 (more exactly commit 88078d98d1bb "net: pskb_trim_rcsum()
> and CHECKSUM_COMPLETE are friends"), the whole skb csum was thrown away,
> so RXFCS changes were ignored.
> 
> Then before commit d55bef5059dd ("net: fix pskb_trim_rcsum_slow() with
> odd trim offset") both mlx5 and pskb_trim_rcsum_slow() bugs were canceling
> each other.
> 
> Now we fixed pskb_trim_rcsum_slow() we need to fix mlx5.
> 
> Note that this patch also rewrites mlx5e_get_fcs() to :
> 
> - Use skb_header_pointer() instead of reinventing it.
> - Use __get_unaligned_cpu32() to avoid possible non aligned accesses
>   as Dmitris pointed out.
> 
> Fixes: 902a545904c7 ("net/mlx5e: When RXFCS is set, add FCS data into checksum calculation")
> Reported-by: Paweł Staszewski <pstaszewski@itcare.pl>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied and queued up for -stable.

^ permalink raw reply

* Re: [Patch V5 net 00/11] Bugfix for the HNS3 driver
From: David Miller @ 2018-10-31 19:42 UTC (permalink / raw)
  To: tanhuazhong
  Cc: sergei.shtylyov, joe, netdev, linuxarm, salil.mehta, yisen.zhuang,
	lipeng321, linyunsheng
In-Reply-To: <1540907453-42276-1-git-send-email-tanhuazhong@huawei.com>

From: Huazhong Tan <tanhuazhong@huawei.com>
Date: Tue, 30 Oct 2018 21:50:42 +0800

> This patch series include bugfix for the HNS3 ethernet
> controller driver.
> 
> Change log:
> V4->V5:
> 	Fixes comments from Joe Perches & Sergei Shtylyov
> V3->V4:
> 	Fixes comments from Sergei Shtylyov
> V2->V3:
> 	Fixes comments from Sergei Shtylyov
> V1->V2:
> 	Fixes the compilation break reported by kbuild test robot
> 	http://patchwork.ozlabs.org/patch/989818/

Series applied.

^ permalink raw reply

* [net 1/8] igb: shorten maximum PHC timecounter update interval
From: Jeff Kirsher @ 2018-10-31 19:42 UTC (permalink / raw)
  To: davem
  Cc: Miroslav Lichvar, netdev, nhorman, sassmann, Jacob Keller,
	Richard Cochran, Thomas Gleixner, Jeff Kirsher
In-Reply-To: <20181031194254.16417-1-jeffrey.t.kirsher@intel.com>

From: Miroslav Lichvar <mlichvar@redhat.com>

The timecounter needs to be updated at least once per ~550 seconds in
order to avoid a 40-bit SYSTIM timestamp to be misinterpreted as an old
timestamp.

Since commit 500462a9d ("timers: Switch to a non-cascading wheel"),
scheduling of delayed work seems to be less accurate and a requested
delay of 540 seconds may actually be longer than 550 seconds. Shorten
the delay to 480 seconds to be sure the timecounter is updated in time.

This fixes an issue with HW timestamps on 82580/I350/I354 being off by
~1100 seconds for few seconds every ~9 minutes.

Cc: Jacob Keller <jacob.e.keller@intel.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igb/igb_ptp.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_ptp.c b/drivers/net/ethernet/intel/igb/igb_ptp.c
index 9f4d700e09df..29ced6b74d36 100644
--- a/drivers/net/ethernet/intel/igb/igb_ptp.c
+++ b/drivers/net/ethernet/intel/igb/igb_ptp.c
@@ -51,9 +51,15 @@
  *
  * The 40 bit 82580 SYSTIM overflows every
  *   2^40 * 10^-9 /  60  = 18.3 minutes.
+ *
+ * SYSTIM is converted to real time using a timecounter. As
+ * timecounter_cyc2time() allows old timestamps, the timecounter
+ * needs to be updated at least once per half of the SYSTIM interval.
+ * Scheduling of delayed work is not very accurate, so we aim for 8
+ * minutes to be sure the actual interval is shorter than 9.16 minutes.
  */
 
-#define IGB_SYSTIM_OVERFLOW_PERIOD	(HZ * 60 * 9)
+#define IGB_SYSTIM_OVERFLOW_PERIOD	(HZ * 60 * 8)
 #define IGB_PTP_TX_TIMEOUT		(HZ * 15)
 #define INCPERIOD_82576			BIT(E1000_TIMINCA_16NS_SHIFT)
 #define INCVALUE_82576_MASK		GENMASK(E1000_TIMINCA_16NS_SHIFT - 1, 0)
-- 
2.17.2

^ permalink raw reply related

* [net 0/8][pull request] Intel Wired LAN Driver Updates 2018-10-31
From: Jeff Kirsher @ 2018-10-31 19:42 UTC (permalink / raw)
  To: davem; +Cc: Jeff Kirsher, netdev, nhorman, sassmann

This series contains a various collection of fixes.

Miroslav Lichvar from Red Hat or should I say IBM now?  Updates the PHC
timecounter interval for igb so that it gets updated at least once
every 550 seconds.

Ngai-Mint provides a fix for fm10k to prevent a soft lockup or system
crash by adding a new condition to determine if the SM mailbox is in the
correct state before proceeding.

Jake provides several fm10k fixes, first one marks complier aborts as
non-fatal since on some platforms trigger machine check errors when the
compile aborts.  Added missing device ids to the in-kernel driver.  Due
to the recent fixes, bumped the driver version.

I (Jeff Kirsher) fixed a XFRM_ALGO dependency for both ixgbe and
ixgbevf.  This fix was based on the original work from Arnd Bergmann,
which only fixed ixgbe.

Mitch provides a fix for i40e/avf to update the status codes, which
resolves an issue between a mis-match between i40e and the iavf driver,
which also supports the ice LAN driver.

Radoslaw fixes the ixgbe where the driver is logging a message about
spoofed packets detected when the VF is re-started with a different MAC
address.

The following are changes since commit a6b3a3fa042343e29ffaf9169f5ba3c819d4f9a2:
  net: mvpp2: Fix affinity hint allocation
and are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue 10GbE

Jacob Keller (3):
  fm10k: ensure completer aborts are marked as non-fatal after a resume
  fm10k: add missing device IDs to the upstream driver
  fm10k: bump driver version to match out-of-tree release

Jeff Kirsher (1):
  ixgbe/ixgbevf: fix XFRM_ALGO dependency

Miroslav Lichvar (1):
  igb: shorten maximum PHC timecounter update interval

Mitch Williams (1):
  i40e: Update status codes

Ngai-Mint Kwan (1):
  fm10k: fix SM mailbox full condition

Radoslaw Tyl (1):
  ixgbe: fix MAC anti-spoofing filter after VFLR

 drivers/net/ethernet/intel/Kconfig            | 18 +++++++
 drivers/net/ethernet/intel/fm10k/fm10k_iov.c  | 51 +++++++++++--------
 drivers/net/ethernet/intel/fm10k/fm10k_main.c |  2 +-
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c  |  2 +
 drivers/net/ethernet/intel/fm10k/fm10k_type.h |  2 +
 .../ethernet/intel/i40e/i40e_virtchnl_pf.c    |  2 +-
 drivers/net/ethernet/intel/igb/igb_ptp.c      |  8 ++-
 drivers/net/ethernet/intel/ixgbe/Makefile     |  2 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe.h      |  8 +--
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |  6 +--
 .../net/ethernet/intel/ixgbe/ixgbe_sriov.c    |  4 +-
 drivers/net/ethernet/intel/ixgbevf/Makefile   |  2 +-
 drivers/net/ethernet/intel/ixgbevf/ixgbevf.h  |  4 +-
 .../net/ethernet/intel/ixgbevf/ixgbevf_main.c |  2 +-
 include/linux/avf/virtchnl.h                  | 12 +++--
 net/xfrm/Kconfig                              |  1 -
 16 files changed, 85 insertions(+), 41 deletions(-)

-- 
2.17.2

^ permalink raw reply

* [net 2/8] fm10k: fix SM mailbox full condition
From: Jeff Kirsher @ 2018-10-31 19:42 UTC (permalink / raw)
  To: davem; +Cc: Ngai-Mint Kwan, netdev, nhorman, sassmann, Jacob Keller,
	Jeff Kirsher
In-Reply-To: <20181031194254.16417-1-jeffrey.t.kirsher@intel.com>

From: Ngai-Mint Kwan <ngai-mint.kwan@intel.com>

Current condition will always incorrectly report a full SM mailbox if an
IES API application is not running. Due to this, the
"fm10k_service_task" will be infinitely queued into the driver's
workqueue. This, in turn, will cause a "kworker" thread to report 100%
CPU utilization and might cause "soft lockup" events or system crashes.

To fix this issue, a new condition is added to determine if the SM
mailbox is in the correct state of FM10K_STATE_OPEN before proceeding.
In other words, an instance of the IES API must be running. If there is,
the remainder of the flow stays the same which is to determine if the SM
mailbox capacity has been exceeded or not and take appropriate action.

Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_iov.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_iov.c b/drivers/net/ethernet/intel/fm10k/fm10k_iov.c
index e707d717012f..74160c2095ee 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_iov.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_iov.c
@@ -244,7 +244,8 @@ s32 fm10k_iov_mbx(struct fm10k_intfc *interface)
 		}
 
 		/* guarantee we have free space in the SM mailbox */
-		if (!hw->mbx.ops.tx_ready(&hw->mbx, FM10K_VFMBX_MSG_MTU)) {
+		if (hw->mbx.state == FM10K_STATE_OPEN &&
+		    !hw->mbx.ops.tx_ready(&hw->mbx, FM10K_VFMBX_MSG_MTU)) {
 			/* keep track of how many times this occurs */
 			interface->hw_sm_mbx_full++;
 
-- 
2.17.2

^ permalink raw reply related

* [net 3/8] fm10k: ensure completer aborts are marked as non-fatal after a resume
From: Jeff Kirsher @ 2018-10-31 19:42 UTC (permalink / raw)
  To: davem; +Cc: Jacob Keller, netdev, nhorman, sassmann, Jeff Kirsher
In-Reply-To: <20181031194254.16417-1-jeffrey.t.kirsher@intel.com>

From: Jacob Keller <jacob.e.keller@intel.com>

VF drivers can trigger PCIe completer aborts any time they read a queue
that they don't own. Even in nominal circumstances, it is not possible
to prevent the VF driver from reading queues it doesn't own. VF drivers
may attempt to read queues it previously owned, but which it no longer
does due to a PF reset.

Normally these completer aborts aren't an issue. However, on some
platforms these trigger machine check errors. This is true even if we
lower their severity from fatal to non-fatal. Indeed, we already have
code for lowering the severity.

We could attempt to mask these errors conditionally around resets, which
is the most common time they would occur. However this would essentially
be a race between the PF and VF drivers, and we may still occasionally
see machine check exceptions on these strictly configured platforms.

Instead, mask the errors entirely any time we resume VFs. By doing so,
we prevent the completer aborts from being sent to the parent PCIe
device, and thus these strict platforms will not upgrade them into
machine check errors.

Additionally, we don't lose any information by masking these errors,
because we'll still report VFs which attempt to access queues via the
FUM_BAD_VF_QACCESS errors.

Without this change, on platforms where completer aborts cause machine
check exceptions, the VF reading queues it doesn't own could crash the
host system. Masking the completer abort prevents this, so we should
mask it for good, and not just around a PCIe reset. Otherwise malicious
or misconfigured VFs could cause the host system to crash.

Because we are masking the error entirely, there is little reason to
also keep setting the severity bit, so that code is also removed.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_iov.c | 48 ++++++++++++--------
 1 file changed, 28 insertions(+), 20 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_iov.c b/drivers/net/ethernet/intel/fm10k/fm10k_iov.c
index 74160c2095ee..5d4f1761dc0c 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_iov.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_iov.c
@@ -303,6 +303,28 @@ void fm10k_iov_suspend(struct pci_dev *pdev)
 	}
 }
 
+static void fm10k_mask_aer_comp_abort(struct pci_dev *pdev)
+{
+	u32 err_mask;
+	int pos;
+
+	pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_ERR);
+	if (!pos)
+		return;
+
+	/* Mask the completion abort bit in the ERR_UNCOR_MASK register,
+	 * preventing the device from reporting these errors to the upstream
+	 * PCIe root device. This avoids bringing down platforms which upgrade
+	 * non-fatal completer aborts into machine check exceptions. Completer
+	 * aborts can occur whenever a VF reads a queue it doesn't own.
+	 */
+	pci_read_config_dword(pdev, pos + PCI_ERR_UNCOR_MASK, &err_mask);
+	err_mask |= PCI_ERR_UNC_COMP_ABORT;
+	pci_write_config_dword(pdev, pos + PCI_ERR_UNCOR_MASK, err_mask);
+
+	mmiowb();
+}
+
 int fm10k_iov_resume(struct pci_dev *pdev)
 {
 	struct fm10k_intfc *interface = pci_get_drvdata(pdev);
@@ -318,6 +340,12 @@ int fm10k_iov_resume(struct pci_dev *pdev)
 	if (!iov_data)
 		return -ENOMEM;
 
+	/* Lower severity of completer abort error reporting as
+	 * the VFs can trigger this any time they read a queue
+	 * that they don't own.
+	 */
+	fm10k_mask_aer_comp_abort(pdev);
+
 	/* allocate hardware resources for the VFs */
 	hw->iov.ops.assign_resources(hw, num_vfs, num_vfs);
 
@@ -461,20 +489,6 @@ void fm10k_iov_disable(struct pci_dev *pdev)
 	fm10k_iov_free_data(pdev);
 }
 
-static void fm10k_disable_aer_comp_abort(struct pci_dev *pdev)
-{
-	u32 err_sev;
-	int pos;
-
-	pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_ERR);
-	if (!pos)
-		return;
-
-	pci_read_config_dword(pdev, pos + PCI_ERR_UNCOR_SEVER, &err_sev);
-	err_sev &= ~PCI_ERR_UNC_COMP_ABORT;
-	pci_write_config_dword(pdev, pos + PCI_ERR_UNCOR_SEVER, err_sev);
-}
-
 int fm10k_iov_configure(struct pci_dev *pdev, int num_vfs)
 {
 	int current_vfs = pci_num_vf(pdev);
@@ -496,12 +510,6 @@ int fm10k_iov_configure(struct pci_dev *pdev, int num_vfs)
 
 	/* allocate VFs if not already allocated */
 	if (num_vfs && num_vfs != current_vfs) {
-		/* Disable completer abort error reporting as
-		 * the VFs can trigger this any time they read a queue
-		 * that they don't own.
-		 */
-		fm10k_disable_aer_comp_abort(pdev);
-
 		err = pci_enable_sriov(pdev, num_vfs);
 		if (err) {
 			dev_err(&pdev->dev,
-- 
2.17.2

^ permalink raw reply related

* [net 7/8] i40e: Update status codes
From: Jeff Kirsher @ 2018-10-31 19:42 UTC (permalink / raw)
  To: davem; +Cc: Mitch Williams, netdev, nhorman, sassmann, Jeff Kirsher
In-Reply-To: <20181031194254.16417-1-jeffrey.t.kirsher@intel.com>

From: Mitch Williams <mitch.a.williams@intel.com>

Add a few new status code which will be used by the ice driver, and
rename a few to make them more consistent. Error code are mapped to
similar values as in i40e_status.h, so as to be compatible with older
VF drivers not using this status enum.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c |  2 +-
 include/linux/avf/virtchnl.h                       | 12 +++++++++---
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 81b0e1f8d14b..ac5698ed0b11 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -3674,7 +3674,7 @@ int i40e_vc_process_vf_msg(struct i40e_pf *pf, s16 vf_id, u32 v_opcode,
 		dev_err(&pf->pdev->dev, "Invalid message from VF %d, opcode %d, len %d\n",
 			local_vf_id, v_opcode, msglen);
 		switch (ret) {
-		case VIRTCHNL_ERR_PARAM:
+		case VIRTCHNL_STATUS_ERR_PARAM:
 			return -EPERM;
 		default:
 			return -EINVAL;
diff --git a/include/linux/avf/virtchnl.h b/include/linux/avf/virtchnl.h
index 2c9756bd9c4c..b2488055fd1d 100644
--- a/include/linux/avf/virtchnl.h
+++ b/include/linux/avf/virtchnl.h
@@ -62,13 +62,19 @@
 /* Error Codes */
 enum virtchnl_status_code {
 	VIRTCHNL_STATUS_SUCCESS				= 0,
-	VIRTCHNL_ERR_PARAM				= -5,
+	VIRTCHNL_STATUS_ERR_PARAM			= -5,
+	VIRTCHNL_STATUS_ERR_NO_MEMORY			= -18,
 	VIRTCHNL_STATUS_ERR_OPCODE_MISMATCH		= -38,
 	VIRTCHNL_STATUS_ERR_CQP_COMPL_ERROR		= -39,
 	VIRTCHNL_STATUS_ERR_INVALID_VF_ID		= -40,
-	VIRTCHNL_STATUS_NOT_SUPPORTED			= -64,
+	VIRTCHNL_STATUS_ERR_ADMIN_QUEUE_ERROR		= -53,
+	VIRTCHNL_STATUS_ERR_NOT_SUPPORTED		= -64,
 };
 
+/* Backward compatibility */
+#define VIRTCHNL_ERR_PARAM VIRTCHNL_STATUS_ERR_PARAM
+#define VIRTCHNL_STATUS_NOT_SUPPORTED VIRTCHNL_STATUS_ERR_NOT_SUPPORTED
+
 #define VIRTCHNL_LINK_SPEED_100MB_SHIFT		0x1
 #define VIRTCHNL_LINK_SPEED_1000MB_SHIFT	0x2
 #define VIRTCHNL_LINK_SPEED_10GB_SHIFT		0x3
@@ -831,7 +837,7 @@ virtchnl_vc_validate_vf_msg(struct virtchnl_version_info *ver, u32 v_opcode,
 	case VIRTCHNL_OP_EVENT:
 	case VIRTCHNL_OP_UNKNOWN:
 	default:
-		return VIRTCHNL_ERR_PARAM;
+		return VIRTCHNL_STATUS_ERR_PARAM;
 	}
 	/* few more checks */
 	if (err_msg_format || valid_len != msglen)
-- 
2.17.2

^ permalink raw reply related

* [net 4/8] fm10k: add missing device IDs to the upstream driver
From: Jeff Kirsher @ 2018-10-31 19:42 UTC (permalink / raw)
  To: davem; +Cc: Jacob Keller, netdev, nhorman, sassmann, Jeff Kirsher
In-Reply-To: <20181031194254.16417-1-jeffrey.t.kirsher@intel.com>

From: Jacob Keller <jacob.e.keller@intel.com>

The device IDs for the Ethernet SDI Adapter devices were never added to
the upstream driver. The IDs are already in the pci.ids database, and
are supported by the out-of-tree driver.

Add the device IDs now, so that the upstream driver can recognize and
load these devices.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c  | 2 ++
 drivers/net/ethernet/intel/fm10k/fm10k_type.h | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index 02345d381303..e49fb51d3613 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -23,6 +23,8 @@ static const struct fm10k_info *fm10k_info_tbl[] = {
  */
 static const struct pci_device_id fm10k_pci_tbl[] = {
 	{ PCI_VDEVICE(INTEL, FM10K_DEV_ID_PF), fm10k_device_pf },
+	{ PCI_VDEVICE(INTEL, FM10K_DEV_ID_SDI_FM10420_QDA2), fm10k_device_pf },
+	{ PCI_VDEVICE(INTEL, FM10K_DEV_ID_SDI_FM10420_DA2), fm10k_device_pf },
 	{ PCI_VDEVICE(INTEL, FM10K_DEV_ID_VF), fm10k_device_vf },
 	/* required last entry */
 	{ 0, }
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_type.h b/drivers/net/ethernet/intel/fm10k/fm10k_type.h
index 3e608e493f9d..9fb9fca375e3 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_type.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_type.h
@@ -15,6 +15,8 @@ struct fm10k_hw;
 
 #define FM10K_DEV_ID_PF			0x15A4
 #define FM10K_DEV_ID_VF			0x15A5
+#define FM10K_DEV_ID_SDI_FM10420_QDA2	0x15D0
+#define FM10K_DEV_ID_SDI_FM10420_DA2	0x15D5
 
 #define FM10K_MAX_QUEUES		256
 #define FM10K_MAX_QUEUES_PF		128
-- 
2.17.2

^ permalink raw reply related

* [net 5/8] fm10k: bump driver version to match out-of-tree release
From: Jeff Kirsher @ 2018-10-31 19:42 UTC (permalink / raw)
  To: davem; +Cc: Jacob Keller, netdev, nhorman, sassmann, Jeff Kirsher
In-Reply-To: <20181031194254.16417-1-jeffrey.t.kirsher@intel.com>

From: Jacob Keller <jacob.e.keller@intel.com>

The upstream and out-of-tree drivers are once again at comparable
functionality. It's been a while since we updated the upstream driver
version, so bump it now.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_main.c b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
index 503bbc017792..5b2a50e5798f 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_main.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
@@ -11,7 +11,7 @@
 
 #include "fm10k.h"
 
-#define DRV_VERSION	"0.23.4-k"
+#define DRV_VERSION	"0.26.1-k"
 #define DRV_SUMMARY	"Intel(R) Ethernet Switch Host Interface Driver"
 const char fm10k_driver_version[] = DRV_VERSION;
 char fm10k_driver_name[] = "fm10k";
-- 
2.17.2

^ permalink raw reply related

* [net 6/8] ixgbe/ixgbevf: fix XFRM_ALGO dependency
From: Jeff Kirsher @ 2018-10-31 19:42 UTC (permalink / raw)
  To: davem
  Cc: Jeff Kirsher, netdev, nhorman, sassmann, Arnd Bergmann,
	Shannon Nelson
In-Reply-To: <20181031194254.16417-1-jeffrey.t.kirsher@intel.com>

Based on the original work from Arnd Bergmann.

When XFRM_ALGO is not enabled, the new ixgbe IPsec code produces a
link error:

drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.o: In function `ixgbe_ipsec_vf_add_sa':
ixgbe_ipsec.c:(.text+0x1266): undefined reference to `xfrm_aead_get_byname'

Simply selecting XFRM_ALGO from here causes circular dependencies, so
to fix it, we probably want this slightly more complex solution that is
similar to what other drivers with XFRM offload do:

A separate Kconfig symbol now controls whether we include the IPsec
offload code. To keep the old behavior, this is left as 'default y'. The
dependency in XFRM_OFFLOAD still causes a circular dependency but is
not actually needed because this symbol is not user visible, so removing
that dependency on top makes it all work.

CC: Arnd Bergmann <arnd@arndb.de>
CC: Shannon Nelson <shannon.nelson@oracle.com>
Fixes: eda0333ac293 ("ixgbe: add VF IPsec management")
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
---
 drivers/net/ethernet/intel/Kconfig             | 18 ++++++++++++++++++
 drivers/net/ethernet/intel/ixgbe/Makefile      |  2 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe.h       |  8 ++++----
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  |  6 +++---
 drivers/net/ethernet/intel/ixgbevf/Makefile    |  2 +-
 drivers/net/ethernet/intel/ixgbevf/ixgbevf.h   |  4 ++--
 .../net/ethernet/intel/ixgbevf/ixgbevf_main.c  |  2 +-
 net/xfrm/Kconfig                               |  1 -
 8 files changed, 30 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/intel/Kconfig b/drivers/net/ethernet/intel/Kconfig
index fd3373d82a9e..59e1bc0f609e 100644
--- a/drivers/net/ethernet/intel/Kconfig
+++ b/drivers/net/ethernet/intel/Kconfig
@@ -200,6 +200,15 @@ config IXGBE_DCB
 
 	  If unsure, say N.
 
+config IXGBE_IPSEC
+	bool "IPSec XFRM cryptography-offload acceleration"
+	depends on IXGBE
+	depends on XFRM_OFFLOAD
+	default y
+	select XFRM_ALGO
+	---help---
+	  Enable support for IPSec offload in ixgbe.ko
+
 config IXGBEVF
 	tristate "Intel(R) 10GbE PCI Express Virtual Function Ethernet support"
 	depends on PCI_MSI
@@ -217,6 +226,15 @@ config IXGBEVF
 	  will be called ixgbevf.  MSI-X interrupt support is required
 	  for this driver to work correctly.
 
+config IXGBEVF_IPSEC
+	bool "IPSec XFRM cryptography-offload acceleration"
+	depends on IXGBEVF
+	depends on XFRM_OFFLOAD
+	default y
+	select XFRM_ALGO
+	---help---
+	  Enable support for IPSec offload in ixgbevf.ko
+
 config I40E
 	tristate "Intel(R) Ethernet Controller XL710 Family support"
 	imply PTP_1588_CLOCK
diff --git a/drivers/net/ethernet/intel/ixgbe/Makefile b/drivers/net/ethernet/intel/ixgbe/Makefile
index ca6b0c458e4a..4fb0d9e3f2da 100644
--- a/drivers/net/ethernet/intel/ixgbe/Makefile
+++ b/drivers/net/ethernet/intel/ixgbe/Makefile
@@ -17,4 +17,4 @@ ixgbe-$(CONFIG_IXGBE_DCB) +=  ixgbe_dcb.o ixgbe_dcb_82598.o \
 ixgbe-$(CONFIG_IXGBE_HWMON) += ixgbe_sysfs.o
 ixgbe-$(CONFIG_DEBUG_FS) += ixgbe_debugfs.o
 ixgbe-$(CONFIG_FCOE:m=y) += ixgbe_fcoe.o
-ixgbe-$(CONFIG_XFRM_OFFLOAD) += ixgbe_ipsec.o
+ixgbe-$(CONFIG_IXGBE_IPSEC) += ixgbe_ipsec.o
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
index ec1b87cc4410..143bdd5ee2a0 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
@@ -769,9 +769,9 @@ struct ixgbe_adapter {
 #define IXGBE_RSS_KEY_SIZE     40  /* size of RSS Hash Key in bytes */
 	u32 *rss_key;
 
-#ifdef CONFIG_XFRM_OFFLOAD
+#ifdef CONFIG_IXGBE_IPSEC
 	struct ixgbe_ipsec *ipsec;
-#endif /* CONFIG_XFRM_OFFLOAD */
+#endif /* CONFIG_IXGBE_IPSEC */
 
 	/* AF_XDP zero-copy */
 	struct xdp_umem **xsk_umems;
@@ -1008,7 +1008,7 @@ void ixgbe_store_key(struct ixgbe_adapter *adapter);
 void ixgbe_store_reta(struct ixgbe_adapter *adapter);
 s32 ixgbe_negotiate_fc(struct ixgbe_hw *hw, u32 adv_reg, u32 lp_reg,
 		       u32 adv_sym, u32 adv_asm, u32 lp_sym, u32 lp_asm);
-#ifdef CONFIG_XFRM_OFFLOAD
+#ifdef CONFIG_IXGBE_IPSEC
 void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter);
 void ixgbe_stop_ipsec_offload(struct ixgbe_adapter *adapter);
 void ixgbe_ipsec_restore(struct ixgbe_adapter *adapter);
@@ -1036,5 +1036,5 @@ static inline int ixgbe_ipsec_vf_add_sa(struct ixgbe_adapter *adapter,
 					u32 *mbuf, u32 vf) { return -EACCES; }
 static inline int ixgbe_ipsec_vf_del_sa(struct ixgbe_adapter *adapter,
 					u32 *mbuf, u32 vf) { return -EACCES; }
-#endif /* CONFIG_XFRM_OFFLOAD */
+#endif /* CONFIG_IXGBE_IPSEC */
 #endif /* _IXGBE_H_ */
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 0049a2becd7e..113b38e0defb 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -8694,7 +8694,7 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff *skb,
 
 #endif /* IXGBE_FCOE */
 
-#ifdef CONFIG_XFRM_OFFLOAD
+#ifdef CONFIG_IXGBE_IPSEC
 	if (skb->sp && !ixgbe_ipsec_tx(tx_ring, first, &ipsec_tx))
 		goto out_drop;
 #endif
@@ -10190,7 +10190,7 @@ ixgbe_features_check(struct sk_buff *skb, struct net_device *dev,
 	 * the TSO, so it's the exception.
 	 */
 	if (skb->encapsulation && !(features & NETIF_F_TSO_MANGLEID)) {
-#ifdef CONFIG_XFRM_OFFLOAD
+#ifdef CONFIG_IXGBE_IPSEC
 		if (!skb->sp)
 #endif
 			features &= ~NETIF_F_TSO;
@@ -10883,7 +10883,7 @@ static int ixgbe_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	if (hw->mac.type >= ixgbe_mac_82599EB)
 		netdev->features |= NETIF_F_SCTP_CRC;
 
-#ifdef CONFIG_XFRM_OFFLOAD
+#ifdef CONFIG_IXGBE_IPSEC
 #define IXGBE_ESP_FEATURES	(NETIF_F_HW_ESP | \
 				 NETIF_F_HW_ESP_TX_CSUM | \
 				 NETIF_F_GSO_ESP)
diff --git a/drivers/net/ethernet/intel/ixgbevf/Makefile b/drivers/net/ethernet/intel/ixgbevf/Makefile
index 297d0f0858b5..186a4bb24fde 100644
--- a/drivers/net/ethernet/intel/ixgbevf/Makefile
+++ b/drivers/net/ethernet/intel/ixgbevf/Makefile
@@ -10,5 +10,5 @@ ixgbevf-objs := vf.o \
                 mbx.o \
                 ethtool.o \
                 ixgbevf_main.o
-ixgbevf-$(CONFIG_XFRM_OFFLOAD) += ipsec.o
+ixgbevf-$(CONFIG_IXGBEVF_IPSEC) += ipsec.o
 
diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
index e399e1c0c54a..ecab686574b6 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
@@ -459,7 +459,7 @@ int ethtool_ioctl(struct ifreq *ifr);
 
 extern void ixgbevf_write_eitr(struct ixgbevf_q_vector *q_vector);
 
-#ifdef CONFIG_XFRM_OFFLOAD
+#ifdef CONFIG_IXGBEVF_IPSEC
 void ixgbevf_init_ipsec_offload(struct ixgbevf_adapter *adapter);
 void ixgbevf_stop_ipsec_offload(struct ixgbevf_adapter *adapter);
 void ixgbevf_ipsec_restore(struct ixgbevf_adapter *adapter);
@@ -482,7 +482,7 @@ static inline int ixgbevf_ipsec_tx(struct ixgbevf_ring *tx_ring,
 				   struct ixgbevf_tx_buffer *first,
 				   struct ixgbevf_ipsec_tx_data *itd)
 { return 0; }
-#endif /* CONFIG_XFRM_OFFLOAD */
+#endif /* CONFIG_IXGBEVF_IPSEC */
 
 void ixgbe_napi_add_all(struct ixgbevf_adapter *adapter);
 void ixgbe_napi_del_all(struct ixgbevf_adapter *adapter);
diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
index 98707ee11d72..5e47ede7e832 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
@@ -4150,7 +4150,7 @@ static int ixgbevf_xmit_frame_ring(struct sk_buff *skb,
 	first->tx_flags = tx_flags;
 	first->protocol = vlan_get_protocol(skb);
 
-#ifdef CONFIG_XFRM_OFFLOAD
+#ifdef CONFIG_IXGBEVF_IPSEC
 	if (skb->sp && !ixgbevf_ipsec_tx(tx_ring, first, &ipsec_tx))
 		goto out_drop;
 #endif
diff --git a/net/xfrm/Kconfig b/net/xfrm/Kconfig
index 4a9ee2d83158..140270a13d54 100644
--- a/net/xfrm/Kconfig
+++ b/net/xfrm/Kconfig
@@ -8,7 +8,6 @@ config XFRM
 
 config XFRM_OFFLOAD
        bool
-       depends on XFRM
 
 config XFRM_ALGO
 	tristate
-- 
2.17.2

^ permalink raw reply related

* [net 8/8] ixgbe: fix MAC anti-spoofing filter after VFLR
From: Jeff Kirsher @ 2018-10-31 19:42 UTC (permalink / raw)
  To: davem; +Cc: Radoslaw Tyl, netdev, nhorman, sassmann, Jeff Kirsher
In-Reply-To: <20181031194254.16417-1-jeffrey.t.kirsher@intel.com>

From: Radoslaw Tyl <radoslawx.tyl@intel.com>

This change resolves a driver bug where the driver is logging a
message that says "Spoofed packets detected". This can occur on the PF
(host) when a VF has VLAN+MACVLAN enabled and is re-started with a
different MAC address.

MAC and VLAN anti-spoofing filters are to be enabled together.

Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Acked-by: Piotr Skajewski <piotrx.skajewski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
index af25a8fffeb8..5dacfc870259 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
@@ -722,8 +722,10 @@ static inline void ixgbe_vf_reset_event(struct ixgbe_adapter *adapter, u32 vf)
 			ixgbe_set_vmvir(adapter, vfinfo->pf_vlan,
 					adapter->default_up, vf);
 
-		if (vfinfo->spoofchk_enabled)
+		if (vfinfo->spoofchk_enabled) {
 			hw->mac.ops.set_vlan_anti_spoofing(hw, true, vf);
+			hw->mac.ops.set_mac_anti_spoofing(hw, true, vf);
+		}
 	}
 
 	/* reset multicast table array for vf */
-- 
2.17.2

^ permalink raw reply related

* Re: [PATCH net] net: dsa: microchip: initialize mutex before use
From: David Miller @ 2018-10-31 19:52 UTC (permalink / raw)
  To: Tristram.Ha; +Cc: andrew, f.fainelli, pavel, UNGLinuxDriver, netdev
In-Reply-To: <1540943149-26832-1-git-send-email-Tristram.Ha@microchip.com>

From: <Tristram.Ha@microchip.com>
Date: Tue, 30 Oct 2018 16:45:49 -0700

> @@ -1206,6 +1201,12 @@ int ksz_switch_register(struct ksz_device *dev)
>  	if (dev->pdata)
>  		dev->chip_id = dev->pdata->chip_id;
>  
> +	/* mutex is used in next function call. */
> +	mutex_init(&dev->reg_mutex);
> +	mutex_init(&dev->stats_mutex);
> +	mutex_init(&dev->alu_mutex);
> +	mutex_init(&dev->vlan_mutex);
> +

Please remove this comment, as per Andrew Lunn's feedback.

^ permalink raw reply

* Re: [PATCH net 0/4] mlxsw: Enable minimum shaper on MC TCs
From: David Miller @ 2018-10-31 19:57 UTC (permalink / raw)
  To: idosch; +Cc: netdev, jiri, petrm, mlxsw
In-Reply-To: <20181031095601.29846-1-idosch@mellanox.com>

From: Ido Schimmel <idosch@mellanox.com>
Date: Wed, 31 Oct 2018 09:56:41 +0000

> Petr says:
> 
> An MC-aware mode was introduced in commit 7b8195306694 ("mlxsw:
> spectrum: Configure MC-aware mode on mlxsw ports"). In MC-aware mode,
> BUM traffic gets a special treatment by being assigned to a separate set
> of traffic classes 8..15. Pairs of TCs 0 and 8, 1 and 9, etc., are then
> configured to strictly prioritize the lower-numbered ones. The intention
> is to prevent BUM traffic from flooding the switch and push out all UC
> traffic, which would otherwise happen, and instead give UC traffic
> precedence.
> 
> However strictly prioritizing UC traffic has the effect that UC overload
> pushes out all BUM traffic, such as legitimate ARP queries. These
> packets are kept in queues for a while, but under sustained UC overload,
> their lifetime eventually expires and these packets are dropped. That is
> detrimental to network performance as well.
> 
> In this patchset, MC TCs (8..15) are configured with minimum shaper of
> 200Mbps (a minimum permitted value) to allow a trickle of necessary
> control traffic to get through.
> 
> First in patch #1, the QEEC register is extended with fields necessary
> to configure the minimum shaper.
> 
> In patch #2, minimum shaper is enabled on TCs 8..15.
> 
> In patches #3 and #4, first the MC-awareness test is tweaked to support
> the minimum shaper, and then a new test is introduced to test that MC
> traffic behaves well under UC overload.

Series applied, thanks.

^ permalink raw reply

* [PATCH bpf] libbpf: Fix compile error in libbpf_attach_type_by_name
From: Andrey Ignatov @ 2018-10-31 19:57 UTC (permalink / raw)
  To: netdev; +Cc: Andrey Ignatov, ast, daniel, kernel-team, acme

Arnaldo Carvalho de Melo reported build error in libbpf when clang
version 3.8.1-24 (tags/RELEASE_381/final) is used:

libbpf.c:2201:36: error: comparison of constant -22 with expression of
type 'const enum bpf_attach_type' is always false
[-Werror,-Wtautological-constant-out-of-range-compare]
                if (section_names[i].attach_type == -EINVAL)
                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^  ~~~~~~~
1 error generated.

Fix the error by keeping "is_attachable" property of a program in a
separate struct field instead of trying to use attach_type itself.

Fixes: commit 956b620fcf0b ("libbpf: Introduce libbpf_attach_type_by_name")
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Andrey Ignatov <rdna@fb.com>
---
 tools/lib/bpf/libbpf.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index b607be7236d3..d6e62e90e8d4 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -2084,19 +2084,19 @@ void bpf_program__set_expected_attach_type(struct bpf_program *prog,
 	prog->expected_attach_type = type;
 }
 
-#define BPF_PROG_SEC_IMPL(string, ptype, eatype, atype) \
-	{ string, sizeof(string) - 1, ptype, eatype, atype }
+#define BPF_PROG_SEC_IMPL(string, ptype, eatype, is_attachable, atype) \
+	{ string, sizeof(string) - 1, ptype, eatype, is_attachable, atype }
 
 /* Programs that can NOT be attached. */
-#define BPF_PROG_SEC(string, ptype) BPF_PROG_SEC_IMPL(string, ptype, 0, -EINVAL)
+#define BPF_PROG_SEC(string, ptype) BPF_PROG_SEC_IMPL(string, ptype, 0, 0, 0)
 
 /* Programs that can be attached. */
 #define BPF_APROG_SEC(string, ptype, atype) \
-	BPF_PROG_SEC_IMPL(string, ptype, 0, atype)
+	BPF_PROG_SEC_IMPL(string, ptype, 0, 1, atype)
 
 /* Programs that must specify expected attach type at load time. */
 #define BPF_EAPROG_SEC(string, ptype, eatype) \
-	BPF_PROG_SEC_IMPL(string, ptype, eatype, eatype)
+	BPF_PROG_SEC_IMPL(string, ptype, eatype, 1, eatype)
 
 /* Programs that can be attached but attach type can't be identified by section
  * name. Kept for backward compatibility.
@@ -2108,6 +2108,7 @@ static const struct {
 	size_t len;
 	enum bpf_prog_type prog_type;
 	enum bpf_attach_type expected_attach_type;
+	int is_attachable;
 	enum bpf_attach_type attach_type;
 } section_names[] = {
 	BPF_PROG_SEC("socket",			BPF_PROG_TYPE_SOCKET_FILTER),
@@ -2198,7 +2199,7 @@ int libbpf_attach_type_by_name(const char *name,
 	for (i = 0; i < ARRAY_SIZE(section_names); i++) {
 		if (strncmp(name, section_names[i].sec, section_names[i].len))
 			continue;
-		if (section_names[i].attach_type == -EINVAL)
+		if (!section_names[i].is_attachable)
 			return -EINVAL;
 		*attach_type = section_names[i].attach_type;
 		return 0;
-- 
2.17.1

^ permalink raw reply related

* Re: [PATCH net] openvswitch: Fix push/pop ethernet validation
From: Gregory Rose @ 2018-10-31 20:13 UTC (permalink / raw)
  To: Jaime Caamaño Ruiz, netdev; +Cc: pshelar
In-Reply-To: <20181031175203.23808-1-jcaamano@suse.com>

On 10/31/2018 10:52 AM, Jaime Caamaño Ruiz wrote:
> When there are both pop and push ethernet header actions among the
> actions to be applied to a packet, an unexpected EINVAL (Invalid
> argument) error is obtained. This is due to mac_proto not being reset
> correctly when those actions are validated.
>
> Reported-at:
> https://mail.openvswitch.org/pipermail/ovs-discuss/2018-October/047554.html
> Fixes: 91820da6ae85 ("openvswitch: add Ethernet push and pop actions")
> Signed-off-by: Jaime Caamaño Ruiz <jcaamano@suse.com>
> ---
>   net/openvswitch/flow_netlink.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
> index a70097ecf33c..865ecef68196 100644
> --- a/net/openvswitch/flow_netlink.c
> +++ b/net/openvswitch/flow_netlink.c
> @@ -3030,7 +3030,7 @@ static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr,
>   			 * is already present */
>   			if (mac_proto != MAC_PROTO_NONE)
>   				return -EINVAL;
> -			mac_proto = MAC_PROTO_NONE;
> +			mac_proto = MAC_PROTO_ETHERNET;
>   			break;
>   
>   		case OVS_ACTION_ATTR_POP_ETH:
> @@ -3038,7 +3038,7 @@ static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr,
>   				return -EINVAL;
>   			if (vlan_tci & htons(VLAN_TAG_PRESENT))
>   				return -EINVAL;
> -			mac_proto = MAC_PROTO_ETHERNET;
> +			mac_proto = MAC_PROTO_NONE;
>   			break;
>   
>   		case OVS_ACTION_ATTR_PUSH_NSH:

Thanks Jaime!

Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>

^ permalink raw reply

* Re: [PATCH v2 1/2] kretprobe: produce sane stack traces
From: kbuild test robot @ 2018-10-31 20:18 UTC (permalink / raw)
  To: Aleksa Sarai
  Cc: kbuild-all, Naveen N. Rao, Anil S Keshavamurthy, David S. Miller,
	Masami Hiramatsu, Jonathan Corbet, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Steven Rostedt, Shuah Khan, Alexei Starovoitov,
	Daniel Borkmann, Aleksa Sarai, Brendan Gregg, Christian Brauner,
	Aleksa 
In-Reply-To: <20181031152543.12138-2-cyphar@cyphar.com>

[-- Attachment #1: Type: text/plain, Size: 3347 bytes --]

Hi Aleksa,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on tip/perf/core]
[also build test ERROR on v4.19 next-20181031]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Aleksa-Sarai/kretprobe-produce-sane-stack-traces/20181101-034104
config: i386-tinyconfig (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All error/warnings (new ones prefixed by >>):

   kernel/events/callchain.c: In function 'get_perf_callchain':
>> kernel/events/callchain.c:201:35: error: implicit declaration of function 'current_kretprobe_instance'; did you mean 'current_top_of_stack'? [-Werror=implicit-function-declaration]
      struct kretprobe_instance *ri = current_kretprobe_instance();
                                      ^~~~~~~~~~~~~~~~~~~~~~~~~~
                                      current_top_of_stack
>> kernel/events/callchain.c:201:35: warning: initialization makes pointer from integer without a cast [-Wint-conversion]
>> kernel/events/callchain.c:206:4: error: implicit declaration of function 'kretprobe_perf_callchain_kernel'; did you mean 'perf_callchain_kernel'? [-Werror=implicit-function-declaration]
       kretprobe_perf_callchain_kernel(ri, &ctx);
       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
       perf_callchain_kernel
   cc1: some warnings being treated as errors

vim +201 kernel/events/callchain.c

   178	
   179	struct perf_callchain_entry *
   180	get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool kernel, bool user,
   181			   u32 max_stack, bool crosstask, bool add_mark)
   182	{
   183		struct perf_callchain_entry *entry;
   184		struct perf_callchain_entry_ctx ctx;
   185		int rctx;
   186	
   187		entry = get_callchain_entry(&rctx);
   188		if (rctx == -1)
   189			return NULL;
   190	
   191		if (!entry)
   192			goto exit_put;
   193	
   194		ctx.entry     = entry;
   195		ctx.max_stack = max_stack;
   196		ctx.nr	      = entry->nr = init_nr;
   197		ctx.contexts       = 0;
   198		ctx.contexts_maxed = false;
   199	
   200		if (kernel && !user_mode(regs)) {
 > 201			struct kretprobe_instance *ri = current_kretprobe_instance();
   202	
   203			if (add_mark)
   204				perf_callchain_store_context(&ctx, PERF_CONTEXT_KERNEL);
   205			if (ri)
 > 206				kretprobe_perf_callchain_kernel(ri, &ctx);
   207			else
   208				perf_callchain_kernel(&ctx, regs);
   209		}
   210	
   211		if (user) {
   212			if (!user_mode(regs)) {
   213				if  (current->mm)
   214					regs = task_pt_regs(current);
   215				else
   216					regs = NULL;
   217			}
   218	
   219			if (regs) {
   220				mm_segment_t fs;
   221	
   222				if (crosstask)
   223					goto exit_put;
   224	
   225				if (add_mark)
   226					perf_callchain_store_context(&ctx, PERF_CONTEXT_USER);
   227	
   228				fs = get_fs();
   229				set_fs(USER_DS);
   230				perf_callchain_user(&ctx, regs);
   231				set_fs(fs);
   232			}
   233		}
   234	
   235	exit_put:
   236		put_callchain_entry(rctx);
   237	
   238		return entry;
   239	}
   240	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 6493 bytes --]

^ permalink raw reply

* Re: [PATCH bpf] libbpf: Fix compile error in libbpf_attach_type_by_name
From: Arnaldo Carvalho de Melo @ 2018-10-31 20:49 UTC (permalink / raw)
  To: Andrey Ignatov; +Cc: netdev, ast, daniel, kernel-team
In-Reply-To: <20181031195718.307757-1-rdna@fb.com>

Em Wed, Oct 31, 2018 at 12:57:18PM -0700, Andrey Ignatov escreveu:
> Arnaldo Carvalho de Melo reported build error in libbpf when clang
> version 3.8.1-24 (tags/RELEASE_381/final) is used:
> 
> libbpf.c:2201:36: error: comparison of constant -22 with expression of
> type 'const enum bpf_attach_type' is always false
> [-Werror,-Wtautological-constant-out-of-range-compare]
>                 if (section_names[i].attach_type == -EINVAL)
>                     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^  ~~~~~~~
> 1 error generated.
> 
> Fix the error by keeping "is_attachable" property of a program in a
> separate struct field instead of trying to use attach_type itself.

Thanks, now it builds in all the previously failing systems:

# export PERF_TARBALL=http://192.168.86.4/perf/perf-4.19.0.tar.xz
# dm debian:9 fedora:25 fedora:26 fedora:27 ubuntu:16.04 ubuntu:17.10
   1 debian:9        : Ok   gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516          clang version 3.8.1-24 (tags/RELEASE_381/final)
   2 fedora:25       : Ok   gcc (GCC) 6.4.1 20170727 (Red Hat 6.4.1-1)           clang version 3.9.1 (tags/RELEASE_391/final)
   3 fedora:26       : Ok   gcc (GCC) 7.3.1 20180130 (Red Hat 7.3.1-2)           clang version 4.0.1 (tags/RELEASE_401/final)
   4 fedora:27       : Ok   gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-6)           clang version 5.0.2 (tags/RELEASE_502/final)
   5 ubuntu:16.04    : Ok   gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609  clang version 3.8.0-2ubuntu4 (tags/RELEASE_380/final)
   6 ubuntu:17.10    : Ok   gcc (Ubuntu 7.2.0-8ubuntu3.2) 7.2.0                  clang version 4.0.1-6 (tags/RELEASE_401/final)
#

Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>

I also have it tentatively applied to my perf/urgent branch, that I'll
push upstream soon.

- Arnaldo
 
> Fixes: commit 956b620fcf0b ("libbpf: Introduce libbpf_attach_type_by_name")
> Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
> Signed-off-by: Andrey Ignatov <rdna@fb.com>
> ---
>  tools/lib/bpf/libbpf.c | 13 +++++++------
>  1 file changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index b607be7236d3..d6e62e90e8d4 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -2084,19 +2084,19 @@ void bpf_program__set_expected_attach_type(struct bpf_program *prog,
>  	prog->expected_attach_type = type;
>  }
>  
> -#define BPF_PROG_SEC_IMPL(string, ptype, eatype, atype) \
> -	{ string, sizeof(string) - 1, ptype, eatype, atype }
> +#define BPF_PROG_SEC_IMPL(string, ptype, eatype, is_attachable, atype) \
> +	{ string, sizeof(string) - 1, ptype, eatype, is_attachable, atype }
>  
>  /* Programs that can NOT be attached. */
> -#define BPF_PROG_SEC(string, ptype) BPF_PROG_SEC_IMPL(string, ptype, 0, -EINVAL)
> +#define BPF_PROG_SEC(string, ptype) BPF_PROG_SEC_IMPL(string, ptype, 0, 0, 0)
>  
>  /* Programs that can be attached. */
>  #define BPF_APROG_SEC(string, ptype, atype) \
> -	BPF_PROG_SEC_IMPL(string, ptype, 0, atype)
> +	BPF_PROG_SEC_IMPL(string, ptype, 0, 1, atype)
>  
>  /* Programs that must specify expected attach type at load time. */
>  #define BPF_EAPROG_SEC(string, ptype, eatype) \
> -	BPF_PROG_SEC_IMPL(string, ptype, eatype, eatype)
> +	BPF_PROG_SEC_IMPL(string, ptype, eatype, 1, eatype)
>  
>  /* Programs that can be attached but attach type can't be identified by section
>   * name. Kept for backward compatibility.
> @@ -2108,6 +2108,7 @@ static const struct {
>  	size_t len;
>  	enum bpf_prog_type prog_type;
>  	enum bpf_attach_type expected_attach_type;
> +	int is_attachable;
>  	enum bpf_attach_type attach_type;
>  } section_names[] = {
>  	BPF_PROG_SEC("socket",			BPF_PROG_TYPE_SOCKET_FILTER),
> @@ -2198,7 +2199,7 @@ int libbpf_attach_type_by_name(const char *name,
>  	for (i = 0; i < ARRAY_SIZE(section_names); i++) {
>  		if (strncmp(name, section_names[i].sec, section_names[i].len))
>  			continue;
> -		if (section_names[i].attach_type == -EINVAL)
> +		if (!section_names[i].is_attachable)
>  			return -EINVAL;
>  		*attach_type = section_names[i].attach_type;
>  		return 0;
> -- 
> 2.17.1

^ permalink raw reply

* Re: Latest net-next kernel 4.19.0+
From: Saeed Mahameed @ 2018-10-31 21:05 UTC (permalink / raw)
  To: eric.dumazet@gmail.com, xiyou.wangcong@gmail.com
  Cc: pstaszewski@itcare.pl, netdev@vger.kernel.org,
	dmichail@google.com
In-Reply-To: <CAM_iQpUKTh51maAzht8M3LuJAYDRMRnsGn_+Db0rGG-scW2SnA@mail.gmail.com>

On Tue, 2018-10-30 at 10:32 -0700, Cong Wang wrote:
> On Tue, Oct 30, 2018 at 7:16 AM Eric Dumazet <eric.dumazet@gmail.com>
> wrote:
> > 
> > 
> > 
> > On 10/30/2018 01:09 AM, Paweł Staszewski wrote:
> > > 
> > > 
> > > W dniu 30.10.2018 o 08:29, Eric Dumazet pisze:
> > > > 
> > > > On 10/29/2018 11:09 PM, Dimitris Michailidis wrote:
> > > > 
> > > > > Indeed this is a bug. I would expect it to produce frequent
> > > > > errors
> > > > > though as many odd-length
> > > > > packets would trigger it. Do you have RXFCS? Regardless, how
> > > > > frequently do you see the problem?
> > > > > 
> > > > 
> > > > Old kernels (before 88078d98d1bb) were simply resetting
> > > > ip_summed to CHECKSUM_NONE
> > > > 
> > > > And before your fix (commit d55bef5059dd057bd), mlx5 bug was
> > > > canceling the bug you fixed.
> > > > 
> > > > So we now need to also fix mlx5.
> > > > 
> > > > And of course use skb_header_pointer() in mlx5e_get_fcs() as I
> > > > mentioned earlier,
> > > > plus __get_unaligned_cpu32() as you hinted.
> > > > 
> > > > 
> > > > 
> > > > 
> > > 
> > > No RXFCS
> 
> 
> Same with Pawel, RXFCS is disabled by default.
> 
> 
> > > 
> > > And this trace is rly frequently like once per 3/4 seconds
> > > like below:
> > > [28965.776864] vlan1490: hw csum failure
> > 
> > Might be vlan related.
> 

Hi Pawel, is the vlan stripping offload disabled or enabled in your
case ? 

To verify:
ethtool -k <interface> | grep rx-vlan-offload
rx-vlan-offload: on
To set:
ethtool -K <interface> rxvlan on/off

if the vlan offload is off then it will trigger the mlx5e vlan csum
adjustment code pointed out by Eric.

Anyhow, it should work in both cases, but i am trying to narrow down
the possibilities. 

Also could it be a double tagged packet ?


> Unlike Pawel's case, we don't use vlan at all, maybe this is why we
> see
> it much less frequently than Pawel.
> 
> Also, it is probably not specific to mlx5, as there is another report
> which
> is probably a non-mlx5 driver.
> 

Cong, How often does this happen ? can you some how verify if the
problematic packet has extra end padding after the ip payload ?

It would be cool if we had a feature in kernel to store such SKB in
memory when such issue occurs, and let the user dump it later (via
tcpdump) and send the dump to the vendor for debug so we could just
replay and see what happens.

> Thanks.

^ permalink raw reply

* Re: [RFC PATCH 4/4] ixgbe: add support for extended PHC gettime
From: Richard Cochran @ 2018-10-31 21:16 UTC (permalink / raw)
  To: Miroslav Lichvar
  Cc: Keller, Jacob E, netdev@vger.kernel.org,
	intel-wired-lan@lists.osuosl.org
In-Reply-To: <20181031144935.GR31668@localhost>

On Wed, Oct 31, 2018 at 03:49:35PM +0100, Miroslav Lichvar wrote:
> 
> How about separating the PHC timestamp from the ptp_system_timestamp
> structure and use NULL to indicate we don't want to read the system
> clock? A gettimex64(ptp, ts, NULL) call would be equal to
> gettime64(ptp, ts).

Doesn't sound too bad to me.

Thanks,
Richard

^ permalink raw reply

* Re: Latest net-next kernel 4.19.0+
From: Cong Wang @ 2018-10-31 21:17 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: Eric Dumazet, Paweł Staszewski,
	Linux Kernel Network Developers, dmichail
In-Reply-To: <7f19ab59f1bbfe74cf3d056ccd9adf556cd09f60.camel@mellanox.com>

On Wed, Oct 31, 2018 at 2:05 PM Saeed Mahameed <saeedm@mellanox.com> wrote:
>
> Cong, How often does this happen ? can you some how verify if the
> problematic packet has extra end padding after the ip payload ?

For us, we need 10+ hours to get one warning. This is also
why we never capture the packet that causes this warning.


>
> It would be cool if we had a feature in kernel to store such SKB in
> memory when such issue occurs, and let the user dump it later (via
> tcpdump) and send the dump to the vendor for debug so we could just
> replay and see what happens.
>

Yeah, the warning kinda sucks, it tells almost nothing, the SKB
should be dumped up on this warning.

^ permalink raw reply

* Re: [RFC PATCH] lib: Introduce generic __cmpxchg_u64() and use it where needed
From: Trond Myklebust @ 2018-11-01  6:30 UTC (permalink / raw)
  To: linux@roeck-us.net, paul.burton@mips.com
  Cc: linux-kernel@vger.kernel.org, ralf@linux-mips.org,
	jlayton@kernel.org, linuxppc-dev@lists.ozlabs.org,
	bfields@fieldses.org, linux-mips@linux-mips.org,
	linux-nfs@vger.kernel.org, akpm@linux-foundation.org,
	anna.schumaker@netapp.com, jhogan@kernel.org,
	netdev@vger.kernel.org, davem@davemloft.net, arnd@arndb.de,
	paulus@samba.org, mpe@ellerman.id.au,
	"benh@kernel.crashing.org" <benh@
In-Reply-To: <291af20b-820e-e848-cf75-730024612117@roeck-us.net>

On Wed, 2018-10-31 at 18:18 -0700, Guenter Roeck wrote:
> On 10/31/18 4:32 PM, Paul Burton wrote:
> > (Copying SunRPC & net maintainers.)
> > 
> > Hi Guenter,
> > 
> > On Wed, Oct 31, 2018 at 03:02:53PM -0700, Guenter Roeck wrote:
> > > The alternatives I can see are
> > > - Do not use cmpxchg64() outside architecture code (ie drop its
> > > use from
> > >    the offending driver, and keep doing the same whenever the
> > > problem comes
> > >    up again).
> > > or
> > > - Introduce something like ARCH_HAS_CMPXCHG64 and use it to
> > > determine
> > >    if cmpxchg64 is supported or not.
> > > 
> > > Any preference ?
> > 
> > My preference would be option 1 - avoiding cmpxchg64() where
> > possible in
> > generic code. I wouldn't be opposed to the Kconfig option if there
> > are
> > cases where cmpxchg64() can really help performance though.
> > 
> > The last time I'm aware of this coming up the affected driver was
> > modified to avoid cmpxchg64() [1].
> > 
> > In this particular case I have no idea why
> > net/sunrpc/auth_gss/gss_krb5_seal.c is using cmpxchg64() at all.
> > It's
> > essentially reinventing atomic64_fetch_inc() which is already
> > provided
> > everywhere via CONFIG_GENERIC_ATOMIC64 & the spinlock approach. At
> > least
> > for atomic64_* functions the assumption that all access will be
> > performed using those same functions seems somewhat reasonable.
> > 
> > So how does the below look? Trond?
> > 
> 
> For my part I agree that this would be a much better solution. The
> argument
> that it is not always absolutely guaranteed that atomics don't wrap
> doesn't
> really hold for me because it looks like they all do. On top of that,
> there
> is an explicit atomic_dec_if_positive() and
> atomic_fetch_add_unless(),
> which to me strongly suggests that they _are_ supposed to wrap.
> Given the cost of adding a comparison to each atomic operation to
> prevent it from wrapping, anything else would not really make sense
> to me.

That's a hypothesis, not a proven fact. There are architectures out
there that do not wrap signed integers, hence my question.

> So ... please consider my patch abandoned. Thanks for looking into
> this!
> 
> Guenter
> 
> > Thanks,
> >      Paul
> > 
> > [1] https://patchwork.ozlabs.org/cover/891284/
> > 
> > ---
> > diff --git a/include/linux/sunrpc/gss_krb5.h
> > b/include/linux/sunrpc/gss_krb5.h
> > index 131424cefc6a..02c0412e368c 100644
> > --- a/include/linux/sunrpc/gss_krb5.h
> > +++ b/include/linux/sunrpc/gss_krb5.h
> > @@ -107,8 +107,8 @@ struct krb5_ctx {
> >   	u8			Ksess[GSS_KRB5_MAX_KEYLEN]; /* session key
> > */
> >   	u8			cksum[GSS_KRB5_MAX_KEYLEN];
> >   	s32			endtime;
> > -	u32			seq_send;
> > -	u64			seq_send64;
> > +	atomic_t		seq_send;
> > +	atomic64_t		seq_send64;
> >   	struct xdr_netobj	mech_used;
> >   	u8			initiator_sign[GSS_KRB5_MAX_KEYLEN];
> >   	u8			acceptor_sign[GSS_KRB5_MAX_KEYLEN];
> > @@ -118,9 +118,6 @@ struct krb5_ctx {
> >   	u8			acceptor_integ[GSS_KRB5_MAX_KEYLEN];
> >   };
> >   
> > -extern u32 gss_seq_send_fetch_and_inc(struct krb5_ctx *ctx);
> > -extern u64 gss_seq_send64_fetch_and_inc(struct krb5_ctx *ctx);
> > -
> >   /* The length of the Kerberos GSS token header */
> >   #define GSS_KRB5_TOK_HDR_LEN	(16)
> >   
> > diff --git a/net/sunrpc/auth_gss/gss_krb5_mech.c
> > b/net/sunrpc/auth_gss/gss_krb5_mech.c
> > index 7f0424dfa8f6..eab71fc7af3e 100644
> > --- a/net/sunrpc/auth_gss/gss_krb5_mech.c
> > +++ b/net/sunrpc/auth_gss/gss_krb5_mech.c
> > @@ -274,6 +274,7 @@ get_key(const void *p, const void *end,
> >   static int
> >   gss_import_v1_context(const void *p, const void *end, struct
> > krb5_ctx *ctx)
> >   {
> > +	u32 seq_send;
> >   	int tmp;
> >   
> >   	p = simple_get_bytes(p, end, &ctx->initiate, sizeof(ctx-
> > >initiate));
> > @@ -315,9 +316,10 @@ gss_import_v1_context(const void *p, const
> > void *end, struct krb5_ctx *ctx)
> >   	p = simple_get_bytes(p, end, &ctx->endtime, sizeof(ctx-
> > >endtime));
> >   	if (IS_ERR(p))
> >   		goto out_err;
> > -	p = simple_get_bytes(p, end, &ctx->seq_send, sizeof(ctx-
> > >seq_send));
> > +	p = simple_get_bytes(p, end, &seq_send, sizeof(seq_send));
> >   	if (IS_ERR(p))
> >   		goto out_err;
> > +	atomic_set(&ctx->seq_send, seq_send);
> >   	p = simple_get_netobj(p, end, &ctx->mech_used);
> >   	if (IS_ERR(p))
> >   		goto out_err;
> > @@ -607,6 +609,7 @@ static int
> >   gss_import_v2_context(const void *p, const void *end, struct
> > krb5_ctx *ctx,
> >   		gfp_t gfp_mask)
> >   {
> > +	u64 seq_send64;
> >   	int keylen;
> >   
> >   	p = simple_get_bytes(p, end, &ctx->flags, sizeof(ctx->flags));
> > @@ -617,14 +620,15 @@ gss_import_v2_context(const void *p, const
> > void *end, struct krb5_ctx *ctx,
> >   	p = simple_get_bytes(p, end, &ctx->endtime, sizeof(ctx-
> > >endtime));
> >   	if (IS_ERR(p))
> >   		goto out_err;
> > -	p = simple_get_bytes(p, end, &ctx->seq_send64, sizeof(ctx-
> > >seq_send64));
> > +	p = simple_get_bytes(p, end, &seq_send64, sizeof(seq_send64));
> >   	if (IS_ERR(p))
> >   		goto out_err;
> > +	atomic64_set(&ctx->seq_send64, seq_send64);
> >   	/* set seq_send for use by "older" enctypes */
> > -	ctx->seq_send = ctx->seq_send64;
> > -	if (ctx->seq_send64 != ctx->seq_send) {
> > -		dprintk("%s: seq_send64 %lx, seq_send %x overflow?\n",
> > __func__,
> > -			(unsigned long)ctx->seq_send64, ctx->seq_send);
> > +	atomic_set(&ctx->seq_send, seq_send64);
> > +	if (seq_send64 != atomic_read(&ctx->seq_send)) {
> > +		dprintk("%s: seq_send64 %llx, seq_send %x overflow?\n",
> > __func__,
> > +			seq_send64, atomic_read(&ctx->seq_send));
> >   		p = ERR_PTR(-EINVAL);
> >   		goto out_err;
> >   	}
> > diff --git a/net/sunrpc/auth_gss/gss_krb5_seal.c
> > b/net/sunrpc/auth_gss/gss_krb5_seal.c
> > index b4adeb06660b..48fe4a591b54 100644
> > --- a/net/sunrpc/auth_gss/gss_krb5_seal.c
> > +++ b/net/sunrpc/auth_gss/gss_krb5_seal.c
> > @@ -123,30 +123,6 @@ setup_token_v2(struct krb5_ctx *ctx, struct
> > xdr_netobj *token)
> >   	return krb5_hdr;
> >   }
> >   
> > -u32
> > -gss_seq_send_fetch_and_inc(struct krb5_ctx *ctx)
> > -{
> > -	u32 old, seq_send = READ_ONCE(ctx->seq_send);
> > -
> > -	do {
> > -		old = seq_send;
> > -		seq_send = cmpxchg(&ctx->seq_send, old, old + 1);
> > -	} while (old != seq_send);
> > -	return seq_send;
> > -}
> > -
> > -u64
> > -gss_seq_send64_fetch_and_inc(struct krb5_ctx *ctx)
> > -{
> > -	u64 old, seq_send = READ_ONCE(ctx->seq_send);
> > -
> > -	do {
> > -		old = seq_send;
> > -		seq_send = cmpxchg64(&ctx->seq_send64, old, old + 1);
> > -	} while (old != seq_send);
> > -	return seq_send;
> > -}
> > -
> >   static u32
> >   gss_get_mic_v1(struct krb5_ctx *ctx, struct xdr_buf *text,
> >   		struct xdr_netobj *token)
> > @@ -177,7 +153,7 @@ gss_get_mic_v1(struct krb5_ctx *ctx, struct
> > xdr_buf *text,
> >   
> >   	memcpy(ptr + GSS_KRB5_TOK_HDR_LEN, md5cksum.data,
> > md5cksum.len);
> >   
> > -	seq_send = gss_seq_send_fetch_and_inc(ctx);
> > +	seq_send = atomic_fetch_inc(&ctx->seq_send);
> >   
> >   	if (krb5_make_seq_num(ctx, ctx->seq, ctx->initiate ? 0 : 0xff,
> >   			      seq_send, ptr + GSS_KRB5_TOK_HDR_LEN, ptr
> > + 8))
> > @@ -205,7 +181,7 @@ gss_get_mic_v2(struct krb5_ctx *ctx, struct
> > xdr_buf *text,
> >   
> >   	/* Set up the sequence number. Now 64-bits in clear
> >   	 * text and w/o direction indicator */
> > -	seq_send_be64 = cpu_to_be64(gss_seq_send64_fetch_and_inc(ctx));
> > +	seq_send_be64 = cpu_to_be64(atomic64_fetch_inc(&ctx-
> > >seq_send64));
> >   	memcpy(krb5_hdr + 8, (char *) &seq_send_be64, 8);
> >   
> >   	if (ctx->initiate) {
> > diff --git a/net/sunrpc/auth_gss/gss_krb5_wrap.c
> > b/net/sunrpc/auth_gss/gss_krb5_wrap.c
> > index 962fa84e6db1..5cdde6cb703a 100644
> > --- a/net/sunrpc/auth_gss/gss_krb5_wrap.c
> > +++ b/net/sunrpc/auth_gss/gss_krb5_wrap.c
> > @@ -228,7 +228,7 @@ gss_wrap_kerberos_v1(struct krb5_ctx *kctx, int
> > offset,
> >   
> >   	memcpy(ptr + GSS_KRB5_TOK_HDR_LEN, md5cksum.data,
> > md5cksum.len);
> >   
> > -	seq_send = gss_seq_send_fetch_and_inc(kctx);
> > +	seq_send = atomic_fetch_inc(&kctx->seq_send);
> >   
> >   	/* XXX would probably be more efficient to compute checksum
> >   	 * and encrypt at the same time: */
> > @@ -475,7 +475,7 @@ gss_wrap_kerberos_v2(struct krb5_ctx *kctx, u32
> > offset,
> >   	*be16ptr++ = 0;
> >   
> >   	be64ptr = (__be64 *)be16ptr;
> > -	*be64ptr = cpu_to_be64(gss_seq_send64_fetch_and_inc(kctx));
> > +	*be64ptr = cpu_to_be64(atomic64_fetch_inc(&kctx->seq_send64));
> >   
> >   	err = (*kctx->gk5e->encrypt_v2)(kctx, offset, buf, pages);
> >   	if (err)
> > 
-- 
Trond Myklebust
CTO, Hammerspace Inc
4300 El Camino Real, Suite 105
Los Altos, CA 94022
www.hammer.space



^ permalink raw reply

* Re: Latest net-next kernel 4.19.0+
From: Paweł Staszewski @ 2018-10-31 21:22 UTC (permalink / raw)
  To: Saeed Mahameed, eric.dumazet@gmail.com, xiyou.wangcong@gmail.com
  Cc: netdev@vger.kernel.org, dmichail@google.com
In-Reply-To: <7f19ab59f1bbfe74cf3d056ccd9adf556cd09f60.camel@mellanox.com>



W dniu 31.10.2018 o 22:05, Saeed Mahameed pisze:
> On Tue, 2018-10-30 at 10:32 -0700, Cong Wang wrote:
>> On Tue, Oct 30, 2018 at 7:16 AM Eric Dumazet <eric.dumazet@gmail.com>
>> wrote:
>>>
>>>
>>> On 10/30/2018 01:09 AM, Paweł Staszewski wrote:
>>>>
>>>> W dniu 30.10.2018 o 08:29, Eric Dumazet pisze:
>>>>> On 10/29/2018 11:09 PM, Dimitris Michailidis wrote:
>>>>>
>>>>>> Indeed this is a bug. I would expect it to produce frequent
>>>>>> errors
>>>>>> though as many odd-length
>>>>>> packets would trigger it. Do you have RXFCS? Regardless, how
>>>>>> frequently do you see the problem?
>>>>>>
>>>>> Old kernels (before 88078d98d1bb) were simply resetting
>>>>> ip_summed to CHECKSUM_NONE
>>>>>
>>>>> And before your fix (commit d55bef5059dd057bd), mlx5 bug was
>>>>> canceling the bug you fixed.
>>>>>
>>>>> So we now need to also fix mlx5.
>>>>>
>>>>> And of course use skb_header_pointer() in mlx5e_get_fcs() as I
>>>>> mentioned earlier,
>>>>> plus __get_unaligned_cpu32() as you hinted.
>>>>>
>>>>>
>>>>>
>>>>>
>>>> No RXFCS
>>
>> Same with Pawel, RXFCS is disabled by default.
>>
>>
>>>> And this trace is rly frequently like once per 3/4 seconds
>>>> like below:
>>>> [28965.776864] vlan1490: hw csum failure
>>> Might be vlan related.
> Hi Pawel, is the vlan stripping offload disabled or enabled in your
> case ?
>
> To verify:
> ethtool -k <interface> | grep rx-vlan-offload
> rx-vlan-offload: on
> To set:
> ethtool -K <interface> rxvlan on/off
Enabled:
ethtool -k enp175s0f0
Features for enp175s0f0:
rx-checksumming: on
tx-checksumming: on
         tx-checksum-ipv4: on
         tx-checksum-ip-generic: off [fixed]
         tx-checksum-ipv6: on
         tx-checksum-fcoe-crc: off [fixed]
         tx-checksum-sctp: off [fixed]
scatter-gather: on
         tx-scatter-gather: on
         tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
         tx-tcp-segmentation: on
         tx-tcp-ecn-segmentation: off [fixed]
         tx-tcp-mangleid-segmentation: off
         tx-tcp6-segmentation: on
udp-fragmentation-offload: off
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: on
tx-vlan-offload: on
ntuple-filters: off
receive-hashing: on
highdma: on [fixed]
rx-vlan-filter: on
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: on
tx-gre-csum-segmentation: on
tx-ipxip4-segmentation: off [fixed]
tx-ipxip6-segmentation: off [fixed]
tx-udp_tnl-segmentation: on
tx-udp_tnl-csum-segmentation: on
tx-gso-partial: on
tx-sctp-segmentation: off [fixed]
tx-esp-segmentation: off [fixed]
tx-udp-segmentation: on
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off
rx-all: off
tx-vlan-stag-hw-insert: on
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: on [fixed]
l2-fwd-offload: off [fixed]
hw-tc-offload: on
esp-hw-offload: off [fixed]
esp-tx-csum-hw-offload: off [fixed]
rx-udp_tunnel-port-offload: on
tls-hw-tx-offload: off [fixed]
tls-hw-rx-offload: off [fixed]
rx-gro-hw: off [fixed]
tls-hw-record: off [fixed]


>
> if the vlan offload is off then it will trigger the mlx5e vlan csum
> adjustment code pointed out by Eric.
>
> Anyhow, it should work in both cases, but i am trying to narrow down
> the possibilities.
>
> Also could it be a double tagged packet ?
no double tagged packets there


>
>
>> Unlike Pawel's case, we don't use vlan at all, maybe this is why we
>> see
>> it much less frequently than Pawel.
>>
>> Also, it is probably not specific to mlx5, as there is another report
>> which
>> is probably a non-mlx5 driver.
>>
> Cong, How often does this happen ? can you some how verify if the
> problematic packet has extra end padding after the ip payload ?
>
> It would be cool if we had a feature in kernel to store such SKB in
> memory when such issue occurs, and let the user dump it later (via
> tcpdump) and send the dump to the vendor for debug so we could just
> replay and see what happens.
>
>> Thanks.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox