Netdev List
 help / color / mirror / Atom feed
* [net-next 14/15] i40evf: cancel workqueue sync for adminq when a VF is removed
From: Jeff Kirsher @ 2018-08-29 22:48 UTC (permalink / raw)
  To: davem; +Cc: Lihong Yang, netdev, nhorman, sassmann, jogreene, Jeff Kirsher
In-Reply-To: <20180829224834.3141-1-jeffrey.t.kirsher@intel.com>

From: Lihong Yang <lihong.yang@intel.com>

If a VF is being removed, there is no need to continue with the
workqueue sync for the adminq task, thus cancel it. Without this call,
when VFs are created and removed right away, there might be a chance for
the driver to crash with events stuck in the adminq.

Signed-off-by: Lihong Yang <lihong.yang@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40evf/i40evf_main.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
index c7048cf484fb..174d1da2857b 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
@@ -3910,6 +3910,8 @@ static void i40evf_remove(struct pci_dev *pdev)
 	if (adapter->watchdog_timer.function)
 		del_timer_sync(&adapter->watchdog_timer);
 
+	cancel_work_sync(&adapter->adminq_task);
+
 	i40evf_free_rss(adapter);
 
 	if (hw->aq.asq.count)
-- 
2.17.1

^ permalink raw reply related

* [net-next 10/15] i40e: report correct statistics when XDP is enabled
From: Jeff Kirsher @ 2018-08-29 22:48 UTC (permalink / raw)
  To: davem
  Cc: Björn Töpel, netdev, nhorman, sassmann, jogreene,
	Jeff Kirsher
In-Reply-To: <20180829224834.3141-1-jeffrey.t.kirsher@intel.com>

From: Björn Töpel <bjorn.topel@intel.com>

When XDP is enabled, the driver will report incorrect
statistics. Received frames will reported as transmitted frames.

This commits fixes the i40e implementation of ndo_get_stats64 (struct
net_device_ops), so that iproute2 will report correct statistics
(e.g. when running "ip -stats link show dev eth0") even when XDP is
enabled.

Reported-by: Jesper Dangaard Brouer <brouer@redhat.com>
Fixes: 74608d17fe29 ("i40e: add support for XDP_TX action")
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 24 +++++++++++----------
 1 file changed, 13 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 62f2b5bce6bb..6cb4076e8fba 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -420,9 +420,9 @@ static void i40e_get_netdev_stats_struct(struct net_device *netdev,
 				  struct rtnl_link_stats64 *stats)
 {
 	struct i40e_netdev_priv *np = netdev_priv(netdev);
-	struct i40e_ring *tx_ring, *rx_ring;
 	struct i40e_vsi *vsi = np->vsi;
 	struct rtnl_link_stats64 *vsi_stats = i40e_get_vsi_stats_struct(vsi);
+	struct i40e_ring *ring;
 	int i;
 
 	if (test_bit(__I40E_VSI_DOWN, vsi->state))
@@ -436,24 +436,26 @@ static void i40e_get_netdev_stats_struct(struct net_device *netdev,
 		u64 bytes, packets;
 		unsigned int start;
 
-		tx_ring = READ_ONCE(vsi->tx_rings[i]);
-		if (!tx_ring)
+		ring = READ_ONCE(vsi->tx_rings[i]);
+		if (!ring)
 			continue;
-		i40e_get_netdev_stats_struct_tx(tx_ring, stats);
+		i40e_get_netdev_stats_struct_tx(ring, stats);
 
-		rx_ring = &tx_ring[1];
+		if (i40e_enabled_xdp_vsi(vsi)) {
+			ring++;
+			i40e_get_netdev_stats_struct_tx(ring, stats);
+		}
 
+		ring++;
 		do {
-			start = u64_stats_fetch_begin_irq(&rx_ring->syncp);
-			packets = rx_ring->stats.packets;
-			bytes   = rx_ring->stats.bytes;
-		} while (u64_stats_fetch_retry_irq(&rx_ring->syncp, start));
+			start   = u64_stats_fetch_begin_irq(&ring->syncp);
+			packets = ring->stats.packets;
+			bytes   = ring->stats.bytes;
+		} while (u64_stats_fetch_retry_irq(&ring->syncp, start));
 
 		stats->rx_packets += packets;
 		stats->rx_bytes   += bytes;
 
-		if (i40e_enabled_xdp_vsi(vsi))
-			i40e_get_netdev_stats_struct_tx(&rx_ring[1], stats);
 	}
 	rcu_read_unlock();
 
-- 
2.17.1

^ permalink raw reply related

* [net-next 12/15] i40evf: Don't enable vlan stripping when rx offload is turned on
From: Jeff Kirsher @ 2018-08-29 22:48 UTC (permalink / raw)
  To: davem; +Cc: Patryk Małek, netdev, nhorman, sassmann, jogreene,
	Jeff Kirsher
In-Reply-To: <20180829224834.3141-1-jeffrey.t.kirsher@intel.com>

From: Patryk Małek <patryk.malek@intel.com>

With current implementation of i40evf_set_features when user sets
any offload via ethtool we set I40EVF_FLAG_AQ_ENABLE_VLAN_STRIPPING
as a required aq which triggers driver to call
i40evf_enable_vlan_stripping. This shouldn't take place.
This patches fixes it by setting the flag only when VLAN offload
is turned on.

Signed-off-by: Patryk Małek <patryk.malek@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40evf/i40evf_main.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
index 07f187b5bcac..c7048cf484fb 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
@@ -3120,18 +3120,19 @@ static int i40evf_set_features(struct net_device *netdev,
 {
 	struct i40evf_adapter *adapter = netdev_priv(netdev);
 
-	/* Don't allow changing VLAN_RX flag when VLAN is set for VF
-	 * and return an error in this case
+	/* Don't allow changing VLAN_RX flag when adapter is not capable
+	 * of VLAN offload
 	 */
-	if (VLAN_ALLOWED(adapter)) {
+	if (!VLAN_ALLOWED(adapter)) {
+		if ((netdev->features ^ features) & NETIF_F_HW_VLAN_CTAG_RX)
+			return -EINVAL;
+	} else if ((netdev->features ^ features) & NETIF_F_HW_VLAN_CTAG_RX) {
 		if (features & NETIF_F_HW_VLAN_CTAG_RX)
 			adapter->aq_required |=
 				I40EVF_FLAG_AQ_ENABLE_VLAN_STRIPPING;
 		else
 			adapter->aq_required |=
 				I40EVF_FLAG_AQ_DISABLE_VLAN_STRIPPING;
-	} else if ((netdev->features ^ features) & NETIF_F_HW_VLAN_CTAG_RX) {
-		return -EINVAL;
 	}
 
 	return 0;
-- 
2.17.1

^ permalink raw reply related

* [net-next 05/15] i40evf: Validate the number of queues a PF sends
From: Jeff Kirsher @ 2018-08-29 22:48 UTC (permalink / raw)
  To: davem
  Cc: Paul M Stillwell Jr, netdev, nhorman, sassmann, jogreene,
	Jeff Kirsher
In-Reply-To: <20180829224834.3141-1-jeffrey.t.kirsher@intel.com>

From: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>

A PF can send any number of queues to the VF and the VF may not
be able to support that many. Check to see that the number of
queues is less than or equal to the max number of queues the
VF can have.

Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 .../ethernet/intel/i40evf/i40evf_virtchnl.c   | 32 +++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c b/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c
index 79b7be83b5c3..6579dabab78c 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c
@@ -153,6 +153,32 @@ int i40evf_send_vf_config_msg(struct i40evf_adapter *adapter)
 					  NULL, 0);
 }
 
+/**
+ * i40evf_validate_num_queues
+ * @adapter: adapter structure
+ *
+ * Validate that the number of queues the PF has sent in
+ * VIRTCHNL_OP_GET_VF_RESOURCES is not larger than the VF can handle.
+ **/
+static void i40evf_validate_num_queues(struct i40evf_adapter *adapter)
+{
+	if (adapter->vf_res->num_queue_pairs > I40EVF_MAX_REQ_QUEUES) {
+		struct virtchnl_vsi_resource *vsi_res;
+		int i;
+
+		dev_info(&adapter->pdev->dev, "Received %d queues, but can only have a max of %d\n",
+			 adapter->vf_res->num_queue_pairs,
+			 I40EVF_MAX_REQ_QUEUES);
+		dev_info(&adapter->pdev->dev, "Fixing by reducing queues to %d\n",
+			 I40EVF_MAX_REQ_QUEUES);
+		adapter->vf_res->num_queue_pairs = I40EVF_MAX_REQ_QUEUES;
+		for (i = 0; i < adapter->vf_res->num_vsis; i++) {
+			vsi_res = &adapter->vf_res->vsi_res[i];
+			vsi_res->num_queue_pairs = I40EVF_MAX_REQ_QUEUES;
+		}
+	}
+}
+
 /**
  * i40evf_get_vf_config
  * @adapter: private adapter structure
@@ -195,6 +221,11 @@ int i40evf_get_vf_config(struct i40evf_adapter *adapter)
 	err = (i40e_status)le32_to_cpu(event.desc.cookie_low);
 	memcpy(adapter->vf_res, event.msg_buf, min(event.msg_len, len));
 
+	/* some PFs send more queues than we should have so validate that
+	 * we aren't getting too many queues
+	 */
+	if (!err)
+		i40evf_validate_num_queues(adapter);
 	i40e_vf_parse_hw_config(hw, adapter->vf_res);
 out_alloc:
 	kfree(event.msg_buf);
@@ -1329,6 +1360,7 @@ void i40evf_virtchnl_completion(struct i40evf_adapter *adapter,
 			  I40E_MAX_VF_VSI *
 			  sizeof(struct virtchnl_vsi_resource);
 		memcpy(adapter->vf_res, msg, min(msglen, len));
+		i40evf_validate_num_queues(adapter);
 		i40e_vf_parse_hw_config(&adapter->hw, adapter->vf_res);
 		if (is_zero_ether_addr(adapter->hw.mac.addr)) {
 			/* restore current mac address */
-- 
2.17.1

^ permalink raw reply related

* [net-next 13/15] i40e: hold the rtnl lock on clearing interrupt scheme
From: Jeff Kirsher @ 2018-08-29 22:48 UTC (permalink / raw)
  To: davem; +Cc: Patryk Małek, netdev, nhorman, sassmann, jogreene,
	Jeff Kirsher
In-Reply-To: <20180829224834.3141-1-jeffrey.t.kirsher@intel.com>

From: Patryk Małek <patryk.malek@intel.com>

Hold the rtnl lock when we're clearing interrupt scheme
in i40e_shutdown and in i40e_remove.

Signed-off-by: Patryk Małek <patryk.malek@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index c30feb27e1c0..112245f32d7d 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -14182,6 +14182,7 @@ static void i40e_remove(struct pci_dev *pdev)
 	mutex_destroy(&hw->aq.asq_mutex);
 
 	/* Clear all dynamic memory lists of rings, q_vectors, and VSIs */
+	rtnl_lock();
 	i40e_clear_interrupt_scheme(pf);
 	for (i = 0; i < pf->num_alloc_vsi; i++) {
 		if (pf->vsi[i]) {
@@ -14190,6 +14191,7 @@ static void i40e_remove(struct pci_dev *pdev)
 			pf->vsi[i] = NULL;
 		}
 	}
+	rtnl_unlock();
 
 	for (i = 0; i < I40E_MAX_VEB; i++) {
 		kfree(pf->veb[i]);
@@ -14401,7 +14403,13 @@ static void i40e_shutdown(struct pci_dev *pdev)
 	wr32(hw, I40E_PFPM_WUFC,
 	     (pf->wol_en ? I40E_PFPM_WUFC_MAG_MASK : 0));
 
+	/* Since we're going to destroy queues during the
+	 * i40e_clear_interrupt_scheme() we should hold the RTNL lock for this
+	 * whole section
+	 */
+	rtnl_lock();
 	i40e_clear_interrupt_scheme(pf);
+	rtnl_unlock();
 
 	if (system_state == SYSTEM_POWER_OFF) {
 		pci_wake_from_d3(pdev, pf->wol_en);
-- 
2.17.1

^ permalink raw reply related

* [net-next 06/15] i40e: use correct length for strncpy
From: Jeff Kirsher @ 2018-08-29 22:48 UTC (permalink / raw)
  To: davem; +Cc: Mitch Williams, netdev, nhorman, sassmann, jogreene, Jeff Kirsher
In-Reply-To: <20180829224834.3141-1-jeffrey.t.kirsher@intel.com>

From: Mitch Williams <mitch.a.williams@intel.com>

Caught by GCC 8. When we provide a length for strncpy, we should not
include the terminating null. So we must tell it one less than the size
of the destination buffer.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_ptp.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_ptp.c b/drivers/net/ethernet/intel/i40e/i40e_ptp.c
index 35f2866b38c6..1199f0502d6d 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_ptp.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_ptp.c
@@ -694,7 +694,8 @@ static long i40e_ptp_create_clock(struct i40e_pf *pf)
 	if (!IS_ERR_OR_NULL(pf->ptp_clock))
 		return 0;
 
-	strncpy(pf->ptp_caps.name, i40e_driver_name, sizeof(pf->ptp_caps.name));
+	strncpy(pf->ptp_caps.name, i40e_driver_name,
+		sizeof(pf->ptp_caps.name) - 1);
 	pf->ptp_caps.owner = THIS_MODULE;
 	pf->ptp_caps.max_adj = 999999999;
 	pf->ptp_caps.n_ext_ts = 0;
-- 
2.17.1

^ permalink raw reply related

* [net-next 07/15] i40evf: set IFF_UNICAST_FLT flag for the VF
From: Jeff Kirsher @ 2018-08-29 22:48 UTC (permalink / raw)
  To: davem; +Cc: Lihong Yang, netdev, nhorman, sassmann, jogreene, Jeff Kirsher
In-Reply-To: <20180829224834.3141-1-jeffrey.t.kirsher@intel.com>

From: Lihong Yang <lihong.yang@intel.com>

Set IFF_UNICAST_FLT flag for the VF to prevent it from entering
promiscuous mode when macvlan is added to the VF.

Signed-off-by: Lihong Yang <lihong.yang@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40evf/i40evf_main.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
index 5906c1c1d19d..07f187b5bcac 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
@@ -3358,6 +3358,8 @@ int i40evf_process_config(struct i40evf_adapter *adapter)
 	if (vfres->vf_cap_flags & VIRTCHNL_VF_OFFLOAD_VLAN)
 		netdev->features |= NETIF_F_HW_VLAN_CTAG_FILTER;
 
+	netdev->priv_flags |= IFF_UNICAST_FLT;
+
 	/* Do not turn on offloads when they are requested to be turned off.
 	 * TSO needs minimum 576 bytes to work correctly.
 	 */
-- 
2.17.1

^ permalink raw reply related

* [net-next 03/15] i40evf: update ethtool stats code and use helper functions
From: Jeff Kirsher @ 2018-08-29 22:48 UTC (permalink / raw)
  To: davem; +Cc: Jacob Keller, netdev, nhorman, sassmann, jogreene, Jeff Kirsher
In-Reply-To: <20180829224834.3141-1-jeffrey.t.kirsher@intel.com>

From: Jacob Keller <jacob.e.keller@intel.com>

Fix a bug in the way we handled VF queues, by always showing stats for
the maximum number of queues, even if they aren't allocated. It is not
safe to change the number of strings reported to ethtool, as grabbing
statistics occurs over multiple ethtool ops for which the rtnl_lock()
cannot be held the entire time.

Avoid this by always reporting queue stats for the maximum number of
queues in the netdevice. Share some of the helper functionality for
adding stats with the PF code in i40e_ethtool_stats.h

This should reduce the chance of potential future bugs, and make adding
new statistics easier.

Note for the queue stats, unlike the PF driver we do not keep an array
of queue pointers, but an array of queues, so care must be taken to
avoid accessing queue memory that hasn't yet been allocated.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 .../intel/i40evf/i40e_ethtool_stats.h         | 221 ++++++++++++++++++
 .../ethernet/intel/i40evf/i40evf_ethtool.c    | 137 ++++++-----
 2 files changed, 298 insertions(+), 60 deletions(-)
 create mode 100644 drivers/net/ethernet/intel/i40evf/i40e_ethtool_stats.h

diff --git a/drivers/net/ethernet/intel/i40evf/i40e_ethtool_stats.h b/drivers/net/ethernet/intel/i40evf/i40e_ethtool_stats.h
new file mode 100644
index 000000000000..60b595dd8c39
--- /dev/null
+++ b/drivers/net/ethernet/intel/i40evf/i40e_ethtool_stats.h
@@ -0,0 +1,221 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2013 - 2018 Intel Corporation. */
+
+/* ethtool statistics helpers */
+
+/**
+ * struct i40e_stats - definition for an ethtool statistic
+ * @stat_string: statistic name to display in ethtool -S output
+ * @sizeof_stat: the sizeof() the stat, must be no greater than sizeof(u64)
+ * @stat_offset: offsetof() the stat from a base pointer
+ *
+ * This structure defines a statistic to be added to the ethtool stats buffer.
+ * It defines a statistic as offset from a common base pointer. Stats should
+ * be defined in constant arrays using the I40E_STAT macro, with every element
+ * of the array using the same _type for calculating the sizeof_stat and
+ * stat_offset.
+ *
+ * The @sizeof_stat is expected to be sizeof(u8), sizeof(u16), sizeof(u32) or
+ * sizeof(u64). Other sizes are not expected and will produce a WARN_ONCE from
+ * the i40e_add_ethtool_stat() helper function.
+ *
+ * The @stat_string is interpreted as a format string, allowing formatted
+ * values to be inserted while looping over multiple structures for a given
+ * statistics array. Thus, every statistic string in an array should have the
+ * same type and number of format specifiers, to be formatted by variadic
+ * arguments to the i40e_add_stat_string() helper function.
+ **/
+struct i40e_stats {
+	char stat_string[ETH_GSTRING_LEN];
+	int sizeof_stat;
+	int stat_offset;
+};
+
+/* Helper macro to define an i40e_stat structure with proper size and type.
+ * Use this when defining constant statistics arrays. Note that @_type expects
+ * only a type name and is used multiple times.
+ */
+#define I40E_STAT(_type, _name, _stat) { \
+	.stat_string = _name, \
+	.sizeof_stat = FIELD_SIZEOF(_type, _stat), \
+	.stat_offset = offsetof(_type, _stat) \
+}
+
+/* Helper macro for defining some statistics directly copied from the netdev
+ * stats structure.
+ */
+#define I40E_NETDEV_STAT(_net_stat) \
+	I40E_STAT(struct rtnl_link_stats64, #_net_stat, _net_stat)
+
+/* Helper macro for defining some statistics related to queues */
+#define I40E_QUEUE_STAT(_name, _stat) \
+	I40E_STAT(struct i40e_ring, _name, _stat)
+
+/* Stats associated with a Tx or Rx ring */
+static const struct i40e_stats i40e_gstrings_queue_stats[] = {
+	I40E_QUEUE_STAT("%s-%u.packets", stats.packets),
+	I40E_QUEUE_STAT("%s-%u.bytes", stats.bytes),
+};
+
+/**
+ * i40evf_add_one_ethtool_stat - copy the stat into the supplied buffer
+ * @data: location to store the stat value
+ * @pointer: basis for where to copy from
+ * @stat: the stat definition
+ *
+ * Copies the stat data defined by the pointer and stat structure pair into
+ * the memory supplied as data. Used to implement i40e_add_ethtool_stats and
+ * i40evf_add_queue_stats. If the pointer is null, data will be zero'd.
+ */
+static inline void
+i40evf_add_one_ethtool_stat(u64 *data, void *pointer,
+			    const struct i40e_stats *stat)
+{
+	char *p;
+
+	if (!pointer) {
+		/* ensure that the ethtool data buffer is zero'd for any stats
+		 * which don't have a valid pointer.
+		 */
+		*data = 0;
+		return;
+	}
+
+	p = (char *)pointer + stat->stat_offset;
+	switch (stat->sizeof_stat) {
+	case sizeof(u64):
+		*data = *((u64 *)p);
+		break;
+	case sizeof(u32):
+		*data = *((u32 *)p);
+		break;
+	case sizeof(u16):
+		*data = *((u16 *)p);
+		break;
+	case sizeof(u8):
+		*data = *((u8 *)p);
+		break;
+	default:
+		WARN_ONCE(1, "unexpected stat size for %s",
+			  stat->stat_string);
+		*data = 0;
+	}
+}
+
+/**
+ * __i40evf_add_ethtool_stats - copy stats into the ethtool supplied buffer
+ * @data: ethtool stats buffer
+ * @pointer: location to copy stats from
+ * @stats: array of stats to copy
+ * @size: the size of the stats definition
+ *
+ * Copy the stats defined by the stats array using the pointer as a base into
+ * the data buffer supplied by ethtool. Updates the data pointer to point to
+ * the next empty location for successive calls to __i40evf_add_ethtool_stats.
+ * If pointer is null, set the data values to zero and update the pointer to
+ * skip these stats.
+ **/
+static inline void
+__i40evf_add_ethtool_stats(u64 **data, void *pointer,
+			   const struct i40e_stats stats[],
+			   const unsigned int size)
+{
+	unsigned int i;
+
+	for (i = 0; i < size; i++)
+		i40evf_add_one_ethtool_stat((*data)++, pointer, &stats[i]);
+}
+
+/**
+ * i40e_add_ethtool_stats - copy stats into ethtool supplied buffer
+ * @data: ethtool stats buffer
+ * @pointer: location where stats are stored
+ * @stats: static const array of stat definitions
+ *
+ * Macro to ease the use of __i40evf_add_ethtool_stats by taking a static
+ * constant stats array and passing the ARRAY_SIZE(). This avoids typos by
+ * ensuring that we pass the size associated with the given stats array.
+ *
+ * The parameter @stats is evaluated twice, so parameters with side effects
+ * should be avoided.
+ **/
+#define i40e_add_ethtool_stats(data, pointer, stats) \
+	__i40evf_add_ethtool_stats(data, pointer, stats, ARRAY_SIZE(stats))
+
+/**
+ * i40evf_add_queue_stats - copy queue statistics into supplied buffer
+ * @data: ethtool stats buffer
+ * @ring: the ring to copy
+ *
+ * Queue statistics must be copied while protected by
+ * u64_stats_fetch_begin_irq, so we can't directly use i40e_add_ethtool_stats.
+ * Assumes that queue stats are defined in i40e_gstrings_queue_stats. If the
+ * ring pointer is null, zero out the queue stat values and update the data
+ * pointer. Otherwise safely copy the stats from the ring into the supplied
+ * buffer and update the data pointer when finished.
+ *
+ * This function expects to be called while under rcu_read_lock().
+ **/
+static inline void
+i40evf_add_queue_stats(u64 **data, struct i40e_ring *ring)
+{
+	const unsigned int size = ARRAY_SIZE(i40e_gstrings_queue_stats);
+	const struct i40e_stats *stats = i40e_gstrings_queue_stats;
+	unsigned int start;
+	unsigned int i;
+
+	/* To avoid invalid statistics values, ensure that we keep retrying
+	 * the copy until we get a consistent value according to
+	 * u64_stats_fetch_retry_irq. But first, make sure our ring is
+	 * non-null before attempting to access its syncp.
+	 */
+	do {
+		start = !ring ? 0 : u64_stats_fetch_begin_irq(&ring->syncp);
+		for (i = 0; i < size; i++) {
+			i40evf_add_one_ethtool_stat(&(*data)[i], ring,
+						    &stats[i]);
+		}
+	} while (ring && u64_stats_fetch_retry_irq(&ring->syncp, start));
+
+	/* Once we successfully copy the stats in, update the data pointer */
+	*data += size;
+}
+
+/**
+ * __i40e_add_stat_strings - copy stat strings into ethtool buffer
+ * @p: ethtool supplied buffer
+ * @stats: stat definitions array
+ * @size: size of the stats array
+ *
+ * Format and copy the strings described by stats into the buffer pointed at
+ * by p.
+ **/
+static void __i40e_add_stat_strings(u8 **p, const struct i40e_stats stats[],
+				    const unsigned int size, ...)
+{
+	unsigned int i;
+
+	for (i = 0; i < size; i++) {
+		va_list args;
+
+		va_start(args, size);
+		vsnprintf(*p, ETH_GSTRING_LEN, stats[i].stat_string, args);
+		*p += ETH_GSTRING_LEN;
+		va_end(args);
+	}
+}
+
+/**
+ * 40e_add_stat_strings - copy stat strings into ethtool buffer
+ * @p: ethtool supplied buffer
+ * @stats: stat definitions array
+ *
+ * Format and copy the strings described by the const static stats value into
+ * the buffer pointed at by p.
+ *
+ * The parameter @stats is evaluated twice, so parameters with side effects
+ * should be avoided. Additionally, stats must be an array such that
+ * ARRAY_SIZE can be called on it.
+ **/
+#define i40e_add_stat_strings(p, stats, ...) \
+	__i40e_add_stat_strings(p, stats, ARRAY_SIZE(stats), ## __VA_ARGS__)
diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_ethtool.c b/drivers/net/ethernet/intel/i40evf/i40evf_ethtool.c
index 69efe0aec76a..9319971c5c92 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_ethtool.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_ethtool.c
@@ -6,18 +6,12 @@
 
 #include <linux/uaccess.h>
 
-struct i40evf_stats {
-	char stat_string[ETH_GSTRING_LEN];
-	int stat_offset;
-};
+#include "i40e_ethtool_stats.h"
 
-#define I40EVF_STAT(_name, _stat) { \
-	.stat_string = _name, \
-	.stat_offset = offsetof(struct i40evf_adapter, _stat) \
-}
+#define I40EVF_STAT(_name, _stat) \
+	I40E_STAT(struct i40evf_adapter, _name, _stat)
 
-/* All stats are u64, so we don't need to track the size of the field. */
-static const struct i40evf_stats i40evf_gstrings_stats[] = {
+static const struct i40e_stats i40evf_gstrings_stats[] = {
 	I40EVF_STAT("rx_bytes", current_stats.rx_bytes),
 	I40EVF_STAT("rx_unicast", current_stats.rx_unicast),
 	I40EVF_STAT("rx_multicast", current_stats.rx_multicast),
@@ -32,13 +26,9 @@ static const struct i40evf_stats i40evf_gstrings_stats[] = {
 	I40EVF_STAT("tx_errors", current_stats.tx_errors),
 };
 
-#define I40EVF_GLOBAL_STATS_LEN ARRAY_SIZE(i40evf_gstrings_stats)
-#define I40EVF_QUEUE_STATS_LEN(_dev) \
-	(((struct i40evf_adapter *)\
-		netdev_priv(_dev))->num_active_queues \
-		  * 2 * (sizeof(struct i40e_queue_stats) / sizeof(u64)))
-#define I40EVF_STATS_LEN(_dev) \
-	(I40EVF_GLOBAL_STATS_LEN + I40EVF_QUEUE_STATS_LEN(_dev))
+#define I40EVF_STATS_LEN	ARRAY_SIZE(i40evf_gstrings_stats)
+
+#define I40EVF_QUEUE_STATS_LEN	ARRAY_SIZE(i40e_gstrings_queue_stats)
 
 /* For now we have one and only one private flag and it is only defined
  * when we have support for the SKIP_CPU_SYNC DMA attribute.  Instead
@@ -117,13 +107,13 @@ static int i40evf_get_link_ksettings(struct net_device *netdev,
  * @netdev: network interface device structure
  * @sset: id of string set
  *
- * Reports size of string table. This driver only supports
- * strings for statistics.
+ * Reports size of various string tables.
  **/
 static int i40evf_get_sset_count(struct net_device *netdev, int sset)
 {
 	if (sset == ETH_SS_STATS)
-		return I40EVF_STATS_LEN(netdev);
+		return I40EVF_STATS_LEN +
+			(I40EVF_QUEUE_STATS_LEN * 2 * I40EVF_MAX_REQ_QUEUES);
 	else if (sset == ETH_SS_PRIV_FLAGS)
 		return I40EVF_PRIV_FLAGS_STR_LEN;
 	else
@@ -142,20 +132,66 @@ static void i40evf_get_ethtool_stats(struct net_device *netdev,
 				     struct ethtool_stats *stats, u64 *data)
 {
 	struct i40evf_adapter *adapter = netdev_priv(netdev);
-	unsigned int i, j;
-	char *p;
+	unsigned int i;
+
+	i40e_add_ethtool_stats(&data, adapter, i40evf_gstrings_stats);
+
+	rcu_read_lock();
+	for (i = 0; i < I40EVF_MAX_REQ_QUEUES; i++) {
+		struct i40e_ring *ring;
 
-	for (i = 0; i < I40EVF_GLOBAL_STATS_LEN; i++) {
-		p = (char *)adapter + i40evf_gstrings_stats[i].stat_offset;
-		data[i] =  *(u64 *)p;
+		/* Avoid accessing un-allocated queues */
+		ring = (i < adapter->num_active_queues ?
+			&adapter->tx_rings[i] : NULL);
+		i40evf_add_queue_stats(&data, ring);
+
+		/* Avoid accessing un-allocated queues */
+		ring = (i < adapter->num_active_queues ?
+			&adapter->rx_rings[i] : NULL);
+		i40evf_add_queue_stats(&data, ring);
 	}
-	for (j = 0; j < adapter->num_active_queues; j++) {
-		data[i++] = adapter->tx_rings[j].stats.packets;
-		data[i++] = adapter->tx_rings[j].stats.bytes;
+	rcu_read_unlock();
+}
+
+/**
+ * i40evf_get_priv_flag_strings - Get private flag strings
+ * @netdev: network interface device structure
+ * @data: buffer for string data
+ *
+ * Builds the private flags string table
+ **/
+static void i40evf_get_priv_flag_strings(struct net_device *netdev, u8 *data)
+{
+	unsigned int i;
+
+	for (i = 0; i < I40EVF_PRIV_FLAGS_STR_LEN; i++) {
+		snprintf(data, ETH_GSTRING_LEN, "%s",
+			 i40evf_gstrings_priv_flags[i].flag_string);
+		data += ETH_GSTRING_LEN;
 	}
-	for (j = 0; j < adapter->num_active_queues; j++) {
-		data[i++] = adapter->rx_rings[j].stats.packets;
-		data[i++] = adapter->rx_rings[j].stats.bytes;
+}
+
+/**
+ * i40evf_get_stat_strings - Get stat strings
+ * @netdev: network interface device structure
+ * @data: buffer for string data
+ *
+ * Builds the statistics string table
+ **/
+static void i40evf_get_stat_strings(struct net_device *netdev, u8 *data)
+{
+	unsigned int i;
+
+	i40e_add_stat_strings(&data, i40evf_gstrings_stats);
+
+	/* Queues are always allocated in pairs, so we just use num_tx_queues
+	 * for both Tx and Rx queues.
+	 */
+	for (i = 0; i < netdev->num_tx_queues; i++) {
+		i40e_add_stat_strings(&data, i40e_gstrings_queue_stats,
+				      "tx", i);
+		i40e_add_stat_strings(&data, i40e_gstrings_queue_stats,
+				      "rx", i);
 	}
 }
 
@@ -165,38 +201,19 @@ static void i40evf_get_ethtool_stats(struct net_device *netdev,
  * @sset: id of string set
  * @data: buffer for string data
  *
- * Builds stats string table.
+ * Builds string tables for various string sets
  **/
 static void i40evf_get_strings(struct net_device *netdev, u32 sset, u8 *data)
 {
-	struct i40evf_adapter *adapter = netdev_priv(netdev);
-	u8 *p = data;
-	int i;
-
-	if (sset == ETH_SS_STATS) {
-		for (i = 0; i < (int)I40EVF_GLOBAL_STATS_LEN; i++) {
-			memcpy(p, i40evf_gstrings_stats[i].stat_string,
-			       ETH_GSTRING_LEN);
-			p += ETH_GSTRING_LEN;
-		}
-		for (i = 0; i < adapter->num_active_queues; i++) {
-			snprintf(p, ETH_GSTRING_LEN, "tx-%u.packets", i);
-			p += ETH_GSTRING_LEN;
-			snprintf(p, ETH_GSTRING_LEN, "tx-%u.bytes", i);
-			p += ETH_GSTRING_LEN;
-		}
-		for (i = 0; i < adapter->num_active_queues; i++) {
-			snprintf(p, ETH_GSTRING_LEN, "rx-%u.packets", i);
-			p += ETH_GSTRING_LEN;
-			snprintf(p, ETH_GSTRING_LEN, "rx-%u.bytes", i);
-			p += ETH_GSTRING_LEN;
-		}
-	} else if (sset == ETH_SS_PRIV_FLAGS) {
-		for (i = 0; i < I40EVF_PRIV_FLAGS_STR_LEN; i++) {
-			snprintf(p, ETH_GSTRING_LEN, "%s",
-				 i40evf_gstrings_priv_flags[i].flag_string);
-			p += ETH_GSTRING_LEN;
-		}
+	switch (sset) {
+	case ETH_SS_STATS:
+		i40evf_get_stat_strings(netdev, data);
+		break;
+	case ETH_SS_PRIV_FLAGS:
+		i40evf_get_priv_flag_strings(netdev, data);
+		break;
+	default:
+		break;
 	}
 }
 
-- 
2.17.1

^ permalink raw reply related

* [net-next 09/15] i40e: static analysis report from community
From: Jeff Kirsher @ 2018-08-29 22:48 UTC (permalink / raw)
  To: davem; +Cc: Martyna Szapar, netdev, nhorman, sassmann, jogreene, Jeff Kirsher
In-Reply-To: <20180829224834.3141-1-jeffrey.t.kirsher@intel.com>

From: Martyna Szapar <martyna.szapar@intel.com>

Static analysis tools report a problem from original driver submission.
Removing unnecessary check in condition.

Signed-off-by: Martyna Szapar <martyna.szapar@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index ac685ad4d877..62f2b5bce6bb 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -13033,7 +13033,7 @@ struct i40e_veb *i40e_veb_setup(struct i40e_pf *pf, u16 flags,
 	for (vsi_idx = 0; vsi_idx < pf->num_alloc_vsi; vsi_idx++)
 		if (pf->vsi[vsi_idx] && pf->vsi[vsi_idx]->seid == vsi_seid)
 			break;
-	if (vsi_idx >= pf->num_alloc_vsi && vsi_seid != 0) {
+	if (vsi_idx == pf->num_alloc_vsi && vsi_seid != 0) {
 		dev_info(&pf->pdev->dev, "vsi seid %d not found\n",
 			 vsi_seid);
 		return NULL;
-- 
2.17.1

^ permalink raw reply related

* [net-next 01/15] i40e: convert queue stats to i40e_stats array
From: Jeff Kirsher @ 2018-08-29 22:48 UTC (permalink / raw)
  To: davem; +Cc: Jacob Keller, netdev, nhorman, sassmann, jogreene, Jeff Kirsher
In-Reply-To: <20180829224834.3141-1-jeffrey.t.kirsher@intel.com>

From: Jacob Keller <jacob.e.keller@intel.com>

Use an i40e_stats array to handle the queue stats, instead of coding
similar functionality separately. Because of how the queue stats are
accessed on some kernels, we can't easily use i40e_add_ethtool_stats.

Instead, implement a separate helper, i40e_add_queue_stats, which we'll
use instead. This helper will correctly implement the
u64_stats_fetch_begin_irq logic and allow retries until successful. We
share the most complex code by re-using i40e_add_one_ethtool_stat.

This logic additionally easily supports skipping disabled rings by using
a ternary operator before calling the u64_stats_fetch_begin_irq()
function, so that we correctly zero-out the stats values without having
to perform two separate sections of code.

This significantly reduces the boiler plate code in
i40e_get_ethtool_stats, and helps keep the complex logic contained to as
few functions as possible.

With this patch, we've finally converted all the statistics to use the
helpers and the i40e_stats function.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 .../net/ethernet/intel/i40e/i40e_ethtool.c    | 148 +++++++++++-------
 1 file changed, 89 insertions(+), 59 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
index 5ff6caa83948..235012b3bd42 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
@@ -33,6 +33,8 @@ struct i40e_stats {
 	I40E_STAT(struct i40e_veb, _name, _stat)
 #define I40E_PFC_STAT(_name, _stat) \
 	I40E_STAT(struct i40e_pfc_stats, _name, _stat)
+#define I40E_QUEUE_STAT(_name, _stat) \
+	I40E_STAT(struct i40e_ring, _name, _stat)
 
 static const struct i40e_stats i40e_gstrings_net_stats[] = {
 	I40E_NETDEV_STAT(rx_packets),
@@ -171,20 +173,16 @@ static const struct i40e_stats i40e_gstrings_pfc_stats[] = {
 	I40E_PFC_STAT("port.rx_priority_%u_xon_2_xoff", priority_xon_2_xoff),
 };
 
-/* We use num_tx_queues here as a proxy for the maximum number of queues
- * available because we always allocate queues symmetrically.
- */
-#define I40E_MAX_NUM_QUEUES(n) ((n)->num_tx_queues)
-#define I40E_QUEUE_STATS_LEN(n)                                              \
-	   (I40E_MAX_NUM_QUEUES(n)                                           \
-	    * 2 /* Tx and Rx together */                                     \
-	    * (sizeof(struct i40e_queue_stats) / sizeof(u64)))
-#define I40E_GLOBAL_STATS_LEN	ARRAY_SIZE(i40e_gstrings_stats)
+static const struct i40e_stats i40e_gstrings_queue_stats[] = {
+	I40E_QUEUE_STAT("%s-%u.packets", stats.packets),
+	I40E_QUEUE_STAT("%s-%u.bytes", stats.bytes),
+};
+
 #define I40E_NETDEV_STATS_LEN	ARRAY_SIZE(i40e_gstrings_net_stats)
+
 #define I40E_MISC_STATS_LEN	ARRAY_SIZE(i40e_gstrings_misc_stats)
-#define I40E_VSI_STATS_LEN(n)	(I40E_NETDEV_STATS_LEN + \
-				 I40E_MISC_STATS_LEN + \
-				 I40E_QUEUE_STATS_LEN((n)))
+
+#define I40E_VSI_STATS_LEN	(I40E_NETDEV_STATS_LEN + I40E_MISC_STATS_LEN)
 
 #define I40E_PFC_STATS_LEN	(ARRAY_SIZE(i40e_gstrings_pfc_stats) * \
 				 I40E_MAX_USER_PRIORITY)
@@ -193,10 +191,15 @@ static const struct i40e_stats i40e_gstrings_pfc_stats[] = {
 				 (ARRAY_SIZE(i40e_gstrings_veb_tc_stats) * \
 				  I40E_MAX_TRAFFIC_CLASS))
 
-#define I40E_PF_STATS_LEN(n)	(I40E_GLOBAL_STATS_LEN + \
+#define I40E_GLOBAL_STATS_LEN	ARRAY_SIZE(i40e_gstrings_stats)
+
+#define I40E_PF_STATS_LEN	(I40E_GLOBAL_STATS_LEN + \
 				 I40E_PFC_STATS_LEN + \
 				 I40E_VEB_STATS_LEN + \
-				 I40E_VSI_STATS_LEN((n)))
+				 I40E_VSI_STATS_LEN)
+
+/* Length of stats for a single queue */
+#define I40E_QUEUE_STATS_LEN	ARRAY_SIZE(i40e_gstrings_queue_stats)
 
 enum i40e_ethtool_test_id {
 	I40E_ETH_TEST_REG = 0,
@@ -1701,11 +1704,30 @@ static int i40e_get_stats_count(struct net_device *netdev)
 	struct i40e_netdev_priv *np = netdev_priv(netdev);
 	struct i40e_vsi *vsi = np->vsi;
 	struct i40e_pf *pf = vsi->back;
+	int stats_len;
 
 	if (vsi == pf->vsi[pf->lan_vsi] && pf->hw.partition_id == 1)
-		return I40E_PF_STATS_LEN(netdev);
+		stats_len = I40E_PF_STATS_LEN;
 	else
-		return I40E_VSI_STATS_LEN(netdev);
+		stats_len = I40E_VSI_STATS_LEN;
+
+	/* The number of stats reported for a given net_device must remain
+	 * constant throughout the life of that device.
+	 *
+	 * This is because the API for obtaining the size, strings, and stats
+	 * is spread out over three separate ethtool ioctls. There is no safe
+	 * way to lock the number of stats across these calls, so we must
+	 * assume that they will never change.
+	 *
+	 * Due to this, we report the maximum number of queues, even if not
+	 * every queue is currently configured. Since we always allocate
+	 * queues in pairs, we'll just use netdev->num_tx_queues * 2. This
+	 * works because the num_tx_queues is set at device creation and never
+	 * changes.
+	 */
+	stats_len += I40E_QUEUE_STATS_LEN * 2 * netdev->num_tx_queues;
+
+	return stats_len;
 }
 
 static int i40e_get_sset_count(struct net_device *netdev, int sset)
@@ -1734,8 +1756,8 @@ static int i40e_get_sset_count(struct net_device *netdev, int sset)
  * @stat: the stat definition
  *
  * Copies the stat data defined by the pointer and stat structure pair into
- * the memory supplied as data. Used to implement i40e_add_ethtool_stats.
- * If the pointer is null, data will be zero'd.
+ * the memory supplied as data. Used to implement i40e_add_ethtool_stats and
+ * i40e_add_queue_stats. If the pointer is null, data will be zero'd.
  */
 static inline void
 i40e_add_one_ethtool_stat(u64 *data, void *pointer,
@@ -1810,6 +1832,45 @@ __i40e_add_ethtool_stats(u64 **data, void *pointer,
 #define i40e_add_ethtool_stats(data, pointer, stats) \
 	__i40e_add_ethtool_stats(data, pointer, stats, ARRAY_SIZE(stats))
 
+/**
+ * i40e_add_queue_stats - copy queue statistics into supplied buffer
+ * @data: ethtool stats buffer
+ * @ring: the ring to copy
+ *
+ * Queue statistics must be copied while protected by
+ * u64_stats_fetch_begin_irq, so we can't directly use i40e_add_ethtool_stats.
+ * Assumes that queue stats are defined in i40e_gstrings_queue_stats. If the
+ * ring pointer is null, zero out the queue stat values and update the data
+ * pointer. Otherwise safely copy the stats from the ring into the supplied
+ * buffer and update the data pointer when finished.
+ *
+ * This function expects to be called while under rcu_read_lock().
+ **/
+static inline void
+i40e_add_queue_stats(u64 **data, struct i40e_ring *ring)
+{
+	const unsigned int size = ARRAY_SIZE(i40e_gstrings_queue_stats);
+	const struct i40e_stats *stats = i40e_gstrings_queue_stats;
+	unsigned int start;
+	unsigned int i;
+
+	/* To avoid invalid statistics values, ensure that we keep retrying
+	 * the copy until we get a consistent value according to
+	 * u64_stats_fetch_retry_irq. But first, make sure our ring is
+	 * non-null before attempting to access its syncp.
+	 */
+	do {
+		start = !ring ? 0 : u64_stats_fetch_begin_irq(&ring->syncp);
+		for (i = 0; i < size; i++) {
+			i40e_add_one_ethtool_stat(&(*data)[i], ring,
+						  &stats[i]);
+		}
+	} while (ring && u64_stats_fetch_retry_irq(&ring->syncp, start));
+
+	/* Once we successfully copy the stats in, update the data pointer */
+	data += size;
+}
+
 /**
  * i40e_get_pfc_stats - copy HW PFC statistics to formatted structure
  * @pf: the PF device structure
@@ -1853,12 +1914,10 @@ static void i40e_get_ethtool_stats(struct net_device *netdev,
 				   struct ethtool_stats *stats, u64 *data)
 {
 	struct i40e_netdev_priv *np = netdev_priv(netdev);
-	struct i40e_ring *tx_ring, *rx_ring;
 	struct i40e_vsi *vsi = np->vsi;
 	struct i40e_pf *pf = vsi->back;
 	struct i40e_veb *veb = pf->veb[pf->lan_veb];
 	unsigned int i;
-	unsigned int start;
 	bool veb_stats;
 	u64 *p = data;
 
@@ -1870,38 +1929,12 @@ static void i40e_get_ethtool_stats(struct net_device *netdev,
 	i40e_add_ethtool_stats(&data, vsi, i40e_gstrings_misc_stats);
 
 	rcu_read_lock();
-	for (i = 0; i < I40E_MAX_NUM_QUEUES(netdev) ; i++) {
-		tx_ring = READ_ONCE(vsi->tx_rings[i]);
-
-		if (!tx_ring) {
-			/* Bump the stat counter to skip these stats, and make
-			 * sure the memory is zero'd
-			 */
-			*(data++) = 0;
-			*(data++) = 0;
-			*(data++) = 0;
-			*(data++) = 0;
-			continue;
-		}
-
-		/* process Tx ring statistics */
-		do {
-			start = u64_stats_fetch_begin_irq(&tx_ring->syncp);
-			data[0] = tx_ring->stats.packets;
-			data[1] = tx_ring->stats.bytes;
-		} while (u64_stats_fetch_retry_irq(&tx_ring->syncp, start));
-		data += 2;
-
-		/* Rx ring is the 2nd half of the queue pair */
-		rx_ring = &tx_ring[1];
-		do {
-			start = u64_stats_fetch_begin_irq(&rx_ring->syncp);
-			data[0] = rx_ring->stats.packets;
-			data[1] = rx_ring->stats.bytes;
-		} while (u64_stats_fetch_retry_irq(&rx_ring->syncp, start));
-		data += 2;
+	for (i = 0; i < netdev->num_tx_queues; i++) {
+		i40e_add_queue_stats(&data, READ_ONCE(vsi->tx_rings[i]));
+		i40e_add_queue_stats(&data, READ_ONCE(vsi->rx_rings[i]));
 	}
 	rcu_read_unlock();
+
 	if (vsi != pf->vsi[pf->lan_vsi] || pf->hw.partition_id != 1)
 		goto check_data_pointer;
 
@@ -1990,16 +2023,13 @@ static void i40e_get_stat_strings(struct net_device *netdev, u8 *data)
 
 	i40e_add_stat_strings(&data, i40e_gstrings_misc_stats);
 
-	for (i = 0; i < I40E_MAX_NUM_QUEUES(netdev); i++) {
-		snprintf(data, ETH_GSTRING_LEN, "tx-%u.tx_packets", i);
-		data += ETH_GSTRING_LEN;
-		snprintf(data, ETH_GSTRING_LEN, "tx-%u.tx_bytes", i);
-		data += ETH_GSTRING_LEN;
-		snprintf(data, ETH_GSTRING_LEN, "rx-%u.rx_packets", i);
-		data += ETH_GSTRING_LEN;
-		snprintf(data, ETH_GSTRING_LEN, "rx-%u.rx_bytes", i);
-		data += ETH_GSTRING_LEN;
+	for (i = 0; i < netdev->num_tx_queues; i++) {
+		i40e_add_stat_strings(&data, i40e_gstrings_queue_stats,
+				      "tx", i);
+		i40e_add_stat_strings(&data, i40e_gstrings_queue_stats,
+				      "rx", i);
 	}
+
 	if (vsi != pf->vsi[pf->lan_vsi] || pf->hw.partition_id != 1)
 		return;
 
-- 
2.17.1

^ permalink raw reply related

* [net-next 02/15] i40e: move ethtool stats boiler plate code to i40e_ethtool_stats.h
From: Jeff Kirsher @ 2018-08-29 22:48 UTC (permalink / raw)
  To: davem; +Cc: Jacob Keller, netdev, nhorman, sassmann, jogreene, Jeff Kirsher
In-Reply-To: <20180829224834.3141-1-jeffrey.t.kirsher@intel.com>

From: Jacob Keller <jacob.e.keller@intel.com>

Move the boiler plate structures and helper functions we recently
added into their own header file, so that the complete collection is
located together.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 .../net/ethernet/intel/i40e/i40e_ethtool.c    | 182 +--------------
 .../ethernet/intel/i40e/i40e_ethtool_stats.h  | 221 ++++++++++++++++++
 2 files changed, 222 insertions(+), 181 deletions(-)
 create mode 100644 drivers/net/ethernet/intel/i40e/i40e_ethtool_stats.h

diff --git a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
index 235012b3bd42..d7d3974beca2 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
@@ -6,25 +6,8 @@
 #include "i40e.h"
 #include "i40e_diag.h"
 
-struct i40e_stats {
-	/* The stat_string is expected to be a format string formatted using
-	 * vsnprintf by i40e_add_stat_strings. Every member of a stats array
-	 * should use the same format specifiers as they will be formatted
-	 * using the same variadic arguments.
-	 */
-	char stat_string[ETH_GSTRING_LEN];
-	int sizeof_stat;
-	int stat_offset;
-};
-
-#define I40E_STAT(_type, _name, _stat) { \
-	.stat_string = _name, \
-	.sizeof_stat = FIELD_SIZEOF(_type, _stat), \
-	.stat_offset = offsetof(_type, _stat) \
-}
+#include "i40e_ethtool_stats.h"
 
-#define I40E_NETDEV_STAT(_net_stat) \
-	I40E_STAT(struct rtnl_link_stats64, #_net_stat, _net_stat)
 #define I40E_PF_STAT(_name, _stat) \
 	I40E_STAT(struct i40e_pf, _name, _stat)
 #define I40E_VSI_STAT(_name, _stat) \
@@ -173,11 +156,6 @@ static const struct i40e_stats i40e_gstrings_pfc_stats[] = {
 	I40E_PFC_STAT("port.rx_priority_%u_xon_2_xoff", priority_xon_2_xoff),
 };
 
-static const struct i40e_stats i40e_gstrings_queue_stats[] = {
-	I40E_QUEUE_STAT("%s-%u.packets", stats.packets),
-	I40E_QUEUE_STAT("%s-%u.bytes", stats.bytes),
-};
-
 #define I40E_NETDEV_STATS_LEN	ARRAY_SIZE(i40e_gstrings_net_stats)
 
 #define I40E_MISC_STATS_LEN	ARRAY_SIZE(i40e_gstrings_misc_stats)
@@ -1749,128 +1727,6 @@ static int i40e_get_sset_count(struct net_device *netdev, int sset)
 	}
 }
 
-/**
- * i40e_add_one_ethtool_stat - copy the stat into the supplied buffer
- * @data: location to store the stat value
- * @pointer: basis for where to copy from
- * @stat: the stat definition
- *
- * Copies the stat data defined by the pointer and stat structure pair into
- * the memory supplied as data. Used to implement i40e_add_ethtool_stats and
- * i40e_add_queue_stats. If the pointer is null, data will be zero'd.
- */
-static inline void
-i40e_add_one_ethtool_stat(u64 *data, void *pointer,
-			  const struct i40e_stats *stat)
-{
-	char *p;
-
-	if (!pointer) {
-		/* ensure that the ethtool data buffer is zero'd for any stats
-		 * which don't have a valid pointer.
-		 */
-		*data = 0;
-		return;
-	}
-
-	p = (char *)pointer + stat->stat_offset;
-	switch (stat->sizeof_stat) {
-	case sizeof(u64):
-		*data = *((u64 *)p);
-		break;
-	case sizeof(u32):
-		*data = *((u32 *)p);
-		break;
-	case sizeof(u16):
-		*data = *((u16 *)p);
-		break;
-	case sizeof(u8):
-		*data = *((u8 *)p);
-		break;
-	default:
-		WARN_ONCE(1, "unexpected stat size for %s",
-			  stat->stat_string);
-		*data = 0;
-	}
-}
-
-/**
- * __i40e_add_ethtool_stats - copy stats into the ethtool supplied buffer
- * @data: ethtool stats buffer
- * @pointer: location to copy stats from
- * @stats: array of stats to copy
- * @size: the size of the stats definition
- *
- * Copy the stats defined by the stats array using the pointer as a base into
- * the data buffer supplied by ethtool. Updates the data pointer to point to
- * the next empty location for successive calls to __i40e_add_ethtool_stats.
- * If pointer is null, set the data values to zero and update the pointer to
- * skip these stats.
- **/
-static inline void
-__i40e_add_ethtool_stats(u64 **data, void *pointer,
-			 const struct i40e_stats stats[],
-			 const unsigned int size)
-{
-	unsigned int i;
-
-	for (i = 0; i < size; i++)
-		i40e_add_one_ethtool_stat((*data)++, pointer, &stats[i]);
-}
-
-/**
- * i40e_add_ethtool_stats - copy stats into ethtool supplied buffer
- * @data: ethtool stats buffer
- * @pointer: location where stats are stored
- * @stats: static const array of stat definitions
- *
- * Macro to ease the use of __i40e_add_ethtool_stats by taking a static
- * constant stats array and passing the ARRAY_SIZE(). This avoids typos by
- * ensuring that we pass the size associated with the given stats array.
- * Assumes that stats is an array.
- **/
-#define i40e_add_ethtool_stats(data, pointer, stats) \
-	__i40e_add_ethtool_stats(data, pointer, stats, ARRAY_SIZE(stats))
-
-/**
- * i40e_add_queue_stats - copy queue statistics into supplied buffer
- * @data: ethtool stats buffer
- * @ring: the ring to copy
- *
- * Queue statistics must be copied while protected by
- * u64_stats_fetch_begin_irq, so we can't directly use i40e_add_ethtool_stats.
- * Assumes that queue stats are defined in i40e_gstrings_queue_stats. If the
- * ring pointer is null, zero out the queue stat values and update the data
- * pointer. Otherwise safely copy the stats from the ring into the supplied
- * buffer and update the data pointer when finished.
- *
- * This function expects to be called while under rcu_read_lock().
- **/
-static inline void
-i40e_add_queue_stats(u64 **data, struct i40e_ring *ring)
-{
-	const unsigned int size = ARRAY_SIZE(i40e_gstrings_queue_stats);
-	const struct i40e_stats *stats = i40e_gstrings_queue_stats;
-	unsigned int start;
-	unsigned int i;
-
-	/* To avoid invalid statistics values, ensure that we keep retrying
-	 * the copy until we get a consistent value according to
-	 * u64_stats_fetch_retry_irq. But first, make sure our ring is
-	 * non-null before attempting to access its syncp.
-	 */
-	do {
-		start = !ring ? 0 : u64_stats_fetch_begin_irq(&ring->syncp);
-		for (i = 0; i < size; i++) {
-			i40e_add_one_ethtool_stat(&(*data)[i], ring,
-						  &stats[i]);
-		}
-	} while (ring && u64_stats_fetch_retry_irq(&ring->syncp, start));
-
-	/* Once we successfully copy the stats in, update the data pointer */
-	data += size;
-}
-
 /**
  * i40e_get_pfc_stats - copy HW PFC statistics to formatted structure
  * @pf: the PF device structure
@@ -1965,42 +1821,6 @@ static void i40e_get_ethtool_stats(struct net_device *netdev,
 		  "ethtool stats count mismatch!");
 }
 
-/**
- * __i40e_add_stat_strings - copy stat strings into ethtool buffer
- * @p: ethtool supplied buffer
- * @stats: stat definitions array
- * @size: size of the stats array
- *
- * Format and copy the strings described by stats into the buffer pointed at
- * by p.
- **/
-static void __i40e_add_stat_strings(u8 **p, const struct i40e_stats stats[],
-				    const unsigned int size, ...)
-{
-	unsigned int i;
-
-	for (i = 0; i < size; i++) {
-		va_list args;
-
-		va_start(args, size);
-		vsnprintf(*p, ETH_GSTRING_LEN, stats[i].stat_string, args);
-		*p += ETH_GSTRING_LEN;
-		va_end(args);
-	}
-}
-
-/**
- * 40e_add_stat_strings - copy stat strings into ethtool buffer
- * @p: ethtool supplied buffer
- * @stats: stat definitions array
- *
- * Format and copy the strings described by the const static stats value into
- * the buffer pointed at by p. Assumes that stats can have ARRAY_SIZE called
- * for it.
- **/
-#define i40e_add_stat_strings(p, stats, ...) \
-	__i40e_add_stat_strings(p, stats, ARRAY_SIZE(stats), ## __VA_ARGS__)
-
 /**
  * i40e_get_stat_strings - copy stat strings into supplied buffer
  * @netdev: the netdev to collect strings for
diff --git a/drivers/net/ethernet/intel/i40e/i40e_ethtool_stats.h b/drivers/net/ethernet/intel/i40e/i40e_ethtool_stats.h
new file mode 100644
index 000000000000..0290ade7494b
--- /dev/null
+++ b/drivers/net/ethernet/intel/i40e/i40e_ethtool_stats.h
@@ -0,0 +1,221 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2013 - 2018 Intel Corporation. */
+
+/* ethtool statistics helpers */
+
+/**
+ * struct i40e_stats - definition for an ethtool statistic
+ * @stat_string: statistic name to display in ethtool -S output
+ * @sizeof_stat: the sizeof() the stat, must be no greater than sizeof(u64)
+ * @stat_offset: offsetof() the stat from a base pointer
+ *
+ * This structure defines a statistic to be added to the ethtool stats buffer.
+ * It defines a statistic as offset from a common base pointer. Stats should
+ * be defined in constant arrays using the I40E_STAT macro, with every element
+ * of the array using the same _type for calculating the sizeof_stat and
+ * stat_offset.
+ *
+ * The @sizeof_stat is expected to be sizeof(u8), sizeof(u16), sizeof(u32) or
+ * sizeof(u64). Other sizes are not expected and will produce a WARN_ONCE from
+ * the i40e_add_ethtool_stat() helper function.
+ *
+ * The @stat_string is interpreted as a format string, allowing formatted
+ * values to be inserted while looping over multiple structures for a given
+ * statistics array. Thus, every statistic string in an array should have the
+ * same type and number of format specifiers, to be formatted by variadic
+ * arguments to the i40e_add_stat_string() helper function.
+ **/
+struct i40e_stats {
+	char stat_string[ETH_GSTRING_LEN];
+	int sizeof_stat;
+	int stat_offset;
+};
+
+/* Helper macro to define an i40e_stat structure with proper size and type.
+ * Use this when defining constant statistics arrays. Note that @_type expects
+ * only a type name and is used multiple times.
+ */
+#define I40E_STAT(_type, _name, _stat) { \
+	.stat_string = _name, \
+	.sizeof_stat = FIELD_SIZEOF(_type, _stat), \
+	.stat_offset = offsetof(_type, _stat) \
+}
+
+/* Helper macro for defining some statistics directly copied from the netdev
+ * stats structure.
+ */
+#define I40E_NETDEV_STAT(_net_stat) \
+	I40E_STAT(struct rtnl_link_stats64, #_net_stat, _net_stat)
+
+/* Helper macro for defining some statistics related to queues */
+#define I40E_QUEUE_STAT(_name, _stat) \
+	I40E_STAT(struct i40e_ring, _name, _stat)
+
+/* Stats associated with a Tx or Rx ring */
+static const struct i40e_stats i40e_gstrings_queue_stats[] = {
+	I40E_QUEUE_STAT("%s-%u.packets", stats.packets),
+	I40E_QUEUE_STAT("%s-%u.bytes", stats.bytes),
+};
+
+/**
+ * i40e_add_one_ethtool_stat - copy the stat into the supplied buffer
+ * @data: location to store the stat value
+ * @pointer: basis for where to copy from
+ * @stat: the stat definition
+ *
+ * Copies the stat data defined by the pointer and stat structure pair into
+ * the memory supplied as data. Used to implement i40e_add_ethtool_stats and
+ * i40e_add_queue_stats. If the pointer is null, data will be zero'd.
+ */
+static inline void
+i40e_add_one_ethtool_stat(u64 *data, void *pointer,
+			  const struct i40e_stats *stat)
+{
+	char *p;
+
+	if (!pointer) {
+		/* ensure that the ethtool data buffer is zero'd for any stats
+		 * which don't have a valid pointer.
+		 */
+		*data = 0;
+		return;
+	}
+
+	p = (char *)pointer + stat->stat_offset;
+	switch (stat->sizeof_stat) {
+	case sizeof(u64):
+		*data = *((u64 *)p);
+		break;
+	case sizeof(u32):
+		*data = *((u32 *)p);
+		break;
+	case sizeof(u16):
+		*data = *((u16 *)p);
+		break;
+	case sizeof(u8):
+		*data = *((u8 *)p);
+		break;
+	default:
+		WARN_ONCE(1, "unexpected stat size for %s",
+			  stat->stat_string);
+		*data = 0;
+	}
+}
+
+/**
+ * __i40e_add_ethtool_stats - copy stats into the ethtool supplied buffer
+ * @data: ethtool stats buffer
+ * @pointer: location to copy stats from
+ * @stats: array of stats to copy
+ * @size: the size of the stats definition
+ *
+ * Copy the stats defined by the stats array using the pointer as a base into
+ * the data buffer supplied by ethtool. Updates the data pointer to point to
+ * the next empty location for successive calls to __i40e_add_ethtool_stats.
+ * If pointer is null, set the data values to zero and update the pointer to
+ * skip these stats.
+ **/
+static inline void
+__i40e_add_ethtool_stats(u64 **data, void *pointer,
+			 const struct i40e_stats stats[],
+			 const unsigned int size)
+{
+	unsigned int i;
+
+	for (i = 0; i < size; i++)
+		i40e_add_one_ethtool_stat((*data)++, pointer, &stats[i]);
+}
+
+/**
+ * i40e_add_ethtool_stats - copy stats into ethtool supplied buffer
+ * @data: ethtool stats buffer
+ * @pointer: location where stats are stored
+ * @stats: static const array of stat definitions
+ *
+ * Macro to ease the use of __i40e_add_ethtool_stats by taking a static
+ * constant stats array and passing the ARRAY_SIZE(). This avoids typos by
+ * ensuring that we pass the size associated with the given stats array.
+ *
+ * The parameter @stats is evaluated twice, so parameters with side effects
+ * should be avoided.
+ **/
+#define i40e_add_ethtool_stats(data, pointer, stats) \
+	__i40e_add_ethtool_stats(data, pointer, stats, ARRAY_SIZE(stats))
+
+/**
+ * i40e_add_queue_stats - copy queue statistics into supplied buffer
+ * @data: ethtool stats buffer
+ * @ring: the ring to copy
+ *
+ * Queue statistics must be copied while protected by
+ * u64_stats_fetch_begin_irq, so we can't directly use i40e_add_ethtool_stats.
+ * Assumes that queue stats are defined in i40e_gstrings_queue_stats. If the
+ * ring pointer is null, zero out the queue stat values and update the data
+ * pointer. Otherwise safely copy the stats from the ring into the supplied
+ * buffer and update the data pointer when finished.
+ *
+ * This function expects to be called while under rcu_read_lock().
+ **/
+static inline void
+i40e_add_queue_stats(u64 **data, struct i40e_ring *ring)
+{
+	const unsigned int size = ARRAY_SIZE(i40e_gstrings_queue_stats);
+	const struct i40e_stats *stats = i40e_gstrings_queue_stats;
+	unsigned int start;
+	unsigned int i;
+
+	/* To avoid invalid statistics values, ensure that we keep retrying
+	 * the copy until we get a consistent value according to
+	 * u64_stats_fetch_retry_irq. But first, make sure our ring is
+	 * non-null before attempting to access its syncp.
+	 */
+	do {
+		start = !ring ? 0 : u64_stats_fetch_begin_irq(&ring->syncp);
+		for (i = 0; i < size; i++) {
+			i40e_add_one_ethtool_stat(&(*data)[i], ring,
+						  &stats[i]);
+		}
+	} while (ring && u64_stats_fetch_retry_irq(&ring->syncp, start));
+
+	/* Once we successfully copy the stats in, update the data pointer */
+	*data += size;
+}
+
+/**
+ * __i40e_add_stat_strings - copy stat strings into ethtool buffer
+ * @p: ethtool supplied buffer
+ * @stats: stat definitions array
+ * @size: size of the stats array
+ *
+ * Format and copy the strings described by stats into the buffer pointed at
+ * by p.
+ **/
+static void __i40e_add_stat_strings(u8 **p, const struct i40e_stats stats[],
+				    const unsigned int size, ...)
+{
+	unsigned int i;
+
+	for (i = 0; i < size; i++) {
+		va_list args;
+
+		va_start(args, size);
+		vsnprintf(*p, ETH_GSTRING_LEN, stats[i].stat_string, args);
+		*p += ETH_GSTRING_LEN;
+		va_end(args);
+	}
+}
+
+/**
+ * 40e_add_stat_strings - copy stat strings into ethtool buffer
+ * @p: ethtool supplied buffer
+ * @stats: stat definitions array
+ *
+ * Format and copy the strings described by the const static stats value into
+ * the buffer pointed at by p.
+ *
+ * The parameter @stats is evaluated twice, so parameters with side effects
+ * should be avoided. Additionally, stats must be an array such that
+ * ARRAY_SIZE can be called on it.
+ **/
+#define i40e_add_stat_strings(p, stats, ...) \
+	__i40e_add_stat_strings(p, stats, ARRAY_SIZE(stats), ## __VA_ARGS__)
-- 
2.17.1

^ permalink raw reply related

* [net-next 04/15] i40evf: Change a VF mac without reloading the VF driver
From: Jeff Kirsher @ 2018-08-29 22:48 UTC (permalink / raw)
  To: davem
  Cc: Paweł Jabłoński, netdev, nhorman, sassmann,
	jogreene, Jeff Kirsher
In-Reply-To: <20180829224834.3141-1-jeffrey.t.kirsher@intel.com>

From: Paweł Jabłoński <pawel.jablonski@intel.com>

Add possibility to change a VF mac address from host side
without reloading the VF driver on the guest side. Without
this patch it is not possible to change the VF mac because
executing i40evf_virtchnl_completion function with
VIRTCHNL_OP_GET_VF_RESOURCES opcode resets the VF mac
address to previous value.

Signed-off-by: Paweł Jabłoński <pawel.jablonski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c  |  8 +++++---
 drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c | 11 +++++++++--
 2 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index c6d24eaede18..f56c645588f3 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -2458,7 +2458,7 @@ static inline int i40e_check_vf_permission(struct i40e_vf *vf,
 		    !is_multicast_ether_addr(addr) && vf->pf_set_mac &&
 		    !ether_addr_equal(addr, vf->default_lan_addr.addr)) {
 			dev_err(&pf->pdev->dev,
-				"VF attempting to override administratively set MAC address, reload the VF driver to resume normal operation\n");
+				"VF attempting to override administratively set MAC address, bring down and up the VF interface to resume normal operation\n");
 			return -EPERM;
 		}
 	}
@@ -3873,9 +3873,11 @@ int i40e_ndo_set_vf_mac(struct net_device *netdev, int vf_id, u8 *mac)
 			 mac, vf_id);
 	}
 
-	/* Force the VF driver stop so it has to reload with new MAC address */
+	/* Force the VF interface down so it has to bring up with new MAC
+	 * address
+	 */
 	i40e_vc_disable_vf(vf);
-	dev_info(&pf->pdev->dev, "Reload the VF driver to make this change effective.\n");
+	dev_info(&pf->pdev->dev, "Bring down and up the VF interface to make this change effective.\n");
 
 error_param:
 	return ret;
diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c b/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c
index 565677de5ba3..79b7be83b5c3 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c
@@ -1330,8 +1330,15 @@ void i40evf_virtchnl_completion(struct i40evf_adapter *adapter,
 			  sizeof(struct virtchnl_vsi_resource);
 		memcpy(adapter->vf_res, msg, min(msglen, len));
 		i40e_vf_parse_hw_config(&adapter->hw, adapter->vf_res);
-		/* restore current mac address */
-		ether_addr_copy(adapter->hw.mac.addr, netdev->dev_addr);
+		if (is_zero_ether_addr(adapter->hw.mac.addr)) {
+			/* restore current mac address */
+			ether_addr_copy(adapter->hw.mac.addr, netdev->dev_addr);
+		} else {
+			/* refresh current mac address if changed */
+			ether_addr_copy(netdev->dev_addr, adapter->hw.mac.addr);
+			ether_addr_copy(netdev->perm_addr,
+					adapter->hw.mac.addr);
+		}
 		i40evf_process_config(adapter);
 		}
 		break;
-- 
2.17.1

^ permalink raw reply related

* [net-next 00/15][pull request] 40GbE Intel Wired LAN Driver Updates 2018-08-29
From: Jeff Kirsher @ 2018-08-29 22:48 UTC (permalink / raw)
  To: davem; +Cc: Jeff Kirsher, netdev, nhorman, sassmann, jogreene

This series contains updates to i40e, i40evf and virtchnl.

Jake implements helper functions to use an array to handle the queue
stats which reduces the boiler plate code as well as keep the complexity
localized to a few functions.

Paweł adds the ability to change a VF's MAC address from the host side
without having to reload the VF driver on the guest side.

Paul adds a check to ensure that the number of queues that the PF sends
to the VF is equal to or less than the maximum number of queues the VF
can support.

Mitch fixes an issue caught by GCC 8, where we need to not include the
terminating null in the length of the string for strncpy().

Lihong fixes a VF issue to ensure that it does not enter into
promiscuous mode when macvlan is added to the VF.  Fixed a potential
crash after a VF is removed, since the workqueue sync for the adminq
task was not being cancelled.

Harshitha fixes the type for field_flags in the virtchnl_filter struct.

Martyna removes an unnecessary check in a conditional if statement.

Björn fixes an issue reported by Jesper Dangaard Brouer, where the
driver was reporting incorrect statistics when XDP was enabled.

Jan fixes the potential reporting of incorrect speed settings.

Patryk fixed an issue where the flag
I40EVF_FLAG_AQ_ENABLE_VLAN_STRIPPING was getting set when any offload is
set via ethtool.  Resolved by only setting this flag when VLAN offload
is enabled.  Also ensure we hold the rtnl lock when we are clearing the
interrupt scheme.  Added a check when deleting the MAC address from the
VF to ensure that the MAC address was not set by the PF and if it was,
do not delete it. 

The following are changes since commit 817e60a7a2bb1f22052f18562990d675cb3a3762:
  Merge branch 'nfp-add-NFP5000-support'
and are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue 40GbE

Björn Töpel (1):
  i40e: report correct statistics when XDP is enabled

Harshitha Ramamurthy (1):
  virtchnl: use u8 type for a field in the virtchnl_filter struct

Jacob Keller (3):
  i40e: convert queue stats to i40e_stats array
  i40e: move ethtool stats boiler plate code to i40e_ethtool_stats.h
  i40evf: update ethtool stats code and use helper functions

Jan Sokolowski (1):
  i40e: Check and correct speed values for link on open

Lihong Yang (2):
  i40evf: set IFF_UNICAST_FLT flag for the VF
  i40evf: cancel workqueue sync for adminq when a VF is removed

Martyna Szapar (1):
  i40e: static analysis report from community

Mitch Williams (1):
  i40e: use correct length for strncpy

Patryk Małek (3):
  i40evf: Don't enable vlan stripping when rx offload is turned on
  i40e: hold the rtnl lock on clearing interrupt scheme
  i40e: Prevent deleting MAC address from VF when set by PF

Paul M Stillwell Jr (1):
  i40evf: Validate the number of queues a PF sends

Paweł Jabłoński (1):
  i40evf: Change a VF mac without reloading the VF driver

 .../net/ethernet/intel/i40e/i40e_ethtool.c    | 238 ++++--------------
 .../ethernet/intel/i40e/i40e_ethtool_stats.h  | 221 ++++++++++++++++
 drivers/net/ethernet/intel/i40e/i40e_main.c   |  61 +++--
 drivers/net/ethernet/intel/i40e/i40e_ptp.c    |   3 +-
 .../ethernet/intel/i40e/i40e_virtchnl_pf.c    |  18 +-
 .../intel/i40evf/i40e_ethtool_stats.h         | 221 ++++++++++++++++
 .../ethernet/intel/i40evf/i40evf_ethtool.c    | 137 +++++-----
 .../net/ethernet/intel/i40evf/i40evf_main.c   |  15 +-
 .../ethernet/intel/i40evf/i40evf_virtchnl.c   |  43 +++-
 include/linux/avf/virtchnl.h                  |   2 +-
 10 files changed, 678 insertions(+), 281 deletions(-)
 create mode 100644 drivers/net/ethernet/intel/i40e/i40e_ethtool_stats.h
 create mode 100644 drivers/net/ethernet/intel/i40evf/i40e_ethtool_stats.h

-- 
2.17.1

^ permalink raw reply

* Re: [PATCH net-next 0/2] ethtool: drop get_settings and set_settings ops
From: David Miller @ 2018-08-30  2:46 UTC (permalink / raw)
  To: mkubecek
  Cc: netdev, andrew, f.fainelli, linux, linux-arm-kernel, linux-kernel
In-Reply-To: <cover.1535477409.git.mkubecek@suse.cz>

From: Michal Kubecek <mkubecek@suse.cz>
Date: Tue, 28 Aug 2018 19:56:48 +0200 (CEST)

> As Andrew Lunn pointed out in recent discussion, there is only one in tree
> driver left which still defines deprecated callbacks get_settings() and
> set_settings() in ethtool_ops. First patch converts this driver to
> get_link_ksettings() and set_link_ksettings(). Second patch then removes
> the deprecated callbacks from struct ethtool_ops and ethtool code which
> falls back to them.
> 
> This doesn't break old versions of ethtool or any other userspace code
> using ETHTOOL_{G,S}SET. We still implement both (old) ETHTOOL_{G,S}SET and
> (new) ETHTOOL_{G,S}LINKSETTINGS ioctl commands but after this series both
> will be implemented only using {g,s}et_link_ksettings(). The only affected
> code would be out of tree NIC drivers which have not been converted yet.

Nice, thanks for following up on this.

Series applied.

^ permalink raw reply

* Re: [PATCH net-next] genetlink: constify genl_err_attr() argument
From: David Miller @ 2018-08-30  2:43 UTC (permalink / raw)
  To: mkubecek; +Cc: netdev, linux-kernel
In-Reply-To: <20180828165158.B13DCA0C37@unicorn.suse.cz>

From: Michal Kubecek <mkubecek@suse.cz>
Date: Tue, 28 Aug 2018 18:51:58 +0200 (CEST)

> genl_err_attr() sets netlink_ext_ack::bad_attr which is a pointer to const
> struct nlattr so make the attr argument also const.
> 
> Signed-off-by: Michal Kubecek <mkubecek@suse.cz>

Applied.

^ permalink raw reply

* Re: [PATCH net] net: stmmac: build the dwmac-socfpga platform driver for Stratix10
From: David Miller @ 2018-08-30  2:40 UTC (permalink / raw)
  To: dinguyen; +Cc: netdev, linux-kernel
In-Reply-To: <1535469560-2304-1-git-send-email-dinguyen@kernel.org>

From: Dinh Nguyen <dinguyen@kernel.org>
Date: Tue, 28 Aug 2018 10:19:20 -0500

> The Stratix10 SoC is an AARCH64 based platform that shares the same ethernet
> controller that is on other SoCFPGA platforms. Build the platform driver.
> 
> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>

Applied, thank you.

^ permalink raw reply

* Re: [PATCH v4 0/3] KASLR feature to randomize each loadable module
From: Alexei Starovoitov @ 2018-08-30  2:27 UTC (permalink / raw)
  To: Rick Edgecombe
  Cc: tglx, mingo, hpa, x86, linux-kernel, linux-mm, kernel-hardening,
	daniel, jannh, keescook, kristen, dave.hansen, arjan, netdev
In-Reply-To: <1535583579-6138-1-git-send-email-rick.p.edgecombe@intel.com>

On Wed, Aug 29, 2018 at 03:59:36PM -0700, Rick Edgecombe wrote:
> Hi,
> 
> This is v4 of the "KASLR feature to randomize each loadable module" patchset.
> The purpose is to increase the randomization and also to make the modules
> randomized in relation to each other instead of just the base, so that if one
> module leaks the location of the others can't be inferred. It is enabled for
> x86_64 for now.
> 
> V4 is a few small fixes. I humbly think this is in pretty good shape at this
> point, unless anyone has any comments. The only other big change I was
> considering was moving the new randomization algorithm into vmalloc so it could
> be re-used for other architectures or possibly other vmalloc usages.
> 
> A few words on how this was tested - As previously mentioned, the entropy
> estimates were done using extracted module text sizes from the in-tree modules.
> These were also used to run 100,000's of simulated module allocations by calling
> module_alloc from a test module, including testing until allocation failure. The
> simulations kept track of every allocation address to make sure there were no
> collisions, and verified memory was actually mapped.
> 
> In addition the __vmalloc_node_try_addr function has a suite of unit tests that
> verify for a bunch of edge cases that it:
>  - Allows for allocations when it should
>  - Reports the right error code if it collides with a lazy-free area or real
>    allocation
>  - Verifies it frees a lazy free area when it should
> 
> These synthetic tests were also how the performance metrics were gathered.
> 
> Changes for V4:
>  - Fix issue caused by KASAN, kmemleak being provided different allocation
>    lengths (padding).
>  - Avoid kmalloc until sure its needed in __vmalloc_node_try_addr.
>  - Fix for debug file hang when the last VA is a lazy purge area
>  - Fixed issues reported by 0-day build system.
> 
> Changes for V3:
>  - Code cleanup based on internal feedback. (thanks to Dave Hansen and Andriy
>    Shevchenko)
>  - Slight refactor of existing algorithm to more cleanly live along side new
>    one.
>  - BPF synthetic benchmark

I don't see this benchmark in this patch set.
Could you prepare it as a test in tools/testing/selftests/bpf/ ?
so we can double check what is being tested and run it regularly
like we do for all other tests in there.

^ permalink raw reply

* RE: [PATCH] hv_netvsc: Fix a deadlock by getting rtnl_lock earlier in netvsc_probe()
From: Dexuan Cui @ 2018-08-30  2:10 UTC (permalink / raw)
  To: David Miller
  Cc: KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	netdev@vger.kernel.org, devel@linuxdriverproject.org,
	linux-kernel@vger.kernel.org, olaf@aepfle.de, apw@canonical.com,
	jasowang@redhat.com, vkuznets, marcelo.cerri@canonical.com,
	Josh Poulson
In-Reply-To: <20180829.190839.1082082323658260994.davem@davemloft.net>

> From: David Miller <davem@davemloft.net>
> Sent: Wednesday, August 29, 2018 19:09
> > Hi David,
> > I was afraid the call-traces are too detailed. :-)
> >
> > Can you please move the info to before the --- line?
> >
> > Or, should I resend the patch with the commit log updated?
> 
> Please resend.

OK. Will do.

-- Dexuan

^ permalink raw reply

* Re: [PATCH] hv_netvsc: Fix a deadlock by getting rtnl_lock earlier in netvsc_probe()
From: David Miller @ 2018-08-30  2:08 UTC (permalink / raw)
  To: decui
  Cc: kys, haiyangz, sthemmin, netdev, devel, linux-kernel, olaf, apw,
	jasowang, vkuznets, marcelo.cerri, jopoulso
In-Reply-To: <PU1P153MB016903725D5E8EF56C07FF69BF080@PU1P153MB0169.APCP153.PROD.OUTLOOK.COM>

From: Dexuan Cui <decui@microsoft.com>
Date: Thu, 30 Aug 2018 00:58:48 +0000

>> From: David Miller <davem@davemloft.net>
>> Sent: Wednesday, August 29, 2018 17:49
>> 
>> From: Dexuan Cui <decui@microsoft.com>
>> Date: Wed, 22 Aug 2018 21:20:03 +0000
>> 
>> > ---
>> >  drivers/net/hyperv/netvsc_drv.c | 11 ++++++++++-
>> >  1 file changed, 10 insertions(+), 1 deletion(-)
>> >
>> >
>> > FYI: these are the related 3 paths which show the deadlock:
>> 
>> This incredibly useful information belongs in the commit log
>> message, and therefore before the --- and signoffs.
> 
> Hi David,
> I was afraid the call-traces are too detailed. :-)
> 
> Can you please move the info to before the --- line?
> 
> Or, should I resend the patch with the commit log updated?

Please resend.

^ permalink raw reply

* [PATCH net-next] tcp: change IPv6 flow-label upon receiving spurious retransmission
From: Yuchung Cheng @ 2018-08-29 21:53 UTC (permalink / raw)
  To: davem, edumazet; +Cc: netdev, ncardwell, Yuchung Cheng

Currently a Linux IPv6 TCP sender will change the flow label upon
timeouts to potentially steer away from a data path that has gone
bad. However this does not help if the problem is on the ACK path
and the data path is healthy. In this case the receiver is likely
to receive repeated spurious retransmission because the sender
couldn't get the ACKs in time and has recurring timeouts.

This patch adds another feature to mitigate this problem. It
leverages the DSACK states in the receiver to change the flow
label of the ACKs to speculatively re-route the ACK packets.
In order to allow triggering on the second consecutive spurious
RTO, the receiver changes the flow label upon sending a second
consecutive DSACK for a sequence number below RCV.NXT.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv4/tcp.c       |  2 ++
 net/ipv4/tcp_input.c | 13 +++++++++++++
 2 files changed, 15 insertions(+)

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index b8af2fec5ad5..8c4235c098fd 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2595,6 +2595,8 @@ int tcp_disconnect(struct sock *sk, int flags)
 	tp->compressed_ack = 0;
 	tp->bytes_sent = 0;
 	tp->bytes_retrans = 0;
+	tp->duplicate_sack[0].start_seq = 0;
+	tp->duplicate_sack[0].end_seq = 0;
 	tp->dsack_dups = 0;
 	tp->reord_seen = 0;
 
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 4c2dd9f863f7..62508a2f9b21 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4199,6 +4199,17 @@ static void tcp_dsack_extend(struct sock *sk, u32 seq, u32 end_seq)
 		tcp_sack_extend(tp->duplicate_sack, seq, end_seq);
 }
 
+static void tcp_rcv_spurious_retrans(struct sock *sk, const struct sk_buff *skb)
+{
+	/* When the ACK path fails or drops most ACKs, the sender would
+	 * timeout and spuriously retransmit the same segment repeatedly.
+	 * The receiver remembers and reflects via DSACKs. Leverage the
+	 * DSACK state and change the txhash to re-route speculatively.
+	 */
+	if (TCP_SKB_CB(skb)->seq == tcp_sk(sk)->duplicate_sack[0].start_seq)
+		sk_rethink_txhash(sk);
+}
+
 static void tcp_send_dupack(struct sock *sk, const struct sk_buff *skb)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
@@ -4211,6 +4222,7 @@ static void tcp_send_dupack(struct sock *sk, const struct sk_buff *skb)
 		if (tcp_is_sack(tp) && sock_net(sk)->ipv4.sysctl_tcp_dsack) {
 			u32 end_seq = TCP_SKB_CB(skb)->end_seq;
 
+			tcp_rcv_spurious_retrans(sk, skb);
 			if (after(TCP_SKB_CB(skb)->end_seq, tp->rcv_nxt))
 				end_seq = tp->rcv_nxt;
 			tcp_dsack_set(sk, TCP_SKB_CB(skb)->seq, end_seq);
@@ -4755,6 +4767,7 @@ static void tcp_data_queue(struct sock *sk, struct sk_buff *skb)
 	}
 
 	if (!after(TCP_SKB_CB(skb)->end_seq, tp->rcv_nxt)) {
+		tcp_rcv_spurious_retrans(sk, skb);
 		/* A retransmit, 2nd most common case.  Force an immediate ack. */
 		NET_INC_STATS(sock_net(sk), LINUX_MIB_DELAYEDACKLOST);
 		tcp_dsack_set(sk, TCP_SKB_CB(skb)->seq, TCP_SKB_CB(skb)->end_seq);
-- 
2.19.0.rc0.228.g281dcd1b4d0-goog

^ permalink raw reply related

* [PATCH bpf-next 3/3] tools/bpf: bpftool: add btf percpu map formated dump
From: Yonghong Song @ 2018-08-29 21:43 UTC (permalink / raw)
  To: ast, kafai, daniel, netdev; +Cc: kernel-team
In-Reply-To: <20180829214315.4065753-1-yhs@fb.com>

The btf pretty print is added to percpu arraymap,
percpu hashmap and percpu lru hashmap.
For each <key, value> pair, the following will be
added to plain/json output:

   {
       "key": <pretty_print_key>,
       "values": [{
             "cpu": 0,
             "value": <pretty_print_value_on_cpu0>
          },{
             "cpu": 1,
             "value": <pretty_print_value_on_cpu1>
          },{
          ....
          },{
             "cpu": n,
             "value": <pretty_print_value_on_cpun>
          }
       ]
   }

For example, the following could be part of plain or json formatted
output:
    {
        "key": 0,
        "values": [{
                "cpu": 0,
                "value": {
                    "ui32": 0,
                    "ui16": 0,
                }
            },{
                "cpu": 1,
                "value": {
                    "ui32": 1,
                    "ui16": 0,
                }
            },{
                "cpu": 2,
                "value": {
                    "ui32": 2,
                    "ui16": 0,
                }
            },{
                "cpu": 3,
                "value": {
                    "ui32": 3,
                    "ui16": 0,
                }
            }
        ]
    }

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 tools/bpf/bpftool/map.c | 33 +++++++++++++++++++++++++++++++--
 1 file changed, 31 insertions(+), 2 deletions(-)

diff --git a/tools/bpf/bpftool/map.c b/tools/bpf/bpftool/map.c
index b2ec20e562bd..df175bc33c5d 100644
--- a/tools/bpf/bpftool/map.c
+++ b/tools/bpf/bpftool/map.c
@@ -169,9 +169,28 @@ static int do_dump_btf(const struct btf_dumper *d,
 	if (ret)
 		goto err_end_obj;
 
-	jsonw_name(d->jw, "value");
+	if (!map_is_per_cpu(map_info->type)) {
+		jsonw_name(d->jw, "value");
+		ret = btf_dumper_type(d, map_info->btf_value_type_id, value);
+	} else {
+		unsigned int i, n, step;
 
-	ret = btf_dumper_type(d, map_info->btf_value_type_id, value);
+		jsonw_name(d->jw, "values");
+		jsonw_start_array(d->jw);
+		n = get_possible_cpus();
+		step = round_up(map_info->value_size, 8);
+		for (i = 0; i < n; i++) {
+			jsonw_start_object(d->jw);
+			jsonw_int_field(d->jw, "cpu", i);
+			jsonw_name(d->jw, "value");
+			ret = btf_dumper_type(d, map_info->btf_value_type_id,
+					      value + i * step);
+			jsonw_end_object(d->jw);
+			if (ret)
+				break;
+		}
+		jsonw_end_array(d->jw);
+	}
 
 err_end_obj:
 	/* end of key-value pair */
@@ -298,6 +317,16 @@ static void print_entry_json(struct bpf_map_info *info, unsigned char *key,
 			jsonw_end_object(json_wtr);
 		}
 		jsonw_end_array(json_wtr);
+		if (btf) {
+			struct btf_dumper d = {
+				.btf = btf,
+				.jw = json_wtr,
+				.is_plain_text = false,
+			};
+
+			jsonw_name(json_wtr, "formatted");
+			do_dump_btf(&d, info, key, value);
+		}
 	}
 
 	jsonw_end_object(json_wtr);
-- 
2.14.3

^ permalink raw reply related

* [PATCH bpf-next 1/3] bpf: add bpffs pretty print for percpu arraymap/hash/lru_hash
From: Yonghong Song @ 2018-08-29 21:43 UTC (permalink / raw)
  To: ast, kafai, daniel, netdev; +Cc: kernel-team
In-Reply-To: <20180829214315.4065753-1-yhs@fb.com>

Added bpffs pretty print for percpu arraymap, percpu hashmap
and percpu lru hashmap.

For each map <key, value> pair, the format is:
   <key_value>: {
	cpu0: <value_on_cpu0>
	cpu1: <value_on_cpu1>
	...
	cpun: <value_on_cpun>
   }

For example, on my VM, there are 4 cpus, and
for test_btf test in the next patch:
   cat /sys/fs/bpf/pprint_test_percpu_hash

You may get:
   ...
   43602: {
	cpu0: {43602,0,-43602,0x3,0xaa52,0x3,{43602|[82,170,0,0,0,0,0,0]},ENUM_TWO}
	cpu1: {43602,0,-43602,0x3,0xaa52,0x3,{43602|[82,170,0,0,0,0,0,0]},ENUM_TWO}
	cpu2: {43602,0,-43602,0x3,0xaa52,0x3,{43602|[82,170,0,0,0,0,0,0]},ENUM_TWO}
	cpu3: {43602,0,-43602,0x3,0xaa52,0x3,{43602|[82,170,0,0,0,0,0,0]},ENUM_TWO}
   }
   72847: {
	cpu0: {72847,0,-72847,0x3,0x11c8f,0x3,{72847|[143,28,1,0,0,0,0,0]},ENUM_THREE}
	cpu1: {72847,0,-72847,0x3,0x11c8f,0x3,{72847|[143,28,1,0,0,0,0,0]},ENUM_THREE}
	cpu2: {72847,0,-72847,0x3,0x11c8f,0x3,{72847|[143,28,1,0,0,0,0,0]},ENUM_THREE}
	cpu3: {72847,0,-72847,0x3,0x11c8f,0x3,{72847|[143,28,1,0,0,0,0,0]},ENUM_THREE}
   }
   ...

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 kernel/bpf/arraymap.c | 24 ++++++++++++++++++++++++
 kernel/bpf/hashtab.c  | 31 +++++++++++++++++++++++++++++++
 2 files changed, 55 insertions(+)

diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
index 0c17aab3ce5f..f9d24121be99 100644
--- a/kernel/bpf/arraymap.c
+++ b/kernel/bpf/arraymap.c
@@ -358,6 +358,29 @@ static void array_map_seq_show_elem(struct bpf_map *map, void *key,
 	rcu_read_unlock();
 }
 
+static void percpu_array_map_seq_show_elem(struct bpf_map *map, void *key,
+					   struct seq_file *m)
+{
+	struct bpf_array *array = container_of(map, struct bpf_array, map);
+	u32 index = *(u32 *)key;
+	void __percpu *pptr;
+	int cpu;
+
+	rcu_read_lock();
+
+	seq_printf(m, "%u: {\n", *(u32 *)key);
+	pptr = array->pptrs[index & array->index_mask];
+	for_each_possible_cpu(cpu) {
+		seq_printf(m, "\tcpu%d: ", cpu);
+		btf_type_seq_show(map->btf, map->btf_value_type_id,
+				  per_cpu_ptr(pptr, cpu), m);
+		seq_puts(m, "\n");
+	}
+	seq_puts(m, "}\n");
+
+	rcu_read_unlock();
+}
+
 static int array_map_check_btf(const struct bpf_map *map,
 			       const struct btf_type *key_type,
 			       const struct btf_type *value_type)
@@ -398,6 +421,7 @@ const struct bpf_map_ops percpu_array_map_ops = {
 	.map_lookup_elem = percpu_array_map_lookup_elem,
 	.map_update_elem = array_map_update_elem,
 	.map_delete_elem = array_map_delete_elem,
+	.map_seq_show_elem = percpu_array_map_seq_show_elem,
 	.map_check_btf = array_map_check_btf,
 };
 
diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
index 03cc59ee9c95..2c1790288138 100644
--- a/kernel/bpf/hashtab.c
+++ b/kernel/bpf/hashtab.c
@@ -1285,6 +1285,35 @@ int bpf_percpu_hash_update(struct bpf_map *map, void *key, void *value,
 	return ret;
 }
 
+static void htab_percpu_map_seq_show_elem(struct bpf_map *map, void *key,
+					  struct seq_file *m)
+{
+	struct htab_elem *l;
+	void __percpu *pptr;
+	int cpu;
+
+	rcu_read_lock();
+
+	l = __htab_map_lookup_elem(map, key);
+	if (!l) {
+		rcu_read_unlock();
+		return;
+	}
+
+	btf_type_seq_show(map->btf, map->btf_key_type_id, key, m);
+	seq_puts(m, ": {\n");
+	pptr = htab_elem_get_ptr(l, map->key_size);
+	for_each_possible_cpu(cpu) {
+		seq_printf(m, "\tcpu%d: ", cpu);
+		btf_type_seq_show(map->btf, map->btf_value_type_id,
+				  per_cpu_ptr(pptr, cpu), m);
+		seq_puts(m, "\n");
+	}
+	seq_puts(m, "}\n");
+
+	rcu_read_unlock();
+}
+
 const struct bpf_map_ops htab_percpu_map_ops = {
 	.map_alloc_check = htab_map_alloc_check,
 	.map_alloc = htab_map_alloc,
@@ -1293,6 +1322,7 @@ const struct bpf_map_ops htab_percpu_map_ops = {
 	.map_lookup_elem = htab_percpu_map_lookup_elem,
 	.map_update_elem = htab_percpu_map_update_elem,
 	.map_delete_elem = htab_map_delete_elem,
+	.map_seq_show_elem = htab_percpu_map_seq_show_elem,
 };
 
 const struct bpf_map_ops htab_lru_percpu_map_ops = {
@@ -1303,6 +1333,7 @@ const struct bpf_map_ops htab_lru_percpu_map_ops = {
 	.map_lookup_elem = htab_lru_percpu_map_lookup_elem,
 	.map_update_elem = htab_lru_percpu_map_update_elem,
 	.map_delete_elem = htab_lru_map_delete_elem,
+	.map_seq_show_elem = htab_percpu_map_seq_show_elem,
 };
 
 static int fd_htab_map_alloc_check(union bpf_attr *attr)
-- 
2.14.3

^ permalink raw reply related

* [PATCH bpf-next 2/3] tools/bpf: add bpffs percpu map pretty print tests in test_btf
From: Yonghong Song @ 2018-08-29 21:43 UTC (permalink / raw)
  To: ast, kafai, daniel, netdev; +Cc: kernel-team
In-Reply-To: <20180829214315.4065753-1-yhs@fb.com>

The bpf selftest test_btf is extended to test bpffs
percpu map pretty print for percpu array, percpu hash and
percpu lru hash.

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 tools/testing/selftests/bpf/test_btf.c | 179 ++++++++++++++++++++++++++-------
 1 file changed, 144 insertions(+), 35 deletions(-)

diff --git a/tools/testing/selftests/bpf/test_btf.c b/tools/testing/selftests/bpf/test_btf.c
index 6b5cfeb7a9cc..f42b3396d622 100644
--- a/tools/testing/selftests/bpf/test_btf.c
+++ b/tools/testing/selftests/bpf/test_btf.c
@@ -4,6 +4,7 @@
 #include <linux/bpf.h>
 #include <linux/btf.h>
 #include <linux/err.h>
+#include <linux/kernel.h>
 #include <bpf/bpf.h>
 #include <sys/resource.h>
 #include <libelf.h>
@@ -45,7 +46,6 @@ static int count_result(int err)
 	return err;
 }
 
-#define min(a, b) ((a) < (b) ? (a) : (b))
 #define __printf(a, b)	__attribute__((format(printf, a, b)))
 
 __printf(1, 2)
@@ -130,6 +130,7 @@ struct btf_raw_test {
 	bool map_create_err;
 	bool ordered_map;
 	bool lossless_map;
+	bool percpu_map;
 	int hdr_len_delta;
 	int type_off_delta;
 	int str_off_delta;
@@ -2157,6 +2158,7 @@ static struct btf_pprint_test_meta {
 	const char *map_name;
 	bool ordered_map;
 	bool lossless_map;
+	bool percpu_map;
 } pprint_tests_meta[] = {
 {
 	.descr = "BTF pretty print array",
@@ -2164,6 +2166,7 @@ static struct btf_pprint_test_meta {
 	.map_name = "pprint_test_array",
 	.ordered_map = true,
 	.lossless_map = true,
+	.percpu_map = false,
 },
 
 {
@@ -2172,6 +2175,7 @@ static struct btf_pprint_test_meta {
 	.map_name = "pprint_test_hash",
 	.ordered_map = false,
 	.lossless_map = true,
+	.percpu_map = false,
 },
 
 {
@@ -2180,30 +2184,83 @@ static struct btf_pprint_test_meta {
 	.map_name = "pprint_test_lru_hash",
 	.ordered_map = false,
 	.lossless_map = false,
+	.percpu_map = false,
+},
+
+{
+	.descr = "BTF pretty print percpu array",
+	.map_type = BPF_MAP_TYPE_PERCPU_ARRAY,
+	.map_name = "pprint_test_percpu_array",
+	.ordered_map = true,
+	.lossless_map = true,
+	.percpu_map = true,
+},
+
+{
+	.descr = "BTF pretty print percpu hash",
+	.map_type = BPF_MAP_TYPE_PERCPU_HASH,
+	.map_name = "pprint_test_percpu_hash",
+	.ordered_map = false,
+	.lossless_map = true,
+	.percpu_map = true,
+},
+
+{
+	.descr = "BTF pretty print lru percpu hash",
+	.map_type = BPF_MAP_TYPE_LRU_PERCPU_HASH,
+	.map_name = "pprint_test_lru_percpu_hash",
+	.ordered_map = false,
+	.lossless_map = false,
+	.percpu_map = true,
 },
 
 };
 
 
-static void set_pprint_mapv(struct pprint_mapv *v, uint32_t i)
+static void set_pprint_mapv(struct pprint_mapv *v, uint32_t i,
+			    int num_cpus, int rounded_value_size)
 {
-	v->ui32 = i;
-	v->si32 = -i;
-	v->unused_bits2a = 3;
-	v->bits28 = i;
-	v->unused_bits2b = 3;
-	v->ui64 = i;
-	v->aenum = i & 0x03;
+	int cpu;
+
+	for (cpu = 0; cpu < num_cpus; cpu++) {
+		v->ui32 = i + cpu;
+		v->si32 = -i;
+		v->unused_bits2a = 3;
+		v->bits28 = i;
+		v->unused_bits2b = 3;
+		v->ui64 = i;
+		v->aenum = i & 0x03;
+		v = (void *)v + rounded_value_size;
+	}
 }
 
+static int check_line(const char *expected_line, int nexpected_line,
+		      int expected_line_len, const char *line)
+{
+	if (CHECK(nexpected_line == expected_line_len,
+		  "expected_line is too long"))
+		return -1;
+
+	if (strcmp(expected_line, line)) {
+		fprintf(stderr, "unexpected pprint output\n");
+		fprintf(stderr, "expected: %s", expected_line);
+		fprintf(stderr, "    read: %s", line);
+		return -1;
+	}
+
+	return 0;
+}
+
+
 static int do_test_pprint(void)
 {
 	const struct btf_raw_test *test = &pprint_test_template;
 	struct bpf_create_map_attr create_attr = {};
+	bool ordered_map, lossless_map, percpu_map;
+	int err, ret, num_cpus, rounded_value_size;
+	struct pprint_mapv *mapv = NULL;
 	unsigned int key, nr_read_elems;
-	bool ordered_map, lossless_map;
 	int map_fd = -1, btf_fd = -1;
-	struct pprint_mapv mapv = {};
 	unsigned int raw_btf_size;
 	char expected_line[255];
 	FILE *pin_file = NULL;
@@ -2212,7 +2269,6 @@ static int do_test_pprint(void)
 	char *line = NULL;
 	uint8_t *raw_btf;
 	ssize_t nread;
-	int err, ret;
 
 	fprintf(stderr, "%s......", test->descr);
 	raw_btf = btf_raw_create(&hdr_tmpl, test->raw_types,
@@ -2261,9 +2317,18 @@ static int do_test_pprint(void)
 	if (CHECK(err, "bpf_obj_pin(%s): errno:%d.", pin_path, errno))
 		goto done;
 
+	percpu_map = test->percpu_map;
+	num_cpus = percpu_map ? bpf_num_possible_cpus() : 1;
+	rounded_value_size = round_up(sizeof(struct pprint_mapv), 8);
+	mapv = calloc(num_cpus, rounded_value_size);
+	if (CHECK(!mapv, "mapv allocation failure")) {
+		err = -1;
+		goto done;
+	}
+
 	for (key = 0; key < test->max_entries; key++) {
-		set_pprint_mapv(&mapv, key);
-		bpf_map_update_elem(map_fd, &key, &mapv, 0);
+		set_pprint_mapv(mapv, key, num_cpus, rounded_value_size);
+		bpf_map_update_elem(map_fd, &key, mapv, 0);
 	}
 
 	pin_file = fopen(pin_path, "r");
@@ -2286,33 +2351,74 @@ static int do_test_pprint(void)
 	ordered_map = test->ordered_map;
 	lossless_map = test->lossless_map;
 	do {
+		struct pprint_mapv *cmapv;
 		ssize_t nexpected_line;
 		unsigned int next_key;
+		int cpu;
 
 		next_key = ordered_map ? nr_read_elems : atoi(line);
-		set_pprint_mapv(&mapv, next_key);
-		nexpected_line = snprintf(expected_line, sizeof(expected_line),
-					  "%u: {%u,0,%d,0x%x,0x%x,0x%x,{%lu|[%u,%u,%u,%u,%u,%u,%u,%u]},%s}\n",
-					  next_key,
-					  mapv.ui32, mapv.si32,
-					  mapv.unused_bits2a, mapv.bits28, mapv.unused_bits2b,
-					  mapv.ui64,
-					  mapv.ui8a[0], mapv.ui8a[1], mapv.ui8a[2], mapv.ui8a[3],
-					  mapv.ui8a[4], mapv.ui8a[5], mapv.ui8a[6], mapv.ui8a[7],
-					  pprint_enum_str[mapv.aenum]);
-
-		if (CHECK(nexpected_line == sizeof(expected_line),
-			  "expected_line is too long")) {
-			err = -1;
-			goto done;
+		set_pprint_mapv(mapv, next_key, num_cpus, rounded_value_size);
+		cmapv = mapv;
+
+		for (cpu = 0; cpu < num_cpus; cpu++) {
+			if (percpu_map) {
+				/* for percpu map, the format looks like:
+				 * <key>: {
+				 *	cpu0: <value_on_cpu0>
+				 *	cpu1: <value_on_cpu1>
+				 *	...
+				 *	cpun: <value_on_cpun>
+				 * }
+				 *
+				 * let us verify the line containing the key here.
+				 */
+				if (cpu == 0) {
+					nexpected_line = snprintf(expected_line,
+								  sizeof(expected_line),
+								  "%u: {\n",
+								  next_key);
+
+					err = check_line(expected_line, nexpected_line,
+							 sizeof(expected_line), line);
+					if (err == -1)
+						goto done;
+				}
+
+				/* read value@cpu */
+				nread = getline(&line, &line_len, pin_file);
+				if (nread < 0)
+					break;
+			}
+
+			nexpected_line = snprintf(expected_line, sizeof(expected_line),
+						  "%s%u: {%u,0,%d,0x%x,0x%x,0x%x,"
+						  "{%lu|[%u,%u,%u,%u,%u,%u,%u,%u]},%s}\n",
+						  percpu_map ? "\tcpu" : "",
+						  percpu_map ? cpu : next_key,
+						  cmapv->ui32, cmapv->si32,
+						  cmapv->unused_bits2a,
+						  cmapv->bits28,
+						  cmapv->unused_bits2b,
+						  cmapv->ui64,
+						  cmapv->ui8a[0], cmapv->ui8a[1],
+						  cmapv->ui8a[2], cmapv->ui8a[3],
+						  cmapv->ui8a[4], cmapv->ui8a[5],
+						  cmapv->ui8a[6], cmapv->ui8a[7],
+						  pprint_enum_str[cmapv->aenum]);
+
+			err = check_line(expected_line, nexpected_line,
+					 sizeof(expected_line), line);
+			if (err == -1)
+				goto done;
+
+			cmapv = (void *)cmapv + rounded_value_size;
 		}
 
-		if (strcmp(expected_line, line)) {
-			err = -1;
-			fprintf(stderr, "unexpected pprint output\n");
-			fprintf(stderr, "expected: %s", expected_line);
-			fprintf(stderr, "    read: %s", line);
-			goto done;
+		if (percpu_map) {
+			/* skip the last bracket for the percpu map */
+			nread = getline(&line, &line_len, pin_file);
+			if (nread < 0)
+				break;
 		}
 
 		nread = getline(&line, &line_len, pin_file);
@@ -2334,6 +2440,8 @@ static int do_test_pprint(void)
 	err = 0;
 
 done:
+	if (mapv)
+		free(mapv);
 	if (!err)
 		fprintf(stderr, "OK");
 	if (*btf_log_buf && (err || args.always_log))
@@ -2361,6 +2469,7 @@ static int test_pprint(void)
 		pprint_test_template.map_name = pprint_tests_meta[i].map_name;
 		pprint_test_template.ordered_map = pprint_tests_meta[i].ordered_map;
 		pprint_test_template.lossless_map = pprint_tests_meta[i].lossless_map;
+		pprint_test_template.percpu_map = pprint_tests_meta[i].percpu_map;
 
 		err |= count_result(do_test_pprint());
 	}
-- 
2.14.3

^ permalink raw reply related

* [PATCH bpf-next 0/3] bpf: implement percpu map pretty print for bpffs and bpftool
From: Yonghong Song @ 2018-08-29 21:43 UTC (permalink / raw)
  To: ast, kafai, daniel, netdev; +Cc: kernel-team

Commit a26ca7c982cb ("bpf: btf: Add pretty print support to the
basic arraymap") and Commit 699c86d6ec21 ("bpf: btf: add pretty print
for hash/lru_hash maps") added bpffs pretty print for array, hash and
lru hash maps. The pretty print gives users a structurally formatted
dump for keys/values which much easy to understand than raw bytes.

This patch set implemented bpffs pretty print support for
percpu arraymap, percpu hashmap and percpu lru hashmap.
For complex key/value types, the pretty print here is even more useful
due to
  . large volumne of data making it even harder to correlate bytes
    to a particular field in a particular cpu.
  . kernel rounds the value size for each cpu to multiple of 8.
    User has to be aware of this otherwise wrong value may be
    derived from cpu 1/2/...

For example, we may have a bpffs pretty print like below:
   43602: {
        cpu0: {43602,0,-43602,0x3,0xaa52,0x3,{43602|[82,170,0,0,0,0,0,0]},ENUM_TWO}
        cpu1: {43602,0,-43602,0x3,0xaa52,0x3,{43602|[82,170,0,0,0,0,0,0]},ENUM_TWO}
        cpu2: {43602,0,-43602,0x3,0xaa52,0x3,{43602|[82,170,0,0,0,0,0,0]},ENUM_TWO}
        cpu3: {43602,0,-43602,0x3,0xaa52,0x3,{43602|[82,170,0,0,0,0,0,0]},ENUM_TWO}
   }
for a percpu map.

This patch also added percpu formatted print on bpftool. For example,
bpftool may print like below:
    {
        "key": 0,
        "values": [{
                "cpu": 0,
                "value": {
                    "ui32": 0,
                    "ui16": 0,
                }
            },{
                "cpu": 1,
                "value": {
                    "ui32": 1,
                    "ui16": 0,
                }
            },{
                "cpu": 2,
                "value": {
                    "ui32": 2,
                    "ui16": 0,
                }
            },{
                "cpu": 3,
                "value": {
                    "ui32": 3,
                    "ui16": 0,
                }
            }
        ]
    }

Patch #1 implemented bpffs pretty print for percpu arraymap/hash/lru_hash
in kernel. Patch #2 added the test case in tools bpf selftest test_btf.
Patch #3 added percpu map btf based dump.

Yonghong Song (3):
  bpf: add bpffs pretty print for percpu arraymap/hash/lru_hash
  tools/bpf: add bpffs percpu map pretty print tests in test_btf
  tools/bpf: bpftool: add btf percpu map formated dump

 kernel/bpf/arraymap.c                  |  24 +++++
 kernel/bpf/hashtab.c                   |  31 ++++++
 tools/bpf/bpftool/map.c                |  33 +++++-
 tools/testing/selftests/bpf/test_btf.c | 179 ++++++++++++++++++++++++++-------
 4 files changed, 230 insertions(+), 37 deletions(-)

-- 
2.14.3

^ permalink raw reply

* Re: [PATCH] net: sched: Fix memory exposure from short TCA_U32_SEL
From: Al Viro @ 2018-08-29 21:33 UTC (permalink / raw)
  To: Cong Wang
  Cc: Jamal Hadi Salim, Kees Cook, LKML, Jiri Pirko, David Miller,
	Linux Kernel Network Developers
In-Reply-To: <CAM_iQpXk48ip4dAKpc-HMwxf+wC6eYAK-atpsCSv7T7u7sLFMw@mail.gmail.com>

On Wed, Aug 29, 2018 at 12:07:09PM -0700, Cong Wang wrote:
> On Mon, Aug 27, 2018 at 5:03 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > On Mon, Aug 27, 2018 at 02:31:41PM -0700, Cong Wang wrote:
> > > > I cant think of any challenges. Cong/Jiri? Would it require development
> > > > time classifiers/actions/qdiscs to sit in that directory (I suspect you
> > > > dont want them in include/net).
> > > > BTW, the idea of improving grep-ability of the code by prefixing the
> > > > ops appropriately makes sense. i.e we should have ops->cls_init,
> > > > ops->act_init etc.
> > >
> > > Hmm? Isn't struct tcf_proto_ops used and must be provided
> > > by each tc filter module? How does it work if you move it into
> > > net/sched/* for out-of-tree modules? Are they supposed to
> > > include "..../net/sched/tcf_proto.h"?? Or something else?
> >
> > If you care about out-of-tree modules, that could easily live in
> > include/net/tcf_proto.h, provided that it's not pulled by indirect
> > includes into hell knows how many places.  Try
> > make allmodconfig
> > make >/dev/null 2>&1
> > find -name '.*.cmd'|xargs grep sch_generic.h
> >
> > That finds 2977 files here, most of them having nothing to do with
> > net/sched.
> 
> 
> Moving it to include/net/tcf_proto.h is fine, as out-of-tree modules
> can still compile by modifying the included header path.
> 
> include/net/pkt_cls.h might be a choice here too.

Nowhere near as massive exposure (123 files vs. 2977), but still - do
ethernet drivers need that?  Because that's 77 out of 123...

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox