Netdev List
 help / color / mirror / Atom feed
* [net-next 11/15] i40e: Remove function i40e_update_dcb_config()
From: Jeff Kirsher @ 2019-08-21 20:16 UTC (permalink / raw)
  To: davem
  Cc: Grzegorz Siwik, netdev, nhorman, sassmann, Andrew Bowers,
	Jeff Kirsher
In-Reply-To: <20190821201623.5506-1-jeffrey.t.kirsher@intel.com>

From: Grzegorz Siwik <grzegorz.siwik@intel.com>

This patch removes function i40e_update_dcb_config(). Instead of
i40e_update_dcb_config() we use i40e_init_dcb(), which implements the
correct NVM read.

Signed-off-by: Grzegorz Siwik <grzegorz.siwik@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 46 +--------------------
 1 file changed, 1 insertion(+), 45 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 5c280c025085..8d6b9515b595 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -6419,50 +6419,6 @@ static int i40e_resume_port_tx(struct i40e_pf *pf)
 	return ret;
 }
 
-/**
- * i40e_update_dcb_config
- * @hw: pointer to the HW struct
- * @enable_mib_change: enable MIB change event
- *
- * Update DCB configuration from the firmware
- **/
-static enum i40e_status_code
-i40e_update_dcb_config(struct i40e_hw *hw, bool enable_mib_change)
-{
-	struct i40e_lldp_variables lldp_cfg;
-	i40e_status ret;
-
-	if (!hw->func_caps.dcb)
-		return I40E_NOT_SUPPORTED;
-
-	/* Read LLDP NVM area */
-	ret = i40e_read_lldp_cfg(hw, &lldp_cfg);
-	if (ret)
-		return I40E_ERR_NOT_READY;
-
-	/* Get DCBX status */
-	ret = i40e_get_dcbx_status(hw, &hw->dcbx_status);
-	if (ret)
-		return ret;
-
-	/* Check the DCBX Status */
-	if (hw->dcbx_status == I40E_DCBX_STATUS_DONE ||
-	    hw->dcbx_status == I40E_DCBX_STATUS_IN_PROGRESS) {
-		/* Get current DCBX configuration */
-		ret = i40e_get_dcb_config(hw);
-		if (ret)
-			return ret;
-	} else if (hw->dcbx_status == I40E_DCBX_STATUS_DISABLED) {
-		return I40E_ERR_NOT_READY;
-	}
-
-	/* Configure the LLDP MIB change event */
-	if (enable_mib_change)
-		ret = i40e_aq_cfg_lldp_mib_change_event(hw, true, NULL);
-
-	return ret;
-}
-
 /**
  * i40e_init_pf_dcb - Initialize DCB configuration
  * @pf: PF being configured
@@ -6485,7 +6441,7 @@ static int i40e_init_pf_dcb(struct i40e_pf *pf)
 		goto out;
 	}
 
-	err = i40e_update_dcb_config(hw, true);
+	err = i40e_init_dcb(hw, true);
 	if (!err) {
 		/* Device/Function is not DCBX capable */
 		if ((!hw->func_caps.dcb) ||
-- 
2.21.0


^ permalink raw reply related

* [net-next 12/15] i40e: make i40e_set_mac_type() public
From: Jeff Kirsher @ 2019-08-21 20:16 UTC (permalink / raw)
  To: davem
  Cc: Piotr Kwapulinski, netdev, nhorman, sassmann, Andrew Bowers,
	Jeff Kirsher
In-Reply-To: <20190821201623.5506-1-jeffrey.t.kirsher@intel.com>

From: Piotr Kwapulinski <piotr.kwapulinski@intel.com>

Make i40e_set_mac_type() public. i40e driver requires i40e_set_mac_type()
to be public. It is required for recovery mode handling. Without this patch
recovery mode could not be detected in i40e_probe().

Signed-off-by: Piotr Kwapulinski <piotr.kwapulinski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_common.c    | 2 +-
 drivers/net/ethernet/intel/i40e/i40e_prototype.h | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_common.c b/drivers/net/ethernet/intel/i40e/i40e_common.c
index 7af1b7477140..de996a80013e 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_common.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_common.c
@@ -13,7 +13,7 @@
  * This function sets the mac type of the adapter based on the
  * vendor ID and device ID stored in the hw structure.
  **/
-static i40e_status i40e_set_mac_type(struct i40e_hw *hw)
+i40e_status i40e_set_mac_type(struct i40e_hw *hw)
 {
 	i40e_status status = 0;
 
diff --git a/drivers/net/ethernet/intel/i40e/i40e_prototype.h b/drivers/net/ethernet/intel/i40e/i40e_prototype.h
index eac88bcc6c06..9c810d54df1c 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_prototype.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_prototype.h
@@ -326,6 +326,8 @@ void i40e_nvmupd_check_wait_event(struct i40e_hw *hw, u16 opcode,
 void i40e_nvmupd_clear_wait_state(struct i40e_hw *hw);
 void i40e_set_pci_config_data(struct i40e_hw *hw, u16 link_status);
 
+i40e_status i40e_set_mac_type(struct i40e_hw *hw);
+
 extern struct i40e_rx_ptype_decoded i40e_ptype_lookup[];
 
 static inline struct i40e_rx_ptype_decoded decode_rx_desc_ptype(u8 ptype)
-- 
2.21.0


^ permalink raw reply related

* [net-next 01/15] i40e: reduce stack usage in i40e_set_fc
From: Jeff Kirsher @ 2019-08-21 20:16 UTC (permalink / raw)
  To: davem; +Cc: Arnd Bergmann, netdev, nhorman, sassmann, Andrew Bowers,
	Jeff Kirsher
In-Reply-To: <20190821201623.5506-1-jeffrey.t.kirsher@intel.com>

From: Arnd Bergmann <arnd@arndb.de>

The functions i40e_aq_get_phy_abilities_resp() and i40e_set_fc() both
have giant structure on the stack, which makes each one use stack frames
larger than 500 bytes.

As clang decides one function into the other, we get a warning for
exceeding the frame size limit on 32-bit architectures:

drivers/net/ethernet/intel/i40e/i40e_common.c:1654:23: error: stack frame size of 1116 bytes in function 'i40e_set_fc' [-Werror,-Wframe-larger-than=]

When building with gcc, the inlining does not happen, but i40e_set_fc()
calls i40e_aq_get_phy_abilities_resp() anyway, so they add up on the
kernel stack just as much.

The parts that actually use large stacks don't overlap, so make sure
each one is a separate function, and mark them as noinline_for_stack to
prevent the compilers from combining them again.

Fixes: 0a862b43acc6 ("i40e/i40evf: Add module_types and update_link_info")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_common.c | 91 +++++++++++--------
 1 file changed, 51 insertions(+), 40 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_common.c b/drivers/net/ethernet/intel/i40e/i40e_common.c
index 906cf68d3453..7af1b7477140 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_common.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_common.c
@@ -1643,25 +1643,15 @@ enum i40e_status_code i40e_aq_set_phy_config(struct i40e_hw *hw,
 	return status;
 }
 
-/**
- * i40e_set_fc
- * @hw: pointer to the hw struct
- * @aq_failures: buffer to return AdminQ failure information
- * @atomic_restart: whether to enable atomic link restart
- *
- * Set the requested flow control mode using set_phy_config.
- **/
-enum i40e_status_code i40e_set_fc(struct i40e_hw *hw, u8 *aq_failures,
-				  bool atomic_restart)
+static noinline_for_stack enum i40e_status_code
+i40e_set_fc_status(struct i40e_hw *hw,
+		   struct i40e_aq_get_phy_abilities_resp *abilities,
+		   bool atomic_restart)
 {
-	enum i40e_fc_mode fc_mode = hw->fc.requested_mode;
-	struct i40e_aq_get_phy_abilities_resp abilities;
 	struct i40e_aq_set_phy_config config;
-	enum i40e_status_code status;
+	enum i40e_fc_mode fc_mode = hw->fc.requested_mode;
 	u8 pause_mask = 0x0;
 
-	*aq_failures = 0x0;
-
 	switch (fc_mode) {
 	case I40E_FC_FULL:
 		pause_mask |= I40E_AQ_PHY_FLAG_PAUSE_TX;
@@ -1677,6 +1667,48 @@ enum i40e_status_code i40e_set_fc(struct i40e_hw *hw, u8 *aq_failures,
 		break;
 	}
 
+	memset(&config, 0, sizeof(struct i40e_aq_set_phy_config));
+	/* clear the old pause settings */
+	config.abilities = abilities->abilities & ~(I40E_AQ_PHY_FLAG_PAUSE_TX) &
+			   ~(I40E_AQ_PHY_FLAG_PAUSE_RX);
+	/* set the new abilities */
+	config.abilities |= pause_mask;
+	/* If the abilities have changed, then set the new config */
+	if (config.abilities == abilities->abilities)
+		return 0;
+
+	/* Auto restart link so settings take effect */
+	if (atomic_restart)
+		config.abilities |= I40E_AQ_PHY_ENABLE_ATOMIC_LINK;
+	/* Copy over all the old settings */
+	config.phy_type = abilities->phy_type;
+	config.phy_type_ext = abilities->phy_type_ext;
+	config.link_speed = abilities->link_speed;
+	config.eee_capability = abilities->eee_capability;
+	config.eeer = abilities->eeer_val;
+	config.low_power_ctrl = abilities->d3_lpan;
+	config.fec_config = abilities->fec_cfg_curr_mod_ext_info &
+			    I40E_AQ_PHY_FEC_CONFIG_MASK;
+
+	return i40e_aq_set_phy_config(hw, &config, NULL);
+}
+
+/**
+ * i40e_set_fc
+ * @hw: pointer to the hw struct
+ * @aq_failures: buffer to return AdminQ failure information
+ * @atomic_restart: whether to enable atomic link restart
+ *
+ * Set the requested flow control mode using set_phy_config.
+ **/
+enum i40e_status_code i40e_set_fc(struct i40e_hw *hw, u8 *aq_failures,
+				  bool atomic_restart)
+{
+	struct i40e_aq_get_phy_abilities_resp abilities;
+	enum i40e_status_code status;
+
+	*aq_failures = 0x0;
+
 	/* Get the current phy config */
 	status = i40e_aq_get_phy_capabilities(hw, false, false, &abilities,
 					      NULL);
@@ -1685,31 +1717,10 @@ enum i40e_status_code i40e_set_fc(struct i40e_hw *hw, u8 *aq_failures,
 		return status;
 	}
 
-	memset(&config, 0, sizeof(struct i40e_aq_set_phy_config));
-	/* clear the old pause settings */
-	config.abilities = abilities.abilities & ~(I40E_AQ_PHY_FLAG_PAUSE_TX) &
-			   ~(I40E_AQ_PHY_FLAG_PAUSE_RX);
-	/* set the new abilities */
-	config.abilities |= pause_mask;
-	/* If the abilities have changed, then set the new config */
-	if (config.abilities != abilities.abilities) {
-		/* Auto restart link so settings take effect */
-		if (atomic_restart)
-			config.abilities |= I40E_AQ_PHY_ENABLE_ATOMIC_LINK;
-		/* Copy over all the old settings */
-		config.phy_type = abilities.phy_type;
-		config.phy_type_ext = abilities.phy_type_ext;
-		config.link_speed = abilities.link_speed;
-		config.eee_capability = abilities.eee_capability;
-		config.eeer = abilities.eeer_val;
-		config.low_power_ctrl = abilities.d3_lpan;
-		config.fec_config = abilities.fec_cfg_curr_mod_ext_info &
-				    I40E_AQ_PHY_FEC_CONFIG_MASK;
-		status = i40e_aq_set_phy_config(hw, &config, NULL);
+	status = i40e_set_fc_status(hw, &abilities, atomic_restart);
+	if (status)
+		*aq_failures |= I40E_SET_FC_AQ_FAIL_SET;
 
-		if (status)
-			*aq_failures |= I40E_SET_FC_AQ_FAIL_SET;
-	}
 	/* Update the link info */
 	status = i40e_update_link_info(hw);
 	if (status) {
@@ -2537,7 +2548,7 @@ i40e_status i40e_get_link_status(struct i40e_hw *hw, bool *link_up)
  * i40e_updatelink_status - update status of the HW network link
  * @hw: pointer to the hw struct
  **/
-i40e_status i40e_update_link_info(struct i40e_hw *hw)
+noinline_for_stack i40e_status i40e_update_link_info(struct i40e_hw *hw)
 {
 	struct i40e_aq_get_phy_abilities_resp abilities;
 	i40e_status status = 0;
-- 
2.21.0


^ permalink raw reply related

* [net-next 02/15] i40e: Check if transceiver implements DDM before access
From: Jeff Kirsher @ 2019-08-21 20:16 UTC (permalink / raw)
  To: davem
  Cc: Mauro S. M. Rodrigues, netdev, nhorman, sassmann, Andrew Bowers,
	Jeff Kirsher
In-Reply-To: <20190821201623.5506-1-jeffrey.t.kirsher@intel.com>

From: "Mauro S. M. Rodrigues" <maurosr@linux.vnet.ibm.com>

Similar to the ixgbe issue fixed in:
655c91414579 ("ixgbe: Check DDM existence in transceiver before access)

i40e has the same issue when reading eeprom from SFP's module that comply
with SFF-8472 but not implement the Digital Diagnostic Monitoring (DDM)
interface described in it. The existence of such area is specified by bit
6 of byte 92, set to 1 if implemented.

Without this patch, due to not checking this bit i40e fails to read SFP
module's eeprom with the follow message:

ethtool -m enP51p1s0f0
Cannot get Module EEPROM data: Input/output error

Because it fails to read the additional 256 bytes in which it was assumed
to exist the DDM data.

Signed-off-by: "Mauro S. M. Rodrigues" <maurosr@linux.vnet.ibm.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_ethtool.c | 6 ++++++
 drivers/net/ethernet/intel/i40e/i40e_type.h    | 1 +
 2 files changed, 7 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
index 01e4615b1b4b..41e1240acaea 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
@@ -5137,6 +5137,12 @@ static int i40e_get_module_info(struct net_device *netdev,
 			/* Module is not SFF-8472 compliant */
 			modinfo->type = ETH_MODULE_SFF_8079;
 			modinfo->eeprom_len = ETH_MODULE_SFF_8079_LEN;
+		} else if (!(sff8472_swap & I40E_MODULE_SFF_DDM_IMPLEMENTED)) {
+			/* Module is SFF-8472 compliant but doesn't implement
+			 * Digital Diagnostic Monitoring (DDM).
+			 */
+			modinfo->type = ETH_MODULE_SFF_8079;
+			modinfo->eeprom_len = ETH_MODULE_SFF_8079_LEN;
 		} else {
 			modinfo->type = ETH_MODULE_SFF_8472;
 			modinfo->eeprom_len = ETH_MODULE_SFF_8472_LEN;
diff --git a/drivers/net/ethernet/intel/i40e/i40e_type.h b/drivers/net/ethernet/intel/i40e/i40e_type.h
index 8f43aa47c263..2a6219d66771 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_type.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_type.h
@@ -443,6 +443,7 @@ struct i40e_nvm_access {
 #define I40E_MODULE_SFF_8472_COMP	0x5E
 #define I40E_MODULE_SFF_8472_SWAP	0x5C
 #define I40E_MODULE_SFF_ADDR_MODE	0x04
+#define I40E_MODULE_SFF_DDM_IMPLEMENTED 0x40
 #define I40E_MODULE_TYPE_QSFP_PLUS	0x0D
 #define I40E_MODULE_TYPE_QSFP28		0x11
 #define I40E_MODULE_QSFP_MAX_LEN	640
-- 
2.21.0


^ permalink raw reply related

* Re: [PATCH spi for-5.4 0/5] Deterministic SPI latency with NXP DSPI driver
From: Vladimir Oltean @ 2019-08-21 20:17 UTC (permalink / raw)
  To: Richard Cochran
  Cc: Mark Brown, Hubert Feurstein, Miroslav Lichvar, Andrew Lunn,
	Florian Fainelli, linux-spi, netdev
In-Reply-To: <20190821140815.GA1447@localhost>

Hi Richard,

On Wed, 21 Aug 2019 at 17:08, Richard Cochran <richardcochran@gmail.com> wrote:
>
> On Tue, Aug 20, 2019 at 09:38:45PM -0700, Richard Cochran wrote:
> > Overall, the PTP switch use case is well supported by Linux.  The
> > synchronization of the management CPU to the PTP, while nice to have,
> > is not required to implement a Transparent Clock.  Your specific
> > application might require it, but honestly, if the management CPU
> > needs good synchronization, then you really aught to feed a PPS from
> > the switch into a gpio (for example) on the CPU.
>
> Another way to achieve this is to have a second MAC interface on the
> management CPU connected to a spare port on the switch.  Then time
> stamping, PHC, ptp4l, and phc2sys work as expected.
>
> Thanks,
> Richard

Of course PPS with a dedicated hardware receiver that can take input
compare timestamps is always preferable. However non-Ethernet
synchronization in the field looks to me like "make do with whatever
you can". I'm not sure a plain GPIO that raises an interrupt is better
than an interrupt-driven serial protocol controller - it's (mostly)
the interrupts that throw off the precision of the software timestamp.
And use Miroslav's pps-gpio-poll module and you're back from where you
started (try to make a sw timestamp as precise as possible).
As for dedicating a second interface pair in (basically) loopback just
for sync, that's how I'm testing PTP when I don't have a second board
and hence how the idea occurred to me. I can imagine this even getting
deployed and I can also probably name an example, but it certainly
wouldn't be my first choice. But DSA could have that built-in, and
with the added latency benefit of a MAC-to-MAC connection.
Too bad the mv88e6xxx driver can't do loopback timestamping, that's
already 50% of the DSA drivers that support PTP at all. An embedded
solution for this is less compelling now.

Regards,
-Vladimir

^ permalink raw reply

* Re: VRF notes when using ipv6 and flushing tables.
From: Ben Greear @ 2019-08-21 20:20 UTC (permalink / raw)
  To: David Ahern, netdev
In-Reply-To: <2a8914bb-56ec-e585-bd76-36b77ca2517d@gmail.com>

On 08/20/2019 08:02 PM, David Ahern wrote:
> On 8/20/19 2:27 PM, Ben Greear wrote:
>> I recently spend a few days debugging what in the end was user error on
>> my part.
>>
>> Here are my notes in hope they help someone else.
>>
>> First, 'ip -6 route show vrf vrfX' will not show some of the
>> routes (like local routes) that will show up with
>> 'ip -6 route show table X', where X == vrfX's table-id
>>
>> If you run 'ip -6 route flush table X', then you will loose all of the auto
>> generated routes, including anycast, ff00::/8, and local routes.
>>
>> ff00::/8 is needed for neigh discovery to work (probably among other
>> things)
>>
>> local route is needed or packets won't actually be accepted up the stack
>> (I think that is the symptom at least)
>>
>> Not sure exactly what anycast does, but I'm guessing it is required for
>> something useful.
>>
>> You must manually re-add those to the table unless you for certain know
>> that
>> you do not need them for whatever reason.
>>
>
> sorry you went through such a long and painful debugging session.

No problem.  I learned some details of IPv6 I never realized before,
sure to come in useful some day!

Thanks,
Ben

> yes, the kernel doc for VRF needs to be updated that 'ip route show vrf
> X' and 'ip route show table X' are different ('show vrf' mimics the main
> table in not showing local, broadcast, anycast; 'table vrf' shows all).
>
> A suggestion for others: the documentation and selftests directory have
> a lot of VRF examples now. If something basic is not working (e.g., arp
> or neigh discovery), see if it works there and if so compare the outputs
> of the route table along the way.



-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply

* [PATCH] net: usb: Delete unnecessary checks before the macro call “dev_kfree_skb”
From: Markus Elfring @ 2019-08-21 20:24 UTC (permalink / raw)
  To: linux-usb, netdev, David S. Miller, Greg Kroah-Hartman,
	Kate Stewart, Petko Manolov, Steve Winslow, Thomas Gleixner
  Cc: LKML, kernel-janitors

From: Markus Elfring <elfring@users.sourceforge.net>
Date: Wed, 21 Aug 2019 22:16:02 +0200

The dev_kfree_skb() function performs also input parameter validation.
Thus the test around the shown calls is not needed.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
---
 drivers/net/usb/lg-vl600.c | 4 +---
 drivers/net/usb/rtl8150.c  | 6 ++----
 2 files changed, 3 insertions(+), 7 deletions(-)

diff --git a/drivers/net/usb/lg-vl600.c b/drivers/net/usb/lg-vl600.c
index 6c2b3e368efe..217a2d8fa47b 100644
--- a/drivers/net/usb/lg-vl600.c
+++ b/drivers/net/usb/lg-vl600.c
@@ -87,9 +87,7 @@ static void vl600_unbind(struct usbnet *dev, struct usb_interface *intf)
 {
 	struct vl600_state *s = dev->driver_priv;

-	if (s->current_rx_buf)
-		dev_kfree_skb(s->current_rx_buf);
-
+	dev_kfree_skb(s->current_rx_buf);
 	kfree(s);

 	return usbnet_cdc_unbind(dev, intf);
diff --git a/drivers/net/usb/rtl8150.c b/drivers/net/usb/rtl8150.c
index 98f33e270af1..13e51ccf0214 100644
--- a/drivers/net/usb/rtl8150.c
+++ b/drivers/net/usb/rtl8150.c
@@ -586,8 +586,7 @@ static void free_skb_pool(rtl8150_t *dev)
 	int i;

 	for (i = 0; i < RX_SKB_POOL_SIZE; i++)
-		if (dev->rx_skb_pool[i])
-			dev_kfree_skb(dev->rx_skb_pool[i]);
+		dev_kfree_skb(dev->rx_skb_pool[i]);
 }

 static void rx_fixup(unsigned long data)
@@ -946,8 +945,7 @@ static void rtl8150_disconnect(struct usb_interface *intf)
 		unlink_all_urbs(dev);
 		free_all_urbs(dev);
 		free_skb_pool(dev);
-		if (dev->rx_skb)
-			dev_kfree_skb(dev->rx_skb);
+		dev_kfree_skb(dev->rx_skb);
 		kfree(dev->intr_buff);
 		free_netdev(dev->netdev);
 	}
--
2.23.0


^ permalink raw reply related

* Re: [RFC bpf-next 0/5] Convert iproute2 to use libbpf (WIP)
From: Andrii Nakryiko @ 2019-08-21 20:30 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: Stephen Hemminger, Daniel Borkmann, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, Networking, bpf
In-Reply-To: <20190820114706.18546-1-toke@redhat.com>

On Tue, Aug 20, 2019 at 4:47 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> iproute2 uses its own bpf loader to load eBPF programs, which has
> evolved separately from libbpf. Since we are now standardising on
> libbpf, this becomes a problem as iproute2 is slowly accumulating
> feature incompatibilities with libbpf-based loaders. In particular,
> iproute2 has its own (expanded) version of the map definition struct,
> which makes it difficult to write programs that can be loaded with both
> custom loaders and iproute2.
>
> This series seeks to address this by converting iproute2 to using libbpf
> for all its bpf needs. This version is an early proof-of-concept RFC, to
> get some feedback on whether people think this is the right direction.
>
> What this series does is the following:
>
> - Updates the libbpf map definition struct to match that of iproute2
>   (patch 1).


Hi Toke,

Thanks for taking a stab at unifying libbpf and iproute2 loaders. I'm
totally in support of making iproute2 use libbpf to load/initialize
BPF programs. But I'm against adding iproute2-specific fields to
libbpf's bpf_map_def definitions to support this.

I've proposed the plan of extending libbpf's supported features so
that it can be used to load iproute2-style BPF programs earlier,
please see discussions in [0] and [1]. I think instead of emulating
iproute2 way of matching everything based on user-specified internal
IDs, which doesn't provide good user experience and is quite easy to
get wrong, we should support same scenarios with better declarative
syntax and in a less error-prone way. I believe we can do that by
relying on BTF more heavily (again, please check some of my proposals
in [0], [1], and discussion with Daniel in those threads). It will
feel more natural and be more straightforward to follow. It would be
great if you can lend a hand in implementing pieces of that plan!

I'm currently on vacation, so my availability is very sparse, but I'd
be happy to discuss this further, if need be.

  [0] https://lore.kernel.org/bpf/CAEf4BzbfdG2ub7gCi0OYqBrUoChVHWsmOntWAkJt47=FE+km+A@xxxxxxxxxxxxxx/
  [1] https://www.spinics.net/lists/bpf/msg03976.html

> - Adds functionality to libbpf to support automatic pinning of maps when
>   loading an eBPF program, while re-using pinned maps if they already
>   exist (patches 2-3).
> - Modifies iproute2 to make it possible to compile it against libbpf
>   without affecting any existing functionality (patch 4).
> - Changes the iproute2 eBPF loader to use libbpf for loading XDP
>   programs (patch 5).
>
>
> As this is an early PoC, there are still a few missing pieces before
> this can be merged. Including (but probably not limited to):
>
> - Consolidate the map definition struct in the bpf_helpers.h file in the
>   kernel tree. This contains a different, and incompatible, update to
>   the struct. Since the iproute2 version has actually been released for
>   use outside the kernel tree (and thus is subject to API stability
>   constraints), I think it makes the most sense to keep that, and port
>   the selftests to use it.
>
> - The iproute2 loader supports automatically populating map-in-map
>   definitions on load. This needs to be added to libbpf as well. There
>   is an implementation of this in the selftests in the kernel tree,
>   which will have to be ported (related to the previous point).
>
> - The iproute2 port needs to be completed; this means at least
>   supporting TC eBPF programs as well, figuring out how to deal with
>   cBPF programs, and getting the verbose output back to the same state
>   as before the port. Also, I guess the iproute2 maintainers need to ACK
>   that they are good with adding a dependency on libbpf.
>
> - Some of the code added to libbpf in patch 2 in this series include
>   code derived from iproute2, which is GPLv2+. So it will need to be
>   re-licensed to be usable in libbpf. Since `git blame` indicated that
>   the original code was written by Daniel, I figure he can ACK that
>   relicensing before applying the patches :)
>
>
> Please take a look at this series and let me know if you agree
> that this is the right direction to go. Assuming there's consensus that
> it is, I'll focus on getting the rest of the libbpf patches ready for
> merging. I'll send those as a separate series, and hold off on the
> iproute2 patches until they are merged; but for this version I'm
> including both in one series so it's easier to see the context.
>
> -Toke
>
>
> libbpf:
> Toke Høiland-Jørgensen (3):
>   libbpf: Add map definition struct fields from iproute2
>   libbpf: Add support for auto-pinning of maps with reuse on program
>     load
>   libbpf: Add support for specifying map pinning path via callback
>
>  tools/lib/bpf/libbpf.c | 205 +++++++++++++++++++++++++++++++++++++++--
>  tools/lib/bpf/libbpf.h |  18 ++++
>  2 files changed, 214 insertions(+), 9 deletions(-)
>
> iproute2:
> Toke Høiland-Jørgensen (2):
>   iproute2: Allow compiling against libbpf
>   iproute2: Support loading XDP programs with libbpf
>
>  configure          |  16 +++++
>  include/bpf_util.h |   6 +-
>  ip/ipvrf.c         |   4 +-
>  lib/bpf.c          | 148 ++++++++++++++++++++++++++++++++++++---------
>  4 files changed, 142 insertions(+), 32 deletions(-)
>
>
> --
> 2.22.1
>

^ permalink raw reply

* Re: [PATCH 33/38] act_api: Convert action_idr to XArray
From: Matthew Wilcox @ 2019-08-21 20:35 UTC (permalink / raw)
  To: Vlad Buslov; +Cc: netdev@vger.kernel.org
In-Reply-To: <vbfpnky4884.fsf@mellanox.com>

On Wed, Aug 21, 2019 at 07:41:19PM +0000, Vlad Buslov wrote:
> > @@ -301,18 +292,18 @@ static int tcf_del_walker(struct tcf_idrinfo *idrinfo, struct sk_buff *skb,
> >  	if (nla_put_string(skb, TCA_KIND, ops->kind))
> >  		goto nla_put_failure;
> >  
> > -	mutex_lock(&idrinfo->lock);
> > -	idr_for_each_entry_ul(idr, p, tmp, id) {
> > +	xa_lock(&idrinfo->actions);
> > +	xa_for_each(&idrinfo->actions, index, p) {
> >  		ret = tcf_idr_release_unsafe(p);
> 
> I like the simplification of reusing builtin xarray spinlock for this,
> but the reason we are using mutex here is because the following call
> chain: tcf_idr_release_unsafe() -> tcf_action_cleanup() -> free_tcf() ->
> tcf_chain_put_by_act() -> __tcf_chain_put() -> mutex_lock(&block->lock);

Ah, this one is buggy anyway because we call xa_erase() inside the
loop, so it'll deadlock.

We could just drop the xa_lock() / xa_unlock() around the loop.  There's
no need to hold the lock unless we're trying to prevent other simultaneous
additions/deletions, which shouldn't be happening?  ie this:

@@ -268,8 +268,10 @@ static int tcf_idr_release_unsafe(struct tc_action *p)
        if (atomic_read(&p->tcfa_bindcnt) > 0)
                return -EPERM;
 
-       if (refcount_dec_and_test(&p->tcfa_refcnt)) {
-               xa_erase(&p->idrinfo->actions, p->tcfa_index);
+       if (refcount_dec_and_lock(&p->tcfa_refcnt,
+                                       &p->idrinfo->actions.xa_lock)) {
+               __xa_erase(&p->idrinfo->actions, p->tcfa_index);
+               xa_unlock(&p->idrinfo->actions);
                tcf_action_cleanup(p);
                return ACT_P_DELETED;
        }
@@ -292,18 +294,15 @@ static int tcf_del_walker(struct tcf_idrinfo *idrinfo, struct sk_buff *skb,
        if (nla_put_string(skb, TCA_KIND, ops->kind))
                goto nla_put_failure;
 
-       xa_lock(&idrinfo->actions);
        xa_for_each(&idrinfo->actions, index, p) {
                ret = tcf_idr_release_unsafe(p);
                if (ret == ACT_P_DELETED) {
                        module_put(ops->owner);
                        n_i++;
                } else if (ret < 0) {
-                       xa_unlock(&idrinfo->actions);
                        goto nla_put_failure;
                }
        }
-       xa_unlock(&idrinfo->actions);
 
        if (nla_put_u32(skb, TCA_FCNT, n_i))
                goto nla_put_failure;


^ permalink raw reply

* Re: [PATCH v5 10/17] net: sgi: ioc3-eth: rework skb rx handling
From: Jakub Kicinski @ 2019-08-21 20:37 UTC (permalink / raw)
  To: Thomas Bogendoerfer
  Cc: Jonathan Corbet, Ralf Baechle, Paul Burton, James Hogan,
	Dmitry Torokhov, Lee Jones, David S. Miller, Srinivas Kandagatla,
	Alessandro Zummo, Alexandre Belloni, Greg Kroah-Hartman,
	Jiri Slaby, Evgeniy Polyakov, linux-doc, linux-kernel, linux-mips,
	linux-input, netdev, linux-rtc, linux-serial,
	Jesper Dangaard Brouer
In-Reply-To: <20190821162847.479c9967d4dc8026fe65fa0e@suse.de>

On Wed, 21 Aug 2019 16:28:47 +0200, Thomas Bogendoerfer wrote:
> > This looks like a DMA engine alignment requirement, more than an
> > optimization.  
> 
> that true, there are two constraints for the rx buffers, start must be aligned
> to 128 bytes and a buffer must not cross a 16kbyte boundary. I was already
> thinking of allocating pages and chop them up. Is there a Linux API available,
> which could help for implementing this ?
> 
> I'll probably drop this patch or only change the skb_put stuff plus RX_BUF_SIZE
> define.

Sounds a little like frag allocator (napi_alloc_frag()/
netdev_alloc_frag()), but I'm not sure you'd have sufficient control
to skip over the 16k boundary.. Perhaps others have better suggestions.

^ permalink raw reply

* Re: [PATCH net-next] amd-xgbe: use devm_platform_ioremap_resource() to simplify code
From: David Miller @ 2019-08-21 20:42 UTC (permalink / raw)
  To: yuehaibing; +Cc: thomas.lendacky, linux-kernel, netdev
In-Reply-To: <20190821123203.71404-1-yuehaibing@huawei.com>

From: YueHaibing <yuehaibing@huawei.com>
Date: Wed, 21 Aug 2019 20:32:03 +0800

> Use devm_platform_ioremap_resource() to simplify the code a bit.
> This is detected by coccinelle.
> 
> Reported-by: Hulk Robot <hulkci@huawei.com>
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] net: ethernet: ti: use devm_platform_ioremap_resource() to simplify code
From: David Miller @ 2019-08-21 20:42 UTC (permalink / raw)
  To: yuehaibing
  Cc: grygorii.strashko, ivan.khoronzhuk, andrew, ynezz, linux-kernel,
	netdev, linux-omap
In-Reply-To: <20190821124850.9592-1-yuehaibing@huawei.com>

From: YueHaibing <yuehaibing@huawei.com>
Date: Wed, 21 Aug 2019 20:48:50 +0800

> Use devm_platform_ioremap_resource() to simplify the code a bit.
> This is detected by coccinelle.
> 
> Reported-by: Hulk Robot <hulkci@huawei.com>
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] via-rhine: use devm_platform_ioremap_resource() to simplify code
From: David Miller @ 2019-08-21 20:42 UTC (permalink / raw)
  To: yuehaibing; +Cc: will, linux-kernel, netdev
In-Reply-To: <20190821125050.67652-1-yuehaibing@huawei.com>

From: YueHaibing <yuehaibing@huawei.com>
Date: Wed, 21 Aug 2019 20:50:50 +0800

> Use devm_platform_ioremap_resource() to simplify the code a bit.
> This is detected by coccinelle.
> 
> Reported-by: Hulk Robot <hulkci@huawei.com>
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] net: socionext: use devm_platform_ioremap_resource() to simplify code
From: David Miller @ 2019-08-21 20:42 UTC (permalink / raw)
  To: yuehaibing; +Cc: hayashi.kunihiko, linux-kernel, netdev
In-Reply-To: <20190821125357.66676-1-yuehaibing@huawei.com>

From: YueHaibing <yuehaibing@huawei.com>
Date: Wed, 21 Aug 2019 20:53:57 +0800

> Use devm_platform_ioremap_resource() to simplify the code a bit.
> This is detected by coccinelle.
> 
> Reported-by: Hulk Robot <hulkci@huawei.com>
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] net: ks8851-ml: use devm_platform_ioremap_resource() to simplify code
From: David Miller @ 2019-08-21 20:42 UTC (permalink / raw)
  To: yuehaibing; +Cc: ynezz, allison, lukas, linux-kernel, netdev
In-Reply-To: <20190821125811.70524-1-yuehaibing@huawei.com>

From: YueHaibing <yuehaibing@huawei.com>
Date: Wed, 21 Aug 2019 20:58:11 +0800

> Use devm_platform_ioremap_resource() to simplify the code a bit.
> This is detected by coccinelle.
> 
> Reported-by: Hulk Robot <hulkci@huawei.com>
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] net: sxgbe: use devm_platform_ioremap_resource() to simplify code
From: David Miller @ 2019-08-21 20:43 UTC (permalink / raw)
  To: yuehaibing; +Cc: bh74.an, ks.giri, vipul.pandya, linux-kernel, netdev
In-Reply-To: <20190821125959.17728-1-yuehaibing@huawei.com>

From: YueHaibing <yuehaibing@huawei.com>
Date: Wed, 21 Aug 2019 20:59:59 +0800

> Use devm_platform_ioremap_resource() to simplify the code a bit.
> This is detected by coccinelle.
> 
> Reported-by: Hulk Robot <hulkci@huawei.com>
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] cirrus: cs89x0: use devm_platform_ioremap_resource() to simplify code
From: David Miller @ 2019-08-21 20:43 UTC (permalink / raw)
  To: yuehaibing; +Cc: linux-kernel, netdev
In-Reply-To: <20190821130241.58276-1-yuehaibing@huawei.com>

From: YueHaibing <yuehaibing@huawei.com>
Date: Wed, 21 Aug 2019 21:02:41 +0800

> Use devm_platform_ioremap_resource() to simplify the code a bit.
> This is detected by coccinelle.
> 
> Reported-by: Hulk Robot <hulkci@huawei.com>
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] ezchip: nps_enet: use devm_platform_ioremap_resource() to simplify code
From: David Miller @ 2019-08-21 20:43 UTC (permalink / raw)
  To: yuehaibing; +Cc: ynezz, tglx, gregkh, linux-kernel, netdev
In-Reply-To: <20190821130509.71916-1-yuehaibing@huawei.com>

From: YueHaibing <yuehaibing@huawei.com>
Date: Wed, 21 Aug 2019 21:05:09 +0800

> Use devm_platform_ioremap_resource() to simplify the code a bit.
> This is detected by coccinelle.
> 
> Reported-by: Hulk Robot <hulkci@huawei.com>
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>

Applied.

^ permalink raw reply

* Re: [PATCH 0/1] pull request for net: batman-adv 2019-08-21
From: David Miller @ 2019-08-21 20:50 UTC (permalink / raw)
  To: sw; +Cc: netdev, b.a.t.m.a.n
In-Reply-To: <20190821133015.12778-1-sw@simonwunderlich.de>

From: Simon Wunderlich <sw@simonwunderlich.de>
Date: Wed, 21 Aug 2019 15:30:14 +0200

> here is a pull request with Erics bugfix from last week which we would
> like to have integrated into net. We didn't get anything else, so it's
> a short one this time. :)
> 
> Please pull or let me know of any problem!

Pulled, thanks Simon.

^ permalink raw reply

* Re: [PATCH net-next] net: fec: use devm_platform_ioremap_resource() to simplify code
From: David Miller @ 2019-08-21 20:50 UTC (permalink / raw)
  To: yuehaibing; +Cc: fugang.duan, linux-kernel, netdev
In-Reply-To: <20190821132945.19648-1-yuehaibing@huawei.com>

From: YueHaibing <yuehaibing@huawei.com>
Date: Wed, 21 Aug 2019 21:29:45 +0800

> Use devm_platform_ioremap_resource() to simplify the code a bit.
> This is detected by coccinelle.
> 
> Reported-by: Hulk Robot <hulkci@huawei.com>
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] net: mvneta: use devm_platform_ioremap_resource() to simplify code
From: David Miller @ 2019-08-21 20:52 UTC (permalink / raw)
  To: yuehaibing; +Cc: bigeasy, linux-kernel, netdev
In-Reply-To: <20190821133302.72880-1-yuehaibing@huawei.com>

From: YueHaibing <yuehaibing@huawei.com>
Date: Wed, 21 Aug 2019 21:33:02 +0800

> Use devm_platform_ioremap_resource() to simplify the code a bit.
> This is detected by coccinelle.
> 
> Reported-by: Hulk Robot <hulkci@huawei.com>
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] pxa168_eth: use devm_platform_ioremap_resource() to simplify code
From: David Miller @ 2019-08-21 20:52 UTC (permalink / raw)
  To: yuehaibing; +Cc: andrew, mcgrof, tglx, ynezz, linux-kernel, netdev
In-Reply-To: <20190821133854.4308-1-yuehaibing@huawei.com>

From: YueHaibing <yuehaibing@huawei.com>
Date: Wed, 21 Aug 2019 21:38:54 +0800

> Use devm_platform_ioremap_resource() to simplify the code a bit.
> This is detected by coccinelle.
> 
> Reported-by: Hulk Robot <hulkci@huawei.com>
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] net: bcmgenet: use devm_platform_ioremap_resource() to simplify code
From: David Miller @ 2019-08-21 20:52 UTC (permalink / raw)
  To: yuehaibing
  Cc: opendmb, f.fainelli, bcm-kernel-feedback-list, linux-kernel,
	netdev
In-Reply-To: <20190821134131.57780-1-yuehaibing@huawei.com>

From: YueHaibing <yuehaibing@huawei.com>
Date: Wed, 21 Aug 2019 21:41:31 +0800

> Use devm_platform_ioremap_resource() to simplify the code a bit.
> This is detected by coccinelle.
> 
> Reported-by: Hulk Robot <hulkci@huawei.com>
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] net: systemport: use devm_platform_ioremap_resource() to simplify code
From: David Miller @ 2019-08-21 20:53 UTC (permalink / raw)
  To: yuehaibing
  Cc: opendmb, f.fainelli, bcm-kernel-feedback-list, linux-kernel,
	netdev
In-Reply-To: <20190821134613.23276-1-yuehaibing@huawei.com>

From: YueHaibing <yuehaibing@huawei.com>
Date: Wed, 21 Aug 2019 21:46:13 +0800

> Use devm_platform_ioremap_resource() to simplify the code a bit.
> This is detected by coccinelle.
> 
> Reported-by: Hulk Robot <hulkci@huawei.com>
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] net: stmmac: dwmac-meson8b: use devm_platform_ioremap_resource() to simplify code
From: David Miller @ 2019-08-21 20:53 UTC (permalink / raw)
  To: yuehaibing
  Cc: peppe.cavallaro, alexandre.torgue, joabreu, khilman,
	mcoquelin.stm32, linux-kernel, netdev, linux-arm-kernel,
	linux-amlogic, linux-stm32
In-Reply-To: <20190821135130.68636-1-yuehaibing@huawei.com>

From: YueHaibing <yuehaibing@huawei.com>
Date: Wed, 21 Aug 2019 21:51:30 +0800

> Use devm_platform_ioremap_resource() to simplify the code a bit.
> This is detected by coccinelle.
> 
> Reported-by: Hulk Robot <hulkci@huawei.com>
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>

Applied.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox