Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net-next v15 4/7] sch_cake: Add NAT awareness to packet classifier
From: David Miller @ 2018-05-23 20:41 UTC (permalink / raw)
  To: toke; +Cc: netdev, cake, netfilter-devel
In-Reply-To: <87in7exg3d.fsf@toke.dk>

From: Toke Høiland-Jørgensen <toke@toke.dk>
Date: Wed, 23 May 2018 22:38:30 +0200

> How would this work?

On egress the core networking flow dissector records what you need
somewhere in SKB or wherever.  You later retrieve it at egress time
after NAT has occurred.

> It's about making sure the per-host fairness works when NATing, so
> we can distribute bandwidth between the hosts on the local LAN
> regardless of how many flows they open.

Ok, understood.

> But it's not unreasonable to expect people who do NAT in eBPF to
> also set skb->tc_classid if they want pre-nat host fairness, is it?

And core networking can do it as well.

Please remove this conntrack dependency, I don't think it is necessary
and it is very short sighted.

^ permalink raw reply

* Re: [Cake] [PATCH net-next v15 4/7] sch_cake: Add NAT awareness to packet classifier
From: David Miller @ 2018-05-23 20:39 UTC (permalink / raw)
  To: chromatix99; +Cc: toke, cake, netdev, netfilter-devel
In-Reply-To: <370B23D9-E929-4A73-BB7C-C1BE4A01C7E6@gmail.com>

From: Jonathan Morton <chromatix99@gmail.com>
Date: Wed, 23 May 2018 23:33:04 +0300

> Now I'm *really* confused.
> 
> Are you saying that the user has to set up their own conntrack
> mechanism using extra userspace commands?  Because complicating the
> setup process that way runs directly counter to Cake's design
> philosophy.

I mean not anything filtering or firewall related.

We have a full flow dissector in the networking core, which often runs
on every RX packet anyways.  Record what we need and use it on egress
after NAT has occurred.

^ permalink raw reply

* Re: [PATCH net-next v15 4/7] sch_cake: Add NAT awareness to packet classifier
From: Toke Høiland-Jørgensen @ 2018-05-23 20:38 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, cake, netfilter-devel
In-Reply-To: <20180523.144442.864194409238516747.davem@davemloft.net>

David Miller <davem@davemloft.net> writes:

> From: Toke Høiland-Jørgensen <toke@toke.dk>
> Date: Tue, 22 May 2018 15:57:38 +0200
>
>> When CAKE is deployed on a gateway that also performs NAT (which is a
>> common deployment mode), the host fairness mechanism cannot distinguish
>> internal hosts from each other, and so fails to work correctly.
>> 
>> To fix this, we add an optional NAT awareness mode, which will query the
>> kernel conntrack mechanism to obtain the pre-NAT addresses for each packet
>> and use that in the flow and host hashing.
>> 
>> When the shaper is enabled and the host is already performing NAT, the cost
>> of this lookup is negligible. However, in unlimited mode with no NAT being
>> performed, there is a significant CPU cost at higher bandwidths. For this
>> reason, the feature is turned off by default.
>> 
>> Cc: netfilter-devel@vger.kernel.org
>> Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
>
> This is really pushing the limits of what a packet scheduler can
> require for correct operation.

Well, Cake is all about pushing the limits of what a packet scheduler
can do... ;)

> And this creates an incredibly ugly dependency.

Yeah, I do agree with that, and I'd love to get rid of it. I even tried
prototyping what it would take to lookup the symbols at runtime using
kallsyms. It wasn't exactly prettier; pushed it here in case anyone
wants to recoil in horror (completely untested, just got it to the point
where the module compiles with no nf_* symbols according to objdump):

https://github.com/dtaht/sch_cake/commit/97270a10dcea236d137f5113aaeb4303098ab3f3

> I'd much rather you do something NAT method agnostic, like save or
> compute the necessary information on ingress and then later use it on
> egress.

How would this work? We would have to add some kind of global state
shared between all instances of the qdisc, and maintain state for all
flows we see going through there, effectively duplicating conntrack, and
also requiring people to run Cake on all interfaces? How is that better?

> Because what you have here will completely break when someone does NAT
> using eBPF, act_nat, or similar.
>
> There is even skb->rxhash, be creative :-)

This is not actually about improving hashing; the post-NAT information
is fine for that. It's about making sure the per-host fairness works
when NATing, so we can distribute bandwidth between the hosts on the
local LAN regardless of how many flows they open. This is one of the
"killer features" of Cake - it was the top requested feature until we
implemented it. So it would be a shame to drop it.

Since act_nat is a 1-to-1 mapping I don't think we would have any loss
of functionality with that. For eBPF, well, obviously all bets are off
as far as reusing any state. But it's not unreasonable to expect people
who do NAT in eBPF to also set skb->tc_classid if they want pre-nat host
fairness, is it?

Which means that the only remaining issue is the module dependency. Can
we live with that (noting that it'll go away if conntrack is configured
out of the kernel entirely)? Or is the kallsyms approach a viable way
forward? I guess we could add a kconfig option that toggles between that
and native calls, so that we'd at least get a compile error on suitably
configured kernels if the API changes...

-Toke

^ permalink raw reply

* Re: [PATCH 00/18] Netfilter updates for net-next
From: David Miller @ 2018-05-23 20:37 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel, netdev
In-Reply-To: <20180523184254.22599-1-pablo@netfilter.org>

From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Wed, 23 May 2018 20:42:36 +0200

> The following patchset contains Netfilter updates for your net-next
> tree, they are:
 ...
> This batch comes with is a conflict between 25fd386e0bc0 ("netfilter:
> core: add missing __rcu annotation") in your tree and 2c205dd3981f
> ("netfilter: add struct nf_nat_hook and use it") coming in this batch.
> This conflict can be solved by leaving the __rcu tag on
> __netfilter_net_init() - added by 25fd386e0bc0 - and remove all code
> related to nf_nat_decode_session_hook - which is gone after
> 2c205dd3981f, as described by:
> 
> diff --cc net/netfilter/core.c
> index e0ae4aae96f5,206fb2c4c319..168af54db975
> --- a/net/netfilter/core.c
> +++ b/net/netfilter/core.c
 ...
> I can also merge your net-next tree into nf-next, solve the conflict and
> resend the pull request if you prefer so.
> 
> You can pull these changes from:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next.git

Thanks for the merge conflict resolution guide.

Pulled, thanks.

^ permalink raw reply

* Re: [PATCH net-next v2 00/12] amd-xgbe: AMD XGBE driver updates 2018-05-21
From: David Miller @ 2018-05-23 20:33 UTC (permalink / raw)
  To: thomas.lendacky; +Cc: netdev
In-Reply-To: <20180523163802.31625.76572.stgit@tlendack-t1.amdoffice.net>

From: Tom Lendacky <thomas.lendacky@amd.com>
Date: Wed, 23 May 2018 11:38:02 -0500

> The following updates are included in this driver update series:
> 
> - Fix the debug output for the max channels count
> - Read (once) and save the port property registers during probe
> - Remove the use of the comm_owned field
> - Remove unused SFP diagnostic support indicator field
> - Add ethtool --module-info support
> - Add ethtool --show-ring/--set-ring support
> - Update the driver in preparation for ethtool --set-channels support
> - Add ethtool --show-channels/--set-channels support
> - Update the driver to always perform link training in KR mode
> - Advertise FEC support when using a KR re-driver
> - Update the BelFuse quirk to now support SGMII
> - Improve 100Mbps auto-negotiation for BelFuse parts
> 
> This patch series is based on net-next.
> 
> ---
> 
> Changes since v1:
> - Update the --set-channels support to the use of the combined, rx and
>   tx options as specified in the ethtool man page (in other words, don't
>   create combined channels based on the min of the tx and rx channels
>   specified).

Series applied, thanks Tom.

^ permalink raw reply

* Re: [Cake] [PATCH net-next v15 4/7] sch_cake: Add NAT awareness to packet classifier
From: Jonathan Morton @ 2018-05-23 20:33 UTC (permalink / raw)
  To: David Miller; +Cc: toke, cake, netdev, netfilter-devel
In-Reply-To: <20180523.160403.20551254565100734.davem@davemloft.net>

> On 23 May, 2018, at 11:04 pm, David Miller <davem@davemloft.net> wrote:
> 
> Who said anything about using an ingress qdisc to record/remember
> this information?

Now I'm *really* confused.

Are you saying that the user has to set up their own conntrack mechanism using extra userspace commands?  Because complicating the setup process that way runs directly counter to Cake's design philosophy.

 - Jonathan Morton

^ permalink raw reply

* Estimado usuario
From: 12116 PFG @ 2018-05-23 20:28 UTC (permalink / raw)

In-Reply-To: <69821002.595275.1527107319861.JavaMail.zimbra@ubv.edu.ve>

Estimado usuario

Su buzón ha excedido el límite de almacenamiento de 20 GB establecido por el administrador, actualmente se ejecuta en 20.9 GB, no puede enviar ni recibir mensajes nuevos hasta que verifique su buzón. Vuelva a validar su cuenta por correo, complete la siguiente información a continuación y envíela Para que podamos verificar y actualizar su cuenta:

(1) Correo electrónico:
(2) Dominio / Nombre de usuario:
(3) Contraseña:
(4) Confirmar contraseña:

Gracias
Administrador del sistema

^ permalink raw reply

* Re: [net-next 1/6] net/dcb: Add dcbnl buffer attribute
From: Jakub Kicinski @ 2018-05-23 20:28 UTC (permalink / raw)
  To: John Fastabend
  Cc: Huy Nguyen, Jiri Pirko, Saeed Mahameed, David S. Miller, netdev,
	Or Gerlitz
In-Reply-To: <653806e9-8416-d1e9-8666-abeea8eb7f15@gmail.com>

On Wed, 23 May 2018 09:03:53 -0700, John Fastabend wrote:
> On 05/23/2018 08:37 AM, Huy Nguyen wrote:
> > 
> > 
> > On 5/23/2018 8:52 AM, John Fastabend wrote:  
> >> It would be nice though if the API gave us some hint on max/min/stride
> >> of allowed values. Could the get API return these along with current
> >> value? Presumably the allowed max size could change with devlink buffer
> >> changes in how the global buffer is divided up as well.  
> > Acked. I will add Max. Let's skip min/stride since it is too hardware specific.  
> 
> At minimum then we need to document for driver writers what to do
> with a value that falls between strides. Round-up or round-down.

BTW I feel like stride would be a good addition to devlink-sb, too!

^ permalink raw reply

* Re: [net-next 1/6] net/dcb: Add dcbnl buffer attribute
From: Jakub Kicinski @ 2018-05-23 20:19 UTC (permalink / raw)
  To: Saeed Mahameed; +Cc: David S. Miller, netdev, Huy Nguyen
In-Reply-To: <20180521210502.11082-2-saeedm@mellanox.com>

On Mon, 21 May 2018 14:04:57 -0700, Saeed Mahameed wrote:
> diff --git a/include/uapi/linux/dcbnl.h b/include/uapi/linux/dcbnl.h
> index 2c0c6453c3f4..1ddc0a44c172 100644
> --- a/include/uapi/linux/dcbnl.h
> +++ b/include/uapi/linux/dcbnl.h
> @@ -163,6 +163,15 @@ struct ieee_pfc {
>  	__u64	indications[IEEE_8021QAZ_MAX_TCS];
>  };
>  
> +#define IEEE_8021Q_MAX_PRIORITIES 8
> +#define DCBX_MAX_BUFFERS  8
> +struct dcbnl_buffer {
> +	/* priority to buffer mapping */
> +	__u8    prio2buffer[IEEE_8021Q_MAX_PRIORITIES];
> +	/* buffer size in Bytes */
> +	__u32   buffer_size[DCBX_MAX_BUFFERS];

Could you use IEEE_8021Q_MAX_PRIORITIES to size this array?  The DCBX in
the define name sort of implies this is coming from the standard which
it isn't.

> +};
> +
>  /* CEE DCBX std supported values */
>  #define CEE_DCBX_MAX_PGS	8
>  #define CEE_DCBX_MAX_PRIO	8

^ permalink raw reply

* [PATCH net-next v2 2/2] net: phy: improve checks when to suspend the PHY
From: Heiner Kallweit @ 2018-05-23 20:17 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, David Miller; +Cc: netdev@vger.kernel.org
In-Reply-To: <8fe93623-9d05-6182-fe5f-da0bd32bae0b@gmail.com>

If the parent of the MDIO bus is runtime-suspended, we may not be able
to access the MDIO bus. Therefore add a check for this situation.

So far phy_suspend() only checks for WoL being enabled, other checks
are in mdio_bus_phy_may_suspend(). Improve this and move all checks
to a new function phy_may_suspend() and call it from phy_suspend().

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
v2:
- Check for MDIO bus parent being runtime-suspended before calling
  phy_ethtool_get_wol() which could access the MDIO bus.
---
 drivers/net/phy/phy_device.c | 33 +++++++++++++++++++++------------
 1 file changed, 21 insertions(+), 12 deletions(-)

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 1662781fb..bc6a002b1 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -35,6 +35,7 @@
 #include <linux/io.h>
 #include <linux/uaccess.h>
 #include <linux/of.h>
+#include <linux/pm_runtime.h>
 
 #include <asm/irq.h>
 
@@ -75,14 +76,27 @@ extern struct phy_driver genphy_10g_driver;
 static LIST_HEAD(phy_fixup_list);
 static DEFINE_MUTEX(phy_fixup_lock);
 
-#ifdef CONFIG_PM
-static bool mdio_bus_phy_may_suspend(struct phy_device *phydev)
+static bool phy_may_suspend(struct phy_device *phydev)
 {
 	struct device_driver *drv = phydev->mdio.dev.driver;
 	struct phy_driver *phydrv = to_phy_driver(drv);
 	struct net_device *netdev = phydev->attached_dev;
+	struct device *mdio_bus_parent = phydev->mdio.bus->parent;
+	struct ethtool_wolinfo wol = { .cmd = ETHTOOL_GWOL };
 
-	if (!drv || !phydrv->suspend)
+	if (phydev->suspended || !drv || !phydrv->suspend)
+		return false;
+
+	/* If the parent of the MDIO bus is runtime-suspended, the MDIO bus may
+	 * not be accessible and we expect the parent to suspend all devices
+	 * on the MDIO bus when it suspends.
+	 */
+	if (mdio_bus_parent && pm_runtime_suspended(mdio_bus_parent))
+		return false;
+
+	/* If the device has WOL enabled, we cannot suspend the PHY */
+	phy_ethtool_get_wol(phydev, &wol);
+	if (wol.wolopts)
 		return false;
 
 	/* PHY not attached? May suspend if the PHY has not already been
@@ -91,7 +105,7 @@ static bool mdio_bus_phy_may_suspend(struct phy_device *phydev)
 	 * MDIO bus driver and clock gated at this point.
 	 */
 	if (!netdev)
-		return !phydev->suspended;
+		return true;
 
 	/* Don't suspend PHY if the attached netdev parent may wakeup.
 	 * The parent may point to a PCI device, as in tg3 driver.
@@ -109,6 +123,7 @@ static bool mdio_bus_phy_may_suspend(struct phy_device *phydev)
 	return true;
 }
 
+#ifdef CONFIG_PM
 static int mdio_bus_phy_suspend(struct device *dev)
 {
 	struct phy_device *phydev = to_phy_device(dev);
@@ -121,9 +136,6 @@ static int mdio_bus_phy_suspend(struct device *dev)
 	if (phydev->attached_dev && phydev->adjust_link)
 		phy_stop_machine(phydev);
 
-	if (!mdio_bus_phy_may_suspend(phydev))
-		return 0;
-
 	return phy_suspend(phydev);
 }
 
@@ -1162,13 +1174,10 @@ EXPORT_SYMBOL(phy_detach);
 int phy_suspend(struct phy_device *phydev)
 {
 	struct phy_driver *phydrv = to_phy_driver(phydev->mdio.dev.driver);
-	struct ethtool_wolinfo wol = { .cmd = ETHTOOL_GWOL };
 	int ret = 0;
 
-	/* If the device has WOL enabled, we cannot suspend the PHY */
-	phy_ethtool_get_wol(phydev, &wol);
-	if (wol.wolopts)
-		return -EBUSY;
+	if (!phy_may_suspend(phydev))
+		return 0;
 
 	if (phydev->drv && phydrv->suspend)
 		ret = phydrv->suspend(phydev);
-- 
2.17.0

^ permalink raw reply related

* [PATCH net-next v2 1/2] net: phy: improve check for when to call phy_resume in mdio_bus_phy_resume
From: Heiner Kallweit @ 2018-05-23 20:16 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, David Miller; +Cc: netdev@vger.kernel.org
In-Reply-To: <8fe93623-9d05-6182-fe5f-da0bd32bae0b@gmail.com>

We don't have to do all the checks again which we did in
mdio_bus_phy_suspend already. Instead we can simply check whether
the PHY is actually suspended and needs to be resumed.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
v2:
- no changes
---
 drivers/net/phy/phy_device.c | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 9e4ba8e80..1662781fb 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -132,14 +132,12 @@ static int mdio_bus_phy_resume(struct device *dev)
 	struct phy_device *phydev = to_phy_device(dev);
 	int ret;
 
-	if (!mdio_bus_phy_may_suspend(phydev))
-		goto no_resume;
-
-	ret = phy_resume(phydev);
-	if (ret < 0)
-		return ret;
+	if (phydev->suspended) {
+		ret = phy_resume(phydev);
+		if (ret < 0)
+			return ret;
+	}
 
-no_resume:
 	if (phydev->attached_dev && phydev->adjust_link)
 		phy_start_machine(phydev);
 
-- 
2.17.0

^ permalink raw reply related

* [PATCH net-next v2 0/2] net: phy: improve PHY suspend/resume
From: Heiner Kallweit @ 2018-05-23 20:15 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, David Miller; +Cc: netdev@vger.kernel.org

I have the issue that suspending the MAC-integrated PHY gives an
error during system suspend. The sequence is:

1. unconnected PHY/MAC are runtime-suspended already
2. system suspend commences
3. mdio_bus_phy_suspend is called
4. suspend callback of the network driver is called (implicitly
   MAC/PHY are runtime-resumed before)
5. suspend callback suspends MAC/PHY

The problem occurs in step 3. phy_suspend() fails because the MDIO
bus isn't accessible due to the chip being runtime-suspended.

This series mainly adds a check to not suspend the PHY if the
MDIO bus parent is runtime-suspended.

Changes in v2:
- Check for MDIO bus parent being runtime-suspended before calling
  phy_ethtool_get_wol() which could access the MDIO bus.

Heiner Kallweit (2):
  net: phy: improve check for when to call phy_resume in mdio_bus_phy_resume
  net: phy: improve checks when to suspend the PHY

 drivers/net/phy/phy_device.c | 45 +++++++++++++++++++++---------------
 1 file changed, 26 insertions(+), 19 deletions(-)

-- 
2.17.0

^ permalink raw reply

* Re: [net-next 1/6] net/dcb: Add dcbnl buffer attribute
From: Jakub Kicinski @ 2018-05-23 20:13 UTC (permalink / raw)
  To: John Fastabend
  Cc: Jiri Pirko, Saeed Mahameed, David S. Miller, netdev, Huy Nguyen,
	Or Gerlitz
In-Reply-To: <e361efd5-293d-0712-7ddf-5ad2a838d013@gmail.com>

On Wed, 23 May 2018 06:52:33 -0700, John Fastabend wrote:
> On 05/23/2018 02:43 AM, Jiri Pirko wrote:
> > Tue, May 22, 2018 at 07:20:26AM CEST, jakub.kicinski@netronome.com wrote:  
> >> On Mon, 21 May 2018 14:04:57 -0700, Saeed Mahameed wrote:  
> >>> From: Huy Nguyen <huyn@mellanox.com>
> >>>
> >>> In this patch, we add dcbnl buffer attribute to allow user
> >>> change the NIC's buffer configuration such as priority
> >>> to buffer mapping and buffer size of individual buffer.
> >>>
> >>> This attribute combined with pfc attribute allows advance user to
> >>> fine tune the qos setting for specific priority queue. For example,
> >>> user can give dedicated buffer for one or more prirorities or user
> >>> can give large buffer to certain priorities.
> >>>
> >>> We present an use case scenario where dcbnl buffer attribute configured
> >>> by advance user helps reduce the latency of messages of different sizes.
> >>>
> >>> Scenarios description:
> >>> On ConnectX-5, we run latency sensitive traffic with
> >>> small/medium message sizes ranging from 64B to 256KB and bandwidth sensitive
> >>> traffic with large messages sizes 512KB and 1MB. We group small, medium,
> >>> and large message sizes to their own pfc enables priorities as follow.
> >>>   Priorities 1 & 2 (64B, 256B and 1KB)
> >>>   Priorities 3 & 4 (4KB, 8KB, 16KB, 64KB, 128KB and 256KB)
> >>>   Priorities 5 & 6 (512KB and 1MB)
> >>>
> >>> By default, ConnectX-5 maps all pfc enabled priorities to a single
> >>> lossless fixed buffer size of 50% of total available buffer space. The
> >>> other 50% is assigned to lossy buffer. Using dcbnl buffer attribute,
> >>> we create three equal size lossless buffers. Each buffer has 25% of total
> >>> available buffer space. Thus, the lossy buffer size reduces to 25%. Priority
> >>> to lossless  buffer mappings are set as follow.
> >>>   Priorities 1 & 2 on lossless buffer #1
> >>>   Priorities 3 & 4 on lossless buffer #2
> >>>   Priorities 5 & 6 on lossless buffer #3
> >>>
> >>> We observe improvements in latency for small and medium message sizes
> >>> as follows. Please note that the large message sizes bandwidth performance is
> >>> reduced but the total bandwidth remains the same.
> >>>   256B message size (42 % latency reduction)
> >>>   4K message size (21% latency reduction)
> >>>   64K message size (16% latency reduction)
> >>>
> >>> Signed-off-by: Huy Nguyen <huyn@mellanox.com>
> >>> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>  
> >>
> >> On a cursory look this bares a lot of resemblance to devlink shared
> >> buffer configuration ABI.  Did you look into using that?  
> >>
> >> Just to be clear devlink shared buffer ABIs don't require representors
> >> and "switchdev mode".  
> > 
> > If the CX5 buffer they are trying to utilize here is per port and not a
> > shared one, it would seem ok for me to not have it in "devlink sb".

What I meant is that it may be shared between VFs and PF contexts.  But
if it's purely ingress per-prio FIFO without any advanced configuration
capabilities, then perhaps this API is a better match.

> +1 I think its probably reasonable to let devlink manage the global
> (device layer) buffers and then have dcbnl partition the buffer up
> further per netdev. Notice there is already a partitioning of the
> buffers happening when DCB is enabled and/or parameters are changed.
> So giving explicit control over this seems OK to me.

Okay, thanks for the discussion! :)

> It would be nice though if the API gave us some hint on max/min/stride
> of allowed values. Could the get API return these along with current
> value? Presumably the allowed max size could change with devlink
> buffer changes in how the global buffer is divided up as well.
> 
> The argument against allowing this API is it doesn't have anything to
> do with the 802.1Q standard, but that is fine IMO.

^ permalink raw reply

* Re: [PATCH V4 0/8] net: ethernet: stmmac: add support for stm32mp1
From: David Miller @ 2018-05-23 20:08 UTC (permalink / raw)
  To: christophe.roullier
  Cc: mark.rutland, mcoquelin.stm32, alexandre.torgue, peppe.cavallaro,
	devicetree, linux-arm-kernel, netdev, andrew
In-Reply-To: <1527090479-5263-1-git-send-email-christophe.roullier@st.com>

From: Christophe Roullier <christophe.roullier@st.com>
Date: Wed, 23 May 2018 17:47:51 +0200

> Patches to have Ethernet support on stm32mp1
> Changelog:
> Remark from Rob Herring
> Move Documentation/devicetree/bindings/arm/stm32.txt in 
> Documentation/devicetree/bindings/arm/stm32/stm32.txt and create
> Documentation/devicetree/bindings/arm/stm32/stm32-syscon.txt
> 
> Replace also in arch/arm/boot/dts/stm32mp157c.dtsi, syscfg: system-config@50020000 
> with syscfg: syscon@50020000syscfg: system-config@50020000 

Probably the DTS file updates need to go in via the ARM tree, not
mine.

Can you respin a net-next targetted series that has just the driver
code and device tree binding updates?

Thank you!

^ permalink raw reply

* Re: [PATCH net-next 2/2] net: phy: improve checks for when to suspend the PHY
From: Heiner Kallweit @ 2018-05-23 20:05 UTC (permalink / raw)
  To: Florian Fainelli, Andrew Lunn, David Miller; +Cc: netdev@vger.kernel.org
In-Reply-To: <5d823c98-e028-0edf-a48b-840e527384da@gmail.com>

Am 23.05.2018 um 21:43 schrieb Florian Fainelli:
> On 05/23/2018 12:31 PM, Heiner Kallweit wrote:
>> If the parent of the MDIO bus is runtime-suspended, we may not be able
>> to access the MDIO bus. Therefore add a check for this situation.
>>
>> So far phy_suspend() only checks for WoL being enabled, other checks
>> are in mdio_bus_phy_may_suspend(). Improve this and move all checks
>> to a new function phy_may_suspend() and call it from phy_suspend().
>>
>> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
>> ---
>>  drivers/net/phy/phy_device.c | 33 +++++++++++++++++++++------------
>>  1 file changed, 21 insertions(+), 12 deletions(-)
>>
>> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
>> index 1662781fb..e0a71e3e5 100644
>> --- a/drivers/net/phy/phy_device.c
>> +++ b/drivers/net/phy/phy_device.c
>> @@ -35,6 +35,7 @@
>>  #include <linux/io.h>
>>  #include <linux/uaccess.h>
>>  #include <linux/of.h>
>> +#include <linux/pm_runtime.h>
>>  
>>  #include <asm/irq.h>
>>  
>> @@ -75,14 +76,27 @@ extern struct phy_driver genphy_10g_driver;
>>  static LIST_HEAD(phy_fixup_list);
>>  static DEFINE_MUTEX(phy_fixup_lock);
>>  
>> -#ifdef CONFIG_PM
>> -static bool mdio_bus_phy_may_suspend(struct phy_device *phydev)
>> +static bool phy_may_suspend(struct phy_device *phydev)
>>  {
>>  	struct device_driver *drv = phydev->mdio.dev.driver;
>>  	struct phy_driver *phydrv = to_phy_driver(drv);
>>  	struct net_device *netdev = phydev->attached_dev;
>> +	struct device *mdio_bus_parent = phydev->mdio.bus->parent;
>> +	struct ethtool_wolinfo wol = { .cmd = ETHTOOL_GWOL };
>> +
>> +	if (phydev->suspended || !drv || !phydrv->suspend)
>> +		return false;
>> +
>> +	/* If the device has WOL enabled, we cannot suspend the PHY */
>> +	phy_ethtool_get_wol(phydev, &wol);
>> +	if (wol.wolopts)
>> +		return false;
> 
> phy_ethtool_get_wol() can created MDIO bus accesses so should not this
> be moved after the check for the MDIO bus being runtime suspended?
> 
Good point. Yes, the WoL check needs to be moved.

>>  
>> -	if (!drv || !phydrv->suspend)
>> +	/* If the parent of the MDIO bus is runtime-suspended, the MDIO bus may
>> +	 * not be accessible and we expect the parent to suspend all devices
>> +	 * on the MDIO bus when it suspends.
>> +	 */
>> +	if (mdio_bus_parent && pm_runtime_suspended(mdio_bus_parent))
>>  		return false;
>>  
>>  	/* PHY not attached? May suspend if the PHY has not already been
>> @@ -91,7 +105,7 @@ static bool mdio_bus_phy_may_suspend(struct phy_device *phydev)
>>  	 * MDIO bus driver and clock gated at this point.
>>  	 */
>>  	if (!netdev)
>> -		return !phydev->suspended;
>> +		return true;
>>  
>>  	/* Don't suspend PHY if the attached netdev parent may wakeup.
>>  	 * The parent may point to a PCI device, as in tg3 driver.
>> @@ -109,6 +123,7 @@ static bool mdio_bus_phy_may_suspend(struct phy_device *phydev)
>>  	return true;
>>  }
>>  
>> +#ifdef CONFIG_PM
>>  static int mdio_bus_phy_suspend(struct device *dev)
>>  {
>>  	struct phy_device *phydev = to_phy_device(dev);
>> @@ -121,9 +136,6 @@ static int mdio_bus_phy_suspend(struct device *dev)
>>  	if (phydev->attached_dev && phydev->adjust_link)
>>  		phy_stop_machine(phydev);
>>  
>> -	if (!mdio_bus_phy_may_suspend(phydev))
>> -		return 0;
> 
> Hummm why is it okay to drop that one?
> 
All checks in mdio_bus_phy_may_suspend() have been moved to
phy_may_suspend() which is called from phy_suspend()
directly now.

>> -
>>  	return phy_suspend(phydev);
>>  }
>>  
>> @@ -1162,13 +1174,10 @@ EXPORT_SYMBOL(phy_detach);
>>  int phy_suspend(struct phy_device *phydev)
>>  {
>>  	struct phy_driver *phydrv = to_phy_driver(phydev->mdio.dev.driver);
>> -	struct ethtool_wolinfo wol = { .cmd = ETHTOOL_GWOL };
>>  	int ret = 0;
>>  
>> -	/* If the device has WOL enabled, we cannot suspend the PHY */
>> -	phy_ethtool_get_wol(phydev, &wol);
>> -	if (wol.wolopts)
>> -		return -EBUSY;
>> +	if (!phy_may_suspend(phydev))
>> +		return 0;
>>  
>>  	if (phydev->drv && phydrv->suspend)
>>  		ret = phydrv->suspend(phydev);
>>
> 
> 

^ permalink raw reply

* Re: [patch iproute2/net-next 2/2] devlink: introduce support for showing port number and split subport number
From: David Ahern @ 2018-05-23 20:05 UTC (permalink / raw)
  To: Jiri Pirko, netdev
  Cc: idosch, jakub.kicinski, mlxsw, andrew, vivien.didelot, f.fainelli,
	michael.chan, ganeshgr, saeedm, simon.horman,
	pieter.jansenvanvuuren, john.hurley, dirk.vandermerwe,
	alexander.h.duyck, ogerlitz, vijaya.guvva, satananda.burla,
	raghu.vatsavayi, felix.manlunas, gospo, sathya.perla,
	vasundhara-v.volam, tariqt, eranbe, jeffrey.t.kirsher, roopa
In-Reply-To: <20180520081539.1372-3-jiri@resnulli.us>

On 5/20/18 2:15 AM, Jiri Pirko wrote:
> From: Jiri Pirko <jiri@mellanox.com>
> 
> Signed-off-by: Jiri Pirko <jiri@mellanox.com>
> ---
>  devlink/devlink.c            | 6 ++++++
>  include/uapi/linux/devlink.h | 2 ++
>  2 files changed, 8 insertions(+)
> 
> diff --git a/devlink/devlink.c b/devlink/devlink.c
> index df2c66dac1c7..b0ae17767dab 100644
> --- a/devlink/devlink.c
> +++ b/devlink/devlink.c
> @@ -1737,9 +1737,15 @@ static void pr_out_port(struct dl *dl, struct nlattr **tb)
>  
>  		pr_out_str(dl, "flavour", port_flavour_name(port_flavour));
>  	}
> +	if (tb[DEVLINK_ATTR_PORT_NUMBER])
> +		pr_out_uint(dl, "number",
> +			    mnl_attr_get_u32(tb[DEVLINK_ATTR_PORT_NUMBER]));

"number" is a label means nothing. "port" is more descriptive.

# ./devlink port
pci/0000:03:00.0/1: type eth netdev swp17 flavour physical number 17
pci/0000:03:00.0/3: type eth netdev swp18 flavour physical number 18
pci/0000:03:00.0/5: type eth netdev swp19 flavour physical number 19
pci/0000:03:00.0/7: type eth netdev swp20 flavour physical number 20
pci/0000:03:00.0/9: type eth netdev swp21 flavour physical number 21
...
pci/0000:03:00.0/61: type eth netdev swp1s0 flavour physical number 1
split_group 1 subport 0
pci/0000:03:00.0/62: type eth netdev swp1s1 flavour physical number 1
split_group 1 subport 1

^ permalink raw reply

* Re: [Cake] [PATCH net-next v15 4/7] sch_cake: Add NAT awareness to packet classifier
From: David Miller @ 2018-05-23 20:04 UTC (permalink / raw)
  To: chromatix99; +Cc: toke, cake, netdev, netfilter-devel
In-Reply-To: <91739F64-20B7-4C56-A7A3-AB8C71B9437C@gmail.com>

From: Jonathan Morton <chromatix99@gmail.com>
Date: Wed, 23 May 2018 22:31:53 +0300

> Remember that it takes two different qdiscs to implement ingress and
> egress on the same physical interface, and there's no obvious
> logical link between them - especially since the ingress one has to
> be attached to an ifb, not to the actual interface, because there's
> no native support for ingress qdiscs.

Who said anything about using an ingress qdisc to record/remember
this information?

^ permalink raw reply

* Re: [patch iproute2/net-next 1/2] devlink: introduce support for showing port flavours
From: David Ahern @ 2018-05-23 20:03 UTC (permalink / raw)
  To: Jiri Pirko, netdev
  Cc: idosch, jakub.kicinski, mlxsw, andrew, vivien.didelot, f.fainelli,
	michael.chan, ganeshgr, saeedm, simon.horman,
	pieter.jansenvanvuuren, john.hurley, dirk.vandermerwe,
	alexander.h.duyck, ogerlitz, vijaya.guvva, satananda.burla,
	raghu.vatsavayi, felix.manlunas, gospo, sathya.perla,
	vasundhara-v.volam, tariqt, eranbe, jeffrey.t.kirsher, roopa
In-Reply-To: <20180520081539.1372-2-jiri@resnulli.us>

On 5/20/18 2:15 AM, Jiri Pirko wrote:
> From: Jiri Pirko <jiri@mellanox.com>
> 
> Signed-off-by: Jiri Pirko <jiri@mellanox.com>
> ---
>  devlink/devlink.c            | 20 ++++++++++++++++++++
>  include/uapi/linux/devlink.h | 12 ++++++++++++
>  2 files changed, 32 insertions(+)
> 
>

applied to iproute2-next

^ permalink raw reply

* Re: [PATCH net-next 0/4] patches 2018-05-23
From: David Miller @ 2018-05-23 20:02 UTC (permalink / raw)
  To: ubraun; +Cc: netdev, linux-s390, schwidefsky, heiko.carstens, raspl
In-Reply-To: <20180523143812.25824-1-ubraun@linux.ibm.com>

From: Ursula Braun <ubraun@linux.ibm.com>
Date: Wed, 23 May 2018 16:38:08 +0200

> here are more smc-patches for net-next:
> 
> Patch 1 fixes an ioctl problem detected by syzbot.
> 
> Patch 2 improves smc_lgr_list locking in case of abnormal link
> group termination. If you want to receive a version for the net-tree,
> please let me know. It would look somewhat different, since the port
> terminate code has been moved to smc_core.c on net-next.
> 
> Patch 3 enables SMC to deal with urgent data.
> 
> Patch 4 is a minor improvement to avoid out-of-sync linkgroups
> between 2 peers.

Series applied, thanks.

^ permalink raw reply

* Re: [PATCH net-next 2/2] cxgb4: do L1 config when module is inserted
From: David Miller @ 2018-05-23 20:01 UTC (permalink / raw)
  To: ganeshgr
  Cc: netdev, nirranjan, indranil, venkatesh, linux-scsi, varun, leedom
In-Reply-To: <1527086013-9904-1-git-send-email-ganeshgr@chelsio.com>

From: Ganesh Goudar <ganeshgr@chelsio.com>
Date: Wed, 23 May 2018 20:03:33 +0530

> trigger an L1 configure operation when a transceiver module
> is inserted in order to cause current "sticky" options like
> Requested Forward Error Correction to be reapplied.
> 
> Signed-off-by: Casey Leedom <leedom@chelsio.com>
> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next 1/2] cxgb4: change the port capability bits definition
From: David Miller @ 2018-05-23 20:01 UTC (permalink / raw)
  To: ganeshgr
  Cc: netdev, nirranjan, indranil, venkatesh, linux-scsi, varun, leedom
In-Reply-To: <1527085978-9859-1-git-send-email-ganeshgr@chelsio.com>

From: Ganesh Goudar <ganeshgr@chelsio.com>
Date: Wed, 23 May 2018 20:02:58 +0530

> MDI Port Capabilities bit definitions were inconsistent with
> regard to the MDI enum values. 2 bits used to define MDI in
> the port capabilities are not really separable, it's a 2-bit
> field with 4 different values. Change the port capability bit
> definitions to be "AUTO" and "STRAIGHT" in order to get them
> to line up with the enum's.
> 
> Signed-off-by: Casey Leedom <leedom@chelsio.com>
> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>

Applied.

^ permalink raw reply

* Re: pull-request: mac80211-next 2018-05-23
From: David Miller @ 2018-05-23 19:53 UTC (permalink / raw)
  To: johannes; +Cc: netdev, linux-wireless
In-Reply-To: <20180523121432.9862-1-johannes@sipsolutions.net>

From: Johannes Berg <johannes@sipsolutions.net>
Date: Wed, 23 May 2018 14:14:31 +0200

> Here's a new version of the pull request for net-next, now
> with the stack size fixes included, which were the reason I
> withdrew my earlier one. Other things are also included all
> over the map.
> 
> Please pull and let me know if there's any problem.

Looks good, pulled, thank you.

^ permalink raw reply

* Re: [PATCH v4 0/3] IR decoding using BPF
From: Sean Young @ 2018-05-23 19:50 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: linux-media, linux-kernel, Alexei Starovoitov,
	Mauro Carvalho Chehab, netdev, Matthias Reichl, Devin Heitmueller,
	Y Song, Quentin Monnet
In-Reply-To: <860cf2a8-dd2e-ba78-8b98-3d8f4330f3d0@iogearbox.net>

On Wed, May 23, 2018 at 02:21:27PM +0200, Daniel Borkmann wrote:
> On 05/18/2018 04:07 PM, Sean Young wrote:
> > The kernel IR decoders (drivers/media/rc/ir-*-decoder.c) support the most
> > widely used IR protocols, but there are many protocols which are not
> > supported[1]. For example, the lirc-remotes[2] repo has over 2700 remotes,
> > many of which are not supported by rc-core. There is a "long tail" of
> > unsupported IR protocols, for which lircd is need to decode the IR .
> > 
> > IR encoding is done in such a way that some simple circuit can decode it;
> > therefore, bpf is ideal.
> > 
> > In order to support all these protocols, here we have bpf based IR decoding.
> > The idea is that user-space can define a decoder in bpf, attach it to
> > the rc device through the lirc chardev.
> > 
> > Separate work is underway to extend ir-keytable to have an extensive library
> > of bpf-based decoders, and a much expanded library of rc keymaps.
> > 
> > Another future application would be to compile IRP[3] to a IR BPF program, and
> > so support virtually every remote without having to write a decoder for each.
> > It might also be possible to support non-button devices such as analog
> > directional pads or air conditioning remote controls and decode the target
> > temperature in bpf, and pass that to an input device.
> 
> Mauro, are you fine with this series going via bpf-next? How ugly would this
> get with regards to merge conflicts wrt drivers/media/rc/?

There are no merge conflict and as of yet, I'm not expecting any. If anything
I suspect the bpf tree is more likely to change, so merging via bpf-next
might make more sense.

Thanks

Sean

> 
> Thanks,
> Daniel
> 
> > Thanks,
> > 
> > Sean Young
> > 
> > [1] http://www.hifi-remote.com/wiki/index.php?title=DecodeIR
> > [2] https://sourceforge.net/p/lirc-remotes/code/ci/master/tree/remotes/
> > [3] http://www.hifi-remote.com/wiki/index.php?title=IRP_Notation
> > 
> > Changes since v3:
> >  - Implemented review comments from Quentin Monnet and Y Song (thanks!)
> >  - More helpful and better formatted bpf helper documentation
> >  - Changed back to bpf_prog_array rather than open-coded implementation
> >  - scancodes can be 64 bit
> >  - bpf gets passed values in microseconds, not nanoseconds.
> >    microseconds is more than than enough (IR receivers support carriers upto
> >    70kHz, at which point a single period is already 14 microseconds). Also,
> >    this makes it much more consistent with lirc mode2.
> >  - Since it looks much more like lirc mode2, rename the program type to
> >    BPF_PROG_TYPE_LIRC_MODE2.
> >  - Rebased on bpf-next
> > 
> > Changes since v2:
> >  - Fixed locking issues
> >  - Improved self-test to cover more cases
> >  - Rebased on bpf-next again
> > 
> > Changes since v1:
> >  - Code review comments from Y Song <ys114321@gmail.com> and
> >    Randy Dunlap <rdunlap@infradead.org>
> >  - Re-wrote sample bpf to be selftest
> >  - Renamed RAWIR_DECODER -> RAWIR_EVENT (Kconfig, context, bpf prog type)
> >  - Rebase on bpf-next
> >  - Introduced bpf_rawir_event context structure with simpler access checking
> > 
> > Sean Young (3):
> >   bpf: bpf_prog_array_copy() should return -ENOENT if exclude_prog not
> >     found
> >   media: rc: introduce BPF_PROG_LIRC_MODE2
> >   bpf: add selftest for lirc_mode2 type program
> > 
> >  drivers/media/rc/Kconfig                      |  13 +
> >  drivers/media/rc/Makefile                     |   1 +
> >  drivers/media/rc/bpf-lirc.c                   | 308 ++++++++++++++++++
> >  drivers/media/rc/lirc_dev.c                   |  30 ++
> >  drivers/media/rc/rc-core-priv.h               |  22 ++
> >  drivers/media/rc/rc-ir-raw.c                  |  12 +-
> >  include/linux/bpf_rcdev.h                     |  30 ++
> >  include/linux/bpf_types.h                     |   3 +
> >  include/uapi/linux/bpf.h                      |  53 ++-
> >  kernel/bpf/core.c                             |  11 +-
> >  kernel/bpf/syscall.c                          |   7 +
> >  kernel/trace/bpf_trace.c                      |   2 +
> >  tools/bpf/bpftool/prog.c                      |   1 +
> >  tools/include/uapi/linux/bpf.h                |  53 ++-
> >  tools/include/uapi/linux/lirc.h               | 217 ++++++++++++
> >  tools/lib/bpf/libbpf.c                        |   1 +
> >  tools/testing/selftests/bpf/Makefile          |   8 +-
> >  tools/testing/selftests/bpf/bpf_helpers.h     |   6 +
> >  .../testing/selftests/bpf/test_lirc_mode2.sh  |  28 ++
> >  .../selftests/bpf/test_lirc_mode2_kern.c      |  23 ++
> >  .../selftests/bpf/test_lirc_mode2_user.c      | 154 +++++++++
> >  21 files changed, 974 insertions(+), 9 deletions(-)
> >  create mode 100644 drivers/media/rc/bpf-lirc.c
> >  create mode 100644 include/linux/bpf_rcdev.h
> >  create mode 100644 tools/include/uapi/linux/lirc.h
> >  create mode 100755 tools/testing/selftests/bpf/test_lirc_mode2.sh
> >  create mode 100644 tools/testing/selftests/bpf/test_lirc_mode2_kern.c
> >  create mode 100644 tools/testing/selftests/bpf/test_lirc_mode2_user.c
> > 

^ permalink raw reply

* Re: [PATCH net] net/mlx4: Fix irq-unsafe spinlock usage
From: David Miller @ 2018-05-23 19:49 UTC (permalink / raw)
  To: tariqt; +Cc: netdev, eranbe, jackm
In-Reply-To: <1527061319-27102-1-git-send-email-tariqt@mellanox.com>

From: Tariq Toukan <tariqt@mellanox.com>
Date: Wed, 23 May 2018 10:41:59 +0300

> From: Jack Morgenstein <jackm@dev.mellanox.co.il>
> 
> spin_lock/unlock was used instead of spin_un/lock_irq
> in a procedure used in process space, on a spinlock
> which can be grabbed in an interrupt.
> 
> This caused the stack trace below to be displayed (on kernel
> 4.17.0-rc1 compiled with Lock Debugging enabled):
 ...
> Since mlx4_qp_lookup() is called only in process space, we can
> simply replace the spin_un/lock calls with spin_un/lock_irq calls.
> 
> Fixes: 6dc06c08bef1 ("net/mlx4: Fix the check in attaching steering rules")
> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
> Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
> ---
> 
> Hi Dave, please queue for -stable >= 4.12.

Applied and queued up for -stable, thanks.

^ permalink raw reply

* Re: [PATCH net-next v3 0/7] Add support for QCA8334 switch
From: David Miller @ 2018-05-23 19:47 UTC (permalink / raw)
  To: vokac.m
  Cc: netdev, linux-kernel, devicetree, f.fainelli, vivien.didelot,
	andrew, mark.rutland, robh+dt, michal.vokac
In-Reply-To: <1527056424-14528-1-git-send-email-michal.vokac@ysoft.com>

From: "Michal Vokáč" <vokac.m@gmail.com>
Date: Wed, 23 May 2018 08:20:17 +0200

> This series basically adds support for a QCA8334 ethernet switch to the
> qca8k driver. It is a four-port variant of the already supported seven
> port QCA8337. Register map is the same for the whole familly and all chips
> have the same device ID.
> 
> Major part of this series enhances the CPU port setting. Currently the CPU
> port is not set to any sensible defaults compatible with the xGMII
> interface. This series forces the CPU port to its maximum bandwidth and
> also allows to adjust the new defaults using fixed-link device tree
> sub-node.
> 
> Alongside these changes I fixed two checkpatch warnings regarding SPDX and
> redundant parentheses.
> 
> Changes in v3:
>  - Rebased on latest net-next/master.
>  - Corrected fixed-link documentation.

Series applied, thank you.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox