Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: switchdev offload & ecmp
From: Nicolas Dichtel @ 2017-05-16 20:22 UTC (permalink / raw)
  To: Ido Schimmel; +Cc: Jiri Pirko, Nikolay Aleksandrov, Roopa Prabhu, netdev
In-Reply-To: <20170516141149.GA1874@splinter.mtl.com>

Le 16/05/2017 à 16:11, Ido Schimmel a écrit :
> On Tue, May 16, 2017 at 02:57:47PM +0200, Nicolas Dichtel wrote:
>>>> I suspect that there can be scenarii where some packets of a flow are forwarded
>>>> by the driver and some other are forwarded by the kernel.
>>>
>>> Can you elaborate? The kernel only sees specific packets, which were
>>> trapped to the CPU. See:
>>> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/tree/drivers/net/ethernet/mellanox/mlxsw/spectrum.c#n2996
>> Ok, this part was not clear for me, thank you for the pointer.
>>
>> So, when an arp resolution is needed, the packets are not trapped to the CPU,
>> the device manages the queue itself?
> 
> There are two cases here. If you need an ARP resolution following a hit
> of a directly connected route and this neighbour isn't in the device's
> table, then packet is trapped (HOST_MISS_IPV4 in above list) to the CPU
> and triggers ARP resolution in the kernel. Eventually a NETEVENT will be
> sent and the neighbour will be programmed to the device.
> 
> If you need an ARP resolution of a nexthop, then this is a bit
> different. If you have an ECMP group with several nexthops, then once
> one of them is resolved, packets will be forwarded using it. To make
> sure other nexthops will also be resolved we try to periodically refresh
> them. Otherwise packets will always be forwarded using a single nexthop,
> as the kernel won't have motivation to resolve the others.
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c#n987
> 
> In case no nexthops can be resolved, then packets will be trapped to the
> CPU (RTR_INGRESS0 in above list) and forwarded by the kernel.
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c#n1896
> 
Ok, thank you for the details.

Regards,
Nicolas

^ permalink raw reply

* Re: linux-next: Tree for May 16 (net/core)
From: Eric Dumazet @ 2017-05-16 20:12 UTC (permalink / raw)
  To: Paul Gortmaker
  Cc: Randy Dunlap, Stephen Rothwell, Linux-Next Mailing List,
	Linux Kernel Mailing List, netdev@vger.kernel.org
In-Reply-To: <CAP=VYLoYkz1HE4yOHVGQvBanDtpkWhtHBMvbeuyd8PtNFCivRQ@mail.gmail.com>

On Tue, May 16, 2017 at 12:44 PM, Paul Gortmaker
<paul.gortmaker@windriver.com> wrote:
> On Tue, May 16, 2017 at 12:28 PM, Randy Dunlap <rdunlap@infradead.org> wrote:
>> On 05/15/17 18:21, Stephen Rothwell wrote:
>>> Hi all,
>>>
>>> Changes since 20170515:
>>>
>>
>> on i386 or x86_64:
>>
>> when CONFIG_INET is not enabled:
>>
>> ../net/core/sock.c: In function 'skb_orphan_partial':
>> ../net/core/sock.c:1810:2: error: implicit declaration of function 'skb_is_tcp_pure_ack' [-Werror=implicit-function-declaration]
>>   if (skb_is_tcp_pure_ack(skb))
>
> Automated bisect on an ARM build with the same issue reveals:
>
> f6ba8d33cfbb46df569972e64dbb5bb7e929bfd9 is the first bad commit
> commit f6ba8d33cfbb46df569972e64dbb5bb7e929bfd9
> Author: Eric Dumazet <edumazet@google.com>
> Date:   Thu May 11 15:24:41 2017 -0700
>
>     netem: fix skb_orphan_partial()
>
>     I should have known that lowering skb->truesize was dangerous :/
>
>     In case packets are not leaving the host via a standard Ethernet device,
>     but looped back to local sockets, bad things can happen, as reported
>     by Michael Madsen ( https://bugzilla.kernel.org/show_bug.cgi?id=195713 )
>
>     So instead of tweaking skb->truesize, lets change skb->destructor
>     and keep a reference on the owner socket via its sk_refcnt.
>
>     Fixes: f2f872f9272a ("netem: Introduce skb_orphan_partial() helper")
>     Signed-off-by: Eric Dumazet <edumazet@google.com>
>     Reported-by: Michael Madsen <mkm@nabto.com>
>     Signed-off-by: David S. Miller <davem@davemloft.net>
>
> :040000 040000 7bfb7a6f5e12373b1c50ede2455b6ddd6d79cee0
> b45b7255322f1dff5e3ab8d3d707cf38a91c76ce M      net
> bisect run success
>
> http://kisskb.ellerman.id.au/kisskb/buildresult/13033081/
>
> I'm guessing Eric already knows about this but I've Cc'd him just in case.

I was not aware of this, I will submit a fix, thanks.

^ permalink raw reply

* Re: [PATCH net-next] geneve: add rtnl changelink support
From: Girish Moodalbail @ 2017-05-16 20:09 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, pshelar, joe, jbenc
In-Reply-To: <20170516.153115.2084299272043901950.davem@davemloft.net>

On 5/16/17 12:31 PM, David Miller wrote:
> From: Girish Moodalbail <girish.moodalbail@oracle.com>
> Date: Mon, 15 May 2017 10:47:04 -0700
>
>>  	if (data[IFLA_GENEVE_REMOTE]) {
>> -		info.key.u.ipv4.dst =
>> +		info->key.u.ipv4.dst =
>>  			nla_get_in_addr(data[IFLA_GENEVE_REMOTE]);
>>
>> -		if (IN_MULTICAST(ntohl(info.key.u.ipv4.dst))) {
>> +		if (IN_MULTICAST(ntohl(info->key.u.ipv4.dst))) {
>>  			netdev_dbg(dev, "multicast remote is unsupported\n");
>>  			return -EINVAL;
>>  		}
>> +		if (changelink &&
>> +		    ip_tunnel_info_af(&geneve->info) == AF_INET6) {
>> +			info->mode &= ~IP_TUNNEL_INFO_IPV6;
>> +			info->key.tun_flags &= ~TUNNEL_CSUM;
>> +			*use_udp6_rx_checksums = false;
>> +		}
>>  	}
>
> I don't understand this "changelink" guarded code, why do you need to
> clear all of this state out if the existing tunnel type if AF_INET6
> and only when doing a changelink?
>
> In any event, I think you need to add a comment explaining it.
>

If geneve link was overlayed over IPv6 network and now the user modifies the 
link to be over IPv4 network by doing

# ip link set gen0 type geneve id 100 remote 192.168.13.2

Then we will need to

  - reset info->mode to be not IPv6 type
  - the default for UDP checksum over IPv4 is 'no', so reset that and
  - set use_udp6_rx_checksums to its default value which is false.

I will capture the above information concisely in a comment around that 
'changelink' guard.

thanks,
~Girish

^ permalink raw reply

* Re: [PATCH net-next] geneve: add rtnl changelink support
From: Pravin Shelar @ 2017-05-16 20:00 UTC (permalink / raw)
  To: Girish Moodalbail
  Cc: Linux Kernel Network Developers, David S. Miller, Joe Stringer,
	Jiri Benc
In-Reply-To: <1494870424-22699-1-git-send-email-girish.moodalbail@oracle.com>

On Mon, May 15, 2017 at 10:47 AM, Girish Moodalbail
<girish.moodalbail@oracle.com> wrote:
> This patch adds changelink rtnl operation support for geneve devices.
> Code changes involve:
>   - refactor geneve_newlink into geneve_nl2info to be used by both
>     geneve_newlink and geneve_changelink
>   - geneve_nl2info takes a changelink boolean argument to isolate
>     changelink checks and updates.
>   - Allow changing only a few attributes:
>     - return -EOPNOTSUPP for attributes that cannot be changed for
>       now. Incremental patches can make the non-supported one
>       available in the future if needed.
>
Thanks for working on this.

> Signed-off-by: Girish Moodalbail <girish.moodalbail@oracle.com>
> ---
>  drivers/net/geneve.c | 149 ++++++++++++++++++++++++++++++++++++++++-----------
>  1 file changed, 117 insertions(+), 32 deletions(-)
>
...
> @@ -1169,45 +1181,58 @@ static void init_tnl_info(struct ip_tunnel_info *info, __u16 dst_port)
>         info->key.tp_dst = htons(dst_port);
>  }
>
> -static int geneve_newlink(struct net *net, struct net_device *dev,
> -                         struct nlattr *tb[], struct nlattr *data[])
> +static int geneve_nl2info(struct net_device *dev, struct nlattr *tb[],
> +                         struct nlattr *data[], struct ip_tunnel_info *info,
> +                         bool *metadata, bool *use_udp6_rx_checksums,
> +                         bool changelink)
>  {
> -       bool use_udp6_rx_checksums = false;
> -       struct ip_tunnel_info info;
> -       bool metadata = false;
> +       struct geneve_dev *geneve = netdev_priv(dev);
>
> -       init_tnl_info(&info, GENEVE_UDP_PORT);
> +       if (changelink) {
> +               /* if changelink operation, start with old existing info */
> +               memcpy(info, &geneve->info, sizeof(*info));
> +               *metadata = geneve->collect_md;
> +               *use_udp6_rx_checksums = geneve->use_udp6_rx_checksums;
> +       } else {
> +               init_tnl_info(info, GENEVE_UDP_PORT);
> +       }
>
>         if (data[IFLA_GENEVE_REMOTE] && data[IFLA_GENEVE_REMOTE6])
>                 return -EINVAL;
>
>         if (data[IFLA_GENEVE_REMOTE]) {
> -               info.key.u.ipv4.dst =
> +               info->key.u.ipv4.dst =
>                         nla_get_in_addr(data[IFLA_GENEVE_REMOTE]);
>
> -               if (IN_MULTICAST(ntohl(info.key.u.ipv4.dst))) {
> +               if (IN_MULTICAST(ntohl(info->key.u.ipv4.dst))) {
>                         netdev_dbg(dev, "multicast remote is unsupported\n");
>                         return -EINVAL;
>                 }
> +               if (changelink &&
> +                   ip_tunnel_info_af(&geneve->info) == AF_INET6) {
> +                       info->mode &= ~IP_TUNNEL_INFO_IPV6;
> +                       info->key.tun_flags &= ~TUNNEL_CSUM;
> +                       *use_udp6_rx_checksums = false;
> +               }
This allows changelink to change ipv4 address but there are no changes
made to the geneve tunnel port hash table after this update. We also
need to check to see if there is any conflicts with existing ports.

What is the barrier between the rx/tx threads and changelink process?

>         }
>
>         if (data[IFLA_GENEVE_REMOTE6]) {
>   #if IS_ENABLED(CONFIG_IPV6)
> -               info.mode = IP_TUNNEL_INFO_IPV6;
> -               info.key.u.ipv6.dst =
> +               info->mode = IP_TUNNEL_INFO_IPV6;
> +               info->key.u.ipv6.dst =
>                         nla_get_in6_addr(data[IFLA_GENEVE_REMOTE6]);
>
> -               if (ipv6_addr_type(&info.key.u.ipv6.dst) &
> +               if (ipv6_addr_type(&info->key.u.ipv6.dst) &
>                     IPV6_ADDR_LINKLOCAL) {
>                         netdev_dbg(dev, "link-local remote is unsupported\n");
>                         return -EINVAL;
>                 }
> -               if (ipv6_addr_is_multicast(&info.key.u.ipv6.dst)) {
> +               if (ipv6_addr_is_multicast(&info->key.u.ipv6.dst)) {
>                         netdev_dbg(dev, "multicast remote is unsupported\n");
>                         return -EINVAL;
>                 }
> -               info.key.tun_flags |= TUNNEL_CSUM;
> -               use_udp6_rx_checksums = true;
> +               info->key.tun_flags |= TUNNEL_CSUM;
> +               *use_udp6_rx_checksums = true;
Same here. We need to check/fix the geneve tunnel hash table according
to new IP address.

>  #else
>                 return -EPFNOSUPPORT;
>  #endif
> @@ -1216,48 +1241,107 @@ static int geneve_newlink(struct net *net, struct net_device *dev,
...
>
> -       if (data[IFLA_GENEVE_PORT])
> -               info.key.tp_dst = nla_get_be16(data[IFLA_GENEVE_PORT]);
> +       if (data[IFLA_GENEVE_PORT]) {
> +               if (changelink)
> +                       return -EOPNOTSUPP;
> +               info->key.tp_dst = nla_get_be16(data[IFLA_GENEVE_PORT]);
> +       }
> +
> +       if (data[IFLA_GENEVE_COLLECT_METADATA]) {
> +               if (changelink)
> +                       return -EOPNOTSUPP;
Rather than blindly returning error here it should check if the
changelink is changing existing configuration.

> +               *metadata = true;
> +       }
> +
> +       if (data[IFLA_GENEVE_UDP_CSUM]) {
> +               if (changelink)
> +                       return -EOPNOTSUPP;
> +               if (nla_get_u8(data[IFLA_GENEVE_UDP_CSUM]))
> +                       info->key.tun_flags |= TUNNEL_CSUM;
> +       }
> +
same here.

> +       if (data[IFLA_GENEVE_UDP_ZERO_CSUM6_TX]) {
> +               if (changelink)
> +                       return -EOPNOTSUPP;
> +               if (nla_get_u8(data[IFLA_GENEVE_UDP_ZERO_CSUM6_TX]))
> +                       info->key.tun_flags &= ~TUNNEL_CSUM;
same here.

> +       }
>
> -       if (data[IFLA_GENEVE_COLLECT_METADATA])
> -               metadata = true;
> +       if (data[IFLA_GENEVE_UDP_ZERO_CSUM6_RX]) {
> +               if (changelink)
> +                       return -EOPNOTSUPP;
> +               if (nla_get_u8(data[IFLA_GENEVE_UDP_ZERO_CSUM6_RX]))
> +                       *use_udp6_rx_checksums = false;
> +       }
>
> -       if (data[IFLA_GENEVE_UDP_CSUM] &&
> -           nla_get_u8(data[IFLA_GENEVE_UDP_CSUM]))
> -               info.key.tun_flags |= TUNNEL_CSUM;
> +       return 0;
> +}
>
> -       if (data[IFLA_GENEVE_UDP_ZERO_CSUM6_TX] &&
> -           nla_get_u8(data[IFLA_GENEVE_UDP_ZERO_CSUM6_TX]))
> -               info.key.tun_flags &= ~TUNNEL_CSUM;
> +static int geneve_newlink(struct net *net, struct net_device *dev,
> +                         struct nlattr *tb[], struct nlattr *data[])
> +{
> +       bool use_udp6_rx_checksums = false;
> +       struct ip_tunnel_info info;
> +       bool metadata = false;
> +       int err;
>
> -       if (data[IFLA_GENEVE_UDP_ZERO_CSUM6_RX] &&
> -           nla_get_u8(data[IFLA_GENEVE_UDP_ZERO_CSUM6_RX]))
> -               use_udp6_rx_checksums = false;
> +       err = geneve_nl2info(dev, tb, data, &info, &metadata,
> +                            &use_udp6_rx_checksums, false);
> +       if (err)
> +               return err;
>
>         return geneve_configure(net, dev, &info, metadata, use_udp6_rx_checksums);
>  }
>
> +static int geneve_changelink(struct net_device *dev, struct nlattr *tb[],
> +                            struct nlattr *data[])
> +{
> +       struct geneve_dev *geneve = netdev_priv(dev);
> +       struct ip_tunnel_info info;
> +       bool metadata = false;
> +       bool use_udp6_rx_checksums = false;
> +       int err;
> +
> +       err = geneve_nl2info(dev, tb, data, &info, &metadata,
> +                            &use_udp6_rx_checksums, true);
> +       if (err)
> +               return err;
> +
> +       if (!geneve_dst_addr_equal(&geneve->info, &info))
> +               dst_cache_reset(&info.dst_cache);
> +       geneve->info = info;
This would just overwrite dst-cache, which could leak the percpu
cached dst-entry objects.

> +       geneve->collect_md = metadata;
> +       geneve->use_udp6_rx_checksums = use_udp6_rx_checksums;
> +
> +       return 0;
> +}
> +

^ permalink raw reply

* Re: [PATCH] liquidio: use pcie_flr instead of duplicating it
From: David Miller @ 2017-05-16 20:00 UTC (permalink / raw)
  To: hch
  Cc: felix.manlunas, raghu.vatsavayi, satananda.burla, derek.chickles,
	netdev
In-Reply-To: <20170516142146.21739-1-hch@lst.de>

From: Christoph Hellwig <hch@lst.de>
Date: Tue, 16 May 2017 16:21:46 +0200

> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Tested-by: Felix Manlunas <felix.manlunas@cavium.com>

Applied to net-next, thanks.

^ permalink raw reply

* Re: [PATCH] net: phy: Remove residual magic from PHY drivers
From: David Miller @ 2017-05-16 19:58 UTC (permalink / raw)
  To: andrew; +Cc: f.fainelli, netdev
In-Reply-To: <1494952151-3040-1-git-send-email-andrew@lunn.ch>

From: Andrew Lunn <andrew@lunn.ch>
Date: Tue, 16 May 2017 18:29:11 +0200

> commit fa8cddaf903c ("net phylib: Remove unnecessary condition check in phy")
> removed the only place where the PHY flag PHY_HAS_MAGICANEG was
> checked. But it left the flag being set in the drivers. Remove the flag.
> 
> Signed-off-by: Andrew Lunn <andrew@lunn.ch>

Applied to net-next, thanks.

^ permalink raw reply

* Re: [PATCH 1/1] dt-binding: net: wireless: fix node name in the BCM43xx example
From: Martin Blumenstingl @ 2017-05-16 19:56 UTC (permalink / raw)
  To: Arend Van Spriel, robh+dt-DgEjT+Ai2ygdnm+yROfE0A
  Cc: kvalo-sgV2jX0FEOL9JmXXK+q4OQ, mark.rutland-5wv7dgnIgG8,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <8155d0f1-aed3-fc47-4524-635067f9ee7b-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>

Hi Arend,

On Tue, May 16, 2017 at 12:05 AM, Arend Van Spriel
<arend.vanspriel-dY08KVG/lbpWk0Htik3J/w@public.gmane.org> wrote:
> On 15-5-2017 22:13, Martin Blumenstingl wrote:
>> The example in the BCM43xx documentation uses "brcmf" as node name.
>> However, wireless devices should be named "wifi" instead. Fix this to
>
> Hi Martin,
>
> Since when is that a rule. I never got the memo and the DTC did not ever
> complain to me about the naming. That being said I do not really care
> and I suppose it is for the sake of consistency only.
I'm not sure if it's actually a rule or (as you already noted) just
for consistency. back when I added devicetree support to ath9k Rob
pointed out that the node should be named "wifi" (instead of "ath9k"),
see [0]

>> make sure that .dts authors can simply use the documentation as
>> reference (or simply copy the node from the documentation and then
>> adjust only the board specific bits).
>
> Please feel free to add my...
>
> Acked-by: Arend van Spriel <arend.vanspriel-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
thank you!

@Rob: maybe you can ACK this as well if you're fine with this patch?

>> Signed-off-by: Martin Blumenstingl <martin.blumenstingl-gM/Ye1E23mwN+BqQ9rBEUg@public.gmane.org>
>> ---
>>  Documentation/devicetree/bindings/net/wireless/brcm,bcm43xx-fmac.txt | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/Documentation/devicetree/bindings/net/wireless/brcm,bcm43xx-fmac.txt b/Documentation/devicetree/bindings/net/wireless/brcm,bcm43xx-fmac.txt
>> index 5dbf169cd81c..590f622188de 100644
>> --- a/Documentation/devicetree/bindings/net/wireless/brcm,bcm43xx-fmac.txt
>> +++ b/Documentation/devicetree/bindings/net/wireless/brcm,bcm43xx-fmac.txt
>> @@ -31,7 +31,7 @@ mmc3: mmc@01c12000 {
>>       non-removable;
>>       status = "okay";
>>
>> -     brcmf: bcrmf@1 {
>> +     brcmf: wifi@1 {
>>               reg = <1>;
>>               compatible = "brcm,bcm4329-fmac";
>>               interrupt-parent = <&pio>;
>>

[0] http://www.mail-archive.com/ath9k-devel-xDcbHBWguxHbcTqmT+pZeQ@public.gmane.org/msg14678.html

^ permalink raw reply

* Re: [PATCH 4.4-only] openvswitch: clear sender cpu before forwarding packets
From: Joe Stringer @ 2017-05-16 19:55 UTC (permalink / raw)
  To: Anoob Soman
  Cc: stable, Pravin B Shelar, David S. Miller, netdev, ovs dev, LKML
In-Reply-To: <1494944710-7901-1-git-send-email-anoob.soman@citrix.com>

On 16 May 2017 at 07:25, Anoob Soman <anoob.soman@citrix.com> wrote:
> Similar to commit c29390c6dfee ("xps: must clear sender_cpu before
> forwarding") the skb->sender_cpu needs to be cleared before forwarding
> packets.
>
> Fixes: 2bd82484bb4c ("xps: fix xps for stacked devices")
> Signed-off-by: Anoob Soman <anoob.soman@citrix.com>

Is this needed for 4.1 too?

^ permalink raw reply

* Re: [PATCH net-next] cxgb4: update latest firmware version supported
From: David Miller @ 2017-05-16 19:53 UTC (permalink / raw)
  To: ganeshgr; +Cc: netdev, nirranjan, indranil
In-Reply-To: <1494948412-7818-1-git-send-email-ganeshgr@chelsio.com>

From: Ganesh Goudar <ganeshgr@chelsio.com>
Date: Tue, 16 May 2017 20:56:52 +0530

> Change t4fw_version.h to update latest firmware version
> number to 1.16.43.0.
> 
> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>

People are hitting regressions in 'net' due to using firmware allowed
by the current defines in combination with the FEC disabling commit.

So it doesn't make sense to "fix" this in 'net-next'.

^ permalink raw reply

* Re: [PATCH v2 1/3] bpf: Use 1<<16 as ceiling for immediate alignment in verifier.
From: David Miller @ 2017-05-16 19:52 UTC (permalink / raw)
  To: ecree; +Cc: daniel, ast, alexei.starovoitov, netdev
In-Reply-To: <754f2c39-fdb0-2407-c2f2-aa36d506d202@solarflare.com>

From: Edward Cree <ecree@solarflare.com>
Date: Tue, 16 May 2017 13:37:42 +0100

> On 15/05/17 17:04, David Miller wrote:
>> If we use 1<<31, then sequences like:
>>
>> 	R1 = 0
>> 	R1 <<= 2
>>
>> do silly things.
> Hmm.  It might be a bit late for this, but I wonder if, instead of handling
>  alignments as (1 << align), you could store them as -(1 << align), i.e.
>  leading 1s followed by 'align' 0s.
> Now the alignment of 0 is 0 (really 1 << 32), which doesn't change when
>  left-shifted some more.  Shifts of other numbers' alignments also do the
>  right thing, e.g. align(6) << 2 = (-2) << 2 = -8 = align(6 << 2).  Of
>  course you do all this in unsigned, to make sure right shifts work.
> This also makes other arithmetic simple to track; for instance, align(a + b)
>  is at worst align(a) | align(b).  (Of course, this bound isn't tight.)
> A number is 2^(n+1)-aligned if the 2^n bit of its alignment is cleared.
> Considered as unsigned numbers, smaller values are stricter alignments.

Thanks for the bit twiddling suggestion, I'll take a look!

^ permalink raw reply

* Re: [PATCH net-next] bnx2x: Remove open coded carrier check
From: David Miller @ 2017-05-16 19:49 UTC (permalink / raw)
  To: leon; +Cc: Yuval.Mintz, netdev, leonro
In-Reply-To: <20170516122056.1534-1-leon@kernel.org>

From: Leon Romanovsky <leon@kernel.org>
Date: Tue, 16 May 2017 15:20:56 +0300

> From: Leon Romanovsky <leonro@mellanox.com>
> 
> There is inline function to test if carrier present,
> so it makes open-coded solution redundant.
> 
> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>

Applied.

^ permalink raw reply

* Re: [PATCH] [net, 4.12] mlx5e: add CONFIG_INET dependency
From: David Miller @ 2017-05-16 19:48 UTC (permalink / raw)
  To: arnd-r2nGTMty4D4
  Cc: saeedm-VPRAkNaXOzVWk0Htik3J/w, matanb-VPRAkNaXOzVWk0Htik3J/w,
	leonro-VPRAkNaXOzVWk0Htik3J/w, erezsh-VPRAkNaXOzVWk0Htik3J/w,
	ogerlitz-VPRAkNaXOzVWk0Htik3J/w, netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20170516112807.4107417-1-arnd-r2nGTMty4D4@public.gmane.org>

From: Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org>
Date: Tue, 16 May 2017 13:27:49 +0200

> We now reference the arp_tbl, which requires IPv4 support to be
> enabled in the kernel, otherwise we get a link error:
> 
> drivers/net/built-in.o: In function `mlx5e_tc_update_neigh_used_value':
> (.text+0x16afec): undefined reference to `arp_tbl'
> drivers/net/built-in.o: In function `mlx5e_rep_neigh_init':
> en_rep.c:(.text+0x16c16d): undefined reference to `arp_tbl'
> drivers/net/built-in.o: In function `mlx5e_rep_netevent_event':
> en_rep.c:(.text+0x16cbb5): undefined reference to `arp_tbl'
> 
> This adds a Kconfig dependency for it.
> 
> Fixes: 232c001398ae ("net/mlx5e: Add support to neighbour update flow")
> Signed-off-by: Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org>

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [patch iproute2 v2 repost 1/3] tc_filter: add support for chain index
From: Jiri Pirko @ 2017-05-16 19:47 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: netdev, davem, jhs, xiyou.wangcong, dsa, edumazet, daniel,
	alexander.h.duyck, simon.horman, mlxsw
In-Reply-To: <20170516111658.42b9191c@xeon-e3>

Tue, May 16, 2017 at 08:16:58PM CEST, stephen@networkplumber.org wrote:
>On Tue, 16 May 2017 19:29:35 +0200
>Jiri Pirko <jiri@resnulli.us> wrote:
>
>> From: Jiri Pirko <jiri@mellanox.com>
>> 
>> Allow user to put filter to a specific chain identified by index.
>> 
>> Signed-off-by: Jiri Pirko <jiri@mellanox.com>
>
>This will have to wait for the chain bits to show up upstream in net-next.
>

Sure. I just like to send the kernel patches alongside with the related
iproute2 patches. Thanks.

^ permalink raw reply

* Re: [PATCH v2 net-next] tcp: internal implementation for pacing
From: David Miller @ 2017-05-16 19:46 UTC (permalink / raw)
  To: edumazet; +Cc: netdev, eric.dumazet, ncardwell, ycheng, vanj, hkchu
In-Reply-To: <20170516112436.10189-1-edumazet@google.com>

From: Eric Dumazet <edumazet@google.com>
Date: Tue, 16 May 2017 04:24:36 -0700

> BBR congestion control depends on pacing, and pacing is
> currently handled by sch_fq packet scheduler for performance reasons,
> and also because implemening pacing with FQ was convenient to truly
> avoid bursts.
> 
> However there are many cases where this packet scheduler constraint
> is not practical.
> - Many linux hosts are not focusing on handling thousands of TCP
>   flows in the most efficient way.
> - Some routers use fq_codel or other AQM, but still would like
>   to use BBR for the few TCP flows they initiate/terminate.
> 
> This patch implements an automatic fallback to internal pacing.
 ...

Looks great, applied, thanks!

^ permalink raw reply

* Re: linux-next: Tree for May 16 (net/core)
From: Paul Gortmaker @ 2017-05-16 19:44 UTC (permalink / raw)
  To: Randy Dunlap
  Cc: Stephen Rothwell, Linux-Next Mailing List,
	Linux Kernel Mailing List, netdev@vger.kernel.org, Eric Dumazet
In-Reply-To: <f63533e7-edd3-4c0d-ddff-e3409d15abc6@infradead.org>

On Tue, May 16, 2017 at 12:28 PM, Randy Dunlap <rdunlap@infradead.org> wrote:
> On 05/15/17 18:21, Stephen Rothwell wrote:
>> Hi all,
>>
>> Changes since 20170515:
>>
>
> on i386 or x86_64:
>
> when CONFIG_INET is not enabled:
>
> ../net/core/sock.c: In function 'skb_orphan_partial':
> ../net/core/sock.c:1810:2: error: implicit declaration of function 'skb_is_tcp_pure_ack' [-Werror=implicit-function-declaration]
>   if (skb_is_tcp_pure_ack(skb))

Automated bisect on an ARM build with the same issue reveals:

f6ba8d33cfbb46df569972e64dbb5bb7e929bfd9 is the first bad commit
commit f6ba8d33cfbb46df569972e64dbb5bb7e929bfd9
Author: Eric Dumazet <edumazet@google.com>
Date:   Thu May 11 15:24:41 2017 -0700

    netem: fix skb_orphan_partial()

    I should have known that lowering skb->truesize was dangerous :/

    In case packets are not leaving the host via a standard Ethernet device,
    but looped back to local sockets, bad things can happen, as reported
    by Michael Madsen ( https://bugzilla.kernel.org/show_bug.cgi?id=195713 )

    So instead of tweaking skb->truesize, lets change skb->destructor
    and keep a reference on the owner socket via its sk_refcnt.

    Fixes: f2f872f9272a ("netem: Introduce skb_orphan_partial() helper")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reported-by: Michael Madsen <mkm@nabto.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

:040000 040000 7bfb7a6f5e12373b1c50ede2455b6ddd6d79cee0
b45b7255322f1dff5e3ab8d3d707cf38a91c76ce M      net
bisect run success

http://kisskb.ellerman.id.au/kisskb/buildresult/13033081/

I'm guessing Eric already knows about this but I've Cc'd him just in case.

P.
--

>
>
> --
> ~Randy

^ permalink raw reply

* Re: [PATCH net-next v2 0/3] udp: scalability improvements
From: David Miller @ 2017-05-16 19:41 UTC (permalink / raw)
  To: pabeni; +Cc: netdev, edumazet
In-Reply-To: <cover.1494881617.git.pabeni@redhat.com>

From: Paolo Abeni <pabeni@redhat.com>
Date: Tue, 16 May 2017 11:20:12 +0200

> This patch series implement an idea suggested by Eric Dumazet to
> reduce the contention of the udp sk_receive_queue lock when the socket is
> under flood.

Series applied, thanks a lot.

^ permalink raw reply

* Re: [PATCH net-next] geneve: add rtnl changelink support
From: David Miller @ 2017-05-16 19:31 UTC (permalink / raw)
  To: girish.moodalbail; +Cc: netdev, pshelar, joe, jbenc
In-Reply-To: <1494870424-22699-1-git-send-email-girish.moodalbail@oracle.com>

From: Girish Moodalbail <girish.moodalbail@oracle.com>
Date: Mon, 15 May 2017 10:47:04 -0700

>  	if (data[IFLA_GENEVE_REMOTE]) {
> -		info.key.u.ipv4.dst =
> +		info->key.u.ipv4.dst =
>  			nla_get_in_addr(data[IFLA_GENEVE_REMOTE]);
>  
> -		if (IN_MULTICAST(ntohl(info.key.u.ipv4.dst))) {
> +		if (IN_MULTICAST(ntohl(info->key.u.ipv4.dst))) {
>  			netdev_dbg(dev, "multicast remote is unsupported\n");
>  			return -EINVAL;
>  		}
> +		if (changelink &&
> +		    ip_tunnel_info_af(&geneve->info) == AF_INET6) {
> +			info->mode &= ~IP_TUNNEL_INFO_IPV6;
> +			info->key.tun_flags &= ~TUNNEL_CSUM;
> +			*use_udp6_rx_checksums = false;
> +		}
>  	}

I don't understand this "changelink" guarded code, why do you need to
clear all of this state out if the existing tunnel type if AF_INET6
and only when doing a changelink?

In any event, I think you need to add a comment explaining it.

^ permalink raw reply

* Re: [PATCH] net/smc: mark as BROKEN due to remote memory exposure
From: Doug Ledford @ 2017-05-16 19:28 UTC (permalink / raw)
  To: David Miller
  Cc: Bart.VanAssche-XdAiOPVOjttBDgjK7y7TUQ,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, hch-jcswGhMUV9g,
	netdev-u79uwXL29TY76Z2rM5mHXA, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	stable-u79uwXL29TY76Z2rM5mHXA,
	ubraun-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
In-Reply-To: <20170516.145249.871010194359061722.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>

On Tue, 2017-05-16 at 14:52 -0400, David Miller wrote:
> From: Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> Date: Tue, 16 May 2017 14:03:22 -0400
> 
> > On Tue, 2017-05-16 at 13:36 -0400, David Miller wrote:
> >> From: Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> >> Date: Tue, 16 May 2017 13:20:44 -0400
> >> 
> >> > Anyway, we're just talking out what happened, when what we
> really
> >> need
> >> > to focus on is moving forward.  Again, your thoughts on marking
> SMC
> >> > EXPERIMENTAL until it's fixed up and unfreezing the API in case
> we
> >> need
> >> > to adjust it to work on different link layers?
> >> 
> >> Something like:
> >> 
> >>         http://patchwork.ozlabs.org/patch/762803/
> >> 
> >> with the addition of the EXPERIMENTAL dependency?
> >> 
> >> Sure.
> > 
> > Perfect.  I assume you'll submit it since it's in your patchworks?
> 
> Ok I applied the patch referenced above, but we don't actually have
> an EXPERIMENTAL symbol.  The closest thing we have is BROKEN and
> even in this situation that's a bit harsh.

I hadn't realized EXPERIMENTAL was gone.  Which is too bad, because
that's entirely appropriate in this case, and would have had the
desired side effect of keeping it out of any non-cutting edge distros
and warning people of possible API changes.  With EXPERIMENTAL gone,
the closest thing we have is drivers/staging, since that tends to imply
some of the same consequences.  I know you think BROKEN is overly
harsh, but I'm not sure we should just do nothing.  How about we take a
few days to let some of the RDMA people closely review the 143 page
(egads!) rfc (http://www.rfc-editor.org/info/rfc7609) to see if we
think it can be fixed to use multiple link layers with the existing API
in place or if it will require something other than AF_SMC.  If we need
to break API, then I think we should either fix it ASAP and send that
fix to the 4.11 stable series (which probably violates the normative
stable patch size/scope) or if the fix will take longer than this
kernel cycle, then move it to staging both here and in 4.11 stable, and
fix it there and then move it back.  Something like that would prevent
the kind of API flappage we ought not do....

-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
    GPG KeyID: B826A3330E572FDD

Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH net] selftests/bpf: fix broken build due to types.h
From: David Miller @ 2017-05-16 19:18 UTC (permalink / raw)
  To: yhs; +Cc: daniel, netdev, kernel-team
In-Reply-To: <20170516184634.2803675-1-yhs@fb.com>

Please correct the address of the netdev list (it is just plain
'netdev' not 'linux-netdev').

Secondly, __always_inline should not be defined by types.h

That has to come from linux/compiler.h which we have no reason
to define a private version of for eBPF clang compilation.

The problem is that via several layers of indirection, linux/types.h
eventually includes linux/compiler.h and that is probably the more
appropriate thing for you to do.

^ permalink raw reply

* Re: [PATCH 2/3] bpf: Track alignment of MAP pointers in verifier.
From: David Miller @ 2017-05-16 19:08 UTC (permalink / raw)
  To: daniel; +Cc: ast, alexei.starovoitov, netdev
In-Reply-To: <591A23E3.2050105@iogearbox.net>

From: Daniel Borkmann <daniel@iogearbox.net>
Date: Mon, 15 May 2017 23:55:47 +0200

> I'm actually wondering about the min_align/aux_off/aux_off_align and
> given this is not really related to varlen_map_access and we currently
> just skip this.
> 
> We should make sure that when env->strict_alignment is false that we
> ignore any difference in min_align/aux_off/aux_off_align, afaik, the
> min_align would also be set on regs other than ptr_to_pkt.

Ok I see what you are saying, alignment related register state has to
be taken into consideration during pruning but only when
env->strict_alignment is true.

->min_align is set on any register upon which a calculation is
performed.

> What about compare_ptrs_to_packet() for when env->strict_alignment is
> true in ptr_to_pkt case?

Yes we need to do something there, and yes we do need testcases.

You also remind me that I was thinking about whether we should
propagate alignment state through branches.  For example on
the taken path of a JEQ we can set both arms of the test to
have the largest of the two arms alignment.

^ permalink raw reply

* Re: ipsec doesn't route TCP with 4.11 kernel
From: Don Bowman @ 2017-05-16 19:05 UTC (permalink / raw)
  To: Steffen Klassert
  Cc: Cong Wang, linux-kernel@vger.kernel.org, Herbert Xu,
	Linux Kernel Network Developers
In-Reply-To: <20170503081421.GL2649@secunet.com>

On 3 May 2017 at 04:14, Steffen Klassert <steffen.klassert@secunet.com> wrote:
> On Sat, Apr 29, 2017 at 08:39:34PM -0400, Don Bowman wrote:
>> On 28 April 2017 at 03:13, Steffen Klassert
>> <steffen.klassert@secunet.com> wrote:
>> > On Thu, Apr 27, 2017 at 06:13:38PM -0400, Don Bowman wrote:
>> >> On 27 April 2017 at 04:42, Steffen Klassert <steffen.klassert@secunet.com>
>> >> wrote:
>> >> > On Wed, Apr 26, 2017 at 10:01:34PM -0700, Cong Wang wrote:
>> >> >> (Cc'ing netdev and IPSec maintainers)
>> >> >>
>> >> >> On Tue, Apr 25, 2017 at 6:08 PM, Don Bowman <db@donbowman.ca> wrote:
>> >>
>>
>> <snip>
>>
>> confirmed, with this patch in place that the tcp functions properly.
>
> Thanks for testing!
>
> I'll make sure to get this fix into the mainline soon.

Thanks. Let me know if there is any more assistance I can provide.
I've been running the patch for 2 weeks now on 3 machines.

^ permalink raw reply

* Re: [PATCH 1/2] net-next: stmmac: add adjust_link function
From: Corentin Labbe @ 2017-05-16 19:04 UTC (permalink / raw)
  To: Florian Fainelli; +Cc: peppe.cavallaro, alexandre.torgue, netdev, linux-kernel
In-Reply-To: <b54c0535-f3b2-6184-1422-ec9eed22e67f@gmail.com>

On Tue, May 16, 2017 at 09:43:04AM -0700, Florian Fainelli wrote:
> On 05/15/2017 04:41 AM, Corentin Labbe wrote:
> > My dwmac-sun8i serie will add some if (has_sun8i) to
> > stmmac_adjust_link()
> > Since the current stmmac_adjust_link() alreaady have lots of if (has_gmac/gmac4),
> > It is now better to create an adjust_link() function for each dwmac.
> 
> Is it really, because the diffstat really seems to indicate otherwise
> and by looking at the code, I am definitively not convinced this is an
> improvement other the current code.
> 
> > 
> > So this patch add an adjust_link() function pointer, and move code out
> > of stmmac_adjust_link to it.
> 
> Can't we keep the existing adjust_link() function and just have a
> different one for dwmac-sun8i that either re-uses portions of the
> existing, or duplicate just what it needs?
> 

I have done this work because the maintainer thinked it was a good idea (and so am I)
See https://www.mail-archive.com/netdev@vger.kernel.org/msg167484.html

You can see also the necessary modification of adjust_link for dwmac-sun8i here: http://lists.infradead.org/pipermail/linux-arm-kernel/2017-May/504034.html
I agree that without dwmac-sun8i, this patch is near uncessary, but with it I add 4 "if (sun8i)".

Anyway I will fix all problem you reported below.

Thanks
Regards

> > 
> > Removing in the process stmmac_mac_flow_ctrl/stmmac_hw_fix_mac_speed
> > since there not used anymore.
> > 
> > Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
> > ---
> >  drivers/net/ethernet/stmicro/stmmac/common.h       |  3 +
> >  .../net/ethernet/stmicro/stmmac/dwmac1000_core.c   | 54 ++++++++++++++
> >  .../net/ethernet/stmicro/stmmac/dwmac100_core.c    | 46 ++++++++++++
> >  drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c  | 54 ++++++++++++++
> >  drivers/net/ethernet/stmicro/stmmac/stmmac_main.c  | 83 +---------------------
> >  5 files changed, 158 insertions(+), 82 deletions(-)
> > 
> > diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h
> > index b7ce3fbb5375..451c231006fe 100644
> > --- a/drivers/net/ethernet/stmicro/stmmac/common.h
> > +++ b/drivers/net/ethernet/stmicro/stmmac/common.h
> > @@ -469,11 +469,14 @@ struct stmmac_dma_ops {
> >  };
> >  
> >  struct mac_device_info;
> > +struct stmmac_priv;
> >  
> >  /* Helpers to program the MAC core */
> >  struct stmmac_ops {
> >  	/* MAC core initialization */
> >  	void (*core_init)(struct mac_device_info *hw, int mtu);
> > +	/* adjust link */
> > +	int (*adjust_link)(struct stmmac_priv *priv);
> >  	/* Enable the MAC RX/TX */
> >  	void (*set_mac)(void __iomem *ioaddr, bool enable);
> >  	/* Enable and verify that the IPC module is supported */
> > diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c
> > index f3d9305e5f70..5f3aace46c41 100644
> > --- a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c
> > +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c
> > @@ -26,6 +26,7 @@
> >  #include <linux/slab.h>
> >  #include <linux/ethtool.h>
> >  #include <asm/io.h>
> > +#include "stmmac.h"
> >  #include "stmmac_pcs.h"
> >  #include "dwmac1000.h"
> >  
> > @@ -75,6 +76,58 @@ static void dwmac1000_core_init(struct mac_device_info *hw, int mtu)
> >  #endif
> >  }
> >  
> > +static int dwmac1000_adjust_link(struct stmmac_priv *priv)
> > +{
> > +	struct net_device *ndev = priv->dev;
> > +	struct phy_device *phydev = ndev->phydev;
> > +	int new_state = 0;
> > +	u32 tx_cnt = priv->plat->tx_queues_to_use;
> > +	u32 ctrl;
> 
> Reverse christmas tree declaration please.
> 
> > +
> > +	ctrl = readl(priv->ioaddr + GMAC_CONTROL);
> > +
> > +	if (phydev->duplex != priv->oldduplex) {
> > +		new_state = 1;
> 
> bool new_state
> 
> > +		if (!(phydev->duplex))
> 
> Parenthesis not needed (this is an integer, not a bitmask)
> 
> > +			ctrl &= ~GMAC_CONTROL_DM;
> > +		else
> > +			ctrl |= GMAC_CONTROL_DM;
> > +		priv->oldduplex = phydev->duplex;
> > +	}
> > +
> > +	if (phydev->pause)
> > +		priv->hw->mac->flow_ctrl(priv->hw, phydev->duplex, priv->flow_ctrl,
> > +					 priv->pause, tx_cnt);
> > +
> > +	if (phydev->speed != priv->speed) {
> > +		new_state = 1;
> > +		switch (phydev->speed) {
> > +		case 1000:
> 
> case SPEED_1000 and so on?
> 
> > +			ctrl &= ~GMAC_CONTROL_PS;
> > +			break;
> > +		case 100:
> > +			ctrl |= GMAC_CONTROL_PS;
> > +			ctrl |= GMAC_CONTROL_FES;
> > +			break;
> > +		case 10:
> > +			ctrl |= GMAC_CONTROL_PS;
> > +			ctrl |= ~GMAC_CONTROL_FES;
> > +			break;
> > +		default:
> > +			netif_warn(priv, link, priv->dev,
> > +				   "broken speed: %d\n", phydev->speed);
> > +			phydev->speed = SPEED_UNKNOWN;
> > +			break;
> > +		}
> > +		if (phydev->speed != SPEED_UNKNOWN && likely(priv->plat->fix_mac_speed))
> > +			priv->plat->fix_mac_speed(priv->plat->bsp_priv, phydev->speed);
> > +		priv->speed = phydev->speed;
> > +	}
> > +
> > +	writel(ctrl, priv->ioaddr + GMAC_CONTROL);
> > +	return new_state;
> > +}
> > +
> >  static int dwmac1000_rx_ipc_enable(struct mac_device_info *hw)
> >  {
> >  	void __iomem *ioaddr = hw->pcsr;
> > @@ -490,6 +543,7 @@ static void dwmac1000_debug(void __iomem *ioaddr, struct stmmac_extra_stats *x,
> >  
> >  static const struct stmmac_ops dwmac1000_ops = {
> >  	.core_init = dwmac1000_core_init,
> > +	.adjust_link = dwmac1000_adjust_link,
> >  	.set_mac = stmmac_set_mac,
> >  	.rx_ipc = dwmac1000_rx_ipc_enable,
> >  	.dump_regs = dwmac1000_dump_regs,
> > diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac100_core.c b/drivers/net/ethernet/stmicro/stmmac/dwmac100_core.c
> > index 1b3609105484..ba3d46e65e1a 100644
> > --- a/drivers/net/ethernet/stmicro/stmmac/dwmac100_core.c
> > +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac100_core.c
> > @@ -27,6 +27,7 @@
> >  #include <linux/crc32.h>
> >  #include <asm/io.h>
> >  #include "dwmac100.h"
> > +#include "stmmac.h"
> >  
> >  static void dwmac100_core_init(struct mac_device_info *hw, int mtu)
> >  {
> > @@ -40,6 +41,50 @@ static void dwmac100_core_init(struct mac_device_info *hw, int mtu)
> >  #endif
> >  }
> >  
> > +static int dwmac100_adjust_link(struct stmmac_priv *priv)
> > +{
> > +	struct net_device *ndev = priv->dev;
> > +	struct phy_device *phydev = ndev->phydev;
> > +	int new_state = 0;
> > +	u32 tx_cnt = priv->plat->tx_queues_to_use;
> > +	u32 ctrl;
> > +
> > +	ctrl = readl(priv->ioaddr + MAC_CTRL_REG);
> > +	if (phydev->duplex != priv->oldduplex) {
> > +		new_state = 1;
> > +		if (!(phydev->duplex))
> > +			ctrl &= ~MAC_CONTROL_F;
> > +		else
> > +			ctrl |= MAC_CONTROL_F;
> > +		priv->oldduplex = phydev->duplex;
> > +	}
> > +
> > +	if (phydev->pause)
> > +		priv->hw->mac->flow_ctrl(priv->hw, phydev->duplex, priv->flow_ctrl,
> > +				priv->pause, tx_cnt);
> > +
> > +	if (phydev->speed != priv->speed) {
> > +		new_state = 1;
> > +		switch (phydev->speed) {
> > +		case 100:
> > +		case 10:
> > +			ctrl &= ~MAC_CONTROL_PS;
> > +			break;
> > +		default:
> > +			netif_warn(priv, link, priv->dev,
> > +					"broken speed: %d\n", phydev->speed);
> > +			phydev->speed = SPEED_UNKNOWN;
> > +			break;
> > +		}
> > +		if (phydev->speed != SPEED_UNKNOWN && likely(priv->plat->fix_mac_speed))
> > +			priv->plat->fix_mac_speed(priv->plat->bsp_priv, phydev->speed);
> > +		priv->speed = phydev->speed;
> > +	}
> > +
> > +	writel(ctrl, priv->ioaddr + MAC_CTRL_REG);
> > +	return new_state;
> > +}
> > +
> >  static void dwmac100_dump_mac_regs(struct mac_device_info *hw, u32 *reg_space)
> >  {
> >  	void __iomem *ioaddr = hw->pcsr;
> > @@ -150,6 +195,7 @@ static void dwmac100_pmt(struct mac_device_info *hw, unsigned long mode)
> >  
> >  static const struct stmmac_ops dwmac100_ops = {
> >  	.core_init = dwmac100_core_init,
> > +	.adjust_link = dwmac100_adjust_link,
> >  	.set_mac = stmmac_set_mac,
> >  	.rx_ipc = dwmac100_rx_ipc_enable,
> >  	.dump_regs = dwmac100_dump_mac_regs,
> > diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c
> > index 48793f2e9307..133b6bcd7b61 100644
> > --- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c
> > +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c
> > @@ -19,6 +19,7 @@
> >  #include <linux/io.h>
> >  #include "stmmac_pcs.h"
> >  #include "dwmac4.h"
> > +#include "stmmac.h"
> >  
> >  static void dwmac4_core_init(struct mac_device_info *hw, int mtu)
> >  {
> > @@ -59,6 +60,58 @@ static void dwmac4_core_init(struct mac_device_info *hw, int mtu)
> >  	writel(value, ioaddr + GMAC_INT_EN);
> >  }
> >  
> > +static int dwmac4_adjust_link(struct stmmac_priv *priv)
> > +{
> > +	struct net_device *ndev = priv->dev;
> > +	struct phy_device *phydev = ndev->phydev;
> > +	int new_state = 0;
> > +	u32 tx_cnt = priv->plat->tx_queues_to_use;
> > +	u32 ctrl;
> > +
> > +	ctrl = readl(priv->ioaddr + MAC_CTRL_REG);
> > +
> > +	if (phydev->duplex != priv->oldduplex) {
> > +		new_state = 1;
> > +		if (!(phydev->duplex))
> > +			ctrl &= ~GMAC_CONFIG_DM;
> > +		else
> > +			ctrl |= GMAC_CONFIG_DM;
> > +		priv->oldduplex = phydev->duplex;
> > +	}
> > +
> > +	if (phydev->pause)
> > +		priv->hw->mac->flow_ctrl(priv->hw, phydev->duplex, priv->flow_ctrl,
> > +					 priv->pause, tx_cnt);
> > +
> > +	if (phydev->speed != priv->speed) {
> > +		new_state = 1;
> > +		switch (phydev->speed) {
> > +		case 1000:
> > +			ctrl &= ~GMAC_CONFIG_PS;
> > +			break;
> > +		case 100:
> > +			ctrl |= GMAC_CONFIG_PS;
> > +			ctrl |= GMAC_CONFIG_FES;
> > +			break;
> > +		case 10:
> > +			ctrl |= GMAC_CONFIG_PS;
> > +			ctrl |= ~GMAC_CONFIG_FES;
> > +			break;
> > +		default:
> > +			netif_warn(priv, link, priv->dev,
> > +				   "broken speed: %d\n", phydev->speed);
> > +			phydev->speed = SPEED_UNKNOWN;
> > +			break;
> > +		}
> > +		if (phydev->speed != SPEED_UNKNOWN && likely(priv->plat->fix_mac_speed))
> > +			priv->plat->fix_mac_speed(priv->plat->bsp_priv, phydev->speed);
> > +		priv->speed = phydev->speed;
> > +	}
> > +
> > +	writel(ctrl, priv->ioaddr + MAC_CTRL_REG);
> > +	return new_state;
> > +}
> > +
> >  static void dwmac4_rx_queue_enable(struct mac_device_info *hw,
> >  				   u8 mode, u32 queue)
> >  {
> > @@ -669,6 +722,7 @@ static void dwmac4_debug(void __iomem *ioaddr, struct stmmac_extra_stats *x,
> >  
> >  static const struct stmmac_ops dwmac4_ops = {
> >  	.core_init = dwmac4_core_init,
> > +	.adjust_link = dwmac4_adjust_link,
> >  	.set_mac = stmmac_set_mac,
> >  	.rx_ipc = dwmac4_rx_ipc_enable,
> >  	.rx_queue_enable = dwmac4_rx_queue_enable,
> > diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> > index b05a042cf2c6..fb3e2ddaa7c9 100644
> > --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> > +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> > @@ -286,21 +286,6 @@ static inline u32 stmmac_rx_dirty(struct stmmac_priv *priv, u32 queue)
> >  }
> >  
> >  /**
> > - * stmmac_hw_fix_mac_speed - callback for speed selection
> > - * @priv: driver private structure
> > - * Description: on some platforms (e.g. ST), some HW system configuration
> > - * registers have to be set according to the link speed negotiated.
> > - */
> > -static inline void stmmac_hw_fix_mac_speed(struct stmmac_priv *priv)
> > -{
> > -	struct net_device *ndev = priv->dev;
> > -	struct phy_device *phydev = ndev->phydev;
> > -
> > -	if (likely(priv->plat->fix_mac_speed))
> > -		priv->plat->fix_mac_speed(priv->plat->bsp_priv, phydev->speed);
> > -}
> > -
> > -/**
> >   * stmmac_enable_eee_mode - check and enter in LPI mode
> >   * @priv: driver private structure
> >   * Description: this function is to verify and enter in LPI mode in case of
> > @@ -759,19 +744,6 @@ static void stmmac_release_ptp(struct stmmac_priv *priv)
> >  }
> >  
> >  /**
> > - *  stmmac_mac_flow_ctrl - Configure flow control in all queues
> > - *  @priv: driver private structure
> > - *  Description: It is used for configuring the flow control in all queues
> > - */
> > -static void stmmac_mac_flow_ctrl(struct stmmac_priv *priv, u32 duplex)
> > -{
> > -	u32 tx_cnt = priv->plat->tx_queues_to_use;
> > -
> > -	priv->hw->mac->flow_ctrl(priv->hw, duplex, priv->flow_ctrl,
> > -				 priv->pause, tx_cnt);
> > -}
> > -
> > -/**
> >   * stmmac_adjust_link - adjusts the link parameters
> >   * @dev: net device structure
> >   * Description: this is the helper called by the physical abstraction layer
> > @@ -793,60 +765,7 @@ static void stmmac_adjust_link(struct net_device *dev)
> >  	spin_lock_irqsave(&priv->lock, flags);
> >  
> >  	if (phydev->link) {
> > -		u32 ctrl = readl(priv->ioaddr + MAC_CTRL_REG);
> > -
> > -		/* Now we make sure that we can be in full duplex mode.
> > -		 * If not, we operate in half-duplex mode. */
> > -		if (phydev->duplex != priv->oldduplex) {
> > -			new_state = 1;
> > -			if (!(phydev->duplex))
> > -				ctrl &= ~priv->hw->link.duplex;
> > -			else
> > -				ctrl |= priv->hw->link.duplex;
> > -			priv->oldduplex = phydev->duplex;
> > -		}
> > -		/* Flow Control operation */
> > -		if (phydev->pause)
> > -			stmmac_mac_flow_ctrl(priv, phydev->duplex);
> > -
> > -		if (phydev->speed != priv->speed) {
> > -			new_state = 1;
> > -			switch (phydev->speed) {
> > -			case 1000:
> > -				if (priv->plat->has_gmac ||
> > -				    priv->plat->has_gmac4)
> > -					ctrl &= ~priv->hw->link.port;
> > -				break;
> > -			case 100:
> > -				if (priv->plat->has_gmac ||
> > -				    priv->plat->has_gmac4) {
> > -					ctrl |= priv->hw->link.port;
> > -					ctrl |= priv->hw->link.speed;
> > -				} else {
> > -					ctrl &= ~priv->hw->link.port;
> > -				}
> > -				break;
> > -			case 10:
> > -				if (priv->plat->has_gmac ||
> > -				    priv->plat->has_gmac4) {
> > -					ctrl |= priv->hw->link.port;
> > -					ctrl &= ~(priv->hw->link.speed);
> > -				} else {
> > -					ctrl &= ~priv->hw->link.port;
> > -				}
> > -				break;
> > -			default:
> > -				netif_warn(priv, link, priv->dev,
> > -					   "broken speed: %d\n", phydev->speed);
> > -				phydev->speed = SPEED_UNKNOWN;
> > -				break;
> > -			}
> > -			if (phydev->speed != SPEED_UNKNOWN)
> > -				stmmac_hw_fix_mac_speed(priv);
> > -			priv->speed = phydev->speed;
> > -		}
> > -
> > -		writel(ctrl, priv->ioaddr + MAC_CTRL_REG);
> > +		new_state = priv->hw->mac->adjust_link(priv);
> >  
> >  		if (!priv->oldlink) {
> >  			new_state = 1;
> > 
> 
> 
> -- 
> Florian

^ permalink raw reply

* Re: [PATCH] net: Improve handling of failures on link and route dumps
From: David Miller @ 2017-05-16 18:54 UTC (permalink / raw)
  To: dsahern; +Cc: netdev, mq
In-Reply-To: <20170516061917.8128-1-dsahern@gmail.com>

From: David Ahern <dsahern@gmail.com>
Date: Mon, 15 May 2017 23:19:17 -0700

> In general, rtnetlink dumps do not anticipate failure to dump a single
> object (e.g., link or route) on a single pass. As both route and link
> objects have grown via more attributes, that is no longer a given.
> 
> netlink dumps can handle a failure if the dump function returns an
> error; specifically, netlink_dump adds the return code to the response
> if it is <= 0 so userspace is notified of the failure. The missing
> piece is the rtnetlink dump functions returning the error.
> 
> Fix route and link dump functions to return the errors if no object is
> added to an skb (detected by skb->len != 0). IPv6 route dumps
> (rt6_dump_route) already return the error; this patch updates IPv4 and
> link dumps. Other dump functions may need to be ajusted as well.
> 
> Reported-by: Jan Moskyto Matejka <mq@ucw.cz>
> Signed-off-by: David Ahern <dsahern@gmail.com>
> ---
> The recent IPv6 multipath change brought this to light because of the
> ease at which ipv6 route appends can exceed a buffer size, but it seems
> to be a day 1 problem.

Applied and queued up for -stable, thanks David.

^ permalink raw reply

* Re: [PATCH net v1] net/smc: Add warning about remote memory exposure
From: David Miller @ 2017-05-16 18:53 UTC (permalink / raw)
  To: leon; +Cc: ubraun, netdev, iinux-rdma, hch
In-Reply-To: <20170516065138.24789-1-leon@kernel.org>

From: Leon Romanovsky <leon@kernel.org>
Date: Tue, 16 May 2017 09:51:38 +0300

> From: Christoph Hellwig <hch@lst.de>
> 
> The driver explicitly bypasses APIs to register all memory once a
> connection is made, and thus allows remote access to memory.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Leon Romanovsky <leon@kernel.org>

Applied.

^ permalink raw reply

* Re: [PATCH V3 net 1/1] smc: switch to usage of IB_PD_UNSAFE_GLOBAL_RKEY
From: David Miller @ 2017-05-16 18:53 UTC (permalink / raw)
  To: ubraun-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
  Cc: hch-jcswGhMUV9g, netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-s390-u79uwXL29TY76Z2rM5mHXA,
	jwi-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	schwidefsky-tA70FqPdS9bQT0dZR+AlfA,
	heiko.carstens-tA70FqPdS9bQT0dZR+AlfA,
	raspl-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
In-Reply-To: <20170515153337.31262-2-ubraun-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>

From: Ursula Braun <ubraun-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
Date: Mon, 15 May 2017 17:33:37 +0200

> Currently, SMC enables remote access to physical memory when a user
> has successfully configured and established an SMC-connection until ten
> minutes after the last SMC connection is closed. Because this is considered
> a security risk, drivers are supposed to use IB_PD_UNSAFE_GLOBAL_RKEY in
> such a case.
> 
> This patch changes the current SMC code to use IB_PD_UNSAFE_GLOBAL_RKEY.
> This improves user awareness, but does not remove the security risk itself.
> 
> Signed-off-by: Ursula Braun <ubraun-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>

Applied.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox