* Re: [PATCH net-next 01/10] net: stmmac: Let descriptor code set skbuff address
From: David Miller @ 2018-05-10 19:06 UTC (permalink / raw)
To: Jose.Abreu
Cc: netdev, Joao.Pinto, Vitor.Soares, peppe.cavallaro,
alexandre.torgue
In-Reply-To: <cfc227a5bce0caf2058a8efe1bf5071b9dec48e0.1525683833.git.joabreu@synopsys.com>
From: Jose Abreu <Jose.Abreu@synopsys.com>
Date: Tue, 8 May 2018 15:45:24 +0100
> Stop using if conditions depending on the GMAC version for setting the
> the descriptor skbuff address and use instead a helper implemented in
> the descriptor files.
>
> Signed-off-by: Jose Abreu <joabreu@synopsys.com>
With Spectre mitigations, indirect calls are extremely expensive. Much
more expensive than conditional checks.
And since this is the descriptor setup in the fast paths of the driver,
I advise that you keep these conditionals.
^ permalink raw reply
* Re: [PATCH net-next] net:sched: add gkprio scheduler
From: Michel Machado @ 2018-05-10 19:06 UTC (permalink / raw)
To: Cong Wang
Cc: Nishanth Devarajan, Jiri Pirko, Jamal Hadi Salim, David Miller,
Linux Kernel Network Developers, Cody Doucette
In-Reply-To: <CAM_iQpX5iwRN5C7oQ3X9F0viG8Ya15C0gcYFR_CJDxqhXCPYqQ@mail.gmail.com>
On 05/10/2018 01:38 PM, Cong Wang wrote:
> On Wed, May 9, 2018 at 7:09 AM, Michel Machado <michel@digirati.com.br> wrote:
>> On 05/08/2018 10:24 PM, Cong Wang wrote:
>>>
>>> On Tue, May 8, 2018 at 5:59 AM, Michel Machado <michel@digirati.com.br>
>>> wrote:
>>>>>>
>>>>>> Overall it looks good to me, just one thing below:
>>>>>>
>>>>>>> +struct Qdisc_ops gkprio_qdisc_ops __read_mostly = {
>>>>>>> + .id = "gkprio",
>>>>>>> + .priv_size = sizeof(struct gkprio_sched_data),
>>>>>>> + .enqueue = gkprio_enqueue,
>>>>>>> + .dequeue = gkprio_dequeue,
>>>>>>> + .peek = qdisc_peek_dequeued,
>>>>>>> + .init = gkprio_init,
>>>>>>> + .reset = gkprio_reset,
>>>>>>> + .change = gkprio_change,
>>>>>>> + .dump = gkprio_dump,
>>>>>>> + .destroy = gkprio_destroy,
>>>>>>> + .owner = THIS_MODULE,
>>>>>>> +};
>>>>>>
>>>>>>
>>>>>>
>>>>>> You probably want to add Qdisc_class_ops here so that you can
>>>>>> dump the stats of each internal queue.
>>>>
>>>>
>>>>
>>>> Hi Cong,
>>>>
>>>> In the production scenario we are targeting, this priority queue must
>>>> be
>>>> classless; being classful would only bloat the code for us. I don't see
>>>> making this queue classful as a problem per se, but I suggest leaving it
>>>> as
>>>> a future improvement for when someone can come up with a useful scenario
>>>> for
>>>> it.
>>>
>>>
>>>
>>> Take a look at sch_prio, it is fairly simple since your internal
>>> queues are just an array... Per-queue stats are quite useful
>>> in production, we definitely want to observe which queues are
>>> full which are not.
>>>
>>
>> DSprio cannot add Qdisc_class_ops without a rewrite of other queue
>> disciplines, which doesn't seem desirable. Since the method cops->leaf is
>> required (see register_qdisc()), we would need to replace the array struct
>> sk_buff_head qdiscs[GKPRIO_MAX_PRIORITY] in struct gkprio_sched_data with
>> the array struct Qdisc *queues[GKPRIO_MAX_PRIORITY] to be able to return a
>> Qdisc in dsprio_leaf(). The problem with this change is that Qdisc does not
>> have a method to dequeue from its tail. This new method may not even make
>> sense in other queue disciplines. But without this method, gkprio_enqueue()
>> cannot drop the lowest priority packet when the queue is full and an
>> incoming packet has higher priority.
>
> Sorry for giving you a bad example. Take a look at sch_fq_codel instead,
> it returns NULL for ->leaf() and maps its internal flows to classes.
>
> I thought sch_prio uses internal qdiscs, but I was wrong, as you noticed
> it actually exposes them to user via classes.
>
> My point is never to make it classful, just want to expose the useful stats,
> like how fq_codel dumps its internal flows.
>
>
>>
>> Nevertheless, I see your point on being able to observe the distribution of
>> queued packets per priority. A solution for that would be to add the array
>> __u32 qlen[GKPRIO_MAX_PRIORITY] in struct tc_gkprio_qopt. This solution even
>> avoids adding overhead in the critical paths of DSprio. Do you see a better
>> solution?
>
> I believe you can return NULL for ->leaf() and don't need to worry about
> ->graft() either. ;)
Thank you for pointing sch_fq_codel out. We'll follow its example.
[ ]'s
Michel Machado
^ permalink raw reply
* Re: [PATCH net-next 00/10] net: stmmac: Be less dependent on Synopsys ID
From: David Miller @ 2018-05-10 19:06 UTC (permalink / raw)
To: Jose.Abreu
Cc: netdev, Joao.Pinto, Vitor.Soares, peppe.cavallaro,
alexandre.torgue
In-Reply-To: <cover.1525683832.git.joabreu@synopsys.com>
From: Jose Abreu <Jose.Abreu@synopsys.com>
Date: Tue, 8 May 2018 15:45:23 +0100
> This series was based on top of [1]. Please see my reply on [1] and let me
> know if I need to rebase against -next without this dependecy.
...
> [1] https://patchwork.ozlabs.org/patch/908618/
V1 of that patch is what ended up in my tree before you even posted
V2. So if you have to update to what's in V2 you must send in a
relative patch against net-next to fix it up.
Thank you.
^ permalink raw reply
* Re: [PATCH -next 0/6] net: Update static keys to modern api
From: David Miller @ 2018-05-10 19:14 UTC (permalink / raw)
To: dave; +Cc: netdev, linux-kernel
In-Reply-To: <20180508160703.8125-1-dave@stgolabs.net>
From: Davidlohr Bueso <dave@stgolabs.net>
Date: Tue, 8 May 2018 09:06:57 -0700
> The following patches update pretty much all core net static key users
> to the modern api. Changes are mostly trivial conversion without affecting
> any semantics. The motivation is a resend of patches 1 and 2 from a while[1]
> back, and the rest are added patches, specific for -net.
>
> Applies against today's linux-next. Compile tested only.
>
> [1] lkml.kernel.org/r/20180326210929.5244-1-dave@stgolabs.net
Series applied, thank you.
^ permalink raw reply
* Re: [PATCH net-next] tun: Do SIOCGSKNS out of rtnl_lock()
From: David Miller @ 2018-05-10 19:17 UTC (permalink / raw)
To: ktkhai; +Cc: jasowang, edumazet, mst, brouer, peterpenkov96, sd, netdev
In-Reply-To: <152579647246.21100.10461408116587658568.stgit@localhost.localdomain>
From: Kirill Tkhai <ktkhai@virtuozzo.com>
Date: Tue, 08 May 2018 19:21:34 +0300
> Since net ns of tun device is assigned on the device creation,
> and it never changes, we do not need to use any lock to get it
> from alive tun.
>
> Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Applied, thank you.
^ permalink raw reply
* Re: [PATCH net 0/2] qed*: Rdma fixes
From: David Miller @ 2018-05-10 19:22 UTC (permalink / raw)
To: Michal.Kalderon; +Cc: netdev, linux-rdma, chad.dupuis, Sudarsana.Kalluru
In-Reply-To: <20180508182919.23590-1-Michal.Kalderon@cavium.com>
From: Michal Kalderon <Michal.Kalderon@cavium.com>
Date: Tue, 8 May 2018 21:29:17 +0300
> This patch series include two fixes for bugs related to rdma.
> The first has to do with loading the driver over an iWARP
> device.
> The second fixes a previous commit that added proper link
> indication for iWARP / RoCE.
>
> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
> Signed-off-by: Sudarsana Kalluru <Sudarsana.Kalluru@cavium.com>
Series applied, thanks.
^ permalink raw reply
* Re: [PATCH] [ATM] firestream: fix spelling mistake: "reseverd" -> "reserved"
From: David Miller @ 2018-05-10 19:24 UTC (permalink / raw)
To: colin.king
Cc: 3chas3, linux-atm-general, netdev, kernel-janitors, linux-kernel
In-Reply-To: <20180508220151.20725-1-colin.king@canonical.com>
From: Colin King <colin.king@canonical.com>
Date: Tue, 8 May 2018 23:01:51 +0100
> From: Colin Ian King <colin.king@canonical.com>
>
> Trivial fix to spelling mistake in res_strings string array
>
> Signed-off-by: Colin Ian King <colin.king@canonical.com>
Applied.
^ permalink raw reply
* Re: [PATCH] sctp: fix spelling mistake: "max_retans" -> "max_retrans"
From: David Miller @ 2018-05-10 19:24 UTC (permalink / raw)
To: colin.king
Cc: vyasevich, nhorman, marcelo.leitner, linux-sctp, netdev,
kernel-janitors, linux-kernel
In-Reply-To: <20180508222428.24874-1-colin.king@canonical.com>
From: Colin King <colin.king@canonical.com>
Date: Tue, 8 May 2018 23:24:28 +0100
> From: Colin Ian King <colin.king@canonical.com>
>
> Trivial fix to spelling mistake in error string
>
> Signed-off-by: Colin Ian King <colin.king@canonical.com>
Applied.
^ permalink raw reply
* Re: [PATCH] net/9p: fix spelling mistake: "suspsend" -> "suspend"
From: David Miller @ 2018-05-10 19:24 UTC (permalink / raw)
To: colin.king
Cc: ericvh, rminnich, lucho, v9fs-developer, netdev, kernel-janitors,
linux-kernel
In-Reply-To: <20180509094833.15871-1-colin.king@canonical.com>
From: Colin King <colin.king@canonical.com>
Date: Wed, 9 May 2018 10:48:33 +0100
> From: Colin Ian King <colin.king@canonical.com>
>
> Trivial fix to spelling mistake in dev_warn message text
>
> Signed-off-by: Colin Ian King <colin.king@canonical.com>
Applied.
^ permalink raw reply
* Re: [PATCH net-next] net: dsa: fix added_by_user switchdev notification
From: David Miller @ 2018-05-10 19:27 UTC (permalink / raw)
To: vivien.didelot
Cc: netdev, linux-kernel, kernel, petrm, jiri, idosch, ivecera,
stephen, andrew, f.fainelli, nikolay, bridge
In-Reply-To: <20180509030312.29548-1-vivien.didelot@savoirfairelinux.com>
From: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Date: Tue, 8 May 2018 23:03:12 -0400
> Commit 161d82de1ff8 ("net: bridge: Notify about !added_by_user FDB
> entries") causes the below oops when bringing up a slave interface,
> because dsa_port_fdb_add is still scheduled, but with a NULL address.
>
> To fix this, keep the dsa_slave_switchdev_event function agnostic of the
> notified info structure and handle the added_by_user flag in the
> specific dsa_slave_switchdev_event_work function.
...
> Fixes: 816a3bed9549 ("switchdev: Add fdb.added_by_user to switchdev notifications")
> Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Applied, thanks Vivien.
^ permalink raw reply
* Re: [bpf-next v3 8/9] bpf: Provide helper to do forwarding lookups in kernel FIB table
From: Mathieu Xhonneux @ 2018-05-10 19:27 UTC (permalink / raw)
To: David Ahern
Cc: netdev, borkmann, ast, David Miller, shm, roopa, brouer, toke,
john.fastabend
In-Reply-To: <20180510033427.20756-9-dsahern@gmail.com>
I'm quite interested in this helper to implement OAM features (through
other hooks, e.g. the BPF LWT hook). Do you have an idea about how it
behaves with ECMP routes (with IPv4 and/or IPv6) ? In IPv6, I'm
guessing that the returned gateway address is always a link-local
address ?
Thanks.
2018-05-10 5:34 GMT+02:00 David Ahern <dsahern@gmail.com>:
>
> Provide a helper for doing a FIB and neighbor lookup in the kernel
> tables from an XDP program. The helper provides a fastpath for forwarding
> packets. If the packet is a local delivery or for any reason is not a
> simple lookup and forward, the packet continues up the stack.
>
> If it is to be forwarded, the forwarding can be done directly if the
> neighbor is already known. If the neighbor does not exist, the first
> few packets go up the stack for neighbor resolution. Once resolved, the
> xdp program provides the fast path.
>
> On successful lookup the nexthop dmac, current device smac and egress
> device index are returned.
>
> The API supports IPv4, IPv6 and MPLS protocols, but only IPv4 and IPv6
> are implemented in this patch. The API includes layer 4 parameters if
> the XDP program chooses to do deep packet inspection to allow compare
> against ACLs implemented as FIB rules.
>
> Header rewrite is left to the XDP program.
>
> The lookup takes 2 flags:
> - BPF_FIB_LOOKUP_DIRECT to do a lookup that bypasses FIB rules and goes
> straight to the table associated with the device (expert setting for
> those looking to maximize throughput)
>
> - BPF_FIB_LOOKUP_OUTPUT to do a lookup from the egress perspective.
> Default is an ingress lookup.
>
> Initial performance numbers collected by Jesper, forwarded packets/sec:
>
> Full stack XDP FIB lookup XDP Direct lookup
> IPv4 1,947,969 7,074,156 7,415,333
> IPv6 1,728,000 6,165,504 7,262,720
>
> These number are single CPU core forwarding on a Broadwell
> E5-1650 v4 @ 3.60GHz.
>
> Signed-off-by: David Ahern <dsahern@gmail.com>
> ---
> include/uapi/linux/bpf.h | 81 +++++++++++++-
> net/core/filter.c | 267 +++++++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 347 insertions(+), 1 deletion(-)
>
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index d615c777b573..02e4112510f8 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -1828,6 +1828,33 @@ union bpf_attr {
> * Return
> * 0 on success, or a negative error in case of failure.
> *
> + *
> + * int bpf_fib_lookup(void *ctx, struct bpf_fib_lookup *params, int plen, u32 flags)
> + * Description
> + * Do FIB lookup in kernel tables using parameters in *params*.
> + * If lookup is successful and result shows packet is to be
> + * forwarded, the neighbor tables are searched for the nexthop.
> + * If successful (ie., FIB lookup shows forwarding and nexthop
> + * is resolved), the nexthop address is returned in ipv4_dst,
> + * ipv6_dst or mpls_out based on family, smac is set to mac
> + * address of egress device, dmac is set to nexthop mac address,
> + * rt_metric is set to metric from route.
> + *
> + * *plen* argument is the size of the passed in struct.
> + * *flags* argument can be one or more BPF_FIB_LOOKUP_ flags:
> + *
> + * **BPF_FIB_LOOKUP_DIRECT** means do a direct table lookup vs
> + * full lookup using FIB rules
> + * **BPF_FIB_LOOKUP_OUTPUT** means do lookup from an egress
> + * perspective (default is ingress)
> + *
> + * *ctx* is either **struct xdp_md** for XDP programs or
> + * **struct sk_buff** tc cls_act programs.
> + *
> + * Return
> + * Egress device index on success, 0 if packet needs to continue
> + * up the stack for further processing or a negative error in case
> + * of failure.
> */
> #define __BPF_FUNC_MAPPER(FN) \
> FN(unspec), \
> @@ -1898,7 +1925,8 @@ union bpf_attr {
> FN(xdp_adjust_tail), \
> FN(skb_get_xfrm_state), \
> FN(get_stack), \
> - FN(skb_load_bytes_relative),
> + FN(skb_load_bytes_relative), \
> + FN(fib_lookup),
>
> /* integer value in 'imm' field of BPF_CALL instruction selects which helper
> * function eBPF program intends to call
> @@ -2321,4 +2349,55 @@ struct bpf_raw_tracepoint_args {
> __u64 args[0];
> };
>
> +/* DIRECT: Skip the FIB rules and go to FIB table associated with device
> + * OUTPUT: Do lookup from egress perspective; default is ingress
> + */
> +#define BPF_FIB_LOOKUP_DIRECT BIT(0)
> +#define BPF_FIB_LOOKUP_OUTPUT BIT(1)
> +
> +struct bpf_fib_lookup {
> + /* input */
> + __u8 family; /* network family, AF_INET, AF_INET6, AF_MPLS */
> +
> + /* set if lookup is to consider L4 data - e.g., FIB rules */
> + __u8 l4_protocol;
> + __be16 sport;
> + __be16 dport;
> +
> + /* total length of packet from network header - used for MTU check */
> + __u16 tot_len;
> + __u32 ifindex; /* L3 device index for lookup */
> +
> + union {
> + /* inputs to lookup */
> + __u8 tos; /* AF_INET */
> + __be32 flowlabel; /* AF_INET6 */
> +
> + /* output: metric of fib result */
> + __u32 rt_metric;
> + };
> +
> + union {
> + __be32 mpls_in;
> + __be32 ipv4_src;
> + __u32 ipv6_src[4]; /* in6_addr; network order */
> + };
> +
> + /* input to bpf_fib_lookup, *dst is destination address.
> + * output: bpf_fib_lookup sets to gateway address
> + */
> + union {
> + /* return for MPLS lookups */
> + __be32 mpls_out[4]; /* support up to 4 labels */
> + __be32 ipv4_dst;
> + __u32 ipv6_dst[4]; /* in6_addr; network order */
> + };
> +
> + /* output */
> + __be16 h_vlan_proto;
> + __be16 h_vlan_TCI;
> + __u8 smac[6]; /* ETH_ALEN */
> + __u8 dmac[6]; /* ETH_ALEN */
> +};
> +
> #endif /* _UAPI__LINUX_BPF_H__ */
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 0baa715e4699..ca60d2872da4 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -60,6 +60,10 @@
> #include <net/xfrm.h>
> #include <linux/bpf_trace.h>
> #include <net/xdp_sock.h>
> +#include <linux/inetdevice.h>
> +#include <net/ip_fib.h>
> +#include <net/flow.h>
> +#include <net/arp.h>
>
> /**
> * sk_filter_trim_cap - run a packet through a socket filter
> @@ -4032,6 +4036,265 @@ static const struct bpf_func_proto bpf_skb_get_xfrm_state_proto = {
> };
> #endif
>
> +#if IS_ENABLED(CONFIG_INET) || IS_ENABLED(CONFIG_IPV6)
> +static int bpf_fib_set_fwd_params(struct bpf_fib_lookup *params,
> + const struct neighbour *neigh,
> + const struct net_device *dev)
> +{
> + memcpy(params->dmac, neigh->ha, ETH_ALEN);
> + memcpy(params->smac, dev->dev_addr, ETH_ALEN);
> + params->h_vlan_TCI = 0;
> + params->h_vlan_proto = 0;
> +
> + return dev->ifindex;
> +}
> +#endif
> +
> +#if IS_ENABLED(CONFIG_INET)
> +static int bpf_ipv4_fib_lookup(struct net *net, struct bpf_fib_lookup *params,
> + u32 flags)
> +{
> + struct in_device *in_dev;
> + struct neighbour *neigh;
> + struct net_device *dev;
> + struct fib_result res;
> + struct fib_nh *nh;
> + struct flowi4 fl4;
> + int err;
> +
> + dev = dev_get_by_index_rcu(net, params->ifindex);
> + if (unlikely(!dev))
> + return -ENODEV;
> +
> + /* verify forwarding is enabled on this interface */
> + in_dev = __in_dev_get_rcu(dev);
> + if (unlikely(!in_dev || !IN_DEV_FORWARD(in_dev)))
> + return 0;
> +
> + if (flags & BPF_FIB_LOOKUP_OUTPUT) {
> + fl4.flowi4_iif = 1;
> + fl4.flowi4_oif = params->ifindex;
> + } else {
> + fl4.flowi4_iif = params->ifindex;
> + fl4.flowi4_oif = 0;
> + }
> + fl4.flowi4_tos = params->tos & IPTOS_RT_MASK;
> + fl4.flowi4_scope = RT_SCOPE_UNIVERSE;
> + fl4.flowi4_flags = 0;
> +
> + fl4.flowi4_proto = params->l4_protocol;
> + fl4.daddr = params->ipv4_dst;
> + fl4.saddr = params->ipv4_src;
> + fl4.fl4_sport = params->sport;
> + fl4.fl4_dport = params->dport;
> +
> + if (flags & BPF_FIB_LOOKUP_DIRECT) {
> + u32 tbid = l3mdev_fib_table_rcu(dev) ? : RT_TABLE_MAIN;
> + struct fib_table *tb;
> +
> + tb = fib_get_table(net, tbid);
> + if (unlikely(!tb))
> + return 0;
> +
> + err = fib_table_lookup(tb, &fl4, &res, FIB_LOOKUP_NOREF);
> + } else {
> + fl4.flowi4_mark = 0;
> + fl4.flowi4_secid = 0;
> + fl4.flowi4_tun_key.tun_id = 0;
> + fl4.flowi4_uid = sock_net_uid(net, NULL);
> +
> + err = fib_lookup(net, &fl4, &res, FIB_LOOKUP_NOREF);
> + }
> +
> + if (err || res.type != RTN_UNICAST)
> + return 0;
> +
> + if (res.fi->fib_nhs > 1)
> + fib_select_path(net, &res, &fl4, NULL);
> +
> + nh = &res.fi->fib_nh[res.nh_sel];
> +
> + /* do not handle lwt encaps right now */
> + if (nh->nh_lwtstate)
> + return 0;
> +
> + dev = nh->nh_dev;
> + if (unlikely(!dev))
> + return 0;
> +
> + if (nh->nh_gw)
> + params->ipv4_dst = nh->nh_gw;
> +
> + params->rt_metric = res.fi->fib_priority;
> +
> + /* xdp and cls_bpf programs are run in RCU-bh so
> + * rcu_read_lock_bh is not needed here
> + */
> + neigh = __ipv4_neigh_lookup_noref(dev, (__force u32)params->ipv4_dst);
> + if (neigh)
> + return bpf_fib_set_fwd_params(params, neigh, dev);
> +
> + return 0;
> +}
> +#endif
> +
> +#if IS_ENABLED(CONFIG_IPV6)
> +static int bpf_ipv6_fib_lookup(struct net *net, struct bpf_fib_lookup *params,
> + u32 flags)
> +{
> + struct in6_addr *src = (struct in6_addr *) params->ipv6_src;
> + struct in6_addr *dst = (struct in6_addr *) params->ipv6_dst;
> + struct neighbour *neigh;
> + struct net_device *dev;
> + struct inet6_dev *idev;
> + struct fib6_info *f6i;
> + struct flowi6 fl6;
> + int strict = 0;
> + int oif;
> +
> + /* link local addresses are never forwarded */
> + if (rt6_need_strict(dst) || rt6_need_strict(src))
> + return 0;
> +
> + dev = dev_get_by_index_rcu(net, params->ifindex);
> + if (unlikely(!dev))
> + return -ENODEV;
> +
> + idev = __in6_dev_get_safely(dev);
> + if (unlikely(!idev || !net->ipv6.devconf_all->forwarding))
> + return 0;
> +
> + if (flags & BPF_FIB_LOOKUP_OUTPUT) {
> + fl6.flowi6_iif = 1;
> + oif = fl6.flowi6_oif = params->ifindex;
> + } else {
> + oif = fl6.flowi6_iif = params->ifindex;
> + fl6.flowi6_oif = 0;
> + strict = RT6_LOOKUP_F_HAS_SADDR;
> + }
> + fl6.flowlabel = params->flowlabel;
> + fl6.flowi6_scope = 0;
> + fl6.flowi6_flags = 0;
> + fl6.mp_hash = 0;
> +
> + fl6.flowi6_proto = params->l4_protocol;
> + fl6.daddr = *dst;
> + fl6.saddr = *src;
> + fl6.fl6_sport = params->sport;
> + fl6.fl6_dport = params->dport;
> +
> + if (flags & BPF_FIB_LOOKUP_DIRECT) {
> + u32 tbid = l3mdev_fib_table_rcu(dev) ? : RT_TABLE_MAIN;
> + struct fib6_table *tb;
> +
> + tb = ipv6_stub->fib6_get_table(net, tbid);
> + if (unlikely(!tb))
> + return 0;
> +
> + f6i = ipv6_stub->fib6_table_lookup(net, tb, oif, &fl6, strict);
> + } else {
> + fl6.flowi6_mark = 0;
> + fl6.flowi6_secid = 0;
> + fl6.flowi6_tun_key.tun_id = 0;
> + fl6.flowi6_uid = sock_net_uid(net, NULL);
> +
> + f6i = ipv6_stub->fib6_lookup(net, oif, &fl6, strict);
> + }
> +
> + if (unlikely(IS_ERR_OR_NULL(f6i) || f6i == net->ipv6.fib6_null_entry))
> + return 0;
> +
> + if (unlikely(f6i->fib6_flags & RTF_REJECT ||
> + f6i->fib6_type != RTN_UNICAST))
> + return 0;
> +
> + if (f6i->fib6_nsiblings && fl6.flowi6_oif == 0)
> + f6i = ipv6_stub->fib6_multipath_select(net, f6i, &fl6,
> + fl6.flowi6_oif, NULL,
> + strict);
> +
> + if (f6i->fib6_nh.nh_lwtstate)
> + return 0;
> +
> + if (f6i->fib6_flags & RTF_GATEWAY)
> + *dst = f6i->fib6_nh.nh_gw;
> +
> + dev = f6i->fib6_nh.nh_dev;
> + params->rt_metric = f6i->fib6_metric;
> +
> + /* xdp and cls_bpf programs are run in RCU-bh so rcu_read_lock_bh is
> + * not needed here. Can not use __ipv6_neigh_lookup_noref here
> + * because we need to get nd_tbl via the stub
> + */
> + neigh = ___neigh_lookup_noref(ipv6_stub->nd_tbl, neigh_key_eq128,
> + ndisc_hashfn, dst, dev);
> + if (neigh)
> + return bpf_fib_set_fwd_params(params, neigh, dev);
> +
> + return 0;
> +}
> +#endif
> +
> +BPF_CALL_4(bpf_xdp_fib_lookup, struct xdp_buff *, ctx,
> + struct bpf_fib_lookup *, params, int, plen, u32, flags)
> +{
> + if (plen < sizeof(*params))
> + return -EINVAL;
> +
> + switch (params->family) {
> +#if IS_ENABLED(CONFIG_INET)
> + case AF_INET:
> + return bpf_ipv4_fib_lookup(dev_net(ctx->rxq->dev), params,
> + flags);
> +#endif
> +#if IS_ENABLED(CONFIG_IPV6)
> + case AF_INET6:
> + return bpf_ipv6_fib_lookup(dev_net(ctx->rxq->dev), params,
> + flags);
> +#endif
> + }
> + return 0;
> +}
> +
> +static const struct bpf_func_proto bpf_xdp_fib_lookup_proto = {
> + .func = bpf_xdp_fib_lookup,
> + .gpl_only = true,
> + .ret_type = RET_INTEGER,
> + .arg1_type = ARG_PTR_TO_CTX,
> + .arg2_type = ARG_PTR_TO_MEM,
> + .arg3_type = ARG_CONST_SIZE,
> + .arg4_type = ARG_ANYTHING,
> +};
> +
> +BPF_CALL_4(bpf_skb_fib_lookup, struct sk_buff *, skb,
> + struct bpf_fib_lookup *, params, int, plen, u32, flags)
> +{
> + if (plen < sizeof(*params))
> + return -EINVAL;
> +
> + switch (params->family) {
> +#if IS_ENABLED(CONFIG_INET)
> + case AF_INET:
> + return bpf_ipv4_fib_lookup(dev_net(skb->dev), params, flags);
> +#endif
> +#if IS_ENABLED(CONFIG_IPV6)
> + case AF_INET6:
> + return bpf_ipv6_fib_lookup(dev_net(skb->dev), params, flags);
> +#endif
> + }
> + return -ENOTSUPP;
> +}
> +
> +static const struct bpf_func_proto bpf_skb_fib_lookup_proto = {
> + .func = bpf_skb_fib_lookup,
> + .gpl_only = true,
> + .ret_type = RET_INTEGER,
> + .arg1_type = ARG_PTR_TO_CTX,
> + .arg2_type = ARG_PTR_TO_MEM,
> + .arg3_type = ARG_CONST_SIZE,
> + .arg4_type = ARG_ANYTHING,
> +};
> +
> static const struct bpf_func_proto *
> bpf_base_func_proto(enum bpf_func_id func_id)
> {
> @@ -4181,6 +4444,8 @@ tc_cls_act_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
> case BPF_FUNC_skb_get_xfrm_state:
> return &bpf_skb_get_xfrm_state_proto;
> #endif
> + case BPF_FUNC_fib_lookup:
> + return &bpf_skb_fib_lookup_proto;
> default:
> return bpf_base_func_proto(func_id);
> }
> @@ -4206,6 +4471,8 @@ xdp_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
> return &bpf_xdp_redirect_map_proto;
> case BPF_FUNC_xdp_adjust_tail:
> return &bpf_xdp_adjust_tail_proto;
> + case BPF_FUNC_fib_lookup:
> + return &bpf_xdp_fib_lookup_proto;
> default:
> return bpf_base_func_proto(func_id);
> }
> --
> 2.11.0
>
^ permalink raw reply
* Re: [PATCH net] nfp: flower: remove headroom from max MTU calculation
From: David Miller @ 2018-05-10 19:28 UTC (permalink / raw)
To: jakub.kicinski; +Cc: oss-drivers, netdev, pieter.jansenvanvuuren
In-Reply-To: <20180509071858.11124-1-jakub.kicinski@netronome.com>
From: Jakub Kicinski <jakub.kicinski@netronome.com>
Date: Wed, 9 May 2018 00:18:58 -0700
> From: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
>
> Since commit 29a5dcae2790 ("nfp: flower: offload phys port MTU change") we
> take encapsulation headroom into account when calculating the max allowed
> MTU. This is unnecessary as the max MTU advertised by firmware should have
> already accounted for encap headroom.
>
> Subtracting headroom twice brings the max MTU below what's necessary for
> some deployments.
>
> Fixes: 29a5dcae2790 ("nfp: flower: offload phys port MTU change")
> Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
> Reviewed-by: John Hurley <john.hurley@netronome.com>
> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Applied.
^ permalink raw reply
* Re: [PATCH 09/18] ptp: Remove pr_fmt duplicate logging prefixes
From: Richard Cochran @ 2018-05-10 19:35 UTC (permalink / raw)
To: Joe Perches; +Cc: netdev, linux-kernel
In-Reply-To: <5b315a623c63fd82df0540e27a130aaabd8c3b49.1525964385.git.joe@perches.com>
On Thu, May 10, 2018 at 08:45:35AM -0700, Joe Perches wrote:
> Converting pr_fmt from a simple define to use KBUILD_MODNAME added
> some duplicate logging prefixes to existing uses.
>
> Remove them.
>
> Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
^ permalink raw reply
* Re: [PATCH v2] hv_netvsc: Fix net device attach on older Windows hosts
From: David Miller @ 2018-05-10 19:37 UTC (permalink / raw)
To: mgamal; +Cc: sthemmin, netdev, haiyangz, linux-kernel, devel, vkuznets
In-Reply-To: <1525853854-8277-1-git-send-email-mgamal@redhat.com>
From: Mohammed Gamal <mgamal@redhat.com>
Date: Wed, 9 May 2018 10:17:34 +0200
> On older windows hosts the net_device instance is returned to
> the caller of rndis_filter_device_add() without having the presence
> bit set first. This would cause any subsequent calls to network device
> operations (e.g. MTU change, channel change) to fail after the device
> is detached once, returning -ENODEV.
>
> Instead of returning the device instabce, we take the exit path where
> we call netif_device_attach()
>
> Fixes: 7b2ee50c0cd5 ("hv_netvsc: common detach logic")
>
> Signed-off-by: Mohammed Gamal <mgamal@redhat.com>
Applied.
^ permalink raw reply
* Re: [PATCH net] ipv4: reset fnhe_mtu_locked after cache route flushed
From: David Miller @ 2018-05-10 19:43 UTC (permalink / raw)
To: liuhangbin; +Cc: netdev, sd, sbrivio
In-Reply-To: <1525860404-15012-1-git-send-email-liuhangbin@gmail.com>
From: Hangbin Liu <liuhangbin@gmail.com>
Date: Wed, 9 May 2018 18:06:44 +0800
> After route cache is flushed via ipv4_sysctl_rtcache_flush(), we forget
> to reset fnhe_mtu_locked in rt_bind_exception(). When pmtu is updated
> in __ip_rt_update_pmtu(), it will return directly since the pmtu is
> still locked. e.g.
>
> + ip netns exec client ping 10.10.1.1 -c 1 -s 1400 -M do
> PING 10.10.1.1 (10.10.1.1) 1400(1428) bytes of data.
> From 10.10.0.254 icmp_seq=1 Frag needed and DF set (mtu = 0)
>
> --- 10.10.1.1 ping statistics ---
> 1 packets transmitted, 0 received, +1 errors, 100% packet loss, time 0ms
>
> + ip netns exec client ip route get 10.10.1.1
> 10.10.1.1 via 10.10.0.254 dev veth0_c src 10.10.0.1 uid 0
> cache expires 599sec mtu lock 552
> + ip netns exec client ip route flush cache
> + ip netns exec client ip route get 10.10.1.1
> 10.10.1.1 via 10.10.0.254 dev veth0_c src 10.10.0.1 uid 0
> cache
> + ip netns exec client ping 10.10.1.1 -c 1 -s 1400 -M do
> PING 10.10.1.1 (10.10.1.1) 1400(1428) bytes of data.
> ping: local error: Message too long, mtu=576
>
> --- 10.10.1.1 ping statistics ---
> 1 packets transmitted, 0 received, +1 errors, 100% packet loss, time 0ms
>
> + ip netns exec client ip route get 10.10.1.1
> 10.10.1.1 via 10.10.0.254 dev veth0_c src 10.10.0.1 uid 0
> cache
>
> Fixes: d52e5a7e7ca49 ("ipv4: lock mtu in fnhe when received PMTU < net.ipv4.route.min_pmtu")
> Reported-by: Jianlin Shi <jishi@redhat.com>
> Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Applied.
^ permalink raw reply
* Re: [PATCH net] udp: fix SO_BINDTODEVICE
From: David Miller @ 2018-05-10 19:43 UTC (permalink / raw)
To: pabeni; +Cc: netdev, dnman, dsahern
In-Reply-To: <9445dd5d149af16463df4d0502b2667ee2b6f4e8.1525862461.git.pabeni@redhat.com>
From: Paolo Abeni <pabeni@redhat.com>
Date: Wed, 9 May 2018 12:42:34 +0200
> Damir reported a breakage of SO_BINDTODEVICE for UDP sockets.
> In absence of VRF devices, after commit fb74c27735f0 ("net:
> ipv4: add second dif to udp socket lookups") the dif mismatch
> isn't fatal anymore for UDP socket lookup with non null
> sk_bound_dev_if, breaking SO_BINDTODEVICE semantics.
>
> This changeset addresses the issue making the dif match mandatory
> again in the above scenario.
>
> Reported-by: Damir Mansurov <dnman@oktetlabs.ru>
> Fixes: fb74c27735f0 ("net: ipv4: add second dif to udp socket lookups")
> Fixes: 1801b570dd2a ("net: ipv6: add second dif to udp socket lookups")
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Applied.
^ permalink raw reply
* [PATCH net-next] tcp: switch pacing timer to softirq based hrtimer
From: Eric Dumazet @ 2018-05-10 19:49 UTC (permalink / raw)
To: David S . Miller; +Cc: netdev, Eric Dumazet, Eric Dumazet
linux-4.16 got support for softirq based hrtimers.
TCP can switch its pacing hrtimer to this variant, since this
avoids going through a tasklet and some atomic operations.
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
net/ipv4/tcp_output.c | 69 ++++++++++++++++---------------------------
net/ipv4/tcp_timer.c | 2 +-
2 files changed, 26 insertions(+), 45 deletions(-)
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index d07c0dcc99aaa55c4da963599c8286c8baa1f783..0d8f950a9006598c70dbf51e281a3fe32dfaa234 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -772,7 +772,7 @@ struct tsq_tasklet {
};
static DEFINE_PER_CPU(struct tsq_tasklet, tsq_tasklet);
-static void tcp_tsq_handler(struct sock *sk)
+static void tcp_tsq_write(struct sock *sk)
{
if ((1 << sk->sk_state) &
(TCPF_ESTABLISHED | TCPF_FIN_WAIT1 | TCPF_CLOSING |
@@ -789,6 +789,16 @@ static void tcp_tsq_handler(struct sock *sk)
0, GFP_ATOMIC);
}
}
+
+static void tcp_tsq_handler(struct sock *sk)
+{
+ bh_lock_sock(sk);
+ if (!sock_owned_by_user(sk))
+ tcp_tsq_write(sk);
+ else if (!test_and_set_bit(TCP_TSQ_DEFERRED, &sk->sk_tsq_flags))
+ sock_hold(sk);
+ bh_unlock_sock(sk);
+}
/*
* One tasklet per cpu tries to send more skbs.
* We run in tasklet context but need to disable irqs when
@@ -816,16 +826,7 @@ static void tcp_tasklet_func(unsigned long data)
smp_mb__before_atomic();
clear_bit(TSQ_QUEUED, &sk->sk_tsq_flags);
- if (!sk->sk_lock.owned &&
- test_bit(TCP_TSQ_DEFERRED, &sk->sk_tsq_flags)) {
- bh_lock_sock(sk);
- if (!sock_owned_by_user(sk)) {
- clear_bit(TCP_TSQ_DEFERRED, &sk->sk_tsq_flags);
- tcp_tsq_handler(sk);
- }
- bh_unlock_sock(sk);
- }
-
+ tcp_tsq_handler(sk);
sk_free(sk);
}
}
@@ -853,9 +854,10 @@ void tcp_release_cb(struct sock *sk)
nflags = flags & ~TCP_DEFERRED_ALL;
} while (cmpxchg(&sk->sk_tsq_flags, flags, nflags) != flags);
- if (flags & TCPF_TSQ_DEFERRED)
- tcp_tsq_handler(sk);
-
+ if (flags & TCPF_TSQ_DEFERRED) {
+ tcp_tsq_write(sk);
+ __sock_put(sk);
+ }
/* Here begins the tricky part :
* We are called from release_sock() with :
* 1) BH disabled
@@ -929,7 +931,7 @@ void tcp_wfree(struct sk_buff *skb)
if (!(oval & TSQF_THROTTLED) || (oval & TSQF_QUEUED))
goto out;
- nval = (oval & ~TSQF_THROTTLED) | TSQF_QUEUED | TCPF_TSQ_DEFERRED;
+ nval = (oval & ~TSQF_THROTTLED) | TSQF_QUEUED;
nval = cmpxchg(&sk->sk_tsq_flags, oval, nval);
if (nval != oval)
continue;
@@ -948,37 +950,17 @@ void tcp_wfree(struct sk_buff *skb)
sk_free(sk);
}
-/* Note: Called under hard irq.
- * We can not call TCP stack right away.
+/* Note: Called under soft irq.
+ * We can call TCP stack right away, unless socket is owned by user.
*/
enum hrtimer_restart tcp_pace_kick(struct hrtimer *timer)
{
struct tcp_sock *tp = container_of(timer, struct tcp_sock, pacing_timer);
struct sock *sk = (struct sock *)tp;
- unsigned long nval, oval;
- for (oval = READ_ONCE(sk->sk_tsq_flags);; oval = nval) {
- struct tsq_tasklet *tsq;
- bool empty;
+ tcp_tsq_handler(sk);
+ sock_put(sk);
- if (oval & TSQF_QUEUED)
- break;
-
- nval = (oval & ~TSQF_THROTTLED) | TSQF_QUEUED | TCPF_TSQ_DEFERRED;
- nval = cmpxchg(&sk->sk_tsq_flags, oval, nval);
- if (nval != oval)
- continue;
-
- if (!refcount_inc_not_zero(&sk->sk_wmem_alloc))
- break;
- /* queue this socket to tasklet queue */
- tsq = this_cpu_ptr(&tsq_tasklet);
- empty = list_empty(&tsq->head);
- list_add(&tp->tsq_node, &tsq->head);
- if (empty)
- tasklet_schedule(&tsq->tasklet);
- break;
- }
return HRTIMER_NORESTART;
}
@@ -1011,7 +993,8 @@ static void tcp_internal_pacing(struct sock *sk, const struct sk_buff *skb)
do_div(len_ns, rate);
hrtimer_start(&tcp_sk(sk)->pacing_timer,
ktime_add_ns(ktime_get(), len_ns),
- HRTIMER_MODE_ABS_PINNED);
+ HRTIMER_MODE_ABS_PINNED_SOFT);
+ sock_hold(sk);
}
static void tcp_update_skb_after_send(struct tcp_sock *tp, struct sk_buff *skb)
@@ -1078,7 +1061,7 @@ static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, int clone_it,
/* if no packet is in qdisc/device queue, then allow XPS to select
* another queue. We can be called from tcp_tsq_handler()
- * which holds one reference to sk_wmem_alloc.
+ * which holds one reference to sk.
*
* TODO: Ideally, in-flight pure ACK packets should not matter here.
* One way to get this would be to set skb->truesize = 2 on them.
@@ -2185,7 +2168,7 @@ static int tcp_mtu_probe(struct sock *sk)
static bool tcp_pacing_check(const struct sock *sk)
{
return tcp_needs_internal_pacing(sk) &&
- hrtimer_active(&tcp_sk(sk)->pacing_timer);
+ hrtimer_is_queued(&tcp_sk(sk)->pacing_timer);
}
/* TCP Small Queues :
@@ -2365,8 +2348,6 @@ static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle,
skb, limit, mss_now, gfp)))
break;
- if (test_bit(TCP_TSQ_DEFERRED, &sk->sk_tsq_flags))
- clear_bit(TCP_TSQ_DEFERRED, &sk->sk_tsq_flags);
if (tcp_small_queue_check(sk, skb, 0))
break;
diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
index f7d944855f8ebd0a312fe73a53a56ab8d451ee44..92bdf64fffae3a5be291ca419eb21276b4c8cbae 100644
--- a/net/ipv4/tcp_timer.c
+++ b/net/ipv4/tcp_timer.c
@@ -713,6 +713,6 @@ void tcp_init_xmit_timers(struct sock *sk)
inet_csk_init_xmit_timers(sk, &tcp_write_timer, &tcp_delack_timer,
&tcp_keepalive_timer);
hrtimer_init(&tcp_sk(sk)->pacing_timer, CLOCK_MONOTONIC,
- HRTIMER_MODE_ABS_PINNED);
+ HRTIMER_MODE_ABS_PINNED_SOFT);
tcp_sk(sk)->pacing_timer.function = tcp_pace_kick;
}
--
2.17.0.441.gb46fe60e1d-goog
^ permalink raw reply related
* Re: [PATCH net] cxgb4: zero the HMA memory
From: David Miller @ 2018-05-10 20:04 UTC (permalink / raw)
To: ganeshgr; +Cc: netdev, nirranjan, indranil, venkatesh
In-Reply-To: <1525871409-32074-1-git-send-email-ganeshgr@chelsio.com>
From: Ganesh Goudar <ganeshgr@chelsio.com>
Date: Wed, 9 May 2018 18:40:09 +0530
> firmware expects HMA memory to be zeroed, use __GFP_ZERO
> for HMA memory allocation.
>
> Fixes: 8b4e6b3ca2ed ("cxgb4: Add HMA support")
> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Applied.
^ permalink raw reply
* Re: [PATCH net] cxgb4: copy mbox log size to PF0-3 adap instances
From: David Miller @ 2018-05-10 20:04 UTC (permalink / raw)
To: ganeshgr; +Cc: netdev, nirranjan, indranil, venkatesh, leedom
In-Reply-To: <1525872635-1342-1-git-send-email-ganeshgr@chelsio.com>
From: Ganesh Goudar <ganeshgr@chelsio.com>
Date: Wed, 9 May 2018 19:00:35 +0530
> copy mbox size to adapter instances of PF0-3 to avoid
> mbox log overflow. This fixes the possible protection
> fault.
>
> Fixes: baf5086840ab ("cxgb4: restructure VF mgmt code")
> Signed-off-by: Casey Leedom <leedom@chelsio.com>
> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Applied.
^ permalink raw reply
* Re: [PATCH v2] net: phy: DP83TC811: Introduce support for the DP83TC811 phy
From: David Miller @ 2018-05-10 20:06 UTC (permalink / raw)
To: dmurphy; +Cc: andrew, f.fainelli, netdev, linux-kernel
In-Reply-To: <20180509140902.16636-1-dmurphy@ti.com>
From: Dan Murphy <dmurphy@ti.com>
Date: Wed, 9 May 2018 09:09:02 -0500
> Add support for the DP83811 phy.
>
> The DP83811 supports both rgmii and sgmii interfaces.
> There are 2 part numbers for this the DP83TC811R does not
> reliably support the SGMII interface but the DP83TC811S will.
>
> There is not a way to differentiate these parts from the
> hardware or register set. So this is controlled via the DT
> to indicate which phy mode is required. Or the part can be
> strapped to a certain interface.
>
> Data sheet can be found here:
> http://www.ti.com/product/DP83TC811S-Q1/description
> http://www.ti.com/product/DP83TC811R-Q1/description
>
> Signed-off-by: Dan Murphy <dmurphy@ti.com>
> ---
>
> v2 - Remove extra config_init in reset, update config_init call back function
> fix a checkpatch alignment issue, add SGMII check in autoneg api - https://patchwork.kernel.org/patch/10389323/
Hello Dan, just a few coding style fixes:
> +static int dp83811_set_wol(struct phy_device *phydev,
> + struct ethtool_wolinfo *wol)
> +{
> + struct net_device *ndev = phydev->attached_dev;
> + u16 value;
> + const u8 *mac;
Please order local variables longest to shortest line.
> +static void dp83811_get_wol(struct phy_device *phydev,
> + struct ethtool_wolinfo *wol)
> +{
> + int value;
> + u16 sopass_val;
Likewise.
> +static int dp83811_config_aneg(struct phy_device *phydev)
> +{
> + int err;
> + int value;
Likewise.
However, I would recommend that in these functions having two
integer local variables that you just declare them both on the
same line like:
int err, value;
> +static int dp83811_config_init(struct phy_device *phydev)
> +{
> + int err;
> + int value;
Likewise.
Thank you.
^ permalink raw reply
* Re: [PATCH net-next 0/3] mlx4_core misc for 4.18
From: David Miller @ 2018-05-10 20:09 UTC (permalink / raw)
To: tariqt; +Cc: netdev, eranbe
In-Reply-To: <1525879744-1858-1-git-send-email-tariqt@mellanox.com>
From: Tariq Toukan <tariqt@mellanox.com>
Date: Wed, 9 May 2018 18:29:01 +0300
> This patchset contains misc enhancements from the team
> to the mlx4 Core driver.
>
> Patch 1 by Eran adds driver version report in FW.
> Patch 2 by Yishai implements suspend/resume PCI callbacks.
> Patch 3 extends the range of an existing module param from boolean to numerical.
>
> Series generated against net-next commit:
> 53a7bdfb2a27 dt-bindings: dsa: Remove unnecessary #address/#size-cells
Series applied.
^ permalink raw reply
* Re: [PATCH net] net/mlx4_en: Verify coalescing parameters are in range
From: David Miller @ 2018-05-10 20:09 UTC (permalink / raw)
To: tariqt; +Cc: netdev, eranbe, moshe
In-Reply-To: <1525880113-2757-1-git-send-email-tariqt@mellanox.com>
From: Tariq Toukan <tariqt@mellanox.com>
Date: Wed, 9 May 2018 18:35:13 +0300
> From: Moshe Shemesh <moshe@mellanox.com>
>
> Add check of coalescing parameters received through ethtool are within
> range of values supported by the HW.
> Driver gets the coalescing rx/tx-usecs and rx/tx-frames as set by the
> users through ethtool. The ethtool support up to 32 bit value for each.
> However, mlx4 modify cq limits the coalescing time parameter and
> coalescing frames parameters to 16 bits.
> Return out of range error if user tries to set these parameters to
> higher values.
> Change type of sample-interval and adaptive_rx_coal parameters in mlx4
> driver to u32 as the ethtool holds them as u32 and these parameters are
> not limited due to mlx4 HW.
>
> Fixes: c27a02cd94d6 ('mlx4_en: Add driver for Mellanox ConnectX 10GbE NIC')
> Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
> Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Applied and queued up for -stable.
^ permalink raw reply
* Re: [PATCH net-next 0/3] net: dsa: mv88e6xxx: cleanup Global Control 2 register
From: David Miller @ 2018-05-10 20:13 UTC (permalink / raw)
To: vivien.didelot; +Cc: netdev, linux-kernel, kernel, andrew, f.fainelli
In-Reply-To: <20180509153851.10207-1-vivien.didelot@savoirfairelinux.com>
From: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Date: Wed, 9 May 2018 11:38:48 -0400
> The mv88e6xxx driver still writes arbitrary values in the Global 1
> Control 2 register at setup, which layout differs a lot between chips.
> This results in an inconsistent configuration, for example with the
> Remote Management Unit (RMU).
>
> The first patch adds an operation for the Cascade Port bits, the second
> patch sets the device number in the device mapping function and the
> third patch adds an operation to correctly disable the RMU.
Series applied.
^ permalink raw reply
* [PATCH net-next v2 0/9] net: dsa: Plug in PHYLINK support
From: Florian Fainelli @ 2018-05-10 20:17 UTC (permalink / raw)
To: netdev
Cc: Florian Fainelli, privat, andrew, vivien.didelot, davem,
rmk+kernel, sean.wang, Woojung.Huh, john, cphealy
Hi all,
This patch series adds PHYLINK support to DSA which is necessary to support more
complex PHY and pluggable modules setups.
Patch series can be found here:
https://github.com/ffainelli/linux/commits/dsa-phylink-v2
This was tested on:
- dsa-loop
- bcm_sf2
- mv88e6xxx
- b53
With a variety of test cases:
- internal & external MDIO PHYs
- MoCA with link notification through interrupt/MMIO register
- built-in PHYs
- ifconfig up/down for several cycles works
- bind/unbind of the drivers
Changes in v2:
- fixed link configuration for mv88e6xxx (Andrew) after introducing polling
This is technically v2 of what was posted back in March 2018, changes from last
time:
- fixed probe/remove of drivers
- fixed missing gpiod_put() for link GPIOs
- fixed polling of link GPIOs (Russell I would need your SoB on the patch you
provided offline initially, added some modifications to it)
- tested across a wider set of platforms
And everything should still work as expected. Please be aware of the following:
- switch drivers (like bcm_sf2) which may have user-facing network ports using
fixed links would need to implement phylink_mac_ops to remain functional.
PHYLINK does not create a phy_device for fixed links, therefore our
call to adjust_link() from phylink_mac_link_{up,down} would not be calling
into the driver. This *should not* affect CPU/DSA ports which are configured
through adjust_link() but have no network devices
- support for SFP/SFF is now possible, but switch drivers will still need some
modifications to properly support those, including, but not limited to using
the correct binding information. This will be submitted on top of this series
Please do test on your respective platforms/switches and let me know if you
find any issues, hopefully everything still works like before.
Thank you!
Florian Fainelli (7):
net: phy: phylink: Use gpiod_get_value_cansleep()
net: phy: phylink: Release link GPIO
net: dsa: Add PHYLINK switch operations
net: dsa: bcm_sf2: Implement phylink_mac_ops
net: dsa: Eliminate dsa_slave_get_link()
net: dsa: Plug in PHYLINK support
net: dsa: bcm_sf2: Get rid of PHYLIB functions
Russell King (2):
net: phy: phylink: Poll link GPIOs
net: dsa: mv88e6xxx: add PHYLINK support
drivers/net/dsa/bcm_sf2.c | 191 +++++++++++++++----------
drivers/net/dsa/mv88e6xxx/chip.c | 83 +++++++++++
drivers/net/dsa/mv88e6xxx/port.c | 39 ++++++
drivers/net/dsa/mv88e6xxx/port.h | 3 +
drivers/net/phy/phylink.c | 20 ++-
include/net/dsa.h | 25 ++++
net/dsa/Kconfig | 2 +-
net/dsa/dsa_priv.h | 9 --
net/dsa/slave.c | 293 ++++++++++++++++++++++-----------------
9 files changed, 458 insertions(+), 207 deletions(-)
--
2.14.1
^ permalink raw reply
* [PATCH net-next v2 1/9] net: phy: phylink: Use gpiod_get_value_cansleep()
From: Florian Fainelli @ 2018-05-10 20:17 UTC (permalink / raw)
To: netdev
Cc: Florian Fainelli, privat, andrew, vivien.didelot, davem,
rmk+kernel, sean.wang, Woojung.Huh, john, cphealy
In-Reply-To: <20180510201737.13887-1-f.fainelli@gmail.com>
The GPIO provider for the link GPIO line might require the use of the
_cansleep() API, utilize that. This is safe to do since we run in workqueue
context.
Fixes: 9525ae83959b ("phylink: add phylink infrastructure")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
drivers/net/phy/phylink.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
index c582b2d7546c..412d1cf4fa66 100644
--- a/drivers/net/phy/phylink.c
+++ b/drivers/net/phy/phylink.c
@@ -360,7 +360,7 @@ static void phylink_get_fixed_state(struct phylink *pl, struct phylink_link_stat
if (pl->get_fixed_state)
pl->get_fixed_state(pl->netdev, state);
else if (pl->link_gpio)
- state->link = !!gpiod_get_value(pl->link_gpio);
+ state->link = !!gpiod_get_value_cansleep(pl->link_gpio);
}
/* Flow control is resolved according to our and the link partners
--
2.14.1
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox