Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH v5 net] stmmac: 802.1ad tag stripping fix
From: David Miller @ 2018-06-03 14:33 UTC (permalink / raw)
  To: eladv6
  Cc: makita.toshiaki, Jose.Abreu, f.fainelli, netdev, peppe.cavallaro,
	alexandre.torgue
In-Reply-To: <113191f7-ad35-151f-3414-a2342ff0e13c@gmail.com>

From: Elad Nachman <eladv6@gmail.com>
Date: Wed, 30 May 2018 08:48:25 +0300

>  static void stmmac_rx_vlan(struct net_device *dev, struct sk_buff *skb)
>  {
> -	struct ethhdr *ehdr;
> +	struct vlan_ethhdr *veth;
>  	u16 vlanid;
> +	__be16 vlan_proto;

Please order local variables from longest to shortest line.

>  
> -	if ((dev->features & NETIF_F_HW_VLAN_CTAG_RX) ==
> -	    NETIF_F_HW_VLAN_CTAG_RX &&
> -	    !__vlan_get_tag(skb, &vlanid)) {
> +	if (!__vlan_get_tag(skb, &vlanid)) {
>  		/* pop the vlan tag */
> -		ehdr = (struct ethhdr *)skb->data;
> -		memmove(skb->data + VLAN_HLEN, ehdr, ETH_ALEN * 2);
> +		veth = (struct vlan_ethhdr *)skb->data;
> +		vlan_proto = veth->h_vlan_proto;
> +		memmove(skb->data + VLAN_HLEN, veth, ETH_ALEN * 2);
>  		skb_pull(skb, VLAN_HLEN);
> -		__vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), vlanid);
> +		__vlan_hwaccel_put_tag(skb, vlan_proto, vlanid);
>  	}
>  }

I can't see how it is valid to do an unconditional software VLAN
untagging even when VLAN is disabled in the kernel config or the
NETIF_F_* feature bits are not set.

At a minimum that feature test has to stay there, and when it's clear
we let the generic VLAN code untag the packet.

^ permalink raw reply

* Re: [PATCH net-next] net: phy: consider PHY_IGNORE_INTERRUPT in state machine PHY_NOLINK handling
From: David Miller @ 2018-06-03 14:33 UTC (permalink / raw)
  To: hkallweit1; +Cc: f.fainelli, andrew, netdev
In-Reply-To: <0a4e472d-cb7f-ef1f-420c-1327fa41e8cd@gmail.com>

From: Heiner Kallweit <hkallweit1@gmail.com>
Date: Wed, 30 May 2018 22:13:20 +0200

> We can bail out immediately also in case of PHY_IGNORE_INTERRUPT because
> phy_mac_interupt() informs us once the link is up.
> 
> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH net] vrf: check the original netdevice for generating redirect
From: David Miller @ 2018-06-03 14:34 UTC (permalink / raw)
  To: ssuryaextr; +Cc: netdev, dsa
In-Reply-To: <1527825921-17677-1-git-send-email-ssuryaextr@gmail.com>

From: Stephen Suryaputra <ssuryaextr@gmail.com>
Date: Fri,  1 Jun 2018 00:05:21 -0400

> Use the right device to determine if redirect should be sent especially
> when using vrf. Same as well as when sending the redirect.
> 
> Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com>

David A., please review.

^ permalink raw reply

* Re: [PATCH net-next] hv_netvsc: fix error return code in netvsc_probe()
From: David Miller @ 2018-06-03 14:35 UTC (permalink / raw)
  To: weiyongjun1
  Cc: kys, haiyangz, sthemmin, sridhar.samudrala, devel, netdev,
	kernel-janitors
In-Reply-To: <1527732283-145530-1-git-send-email-weiyongjun1@huawei.com>

From: Wei Yongjun <weiyongjun1@huawei.com>
Date: Thu, 31 May 2018 02:04:43 +0000

> Fix to return a negative error code from the failover register fail
> error handling case instead of 0, as done elsewhere in this function.
> 
> Fixes: 1ff78076d8dd ("netvsc: refactor notifier/event handling code to use the failover framework")
> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>

Applied, thank you.

^ permalink raw reply

* Re: [PATCH net-next] net/mlx5: Make function mlx5_fpga_tls_send_teardown_cmd() static
From: David Miller @ 2018-06-03 14:36 UTC (permalink / raw)
  To: weiyongjun1
  Cc: borisp, saeedm, leon, ilyal, netdev, linux-rdma, kernel-janitors
In-Reply-To: <1527733872-149077-1-git-send-email-weiyongjun1@huawei.com>

From: Wei Yongjun <weiyongjun1@huawei.com>
Date: Thu, 31 May 2018 02:31:12 +0000

> Fixes the following sparse warning:
> 
> drivers/net/ethernet/mellanox/mlx5/core/fpga/tls.c:199:6: warning:
>  symbol 'mlx5_fpga_tls_send_teardown_cmd' was not declared. Should it be static?
> 
> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] net/smc: fix error return code in smc_setsockopt()
From: David Miller @ 2018-06-03 14:39 UTC (permalink / raw)
  To: weiyongjun1; +Cc: ubraun, linux-s390, netdev, kernel-janitors
In-Reply-To: <1527733882-149144-1-git-send-email-weiyongjun1@huawei.com>

From: Wei Yongjun <weiyongjun1@huawei.com>
Date: Thu, 31 May 2018 02:31:22 +0000

> Fix to return error code -EINVAL instead of 0 if optlen is invalid.
> 
> Fixes: 01d2f7e2cdd3 ("net/smc: sockopts TCP_NODELAY and TCP_CORK")
> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>

Although the TCP code should be checking this in the previous lines,
it's not good practice to depend so tightly upon that.

And it makes this code easier to audit if the check exists here
explicitly too.

So I'll apply this, thanks.

^ permalink raw reply

* Re: [PATCH net-next] net: netcp: ethss: remove unnecessary pointer set to NULL
From: David Miller @ 2018-06-03 14:40 UTC (permalink / raw)
  To: yuehaibing; +Cc: w-kwok2, m-karicheri2, netdev, linux-kernel
In-Reply-To: <20180531034848.23080-1-yuehaibing@huawei.com>

From: YueHaibing <yuehaibing@huawei.com>
Date: Thu, 31 May 2018 11:48:48 +0800

> If statement has make sure the 'slave->phy' is NULL
> 
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>

Looks good, applied.

^ permalink raw reply

* Re: [PATCH net] net: ipv6: prevent use after free in ip6_route_mpath_notify()
From: David Ahern @ 2018-06-03 14:40 UTC (permalink / raw)
  To: Eric Dumazet, David S . Miller; +Cc: netdev, Eric Dumazet
In-Reply-To: <4b46d531-904b-6e5f-67ce-a275f0826d47@cumulusnetworks.com>

On 6/3/18 8:01 AM, David Ahern wrote:
> Is there a reproducer for the syzbot case?

One reproducer is to insert a route and then add a multipath route that
has a duplicate nexthop.e.g,:

ip -6 ro add vrf red 2001:db8:101::/64 nexthop via 2001:db8:1::2

ip -6 ro append vrf red 2001:db8:101::/64 nexthop via 2001:db8:1::4
nexthop via 2001:db8:1::2

Current net and next-next generates the trace; with the fix I proposed I
don't see it on either branch and I do see the expected notifications to
userspace.

^ permalink raw reply

* Re: [PATCH net-next] net/ncsi: Avoid GFP_KERNEL in response handler
From: David Miller @ 2018-06-03 14:42 UTC (permalink / raw)
  To: sam; +Cc: netdev, linux-kernel, openbmc
In-Reply-To: <20180531070254.28878-1-sam@mendozajonas.com>

From: Samuel Mendoza-Jonas <sam@mendozajonas.com>
Date: Thu, 31 May 2018 17:02:54 +1000

> ncsi_rsp_handler_gc() allocates the filter arrays using GFP_KERNEL in
> softirq context, causing the below backtrace. This allocation is only a
> few dozen bytes during probing so allocate with GFP_ATOMIC instead.
 ...
> Signed-off-by: Samuel Mendoza-Jonas <sam@mendozajonas.com>

Applied with Fixes: tag added, thanks.

^ permalink raw reply

* Re: [PATCH net] net: ipv6: prevent use after free in ip6_route_mpath_notify()
From: David Ahern @ 2018-06-03 14:46 UTC (permalink / raw)
  To: Eric Dumazet, Eric Dumazet, David S . Miller; +Cc: netdev
In-Reply-To: <4dfbdd4b-947b-bbf7-27f3-abbd48a817b4@gmail.com>

On 6/3/18 8:31 AM, Eric Dumazet wrote:
> 
> 
> On 06/03/2018 07:01 AM, David Ahern wrote:
>> On 6/3/18 7:35 AM, Eric Dumazet wrote:
>>> diff --git a/net/ipv6/route.c b/net/ipv6/route.c
>>> index f4d61736c41abe8cd7f439c4a37100e90c1eacca..830eefdbdb6734eb81ea0322fb6077ee20be1889 100644
>>> --- a/net/ipv6/route.c
>>> +++ b/net/ipv6/route.c
>>> @@ -4263,7 +4263,9 @@ static int ip6_route_multipath_add(struct fib6_config *cfg,
>>>  
>>>  	err_nh = NULL;
>>>  	list_for_each_entry(nh, &rt6_nh_list, next) {
>>> +		dst_release(&rt_last->dst);
>>>  		rt_last = nh->rt6_info;
>>> +		dst_hold(&rt_last->dst);
>>>  		err = __ip6_ins_rt(nh->rt6_info, info, &nh->mxc, extack);
>>>  		/* save reference to first route for notification */
>>>  		if (!rt_notif && !err)
>>> @@ -4317,7 +4319,7 @@ static int ip6_route_multipath_add(struct fib6_config *cfg,
>>>  		list_del(&nh->next);
>>>  		kfree(nh);
>>>  	}
>>> -
>>> +	dst_release(&rt_last->dst);
>>>  	return err;
>>>  }
>>
>> Since the rtnl lock is held, a successfully inserted route can not be
>> removed until ip6_route_multipath_add finishes. This is a simpler change
>> that works with net-next as well:
> 
> Your patch changes the intent of your original commit.
> 
> It seems you wanted rt_last to point to the last attempted insertion,
> not the last successful one ?

The note in ip6_route_mpath_notify explains it:

        /* if this is an APPEND route, then rt points to the first route
         * inserted and rt_last points to last route inserted. Userspace

> 
> Or have I misunderstood, and not only we had a use-after-free, but also
> a semantic error ?

It was a mistake to set rt_last before checking err. So the
use-after-free exposed the semantic error.

^ permalink raw reply

* Re: [PATCH net-next] net: axienet: remove stale comment of axienet_open
From: David Miller @ 2018-06-03 14:59 UTC (permalink / raw)
  To: yuehaibing; +Cc: anirudh, John.Linn, netdev, linux-kernel, michal.simek
In-Reply-To: <20180531115115.11920-1-yuehaibing@huawei.com>

From: YueHaibing <yuehaibing@huawei.com>
Date: Thu, 31 May 2018 19:51:15 +0800

> axienet_open no longer return -ENODEV when PHY cannot be connected to
> since commit d7cc3163e026 ("net: axienet: Support phy-less mode of operation")
> 
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>

Applied.

^ permalink raw reply

* Re: [PATCH net v2] ipv6: omit traffic class when calculating flow hash
From: David Ahern @ 2018-06-03 15:00 UTC (permalink / raw)
  To: Michal Kubecek, David S. Miller
  Cc: netdev, linux-kernel, Nicolas Dichtel, Tom Herbert, Ido Schimmel
In-Reply-To: <20180602080528.54B27A0C48@unicorn.suse.cz>

On 6/2/18 1:40 AM, Michal Kubecek wrote:
> diff --git a/include/net/ipv6.h b/include/net/ipv6.h
> index 836f31af1369..7fbdc3e9e25d 100644
> --- a/include/net/ipv6.h
> +++ b/include/net/ipv6.h
> @@ -906,6 +906,11 @@ static inline __be32 ip6_make_flowinfo(unsigned int tclass, __be32 flowlabel)
>  	return htonl(tclass << IPV6_TCLASS_SHIFT) | flowlabel;
>  }
>  
> +static inline u32 flowi6_get_flowlabel(const struct flowi6 *fl6)
> +{
> +	return (__force u32)(fl6->flowlabel & IPV6_FLOWLABEL_MASK);
> +}
> +
>  /*
>   *	Prototypes exported by ipv6
>   */

discussing the fix for net-next and making the label vs info consistent,
Michal notes a few places where this helper is needed as a __be32, so
the typecast should be outside of this helper.

^ permalink raw reply

* Re: [PATCH] vlan: use non-archaic spelling of failes
From: David Miller @ 2018-06-03 15:02 UTC (permalink / raw)
  To: cascardo; +Cc: netdev
In-Reply-To: <20180531122020.9225-1-cascardo@canonical.com>

From: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Date: Thu, 31 May 2018 09:20:20 -0300

> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

Applied.

^ permalink raw reply

* Re: [PATCH v2 net] mlx4_core: restore optimal ICM memory allocation
From: David Miller @ 2018-06-03 15:02 UTC (permalink / raw)
  To: edumazet
  Cc: netdev, eric.dumazet, jsperbeck, tarick, qing.huang, danielj,
	yanjun.zhu
In-Reply-To: <20180531125224.97098-1-edumazet@google.com>

From: Eric Dumazet <edumazet@google.com>
Date: Thu, 31 May 2018 05:52:24 -0700

> Commit 1383cb8103bb ("mlx4_core: allocate ICM memory in page size chunks")
> brought two regressions caught in our regression suite.
> 
> The big one is an additional cost of 256 bytes of overhead per 4096 bytes,
> or 6.25 % which is unacceptable since ICM can be pretty large.
> 
> This comes from having to allocate one struct mlx4_icm_chunk (256 bytes)
> per MLX4_TABLE_CHUNK, which the buggy commit shrank to 4KB
> (instead of prior 256KB)
> 
> Note that mlx4_alloc_icm() is already able to try high order allocations
> and fallback to low-order allocations under high memory pressure.
> 
> Most of these allocations happen right after boot time, when we get
> plenty of non fragmented memory, there is really no point being so
> pessimistic and break huge pages into order-0 ones just for fun.
> 
> We only have to tweak gfp_mask a bit, to help falling back faster,
> without risking OOM killings.
> 
> Second regression is an KASAN fault, that will need further investigations.
> 
> Fixes: 1383cb8103bb ("mlx4_core: allocate ICM memory in page size chunks")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Acked-by: Tariq Toukan <tariqt@mellanox.com>

Applied, thanks Eric.

^ permalink raw reply

* Re: pull-request: wireless-drivers-next 2018-05-31
From: David Miller @ 2018-06-03 15:03 UTC (permalink / raw)
  To: kvalo; +Cc: linux-wireless, netdev, linux-kernel
In-Reply-To: <877enj29x4.fsf@kamboji.qca.qualcomm.com>

From: Kalle Valo <kvalo@codeaurora.org>
Date: Thu, 31 May 2018 17:10:15 +0300

> here's a pull request to net-next tree for 4.18. More info below and
> please let me know if there are any problems.

Pulled, thanks Kalle.

^ permalink raw reply

* Re: [PATCH bpf-next v3 00/11] Misc BPF improvements
From: Alexei Starovoitov @ 2018-06-03 15:08 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: netdev
In-Reply-To: <20180602210641.6163-1-daniel@iogearbox.net>

On Sat, Jun 02, 2018 at 11:06:30PM +0200, Daniel Borkmann wrote:
> This set adds various patches I still had in my queue, first two
> are test cases to provide coverage for the recent two fixes that
> went to bpf tree, then a small improvement on the error message
> for gpl helpers. Next, we expose prog and map id into fdinfo in
> order to allow for inspection of these objections currently used
> in applications. Patch after that removes a retpoline call for
> map lookup/update/delete helpers. A new helper is added in the
> subsequent patch to lookup the skb's socket's cgroup v2 id which
> can be used in an efficient way for e.g. lookups on egress side.
> Next one is a fix to fully clear state info in tunnel/xfrm helpers.
> Given this is full cap_sys_admin from init ns and has same priv
> requirements like tracing, bpf-next should be okay. A small bug
> fix for bpf_asm follows, and next a fix for context access in
> tracing which was recently reported. Lastly, a small update in
> the maintainer's file to add patchwork url and missing files.
> 
> Thanks!
> 
> v2 -> v3:
>   - Noticed a merge artefact inside uapi header comment, sigh,
>     fixed now.
> v1 -> v2:
>   - minor fix in getting context access work on 32 bit for tracing
>   - add paragraph to uapi helper doc to better describe kernel
>     build deps for cggroup helper

Applied, Thanks Daniel.
fixed up commit log s/bpftool p d x i/bpftool prog dump xlated id/
while applying, since it was indeed a bit cryptic.

^ permalink raw reply

* [PATCH bpf-next] bpf: flowlabel in bpf_fib_lookup should be flowinfo
From: dsahern @ 2018-06-03 15:15 UTC (permalink / raw)
  To: netdev, borkmann, ast; +Cc: David Ahern, Michal Kubecek

From: David Ahern <dsahern@gmail.com>

As Michal noted the flow struct takes both the flow label and priority.
Update the bpf_fib_lookup API to note that it is flowinfo and not just
the flow label.

Cc: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David Ahern <dsahern@gmail.com>
---
 include/uapi/linux/bpf.h   | 2 +-
 net/core/filter.c          | 2 +-
 samples/bpf/xdp_fwd_kern.c | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index f0b6608b1f1c..5ef032bc4746 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -2623,7 +2623,7 @@ struct bpf_fib_lookup {
 	union {
 		/* inputs to lookup */
 		__u8	tos;		/* AF_INET  */
-		__be32	flowlabel;	/* AF_INET6 */
+		__be32	flowinfo;	/* AF_INET6, flow_label + priority */
 
 		/* output: metric of fib result (IPv4/IPv6 only) */
 		__u32	rt_metric;
diff --git a/net/core/filter.c b/net/core/filter.c
index 28e864777c0f..704d515de2df 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -4222,7 +4222,7 @@ static int bpf_ipv6_fib_lookup(struct net *net, struct bpf_fib_lookup *params,
 		fl6.flowi6_oif = 0;
 		strict = RT6_LOOKUP_F_HAS_SADDR;
 	}
-	fl6.flowlabel = params->flowlabel;
+	fl6.flowlabel = params->flowinfo;
 	fl6.flowi6_scope = 0;
 	fl6.flowi6_flags = 0;
 	fl6.mp_hash = 0;
diff --git a/samples/bpf/xdp_fwd_kern.c b/samples/bpf/xdp_fwd_kern.c
index 4a6be0f87505..6673cdb9f55c 100644
--- a/samples/bpf/xdp_fwd_kern.c
+++ b/samples/bpf/xdp_fwd_kern.c
@@ -88,7 +88,7 @@ static __always_inline int xdp_fwd_flags(struct xdp_md *ctx, u32 flags)
 			return XDP_PASS;
 
 		fib_params.family	= AF_INET6;
-		fib_params.flowlabel	= *(__be32 *)ip6h & IPV6_FLOWINFO_MASK;
+		fib_params.flowinfo	= *(__be32 *)ip6h & IPV6_FLOWINFO_MASK;
 		fib_params.l4_protocol	= ip6h->nexthdr;
 		fib_params.sport	= 0;
 		fib_params.dport	= 0;
-- 
2.11.0

^ permalink raw reply related

* Re: [bpf-next V2 PATCH 0/8] bpf/xdp: add flags argument to ndo_xdp_xmit and flag flush operation
From: Alexei Starovoitov @ 2018-06-03 15:17 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: netdev, Daniel Borkmann, liu.song.a23, songliubraving,
	John Fastabend
In-Reply-To: <152775714013.24817.5067576840614810786.stgit@firesoul>

On Thu, May 31, 2018 at 10:59:42AM +0200, Jesper Dangaard Brouer wrote:
> As I mentioned in merge commit 10f678683e4 ("Merge branch 'xdp_xmit-bulking'")
> I plan to change the API for ndo_xdp_xmit once more, by adding a flags
> argument, which is done in this patchset.
> 
> I know it is late in the cycle (currently at rc7), but it would be
> nice to avoid changing NDOs over several kernel releases, as it is
> annoying to vendors and distro backporters, but it is not strictly
> UAPI so it is allowed (according to Alexei).
> 
> The end-goal is getting rid of the ndo_xdp_flush operation, as it will
> make it possible for drivers to implement a TXQ synchronization mechanism
> that is not necessarily derived from the CPU id (smp_processor_id).
> 
> This patchset removes all callers of the ndo_xdp_flush operation, but
> it doesn't take the last step of removing it from all drivers.  This
> can be done later, or I can update the patchset on request.
> 
> Micro-benchmarks only show a very small performance improvement, for
> map-redirect around ~2 ns, and for non-map redirect ~7 ns.  I've not
> benchmarked this with CONFIG_RETPOLINE, but the performance benefit
> should be more visible given we end-up removing an indirect call.
> 
> ---
> V2: Updated based on feedback from Song Liu <songliubraving@fb.com>

Applied, but please send a follow up patch to remove ndo_xdp_flush().
Otherwise this patch set is just a code churn that doing the opposite
of what you're trying to achieve and creating more backport pains.

^ permalink raw reply

* Re: [PATCH net] vrf: check the original netdevice for generating redirect
From: David Ahern @ 2018-06-03 15:31 UTC (permalink / raw)
  To: Stephen Suryaputra, netdev
In-Reply-To: <1527825921-17677-1-git-send-email-ssuryaextr@gmail.com>

On 5/31/18 10:05 PM, Stephen Suryaputra wrote:
> Use the right device to determine if redirect should be sent especially
> when using vrf. Same as well as when sending the redirect.
> 
> Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com>
> ---
>  net/ipv6/ip6_output.c | 3 ++-
>  net/ipv6/ndisc.c      | 6 ++++++
>  2 files changed, 8 insertions(+), 1 deletion(-)

skb->dev in this path is set to the vrf device if applicable, so yes the
change is needed. Thanks for the fix.

Acked-by: David Ahern <dsahern@gmail.com>

^ permalink raw reply

* Re: [RFC V5 PATCH 8/8] vhost: event suppression for packed ring
From: Wei Xu @ 2018-06-03 15:40 UTC (permalink / raw)
  To: Jason Wang
  Cc: mst, kvm, virtualization, netdev, linux-kernel, jfreimann,
	tiwei.bie
In-Reply-To: <12f2c455-5868-3b07-0eba-d49dcafd10f2@redhat.com>

On Thu, May 31, 2018 at 11:09:07AM +0800, Jason Wang wrote:
> 
> 
> On 2018年05月30日 19:42, Wei Xu wrote:
> >>  /* This actually signals the guest, using eventfd. */
> >>  void vhost_signal(struct vhost_dev *dev, struct vhost_virtqueue *vq)
> >>  {
> >>@@ -2802,10 +2930,34 @@ static bool vhost_enable_notify_packed(struct vhost_dev *dev,
> >>  				       struct vhost_virtqueue *vq)
> >>  {
> >>  	struct vring_desc_packed *d = vq->desc_packed + vq->avail_idx;
> >>-	__virtio16 flags;
> >>+	__virtio16 flags = RING_EVENT_FLAGS_ENABLE;
> >>  	int ret;
> >>-	/* FIXME: disable notification through device area */
> >>+	if (!(vq->used_flags & VRING_USED_F_NO_NOTIFY))
> >>+		return false;
> >>+	vq->used_flags &= ~VRING_USED_F_NO_NOTIFY;
> >'used_flags' was originally designed for 1.0, why should we pay attetion to it here?
> >
> >Wei
> 
> It was used to recored whether or not we've disabled notification. Then we
> can avoid unnecessary userspace writes or memory barriers.

OK, thanks.

> 
> Thanks

^ permalink raw reply

* Re: [PATCH 3/6] ravb: remove custom .set_link_ksettings from ethtool ops
From: Sergei Shtylyov @ 2018-06-03 15:42 UTC (permalink / raw)
  To: Vladimir Zapolskiy, David S. Miller; +Cc: netdev, linux-renesas-soc
In-Reply-To: <6f908ff0-254b-4378-27d3-5ff973328d88@mentor.com>

Hello!

   Sorry for the delay replying, the management keeps me busy... :-(

On 05/28/2018 12:51 PM, Vladimir Zapolskiy wrote:

>>> The change replaces a custom implementation of .set_link_ksettings
>>> callback with a shared phy_ethtool_set_link_ksettings(), this fixes
>>> sleep in atomic context bug, which is encountered every time when link
>>> settings are changed by ethtool.
>>
>>    Seeing it now...

   And to say that this is *fixed* by removing the custom method is err...
simply misleading. The sleep in atomic context is fixed solely by the removal
of the spinlock grabbing before the phylib call.

>>> Now duplex mode setting is enforced in ravb_adjust_link() only, also
>>> now TX/RX is disabled when link is put down or modifications to E-MAC
>>> registers ECMR and GECMR are expected for both cases of checked and
>>> ignored link status pin state from E-MAC interrupt handler.
>>>
>>> Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
>>> ---
>>>  drivers/net/ethernet/renesas/ravb_main.c | 58 +++++++++-----------------------
>>>  1 file changed, 15 insertions(+), 43 deletions(-)
>>>
>>> diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c
>>> index 3d91caa44176..0d811c02ff34 100644
>>> --- a/drivers/net/ethernet/renesas/ravb_main.c
>>> +++ b/drivers/net/ethernet/renesas/ravb_main.c
>>> @@ -980,6 +980,13 @@ static void ravb_adjust_link(struct net_device *ndev)
>>>  	struct ravb_private *priv = netdev_priv(ndev);
>>>  	struct phy_device *phydev = ndev->phydev;
>>>  	bool new_state = false;
>>> +	unsigned long flags;
>>> +
>>> +	spin_lock_irqsave(&priv->lock, flags);
>>> +
>>> +	/* Disable TX and RX right over here, if E-MAC change is ignored */
>>> +	if (priv->no_avb_link)
>>> +		ravb_rcv_snd_disable(ndev);
>>>  
>>>  	if (phydev->link) {
>>>  		if (phydev->duplex != priv->duplex) {
>>> @@ -997,18 +1004,21 @@ static void ravb_adjust_link(struct net_device *ndev)
>>>  			ravb_modify(ndev, ECMR, ECMR_TXF, 0);
>>>  			new_state = true;
>>>  			priv->link = phydev->link;
>>> -			if (priv->no_avb_link)
>>> -				ravb_rcv_snd_enable(ndev);
>>>  		}
>>>  	} else if (priv->link) {
>>>  		new_state = true;
>>>  		priv->link = 0;
>>>  		priv->speed = 0;
>>>  		priv->duplex = -1;
>>> -		if (priv->no_avb_link)
>>> -			ravb_rcv_snd_disable(ndev);
>>>  	}
>>>  
>>> +	/* Enable TX and RX right over here, if E-MAC change is ignored */
>>> +	if (priv->no_avb_link && phydev->link)
>>> +		ravb_rcv_snd_enable(ndev);
>>> +
>>> +	mmiowb();
>>> +	spin_unlock_irqrestore(&priv->lock, flags);
>>> +
>>
>>    I like this part. :-)
>>
> 
> A weight off my mind :) And I hope that this change will remain the less
> questionable one, other ones from the series are trivial.
> 
> Anyway I hope it is understandable that this part of the change can not
> be simply extracted from the rest one below, otherwise there'll be bugs of
> another type intorduced.

   I never said I'd like to apply this part alone, my idea was more like removing
the spinlock grabbing and the duplex handling down below.

[...]
>>> @@ -1096,44 +1106,6 @@ static int ravb_phy_start(struct net_device *ndev)
>>>  	return 0;
>>>  }
>>>  
>>> -static int ravb_set_link_ksettings(struct net_device *ndev,
>>> -				   const struct ethtool_link_ksettings *cmd)
>>> -{
>>> -	struct ravb_private *priv = netdev_priv(ndev);
>>> -	unsigned long flags;
>>> -	int error;
>>> -
>>> -	if (!ndev->phydev)
>>> -		return -ENODEV;
>>> -
>>> -	spin_lock_irqsave(&priv->lock, flags);
>>> -
>>> -	/* Disable TX and RX */
>>> -	ravb_rcv_snd_disable(ndev);
>>> -
>>> -	error = phy_ethtool_ksettings_set(ndev->phydev, cmd);
>>> -	if (error)
>>> -		goto error_exit;
>>> -
>>> -	if (cmd->base.duplex == DUPLEX_FULL)
>>> -		priv->duplex = 1;
>>> -	else
>>> -		priv->duplex = 0;
>>> -
>>> -	ravb_set_duplex(ndev);
>>> -
>>> -error_exit:
>>> -	mdelay(1);
>>> -
>>> -	/* Enable TX and RX */
>>> -	ravb_rcv_snd_enable(ndev);
>>> -
>>> -	mmiowb();
>>> -	spin_unlock_irqrestore(&priv->lock, flags);
>>> -
>>> -	return error;
>>> -}
>>> -
>>
>>    But this part is clearly lumping it all together... 
> 
> Please elaborate.

   My point is still that complete removal of the custom method was somewhat
premature and completely unnecessary for fixing the issues we have.

>> [...]
>>> @@ -1357,7 +1329,7 @@ static const struct ethtool_ops ravb_ethtool_ops = {
>>>  	.set_ringparam		= ravb_set_ringparam,
>>>  	.get_ts_info		= ravb_get_ts_info,
>>>  	.get_link_ksettings	= phy_ethtool_get_link_ksettings,
>>> -	.set_link_ksettings	= ravb_set_link_ksettings,
>>> +	.set_link_ksettings	= phy_ethtool_set_link_ksettings,
>>
>>    Should have been a part of the final patch in the fix/enhancement chain...
> 
> Please elaborate.
> 
> Do you mean that firstly I have to make erroneous ravb_set_link_ksettings()
> to look similar to phy_ethtool_set_link_ksettings() and then remove it?

   Yes.

> As I see it in the current context (removal of ravb_set_duplex() call and
> so on), the problem with this approach is that the actual fix change will
> be done on top of a number of enchancement changes, thus it contradicts to

   Now I have to ask you to elaborate. I have no idea what you mean. :-(

   And of course, sometimes the things are broken in a so subtle way, that
only as pile of "cleanups" fixed them, we had that situation in e.g. the
R-Car I2C driver -- *none* of AFAIR 9 patches was good as a -stable patch...

> the accepted development/maintenace model "fixes first", and most probably
> it won't be possible to backport the real fix, however this sole change can
> be backported.

   My idea was to move the [G]ECMR writes to the adjust_link() callback and
to stop grabbing the spinlock where it *was* grabbed in the same fix patch.
Then just a single clean up, to start using the new phylib method.

[...]
> --
> With best wishes,
> Vladimir

MBR, Sergei

^ permalink raw reply

* RE: [PATCH net-next] qed: Add srq core support for RoCE and iWARP
From: Bason, Yuval @ 2018-06-03 16:10 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: davem@davemloft.net, netdev@vger.kernel.org, jgg@mellanox.com,
	dledford@redhat.com, linux-rdma@vger.kernel.org, Kalderon, Michal,
	Elior, Ariel
In-Reply-To: <20180531173301.GV3697@mtr-leonro.mtl.com>

From: Leon Romanovsky [mailto:leon@kernel.org]
Sent: Thursday, May 31, 2018 8:33 PM
> On Wed, May 30, 2018 at 04:11:37PM +0300, Yuval Bason wrote:
> > This patch adds support for configuring SRQ and provides the necessary
> > APIs for rdma upper layer driver (qedr) to enable the SRQ feature.
> >
> > Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com>
> > Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
> > Signed-off-by: Yuval Bason <yuval.bason@cavium.com>
> > ---
> >  drivers/net/ethernet/qlogic/qed/qed_cxt.c   |   5 +-
> >  drivers/net/ethernet/qlogic/qed/qed_cxt.h   |   1 +
> >  drivers/net/ethernet/qlogic/qed/qed_hsi.h   |   2 +
> >  drivers/net/ethernet/qlogic/qed/qed_iwarp.c |  23 ++++
> >  drivers/net/ethernet/qlogic/qed/qed_main.c  |   2 +
> >  drivers/net/ethernet/qlogic/qed/qed_rdma.c  | 179
> +++++++++++++++++++++++++++-
> >  drivers/net/ethernet/qlogic/qed/qed_rdma.h  |   2 +
> >  drivers/net/ethernet/qlogic/qed/qed_roce.c  |  17 ++-
> >  include/linux/qed/qed_rdma_if.h             |  12 +-
> >  9 files changed, 235 insertions(+), 8 deletions(-)
> >
> 
> ...
> 
> > +	struct qed_sp_init_data init_data;
> 
> ...
> 
> > +	memset(&init_data, 0, sizeof(init_data));
> 
> This patter is so common in this patch, why?
> 
> "struct qed_sp_init_data init_data = {};" will do the trick.
> 
Thanks for pointing out, will be fixed in v2.

> Thanks

^ permalink raw reply

* Re: [PATCH bpf-next v3 05/11] bpf: avoid retpoline for lookup/update/delete calls on maps
From: Daniel Borkmann @ 2018-06-03 16:11 UTC (permalink / raw)
  To: Jesper Dangaard Brouer; +Cc: alexei.starovoitov, netdev
In-Reply-To: <20180603085651.73c76704@redhat.com>

On 06/03/2018 08:56 AM, Jesper Dangaard Brouer wrote:
> On Sat,  2 Jun 2018 23:06:35 +0200
> Daniel Borkmann <daniel@iogearbox.net> wrote:
> 
>> Before:
>>
>>   # bpftool p d x i 1
> 
> Could this please be changed to:
> 
>  # bpftool prog dump xlated id 1
> 
> I requested this before, but you seem to have missed my feedback...
> This makes the command "self-documenting" and searchable by Google.

I recently wrote a howto here, but there's also excellent documentation
in terms of man pages for bpftool.

http://cilium.readthedocs.io/en/latest/bpf/#bpftool

My original thinking was that it might be okay to also show usage of
short option matching, like in iproute2 probably few people only write
'ip address' but majority uses 'ip a' instead. But I'm fine either way
if there are strong opinions ... thanks Alexei for fixing up!

^ permalink raw reply

* [PATCH net-next v2] qed: Add srq core support for RoCE and iWARP
From: Yuval Bason @ 2018-06-03 16:13 UTC (permalink / raw)
  To: yuval.bason, davem
  Cc: netdev, jgg, dledford, linux-rdma, Michal Kalderon, Ariel Elior

This patch adds support for configuring SRQ and provides the necessary
APIs for rdma upper layer driver (qedr) to enable the SRQ feature.

Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: Yuval Bason <yuval.bason@cavium.com>
---
Changes from v1:
	- sparse warnings
	- replace memset with ={}
---
 drivers/net/ethernet/qlogic/qed/qed_cxt.c   |   5 +-
 drivers/net/ethernet/qlogic/qed/qed_cxt.h   |   1 +
 drivers/net/ethernet/qlogic/qed/qed_hsi.h   |   2 +
 drivers/net/ethernet/qlogic/qed/qed_iwarp.c |  23 ++++
 drivers/net/ethernet/qlogic/qed/qed_main.c  |   2 +
 drivers/net/ethernet/qlogic/qed/qed_rdma.c  | 178 +++++++++++++++++++++++++++-
 drivers/net/ethernet/qlogic/qed/qed_rdma.h  |   2 +
 drivers/net/ethernet/qlogic/qed/qed_roce.c  |  17 ++-
 include/linux/qed/qed_rdma_if.h             |  12 +-
 9 files changed, 234 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_cxt.c b/drivers/net/ethernet/qlogic/qed/qed_cxt.c
index 820b226..7ed6aa0 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_cxt.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_cxt.c
@@ -47,6 +47,7 @@
 #include "qed_hsi.h"
 #include "qed_hw.h"
 #include "qed_init_ops.h"
+#include "qed_rdma.h"
 #include "qed_reg_addr.h"
 #include "qed_sriov.h"
 
@@ -426,7 +427,7 @@ static void qed_cxt_set_srq_count(struct qed_hwfn *p_hwfn, u32 num_srqs)
 	p_mgr->srq_count = num_srqs;
 }
 
-static u32 qed_cxt_get_srq_count(struct qed_hwfn *p_hwfn)
+u32 qed_cxt_get_srq_count(struct qed_hwfn *p_hwfn)
 {
 	struct qed_cxt_mngr *p_mgr = p_hwfn->p_cxt_mngr;
 
@@ -2071,7 +2072,7 @@ static void qed_rdma_set_pf_params(struct qed_hwfn *p_hwfn,
 	u32 num_cons, num_qps, num_srqs;
 	enum protocol_type proto;
 
-	num_srqs = min_t(u32, 32 * 1024, p_params->num_srqs);
+	num_srqs = min_t(u32, QED_RDMA_MAX_SRQS, p_params->num_srqs);
 
 	if (p_hwfn->mcp_info->func_info.protocol == QED_PCI_ETH_RDMA) {
 		DP_NOTICE(p_hwfn,
diff --git a/drivers/net/ethernet/qlogic/qed/qed_cxt.h b/drivers/net/ethernet/qlogic/qed/qed_cxt.h
index a4e9586..758a8b4 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_cxt.h
+++ b/drivers/net/ethernet/qlogic/qed/qed_cxt.h
@@ -235,6 +235,7 @@ u32 qed_cxt_get_proto_tid_count(struct qed_hwfn *p_hwfn,
 				enum protocol_type type);
 u32 qed_cxt_get_proto_cid_start(struct qed_hwfn *p_hwfn,
 				enum protocol_type type);
+u32 qed_cxt_get_srq_count(struct qed_hwfn *p_hwfn);
 int qed_cxt_free_proto_ilt(struct qed_hwfn *p_hwfn, enum protocol_type proto);
 
 #define QED_CTX_WORKING_MEM 0
diff --git a/drivers/net/ethernet/qlogic/qed/qed_hsi.h b/drivers/net/ethernet/qlogic/qed/qed_hsi.h
index 8e1e6e1..82ce401 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_hsi.h
+++ b/drivers/net/ethernet/qlogic/qed/qed_hsi.h
@@ -9725,6 +9725,8 @@ enum iwarp_eqe_async_opcode {
 	IWARP_EVENT_TYPE_ASYNC_EXCEPTION_DETECTED,
 	IWARP_EVENT_TYPE_ASYNC_QP_IN_ERROR_STATE,
 	IWARP_EVENT_TYPE_ASYNC_CQ_OVERFLOW,
+	IWARP_EVENT_TYPE_ASYNC_SRQ_EMPTY,
+	IWARP_EVENT_TYPE_ASYNC_SRQ_LIMIT,
 	MAX_IWARP_EQE_ASYNC_OPCODE
 };
 
diff --git a/drivers/net/ethernet/qlogic/qed/qed_iwarp.c b/drivers/net/ethernet/qlogic/qed/qed_iwarp.c
index 2a2b101..474e6cf 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_iwarp.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_iwarp.c
@@ -271,6 +271,8 @@ int qed_iwarp_create_qp(struct qed_hwfn *p_hwfn,
 	p_ramrod->sq_num_pages = qp->sq_num_pages;
 	p_ramrod->rq_num_pages = qp->rq_num_pages;
 
+	p_ramrod->srq_id.srq_idx = cpu_to_le16(qp->srq_id);
+	p_ramrod->srq_id.opaque_fid = cpu_to_le16(p_hwfn->hw_info.opaque_fid);
 	p_ramrod->qp_handle_for_cqe.hi = cpu_to_le32(qp->qp_handle.hi);
 	p_ramrod->qp_handle_for_cqe.lo = cpu_to_le32(qp->qp_handle.lo);
 
@@ -3004,8 +3006,11 @@ static int qed_iwarp_async_event(struct qed_hwfn *p_hwfn,
 				 union event_ring_data *data,
 				 u8 fw_return_code)
 {
+	struct qed_rdma_events events = p_hwfn->p_rdma_info->events;
 	struct regpair *fw_handle = &data->rdma_data.async_handle;
 	struct qed_iwarp_ep *ep = NULL;
+	u16 srq_offset;
+	u16 srq_id;
 	u16 cid;
 
 	ep = (struct qed_iwarp_ep *)(uintptr_t)HILO_64(fw_handle->hi,
@@ -3067,6 +3072,24 @@ static int qed_iwarp_async_event(struct qed_hwfn *p_hwfn,
 		qed_iwarp_cid_cleaned(p_hwfn, cid);
 
 		break;
+	case IWARP_EVENT_TYPE_ASYNC_SRQ_EMPTY:
+		DP_NOTICE(p_hwfn, "IWARP_EVENT_TYPE_ASYNC_SRQ_EMPTY\n");
+		srq_offset = p_hwfn->p_rdma_info->srq_id_offset;
+		/* FW assigns value that is no greater than u16 */
+		srq_id = ((u16)le32_to_cpu(fw_handle->lo)) - srq_offset;
+		events.affiliated_event(events.context,
+					QED_IWARP_EVENT_SRQ_EMPTY,
+					&srq_id);
+		break;
+	case IWARP_EVENT_TYPE_ASYNC_SRQ_LIMIT:
+		DP_NOTICE(p_hwfn, "IWARP_EVENT_TYPE_ASYNC_SRQ_LIMIT\n");
+		srq_offset = p_hwfn->p_rdma_info->srq_id_offset;
+		/* FW assigns value that is no greater than u16 */
+		srq_id = ((u16)le32_to_cpu(fw_handle->lo)) - srq_offset;
+		events.affiliated_event(events.context,
+					QED_IWARP_EVENT_SRQ_LIMIT,
+					&srq_id);
+		break;
 	case IWARP_EVENT_TYPE_ASYNC_CQ_OVERFLOW:
 		DP_NOTICE(p_hwfn, "IWARP_EVENT_TYPE_ASYNC_CQ_OVERFLOW\n");
 
diff --git a/drivers/net/ethernet/qlogic/qed/qed_main.c b/drivers/net/ethernet/qlogic/qed/qed_main.c
index 68c4399..b04d57c 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_main.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_main.c
@@ -64,6 +64,7 @@
 
 #define QED_ROCE_QPS			(8192)
 #define QED_ROCE_DPIS			(8)
+#define QED_RDMA_SRQS                   QED_ROCE_QPS
 
 static char version[] =
 	"QLogic FastLinQ 4xxxx Core Module qed " DRV_MODULE_VERSION "\n";
@@ -922,6 +923,7 @@ static void qed_update_pf_params(struct qed_dev *cdev,
 	if (IS_ENABLED(CONFIG_QED_RDMA)) {
 		params->rdma_pf_params.num_qps = QED_ROCE_QPS;
 		params->rdma_pf_params.min_dpis = QED_ROCE_DPIS;
+		params->rdma_pf_params.num_srqs = QED_RDMA_SRQS;
 		/* divide by 3 the MRs to avoid MF ILT overflow */
 		params->rdma_pf_params.gl_pi = QED_ROCE_PROTOCOL_INDEX;
 	}
diff --git a/drivers/net/ethernet/qlogic/qed/qed_rdma.c b/drivers/net/ethernet/qlogic/qed/qed_rdma.c
index a411f9c..b870510 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_rdma.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_rdma.c
@@ -259,15 +259,29 @@ static int qed_rdma_alloc(struct qed_hwfn *p_hwfn,
 		goto free_cid_map;
 	}
 
+	/* Allocate bitmap for srqs */
+	p_rdma_info->num_srqs = qed_cxt_get_srq_count(p_hwfn);
+	rc = qed_rdma_bmap_alloc(p_hwfn, &p_rdma_info->srq_map,
+				 p_rdma_info->num_srqs, "SRQ");
+	if (rc) {
+		DP_VERBOSE(p_hwfn, QED_MSG_RDMA,
+			   "Failed to allocate srq bitmap, rc = %d\n", rc);
+		goto free_real_cid_map;
+	}
+
 	if (QED_IS_IWARP_PERSONALITY(p_hwfn))
 		rc = qed_iwarp_alloc(p_hwfn);
 
 	if (rc)
-		goto free_cid_map;
+		goto free_srq_map;
 
 	DP_VERBOSE(p_hwfn, QED_MSG_RDMA, "Allocation successful\n");
 	return 0;
 
+free_srq_map:
+	kfree(p_rdma_info->srq_map.bitmap);
+free_real_cid_map:
+	kfree(p_rdma_info->real_cid_map.bitmap);
 free_cid_map:
 	kfree(p_rdma_info->cid_map.bitmap);
 free_tid_map:
@@ -351,6 +365,8 @@ static void qed_rdma_resc_free(struct qed_hwfn *p_hwfn)
 	qed_rdma_bmap_free(p_hwfn, &p_hwfn->p_rdma_info->cq_map, 1);
 	qed_rdma_bmap_free(p_hwfn, &p_hwfn->p_rdma_info->toggle_bits, 0);
 	qed_rdma_bmap_free(p_hwfn, &p_hwfn->p_rdma_info->tid_map, 1);
+	qed_rdma_bmap_free(p_hwfn, &p_hwfn->p_rdma_info->srq_map, 1);
+	qed_rdma_bmap_free(p_hwfn, &p_hwfn->p_rdma_info->real_cid_map, 1);
 
 	kfree(p_rdma_info->port);
 	kfree(p_rdma_info->dev);
@@ -431,6 +447,12 @@ static void qed_rdma_init_devinfo(struct qed_hwfn *p_hwfn,
 	if (cdev->rdma_max_sge)
 		dev->max_sge = min_t(u32, cdev->rdma_max_sge, dev->max_sge);
 
+	dev->max_srq_sge = QED_RDMA_MAX_SGE_PER_SRQ_WQE;
+	if (p_hwfn->cdev->rdma_max_srq_sge) {
+		dev->max_srq_sge = min_t(u32,
+					 p_hwfn->cdev->rdma_max_srq_sge,
+					 dev->max_srq_sge);
+	}
 	dev->max_inline = ROCE_REQ_MAX_INLINE_DATA_SIZE;
 
 	dev->max_inline = (cdev->rdma_max_inline) ?
@@ -474,6 +496,8 @@ static void qed_rdma_init_devinfo(struct qed_hwfn *p_hwfn,
 	dev->max_mr_mw_fmr_size = dev->max_mr_mw_fmr_pbl * PAGE_SIZE;
 	dev->max_pkey = QED_RDMA_MAX_P_KEY;
 
+	dev->max_srq = p_hwfn->p_rdma_info->num_srqs;
+	dev->max_srq_wr = QED_RDMA_MAX_SRQ_WQE_ELEM;
 	dev->max_qp_resp_rd_atomic_resc = RDMA_RING_PAGE_SIZE /
 					  (RDMA_RESP_RD_ATOMIC_ELM_SIZE * 2);
 	dev->max_qp_req_rd_atomic_resc = RDMA_RING_PAGE_SIZE /
@@ -1628,6 +1652,155 @@ static void *qed_rdma_get_rdma_ctx(struct qed_dev *cdev)
 	return QED_LEADING_HWFN(cdev);
 }
 
+static int qed_rdma_modify_srq(void *rdma_cxt,
+			       struct qed_rdma_modify_srq_in_params *in_params)
+{
+	struct rdma_srq_modify_ramrod_data *p_ramrod;
+	struct qed_sp_init_data init_data = {};
+	struct qed_hwfn *p_hwfn = rdma_cxt;
+	struct qed_spq_entry *p_ent;
+	u16 opaque_fid;
+	int rc;
+
+	init_data.opaque_fid = p_hwfn->hw_info.opaque_fid;
+	init_data.comp_mode = QED_SPQ_MODE_EBLOCK;
+
+	rc = qed_sp_init_request(p_hwfn, &p_ent,
+				 RDMA_RAMROD_MODIFY_SRQ,
+				 p_hwfn->p_rdma_info->proto, &init_data);
+	if (rc)
+		return rc;
+
+	p_ramrod = &p_ent->ramrod.rdma_modify_srq;
+	p_ramrod->srq_id.srq_idx = cpu_to_le16(in_params->srq_id);
+	opaque_fid = p_hwfn->hw_info.opaque_fid;
+	p_ramrod->srq_id.opaque_fid = cpu_to_le16(opaque_fid);
+	p_ramrod->wqe_limit = cpu_to_le32(in_params->wqe_limit);
+
+	rc = qed_spq_post(p_hwfn, p_ent, NULL);
+	if (rc)
+		return rc;
+
+	DP_VERBOSE(p_hwfn, QED_MSG_RDMA, "modified SRQ id = %x",
+		   in_params->srq_id);
+
+	return rc;
+}
+
+static int
+qed_rdma_destroy_srq(void *rdma_cxt,
+		     struct qed_rdma_destroy_srq_in_params *in_params)
+{
+	struct rdma_srq_destroy_ramrod_data *p_ramrod;
+	struct qed_sp_init_data init_data = {};
+	struct qed_hwfn *p_hwfn = rdma_cxt;
+	struct qed_spq_entry *p_ent;
+	struct qed_bmap *bmap;
+	u16 opaque_fid;
+	int rc;
+
+	opaque_fid = p_hwfn->hw_info.opaque_fid;
+
+	init_data.opaque_fid = opaque_fid;
+	init_data.comp_mode = QED_SPQ_MODE_EBLOCK;
+
+	rc = qed_sp_init_request(p_hwfn, &p_ent,
+				 RDMA_RAMROD_DESTROY_SRQ,
+				 p_hwfn->p_rdma_info->proto, &init_data);
+	if (rc)
+		return rc;
+
+	p_ramrod = &p_ent->ramrod.rdma_destroy_srq;
+	p_ramrod->srq_id.srq_idx = cpu_to_le16(in_params->srq_id);
+	p_ramrod->srq_id.opaque_fid = cpu_to_le16(opaque_fid);
+
+	rc = qed_spq_post(p_hwfn, p_ent, NULL);
+	if (rc)
+		return rc;
+
+	bmap = &p_hwfn->p_rdma_info->srq_map;
+
+	spin_lock_bh(&p_hwfn->p_rdma_info->lock);
+	qed_bmap_release_id(p_hwfn, bmap, in_params->srq_id);
+	spin_unlock_bh(&p_hwfn->p_rdma_info->lock);
+
+	DP_VERBOSE(p_hwfn, QED_MSG_RDMA, "SRQ destroyed Id = %x",
+		   in_params->srq_id);
+
+	return rc;
+}
+
+static int
+qed_rdma_create_srq(void *rdma_cxt,
+		    struct qed_rdma_create_srq_in_params *in_params,
+		    struct qed_rdma_create_srq_out_params *out_params)
+{
+	struct rdma_srq_create_ramrod_data *p_ramrod;
+	struct qed_sp_init_data init_data = {};
+	struct qed_hwfn *p_hwfn = rdma_cxt;
+	enum qed_cxt_elem_type elem_type;
+	struct qed_spq_entry *p_ent;
+	u16 opaque_fid, srq_id;
+	struct qed_bmap *bmap;
+	u32 returned_id;
+	int rc;
+
+	bmap = &p_hwfn->p_rdma_info->srq_map;
+	spin_lock_bh(&p_hwfn->p_rdma_info->lock);
+	rc = qed_rdma_bmap_alloc_id(p_hwfn, bmap, &returned_id);
+	spin_unlock_bh(&p_hwfn->p_rdma_info->lock);
+
+	if (rc) {
+		DP_NOTICE(p_hwfn, "failed to allocate srq id\n");
+		return rc;
+	}
+
+	elem_type = QED_ELEM_SRQ;
+	rc = qed_cxt_dynamic_ilt_alloc(p_hwfn, elem_type, returned_id);
+	if (rc)
+		goto err;
+	/* returned id is no greater than u16 */
+	srq_id = (u16)returned_id;
+	opaque_fid = p_hwfn->hw_info.opaque_fid;
+
+	opaque_fid = p_hwfn->hw_info.opaque_fid;
+	init_data.opaque_fid = opaque_fid;
+	init_data.comp_mode = QED_SPQ_MODE_EBLOCK;
+
+	rc = qed_sp_init_request(p_hwfn, &p_ent,
+				 RDMA_RAMROD_CREATE_SRQ,
+				 p_hwfn->p_rdma_info->proto, &init_data);
+	if (rc)
+		goto err;
+
+	p_ramrod = &p_ent->ramrod.rdma_create_srq;
+	DMA_REGPAIR_LE(p_ramrod->pbl_base_addr, in_params->pbl_base_addr);
+	p_ramrod->pages_in_srq_pbl = cpu_to_le16(in_params->num_pages);
+	p_ramrod->pd_id = cpu_to_le16(in_params->pd_id);
+	p_ramrod->srq_id.srq_idx = cpu_to_le16(srq_id);
+	p_ramrod->srq_id.opaque_fid = cpu_to_le16(opaque_fid);
+	p_ramrod->page_size = cpu_to_le16(in_params->page_size);
+	DMA_REGPAIR_LE(p_ramrod->producers_addr, in_params->prod_pair_addr);
+
+	rc = qed_spq_post(p_hwfn, p_ent, NULL);
+	if (rc)
+		goto err;
+
+	out_params->srq_id = srq_id;
+
+	DP_VERBOSE(p_hwfn, QED_MSG_RDMA,
+		   "SRQ created Id = %x\n", out_params->srq_id);
+
+	return rc;
+
+err:
+	spin_lock_bh(&p_hwfn->p_rdma_info->lock);
+	qed_bmap_release_id(p_hwfn, bmap, returned_id);
+	spin_unlock_bh(&p_hwfn->p_rdma_info->lock);
+
+	return rc;
+}
+
 bool qed_rdma_allocated_qps(struct qed_hwfn *p_hwfn)
 {
 	bool result;
@@ -1773,6 +1946,9 @@ static int qed_roce_ll2_set_mac_filter(struct qed_dev *cdev,
 	.rdma_free_tid = &qed_rdma_free_tid,
 	.rdma_register_tid = &qed_rdma_register_tid,
 	.rdma_deregister_tid = &qed_rdma_deregister_tid,
+	.rdma_create_srq = &qed_rdma_create_srq,
+	.rdma_modify_srq = &qed_rdma_modify_srq,
+	.rdma_destroy_srq = &qed_rdma_destroy_srq,
 	.ll2_acquire_connection = &qed_ll2_acquire_connection,
 	.ll2_establish_connection = &qed_ll2_establish_connection,
 	.ll2_terminate_connection = &qed_ll2_terminate_connection,
diff --git a/drivers/net/ethernet/qlogic/qed/qed_rdma.h b/drivers/net/ethernet/qlogic/qed/qed_rdma.h
index 18ec9cb..6f722ee 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_rdma.h
+++ b/drivers/net/ethernet/qlogic/qed/qed_rdma.h
@@ -96,6 +96,8 @@ struct qed_rdma_info {
 	u8 num_cnqs;
 	u32 num_qps;
 	u32 num_mrs;
+	u32 num_srqs;
+	u16 srq_id_offset;
 	u16 queue_zone_base;
 	u16 max_queue_zones;
 	enum protocol_type proto;
diff --git a/drivers/net/ethernet/qlogic/qed/qed_roce.c b/drivers/net/ethernet/qlogic/qed/qed_roce.c
index 6acfd43..ee57fcd 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_roce.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_roce.c
@@ -65,6 +65,8 @@
 		     u8 fw_event_code,
 		     u16 echo, union event_ring_data *data, u8 fw_return_code)
 {
+	struct qed_rdma_events events = p_hwfn->p_rdma_info->events;
+
 	if (fw_event_code == ROCE_ASYNC_EVENT_DESTROY_QP_DONE) {
 		u16 icid =
 		    (u16)le32_to_cpu(data->rdma_data.rdma_destroy_qp_data.cid);
@@ -75,11 +77,18 @@
 		 */
 		qed_roce_free_real_icid(p_hwfn, icid);
 	} else {
-		struct qed_rdma_events *events = &p_hwfn->p_rdma_info->events;
+		if (fw_event_code == ROCE_ASYNC_EVENT_SRQ_EMPTY ||
+		    fw_event_code == ROCE_ASYNC_EVENT_SRQ_LIMIT) {
+			u16 srq_id = (u16)data->rdma_data.async_handle.lo;
+
+			events.affiliated_event(events.context, fw_event_code,
+						&srq_id);
+		} else {
+			union rdma_eqe_data rdata = data->rdma_data;
 
-		events->affiliated_event(p_hwfn->p_rdma_info->events.context,
-					 fw_event_code,
-				     (void *)&data->rdma_data.async_handle);
+			events.affiliated_event(events.context, fw_event_code,
+						(void *)&rdata.async_handle);
+		}
 	}
 
 	return 0;
diff --git a/include/linux/qed/qed_rdma_if.h b/include/linux/qed/qed_rdma_if.h
index 4dd72ba..e05e320 100644
--- a/include/linux/qed/qed_rdma_if.h
+++ b/include/linux/qed/qed_rdma_if.h
@@ -485,7 +485,9 @@ enum qed_iwarp_event_type {
 	QED_IWARP_EVENT_ACTIVE_MPA_REPLY,
 	QED_IWARP_EVENT_LOCAL_ACCESS_ERROR,
 	QED_IWARP_EVENT_REMOTE_OPERATION_ERROR,
-	QED_IWARP_EVENT_TERMINATE_RECEIVED
+	QED_IWARP_EVENT_TERMINATE_RECEIVED,
+	QED_IWARP_EVENT_SRQ_LIMIT,
+	QED_IWARP_EVENT_SRQ_EMPTY,
 };
 
 enum qed_tcp_ip_version {
@@ -646,6 +648,14 @@ struct qed_rdma_ops {
 	int (*rdma_alloc_tid)(void *rdma_cxt, u32 *itid);
 	void (*rdma_free_tid)(void *rdma_cxt, u32 itid);
 
+	int (*rdma_create_srq)(void *rdma_cxt,
+			       struct qed_rdma_create_srq_in_params *iparams,
+			       struct qed_rdma_create_srq_out_params *oparams);
+	int (*rdma_destroy_srq)(void *rdma_cxt,
+				struct qed_rdma_destroy_srq_in_params *iparams);
+	int (*rdma_modify_srq)(void *rdma_cxt,
+			       struct qed_rdma_modify_srq_in_params *iparams);
+
 	int (*ll2_acquire_connection)(void *rdma_cxt,
 				      struct qed_ll2_acquire_data *data);
 
-- 
1.8.3.1

^ permalink raw reply related

* Re: [PATCH bpf-next v3 05/11] bpf: avoid retpoline for lookup/update/delete calls on maps
From: Jesper Dangaard Brouer @ 2018-06-03 17:08 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: brouer, alexei.starovoitov, netdev, Phil Sutter, Jakub Kicinski,
	Jakub Kicinski, Quentin Monnet
In-Reply-To: <d05e733b-7f54-9fd9-e80a-67e704197d14@iogearbox.net>

On Sun, 3 Jun 2018 18:11:45 +0200
Daniel Borkmann <daniel@iogearbox.net> wrote:

> On 06/03/2018 08:56 AM, Jesper Dangaard Brouer wrote:
> > On Sat,  2 Jun 2018 23:06:35 +0200
> > Daniel Borkmann <daniel@iogearbox.net> wrote:
> >   
> >> Before:
> >>
> >>   # bpftool p d x i 1  
> > 
> > Could this please be changed to:
> > 
> >  # bpftool prog dump xlated id 1
> > 
> > I requested this before, but you seem to have missed my feedback...
> > This makes the command "self-documenting" and searchable by Google.  
> 
> I recently wrote a howto here, but there's also excellent documentation
> in terms of man pages for bpftool.
> 
> http://cilium.readthedocs.io/en/latest/bpf/#bpftool
> 
> My original thinking was that it might be okay to also show usage of
> short option matching, like in iproute2 probably few people only write
> 'ip address' but majority uses 'ip a' instead. But I'm fine either way
> if there are strong opinions ... thanks Alexei for fixing up!

First of all I love your documentation effort.

Secondly I personally *hate* how the 'ip' does it's short options
parsing and especially order/precedence ambiguity.  Phil Sutter
(Fedora/RHEL iproute2 maintainer) have a funny quiz illustrating the
ambiguity issues.

Quiz: https://youtu.be/cymH9pcFGa0?t=7m10s
Code problem: https://youtu.be/cymH9pcFGa0?t=9m8s

I hope the maintainers and developers of bpftool make sure we don't end
up in an ambiguity mess like we have with 'ip', pretty please.
-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox