Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH bpf] xsk: re-add queue id check for XDP_SKB path
From: Daniel Borkmann @ 2018-06-12 10:21 UTC (permalink / raw)
  To: Björn Töpel, magnus.karlsson, magnus.karlsson, ast,
	netdev
  Cc: Björn Töpel, qi.z.zhang
In-Reply-To: <20180612100256.21300-1-bjorn.topel@gmail.com>

On 06/12/2018 12:02 PM, Björn Töpel wrote:
> From: Björn Töpel <bjorn.topel@intel.com>
> 
> Commit 173d3adb6f43 ("xsk: add zero-copy support for Rx") introduced a
> regression on the XDP_SKB receive path, when the queue id checks were
> removed. Now, they are back again.
> 
> Fixes: 173d3adb6f43 ("xsk: add zero-copy support for Rx")
> Reported-by: Qi Zhang <qi.z.zhang@intel.com>
> Signed-off-by: Björn Töpel <bjorn.topel@intel.com>

Applied to bpf, thanks Björn!

^ permalink raw reply

* Re: [PATCH net] tls: fix NULL pointer dereference on poll
From: Daniel Borkmann @ 2018-06-12 10:43 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: davem, davejwatson, netdev, ast
In-Reply-To: <20180612053749.GA16853@lst.de>

On 06/12/2018 07:37 AM, Christoph Hellwig wrote:
>> Looks like the recent conversion from poll to poll_mask callback started
>> in 152524231023 ("net: add support for ->poll_mask in proto_ops") missed
>> to eventually convert kTLS, too: TCP's ->poll was converted over to the
>> ->poll_mask in commit 2c7d3dacebd4 ("net/tcp: convert to ->poll_mask")
>> and therefore kTLS wrongly saved the ->poll old one which is now NULL.
> 
> Looks like this TLS code was added in the same cycle. 
> 
>> diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
>> index 301f224..a127d61 100644
>> --- a/net/tls/tls_main.c
>> +++ b/net/tls/tls_main.c
>> @@ -712,7 +712,7 @@ static int __init tls_register(void)
>>  	build_protos(tls_prots[TLSV4], &tcp_prot);
>>  
>>  	tls_sw_proto_ops = inet_stream_ops;
>> -	tls_sw_proto_ops.poll = tls_sw_poll;
>> +	tls_sw_proto_ops.poll_mask = tls_sw_poll_mask;
>>  	tls_sw_proto_ops.splice_read = tls_sw_splice_read;
> 
> Not new in this patch, but copying ops vectors is a very bad idea, not
> only because your new instance can't be marked const and you thus open
> up exploit vectors. I would suggest to clean this up eventually.

Generally, agree with you. It could at minimum also be a __ro_after_init
candidate, at least the TLSV4 ops which wouldn't change. In v6 case though
it could be loaded as a module after TLS was initialized.

>> +__poll_t tls_sw_poll_mask(struct socket *sock, __poll_t events)
>>  {
>>  	struct sock *sk = sock->sk;
>>  	struct tls_context *tls_ctx = tls_get_ctx(sk);
>>  	struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx);
>> +	__poll_t mask;
>>  
>> +	/* Grab EPOLLOUT and EPOLLHUP from the underlying socket */
>> +	mask = ctx->sk_poll_mask(sock, events);
>>  
>> +	/* Clear EPOLLIN bits, and set based on recv_pkt */
>> +	mask &= ~(EPOLLIN | EPOLLRDNORM);
>>  	if (ctx->recv_pkt)
>> +		mask |= EPOLLIN | EPOLLRDNORM;
>>  
>> +	return mask;
> 
> So you call the underlying protocol method on the struct sock of
> the TLS code?  Again not reall new in this patch, but how is this
> even supposed to work?

Yeah, patch doesn't change it, but reason is that TLS relies on kernel's
stream parser to determine TLS message boundary on ingress, so once a full
message got received only then we want to signal this to the user space
application. Latter skb is then held in ctx->recv_pkt via stream parser.

Thanks,
Daniel

^ permalink raw reply

* [PATCH] selftests: bpf: config: add config fragments
From: Anders Roxell @ 2018-06-12 11:05 UTC (permalink / raw)
  To: ast, daniel, shuah; +Cc: netdev, linux-kernel, linux-kselftest, Anders Roxell

Tests test_tunnel.sh fails due to config fragments ins't enabled.

Fixes: 933a741e3b82 ("selftests/bpf: bpf tunnel test.")
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
---

All tests passes except ip6gretap that still fails. I'm unsure why.
Ideas?

Cheers,
Anders

 tools/testing/selftests/bpf/config | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/tools/testing/selftests/bpf/config b/tools/testing/selftests/bpf/config
index 1eefe211a4a8..7eb613ffef55 100644
--- a/tools/testing/selftests/bpf/config
+++ b/tools/testing/selftests/bpf/config
@@ -7,3 +7,13 @@ CONFIG_CGROUP_BPF=y
 CONFIG_NETDEVSIM=m
 CONFIG_NET_CLS_ACT=y
 CONFIG_NET_SCH_INGRESS=y
+CONFIG_NET_IPIP=y
+CONFIG_IPV6=y
+CONFIG_NET_IPGRE_DEMUX=y
+CONFIG_NET_IPGRE=y
+CONFIG_IPV6_GRE=y
+CONFIG_CRYPTO_USER_API_HASH=m
+CONFIG_CRYPTO_HMAC=m
+CONFIG_CRYPTO_SHA256=m
+CONFIG_VXLAN=y
+CONFIG_GENEVE=y
-- 
2.17.1

^ permalink raw reply related

* Re: [PULL] vhost: cleanups and fixes
From: Wei Wang @ 2018-06-12 11:05 UTC (permalink / raw)
  To: Linus Torvalds, Michael S. Tsirkin
  Cc: KVM list, Network Development, Linux Kernel Mailing List,
	Bjorn Andersson, Andrew Morton, virtualization
In-Reply-To: <CA+55aFyNhEzzufw0XP9DcqZNS1CH+jDGdN4CVnazb3ssFxFbzQ@mail.gmail.com>

On 06/12/2018 09:59 AM, Linus Torvalds wrote:
> On Mon, Jun 11, 2018 at 6:36 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>> Maybe it will help to have GFP_NONE which will make any allocation
>> fail if attempted. Linus, would this address your comment?
> It would definitely have helped me initially overlook that call chain.
>
> But then when I started looking at the whole dma_map_page() thing, it
> just raised my hackles again.
>
> I would seriously suggest having a much simpler version for the "no
> allocation, no dma mapping" case, so that it's *obvious* that that
> never happens.
>
> So instead of having virtio_balloon_send_free_pages() call a really
> generic complex chain of functions that in _some_ cases can do memory
> allocation, why isn't there a short-circuited "vitruque_add_datum()"
> that is guaranteed to never do anything like that?
>
> Honestly, I look at "add_one_sg()" and it really doesn't make me
> happy. It looks hacky as hell. If I read the code right, you're really
> trying to just queue up a simple tuple of <pfn,len>, except you encode
> it as a page pointer in order to play games with the SG logic, and
> then you hmap that to the ring, except in this case it's all a fake
> ring that just adds the cpu-physical address instead.
>
> And to figuer that out, it's like five layers of indirection through
> different helper functions that *can* do more generic things but in
> this case don't.
>
> And you do all of this from a core VM callback function with some
> _really_ core VM locks held.
>
> That makes no sense to me.
>
> How about this:
>
>   - get rid of all that code
>
>   - make the core VM callback save the "these are the free memory
> regions" in a fixed and limited array. One that DOES JUST THAT. No
> crazy "SG IO dma-mapping function crap". Just a plain array of a fixed
> size, pre-allocated for that virtio instance.
>
>   - make it obvious that what you do in that sequence is ten
> instructions and no allocations ("Look ma, I wrote a value to an array
> and incremented the array idex, and I'M DONE")
>
>   - then in that workqueue entry that you start *anyway*, you empty the
> array and do all the crazy virtio stuff.
>
> In fact, while at it, just simplify the VM interface too. Instead of
> traversing a random number of buddy lists, just trraverse *one* - the
> top-level one. Are you seriously ever going to shrink or mark
> read-only anythin *but* something big enough to be in the maximum
> order?
>
> MAX_ORDER is what, 11? So we're talking 8MB blocks. Do you *really*
> want the balloon code to work on smaller things, particularly since
> the whole interface is fundamentally racy and opportunistic to begin
> with?

OK, I will implement a new version based on the suggestions. Thanks.

Best,
Wei

^ permalink raw reply

* Re: [Qemu-devel] [PATCH] qemu: Introduce VIRTIO_NET_F_STANDBY feature bit to virtio_net
From: Michael S. Tsirkin @ 2018-06-12 11:34 UTC (permalink / raw)
  To: Samudrala, Sridhar
  Cc: alexander.h.duyck, virtio-dev, aaron.f.brown, jiri, kubakici,
	netdev, qemu-devel, loseweigh, virtualization
In-Reply-To: <23fc4aa4-ec41-d6e2-3354-10cbfc13b7ec@intel.com>

On Mon, Jun 11, 2018 at 10:02:45PM -0700, Samudrala, Sridhar wrote:
> On 6/11/2018 7:17 PM, Michael S. Tsirkin wrote:
> > On Tue, Jun 12, 2018 at 09:54:44AM +0800, Jason Wang wrote:
> > > 
> > > On 2018年06月12日 01:26, Michael S. Tsirkin wrote:
> > > > On Mon, May 07, 2018 at 04:09:54PM -0700, Sridhar Samudrala wrote:
> > > > > This feature bit can be used by hypervisor to indicate virtio_net device to
> > > > > act as a standby for another device with the same MAC address.
> > > > > 
> > > > > I tested this with a small change to the patch to mark the STANDBY feature 'true'
> > > > > by default as i am using libvirt to start the VMs.
> > > > > Is there a way to pass the newly added feature bit 'standby' to qemu via libvirt
> > > > > XML file?
> > > > > 
> > > > > Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
> > > > So I do not think we can commit to this interface: we
> > > > really need to control visibility of the primary device.
> > > The problem is legacy guest won't use primary device at all if we do this.
> > And that's by design - I think it's the only way to ensure the
> > legacy guest isn't confused.
> 
> Yes. I think so. But i am not sure if Qemu is the right place to control the visibility
> of the primary device. The primary device may not be specified as an argument to Qemu. It
> may be plugged in later.
> The cloud service provider is providing a feature that enables low latency datapath and live
> migration capability.
> A tenant can use this feature only if he is running a VM that has virtio-net with failover support.

Well live migration is there already. The new feature is low latency
data path.

And it's the guest that needs failover support not the VM.


> I think Qemu should check if guest virtio-net supports this feature and provide a mechanism for
> an upper layer indicating if the STANDBY feature is successfully negotiated or not.
> The upper layer can then decide if it should hot plug a VF with the same MAC and manage the 2 links.
> If VF is successfully hot plugged, virtio-net link should be disabled.

Did you even talk to upper layer management about it?
Just list the steps they need to do and you will see
that's a lot of machinery to manage by the upper layer.

What do we gain in flexibility? As far as I can see the
only gain is some resources saved for legacy VMs.

That's not a lot as tenant of the upper layer probably already has
at least a hunch that it's a new guest otherwise
why bother specifying the feature at all - you
save even more resources without it.




> 
> > 
> > > How about control the visibility of standby device?
> > > 
> > > Thanks
> > standy the always there to guarantee no downtime.
> > 
> > > > However just for testing purposes, we could add a non-stable
> > > > interface "x-standby" with the understanding that as any
> > > > x- prefix it's unstable and will be changed down the road,
> > > > likely in the next release.
> > > > 
> > > > 
> > > > > ---
> > > > >    hw/net/virtio-net.c                         | 2 ++
> > > > >    include/standard-headers/linux/virtio_net.h | 3 +++
> > > > >    2 files changed, 5 insertions(+)
> > > > > 
> > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > > index 90502fca7c..38b3140670 100644
> > > > > --- a/hw/net/virtio-net.c
> > > > > +++ b/hw/net/virtio-net.c
> > > > > @@ -2198,6 +2198,8 @@ static Property virtio_net_properties[] = {
> > > > >                         true),
> > > > >        DEFINE_PROP_INT32("speed", VirtIONet, net_conf.speed, SPEED_UNKNOWN),
> > > > >        DEFINE_PROP_STRING("duplex", VirtIONet, net_conf.duplex_str),
> > > > > +    DEFINE_PROP_BIT64("standby", VirtIONet, host_features, VIRTIO_NET_F_STANDBY,
> > > > > +                      false),
> > > > >        DEFINE_PROP_END_OF_LIST(),
> > > > >    };
> > > > > diff --git a/include/standard-headers/linux/virtio_net.h b/include/standard-headers/linux/virtio_net.h
> > > > > index e9f255ea3f..01ec09684c 100644
> > > > > --- a/include/standard-headers/linux/virtio_net.h
> > > > > +++ b/include/standard-headers/linux/virtio_net.h
> > > > > @@ -57,6 +57,9 @@
> > > > >    					 * Steering */
> > > > >    #define VIRTIO_NET_F_CTRL_MAC_ADDR 23	/* Set MAC address */
> > > > > +#define VIRTIO_NET_F_STANDBY      62    /* Act as standby for another device
> > > > > +                                         * with the same MAC.
> > > > > +                                         */
> > > > >    #define VIRTIO_NET_F_SPEED_DUPLEX 63	/* Device set linkspeed and duplex */
> > > > >    #ifndef VIRTIO_NET_NO_LEGACY
> > > > > -- 
> > > > > 2.14.3
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* [PATCH 1/2] ath10k: do not mix spaces and tabs in Kconfig
From: Niklas Cassel @ 2018-06-12 11:39 UTC (permalink / raw)
  To: Kalle Valo, David S. Miller
  Cc: Niklas Cassel, ath10k, linux-wireless, netdev, linux-kernel

Do not mix spaces and tabs in Kconfig.

Signed-off-by: Niklas Cassel <niklas.cassel@linaro.org>
---
 drivers/net/wireless/ath/ath10k/Kconfig | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/Kconfig b/drivers/net/wireless/ath/ath10k/Kconfig
index 84f071ac0d84..54ff5930126c 100644
--- a/drivers/net/wireless/ath/ath10k/Kconfig
+++ b/drivers/net/wireless/ath/ath10k/Kconfig
@@ -1,15 +1,15 @@
 config ATH10K
-        tristate "Atheros 802.11ac wireless cards support"
-        depends on MAC80211 && HAS_DMA
+	tristate "Atheros 802.11ac wireless cards support"
+	depends on MAC80211 && HAS_DMA
 	select ATH_COMMON
 	select CRC32
 	select WANT_DEV_COREDUMP
 	select ATH10K_CE
-        ---help---
-          This module adds support for wireless adapters based on
-          Atheros IEEE 802.11ac family of chipsets.
+	---help---
+	  This module adds support for wireless adapters based on
+	  Atheros IEEE 802.11ac family of chipsets.
 
-          If you choose to build a module, it'll be called ath10k.
+	  If you choose to build a module, it'll be called ath10k.
 
 config ATH10K_CE
 	bool
@@ -41,12 +41,12 @@ config ATH10K_USB
 	  work in progress and will not fully work.
 
 config ATH10K_SNOC
-        tristate "Qualcomm ath10k SNOC support (EXPERIMENTAL)"
-        depends on ATH10K && ARCH_QCOM
-        ---help---
-          This module adds support for integrated WCN3990 chip connected
-          to system NOC(SNOC). Currently work in progress and will not
-          fully work.
+	tristate "Qualcomm ath10k SNOC support (EXPERIMENTAL)"
+	depends on ATH10K && ARCH_QCOM
+	---help---
+	  This module adds support for integrated WCN3990 chip connected
+	  to system NOC(SNOC). Currently work in progress and will not
+	  fully work.
 
 config ATH10K_DEBUG
 	bool "Atheros ath10k debugging"
-- 
2.17.1

^ permalink raw reply related

* [PATCH 2/2] ath10k: allow ATH10K_SNOC with COMPILE_TEST
From: Niklas Cassel @ 2018-06-12 11:39 UTC (permalink / raw)
  To: Kalle Valo, David S. Miller
  Cc: Niklas Cassel, ath10k, linux-wireless, netdev, linux-kernel
In-Reply-To: <20180612113907.15043-1-niklas.cassel@linaro.org>

ATH10K_SNOC builds just fine with COMPILE_TEST, so make that possible.

Signed-off-by: Niklas Cassel <niklas.cassel@linaro.org>
---
 drivers/net/wireless/ath/ath10k/Kconfig | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/wireless/ath/ath10k/Kconfig b/drivers/net/wireless/ath/ath10k/Kconfig
index 54ff5930126c..6572a43590a8 100644
--- a/drivers/net/wireless/ath/ath10k/Kconfig
+++ b/drivers/net/wireless/ath/ath10k/Kconfig
@@ -42,7 +42,8 @@ config ATH10K_USB
 
 config ATH10K_SNOC
 	tristate "Qualcomm ath10k SNOC support (EXPERIMENTAL)"
-	depends on ATH10K && ARCH_QCOM
+	depends on ATH10K
+	depends on ARCH_QCOM || COMPILE_TEST
 	---help---
 	  This module adds support for integrated WCN3990 chip connected
 	  to system NOC(SNOC). Currently work in progress and will not
-- 
2.17.1

^ permalink raw reply related

* Re: WARNING: kmalloc bug in xdp_umem_create
From: Daniel Borkmann @ 2018-06-12 12:08 UTC (permalink / raw)
  To: Björn Töpel, penguin-kernel
  Cc: dvyukov, syzbot+4abadc5d69117b346506, Björn Töpel,
	Karlsson, Magnus, David Miller, LKML, Netdev, syzkaller-bugs
In-Reply-To: <CAJ+HfNh9pRGcd9EO7BEfPPEdCmP5EDdu_rNgLR7r4oDrcLgvQQ@mail.gmail.com>

On 06/10/2018 03:03 PM, Björn Töpel wrote:
> Den sön 10 juni 2018 kl 14:53 skrev Tetsuo Handa
> <penguin-kernel@i-love.sakura.ne.jp>:
>> On 2018/06/10 20:52, Dmitry Vyukov wrote:
>>> On Sun, Jun 10, 2018 at 11:31 AM, Björn Töpel <bjorn.topel@gmail.com> wrote:
>>>> Den sön 10 juni 2018 kl 04:53 skrev Tetsuo Handa
>>>> <penguin-kernel@i-love.sakura.ne.jp>:
>>>>> On 2018/06/10 7:47, syzbot wrote:
>>>>>> Hello,
>>>>>>
>>>>>> syzbot found the following crash on:
>>>>>>
>>>>>> HEAD commit:    7d3bf613e99a Merge tag 'libnvdimm-for-4.18' of git://git.k..
>>>>>> git tree:       upstream
>>>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1073f68f800000
>>>>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=f04d8d0a2afb789a
>>>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=4abadc5d69117b346506
>>>>>> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
>>>>>> syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=13c9756f800000
>>>>>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=16366f9f800000
>>>>>>
>>>>>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>>>>>> Reported-by: syzbot+4abadc5d69117b346506@syzkaller.appspotmail.com
>>>>>>
>>>>>> random: sshd: uninitialized urandom read (32 bytes read)
>>>>>> random: sshd: uninitialized urandom read (32 bytes read)
>>>>>> random: sshd: uninitialized urandom read (32 bytes read)
>>>>>> random: sshd: uninitialized urandom read (32 bytes read)
>>>>>> random: sshd: uninitialized urandom read (32 bytes read)
>>>>>> WARNING: CPU: 1 PID: 4537 at mm/slab_common.c:996 kmalloc_slab+0x56/0x70 mm/slab_common.c:996
>>>>>> Kernel panic - not syncing: panic_on_warn set ...
>>>>>
>>>>> syzbot gave up upon kmalloc(), but actually error handling path has
>>>>> NULL pointer dereference bug.
>>>>
>>>> Thanks Tetsuo! This crash has been fixed by Daniel Borkmann in commit
>>>> c09290c56376 ("bpf, xdp: fix crash in xdp_umem_unaccount_pages").
>>>
>>> Let's tell syzbot about this:
>>>
>>> #syz fix: bpf, xdp: fix crash in xdp_umem_unaccount_pages
>>>
>> Excuse me, but that patch fixes NULL pointer dereference which occurs after kmalloc()'s
>> "WARNING: CPU: 1 PID: 4537 at mm/slab_common.c:996 kmalloc_slab+0x56/0x70 mm/slab_common.c:996"
>> message. That is, "Too large memory allocation" itself is not yet fixed.
> 
> The code relies on that the sl{u,a,o}b layer says no, and the
> setsockopt bails out. The warning could be opted out using
> __GFP_NOWARN. Is there another preferred way? Two get_user_pages
> calls, where the first call would set pages to NULL just to fault the
> region? Walk the process' VMAs? Something else?

(Now resolved as well.)

#syz fix: xsk: silence warning on memory allocation failure

^ permalink raw reply

* Re: mainline: x86_64: kernel panic: RIP: 0010:__xfrm_policy_check+0xcb/0x690
From: Anders Roxell @ 2018-06-12 12:09 UTC (permalink / raw)
  To: Steffen Klassert
  Cc: Naresh Kamboju, Networking, David S. Miller, herbert,
	open list:KERNEL SELFTEST FRAMEWORK, open list
In-Reply-To: <20180612083435.7f7k4exergraaa2u@gauss3.secunet.de>

On 12 June 2018 at 10:34, Steffen Klassert <steffen.klassert@secunet.com> wrote:
> On Mon, Jun 11, 2018 at 10:11:46PM +0530, Naresh Kamboju wrote:
>> Kernel panic on x86_64 machine running mainline 4.17.0 kernel while testing
>> selftests bpf test_tunnel.sh test caused this kernel panic.
>> I have noticed this kernel panic start happening from
>> 4.17.0-rc7-next-20180529 and still happening on 4.17.0-next-20180608.
>>
>> [  213.638287] BUG: unable to handle kernel NULL pointer dereference
>> at 0000000000000008
>> ++[ ip xfrm poli  213.674036] PGD 0 P4D 0
>> [  213.674118] audit: type=1327 audit(1528917683.623:7):
>> proctitle=6970007866726D00706F6C69637900616464007372630031302E312E312E3130302F3332006473740031302E312E312E3230302F33320064697200696E00746D706C00737263003137322E31362E312E31303000647374003137322E31362E312E3230300070726F746F006573700072657169640031006D6F64650074756E6E
>> [  213.677950] Oops: 0000 [#1] SMP PTI
>> cy[ add src 10.1.  213.677952] CPU: 2 PID: 0 Comm: swapper/2 Tainted:
>> G        W         4.17.0-next-20180608 #1
>> [  213.677953] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
>> 2.0b 07/27/2017
>> [  213.726998] RIP: 0010:__xfrm_policy_check+0xcb/0x690
>> [  213.731962] Code: 80 3d 0a d8 f1 00 00 0f 84 c1 02 00 00 4c 8b 25
>> 2b af f4 00 e8 66 a6 6a ff 85 c0 74 0d 80 3d eb d7 f1 00 00 0f 84 d5
>> 02 00 00 <49> 8b 44 24 08 48 85 c0 74 0c 48 8d b5 78 ff ff ff 4c 89 ff
>> ff d0
>
> This looks like a bug that I've seen already. If it is what I think,
> then commit 2c205dd3981f ("netfilter: add struct nf_nat_hook and use
> it") introduced this bug.
>
> There was already a fix for this on the netdev list, but
> I don't know the current status of that patch:
>
> https://patchwork.ozlabs.org/patch/921387/

Hi, I applied the patch and ran bpf/test_tunnel.sh and I I couldn't
see any crash.
However, the script never returned (I had to Ctrl+c to get back), any ideas ?
See log from the test below.

Cheers,
Anders

Testing IPSec tunnel...
[  269.060050] IPv6: ADDRCONF(NETDEV_UP): veth0: link is not ready
[  269.090000] IPv6: ADDRCONF(NETDEV_CHANGE): veth0: link becomes ready
          <idle>-0     [000] ..s3   190.987095: 0: key 2 remote ip 0xac100164
            ping-3043  [000] ..s3   190.988715: 0: key 2 remote ip 0xac100164
     ksoftirqd/0-9     [000] ..s2   190.988986: 0: key 2 remote ip 0xac100164
 systemd-resolve-2664  [003] ..s3   191.083771: 0: ERROR line:77 ret:-22
 systemd-resolve-2664  [003] ..s3   191.333763: 0: ERROR line:77 ret:-22
     kworker/0:1-33    [000] ..s4   191.419445: 0: key 2 remote ip 0xac100164
            ping-3043  [000] ..s3   191.989437: 0: key 2 remote ip 0xac100164
     kworker/0:1-33    [000] ..s4   192.443460: 0: key 2 remote ip 0xac100164
     kworker/0:1-33    [000] ..s4   192.443508: 0: key 2 remote ip 0xac100164
          <idle>-0     [000] ..s3   192.446318: 0: key 2 remote ip 0xac100164
 systemd-resolve-2664  [000] ..s3   192.768767: 0: ERROR line:77 ret:-22
            ping-3043  [000] ..s3   192.989902: 0: key 2 remote ip 0xac100164
            ping-3044  [000] ..s3   193.025776: 0: key 2 remote ip 0xac100164
 systemd-resolve-2664  [000] ..s3   193.083650: 0: ERROR line:77 ret:-22
 systemd-resolve-2664  [000] ..s3   193.333865: 0: ERROR line:77 ret:-22
            ping-3044  [000] ..s3   194.026240: 0: key 2 remote ip 0xac100164
            ping-3044  [000] ..s3   195.026707: 0: key 2 remote ip 0xac100164
     ksoftirqd/2-21    [002] ..s2   198.075583: 0: key 2 remote ip6
::11000000 label bcdef
     ksoftirqd/2-21    [002] ..s2   198.075597: 0: key 2 remot[
269.270883] audit: type=1415 audit(1532018021.150:6): op=SAD-add
auid=0 ses=2 subj=kerne
l src=172.16.1.100 dst=172.16.1.200 spi=1(0x1) res=1
[  269.284308] audit: type=1300 audit(1532018021.150:6): arch=c000003e
syscall=46 success=yes exit=424 a0=4 a1=7ffff18d1ba0 a2=0 a3=5e9
items=0 ppid=2924
 pid=4333 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0
fsgid=0 tty=pts0 ses=2 comm=\"ip\" exe=\"/sbin/ip.iproute2\"
subj=kernel key=(null)
[  269.310249] audit: type=1327 audit(1532018021.150:6):
proctitle=6970007866726D0073746174650061646400737263003137322E31362E312E31303000647374003137322E
31362E312E3230300070726F746F0065737000737069003078310072657169640031006D6F64650074756E6E656C00617574682D7472756E6300686D616328736861312900307831313131313
13131313131313131
e ip6 ::11000000 label bcdef
            ping-3164  [003] ..s3   198.113160: 0: key 2 remote ip6
::11000000 label bcdef
            ping-3164  [003] ..s3   199.113661: 0: key 2 remote ip6
::11000000 label bcdef
          <idle>-0     [000] ..s3   199.931430: 0: key 2 remote ip6
::11000000 label bcdef
            ping-3164  [003] ..s3   200.114432: 0: key 2 remote ip6
::11000000 label bcdef
            ping-3[  269.374987] audit: type=1415
audit(1532018021.373:7): op=SPD-add auid=0 ses=2 subj=kernel res=1
src=10.1.1.100 dst=10.1.1.200
165  [002] ..s3 [  269.386787] audit: type=1300
audit(1532018021.373:7): arch=c000003e syscall=46 success=yes exit=252
a0=4 a1=7ffe1e400ff0 a2=0 a3=5e9 i
tems=0 ppid=2924 pid=4354 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0
egid=0 sgid=0 fsgid=0 tty=pts0 ses=2 comm=\"ip\"
exe=\"/sbin/ip.iproute2\" subj=kernel
 key=(null)
[  269.414124] audit: type=1327 audit(1532018021.373:7):
proctitle=6970007866726D00706F6C69637900616464007372630031302E312E312E3130302F333200647374003130
2E312E312E3230302F333200646972006F757400746D706C00737263003137322E31362E312E31303000647374003137322E31362E312E3230300070726F746F0065737000726571696400310
06D6F64650074756E
  200.133573: 0: key 2 remote ip6 ::11000000 label bcdef
            ping-3165  [002] ..s3   201.134091: 0: key 2 remote ip6
::11000000 label bcdef
            ping-3165  [002] ..s3   202.134600: 0: key 2 remote ip6
::11000000 label bcdef
           ping6-3167  [000] ..s3   202.172808: 0: key 2 remote ip6
::11000000 label bcdef
  [  269.471306] audit: type=1415 audit(1532018021.470:8): op=SAD-add
auid=0 ses=2 subj=kernel src=172.16.1.200 dst=172.16.1.100 spi=2(0x2)
res=1
[  269.484439] audit: type=1300 audit(1532018021.470:8): arch=c000003e
syscall=46 success=yes exit=424 a0=4 a1=7ffc79931450 a2=0 a3=5e9
items=0 ppid=2924
 pid=4355 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0
fsgid=0 tty=pts0 ses=2 comm=\"ip\" exe=\"/sbin/ip.iproute2\"
subj=kernel key=(null)
[  269.510375] audit: type=1327 audit(1532018021.470:8):
proctitle=6970007866726D0073746174650061646400737263003137322E31362E312E32303000647374003137322E
31362E312E3130300070726F746F0065737000737069003078320072657169640032006D6F64650074756E6E656C00617574682D7472756E6300686D616328736861312900307831313131313
13131313131313131
         ping6-3167  [000] ..s3   203.173251: 0: key 2 remote ip6
::11000000 label bcdef
           ping6-3167  [000] ..s3   204.173741: 0: key 2 remote ip6
::11000000 label bcdef
 systemd-resolve-2664  [001] ..s3   205.333801: 0: ERROR line:119 ret:-22
 systemd-resolve-2664  [001] ..s3   205.583819: 0: ERROR line:119 ret:-22
 systemd-resolve-2664  [001] ..s3   205.833437: 0: E[  269.572155]
audit: type=1415 audit(1532018021.571:9): op=SPD-add auid=0 ses=2
subj=kernel res=1 sr
c=10.1.1.200 dst=10.1.1.100
RROR line:119 ret:-22
 systemd-resolve-2664  [001] ..s3   206.583819: 0: ERROR line:119 ret:-22
 systemd-resolve-2664  [003] ..s3   206.782769: 0: ERROR line:119 ret:-22
     kworker/3:2-1537  [003] ..s4   207.035785: 0: key 2 remote ip6
::11000000 label bcdef
     kworker/3:2-1537  [003] ..s4   207.035796: 0: key 2 remote ip6
::11000000 label bcdef
     kworker/3:2-1537  [003] ..s4   207.035890: 0: key 2 remote ip6
::11000000 label bcdef
     kworker/3:2-1537  [003] ..s4   207.035926: 0: key 2 remote ip6
::11000000 label bcdef
 systemd-resolve-2664  [000] ..s3   207.083608: 0: ERROR line:119 ret:-22
     ksoftirqd/3-26    [003] ..s2   207.739454: 0: key 2 remote ip6
::11000000 label bcdef
            ping-3298  [000] ..s3   208.263120: 0: key 2 remote ip6
::11000000 label bcdef
            ping-3298  [000] ..s3   208.263224: 0: key 2 remote ip6
::11000000 label bcdef
            ping-3298  [000] ..s3   209.263703: 0: key 2 remote ip6
::11000000 label bcdef
            ping-3298  [000] ..s3   210.264203: 0: key 2 remote ip6
::11000000 label bcdef
            ping-3299  [002] ..s3   210.279710: 0: key 2 remote ip6
::11000000 label bcdef
          <idle>-0     [003] ..s3   210.875420: 0: key 2 remote ip6
::11000000 label bcdef
            ping-3299  [002] ..s3   211.280241: 0: key 2 remote ip6
::11000000 label bcdef
     ksoftirqd/3-26    [003] ..s2   212.219559: 0: key 2 remote ip6
::11000000 label bcdef
            ping-3299  [002] ..s3   212.280741: 0: key 2 remote ip6
::11000000 label bcdef
           ping6-3300  [002] ..s3   212.315807: 0: key 2 remote ip6
::11000000 label bcdef
           ping6-3300  [002] ..s3   213.316255: 0: key 2 remote ip6
::11000000 label bcdef
          <idle>-0     [000] ..s3   213.755462: 0: key 2 remote ip6
::11000000 label bcdef
           ping6-3300  [002] ..s3   214.316506: 0: key 2 remote ip6
::11000000 label bcdef
           ping6-3300  [002] ..s3   215.316951: 0: key 2 remote ip6
::11000000 label bcdef
           ping6-3300  [002] ..s3   216.317320: 0: key 2 remote ip6
::11000000 label bcdef
           ping6-3300  [002] ..s3   217.317765: 0: key 2 remote ip6
::11000000 label bcdef
     ksoftirqd/3-26    [003] ..s2   217.339464: 0: key 2 remote ip6
::11000000 label bcdef
           ping6-3300  [002] ..s3   218.318227: 0: key 2 remote ip6
::11000000 label bcdef
     ksoftirqd/3-26    [003] ..s2   218.875455: 0: key 2 remote ip6
::11000000 label bcdef
           ping6-3300  [002] ..s3   219.318671: 0: key 2 remote ip6
::11000000 label bcdef
           ping6-3300  [002] ..s3   220.319141: 0: key 2 remote ip6
::11000000 label bcdef
           ping6-3300  [002] ..s3   221.319585: 0: key 2 remote ip6
::11000000 label bcdef
            ping-3419  [002] ..s3   223.036234: 0: key 2 remote ip
0xac100164 erspan version 2
            ping-3419  [002] ..s3   223.036236: 0:      direction 1
hwid 3 timestamp 31402699
            ping-3419  [002] ..s3   223.036256: 0: key 2 remote ip
0xac100164 erspan version 2
            ping-3419  [002] ..s3   223.036256: 0:      direction 1
hwid 3 timestamp 31402699
          <idle>-0     [000] ..s3   223.067438: 0: key 2 remote ip
0xac100164 erspan version 2
          <idle>-0     [000] ..s3   223.067443: 0:      direction 1
hwid 3 timestamp 31403010
 systemd-resolve-2664  [000] ..s3   223.083837: 0: ERROR line:183 ret:-22
     kworker/0:1-33    [000] ..s4   223.283447: 0: key 2 remote ip
0xac100164 erspan version 2
     kworker/0:1-33    [000] ..s4   223.283452: 0:      direction 1
hwid 3 timestamp 31405171
 systemd-resolve-2664  [000] ..s3   223.333807: 0: ERROR line:183 ret:-22
 systemd-resolve-2664  [000] ..s3   223.583816: 0: ERROR line:183 ret:-22
            ping-3419  [002] ..s3   224.036713: 0: key 2 remote ip
0xac100164 erspan version 2
            ping-3419  [002] ..s3   224.036718: 0:      direction 1
hwid 3 timestamp 31412704
     kworker/0:1-33    [000] ..s4   224.315459: 0: key 2 remote ip
0xac100164 erspan version 2
     kworker/0:1-33    [000] ..s4   224.315464: 0:      direction 1
hwid 3 timestamp 31415491
     kworker/0:1-33    [000] ..s4   224.315514: 0: key 2 remote ip
0xac100164 erspan version 2
     kworker/0:1-33    [000] ..s4   224.315516: 0:      direction 1
hwid 3 timestamp 31415492
          <idle>-0     [000] ..s3   224.675428: 0: key 2 remote ip
0xac100164 erspan version 2
          <idle>-0     [000] ..s3   224.675433: 0:      direction 1
hwid 3 timestamp 31419090
            ping-3419  [002] ..s3   225.036920: 0: key 2 remote ip
0xac100164 erspan version 2
            ping-3419  [002] ..s3   225.036925: 0:      direction 1
hwid 3 timestamp 31422706
            ping-3420  [003] ..s3   225.064293: 0: key 2 remote ip
0xac100164 erspan version 2
            ping-3420  [003] ..s3   225.064298: 0:      direction 1
hwid 3 timestamp 31422979
 systemd-resolve-2664  [000] ..s3   225.083664: 0: ERROR line:183 ret:-22
 systemd-resolve-2664  [000] ..s3   225.333742: 0: ERROR line:183 ret:-22
 systemd-resolve-2664  [000] ..s3   225.583778: 0: ERROR line:183 ret:-22
            ping-3420  [003] ..s3   226.064761: 0: key 2 remote ip
0xac100164 erspan version 2
            ping-3420  [003] ..s3   226.064765: 0:      direction 1
hwid 3 timestamp 31432984
            ping-3420  [003] ..s3   227.065237: 0: key 2 remote ip
0xac100164 erspan version 2
            ping-3420  [003] ..s3   227.065242: 0:      direction 1
hwid 3 timestamp 31442989
 systemd-resolve-2664  [003] ..s3   228.186369: 0: ERROR line:269 ret:-22
 systemd-resolve-2664  [003] ..s3   228.333716: 0: ERROR line:269 ret:-22
 systemd-resolve-2664  [003] ..s3   228.583747: 0: ERROR line:269 ret:-22
 systemd-resolve-2664  [001] ..s3   229.833537: 0: ERROR line:269 ret:-22
     kworker/0:1-33    [000] ..s4   229.947659: 0: ip6erspan get key 2
remote ip6 ::0 erspan version 2
     kworker/0:1-33    [000] ..s4   229.947664: 0:      direction 1
hwid 7 timestamp 31471811
     kworker/0:1-33    [000] ..s4   229.947708: 0: ip6erspan get key 2
remote ip6 ::0 erspan version 2
     kworker/0:1-33    [000] ..s4   229.947711: 0:      direction 1
hwid 7 timestamp 31471814
 systemd-resolve-2664  [001] ..s3   230.083782: 0: ERROR line:269 ret:-22
 systemd-resolve-2664  [001] ..s3   230.333523: 0: ERROR line:269 ret:-22
          <idle>-0     [000] ..s3   230.779419: 0: ip6erspan get key 2
remote ip6 ::0 erspan version 2
          <idle>-0     [000] ..s3   230.779424: 0:      direction 1
hwid 7 timestamp 31480130
            ping-3540  [003] ..s3   231.091633: 0: ip6erspan get key 2
remote ip6 ::0 erspan version 2
            ping-3540  [003] ..s3   231.091638: 0:      direction 1
hwid 7 timestamp 31483253
            ping-3540  [003] ..s3   231.091737: 0: ip6erspan get key 2
remote ip6 ::0 erspan version 2
            ping-3540  [003] ..s3   231.091739: 0:      direction 1
hwid 7 timestamp 31483254
            ping-3540  [003] ..s3   232.092227: 0: ip6erspan get key 2
remote ip6 ::0 erspan version 2
            ping-3540  [003] ..s3   232.092232: 0:      direction 1
hwid 7 timestamp 31493259
            ping-3540  [003] ..s3   233.092737: 0: ip6erspan get key 2
remote ip6 ::0 erspan version 2
            ping-3540  [003] ..s3   233.092742: 0:      direction 1
hwid 7 timestamp 31503264
            ping-3655  [000] ..s3   233.957186: 0: key 2 remote ip
0xac100164 vxlan gbp 0x800ff
 systemd-resolve-2664  [001] ..s3   234.065463: 0: ERROR line:339 ret:-22
          <idle>-0     [000] ..s3   234.171427: 0: key 2 remote ip
0xac100164 vxlan gbp 0x94
     kworker/0:1-33    [000] ..s4   234.235502: 0: ERROR line:345 ret:-2
 systemd-resolve-2664  [001] ..s3   234.333813: 0: ERROR line:339 ret:-22
 systemd-resolve-2664  [001] ..s3   234.583761: 0: ERROR line:339 ret:-22
            ping-3655  [000] ..s3   234.957640: 0: key 2 remote ip
0xac100164 vxlan gbp 0x800ff
     kworker/0:1-33    [000] ..s4   235.259475: 0: key 2 remote ip
0xac100164 vxlan gbp 0x94
     kworker/0:1-33    [000] ..s4   235.259528: 0: ERROR line:345 ret:-2
 systemd-resolve-2664  [001] ..s3   235.752142: 0: ERROR line:339 ret:-22
            ping-3655  [000] ..s3   235.958144: 0: key 2 remote ip
0xac100164 vxlan gbp 0x800ff
          <idle>-0     [000] ..s3   235.963159: 0: key 2 remote ip
0xac100164 vxlan gbp 0x94
            ping-3663  [002] ..s3   235.990422: 0: key 2 remote ip
0xac100164 vxlan gbp 0x800ff
 systemd-resolve-2664  [001] ..s3   236.083782: 0: ERROR line:339 ret:-22
 systemd-resolve-2664  [001] ..s3   236.333591: 0: ERROR line:339 ret:-22
            ping-3663  [002] ..s3   236.990900: 0: key 2 remote ip
0xac100164 vxlan gbp 0x800ff
            ping-3663  [002] ..s3   237.991370: 0: key 2 remote ip
0xac100164 vxlan gbp 0x800ff
 systemd-resolve-2664  [002] ..s3   239.083778: 0: ERROR line:387 ret:-22
 systemd-resolve-2664  [002] ..s3   239.333765: 0: ERROR line:387 ret:-22
 systemd-resolve-2664  [002] ..s3   239.583765: 0: ERROR line:387 ret:-22
 systemd-resolve-2664  [001] ..s3   240.194218: 0: ERROR line:387 ret:-22
 systemd-resolve-2664  [001] ..s3   240.333718: 0: ERROR line:387 ret:-22
 systemd-resolve-2664  [001] ..s3   240.583708: 0: ERROR line:387 ret:-22
          <idle>-0     [002] ..s3   240.955492: 0: key 22 remote ip6
::11000000 label 0
          <idle>-0     [002] ..s3   240.955505: 0: key 22 remote ip6
::11000000 label 0
          <idle>-0     [002] ..s3   240.955512: 0: key 22 remote ip6
::11000000 label 0
          <idle>-0     [002] ..s3   240.955518: 0: key 22 remote ip6
::11000000 label 0
          <idle>-0     [002] ..s3   240.955524: 0: key 22 remote ip6
::11000000 label 0
          <idle>-0     [002] ..s3   241.211432: 0: key 22 remote ip6
::11000000 label 0
     ksoftirqd/1-16    [001] ..s2   242.012579: 0: key 22 remote ip6
::11000000 label 0
     ksoftirqd/1-16    [001] ..s2   242.026381: 0: key 22 remote ip6
::11000000 label 0
     ksoftirqd/3-26    [003] ..s2   243.028099: 0: key 22 remote ip6
::11000000 label 0
     ksoftirqd/3-26    [003] ..s2   244.040809: 0: key 22 remote ip6
::11000000 label 0
            ping-3774  [001] ..s3   244.062314: 0: key 22 remote ip6
::11000000 label 0
          <idle>-0     [002] ..s3   244.219413: 0: key 22 remote ip6
::11000000 label 0
            ping-3774  [001] ..s3   245.075208: 0: key 22 remote ip6
::11000000 label 0
            ping-3774  [001] ..s3   246.088218: 0: key 22 remote ip6
::11000000 label 0
            ping-3887  [002] ..s3   246.831153: 0: key 2 remote ip
0xac100164 geneve class 0x0
            ping-3887  [002] ..s3   246.831226: 0: key 2 remote ip
0xac100164 geneve class 0x0
 systemd-resolve-2664  [003] ..s3   246.963398: 0: ERROR line:445 ret:-22
 systemd-resolve-2664  [003] ..s3   247.083859: 0: ERROR line:445 ret:-22
 systemd-resolve-2664  [003] ..s3   247.333815: 0: ERROR line:445 ret:-22
     kworker/1:2-1480  [001] ..s4   247.355451: 0: key 2 remote ip
0xac100164 geneve class 0x0
          <idle>-0     [001] ..s3   247.419431: 0: key 2 remote ip
0xac100164 geneve class 0x0
            ping-3887  [002] ..s3   247.831692: 0: key 2 remote ip
0xac100164 geneve class 0x0
     kworker/1:2-1480  [001] ..s4   248.379561: 0: key 2 remote ip
0xac100164 geneve class 0x0
     kworker/1:2-1480  [001] ..s4   248.379605: 0: key 2 remote ip
0xac100164 geneve class 0x0
 systemd-resolve-2664  [001] ..s3   248.440767: 0: ERROR line:445 ret:-22
          <idle>-0     [001] ..s3   248.459429: 0: key 2 remote ip
0xac100164 geneve class 0x0
 systemd-resolve-2664  [001] ..s3   248.583784: 0: ERROR line:445 ret:-22
            ping-3887  [002] ..s3   248.832187: 0: key 2 remote ip
0xac100164 geneve class 0x0
 systemd-resolve-2664  [001] ..s3   248.833517: 0: ERROR line:445 ret:-22
            ping-3888  [000] ..s3   248.874871: 0: key 2 remote ip
0xac100164 geneve class 0x0
            ping-3888  [000] ..s3   249.875339: 0: key 2 remote ip
0xac100164 geneve class 0x0
            ping-3888  [000] ..s3   250.875805: 0: key 2 remote ip
0xac100164 geneve class 0x0
 systemd-resolve-2664  [003] ..s3   252.032344: 0: ERROR line:509 ret:-22
 systemd-resolve-2664  [003] ..s3   252.333767: 0: ERROR line:509 ret:-22
 systemd-resolve-2664  [003] ..s3   252.583766: 0: ERROR line:509 ret:-22
     kworker/0:1-33    [000] ..s4   253.627615: 0: key 22 remote ip
0x0 geneve class 0x0
     kworker/0:1-33    [000] ..s4   253.627634: 0: key 22 remote ip
0x0 geneve class 0x0
     kworker/0:1-33    [000] ..s4   253.627766: 0: key 22 remote ip
0x0 geneve class 0x0
     kworker/0:1-33    [000] ..s4   253.627797: 0: key 22 remote ip
0x0 geneve class 0x0
     kworker/0:1-33    [000] ..s4   253.627833: 0: key 22 remote ip
0x0 geneve class 0x0
            ping-3997  [001] ..s3   253.915594: 0: key 22 remote ip
0x0 geneve class 0x0
            ping-3998  [002] ..s3   253.946604: 0: key 22 remote ip
0x0 geneve class 0x0
 systemd-resolve-2664  [002] ..s3   253.956160: 0: ERROR line:509 ret:-22
 systemd-resolve-2664  [002] ..s3   254.083777: 0: ERROR line:509 ret:-22
          <idle>-0     [000] ..s3   254.331437: 0: key 22 remote ip
0x0 geneve class 0x0
 systemd-resolve-2664  [002] ..s3   254.333628: 0: ERROR line:509 ret:-22
            ping-3998  [002] ..s3   254.947116: 0: key 22 remote ip
0x0 geneve class 0x0
            ping-3998  [001] ..s3   255.947392: 0: key 22 remote ip
0x0 geneve class 0x0
            ping-4118  [002] ..s3   256.654930: 0: remote ip 0xac100164
            ping-4118  [002] ..s3   257.655393: 0: remote ip 0xac100164
            ping-4118  [002] ..s3   258.655851: 0: remote ip 0xac100164
            ping-4119  [000] ..s3   258.695970: 0: remote ip 0xac100164
            ping-4119  [000] ..s3   259.696426: 0: remote ip 0xac100164
            ping-4119  [000] ..s3   260.696886: 0: remote ip 0xac100164
            ping-4231  [002] ..s3   264.713834: 0: remote ip6 0::11
            ping-4231  [002] ..s3   265.714327: 0: remote ip6 0::11
          <idle>-0     [002] ..s3   265.979423: 0: remote ip6 0::11
            ping-4231  [002] ..s3   266.714822: 0: remote ip6 0::11
            ping-4239  [001] ..s3   266.748758: 0: remote ip6 0::11
            ping-4239  [001] ..s3   267.749244: 0: remote ip6 0::11
            ping-4239  [001] ..s3   268.749737: 0: remote ip6 0::11
            ping-4367  [003] ..s3   270.047776: 0: reqid 1 spi 0x1
remote ip 0xac100164
     ksoftirqd/3-26    [003] ..s2   271.048427: 0: reqid 1 spi 0x1
remote ip 0xac100164
PING 10.1.1.200 (10.1.1.200): 56 data bytes
--- 10.1.1.200 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.263/0.370/0.579 ms
            ping-4367  [003] ..s3   272.048870: 0: reqid 1 spi 0x1
remote ip 0xac100164
            ping-4367  [003] ..s3   270.047776: 0: reqid 1 spi 0x1
remote ip 0xac100164
     ksoftirqd/3-26    [003] ..s2   271.048427: 0: reqid 1 spi 0x1
remote ip 0xac100164
            ping-4367  [003] ..s3   272.048870: 0: reqid 1 spi 0x1
remote ip 0xac100164
            ping-4367  [003] ..s3   270.047776: 0: reqid 1 spi 0x1
remote ip 0xac100164
     ksoftirqd/3-26    [003] ..s2   271.048427: 0: reqid 1 spi 0x1
remote ip 0xac100164
            ping-4367  [003] ..s3   272.048870: 0: reqid 1 spi 0x1
remote ip 0xac100164
          <idle>-0     [000] ..s3   190.987095: 0: key 2 remote ip 0xac100164
            ping-3043  [000] ..s3   190.988715: 0: key 2 remote ip 0xac100164
     ksoftirqd/0-9     [000] ..s2   190.988986: 0: key 2 remote ip 0xac100164
     kworker/0:1-33    [000] ..s4   191.419445: 0: key 2 remote ip 0xac100164
            ping-3043  [000] ..s3   191.989437: 0: key 2 remote ip 0xac100164
     kworker/0:1-33    [000] ..s4   192.443460: 0: key 2 remote ip 0xac100164
     kworker/0:1-33    [000] ..s4   192.443508: 0: key 2 remote ip 0xac100164
          <idle>-0     [000] ..s3   192.446318: 0: key 2 remote ip 0xac100164
            ping-3043  [000] ..s3   192.989902: 0: key 2 remote ip 0xac100164
            ping-3044  [000] ..s3   193.025776: 0: key 2 remote ip 0xac100164
            ping-3044  [000] ..s3   194.026240: 0: key 2 remote ip 0xac100164
            ping-3044  [000] ..s3   195.026707: 0: key 2 remote ip 0xac100164
            ping-3419  [002] ..s3   223.036234: 0: key 2 remote ip
0xac100164 erspan version 2
            ping-3419  [002] ..s3   223.036256: 0: key 2 remote ip
0xac100164 erspan version 2
          <idle>-0     [000] ..s3   223.067438: 0: key 2 remote ip
0xac100164 erspan version 2
     kworker/0:1-33    [000] ..s4   223.283447: 0: key 2 remote ip
0xac100164 erspan version 2
            ping-3419  [002] ..s3   224.036713: 0: key 2 remote ip
0xac100164 erspan version 2
     kworker/0:1-33    [000] ..s4   224.315459: 0: key 2 remote ip
0xac100164 erspan version 2
     kworker/0:1-33    [000] ..s4   224.315514: 0: key 2 remote ip
0xac100164 erspan version 2
          <idle>-0     [000] ..s3   224.675428: 0: key 2 remote ip
0xac100164 erspan version 2
            ping-3419  [002] ..s3   225.036920: 0: key 2 remote ip
0xac100164 erspan version 2
            ping-3420  [003] ..s3   225.064293: 0: key 2 remote ip
0xac100164 erspan version 2
            ping-3420  [003] ..s3   226.064761: 0: key 2 remote ip
0xac100164 erspan version 2
            ping-3420  [003] ..s3   227.065237: 0: key 2 remote ip
0xac100164 erspan version 2
            ping-3655  [000] ..s3   233.957186: 0: key 2 remote ip
0xac100164 vxlan gbp 0x800ff
          <idle>-0     [000] ..s3   234.171427: 0: key 2 remote ip
0xac100164 vxlan gbp 0x94
            ping-3655  [000] ..s3   234.957640: 0: key 2 remote ip
0xac100164 vxlan gbp 0x800ff
     kworker/0:1-33    [000] ..s4   235.259475: 0: key 2 remote ip
0xac100164 vxlan gbp 0x94
            ping-3655  [000] ..s3   235.958144: 0: key 2 remote ip
0xac100164 vxlan gbp 0x800ff
          <idle>-0     [000] ..s3   235.963159: 0: key 2 remote ip
0xac100164 vxlan gbp 0x94
            ping-3663  [002] ..s3   235.990422: 0: key 2 remote ip
0xac100164 vxlan gbp 0x800ff
            ping-3663  [002] ..s3   236.990900: 0: key 2 remote ip
0xac100164 vxlan gbp 0x800ff
            ping-3663  [002] ..s3   237.991370: 0: key 2 remote ip
0xac100164 vxlan gbp 0x800ff
            ping-3887  [002] ..s3   246.831153: 0: key 2 remote ip
0xac100164 geneve class 0x0
            ping-3887  [002] ..s3   246.831226: 0: key 2 remote ip
0xac100164 geneve class 0x0
     kworker/1:2-1480  [001] ..s4   247.355451: 0: key 2 remote ip
0xac100164 geneve class 0x0
          <idle>-0     [001] ..s3   247.419431: 0: key 2 remote ip
0xac100164 geneve class 0x0
            ping-3887  [002] ..s3   247.831692: 0: key 2 remote ip
0xac100164 geneve class 0x0
     kworker/1:2-1480  [001] ..s4   248.379561: 0: key 2 remote ip
0xac100164 geneve class 0x0
     kworker/1:2-1480  [001] ..s4   248.379605: 0: key 2 remote ip
0xac100164 geneve class 0x0
          <idle>-0     [001] ..s3   248.459429: 0: key 2 remote ip
0xac100164 geneve class 0x0
            ping-3887  [002] ..s3   248.832187: 0: key 2 remote ip
0xac100164 geneve class 0x0
            ping-3888  [000] ..s3   248.874871: 0: key 2 remote ip
0xac100164 geneve class 0x0
            ping-3888  [000] ..s3   249.875339: 0: key 2 remote ip
0xac100164 geneve class 0x0
            ping-3888  [000] ..s3   250.875805: 0: key 2 remote ip
0xac100164 geneve class 0x0
            ping-4118  [002] ..s3   256.654930: 0: remote ip 0xac100164
            ping-4118  [002] ..s3   257.655393: 0: remote ip 0xac100164
            ping-4118  [002] ..s3   258.655851: 0: remote ip 0xac100164
            ping-4119  [000] ..s3   258.695970: 0: remote ip 0xac100164
            ping-4119  [000] ..s3   259.696426: 0: remote ip 0xac100164
            ping-4119  [000] ..s3   260.696886: 0: remote ip 0xac100164
            ping-4367  [003] ..s3   270.047776: 0: reqid 1 spi 0x1
remote ip 0xac100164
     ksoftirqd/3-26    [003] ..s2   271.048427: 0: reqid 1 spi 0x1
remote ip 0xac100164
            ping-4367  [003] ..s3   272.048870: 0: reqid 1 spi 0x1
remote ip 0xac100164
[0;92mPASS: xfrm tunnel[0m

^ permalink raw reply

* Re: FW: [PATCH 2/2] ath10k: allow ATH10K_SNOC with COMPILE_TEST
From: Govind Singh @ 2018-06-12 12:32 UTC (permalink / raw)
  To: niklas.cassel, kvalo, davem; +Cc: netdev, linux-wireless, linux-kernel
In-Reply-To: <7058492257914633b55fcd423e4c0b59@aphydexm01b.ap.qualcomm.com>

On 2018-06-12 17:45, Govind Singh wrote:
> -----Original Message-----
> From: ath10k <ath10k-bounces@lists.infradead.org> On Behalf Of Niklas 
> Cassel
> Sent: Tuesday, June 12, 2018 5:09 PM
> To: Kalle Valo <kvalo@codeaurora.org>; David S. Miller 
> <davem@davemloft.net>
> Cc: Niklas Cassel <niklas.cassel@linaro.org>; netdev@vger.kernel.org;
> linux-wireless@vger.kernel.org; linux-kernel@vger.kernel.org;
> ath10k@lists.infradead.org
> Subject: [PATCH 2/2] ath10k: allow ATH10K_SNOC with COMPILE_TEST
> 
> ATH10K_SNOC builds just fine with COMPILE_TEST, so make that possible.
> 
> Signed-off-by: Niklas Cassel <niklas.cassel@linaro.org>
> ---
>  drivers/net/wireless/ath/ath10k/Kconfig | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/wireless/ath/ath10k/Kconfig
> b/drivers/net/wireless/ath/ath10k/Kconfig
> index 54ff5930126c..6572a43590a8 100644
> --- a/drivers/net/wireless/ath/ath10k/Kconfig
> +++ b/drivers/net/wireless/ath/ath10k/Kconfig
> @@ -42,7 +42,8 @@ config ATH10K_USB
> 
>  config ATH10K_SNOC
>  	tristate "Qualcomm ath10k SNOC support (EXPERIMENTAL)"
> -	depends on ATH10K && ARCH_QCOM
> +	depends on ATH10K
> +	depends on ARCH_QCOM || COMPILE_TEST
>  	---help---
>  	  This module adds support for integrated WCN3990 chip connected
>  	  to system NOC(SNOC). Currently work in progress and will not

Thanks Niklas for enabling COMPILE_TEST. With QMI set of 
changes(https://patchwork.kernel.org/patch/10448183/), we need to enable 
COMPILE_TEST for
QCOM_SCM/QMI_HELPERS which seems broken today. Are you planning to fix 
the same.

  config QCOM_SCM
  	bool
-	depends on ARM || ARM64
+	depends on ARM || ARM64 || COMPILE_TEST
  	select RESET_CONTROLLER


  config QCOM_SCM_64
  	def_bool y
-	depends on QCOM_SCM && ARM64
+	depends on QCOM_SCM && ARM64 || COMPILE_TEST

  config QCOM_QMI_HELPERS
  	tristate
-	depends on ARCH_QCOM && NET
+	depends on (ARCH_QCOM || COMPILE_TEST) && NET

-obj-$(CONFIG_ARCH_QCOM)		+= qcom/
+obj-y				+= qcom/

__qcom_scm_init/qcom_scm_call wrapper to support COMPILE_TEST.



BR,
Govind

^ permalink raw reply

* Re: FW: [PATCH 2/2] ath10k: allow ATH10K_SNOC with COMPILE_TEST
From: Niklas Cassel @ 2018-06-12 12:44 UTC (permalink / raw)
  To: Govind Singh, bjorn.andersson
  Cc: kvalo, davem, netdev, linux-wireless, linux-kernel
In-Reply-To: <47cf8ec1b11d727ea928a307a431d4a2@codeaurora.org>

On Tue, Jun 12, 2018 at 06:02:48PM +0530, Govind Singh wrote:
> On 2018-06-12 17:45, Govind Singh wrote:
> > -----Original Message-----
> > From: ath10k <ath10k-bounces@lists.infradead.org> On Behalf Of Niklas
> > Cassel
> > Sent: Tuesday, June 12, 2018 5:09 PM
> > To: Kalle Valo <kvalo@codeaurora.org>; David S. Miller
> > <davem@davemloft.net>
> > Cc: Niklas Cassel <niklas.cassel@linaro.org>; netdev@vger.kernel.org;
> > linux-wireless@vger.kernel.org; linux-kernel@vger.kernel.org;
> > ath10k@lists.infradead.org
> > Subject: [PATCH 2/2] ath10k: allow ATH10K_SNOC with COMPILE_TEST
> > 
> > ATH10K_SNOC builds just fine with COMPILE_TEST, so make that possible.
> > 
> > Signed-off-by: Niklas Cassel <niklas.cassel@linaro.org>
> > ---
> >  drivers/net/wireless/ath/ath10k/Kconfig | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/net/wireless/ath/ath10k/Kconfig
> > b/drivers/net/wireless/ath/ath10k/Kconfig
> > index 54ff5930126c..6572a43590a8 100644
> > --- a/drivers/net/wireless/ath/ath10k/Kconfig
> > +++ b/drivers/net/wireless/ath/ath10k/Kconfig
> > @@ -42,7 +42,8 @@ config ATH10K_USB
> > 
> >  config ATH10K_SNOC
> >  	tristate "Qualcomm ath10k SNOC support (EXPERIMENTAL)"
> > -	depends on ATH10K && ARCH_QCOM
> > +	depends on ATH10K
> > +	depends on ARCH_QCOM || COMPILE_TEST
> >  	---help---
> >  	  This module adds support for integrated WCN3990 chip connected
> >  	  to system NOC(SNOC). Currently work in progress and will not
> 
> Thanks Niklas for enabling COMPILE_TEST. With QMI set of
> changes(https://patchwork.kernel.org/patch/10448183/), we need to enable
> COMPILE_TEST for
> QCOM_SCM/QMI_HELPERS which seems broken today. Are you planning to fix the
> same.


Argh..

qcom_scm seems fine, it is just missing a single definition in the
#else clause of include/linux/qcom_scm.h.

+++ b/include/linux/qcom_scm.h
@@ -89,6 +89,10 @@ static inline int qcom_scm_pas_mem_setup(u32 peripheral, phys_addr_t addr,                                     
 static inline int
 qcom_scm_pas_auth_and_reset(u32 peripheral) { return -ENODEV; }
 static inline int qcom_scm_pas_shutdown(u32 peripheral) { return -ENODEV; }                                                      
+static inline int qcom_scm_assign_mem(phys_addr_t mem_addr, size_t mem_sz,                                                       
+                                     unsigned int *src,
+                                     struct qcom_scm_vmperm *newvm,                                                              
+                                     int dest_cnt) { return -ENODEV; }                                                           
 static inline void qcom_scm_cpu_power_down(u32 flags) {}
 static inline u32 qcom_scm_get_version(void) { return 0; }



include/linux/soc/qcom/qmi.h on the other hand doesn't have any
dummy defintions at all.
I think that it makes sense to be able to compile test
the QMI helpers also on other archs..

Bjorn, any opinion?


> 
>  config QCOM_SCM
>  	bool
> -	depends on ARM || ARM64
> +	depends on ARM || ARM64 || COMPILE_TEST
>  	select RESET_CONTROLLER
> 
> 
>  config QCOM_SCM_64
>  	def_bool y
> -	depends on QCOM_SCM && ARM64
> +	depends on QCOM_SCM && ARM64 || COMPILE_TEST
> 
>  config QCOM_QMI_HELPERS
>  	tristate
> -	depends on ARCH_QCOM && NET
> +	depends on (ARCH_QCOM || COMPILE_TEST) && NET
> 
> -obj-$(CONFIG_ARCH_QCOM)		+= qcom/
> +obj-y				+= qcom/
> 
> __qcom_scm_init/qcom_scm_call wrapper to support COMPILE_TEST.
> 
> 
> 
> BR,
> Govind

^ permalink raw reply

* Re: netdevice notifier and device private data
From: Alexander Aring @ 2018-06-12 13:22 UTC (permalink / raw)
  To: Michael Richardson; +Cc: netdev, linux-wpan, linux-bluetooth
In-Reply-To: <1326.1528682979@localhost>

Hi,

On Sun, Jun 10, 2018 at 10:09:39PM -0400, Michael Richardson wrote:
> 
> Alexander Aring <aring@mojatatu.com> wrote:
>     >> It totally seems like broken behaviour.  Maybe it's not even
>     >> intentional.  Maybe they are just foobar.
> 
>     > They simple don't know what they doing... somebody thought 6LoWPAN need
>     > to be 6LoWPAN, but they actually don't use the 6LoWPAN handling inside
>     > the kernel. _Except_ they doing out of tree stuff which I don't
>     > believe.
> 
> So, it seems like this ioctl() should be disabled, or restricted to cases
> that actually work.  hate to break their code, but if it's broken anyway, at
> least the kernel won't crash under them.
> 

before we breaking their software I will gentle ask before why they
doing that and I get a good reason then. Then we look more how we deal
with an illegal read/dereference in dev->priv.

I will figure out how I can do that over github.

- Alex

^ permalink raw reply

* Re: [PATCH 1/3] m68k: coldfire: Normalize clk API
From: Greg Ungerer @ 2018-06-12 13:31 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Ralf Baechle, James Hogan, Giuseppe Cavallaro, Alexandre Torgue,
	Jose Abreu, Corentin Labbe, David S. Miller, Arnd Bergmann,
	linux-m68k, Linux MIPS Mailing List, netdev,
	Linux Kernel Mailing List
In-Reply-To: <CAMuHMdUyD8d2yoe6v8TEinEH3hhS7Znv99pPxDCkr_uEFS0Fzg@mail.gmail.com>

Hi Geert,

On 12/06/18 17:31, Geert Uytterhoeven wrote:
> On Tue, Jun 12, 2018 at 9:27 AM Greg Ungerer <gerg@linux-m68k.org> wrote:
>> On 11/06/18 18:44, Geert Uytterhoeven wrote:
>>> Coldfire still provides its own variant of the clk API rather than using
>>> the generic COMMON_CLK API.  This generally works, but it causes some
>>> link errors with drivers using the clk_round_rate(), clk_set_rate(),
>>> clk_set_parent(), or clk_get_parent() functions when a platform lacks
>>> those interfaces.
>>>
>>> This adds empty stub implementations for each of them, and I don't even
>>> try to do something useful here but instead just print a WARN() message
>>> to make it obvious what is going on if they ever end up being called.
>>>
>>> The drivers that call these won't be used on these platforms (otherwise
>>> we'd get a link error today), so the added code is harmless bloat and
>>> will warn about accidental use.
>>>
>>> Based on commit bd7fefe1f06ca6cc ("ARM: w90x900: normalize clk API").
>>>
>>> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
>>
>> I am fine with this for ColdFire, so
>>
>> Acked-by: Greg Ungerer <gerg@linux-m68k.org>
> 
> Thanks!
> 
>> Are you going to take this/these via your m68k git tree?
> 
> I''m fine delagating this to you.

No problem. I'll add it to the m68knommu git tree (for-next branch)
when the merge window closes.

Thanks
Greg

^ permalink raw reply

* Re: [bpf PATCH v2 1/2] bpf: sockmap, fix crash when ipv6 sock is added
From: John Fastabend @ 2018-06-12 13:57 UTC (permalink / raw)
  To: Daniel Borkmann, edumazet, weiwan, ast; +Cc: netdev
In-Reply-To: <e96c1dcf-c811-3252-d7fb-0aedd8c70921@iogearbox.net>

On 06/11/2018 04:14 PM, Daniel Borkmann wrote:
> Hi John,
> 
> On 06/08/2018 05:06 PM, John Fastabend wrote:
>> This fixes a crash where we assign tcp_prot to IPv6 sockets instead
>> of tcpv6_prot.
>>
>> Previously we overwrote the sk->prot field with tcp_prot even in the
>> AF_INET6 case. This patch ensures the correct tcp_prot and tcpv6_prot
>> are used. Further, only allow ESTABLISHED connections to join the
>> map per note in TLS ULP,
>>
>>    /* The TLS ulp is currently supported only for TCP sockets
>>     * in ESTABLISHED state.
>>     * Supporting sockets in LISTEN state will require us
>>     * to modify the accept implementation to clone rather then
>>     * share the ulp context.
>>     */
>>
>> Also tested with 'netserver -6' and 'netperf -H [IPv6]' as well as
>> 'netperf -H [IPv4]'. The ESTABLISHED check resolves the previously
>> crashing case here.
>>
>> Fixes: 174a79ff9515 ("bpf: sockmap with sk redirect support")
>> Reported-by: syzbot+5c063698bdbfac19f363@syzkaller.appspotmail.com
>> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
>> Signed-off-by: Wei Wang <weiwan@google.com>
> [...]
> 
> Still one question for some more clarification below that popped up while
> review:
> 
>> @@ -162,6 +164,8 @@ static bool bpf_tcp_stream_read(const struct sock *sk)
>>  }
>>  
>>  static struct proto tcp_bpf_proto;
>> +static struct proto tcpv6_bpf_proto;
> 
> These two are global, w/o locking.
> 
>>  static int bpf_tcp_init(struct sock *sk)
>>  {
>>  	struct smap_psock *psock;
>> @@ -181,14 +185,30 @@ static int bpf_tcp_init(struct sock *sk)
>>  	psock->save_close = sk->sk_prot->close;
>>  	psock->sk_proto = sk->sk_prot;
>>  
>> +	if (sk->sk_family == AF_INET6) {
>> +		tcpv6_bpf_proto = *sk->sk_prot;
>> +		tcpv6_bpf_proto.close = bpf_tcp_close;
>> +	} else {
>> +		tcp_bpf_proto = *sk->sk_prot;
>> +		tcp_bpf_proto.close = bpf_tcp_close;
>> +	}
> 
> And each time we add a BPF ULP to a v4/v6 socket, we override tcp{,v6}_bpf_proto
> from scratch.
> 
>>  	if (psock->bpf_tx_msg) {
>> +		tcpv6_bpf_proto.sendmsg = bpf_tcp_sendmsg;
>> +		tcpv6_bpf_proto.sendpage = bpf_tcp_sendpage;
>> +		tcpv6_bpf_proto.recvmsg = bpf_tcp_recvmsg;
>> +		tcpv6_bpf_proto.stream_memory_read = bpf_tcp_stream_read;
>>  		tcp_bpf_proto.sendmsg = bpf_tcp_sendmsg;
>>  		tcp_bpf_proto.sendpage = bpf_tcp_sendpage;
>>  		tcp_bpf_proto.recvmsg = bpf_tcp_recvmsg;
>>  		tcp_bpf_proto.stream_memory_read = bpf_tcp_stream_read;
>>  	}
>>  
>> -	sk->sk_prot = &tcp_bpf_proto;
>> +	if (sk->sk_family == AF_INET6)
>> +		sk->sk_prot = &tcpv6_bpf_proto;
>> +	else
>> +		sk->sk_prot = &tcp_bpf_proto;
> 
> Where every active socket would be affected from it as well. Isn't that
> generally racy? E.g. existing ones where tcpv6_bpf_proto.sendmsg points
> to bpf_tcp_sendmsg would get overridden with earlier assignment on the
> tcpv6_bpf_proto = *sk->sk_prot during their lifetime after bpf_tcp_init().
> 

In general yes. At best it does feel fragile.

> In the kTLS case, the v4 protos are built up in module init via tls_register()
> and never change from there. The v6 ones are only reloaded when their addr
> changes e.g. module reload would come to mind, which should only be possible
> once no active v6 socket is present. What speaks against adapting similar
> scheme resp. what am I missing that the above would work? (Would be nice if
> there was some discussion in commit log related to it on 'why' this approach
> was done differently.)

I think its best to use the same scheme. Will post a new version. Also
would be nice to fix the selftests in the same series. Finally, I set
these pointers lazily adding a sendmsg hook for example even if it not
needed. Its harmless but does create an extra call through bpf for
no reason on some socks. To be complete we should avoid that.

> 
> Thanks,
> Daniel
> 
>>  	rcu_read_unlock();
>>  	return 0;
>>  }
>> @@ -1111,8 +1131,6 @@ static void bpf_tcp_msg_add(struct smap_psock *psock,
>>  
>>  static int bpf_tcp_ulp_register(void)
>>  {
>> -	tcp_bpf_proto = tcp_prot;
>> -	tcp_bpf_proto.close = bpf_tcp_close;
>>  	/* Once BPF TX ULP is registered it is never unregistered. It
>>  	 * will be in the ULP list for the lifetime of the system. Doing
>>  	 * duplicate registers is not a problem.
>>
> 

^ permalink raw reply

* [PATCH 1/1] ip: add rmnet initial support
From: Daniele Palmas @ 2018-06-12 14:12 UTC (permalink / raw)
  To: netdev, Stephen Hemminger; +Cc: Subash Abhinov Kasiviswanathan, Daniele Palmas

This patch adds basic support for Qualcomm rmnet devices.

Signed-off-by: Daniele Palmas <dnlplm@gmail.com>
---
 ip/Makefile       |  2 +-
 ip/iplink.c       |  2 +-
 ip/iplink_rmnet.c | 70 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 72 insertions(+), 2 deletions(-)
 create mode 100644 ip/iplink_rmnet.c

diff --git a/ip/Makefile b/ip/Makefile
index 77fadee..a88f936 100644
--- a/ip/Makefile
+++ b/ip/Makefile
@@ -10,7 +10,7 @@ IPOBJ=ip.o ipaddress.o ipaddrlabel.o iproute.o iprule.o ipnetns.o \
     link_iptnl.o link_gre6.o iplink_bond.o iplink_bond_slave.o iplink_hsr.o \
     iplink_bridge.o iplink_bridge_slave.o ipfou.o iplink_ipvlan.o \
     iplink_geneve.o iplink_vrf.o iproute_lwtunnel.o ipmacsec.o ipila.o \
-    ipvrf.o iplink_xstats.o ipseg6.o iplink_netdevsim.o
+    ipvrf.o iplink_xstats.o ipseg6.o iplink_netdevsim.o iplink_rmnet.o
 
 RTMONOBJ=rtmon.o
 
diff --git a/ip/iplink.c b/ip/iplink.c
index e4d4da9..f0f8fb8 100644
--- a/ip/iplink.c
+++ b/ip/iplink.c
@@ -121,7 +121,7 @@ void iplink_usage(void)
 			"          bridge | bond | team | ipoib | ip6tnl | ipip | sit | vxlan |\n"
 			"          gre | gretap | erspan | ip6gre | ip6gretap | ip6erspan |\n"
 			"          vti | nlmon | team_slave | bond_slave | ipvlan | geneve |\n"
-			"          bridge_slave | vrf | macsec | netdevsim }\n");
+			"          bridge_slave | vrf | macsec | netdevsim | rmnet}\n");
 	}
 	exit(-1);
 }
diff --git a/ip/iplink_rmnet.c b/ip/iplink_rmnet.c
new file mode 100644
index 0000000..2367754
--- /dev/null
+++ b/ip/iplink_rmnet.c
@@ -0,0 +1,70 @@
+/*
+ * iplink_rmnet.c	RMNET device support
+ *
+ *              This program is free software; you can redistribute it and/or
+ *              modify it under the terms of the GNU General Public License
+ *              as published by the Free Software Foundation; either version
+ *              2 of the License, or (at your option) any later version.
+ *
+ * Authors:     Daniele Palmas <dnlplm@gmail.com>
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "rt_names.h"
+#include "utils.h"
+#include "ip_common.h"
+
+static void print_explain(FILE *f)
+{
+	fprintf(f,
+		"Usage: ... rmnet mux_id MUXID\n"
+		"\n"
+		"MUXID := 1-127\n"
+	);
+}
+
+static void explain(void)
+{
+	print_explain(stderr);
+}
+
+static int rmnet_parse_opt(struct link_util *lu, int argc, char **argv,
+			   struct nlmsghdr *n)
+{
+	__u16 mux_id;
+
+	while (argc > 0) {
+		if (matches(*argv, "mux_id") == 0) {
+			NEXT_ARG();
+			if (get_u16(&mux_id, *argv, 0))
+				invarg("mux_id is invalid", *argv);
+			addattr_l(n, 1024, IFLA_RMNET_MUX_ID, &mux_id, 2);
+		} else if (matches(*argv, "help") == 0) {
+			explain();
+			return -1;
+		} else {
+			fprintf(stderr, "rmnet: unknown command \"%s\"?\n", *argv);
+			explain();
+			return -1;
+		}
+		argc--, argv++;
+	}
+
+	return 0;
+}
+
+static void rmnet_print_help(struct link_util *lu, int argc, char **argv,
+			     FILE *f)
+{
+	print_explain(f);
+}
+
+struct link_util rmnet_link_util = {
+	.id		= "rmnet",
+	.maxattr	= IFLA_RMNET_MAX,
+	.parse_opt	= rmnet_parse_opt,
+	.print_help	= rmnet_print_help,
+};
-- 
2.7.4

^ permalink raw reply related

* RE: [v3, 00/10] Support DPAA PTP clock and timestamping
From: Madalin-cristian Bucur @ 2018-06-12 14:27 UTC (permalink / raw)
  To: Y.b. Lu, netdev@vger.kernel.org, Richard Cochran, Rob Herring,
	Shawn Guo, David S . Miller
  Cc: devicetree@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, Y.b. Lu
In-Reply-To: <20180607092050.46128-1-yangbo.lu@nxp.com>

> -----Original Message-----
> From: Yangbo Lu [mailto:yangbo.lu@nxp.com]
> Sent: Thursday, June 7, 2018 12:21 PM
> To: netdev@vger.kernel.org; Madalin-cristian Bucur
> <madalin.bucur@nxp.com>; Richard Cochran <richardcochran@gmail.com>;
> Rob Herring <robh+dt@kernel.org>; Shawn Guo <shawnguo@kernel.org>;
> David S . Miller <davem@davemloft.net>
> Cc: devicetree@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; linux-arm-
> kernel@lists.infradead.org; linux-kernel@vger.kernel.org; Y.b. Lu
> <yangbo.lu@nxp.com>
> Subject: [v3, 00/10] Support DPAA PTP clock and timestamping
> 
> This patchset is to support DPAA FMAN PTP clock and HW timestamping.
> It had been verified on both ARM platform and PPC platform.
> - The patch #1 to patch #5 are to support DPAA FMAN 1588 timer in
>   ptp_qoriq driver.
> - The patch #6 to patch #10 are to add HW timestamping support in
>   DPAA ethernet driver.
> 
> Yangbo Lu (10):
>   fsl/fman: share the event interrupt
>   ptp: support DPAA FMan 1588 timer in ptp_qoriq
>   dt-binding: ptp_qoriq: add DPAA FMan support
>   powerpc/mpc85xx: move ptp timer out of fman in dts
>   arm64: dts: fsl: move ptp timer out of fman
>   fsl/fman: add set_tstamp interface
>   fsl/fman_port: support getting timestamp
>   fsl/fman: define frame description command UPD
>   dpaa_eth: add support for hardware timestamping
>   dpaa_eth: add the get_ts_info interface for ethtool
> 
>  Documentation/devicetree/bindings/net/fsl-fman.txt |   25 +-----
>  .../devicetree/bindings/ptp/ptp-qoriq.txt          |   15 +++-
>  arch/arm64/boot/dts/freescale/qoriq-fman3-0.dtsi   |   14 ++-
>  arch/powerpc/boot/dts/fsl/qoriq-fman-0.dtsi        |   14 ++-
>  arch/powerpc/boot/dts/fsl/qoriq-fman-1.dtsi        |   14 ++-
>  arch/powerpc/boot/dts/fsl/qoriq-fman3-0.dtsi       |   14 ++-
>  arch/powerpc/boot/dts/fsl/qoriq-fman3-1.dtsi       |   14 ++-
>  arch/powerpc/boot/dts/fsl/qoriq-fman3l-0.dtsi      |   14 ++-
>  drivers/net/ethernet/freescale/dpaa/dpaa_eth.c     |   88
> ++++++++++++++++-
>  drivers/net/ethernet/freescale/dpaa/dpaa_eth.h     |    3 +
>  drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c |   39 ++++++++
>  drivers/net/ethernet/freescale/fman/fman.c         |    3 +-
>  drivers/net/ethernet/freescale/fman/fman.h         |    1 +
>  drivers/net/ethernet/freescale/fman/fman_dtsec.c   |   27 +++++
>  drivers/net/ethernet/freescale/fman/fman_dtsec.h   |    1 +
>  drivers/net/ethernet/freescale/fman/fman_memac.c   |    5 +
>  drivers/net/ethernet/freescale/fman/fman_memac.h   |    1 +
>  drivers/net/ethernet/freescale/fman/fman_port.c    |   12 +++
>  drivers/net/ethernet/freescale/fman/fman_port.h    |    2 +
>  drivers/net/ethernet/freescale/fman/fman_tgec.c    |   21 ++++
>  drivers/net/ethernet/freescale/fman/fman_tgec.h    |    1 +
>  drivers/net/ethernet/freescale/fman/mac.c          |    3 +
>  drivers/net/ethernet/freescale/fman/mac.h          |    1 +
>  drivers/ptp/Kconfig                                |    2 +-
>  drivers/ptp/ptp_qoriq.c                            |  104 ++++++++++++-------
>  include/linux/fsl/ptp_qoriq.h                      |   38 ++++++--
>  26 files changed, 361 insertions(+), 115 deletions(-)

Acked-by: Madalin Bucur <madalin.bucur@nxp.com>

^ permalink raw reply

* Re: [PATCH 2/2] ath10k: allow ATH10K_SNOC with COMPILE_TEST
From: kbuild test robot @ 2018-06-12 14:50 UTC (permalink / raw)
  To: Niklas Cassel
  Cc: kbuild-all, Kalle Valo, David S. Miller, Niklas Cassel, ath10k,
	linux-wireless, netdev, linux-kernel
In-Reply-To: <20180612113907.15043-2-niklas.cassel@linaro.org>

Hi Niklas,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on ath6kl/ath-next]
[also build test WARNING on next-20180612]
[cannot apply to v4.17]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Niklas-Cassel/ath10k-do-not-mix-spaces-and-tabs-in-Kconfig/20180612-194241
base:   https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git ath-next
reproduce:
        # apt-get install sparse
        make ARCH=x86_64 allmodconfig
        make C=1 CF=-D__CHECK_ENDIAN__


sparse warnings: (new ones prefixed by >>)

>> drivers/net/wireless/ath/ath10k/snoc.c:823:5: sparse: symbol 'ath10k_snoc_get_ce_id_from_irq' was not declared. Should it be static?
>> drivers/net/wireless/ath/ath10k/snoc.c:871:6: sparse: symbol 'ath10k_snoc_init_napi' was not declared. Should it be static?

Please review and possibly fold the followup patch.

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

^ permalink raw reply

* [RFC PATCH] ath10k: ath10k_snoc_get_ce_id_from_irq() can be static
From: kbuild test robot @ 2018-06-12 14:50 UTC (permalink / raw)
  To: Niklas Cassel
  Cc: kbuild-all, Kalle Valo, David S. Miller, Niklas Cassel, ath10k,
	linux-wireless, netdev, linux-kernel
In-Reply-To: <20180612113907.15043-2-niklas.cassel@linaro.org>


Fixes: aecf55e7df3a ("ath10k: allow ATH10K_SNOC with COMPILE_TEST")
Signed-off-by: kbuild test robot <fengguang.wu@intel.com>
---
 snoc.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/snoc.c b/drivers/net/wireless/ath/ath10k/snoc.c
index a3a7042..92ddb1c 100644
--- a/drivers/net/wireless/ath/ath10k/snoc.c
+++ b/drivers/net/wireless/ath/ath10k/snoc.c
@@ -820,7 +820,7 @@ static const struct ath10k_bus_ops ath10k_snoc_bus_ops = {
 	.write32	= ath10k_snoc_write32,
 };
 
-int ath10k_snoc_get_ce_id_from_irq(struct ath10k *ar, int irq)
+static int ath10k_snoc_get_ce_id_from_irq(struct ath10k *ar, int irq)
 {
 	struct ath10k_snoc *ar_snoc = ath10k_snoc_priv(ar);
 	int i;
@@ -868,7 +868,7 @@ static int ath10k_snoc_napi_poll(struct napi_struct *ctx, int budget)
 	return done;
 }
 
-void ath10k_snoc_init_napi(struct ath10k *ar)
+static void ath10k_snoc_init_napi(struct ath10k *ar)
 {
 	netif_napi_add(&ar->napi_dev, &ar->napi, ath10k_snoc_napi_poll,
 		       ATH10K_NAPI_BUDGET);

^ permalink raw reply related

* Re: [PATCH 3/3 RFC] Revert "net: stmmac: fix build failure due to missing COMMON_CLK dependency"
From: Jose Abreu @ 2018-06-12 14:51 UTC (permalink / raw)
  To: Geert Uytterhoeven, Greg Ungerer, Ralf Baechle, James Hogan,
	Giuseppe Cavallaro, Alexandre Torgue, Jose Abreu, Corentin Labbe,
	David S . Miller
  Cc: Arnd Bergmann, linux-m68k, linux-mips, netdev, linux-kernel
In-Reply-To: <1528706663-20670-4-git-send-email-geert@linux-m68k.org>

Hi,

On 11-06-2018 09:44, Geert Uytterhoeven wrote:
> This reverts commit bde4975310eb1982bd0bbff673989052d92fd481.
>
> All legacy clock implementations now implement clk_set_rate() (Some
> implementations may be dummies, though).
>
> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
>

This seems okay by me. You can send a non-rfc patch with my ack
once other 2 patches get accepted:

Acked-by: Jose Abreu <joabreu@synopsys.com>

Thanks and Best Regards,
Jose Miguel Abreu

^ permalink raw reply

* Re: [PATCH net 2/3] hv_netvsc: fix network namespace issues with VF support
From: Stephen Hemminger @ 2018-06-12 15:10 UTC (permalink / raw)
  To: Dan Carpenter; +Cc: kys, haiyangz, sthemmin, devel, netdev
In-Reply-To: <20180612095128.kslkt6ban6otanbs@mwanda>

On Tue, 12 Jun 2018 12:51:28 +0300
Dan Carpenter <dan.carpenter@oracle.com> wrote:

> On Mon, Jun 11, 2018 at 12:44:55PM -0700, Stephen Hemminger wrote:
> > When finding the parent netvsc device, the search needs to be across
> > all netvsc device instances (independent of network namespace).
> > 
> > Find parent device of VF using upper_dev_get routine which
> > searches only adjacent list.
> > 
> > Fixes: e8ff40d4bff1 ("hv_netvsc: improve VF device matching")
> > Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
> > 
> > netns aware byref  
> 
> What?  Presumably that wasn't supposed to be part of the commit message.

That was leftover from earlier commit message

^ permalink raw reply

* Re: [PATCH 1/2] Convert target drivers to use sbitmap
From: Bart Van Assche @ 2018-06-12 15:22 UTC (permalink / raw)
  To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-usb@vger.kernel.org, willy@infradead.org,
	virtualization@lists.linux-foundation.org,
	kent.overstreet@gmail.com, linux1394-devel@lists.sourceforge.net,
	jgross@suse.com, axboe@kernel.dk, linux-scsi@vger.kernel.org,
	qla2xxx-upstream@qlogic.com, target-devel@vger.kernel.org,
	netdev@vger.kernel.org
  Cc: mawilcox@microsoft.com
In-Reply-To: <20180515160043.27044-2-willy@infradead.org>

On Tue, 2018-05-15 at 09:00 -0700, Matthew Wilcox wrote:
> diff --git a/drivers/scsi/qla2xxx/qla_target.c b/drivers/scsi/qla2xxx/qla_target.c
> index 025dc2d3f3de..cdf671c2af61 100644
> --- a/drivers/scsi/qla2xxx/qla_target.c
> +++ b/drivers/scsi/qla2xxx/qla_target.c
> @@ -3719,7 +3719,8 @@ void qlt_free_cmd(struct qla_tgt_cmd *cmd)
>  		return;
>  	}
>  	cmd->jiffies_at_free = get_jiffies_64();
> -	percpu_ida_free(&sess->se_sess->sess_tag_pool, cmd->se_cmd.map_tag);
> +	sbitmap_queue_clear(&sess->se_sess->sess_tag_pool, cmd->se_cmd.map_tag,
> +			cmd->se_cmd.map_cpu);
>  }
>  EXPORT_SYMBOL(qlt_free_cmd);

Please introduce functions in the target core for allocating and freeing a tag
instead of spreading the knowledge of how to allocate and free tags over all
target drivers.
 
> +int iscsit_wait_for_tag(struct se_session *se_sess, int state, int *cpup)
> +{
> +	int tag = -1;
> +	DEFINE_WAIT(wait);
> +	struct sbq_wait_state *ws;
> +
> +	if (state == TASK_RUNNING)
> +		return tag;
> +
> +	ws = &se_sess->sess_tag_pool.ws[0];
> +	for (;;) {
> +		prepare_to_wait_exclusive(&ws->wait, &wait, state);
> +		if (signal_pending_state(state, current))
> +			break;

This looks weird to me. Shouldn't target code ignore signals instead of causing
tag allocation to fail if a signal is received?

> +		schedule();
> +		tag = sbitmap_queue_get(&se_sess->sess_tag_pool, cpup);
> +	}
> +
> +	finish_wait(&ws->wait, &wait);
> +	return tag;
> +}

Thanks,

Bart.

^ permalink raw reply

* [jkirsher/next-queue PATCH v2 0/7] Add support for L2 Fwd Offload w/o ndo_select_queue
From: Alexander Duyck @ 2018-06-12 15:18 UTC (permalink / raw)
  To: intel-wired-lan, jeffrey.t.kirsher, netdev

This patch series is meant to allow support for the L2 forward offload, aka
MACVLAN offload without the need for using ndo_select_queue.

The existing solution currently requires that we use ndo_select_queue in
the transmit path if we want to associate specific Tx queues with a given
MACVLAN interface. In order to get away from this we need to repurpose the
tc_to_txq array and XPS pointer for the MACVLAN interface and use those as
a means of accessing the queues on the lower device. As a result we cannot
offload a device that is configured as multiqueue, however it doesn't
really make sense to configure a macvlan interfaced as being multiqueue
anyway since it doesn't really have a qdisc of its own in the first place.

I am submitting this as an RFC for the netdev mailing list, and officially
submitting it for testing to Jeff Kirsher's next-queue in order to validate
the ixgbe specific bits.

The big changes in this set are:
  Allow lower device to update tc_to_txq and XPS map of offloaded MACVLAN
  Disable XPS for single queue devices
  Replace accel_priv with sb_dev in ndo_select_queue
  Add sb_dev parameter to fallback function for ndo_select_queue
  Consolidated ndo_select_queue functions that appeared to be duplicates

v2: Implement generic "select_queue" functions instead of "fallback" functions.
    Tweak last two patches to account for changes in dev_pick_tx_xxx functions.

---

Alexander Duyck (7):
      net-sysfs: Drop support for XPS and traffic_class on single queue device
      net: Add support for subordinate device traffic classes
      ixgbe: Add code to populate and use macvlan tc to Tx queue map
      net: Add support for subordinate traffic classes to netdev_pick_tx
      net: Add generic ndo_select_queue functions
      net: allow ndo_select_queue to pass netdev
      net: allow fallback function to pass netdev


 drivers/infiniband/hw/hfi1/vnic_main.c            |    2 
 drivers/infiniband/ulp/opa_vnic/opa_vnic_netdev.c |    4 -
 drivers/net/bonding/bond_main.c                   |    3 
 drivers/net/ethernet/amazon/ena/ena_netdev.c      |    5 -
 drivers/net/ethernet/broadcom/bcmsysport.c        |    6 -
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c   |    6 +
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h   |    3 
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c   |    5 -
 drivers/net/ethernet/hisilicon/hns/hns_enet.c     |    5 -
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c     |   62 ++++++--
 drivers/net/ethernet/lantiq_etop.c                |   10 -
 drivers/net/ethernet/mellanox/mlx4/en_tx.c        |    7 +
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h      |    3 
 drivers/net/ethernet/mellanox/mlx5/core/en.h      |    3 
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c   |    5 -
 drivers/net/ethernet/renesas/ravb_main.c          |    3 
 drivers/net/ethernet/sun/ldmvsw.c                 |    3 
 drivers/net/ethernet/sun/sunvnet.c                |    3 
 drivers/net/ethernet/ti/netcp_core.c              |    9 -
 drivers/net/hyperv/netvsc_drv.c                   |    6 -
 drivers/net/macvlan.c                             |   10 -
 drivers/net/net_failover.c                        |    7 +
 drivers/net/team/team.c                           |    3 
 drivers/net/tun.c                                 |    3 
 drivers/net/wireless/marvell/mwifiex/main.c       |    3 
 drivers/net/xen-netback/interface.c               |    4 -
 drivers/net/xen-netfront.c                        |    3 
 drivers/staging/netlogic/xlr_net.c                |    9 -
 drivers/staging/rtl8188eu/os_dep/os_intfs.c       |    3 
 drivers/staging/rtl8723bs/os_dep/os_intfs.c       |    7 -
 include/linux/netdevice.h                         |   34 ++++-
 net/core/dev.c                                    |  156 ++++++++++++++++++---
 net/core/net-sysfs.c                              |   36 ++++-
 net/mac80211/iface.c                              |    4 -
 net/packet/af_packet.c                            |    7 +
 35 files changed, 312 insertions(+), 130 deletions(-)

^ permalink raw reply

* [jkirsher/next-queue PATCH v2 1/7] net-sysfs: Drop support for XPS and traffic_class on single queue device
From: Alexander Duyck @ 2018-06-12 15:18 UTC (permalink / raw)
  To: intel-wired-lan, jeffrey.t.kirsher, netdev
In-Reply-To: <20180612151322.86792.97587.stgit@ahduyck-green-test.jf.intel.com>

This patch makes it so that we do not report the traffic class or allow XPS
configuration on single queue devices. This is mostly to avoid unnecessary
complexity with changes I have planned that will allow us to reuse
the unused tc_to_txq and XPS configuration on a single queue device to
allow it to make use of a subset of queues on an underlying device.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 net/core/net-sysfs.c |   15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index bb7e80f..335c6a4 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -1047,9 +1047,14 @@ static ssize_t traffic_class_show(struct netdev_queue *queue,
 				  char *buf)
 {
 	struct net_device *dev = queue->dev;
-	int index = get_netdev_queue_index(queue);
-	int tc = netdev_txq_to_tc(dev, index);
+	int index;
+	int tc;
 
+	if (!netif_is_multiqueue(dev))
+		return -ENOENT;
+
+	index = get_netdev_queue_index(queue);
+	tc = netdev_txq_to_tc(dev, index);
 	if (tc < 0)
 		return -EINVAL;
 
@@ -1214,6 +1219,9 @@ static ssize_t xps_cpus_show(struct netdev_queue *queue,
 	cpumask_var_t mask;
 	unsigned long index;
 
+	if (!netif_is_multiqueue(dev))
+		return -ENOENT;
+
 	index = get_netdev_queue_index(queue);
 
 	if (dev->num_tc) {
@@ -1260,6 +1268,9 @@ static ssize_t xps_cpus_store(struct netdev_queue *queue,
 	cpumask_var_t mask;
 	int err;
 
+	if (!netif_is_multiqueue(dev))
+		return -ENOENT;
+
 	if (!capable(CAP_NET_ADMIN))
 		return -EPERM;
 

^ permalink raw reply related

* [jkirsher/next-queue PATCH v2 2/7] net: Add support for subordinate device traffic classes
From: Alexander Duyck @ 2018-06-12 15:18 UTC (permalink / raw)
  To: intel-wired-lan, jeffrey.t.kirsher, netdev
In-Reply-To: <20180612151322.86792.97587.stgit@ahduyck-green-test.jf.intel.com>

This patch is meant to provide the basic tools needed to allow us to create
subordinate device traffic classes. The general idea here is to allow
subdividing the queues of a device into queue groups accessible through an
upper device such as a macvlan.

The idea here is to enforce the idea that an upper device has to be a
single queue device, ideally with IFF_NO_QUQUE set. With that being the
case we can pretty much guarantee that the tc_to_txq mappings and XPS maps
for the upper device are unused. As such we could reuse those in order to
support subdividing the lower device and distributing those queues between
the subordinate devices.

In order to distinguish between a regular set of traffic classes and if a
device is carrying subordinate traffic classes I changed num_tc from a u8
to a s16 value and use the negative values to represent the suboordinate
pool values. So starting at -1 and running to -32768 we can encode those as
pool values, and the existing values of 0 to 15 can be maintained.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 include/linux/netdevice.h |   16 ++++++++
 net/core/dev.c            |   89 +++++++++++++++++++++++++++++++++++++++++++++
 net/core/net-sysfs.c      |   21 ++++++++++-
 3 files changed, 124 insertions(+), 2 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 3ec9850..41b4660 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -569,6 +569,9 @@ struct netdev_queue {
 	 * (/sys/class/net/DEV/Q/trans_timeout)
 	 */
 	unsigned long		trans_timeout;
+
+	/* Suboordinate device that the queue has been assigned to */
+	struct net_device	*sb_dev;
 /*
  * write-mostly part
  */
@@ -1978,7 +1981,7 @@ struct net_device {
 #ifdef CONFIG_DCB
 	const struct dcbnl_rtnl_ops *dcbnl_ops;
 #endif
-	u8			num_tc;
+	s16			num_tc;
 	struct netdev_tc_txq	tc_to_txq[TC_MAX_QUEUE];
 	u8			prio_tc_map[TC_BITMASK + 1];
 
@@ -2032,6 +2035,17 @@ int netdev_get_num_tc(struct net_device *dev)
 	return dev->num_tc;
 }
 
+void netdev_unbind_sb_channel(struct net_device *dev,
+			      struct net_device *sb_dev);
+int netdev_bind_sb_channel_queue(struct net_device *dev,
+				 struct net_device *sb_dev,
+				 u8 tc, u16 count, u16 offset);
+int netdev_set_sb_channel(struct net_device *dev, u16 channel);
+static inline int netdev_get_sb_channel(struct net_device *dev)
+{
+	return max_t(int, -dev->num_tc, 0);
+}
+
 static inline
 struct netdev_queue *netdev_get_tx_queue(const struct net_device *dev,
 					 unsigned int index)
diff --git a/net/core/dev.c b/net/core/dev.c
index 6e18242..27fe4f2 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2068,11 +2068,13 @@ int netdev_txq_to_tc(struct net_device *dev, unsigned int txq)
 		struct netdev_tc_txq *tc = &dev->tc_to_txq[0];
 		int i;
 
+		/* walk through the TCs and see if it falls into any of them */
 		for (i = 0; i < TC_MAX_QUEUE; i++, tc++) {
 			if ((txq - tc->offset) < tc->count)
 				return i;
 		}
 
+		/* didn't find it, just return -1 to indicate no match */
 		return -1;
 	}
 
@@ -2215,7 +2217,14 @@ int netif_set_xps_queue(struct net_device *dev, const struct cpumask *mask,
 	bool active = false;
 
 	if (dev->num_tc) {
+		/* Do not allow XPS on subordinate device directly */
 		num_tc = dev->num_tc;
+		if (num_tc < 0)
+			return -EINVAL;
+
+		/* If queue belongs to subordinate dev use its map */
+		dev = netdev_get_tx_queue(dev, index)->sb_dev ? : dev;
+
 		tc = netdev_txq_to_tc(dev, index);
 		if (tc < 0)
 			return -EINVAL;
@@ -2366,11 +2375,25 @@ int netif_set_xps_queue(struct net_device *dev, const struct cpumask *mask,
 EXPORT_SYMBOL(netif_set_xps_queue);
 
 #endif
+static void netdev_unbind_all_sb_channels(struct net_device *dev)
+{
+	struct netdev_queue *txq = &dev->_tx[dev->num_tx_queues];
+
+	/* Unbind any subordinate channels */
+	while (txq-- != &dev->_tx[0]) {
+		if (txq->sb_dev)
+			netdev_unbind_sb_channel(dev, txq->sb_dev);
+	}
+}
+
 void netdev_reset_tc(struct net_device *dev)
 {
 #ifdef CONFIG_XPS
 	netif_reset_xps_queues_gt(dev, 0);
 #endif
+	netdev_unbind_all_sb_channels(dev);
+
+	/* Reset TC configuration of device */
 	dev->num_tc = 0;
 	memset(dev->tc_to_txq, 0, sizeof(dev->tc_to_txq));
 	memset(dev->prio_tc_map, 0, sizeof(dev->prio_tc_map));
@@ -2399,11 +2422,77 @@ int netdev_set_num_tc(struct net_device *dev, u8 num_tc)
 #ifdef CONFIG_XPS
 	netif_reset_xps_queues_gt(dev, 0);
 #endif
+	netdev_unbind_all_sb_channels(dev);
+
 	dev->num_tc = num_tc;
 	return 0;
 }
 EXPORT_SYMBOL(netdev_set_num_tc);
 
+void netdev_unbind_sb_channel(struct net_device *dev,
+			      struct net_device *sb_dev)
+{
+	struct netdev_queue *txq = &dev->_tx[dev->num_tx_queues];
+
+#ifdef CONFIG_XPS
+	netif_reset_xps_queues_gt(sb_dev, 0);
+#endif
+	memset(sb_dev->tc_to_txq, 0, sizeof(sb_dev->tc_to_txq));
+	memset(sb_dev->prio_tc_map, 0, sizeof(sb_dev->prio_tc_map));
+
+	while (txq-- != &dev->_tx[0]) {
+		if (txq->sb_dev == sb_dev)
+			txq->sb_dev = NULL;
+	}
+}
+EXPORT_SYMBOL(netdev_unbind_sb_channel);
+
+int netdev_bind_sb_channel_queue(struct net_device *dev,
+				 struct net_device *sb_dev,
+				 u8 tc, u16 count, u16 offset)
+{
+	/* Make certain the sb_dev and dev are already configured */
+	if (sb_dev->num_tc >= 0 || tc >= dev->num_tc)
+		return -EINVAL;
+
+	/* We cannot hand out queues we don't have */
+	if ((offset + count) > dev->real_num_tx_queues)
+		return -EINVAL;
+
+	/* Record the mapping */
+	sb_dev->tc_to_txq[tc].count = count;
+	sb_dev->tc_to_txq[tc].offset = offset;
+
+	/* Provide a way for Tx queue to find the tc_to_txq map or
+	 * XPS map for itself.
+	 */
+	while (count--)
+		netdev_get_tx_queue(dev, count + offset)->sb_dev = sb_dev;
+
+	return 0;
+}
+EXPORT_SYMBOL(netdev_bind_sb_channel_queue);
+
+int netdev_set_sb_channel(struct net_device *dev, u16 channel)
+{
+	/* Do not use a multiqueue device to represent a subordinate channel */
+	if (netif_is_multiqueue(dev))
+		return -ENODEV;
+
+	/* We allow channels 1 - 32767 to be used for subordinate channels.
+	 * Channel 0 is meant to be "native" mode and used only to represent
+	 * the main root device. We allow writing 0 to reset the device back
+	 * to normal mode after being used as a subordinate channel.
+	 */
+	if (channel > S16_MAX)
+		return -EINVAL;
+
+	dev->num_tc = -channel;
+
+	return 0;
+}
+EXPORT_SYMBOL(netdev_set_sb_channel);
+
 /*
  * Routine to help set real_num_tx_queues. To avoid skbs mapped to queues
  * greater than real_num_tx_queues stale skbs on the qdisc must be flushed.
diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index 335c6a4..bd067b1 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -1054,11 +1054,23 @@ static ssize_t traffic_class_show(struct netdev_queue *queue,
 		return -ENOENT;
 
 	index = get_netdev_queue_index(queue);
+
+	/* If queue belongs to subordinate dev use its tc mapping */
+	dev = netdev_get_tx_queue(dev, index)->sb_dev ? : dev;
+
 	tc = netdev_txq_to_tc(dev, index);
 	if (tc < 0)
 		return -EINVAL;
 
-	return sprintf(buf, "%u\n", tc);
+	/* We can report the traffic class one of two ways:
+	 * Subordinate device traffic classes are reported with the traffic
+	 * class first, and then the subordinate class so for example TC0 on
+	 * subordinate device 2 will be reported as "0-2". If the queue
+	 * belongs to the root device it will be reported with just the
+	 * traffic class, so just "0" for TC 0 for example.
+	 */
+	return dev->num_tc < 0 ? sprintf(buf, "%u%d\n", tc, dev->num_tc) :
+				 sprintf(buf, "%u\n", tc);
 }
 
 #ifdef CONFIG_XPS
@@ -1225,7 +1237,14 @@ static ssize_t xps_cpus_show(struct netdev_queue *queue,
 	index = get_netdev_queue_index(queue);
 
 	if (dev->num_tc) {
+		/* Do not allow XPS on subordinate device directly */
 		num_tc = dev->num_tc;
+		if (num_tc < 0)
+			return -EINVAL;
+
+		/* If queue belongs to subordinate dev use its map */
+		dev = netdev_get_tx_queue(dev, index)->sb_dev ? : dev;
+
 		tc = netdev_txq_to_tc(dev, index);
 		if (tc < 0)
 			return -EINVAL;

^ permalink raw reply related

* [jkirsher/next-queue PATCH v2 4/7] net: Add support for subordinate traffic classes to netdev_pick_tx
From: Alexander Duyck @ 2018-06-12 15:18 UTC (permalink / raw)
  To: intel-wired-lan, jeffrey.t.kirsher, netdev
In-Reply-To: <20180612151322.86792.97587.stgit@ahduyck-green-test.jf.intel.com>

This change makes it so that we can support the concept of subordinate
device traffic classes to the core networking code. In doing this we can
start pulling out the driver specific bits needed to support selecting a
queue based on an upper device.

The solution at is currently stands is only partially implemented. I have
the start of some XPS bits in here, but I would still need to allow for
configuration of the XPS maps on the queues reserved for the subordinate
devices. For now I am using the reference to the sb_dev XPS map as just a
way to skip the lookup of the lower device XPS map for now as that would
result in the wrong queue being picked.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |   19 +++-----
 drivers/net/macvlan.c                         |   10 +---
 include/linux/netdevice.h                     |    4 +-
 net/core/dev.c                                |   57 +++++++++++++++----------
 4 files changed, 45 insertions(+), 45 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 6e27848..053a54c 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -8219,20 +8219,17 @@ static void ixgbe_atr(struct ixgbe_ring *ring,
 					      input, common, ring->queue_index);
 }
 
+#ifdef IXGBE_FCOE
 static u16 ixgbe_select_queue(struct net_device *dev, struct sk_buff *skb,
 			      void *accel_priv, select_queue_fallback_t fallback)
 {
-	struct ixgbe_fwd_adapter *fwd_adapter = accel_priv;
-#ifdef IXGBE_FCOE
 	struct ixgbe_adapter *adapter;
 	struct ixgbe_ring_feature *f;
-#endif
 	int txq;
 
-	if (fwd_adapter) {
-		u8 tc = netdev_get_num_tc(dev) ?
-			netdev_get_prio_tc_map(dev, skb->priority) : 0;
-		struct net_device *vdev = fwd_adapter->netdev;
+	if (accel_priv) {
+		u8 tc = netdev_get_prio_tc_map(dev, skb->priority);
+		struct net_device *vdev = accel_priv;
 
 		txq = vdev->tc_to_txq[tc].offset;
 		txq += reciprocal_scale(skb_get_hash(skb),
@@ -8241,8 +8238,6 @@ static u16 ixgbe_select_queue(struct net_device *dev, struct sk_buff *skb,
 		return txq;
 	}
 
-#ifdef IXGBE_FCOE
-
 	/*
 	 * only execute the code below if protocol is FCoE
 	 * or FIP and we have FCoE enabled on the adapter
@@ -8268,11 +8263,9 @@ static u16 ixgbe_select_queue(struct net_device *dev, struct sk_buff *skb,
 		txq -= f->indices;
 
 	return txq + f->offset;
-#else
-	return fallback(dev, skb);
-#endif
 }
 
+#endif
 static int ixgbe_xmit_xdp_ring(struct ixgbe_adapter *adapter,
 			       struct xdp_frame *xdpf)
 {
@@ -10076,7 +10069,6 @@ static int ixgbe_xdp_xmit(struct net_device *dev, int n,
 	.ndo_open		= ixgbe_open,
 	.ndo_stop		= ixgbe_close,
 	.ndo_start_xmit		= ixgbe_xmit_frame,
-	.ndo_select_queue	= ixgbe_select_queue,
 	.ndo_set_rx_mode	= ixgbe_set_rx_mode,
 	.ndo_validate_addr	= eth_validate_addr,
 	.ndo_set_mac_address	= ixgbe_set_mac,
@@ -10099,6 +10091,7 @@ static int ixgbe_xdp_xmit(struct net_device *dev, int n,
 	.ndo_poll_controller	= ixgbe_netpoll,
 #endif
 #ifdef IXGBE_FCOE
+	.ndo_select_queue	= ixgbe_select_queue,
 	.ndo_fcoe_ddp_setup = ixgbe_fcoe_ddp_get,
 	.ndo_fcoe_ddp_target = ixgbe_fcoe_ddp_target,
 	.ndo_fcoe_ddp_done = ixgbe_fcoe_ddp_put,
diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index adde8fc..401e1d1 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -514,7 +514,6 @@ static int macvlan_queue_xmit(struct sk_buff *skb, struct net_device *dev)
 	const struct macvlan_dev *vlan = netdev_priv(dev);
 	const struct macvlan_port *port = vlan->port;
 	const struct macvlan_dev *dest;
-	void *accel_priv = NULL;
 
 	if (vlan->mode == MACVLAN_MODE_BRIDGE) {
 		const struct ethhdr *eth = (void *)skb->data;
@@ -533,15 +532,10 @@ static int macvlan_queue_xmit(struct sk_buff *skb, struct net_device *dev)
 			return NET_XMIT_SUCCESS;
 		}
 	}
-
-	/* For packets that are non-multicast and not bridged we will pass
-	 * the necessary information so that the lowerdev can distinguish
-	 * the source of the packets via the accel_priv value.
-	 */
-	accel_priv = vlan->accel_priv;
 xmit_world:
 	skb->dev = vlan->lowerdev;
-	return dev_queue_xmit_accel(skb, accel_priv);
+	return dev_queue_xmit_accel(skb,
+				    netdev_get_sb_channel(dev) ? dev : NULL);
 }
 
 static inline netdev_tx_t macvlan_netpoll_send_skb(struct macvlan_dev *vlan, struct sk_buff *skb)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 41b4660..91b3ca9 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2090,7 +2090,7 @@ static inline void netdev_for_each_tx_queue(struct net_device *dev,
 
 struct netdev_queue *netdev_pick_tx(struct net_device *dev,
 				    struct sk_buff *skb,
-				    void *accel_priv);
+				    struct net_device *sb_dev);
 
 /* returns the headroom that the master device needs to take in account
  * when forwarding to this dev
@@ -2552,7 +2552,7 @@ struct net_device *__dev_get_by_flags(struct net *net, unsigned short flags,
 void dev_disable_lro(struct net_device *dev);
 int dev_loopback_xmit(struct net *net, struct sock *sk, struct sk_buff *newskb);
 int dev_queue_xmit(struct sk_buff *skb);
-int dev_queue_xmit_accel(struct sk_buff *skb, void *accel_priv);
+int dev_queue_xmit_accel(struct sk_buff *skb, struct net_device *sb_dev);
 int dev_direct_xmit(struct sk_buff *skb, u16 queue_id);
 int register_netdevice(struct net_device *dev);
 void unregister_netdevice_queue(struct net_device *dev, struct list_head *head);
diff --git a/net/core/dev.c b/net/core/dev.c
index 27fe4f2..2249294 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2704,24 +2704,26 @@ void netif_device_attach(struct net_device *dev)
  * Returns a Tx hash based on the given packet descriptor a Tx queues' number
  * to be used as a distribution range.
  */
-static u16 skb_tx_hash(const struct net_device *dev, struct sk_buff *skb)
+static u16 skb_tx_hash(const struct net_device *dev,
+		       const struct net_device *sb_dev,
+		       struct sk_buff *skb)
 {
 	u32 hash;
 	u16 qoffset = 0;
 	u16 qcount = dev->real_num_tx_queues;
 
+	if (dev->num_tc) {
+		u8 tc = netdev_get_prio_tc_map(dev, skb->priority);
+
+		qoffset = sb_dev->tc_to_txq[tc].offset;
+		qcount = sb_dev->tc_to_txq[tc].count;
+	}
+
 	if (skb_rx_queue_recorded(skb)) {
 		hash = skb_get_rx_queue(skb);
 		while (unlikely(hash >= qcount))
 			hash -= qcount;
-		return hash;
-	}
-
-	if (dev->num_tc) {
-		u8 tc = netdev_get_prio_tc_map(dev, skb->priority);
-
-		qoffset = dev->tc_to_txq[tc].offset;
-		qcount = dev->tc_to_txq[tc].count;
+		return hash + qoffset;
 	}
 
 	return (u16) reciprocal_scale(skb_get_hash(skb), qcount) + qoffset;
@@ -3465,7 +3467,9 @@ int dev_loopback_xmit(struct net *net, struct sock *sk, struct sk_buff *skb)
 }
 #endif /* CONFIG_NET_EGRESS */
 
-static inline int get_xps_queue(struct net_device *dev, struct sk_buff *skb)
+static inline int get_xps_queue(struct net_device *dev,
+				struct net_device *sb_dev,
+				struct sk_buff *skb)
 {
 #ifdef CONFIG_XPS
 	struct xps_dev_maps *dev_maps;
@@ -3473,7 +3477,7 @@ static inline int get_xps_queue(struct net_device *dev, struct sk_buff *skb)
 	int queue_index = -1;
 
 	rcu_read_lock();
-	dev_maps = rcu_dereference(dev->xps_maps);
+	dev_maps = rcu_dereference(sb_dev->xps_maps);
 	if (dev_maps) {
 		unsigned int tci = skb->sender_cpu - 1;
 
@@ -3501,17 +3505,20 @@ static inline int get_xps_queue(struct net_device *dev, struct sk_buff *skb)
 #endif
 }
 
-static u16 __netdev_pick_tx(struct net_device *dev, struct sk_buff *skb)
+static u16 ___netdev_pick_tx(struct net_device *dev, struct sk_buff *skb,
+			     struct net_device *sb_dev)
 {
 	struct sock *sk = skb->sk;
 	int queue_index = sk_tx_queue_get(sk);
 
+	sb_dev = sb_dev ? : dev;
+
 	if (queue_index < 0 || skb->ooo_okay ||
 	    queue_index >= dev->real_num_tx_queues) {
-		int new_index = get_xps_queue(dev, skb);
+		int new_index = get_xps_queue(dev, sb_dev, skb);
 
 		if (new_index < 0)
-			new_index = skb_tx_hash(dev, skb);
+			new_index = skb_tx_hash(dev, sb_dev, skb);
 
 		if (queue_index != new_index && sk &&
 		    sk_fullsock(sk) &&
@@ -3524,9 +3531,15 @@ static u16 __netdev_pick_tx(struct net_device *dev, struct sk_buff *skb)
 	return queue_index;
 }
 
+static u16 __netdev_pick_tx(struct net_device *dev,
+			    struct sk_buff *skb)
+{
+	return ___netdev_pick_tx(dev, skb, NULL);
+}
+
 struct netdev_queue *netdev_pick_tx(struct net_device *dev,
 				    struct sk_buff *skb,
-				    void *accel_priv)
+				    struct net_device *sb_dev)
 {
 	int queue_index = 0;
 
@@ -3541,10 +3554,10 @@ struct netdev_queue *netdev_pick_tx(struct net_device *dev,
 		const struct net_device_ops *ops = dev->netdev_ops;
 
 		if (ops->ndo_select_queue)
-			queue_index = ops->ndo_select_queue(dev, skb, accel_priv,
+			queue_index = ops->ndo_select_queue(dev, skb, sb_dev,
 							    __netdev_pick_tx);
 		else
-			queue_index = __netdev_pick_tx(dev, skb);
+			queue_index = ___netdev_pick_tx(dev, skb, sb_dev);
 
 		queue_index = netdev_cap_txqueue(dev, queue_index);
 	}
@@ -3556,7 +3569,7 @@ struct netdev_queue *netdev_pick_tx(struct net_device *dev,
 /**
  *	__dev_queue_xmit - transmit a buffer
  *	@skb: buffer to transmit
- *	@accel_priv: private data used for L2 forwarding offload
+ *	@sb_dev: suboordinate device used for L2 forwarding offload
  *
  *	Queue a buffer for transmission to a network device. The caller must
  *	have set the device and priority and built the buffer before calling
@@ -3579,7 +3592,7 @@ struct netdev_queue *netdev_pick_tx(struct net_device *dev,
  *      the BH enable code must have IRQs enabled so that it will not deadlock.
  *          --BLG
  */
-static int __dev_queue_xmit(struct sk_buff *skb, void *accel_priv)
+static int __dev_queue_xmit(struct sk_buff *skb, struct net_device *sb_dev)
 {
 	struct net_device *dev = skb->dev;
 	struct netdev_queue *txq;
@@ -3618,7 +3631,7 @@ static int __dev_queue_xmit(struct sk_buff *skb, void *accel_priv)
 	else
 		skb_dst_force(skb);
 
-	txq = netdev_pick_tx(dev, skb, accel_priv);
+	txq = netdev_pick_tx(dev, skb, sb_dev);
 	q = rcu_dereference_bh(txq->qdisc);
 
 	trace_net_dev_queue(skb);
@@ -3692,9 +3705,9 @@ int dev_queue_xmit(struct sk_buff *skb)
 }
 EXPORT_SYMBOL(dev_queue_xmit);
 
-int dev_queue_xmit_accel(struct sk_buff *skb, void *accel_priv)
+int dev_queue_xmit_accel(struct sk_buff *skb, struct net_device *sb_dev)
 {
-	return __dev_queue_xmit(skb, accel_priv);
+	return __dev_queue_xmit(skb, sb_dev);
 }
 EXPORT_SYMBOL(dev_queue_xmit_accel);
 

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox