Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net-next] net/tls: prevent skb_orphan() from leaking TLS plain text with offload
From: David Miller @ 2019-08-03  0:24 UTC (permalink / raw)
  To: jakub.kicinski
  Cc: netdev, oss-drivers, edumazet, borisp, aviadye, davejwatson,
	john.fastabend, daniel, willemb, xiyou.wangcong, fw,
	alexei.starovoitov
In-Reply-To: <20190730211258.13748-1-jakub.kicinski@netronome.com>

From: Jakub Kicinski <jakub.kicinski@netronome.com>
Date: Tue, 30 Jul 2019 14:12:58 -0700

> I'm sending this for net-next because of lack of confidence
> in my own abilities. It should apply cleanly to net... :)

It looks like there will be changes to this.

^ permalink raw reply

* Re: [PATCH net-next v5 5/6] flow_offload: support get flow_block immediately
From: Jakub Kicinski @ 2019-08-03  0:21 UTC (permalink / raw)
  To: wenxu; +Cc: jiri, pablo, fw, netfilter-devel, netdev, John Hurley
In-Reply-To: <45660f1e-b6a8-1bcb-0d57-7c1790d3fbaf@ucloud.cn>

On Sat, 3 Aug 2019 07:19:31 +0800, wenxu wrote:
> > Or:
> >
> > device unregister:
> >   - nft block destroy
> >     - UNBIND cb
> >       - free driver's block state
> >   - driver notifier callback
> >     - free driver's state
> >
> > No?  
> 
> For the second case maybe can't unbind cb? because the nft block is
> destroied. There is no way to find the block(chain) in nft.

But before the block is destroyed doesn't nft send an UNBIND event to
the drivers, always?

^ permalink raw reply

* RE: [PATCH net 1/2] net/tls: partially revert fix transition through disconnect with close
From: John Fastabend @ 2019-08-02 23:57 UTC (permalink / raw)
  To: Jakub Kicinski, davem
  Cc: netdev, oss-drivers, edumazet, davejwatson, borisp, aviadye,
	john.fastabend, daniel, Jakub Kicinski
In-Reply-To: <20190801213602.19634-1-jakub.kicinski@netronome.com>

Jakub Kicinski wrote:
> Looks like we were slightly overzealous with the shutdown()
> cleanup. Even though the sock->sk_state can reach CLOSED again,
> socket->state will not got back to SS_UNCONNECTED once
> connections is ESTABLISHED. Meaning we will see EISCONN if
> we try to reconnect, and EINVAL if we try to listen.
> 
> Only listen sockets can be shutdown() and reused, but since
> ESTABLISHED sockets can never be re-connected() or used for
> listen() we don't need to try to clean up the ULP state early.
> 
> Fixes: 32857cf57f92 ("net/tls: fix transition through disconnect with close")
> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> ---

Thanks Jakub this looks reasonable to me I'm going to run some tests
with sockmap + ktls on it this weekend to be sure the two work
together with some of our applications. I believe the original
series should be enough so that BPF can be safely unloaded out
from underneath ktls now.

I guess we could do something similar on the sockmap side but I
do want to loosen the restrictions there at some point so might
be best just to keep it as is.

Thanks,
John

^ permalink raw reply

* [PATCH bpf-next 2/2] selftests/bpf: add loop test 5
From: Alexei Starovoitov @ 2019-08-02 23:33 UTC (permalink / raw)
  To: davem; +Cc: daniel, netdev, bpf, kernel-team
In-Reply-To: <20190802233344.863418-1-ast@kernel.org>

Add a test with multiple exit conditions.
It's not an infinite loop only when the verifier can properly track
all math on variable 'i' through all possible ways of executing this loop.

barrier()s are needed to disable llvm optimization that combines multiple
branches into fewer branches.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 .../bpf/prog_tests/bpf_verif_scale.c          |  1 +
 tools/testing/selftests/bpf/progs/loop5.c     | 37 +++++++++++++++++++
 2 files changed, 38 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/progs/loop5.c

diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c b/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c
index 757e39540eda..29615a4a9362 100644
--- a/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c
+++ b/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c
@@ -72,6 +72,7 @@ void test_bpf_verif_scale(void)
 		{ "loop1.o", BPF_PROG_TYPE_RAW_TRACEPOINT },
 		{ "loop2.o", BPF_PROG_TYPE_RAW_TRACEPOINT },
 		{ "loop4.o", BPF_PROG_TYPE_RAW_TRACEPOINT },
+		{ "loop5.o", BPF_PROG_TYPE_RAW_TRACEPOINT },
 
 		/* partial unroll. 19k insn in a loop.
 		 * Total program size 20.8k insn.
diff --git a/tools/testing/selftests/bpf/progs/loop5.c b/tools/testing/selftests/bpf/progs/loop5.c
new file mode 100644
index 000000000000..9d9817efe208
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/loop5.c
@@ -0,0 +1,37 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (c) 2019 Facebook
+#include <linux/sched.h>
+#include <linux/ptrace.h>
+#include <stdint.h>
+#include <stddef.h>
+#include <stdbool.h>
+#include <linux/bpf.h>
+#include "bpf_helpers.h"
+#define barrier() __asm__ __volatile__("": : :"memory")
+
+char _license[] SEC("license") = "GPL";
+
+SEC("socket")
+int while_true(volatile struct __sk_buff* skb)
+{
+	int i = 0;
+
+	while (true) {
+		if (skb->len)
+			i += 3;
+		else
+			i += 7;
+		if (i == 9)
+			break;
+		barrier();
+		if (i == 10)
+			break;
+		barrier();
+		if (i == 13)
+			break;
+		barrier();
+		if (i == 14)
+			break;
+	}
+	return i;
+}
-- 
2.20.0


^ permalink raw reply related

* [PATCH bpf-next 0/2] selftests/bpf: more loop tests
From: Alexei Starovoitov @ 2019-08-02 23:33 UTC (permalink / raw)
  To: davem; +Cc: daniel, netdev, bpf, kernel-team

Add two bounded loop tests.

Alexei Starovoitov (2):
  selftests/bpf: add loop test 4
  selftests/bpf: add loop test 5

 .../bpf/prog_tests/bpf_verif_scale.c          |  2 +
 tools/testing/selftests/bpf/progs/loop4.c     | 23 ++++++++++++
 tools/testing/selftests/bpf/progs/loop5.c     | 37 +++++++++++++++++++
 3 files changed, 62 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/progs/loop4.c
 create mode 100644 tools/testing/selftests/bpf/progs/loop5.c

-- 
2.20.0


^ permalink raw reply

* [PATCH bpf-next 1/2] selftests/bpf: add loop test 4
From: Alexei Starovoitov @ 2019-08-02 23:33 UTC (permalink / raw)
  To: davem; +Cc: daniel, netdev, bpf, kernel-team
In-Reply-To: <20190802233344.863418-1-ast@kernel.org>

Add a test that returns a 'random' number between [0, 2^20)
If state pruning is not working correctly for loop body the number of
processed insns will be 2^20 * num_of_insns_in_loop_body and the program
will be rejected.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 .../bpf/prog_tests/bpf_verif_scale.c          |  1 +
 tools/testing/selftests/bpf/progs/loop4.c     | 23 +++++++++++++++++++
 2 files changed, 24 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/progs/loop4.c

diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c b/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c
index b4be96162ff4..757e39540eda 100644
--- a/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c
+++ b/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c
@@ -71,6 +71,7 @@ void test_bpf_verif_scale(void)
 
 		{ "loop1.o", BPF_PROG_TYPE_RAW_TRACEPOINT },
 		{ "loop2.o", BPF_PROG_TYPE_RAW_TRACEPOINT },
+		{ "loop4.o", BPF_PROG_TYPE_RAW_TRACEPOINT },
 
 		/* partial unroll. 19k insn in a loop.
 		 * Total program size 20.8k insn.
diff --git a/tools/testing/selftests/bpf/progs/loop4.c b/tools/testing/selftests/bpf/progs/loop4.c
new file mode 100644
index 000000000000..3e7ee14fddbd
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/loop4.c
@@ -0,0 +1,23 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (c) 2019 Facebook
+#include <linux/sched.h>
+#include <linux/ptrace.h>
+#include <stdint.h>
+#include <stddef.h>
+#include <stdbool.h>
+#include <linux/bpf.h>
+#include "bpf_helpers.h"
+
+char _license[] SEC("license") = "GPL";
+
+SEC("socket")
+int combinations(volatile struct __sk_buff* skb)
+{
+	int ret = 0, i;
+
+#pragma nounroll
+	for (i = 0; i < 20; i++)
+		if (skb->len)
+			ret |= 1 << i;
+	return ret;
+}
-- 
2.20.0


^ permalink raw reply related

* Re: [PATCH] net: sctp: Rename fallthrough label to unhandled
From: Joe Perches @ 2019-08-02 23:26 UTC (permalink / raw)
  To: David Miller
  Cc: nhorman, vyasevich, marcelo.leitner, linux-sctp, netdev,
	linux-kernel
In-Reply-To: <20190802.161932.1776993765494484851.davem@davemloft.net>

On Fri, 2019-08-02 at 16:19 -0700, David Miller wrote:
> From: Joe Perches <joe@perches.com>
> Date: Fri, 02 Aug 2019 10:47:34 -0700
> 
> > On Wed, 2019-07-31 at 08:16 -0400, Neil Horman wrote:
> >> On Wed, Jul 31, 2019 at 04:32:43AM -0700, Joe Perches wrote:
> >> > On Wed, 2019-07-31 at 07:19 -0400, Neil Horman wrote:
> >> > > On Tue, Jul 30, 2019 at 10:04:37PM -0700, Joe Perches wrote:
> >> > > > fallthrough may become a pseudo reserved keyword so this only use of
> >> > > > fallthrough is better renamed to allow it.
> > 
> > Can you or any other maintainer apply this patch
> > or ack it so David Miller can apply it?
> 
> I, like others, don't like the lack of __ in the keyword.  It's kind of
> rediculous the problems it creates to pollute the global namespace like
> that and yes also inconsistent with other shorthands for builtins.

Please add that to the conversation on the RFC thread.
https://lore.kernel.org/patchwork/patch/1108577/

In any case, fallthrough is not really a good label
name for this use.



^ permalink raw reply

* Re: [PATCH net-next v5 5/6] flow_offload: support get flow_block immediately
From: wenxu @ 2019-08-02 23:19 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: jiri, pablo, fw, netfilter-devel, netdev, John Hurley
In-Reply-To: <20190802110216.5e1fd938@cakuba.netronome.com>


在 2019/8/3 2:02, Jakub Kicinski 写道:
> On Fri, 2 Aug 2019 21:09:03 +0800, wenxu wrote:
>>>> We'd have something like the loop in flow_get_default_block():
>>>>
>>>> 	for each (subsystem)
>>>> 		subsystem->handle_new_indir_cb(indr_dev, cb);
>>>>
>>>> And then per-subsystem logic would actually call the cb. Or:
>>>>
>>>> 	for each (subsystem)
>>>> 		block = get_default_block(indir_dev)
>>>> 		indr_dev->ing_cmd_cb(...)  
>>>             nft dev chian is also based on register_netdevice_notifier, So for unregister case,
>>>
>>> the basechian(block) of nft maybe delete before the __tc_indr_block_cb_unregister. is right?
>>>
>>> So maybe we can cache the block as a list of all the subsystem in  indr_dev ?  
>>
>> when the device is unregister the nft netdev chain related to this device will also be delete through netdevice_notifier
>>
>> . So for unregister case,the basechian(block) of nft maybe delete before the __tc_indr_block_cb_unregister.
> Hm, but I don't think that should be an issue. The ordering should be
> like one of the following two:
>
> device unregister:
>   - driver notifier callback
>     - unregister flow cb
>       - UNBIND cb
>         - free driver's block state
>     - free driver's device state
>   - nft block destroy
>     # doesn't see driver's CB any more
>
> Or:
>
> device unregister:
>   - nft block destroy
>     - UNBIND cb
>       - free driver's block state
>   - driver notifier callback
>     - free driver's state
>
> No?

For the second case maybe can't unbind cb? because the nft block is destroied. There is no way

to find the block(chain) in nft.


>
>> cache for the block is not work because the chain already be delete and free. Maybe it improve the prio of
>>
>> rep_netdev_event can help this?
> In theory the cache should work in a similar way as drivers, because
> once the indr_dev is created and the initial block is found, the cached
> value is just recorded in BIND/UNBIND calls. So if BIND/UNBIND works for
> drivers it will also put the right info in the cache.
>

^ permalink raw reply

* Re: [PATCH] net: sctp: Rename fallthrough label to unhandled
From: David Miller @ 2019-08-02 23:19 UTC (permalink / raw)
  To: joe; +Cc: nhorman, vyasevich, marcelo.leitner, linux-sctp, netdev,
	linux-kernel
In-Reply-To: <a03a23728d3b468942a20b55f70babceaec587ee.camel@perches.com>

From: Joe Perches <joe@perches.com>
Date: Fri, 02 Aug 2019 10:47:34 -0700

> On Wed, 2019-07-31 at 08:16 -0400, Neil Horman wrote:
>> On Wed, Jul 31, 2019 at 04:32:43AM -0700, Joe Perches wrote:
>> > On Wed, 2019-07-31 at 07:19 -0400, Neil Horman wrote:
>> > > On Tue, Jul 30, 2019 at 10:04:37PM -0700, Joe Perches wrote:
>> > > > fallthrough may become a pseudo reserved keyword so this only use of
>> > > > fallthrough is better renamed to allow it.
> 
> Can you or any other maintainer apply this patch
> or ack it so David Miller can apply it?

I, like others, don't like the lack of __ in the keyword.  It's kind of
rediculous the problems it creates to pollute the global namespace like
that and yes also inconsistent with other shorthands for builtins.

^ permalink raw reply

* Re: [PATCH 03/34] net/ceph: convert put_page() to put_user_page*()
From: Jeff Layton @ 2019-08-02 22:32 UTC (permalink / raw)
  To: john.hubbard, Andrew Morton
  Cc: Christoph Hellwig, Dan Williams, Dave Chinner, Dave Hansen,
	Ira Weiny, Jan Kara, Jason Gunthorpe, Jérôme Glisse,
	LKML, amd-gfx, ceph-devel, devel, devel, dri-devel, intel-gfx,
	kvm, linux-arm-kernel, linux-block, linux-crypto, linux-fbdev,
	linux-fsdevel, linux-media, linux-mm, linux-nfs, linux-rdma,
	linux-rpi-kernel, linux-xfs, netdev, rds-devel, sparclinux, x86,
	xen-devel, John Hubbard, Ilya Dryomov, Sage Weil,
	David S . Miller
In-Reply-To: <20190802022005.5117-4-jhubbard@nvidia.com>

On Thu, 2019-08-01 at 19:19 -0700, john.hubbard@gmail.com wrote:
> From: John Hubbard <jhubbard@nvidia.com>
> 
> For pages that were retained via get_user_pages*(), release those pages
> via the new put_user_page*() routines, instead of via put_page() or
> release_pages().
> 
> This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
> ("mm: introduce put_user_page*(), placeholder versions").
> 
> Cc: Ilya Dryomov <idryomov@gmail.com>
> Cc: Sage Weil <sage@redhat.com>
> Cc: David S. Miller <davem@davemloft.net>
> Cc: ceph-devel@vger.kernel.org
> Cc: netdev@vger.kernel.org
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
>  net/ceph/pagevec.c | 8 +-------
>  1 file changed, 1 insertion(+), 7 deletions(-)
> 
> diff --git a/net/ceph/pagevec.c b/net/ceph/pagevec.c
> index 64305e7056a1..c88fff2ab9bd 100644
> --- a/net/ceph/pagevec.c
> +++ b/net/ceph/pagevec.c
> @@ -12,13 +12,7 @@
>  
>  void ceph_put_page_vector(struct page **pages, int num_pages, bool dirty)
>  {
> -	int i;
> -
> -	for (i = 0; i < num_pages; i++) {
> -		if (dirty)
> -			set_page_dirty_lock(pages[i]);
> -		put_page(pages[i]);
> -	}
> +	put_user_pages_dirty_lock(pages, num_pages, dirty);
>  	kvfree(pages);
>  }
>  EXPORT_SYMBOL(ceph_put_page_vector);

This patch looks sane enough. Assuming that the earlier patches are OK:

Acked-by: Jeff Layton <jlayton@kernel.org>


^ permalink raw reply

* Re: [PATCH net 0/2] flow_offload hardware priority fixes
From: Jakub Kicinski @ 2019-08-02 22:25 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: netfilter-devel, davem, netdev, marcelo.leitner, jiri, wenxu,
	saeedm, paulb, gerlitz.or
In-Reply-To: <20190802220409.klwdkcvjgegz6hj2@salvia>

On Sat, 3 Aug 2019 00:04:09 +0200, Pablo Neira Ayuso wrote:
> That patch removed the reference to tcf_auto_prio() already, please
> let me know if you have any more specific update you would like to see
> on that patch.

Please explain why the artificial priorities are needed at all.
Hardware should order tables based on table type - ethtool, TC, nft.
As I mentioned in the first email, and unless you can make a strong 
case against that.
Within those tables we should follow the same ordering rules as we 
do in software (modulo ethtool but ordering is pretty clear there).

^ permalink raw reply

* Re: [PATCH] isdn: hysdn: Fix error spaces around '*'
From: Joe Perches @ 2019-08-02 22:05 UTC (permalink / raw)
  To: Stephen Hemminger, Jose Carlos Cazarin Filho
  Cc: isdn, gregkh, devel, netdev, linux-kernel
In-Reply-To: <20190802145506.168b576b@hermes.lan>

On Fri, 2019-08-02 at 14:55 -0700, Stephen Hemminger wrote:
> On Fri,  2 Aug 2019 19:56:02 +0000
> Jose Carlos Cazarin Filho <joseespiriki@gmail.com> wrote:
> 
> > Fix checkpath error:
> > CHECK: spaces preferred around that '*' (ctx:WxV)
> > +extern hysdn_card *card_root;        /* pointer to first card */
> > 
> > Signed-off-by: Jose Carlos Cazarin Filho <joseespiriki@gmail.com>
> 
> Read the TODO, these drivers are scheduled for removal, so changes
> are not helpful at this time.

Maybe better to mark the MAINTAINERS entry obsolete so
checkpatch bleats a message about unnecessary changes.
---
 MAINTAINERS | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 30bf852e6d6b..b5df91032574 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8628,7 +8628,7 @@ M:	Karsten Keil <isdn@linux-pingi.de>
 L:	isdn4linux@listserv.isdn4linux.de (subscribers-only)
 L:	netdev@vger.kernel.org
 W:	http://www.isdn4linux.de
-S:	Odd Fixes
+S:	Obsolete
 F:	Documentation/isdn/
 F:	drivers/isdn/capi/
 F:	drivers/staging/isdn/


^ permalink raw reply related

* Re: [PATCH net 0/2] flow_offload hardware priority fixes
From: Pablo Neira Ayuso @ 2019-08-02 22:04 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netfilter-devel, davem, netdev, marcelo.leitner, jiri, wenxu,
	saeedm, paulb, gerlitz.or
In-Reply-To: <20190802134738.328691b4@cakuba.netronome.com>

Hi Jakub,

On Fri, Aug 02, 2019 at 01:47:38PM -0700, Jakub Kicinski wrote:
> On Fri, 2 Aug 2019 13:00:23 +0200, Pablo Neira Ayuso wrote:
> > Hi Jakub,
> > 
> > If the user specifies 'pref' in the new rule, then tc checks if there
> > is a tcf_proto object that matches this priority. If the tcf_proto
> > object does not exist, tc creates a tcf_proto object and it adds the
> > new rule to this tcf_proto.
> > 
> > In cls_flower, each tcf_proto only stores one single rule, so if the
> > user tries to add another rule with the same 'pref', cls_flower
> > returns EEXIST.
> 
> So you're saying this doesn't work?
> 
> ip link add type dummy
> tc qdisc add dev dummy0 clsact
> tc filter add dev dummy0 ingress protocol ipv6 prio 123 flower src_ip 1111::1 action drop
> tc filter add dev dummy0 ingress protocol ipv6 prio 123 flower src_ip 1111::2 action drop

This works indeed as you describe.

I'll go back to the original netfilter basechain priority patch then:

https://patchwork.ozlabs.org/patch/1140412/

That patch removed the reference to tcf_auto_prio() already, please
let me know if you have any more specific update you would like to see
on that patch.

Thanks.

^ permalink raw reply

* Re: [PATCH v3 bpf-next 02/12] libbpf: implement BPF CO-RE offset relocation algorithm
From: Alexei Starovoitov @ 2019-08-02 21:56 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Yonghong Song, Song Liu, Kernel Team
In-Reply-To: <CAEf4BzarjODxo5c-UKtCL_dGGNb1m-3QPAGGR0eq_0tcZVMt8g@mail.gmail.com>

On Fri, Aug 02, 2019 at 12:16:52AM -0700, Andrii Nakryiko wrote:
> On Thu, Aug 1, 2019 at 4:50 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Wed, Jul 31, 2019 at 11:47:53PM -0700, Andrii Nakryiko wrote:
> > > This patch implements the core logic for BPF CO-RE offsets relocations.
> > > Every instruction that needs to be relocated has corresponding
> > > bpf_offset_reloc as part of BTF.ext. Relocations are performed by trying
> > > to match recorded "local" relocation spec against potentially many
> > > compatible "target" types, creating corresponding spec. Details of the
> > > algorithm are noted in corresponding comments in the code.
> > >
> > > Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> > > Acked-by: Song Liu <songliubraving@fb.com>
> > ...
> > > +             if (btf_is_composite(t)) {
> > > +                     const struct btf_member *m = (void *)(t + 1);
> > > +                     __u32 offset;
> > > +
> > > +                     if (access_idx >= BTF_INFO_VLEN(t->info))
> > > +                             return -EINVAL;
> > > +
> > > +                     m = &m[access_idx];
> > > +
> > > +                     if (BTF_INFO_KFLAG(t->info)) {
> > > +                             if (BTF_MEMBER_BITFIELD_SIZE(m->offset))
> > > +                                     return -EINVAL;
> > > +                             offset = BTF_MEMBER_BIT_OFFSET(m->offset);
> > > +                     } else {
> > > +                             offset = m->offset;
> > > +                     }
> >
> > very similar logic exists in btf_dump.c
> > probably makes sense to make a common helper at some point.
> 
> Will add btf_member_bit_offset(type, member) and
> btf_member_bit_size(type, member).
> 
> >
> > > +static size_t bpf_core_essential_name_len(const char *name)
> > > +{
> > > +     size_t n = strlen(name);
> > > +     int i = n - 3;
> > > +
> > > +     while (i > 0) {
> > > +             if (name[i] == '_' && name[i + 1] == '_' && name[i + 2] == '_')
> > > +                     return i;
> > > +             i--;
> > > +     }
> > > +     return n;
> > > +}
> >
> > that's a bit of an eye irritant. How about?
> >         size_t n = strlen(name);
> >         int i, cnt = 0;
> >
> >         for (i = n - 1; i >= 0; i--) {
> >                 if (name[i] == '_') {
> >                     cnt++;
> >                 } else {
> >                    if (cnt == 3)
> >                       return i + 1;
> >                    cnt = 0;
> >                 }
> >         }
> >         return n;
> 
> I find this one much harder to read and understand. What's
> eye-irritating about that loop?
> 
> Your loop will also handle `a____b` differently. My version will
> return "a_" as essential name, yours "a____b". Was this intentional on
> your part?

hmm. I think both will return sizeof("a") == 1

> I'd rather use this instead, if you hate the first one:
> 
> size_t n = strlen(name);
> int i;
> 
> for (i = n - 3; i > 0; i--) {
>     if (strncmp(name + i, "___", 3) == 0)
>         return i;
> }
> 
> Is this better?

that is worse.
What I don't like about it is that every byte is
compared N=sizeof(string-to-found) times.
I guess it's not such a big performance criticial path,
but libbpf has to keep the bar high.

> >
> > > +     case BTF_KIND_ARRAY: {
> > > +             const struct btf_array *loc_a, *targ_a;
> > > +
> > > +             loc_a = (void *)(local_type + 1);
> > > +             targ_a = (void *)(targ_type + 1);
> > > +             local_id = loc_a->type;
> > > +             targ_id = targ_a->type;
> >
> > can we add a helper like:
> 
> Yes, we can. I was thinking about that, but decided to not expand
> patch set. But we do need to extract all those small, but nice
> helpers. I'll put them in libbpf_internal.h for now, but I think it
> might be good idea to expose them as part of btf.h. Thoughts?

part of btf.h make sense to me.

> 
> > const struct btf_array *btf_array(cosnt struct btf_type *t)
> > {
> >         return (const struct btf_array *)(t + 1);
> > }
> >
> > then above will be:
> >         case BTF_KIND_ARRAY: {
> >                 local_id = btf_array(local_type)->type;
> >                 targ_id = btf_array(targ_type)->type;
> >
> > and a bunch of code in btf.c and btf_dump.c would be cleaner as well?
> 
> Yep, some of those are already scattered around btf.c and btf_dump.c,
> will clean up and add patch to this patch set.
> 
> >
> > > +             goto recur;
> > > +     }
> > > +     default:
> > > +             pr_warning("unexpected kind %d relocated, local [%d], target [%d]\n",
> > > +                        kind, local_id, targ_id);
> > > +             return 0;
> > > +     }
> > > +}
> > > +
> > > +/*
> > > + * Given single high-level named field accessor in local type, find
> > > + * corresponding high-level accessor for a target type. Along the way,
> > > + * maintain low-level spec for target as well. Also keep updating target
> > > + * offset.
> > > + *
> > > + * Searching is performed through recursive exhaustive enumeration of all
> > > + * fields of a struct/union. If there are any anonymous (embedded)
> > > + * structs/unions, they are recursively searched as well. If field with
> > > + * desired name is found, check compatibility between local and target types,
> > > + * before returning result.
> > > + *
> > > + * 1 is returned, if field is found.
> > > + * 0 is returned if no compatible field is found.
> > > + * <0 is returned on error.
> > > + */
> > > +static int bpf_core_match_member(const struct btf *local_btf,
> > > +                              const struct bpf_core_accessor *local_acc,
> > > +                              const struct btf *targ_btf,
> > > +                              __u32 targ_id,
> > > +                              struct bpf_core_spec *spec,
> > > +                              __u32 *next_targ_id)
> > > +{
> > > +     const struct btf_type *local_type, *targ_type;
> > > +     const struct btf_member *local_member, *m;
> > > +     const char *local_name, *targ_name;
> > > +     __u32 local_id;
> > > +     int i, n, found;
> > > +
> > > +     targ_type = skip_mods_and_typedefs(targ_btf, targ_id, &targ_id);
> > > +     if (!targ_type)
> > > +             return -EINVAL;
> > > +     if (!btf_is_composite(targ_type))
> > > +             return 0;
> > > +
> > > +     local_id = local_acc->type_id;
> > > +     local_type = btf__type_by_id(local_btf, local_id);
> > > +     local_member = (void *)(local_type + 1);
> > > +     local_member += local_acc->idx;
> > > +     local_name = btf__name_by_offset(local_btf, local_member->name_off);
> > > +
> > > +     n = BTF_INFO_VLEN(targ_type->info);
> > > +     m = (void *)(targ_type + 1);
> >
> > new btf_member() helper?
> >
> > > +     for (i = 0; i < n; i++, m++) {
> > > +             __u32 offset;
> > > +
> > > +             /* bitfield relocations not supported */
> > > +             if (BTF_INFO_KFLAG(targ_type->info)) {
> > > +                     if (BTF_MEMBER_BITFIELD_SIZE(m->offset))
> > > +                             continue;
> > > +                     offset = BTF_MEMBER_BIT_OFFSET(m->offset);
> > > +             } else {
> > > +                     offset = m->offset;
> > > +             }
> > > +             if (offset % 8)
> > > +                     continue;
> >
> > same bit of code again?
> > definitely could use a helper.
> 
> Different handling (continue here, return error above), but can use
> those helpers I mentioned above.
> 
> >
> > > +     for (i = 0; i < local_spec->len; i++, local_acc++, targ_acc++) {
> > > +             targ_type = skip_mods_and_typedefs(targ_spec->btf, targ_id,
> > > +                                                &targ_id);
> > > +             if (!targ_type)
> > > +                     return -EINVAL;
> > > +
> > > +             if (local_acc->name) {
> > > +                     matched = bpf_core_match_member(local_spec->btf,
> > > +                                                     local_acc,
> > > +                                                     targ_btf, targ_id,
> > > +                                                     targ_spec, &targ_id);
> > > +                     if (matched <= 0)
> > > +                             return matched;
> > > +             } else {
> > > +                     /* for i=0, targ_id is already treated as array element
> > > +                      * type (because it's the original struct), for others
> > > +                      * we should find array element type first
> > > +                      */
> > > +                     if (i > 0) {
> > > +                             const struct btf_array *a;
> > > +
> > > +                             if (!btf_is_array(targ_type))
> > > +                                     return 0;
> > > +
> > > +                             a = (void *)(targ_type + 1);
> > > +                             if (local_acc->idx >= a->nelems)
> > > +                                     return 0;
> >
> > am I reading it correctly that the local spec requested out-of-bounds
> > index in the target array type?
> > Why this is 'ignore' instead of -EINVAL?
> 
> Similar to any other mismatch (e.g., int in local type vs int64 in
> target type). It just makes candidate not matching. Why would that be
> error that will stop the whole relocation and subsequently object
> loading process?

Did the field name match or this is for anon types?
I've read it as names matched and type miscompared.

> 
> >
> > > +/*
> > > + * Probe few well-known locations for vmlinux kernel image and try to load BTF
> > > + * data out of it to use for target BTF.
> > > + */
> > > +static struct btf *bpf_core_find_kernel_btf(void)
> > > +{
> > > +     const char *locations[] = {
> > > +             "/lib/modules/%1$s/vmlinux-%1$s",
> > > +             "/usr/lib/modules/%1$s/kernel/vmlinux",
> > > +     };
> > > +     char path[PATH_MAX + 1];
> > > +     struct utsname buf;
> > > +     struct btf *btf;
> > > +     int i, err;
> > > +
> > > +     err = uname(&buf);
> > > +     if (err) {
> > > +             pr_warning("failed to uname(): %d\n", err);
> >
> > defensive programming ?
> > I think uname() can fail only if &buf points to non-existing page like null.
> 
> I haven't checked source for this syscall, but man page specified that
> it might return -1 on error.

man page says that it can only return EFAULT.

> 
> >
> > > +             return ERR_PTR(err);
> > > +     }
> > > +
> > > +     for (i = 0; i < ARRAY_SIZE(locations); i++) {
> > > +             snprintf(path, PATH_MAX, locations[i], buf.release);
> > > +             pr_debug("attempting to load kernel BTF from '%s'\n", path);
> >
> > I think this debug message would have been more useful after access().
> 
> Sure, will move.
> 
> >
> > > +
> > > +             if (access(path, R_OK))
> > > +                     continue;
> > > +
> > > +             btf = btf__parse_elf(path, NULL);
> > > +             if (IS_ERR(btf))
> > > +                     continue;
> > > +
> > > +             pr_debug("successfully loaded kernel BTF from '%s'\n", path);
> > > +             return btf;
> > > +     }
> > > +
> > > +     pr_warning("failed to find valid kernel BTF\n");
> > > +     return ERR_PTR(-ESRCH);
> > > +}
> > > +
> > > +/* Output spec definition in the format:
> > > + * [<type-id>] (<type-name>) + <raw-spec> => <offset>@<spec>,
> > > + * where <spec> is a C-syntax view of recorded field access, e.g.: x.a[3].b
> > > + */
> > > +static void bpf_core_dump_spec(int level, const struct bpf_core_spec *spec)
> > > +{
> > > +     const struct btf_type *t;
> > > +     const char *s;
> > > +     __u32 type_id;
> > > +     int i;
> > > +
> > > +     type_id = spec->spec[0].type_id;
> > > +     t = btf__type_by_id(spec->btf, type_id);
> > > +     s = btf__name_by_offset(spec->btf, t->name_off);
> > > +     libbpf_print(level, "[%u] (%s) + ", type_id, s);
> >
> > imo extra []() don't improve readability of the dump.
> 
> [<num>] is the general convention I've been using throughout libbpf to
> specify type ID, so I'd rather keep it for consistency. I can drop
> parens, though, no problem.
> 
> >
> > > +
> > > +     for (i = 0; i < spec->raw_len; i++)
> > > +             libbpf_print(level, "%d%s", spec->raw_spec[i],
> > > +                          i == spec->raw_len - 1 ? " => " : ":");
> > > +
> > > +     libbpf_print(level, "%u @ &x", spec->offset);
> > > +
> > > +     for (i = 0; i < spec->len; i++) {
> > > +             if (spec->spec[i].name)
> > > +                     libbpf_print(level, ".%s", spec->spec[i].name);
> > > +             else
> > > +                     libbpf_print(level, "[%u]", spec->spec[i].idx);
> > > +     }
> > > +
> > > +}
> > > +
> > > +static size_t bpf_core_hash_fn(const void *key, void *ctx)
> > > +{
> > > +     return (size_t)key;
> > > +}
> > > +
> > > +static bool bpf_core_equal_fn(const void *k1, const void *k2, void *ctx)
> > > +{
> > > +     return k1 == k2;
> > > +}
> > > +
> > > +static void *u32_to_ptr(__u32 x)
> > > +{
> > > +     return (void *)(uintptr_t)x;
> > > +}
> >
> > u32 to pointer on 64-bit arch?!
> > That surely needs a comment.
> 
> I should probably call it u32_to_hash_key() to make it obvious it's
> conversion to hashmap's generic `void *` key type.
> 
> >
> > > +
> > > +/*
> > > + * CO-RE relocate single instruction.
> > > + *
> > > + * The outline and important points of the algorithm:
> > > + * 1. For given local type, find corresponding candidate target types.
> > > + *    Candidate type is a type with the same "essential" name, ignoring
> > > + *    everything after last triple underscore (___). E.g., `sample`,
> > > + *    `sample___flavor_one`, `sample___flavor_another_one`, are all candidates
> > > + *    for each other. Names with triple underscore are referred to as
> > > + *    "flavors" and are useful, among other things, to allow to
> > > + *    specify/support incompatible variations of the same kernel struct, which
> > > + *    might differ between different kernel versions and/or build
> > > + *    configurations.
> > > + *
> > > + *    N.B. Struct "flavors" could be generated by bpftool's BTF-to-C
> > > + *    converter, when deduplicated BTF of a kernel still contains more than
> > > + *    one different types with the same name. In that case, ___2, ___3, etc
> > > + *    are appended starting from second name conflict. But start flavors are
> > > + *    also useful to be defined "locally", in BPF program, to extract same
> > > + *    data from incompatible changes between different kernel
> > > + *    versions/configurations. For instance, to handle field renames between
> > > + *    kernel versions, one can use two flavors of the struct name with the
> > > + *    same common name and use conditional relocations to extract that field,
> > > + *    depending on target kernel version.
> >
> > there are actual kernel types that have ___ in the name.
> > Ex: struct lmc___media
> > We probably need to revisit this 'flavor' convention.
> 
> There are only these:
> - lmc___softc
> - lmc___media
> - lmc___ctl (all three in drivers/net/wan/lmc/lmc_var.h)
> - ____ftrace_##name set of structs
> 
> I couldn't come up with anything cleaner-looking. I think we can still
> keep ___ convention, but:
> 
> 1. Match only exactly 3 underscores, delimited by non-underscore from
> both sides (so similar to your proposed loop above);
> 2. We can also try matching candidates assuming full name without
> ___xxx part removed, in addition to current logic. This seems like an
> overkill at this point and unlikely to be useful in practice, so I'd
> postpone implementing this until we really need it.
> 
> What do you think? Which other convention did you have in mind?

may be match ___[0-9]+ instead for now?
Not as flexible, but user supplied "flavors" is not an immediate task.

> 
> >
> > > +     for (i = 0, j = 0; i < cand_ids->len; i++) {
> > > +             cand_id = cand_ids->data[i];
> > > +             cand_type = btf__type_by_id(targ_btf, cand_id);
> > > +             cand_name = btf__name_by_offset(targ_btf, cand_type->name_off);
> > > +
> > > +             err = bpf_core_spec_match(&local_spec, targ_btf,
> > > +                                       cand_id, &cand_spec);
> > > +             if (err < 0) {
> > > +                     pr_warning("prog '%s': relo #%d: failed to match spec ",
> > > +                                prog_name, relo_idx);
> > > +                     bpf_core_dump_spec(LIBBPF_WARN, &local_spec);
> > > +                     libbpf_print(LIBBPF_WARN,
> > > +                                  " to candidate #%d [%d] (%s): %d\n",
> > > +                                  i, cand_id, cand_name, err);
> > > +                     return err;
> > > +             }
> > > +             if (err == 0) {
> > > +                     pr_debug("prog '%s': relo #%d: candidate #%d [%d] (%s) doesn't match spec ",
> > > +                              prog_name, relo_idx, i, cand_id, cand_name);
> > > +                     bpf_core_dump_spec(LIBBPF_DEBUG, &local_spec);
> > > +                     libbpf_print(LIBBPF_DEBUG, "\n");
> > > +                     continue;
> > > +             }
> > > +
> > > +             pr_debug("prog '%s': relo #%d: candidate #%d matched as spec ",
> > > +                      prog_name, relo_idx, i);
> >
> > did you mention that you're going to make a helper for this debug dumps?
> 
> yeah, I added bpf_core_dump_spec(), but I don't know how to shorten
> this further... This output is extremely useful to understand what's
> happening and will be invaluable when users will inevitably report
> confusing behavior in some cases, so I still want to keep it.

not sure yet. Just pointing out that this function has more debug printfs
than actual code which doesn't look right.
We have complex algorithms in the kernel (like verifier).
Yet we don't sprinkle printfs in there to this degree.


^ permalink raw reply

* Re: [PATCH] isdn: hysdn: Fix error spaces around '*'
From: Stephen Hemminger @ 2019-08-02 21:55 UTC (permalink / raw)
  To: Jose Carlos Cazarin Filho; +Cc: isdn, gregkh, devel, netdev, linux-kernel
In-Reply-To: <20190802195602.28414-1-joseespiriki@gmail.com>

On Fri,  2 Aug 2019 19:56:02 +0000
Jose Carlos Cazarin Filho <joseespiriki@gmail.com> wrote:

> Fix checkpath error:
> CHECK: spaces preferred around that '*' (ctx:WxV)
> +extern hysdn_card *card_root;        /* pointer to first card */
> 
> Signed-off-by: Jose Carlos Cazarin Filho <joseespiriki@gmail.com>


Read the TODO, these drivers are scheduled for removal, so changes
are not helpful at this time.

^ permalink raw reply

* Re: [PATCH] isdn: hysdn: fix code style error from checkpatch
From: Stephen Hemminger @ 2019-08-02 21:54 UTC (permalink / raw)
  To: Ricardo Bruno Lopes da Silva
  Cc: isdn, gregkh, netdev, devel, linux-kernel, lkcamp
In-Reply-To: <20190802195017.27845-1-ricardo6142@gmail.com>

On Fri,  2 Aug 2019 19:50:17 +0000
Ricardo Bruno Lopes da Silva <ricardo6142@gmail.com> wrote:

> Fix error bellow from checkpatch.
> 
> WARNING: Block comments use * on subsequent lines
> +/***********************************************************
> +
> 
> Signed-off-by: Ricardo Bruno Lopes da Silva <ricardo6142@gmail.com>

Read the TODO, these drivers are scheduled for removal, so changes
are not helpful at this time.

^ permalink raw reply

* Re: [PATCH v4 net-next 18/19] ionic: Add coalesce and other features
From: Shannon Nelson @ 2019-08-02 21:48 UTC (permalink / raw)
  To: Saeed Mahameed, netdev@vger.kernel.org, davem@davemloft.net
In-Reply-To: <84f9a5438585a2274df162f6554504138e276d71.camel@mellanox.com>

On 7/25/19 5:13 PM, Saeed Mahameed wrote:
> On Mon, 2019-07-22 at 14:40 -0700, Shannon Nelson wrote:

>> +static void ionic_tx_timeout_work(struct work_struct *ws)
>> +{
>> +	struct lif *lif = container_of(ws, struct lif,
>> tx_timeout_work);
>> +
>> +	netdev_info(lif->netdev, "Tx Timeout recovery\n");
>> +	ionic_reset_queues(lif);
> missing rtnl_lock ?
>
>> +}
>> +
>>   static void ionic_tx_timeout(struct net_device *netdev)
>>   {
>> -	netdev_info(netdev, "%s: stubbed\n", __func__);
>> +	struct lif *lif = netdev_priv(netdev);
>> +
>> +	schedule_work(&lif->tx_timeout_work);
>>   }
> missing cancel work ? be careful when combined with the rtnl_lockthough ..
>
>

Yep, good catch on both.  I'll take care of those.

sln


^ permalink raw reply

* [PATCH bpf-next v2 2/3] selftests/bpf: test_progs: test__printf -> printf
From: Stanislav Fomichev @ 2019-08-02 21:11 UTC (permalink / raw)
  To: netdev, bpf; +Cc: davem, ast, daniel, Stanislav Fomichev, Andrii Nakryiko
In-Reply-To: <20190802211108.90739-1-sdf@google.com>

Now that test__printf is a simple wraper around printf, let's drop it
(and test__vprintf as well).

Cc: Andrii Nakryiko <andriin@fb.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 .../selftests/bpf/prog_tests/bpf_verif_scale.c   |  4 ++--
 .../testing/selftests/bpf/prog_tests/l4lb_all.c  |  2 +-
 .../testing/selftests/bpf/prog_tests/map_lock.c  | 10 +++++-----
 .../selftests/bpf/prog_tests/send_signal.c       |  4 ++--
 .../testing/selftests/bpf/prog_tests/spinlock.c  |  2 +-
 .../bpf/prog_tests/stacktrace_build_id.c         |  4 ++--
 .../bpf/prog_tests/stacktrace_build_id_nmi.c     |  4 ++--
 .../selftests/bpf/prog_tests/xdp_noinline.c      |  4 ++--
 tools/testing/selftests/bpf/test_progs.c         | 16 +---------------
 tools/testing/selftests/bpf/test_progs.h         | 10 ++++------
 10 files changed, 22 insertions(+), 38 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c b/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c
index b4be96162ff4..3548ba2f24a8 100644
--- a/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c
+++ b/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c
@@ -5,13 +5,13 @@ static int libbpf_debug_print(enum libbpf_print_level level,
 			      const char *format, va_list args)
 {
 	if (level != LIBBPF_DEBUG) {
-		test__vprintf(format, args);
+		vprintf(format, args);
 		return 0;
 	}
 
 	if (!strstr(format, "verifier log"))
 		return 0;
-	test__vprintf("%s", args);
+	vprintf("%s", args);
 	return 0;
 }
 
diff --git a/tools/testing/selftests/bpf/prog_tests/l4lb_all.c b/tools/testing/selftests/bpf/prog_tests/l4lb_all.c
index 5ce572c03a5f..20ddca830e68 100644
--- a/tools/testing/selftests/bpf/prog_tests/l4lb_all.c
+++ b/tools/testing/selftests/bpf/prog_tests/l4lb_all.c
@@ -74,7 +74,7 @@ static void test_l4lb(const char *file)
 	}
 	if (bytes != MAGIC_BYTES * NUM_ITER * 2 || pkts != NUM_ITER * 2) {
 		error_cnt++;
-		test__printf("test_l4lb:FAIL:stats %lld %lld\n", bytes, pkts);
+		printf("test_l4lb:FAIL:stats %lld %lld\n", bytes, pkts);
 	}
 out:
 	bpf_object__close(obj);
diff --git a/tools/testing/selftests/bpf/prog_tests/map_lock.c b/tools/testing/selftests/bpf/prog_tests/map_lock.c
index 2e78217ed3fd..ee99368c595c 100644
--- a/tools/testing/selftests/bpf/prog_tests/map_lock.c
+++ b/tools/testing/selftests/bpf/prog_tests/map_lock.c
@@ -9,12 +9,12 @@ static void *parallel_map_access(void *arg)
 	for (i = 0; i < 10000; i++) {
 		err = bpf_map_lookup_elem_flags(map_fd, &key, vars, BPF_F_LOCK);
 		if (err) {
-			test__printf("lookup failed\n");
+			printf("lookup failed\n");
 			error_cnt++;
 			goto out;
 		}
 		if (vars[0] != 0) {
-			test__printf("lookup #%d var[0]=%d\n", i, vars[0]);
+			printf("lookup #%d var[0]=%d\n", i, vars[0]);
 			error_cnt++;
 			goto out;
 		}
@@ -22,8 +22,8 @@ static void *parallel_map_access(void *arg)
 		for (j = 2; j < 17; j++) {
 			if (vars[j] == rnd)
 				continue;
-			test__printf("lookup #%d var[1]=%d var[%d]=%d\n",
-				     i, rnd, j, vars[j]);
+			printf("lookup #%d var[1]=%d var[%d]=%d\n",
+			       i, rnd, j, vars[j]);
 			error_cnt++;
 			goto out;
 		}
@@ -43,7 +43,7 @@ void test_map_lock(void)
 
 	err = bpf_prog_load(file, BPF_PROG_TYPE_CGROUP_SKB, &obj, &prog_fd);
 	if (err) {
-		test__printf("test_map_lock:bpf_prog_load errno %d\n", errno);
+		printf("test_map_lock:bpf_prog_load errno %d\n", errno);
 		goto close_prog;
 	}
 	map_fd[0] = bpf_find_map(__func__, obj, "hash_map");
diff --git a/tools/testing/selftests/bpf/prog_tests/send_signal.c b/tools/testing/selftests/bpf/prog_tests/send_signal.c
index 461b423d0584..1575f0a1f586 100644
--- a/tools/testing/selftests/bpf/prog_tests/send_signal.c
+++ b/tools/testing/selftests/bpf/prog_tests/send_signal.c
@@ -202,8 +202,8 @@ static int test_send_signal_nmi(void)
 			 -1 /* cpu */, -1 /* group_fd */, 0 /* flags */);
 	if (pmu_fd == -1) {
 		if (errno == ENOENT) {
-			test__printf("%s:SKIP:no PERF_COUNT_HW_CPU_CYCLES\n",
-				     __func__);
+			printf("%s:SKIP:no PERF_COUNT_HW_CPU_CYCLES\n",
+			       __func__);
 			return 0;
 		}
 		/* Let the test fail with a more informative message */
diff --git a/tools/testing/selftests/bpf/prog_tests/spinlock.c b/tools/testing/selftests/bpf/prog_tests/spinlock.c
index deb2db5b85b0..114ebe6a438e 100644
--- a/tools/testing/selftests/bpf/prog_tests/spinlock.c
+++ b/tools/testing/selftests/bpf/prog_tests/spinlock.c
@@ -12,7 +12,7 @@ void test_spinlock(void)
 
 	err = bpf_prog_load(file, BPF_PROG_TYPE_CGROUP_SKB, &obj, &prog_fd);
 	if (err) {
-		test__printf("test_spin_lock:bpf_prog_load errno %d\n", errno);
+		printf("test_spin_lock:bpf_prog_load errno %d\n", errno);
 		goto close_prog;
 	}
 	for (i = 0; i < 4; i++)
diff --git a/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id.c b/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id.c
index 356d2c017a9c..ac44fda84833 100644
--- a/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id.c
+++ b/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id.c
@@ -109,8 +109,8 @@ void test_stacktrace_build_id(void)
 	if (build_id_matches < 1 && retry--) {
 		bpf_link__destroy(link);
 		bpf_object__close(obj);
-		test__printf("%s:WARN:Didn't find expected build ID from the map, retrying\n",
-			     __func__);
+		printf("%s:WARN:Didn't find expected build ID from the map, retrying\n",
+		       __func__);
 		goto retry;
 	}
 
diff --git a/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id_nmi.c b/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id_nmi.c
index f44f2c159714..9557b7dfb782 100644
--- a/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id_nmi.c
+++ b/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id_nmi.c
@@ -140,8 +140,8 @@ void test_stacktrace_build_id_nmi(void)
 	if (build_id_matches < 1 && retry--) {
 		bpf_link__destroy(link);
 		bpf_object__close(obj);
-		test__printf("%s:WARN:Didn't find expected build ID from the map, retrying\n",
-			     __func__);
+		printf("%s:WARN:Didn't find expected build ID from the map, retrying\n",
+		       __func__);
 		goto retry;
 	}
 
diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_noinline.c b/tools/testing/selftests/bpf/prog_tests/xdp_noinline.c
index b5404494b8aa..15f7c272edb0 100644
--- a/tools/testing/selftests/bpf/prog_tests/xdp_noinline.c
+++ b/tools/testing/selftests/bpf/prog_tests/xdp_noinline.c
@@ -75,8 +75,8 @@ void test_xdp_noinline(void)
 	}
 	if (bytes != MAGIC_BYTES * NUM_ITER * 2 || pkts != NUM_ITER * 2) {
 		error_cnt++;
-		test__printf("test_xdp_noinline:FAIL:stats %lld %lld\n",
-			     bytes, pkts);
+		printf("test_xdp_noinline:FAIL:stats %lld %lld\n",
+		       bytes, pkts);
 	}
 out:
 	bpf_object__close(obj);
diff --git a/tools/testing/selftests/bpf/test_progs.c b/tools/testing/selftests/bpf/test_progs.c
index eb8743302a00..1827ce5114f4 100644
--- a/tools/testing/selftests/bpf/test_progs.c
+++ b/tools/testing/selftests/bpf/test_progs.c
@@ -108,20 +108,6 @@ void test__force_log() {
 	env.test->force_log = true;
 }
 
-void test__vprintf(const char *fmt, va_list args)
-{
-	vprintf(fmt, args);
-}
-
-void test__printf(const char *fmt, ...)
-{
-	va_list args;
-
-	va_start(args, fmt);
-	test__vprintf(fmt, args);
-	va_end(args);
-}
-
 struct ipv4_packet pkt_v4 = {
 	.eth.h_proto = __bpf_constant_htons(ETH_P_IP),
 	.iph.ihl = 5,
@@ -313,7 +299,7 @@ static int libbpf_print_fn(enum libbpf_print_level level,
 {
 	if (!env.very_verbose && level == LIBBPF_DEBUG)
 		return 0;
-	test__vprintf(format, args);
+	vprintf(format, args);
 	return 0;
 }
 
diff --git a/tools/testing/selftests/bpf/test_progs.h b/tools/testing/selftests/bpf/test_progs.h
index 54f30d7731c6..1231b30cbbda 100644
--- a/tools/testing/selftests/bpf/test_progs.h
+++ b/tools/testing/selftests/bpf/test_progs.h
@@ -69,8 +69,6 @@ extern int error_cnt;
 extern int pass_cnt;
 extern struct test_env env;
 
-extern void test__printf(const char *fmt, ...);
-extern void test__vprintf(const char *fmt, va_list args);
 extern void test__force_log();
 extern bool test__start_subtest(const char *name);
 
@@ -96,12 +94,12 @@ extern struct ipv6_packet pkt_v6;
 	int __ret = !!(condition);					\
 	if (__ret) {							\
 		error_cnt++;						\
-		test__printf("%s:FAIL:%s ", __func__, tag);		\
-		test__printf(format);					\
+		printf("%s:FAIL:%s ", __func__, tag);			\
+		printf(format);						\
 	} else {							\
 		pass_cnt++;						\
-		test__printf("%s:PASS:%s %d nsec\n",			\
-			      __func__, tag, duration);			\
+		printf("%s:PASS:%s %d nsec\n",				\
+		       __func__, tag, duration);			\
 	}								\
 	__ret;								\
 })
-- 
2.22.0.770.g0f2c4a37fd-goog


^ permalink raw reply related

* [PATCH bpf-next v2 3/3] selftests/bpf: test_progs: drop extra trailing tab
From: Stanislav Fomichev @ 2019-08-02 21:11 UTC (permalink / raw)
  To: netdev, bpf; +Cc: davem, ast, daniel, Stanislav Fomichev, Andrii Nakryiko
In-Reply-To: <20190802211108.90739-1-sdf@google.com>

Small (un)related cleanup.

Cc: Andrii Nakryiko <andriin@fb.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 tools/testing/selftests/bpf/test_progs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/test_progs.c b/tools/testing/selftests/bpf/test_progs.c
index 1827ce5114f4..ccc77782ca7e 100644
--- a/tools/testing/selftests/bpf/test_progs.c
+++ b/tools/testing/selftests/bpf/test_progs.c
@@ -281,7 +281,7 @@ enum ARG_KEYS {
 	ARG_VERIFIER_STATS = 's',
 	ARG_VERBOSE = 'v',
 };
-	
+
 static const struct argp_option opts[] = {
 	{ "num", ARG_TEST_NUM, "NUM", 0,
 	  "Run test number NUM only " },
-- 
2.22.0.770.g0f2c4a37fd-goog


^ permalink raw reply related

* [PATCH bpf-next v2 1/3] selftests/bpf: test_progs: switch to open_memstream
From: Stanislav Fomichev @ 2019-08-02 21:11 UTC (permalink / raw)
  To: netdev, bpf; +Cc: davem, ast, daniel, Stanislav Fomichev, Andrii Nakryiko
In-Reply-To: <20190802211108.90739-1-sdf@google.com>

Use open_memstream to override stdout during test execution.
The copy of the original stdout is held in env.stdout and used
to print subtest info and dump failed log.

test_{v,}printf are now simple wrappers around stdout and will be
removed in the next patch.

v2:
* add ifdef __GLIBC__ around open_memstream (maybe pointless since
  we already depend on glibc for argp_parse)
* hijack stderr as well (Andrii Nakryiko)
* don't hijack for every test, do it once (Andrii Nakryiko)
* log_cap -> log_size (Andrii Nakryiko)
* do fseeko in a proper place (Andrii Nakryiko)
* check open_memstream returned value (Andrii Nakryiko)

Cc: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 tools/testing/selftests/bpf/test_progs.c | 124 ++++++++++++-----------
 tools/testing/selftests/bpf/test_progs.h |   4 +-
 2 files changed, 68 insertions(+), 60 deletions(-)

diff --git a/tools/testing/selftests/bpf/test_progs.c b/tools/testing/selftests/bpf/test_progs.c
index db00196c8315..eb8743302a00 100644
--- a/tools/testing/selftests/bpf/test_progs.c
+++ b/tools/testing/selftests/bpf/test_progs.c
@@ -40,14 +40,23 @@ static bool should_run(struct test_selector *sel, int num, const char *name)
 
 static void dump_test_log(const struct prog_test_def *test, bool failed)
 {
-	if (env.verbose || test->force_log || failed) {
-		if (env.log_cnt) {
-			fprintf(stdout, "%s", env.log_buf);
-			if (env.log_buf[env.log_cnt - 1] != '\n')
-				fprintf(stdout, "\n");
+	if (stdout == env.stdout)
+		return;
+
+	fflush(stdout); /* exports env.log_buf & env.log_size */
+
+	if (env.log_size && (env.verbose || test->force_log || failed)) {
+		int len = strlen(env.log_buf);
+
+		if (len) {
+			fprintf(env.stdout, "%s", env.log_buf);
+			if (env.log_buf[len - 1] != '\n')
+				fprintf(env.stdout, "\n");
+
 		}
 	}
-	env.log_cnt = 0;
+
+	fseeko(stdout, 0, SEEK_SET); /* rewind */
 }
 
 void test__end_subtest()
@@ -62,7 +71,7 @@ void test__end_subtest()
 
 	dump_test_log(test, sub_error_cnt);
 
-	printf("#%d/%d %s:%s\n",
+	fprintf(env.stdout, "#%d/%d %s:%s\n",
 	       test->test_num, test->subtest_num,
 	       test->subtest_name, sub_error_cnt ? "FAIL" : "OK");
 }
@@ -79,7 +88,8 @@ bool test__start_subtest(const char *name)
 	test->subtest_num++;
 
 	if (!name || !name[0]) {
-		fprintf(stderr, "Subtest #%d didn't provide sub-test name!\n",
+		fprintf(env.stderr,
+			"Subtest #%d didn't provide sub-test name!\n",
 			test->subtest_num);
 		return false;
 	}
@@ -100,53 +110,7 @@ void test__force_log() {
 
 void test__vprintf(const char *fmt, va_list args)
 {
-	size_t rem_sz;
-	int ret = 0;
-
-	if (env.verbose || (env.test && env.test->force_log)) {
-		vfprintf(stderr, fmt, args);
-		return;
-	}
-
-try_again:
-	rem_sz = env.log_cap - env.log_cnt;
-	if (rem_sz) {
-		va_list ap;
-
-		va_copy(ap, args);
-		/* we reserved extra byte for \0 at the end */
-		ret = vsnprintf(env.log_buf + env.log_cnt, rem_sz + 1, fmt, ap);
-		va_end(ap);
-
-		if (ret < 0) {
-			env.log_buf[env.log_cnt] = '\0';
-			fprintf(stderr, "failed to log w/ fmt '%s'\n", fmt);
-			return;
-		}
-	}
-
-	if (!rem_sz || ret > rem_sz) {
-		size_t new_sz = env.log_cap * 3 / 2;
-		char *new_buf;
-
-		if (new_sz < 4096)
-			new_sz = 4096;
-		if (new_sz < ret + env.log_cnt)
-			new_sz = ret + env.log_cnt;
-
-		/* +1 for guaranteed space for terminating \0 */
-		new_buf = realloc(env.log_buf, new_sz + 1);
-		if (!new_buf) {
-			fprintf(stderr, "failed to realloc log buffer: %d\n",
-				errno);
-			return;
-		}
-		env.log_buf = new_buf;
-		env.log_cap = new_sz;
-		goto try_again;
-	}
-
-	env.log_cnt += ret;
+	vprintf(fmt, args);
 }
 
 void test__printf(const char *fmt, ...)
@@ -477,6 +441,48 @@ static error_t parse_arg(int key, char *arg, struct argp_state *state)
 	return 0;
 }
 
+static void stdio_hijack(void)
+{
+#ifdef __GLIBC__
+	if (env.verbose || (env.test && env.test->force_log)) {
+		/* nothing to do, output to stdout by default */
+		return;
+	}
+
+	/* stdout and stderr -> buffer */
+	fflush(stdout);
+
+	env.stdout = stdout;
+	env.stderr = stderr;
+
+	stdout = open_memstream(&env.log_buf, &env.log_size);
+	if (!stdout) {
+		stdout = env.stdout;
+		perror("open_memstream");
+		return;
+	}
+
+	stderr = stdout;
+#endif
+}
+
+static void stdio_restore(void)
+{
+#ifdef __GLIBC__
+	if (stdout == env.stdout)
+		return;
+
+	fclose(stdout);
+	free(env.log_buf);
+
+	env.log_buf = NULL;
+	env.log_size = 0;
+
+	stdout = env.stdout;
+	stderr = env.stderr;
+#endif
+}
+
 int main(int argc, char **argv)
 {
 	static const struct argp argp = {
@@ -496,6 +502,7 @@ int main(int argc, char **argv)
 
 	env.jit_enabled = is_jit_enabled();
 
+	stdio_hijack();
 	for (i = 0; i < prog_test_cnt; i++) {
 		struct prog_test_def *test = &prog_test_defs[i];
 		int old_pass_cnt = pass_cnt;
@@ -523,13 +530,14 @@ int main(int argc, char **argv)
 
 		dump_test_log(test, test->error_cnt);
 
-		printf("#%d %s:%s\n", test->test_num, test->test_name,
-		       test->error_cnt ? "FAIL" : "OK");
+		fprintf(env.stdout, "#%d %s:%s\n",
+			test->test_num, test->test_name,
+			test->error_cnt ? "FAIL" : "OK");
 	}
+	stdio_restore();
 	printf("Summary: %d/%d PASSED, %d FAILED\n",
 	       env.succ_cnt, env.sub_succ_cnt, env.fail_cnt);
 
-	free(env.log_buf);
 	free(env.test_selector.num_set);
 	free(env.subtest_selector.num_set);
 
diff --git a/tools/testing/selftests/bpf/test_progs.h b/tools/testing/selftests/bpf/test_progs.h
index afd14962456f..54f30d7731c6 100644
--- a/tools/testing/selftests/bpf/test_progs.h
+++ b/tools/testing/selftests/bpf/test_progs.h
@@ -56,9 +56,9 @@ struct test_env {
 	bool jit_enabled;
 
 	struct prog_test_def *test;
+	FILE *stdout, *stderr;
 	char *log_buf;
-	size_t log_cnt;
-	size_t log_cap;
+	size_t log_size;
 
 	int succ_cnt; /* successful tests */
 	int sub_succ_cnt; /* successful sub-tests */
-- 
2.22.0.770.g0f2c4a37fd-goog


^ permalink raw reply related

* [PATCH bpf-next v2 0/3] selftests/bpf: switch test_progs back to stdio
From: Stanislav Fomichev @ 2019-08-02 21:11 UTC (permalink / raw)
  To: netdev, bpf; +Cc: davem, ast, daniel, Stanislav Fomichev, Andrii Nakryiko

I was looking into converting test_sockops* to test_progs framework
and that requires using cgroup_helpers.c which rely on stdio/stderr.
Let's use open_memstream to override stdout into buffer during
subtests instead of custom test_{v,}printf wrappers. That lets
us continue to use stdio in the subtests and dump it on failure
if required.

That would also fix bpf_find_map which currently uses printf to
signal failure (missed during test_printf conversion).

Cc: Andrii Nakryiko <andriin@fb.com>

Stanislav Fomichev (3):
  selftests/bpf: test_progs: switch to open_memstream
  selftests/bpf: test_progs: test__printf -> printf
  selftests/bpf: test_progs: drop extra trailing tab

 .../bpf/prog_tests/bpf_verif_scale.c          |   4 +-
 .../selftests/bpf/prog_tests/l4lb_all.c       |   2 +-
 .../selftests/bpf/prog_tests/map_lock.c       |  10 +-
 .../selftests/bpf/prog_tests/send_signal.c    |   4 +-
 .../selftests/bpf/prog_tests/spinlock.c       |   2 +-
 .../bpf/prog_tests/stacktrace_build_id.c      |   4 +-
 .../bpf/prog_tests/stacktrace_build_id_nmi.c  |   4 +-
 .../selftests/bpf/prog_tests/xdp_noinline.c   |   4 +-
 tools/testing/selftests/bpf/test_progs.c      | 140 +++++++++---------
 tools/testing/selftests/bpf/test_progs.h      |  14 +-
 10 files changed, 90 insertions(+), 98 deletions(-)

-- 
2.22.0.770.g0f2c4a37fd-goog

^ permalink raw reply

* Re: [net-next 01/12] net/mlx5: E-Switch, add ingress rate support
From: Saeed Mahameed @ 2019-08-02 20:48 UTC (permalink / raw)
  To: alexei.starovoitov@gmail.com
  Cc: Eli Cohen, davem@davemloft.net, netdev@vger.kernel.org,
	Paul Blakey
In-Reply-To: <20190802194429.m34bpvf5hzgkop4g@ast-mbp.dhcp.thefacebook.com>

On Fri, 2019-08-02 at 12:44 -0700, Alexei Starovoitov wrote:
> On Fri, Aug 02, 2019 at 07:22:21PM +0000, Saeed Mahameed wrote:
> > On Fri, 2019-08-02 at 10:37 -0700, Alexei Starovoitov wrote:
> > > On Thu, Aug 1, 2019 at 6:30 PM Saeed Mahameed <
> > > saeedm@mellanox.com>
> > > wrote:
> > > > From: Eli Cohen <eli@mellanox.com>
> > > > 
> > > > Use the scheduling elements to implement ingress rate limiter
> > > > on an
> > > > eswitch ports ingress traffic. Since the ingress of eswitch
> > > > port is
> > > > the
> > > > egress of VF port, we control eswitch ingress by controlling VF
> > > > egress.
> > > 
> > > Looks like the patch is only passing args to firmware which is
> > > doing
> > > the magic.
> > > Can you please describe what is the algorithm there?
> > > Is it configurable?
> > 
> > Hi Alexei, 
> > 
> > I am not sure how much details you are looking for, but let me
> > share
> > some of what i know:
> > 
> > From a previous submission for legacy mode sriov vf bw limit, where
> > we 
> > introduced the FW configuration API and the legacy sriov use case: 
> > https://patchwork.kernel.org/patch/9404655/
> > 
> > So basically the algorithm is Deficit Weighted Round Robin (DWRR)
> > between the agents, we can control BW allocation/weight of each
> > agent
> > (vf vport).
> 
> commit log of this patch says nothing about DWRR.

it says it uses the "scheduling elements ingress rate limiter" which
automatically translates to DWRR, this is basic knowledge for mlx5
developers, and it is clear in the commit message when the FW API was
introduced.

> It is also not using any of the api that were provided by that
> earlier patch.

It is using the api: 
mlx5e_tc_configure_matchall => apply_police_params =>
mlx5_esw_modify_vport_rate => mlx5_modify_scheduling_element_cmd()..

> what is going on?
> 

can you please clarify what is bothering you ? we can fix it if
necessary.

Thanks,
Saeed.



^ permalink raw reply

* Re: [PATCH net-next 0/3,v2] flow_offload hardware priority fixes
From: Jakub Kicinski @ 2019-08-02 20:48 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: netfilter-devel, davem, netdev, jiri, marcelo.leitner, saeedm,
	wenxu, gerlitz.or, paulb
In-Reply-To: <20190802132846.3067-1-pablo@netfilter.org>

On Fri,  2 Aug 2019 15:28:43 +0200, Pablo Neira Ayuso wrote:
> v2: address Jakub comments to not use the netfilter basechain
>     priority for this mapping.

Hardly.

^ permalink raw reply

* Re: [PATCH net 0/2] flow_offload hardware priority fixes
From: Jakub Kicinski @ 2019-08-02 20:47 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: netfilter-devel, davem, netdev, marcelo.leitner, jiri, wenxu,
	saeedm, paulb, gerlitz.or
In-Reply-To: <20190802110023.udfcxowe3vmihduq@salvia>

On Fri, 2 Aug 2019 13:00:23 +0200, Pablo Neira Ayuso wrote:
> Hi Jakub,
> 
> If the user specifies 'pref' in the new rule, then tc checks if there
> is a tcf_proto object that matches this priority. If the tcf_proto
> object does not exist, tc creates a tcf_proto object and it adds the
> new rule to this tcf_proto.
> 
> In cls_flower, each tcf_proto only stores one single rule, so if the
> user tries to add another rule with the same 'pref', cls_flower
> returns EEXIST.

😳 

So you're saying this doesn't work?

ip link add type dummy
tc qdisc add dev dummy0 clsact
tc filter add dev dummy0 ingress protocol ipv6 prio 123 flower src_ip 1111::1 action drop
tc filter add dev dummy0 ingress protocol ipv6 prio 123 flower src_ip 1111::2 action drop
tc filter add dev dummy0 ingress protocol ipv6 prio 123 flower src_ip 1111::3 action drop
tc filter add dev dummy0 ingress protocol ipv6 prio 123 flower src_ip 1111::4 action drop
tc filter add dev dummy0 ingress protocol ipv6 prio 123 flower src_ip 1111::5 action drop

tc filter show dev dummy0 ingress

filter protocol ipv6 pref 123 flower chain 0 
filter protocol ipv6 pref 123 flower chain 0 handle 0x1 
  eth_type ipv6
  src_ip 1111::1
  not_in_hw
	action order 1: gact action drop
	 random type none pass val 0
	 index 1 ref 1 bind 1

filter protocol ipv6 pref 123 flower chain 0 handle 0x2 
  eth_type ipv6
  src_ip 1111::2
  not_in_hw
	action order 1: gact action drop
	 random type none pass val 0
	 index 2 ref 1 bind 1

filter protocol ipv6 pref 123 flower chain 0 handle 0x3 
  eth_type ipv6
  src_ip 1111::3
  not_in_hw
	action order 1: gact action drop
	 random type none pass val 0
	 index 3 ref 1 bind 1

filter protocol ipv6 pref 123 flower chain 0 handle 0x4 
  eth_type ipv6
  src_ip 1111::4
  not_in_hw
	action order 1: gact action drop
	 random type none pass val 0
	 index 4 ref 1 bind 1

filter protocol ipv6 pref 123 flower chain 0 handle 0x5 
  eth_type ipv6
  src_ip 1111::5
  not_in_hw
	action order 1: gact action drop
	 random type none pass val 0
	 index 5 ref 1 bind 1


> I'll prepare a new patchset not to map the priority to the netfilter
> basechain priority, instead the rule priority will be internally
> allocated for each new rule.

In which you're adding fake priorities to rules, AFAICT, 
and continue to baffle me.

^ permalink raw reply

* Re: [PATCH net] ipv4/route: do not check saddr dev if iif is LOOPBACK_IFINDEX
From: Stefano Brivio @ 2019-08-02 20:35 UTC (permalink / raw)
  To: David Ahern
  Cc: Hangbin Liu, netdev, Marcelo Ricardo Leitner, David S . Miller
In-Reply-To: <f44d9f26-046d-38a2-13aa-d25b92419d11@gmail.com>

David,

On Thu, 1 Aug 2019 13:51:25 -0600
David Ahern <dsahern@gmail.com> wrote:

> On 8/1/19 2:29 AM, Hangbin Liu wrote:
> > Jianlin reported a bug that for IPv4, ip route get from src_addr would fail
> > if src_addr is not an address on local system.
> > 
> > \# ip route get 1.1.1.1 from 2.2.2.2
> > RTNETLINK answers: Invalid argument  
> 
> so this is a forwarding lookup in which case iif should be set.

On actual forwarding, yes, it will be set.

But if we are just doing a lookup for a route (iif is
LOOPBACK_IFINDEX), I think this should still give us the matching route,
which is what IPv6 already does and what this patch fixes for IPv4.

Otherwise, we have no way to fetch that route, no matter if source
routing is configured. So I think this patch is correct and to some
extent necessary.

-- 
Stefano

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox