Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH bpf-next v4 1/3] selftests/bpf: test_progs: switch to open_memstream
From: Andrii Nakryiko @ 2019-08-06 17:55 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Stanislav Fomichev, Networking, bpf, David S. Miller,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
In-Reply-To: <20190806175411.GC23939@mini-arch>

On Tue, Aug 6, 2019 at 10:54 AM Stanislav Fomichev <sdf@fomichev.me> wrote:
>
> On 08/06, Andrii Nakryiko wrote:
> > On Tue, Aug 6, 2019 at 10:40 AM Stanislav Fomichev <sdf@fomichev.me> wrote:
> > >
> > > On 08/06, Andrii Nakryiko wrote:
> > > > On Tue, Aug 6, 2019 at 10:19 AM Stanislav Fomichev <sdf@google.com> wrote:
> > > > >
> > > > > Use open_memstream to override stdout during test execution.
> > > > > The copy of the original stdout is held in env.stdout and used
> > > > > to print subtest info and dump failed log.
> > > > >
> > > > > test_{v,}printf are now simple wrappers around stdout and will be
> > > > > removed in the next patch.
> > > > >
> > > > > v4:
> > > > > * one field per line for stdout/stderr (Andrii Nakryiko)
> > > > >
> > > > > v3:
> > > > > * don't do strlen over log_buf, log_cnt has it already (Andrii Nakryiko)
> > > > >
> > > > > v2:
> > > > > * add ifdef __GLIBC__ around open_memstream (maybe pointless since
> > > > >   we already depend on glibc for argp_parse)
> > > > > * hijack stderr as well (Andrii Nakryiko)
> > > > > * don't hijack for every test, do it once (Andrii Nakryiko)
> > > > > * log_cap -> log_size (Andrii Nakryiko)
> > > > > * do fseeko in a proper place (Andrii Nakryiko)
> > > > > * check open_memstream returned value (Andrii Nakryiko)
> > > > >
> > > > > Cc: Andrii Nakryiko <andriin@fb.com>
> > > > > Acked-by: Andrii Nakryiko <andriin@fb.com>
> > > > > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > > > > ---
> > > > >  tools/testing/selftests/bpf/test_progs.c | 115 ++++++++++++-----------
> > > > >  tools/testing/selftests/bpf/test_progs.h |   3 +-
> > > > >  2 files changed, 62 insertions(+), 56 deletions(-)
> > > > >
> >
> > [...]
> >
> > > > >  void test__printf(const char *fmt, ...)
> > > > > @@ -477,6 +438,48 @@ static error_t parse_arg(int key, char *arg, struct argp_state *state)
> > > > >         return 0;
> > > > >  }
> > > > >
> > > > > +static void stdio_hijack(void)
> > > > > +{
> > > > > +#ifdef __GLIBC__
> > > > > +       if (env.verbose || (env.test && env.test->force_log)) {
> > > >
> > > > I just also realized that you don't need `(env.test &&
> > > > env.test->force_log)` test. We hijack stdout/stderr before env.test is
> > > > even set, so this does nothing anyways. Plus, force_log can be set in
> > > > the middle of test/sub-test, yet we hijack stdout just once (or even
> > > > if per-test), so it's still going to be "racy". Let's buffer output
> > > > (unless it's env.verbose, which is important to not buffer because
> > > > some tests will have huge output, when failing, so this allows to
> > > > bypass using tons of memory for those, when debugging) and dump at the
> > > > end.
> > > Makes sense, will drop this test and resubmit along with a fix for '-v'
> > > that Alexei discovered. Thanks!
> >
> > Oh, is it because env.stdout and env.stderr is not set in verbose
> > mode? I'd always set env.stdout/env.stderr at the beginning, next to
> > env.jit_enabled, to never care about "logging modes".
> Yeah, I moved it to the beginning of stdio_hijack() to mirror the 'undo'
> in stdio_restore(). Let me know if you feel strongly about it and want
> it to be near env.jit_enabled instead.

I'm fine with it, no big deal, I wrote email before I saw your v5.

>
> > > > > +               /* nothing to do, output to stdout by default */
> > > > > +               return;
> > > > > +       }
> > > > > +
> >
> > [...]

^ permalink raw reply

* Re: [patch net-next v2 1/3] net: devlink: allow to change namespaces
From: David Ahern @ 2019-08-06 17:55 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: Jakub Kicinski, netdev, davem, sthemmin, mlxsw
In-Reply-To: <20190806175323.GB2332@nanopsycho.orion>

On 8/6/19 11:53 AM, Jiri Pirko wrote:
> Let's figure out the devlink-controlling-kernel-resources thread first.
> What you describe here is exactly that.

as I mentioned on the phone, any outcome of that thread will be in 5.4
at best. Meanwhile this breakage in 5.2 and 5.3 needs to be fixed.

^ permalink raw reply

* Re: [PATCH bpf-next v4 1/3] selftests/bpf: test_progs: switch to open_memstream
From: Stanislav Fomichev @ 2019-08-06 17:54 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Stanislav Fomichev, Networking, bpf, David S. Miller,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
In-Reply-To: <CAEf4Bzbt_6Y3bEpsPiHi59KnWoHsk9gQa3XpfRo+gG7-rKqN4w@mail.gmail.com>

On 08/06, Andrii Nakryiko wrote:
> On Tue, Aug 6, 2019 at 10:40 AM Stanislav Fomichev <sdf@fomichev.me> wrote:
> >
> > On 08/06, Andrii Nakryiko wrote:
> > > On Tue, Aug 6, 2019 at 10:19 AM Stanislav Fomichev <sdf@google.com> wrote:
> > > >
> > > > Use open_memstream to override stdout during test execution.
> > > > The copy of the original stdout is held in env.stdout and used
> > > > to print subtest info and dump failed log.
> > > >
> > > > test_{v,}printf are now simple wrappers around stdout and will be
> > > > removed in the next patch.
> > > >
> > > > v4:
> > > > * one field per line for stdout/stderr (Andrii Nakryiko)
> > > >
> > > > v3:
> > > > * don't do strlen over log_buf, log_cnt has it already (Andrii Nakryiko)
> > > >
> > > > v2:
> > > > * add ifdef __GLIBC__ around open_memstream (maybe pointless since
> > > >   we already depend on glibc for argp_parse)
> > > > * hijack stderr as well (Andrii Nakryiko)
> > > > * don't hijack for every test, do it once (Andrii Nakryiko)
> > > > * log_cap -> log_size (Andrii Nakryiko)
> > > > * do fseeko in a proper place (Andrii Nakryiko)
> > > > * check open_memstream returned value (Andrii Nakryiko)
> > > >
> > > > Cc: Andrii Nakryiko <andriin@fb.com>
> > > > Acked-by: Andrii Nakryiko <andriin@fb.com>
> > > > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > > > ---
> > > >  tools/testing/selftests/bpf/test_progs.c | 115 ++++++++++++-----------
> > > >  tools/testing/selftests/bpf/test_progs.h |   3 +-
> > > >  2 files changed, 62 insertions(+), 56 deletions(-)
> > > >
> 
> [...]
> 
> > > >  void test__printf(const char *fmt, ...)
> > > > @@ -477,6 +438,48 @@ static error_t parse_arg(int key, char *arg, struct argp_state *state)
> > > >         return 0;
> > > >  }
> > > >
> > > > +static void stdio_hijack(void)
> > > > +{
> > > > +#ifdef __GLIBC__
> > > > +       if (env.verbose || (env.test && env.test->force_log)) {
> > >
> > > I just also realized that you don't need `(env.test &&
> > > env.test->force_log)` test. We hijack stdout/stderr before env.test is
> > > even set, so this does nothing anyways. Plus, force_log can be set in
> > > the middle of test/sub-test, yet we hijack stdout just once (or even
> > > if per-test), so it's still going to be "racy". Let's buffer output
> > > (unless it's env.verbose, which is important to not buffer because
> > > some tests will have huge output, when failing, so this allows to
> > > bypass using tons of memory for those, when debugging) and dump at the
> > > end.
> > Makes sense, will drop this test and resubmit along with a fix for '-v'
> > that Alexei discovered. Thanks!
> 
> Oh, is it because env.stdout and env.stderr is not set in verbose
> mode? I'd always set env.stdout/env.stderr at the beginning, next to
> env.jit_enabled, to never care about "logging modes".
Yeah, I moved it to the beginning of stdio_hijack() to mirror the 'undo'
in stdio_restore(). Let me know if you feel strongly about it and want
it to be near env.jit_enabled instead.

> > > > +               /* nothing to do, output to stdout by default */
> > > > +               return;
> > > > +       }
> > > > +
> 
> [...]

^ permalink raw reply

* Re: [patch net-next v2 1/3] net: devlink: allow to change namespaces
From: Jiri Pirko @ 2019-08-06 17:53 UTC (permalink / raw)
  To: David Ahern; +Cc: Jakub Kicinski, netdev, davem, sthemmin, mlxsw
In-Reply-To: <7200bdbb-2a02-92c6-0251-1c59b159dde7@gmail.com>

Tue, Aug 06, 2019 at 07:34:59PM CEST, dsahern@gmail.com wrote:
>On 8/5/19 9:20 AM, Jiri Pirko wrote:
>> Mon, Aug 05, 2019 at 04:51:22PM CEST, dsahern@gmail.com wrote:
>>> On 8/5/19 8:49 AM, Jiri Pirko wrote:
>>>>> Your commit 5fc494225c1eb81309cc4c91f183cd30e4edb674 changed that from a
>>>>> per-namepace accounting to all namespaces managed by a single devlink
>>>>> instance in init_net - which is completely wrong.
>>>> No. Not "all namespaces". Only the one where the devlink is. And that is
>>>> always init_net, until this patchset.
>>>>
>>>>
>>>
>>> Jiri: your change to fib.c does not take into account namespace when
>>> doing rules and routes accounting. you broke it. fix it.
>> 
>> What do you mean by "account namespace"? It's a device resource, why to
>> tight it with namespace? What if you have 2 netdevsim-devlink instances
>> in one namespace? Why the setting should be per-namespace?
>> 
>
>Jiri:
>
>Here's an example of how your 5.2 change to netdevsim broke the resource
>controller:
>
>Create a netdevsim device:
>$ modprobe netdevsim
>$  echo "0 1" > /sys/bus/netdevsim/new_device
>
>Get the current number of IPv4 routes:
>$ n=$(ip -4 ro ls table all | wc -l)
>$ echo $n
>13
>
>Prevent any more from being added. This limit should apply solely to
>this namespace, init_net:
>
>$ devlink resource set netdevsim/netdevsim0 path /IPv4/fib size $n
>$ devlink dev reload netdevsim/netdevsim0
>Error: netdevsim: New size is less than current occupancy.
>devlink answers: Invalid argument
>
>So that is the first breakage: accounting is off - maybe. Note there are
>no other visible namespaces, but who knows what systemd or other
>processes are doing with namespaces. Perhaps this accounting is another
>example of your changes not properly handling namespaces:
>
>$ devlink resource show netdevsim/netdevsim0
>netdevsim/netdevsim0:
>  name IPv4 size unlimited unit entry size_min 0 size_max unlimited
>size_gran 1 dpipe_tables none
>    resources:
>      name fib size 13 occ 17 unit entry size_min 0 size_max unlimited
>size_gran 1 dpipe_tables none
>      name fib-rules size unlimited occ 6 unit entry size_min 0 size_max
>unlimited size_gran 1 dpipe_tables none
>  name IPv6 size unlimited unit entry size_min 0 size_max unlimited
>size_gran 1 dpipe_tables none
>    resources:
>      name fib size unlimited occ 10 unit entry size_min 0 size_max
>unlimited size_gran 1 dpipe_tables none
>      name fib-rules size unlimited occ 4 unit entry size_min 0 size_max
>unlimited size_gran 1 dpipe_tables none
>
>So the occupancy does not match the tables for init_net.
>
>Reset the max to 17, the current occupancy:
>$ devlink resource set netdevsim/netdevsim0 path /IPv4/fib size 17
>$ devlink dev reload netdevsim/netdevsim0
>$ devlink resource show netdevsim/netdevsim0
>netdevsim/netdevsim0:
>  name IPv4 size unlimited unit entry size_min 0 size_max unlimited
>size_gran 1 dpipe_tables none
>    resources:
>      name fib size 17 occ 17 unit entry size_min 0 size_max unlimited
>size_gran 1 dpipe_tables none
>      name fib-rules size unlimited occ 6 unit entry size_min 0 size_max
>unlimited size_gran 1 dpipe_tables none
>  name IPv6 size unlimited unit entry size_min 0 size_max unlimited
>size_gran 1 dpipe_tables none
>    resources:
>      name fib size unlimited occ 10 unit entry size_min 0 size_max
>unlimited size_gran 1 dpipe_tables none
>      name fib-rules size unlimited occ 4 unit entry size_min 0 size_max
>unlimited size_gran 1 dpipe_tables none
>
>Create a new namespace, bring up lo which attempts to add more route
>entries:
>$ unshare -n
>$ ip li set lo up
>
>If you list routes you see the lo routes failed to installed because of
>the limits, but it is a silent failure. Try to add a new route and you
>see the cross namespace accounting now:
>$ ip ro add 192.168.1.0/24 dev lo
>Error: netdevsim: Exceeded number of supported fib entries.
>
>
>Contrast that behavior with 5.1 and you see the new namespaces have no
>bearing on accounting in init_net and limits in init_net do not affect
>other namespaces.
>
>That behavior needs to be restored in 5.2 and 5.3.

Let's figure out the devlink-controlling-kernel-resources thread first.
What you describe here is exactly that.

^ permalink raw reply

* [sashal@kernel.org: Re: Back-porting request]
From: Stephen Suryaputra @ 2019-08-06 17:51 UTC (permalink / raw)
  To: netdev

Hi,

How do I request back-porting of commit 5b18f1289808 ("ipv4: reset
rt_iif for recirculated mcast/bcast out pkts") quoted below? I think for
4.19, the following diff is needed in addition to the commit:

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index cd84f7f68032..120ef1f284fa 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1652,11 +1652,7 @@ struct rtable *rt_dst_clone(struct net_device *dev, struct rtable *rt)
 		new_rt->rt_iif = rt->rt_iif;
 		new_rt->rt_pmtu = rt->rt_pmtu;
 		new_rt->rt_mtu_locked = rt->rt_mtu_locked;
-		new_rt->rt_gw_family = rt->rt_gw_family;
-		if (rt->rt_gw_family == AF_INET)
-			new_rt->rt_gw4 = rt->rt_gw4;
-		else if (rt->rt_gw_family == AF_INET6)
-			new_rt->rt_gw6 = rt->rt_gw6;
+		new_rt->rt_gateway = rt->rt_gateway;
 		INIT_LIST_HEAD(&new_rt->rt_uncached);
 
 		new_rt->dst.flags |= DST_HOST;

Thanks.

----- Forwarded message from Sasha Levin <sashal@kernel.org> -----

On Mon, Jul 29, 2019 at 05:56:27PM +0200, Greg KH wrote:
> On Mon, Jul 29, 2019 at 11:42:34AM -0400, Stephen Suryaputra wrote:
> > Hello,
> > 
> > I'm requesting this commit to be back-ported to v4.14:
> > ---
> > commit 5b18f1289808fee5d04a7e6ecf200189f41a4db6
> > Author: Stephen Suryaputra <ssuryaextr@gmail.com>
> > Date:   Wed Jun 26 02:21:16 2019 -0400
> > 
> >     ipv4: reset rt_iif for recirculated mcast/bcast out pkts
> > 
> >     Multicast or broadcast egress packets have rt_iif set to the oif. These
> >     packets might be recirculated back as input and lookup to the raw
> >     sockets may fail because they are bound to the incoming interface
> >     (skb_iif). If rt_iif is not zero, during the lookup, inet_iif() function
> >     returns rt_iif instead of skb_iif. Hence, the lookup fails.
> > 
> >     v2: Make it non vrf specific (David Ahern). Reword the changelog to
> >         reflect it.
> >     Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com>
> >     Reviewed-by: David Ahern <dsahern@gmail.com>
> >     Signed-off-by: David S. Miller <davem@davemloft.net>
> > ---
> > 

[deleted]
> 
> For networking patches to be applied to the stable kernel tree(s),
> please read:
>    https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
> for how to do this properly.
> 
> There is a section for how to do this for networking patches as they are
> accepted a bit differently from other patches.

To clarify a bit more here: technically you're asking for a patch to be
included in 4.14, which isn't one of the "last two stable releases", so
that document will tell you to send that patch directly to us.

However, this patch isn't in 4.19 either, which is still Dave's domain,
and we can't take it in 4.14 if it's not in 4.19 (we don't want to
introduce regressions for people who are upgrading their kernels), so
this will still need to go through Dave.

--
Thanks,
Sasha

----- End forwarded message -----

^ permalink raw reply related

* Re: [PATCH net v2] net: dsa: Check existence of .port_mdb_add callback before calling it
From: Chen-Yu Tsai @ 2019-08-06 17:49 UTC (permalink / raw)
  To: Vivien Didelot
  Cc: Chen-Yu Tsai, Andrew Lunn, Florian Fainelli, David S. Miller,
	netdev, linux-kernel
In-Reply-To: <20190806131513.GB2822@t480s.localdomain>

On Wed, Aug 7, 2019 at 1:15 AM Vivien Didelot <vivien.didelot@gmail.com> wrote:
>
> Hi Chen-Yu,
>
> On Tue,  6 Aug 2019 15:53:25 +0800, Chen-Yu Tsai <wens@kernel.org> wrote:
> > From: Chen-Yu Tsai <wens@csie.org>
> >
> > With the recent addition of commit 75dad2520fc3 ("net: dsa: b53: Disable
> > all ports on setup"), users of b53 (BCM53125 on Lamobo R1 in my case)
> > are forced to use the dsa subsystem to enable the switch, instead of
> > having it in the default transparent "forward-to-all" mode.
> >
> > The b53 driver does not support mdb bitmap functions. However the dsa
> > layer does not check for the existence of the .port_mdb_add callback
> > before actually using it. This results in a NULL pointer dereference,
> > as shown in the kernel oops below.
> >
> > The other functions seem to be properly guarded. Do the same for
> > .port_mdb_add in dsa_switch_mdb_add_bitmap() as well.
> >
> > b53 is not the only driver that doesn't support mdb bitmap functions.
> > Others include bcm_sf2, dsa_loop, lantiq_gswip, mt7530, mv88e6060,
> > qca8k, realtek-smi, and vitesse-vsc73xx.
>
> I don't know what you mean by that, there's no "mdb bitmap function"
> support for drivers, only the port_mdb_{prepare,add,del} callbacks...

The term was coined from commit e6db98db8a95 ("net: dsa: add switch mdb
bitmap functions"). But yeah, .port_mdb_* ops/callbacks would be more
appropriate.

> >     8<--- cut here ---
> >     Unable to handle kernel NULL pointer dereference at virtual address 00000000
> >     pgd = (ptrval)
> >     [00000000] *pgd=00000000
> >     Internal error: Oops: 80000005 [#1] SMP ARM
> >     Modules linked in: rtl8xxxu rtl8192cu rtl_usb rtl8192c_common rtlwifi mac80211 cfg80211
> >     CPU: 1 PID: 134 Comm: kworker/1:2 Not tainted 5.3.0-rc1-00247-gd3519030752a #1
> >     Hardware name: Allwinner sun7i (A20) Family
> >     Workqueue: events switchdev_deferred_process_work
> >     PC is at 0x0
> >     LR is at dsa_switch_event+0x570/0x620
> >     pc : [<00000000>]    lr : [<c08533ec>]    psr: 80070013
> >     sp : ee871db8  ip : 00000000  fp : ee98d0a4
> >     r10: 0000000c  r9 : 00000008  r8 : ee89f710
> >     r7 : ee98d040  r6 : ee98d088  r5 : c0f04c48  r4 : ee98d04c
> >     r3 : 00000000  r2 : ee89f710  r1 : 00000008  r0 : ee98d040
> >     Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> >     Control: 10c5387d  Table: 6deb406a  DAC: 00000051
> >     Process kworker/1:2 (pid: 134, stack limit = 0x(ptrval))
> >     Stack: (0xee871db8 to 0xee872000)
> >     1da0:                                                       ee871e14 103ace2d
> >     1dc0: 00000000 ffffffff 00000000 ee871e14 00000005 00000000 c08524a0 00000000
> >     1de0: ffffe000 c014bdfc c0f04c48 ee871e98 c0f04c48 ee9e5000 c0851120 c014bef0
> >     1e00: 00000000 b643aea2 ee9b4068 c08509a8 ee2bf940 ee89f710 ee871ecb 00000000
> >     1e20: 00000008 103ace2d 00000000 c087e248 ee29c868 103ace2d 00000001 ffffffff
> >     1e40: 00000000 ee871e98 00000006 00000000 c0fb2a50 c087e2d0 ffffffff c08523c4
> >     1e60: ffffffff c014bdfc 00000006 c0fad2d0 ee871e98 ee89f710 00000000 c014c500
> >     1e80: 00000000 ee89f3c0 c0f04c48 00000000 ee9e5000 c087dfb4 ee9e5000 00000000
> >     1ea0: ee89f710 ee871ecb 00000001 103ace2d 00000000 c0f04c48 00000000 c087e0a8
> >     1ec0: 00000000 efd9a3e0 0089f3c0 103ace2d ee89f700 ee89f710 ee9e5000 00000122
> >     1ee0: 00000100 c087e130 ee89f700 c0fad2c8 c1003ef0 c087de4c 2e928000 c0fad2ec
> >     1f00: c0fad2ec ee839580 ef7a62c0 ef7a9400 00000000 c087def8 c0fad2ec c01447dc
> >     1f20: ef315640 ef7a62c0 00000008 ee839580 ee839594 ef7a62c0 00000008 c0f03d00
> >     1f40: ef7a62d8 ef7a62c0 ffffe000 c0145b84 ffffe000 c0fb2420 c0bfaa8c 00000000
> >     1f60: ffffe000 ee84b600 ee84b5c0 00000000 ee870000 ee839580 c0145b40 ef0e5ea4
> >     1f80: ee84b61c c014a6f8 00000001 ee84b5c0 c014a5b0 00000000 00000000 00000000
> >     1fa0: 00000000 00000000 00000000 c01010e8 00000000 00000000 00000000 00000000
> >     1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> >     1fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
> >     [<c08533ec>] (dsa_switch_event) from [<c014bdfc>] (notifier_call_chain+0x48/0x84)
> >     [<c014bdfc>] (notifier_call_chain) from [<c014bef0>] (raw_notifier_call_chain+0x18/0x20)
> >     [<c014bef0>] (raw_notifier_call_chain) from [<c08509a8>] (dsa_port_mdb_add+0x48/0x74)
> >     [<c08509a8>] (dsa_port_mdb_add) from [<c087e248>] (__switchdev_handle_port_obj_add+0x54/0xd4)
> >     [<c087e248>] (__switchdev_handle_port_obj_add) from [<c087e2d0>] (switchdev_handle_port_obj_add+0x8/0x14)
> >     [<c087e2d0>] (switchdev_handle_port_obj_add) from [<c08523c4>] (dsa_slave_switchdev_blocking_event+0x94/0xa4)
> >     [<c08523c4>] (dsa_slave_switchdev_blocking_event) from [<c014bdfc>] (notifier_call_chain+0x48/0x84)
> >     [<c014bdfc>] (notifier_call_chain) from [<c014c500>] (blocking_notifier_call_chain+0x50/0x68)
> >     [<c014c500>] (blocking_notifier_call_chain) from [<c087dfb4>] (switchdev_port_obj_notify+0x44/0xa8)
> >     [<c087dfb4>] (switchdev_port_obj_notify) from [<c087e0a8>] (switchdev_port_obj_add_now+0x90/0x104)
> >     [<c087e0a8>] (switchdev_port_obj_add_now) from [<c087e130>] (switchdev_port_obj_add_deferred+0x14/0x5c)
> >     [<c087e130>] (switchdev_port_obj_add_deferred) from [<c087de4c>] (switchdev_deferred_process+0x64/0x104)
> >     [<c087de4c>] (switchdev_deferred_process) from [<c087def8>] (switchdev_deferred_process_work+0xc/0x14)
> >     [<c087def8>] (switchdev_deferred_process_work) from [<c01447dc>] (process_one_work+0x218/0x50c)
> >     [<c01447dc>] (process_one_work) from [<c0145b84>] (worker_thread+0x44/0x5bc)
> >     [<c0145b84>] (worker_thread) from [<c014a6f8>] (kthread+0x148/0x150)
> >     [<c014a6f8>] (kthread) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
> >     Exception stack(0xee871fb0 to 0xee871ff8)
> >     1fa0:                                     00000000 00000000 00000000 00000000
> >     1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> >     1fe0: 00000000 00000000 00000000 00000000 00000013 00000000
> >     Code: bad PC value
> >     ---[ end trace 1292c61abd17b130 ]---
> >
> >     [<c08533ec>] (dsa_switch_event) from [<c014bdfc>] (notifier_call_chain+0x48/0x84)
> >     corresponds to
> >
> >       $ arm-linux-gnueabihf-addr2line -C -i -e vmlinux c08533ec
> >
> >       linux/net/dsa/switch.c:156
> >       linux/net/dsa/switch.c:178
> >       linux/net/dsa/switch.c:328
> >
> > Fixes: e6db98db8a95 ("net: dsa: add switch mdb bitmap functions")
> > Signed-off-by: Chen-Yu Tsai <wens@csie.org>
> > ---
> > Changes since v1:
> >
> >   - Moved the check to the beginning of dsa_switch_mdb_add()
> >
> > Looks like we could also move the ops check out of
> > dsa_switch_mdb_prepare_bitmap(), though I suppose keeping the code the
> > way it is now is clearer.
> >
> > ---
> >  net/dsa/switch.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/net/dsa/switch.c b/net/dsa/switch.c
> > index 4ec5b7f85d51..231af5268656 100644
> > --- a/net/dsa/switch.c
> > +++ b/net/dsa/switch.c
> > @@ -164,6 +164,9 @@ static int dsa_switch_mdb_add(struct dsa_switch *ds,
> >       struct switchdev_trans *trans = info->trans;
> >       int port;
> >
> > +     if (!ds->ops->port_mdb_add)
> > +             return -EOPNOTSUPP;
> > +
> >       /* Build a mask of Multicast group members */
> >       bitmap_zero(ds->bitmap, ds->num_ports);
> >       if (ds->index == info->sw_index)
> > --
> > 2.20.1
> >
>
> I don't understand the crash here, nor the fix. dsa_switch_mdb_add()
> is supposed to be called through switchdev with a prepare phase,
> which checks for ds->ops->port_mdb_add. Do you mean that a switchdev
> MDB object is added somewhere without a prepare phase? If that's
> the case, this is what the commit message must say. Then the

I had pretty much zero understanding of how switchdev and dsa work.
The symptom is a NULL pointer reference, resulting from an unsupported
callback that was not checked before being called, as described above.
And that is what I mean. A NULL pointer reference happened when it
should not have.

Based on what you just mentioned, yes it does look like an object was
added without a prepare phase. Randomly looking through the net/dsa
code, it seems only dsa_port_vid_add() does a prepare phase, judging
by .ph_prepare being set. dsa_port_{vlan,mbd,fdb}_add directly call
the add phase, without the prepare phase. So I'm guessing "supposed
to be called with a prepare phase" is not quite accurate. This also
exceeds the scope of the simple fix I had in mind.

> ds->ops->port_mdb_add check must go where it is used, that is to say
> at the beginning of dsa_switch_mdb_add_bitmap() (similarly to what
> dsa_switch_mdb_prepare_bitmap() does), not in dsa_switch_mdb_add.

Andrew asked me to move it to where it is now. Please take a look at
v1 [2] if it's what you would like.

I'm ok either way.

ChenYu

[1] https://lkml.org/lkml/2019/7/25/706
[2] https://lkml.org/lkml/2019/7/23/1129

^ permalink raw reply

* Re: [PATCH bpf-next v4 1/3] selftests/bpf: test_progs: switch to open_memstream
From: Andrii Nakryiko @ 2019-08-06 17:49 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Stanislav Fomichev, Networking, bpf, David S. Miller,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
In-Reply-To: <20190806174028.GB23939@mini-arch>

On Tue, Aug 6, 2019 at 10:40 AM Stanislav Fomichev <sdf@fomichev.me> wrote:
>
> On 08/06, Andrii Nakryiko wrote:
> > On Tue, Aug 6, 2019 at 10:19 AM Stanislav Fomichev <sdf@google.com> wrote:
> > >
> > > Use open_memstream to override stdout during test execution.
> > > The copy of the original stdout is held in env.stdout and used
> > > to print subtest info and dump failed log.
> > >
> > > test_{v,}printf are now simple wrappers around stdout and will be
> > > removed in the next patch.
> > >
> > > v4:
> > > * one field per line for stdout/stderr (Andrii Nakryiko)
> > >
> > > v3:
> > > * don't do strlen over log_buf, log_cnt has it already (Andrii Nakryiko)
> > >
> > > v2:
> > > * add ifdef __GLIBC__ around open_memstream (maybe pointless since
> > >   we already depend on glibc for argp_parse)
> > > * hijack stderr as well (Andrii Nakryiko)
> > > * don't hijack for every test, do it once (Andrii Nakryiko)
> > > * log_cap -> log_size (Andrii Nakryiko)
> > > * do fseeko in a proper place (Andrii Nakryiko)
> > > * check open_memstream returned value (Andrii Nakryiko)
> > >
> > > Cc: Andrii Nakryiko <andriin@fb.com>
> > > Acked-by: Andrii Nakryiko <andriin@fb.com>
> > > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > > ---
> > >  tools/testing/selftests/bpf/test_progs.c | 115 ++++++++++++-----------
> > >  tools/testing/selftests/bpf/test_progs.h |   3 +-
> > >  2 files changed, 62 insertions(+), 56 deletions(-)
> > >

[...]

> > >  void test__printf(const char *fmt, ...)
> > > @@ -477,6 +438,48 @@ static error_t parse_arg(int key, char *arg, struct argp_state *state)
> > >         return 0;
> > >  }
> > >
> > > +static void stdio_hijack(void)
> > > +{
> > > +#ifdef __GLIBC__
> > > +       if (env.verbose || (env.test && env.test->force_log)) {
> >
> > I just also realized that you don't need `(env.test &&
> > env.test->force_log)` test. We hijack stdout/stderr before env.test is
> > even set, so this does nothing anyways. Plus, force_log can be set in
> > the middle of test/sub-test, yet we hijack stdout just once (or even
> > if per-test), so it's still going to be "racy". Let's buffer output
> > (unless it's env.verbose, which is important to not buffer because
> > some tests will have huge output, when failing, so this allows to
> > bypass using tons of memory for those, when debugging) and dump at the
> > end.
> Makes sense, will drop this test and resubmit along with a fix for '-v'
> that Alexei discovered. Thanks!

Oh, is it because env.stdout and env.stderr is not set in verbose
mode? I'd always set env.stdout/env.stderr at the beginning, next to
env.jit_enabled, to never care about "logging modes".

>
> > > +               /* nothing to do, output to stdout by default */
> > > +               return;
> > > +       }
> > > +

[...]

^ permalink raw reply

* [PATCH bpf-next v5 3/3] selftests/bpf: test_progs: drop extra trailing tab
From: Stanislav Fomichev @ 2019-08-06 17:45 UTC (permalink / raw)
  To: netdev, bpf; +Cc: davem, ast, daniel, Stanislav Fomichev, Andrii Nakryiko
In-Reply-To: <20190806174529.8341-1-sdf@google.com>

Small (un)related cleanup.

Cc: Andrii Nakryiko <andriin@fb.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 tools/testing/selftests/bpf/test_progs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/test_progs.c b/tools/testing/selftests/bpf/test_progs.c
index 6c11c796ea1f..12895d03d58b 100644
--- a/tools/testing/selftests/bpf/test_progs.c
+++ b/tools/testing/selftests/bpf/test_progs.c
@@ -278,7 +278,7 @@ enum ARG_KEYS {
 	ARG_VERIFIER_STATS = 's',
 	ARG_VERBOSE = 'v',
 };
-	
+
 static const struct argp_option opts[] = {
 	{ "num", ARG_TEST_NUM, "NUM", 0,
 	  "Run test number NUM only " },
-- 
2.22.0.770.g0f2c4a37fd-goog


^ permalink raw reply related

* [PATCH bpf-next v5 2/3] selftests/bpf: test_progs: test__printf -> printf
From: Stanislav Fomichev @ 2019-08-06 17:45 UTC (permalink / raw)
  To: netdev, bpf; +Cc: davem, ast, daniel, Stanislav Fomichev, Andrii Nakryiko
In-Reply-To: <20190806174529.8341-1-sdf@google.com>

Now that test__printf is a simple wraper around printf, let's drop it
(and test__vprintf as well).

Cc: Andrii Nakryiko <andriin@fb.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 .../selftests/bpf/prog_tests/bpf_verif_scale.c   |  4 ++--
 .../testing/selftests/bpf/prog_tests/l4lb_all.c  |  2 +-
 .../testing/selftests/bpf/prog_tests/map_lock.c  | 10 +++++-----
 .../selftests/bpf/prog_tests/send_signal.c       |  4 ++--
 .../testing/selftests/bpf/prog_tests/spinlock.c  |  2 +-
 .../bpf/prog_tests/stacktrace_build_id.c         |  4 ++--
 .../bpf/prog_tests/stacktrace_build_id_nmi.c     |  4 ++--
 .../selftests/bpf/prog_tests/xdp_noinline.c      |  4 ++--
 tools/testing/selftests/bpf/test_progs.c         | 16 +---------------
 tools/testing/selftests/bpf/test_progs.h         | 10 ++++------
 10 files changed, 22 insertions(+), 38 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c b/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c
index b4be96162ff4..3548ba2f24a8 100644
--- a/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c
+++ b/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c
@@ -5,13 +5,13 @@ static int libbpf_debug_print(enum libbpf_print_level level,
 			      const char *format, va_list args)
 {
 	if (level != LIBBPF_DEBUG) {
-		test__vprintf(format, args);
+		vprintf(format, args);
 		return 0;
 	}
 
 	if (!strstr(format, "verifier log"))
 		return 0;
-	test__vprintf("%s", args);
+	vprintf("%s", args);
 	return 0;
 }
 
diff --git a/tools/testing/selftests/bpf/prog_tests/l4lb_all.c b/tools/testing/selftests/bpf/prog_tests/l4lb_all.c
index 5ce572c03a5f..20ddca830e68 100644
--- a/tools/testing/selftests/bpf/prog_tests/l4lb_all.c
+++ b/tools/testing/selftests/bpf/prog_tests/l4lb_all.c
@@ -74,7 +74,7 @@ static void test_l4lb(const char *file)
 	}
 	if (bytes != MAGIC_BYTES * NUM_ITER * 2 || pkts != NUM_ITER * 2) {
 		error_cnt++;
-		test__printf("test_l4lb:FAIL:stats %lld %lld\n", bytes, pkts);
+		printf("test_l4lb:FAIL:stats %lld %lld\n", bytes, pkts);
 	}
 out:
 	bpf_object__close(obj);
diff --git a/tools/testing/selftests/bpf/prog_tests/map_lock.c b/tools/testing/selftests/bpf/prog_tests/map_lock.c
index 2e78217ed3fd..ee99368c595c 100644
--- a/tools/testing/selftests/bpf/prog_tests/map_lock.c
+++ b/tools/testing/selftests/bpf/prog_tests/map_lock.c
@@ -9,12 +9,12 @@ static void *parallel_map_access(void *arg)
 	for (i = 0; i < 10000; i++) {
 		err = bpf_map_lookup_elem_flags(map_fd, &key, vars, BPF_F_LOCK);
 		if (err) {
-			test__printf("lookup failed\n");
+			printf("lookup failed\n");
 			error_cnt++;
 			goto out;
 		}
 		if (vars[0] != 0) {
-			test__printf("lookup #%d var[0]=%d\n", i, vars[0]);
+			printf("lookup #%d var[0]=%d\n", i, vars[0]);
 			error_cnt++;
 			goto out;
 		}
@@ -22,8 +22,8 @@ static void *parallel_map_access(void *arg)
 		for (j = 2; j < 17; j++) {
 			if (vars[j] == rnd)
 				continue;
-			test__printf("lookup #%d var[1]=%d var[%d]=%d\n",
-				     i, rnd, j, vars[j]);
+			printf("lookup #%d var[1]=%d var[%d]=%d\n",
+			       i, rnd, j, vars[j]);
 			error_cnt++;
 			goto out;
 		}
@@ -43,7 +43,7 @@ void test_map_lock(void)
 
 	err = bpf_prog_load(file, BPF_PROG_TYPE_CGROUP_SKB, &obj, &prog_fd);
 	if (err) {
-		test__printf("test_map_lock:bpf_prog_load errno %d\n", errno);
+		printf("test_map_lock:bpf_prog_load errno %d\n", errno);
 		goto close_prog;
 	}
 	map_fd[0] = bpf_find_map(__func__, obj, "hash_map");
diff --git a/tools/testing/selftests/bpf/prog_tests/send_signal.c b/tools/testing/selftests/bpf/prog_tests/send_signal.c
index 461b423d0584..1575f0a1f586 100644
--- a/tools/testing/selftests/bpf/prog_tests/send_signal.c
+++ b/tools/testing/selftests/bpf/prog_tests/send_signal.c
@@ -202,8 +202,8 @@ static int test_send_signal_nmi(void)
 			 -1 /* cpu */, -1 /* group_fd */, 0 /* flags */);
 	if (pmu_fd == -1) {
 		if (errno == ENOENT) {
-			test__printf("%s:SKIP:no PERF_COUNT_HW_CPU_CYCLES\n",
-				     __func__);
+			printf("%s:SKIP:no PERF_COUNT_HW_CPU_CYCLES\n",
+			       __func__);
 			return 0;
 		}
 		/* Let the test fail with a more informative message */
diff --git a/tools/testing/selftests/bpf/prog_tests/spinlock.c b/tools/testing/selftests/bpf/prog_tests/spinlock.c
index deb2db5b85b0..114ebe6a438e 100644
--- a/tools/testing/selftests/bpf/prog_tests/spinlock.c
+++ b/tools/testing/selftests/bpf/prog_tests/spinlock.c
@@ -12,7 +12,7 @@ void test_spinlock(void)
 
 	err = bpf_prog_load(file, BPF_PROG_TYPE_CGROUP_SKB, &obj, &prog_fd);
 	if (err) {
-		test__printf("test_spin_lock:bpf_prog_load errno %d\n", errno);
+		printf("test_spin_lock:bpf_prog_load errno %d\n", errno);
 		goto close_prog;
 	}
 	for (i = 0; i < 4; i++)
diff --git a/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id.c b/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id.c
index 356d2c017a9c..ac44fda84833 100644
--- a/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id.c
+++ b/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id.c
@@ -109,8 +109,8 @@ void test_stacktrace_build_id(void)
 	if (build_id_matches < 1 && retry--) {
 		bpf_link__destroy(link);
 		bpf_object__close(obj);
-		test__printf("%s:WARN:Didn't find expected build ID from the map, retrying\n",
-			     __func__);
+		printf("%s:WARN:Didn't find expected build ID from the map, retrying\n",
+		       __func__);
 		goto retry;
 	}
 
diff --git a/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id_nmi.c b/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id_nmi.c
index f44f2c159714..9557b7dfb782 100644
--- a/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id_nmi.c
+++ b/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id_nmi.c
@@ -140,8 +140,8 @@ void test_stacktrace_build_id_nmi(void)
 	if (build_id_matches < 1 && retry--) {
 		bpf_link__destroy(link);
 		bpf_object__close(obj);
-		test__printf("%s:WARN:Didn't find expected build ID from the map, retrying\n",
-			     __func__);
+		printf("%s:WARN:Didn't find expected build ID from the map, retrying\n",
+		       __func__);
 		goto retry;
 	}
 
diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_noinline.c b/tools/testing/selftests/bpf/prog_tests/xdp_noinline.c
index b5404494b8aa..15f7c272edb0 100644
--- a/tools/testing/selftests/bpf/prog_tests/xdp_noinline.c
+++ b/tools/testing/selftests/bpf/prog_tests/xdp_noinline.c
@@ -75,8 +75,8 @@ void test_xdp_noinline(void)
 	}
 	if (bytes != MAGIC_BYTES * NUM_ITER * 2 || pkts != NUM_ITER * 2) {
 		error_cnt++;
-		test__printf("test_xdp_noinline:FAIL:stats %lld %lld\n",
-			     bytes, pkts);
+		printf("test_xdp_noinline:FAIL:stats %lld %lld\n",
+		       bytes, pkts);
 	}
 out:
 	bpf_object__close(obj);
diff --git a/tools/testing/selftests/bpf/test_progs.c b/tools/testing/selftests/bpf/test_progs.c
index 6ea289ba307b..6c11c796ea1f 100644
--- a/tools/testing/selftests/bpf/test_progs.c
+++ b/tools/testing/selftests/bpf/test_progs.c
@@ -105,20 +105,6 @@ void test__force_log() {
 	env.test->force_log = true;
 }
 
-void test__vprintf(const char *fmt, va_list args)
-{
-	vprintf(fmt, args);
-}
-
-void test__printf(const char *fmt, ...)
-{
-	va_list args;
-
-	va_start(args, fmt);
-	test__vprintf(fmt, args);
-	va_end(args);
-}
-
 struct ipv4_packet pkt_v4 = {
 	.eth.h_proto = __bpf_constant_htons(ETH_P_IP),
 	.iph.ihl = 5,
@@ -310,7 +296,7 @@ static int libbpf_print_fn(enum libbpf_print_level level,
 {
 	if (!env.very_verbose && level == LIBBPF_DEBUG)
 		return 0;
-	test__vprintf(format, args);
+	vprintf(format, args);
 	return 0;
 }
 
diff --git a/tools/testing/selftests/bpf/test_progs.h b/tools/testing/selftests/bpf/test_progs.h
index 541f9eab5eed..37d427f5a1e5 100644
--- a/tools/testing/selftests/bpf/test_progs.h
+++ b/tools/testing/selftests/bpf/test_progs.h
@@ -70,8 +70,6 @@ extern int error_cnt;
 extern int pass_cnt;
 extern struct test_env env;
 
-extern void test__printf(const char *fmt, ...);
-extern void test__vprintf(const char *fmt, va_list args);
 extern void test__force_log();
 extern bool test__start_subtest(const char *name);
 
@@ -97,12 +95,12 @@ extern struct ipv6_packet pkt_v6;
 	int __ret = !!(condition);					\
 	if (__ret) {							\
 		error_cnt++;						\
-		test__printf("%s:FAIL:%s ", __func__, tag);		\
-		test__printf(format);					\
+		printf("%s:FAIL:%s ", __func__, tag);			\
+		printf(format);						\
 	} else {							\
 		pass_cnt++;						\
-		test__printf("%s:PASS:%s %d nsec\n",			\
-			      __func__, tag, duration);			\
+		printf("%s:PASS:%s %d nsec\n",				\
+		       __func__, tag, duration);			\
 	}								\
 	__ret;								\
 })
-- 
2.22.0.770.g0f2c4a37fd-goog


^ permalink raw reply related

* [PATCH bpf-next v5 1/3] selftests/bpf: test_progs: switch to open_memstream
From: Stanislav Fomichev @ 2019-08-06 17:45 UTC (permalink / raw)
  To: netdev, bpf; +Cc: davem, ast, daniel, Stanislav Fomichev, Andrii Nakryiko
In-Reply-To: <20190806174529.8341-1-sdf@google.com>

Use open_memstream to override stdout during test execution.
The copy of the original stdout is held in env.stdout and used
to print subtest info and dump failed log.

test_{v,}printf are now simple wrappers around stdout and will be
removed in the next patch.

v5:
* fix -v crash by always setting env.std{in,err} (Alexei Starovoitov)
* drop force_log check from stdio_hijack (Andrii Nakryiko)

v4:
* one field per line for stdout/stderr (Andrii Nakryiko)

v3:
* don't do strlen over log_buf, log_cnt has it already (Andrii Nakryiko)

v2:
* add ifdef __GLIBC__ around open_memstream (maybe pointless since
  we already depend on glibc for argp_parse)
* hijack stderr as well (Andrii Nakryiko)
* don't hijack for every test, do it once (Andrii Nakryiko)
* log_cap -> log_size (Andrii Nakryiko)
* do fseeko in a proper place (Andrii Nakryiko)
* check open_memstream returned value (Andrii Nakryiko)

Cc: Andrii Nakryiko <andriin@fb.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 tools/testing/selftests/bpf/test_progs.c | 115 ++++++++++++-----------
 tools/testing/selftests/bpf/test_progs.h |   3 +-
 2 files changed, 62 insertions(+), 56 deletions(-)

diff --git a/tools/testing/selftests/bpf/test_progs.c b/tools/testing/selftests/bpf/test_progs.c
index db00196c8315..6ea289ba307b 100644
--- a/tools/testing/selftests/bpf/test_progs.c
+++ b/tools/testing/selftests/bpf/test_progs.c
@@ -40,14 +40,20 @@ static bool should_run(struct test_selector *sel, int num, const char *name)
 
 static void dump_test_log(const struct prog_test_def *test, bool failed)
 {
+	if (stdout == env.stdout)
+		return;
+
+	fflush(stdout); /* exports env.log_buf & env.log_cnt */
+
 	if (env.verbose || test->force_log || failed) {
 		if (env.log_cnt) {
-			fprintf(stdout, "%s", env.log_buf);
+			fprintf(env.stdout, "%s", env.log_buf);
 			if (env.log_buf[env.log_cnt - 1] != '\n')
-				fprintf(stdout, "\n");
+				fprintf(env.stdout, "\n");
 		}
 	}
-	env.log_cnt = 0;
+
+	fseeko(stdout, 0, SEEK_SET); /* rewind */
 }
 
 void test__end_subtest()
@@ -62,7 +68,7 @@ void test__end_subtest()
 
 	dump_test_log(test, sub_error_cnt);
 
-	printf("#%d/%d %s:%s\n",
+	fprintf(env.stdout, "#%d/%d %s:%s\n",
 	       test->test_num, test->subtest_num,
 	       test->subtest_name, sub_error_cnt ? "FAIL" : "OK");
 }
@@ -79,7 +85,8 @@ bool test__start_subtest(const char *name)
 	test->subtest_num++;
 
 	if (!name || !name[0]) {
-		fprintf(stderr, "Subtest #%d didn't provide sub-test name!\n",
+		fprintf(env.stderr,
+			"Subtest #%d didn't provide sub-test name!\n",
 			test->subtest_num);
 		return false;
 	}
@@ -100,53 +107,7 @@ void test__force_log() {
 
 void test__vprintf(const char *fmt, va_list args)
 {
-	size_t rem_sz;
-	int ret = 0;
-
-	if (env.verbose || (env.test && env.test->force_log)) {
-		vfprintf(stderr, fmt, args);
-		return;
-	}
-
-try_again:
-	rem_sz = env.log_cap - env.log_cnt;
-	if (rem_sz) {
-		va_list ap;
-
-		va_copy(ap, args);
-		/* we reserved extra byte for \0 at the end */
-		ret = vsnprintf(env.log_buf + env.log_cnt, rem_sz + 1, fmt, ap);
-		va_end(ap);
-
-		if (ret < 0) {
-			env.log_buf[env.log_cnt] = '\0';
-			fprintf(stderr, "failed to log w/ fmt '%s'\n", fmt);
-			return;
-		}
-	}
-
-	if (!rem_sz || ret > rem_sz) {
-		size_t new_sz = env.log_cap * 3 / 2;
-		char *new_buf;
-
-		if (new_sz < 4096)
-			new_sz = 4096;
-		if (new_sz < ret + env.log_cnt)
-			new_sz = ret + env.log_cnt;
-
-		/* +1 for guaranteed space for terminating \0 */
-		new_buf = realloc(env.log_buf, new_sz + 1);
-		if (!new_buf) {
-			fprintf(stderr, "failed to realloc log buffer: %d\n",
-				errno);
-			return;
-		}
-		env.log_buf = new_buf;
-		env.log_cap = new_sz;
-		goto try_again;
-	}
-
-	env.log_cnt += ret;
+	vprintf(fmt, args);
 }
 
 void test__printf(const char *fmt, ...)
@@ -477,6 +438,48 @@ static error_t parse_arg(int key, char *arg, struct argp_state *state)
 	return 0;
 }
 
+static void stdio_hijack(void)
+{
+#ifdef __GLIBC__
+	env.stdout = stdout;
+	env.stderr = stderr;
+
+	if (env.verbose) {
+		/* nothing to do, output to stdout by default */
+		return;
+	}
+
+	/* stdout and stderr -> buffer */
+	fflush(stdout);
+
+	stdout = open_memstream(&env.log_buf, &env.log_cnt);
+	if (!stdout) {
+		stdout = env.stdout;
+		perror("open_memstream");
+		return;
+	}
+
+	stderr = stdout;
+#endif
+}
+
+static void stdio_restore(void)
+{
+#ifdef __GLIBC__
+	if (stdout == env.stdout)
+		return;
+
+	fclose(stdout);
+	free(env.log_buf);
+
+	env.log_buf = NULL;
+	env.log_cnt = 0;
+
+	stdout = env.stdout;
+	stderr = env.stderr;
+#endif
+}
+
 int main(int argc, char **argv)
 {
 	static const struct argp argp = {
@@ -496,6 +499,7 @@ int main(int argc, char **argv)
 
 	env.jit_enabled = is_jit_enabled();
 
+	stdio_hijack();
 	for (i = 0; i < prog_test_cnt; i++) {
 		struct prog_test_def *test = &prog_test_defs[i];
 		int old_pass_cnt = pass_cnt;
@@ -523,13 +527,14 @@ int main(int argc, char **argv)
 
 		dump_test_log(test, test->error_cnt);
 
-		printf("#%d %s:%s\n", test->test_num, test->test_name,
-		       test->error_cnt ? "FAIL" : "OK");
+		fprintf(env.stdout, "#%d %s:%s\n",
+			test->test_num, test->test_name,
+			test->error_cnt ? "FAIL" : "OK");
 	}
+	stdio_restore();
 	printf("Summary: %d/%d PASSED, %d FAILED\n",
 	       env.succ_cnt, env.sub_succ_cnt, env.fail_cnt);
 
-	free(env.log_buf);
 	free(env.test_selector.num_set);
 	free(env.subtest_selector.num_set);
 
diff --git a/tools/testing/selftests/bpf/test_progs.h b/tools/testing/selftests/bpf/test_progs.h
index afd14962456f..541f9eab5eed 100644
--- a/tools/testing/selftests/bpf/test_progs.h
+++ b/tools/testing/selftests/bpf/test_progs.h
@@ -56,9 +56,10 @@ struct test_env {
 	bool jit_enabled;
 
 	struct prog_test_def *test;
+	FILE *stdout;
+	FILE *stderr;
 	char *log_buf;
 	size_t log_cnt;
-	size_t log_cap;
 
 	int succ_cnt; /* successful tests */
 	int sub_succ_cnt; /* successful sub-tests */
-- 
2.22.0.770.g0f2c4a37fd-goog


^ permalink raw reply related

* [PATCH bpf-next v5 0/3] selftests/bpf: switch test_progs back to stdio
From: Stanislav Fomichev @ 2019-08-06 17:45 UTC (permalink / raw)
  To: netdev, bpf; +Cc: davem, ast, daniel, Stanislav Fomichev, Andrii Nakryiko

I was looking into converting test_sockops* to test_progs framework
and that requires using cgroup_helpers.c which rely on stdio/stderr.
Let's use open_memstream to override stdout into buffer during
subtests instead of custom test_{v,}printf wrappers. That lets
us continue to use stdio in the subtests and dump it on failure
if required.

That would also fix bpf_find_map which currently uses printf to
signal failure (missed during test_printf conversion).

Cc: Andrii Nakryiko <andriin@fb.com>

Stanislav Fomichev (3):
  selftests/bpf: test_progs: switch to open_memstream
  selftests/bpf: test_progs: test__printf -> printf
  selftests/bpf: test_progs: drop extra trailing tab

 .../bpf/prog_tests/bpf_verif_scale.c          |   4 +-
 .../selftests/bpf/prog_tests/l4lb_all.c       |   2 +-
 .../selftests/bpf/prog_tests/map_lock.c       |  10 +-
 .../selftests/bpf/prog_tests/send_signal.c    |   4 +-
 .../selftests/bpf/prog_tests/spinlock.c       |   2 +-
 .../bpf/prog_tests/stacktrace_build_id.c      |   4 +-
 .../bpf/prog_tests/stacktrace_build_id_nmi.c  |   4 +-
 .../selftests/bpf/prog_tests/xdp_noinline.c   |   4 +-
 tools/testing/selftests/bpf/test_progs.c      | 131 ++++++++----------
 tools/testing/selftests/bpf/test_progs.h      |  13 +-
 10 files changed, 84 insertions(+), 94 deletions(-)

-- 
2.22.0.770.g0f2c4a37fd-goog

^ permalink raw reply

* Re: [PATCH v6 1/3] mm/gup: add make_dirty arg to put_user_pages_dirty_lock()
From: Ira Weiny @ 2019-08-06 17:40 UTC (permalink / raw)
  To: john.hubbard
  Cc: Andrew Morton, Alexander Viro, Björn Töpel,
	Boaz Harrosh, Christoph Hellwig, Daniel Vetter, Dan Williams,
	Dave Chinner, David Airlie, David S . Miller, Ilya Dryomov,
	Jan Kara, Jason Gunthorpe, Jens Axboe, Jérôme Glisse,
	Johannes Thumshirn, Magnus Karlsson, Matthew Wilcox,
	Miklos Szeredi, Ming Lei, Sage Weil, Santosh Shilimkar, Yan Zheng,
	netdev, dri-devel, linux-mm, linux-rdma, bpf, LKML, John Hubbard
In-Reply-To: <20190804214042.4564-2-jhubbard@nvidia.com>

On Sun, Aug 04, 2019 at 02:40:40PM -0700, john.hubbard@gmail.com wrote:
> From: John Hubbard <jhubbard@nvidia.com>
> 
> Provide a more capable variation of put_user_pages_dirty_lock(),
> and delete put_user_pages_dirty(). This is based on the
> following:
> 
> 1. Lots of call sites become simpler if a bool is passed
> into put_user_page*(), instead of making the call site
> choose which put_user_page*() variant to call.
> 
> 2. Christoph Hellwig's observation that set_page_dirty_lock()
> is usually correct, and set_page_dirty() is usually a
> bug, or at least questionable, within a put_user_page*()
> calling chain.
> 
> This leads to the following API choices:
> 
>     * put_user_pages_dirty_lock(page, npages, make_dirty)
> 
>     * There is no put_user_pages_dirty(). You have to
>       hand code that, in the rare case that it's
>       required.
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Jan Kara <jack@suse.cz>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Cc: Jason Gunthorpe <jgg@ziepe.ca>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>

I assume this is superseded by the patch in the large series?

Ira

> ---
>  drivers/infiniband/core/umem.c             |   5 +-
>  drivers/infiniband/hw/hfi1/user_pages.c    |   5 +-
>  drivers/infiniband/hw/qib/qib_user_pages.c |  13 +--
>  drivers/infiniband/hw/usnic/usnic_uiom.c   |   5 +-
>  drivers/infiniband/sw/siw/siw_mem.c        |  19 +---
>  include/linux/mm.h                         |   5 +-
>  mm/gup.c                                   | 115 +++++++++------------
>  7 files changed, 61 insertions(+), 106 deletions(-)
> 
> diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
> index 08da840ed7ee..965cf9dea71a 100644
> --- a/drivers/infiniband/core/umem.c
> +++ b/drivers/infiniband/core/umem.c
> @@ -54,10 +54,7 @@ static void __ib_umem_release(struct ib_device *dev, struct ib_umem *umem, int d
>  
>  	for_each_sg_page(umem->sg_head.sgl, &sg_iter, umem->sg_nents, 0) {
>  		page = sg_page_iter_page(&sg_iter);
> -		if (umem->writable && dirty)
> -			put_user_pages_dirty_lock(&page, 1);
> -		else
> -			put_user_page(page);
> +		put_user_pages_dirty_lock(&page, 1, umem->writable && dirty);
>  	}
>  
>  	sg_free_table(&umem->sg_head);
> diff --git a/drivers/infiniband/hw/hfi1/user_pages.c b/drivers/infiniband/hw/hfi1/user_pages.c
> index b89a9b9aef7a..469acb961fbd 100644
> --- a/drivers/infiniband/hw/hfi1/user_pages.c
> +++ b/drivers/infiniband/hw/hfi1/user_pages.c
> @@ -118,10 +118,7 @@ int hfi1_acquire_user_pages(struct mm_struct *mm, unsigned long vaddr, size_t np
>  void hfi1_release_user_pages(struct mm_struct *mm, struct page **p,
>  			     size_t npages, bool dirty)
>  {
> -	if (dirty)
> -		put_user_pages_dirty_lock(p, npages);
> -	else
> -		put_user_pages(p, npages);
> +	put_user_pages_dirty_lock(p, npages, dirty);
>  
>  	if (mm) { /* during close after signal, mm can be NULL */
>  		atomic64_sub(npages, &mm->pinned_vm);
> diff --git a/drivers/infiniband/hw/qib/qib_user_pages.c b/drivers/infiniband/hw/qib/qib_user_pages.c
> index bfbfbb7e0ff4..26c1fb8d45cc 100644
> --- a/drivers/infiniband/hw/qib/qib_user_pages.c
> +++ b/drivers/infiniband/hw/qib/qib_user_pages.c
> @@ -37,15 +37,6 @@
>  
>  #include "qib.h"
>  
> -static void __qib_release_user_pages(struct page **p, size_t num_pages,
> -				     int dirty)
> -{
> -	if (dirty)
> -		put_user_pages_dirty_lock(p, num_pages);
> -	else
> -		put_user_pages(p, num_pages);
> -}
> -
>  /**
>   * qib_map_page - a safety wrapper around pci_map_page()
>   *
> @@ -124,7 +115,7 @@ int qib_get_user_pages(unsigned long start_page, size_t num_pages,
>  
>  	return 0;
>  bail_release:
> -	__qib_release_user_pages(p, got, 0);
> +	put_user_pages_dirty_lock(p, got, false);
>  bail:
>  	atomic64_sub(num_pages, &current->mm->pinned_vm);
>  	return ret;
> @@ -132,7 +123,7 @@ int qib_get_user_pages(unsigned long start_page, size_t num_pages,
>  
>  void qib_release_user_pages(struct page **p, size_t num_pages)
>  {
> -	__qib_release_user_pages(p, num_pages, 1);
> +	put_user_pages_dirty_lock(p, num_pages, true);
>  
>  	/* during close after signal, mm can be NULL */
>  	if (current->mm)
> diff --git a/drivers/infiniband/hw/usnic/usnic_uiom.c b/drivers/infiniband/hw/usnic/usnic_uiom.c
> index 0b0237d41613..62e6ffa9ad78 100644
> --- a/drivers/infiniband/hw/usnic/usnic_uiom.c
> +++ b/drivers/infiniband/hw/usnic/usnic_uiom.c
> @@ -75,10 +75,7 @@ static void usnic_uiom_put_pages(struct list_head *chunk_list, int dirty)
>  		for_each_sg(chunk->page_list, sg, chunk->nents, i) {
>  			page = sg_page(sg);
>  			pa = sg_phys(sg);
> -			if (dirty)
> -				put_user_pages_dirty_lock(&page, 1);
> -			else
> -				put_user_page(page);
> +			put_user_pages_dirty_lock(&page, 1, dirty);
>  			usnic_dbg("pa: %pa\n", &pa);
>  		}
>  		kfree(chunk);
> diff --git a/drivers/infiniband/sw/siw/siw_mem.c b/drivers/infiniband/sw/siw/siw_mem.c
> index 67171c82b0c4..1e197753bf2f 100644
> --- a/drivers/infiniband/sw/siw/siw_mem.c
> +++ b/drivers/infiniband/sw/siw/siw_mem.c
> @@ -60,20 +60,6 @@ struct siw_mem *siw_mem_id2obj(struct siw_device *sdev, int stag_index)
>  	return NULL;
>  }
>  
> -static void siw_free_plist(struct siw_page_chunk *chunk, int num_pages,
> -			   bool dirty)
> -{
> -	struct page **p = chunk->plist;
> -
> -	while (num_pages--) {
> -		if (!PageDirty(*p) && dirty)
> -			put_user_pages_dirty_lock(p, 1);
> -		else
> -			put_user_page(*p);
> -		p++;
> -	}
> -}
> -
>  void siw_umem_release(struct siw_umem *umem, bool dirty)
>  {
>  	struct mm_struct *mm_s = umem->owning_mm;
> @@ -82,8 +68,9 @@ void siw_umem_release(struct siw_umem *umem, bool dirty)
>  	for (i = 0; num_pages; i++) {
>  		int to_free = min_t(int, PAGES_PER_CHUNK, num_pages);
>  
> -		siw_free_plist(&umem->page_chunk[i], to_free,
> -			       umem->writable && dirty);
> +		put_user_pages_dirty_lock(umem->page_chunk[i].plist,
> +					  to_free,
> +					  umem->writable && dirty);
>  		kfree(umem->page_chunk[i].plist);
>  		num_pages -= to_free;
>  	}
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 0334ca97c584..9759b6a24420 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1057,8 +1057,9 @@ static inline void put_user_page(struct page *page)
>  	put_page(page);
>  }
>  
> -void put_user_pages_dirty(struct page **pages, unsigned long npages);
> -void put_user_pages_dirty_lock(struct page **pages, unsigned long npages);
> +void put_user_pages_dirty_lock(struct page **pages, unsigned long npages,
> +			       bool make_dirty);
> +
>  void put_user_pages(struct page **pages, unsigned long npages);
>  
>  #if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
> diff --git a/mm/gup.c b/mm/gup.c
> index 98f13ab37bac..7fefd7ab02c4 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -29,85 +29,70 @@ struct follow_page_context {
>  	unsigned int page_mask;
>  };
>  
> -typedef int (*set_dirty_func_t)(struct page *page);
> -
> -static void __put_user_pages_dirty(struct page **pages,
> -				   unsigned long npages,
> -				   set_dirty_func_t sdf)
> -{
> -	unsigned long index;
> -
> -	for (index = 0; index < npages; index++) {
> -		struct page *page = compound_head(pages[index]);
> -
> -		/*
> -		 * Checking PageDirty at this point may race with
> -		 * clear_page_dirty_for_io(), but that's OK. Two key cases:
> -		 *
> -		 * 1) This code sees the page as already dirty, so it skips
> -		 * the call to sdf(). That could happen because
> -		 * clear_page_dirty_for_io() called page_mkclean(),
> -		 * followed by set_page_dirty(). However, now the page is
> -		 * going to get written back, which meets the original
> -		 * intention of setting it dirty, so all is well:
> -		 * clear_page_dirty_for_io() goes on to call
> -		 * TestClearPageDirty(), and write the page back.
> -		 *
> -		 * 2) This code sees the page as clean, so it calls sdf().
> -		 * The page stays dirty, despite being written back, so it
> -		 * gets written back again in the next writeback cycle.
> -		 * This is harmless.
> -		 */
> -		if (!PageDirty(page))
> -			sdf(page);
> -
> -		put_user_page(page);
> -	}
> -}
> -
>  /**
> - * put_user_pages_dirty() - release and dirty an array of gup-pinned pages
> - * @pages:  array of pages to be marked dirty and released.
> + * put_user_pages_dirty_lock() - release and optionally dirty gup-pinned pages
> + * @pages:  array of pages to be maybe marked dirty, and definitely released.
>   * @npages: number of pages in the @pages array.
> + * @make_dirty: whether to mark the pages dirty
>   *
>   * "gup-pinned page" refers to a page that has had one of the get_user_pages()
>   * variants called on that page.
>   *
>   * For each page in the @pages array, make that page (or its head page, if a
> - * compound page) dirty, if it was previously listed as clean. Then, release
> - * the page using put_user_page().
> + * compound page) dirty, if @make_dirty is true, and if the page was previously
> + * listed as clean. In any case, releases all pages using put_user_page(),
> + * possibly via put_user_pages(), for the non-dirty case.
>   *
>   * Please see the put_user_page() documentation for details.
>   *
> - * set_page_dirty(), which does not lock the page, is used here.
> - * Therefore, it is the caller's responsibility to ensure that this is
> - * safe. If not, then put_user_pages_dirty_lock() should be called instead.
> + * set_page_dirty_lock() is used internally. If instead, set_page_dirty() is
> + * required, then the caller should a) verify that this is really correct,
> + * because _lock() is usually required, and b) hand code it:
> + * set_page_dirty_lock(), put_user_page().
>   *
>   */
> -void put_user_pages_dirty(struct page **pages, unsigned long npages)
> +void put_user_pages_dirty_lock(struct page **pages, unsigned long npages,
> +			       bool make_dirty)
>  {
> -	__put_user_pages_dirty(pages, npages, set_page_dirty);
> -}
> -EXPORT_SYMBOL(put_user_pages_dirty);
> +	unsigned long index;
>  
> -/**
> - * put_user_pages_dirty_lock() - release and dirty an array of gup-pinned pages
> - * @pages:  array of pages to be marked dirty and released.
> - * @npages: number of pages in the @pages array.
> - *
> - * For each page in the @pages array, make that page (or its head page, if a
> - * compound page) dirty, if it was previously listed as clean. Then, release
> - * the page using put_user_page().
> - *
> - * Please see the put_user_page() documentation for details.
> - *
> - * This is just like put_user_pages_dirty(), except that it invokes
> - * set_page_dirty_lock(), instead of set_page_dirty().
> - *
> - */
> -void put_user_pages_dirty_lock(struct page **pages, unsigned long npages)
> -{
> -	__put_user_pages_dirty(pages, npages, set_page_dirty_lock);
> +	/*
> +	 * TODO: this can be optimized for huge pages: if a series of pages is
> +	 * physically contiguous and part of the same compound page, then a
> +	 * single operation to the head page should suffice.
> +	 */
> +
> +	if (!make_dirty) {
> +		put_user_pages(pages, npages);
> +		return;
> +	}
> +
> +	for (index = 0; index < npages; index++) {
> +		struct page *page = compound_head(pages[index]);
> +		/*
> +		 * Checking PageDirty at this point may race with
> +		 * clear_page_dirty_for_io(), but that's OK. Two key
> +		 * cases:
> +		 *
> +		 * 1) This code sees the page as already dirty, so it
> +		 * skips the call to set_page_dirty(). That could happen
> +		 * because clear_page_dirty_for_io() called
> +		 * page_mkclean(), followed by set_page_dirty().
> +		 * However, now the page is going to get written back,
> +		 * which meets the original intention of setting it
> +		 * dirty, so all is well: clear_page_dirty_for_io() goes
> +		 * on to call TestClearPageDirty(), and write the page
> +		 * back.
> +		 *
> +		 * 2) This code sees the page as clean, so it calls
> +		 * set_page_dirty(). The page stays dirty, despite being
> +		 * written back, so it gets written back again in the
> +		 * next writeback cycle. This is harmless.
> +		 */
> +		if (!PageDirty(page))
> +			set_page_dirty_lock(page);
> +		put_user_page(page);
> +	}
>  }
>  EXPORT_SYMBOL(put_user_pages_dirty_lock);
>  
> -- 
> 2.22.0
> 

^ permalink raw reply

* Re: [PATCH 12/17] skge: no need to check return value of debugfs_create functions
From: Stephen Hemminger @ 2019-08-06 17:44 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: netdev, Mirko Lindner, David S. Miller
In-Reply-To: <20190806161128.31232-13-gregkh@linuxfoundation.org>

On Tue,  6 Aug 2019 18:11:23 +0200
Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:

> When calling debugfs functions, there is no need to ever check the
> return value.  The function can work or not, but the code logic should
> never do something different based on this.
> 
> Cc: Mirko Lindner <mlindner@marvell.com>
> Cc: Stephen Hemminger <stephen@networkplumber.org>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: netdev@vger.kernel.org
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Acked-by: Stephen Hemminger <stephen@networkplumber.org>

Did not pull card out of dust bin to test it though

^ permalink raw reply

* Re: [PATCH bpf-next v4 1/3] selftests/bpf: test_progs: switch to open_memstream
From: Stanislav Fomichev @ 2019-08-06 17:40 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Stanislav Fomichev, Networking, bpf, David S. Miller,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
In-Reply-To: <CAEf4BzYU6xfcPrHzz0p6dWL3_VM2mD9pKy3T-NfnuDUrd4RMDQ@mail.gmail.com>

On 08/06, Andrii Nakryiko wrote:
> On Tue, Aug 6, 2019 at 10:19 AM Stanislav Fomichev <sdf@google.com> wrote:
> >
> > Use open_memstream to override stdout during test execution.
> > The copy of the original stdout is held in env.stdout and used
> > to print subtest info and dump failed log.
> >
> > test_{v,}printf are now simple wrappers around stdout and will be
> > removed in the next patch.
> >
> > v4:
> > * one field per line for stdout/stderr (Andrii Nakryiko)
> >
> > v3:
> > * don't do strlen over log_buf, log_cnt has it already (Andrii Nakryiko)
> >
> > v2:
> > * add ifdef __GLIBC__ around open_memstream (maybe pointless since
> >   we already depend on glibc for argp_parse)
> > * hijack stderr as well (Andrii Nakryiko)
> > * don't hijack for every test, do it once (Andrii Nakryiko)
> > * log_cap -> log_size (Andrii Nakryiko)
> > * do fseeko in a proper place (Andrii Nakryiko)
> > * check open_memstream returned value (Andrii Nakryiko)
> >
> > Cc: Andrii Nakryiko <andriin@fb.com>
> > Acked-by: Andrii Nakryiko <andriin@fb.com>
> > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > ---
> >  tools/testing/selftests/bpf/test_progs.c | 115 ++++++++++++-----------
> >  tools/testing/selftests/bpf/test_progs.h |   3 +-
> >  2 files changed, 62 insertions(+), 56 deletions(-)
> >
> > diff --git a/tools/testing/selftests/bpf/test_progs.c b/tools/testing/selftests/bpf/test_progs.c
> > index db00196c8315..9556439c607c 100644
> > --- a/tools/testing/selftests/bpf/test_progs.c
> > +++ b/tools/testing/selftests/bpf/test_progs.c
> > @@ -40,14 +40,20 @@ static bool should_run(struct test_selector *sel, int num, const char *name)
> >
> >  static void dump_test_log(const struct prog_test_def *test, bool failed)
> >  {
> > +       if (stdout == env.stdout)
> > +               return;
> > +
> > +       fflush(stdout); /* exports env.log_buf & env.log_cnt */
> > +
> >         if (env.verbose || test->force_log || failed) {
> >                 if (env.log_cnt) {
> > -                       fprintf(stdout, "%s", env.log_buf);
> > +                       fprintf(env.stdout, "%s", env.log_buf);
> >                         if (env.log_buf[env.log_cnt - 1] != '\n')
> > -                               fprintf(stdout, "\n");
> > +                               fprintf(env.stdout, "\n");
> >                 }
> >         }
> > -       env.log_cnt = 0;
> > +
> > +       fseeko(stdout, 0, SEEK_SET); /* rewind */
> >  }
> >
> >  void test__end_subtest()
> > @@ -62,7 +68,7 @@ void test__end_subtest()
> >
> >         dump_test_log(test, sub_error_cnt);
> >
> > -       printf("#%d/%d %s:%s\n",
> > +       fprintf(env.stdout, "#%d/%d %s:%s\n",
> >                test->test_num, test->subtest_num,
> >                test->subtest_name, sub_error_cnt ? "FAIL" : "OK");
> >  }
> > @@ -79,7 +85,8 @@ bool test__start_subtest(const char *name)
> >         test->subtest_num++;
> >
> >         if (!name || !name[0]) {
> > -               fprintf(stderr, "Subtest #%d didn't provide sub-test name!\n",
> > +               fprintf(env.stderr,
> > +                       "Subtest #%d didn't provide sub-test name!\n",
> >                         test->subtest_num);
> >                 return false;
> >         }
> > @@ -100,53 +107,7 @@ void test__force_log() {
> >
> >  void test__vprintf(const char *fmt, va_list args)
> >  {
> > -       size_t rem_sz;
> > -       int ret = 0;
> > -
> > -       if (env.verbose || (env.test && env.test->force_log)) {
> > -               vfprintf(stderr, fmt, args);
> > -               return;
> > -       }
> > -
> > -try_again:
> > -       rem_sz = env.log_cap - env.log_cnt;
> > -       if (rem_sz) {
> > -               va_list ap;
> > -
> > -               va_copy(ap, args);
> > -               /* we reserved extra byte for \0 at the end */
> > -               ret = vsnprintf(env.log_buf + env.log_cnt, rem_sz + 1, fmt, ap);
> > -               va_end(ap);
> > -
> > -               if (ret < 0) {
> > -                       env.log_buf[env.log_cnt] = '\0';
> > -                       fprintf(stderr, "failed to log w/ fmt '%s'\n", fmt);
> > -                       return;
> > -               }
> > -       }
> > -
> > -       if (!rem_sz || ret > rem_sz) {
> > -               size_t new_sz = env.log_cap * 3 / 2;
> > -               char *new_buf;
> > -
> > -               if (new_sz < 4096)
> > -                       new_sz = 4096;
> > -               if (new_sz < ret + env.log_cnt)
> > -                       new_sz = ret + env.log_cnt;
> > -
> > -               /* +1 for guaranteed space for terminating \0 */
> > -               new_buf = realloc(env.log_buf, new_sz + 1);
> > -               if (!new_buf) {
> > -                       fprintf(stderr, "failed to realloc log buffer: %d\n",
> > -                               errno);
> > -                       return;
> > -               }
> > -               env.log_buf = new_buf;
> > -               env.log_cap = new_sz;
> > -               goto try_again;
> > -       }
> > -
> > -       env.log_cnt += ret;
> > +       vprintf(fmt, args);
> >  }
> >
> >  void test__printf(const char *fmt, ...)
> > @@ -477,6 +438,48 @@ static error_t parse_arg(int key, char *arg, struct argp_state *state)
> >         return 0;
> >  }
> >
> > +static void stdio_hijack(void)
> > +{
> > +#ifdef __GLIBC__
> > +       if (env.verbose || (env.test && env.test->force_log)) {
> 
> I just also realized that you don't need `(env.test &&
> env.test->force_log)` test. We hijack stdout/stderr before env.test is
> even set, so this does nothing anyways. Plus, force_log can be set in
> the middle of test/sub-test, yet we hijack stdout just once (or even
> if per-test), so it's still going to be "racy". Let's buffer output
> (unless it's env.verbose, which is important to not buffer because
> some tests will have huge output, when failing, so this allows to
> bypass using tons of memory for those, when debugging) and dump at the
> end.
Makes sense, will drop this test and resubmit along with a fix for '-v'
that Alexei discovered. Thanks!

> > +               /* nothing to do, output to stdout by default */
> > +               return;
> > +       }
> > +
> > +       /* stdout and stderr -> buffer */
> > +       fflush(stdout);
> > +
> > +       env.stdout = stdout;
> > +       env.stderr = stderr;
> > +
> > +       stdout = open_memstream(&env.log_buf, &env.log_cnt);
> > +       if (!stdout) {
> > +               stdout = env.stdout;
> > +               perror("open_memstream");
> > +               return;
> > +       }
> > +
> > +       stderr = stdout;
> > +#endif
> > +}
> > +
> > +static void stdio_restore(void)
> > +{
> > +#ifdef __GLIBC__
> > +       if (stdout == env.stdout)
> > +               return;
> > +
> > +       fclose(stdout);
> > +       free(env.log_buf);
> > +
> > +       env.log_buf = NULL;
> > +       env.log_cnt = 0;
> > +
> > +       stdout = env.stdout;
> > +       stderr = env.stderr;
> > +#endif
> > +}
> > +
> >  int main(int argc, char **argv)
> >  {
> >         static const struct argp argp = {
> > @@ -496,6 +499,7 @@ int main(int argc, char **argv)
> >
> >         env.jit_enabled = is_jit_enabled();
> >
> > +       stdio_hijack();
> >         for (i = 0; i < prog_test_cnt; i++) {
> >                 struct prog_test_def *test = &prog_test_defs[i];
> >                 int old_pass_cnt = pass_cnt;
> > @@ -523,13 +527,14 @@ int main(int argc, char **argv)
> >
> >                 dump_test_log(test, test->error_cnt);
> >
> > -               printf("#%d %s:%s\n", test->test_num, test->test_name,
> > -                      test->error_cnt ? "FAIL" : "OK");
> > +               fprintf(env.stdout, "#%d %s:%s\n",
> > +                       test->test_num, test->test_name,
> > +                       test->error_cnt ? "FAIL" : "OK");
> >         }
> > +       stdio_restore();
> >         printf("Summary: %d/%d PASSED, %d FAILED\n",
> >                env.succ_cnt, env.sub_succ_cnt, env.fail_cnt);
> >
> > -       free(env.log_buf);
> >         free(env.test_selector.num_set);
> >         free(env.subtest_selector.num_set);
> >
> > diff --git a/tools/testing/selftests/bpf/test_progs.h b/tools/testing/selftests/bpf/test_progs.h
> > index afd14962456f..541f9eab5eed 100644
> > --- a/tools/testing/selftests/bpf/test_progs.h
> > +++ b/tools/testing/selftests/bpf/test_progs.h
> > @@ -56,9 +56,10 @@ struct test_env {
> >         bool jit_enabled;
> >
> >         struct prog_test_def *test;
> > +       FILE *stdout;
> > +       FILE *stderr;
> >         char *log_buf;
> >         size_t log_cnt;
> > -       size_t log_cap;
> >
> >         int succ_cnt; /* successful tests */
> >         int sub_succ_cnt; /* successful sub-tests */
> > --
> > 2.22.0.770.g0f2c4a37fd-goog
> >

^ permalink raw reply

* Re: [PATCH v2 01/34] mm/gup: add make_dirty arg to put_user_pages_dirty_lock()
From: Ira Weiny @ 2019-08-06 17:39 UTC (permalink / raw)
  To: john.hubbard
  Cc: Andrew Morton, Christoph Hellwig, Dan Williams, Dave Chinner,
	Dave Hansen, Jan Kara, Jason Gunthorpe, Jérôme Glisse,
	LKML, amd-gfx, ceph-devel, devel, devel, dri-devel, intel-gfx,
	kvm, linux-arm-kernel, linux-block, linux-crypto, linux-fbdev,
	linux-fsdevel, linux-media, linux-mm, linux-nfs, linux-rdma,
	linux-rpi-kernel, linux-xfs, netdev, rds-devel, sparclinux, x86,
	xen-devel, John Hubbard, Christoph Hellwig, Matthew Wilcox
In-Reply-To: <20190804224915.28669-2-jhubbard@nvidia.com>

On Sun, Aug 04, 2019 at 03:48:42PM -0700, john.hubbard@gmail.com wrote:
> From: John Hubbard <jhubbard@nvidia.com>
> 
> Provide a more capable variation of put_user_pages_dirty_lock(),
> and delete put_user_pages_dirty(). This is based on the
> following:
> 
> 1. Lots of call sites become simpler if a bool is passed
> into put_user_page*(), instead of making the call site
> choose which put_user_page*() variant to call.
> 
> 2. Christoph Hellwig's observation that set_page_dirty_lock()
> is usually correct, and set_page_dirty() is usually a
> bug, or at least questionable, within a put_user_page*()
> calling chain.
> 
> This leads to the following API choices:
> 
>     * put_user_pages_dirty_lock(page, npages, make_dirty)
> 
>     * There is no put_user_pages_dirty(). You have to
>       hand code that, in the rare case that it's
>       required.
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Jan Kara <jack@suse.cz>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Cc: Jason Gunthorpe <jgg@ziepe.ca>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
>  drivers/infiniband/core/umem.c             |   5 +-
>  drivers/infiniband/hw/hfi1/user_pages.c    |   5 +-
>  drivers/infiniband/hw/qib/qib_user_pages.c |  13 +--
>  drivers/infiniband/hw/usnic/usnic_uiom.c   |   5 +-
>  drivers/infiniband/sw/siw/siw_mem.c        |  19 +---
>  include/linux/mm.h                         |   5 +-
>  mm/gup.c                                   | 115 +++++++++------------
>  7 files changed, 61 insertions(+), 106 deletions(-)
> 
> diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
> index 08da840ed7ee..965cf9dea71a 100644
> --- a/drivers/infiniband/core/umem.c
> +++ b/drivers/infiniband/core/umem.c
> @@ -54,10 +54,7 @@ static void __ib_umem_release(struct ib_device *dev, struct ib_umem *umem, int d
>  
>  	for_each_sg_page(umem->sg_head.sgl, &sg_iter, umem->sg_nents, 0) {
>  		page = sg_page_iter_page(&sg_iter);
> -		if (umem->writable && dirty)
> -			put_user_pages_dirty_lock(&page, 1);
> -		else
> -			put_user_page(page);
> +		put_user_pages_dirty_lock(&page, 1, umem->writable && dirty);
>  	}
>  
>  	sg_free_table(&umem->sg_head);
> diff --git a/drivers/infiniband/hw/hfi1/user_pages.c b/drivers/infiniband/hw/hfi1/user_pages.c
> index b89a9b9aef7a..469acb961fbd 100644
> --- a/drivers/infiniband/hw/hfi1/user_pages.c
> +++ b/drivers/infiniband/hw/hfi1/user_pages.c
> @@ -118,10 +118,7 @@ int hfi1_acquire_user_pages(struct mm_struct *mm, unsigned long vaddr, size_t np
>  void hfi1_release_user_pages(struct mm_struct *mm, struct page **p,
>  			     size_t npages, bool dirty)
>  {
> -	if (dirty)
> -		put_user_pages_dirty_lock(p, npages);
> -	else
> -		put_user_pages(p, npages);
> +	put_user_pages_dirty_lock(p, npages, dirty);
>  
>  	if (mm) { /* during close after signal, mm can be NULL */
>  		atomic64_sub(npages, &mm->pinned_vm);
> diff --git a/drivers/infiniband/hw/qib/qib_user_pages.c b/drivers/infiniband/hw/qib/qib_user_pages.c
> index bfbfbb7e0ff4..26c1fb8d45cc 100644
> --- a/drivers/infiniband/hw/qib/qib_user_pages.c
> +++ b/drivers/infiniband/hw/qib/qib_user_pages.c
> @@ -37,15 +37,6 @@
>  
>  #include "qib.h"
>  
> -static void __qib_release_user_pages(struct page **p, size_t num_pages,
> -				     int dirty)
> -{
> -	if (dirty)
> -		put_user_pages_dirty_lock(p, num_pages);
> -	else
> -		put_user_pages(p, num_pages);
> -}
> -
>  /**
>   * qib_map_page - a safety wrapper around pci_map_page()
>   *
> @@ -124,7 +115,7 @@ int qib_get_user_pages(unsigned long start_page, size_t num_pages,
>  
>  	return 0;
>  bail_release:
> -	__qib_release_user_pages(p, got, 0);
> +	put_user_pages_dirty_lock(p, got, false);
>  bail:
>  	atomic64_sub(num_pages, &current->mm->pinned_vm);
>  	return ret;
> @@ -132,7 +123,7 @@ int qib_get_user_pages(unsigned long start_page, size_t num_pages,
>  
>  void qib_release_user_pages(struct page **p, size_t num_pages)
>  {
> -	__qib_release_user_pages(p, num_pages, 1);
> +	put_user_pages_dirty_lock(p, num_pages, true);
>  
>  	/* during close after signal, mm can be NULL */
>  	if (current->mm)
> diff --git a/drivers/infiniband/hw/usnic/usnic_uiom.c b/drivers/infiniband/hw/usnic/usnic_uiom.c
> index 0b0237d41613..62e6ffa9ad78 100644
> --- a/drivers/infiniband/hw/usnic/usnic_uiom.c
> +++ b/drivers/infiniband/hw/usnic/usnic_uiom.c
> @@ -75,10 +75,7 @@ static void usnic_uiom_put_pages(struct list_head *chunk_list, int dirty)
>  		for_each_sg(chunk->page_list, sg, chunk->nents, i) {
>  			page = sg_page(sg);
>  			pa = sg_phys(sg);
> -			if (dirty)
> -				put_user_pages_dirty_lock(&page, 1);
> -			else
> -				put_user_page(page);
> +			put_user_pages_dirty_lock(&page, 1, dirty);
>  			usnic_dbg("pa: %pa\n", &pa);
>  		}
>  		kfree(chunk);
> diff --git a/drivers/infiniband/sw/siw/siw_mem.c b/drivers/infiniband/sw/siw/siw_mem.c
> index 67171c82b0c4..1e197753bf2f 100644
> --- a/drivers/infiniband/sw/siw/siw_mem.c
> +++ b/drivers/infiniband/sw/siw/siw_mem.c
> @@ -60,20 +60,6 @@ struct siw_mem *siw_mem_id2obj(struct siw_device *sdev, int stag_index)
>  	return NULL;
>  }
>  
> -static void siw_free_plist(struct siw_page_chunk *chunk, int num_pages,
> -			   bool dirty)
> -{
> -	struct page **p = chunk->plist;
> -
> -	while (num_pages--) {
> -		if (!PageDirty(*p) && dirty)
> -			put_user_pages_dirty_lock(p, 1);
> -		else
> -			put_user_page(*p);
> -		p++;
> -	}
> -}
> -
>  void siw_umem_release(struct siw_umem *umem, bool dirty)
>  {
>  	struct mm_struct *mm_s = umem->owning_mm;
> @@ -82,8 +68,9 @@ void siw_umem_release(struct siw_umem *umem, bool dirty)
>  	for (i = 0; num_pages; i++) {
>  		int to_free = min_t(int, PAGES_PER_CHUNK, num_pages);
>  
> -		siw_free_plist(&umem->page_chunk[i], to_free,
> -			       umem->writable && dirty);
> +		put_user_pages_dirty_lock(umem->page_chunk[i].plist,
> +					  to_free,
> +					  umem->writable && dirty);
>  		kfree(umem->page_chunk[i].plist);
>  		num_pages -= to_free;
>  	}
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 0334ca97c584..9759b6a24420 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1057,8 +1057,9 @@ static inline void put_user_page(struct page *page)
>  	put_page(page);
>  }
>  
> -void put_user_pages_dirty(struct page **pages, unsigned long npages);
> -void put_user_pages_dirty_lock(struct page **pages, unsigned long npages);
> +void put_user_pages_dirty_lock(struct page **pages, unsigned long npages,
> +			       bool make_dirty);
> +
>  void put_user_pages(struct page **pages, unsigned long npages);
>  
>  #if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
> diff --git a/mm/gup.c b/mm/gup.c
> index 98f13ab37bac..7fefd7ab02c4 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -29,85 +29,70 @@ struct follow_page_context {
>  	unsigned int page_mask;
>  };
>  
> -typedef int (*set_dirty_func_t)(struct page *page);
> -
> -static void __put_user_pages_dirty(struct page **pages,
> -				   unsigned long npages,
> -				   set_dirty_func_t sdf)
> -{
> -	unsigned long index;
> -
> -	for (index = 0; index < npages; index++) {
> -		struct page *page = compound_head(pages[index]);
> -
> -		/*
> -		 * Checking PageDirty at this point may race with
> -		 * clear_page_dirty_for_io(), but that's OK. Two key cases:
> -		 *
> -		 * 1) This code sees the page as already dirty, so it skips
> -		 * the call to sdf(). That could happen because
> -		 * clear_page_dirty_for_io() called page_mkclean(),
> -		 * followed by set_page_dirty(). However, now the page is
> -		 * going to get written back, which meets the original
> -		 * intention of setting it dirty, so all is well:
> -		 * clear_page_dirty_for_io() goes on to call
> -		 * TestClearPageDirty(), and write the page back.
> -		 *
> -		 * 2) This code sees the page as clean, so it calls sdf().
> -		 * The page stays dirty, despite being written back, so it
> -		 * gets written back again in the next writeback cycle.
> -		 * This is harmless.
> -		 */
> -		if (!PageDirty(page))
> -			sdf(page);
> -
> -		put_user_page(page);
> -	}
> -}
> -
>  /**
> - * put_user_pages_dirty() - release and dirty an array of gup-pinned pages
> - * @pages:  array of pages to be marked dirty and released.
> + * put_user_pages_dirty_lock() - release and optionally dirty gup-pinned pages
> + * @pages:  array of pages to be maybe marked dirty, and definitely released.

Better would be.

@pages:  array of pages to be put

>   * @npages: number of pages in the @pages array.
> + * @make_dirty: whether to mark the pages dirty
>   *
>   * "gup-pinned page" refers to a page that has had one of the get_user_pages()
>   * variants called on that page.
>   *
>   * For each page in the @pages array, make that page (or its head page, if a
> - * compound page) dirty, if it was previously listed as clean. Then, release
> - * the page using put_user_page().
> + * compound page) dirty, if @make_dirty is true, and if the page was previously
> + * listed as clean. In any case, releases all pages using put_user_page(),
> + * possibly via put_user_pages(), for the non-dirty case.

I don't think users of this interface need this level of detail.  I think
something like.

 * For each page in the @pages array, release the page.  If @make_dirty is
 * true, mark the page dirty prior to release.


>   *
>   * Please see the put_user_page() documentation for details.
>   *
> - * set_page_dirty(), which does not lock the page, is used here.
> - * Therefore, it is the caller's responsibility to ensure that this is
> - * safe. If not, then put_user_pages_dirty_lock() should be called instead.
> + * set_page_dirty_lock() is used internally. If instead, set_page_dirty() is
> + * required, then the caller should a) verify that this is really correct,
> + * because _lock() is usually required, and b) hand code it:
> + * set_page_dirty_lock(), put_user_page().
>   *
>   */
> -void put_user_pages_dirty(struct page **pages, unsigned long npages)
> +void put_user_pages_dirty_lock(struct page **pages, unsigned long npages,
> +			       bool make_dirty)
>  {
> -	__put_user_pages_dirty(pages, npages, set_page_dirty);
> -}
> -EXPORT_SYMBOL(put_user_pages_dirty);
> +	unsigned long index;
>  
> -/**
> - * put_user_pages_dirty_lock() - release and dirty an array of gup-pinned pages
> - * @pages:  array of pages to be marked dirty and released.
> - * @npages: number of pages in the @pages array.
> - *
> - * For each page in the @pages array, make that page (or its head page, if a
> - * compound page) dirty, if it was previously listed as clean. Then, release
> - * the page using put_user_page().
> - *
> - * Please see the put_user_page() documentation for details.
> - *
> - * This is just like put_user_pages_dirty(), except that it invokes
> - * set_page_dirty_lock(), instead of set_page_dirty().
> - *
> - */
> -void put_user_pages_dirty_lock(struct page **pages, unsigned long npages)
> -{
> -	__put_user_pages_dirty(pages, npages, set_page_dirty_lock);
> +	/*
> +	 * TODO: this can be optimized for huge pages: if a series of pages is
> +	 * physically contiguous and part of the same compound page, then a
> +	 * single operation to the head page should suffice.
> +	 */

I think this comment belongs to the for loop below...  or just something about
how to make this and put_user_pages() more efficient.  It is odd, that this is
the same comment as in put_user_pages()...

The code is good.  So... Other than the comments.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>

Ira

> +
> +	if (!make_dirty) {
> +		put_user_pages(pages, npages);
> +		return;
> +	}
> +
> +	for (index = 0; index < npages; index++) {
> +		struct page *page = compound_head(pages[index]);
> +		/*
> +		 * Checking PageDirty at this point may race with
> +		 * clear_page_dirty_for_io(), but that's OK. Two key
> +		 * cases:
> +		 *
> +		 * 1) This code sees the page as already dirty, so it
> +		 * skips the call to set_page_dirty(). That could happen
> +		 * because clear_page_dirty_for_io() called
> +		 * page_mkclean(), followed by set_page_dirty().
> +		 * However, now the page is going to get written back,
> +		 * which meets the original intention of setting it
> +		 * dirty, so all is well: clear_page_dirty_for_io() goes
> +		 * on to call TestClearPageDirty(), and write the page
> +		 * back.
> +		 *
> +		 * 2) This code sees the page as clean, so it calls
> +		 * set_page_dirty(). The page stays dirty, despite being
> +		 * written back, so it gets written back again in the
> +		 * next writeback cycle. This is harmless.
> +		 */
> +		if (!PageDirty(page))
> +			set_page_dirty_lock(page);
> +		put_user_page(page);
> +	}
>  }
>  EXPORT_SYMBOL(put_user_pages_dirty_lock);
>  
> -- 
> 2.22.0
> 

^ permalink raw reply

* Re: [RFC] implicit per-namespace devlink instance to set kernel resource limitations
From: David Ahern @ 2019-08-06 17:38 UTC (permalink / raw)
  To: Jiri Pirko, netdev
  Cc: davem, mlxsw, jakub.kicinski, andrew, f.fainelli, vivien.didelot,
	mkubecek, stephen, daniel, brouer, eric.dumazet
In-Reply-To: <20190806164036.GA2332@nanopsycho.orion>

On 8/6/19 10:40 AM, Jiri Pirko wrote:
> Hi all.
> 
> I just discussed this with DavidA and I would like to bring this to
> broader audience. David wants to limit kernel resources in network
> namespaces, for example fibs, fib rules, etc.
> 
> He claims that devlink api is rich enough to program this limitations
> as it already does for mlxsw hw resources for example. If we have this
> api for hardware, why don't to reuse it for the kernel and it's
> resources too?

The analogy is that a kernel is 'programmed' just like hardware, it has
resources just like hardware (e.g., memory) and those resources are
limited as well. So the resources consumed by fib entries, rules,
nexthops, etc should be controllable.

> 
> So the proposal is to have some new device, say "kernelnet", that would
> implicitly create per-namespace devlink instance. This devlink
> instance would be used to setup resource limits. Like:
> 
> devlink resource set kernelnet path /IPv4/fib size 96
> devlink -N ns1name resource set kernelnet path /IPv6/fib size 100
> devlink -N ns2name resource set kernelnet path /IPv4/fib-rules size 8
> 
> To me it sounds a bit odd for kernel namespace to act as a device, but
> thinking about it more, it makes sense. Probably better than to define
> a new api. User would use the same tool to work with kernel and hw.
> 
> Also we can implement other devlink functionality, like dpipe.
> User would then have visibility of network pipeline, tables,
> utilization, etc. It is related to the resources too.
> 
> What do you think?
> 
> Jiri
> 


^ permalink raw reply

* Re: [patch net-next v2 1/3] net: devlink: allow to change namespaces
From: David Ahern @ 2019-08-06 17:34 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: Jakub Kicinski, netdev, davem, sthemmin, mlxsw
In-Reply-To: <20190805152019.GE2349@nanopsycho.orion>

On 8/5/19 9:20 AM, Jiri Pirko wrote:
> Mon, Aug 05, 2019 at 04:51:22PM CEST, dsahern@gmail.com wrote:
>> On 8/5/19 8:49 AM, Jiri Pirko wrote:
>>>> Your commit 5fc494225c1eb81309cc4c91f183cd30e4edb674 changed that from a
>>>> per-namepace accounting to all namespaces managed by a single devlink
>>>> instance in init_net - which is completely wrong.
>>> No. Not "all namespaces". Only the one where the devlink is. And that is
>>> always init_net, until this patchset.
>>>
>>>
>>
>> Jiri: your change to fib.c does not take into account namespace when
>> doing rules and routes accounting. you broke it. fix it.
> 
> What do you mean by "account namespace"? It's a device resource, why to
> tight it with namespace? What if you have 2 netdevsim-devlink instances
> in one namespace? Why the setting should be per-namespace?
> 

Jiri:

Here's an example of how your 5.2 change to netdevsim broke the resource
controller:

Create a netdevsim device:
$ modprobe netdevsim
$  echo "0 1" > /sys/bus/netdevsim/new_device

Get the current number of IPv4 routes:
$ n=$(ip -4 ro ls table all | wc -l)
$ echo $n
13

Prevent any more from being added. This limit should apply solely to
this namespace, init_net:

$ devlink resource set netdevsim/netdevsim0 path /IPv4/fib size $n
$ devlink dev reload netdevsim/netdevsim0
Error: netdevsim: New size is less than current occupancy.
devlink answers: Invalid argument

So that is the first breakage: accounting is off - maybe. Note there are
no other visible namespaces, but who knows what systemd or other
processes are doing with namespaces. Perhaps this accounting is another
example of your changes not properly handling namespaces:

$ devlink resource show netdevsim/netdevsim0
netdevsim/netdevsim0:
  name IPv4 size unlimited unit entry size_min 0 size_max unlimited
size_gran 1 dpipe_tables none
    resources:
      name fib size 13 occ 17 unit entry size_min 0 size_max unlimited
size_gran 1 dpipe_tables none
      name fib-rules size unlimited occ 6 unit entry size_min 0 size_max
unlimited size_gran 1 dpipe_tables none
  name IPv6 size unlimited unit entry size_min 0 size_max unlimited
size_gran 1 dpipe_tables none
    resources:
      name fib size unlimited occ 10 unit entry size_min 0 size_max
unlimited size_gran 1 dpipe_tables none
      name fib-rules size unlimited occ 4 unit entry size_min 0 size_max
unlimited size_gran 1 dpipe_tables none

So the occupancy does not match the tables for init_net.

Reset the max to 17, the current occupancy:
$ devlink resource set netdevsim/netdevsim0 path /IPv4/fib size 17
$ devlink dev reload netdevsim/netdevsim0
$ devlink resource show netdevsim/netdevsim0
netdevsim/netdevsim0:
  name IPv4 size unlimited unit entry size_min 0 size_max unlimited
size_gran 1 dpipe_tables none
    resources:
      name fib size 17 occ 17 unit entry size_min 0 size_max unlimited
size_gran 1 dpipe_tables none
      name fib-rules size unlimited occ 6 unit entry size_min 0 size_max
unlimited size_gran 1 dpipe_tables none
  name IPv6 size unlimited unit entry size_min 0 size_max unlimited
size_gran 1 dpipe_tables none
    resources:
      name fib size unlimited occ 10 unit entry size_min 0 size_max
unlimited size_gran 1 dpipe_tables none
      name fib-rules size unlimited occ 4 unit entry size_min 0 size_max
unlimited size_gran 1 dpipe_tables none

Create a new namespace, bring up lo which attempts to add more route
entries:
$ unshare -n
$ ip li set lo up

If you list routes you see the lo routes failed to installed because of
the limits, but it is a silent failure. Try to add a new route and you
see the cross namespace accounting now:
$ ip ro add 192.168.1.0/24 dev lo
Error: netdevsim: Exceeded number of supported fib entries.


Contrast that behavior with 5.1 and you see the new namespaces have no
bearing on accounting in init_net and limits in init_net do not affect
other namespaces.

That behavior needs to be restored in 5.2 and 5.3.

^ permalink raw reply

* Re: [PATCH 08/17] nfp: no need to check return value of debugfs_create functions
From: Jakub Kicinski @ 2019-08-06 17:30 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: netdev, David S. Miller, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Edwin Peer, Yangtao Li,
	Simon Horman, oss-drivers
In-Reply-To: <20190806170049.GA12269@kroah.com>

On Tue, 6 Aug 2019 19:00:49 +0200, Greg Kroah-Hartman wrote:
> On Tue, Aug 06, 2019 at 09:50:08AM -0700, Jakub Kicinski wrote:
> > On Tue,  6 Aug 2019 18:11:19 +0200, Greg Kroah-Hartman wrote:  
> > > When calling debugfs functions, there is no need to ever check the
> > > return value.  The function can work or not, but the code logic should
> > > never do something different based on this.
> > > 
> > > Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
> > > Cc: "David S. Miller" <davem@davemloft.net>
> > > Cc: Alexei Starovoitov <ast@kernel.org>
> > > Cc: Daniel Borkmann <daniel@iogearbox.net>
> > > Cc: Jesper Dangaard Brouer <hawk@kernel.org>
> > > Cc: John Fastabend <john.fastabend@gmail.com>
> > > Cc: Edwin Peer <edwin.peer@netronome.com>
> > > Cc: Yangtao Li <tiny.windzz@gmail.com>
> > > Cc: Simon Horman <simon.horman@netronome.com>
> > > Cc: oss-drivers@netronome.com
> > > Cc: netdev@vger.kernel.org
> > > Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>  
> > 
> > Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> > 
> > I take it this is the case since commit ff9fb72bc077 ("debugfs: return
> > error values, not NULL")? I.e. v5.0? It'd be useful to know for backport
> > purposes.  
> 
> You were always safe to ignore debugfs calls before that, but in 5.0 and
> then 5.2 we got a bit more "robust" with some internal debugfs logic to
> make it even easier.  These can be backported to 2.6.11+ if you really
> want to, no functionality should change.

Oh sorry! I meant vendor out-of-tree driver backport. We all maintain 
a tarball version of the drivers that compile on old kernels, I was
mostly wondering from that perspective.
 
> But why would you want to backport them?  This really isn't a "bugfix"
> for a stable kernel.  No one should ever noticed the difference except
> for less memory being used.

Right, it wouldn't really help to do an upstream backport.

^ permalink raw reply

* Re: [PATCH bpf-next v4 1/3] selftests/bpf: test_progs: switch to open_memstream
From: Andrii Nakryiko @ 2019-08-06 17:29 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Networking, bpf, David S. Miller, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko
In-Reply-To: <20190806170901.142264-2-sdf@google.com>

On Tue, Aug 6, 2019 at 10:19 AM Stanislav Fomichev <sdf@google.com> wrote:
>
> Use open_memstream to override stdout during test execution.
> The copy of the original stdout is held in env.stdout and used
> to print subtest info and dump failed log.
>
> test_{v,}printf are now simple wrappers around stdout and will be
> removed in the next patch.
>
> v4:
> * one field per line for stdout/stderr (Andrii Nakryiko)
>
> v3:
> * don't do strlen over log_buf, log_cnt has it already (Andrii Nakryiko)
>
> v2:
> * add ifdef __GLIBC__ around open_memstream (maybe pointless since
>   we already depend on glibc for argp_parse)
> * hijack stderr as well (Andrii Nakryiko)
> * don't hijack for every test, do it once (Andrii Nakryiko)
> * log_cap -> log_size (Andrii Nakryiko)
> * do fseeko in a proper place (Andrii Nakryiko)
> * check open_memstream returned value (Andrii Nakryiko)
>
> Cc: Andrii Nakryiko <andriin@fb.com>
> Acked-by: Andrii Nakryiko <andriin@fb.com>
> Signed-off-by: Stanislav Fomichev <sdf@google.com>
> ---
>  tools/testing/selftests/bpf/test_progs.c | 115 ++++++++++++-----------
>  tools/testing/selftests/bpf/test_progs.h |   3 +-
>  2 files changed, 62 insertions(+), 56 deletions(-)
>
> diff --git a/tools/testing/selftests/bpf/test_progs.c b/tools/testing/selftests/bpf/test_progs.c
> index db00196c8315..9556439c607c 100644
> --- a/tools/testing/selftests/bpf/test_progs.c
> +++ b/tools/testing/selftests/bpf/test_progs.c
> @@ -40,14 +40,20 @@ static bool should_run(struct test_selector *sel, int num, const char *name)
>
>  static void dump_test_log(const struct prog_test_def *test, bool failed)
>  {
> +       if (stdout == env.stdout)
> +               return;
> +
> +       fflush(stdout); /* exports env.log_buf & env.log_cnt */
> +
>         if (env.verbose || test->force_log || failed) {
>                 if (env.log_cnt) {
> -                       fprintf(stdout, "%s", env.log_buf);
> +                       fprintf(env.stdout, "%s", env.log_buf);
>                         if (env.log_buf[env.log_cnt - 1] != '\n')
> -                               fprintf(stdout, "\n");
> +                               fprintf(env.stdout, "\n");
>                 }
>         }
> -       env.log_cnt = 0;
> +
> +       fseeko(stdout, 0, SEEK_SET); /* rewind */
>  }
>
>  void test__end_subtest()
> @@ -62,7 +68,7 @@ void test__end_subtest()
>
>         dump_test_log(test, sub_error_cnt);
>
> -       printf("#%d/%d %s:%s\n",
> +       fprintf(env.stdout, "#%d/%d %s:%s\n",
>                test->test_num, test->subtest_num,
>                test->subtest_name, sub_error_cnt ? "FAIL" : "OK");
>  }
> @@ -79,7 +85,8 @@ bool test__start_subtest(const char *name)
>         test->subtest_num++;
>
>         if (!name || !name[0]) {
> -               fprintf(stderr, "Subtest #%d didn't provide sub-test name!\n",
> +               fprintf(env.stderr,
> +                       "Subtest #%d didn't provide sub-test name!\n",
>                         test->subtest_num);
>                 return false;
>         }
> @@ -100,53 +107,7 @@ void test__force_log() {
>
>  void test__vprintf(const char *fmt, va_list args)
>  {
> -       size_t rem_sz;
> -       int ret = 0;
> -
> -       if (env.verbose || (env.test && env.test->force_log)) {
> -               vfprintf(stderr, fmt, args);
> -               return;
> -       }
> -
> -try_again:
> -       rem_sz = env.log_cap - env.log_cnt;
> -       if (rem_sz) {
> -               va_list ap;
> -
> -               va_copy(ap, args);
> -               /* we reserved extra byte for \0 at the end */
> -               ret = vsnprintf(env.log_buf + env.log_cnt, rem_sz + 1, fmt, ap);
> -               va_end(ap);
> -
> -               if (ret < 0) {
> -                       env.log_buf[env.log_cnt] = '\0';
> -                       fprintf(stderr, "failed to log w/ fmt '%s'\n", fmt);
> -                       return;
> -               }
> -       }
> -
> -       if (!rem_sz || ret > rem_sz) {
> -               size_t new_sz = env.log_cap * 3 / 2;
> -               char *new_buf;
> -
> -               if (new_sz < 4096)
> -                       new_sz = 4096;
> -               if (new_sz < ret + env.log_cnt)
> -                       new_sz = ret + env.log_cnt;
> -
> -               /* +1 for guaranteed space for terminating \0 */
> -               new_buf = realloc(env.log_buf, new_sz + 1);
> -               if (!new_buf) {
> -                       fprintf(stderr, "failed to realloc log buffer: %d\n",
> -                               errno);
> -                       return;
> -               }
> -               env.log_buf = new_buf;
> -               env.log_cap = new_sz;
> -               goto try_again;
> -       }
> -
> -       env.log_cnt += ret;
> +       vprintf(fmt, args);
>  }
>
>  void test__printf(const char *fmt, ...)
> @@ -477,6 +438,48 @@ static error_t parse_arg(int key, char *arg, struct argp_state *state)
>         return 0;
>  }
>
> +static void stdio_hijack(void)
> +{
> +#ifdef __GLIBC__
> +       if (env.verbose || (env.test && env.test->force_log)) {

I just also realized that you don't need `(env.test &&
env.test->force_log)` test. We hijack stdout/stderr before env.test is
even set, so this does nothing anyways. Plus, force_log can be set in
the middle of test/sub-test, yet we hijack stdout just once (or even
if per-test), so it's still going to be "racy". Let's buffer output
(unless it's env.verbose, which is important to not buffer because
some tests will have huge output, when failing, so this allows to
bypass using tons of memory for those, when debugging) and dump at the
end.

> +               /* nothing to do, output to stdout by default */
> +               return;
> +       }
> +
> +       /* stdout and stderr -> buffer */
> +       fflush(stdout);
> +
> +       env.stdout = stdout;
> +       env.stderr = stderr;
> +
> +       stdout = open_memstream(&env.log_buf, &env.log_cnt);
> +       if (!stdout) {
> +               stdout = env.stdout;
> +               perror("open_memstream");
> +               return;
> +       }
> +
> +       stderr = stdout;
> +#endif
> +}
> +
> +static void stdio_restore(void)
> +{
> +#ifdef __GLIBC__
> +       if (stdout == env.stdout)
> +               return;
> +
> +       fclose(stdout);
> +       free(env.log_buf);
> +
> +       env.log_buf = NULL;
> +       env.log_cnt = 0;
> +
> +       stdout = env.stdout;
> +       stderr = env.stderr;
> +#endif
> +}
> +
>  int main(int argc, char **argv)
>  {
>         static const struct argp argp = {
> @@ -496,6 +499,7 @@ int main(int argc, char **argv)
>
>         env.jit_enabled = is_jit_enabled();
>
> +       stdio_hijack();
>         for (i = 0; i < prog_test_cnt; i++) {
>                 struct prog_test_def *test = &prog_test_defs[i];
>                 int old_pass_cnt = pass_cnt;
> @@ -523,13 +527,14 @@ int main(int argc, char **argv)
>
>                 dump_test_log(test, test->error_cnt);
>
> -               printf("#%d %s:%s\n", test->test_num, test->test_name,
> -                      test->error_cnt ? "FAIL" : "OK");
> +               fprintf(env.stdout, "#%d %s:%s\n",
> +                       test->test_num, test->test_name,
> +                       test->error_cnt ? "FAIL" : "OK");
>         }
> +       stdio_restore();
>         printf("Summary: %d/%d PASSED, %d FAILED\n",
>                env.succ_cnt, env.sub_succ_cnt, env.fail_cnt);
>
> -       free(env.log_buf);
>         free(env.test_selector.num_set);
>         free(env.subtest_selector.num_set);
>
> diff --git a/tools/testing/selftests/bpf/test_progs.h b/tools/testing/selftests/bpf/test_progs.h
> index afd14962456f..541f9eab5eed 100644
> --- a/tools/testing/selftests/bpf/test_progs.h
> +++ b/tools/testing/selftests/bpf/test_progs.h
> @@ -56,9 +56,10 @@ struct test_env {
>         bool jit_enabled;
>
>         struct prog_test_def *test;
> +       FILE *stdout;
> +       FILE *stderr;
>         char *log_buf;
>         size_t log_cnt;
> -       size_t log_cap;
>
>         int succ_cnt; /* successful tests */
>         int sub_succ_cnt; /* successful sub-tests */
> --
> 2.22.0.770.g0f2c4a37fd-goog
>

^ permalink raw reply

* Re: [PATCH ethtool] ethtool: dump nested registers
From: Vivien Didelot @ 2019-08-06 17:24 UTC (permalink / raw)
  To: Michal Kubecek; +Cc: netdev, f.fainelli, andrew, davem, linville, cphealy
In-Reply-To: <20190806052002.GD31971@unicorn.suse.cz>

Hi Michal,

On Tue, 6 Aug 2019 07:20:02 +0200, Michal Kubecek <mkubecek@suse.cz> wrote:
> On Mon, Aug 05, 2019 at 10:52:16AM -0400, Vivien Didelot wrote:
> > Hi Michal!
> > 
> > On Mon, 5 Aug 2019 10:04:48 +0200, Michal Kubecek <mkubecek@suse.cz> wrote:
> > > On Fri, Aug 02, 2019 at 03:34:54PM -0400, Vivien Didelot wrote:
> > > > Usually kernel drivers set the regs->len value to the same length as
> > > > info->regdump_len, which was used for the allocation. In case where
> > > > regs->len is smaller than the allocated info->regdump_len length,
> > > > we may assume that the dump contains a nested set of registers.
> > > > 
> > > > This becomes handy for kernel drivers to expose registers of an
> > > > underlying network conduit unfortunately not exposed to userspace,
> > > > as found in network switching equipment for example.
> > > > 
> > > > This patch adds support for recursing into the dump operation if there
> > > > is enough room for a nested ethtool_drvinfo structure containing a
> > > > valid driver name, followed by a ethtool_regs structure like this:
> > > > 
> > > >     0      regs->len                        info->regdump_len
> > > >     v              v                                        v
> > > >     +--------------+-----------------+--------------+-- - --+
> > > >     | ethtool_regs | ethtool_drvinfo | ethtool_regs |       |
> > > >     +--------------+-----------------+--------------+-- - --+
> > > > 
> > > > Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
> > > > ---
> > > 
> > > I'm not sure about this approach. If these additional objects with their
> > > own registers are represented by a network device, we can query their
> > > registers directly. If they are not (which, IIUC, is the case in your
> > > use case), we should use an appropriate interface. AFAIK the CPU ports
> > > are already represented in devlink, shouldn't devlink be also used to
> > > query their registers?
> > 
> > Yet another interface wasn't that much appropriate for DSA, making the
> > stack unnecessarily complex.
> 
> AFAICS, there is already devlink support in dsa and CPU ports are
> presented as devlink ports. So it wouldn't be a completely new
> interface.
> 
> > In fact we are already glueing the statistics of the CPU port into the
> > master's ethtool ops (both physical ports are wired together).
> 
> The ethtool statistics (in general) are one big mess, I don't think it's
> an example worth following; rather one showing us what to avoid.
> 
> > Adding support for nested registers dump in ethtool makes it simple to
> > (pretty) dump CPU port's registers without too much userspace
> > addition.
> 
> It is indeed convenient for pretty print - but very hard to use for any
> automated processing. My point is that CPU port is not represented by
> a network device but it is already represented by a devlink port so that
> it makes much more sense to tie its register dump to that object than to
> add add it to register dump of port's master.

How would that be presented to userspace? The pretty printing for some
DSA switch ports (e.g. mv88e6xxx) are already added in ethtool. Is there
a devlink-ethtool glue of some kind, or should the pretty printing be
duplicated in yet another tool? I'd prefer to avoid the latter...

> In the future, I would like to transform the ethtool register dump from
> current opaque block of data to an actual dump of registers. It is
> unfortunate that drivers are already mixing information specific to
> a network device with information common for the whole physical device.
> Adding more data which is actually related to a different object would
> only make things worse.

I totally understand and I'm interested to follow this work.


Thanks,

	Vivien

^ permalink raw reply

* Re: [PATCH bpf-next v4 0/3] selftests/bpf: switch test_progs back to stdio
From: Alexei Starovoitov @ 2019-08-06 17:20 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Network Development, bpf, David S. Miller, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko
In-Reply-To: <20190806170901.142264-1-sdf@google.com>

On Tue, Aug 6, 2019 at 10:09 AM Stanislav Fomichev <sdf@google.com> wrote:
>
> I was looking into converting test_sockops* to test_progs framework
> and that requires using cgroup_helpers.c which rely on stdio/stderr.
> Let's use open_memstream to override stdout into buffer during
> subtests instead of custom test_{v,}printf wrappers. That lets
> us continue to use stdio in the subtests and dump it on failure
> if required.
>
> That would also fix bpf_find_map which currently uses printf to
> signal failure (missed during test_printf conversion).
>
> Cc: Andrii Nakryiko <andriin@fb.com>

sadly test_progs -v now segfaults.
pls fix and resubmit.

^ permalink raw reply

* Re: [PATCH net v2] net: dsa: Check existence of .port_mdb_add callback before calling it
From: Vivien Didelot @ 2019-08-06 17:15 UTC (permalink / raw)
  To: Chen-Yu Tsai
  Cc: Andrew Lunn, Florian Fainelli, David S. Miller, Chen-Yu Tsai,
	netdev, linux-kernel
In-Reply-To: <20190806075325.9011-1-wens@kernel.org>

Hi Chen-Yu,

On Tue,  6 Aug 2019 15:53:25 +0800, Chen-Yu Tsai <wens@kernel.org> wrote:
> From: Chen-Yu Tsai <wens@csie.org>
> 
> With the recent addition of commit 75dad2520fc3 ("net: dsa: b53: Disable
> all ports on setup"), users of b53 (BCM53125 on Lamobo R1 in my case)
> are forced to use the dsa subsystem to enable the switch, instead of
> having it in the default transparent "forward-to-all" mode.
> 
> The b53 driver does not support mdb bitmap functions. However the dsa
> layer does not check for the existence of the .port_mdb_add callback
> before actually using it. This results in a NULL pointer dereference,
> as shown in the kernel oops below.
> 
> The other functions seem to be properly guarded. Do the same for
> .port_mdb_add in dsa_switch_mdb_add_bitmap() as well.
> 
> b53 is not the only driver that doesn't support mdb bitmap functions.
> Others include bcm_sf2, dsa_loop, lantiq_gswip, mt7530, mv88e6060,
> qca8k, realtek-smi, and vitesse-vsc73xx.

I don't know what you mean by that, there's no "mdb bitmap function"
support for drivers, only the port_mdb_{prepare,add,del} callbacks...
> 
>     8<--- cut here ---
>     Unable to handle kernel NULL pointer dereference at virtual address 00000000
>     pgd = (ptrval)
>     [00000000] *pgd=00000000
>     Internal error: Oops: 80000005 [#1] SMP ARM
>     Modules linked in: rtl8xxxu rtl8192cu rtl_usb rtl8192c_common rtlwifi mac80211 cfg80211
>     CPU: 1 PID: 134 Comm: kworker/1:2 Not tainted 5.3.0-rc1-00247-gd3519030752a #1
>     Hardware name: Allwinner sun7i (A20) Family
>     Workqueue: events switchdev_deferred_process_work
>     PC is at 0x0
>     LR is at dsa_switch_event+0x570/0x620
>     pc : [<00000000>]    lr : [<c08533ec>]    psr: 80070013
>     sp : ee871db8  ip : 00000000  fp : ee98d0a4
>     r10: 0000000c  r9 : 00000008  r8 : ee89f710
>     r7 : ee98d040  r6 : ee98d088  r5 : c0f04c48  r4 : ee98d04c
>     r3 : 00000000  r2 : ee89f710  r1 : 00000008  r0 : ee98d040
>     Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
>     Control: 10c5387d  Table: 6deb406a  DAC: 00000051
>     Process kworker/1:2 (pid: 134, stack limit = 0x(ptrval))
>     Stack: (0xee871db8 to 0xee872000)
>     1da0:                                                       ee871e14 103ace2d
>     1dc0: 00000000 ffffffff 00000000 ee871e14 00000005 00000000 c08524a0 00000000
>     1de0: ffffe000 c014bdfc c0f04c48 ee871e98 c0f04c48 ee9e5000 c0851120 c014bef0
>     1e00: 00000000 b643aea2 ee9b4068 c08509a8 ee2bf940 ee89f710 ee871ecb 00000000
>     1e20: 00000008 103ace2d 00000000 c087e248 ee29c868 103ace2d 00000001 ffffffff
>     1e40: 00000000 ee871e98 00000006 00000000 c0fb2a50 c087e2d0 ffffffff c08523c4
>     1e60: ffffffff c014bdfc 00000006 c0fad2d0 ee871e98 ee89f710 00000000 c014c500
>     1e80: 00000000 ee89f3c0 c0f04c48 00000000 ee9e5000 c087dfb4 ee9e5000 00000000
>     1ea0: ee89f710 ee871ecb 00000001 103ace2d 00000000 c0f04c48 00000000 c087e0a8
>     1ec0: 00000000 efd9a3e0 0089f3c0 103ace2d ee89f700 ee89f710 ee9e5000 00000122
>     1ee0: 00000100 c087e130 ee89f700 c0fad2c8 c1003ef0 c087de4c 2e928000 c0fad2ec
>     1f00: c0fad2ec ee839580 ef7a62c0 ef7a9400 00000000 c087def8 c0fad2ec c01447dc
>     1f20: ef315640 ef7a62c0 00000008 ee839580 ee839594 ef7a62c0 00000008 c0f03d00
>     1f40: ef7a62d8 ef7a62c0 ffffe000 c0145b84 ffffe000 c0fb2420 c0bfaa8c 00000000
>     1f60: ffffe000 ee84b600 ee84b5c0 00000000 ee870000 ee839580 c0145b40 ef0e5ea4
>     1f80: ee84b61c c014a6f8 00000001 ee84b5c0 c014a5b0 00000000 00000000 00000000
>     1fa0: 00000000 00000000 00000000 c01010e8 00000000 00000000 00000000 00000000
>     1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
>     1fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
>     [<c08533ec>] (dsa_switch_event) from [<c014bdfc>] (notifier_call_chain+0x48/0x84)
>     [<c014bdfc>] (notifier_call_chain) from [<c014bef0>] (raw_notifier_call_chain+0x18/0x20)
>     [<c014bef0>] (raw_notifier_call_chain) from [<c08509a8>] (dsa_port_mdb_add+0x48/0x74)
>     [<c08509a8>] (dsa_port_mdb_add) from [<c087e248>] (__switchdev_handle_port_obj_add+0x54/0xd4)
>     [<c087e248>] (__switchdev_handle_port_obj_add) from [<c087e2d0>] (switchdev_handle_port_obj_add+0x8/0x14)
>     [<c087e2d0>] (switchdev_handle_port_obj_add) from [<c08523c4>] (dsa_slave_switchdev_blocking_event+0x94/0xa4)
>     [<c08523c4>] (dsa_slave_switchdev_blocking_event) from [<c014bdfc>] (notifier_call_chain+0x48/0x84)
>     [<c014bdfc>] (notifier_call_chain) from [<c014c500>] (blocking_notifier_call_chain+0x50/0x68)
>     [<c014c500>] (blocking_notifier_call_chain) from [<c087dfb4>] (switchdev_port_obj_notify+0x44/0xa8)
>     [<c087dfb4>] (switchdev_port_obj_notify) from [<c087e0a8>] (switchdev_port_obj_add_now+0x90/0x104)
>     [<c087e0a8>] (switchdev_port_obj_add_now) from [<c087e130>] (switchdev_port_obj_add_deferred+0x14/0x5c)
>     [<c087e130>] (switchdev_port_obj_add_deferred) from [<c087de4c>] (switchdev_deferred_process+0x64/0x104)
>     [<c087de4c>] (switchdev_deferred_process) from [<c087def8>] (switchdev_deferred_process_work+0xc/0x14)
>     [<c087def8>] (switchdev_deferred_process_work) from [<c01447dc>] (process_one_work+0x218/0x50c)
>     [<c01447dc>] (process_one_work) from [<c0145b84>] (worker_thread+0x44/0x5bc)
>     [<c0145b84>] (worker_thread) from [<c014a6f8>] (kthread+0x148/0x150)
>     [<c014a6f8>] (kthread) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
>     Exception stack(0xee871fb0 to 0xee871ff8)
>     1fa0:                                     00000000 00000000 00000000 00000000
>     1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
>     1fe0: 00000000 00000000 00000000 00000000 00000013 00000000
>     Code: bad PC value
>     ---[ end trace 1292c61abd17b130 ]---
> 
>     [<c08533ec>] (dsa_switch_event) from [<c014bdfc>] (notifier_call_chain+0x48/0x84)
>     corresponds to
> 
> 	$ arm-linux-gnueabihf-addr2line -C -i -e vmlinux c08533ec
> 
> 	linux/net/dsa/switch.c:156
> 	linux/net/dsa/switch.c:178
> 	linux/net/dsa/switch.c:328
> 
> Fixes: e6db98db8a95 ("net: dsa: add switch mdb bitmap functions")
> Signed-off-by: Chen-Yu Tsai <wens@csie.org>
> ---
> Changes since v1:
> 
>   - Moved the check to the beginning of dsa_switch_mdb_add()
> 
> Looks like we could also move the ops check out of
> dsa_switch_mdb_prepare_bitmap(), though I suppose keeping the code the
> way it is now is clearer.
> 
> ---
>  net/dsa/switch.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/net/dsa/switch.c b/net/dsa/switch.c
> index 4ec5b7f85d51..231af5268656 100644
> --- a/net/dsa/switch.c
> +++ b/net/dsa/switch.c
> @@ -164,6 +164,9 @@ static int dsa_switch_mdb_add(struct dsa_switch *ds,
>  	struct switchdev_trans *trans = info->trans;
>  	int port;
>  
> +	if (!ds->ops->port_mdb_add)
> +		return -EOPNOTSUPP;
> +
>  	/* Build a mask of Multicast group members */
>  	bitmap_zero(ds->bitmap, ds->num_ports);
>  	if (ds->index == info->sw_index)
> -- 
> 2.20.1
> 

I don't understand the crash here, nor the fix. dsa_switch_mdb_add()
is supposed to be called through switchdev with a prepare phase,
which checks for ds->ops->port_mdb_add. Do you mean that a switchdev
MDB object is added somewhere without a prepare phase? If that's
the case, this is what the commit message must say. Then the
ds->ops->port_mdb_add check must go where it is used, that is to say
at the beginning of dsa_switch_mdb_add_bitmap() (similarly to what
dsa_switch_mdb_prepare_bitmap() does), not in dsa_switch_mdb_add.


Thanks,

	Vivien

^ permalink raw reply

* [PATCH bpf-next v4 3/3] selftests/bpf: test_progs: drop extra trailing tab
From: Stanislav Fomichev @ 2019-08-06 17:09 UTC (permalink / raw)
  To: netdev, bpf; +Cc: davem, ast, daniel, Stanislav Fomichev, Andrii Nakryiko
In-Reply-To: <20190806170901.142264-1-sdf@google.com>

Small (un)related cleanup.

Cc: Andrii Nakryiko <andriin@fb.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 tools/testing/selftests/bpf/test_progs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/test_progs.c b/tools/testing/selftests/bpf/test_progs.c
index 963912008042..beed74043933 100644
--- a/tools/testing/selftests/bpf/test_progs.c
+++ b/tools/testing/selftests/bpf/test_progs.c
@@ -278,7 +278,7 @@ enum ARG_KEYS {
 	ARG_VERIFIER_STATS = 's',
 	ARG_VERBOSE = 'v',
 };
-	
+
 static const struct argp_option opts[] = {
 	{ "num", ARG_TEST_NUM, "NUM", 0,
 	  "Run test number NUM only " },
-- 
2.22.0.770.g0f2c4a37fd-goog


^ permalink raw reply related

* [PATCH bpf-next v4 2/3] selftests/bpf: test_progs: test__printf -> printf
From: Stanislav Fomichev @ 2019-08-06 17:09 UTC (permalink / raw)
  To: netdev, bpf; +Cc: davem, ast, daniel, Stanislav Fomichev, Andrii Nakryiko
In-Reply-To: <20190806170901.142264-1-sdf@google.com>

Now that test__printf is a simple wraper around printf, let's drop it
(and test__vprintf as well).

Cc: Andrii Nakryiko <andriin@fb.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 .../selftests/bpf/prog_tests/bpf_verif_scale.c   |  4 ++--
 .../testing/selftests/bpf/prog_tests/l4lb_all.c  |  2 +-
 .../testing/selftests/bpf/prog_tests/map_lock.c  | 10 +++++-----
 .../selftests/bpf/prog_tests/send_signal.c       |  4 ++--
 .../testing/selftests/bpf/prog_tests/spinlock.c  |  2 +-
 .../bpf/prog_tests/stacktrace_build_id.c         |  4 ++--
 .../bpf/prog_tests/stacktrace_build_id_nmi.c     |  4 ++--
 .../selftests/bpf/prog_tests/xdp_noinline.c      |  4 ++--
 tools/testing/selftests/bpf/test_progs.c         | 16 +---------------
 tools/testing/selftests/bpf/test_progs.h         | 10 ++++------
 10 files changed, 22 insertions(+), 38 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c b/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c
index b4be96162ff4..3548ba2f24a8 100644
--- a/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c
+++ b/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c
@@ -5,13 +5,13 @@ static int libbpf_debug_print(enum libbpf_print_level level,
 			      const char *format, va_list args)
 {
 	if (level != LIBBPF_DEBUG) {
-		test__vprintf(format, args);
+		vprintf(format, args);
 		return 0;
 	}
 
 	if (!strstr(format, "verifier log"))
 		return 0;
-	test__vprintf("%s", args);
+	vprintf("%s", args);
 	return 0;
 }
 
diff --git a/tools/testing/selftests/bpf/prog_tests/l4lb_all.c b/tools/testing/selftests/bpf/prog_tests/l4lb_all.c
index 5ce572c03a5f..20ddca830e68 100644
--- a/tools/testing/selftests/bpf/prog_tests/l4lb_all.c
+++ b/tools/testing/selftests/bpf/prog_tests/l4lb_all.c
@@ -74,7 +74,7 @@ static void test_l4lb(const char *file)
 	}
 	if (bytes != MAGIC_BYTES * NUM_ITER * 2 || pkts != NUM_ITER * 2) {
 		error_cnt++;
-		test__printf("test_l4lb:FAIL:stats %lld %lld\n", bytes, pkts);
+		printf("test_l4lb:FAIL:stats %lld %lld\n", bytes, pkts);
 	}
 out:
 	bpf_object__close(obj);
diff --git a/tools/testing/selftests/bpf/prog_tests/map_lock.c b/tools/testing/selftests/bpf/prog_tests/map_lock.c
index 2e78217ed3fd..ee99368c595c 100644
--- a/tools/testing/selftests/bpf/prog_tests/map_lock.c
+++ b/tools/testing/selftests/bpf/prog_tests/map_lock.c
@@ -9,12 +9,12 @@ static void *parallel_map_access(void *arg)
 	for (i = 0; i < 10000; i++) {
 		err = bpf_map_lookup_elem_flags(map_fd, &key, vars, BPF_F_LOCK);
 		if (err) {
-			test__printf("lookup failed\n");
+			printf("lookup failed\n");
 			error_cnt++;
 			goto out;
 		}
 		if (vars[0] != 0) {
-			test__printf("lookup #%d var[0]=%d\n", i, vars[0]);
+			printf("lookup #%d var[0]=%d\n", i, vars[0]);
 			error_cnt++;
 			goto out;
 		}
@@ -22,8 +22,8 @@ static void *parallel_map_access(void *arg)
 		for (j = 2; j < 17; j++) {
 			if (vars[j] == rnd)
 				continue;
-			test__printf("lookup #%d var[1]=%d var[%d]=%d\n",
-				     i, rnd, j, vars[j]);
+			printf("lookup #%d var[1]=%d var[%d]=%d\n",
+			       i, rnd, j, vars[j]);
 			error_cnt++;
 			goto out;
 		}
@@ -43,7 +43,7 @@ void test_map_lock(void)
 
 	err = bpf_prog_load(file, BPF_PROG_TYPE_CGROUP_SKB, &obj, &prog_fd);
 	if (err) {
-		test__printf("test_map_lock:bpf_prog_load errno %d\n", errno);
+		printf("test_map_lock:bpf_prog_load errno %d\n", errno);
 		goto close_prog;
 	}
 	map_fd[0] = bpf_find_map(__func__, obj, "hash_map");
diff --git a/tools/testing/selftests/bpf/prog_tests/send_signal.c b/tools/testing/selftests/bpf/prog_tests/send_signal.c
index 461b423d0584..1575f0a1f586 100644
--- a/tools/testing/selftests/bpf/prog_tests/send_signal.c
+++ b/tools/testing/selftests/bpf/prog_tests/send_signal.c
@@ -202,8 +202,8 @@ static int test_send_signal_nmi(void)
 			 -1 /* cpu */, -1 /* group_fd */, 0 /* flags */);
 	if (pmu_fd == -1) {
 		if (errno == ENOENT) {
-			test__printf("%s:SKIP:no PERF_COUNT_HW_CPU_CYCLES\n",
-				     __func__);
+			printf("%s:SKIP:no PERF_COUNT_HW_CPU_CYCLES\n",
+			       __func__);
 			return 0;
 		}
 		/* Let the test fail with a more informative message */
diff --git a/tools/testing/selftests/bpf/prog_tests/spinlock.c b/tools/testing/selftests/bpf/prog_tests/spinlock.c
index deb2db5b85b0..114ebe6a438e 100644
--- a/tools/testing/selftests/bpf/prog_tests/spinlock.c
+++ b/tools/testing/selftests/bpf/prog_tests/spinlock.c
@@ -12,7 +12,7 @@ void test_spinlock(void)
 
 	err = bpf_prog_load(file, BPF_PROG_TYPE_CGROUP_SKB, &obj, &prog_fd);
 	if (err) {
-		test__printf("test_spin_lock:bpf_prog_load errno %d\n", errno);
+		printf("test_spin_lock:bpf_prog_load errno %d\n", errno);
 		goto close_prog;
 	}
 	for (i = 0; i < 4; i++)
diff --git a/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id.c b/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id.c
index 356d2c017a9c..ac44fda84833 100644
--- a/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id.c
+++ b/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id.c
@@ -109,8 +109,8 @@ void test_stacktrace_build_id(void)
 	if (build_id_matches < 1 && retry--) {
 		bpf_link__destroy(link);
 		bpf_object__close(obj);
-		test__printf("%s:WARN:Didn't find expected build ID from the map, retrying\n",
-			     __func__);
+		printf("%s:WARN:Didn't find expected build ID from the map, retrying\n",
+		       __func__);
 		goto retry;
 	}
 
diff --git a/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id_nmi.c b/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id_nmi.c
index f44f2c159714..9557b7dfb782 100644
--- a/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id_nmi.c
+++ b/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id_nmi.c
@@ -140,8 +140,8 @@ void test_stacktrace_build_id_nmi(void)
 	if (build_id_matches < 1 && retry--) {
 		bpf_link__destroy(link);
 		bpf_object__close(obj);
-		test__printf("%s:WARN:Didn't find expected build ID from the map, retrying\n",
-			     __func__);
+		printf("%s:WARN:Didn't find expected build ID from the map, retrying\n",
+		       __func__);
 		goto retry;
 	}
 
diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_noinline.c b/tools/testing/selftests/bpf/prog_tests/xdp_noinline.c
index b5404494b8aa..15f7c272edb0 100644
--- a/tools/testing/selftests/bpf/prog_tests/xdp_noinline.c
+++ b/tools/testing/selftests/bpf/prog_tests/xdp_noinline.c
@@ -75,8 +75,8 @@ void test_xdp_noinline(void)
 	}
 	if (bytes != MAGIC_BYTES * NUM_ITER * 2 || pkts != NUM_ITER * 2) {
 		error_cnt++;
-		test__printf("test_xdp_noinline:FAIL:stats %lld %lld\n",
-			     bytes, pkts);
+		printf("test_xdp_noinline:FAIL:stats %lld %lld\n",
+		       bytes, pkts);
 	}
 out:
 	bpf_object__close(obj);
diff --git a/tools/testing/selftests/bpf/test_progs.c b/tools/testing/selftests/bpf/test_progs.c
index 9556439c607c..963912008042 100644
--- a/tools/testing/selftests/bpf/test_progs.c
+++ b/tools/testing/selftests/bpf/test_progs.c
@@ -105,20 +105,6 @@ void test__force_log() {
 	env.test->force_log = true;
 }
 
-void test__vprintf(const char *fmt, va_list args)
-{
-	vprintf(fmt, args);
-}
-
-void test__printf(const char *fmt, ...)
-{
-	va_list args;
-
-	va_start(args, fmt);
-	test__vprintf(fmt, args);
-	va_end(args);
-}
-
 struct ipv4_packet pkt_v4 = {
 	.eth.h_proto = __bpf_constant_htons(ETH_P_IP),
 	.iph.ihl = 5,
@@ -310,7 +296,7 @@ static int libbpf_print_fn(enum libbpf_print_level level,
 {
 	if (!env.very_verbose && level == LIBBPF_DEBUG)
 		return 0;
-	test__vprintf(format, args);
+	vprintf(format, args);
 	return 0;
 }
 
diff --git a/tools/testing/selftests/bpf/test_progs.h b/tools/testing/selftests/bpf/test_progs.h
index 541f9eab5eed..37d427f5a1e5 100644
--- a/tools/testing/selftests/bpf/test_progs.h
+++ b/tools/testing/selftests/bpf/test_progs.h
@@ -70,8 +70,6 @@ extern int error_cnt;
 extern int pass_cnt;
 extern struct test_env env;
 
-extern void test__printf(const char *fmt, ...);
-extern void test__vprintf(const char *fmt, va_list args);
 extern void test__force_log();
 extern bool test__start_subtest(const char *name);
 
@@ -97,12 +95,12 @@ extern struct ipv6_packet pkt_v6;
 	int __ret = !!(condition);					\
 	if (__ret) {							\
 		error_cnt++;						\
-		test__printf("%s:FAIL:%s ", __func__, tag);		\
-		test__printf(format);					\
+		printf("%s:FAIL:%s ", __func__, tag);			\
+		printf(format);						\
 	} else {							\
 		pass_cnt++;						\
-		test__printf("%s:PASS:%s %d nsec\n",			\
-			      __func__, tag, duration);			\
+		printf("%s:PASS:%s %d nsec\n",				\
+		       __func__, tag, duration);			\
 	}								\
 	__ret;								\
 })
-- 
2.22.0.770.g0f2c4a37fd-goog


^ permalink raw reply related

* [PATCH bpf-next v4 1/3] selftests/bpf: test_progs: switch to open_memstream
From: Stanislav Fomichev @ 2019-08-06 17:08 UTC (permalink / raw)
  To: netdev, bpf; +Cc: davem, ast, daniel, Stanislav Fomichev, Andrii Nakryiko
In-Reply-To: <20190806170901.142264-1-sdf@google.com>

Use open_memstream to override stdout during test execution.
The copy of the original stdout is held in env.stdout and used
to print subtest info and dump failed log.

test_{v,}printf are now simple wrappers around stdout and will be
removed in the next patch.

v4:
* one field per line for stdout/stderr (Andrii Nakryiko)

v3:
* don't do strlen over log_buf, log_cnt has it already (Andrii Nakryiko)

v2:
* add ifdef __GLIBC__ around open_memstream (maybe pointless since
  we already depend on glibc for argp_parse)
* hijack stderr as well (Andrii Nakryiko)
* don't hijack for every test, do it once (Andrii Nakryiko)
* log_cap -> log_size (Andrii Nakryiko)
* do fseeko in a proper place (Andrii Nakryiko)
* check open_memstream returned value (Andrii Nakryiko)

Cc: Andrii Nakryiko <andriin@fb.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 tools/testing/selftests/bpf/test_progs.c | 115 ++++++++++++-----------
 tools/testing/selftests/bpf/test_progs.h |   3 +-
 2 files changed, 62 insertions(+), 56 deletions(-)

diff --git a/tools/testing/selftests/bpf/test_progs.c b/tools/testing/selftests/bpf/test_progs.c
index db00196c8315..9556439c607c 100644
--- a/tools/testing/selftests/bpf/test_progs.c
+++ b/tools/testing/selftests/bpf/test_progs.c
@@ -40,14 +40,20 @@ static bool should_run(struct test_selector *sel, int num, const char *name)
 
 static void dump_test_log(const struct prog_test_def *test, bool failed)
 {
+	if (stdout == env.stdout)
+		return;
+
+	fflush(stdout); /* exports env.log_buf & env.log_cnt */
+
 	if (env.verbose || test->force_log || failed) {
 		if (env.log_cnt) {
-			fprintf(stdout, "%s", env.log_buf);
+			fprintf(env.stdout, "%s", env.log_buf);
 			if (env.log_buf[env.log_cnt - 1] != '\n')
-				fprintf(stdout, "\n");
+				fprintf(env.stdout, "\n");
 		}
 	}
-	env.log_cnt = 0;
+
+	fseeko(stdout, 0, SEEK_SET); /* rewind */
 }
 
 void test__end_subtest()
@@ -62,7 +68,7 @@ void test__end_subtest()
 
 	dump_test_log(test, sub_error_cnt);
 
-	printf("#%d/%d %s:%s\n",
+	fprintf(env.stdout, "#%d/%d %s:%s\n",
 	       test->test_num, test->subtest_num,
 	       test->subtest_name, sub_error_cnt ? "FAIL" : "OK");
 }
@@ -79,7 +85,8 @@ bool test__start_subtest(const char *name)
 	test->subtest_num++;
 
 	if (!name || !name[0]) {
-		fprintf(stderr, "Subtest #%d didn't provide sub-test name!\n",
+		fprintf(env.stderr,
+			"Subtest #%d didn't provide sub-test name!\n",
 			test->subtest_num);
 		return false;
 	}
@@ -100,53 +107,7 @@ void test__force_log() {
 
 void test__vprintf(const char *fmt, va_list args)
 {
-	size_t rem_sz;
-	int ret = 0;
-
-	if (env.verbose || (env.test && env.test->force_log)) {
-		vfprintf(stderr, fmt, args);
-		return;
-	}
-
-try_again:
-	rem_sz = env.log_cap - env.log_cnt;
-	if (rem_sz) {
-		va_list ap;
-
-		va_copy(ap, args);
-		/* we reserved extra byte for \0 at the end */
-		ret = vsnprintf(env.log_buf + env.log_cnt, rem_sz + 1, fmt, ap);
-		va_end(ap);
-
-		if (ret < 0) {
-			env.log_buf[env.log_cnt] = '\0';
-			fprintf(stderr, "failed to log w/ fmt '%s'\n", fmt);
-			return;
-		}
-	}
-
-	if (!rem_sz || ret > rem_sz) {
-		size_t new_sz = env.log_cap * 3 / 2;
-		char *new_buf;
-
-		if (new_sz < 4096)
-			new_sz = 4096;
-		if (new_sz < ret + env.log_cnt)
-			new_sz = ret + env.log_cnt;
-
-		/* +1 for guaranteed space for terminating \0 */
-		new_buf = realloc(env.log_buf, new_sz + 1);
-		if (!new_buf) {
-			fprintf(stderr, "failed to realloc log buffer: %d\n",
-				errno);
-			return;
-		}
-		env.log_buf = new_buf;
-		env.log_cap = new_sz;
-		goto try_again;
-	}
-
-	env.log_cnt += ret;
+	vprintf(fmt, args);
 }
 
 void test__printf(const char *fmt, ...)
@@ -477,6 +438,48 @@ static error_t parse_arg(int key, char *arg, struct argp_state *state)
 	return 0;
 }
 
+static void stdio_hijack(void)
+{
+#ifdef __GLIBC__
+	if (env.verbose || (env.test && env.test->force_log)) {
+		/* nothing to do, output to stdout by default */
+		return;
+	}
+
+	/* stdout and stderr -> buffer */
+	fflush(stdout);
+
+	env.stdout = stdout;
+	env.stderr = stderr;
+
+	stdout = open_memstream(&env.log_buf, &env.log_cnt);
+	if (!stdout) {
+		stdout = env.stdout;
+		perror("open_memstream");
+		return;
+	}
+
+	stderr = stdout;
+#endif
+}
+
+static void stdio_restore(void)
+{
+#ifdef __GLIBC__
+	if (stdout == env.stdout)
+		return;
+
+	fclose(stdout);
+	free(env.log_buf);
+
+	env.log_buf = NULL;
+	env.log_cnt = 0;
+
+	stdout = env.stdout;
+	stderr = env.stderr;
+#endif
+}
+
 int main(int argc, char **argv)
 {
 	static const struct argp argp = {
@@ -496,6 +499,7 @@ int main(int argc, char **argv)
 
 	env.jit_enabled = is_jit_enabled();
 
+	stdio_hijack();
 	for (i = 0; i < prog_test_cnt; i++) {
 		struct prog_test_def *test = &prog_test_defs[i];
 		int old_pass_cnt = pass_cnt;
@@ -523,13 +527,14 @@ int main(int argc, char **argv)
 
 		dump_test_log(test, test->error_cnt);
 
-		printf("#%d %s:%s\n", test->test_num, test->test_name,
-		       test->error_cnt ? "FAIL" : "OK");
+		fprintf(env.stdout, "#%d %s:%s\n",
+			test->test_num, test->test_name,
+			test->error_cnt ? "FAIL" : "OK");
 	}
+	stdio_restore();
 	printf("Summary: %d/%d PASSED, %d FAILED\n",
 	       env.succ_cnt, env.sub_succ_cnt, env.fail_cnt);
 
-	free(env.log_buf);
 	free(env.test_selector.num_set);
 	free(env.subtest_selector.num_set);
 
diff --git a/tools/testing/selftests/bpf/test_progs.h b/tools/testing/selftests/bpf/test_progs.h
index afd14962456f..541f9eab5eed 100644
--- a/tools/testing/selftests/bpf/test_progs.h
+++ b/tools/testing/selftests/bpf/test_progs.h
@@ -56,9 +56,10 @@ struct test_env {
 	bool jit_enabled;
 
 	struct prog_test_def *test;
+	FILE *stdout;
+	FILE *stderr;
 	char *log_buf;
 	size_t log_cnt;
-	size_t log_cap;
 
 	int succ_cnt; /* successful tests */
 	int sub_succ_cnt; /* successful sub-tests */
-- 
2.22.0.770.g0f2c4a37fd-goog


^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox