Netdev List
 help / color / mirror / Atom feed
* RE: [PATCH net-next 1/4] dpaa2-eth: Use a single page per Rx buffer
From: Ioana Ciornei @ 2019-02-08 18:19 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Ioana Ciocoi Radulescu
  Cc: Ilias Apalodimas, netdev@vger.kernel.org, davem@davemloft.net
In-Reply-To: <20190206204956.533326c2@carbon>

> Subject: Re: [PATCH net-next 1/4] dpaa2-eth: Use a single page per Rx buffer
> 
> 
> On Wed, 6 Feb 2019 15:36:33 +0000 Ioana Ciocoi Radulescu
> <ruxandra.radulescu@nxp.com> wrote:
> 
> > > From: Ilias Apalodimas <ilias.apalodimas@linaro.org>
> > >
> > > Can you share any results on XDP (XDP_DROP is usually useful for the
> > > hardware capabilities).
> >
> > XDP numbers are pretty much the same as before this patch:
> >
> > On a LS2088A with A72 cores @2GHz (numbers in Mpps):
> > 				1core		8cores
> > -------------------------------------------------------------------------
> > XDP_DROP (no touching data)	5.37		29.6 (linerate)
> > XDP_DROP (xdp1 sample)	3.14		24.22
> 
> It is interesting/problematic to see that the cost of touching the data is so high
> 5.37Mpps -> 3.14Mpps.  The Intel CPUs have solved this in hardware with DDIO,
> which delivers frame in L3-cache. I have some ideas on how to improve this on
> ARM (or CPUs without DDIO).  I've previous implemented this as RFC on mlx4
> tested on a CPU without DDIO, with great success 10mpps -> 20Mpps (but it was
> shutdown, as newer Intel HW solved the issue).  The basic idea is to have an
> array of frames, that you start an L2/L3-prefetch on, before going "back" and
> process them for XDP or netstack. (p.s. this is the same DPDK does)

Thanks for the hint. We are currently investigating what our options are.
We'll come back with a more detailed answer once we figure out.

Ioana C

^ permalink raw reply

* Re: [PATCH net-next] net: phy: disregard "Clause 22 registers present" bit in get_phy_c45_devs_in_pkg
From: Heiner Kallweit @ 2019-02-08 18:16 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: Florian Fainelli, David Miller, netdev@vger.kernel.org
In-Reply-To: <20190208140206.GE26594@lunn.ch>

On 08.02.2019 15:02, Andrew Lunn wrote:
> On Fri, Feb 08, 2019 at 08:12:47AM +0100, Heiner Kallweit wrote:
>> Bit 0 in register 1.5 doesn't represent a device but is a flag that
>> Clause 22 registers are present. Therefore disregard this bit when
>> populating the device list. If code needs this information it
>> should read register 1.5 directly instead of accessing the device
>> list. Because this bit doesn't represent a device I didn't add a
>> MDIO_MMD_C22PRESENT constant or similar.
>>
>> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
>> ---
>>  drivers/net/phy/phy_device.c | 3 ++-
>>  include/uapi/linux/mdio.h    | 1 +
>>  2 files changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
>> index 9369e1323..c2316a117 100644
>> --- a/drivers/net/phy/phy_device.c
>> +++ b/drivers/net/phy/phy_device.c
>> @@ -682,7 +682,8 @@ static int get_phy_c45_devs_in_pkg(struct mii_bus *bus, int addr, int dev_addr,
>>  	phy_reg = mdiobus_read(bus, addr, reg_addr);
>>  	if (phy_reg < 0)
>>  		return -EIO;
>> -	*devices_in_package |= (phy_reg & 0xffff);
>> +	/* Bit 0 doesn't represent a device, it indicates c22 regs presence */
>> +	*devices_in_package |= (phy_reg & 0xfffe);
> 
> Hi Heiner
> 
> Just for readability, can we use BIT(0) in there somehow?
> 
You think 0xfffe together with the comment is still not clear enough?
But sure, I can make it more explicit.

>>  
>>  	return 0;
>>  }
>> diff --git a/include/uapi/linux/mdio.h b/include/uapi/linux/mdio.h
>> index 2e6e309f0..0e012b168 100644
>> --- a/include/uapi/linux/mdio.h
>> +++ b/include/uapi/linux/mdio.h
>> @@ -115,6 +115,7 @@
>>  
>>  /* Device present registers. */
>>  #define MDIO_DEVS_PRESENT(devad)	(1 << (devad))
>> +#define MDIO_DEVS_C22PRESENT		MDIO_DEVS_PRESENT(0)
> 
> Err. The commit message says you did not add this...
> 
Maybe I'm not clear enough in the commit message. Typically we have two
constants for a device:

MDIO_MMD_XXX (for the device)
MDIO_DEVS_XXX (for the bit of the device in the device list bitmap)

For the C22PRESENT flag I don't define the first one (because it's
not a device) but the second one (because it uses a bit in the device
list bitmap).

>      Andrew
> 
Heiner

^ permalink raw reply

* Re: Waiting for vrf to become free on rmmod of bridge...
From: Ben Greear @ 2019-02-08 18:12 UTC (permalink / raw)
  To: David Ahern, netdev
In-Reply-To: <bb4c7049-9f5a-6f2b-840f-3eb42b9307f5@gmail.com>

On 2/6/19 5:50 PM, David Ahern wrote:
> On 2/6/19 3:20 PM, Ben Greear wrote:
>> Hello,
>>
>> I just saw this warning on a system running a hacked 4.20.2+ kernel.
>> Any known bugs
>> of this nature in this (upstream) kernel?  The command that is blocked is:
>> 'rmmod bridge llc'
>>
>> [17069.299135] unregister_netdevice: waiting for _vrf13 to become free.
>> Usage count = 1
>> [17079.306438] unregister_netdevice: waiting for _vrf13 to become free.
>> Usage count = 1
>> [17089.314656] unregister_netdevice: waiting for _vrf13 to become free.
>> Usage count = 1
>> [17099.322870] unregister_netdevice: waiting for _vrf13 to become free.
>> Usage count = 1
>>
>> Thanks,
>> Ben
>>
> 
> No known refcount issues with vrf.
> 
> I use namespaces for testing which creates devices, adds routes, runs
> traffic and deletes the device and namespace. That series in the tests
> has been known to trigger refcount problems in the past.

I'm not using namespaces in my test, but it is fairly convoluted.  If I
figure out how to reproduce the issue I'll let you know.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply

* Re: [PATCH v3 bpf-next 4/4] tools/bpf: remove btf__get_strings superseded() by raw data API
From: Andrii Nakryiko @ 2019-02-08 18:11 UTC (permalink / raw)
  To: Song Liu
  Cc: Andrii Nakryiko, Alexei Starovoitov, Yonghong Song,
	Alexei Starovoitov, Martin Lau, netdev@vger.kernel.org,
	Kernel Team, daniel@iogearbox.net
In-Reply-To: <0E831E04-241E-44F8-800B-63346080E9F3@fb.com>

On Fri, Feb 8, 2019 at 9:31 AM Song Liu <songliubraving@fb.com> wrote:
>
>
>
> > On Feb 7, 2019, at 6:55 PM, Andrii Nakryiko <andriin@fb.com> wrote:
> >
> > Now that we have btf__get_raw_data() it's trivial for tests to iterate
> > over all strings for testing purposes, which eliminates the need for
> > btf__get_strings() API.
> >
> > Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> > ---
> > tools/lib/bpf/btf.c                    |  7 -----
> > tools/lib/bpf/btf.h                    |  2 --
> > tools/testing/selftests/bpf/test_btf.c | 39 +++++++++++++++++---------
> > 3 files changed, 26 insertions(+), 22 deletions(-)
> >
> > diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> > index c87cc3d71b9f..a986dc28f17d 100644
> > --- a/tools/lib/bpf/btf.c
> > +++ b/tools/lib/bpf/btf.c
> > @@ -447,13 +447,6 @@ const void *btf__get_raw_data(const struct btf *btf, __u32 *size)
> >       return btf->data;
> > }
> >
> > -void btf__get_strings(const struct btf *btf, const char **strings,
> > -                   __u32 *str_len)
> > -{
> > -     *strings = btf->strings;
> > -     *str_len = btf->hdr->str_len;
> > -}
> > -
> > const char *btf__name_by_offset(const struct btf *btf, __u32 offset)
> > {
> >       if (offset < btf->hdr->str_len)
> > diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
> > index ad9f648260c2..6179291f2cec 100644
> > --- a/tools/lib/bpf/btf.h
> > +++ b/tools/lib/bpf/btf.h
> > @@ -67,8 +67,6 @@ LIBBPF_API __s64 btf__resolve_size(const struct btf *btf, __u32 type_id);
> > LIBBPF_API int btf__resolve_type(const struct btf *btf, __u32 type_id);
> > LIBBPF_API int btf__fd(const struct btf *btf);
> > LIBBPF_API const void *btf__get_raw_data(const struct btf *btf, __u32 *size);
> > -LIBBPF_API void btf__get_strings(const struct btf *btf, const char **strings,
> > -                              __u32 *str_len);
>
> I guess we need to update libbpf.map with this?

Definitely! I must have lost it during rebase, fixing.

>
> > LIBBPF_API const char *btf__name_by_offset(const struct btf *btf, __u32 offset);
> > LIBBPF_API int btf__get_from_id(__u32 id, struct btf **btf);
> > LIBBPF_API int btf__get_map_kv_tids(const struct btf *btf, const char *map_name,
> > diff --git a/tools/testing/selftests/bpf/test_btf.c b/tools/testing/selftests/bpf/test_btf.c
> > index 447acc34db94..bbcacba39590 100644
> > --- a/tools/testing/selftests/bpf/test_btf.c
> > +++ b/tools/testing/selftests/bpf/test_btf.c
> > @@ -5882,15 +5882,17 @@ static void dump_btf_strings(const char *strs, __u32 len)
> > static int do_test_dedup(unsigned int test_num)
> > {
> >       const struct btf_dedup_test *test = &dedup_tests[test_num - 1];
> > -     int err = 0, i;
> > -     __u32 test_nr_types, expect_nr_types, test_str_len, expect_str_len;
> > -     void *raw_btf;
> > -     unsigned int raw_btf_size;
> > +     __u32 test_nr_types, expect_nr_types, test_btf_size, expect_btf_size;
> > +     const struct btf_header *test_hdr, *expect_hdr;
> >       struct btf *test_btf = NULL, *expect_btf = NULL;
> > +     const void *test_btf_data, *expect_btf_data;
> >       const char *ret_test_next_str, *ret_expect_next_str;
> >       const char *test_strs, *expect_strs;
> >       const char *test_str_cur, *test_str_end;
> >       const char *expect_str_cur, *expect_str_end;
> > +     unsigned int raw_btf_size;
> > +     void *raw_btf;
> > +     int err = 0, i;
> >
> >       fprintf(stderr, "BTF dedup test[%u] (%s):", test_num, test->descr);
> >
> > @@ -5927,23 +5929,34 @@ static int do_test_dedup(unsigned int test_num)
> >               goto done;
> >       }
> >
> > -     btf__get_strings(test_btf, &test_strs, &test_str_len);
> > -     btf__get_strings(expect_btf, &expect_strs, &expect_str_len);
> > -     if (CHECK(test_str_len != expect_str_len,
> > -               "test_str_len:%u != expect_str_len:%u",
> > -               test_str_len, expect_str_len)) {
> > +     test_btf_data = btf__get_raw_data(test_btf, &test_btf_size);
> > +     expect_btf_data = btf__get_raw_data(expect_btf, &expect_btf_size);
> > +     if (CHECK(test_btf_size != expect_btf_size,
> > +               "test_btf_size:%u != expect_btf_size:%u",
> > +               test_btf_size, expect_btf_size)) {
> > +             err = -1;
> > +             goto done;
> > +     }
> > +
> > +     test_hdr = test_btf_data;
> > +     test_strs = test_btf_data + test_hdr->str_off;
> > +     expect_hdr = expect_btf_data;
> > +     expect_strs = expect_btf_data + expect_hdr->str_off;
> > +     if (CHECK(test_hdr->str_len != expect_hdr->str_len,
> > +               "test_hdr->str_len:%u != expect_hdr->str_len:%u",
> > +               test_hdr->str_len, expect_hdr->str_len)) {
> >               fprintf(stderr, "\ntest strings:\n");
> > -             dump_btf_strings(test_strs, test_str_len);
> > +             dump_btf_strings(test_strs, test_hdr->str_len);
> >               fprintf(stderr, "\nexpected strings:\n");
> > -             dump_btf_strings(expect_strs, expect_str_len);
> > +             dump_btf_strings(expect_strs, expect_hdr->str_len);
> >               err = -1;
> >               goto done;
> >       }
> >
> >       test_str_cur = test_strs;
> > -     test_str_end = test_strs + test_str_len;
> > +     test_str_end = test_strs + test_hdr->str_len;
> >       expect_str_cur = expect_strs;
> > -     expect_str_end = expect_strs + expect_str_len;
> > +     expect_str_end = expect_strs + expect_hdr->str_len;
> >       while (test_str_cur < test_str_end && expect_str_cur < expect_str_end) {
> >               size_t test_len, expect_len;
> >
> > --
> > 2.17.1
> >
>

^ permalink raw reply

* Re: [iproute PATCH] ip-link: Fix listing of alias interfaces
From: Phil Sutter @ 2019-02-08 18:04 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev, Roopa Prabhu
In-Reply-To: <20190208095647.047abca3@hermes.lan>

Hi Stephen,

On Fri, Feb 08, 2019 at 09:56:47AM -0800, Stephen Hemminger wrote:
> On Thu,  7 Feb 2019 14:05:27 +0100
> Phil Sutter <phil@nwl.cc> wrote:
> 
> > Commit 50b9950dd9011 ("link dump filter") accidentally broke listing of
> > links in the old alias interface notation:
> > 
> > | % ip link show eth0:1
> > | RTNETLINK answers: No such device
> > | Cannot send link get request: No such device
> > 
> > Prior to the above commit, link lookup was performed via ifindex
> > returned by if_nametoindex(). The latter uses SIOCGIFINDEX ioctl call
> > which on kernel side causes the colon-suffix to be dropped before doing
> > the interface lookup. Netlink API though doesn't care about that at all.
> > To keep things backward compatible, mimick ioctl API behaviour and drop
> > the colon-suffix prior to sending the RTM_GETLINK request.
> > 
> > Fixes: 50b9950dd9011 ("link dump filter")
> > Signed-off-by: Phil Sutter <phil@nwl.cc>
> 
> It is a regression, but the original code was kind of broken.
> iproute2 doesn't need or want the old style alias interface notation.

Thanks for your clarification!

Cheers, Phil

^ permalink raw reply

* Re: [PATCH iproute2] tc: use bits not mbits/sec in rate percent
From: Stephen Hemminger @ 2019-02-08 18:01 UTC (permalink / raw)
  To: Marcos Antonio Moraes; +Cc: netdev
In-Reply-To: <20190207152954.11036-1-marcos.antonio@digirati.com.br>

On Thu,  7 Feb 2019 13:29:54 -0200
Marcos Antonio Moraes <marcos.antonio@digirati.com.br> wrote:

> As /sys/class/net/<iface>/speed indicates a value in Mbits/sec, the
> conversion is necessary to create the correct limits.
> 
> This guarantees the same result for the following commands in an
> 1000Mbit/sec device:
> 
> tc class add ... htb rate 500Mbit
> tc class add ... htb rate 50%
> 
> Fixes: 927e3cfb52b5 ("tc: B.W limits can now be specified in %.")
> Signed-off-by: Marcos Antonio Moraes <marcos.antonio@digirati.com.br>

Sure applied.



^ permalink raw reply

* RE: [PATCH net-next] devlink: Add WARN_ON to catch errors of not cleaning devlink objects
From: Parav Pandit @ 2019-02-08 18:01 UTC (permalink / raw)
  To: David Ahern, netdev@vger.kernel.org, davem@davemloft.net
In-Reply-To: <dcd9b51a-1916-a947-7384-aa24f3d25cf3@gmail.com>



> -----Original Message-----
> From: David Ahern <dsahern@gmail.com>
> Sent: Friday, February 8, 2019 11:30 AM
> To: Parav Pandit <parav@mellanox.com>; netdev@vger.kernel.org;
> davem@davemloft.net
> Subject: Re: [PATCH net-next] devlink: Add WARN_ON to catch errors of not
> cleaning devlink objects
> 
> On 2/8/19 8:22 AM, Parav Pandit wrote:
> > Add WARN_ON to make sure that all sub objects of a devlink device are
> > cleanedup before freeing the devlink device.
> > This helps to catch any driver bugs.
> >
> > Signed-off-by: Parav Pandit <parav@mellanox.com>
> > Acked-by: Jiri Pirko <jiri@mellanox.com>
> > ---
> >  net/core/devlink.c | 7 +++++++
> >  1 file changed, 7 insertions(+)
> >
> > diff --git a/net/core/devlink.c b/net/core/devlink.c index
> > cd0d393..5e2ef5a 100644
> > --- a/net/core/devlink.c
> > +++ b/net/core/devlink.c
> > @@ -4229,6 +4229,13 @@ void devlink_unregister(struct devlink *devlink)
> >   */
> >  void devlink_free(struct devlink *devlink)  {
> > +	WARN_ON(!list_empty(&devlink->port_list));
> > +	WARN_ON(!list_empty(&devlink->sb_list));
> > +	WARN_ON(!list_empty(&devlink->dpipe_table_list));
> > +	WARN_ON(!list_empty(&devlink->resource_list));
> > +	WARN_ON(!list_empty(&devlink->param_list));
> > +	WARN_ON(!list_empty(&devlink->region_list));
> > +
> >  	kfree(devlink);
> >  }
> >  EXPORT_SYMBOL_GPL(devlink_free);
> >
> 
> reporter_list was just added which brings up the maintenance question:
> If you are going to do this you might want a comment in
> include/net/devlink.h to remind folks to update this function as relevant.
I see. Make sense. Adding reporter_list and updating devlink.h, too for comment in v1.

^ permalink raw reply

* Re: [iproute PATCH] ip-link: Fix listing of alias interfaces
From: Stephen Hemminger @ 2019-02-08 17:56 UTC (permalink / raw)
  To: Phil Sutter; +Cc: netdev, Roopa Prabhu
In-Reply-To: <20190207130527.9439-1-phil@nwl.cc>

On Thu,  7 Feb 2019 14:05:27 +0100
Phil Sutter <phil@nwl.cc> wrote:

> Commit 50b9950dd9011 ("link dump filter") accidentally broke listing of
> links in the old alias interface notation:
> 
> | % ip link show eth0:1
> | RTNETLINK answers: No such device
> | Cannot send link get request: No such device
> 
> Prior to the above commit, link lookup was performed via ifindex
> returned by if_nametoindex(). The latter uses SIOCGIFINDEX ioctl call
> which on kernel side causes the colon-suffix to be dropped before doing
> the interface lookup. Netlink API though doesn't care about that at all.
> To keep things backward compatible, mimick ioctl API behaviour and drop
> the colon-suffix prior to sending the RTM_GETLINK request.
> 
> Fixes: 50b9950dd9011 ("link dump filter")
> Signed-off-by: Phil Sutter <phil@nwl.cc>

It is a regression, but the original code was kind of broken.
iproute2 doesn't need or want the old style alias interface notation.

^ permalink raw reply

* Re: [iproute PATCH] ip-link: Fix listing of alias interfaces
From: David Ahern @ 2019-02-08 17:50 UTC (permalink / raw)
  To: Michal Kubecek, netdev; +Cc: Phil Sutter, Stephen Hemminger, Roopa Prabhu
In-Reply-To: <20190208120903.GC7035@unicorn.suse.cz>

On 2/8/19 4:09 AM, Michal Kubecek wrote:
> Not only that, other ip link subcommands also use ioctl for interface
> lookup so that e.g. "ip link del dummy1:x" deletes dummy1 without any
> complaint.
> 
> But as I mentioned earlier in http://patchwork.ozlabs.org/patch/1037934/
> I'm not sure this behaviour is really desirable.

I sent a patch some time ago to convert ll_name_to_index to use an
netlink message first and only fallback to if_nametoindex if it fails.
That should fix the problem you mentioned

^ permalink raw reply

* [PATCH bpf-next 3/3] selftests: bpf: centre kernel bpf objects under new subdir "kern_progs"
From: Jiong Wang @ 2019-02-08 17:41 UTC (permalink / raw)
  To: alexei.starovoitov, daniel; +Cc: netdev, oss-drivers, Jiong Wang
In-Reply-To: <1549647681-13818-1-git-send-email-jiong.wang@netronome.com>

At the moment, all kernel bpf objects are listed under BPF_OBJ_FILES.
Listing them manually sometimes causing patch conflict when people are
adding new testcases simultaneously.

It is better to centre all the related source files under a subdir
"kern_progs", then auto-generate the object file list.

Suggested-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
---
 tools/testing/selftests/bpf/Makefile               | 26 +++++-----------------
 .../selftests/bpf/{ => kern_progs}/bpf_flow.c      |  0
 .../selftests/bpf/{ => kern_progs}/connect4_prog.c |  0
 .../selftests/bpf/{ => kern_progs}/connect6_prog.c |  0
 .../selftests/bpf/{ => kern_progs}/dev_cgroup.c    |  0
 .../bpf/{ => kern_progs}/get_cgroup_id_kern.c      |  0
 .../selftests/bpf/{ => kern_progs}/netcnt_prog.c   |  0
 .../bpf/{ => kern_progs}/sample_map_ret0.c         |  0
 .../selftests/bpf/{ => kern_progs}/sample_ret0.c   |  0
 .../selftests/bpf/{ => kern_progs}/sendmsg4_prog.c |  0
 .../selftests/bpf/{ => kern_progs}/sendmsg6_prog.c |  0
 .../bpf/{ => kern_progs}/socket_cookie_prog.c      |  0
 .../bpf/{ => kern_progs}/sockmap_parse_prog.c      |  0
 .../bpf/{ => kern_progs}/sockmap_tcp_msg_prog.c    |  0
 .../bpf/{ => kern_progs}/sockmap_verdict_prog.c    |  0
 .../bpf/{ => kern_progs}/test_adjust_tail.c        |  0
 .../bpf/{ => kern_progs}/test_btf_haskv.c          |  0
 .../selftests/bpf/{ => kern_progs}/test_btf_nokv.c |  0
 .../bpf/{ => kern_progs}/test_get_stack_rawtp.c    |  0
 .../selftests/bpf/{ => kern_progs}/test_l4lb.c     |  0
 .../bpf/{ => kern_progs}/test_l4lb_noinline.c      |  0
 .../bpf/{ => kern_progs}/test_lirc_mode2_kern.c    |  0
 .../bpf/{ => kern_progs}/test_lwt_seg6local.c      |  0
 .../bpf/{ => kern_progs}/test_map_in_map.c         |  0
 .../selftests/bpf/{ => kern_progs}/test_map_lock.c |  0
 .../selftests/bpf/{ => kern_progs}/test_obj_id.c   |  0
 .../bpf/{ => kern_progs}/test_pkt_access.c         |  0
 .../bpf/{ => kern_progs}/test_pkt_md_access.c      |  0
 .../bpf/{ => kern_progs}/test_queue_map.c          |  0
 .../{ => kern_progs}/test_select_reuseport_kern.c  |  0
 .../bpf/{ => kern_progs}/test_sk_lookup_kern.c     |  0
 .../bpf/{ => kern_progs}/test_skb_cgroup_id_kern.c |  0
 .../bpf/{ => kern_progs}/test_sockhash_kern.c      |  0
 .../bpf/{ => kern_progs}/test_sockmap_kern.c       |  0
 .../bpf/{ => kern_progs}/test_spin_lock.c          |  0
 .../bpf/{ => kern_progs}/test_stack_map.c          |  0
 .../{ => kern_progs}/test_stacktrace_build_id.c    |  0
 .../bpf/{ => kern_progs}/test_stacktrace_map.c     |  0
 .../bpf/{ => kern_progs}/test_tcp_estats.c         |  0
 .../bpf/{ => kern_progs}/test_tcpbpf_kern.c        |  0
 .../bpf/{ => kern_progs}/test_tcpnotify_kern.c     |  0
 .../bpf/{ => kern_progs}/test_tracepoint.c         |  0
 .../bpf/{ => kern_progs}/test_tunnel_kern.c        |  0
 .../selftests/bpf/{ => kern_progs}/test_xdp.c      |  0
 .../selftests/bpf/{ => kern_progs}/test_xdp_meta.c |  0
 .../bpf/{ => kern_progs}/test_xdp_noinline.c       |  0
 .../bpf/{ => kern_progs}/test_xdp_redirect.c       |  0
 .../selftests/bpf/{ => kern_progs}/test_xdp_vlan.c |  0
 .../selftests/bpf/{ => kern_progs}/xdp_dummy.c     |  0
 49 files changed, 5 insertions(+), 21 deletions(-)
 rename tools/testing/selftests/bpf/{ => kern_progs}/bpf_flow.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/connect4_prog.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/connect6_prog.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/dev_cgroup.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/get_cgroup_id_kern.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/netcnt_prog.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/sample_map_ret0.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/sample_ret0.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/sendmsg4_prog.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/sendmsg6_prog.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/socket_cookie_prog.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/sockmap_parse_prog.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/sockmap_tcp_msg_prog.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/sockmap_verdict_prog.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_adjust_tail.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_btf_haskv.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_btf_nokv.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_get_stack_rawtp.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_l4lb.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_l4lb_noinline.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_lirc_mode2_kern.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_lwt_seg6local.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_map_in_map.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_map_lock.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_obj_id.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_pkt_access.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_pkt_md_access.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_queue_map.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_select_reuseport_kern.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_sk_lookup_kern.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_skb_cgroup_id_kern.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_sockhash_kern.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_sockmap_kern.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_spin_lock.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_stack_map.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_stacktrace_build_id.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_stacktrace_map.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_tcp_estats.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_tcpbpf_kern.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_tcpnotify_kern.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_tracepoint.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_tunnel_kern.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_xdp.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_xdp_meta.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_xdp_noinline.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_xdp_redirect.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_xdp_vlan.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/xdp_dummy.c (100%)

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 70b2570..2965855 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -25,24 +25,7 @@ TEST_GEN_PROGS = test_verifier test_tag test_maps test_lru_map test_lpm_map test
 	test_socket_cookie test_cgroup_storage test_select_reuseport test_section_names \
 	test_netcnt test_tcpnotify_user
 
-BPF_OBJ_FILES = \
-	test_xdp_redirect.o test_xdp_meta.o sockmap_parse_prog.o \
-	sockmap_verdict_prog.o dev_cgroup.o sample_ret0.o \
-	test_tcpnotify_kern.o sample_map_ret0.o test_tcpbpf_kern.o \
-	sockmap_tcp_msg_prog.o connect4_prog.o connect6_prog.o \
-	test_btf_haskv.o test_btf_nokv.o test_sockmap_kern.o \
-	test_tunnel_kern.o test_sockhash_kern.o test_lwt_seg6local.o \
-	sendmsg4_prog.o sendmsg6_prog.o test_lirc_mode2_kern.o \
-	get_cgroup_id_kern.o socket_cookie_prog.o test_select_reuseport_kern.o \
-	test_skb_cgroup_id_kern.o bpf_flow.o netcnt_prog.o test_xdp_vlan.o \
-	xdp_dummy.o test_map_in_map.o test_spin_lock.o test_map_lock.o \
-	test_pkt_access.o test_xdp.o test_adjust_tail.o test_l4lb.o \
-	test_l4lb_noinline.o test_xdp_noinline.o test_tcp_estats.o \
-	test_obj_id.o test_pkt_md_access.o test_tracepoint.o \
-	test_stacktrace_map.o test_stacktrace_build_id.o \
-	test_get_stack_rawtp.o test_sk_lookup_kern.o test_queue_map.o \
-	test_stack_map.o
-
+BPF_OBJ_FILES = $(patsubst %.c,%.o, $(notdir $(wildcard kern_progs/*.c)))
 TEST_GEN_FILES = $(BPF_OBJ_FILES)
 
 # Also test sub-register code-gen if LLVM + kernel both has eBPF v3 processor
@@ -183,7 +166,8 @@ $(ALU32_BUILD_DIR)/test_progs_32: test_progs.c $(ALU32_BUILD_DIR) \
 	$(CC) $(CFLAGS) -o $(ALU32_BUILD_DIR)/test_progs_32 $< \
 		trace_helpers.c $(OUTPUT)/libbpf.a $(LDLIBS)
 
-$(ALU32_BUILD_DIR)/%.o: %.c $(ALU32_BUILD_DIR) $(ALU32_BUILD_DIR)/test_progs_32
+$(ALU32_BUILD_DIR)/%.o: kern_progs/%.c $(ALU32_BUILD_DIR) \
+					$(ALU32_BUILD_DIR)/test_progs_32
 	$(CLANG) $(CLANG_FLAGS) \
 		 -O2 -target bpf -emit-llvm -c $< -o - |      \
 	$(LLC) -march=bpf -mattr=+alu32 -mcpu=$(CPU) $(LLC_FLAGS) \
@@ -195,7 +179,7 @@ endif
 
 # Have one program compiled without "-target bpf" to test whether libbpf loads
 # it successfully
-$(OUTPUT)/test_xdp.o: test_xdp.c
+$(OUTPUT)/test_xdp.o: kern_progs/test_xdp.c
 	$(CLANG) $(CLANG_FLAGS) \
 		-O2 -emit-llvm -c $< -o - | \
 	$(LLC) -march=bpf -mcpu=$(CPU) $(LLC_FLAGS) -filetype=obj -o $@
@@ -203,7 +187,7 @@ ifeq ($(DWARF2BTF),y)
 	$(BTF_PAHOLE) -J $@
 endif
 
-$(OUTPUT)/%.o: %.c
+$(OUTPUT)/%.o: kern_progs/%.c
 	$(CLANG) $(CLANG_FLAGS) \
 		 -O2 -target bpf -emit-llvm -c $< -o - |      \
 	$(LLC) -march=bpf -mcpu=$(CPU) $(LLC_FLAGS) -filetype=obj -o $@
diff --git a/tools/testing/selftests/bpf/bpf_flow.c b/tools/testing/selftests/bpf/kern_progs/bpf_flow.c
similarity index 100%
rename from tools/testing/selftests/bpf/bpf_flow.c
rename to tools/testing/selftests/bpf/kern_progs/bpf_flow.c
diff --git a/tools/testing/selftests/bpf/connect4_prog.c b/tools/testing/selftests/bpf/kern_progs/connect4_prog.c
similarity index 100%
rename from tools/testing/selftests/bpf/connect4_prog.c
rename to tools/testing/selftests/bpf/kern_progs/connect4_prog.c
diff --git a/tools/testing/selftests/bpf/connect6_prog.c b/tools/testing/selftests/bpf/kern_progs/connect6_prog.c
similarity index 100%
rename from tools/testing/selftests/bpf/connect6_prog.c
rename to tools/testing/selftests/bpf/kern_progs/connect6_prog.c
diff --git a/tools/testing/selftests/bpf/dev_cgroup.c b/tools/testing/selftests/bpf/kern_progs/dev_cgroup.c
similarity index 100%
rename from tools/testing/selftests/bpf/dev_cgroup.c
rename to tools/testing/selftests/bpf/kern_progs/dev_cgroup.c
diff --git a/tools/testing/selftests/bpf/get_cgroup_id_kern.c b/tools/testing/selftests/bpf/kern_progs/get_cgroup_id_kern.c
similarity index 100%
rename from tools/testing/selftests/bpf/get_cgroup_id_kern.c
rename to tools/testing/selftests/bpf/kern_progs/get_cgroup_id_kern.c
diff --git a/tools/testing/selftests/bpf/netcnt_prog.c b/tools/testing/selftests/bpf/kern_progs/netcnt_prog.c
similarity index 100%
rename from tools/testing/selftests/bpf/netcnt_prog.c
rename to tools/testing/selftests/bpf/kern_progs/netcnt_prog.c
diff --git a/tools/testing/selftests/bpf/sample_map_ret0.c b/tools/testing/selftests/bpf/kern_progs/sample_map_ret0.c
similarity index 100%
rename from tools/testing/selftests/bpf/sample_map_ret0.c
rename to tools/testing/selftests/bpf/kern_progs/sample_map_ret0.c
diff --git a/tools/testing/selftests/bpf/sample_ret0.c b/tools/testing/selftests/bpf/kern_progs/sample_ret0.c
similarity index 100%
rename from tools/testing/selftests/bpf/sample_ret0.c
rename to tools/testing/selftests/bpf/kern_progs/sample_ret0.c
diff --git a/tools/testing/selftests/bpf/sendmsg4_prog.c b/tools/testing/selftests/bpf/kern_progs/sendmsg4_prog.c
similarity index 100%
rename from tools/testing/selftests/bpf/sendmsg4_prog.c
rename to tools/testing/selftests/bpf/kern_progs/sendmsg4_prog.c
diff --git a/tools/testing/selftests/bpf/sendmsg6_prog.c b/tools/testing/selftests/bpf/kern_progs/sendmsg6_prog.c
similarity index 100%
rename from tools/testing/selftests/bpf/sendmsg6_prog.c
rename to tools/testing/selftests/bpf/kern_progs/sendmsg6_prog.c
diff --git a/tools/testing/selftests/bpf/socket_cookie_prog.c b/tools/testing/selftests/bpf/kern_progs/socket_cookie_prog.c
similarity index 100%
rename from tools/testing/selftests/bpf/socket_cookie_prog.c
rename to tools/testing/selftests/bpf/kern_progs/socket_cookie_prog.c
diff --git a/tools/testing/selftests/bpf/sockmap_parse_prog.c b/tools/testing/selftests/bpf/kern_progs/sockmap_parse_prog.c
similarity index 100%
rename from tools/testing/selftests/bpf/sockmap_parse_prog.c
rename to tools/testing/selftests/bpf/kern_progs/sockmap_parse_prog.c
diff --git a/tools/testing/selftests/bpf/sockmap_tcp_msg_prog.c b/tools/testing/selftests/bpf/kern_progs/sockmap_tcp_msg_prog.c
similarity index 100%
rename from tools/testing/selftests/bpf/sockmap_tcp_msg_prog.c
rename to tools/testing/selftests/bpf/kern_progs/sockmap_tcp_msg_prog.c
diff --git a/tools/testing/selftests/bpf/sockmap_verdict_prog.c b/tools/testing/selftests/bpf/kern_progs/sockmap_verdict_prog.c
similarity index 100%
rename from tools/testing/selftests/bpf/sockmap_verdict_prog.c
rename to tools/testing/selftests/bpf/kern_progs/sockmap_verdict_prog.c
diff --git a/tools/testing/selftests/bpf/test_adjust_tail.c b/tools/testing/selftests/bpf/kern_progs/test_adjust_tail.c
similarity index 100%
rename from tools/testing/selftests/bpf/test_adjust_tail.c
rename to tools/testing/selftests/bpf/kern_progs/test_adjust_tail.c
diff --git a/tools/testing/selftests/bpf/test_btf_haskv.c b/tools/testing/selftests/bpf/kern_progs/test_btf_haskv.c
similarity index 100%
rename from tools/testing/selftests/bpf/test_btf_haskv.c
rename to tools/testing/selftests/bpf/kern_progs/test_btf_haskv.c
diff --git a/tools/testing/selftests/bpf/test_btf_nokv.c b/tools/testing/selftests/bpf/kern_progs/test_btf_nokv.c
similarity index 100%
rename from tools/testing/selftests/bpf/test_btf_nokv.c
rename to tools/testing/selftests/bpf/kern_progs/test_btf_nokv.c
diff --git a/tools/testing/selftests/bpf/test_get_stack_rawtp.c b/tools/testing/selftests/bpf/kern_progs/test_get_stack_rawtp.c
similarity index 100%
rename from tools/testing/selftests/bpf/test_get_stack_rawtp.c
rename to tools/testing/selftests/bpf/kern_progs/test_get_stack_rawtp.c
diff --git a/tools/testing/selftests/bpf/test_l4lb.c b/tools/testing/selftests/bpf/kern_progs/test_l4lb.c
similarity index 100%
rename from tools/testing/selftests/bpf/test_l4lb.c
rename to tools/testing/selftests/bpf/kern_progs/test_l4lb.c
diff --git a/tools/testing/selftests/bpf/test_l4lb_noinline.c b/tools/testing/selftests/bpf/kern_progs/test_l4lb_noinline.c
similarity index 100%
rename from tools/testing/selftests/bpf/test_l4lb_noinline.c
rename to tools/testing/selftests/bpf/kern_progs/test_l4lb_noinline.c
diff --git a/tools/testing/selftests/bpf/test_lirc_mode2_kern.c b/tools/testing/selftests/bpf/kern_progs/test_lirc_mode2_kern.c
similarity index 100%
rename from tools/testing/selftests/bpf/test_lirc_mode2_kern.c
rename to tools/testing/selftests/bpf/kern_progs/test_lirc_mode2_kern.c
diff --git a/tools/testing/selftests/bpf/test_lwt_seg6local.c b/tools/testing/selftests/bpf/kern_progs/test_lwt_seg6local.c
similarity index 100%
rename from tools/testing/selftests/bpf/test_lwt_seg6local.c
rename to tools/testing/selftests/bpf/kern_progs/test_lwt_seg6local.c
diff --git a/tools/testing/selftests/bpf/test_map_in_map.c b/tools/testing/selftests/bpf/kern_progs/test_map_in_map.c
similarity index 100%
rename from tools/testing/selftests/bpf/test_map_in_map.c
rename to tools/testing/selftests/bpf/kern_progs/test_map_in_map.c
diff --git a/tools/testing/selftests/bpf/test_map_lock.c b/tools/testing/selftests/bpf/kern_progs/test_map_lock.c
similarity index 100%
rename from tools/testing/selftests/bpf/test_map_lock.c
rename to tools/testing/selftests/bpf/kern_progs/test_map_lock.c
diff --git a/tools/testing/selftests/bpf/test_obj_id.c b/tools/testing/selftests/bpf/kern_progs/test_obj_id.c
similarity index 100%
rename from tools/testing/selftests/bpf/test_obj_id.c
rename to tools/testing/selftests/bpf/kern_progs/test_obj_id.c
diff --git a/tools/testing/selftests/bpf/test_pkt_access.c b/tools/testing/selftests/bpf/kern_progs/test_pkt_access.c
similarity index 100%
rename from tools/testing/selftests/bpf/test_pkt_access.c
rename to tools/testing/selftests/bpf/kern_progs/test_pkt_access.c
diff --git a/tools/testing/selftests/bpf/test_pkt_md_access.c b/tools/testing/selftests/bpf/kern_progs/test_pkt_md_access.c
similarity index 100%
rename from tools/testing/selftests/bpf/test_pkt_md_access.c
rename to tools/testing/selftests/bpf/kern_progs/test_pkt_md_access.c
diff --git a/tools/testing/selftests/bpf/test_queue_map.c b/tools/testing/selftests/bpf/kern_progs/test_queue_map.c
similarity index 100%
rename from tools/testing/selftests/bpf/test_queue_map.c
rename to tools/testing/selftests/bpf/kern_progs/test_queue_map.c
diff --git a/tools/testing/selftests/bpf/test_select_reuseport_kern.c b/tools/testing/selftests/bpf/kern_progs/test_select_reuseport_kern.c
similarity index 100%
rename from tools/testing/selftests/bpf/test_select_reuseport_kern.c
rename to tools/testing/selftests/bpf/kern_progs/test_select_reuseport_kern.c
diff --git a/tools/testing/selftests/bpf/test_sk_lookup_kern.c b/tools/testing/selftests/bpf/kern_progs/test_sk_lookup_kern.c
similarity index 100%
rename from tools/testing/selftests/bpf/test_sk_lookup_kern.c
rename to tools/testing/selftests/bpf/kern_progs/test_sk_lookup_kern.c
diff --git a/tools/testing/selftests/bpf/test_skb_cgroup_id_kern.c b/tools/testing/selftests/bpf/kern_progs/test_skb_cgroup_id_kern.c
similarity index 100%
rename from tools/testing/selftests/bpf/test_skb_cgroup_id_kern.c
rename to tools/testing/selftests/bpf/kern_progs/test_skb_cgroup_id_kern.c
diff --git a/tools/testing/selftests/bpf/test_sockhash_kern.c b/tools/testing/selftests/bpf/kern_progs/test_sockhash_kern.c
similarity index 100%
rename from tools/testing/selftests/bpf/test_sockhash_kern.c
rename to tools/testing/selftests/bpf/kern_progs/test_sockhash_kern.c
diff --git a/tools/testing/selftests/bpf/test_sockmap_kern.c b/tools/testing/selftests/bpf/kern_progs/test_sockmap_kern.c
similarity index 100%
rename from tools/testing/selftests/bpf/test_sockmap_kern.c
rename to tools/testing/selftests/bpf/kern_progs/test_sockmap_kern.c
diff --git a/tools/testing/selftests/bpf/test_spin_lock.c b/tools/testing/selftests/bpf/kern_progs/test_spin_lock.c
similarity index 100%
rename from tools/testing/selftests/bpf/test_spin_lock.c
rename to tools/testing/selftests/bpf/kern_progs/test_spin_lock.c
diff --git a/tools/testing/selftests/bpf/test_stack_map.c b/tools/testing/selftests/bpf/kern_progs/test_stack_map.c
similarity index 100%
rename from tools/testing/selftests/bpf/test_stack_map.c
rename to tools/testing/selftests/bpf/kern_progs/test_stack_map.c
diff --git a/tools/testing/selftests/bpf/test_stacktrace_build_id.c b/tools/testing/selftests/bpf/kern_progs/test_stacktrace_build_id.c
similarity index 100%
rename from tools/testing/selftests/bpf/test_stacktrace_build_id.c
rename to tools/testing/selftests/bpf/kern_progs/test_stacktrace_build_id.c
diff --git a/tools/testing/selftests/bpf/test_stacktrace_map.c b/tools/testing/selftests/bpf/kern_progs/test_stacktrace_map.c
similarity index 100%
rename from tools/testing/selftests/bpf/test_stacktrace_map.c
rename to tools/testing/selftests/bpf/kern_progs/test_stacktrace_map.c
diff --git a/tools/testing/selftests/bpf/test_tcp_estats.c b/tools/testing/selftests/bpf/kern_progs/test_tcp_estats.c
similarity index 100%
rename from tools/testing/selftests/bpf/test_tcp_estats.c
rename to tools/testing/selftests/bpf/kern_progs/test_tcp_estats.c
diff --git a/tools/testing/selftests/bpf/test_tcpbpf_kern.c b/tools/testing/selftests/bpf/kern_progs/test_tcpbpf_kern.c
similarity index 100%
rename from tools/testing/selftests/bpf/test_tcpbpf_kern.c
rename to tools/testing/selftests/bpf/kern_progs/test_tcpbpf_kern.c
diff --git a/tools/testing/selftests/bpf/test_tcpnotify_kern.c b/tools/testing/selftests/bpf/kern_progs/test_tcpnotify_kern.c
similarity index 100%
rename from tools/testing/selftests/bpf/test_tcpnotify_kern.c
rename to tools/testing/selftests/bpf/kern_progs/test_tcpnotify_kern.c
diff --git a/tools/testing/selftests/bpf/test_tracepoint.c b/tools/testing/selftests/bpf/kern_progs/test_tracepoint.c
similarity index 100%
rename from tools/testing/selftests/bpf/test_tracepoint.c
rename to tools/testing/selftests/bpf/kern_progs/test_tracepoint.c
diff --git a/tools/testing/selftests/bpf/test_tunnel_kern.c b/tools/testing/selftests/bpf/kern_progs/test_tunnel_kern.c
similarity index 100%
rename from tools/testing/selftests/bpf/test_tunnel_kern.c
rename to tools/testing/selftests/bpf/kern_progs/test_tunnel_kern.c
diff --git a/tools/testing/selftests/bpf/test_xdp.c b/tools/testing/selftests/bpf/kern_progs/test_xdp.c
similarity index 100%
rename from tools/testing/selftests/bpf/test_xdp.c
rename to tools/testing/selftests/bpf/kern_progs/test_xdp.c
diff --git a/tools/testing/selftests/bpf/test_xdp_meta.c b/tools/testing/selftests/bpf/kern_progs/test_xdp_meta.c
similarity index 100%
rename from tools/testing/selftests/bpf/test_xdp_meta.c
rename to tools/testing/selftests/bpf/kern_progs/test_xdp_meta.c
diff --git a/tools/testing/selftests/bpf/test_xdp_noinline.c b/tools/testing/selftests/bpf/kern_progs/test_xdp_noinline.c
similarity index 100%
rename from tools/testing/selftests/bpf/test_xdp_noinline.c
rename to tools/testing/selftests/bpf/kern_progs/test_xdp_noinline.c
diff --git a/tools/testing/selftests/bpf/test_xdp_redirect.c b/tools/testing/selftests/bpf/kern_progs/test_xdp_redirect.c
similarity index 100%
rename from tools/testing/selftests/bpf/test_xdp_redirect.c
rename to tools/testing/selftests/bpf/kern_progs/test_xdp_redirect.c
diff --git a/tools/testing/selftests/bpf/test_xdp_vlan.c b/tools/testing/selftests/bpf/kern_progs/test_xdp_vlan.c
similarity index 100%
rename from tools/testing/selftests/bpf/test_xdp_vlan.c
rename to tools/testing/selftests/bpf/kern_progs/test_xdp_vlan.c
diff --git a/tools/testing/selftests/bpf/xdp_dummy.c b/tools/testing/selftests/bpf/kern_progs/xdp_dummy.c
similarity index 100%
rename from tools/testing/selftests/bpf/xdp_dummy.c
rename to tools/testing/selftests/bpf/kern_progs/xdp_dummy.c
-- 
2.7.4


^ permalink raw reply related

* [PATCH bpf-next 0/3] selftests: bpf: improve bpf object file rules
From: Jiong Wang @ 2019-02-08 17:41 UTC (permalink / raw)
  To: alexei.starovoitov, daniel; +Cc: netdev, oss-drivers, Jiong Wang

This set improves bpf object file related rules in selftests Makefile.
  - tell git to ignore the build dir "alu32".
  - extend sub-register mode compilation to all bpf object files to give
    LLVM compiler bpf back-end more exercise.
  - auto-generate bpf kernel object file list.

Jiong Wang (3):
  selftests: bpf: add "alu32" to .gitignore
  selftests: bpf: extend sub-register mode compilation to all bpf object
    files
  selftests: bpf: centre kernel bpf objects under new subdir
    "kern_progs"

 tools/testing/selftests/bpf/.gitignore             |  1 +
 tools/testing/selftests/bpf/Makefile               | 35 +++++-----------------
 .../selftests/bpf/{ => kern_progs}/bpf_flow.c      |  0
 .../selftests/bpf/{ => kern_progs}/connect4_prog.c |  0
 .../selftests/bpf/{ => kern_progs}/connect6_prog.c |  0
 .../selftests/bpf/{ => kern_progs}/dev_cgroup.c    |  0
 .../bpf/{ => kern_progs}/get_cgroup_id_kern.c      |  0
 .../selftests/bpf/{ => kern_progs}/netcnt_prog.c   |  0
 .../bpf/{ => kern_progs}/sample_map_ret0.c         |  0
 .../selftests/bpf/{ => kern_progs}/sample_ret0.c   |  0
 .../selftests/bpf/{ => kern_progs}/sendmsg4_prog.c |  0
 .../selftests/bpf/{ => kern_progs}/sendmsg6_prog.c |  0
 .../bpf/{ => kern_progs}/socket_cookie_prog.c      |  0
 .../bpf/{ => kern_progs}/sockmap_parse_prog.c      |  0
 .../bpf/{ => kern_progs}/sockmap_tcp_msg_prog.c    |  0
 .../bpf/{ => kern_progs}/sockmap_verdict_prog.c    |  0
 .../bpf/{ => kern_progs}/test_adjust_tail.c        |  0
 .../bpf/{ => kern_progs}/test_btf_haskv.c          |  0
 .../selftests/bpf/{ => kern_progs}/test_btf_nokv.c |  0
 .../bpf/{ => kern_progs}/test_get_stack_rawtp.c    |  0
 .../selftests/bpf/{ => kern_progs}/test_l4lb.c     |  0
 .../bpf/{ => kern_progs}/test_l4lb_noinline.c      |  0
 .../bpf/{ => kern_progs}/test_lirc_mode2_kern.c    |  0
 .../bpf/{ => kern_progs}/test_lwt_seg6local.c      |  0
 .../bpf/{ => kern_progs}/test_map_in_map.c         |  0
 .../selftests/bpf/{ => kern_progs}/test_map_lock.c |  0
 .../selftests/bpf/{ => kern_progs}/test_obj_id.c   |  0
 .../bpf/{ => kern_progs}/test_pkt_access.c         |  0
 .../bpf/{ => kern_progs}/test_pkt_md_access.c      |  0
 .../bpf/{ => kern_progs}/test_queue_map.c          |  0
 .../{ => kern_progs}/test_select_reuseport_kern.c  |  0
 .../bpf/{ => kern_progs}/test_sk_lookup_kern.c     |  0
 .../bpf/{ => kern_progs}/test_skb_cgroup_id_kern.c |  0
 .../bpf/{ => kern_progs}/test_sockhash_kern.c      |  0
 .../bpf/{ => kern_progs}/test_sockmap_kern.c       |  0
 .../bpf/{ => kern_progs}/test_spin_lock.c          |  0
 .../bpf/{ => kern_progs}/test_stack_map.c          |  0
 .../{ => kern_progs}/test_stacktrace_build_id.c    |  0
 .../bpf/{ => kern_progs}/test_stacktrace_map.c     |  0
 .../bpf/{ => kern_progs}/test_tcp_estats.c         |  0
 .../bpf/{ => kern_progs}/test_tcpbpf_kern.c        |  0
 .../bpf/{ => kern_progs}/test_tcpnotify_kern.c     |  0
 .../bpf/{ => kern_progs}/test_tracepoint.c         |  0
 .../bpf/{ => kern_progs}/test_tunnel_kern.c        |  0
 .../selftests/bpf/{ => kern_progs}/test_xdp.c      |  0
 .../selftests/bpf/{ => kern_progs}/test_xdp_meta.c |  0
 .../bpf/{ => kern_progs}/test_xdp_noinline.c       |  0
 .../bpf/{ => kern_progs}/test_xdp_redirect.c       |  0
 .../selftests/bpf/{ => kern_progs}/test_xdp_vlan.c |  0
 .../selftests/bpf/{ => kern_progs}/xdp_dummy.c     |  0
 50 files changed, 8 insertions(+), 28 deletions(-)
 rename tools/testing/selftests/bpf/{ => kern_progs}/bpf_flow.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/connect4_prog.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/connect6_prog.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/dev_cgroup.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/get_cgroup_id_kern.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/netcnt_prog.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/sample_map_ret0.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/sample_ret0.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/sendmsg4_prog.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/sendmsg6_prog.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/socket_cookie_prog.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/sockmap_parse_prog.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/sockmap_tcp_msg_prog.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/sockmap_verdict_prog.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_adjust_tail.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_btf_haskv.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_btf_nokv.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_get_stack_rawtp.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_l4lb.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_l4lb_noinline.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_lirc_mode2_kern.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_lwt_seg6local.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_map_in_map.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_map_lock.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_obj_id.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_pkt_access.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_pkt_md_access.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_queue_map.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_select_reuseport_kern.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_sk_lookup_kern.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_skb_cgroup_id_kern.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_sockhash_kern.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_sockmap_kern.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_spin_lock.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_stack_map.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_stacktrace_build_id.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_stacktrace_map.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_tcp_estats.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_tcpbpf_kern.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_tcpnotify_kern.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_tracepoint.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_tunnel_kern.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_xdp.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_xdp_meta.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_xdp_noinline.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_xdp_redirect.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/test_xdp_vlan.c (100%)
 rename tools/testing/selftests/bpf/{ => kern_progs}/xdp_dummy.c (100%)

-- 
2.7.4


^ permalink raw reply

* [PATCH bpf-next 2/3] selftests: bpf: extend sub-register mode compilation to all bpf object files
From: Jiong Wang @ 2019-02-08 17:41 UTC (permalink / raw)
  To: alexei.starovoitov, daniel; +Cc: netdev, oss-drivers, Jiong Wang
In-Reply-To: <1549647681-13818-1-git-send-email-jiong.wang@netronome.com>

At the moment, we only do extra sub-register mode compilation on bpf object
files used by "test_progs". These object files are really loaded and
executed.

This patch further extends sub-register mode compilation to all bpf object
files, even those without corresponding runtime tests. Because this could
help testing LLVM sub-register code-gen, kernel bpf selftest has much more
C testcases with reasonable size and complexity compared with LLVM
testsuite which only contains unit tests.

There were some file duplication inside BPF_OBJ_FILES_DUAL_COMPILE which
is removed now.

Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
---
 tools/testing/selftests/bpf/Makefile | 21 ++++++++-------------
 1 file changed, 8 insertions(+), 13 deletions(-)

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 383d2ff..70b2570 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -35,20 +35,15 @@ BPF_OBJ_FILES = \
 	sendmsg4_prog.o sendmsg6_prog.o test_lirc_mode2_kern.o \
 	get_cgroup_id_kern.o socket_cookie_prog.o test_select_reuseport_kern.o \
 	test_skb_cgroup_id_kern.o bpf_flow.o netcnt_prog.o test_xdp_vlan.o \
-	xdp_dummy.o test_map_in_map.o test_spin_lock.o test_map_lock.o
-
-# Objects are built with default compilation flags and with sub-register
-# code-gen enabled.
-BPF_OBJ_FILES_DUAL_COMPILE = \
-	test_pkt_access.o test_pkt_access.o test_xdp.o test_adjust_tail.o \
-	test_l4lb.o test_l4lb_noinline.o test_xdp_noinline.o test_tcp_estats.o \
+	xdp_dummy.o test_map_in_map.o test_spin_lock.o test_map_lock.o \
+	test_pkt_access.o test_xdp.o test_adjust_tail.o test_l4lb.o \
+	test_l4lb_noinline.o test_xdp_noinline.o test_tcp_estats.o \
 	test_obj_id.o test_pkt_md_access.o test_tracepoint.o \
-	test_stacktrace_map.o test_stacktrace_map.o test_stacktrace_build_id.o \
-	test_stacktrace_build_id.o test_get_stack_rawtp.o \
-	test_get_stack_rawtp.o test_tracepoint.o test_sk_lookup_kern.o \
-	test_queue_map.o test_stack_map.o
+	test_stacktrace_map.o test_stacktrace_build_id.o \
+	test_get_stack_rawtp.o test_sk_lookup_kern.o test_queue_map.o \
+	test_stack_map.o
 
-TEST_GEN_FILES = $(BPF_OBJ_FILES) $(BPF_OBJ_FILES_DUAL_COMPILE)
+TEST_GEN_FILES = $(BPF_OBJ_FILES)
 
 # Also test sub-register code-gen if LLVM + kernel both has eBPF v3 processor
 # support which is the first version to contain both ALU32 and JMP32
@@ -58,7 +53,7 @@ SUBREG_CODEGEN := $(shell echo "int cal(int a) { return a > 0; }" | \
 			$(LLC) -mattr=+alu32 -mcpu=probe 2>&1 | \
 			grep 'if w')
 ifneq ($(SUBREG_CODEGEN),)
-TEST_GEN_FILES += $(patsubst %.o,alu32/%.o, $(BPF_OBJ_FILES_DUAL_COMPILE))
+TEST_GEN_FILES += $(patsubst %.o,alu32/%.o, $(BPF_OBJ_FILES))
 endif
 
 # Order correspond to 'make run_tests' order
-- 
2.7.4


^ permalink raw reply related

* [PATCH bpf-next 1/3] selftests: bpf: add "alu32" to .gitignore
From: Jiong Wang @ 2019-02-08 17:41 UTC (permalink / raw)
  To: alexei.starovoitov, daniel; +Cc: netdev, oss-drivers, Jiong Wang
In-Reply-To: <1549647681-13818-1-git-send-email-jiong.wang@netronome.com>

"alu32" is a build dir and contains various files for BPF sub-register
code-gen testing.

This patch tells git to ignore it.

Suggested-by: Yonghong Song <yhs@fb.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
---
 tools/testing/selftests/bpf/.gitignore | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/testing/selftests/bpf/.gitignore b/tools/testing/selftests/bpf/.gitignore
index dd093bd..e47168d 100644
--- a/tools/testing/selftests/bpf/.gitignore
+++ b/tools/testing/selftests/bpf/.gitignore
@@ -29,3 +29,4 @@ test_netcnt
 test_section_names
 test_tcpnotify_user
 test_libbpf
+alu32
-- 
2.7.4


^ permalink raw reply related

* [PATCH 13/19] net: split out functions related to registering inflight socket files
From: Jens Axboe @ 2019-02-08 17:34 UTC (permalink / raw)
  To: linux-aio, linux-block, linux-api
  Cc: hch, jmoyer, avi, jannh, viro, Jens Axboe, netdev,
	David S . Miller
In-Reply-To: <20190208173423.27014-1-axboe@kernel.dk>

We need this functionality for the io_uring file registration, but
we cannot rely on it since CONFIG_UNIX can be modular. Move the helpers
to a separate file, that's always builtin to the kernel if CONFIG_UNIX is
m/y.

No functional changes in this patch, just moving code around.

Cc: netdev@vger.kernel.org
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 include/net/af_unix.h |   1 +
 net/unix/Kconfig      |   5 ++
 net/unix/Makefile     |   2 +
 net/unix/af_unix.c    |  63 +-----------------
 net/unix/garbage.c    |  71 +-------------------
 net/unix/scm.c        | 146 ++++++++++++++++++++++++++++++++++++++++++
 net/unix/scm.h        |  10 +++
 7 files changed, 168 insertions(+), 130 deletions(-)
 create mode 100644 net/unix/scm.c
 create mode 100644 net/unix/scm.h

diff --git a/include/net/af_unix.h b/include/net/af_unix.h
index ddbba838d048..3426d6dacc45 100644
--- a/include/net/af_unix.h
+++ b/include/net/af_unix.h
@@ -10,6 +10,7 @@
 
 void unix_inflight(struct user_struct *user, struct file *fp);
 void unix_notinflight(struct user_struct *user, struct file *fp);
+void unix_destruct_scm(struct sk_buff *skb);
 void unix_gc(void);
 void wait_for_unix_gc(void);
 struct sock *unix_get_socket(struct file *filp);
diff --git a/net/unix/Kconfig b/net/unix/Kconfig
index 8b31ab85d050..3b9e450656a4 100644
--- a/net/unix/Kconfig
+++ b/net/unix/Kconfig
@@ -19,6 +19,11 @@ config UNIX
 
 	  Say Y unless you know what you are doing.
 
+config UNIX_SCM
+	bool
+	depends on UNIX
+	default y
+
 config UNIX_DIAG
 	tristate "UNIX: socket monitoring interface"
 	depends on UNIX
diff --git a/net/unix/Makefile b/net/unix/Makefile
index ffd0a275c3a7..54e58cc4f945 100644
--- a/net/unix/Makefile
+++ b/net/unix/Makefile
@@ -10,3 +10,5 @@ unix-$(CONFIG_SYSCTL)	+= sysctl_net_unix.o
 
 obj-$(CONFIG_UNIX_DIAG)	+= unix_diag.o
 unix_diag-y		:= diag.o
+
+obj-$(CONFIG_UNIX_SCM)	+= scm.o
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 74d1eed7cbd4..2ce32dbb2feb 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -119,6 +119,8 @@
 #include <linux/freezer.h>
 #include <linux/file.h>
 
+#include "scm.h"
+
 struct hlist_head unix_socket_table[2 * UNIX_HASH_SIZE];
 EXPORT_SYMBOL_GPL(unix_socket_table);
 DEFINE_SPINLOCK(unix_table_lock);
@@ -1486,67 +1488,6 @@ static int unix_getname(struct socket *sock, struct sockaddr *uaddr, int peer)
 	return err;
 }
 
-static void unix_detach_fds(struct scm_cookie *scm, struct sk_buff *skb)
-{
-	int i;
-
-	scm->fp = UNIXCB(skb).fp;
-	UNIXCB(skb).fp = NULL;
-
-	for (i = scm->fp->count-1; i >= 0; i--)
-		unix_notinflight(scm->fp->user, scm->fp->fp[i]);
-}
-
-static void unix_destruct_scm(struct sk_buff *skb)
-{
-	struct scm_cookie scm;
-	memset(&scm, 0, sizeof(scm));
-	scm.pid  = UNIXCB(skb).pid;
-	if (UNIXCB(skb).fp)
-		unix_detach_fds(&scm, skb);
-
-	/* Alas, it calls VFS */
-	/* So fscking what? fput() had been SMP-safe since the last Summer */
-	scm_destroy(&scm);
-	sock_wfree(skb);
-}
-
-/*
- * The "user->unix_inflight" variable is protected by the garbage
- * collection lock, and we just read it locklessly here. If you go
- * over the limit, there might be a tiny race in actually noticing
- * it across threads. Tough.
- */
-static inline bool too_many_unix_fds(struct task_struct *p)
-{
-	struct user_struct *user = current_user();
-
-	if (unlikely(user->unix_inflight > task_rlimit(p, RLIMIT_NOFILE)))
-		return !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN);
-	return false;
-}
-
-static int unix_attach_fds(struct scm_cookie *scm, struct sk_buff *skb)
-{
-	int i;
-
-	if (too_many_unix_fds(current))
-		return -ETOOMANYREFS;
-
-	/*
-	 * Need to duplicate file references for the sake of garbage
-	 * collection.  Otherwise a socket in the fps might become a
-	 * candidate for GC while the skb is not yet queued.
-	 */
-	UNIXCB(skb).fp = scm_fp_dup(scm->fp);
-	if (!UNIXCB(skb).fp)
-		return -ENOMEM;
-
-	for (i = scm->fp->count - 1; i >= 0; i--)
-		unix_inflight(scm->fp->user, scm->fp->fp[i]);
-	return 0;
-}
-
 static int unix_scm_to_skb(struct scm_cookie *scm, struct sk_buff *skb, bool send_fds)
 {
 	int err = 0;
diff --git a/net/unix/garbage.c b/net/unix/garbage.c
index f81854d74c7d..8bbe1b8e4ff7 100644
--- a/net/unix/garbage.c
+++ b/net/unix/garbage.c
@@ -86,80 +86,13 @@
 #include <net/scm.h>
 #include <net/tcp_states.h>
 
+#include "scm.h"
+
 /* Internal data structures and random procedures: */
 
-static LIST_HEAD(gc_inflight_list);
 static LIST_HEAD(gc_candidates);
-static DEFINE_SPINLOCK(unix_gc_lock);
 static DECLARE_WAIT_QUEUE_HEAD(unix_gc_wait);
 
-unsigned int unix_tot_inflight;
-
-struct sock *unix_get_socket(struct file *filp)
-{
-	struct sock *u_sock = NULL;
-	struct inode *inode = file_inode(filp);
-
-	/* Socket ? */
-	if (S_ISSOCK(inode->i_mode) && !(filp->f_mode & FMODE_PATH)) {
-		struct socket *sock = SOCKET_I(inode);
-		struct sock *s = sock->sk;
-
-		/* PF_UNIX ? */
-		if (s && sock->ops && sock->ops->family == PF_UNIX)
-			u_sock = s;
-	} else {
-		/* Could be an io_uring instance */
-		u_sock = io_uring_get_socket(filp);
-	}
-	return u_sock;
-}
-
-/* Keep the number of times in flight count for the file
- * descriptor if it is for an AF_UNIX socket.
- */
-
-void unix_inflight(struct user_struct *user, struct file *fp)
-{
-	struct sock *s = unix_get_socket(fp);
-
-	spin_lock(&unix_gc_lock);
-
-	if (s) {
-		struct unix_sock *u = unix_sk(s);
-
-		if (atomic_long_inc_return(&u->inflight) == 1) {
-			BUG_ON(!list_empty(&u->link));
-			list_add_tail(&u->link, &gc_inflight_list);
-		} else {
-			BUG_ON(list_empty(&u->link));
-		}
-		unix_tot_inflight++;
-	}
-	user->unix_inflight++;
-	spin_unlock(&unix_gc_lock);
-}
-
-void unix_notinflight(struct user_struct *user, struct file *fp)
-{
-	struct sock *s = unix_get_socket(fp);
-
-	spin_lock(&unix_gc_lock);
-
-	if (s) {
-		struct unix_sock *u = unix_sk(s);
-
-		BUG_ON(!atomic_long_read(&u->inflight));
-		BUG_ON(list_empty(&u->link));
-
-		if (atomic_long_dec_and_test(&u->inflight))
-			list_del_init(&u->link);
-		unix_tot_inflight--;
-	}
-	user->unix_inflight--;
-	spin_unlock(&unix_gc_lock);
-}
-
 static void scan_inflight(struct sock *x, void (*func)(struct unix_sock *),
 			  struct sk_buff_head *hitlist)
 {
diff --git a/net/unix/scm.c b/net/unix/scm.c
new file mode 100644
index 000000000000..ed1624588934
--- /dev/null
+++ b/net/unix/scm.c
@@ -0,0 +1,146 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/string.h>
+#include <linux/socket.h>
+#include <linux/net.h>
+#include <linux/fs.h>
+#include <net/af_unix.h>
+#include <net/scm.h>
+#include <linux/init.h>
+
+#include "scm.h"
+
+unsigned int unix_tot_inflight;
+
+LIST_HEAD(gc_inflight_list);
+DEFINE_SPINLOCK(unix_gc_lock);
+
+struct sock *unix_get_socket(struct file *filp)
+{
+	struct sock *u_sock = NULL;
+	struct inode *inode = file_inode(filp);
+
+	/* Socket ? */
+	if (S_ISSOCK(inode->i_mode) && !(filp->f_mode & FMODE_PATH)) {
+		struct socket *sock = SOCKET_I(inode);
+		struct sock *s = sock->sk;
+
+		/* PF_UNIX ? */
+		if (s && sock->ops && sock->ops->family == PF_UNIX)
+			u_sock = s;
+	} else {
+		/* Could be an io_uring instance */
+		u_sock = io_uring_get_socket(filp);
+	}
+	return u_sock;
+}
+EXPORT_SYMBOL(unix_get_socket);
+
+/* Keep the number of times in flight count for the file
+ * descriptor if it is for an AF_UNIX socket.
+ */
+void unix_inflight(struct user_struct *user, struct file *fp)
+{
+	struct sock *s = unix_get_socket(fp);
+
+	spin_lock(&unix_gc_lock);
+
+	if (s) {
+		struct unix_sock *u = unix_sk(s);
+
+		if (atomic_long_inc_return(&u->inflight) == 1) {
+			BUG_ON(!list_empty(&u->link));
+			list_add_tail(&u->link, &gc_inflight_list);
+		} else {
+			BUG_ON(list_empty(&u->link));
+		}
+		unix_tot_inflight++;
+	}
+	user->unix_inflight++;
+	spin_unlock(&unix_gc_lock);
+}
+
+void unix_notinflight(struct user_struct *user, struct file *fp)
+{
+	struct sock *s = unix_get_socket(fp);
+
+	spin_lock(&unix_gc_lock);
+
+	if (s) {
+		struct unix_sock *u = unix_sk(s);
+
+		BUG_ON(!atomic_long_read(&u->inflight));
+		BUG_ON(list_empty(&u->link));
+
+		if (atomic_long_dec_and_test(&u->inflight))
+			list_del_init(&u->link);
+		unix_tot_inflight--;
+	}
+	user->unix_inflight--;
+	spin_unlock(&unix_gc_lock);
+}
+
+/*
+ * The "user->unix_inflight" variable is protected by the garbage
+ * collection lock, and we just read it locklessly here. If you go
+ * over the limit, there might be a tiny race in actually noticing
+ * it across threads. Tough.
+ */
+static inline bool too_many_unix_fds(struct task_struct *p)
+{
+	struct user_struct *user = current_user();
+
+	if (unlikely(user->unix_inflight > task_rlimit(p, RLIMIT_NOFILE)))
+		return !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN);
+	return false;
+}
+
+int unix_attach_fds(struct scm_cookie *scm, struct sk_buff *skb)
+{
+	int i;
+
+	if (too_many_unix_fds(current))
+		return -ETOOMANYREFS;
+
+	/*
+	 * Need to duplicate file references for the sake of garbage
+	 * collection.  Otherwise a socket in the fps might become a
+	 * candidate for GC while the skb is not yet queued.
+	 */
+	UNIXCB(skb).fp = scm_fp_dup(scm->fp);
+	if (!UNIXCB(skb).fp)
+		return -ENOMEM;
+
+	for (i = scm->fp->count - 1; i >= 0; i--)
+		unix_inflight(scm->fp->user, scm->fp->fp[i]);
+	return 0;
+}
+EXPORT_SYMBOL(unix_attach_fds);
+
+void unix_detach_fds(struct scm_cookie *scm, struct sk_buff *skb)
+{
+	int i;
+
+	scm->fp = UNIXCB(skb).fp;
+	UNIXCB(skb).fp = NULL;
+
+	for (i = scm->fp->count-1; i >= 0; i--)
+		unix_notinflight(scm->fp->user, scm->fp->fp[i]);
+}
+EXPORT_SYMBOL(unix_detach_fds);
+
+void unix_destruct_scm(struct sk_buff *skb)
+{
+	struct scm_cookie scm;
+
+	memset(&scm, 0, sizeof(scm));
+	scm.pid  = UNIXCB(skb).pid;
+	if (UNIXCB(skb).fp)
+		unix_detach_fds(&scm, skb);
+
+	/* Alas, it calls VFS */
+	/* So fscking what? fput() had been SMP-safe since the last Summer */
+	scm_destroy(&scm);
+	sock_wfree(skb);
+}
diff --git a/net/unix/scm.h b/net/unix/scm.h
new file mode 100644
index 000000000000..5a255a477f16
--- /dev/null
+++ b/net/unix/scm.h
@@ -0,0 +1,10 @@
+#ifndef NET_UNIX_SCM_H
+#define NET_UNIX_SCM_H
+
+extern struct list_head gc_inflight_list;
+extern spinlock_t unix_gc_lock;
+
+int unix_attach_fds(struct scm_cookie *scm, struct sk_buff *skb);
+void unix_detach_fds(struct scm_cookie *scm, struct sk_buff *skb);
+
+#endif
-- 
2.17.1


^ permalink raw reply related

* Re: [PATCH v3 bpf-next 4/4] tools/bpf: remove btf__get_strings superseded() by raw data API
From: Song Liu @ 2019-02-08 17:31 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Andrii Nakryiko, Yonghong Song,
	Alexei Starovoitov, Martin Lau, netdev@vger.kernel.org,
	Kernel Team, daniel@iogearbox.net
In-Reply-To: <20190208025555.4027769-5-andriin@fb.com>



> On Feb 7, 2019, at 6:55 PM, Andrii Nakryiko <andriin@fb.com> wrote:
> 
> Now that we have btf__get_raw_data() it's trivial for tests to iterate
> over all strings for testing purposes, which eliminates the need for
> btf__get_strings() API.
> 
> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> ---
> tools/lib/bpf/btf.c                    |  7 -----
> tools/lib/bpf/btf.h                    |  2 --
> tools/testing/selftests/bpf/test_btf.c | 39 +++++++++++++++++---------
> 3 files changed, 26 insertions(+), 22 deletions(-)
> 
> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> index c87cc3d71b9f..a986dc28f17d 100644
> --- a/tools/lib/bpf/btf.c
> +++ b/tools/lib/bpf/btf.c
> @@ -447,13 +447,6 @@ const void *btf__get_raw_data(const struct btf *btf, __u32 *size)
> 	return btf->data;
> }
> 
> -void btf__get_strings(const struct btf *btf, const char **strings,
> -		      __u32 *str_len)
> -{
> -	*strings = btf->strings;
> -	*str_len = btf->hdr->str_len;
> -}
> -
> const char *btf__name_by_offset(const struct btf *btf, __u32 offset)
> {
> 	if (offset < btf->hdr->str_len)
> diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
> index ad9f648260c2..6179291f2cec 100644
> --- a/tools/lib/bpf/btf.h
> +++ b/tools/lib/bpf/btf.h
> @@ -67,8 +67,6 @@ LIBBPF_API __s64 btf__resolve_size(const struct btf *btf, __u32 type_id);
> LIBBPF_API int btf__resolve_type(const struct btf *btf, __u32 type_id);
> LIBBPF_API int btf__fd(const struct btf *btf);
> LIBBPF_API const void *btf__get_raw_data(const struct btf *btf, __u32 *size);
> -LIBBPF_API void btf__get_strings(const struct btf *btf, const char **strings,
> -				 __u32 *str_len);

I guess we need to update libbpf.map with this? 

> LIBBPF_API const char *btf__name_by_offset(const struct btf *btf, __u32 offset);
> LIBBPF_API int btf__get_from_id(__u32 id, struct btf **btf);
> LIBBPF_API int btf__get_map_kv_tids(const struct btf *btf, const char *map_name,
> diff --git a/tools/testing/selftests/bpf/test_btf.c b/tools/testing/selftests/bpf/test_btf.c
> index 447acc34db94..bbcacba39590 100644
> --- a/tools/testing/selftests/bpf/test_btf.c
> +++ b/tools/testing/selftests/bpf/test_btf.c
> @@ -5882,15 +5882,17 @@ static void dump_btf_strings(const char *strs, __u32 len)
> static int do_test_dedup(unsigned int test_num)
> {
> 	const struct btf_dedup_test *test = &dedup_tests[test_num - 1];
> -	int err = 0, i;
> -	__u32 test_nr_types, expect_nr_types, test_str_len, expect_str_len;
> -	void *raw_btf;
> -	unsigned int raw_btf_size;
> +	__u32 test_nr_types, expect_nr_types, test_btf_size, expect_btf_size;
> +	const struct btf_header *test_hdr, *expect_hdr;
> 	struct btf *test_btf = NULL, *expect_btf = NULL;
> +	const void *test_btf_data, *expect_btf_data;
> 	const char *ret_test_next_str, *ret_expect_next_str;
> 	const char *test_strs, *expect_strs;
> 	const char *test_str_cur, *test_str_end;
> 	const char *expect_str_cur, *expect_str_end;
> +	unsigned int raw_btf_size;
> +	void *raw_btf;
> +	int err = 0, i;
> 
> 	fprintf(stderr, "BTF dedup test[%u] (%s):", test_num, test->descr);
> 
> @@ -5927,23 +5929,34 @@ static int do_test_dedup(unsigned int test_num)
> 		goto done;
> 	}
> 
> -	btf__get_strings(test_btf, &test_strs, &test_str_len);
> -	btf__get_strings(expect_btf, &expect_strs, &expect_str_len);
> -	if (CHECK(test_str_len != expect_str_len,
> -		  "test_str_len:%u != expect_str_len:%u",
> -		  test_str_len, expect_str_len)) {
> +	test_btf_data = btf__get_raw_data(test_btf, &test_btf_size);
> +	expect_btf_data = btf__get_raw_data(expect_btf, &expect_btf_size);
> +	if (CHECK(test_btf_size != expect_btf_size,
> +		  "test_btf_size:%u != expect_btf_size:%u",
> +		  test_btf_size, expect_btf_size)) {
> +		err = -1;
> +		goto done;
> +	}
> +
> +	test_hdr = test_btf_data;
> +	test_strs = test_btf_data + test_hdr->str_off;
> +	expect_hdr = expect_btf_data;
> +	expect_strs = expect_btf_data + expect_hdr->str_off;
> +	if (CHECK(test_hdr->str_len != expect_hdr->str_len,
> +		  "test_hdr->str_len:%u != expect_hdr->str_len:%u",
> +		  test_hdr->str_len, expect_hdr->str_len)) {
> 		fprintf(stderr, "\ntest strings:\n");
> -		dump_btf_strings(test_strs, test_str_len);
> +		dump_btf_strings(test_strs, test_hdr->str_len);
> 		fprintf(stderr, "\nexpected strings:\n");
> -		dump_btf_strings(expect_strs, expect_str_len);
> +		dump_btf_strings(expect_strs, expect_hdr->str_len);
> 		err = -1;
> 		goto done;
> 	}
> 
> 	test_str_cur = test_strs;
> -	test_str_end = test_strs + test_str_len;
> +	test_str_end = test_strs + test_hdr->str_len;
> 	expect_str_cur = expect_strs;
> -	expect_str_end = expect_strs + expect_str_len;
> +	expect_str_end = expect_strs + expect_hdr->str_len;
> 	while (test_str_cur < test_str_end && expect_str_cur < expect_str_end) {
> 		size_t test_len, expect_len;
> 
> -- 
> 2.17.1
> 


^ permalink raw reply

* Re: [PATCH net-next] devlink: Add WARN_ON to catch errors of not cleaning devlink objects
From: David Ahern @ 2019-02-08 17:30 UTC (permalink / raw)
  To: Parav Pandit, netdev, davem
In-Reply-To: <1549642921-11196-1-git-send-email-parav@mellanox.com>

On 2/8/19 8:22 AM, Parav Pandit wrote:
> Add WARN_ON to make sure that all sub objects of a devlink device are
> cleanedup before freeing the devlink device.
> This helps to catch any driver bugs.
> 
> Signed-off-by: Parav Pandit <parav@mellanox.com>
> Acked-by: Jiri Pirko <jiri@mellanox.com>
> ---
>  net/core/devlink.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/net/core/devlink.c b/net/core/devlink.c
> index cd0d393..5e2ef5a 100644
> --- a/net/core/devlink.c
> +++ b/net/core/devlink.c
> @@ -4229,6 +4229,13 @@ void devlink_unregister(struct devlink *devlink)
>   */
>  void devlink_free(struct devlink *devlink)
>  {
> +	WARN_ON(!list_empty(&devlink->port_list));
> +	WARN_ON(!list_empty(&devlink->sb_list));
> +	WARN_ON(!list_empty(&devlink->dpipe_table_list));
> +	WARN_ON(!list_empty(&devlink->resource_list));
> +	WARN_ON(!list_empty(&devlink->param_list));
> +	WARN_ON(!list_empty(&devlink->region_list));
> +
>  	kfree(devlink);
>  }
>  EXPORT_SYMBOL_GPL(devlink_free);
> 

reporter_list was just added which brings up the maintenance question:
If you are going to do this you might want a comment in
include/net/devlink.h to remind folks to update this function as relevant.

^ permalink raw reply

* Re: [PATCH v3 bpf-next 3/4] btf: expose API to work with raw btf_ext data
From: Andrii Nakryiko @ 2019-02-08 17:15 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Andrii Nakryiko, alexei.starovoitov@gmail.com, Song Liu,
	Alexei Starovoitov, Martin Lau, netdev@vger.kernel.org,
	Kernel Team, daniel@iogearbox.net
In-Reply-To: <e4c11465-31a9-5dd3-7408-b0d5665bc391@fb.com>

On Fri, Feb 8, 2019 at 9:13 AM Yonghong Song <yhs@fb.com> wrote:
>
>
>
> On 2/7/19 6:55 PM, Andrii Nakryiko wrote:
> > This patch changes struct btf_ext to retain original data in sequential
> > block of memory, which makes it possible to expose
> > btf_ext__get_raw_data() interface, that's similar to
> > btf__get_raw_data(), allowing users of libbpf to get access to raw
> > representation of .BTF.ext section.
> >
> > Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> > ---
> >   tools/lib/bpf/btf.c      | 85 +++++++++++++++++++++-------------------
> >   tools/lib/bpf/btf.h      |  2 +
> >   tools/lib/bpf/libbpf.map |  1 +
> >   3 files changed, 47 insertions(+), 41 deletions(-)
> >
> > diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> > index 8730f6c3be9e..c87cc3d71b9f 100644
> > --- a/tools/lib/bpf/btf.c
> > +++ b/tools/lib/bpf/btf.c
> > @@ -41,9 +41,8 @@ struct btf {
> >
> >   struct btf_ext_info {
> >       /*
> > -      * info points to a deep copy of the individual info section
> > -      * (e.g. func_info and line_info) from the .BTF.ext.
> > -      * It does not include the __u32 rec_size.
> > +      * info points to the individual info section (e.g. func_info and
> > +      * line_info) from the .BTF.ext. It does not include the __u32 rec_size.
> >        */
> >       void *info;
> >       __u32 rec_size;
> > @@ -51,8 +50,13 @@ struct btf_ext_info {
> >   };
> >
> >   struct btf_ext {
> > +     union {
> > +             struct btf_ext_header *hdr;
> > +             void *data;
> > +     };
> >       struct btf_ext_info func_info;
> >       struct btf_ext_info line_info;
> > +     __u32 data_size;
> >   };
> >
> >   struct btf_ext_info_sec {
> > @@ -603,19 +607,13 @@ struct btf_ext_sec_copy_param {
> >   };
> >
> >   static int btf_ext_copy_info(struct btf_ext *btf_ext,
> > -                          __u8 *data, __u32 data_size,
> >                            struct btf_ext_sec_copy_param *ext_sec)
>
> Overall looks good. Since we do not really "copy" info any more,
> rather we try to "setup" info based on btf_ext. Maybe changing
> all function and structure names with "_copy_" to "_setup_"?

Makes sense, will update.

>
> >   {
> > -     const struct btf_ext_header *hdr = (struct btf_ext_header *)data;
> >       const struct btf_ext_info_sec *sinfo;
> >       struct btf_ext_info *ext_info;
> >       __u32 info_left, record_size;
> >       /* The start of the info sec (including the __u32 record_size). */
> > -     const void *info;
> > -
> > -     /* data and data_size do not include btf_ext_header from now on */
> > -     data = data + hdr->hdr_len;
> > -     data_size -= hdr->hdr_len;
> > +     void *info;
> [...]

^ permalink raw reply

* Re: [PATCH v3 bpf-next 4/4] tools/bpf: remove btf__get_strings superseded() by raw data API
From: Yonghong Song @ 2019-02-08 17:13 UTC (permalink / raw)
  To: Andrii Nakryiko, alexei.starovoitov@gmail.com,
	andrii.nakryiko@gmail.com, Song Liu, Alexei Starovoitov,
	Martin Lau, netdev@vger.kernel.org, Kernel Team,
	daniel@iogearbox.net
In-Reply-To: <20190208025555.4027769-5-andriin@fb.com>



On 2/7/19 6:55 PM, Andrii Nakryiko wrote:
> Now that we have btf__get_raw_data() it's trivial for tests to iterate
> over all strings for testing purposes, which eliminates the need for
> btf__get_strings() API.
> 
> Signed-off-by: Andrii Nakryiko <andriin@fb.com>

Acked-by: Yonghong Song <yhs@fb.com>

^ permalink raw reply

* Re: [PATCH v3 bpf-next 3/4] btf: expose API to work with raw btf_ext data
From: Yonghong Song @ 2019-02-08 17:12 UTC (permalink / raw)
  To: Andrii Nakryiko, alexei.starovoitov@gmail.com,
	andrii.nakryiko@gmail.com, Song Liu, Alexei Starovoitov,
	Martin Lau, netdev@vger.kernel.org, Kernel Team,
	daniel@iogearbox.net
In-Reply-To: <20190208025555.4027769-4-andriin@fb.com>



On 2/7/19 6:55 PM, Andrii Nakryiko wrote:
> This patch changes struct btf_ext to retain original data in sequential
> block of memory, which makes it possible to expose
> btf_ext__get_raw_data() interface, that's similar to
> btf__get_raw_data(), allowing users of libbpf to get access to raw
> representation of .BTF.ext section.
> 
> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> ---
>   tools/lib/bpf/btf.c      | 85 +++++++++++++++++++++-------------------
>   tools/lib/bpf/btf.h      |  2 +
>   tools/lib/bpf/libbpf.map |  1 +
>   3 files changed, 47 insertions(+), 41 deletions(-)
> 
> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> index 8730f6c3be9e..c87cc3d71b9f 100644
> --- a/tools/lib/bpf/btf.c
> +++ b/tools/lib/bpf/btf.c
> @@ -41,9 +41,8 @@ struct btf {
>   
>   struct btf_ext_info {
>   	/*
> -	 * info points to a deep copy of the individual info section
> -	 * (e.g. func_info and line_info) from the .BTF.ext.
> -	 * It does not include the __u32 rec_size.
> +	 * info points to the individual info section (e.g. func_info and
> +	 * line_info) from the .BTF.ext. It does not include the __u32 rec_size.
>   	 */
>   	void *info;
>   	__u32 rec_size;
> @@ -51,8 +50,13 @@ struct btf_ext_info {
>   };
>   
>   struct btf_ext {
> +	union {
> +		struct btf_ext_header *hdr;
> +		void *data;
> +	};
>   	struct btf_ext_info func_info;
>   	struct btf_ext_info line_info;
> +	__u32 data_size;
>   };
>   
>   struct btf_ext_info_sec {
> @@ -603,19 +607,13 @@ struct btf_ext_sec_copy_param {
>   };
>   
>   static int btf_ext_copy_info(struct btf_ext *btf_ext,
> -			     __u8 *data, __u32 data_size,
>   			     struct btf_ext_sec_copy_param *ext_sec)

Overall looks good. Since we do not really "copy" info any more,
rather we try to "setup" info based on btf_ext. Maybe changing
all function and structure names with "_copy_" to "_setup_"?

>   {
> -	const struct btf_ext_header *hdr = (struct btf_ext_header *)data;
>   	const struct btf_ext_info_sec *sinfo;
>   	struct btf_ext_info *ext_info;
>   	__u32 info_left, record_size;
>   	/* The start of the info sec (including the __u32 record_size). */
> -	const void *info;
> -
> -	/* data and data_size do not include btf_ext_header from now on */
> -	data = data + hdr->hdr_len;
> -	data_size -= hdr->hdr_len;
> +	void *info;
[...]

^ permalink raw reply

* Re: [PATCH net-next 1/2] mlxsw: spectrum_router: Offload blackhole routes
From: Ido Schimmel @ 2019-02-08 17:05 UTC (permalink / raw)
  To: David Ahern
  Cc: netdev@vger.kernel.org, davem@davemloft.net, Jiri Pirko,
	Alexander Petrovskiy, mlxsw
In-Reply-To: <bfc7715f-a762-0ad7-9cb2-3c248eefad9e@gmail.com>

On Fri, Feb 08, 2019 at 08:24:49AM -0800, David Ahern wrote:
> On 2/7/19 11:34 PM, Ido Schimmel wrote:
> > I assume that user can't put blackhole and normal nexthops in the same
> > group?
> > 
> 
> I allow a nexthop group to reference a nexthop that is a blackhole, but
> the group can only contain the one entry. That allows multipath routes
> to toggle between a blackhole and a real spec.

Good, thanks. I was afraid users will be able to configure a nexthop
group that will randomly drop flows :)

^ permalink raw reply

* Re: Resource management for ndo_xdp_xmit (Was: [PATCH net] virtio_net: Account for tx bytes and packets on sending xdp_frames)
From: Toke Høiland-Jørgensen @ 2019-02-08 16:55 UTC (permalink / raw)
  To: Saeed Mahameed, brouer@redhat.com
  Cc: hawk@kernel.org, virtualization@lists.linux-foundation.org,
	borkmann@iogearbox.net, Tariq Toukan, john.fastabend@gmail.com,
	mst@redhat.com, jakub.kicinski@netronome.com, dsahern@gmail.com,
	netdev@vger.kernel.org, jasowang@redhat.com, davem@davemloft.net,
	makita.toshiaki@lab.ntt.co.jp
In-Reply-To: <9e5e6882566ac67276209b35ec112a824b256bff.camel@mellanox.com>

Saeed Mahameed <saeedm@mellanox.com> writes:

> But:
> 2) this won't totally solve our problem, since sometimes the driver can
> decide to recreate (change of configuration) hw resources on the fly
> while redirect/devmap is already happening, so we need some kind of a
> dev_map_notification or a flag with rcu synch, for when the driver want
> to make the xdp redirect resources unavailable.

Good point, I'll make a note of this. Do you have a pointer to where the
mlx5 driver does this kind of change currently?

-Toke

^ permalink raw reply

* Re: [PATCH net-next] nfp: flower: remove unused index from nfp_fl_pedit()
From: Jakub Kicinski @ 2019-02-08 16:47 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netdev, davem, oss-drivers
In-Reply-To: <20190208164113.8935-1-pablo@netfilter.org>

On Fri,  8 Feb 2019 17:41:13 +0100, Pablo Neira Ayuso wrote:
> Static checker warning complains on uninitialized variable:
> 
>         drivers/net/ethernet/netronome/nfp/flower/action.c:618 nfp_fl_pedit()
>         error: uninitialized symbol 'idx'.
> 
> Which is actually never used from the functions that take it as
> parameter. Remove it.
> 
> Fixes: 738678817573 ("drivers: net: use flow action infrastructure")

Hardly a fix.  It's completely unused.

> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

Ah, I was hoping you wouldn't notice :)  Now the backport of this code
got almost as excruciatingly painful as the match part :)  But I guess
nothing we can do about it:

Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>

^ permalink raw reply

* [PATCH net-next] nfp: flower: remove unused index from nfp_fl_pedit()
From: Pablo Neira Ayuso @ 2019-02-08 16:41 UTC (permalink / raw)
  To: netdev; +Cc: davem, jakub.kicinski, oss-drivers

Static checker warning complains on uninitialized variable:

        drivers/net/ethernet/netronome/nfp/flower/action.c:618 nfp_fl_pedit()
        error: uninitialized symbol 'idx'.

Which is actually never used from the functions that take it as
parameter. Remove it.

Fixes: 738678817573 ("drivers: net: use flow action infrastructure")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 drivers/net/ethernet/netronome/nfp/flower/action.c | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/flower/action.c b/drivers/net/ethernet/netronome/nfp/flower/action.c
index 583e97c99e68..eeda4ed98333 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/action.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/action.c
@@ -345,7 +345,7 @@ static void nfp_fl_set_helper32(u32 value, u32 mask, u8 *p_exact, u8 *p_mask)
 }
 
 static int
-nfp_fl_set_eth(const struct flow_action_entry *act, int idx, u32 off,
+nfp_fl_set_eth(const struct flow_action_entry *act, u32 off,
 	       struct nfp_fl_set_eth *set_eth)
 {
 	u32 exact, mask;
@@ -376,7 +376,7 @@ struct ipv4_ttl_word {
 };
 
 static int
-nfp_fl_set_ip4(const struct flow_action_entry *act, int idx, u32 off,
+nfp_fl_set_ip4(const struct flow_action_entry *act, u32 off,
 	       struct nfp_fl_set_ip4_addrs *set_ip_addr,
 	       struct nfp_fl_set_ip4_ttl_tos *set_ip_ttl_tos)
 {
@@ -505,7 +505,7 @@ nfp_fl_set_ip6_hop_limit_flow_label(u32 off, __be32 exact, __be32 mask,
 }
 
 static int
-nfp_fl_set_ip6(const struct flow_action_entry *act, int idx, u32 off,
+nfp_fl_set_ip6(const struct flow_action_entry *act, u32 off,
 	       struct nfp_fl_set_ipv6_addr *ip_dst,
 	       struct nfp_fl_set_ipv6_addr *ip_src,
 	       struct nfp_fl_set_ipv6_tc_hl_fl *ip_hl_fl)
@@ -541,7 +541,7 @@ nfp_fl_set_ip6(const struct flow_action_entry *act, int idx, u32 off,
 }
 
 static int
-nfp_fl_set_tport(const struct flow_action_entry *act, int idx, u32 off,
+nfp_fl_set_tport(const struct flow_action_entry *act, u32 off,
 		 struct nfp_fl_set_tport *set_tport, int opcode)
 {
 	u32 exact, mask;
@@ -598,8 +598,8 @@ nfp_fl_pedit(const struct flow_action_entry *act,
 	struct nfp_fl_set_eth set_eth;
 	size_t act_size = 0;
 	u8 ip_proto = 0;
-	int idx, err;
 	u32 offset;
+	int err;
 
 	memset(&set_ip6_tc_hl_fl, 0, sizeof(set_ip6_tc_hl_fl));
 	memset(&set_ip_ttl_tos, 0, sizeof(set_ip_ttl_tos));
@@ -614,22 +614,22 @@ nfp_fl_pedit(const struct flow_action_entry *act,
 
 	switch (htype) {
 	case TCA_PEDIT_KEY_EX_HDR_TYPE_ETH:
-		err = nfp_fl_set_eth(act, idx, offset, &set_eth);
+		err = nfp_fl_set_eth(act, offset, &set_eth);
 		break;
 	case TCA_PEDIT_KEY_EX_HDR_TYPE_IP4:
-		err = nfp_fl_set_ip4(act, idx, offset, &set_ip_addr,
+		err = nfp_fl_set_ip4(act, offset, &set_ip_addr,
 				     &set_ip_ttl_tos);
 		break;
 	case TCA_PEDIT_KEY_EX_HDR_TYPE_IP6:
-		err = nfp_fl_set_ip6(act, idx, offset, &set_ip6_dst,
+		err = nfp_fl_set_ip6(act, offset, &set_ip6_dst,
 				     &set_ip6_src, &set_ip6_tc_hl_fl);
 		break;
 	case TCA_PEDIT_KEY_EX_HDR_TYPE_TCP:
-		err = nfp_fl_set_tport(act, idx, offset, &set_tport,
+		err = nfp_fl_set_tport(act, offset, &set_tport,
 				       NFP_FL_ACTION_OPCODE_SET_TCP);
 		break;
 	case TCA_PEDIT_KEY_EX_HDR_TYPE_UDP:
-		err = nfp_fl_set_tport(act, idx, offset, &set_tport,
+		err = nfp_fl_set_tport(act, offset, &set_tport,
 				       NFP_FL_ACTION_OPCODE_SET_UDP);
 		break;
 	default:
-- 
2.11.0


^ permalink raw reply related

* [PATCH bpf-next v8 6/6] selftests: bpf: add test_lwt_ip_encap selftest
From: Peter Oskolkov @ 2019-02-08 16:38 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, netdev
  Cc: Peter Oskolkov, David Ahern, Willem de Bruijn, Peter Oskolkov
In-Reply-To: <20190208163849.151626-1-posk@google.com>

This patch adds a bpf self-test to cover BPF_LWT_ENCAP_IP mode
in bpf_lwt_push_encap.

Covered:
- encapping in LWT_IN and LWT_XMIT
- IPv4 and IPv6

A follow-up patch will add GSO and VRF-enabled tests.

Signed-off-by: Peter Oskolkov <posk@google.com>
---
 tools/testing/selftests/bpf/Makefile          |   6 +-
 .../testing/selftests/bpf/test_lwt_ip_encap.c |  85 +++++
 .../selftests/bpf/test_lwt_ip_encap.sh        | 311 ++++++++++++++++++
 3 files changed, 400 insertions(+), 2 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/test_lwt_ip_encap.c
 create mode 100755 tools/testing/selftests/bpf/test_lwt_ip_encap.sh

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 383d2ff13fc7..d56c74727b6c 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -35,7 +35,8 @@ BPF_OBJ_FILES = \
 	sendmsg4_prog.o sendmsg6_prog.o test_lirc_mode2_kern.o \
 	get_cgroup_id_kern.o socket_cookie_prog.o test_select_reuseport_kern.o \
 	test_skb_cgroup_id_kern.o bpf_flow.o netcnt_prog.o test_xdp_vlan.o \
-	xdp_dummy.o test_map_in_map.o test_spin_lock.o test_map_lock.o
+	xdp_dummy.o test_map_in_map.o test_spin_lock.o test_map_lock.o \
+	test_lwt_ip_encap.o
 
 # Objects are built with default compilation flags and with sub-register
 # code-gen enabled.
@@ -73,7 +74,8 @@ TEST_PROGS := test_kmod.sh \
 	test_lirc_mode2.sh \
 	test_skb_cgroup_id.sh \
 	test_flow_dissector.sh \
-	test_xdp_vlan.sh
+	test_xdp_vlan.sh \
+	test_lwt_ip_encap.sh
 
 TEST_PROGS_EXTENDED := with_addr.sh \
 	with_tunnels.sh \
diff --git a/tools/testing/selftests/bpf/test_lwt_ip_encap.c b/tools/testing/selftests/bpf/test_lwt_ip_encap.c
new file mode 100644
index 000000000000..c957d6dfe6d7
--- /dev/null
+++ b/tools/testing/selftests/bpf/test_lwt_ip_encap.c
@@ -0,0 +1,85 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stddef.h>
+#include <string.h>
+#include <linux/bpf.h>
+#include <linux/ip.h>
+#include <linux/ipv6.h>
+#include "bpf_helpers.h"
+#include "bpf_endian.h"
+
+struct grehdr {
+	__be16 flags;
+	__be16 protocol;
+};
+
+SEC("encap_gre")
+int bpf_lwt_encap_gre(struct __sk_buff *skb)
+{
+	struct encap_hdr {
+		struct iphdr iph;
+		struct grehdr greh;
+	} hdr;
+	int err;
+
+	memset(&hdr, 0, sizeof(struct encap_hdr));
+
+	hdr.iph.ihl = 5;
+	hdr.iph.version = 4;
+	hdr.iph.ttl = 0x40;
+	hdr.iph.protocol = 47;  /* IPPROTO_GRE */
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+	hdr.iph.saddr = 0x640110ac;  /* 172.16.1.100 */
+	hdr.iph.daddr = 0x641010ac;  /* 172.16.16.100 */
+#elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
+	hdr.iph.saddr = 0xac100164;  /* 172.16.1.100 */
+	hdr.iph.daddr = 0xac101064;  /* 172.16.16.100 */
+#else
+#error "Fix your compiler's __BYTE_ORDER__?!"
+#endif
+	hdr.iph.tot_len = bpf_htons(skb->len + sizeof(struct encap_hdr));
+
+	hdr.greh.protocol = skb->protocol;
+
+	err = bpf_lwt_push_encap(skb, BPF_LWT_ENCAP_IP, &hdr,
+				 sizeof(struct encap_hdr));
+	if (err)
+		return BPF_DROP;
+
+	return BPF_LWT_REROUTE;
+}
+
+SEC("encap_gre6")
+int bpf_lwt_encap_gre6(struct __sk_buff *skb)
+{
+	struct encap_hdr {
+		struct ipv6hdr ip6hdr;
+		struct grehdr greh;
+	} hdr;
+	int err;
+
+	memset(&hdr, 0, sizeof(struct encap_hdr));
+
+	hdr.ip6hdr.version = 6;
+	hdr.ip6hdr.payload_len = bpf_htons(skb->len + sizeof(struct grehdr));
+	hdr.ip6hdr.nexthdr = 47;  /* IPPROTO_GRE */
+	hdr.ip6hdr.hop_limit = 0x40;
+	/* fb01::1 */
+	hdr.ip6hdr.saddr.s6_addr[0] = 0xfb;
+	hdr.ip6hdr.saddr.s6_addr[1] = 1;
+	hdr.ip6hdr.saddr.s6_addr[15] = 1;
+	/* fb10::1 */
+	hdr.ip6hdr.daddr.s6_addr[0] = 0xfb;
+	hdr.ip6hdr.daddr.s6_addr[1] = 0x10;
+	hdr.ip6hdr.daddr.s6_addr[15] = 1;
+
+	hdr.greh.protocol = skb->protocol;
+
+	err = bpf_lwt_push_encap(skb, BPF_LWT_ENCAP_IP, &hdr,
+				 sizeof(struct encap_hdr));
+	if (err)
+		return BPF_DROP;
+
+	return BPF_LWT_REROUTE;
+}
+
+char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/test_lwt_ip_encap.sh b/tools/testing/selftests/bpf/test_lwt_ip_encap.sh
new file mode 100755
index 000000000000..4ca714e23ab0
--- /dev/null
+++ b/tools/testing/selftests/bpf/test_lwt_ip_encap.sh
@@ -0,0 +1,311 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+#
+# Setup/topology:
+#
+#    NS1             NS2             NS3
+#   veth1 <---> veth2   veth3 <---> veth4 (the top route)
+#   veth5 <---> veth6   veth7 <---> veth8 (the bottom route)
+#
+#   each vethN gets IPv[4|6]_N address
+#
+#   IPv*_SRC = IPv*_1
+#   IPv*_DST = IPv*_4
+#
+#   all tests test pings from IPv*_SRC to IPv*_DST
+#
+#   by default, routes are configured to allow packets to go
+#   IP*_1 <=> IP*_2 <=> IP*_3 <=> IP*_4 (the top route)
+#
+#   a GRE device is installed in NS3 with IPv*_GRE, and
+#   NS1/NS2 are configured to route packets to IPv*_GRE via IP*_8
+#   (the bottom route)
+#
+# Tests:
+#
+#   1. routes NS2->IPv*_DST are brought down, so the only way a ping
+#      from IP*_SRC to IP*_DST can work is via IPv*_GRE
+#
+#   2a. in an egress test, a bpf LWT_XMIT program is installed on veth1
+#       that encaps the packets with an IP/GRE header to route to IPv*_GRE
+#
+#       ping: SRC->[encap at veth1:egress]->GRE:decap->DST
+#       ping replies go DST->SRC directly
+#
+#   2b. in an ingress test, a bpf LWT_IN program is installed on veth2
+#       that encaps the packets with an IP/GRE header to route to IPv*_GRE
+#
+#       ping: SRC->[encap at veth2:ingress]->GRE:decap->DST
+#       ping replies go DST->SRC directly
+
+set -e  # exit on error
+
+if [[ $EUID -ne 0 ]]; then
+	echo "This script must be run as root"
+	echo "FAIL"
+	exit 1
+fi
+
+readonly NS1="ns1-$(mktemp -u XXXXXX)"
+readonly NS2="ns2-$(mktemp -u XXXXXX)"
+readonly NS3="ns3-$(mktemp -u XXXXXX)"
+
+readonly IPv4_1="172.16.1.100"
+readonly IPv4_2="172.16.2.100"
+readonly IPv4_3="172.16.3.100"
+readonly IPv4_4="172.16.4.100"
+readonly IPv4_5="172.16.5.100"
+readonly IPv4_6="172.16.6.100"
+readonly IPv4_7="172.16.7.100"
+readonly IPv4_8="172.16.8.100"
+readonly IPv4_GRE="172.16.16.100"
+
+readonly IPv4_SRC=$IPv4_1
+readonly IPv4_DST=$IPv4_4
+
+readonly IPv6_1="fb01::1"
+readonly IPv6_2="fb02::1"
+readonly IPv6_3="fb03::1"
+readonly IPv6_4="fb04::1"
+readonly IPv6_5="fb05::1"
+readonly IPv6_6="fb06::1"
+readonly IPv6_7="fb07::1"
+readonly IPv6_8="fb08::1"
+readonly IPv6_GRE="fb10::1"
+
+readonly IPv6_SRC=$IPv6_1
+readonly IPv6_DST=$IPv6_4
+
+setup() {
+set -e  # exit on error
+	# create devices and namespaces
+	ip netns add "${NS1}"
+	ip netns add "${NS2}"
+	ip netns add "${NS3}"
+
+	ip link add veth1 type veth peer name veth2
+	ip link add veth3 type veth peer name veth4
+	ip link add veth5 type veth peer name veth6
+	ip link add veth7 type veth peer name veth8
+
+	ip netns exec ${NS2} sysctl -wq net.ipv4.ip_forward=1
+	ip netns exec ${NS2} sysctl -wq net.ipv6.conf.all.forwarding=1
+
+	ip link set veth1 netns ${NS1}
+	ip link set veth2 netns ${NS2}
+	ip link set veth3 netns ${NS2}
+	ip link set veth4 netns ${NS3}
+	ip link set veth5 netns ${NS1}
+	ip link set veth6 netns ${NS2}
+	ip link set veth7 netns ${NS2}
+	ip link set veth8 netns ${NS3}
+
+	# configure addesses: the top route (1-2-3-4)
+	ip -netns ${NS1}    addr add ${IPv4_1}/24  dev veth1
+	ip -netns ${NS2}    addr add ${IPv4_2}/24  dev veth2
+	ip -netns ${NS2}    addr add ${IPv4_3}/24  dev veth3
+	ip -netns ${NS3}    addr add ${IPv4_4}/24  dev veth4
+	ip -netns ${NS1} -6 addr add ${IPv6_1}/128 nodad dev veth1
+	ip -netns ${NS2} -6 addr add ${IPv6_2}/128 nodad dev veth2
+	ip -netns ${NS2} -6 addr add ${IPv6_3}/128 nodad dev veth3
+	ip -netns ${NS3} -6 addr add ${IPv6_4}/128 nodad dev veth4
+
+	# configure addresses: the bottom route (5-6-7-8)
+	ip -netns ${NS1}    addr add ${IPv4_5}/24  dev veth5
+	ip -netns ${NS2}    addr add ${IPv4_6}/24  dev veth6
+	ip -netns ${NS2}    addr add ${IPv4_7}/24  dev veth7
+	ip -netns ${NS3}    addr add ${IPv4_8}/24  dev veth8
+	ip -netns ${NS1} -6 addr add ${IPv6_5}/128 nodad dev veth5
+	ip -netns ${NS2} -6 addr add ${IPv6_6}/128 nodad dev veth6
+	ip -netns ${NS2} -6 addr add ${IPv6_7}/128 nodad dev veth7
+	ip -netns ${NS3} -6 addr add ${IPv6_8}/128 nodad dev veth8
+
+
+	ip -netns ${NS1} link set dev veth1 up
+	ip -netns ${NS2} link set dev veth2 up
+	ip -netns ${NS2} link set dev veth3 up
+	ip -netns ${NS3} link set dev veth4 up
+	ip -netns ${NS1} link set dev veth5 up
+	ip -netns ${NS2} link set dev veth6 up
+	ip -netns ${NS2} link set dev veth7 up
+	ip -netns ${NS3} link set dev veth8 up
+
+	# configure routes: IP*_SRC -> veth1/IP*_2 (= top route) default;
+	# the bottom route to specific bottom addresses
+
+	# NS1
+	# top route
+	ip -netns ${NS1}    route add ${IPv4_2}/32  dev veth1
+	ip -netns ${NS1}    route add default dev veth1 via ${IPv4_2}  # go top by default
+	ip -netns ${NS1} -6 route add ${IPv6_2}/128 dev veth1
+	ip -netns ${NS1} -6 route add default dev veth1 via ${IPv6_2}  # go top by default
+	# bottom route
+	ip -netns ${NS1}    route add ${IPv4_6}/32  dev veth5
+	ip -netns ${NS1}    route add ${IPv4_7}/32  dev veth5 via ${IPv4_6}
+	ip -netns ${NS1}    route add ${IPv4_8}/32  dev veth5 via ${IPv4_6}
+	ip -netns ${NS1} -6 route add ${IPv6_6}/128 dev veth5
+	ip -netns ${NS1} -6 route add ${IPv6_7}/128 dev veth5 via ${IPv6_6}
+	ip -netns ${NS1} -6 route add ${IPv6_8}/128 dev veth5 via ${IPv6_6}
+
+	# NS2
+	# top route
+	ip -netns ${NS2}    route add ${IPv4_1}/32  dev veth2
+	ip -netns ${NS2}    route add ${IPv4_4}/32  dev veth3
+	ip -netns ${NS2} -6 route add ${IPv6_1}/128 dev veth2
+	ip -netns ${NS2} -6 route add ${IPv6_4}/128 dev veth3
+	# bottom route
+	ip -netns ${NS2}    route add ${IPv4_5}/32  dev veth6
+	ip -netns ${NS2}    route add ${IPv4_8}/32  dev veth7
+	ip -netns ${NS2} -6 route add ${IPv6_5}/128 dev veth6
+	ip -netns ${NS2} -6 route add ${IPv6_8}/128 dev veth7
+
+	# NS3
+	# top route
+	ip -netns ${NS3}    route add ${IPv4_3}/32  dev veth4
+	ip -netns ${NS3}    route add ${IPv4_1}/32  dev veth4 via ${IPv4_3}
+	ip -netns ${NS3}    route add ${IPv4_2}/32  dev veth4 via ${IPv4_3}
+	ip -netns ${NS3} -6 route add ${IPv6_3}/128 dev veth4
+	ip -netns ${NS3} -6 route add ${IPv6_1}/128 dev veth4 via ${IPv6_3}
+	ip -netns ${NS3} -6 route add ${IPv6_2}/128 dev veth4 via ${IPv6_3}
+	# bottom route
+	ip -netns ${NS3}    route add ${IPv4_7}/32  dev veth8
+	ip -netns ${NS3}    route add ${IPv4_5}/32  dev veth8 via ${IPv4_7}
+	ip -netns ${NS3}    route add ${IPv4_6}/32  dev veth8 via ${IPv4_7}
+	ip -netns ${NS3} -6 route add ${IPv6_7}/128 dev veth8
+	ip -netns ${NS3} -6 route add ${IPv6_5}/128 dev veth8 via ${IPv6_7}
+	ip -netns ${NS3} -6 route add ${IPv6_6}/128 dev veth8 via ${IPv6_7}
+
+	# configure IPv4 GRE device in NS3, and a route to it via the "bottom" route
+	ip -netns ${NS3} tunnel add gre_dev mode gre remote ${IPv4_1} local ${IPv4_GRE} ttl 255
+	ip -netns ${NS3} link set gre_dev up
+	ip -netns ${NS3} addr add ${IPv4_GRE} dev gre_dev
+	ip -netns ${NS1} route add ${IPv4_GRE}/32 dev veth5 via ${IPv4_6}
+	ip -netns ${NS2} route add ${IPv4_GRE}/32 dev veth7 via ${IPv4_8}
+
+
+	# configure IPv6 GRE device in NS3, and a route to it via the "bottom" route
+	ip -netns ${NS3} -6 tunnel add name gre6_dev mode ip6gre remote ${IPv6_1} local ${IPv6_GRE} ttl 255
+	ip -netns ${NS3} link set gre6_dev up
+	ip -netns ${NS3} -6 addr add ${IPv6_GRE} nodad dev gre6_dev
+	ip -netns ${NS1} -6 route add ${IPv6_GRE}/128 dev veth5 via ${IPv6_6}
+	ip -netns ${NS2} -6 route add ${IPv6_GRE}/128 dev veth7 via ${IPv6_8}
+
+	# rp_filter gets confused by what these tests are doing, so disable it
+	ip netns exec ${NS1} sysctl -wq net.ipv4.conf.all.rp_filter=0
+	ip netns exec ${NS2} sysctl -wq net.ipv4.conf.all.rp_filter=0
+	ip netns exec ${NS3} sysctl -wq net.ipv4.conf.all.rp_filter=0
+}
+
+cleanup() {
+	ip netns del ${NS1} 2> /dev/null
+	ip netns del ${NS2} 2> /dev/null
+	ip netns del ${NS3} 2> /dev/null
+}
+
+trap cleanup EXIT
+
+test_ping() {
+	local readonly PROTO=$1
+	local readonly EXPECTED=$2
+	local RET=0
+
+	set +e
+	if [ "${PROTO}" == "IPv4" ] ; then
+		ip netns exec ${NS1} ping  -c 1 -W 1 -I ${IPv4_SRC} ${IPv4_DST} 2>&1 > /dev/null
+		RET=$?
+	elif [ "${PROTO}" == "IPv6" ] ; then
+		ip netns exec ${NS1} ping6 -c 1 -W 6 -I ${IPv6_SRC} ${IPv6_DST} 2>&1 > /dev/null
+		RET=$?
+	else
+		echo "test_ping: unknown PROTO: ${PROTO}"
+		exit 1
+	fi
+	set -e
+
+	if [ "0" != "${RET}" ]; then
+		RET=1
+	fi
+
+	if [ "${EXPECTED}" != "${RET}" ] ; then
+		echo "FAIL: test_ping: ${RET}"
+		exit 1
+	fi
+}
+
+test_egress() {
+	local readonly ENCAP=$1
+	echo "starting egress ${ENCAP} encap test"
+	setup
+
+	# need to wait a bit for IPv6 to autoconf, otherwise
+	# ping6 sometimes fails with "unable to bind to address"
+
+	# by default, pings work
+	test_ping IPv4 0
+	test_ping IPv6 0
+
+	# remove NS2->DST routes, ping fails
+	ip -netns ${NS2}    route del ${IPv4_DST}/32  dev veth3
+	ip -netns ${NS2} -6 route del ${IPv6_DST}/128 dev veth3
+	test_ping IPv4 1
+	test_ping IPv6 1
+
+	# install replacement routes (LWT/eBPF), pings succeed
+	if [ "${ENCAP}" == "IPv4" ] ; then
+		ip -netns ${NS1} route add ${IPv4_DST} encap bpf xmit obj test_lwt_ip_encap.o sec encap_gre dev veth1
+		ip -netns ${NS1} -6 route add ${IPv6_DST} encap bpf xmit obj test_lwt_ip_encap.o sec encap_gre dev veth1
+	elif [ "${ENCAP}" == "IPv6" ] ; then
+		ip -netns ${NS1} route add ${IPv4_DST} encap bpf xmit obj test_lwt_ip_encap.o sec encap_gre6 dev veth1
+		ip -netns ${NS1} -6 route add ${IPv6_DST} encap bpf xmit obj test_lwt_ip_encap.o sec encap_gre6 dev veth1
+	else
+		echo "FAIL: unknown encap ${ENCAP}"
+	fi
+	test_ping IPv4 0
+	test_ping IPv6 0
+
+	cleanup
+	echo "PASS"
+}
+
+test_ingress() {
+	local readonly ENCAP=$1
+	echo "starting ingress ${ENCAP} encap test"
+	setup
+
+	# need to wait a bit for IPv6 to autoconf, otherwise
+	# ping6 sometimes fails with "unable to bind to address"
+
+	# by default, pings work
+	test_ping IPv4 0
+	test_ping IPv6 0
+
+	# remove NS2->DST routes, pings fail
+	ip -netns ${NS2}    route del ${IPv4_DST}/32  dev veth3
+	ip -netns ${NS2} -6 route del ${IPv6_DST}/128 dev veth3
+	test_ping IPv4 1
+	test_ping IPv6 1
+
+	# install replacement routes (LWT/eBPF), pings succeed
+	if [ "${ENCAP}" == "IPv4" ] ; then
+		ip -netns ${NS2} route add ${IPv4_DST} encap bpf in obj test_lwt_ip_encap.o sec encap_gre dev veth2
+		ip -netns ${NS2} -6 route add ${IPv6_DST} encap bpf in obj test_lwt_ip_encap.o sec encap_gre dev veth2
+	elif [ "${ENCAP}" == "IPv6" ] ; then
+		ip -netns ${NS2} route add ${IPv4_DST} encap bpf in obj test_lwt_ip_encap.o sec encap_gre6 dev veth2
+		ip -netns ${NS2} -6 route add ${IPv6_DST} encap bpf in obj test_lwt_ip_encap.o sec encap_gre6 dev veth2
+	else
+		echo "FAIL: unknown encap ${ENCAP}"
+	fi
+	test_ping IPv4 0
+	test_ping IPv6 0
+
+	cleanup
+	echo "PASS"
+}
+
+test_egress IPv4
+test_egress IPv6
+
+test_ingress IPv4
+test_ingress IPv6
+
+echo "all tests passed"
-- 
2.20.1.791.gb4d0f1c61a-goog


^ permalink raw reply related

* KASAN: wild-memory-access Write in fib6_purge_rt
From: syzbot @ 2019-02-08 16:39 UTC (permalink / raw)
  To: davem, kuznet, linux-kernel, netdev, syzkaller-bugs, yoshfuji

Hello,

syzbot found the following crash on:

HEAD commit:    cc7335786f72 socket: fix for Add SO_TIMESTAMP[NS]_NEW
git tree:       net-next
console output: https://syzkaller.appspot.com/x/log.txt?x=16539260c00000
kernel config:  https://syzkaller.appspot.com/x/.config?x=33ad02b9305759c3
dashboard link: https://syzkaller.appspot.com/bug?extid=3dbea54db3674c0d57d6
compiler:       gcc (GCC) 9.0.0 20181231 (experimental)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+3dbea54db3674c0d57d6@syzkaller.appspotmail.com

==================================================================
BUG: KASAN: wild-memory-access in atomic_dec_and_test  
include/asm-generic/atomic-instrumented.h:259 [inline]
BUG: KASAN: wild-memory-access in fib6_info_release  
include/net/ip6_fib.h:294 [inline]
BUG: KASAN: wild-memory-access in fib6_info_release  
include/net/ip6_fib.h:292 [inline]
BUG: KASAN: wild-memory-access in fib6_drop_pcpu_from  
net/ipv6/ip6_fib.c:927 [inline]
BUG: KASAN: wild-memory-access in fib6_purge_rt+0x4f6/0x670  
net/ipv6/ip6_fib.c:960
Write of size 4 at addr ff88809b5430992b by task syz-executor5/21222

CPU: 1 PID: 21222 Comm: syz-executor5 Not tainted 5.0.0-rc4+ #45
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x172/0x1f0 lib/dump_stack.c:113
  kasan_report.cold+0x5/0x40 mm/kasan/report.c:321
  check_memory_region_inline mm/kasan/generic.c:185 [inline]
  check_memory_region+0x123/0x190 mm/kasan/generic.c:191
  kasan_check_write+0x14/0x20 mm/kasan/common.c:106
  atomic_dec_and_test include/asm-generic/atomic-instrumented.h:259 [inline]
  fib6_info_release include/net/ip6_fib.h:294 [inline]
  fib6_info_release include/net/ip6_fib.h:292 [inline]
  fib6_drop_pcpu_from net/ipv6/ip6_fib.c:927 [inline]
  fib6_purge_rt+0x4f6/0x670 net/ipv6/ip6_fib.c:960
  fib6_del_route net/ipv6/ip6_fib.c:1813 [inline]
  fib6_del+0xac2/0x10a0 net/ipv6/ip6_fib.c:1844
  fib6_clean_node+0x3a8/0x590 net/ipv6/ip6_fib.c:2006
  fib6_walk_continue+0x4b3/0x8e0 net/ipv6/ip6_fib.c:1928
  fib6_walk+0x9d/0x100 net/ipv6/ip6_fib.c:1976
  fib6_clean_tree+0xe0/0x120 net/ipv6/ip6_fib.c:2055
  __fib6_clean_all+0x118/0x2a0 net/ipv6/ip6_fib.c:2071
  fib6_clean_all+0x2b/0x40 net/ipv6/ip6_fib.c:2082
  rt6_sync_down_dev+0x134/0x150 net/ipv6/route.c:4041
  rt6_disable_ip+0x27/0x5f0 net/ipv6/route.c:4046
  addrconf_ifdown+0xa2/0x1220 net/ipv6/addrconf.c:3704
  addrconf_notify+0x5e1/0x2280 net/ipv6/addrconf.c:3629
  notifier_call_chain+0xc7/0x240 kernel/notifier.c:93
  __raw_notifier_call_chain kernel/notifier.c:394 [inline]
  raw_notifier_call_chain+0x2e/0x40 kernel/notifier.c:401
  call_netdevice_notifiers_info+0x3f/0x90 net/core/dev.c:1739
  call_netdevice_notifiers_extack net/core/dev.c:1751 [inline]
  call_netdevice_notifiers net/core/dev.c:1765 [inline]
  __dev_notify_flags+0x1e9/0x2c0 net/core/dev.c:7607
  dev_change_flags+0x10d/0x170 net/core/dev.c:7643
  devinet_ioctl+0x1396/0x1c60 net/ipv4/devinet.c:1104
  inet_ioctl+0x1fc/0x370 net/ipv4/af_inet.c:954
  sock_do_ioctl+0xe2/0x320 net/socket.c:972
  sock_ioctl+0x331/0x620 net/socket.c:1096
  vfs_ioctl fs/ioctl.c:46 [inline]
  file_ioctl fs/ioctl.c:509 [inline]
  do_vfs_ioctl+0xd6e/0x1390 fs/ioctl.c:696
  ksys_ioctl+0xab/0xd0 fs/ioctl.c:713
  __do_sys_ioctl fs/ioctl.c:720 [inline]
  __se_sys_ioctl fs/ioctl.c:718 [inline]
  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
  do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457e39
Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7  
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff  
ff 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fa4949bfc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457e39
RDX: 0000000020000040 RSI: 0000000000008914 RDI: 0000000000000004
RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fa4949c06d4
R13: 00000000004c34ff R14: 00000000004d61e8 R15: 00000000ffffffff
==================================================================


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with  
syzbot.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox