Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: linux-next: build failure after merge of the net-next tree
From: Stephen Rothwell @ 2019-07-11  3:27 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Leon Romanovsky, Bernard Metzler, Doug Ledford, David Miller,
	Networking, Linux Next Mailing List, Linux Kernel Mailing List
In-Reply-To: <20190711131603.6b11b831@canb.auug.org.au>

[-- Attachment #1: Type: text/plain, Size: 1420 bytes --]

Hi all,

On Thu, 11 Jul 2019 13:16:03 +1000 Stephen Rothwell <sfr@canb.auug.org.au> wrote:
> 
> On Thu, 11 Jul 2019 13:13:44 +1000 Stephen Rothwell <sfr@canb.auug.org.au> wrote:
> >
> > On Thu, 11 Jul 2019 02:26:27 +0000 Jason Gunthorpe <jgg@mellanox.com> wrote:  
> > >
> > > On Thu, Jul 11, 2019 at 11:50:54AM +1000, Stephen Rothwell wrote:
> > >     
> > > > So today this failed to build after I merged the rdma tree (previously
> > > > it didn;t until after the net-next tree was merged (I assume a
> > > > dependency changed).  It failed because in_dev_for_each_ifa_rcu (and
> > > > in_dev_for_each_ifa_rtnl) is only defined in a commit in the net-next
> > > > tree :-(      
> > > 
> > > ? I'm confused.. 
> > > 
> > > rdma.git builds fine stand alone (I hope!)    
> > 
> > I have "Fixup to build SIW issue" from Leon (which switches to using
> > in_dev_for_each_ifa_rcu) included in the rmda tree merge commit because
> > without that the rdma tree would not build for me.  Are you saying that
> > I don't need that at all, now?  
> 
> Actually , I get it now, "Fixup to build SIW issue" is really just a
> fixup for the net-next and rdma trees merge ... OK, I will fix that up
> tomorrow.  Sorry for my confusion.

Actually, I have rewound my tree and am starting from the merge of the
rdma tree again, so hopefully it should all be good today.

-- 
Cheers,
Stephen Rothwell

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* Re: linux-next: build failure after merge of the net-next tree
From: Stephen Rothwell @ 2019-07-11  3:16 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Leon Romanovsky, Bernard Metzler, Doug Ledford, David Miller,
	Networking, Linux Next Mailing List, Linux Kernel Mailing List
In-Reply-To: <20190711131344.452fc064@canb.auug.org.au>

[-- Attachment #1: Type: text/plain, Size: 1141 bytes --]

Hi all,

On Thu, 11 Jul 2019 13:13:44 +1000 Stephen Rothwell <sfr@canb.auug.org.au> wrote:
>
> On Thu, 11 Jul 2019 02:26:27 +0000 Jason Gunthorpe <jgg@mellanox.com> wrote:
> >
> > On Thu, Jul 11, 2019 at 11:50:54AM +1000, Stephen Rothwell wrote:
> >   
> > > So today this failed to build after I merged the rdma tree (previously
> > > it didn;t until after the net-next tree was merged (I assume a
> > > dependency changed).  It failed because in_dev_for_each_ifa_rcu (and
> > > in_dev_for_each_ifa_rtnl) is only defined in a commit in the net-next
> > > tree :-(    
> > 
> > ? I'm confused.. 
> > 
> > rdma.git builds fine stand alone (I hope!)  
> 
> I have "Fixup to build SIW issue" from Leon (which switches to using
> in_dev_for_each_ifa_rcu) included in the rmda tree merge commit because
> without that the rdma tree would not build for me.  Are you saying that
> I don't need that at all, now?

Actually , I get it now, "Fixup to build SIW issue" is really just a
fixup for the net-next and rdma trees merge ... OK, I will fix that up
tomorrow.  Sorry for my confusion.

-- 
Cheers,
Stephen Rothwell

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* Re: linux-next: build failure after merge of the net-next tree
From: Stephen Rothwell @ 2019-07-11  3:13 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Leon Romanovsky, Bernard Metzler, Doug Ledford, David Miller,
	Networking, Linux Next Mailing List, Linux Kernel Mailing List
In-Reply-To: <20190711015854.GC22409@mellanox.com>

[-- Attachment #1: Type: text/plain, Size: 835 bytes --]

Hi Jason,

On Thu, 11 Jul 2019 02:26:27 +0000 Jason Gunthorpe <jgg@mellanox.com> wrote:
>
> On Thu, Jul 11, 2019 at 11:50:54AM +1000, Stephen Rothwell wrote:
> 
> > So today this failed to build after I merged the rdma tree (previously
> > it didn;t until after the net-next tree was merged (I assume a
> > dependency changed).  It failed because in_dev_for_each_ifa_rcu (and
> > in_dev_for_each_ifa_rtnl) is only defined in a commit in the net-next
> > tree :-(  
> 
> ? I'm confused.. 
> 
> rdma.git builds fine stand alone (I hope!)

I have "Fixup to build SIW issue" from Leon (which switches to using
in_dev_for_each_ifa_rcu) included in the rmda tree merge commit because
without that the rdma tree would not build for me.  Are you saying that
I don't need that at all, now?

-- 
Cheers,
Stephen Rothwell

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* Re: [GIT PULL] Keys: Set 4 - Key ACLs for 5.3
From: Mimi Zohar @ 2019-07-11  3:07 UTC (permalink / raw)
  To: Linus Torvalds, David Howells, James Morris, keyrings, Netdev,
	linux-nfs, CIFS, linux-afs, linux-fsdevel, linux-integrity,
	LSM List, Linux List Kernel Mailing
In-Reply-To: <CAHk-=wiFti6=K2fyAYhx-PSX9ovQPJUNp0FMdV0pDaO_pSx9MQ@mail.gmail.com>

Hi Linus,

On Wed, 2019-07-10 at 18:59 -0700, Linus Torvalds wrote:
> Anyway, since it does seem like David is offline, I've just reverted
> this from my tree, and will be continuing my normal merge window pull
> requests (the other issues I have seen have fixes in their respective
> trees).

Sorry for the delay.  An exception is needed for loading builtin keys
"KEY_ALLOC_BUILT_IN" onto a keyring that is not writable by userspace.
 The following works, but probably is not how David would handle the
exception.

diff --git a/security/keys/key.c b/security/keys/key.c
index 519211a996e7..a99332c1e014 100644
--- a/security/keys/key.c
+++ b/security/keys/key.c
@@ -896,7 +896,7 @@ key_ref_t key_create_or_update(key_ref_t keyring_ref,
        /* if we're going to allocate a new key, we're going to have
         * to modify the keyring */
        ret = key_permission(keyring_ref, KEY_NEED_WRITE);
-       if (ret < 0) {
+       if (ret < 0 && !(flags & KEY_ALLOC_BUILT_IN)) {
                key_ref = ERR_PTR(ret);
                goto error_link_end;
        }

Mimi


^ permalink raw reply related

* Re: linux-next: build failure after merge of the net-next tree
From: Jason Gunthorpe @ 2019-07-11  2:26 UTC (permalink / raw)
  To: Stephen Rothwell
  Cc: Leon Romanovsky, Bernard Metzler, Doug Ledford, David Miller,
	Networking, Linux Next Mailing List, Linux Kernel Mailing List
In-Reply-To: <20190711115054.7d7f468c@canb.auug.org.au>

On Thu, Jul 11, 2019 at 11:50:54AM +1000, Stephen Rothwell wrote:

> So today this failed to build after I merged the rdma tree (previously
> it didn;t until after the net-next tree was merged (I assume a
> dependency changed).  It failed because in_dev_for_each_ifa_rcu (and
> in_dev_for_each_ifa_rtnl) is only defined in a commit in the net-next
> tree :-(

? I'm confused.. 

rdma.git builds fine stand alone (I hope!)

If you merge it with netdev then the above patch is needed afer the
merge as netdev changed to ifa_rcu

I just did this a few hours ago to make and test the patch I sent
above..

Jason

^ permalink raw reply

* Re: [GIT PULL] Keys: Set 4 - Key ACLs for 5.3
From: Linus Torvalds @ 2019-07-11  1:59 UTC (permalink / raw)
  To: Linus Torvalds, David Howells, James Morris, keyrings, Netdev,
	linux-nfs, CIFS, linux-afs, linux-fsdevel, linux-integrity,
	LSM List, Linux List Kernel Mailing
In-Reply-To: <20190710201552.GB83443@gmail.com>

On Wed, Jul 10, 2019 at 1:15 PM Eric Biggers <ebiggers@kernel.org> wrote:
>
> Also worth noting that the key ACL patches were only in linux-next for 9 days
> before the pull request was sent.

Yes. I was not entirely happy with the whole key subsystem situation.
See my concerns in

  https://lore.kernel.org/lkml/CAHk-=wjEowdfG7v_4ttu3xhf9gqopj1+q1nGG86+mGfGDTEBBg@mail.gmail.com/

for more. That was before I realized it was buggy.

So it really would be good to have more people involved, and more
structure to the keys development (and, I suspect, much else under
security/)

Anyway, since it does seem like David is offline, I've just reverted
this from my tree, and will be continuing my normal merge window pull
requests (the other issues I have seen have fixes in their respective
trees).

                 Linus

^ permalink raw reply

* RE: [PATCH v3 0/3] kernel/notifier.c: avoid duplicate registration
From: Nixiaoming @ 2019-07-11  1:55 UTC (permalink / raw)
  To: Vasily Averin, adobriyan@gmail.com, akpm@linux-foundation.org,
	anna.schumaker@netapp.com, arjan@linux.intel.com,
	bfields@fieldses.org, chuck.lever@oracle.com, davem@davemloft.net,
	gregkh@linuxfoundation.org, jlayton@kernel.org, luto@kernel.org,
	mingo@kernel.org, Nadia.Derbey@bull.net,
	paulmck@linux.vnet.ibm.com, semen.protsenko@linaro.org,
	stable@kernel.org, stern@rowland.harvard.edu, tglx@linutronix.de,
	torvalds@linux-foundation.org, trond.myklebust@hammerspace.com,
	viresh.kumar@linaro.org
  Cc: Huangjianhui (Alex), Dailei, linux-kernel@vger.kernel.org,
	linux-nfs@vger.kernel.org, netdev@vger.kernel.org
In-Reply-To: <f628ff03-eb47-62f3-465b-fe4ed046b30c@virtuozzo.com>

On Wed, July 10, 2019 1:49 PM Vasily Averin wrote:
>On 7/10/19 6:09 AM, Xiaoming Ni wrote:
>> Registering the same notifier to a hook repeatedly can cause the hook
>> list to form a ring or lose other members of the list.
>
>I think is not enough to _prevent_ 2nd register attempt,
>it's enough to detect just attempt and generate warning to mark host in bad state.
>

Duplicate registration is prevented in my patch, not just "mark host in bad state"

Duplicate registration is checked and exited in notifier_chain_cond_register()

Duplicate registration was checked in notifier_chain_register() but only 
the alarm was triggered without exiting. added by commit 831246570d34692e 
("kernel/notifier.c: double register detection")

My patch is like a combination of 831246570d34692e and notifier_chain_cond_register(),
 which triggers an alarm and exits when a duplicate registration is detected.

>Unexpected 2nd register of the same hook most likely will lead to 2nd unregister,
>and it can lead to host crash in any time: 
>you can unregister notifier on first attempt it can be too early, it can be still in use.
>on the other hand you can never call 2nd unregister at all.

Since the member was not added to the linked list at the time of the second registration, 
no linked list ring was formed. 
The member is released on the first unregistration and -ENOENT on the second unregistration.
After patching, the fault has been alleviated

It may be more helpful to return an error code when someone tries to register the same
notification program a second time.
But I noticed that notifier_chain_cond_register() returns 0 when duplicate registration 
is detected. At the same time, in all the existing export function comments of notify,
"Currently always returns zero"

I am a bit confused: which is better?

>
>Unfortunately I do not see any ways to handle such cases properly,
>and it seems for me your patches does not resolve this problem.
>
>Am I missed something probably?
> 
>> case1: An infinite loop in notifier_chain_register() can cause soft lockup
>>         atomic_notifier_chain_register(&test_notifier_list, &test1);
>>         atomic_notifier_chain_register(&test_notifier_list, &test1);
>>         atomic_notifier_chain_register(&test_notifier_list, &test2);

Thanks

Xiaoming Ni

^ permalink raw reply

* Re: [PATCH bpf] bpf: fix BTF verifier size resolution logic
From: Yonghong Song @ 2019-07-11  1:53 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Andrii Nakryiko, Alexei Starovoitov, daniel@iogearbox.net,
	bpf@vger.kernel.org, netdev@vger.kernel.org, Kernel Team,
	Martin Lau
In-Reply-To: <CAEf4Bza6Y87C2_Fobj9CwU-2YRTU32S61f8_8CQdhMPenJiJZQ@mail.gmail.com>



On 7/10/19 6:45 PM, Andrii Nakryiko wrote:
> On Wed, Jul 10, 2019 at 5:36 PM Yonghong Song <yhs@fb.com> wrote:
>>
>>
>>
>> On 7/10/19 5:29 PM, Andrii Nakryiko wrote:
>>> On Wed, Jul 10, 2019 at 5:16 PM Yonghong Song <yhs@fb.com> wrote:
>>>>
>>>>
>>>>
>>>> On 7/10/19 1:08 AM, Andrii Nakryiko wrote:
>>>>> BTF verifier has Different logic depending on whether we are following
>>>>> a PTR or STRUCT/ARRAY (or something else). This is an optimization to
>>>>> stop early in DFS traversal while resolving BTF types. But it also
>>>>> results in a size resolution bug, when there is a chain, e.g., of PTR ->
>>>>> TYPEDEF -> ARRAY, in which case due to being in pointer context ARRAY
>>>>> size won't be resolved, as it is considered to be a sink for pointer,
>>>>> leading to TYPEDEF being in RESOLVED state with zero size, which is
>>>>> completely wrong.
>>>>>
>>>>> Optimization is doubtful, though, as btf_check_all_types() will iterate
>>>>> over all BTF types anyways, so the only saving is a potentially slightly
>>>>> shorter stack. But correctness is more important that tiny savings.
>>>>>
>>>>> This bug manifests itself in rejecting BTF-defined maps that use array
>>>>> typedef as a value type:
>>>>>
>>>>> typedef int array_t[16];
>>>>>
>>>>> struct {
>>>>>         __uint(type, BPF_MAP_TYPE_ARRAY);
>>>>>         __type(value, array_t); /* i.e., array_t *value; */
>>>>> } test_map SEC(".maps");
>>>>>
>>>>> Fixes: eb3f595dab40 ("bpf: btf: Validate type reference")
>>>>> Cc: Martin KaFai Lau <kafai@fb.com>
>>>>> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
>>>>
>>>> The change seems okay to me. Currently, looks like intermediate
>>>> modifier type will carry size = 0 (in the internal data structure).
>>>
>>> Yes, which is totally wrong, especially that we use that size in some
>>> cases to reject map with specified BTF.
>>>
>>>>
>>>> If we remove RESOLVE logic, we probably want to double check
>>>> whether we handle circular types correctly or not. Maybe we will
>>>> be okay if all self tests pass.
>>>
>>> I checked, it does. We'll attempt to add referenced type unless it's a
>>> "resolve sink" (where size is immediately known) or is already
>>> resolved (it's state is RESOLVED). In other cases, we'll attempt to
>>> env_stack_push(), which check that the state of that type is
>>> NOT_VISITED. If it's RESOLVED or VISITED, it returns -EEXISTS. When
>>> type is added into the stack, it's resolve state goes from NOT_VISITED
>>> to VISITED.
>>>
>>> So, if there is a loop, then we'll detect it as soon as we'll attempt
>>> to add the same type onto the stack second time.
>>>
>>>>
>>>> I may still be worthwhile to qualify the RESOLVE optimization benefit
>>>> before removing it.
>>>
>>> I don't think there is any, because every type will be visited exactly
>>> once, due to DFS nature of algorithm. The only difference is that if
>>> we have a long chain of modifiers, we can technically reach the max
>>> limit and fail. But at 32 I think it's pretty unrealistic to have such
>>> a long chain of PTR/TYPEDEF/CONST/VOLATILE/RESTRICTs :)
>>>
>>>>
>>>> Another possible change is, for external usage, removing
>>>> modifiers, before checking the size, something like below.
>>>> Note that I am not strongly advocating my below patch as
>>>> it has the same shortcoming that maintained modifier type
>>>> size may not be correct.
>>>
>>> I don't think your patch helps, it can actually confuse things even
>>> more. It skips modifiers until underlying type is found, but you still
>>> don't guarantee that at that time that underlying type will have its
>>> size resolved.
>>
>> It actually does help. It does not change the internal btf type
>> traversal algorithms. It only change the implementation of
>> an external API btf_type_id_size(). Previously, this function
>> is used by externals and internal btf.c. I broke it into two,
>> one internal __btf_type_id_size(), and another external
>> btf_type_id_size(). The external one removes modifier before
>> finding type size. The external one is typically used only
>> after btf is validated.
> 
> Sure, for external callers yes, it solves the problem. But there is
> deeper problem: we mark modifier types RESOLVED before types they
> ultimately point to are resolved. Then in all those btf_xxx_resolve()
> functions we have check:
> 
> if (!env_type_is_resolve_sink && !env_type_is_resolved)
>    return env_stack_push();
> else {
> 
>    /* here we assume that we can calculate size of the type */
>    /* so even if we traverse through all the modifiers and find
> underlying type */
>    /* that type will have resolved_size = 0, because we haven't
> processed it yet */
>    /* but we will just incorrectly assume that zero is *final* size */
> }
> 
> So I think that your patch is still just hiding the problem, not solving it.

That is why I am not advocating it.

The really long modifier chain (const volatile restrict ...) is rare.
So I agree removing this RESOLVE logic is okay.

> 
> BTW, I've also identified part of btf_ptr_resolve() logic that can be
> now safely removed (it's a special case that "restarts" DFS traversal
> for modifiers, because they could have been prematurely marked
> resolved). This is another sign that there is something wrong in an
> algorithm.
> 
> I'd rather remove unnecessary complexity and fix underlying problem,
> especially given that there is no performance or correctness penalty.
> 
> I'll post v2 soon.

Sounds good.

> 
>>
>> Will go through your other comments later.
>>
>>>
>>>>
>>>> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
>>>> index 546ebee39e2a..6f927c3e0a89 100644
>>>> --- a/kernel/bpf/btf.c
>>>> +++ b/kernel/bpf/btf.c
>>>> @@ -620,6 +620,54 @@ static bool btf_type_int_is_regular(const struct
>>>> btf_type *t)
>>>>            return true;
>>>>     }
>>>>
>>>> +static const struct btf_type *__btf_type_id_size(const struct btf *btf,
>>>> +                                                u32 *type_id, u32
>>>> *ret_size,
>>>> +                                                bool skip_modifier)
>>>> +{
>>>> +       const struct btf_type *size_type;
>>>> +       u32 size_type_id = *type_id;
>>>> +       u32 size = 0;
>>>> +
>>>> +       size_type = btf_type_by_id(btf, size_type_id);
>>>> +       if (size_type && skip_modifier) {
>>>> +               while (btf_type_is_modifier(size_type))
>>>> +                       size_type = btf_type_by_id(btf, size_type->type);
>>>> +       }
>>>> +
>>>> +       if (btf_type_nosize_or_null(size_type))
>>>> +               return NULL;
>>>> +
>>>> +       if (btf_type_has_size(size_type)) {
>>>> +               size = size_type->size;
>>>> +       } else if (btf_type_is_array(size_type)) {
>>>> +               size = btf->resolved_sizes[size_type_id];
>>>> +       } else if (btf_type_is_ptr(size_type)) {
>>>> +               size = sizeof(void *);
>>>> +       } else {
>>>> +               if (WARN_ON_ONCE(!btf_type_is_modifier(size_type) &&
>>>> +                                !btf_type_is_var(size_type)))
>>>> +                       return NULL;
>>>> +
>>>> +               size = btf->resolved_sizes[size_type_id];
>>>> +               size_type_id = btf->resolved_ids[size_type_id];
>>>> +               size_type = btf_type_by_id(btf, size_type_id);
>>>> +               if (btf_type_nosize_or_null(size_type))
>>>> +                       return NULL;
>>>> +       }
>>>> +
>>>> +       *type_id = size_type_id;
>>>> +       if (ret_size)
>>>> +               *ret_size = size;
>>>> +
>>>> +       return size_type;
>>>> +}
>>>> +
>> [...]

^ permalink raw reply

* Re: linux-next: build failure after merge of the net-next tree
From: Stephen Rothwell @ 2019-07-11  1:50 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Leon Romanovsky, Bernard Metzler, Doug Ledford, David Miller,
	Networking, Linux Next Mailing List, Linux Kernel Mailing List
In-Reply-To: <20190710175212.GM2887@mellanox.com>

[-- Attachment #1: Type: text/plain, Size: 4311 bytes --]

Hi all,

On Wed, 10 Jul 2019 17:52:17 +0000 Jason Gunthorpe <jgg@mellanox.com> wrote:
>
> On Tue, Jul 09, 2019 at 09:43:46AM +0300, Leon Romanovsky wrote:
> > On Tue, Jul 09, 2019 at 01:56:36PM +1000, Stephen Rothwell wrote:  
> > > Hi all,
> > >
> > > After merging the net-next tree, today's linux-next build (x86_64
> > > allmodconfig) failed like this:
> > >
> > > drivers/infiniband/sw/siw/siw_cm.c: In function 'siw_create_listen':
> > > drivers/infiniband/sw/siw/siw_cm.c:1978:3: error: implicit declaration of function 'for_ifa'; did you mean 'fork_idle'? [-Werror=implicit-function-declaration]
> > >    for_ifa(in_dev)
> > >    ^~~~~~~
> > >    fork_idle
> > > drivers/infiniband/sw/siw/siw_cm.c:1978:18: error: expected ';' before '{' token
> > >    for_ifa(in_dev)
> > >                   ^
> > >                   ;
> > >    {
> > >    ~
> > >
> > > Caused by commit
> > >
> > >   6c52fdc244b5 ("rdma/siw: connection management")
> > >
> > > from the rdma tree.  I don't know why this didn't fail after I mereged
> > > that tree.  
> > 
> > I had the same question, because I have this fix for a couple of days already.
> > 
> > From 56c9e15ec670af580daa8c3ffde9503af3042d67 Mon Sep 17 00:00:00 2001
> > From: Leon Romanovsky <leonro@mellanox.com>
> > Date: Sun, 7 Jul 2019 10:43:42 +0300
> > Subject: [PATCH] Fixup to build SIW issue
> > 
> > Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> >  drivers/infiniband/sw/siw/siw_cm.c | 5 ++---
> >  1 file changed, 2 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/infiniband/sw/siw/siw_cm.c b/drivers/infiniband/sw/siw/siw_cm.c
> > index 8e618cb7261f..c883bf514341 100644
> > +++ b/drivers/infiniband/sw/siw/siw_cm.c
> > @@ -1954,6 +1954,7 @@ static void siw_drop_listeners(struct iw_cm_id *id)
> >  int siw_create_listen(struct iw_cm_id *id, int backlog)
> >  {
> >  	struct net_device *dev = to_siw_dev(id->device)->netdev;
> > +	const struct in_ifaddr *ifa;
> >  	int rv = 0, listeners = 0;
> > 
> >  	siw_dbg(id->device, "id 0x%p: backlog %d\n", id, backlog);
> > @@ -1975,8 +1976,7 @@ int siw_create_listen(struct iw_cm_id *id, int backlog)
> >  			id, &s_laddr.sin_addr, ntohs(s_laddr.sin_port),
> >  			&s_raddr->sin_addr, ntohs(s_raddr->sin_port));
> > 
> > -		for_ifa(in_dev)
> > -		{
> > +		in_dev_for_each_ifa_rcu(ifa, in_dev) {
> >  			if (ipv4_is_zeronet(s_laddr.sin_addr.s_addr) ||  
> 
> Hum. There is no rcu lock held here and we can't use RCU anyhow as
> siw_listen_address will sleep.
> 
> I think this needs to use rtnl, as below. Bernard, please urgently
> confirm. Thanks
> 
> diff --git a/drivers/infiniband/sw/siw/siw_cm.c b/drivers/infiniband/sw/siw/siw_cm.c
> index 8e618cb7261f62..ee98e96a5bfaba 100644
> --- a/drivers/infiniband/sw/siw/siw_cm.c
> +++ b/drivers/infiniband/sw/siw/siw_cm.c
> @@ -1965,6 +1965,7 @@ int siw_create_listen(struct iw_cm_id *id, int backlog)
>  	 */
>  	if (id->local_addr.ss_family == AF_INET) {
>  		struct in_device *in_dev = in_dev_get(dev);
> +		const struct in_ifaddr *ifa;
>  		struct sockaddr_in s_laddr, *s_raddr;
>  
>  		memcpy(&s_laddr, &id->local_addr, sizeof(s_laddr));
> @@ -1975,8 +1976,8 @@ int siw_create_listen(struct iw_cm_id *id, int backlog)
>  			id, &s_laddr.sin_addr, ntohs(s_laddr.sin_port),
>  			&s_raddr->sin_addr, ntohs(s_raddr->sin_port));
>  
> -		for_ifa(in_dev)
> -		{
> +		rtnl_lock();
> +		in_dev_for_each_ifa_rtnl(ifa, in_dev) {
>  			if (ipv4_is_zeronet(s_laddr.sin_addr.s_addr) ||
>  			    s_laddr.sin_addr.s_addr == ifa->ifa_address) {
>  				s_laddr.sin_addr.s_addr = ifa->ifa_address;
> @@ -1988,7 +1989,7 @@ int siw_create_listen(struct iw_cm_id *id, int backlog)
>  					listeners++;
>  			}
>  		}
> -		endfor_ifa(in_dev);
> +		rtnl_unlock();
>  		in_dev_put(in_dev);
>  	} else if (id->local_addr.ss_family == AF_INET6) {
>  		struct inet6_dev *in6_dev = in6_dev_get(dev);

So today this failed to build after I merged the rdma tree (previously
it didn;t until after the net-next tree was merged (I assume a
dependency changed).  It failed because in_dev_for_each_ifa_rcu (and
in_dev_for_each_ifa_rtnl) is only defined in a commit in the net-next
tree :-(

I have disabled the driver again.

-- 
Cheers,
Stephen Rothwell

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* Re: [PATCH bpf] bpf: fix BTF verifier size resolution logic
From: Andrii Nakryiko @ 2019-07-11  1:45 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Andrii Nakryiko, Alexei Starovoitov, daniel@iogearbox.net,
	bpf@vger.kernel.org, netdev@vger.kernel.org, Kernel Team,
	Martin Lau
In-Reply-To: <304d8535-5043-836d-2933-1a5efb7aec72@fb.com>

On Wed, Jul 10, 2019 at 5:36 PM Yonghong Song <yhs@fb.com> wrote:
>
>
>
> On 7/10/19 5:29 PM, Andrii Nakryiko wrote:
> > On Wed, Jul 10, 2019 at 5:16 PM Yonghong Song <yhs@fb.com> wrote:
> >>
> >>
> >>
> >> On 7/10/19 1:08 AM, Andrii Nakryiko wrote:
> >>> BTF verifier has Different logic depending on whether we are following
> >>> a PTR or STRUCT/ARRAY (or something else). This is an optimization to
> >>> stop early in DFS traversal while resolving BTF types. But it also
> >>> results in a size resolution bug, when there is a chain, e.g., of PTR ->
> >>> TYPEDEF -> ARRAY, in which case due to being in pointer context ARRAY
> >>> size won't be resolved, as it is considered to be a sink for pointer,
> >>> leading to TYPEDEF being in RESOLVED state with zero size, which is
> >>> completely wrong.
> >>>
> >>> Optimization is doubtful, though, as btf_check_all_types() will iterate
> >>> over all BTF types anyways, so the only saving is a potentially slightly
> >>> shorter stack. But correctness is more important that tiny savings.
> >>>
> >>> This bug manifests itself in rejecting BTF-defined maps that use array
> >>> typedef as a value type:
> >>>
> >>> typedef int array_t[16];
> >>>
> >>> struct {
> >>>        __uint(type, BPF_MAP_TYPE_ARRAY);
> >>>        __type(value, array_t); /* i.e., array_t *value; */
> >>> } test_map SEC(".maps");
> >>>
> >>> Fixes: eb3f595dab40 ("bpf: btf: Validate type reference")
> >>> Cc: Martin KaFai Lau <kafai@fb.com>
> >>> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> >>
> >> The change seems okay to me. Currently, looks like intermediate
> >> modifier type will carry size = 0 (in the internal data structure).
> >
> > Yes, which is totally wrong, especially that we use that size in some
> > cases to reject map with specified BTF.
> >
> >>
> >> If we remove RESOLVE logic, we probably want to double check
> >> whether we handle circular types correctly or not. Maybe we will
> >> be okay if all self tests pass.
> >
> > I checked, it does. We'll attempt to add referenced type unless it's a
> > "resolve sink" (where size is immediately known) or is already
> > resolved (it's state is RESOLVED). In other cases, we'll attempt to
> > env_stack_push(), which check that the state of that type is
> > NOT_VISITED. If it's RESOLVED or VISITED, it returns -EEXISTS. When
> > type is added into the stack, it's resolve state goes from NOT_VISITED
> > to VISITED.
> >
> > So, if there is a loop, then we'll detect it as soon as we'll attempt
> > to add the same type onto the stack second time.
> >
> >>
> >> I may still be worthwhile to qualify the RESOLVE optimization benefit
> >> before removing it.
> >
> > I don't think there is any, because every type will be visited exactly
> > once, due to DFS nature of algorithm. The only difference is that if
> > we have a long chain of modifiers, we can technically reach the max
> > limit and fail. But at 32 I think it's pretty unrealistic to have such
> > a long chain of PTR/TYPEDEF/CONST/VOLATILE/RESTRICTs :)
> >
> >>
> >> Another possible change is, for external usage, removing
> >> modifiers, before checking the size, something like below.
> >> Note that I am not strongly advocating my below patch as
> >> it has the same shortcoming that maintained modifier type
> >> size may not be correct.
> >
> > I don't think your patch helps, it can actually confuse things even
> > more. It skips modifiers until underlying type is found, but you still
> > don't guarantee that at that time that underlying type will have its
> > size resolved.
>
> It actually does help. It does not change the internal btf type
> traversal algorithms. It only change the implementation of
> an external API btf_type_id_size(). Previously, this function
> is used by externals and internal btf.c. I broke it into two,
> one internal __btf_type_id_size(), and another external
> btf_type_id_size(). The external one removes modifier before
> finding type size. The external one is typically used only
> after btf is validated.

Sure, for external callers yes, it solves the problem. But there is
deeper problem: we mark modifier types RESOLVED before types they
ultimately point to are resolved. Then in all those btf_xxx_resolve()
functions we have check:

if (!env_type_is_resolve_sink && !env_type_is_resolved)
  return env_stack_push();
else {

  /* here we assume that we can calculate size of the type */
  /* so even if we traverse through all the modifiers and find
underlying type */
  /* that type will have resolved_size = 0, because we haven't
processed it yet */
  /* but we will just incorrectly assume that zero is *final* size */
}

So I think that your patch is still just hiding the problem, not solving it.

BTW, I've also identified part of btf_ptr_resolve() logic that can be
now safely removed (it's a special case that "restarts" DFS traversal
for modifiers, because they could have been prematurely marked
resolved). This is another sign that there is something wrong in an
algorithm.

I'd rather remove unnecessary complexity and fix underlying problem,
especially given that there is no performance or correctness penalty.

I'll post v2 soon.

>
> Will go through your other comments later.
>
> >
> >>
> >> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> >> index 546ebee39e2a..6f927c3e0a89 100644
> >> --- a/kernel/bpf/btf.c
> >> +++ b/kernel/bpf/btf.c
> >> @@ -620,6 +620,54 @@ static bool btf_type_int_is_regular(const struct
> >> btf_type *t)
> >>           return true;
> >>    }
> >>
> >> +static const struct btf_type *__btf_type_id_size(const struct btf *btf,
> >> +                                                u32 *type_id, u32
> >> *ret_size,
> >> +                                                bool skip_modifier)
> >> +{
> >> +       const struct btf_type *size_type;
> >> +       u32 size_type_id = *type_id;
> >> +       u32 size = 0;
> >> +
> >> +       size_type = btf_type_by_id(btf, size_type_id);
> >> +       if (size_type && skip_modifier) {
> >> +               while (btf_type_is_modifier(size_type))
> >> +                       size_type = btf_type_by_id(btf, size_type->type);
> >> +       }
> >> +
> >> +       if (btf_type_nosize_or_null(size_type))
> >> +               return NULL;
> >> +
> >> +       if (btf_type_has_size(size_type)) {
> >> +               size = size_type->size;
> >> +       } else if (btf_type_is_array(size_type)) {
> >> +               size = btf->resolved_sizes[size_type_id];
> >> +       } else if (btf_type_is_ptr(size_type)) {
> >> +               size = sizeof(void *);
> >> +       } else {
> >> +               if (WARN_ON_ONCE(!btf_type_is_modifier(size_type) &&
> >> +                                !btf_type_is_var(size_type)))
> >> +                       return NULL;
> >> +
> >> +               size = btf->resolved_sizes[size_type_id];
> >> +               size_type_id = btf->resolved_ids[size_type_id];
> >> +               size_type = btf_type_by_id(btf, size_type_id);
> >> +               if (btf_type_nosize_or_null(size_type))
> >> +                       return NULL;
> >> +       }
> >> +
> >> +       *type_id = size_type_id;
> >> +       if (ret_size)
> >> +               *ret_size = size;
> >> +
> >> +       return size_type;
> >> +}
> >> +
> [...]

^ permalink raw reply

* RE: [PATCH v3 0/3] kernel/notifier.c: avoid duplicate registration
From: Nixiaoming @ 2019-07-11  1:32 UTC (permalink / raw)
  To: Greg KH
  Cc: adobriyan@gmail.com, akpm@linux-foundation.org,
	anna.schumaker@netapp.com, arjan@linux.intel.com,
	bfields@fieldses.org, chuck.lever@oracle.com, davem@davemloft.net,
	jlayton@kernel.org, luto@kernel.org, mingo@kernel.org,
	Nadia.Derbey@bull.net, paulmck@linux.vnet.ibm.com,
	semen.protsenko@linaro.org, stable@kernel.org,
	stern@rowland.harvard.edu, tglx@linutronix.de,
	torvalds@linux-foundation.org, trond.myklebust@hammerspace.com,
	viresh.kumar@linaro.org, vvs@virtuozzo.com, Huangjianhui (Alex),
	Dailei, linux-kernel@vger.kernel.org, linux-nfs@vger.kernel.org,
	netdev@vger.kernel.org
In-Reply-To: <20190710055628.GB5778@kroah.com>

On Wed, July 10, 2019 1:56 PM Greg KH wrote:
>On Wed, Jul 10, 2019 at 11:09:07AM +0800, Xiaoming Ni wrote:
>> Registering the same notifier to a hook repeatedly can cause the hook
>> list to form a ring or lose other members of the list.
>
>Then don't do that :)
>

Duplicate registration is checked and exited in notifier_chain_cond_register()

Duplicate registration was checked in notifier_chain_register() but only 
the alarm was triggered without exiting. added by commit 831246570d34692e 
("kernel/notifier.c: double register detection")

This patch is similar to commit 8312465 and notifier_chain_cond_register(),
 with actual prevention for such behaviour,  which I think is necessary to 
 avoid the formation of a linked list ring.

>Is there any in-kernel users that do do this?  If so, please just fix
>them.
>
Notifier_chain_register() is not a hotspot path.
Adding a check here can make the kernel more stable.

Thanks

Xiaoming Ni


>thanks,
>
>greg k-h
>

^ permalink raw reply

* Re: [bpf-next v3 05/12] selftests/bpf: Allow passing more information to BPF prog test run
From: Andrii Nakryiko @ 2019-07-11  1:17 UTC (permalink / raw)
  To: Krzesimir Nowak
  Cc: open list, Alban Crequy, Iago López Galeiras,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
	Yonghong Song, David S. Miller, Jakub Kicinski,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Networking, bpf, xdp-newbies
In-Reply-To: <20190708163121.18477-6-krzesimir@kinvolk.io>

On Mon, Jul 8, 2019 at 3:42 PM Krzesimir Nowak <krzesimir@kinvolk.io> wrote:
>
> The test case can now specify a custom length of the data member,
> context data and its length, which will be passed to
> bpf_prog_test_run_xattr. For backward compatilibity, if the data
> length is 0 (which is what will happen when the field is left
> unspecified in the designated initializer of a struct), then the
> length passed to the bpf_prog_test_run_xattr is TEST_DATA_LEN.
>
> Also for backward compatilibity, if context data length is 0, NULL is
> passed as a context to bpf_prog_test_run_xattr. This is to avoid
> breaking other tests, where context data being NULL and context data
> length being 0 is handled differently from the case where context data
> is not NULL and context data length is 0.
>
> Custom lengths still can't be greater than hardcoded 64 bytes for data
> and 192 for context data.
>
> 192 for context data was picked to allow passing struct
> bpf_perf_event_data as a context for perf event programs. The struct
> is quite large, because it contains struct pt_regs.
>
> Test runs for perf event programs will not allow the copying the data
> back to data_out buffer, so they require data_out_size to be zero and
> data_out to be NULL. Since test_verifier hardcodes it, make it
> possible to override the size. Overriding the size to zero will cause
> the buffer to be NULL.
>
> Changes since v2:
> - Allow overriding the data out size and buffer.
>
> Signed-off-by: Krzesimir Nowak <krzesimir@kinvolk.io>
> ---
>  tools/testing/selftests/bpf/test_verifier.c | 105 +++++++++++++++++---
>  1 file changed, 93 insertions(+), 12 deletions(-)
>
> diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
> index 1640ba9f12c1..6f124cc4ee34 100644
> --- a/tools/testing/selftests/bpf/test_verifier.c
> +++ b/tools/testing/selftests/bpf/test_verifier.c
> @@ -54,6 +54,7 @@
>  #define MAX_TEST_RUNS  8
>  #define POINTER_VALUE  0xcafe4all
>  #define TEST_DATA_LEN  64
> +#define TEST_CTX_LEN   192
>
>  #define F_NEEDS_EFFICIENT_UNALIGNED_ACCESS     (1 << 0)
>  #define F_LOAD_WITH_STRICT_ALIGNMENT           (1 << 1)
> @@ -96,7 +97,12 @@ struct bpf_test {
>         enum bpf_prog_type prog_type;
>         uint8_t flags;
>         __u8 data[TEST_DATA_LEN];
> +       __u32 data_len;
> +       __u8 ctx[TEST_CTX_LEN];
> +       __u32 ctx_len;
>         void (*fill_helper)(struct bpf_test *self);
> +       bool override_data_out_len;
> +       __u32 overridden_data_out_len;
>         uint8_t runs;
>         struct {
>                 uint32_t retval, retval_unpriv;
> @@ -104,6 +110,9 @@ struct bpf_test {
>                         __u8 data[TEST_DATA_LEN];
>                         __u64 data64[TEST_DATA_LEN / 8];
>                 };
> +               __u32 data_len;
> +               __u8 ctx[TEST_CTX_LEN];
> +               __u32 ctx_len;
>         } retvals[MAX_TEST_RUNS];
>  };
>
> @@ -818,21 +827,35 @@ static int set_admin(bool admin)
>  }
>
>  static int do_prog_test_run(int fd_prog, bool unpriv, uint32_t expected_val,
> -                           void *data, size_t size_data)
> +                           void *data, size_t size_data, void *ctx,
> +                           size_t size_ctx, u32 *overridden_data_out_size)
>  {
> -       __u8 tmp[TEST_DATA_LEN << 2];
> -       __u32 size_tmp = sizeof(tmp);
> -       int saved_errno;
> -       int err;
>         struct bpf_prog_test_run_attr attr = {
>                 .prog_fd = fd_prog,
>                 .repeat = 1,
>                 .data_in = data,
>                 .data_size_in = size_data,
> -               .data_out = tmp,
> -               .data_size_out = size_tmp,
> +               .ctx_in = ctx,
> +               .ctx_size_in = size_ctx,
>         };
> +       __u8 tmp[TEST_DATA_LEN << 2];
> +       __u32 size_tmp = sizeof(tmp);
> +       __u32 size_buf = size_tmp;
> +       __u8 *buf = tmp;
> +       int saved_errno;
> +       int err;
>
> +       if (overridden_data_out_size)
> +               size_buf = *overridden_data_out_size;
> +       if (size_buf > size_tmp) {
> +               printf("FAIL: out data size (%d) greater than a buffer size (%d) ",
> +                      size_buf, size_tmp);
> +               return -EINVAL;
> +       }
> +       if (!size_buf)
> +               buf = NULL;
> +       attr.data_size_out = size_buf;
> +       attr.data_out = buf;
>         if (unpriv)
>                 set_admin(true);
>         err = bpf_prog_test_run_xattr(&attr);
> @@ -956,13 +979,45 @@ static void do_test_single(struct bpf_test *test, bool unpriv,
>         if (!alignment_prevented_execution && fd_prog >= 0) {
>                 uint32_t expected_val;
>                 int i;
> +               __u32 size_data;
> +               __u32 size_ctx;
> +               bool bad_size;
> +               void *ctx;
> +               __u32 *overridden_data_out_size;
>
>                 if (!test->runs) {
> +                       if (test->data_len > 0)
> +                               size_data = test->data_len;
> +                       else
> +                               size_data = sizeof(test->data);
> +                       if (test->override_data_out_len)
> +                               overridden_data_out_size = &test->overridden_data_out_len;
> +                       else
> +                               overridden_data_out_size = NULL;
> +                       size_ctx = test->ctx_len;
> +                       bad_size = false;

I hated all this duplication of logic, which with this patch becomes
even more expansive, so I removed it. Please see [0]. Can you please
apply that patch and add all this new logic only once?

  [0] https://patchwork.ozlabs.org/patch/1130601/

>                         expected_val = unpriv && test->retval_unpriv ?
>                                 test->retval_unpriv : test->retval;
>
> -                       err = do_prog_test_run(fd_prog, unpriv, expected_val,
> -                                              test->data, sizeof(test->data));
> +                       if (size_data > sizeof(test->data)) {
> +                               printf("FAIL: data size (%u) greater than TEST_DATA_LEN (%lu) ", size_data, sizeof(test->data));
> +                               bad_size = true;
> +                       }
> +                       if (size_ctx > sizeof(test->ctx)) {
> +                               printf("FAIL: ctx size (%u) greater than TEST_CTX_LEN (%lu) ", size_ctx, sizeof(test->ctx));

These look like way too long lines, wrap them?

> +                               bad_size = true;
> +                       }
> +                       if (size_ctx)
> +                               ctx = test->ctx;
> +                       else
> +                               ctx = NULL;

nit: single line:

ctx = size_ctx ? test->ctx : NULL;

> +                       if (bad_size)
> +                               err = 1;
> +                       else
> +                               err = do_prog_test_run(fd_prog, unpriv, expected_val,
> +                                                      test->data, size_data,
> +                                                      ctx, size_ctx,
> +                                                      overridden_data_out_size);
>                         if (err)
>                                 run_errs++;
>                         else
> @@ -970,14 +1025,40 @@ static void do_test_single(struct bpf_test *test, bool unpriv,
>                 }
>
>                 for (i = 0; i < test->runs; i++) {
> +                       if (test->retvals[i].data_len > 0)
> +                               size_data = test->retvals[i].data_len;
> +                       else
> +                               size_data = sizeof(test->retvals[i].data);
> +                       if (test->override_data_out_len)
> +                               overridden_data_out_size = &test->overridden_data_out_len;
> +                       else
> +                               overridden_data_out_size = NULL;
> +                       size_ctx = test->retvals[i].ctx_len;
> +                       bad_size = false;
>                         if (unpriv && test->retvals[i].retval_unpriv)
>                                 expected_val = test->retvals[i].retval_unpriv;
>                         else
>                                 expected_val = test->retvals[i].retval;
>
> -                       err = do_prog_test_run(fd_prog, unpriv, expected_val,
> -                                              test->retvals[i].data,
> -                                              sizeof(test->retvals[i].data));
> +                       if (size_data > sizeof(test->retvals[i].data)) {
> +                               printf("FAIL: data size (%u) at run %i greater than TEST_DATA_LEN (%lu) ", size_data, i + 1, sizeof(test->retvals[i].data));
> +                               bad_size = true;
> +                       }
> +                       if (size_ctx > sizeof(test->retvals[i].ctx)) {
> +                               printf("FAIL: ctx size (%u) at run %i greater than TEST_CTX_LEN (%lu) ", size_ctx, i + 1, sizeof(test->retvals[i].ctx));
> +                               bad_size = true;
> +                       }
> +                       if (size_ctx)
> +                               ctx = test->retvals[i].ctx;
> +                       else
> +                               ctx = NULL;
> +                       if (bad_size)
> +                               err = 1;
> +                       else
> +                               err = do_prog_test_run(fd_prog, unpriv, expected_val,
> +                                                      test->retvals[i].data, size_data,
> +                                                      ctx, size_ctx,
> +                                                      overridden_data_out_size);
>                         if (err) {
>                                 printf("(run %d/%d) ", i + 1, test->runs);
>                                 run_errs++;
> --
> 2.20.1
>

^ permalink raw reply

* [PATCH bpf-next] selftests/bpf: remove logic duplication in test_verifier.c
From: Andrii Nakryiko @ 2019-07-11  1:08 UTC (permalink / raw)
  To: andrii.nakryiko, kernel-team, ast, daniel, bpf, netdev
  Cc: Andrii Nakryiko, Krzesimir Nowak

test_verifier tests can specify single- and multi-runs tests. Internally
logic of handling them is duplicated. Get rid of it by making single run
retval specification to be a first retvals spec.

Cc: Krzesimir Nowak <krzesimir@kinvolk.io>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 tools/testing/selftests/bpf/test_verifier.c | 37 ++++++++++-----------
 1 file changed, 18 insertions(+), 19 deletions(-)

diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
index b0773291012a..120ecdf4a7db 100644
--- a/tools/testing/selftests/bpf/test_verifier.c
+++ b/tools/testing/selftests/bpf/test_verifier.c
@@ -86,7 +86,7 @@ struct bpf_test {
 	int fixup_sk_storage_map[MAX_FIXUPS];
 	const char *errstr;
 	const char *errstr_unpriv;
-	uint32_t retval, retval_unpriv, insn_processed;
+	uint32_t insn_processed;
 	int prog_len;
 	enum {
 		UNDEF,
@@ -95,16 +95,24 @@ struct bpf_test {
 	} result, result_unpriv;
 	enum bpf_prog_type prog_type;
 	uint8_t flags;
-	__u8 data[TEST_DATA_LEN];
 	void (*fill_helper)(struct bpf_test *self);
 	uint8_t runs;
-	struct {
-		uint32_t retval, retval_unpriv;
-		union {
-			__u8 data[TEST_DATA_LEN];
-			__u64 data64[TEST_DATA_LEN / 8];
+	union {
+		struct {
+			uint32_t retval, retval_unpriv;
+			union {
+				__u8 data[TEST_DATA_LEN];
+				__u64 data64[TEST_DATA_LEN / 8];
+			};
 		};
-	} retvals[MAX_TEST_RUNS];
+		struct {
+			uint32_t retval, retval_unpriv;
+			union {
+				__u8 data[TEST_DATA_LEN];
+				__u64 data64[TEST_DATA_LEN / 8];
+			};
+		} retvals[MAX_TEST_RUNS];
+	};
 	enum bpf_attach_type expected_attach_type;
 };
 
@@ -949,17 +957,8 @@ static void do_test_single(struct bpf_test *test, bool unpriv,
 		uint32_t expected_val;
 		int i;
 
-		if (!test->runs) {
-			expected_val = unpriv && test->retval_unpriv ?
-				test->retval_unpriv : test->retval;
-
-			err = do_prog_test_run(fd_prog, unpriv, expected_val,
-					       test->data, sizeof(test->data));
-			if (err)
-				run_errs++;
-			else
-				run_successes++;
-		}
+		if (!test->runs)
+			test->runs = 1;
 
 		for (i = 0; i < test->runs; i++) {
 			if (unpriv && test->retvals[i].retval_unpriv)
-- 
2.17.1


^ permalink raw reply related

* Re: [PATCH bpf] bpf: fix BTF verifier size resolution logic
From: Yonghong Song @ 2019-07-11  0:36 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Andrii Nakryiko, Alexei Starovoitov, daniel@iogearbox.net,
	bpf@vger.kernel.org, netdev@vger.kernel.org, Kernel Team,
	Martin Lau
In-Reply-To: <CAEf4BzaVouFd=3whC1EjhQ9mit62b-C+NhQuW4RiXW02Rq_1Ug@mail.gmail.com>



On 7/10/19 5:29 PM, Andrii Nakryiko wrote:
> On Wed, Jul 10, 2019 at 5:16 PM Yonghong Song <yhs@fb.com> wrote:
>>
>>
>>
>> On 7/10/19 1:08 AM, Andrii Nakryiko wrote:
>>> BTF verifier has Different logic depending on whether we are following
>>> a PTR or STRUCT/ARRAY (or something else). This is an optimization to
>>> stop early in DFS traversal while resolving BTF types. But it also
>>> results in a size resolution bug, when there is a chain, e.g., of PTR ->
>>> TYPEDEF -> ARRAY, in which case due to being in pointer context ARRAY
>>> size won't be resolved, as it is considered to be a sink for pointer,
>>> leading to TYPEDEF being in RESOLVED state with zero size, which is
>>> completely wrong.
>>>
>>> Optimization is doubtful, though, as btf_check_all_types() will iterate
>>> over all BTF types anyways, so the only saving is a potentially slightly
>>> shorter stack. But correctness is more important that tiny savings.
>>>
>>> This bug manifests itself in rejecting BTF-defined maps that use array
>>> typedef as a value type:
>>>
>>> typedef int array_t[16];
>>>
>>> struct {
>>>        __uint(type, BPF_MAP_TYPE_ARRAY);
>>>        __type(value, array_t); /* i.e., array_t *value; */
>>> } test_map SEC(".maps");
>>>
>>> Fixes: eb3f595dab40 ("bpf: btf: Validate type reference")
>>> Cc: Martin KaFai Lau <kafai@fb.com>
>>> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
>>
>> The change seems okay to me. Currently, looks like intermediate
>> modifier type will carry size = 0 (in the internal data structure).
> 
> Yes, which is totally wrong, especially that we use that size in some
> cases to reject map with specified BTF.
> 
>>
>> If we remove RESOLVE logic, we probably want to double check
>> whether we handle circular types correctly or not. Maybe we will
>> be okay if all self tests pass.
> 
> I checked, it does. We'll attempt to add referenced type unless it's a
> "resolve sink" (where size is immediately known) or is already
> resolved (it's state is RESOLVED). In other cases, we'll attempt to
> env_stack_push(), which check that the state of that type is
> NOT_VISITED. If it's RESOLVED or VISITED, it returns -EEXISTS. When
> type is added into the stack, it's resolve state goes from NOT_VISITED
> to VISITED.
> 
> So, if there is a loop, then we'll detect it as soon as we'll attempt
> to add the same type onto the stack second time.
> 
>>
>> I may still be worthwhile to qualify the RESOLVE optimization benefit
>> before removing it.
> 
> I don't think there is any, because every type will be visited exactly
> once, due to DFS nature of algorithm. The only difference is that if
> we have a long chain of modifiers, we can technically reach the max
> limit and fail. But at 32 I think it's pretty unrealistic to have such
> a long chain of PTR/TYPEDEF/CONST/VOLATILE/RESTRICTs :)
> 
>>
>> Another possible change is, for external usage, removing
>> modifiers, before checking the size, something like below.
>> Note that I am not strongly advocating my below patch as
>> it has the same shortcoming that maintained modifier type
>> size may not be correct.
> 
> I don't think your patch helps, it can actually confuse things even
> more. It skips modifiers until underlying type is found, but you still
> don't guarantee that at that time that underlying type will have its
> size resolved.

It actually does help. It does not change the internal btf type
traversal algorithms. It only change the implementation of
an external API btf_type_id_size(). Previously, this function
is used by externals and internal btf.c. I broke it into two,
one internal __btf_type_id_size(), and another external
btf_type_id_size(). The external one removes modifier before
finding type size. The external one is typically used only
after btf is validated.

Will go through your other comments later.

> 
>>
>> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
>> index 546ebee39e2a..6f927c3e0a89 100644
>> --- a/kernel/bpf/btf.c
>> +++ b/kernel/bpf/btf.c
>> @@ -620,6 +620,54 @@ static bool btf_type_int_is_regular(const struct
>> btf_type *t)
>>           return true;
>>    }
>>
>> +static const struct btf_type *__btf_type_id_size(const struct btf *btf,
>> +                                                u32 *type_id, u32
>> *ret_size,
>> +                                                bool skip_modifier)
>> +{
>> +       const struct btf_type *size_type;
>> +       u32 size_type_id = *type_id;
>> +       u32 size = 0;
>> +
>> +       size_type = btf_type_by_id(btf, size_type_id);
>> +       if (size_type && skip_modifier) {
>> +               while (btf_type_is_modifier(size_type))
>> +                       size_type = btf_type_by_id(btf, size_type->type);
>> +       }
>> +
>> +       if (btf_type_nosize_or_null(size_type))
>> +               return NULL;
>> +
>> +       if (btf_type_has_size(size_type)) {
>> +               size = size_type->size;
>> +       } else if (btf_type_is_array(size_type)) {
>> +               size = btf->resolved_sizes[size_type_id];
>> +       } else if (btf_type_is_ptr(size_type)) {
>> +               size = sizeof(void *);
>> +       } else {
>> +               if (WARN_ON_ONCE(!btf_type_is_modifier(size_type) &&
>> +                                !btf_type_is_var(size_type)))
>> +                       return NULL;
>> +
>> +               size = btf->resolved_sizes[size_type_id];
>> +               size_type_id = btf->resolved_ids[size_type_id];
>> +               size_type = btf_type_by_id(btf, size_type_id);
>> +               if (btf_type_nosize_or_null(size_type))
>> +                       return NULL;
>> +       }
>> +
>> +       *type_id = size_type_id;
>> +       if (ret_size)
>> +               *ret_size = size;
>> +
>> +       return size_type;
>> +}
>> +
[...]

^ permalink raw reply

* Re: [PATCH net-next] net/mlx5e: Provide cb_list pointer when setting up tc block on rep
From: wenxu @ 2019-07-11  0:35 UTC (permalink / raw)
  To: Vlad Buslov, netdev; +Cc: jhs, xiyou.wangcong, jiri, davem, pablo, saeedm
In-Reply-To: <20190710182554.2988-1-vladbu@mellanox.com>


在 2019/7/11 2:25, Vlad Buslov 写道:
> Recent refactoring of tc block offloads infrastructure introduced new
> flow_block_cb_setup_simple() method intended to be used as unified way for
> all drivers to register offload callbacks. However, commit that actually
> extended all users (drivers) with block cb list and provided it to
> flow_block infra missed mlx5 en_rep. This leads to following NULL-pointer
> dereference when creating Qdisc:
>
> [  278.385175] BUG: kernel NULL pointer dereference, address: 0000000000000000
> [  278.393233] #PF: supervisor read access in kernel mode
> [  278.399446] #PF: error_code(0x0000) - not-present page
> [  278.405847] PGD 8000000850e73067 P4D 8000000850e73067 PUD 8620cd067 PMD 0
> [  278.414141] Oops: 0000 [#1] SMP PTI
> [  278.419019] CPU: 7 PID: 3369 Comm: tc Not tainted 5.2.0-rc6+ #492
> [  278.426580] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
> [  278.435853] RIP: 0010:flow_block_cb_setup_simple+0xc4/0x190
> [  278.442953] Code: 10 48 89 42 08 48 89 10 48 b8 00 01 00 00 00 00 ad de 49 89 00 48 05 00 01 00 00 49 89 40 08 31 c0 c3 b8 a1 ff ff ff c3 f3 c3 <48> 8b 06 48 39 c6 75 0a eb 1a 48 8b 00 48 39 c6 74 12
>  48 3b 50 28
> [  278.464829] RSP: 0018:ffffaf07c3f97990 EFLAGS: 00010246
> [  278.471648] RAX: 0000000000000000 RBX: ffff9b43ed4c7680 RCX: ffff9b43d5f80840
> [  278.480408] RDX: ffffffffc0491650 RSI: 0000000000000000 RDI: ffffaf07c3f97998
> [  278.489110] RBP: ffff9b43ddff9000 R08: ffff9b43d5f80840 R09: 0000000000000001
> [  278.497838] R10: 0000000000000009 R11: 00000000000003ad R12: ffffaf07c3f97c08
> [  278.506595] R13: ffff9b43d5f80000 R14: ffff9b43ed4c7680 R15: ffff9b43dfa20b40
> [  278.515374] FS:  00007f796be1b400(0000) GS:ffff9b43ef840000(0000) knlGS:0000000000000000
> [  278.525099] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  278.532453] CR2: 0000000000000000 CR3: 0000000840398002 CR4: 00000000001606e0
> [  278.541197] Call Trace:
> [  278.545252]  tcf_block_offload_cmd.isra.52+0x7e/0xb0
> [  278.551871]  tcf_block_get_ext+0x365/0x3e0
> [  278.557569]  qdisc_create+0x15c/0x4e0
> [  278.562859]  ? kmem_cache_alloc_trace+0x1a2/0x1c0
> [  278.569235]  tc_modify_qdisc+0x1c8/0x780
> [  278.574761]  rtnetlink_rcv_msg+0x291/0x340
> [  278.580518]  ? _cond_resched+0x15/0x40
> [  278.585856]  ? rtnl_calcit.isra.29+0x120/0x120
> [  278.591868]  netlink_rcv_skb+0x4a/0x110
> [  278.597198]  netlink_unicast+0x1a0/0x250
> [  278.602601]  netlink_sendmsg+0x2c1/0x3c0
> [  278.608022]  sock_sendmsg+0x5b/0x60
> [  278.612969]  ___sys_sendmsg+0x289/0x310
> [  278.618231]  ? do_wp_page+0x99/0x730
> [  278.623216]  ? page_add_new_anon_rmap+0xbe/0x140
> [  278.629298]  ? __handle_mm_fault+0xc84/0x1360
> [  278.635113]  ? __sys_sendmsg+0x5e/0xa0
> [  278.640285]  __sys_sendmsg+0x5e/0xa0
> [  278.645239]  do_syscall_64+0x5b/0x1b0
> [  278.650274]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [  278.656697] RIP: 0033:0x7f796abdeb87
> [  278.661628] Code: 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 80 00 00 00 00 8b 05 6a 2b 2c 00 48 63 d2 48 63 ff 85 c0 75 18 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 59 f3 c3 0f 1f 80 00 00 00 00 53
>  48 89 f3 48
> [  278.683248] RSP: 002b:00007ffde213ba48 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
> [  278.692245] RAX: ffffffffffffffda RBX: 000000005d261e6f RCX: 00007f796abdeb87
> [  278.700862] RDX: 0000000000000000 RSI: 00007ffde213bab0 RDI: 0000000000000003
> [  278.709527] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000006
> [  278.718167] R10: 000000000000000c R11: 0000000000000246 R12: 0000000000000001
> [  278.726743] R13: 000000000067b580 R14: 0000000000000000 R15: 0000000000000000
> [  278.735302] Modules linked in: dummy vxlan ip6_udp_tunnel udp_tunnel sch_ingress nfsv3 nfs_acl nfs lockd grace fscache bridge stp llc sunrpc mlx5_ib ib_uverbs intel_rapl ib_core sb_edac x86_pkg_temp_
> thermal intel_powerclamp coretemp kvm_intel kvm mlx5_core irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel igb ghash_clmulni_intel ses mei_me enclosure mlxfw ipmi_ssif intel_cstate iTCO_wdt ptp mei
> pps_core iTCO_vendor_support pcspkr joydev intel_uncore i2c_i801 ipmi_si lpc_ich intel_rapl_perf ioatdma wmi dca pcc_cpufreq ipmi_devintf ipmi_msghandler acpi_power_meter acpi_pad ast i2c_algo_bit drm_k
> ms_helper ttm drm mpt3sas raid_class scsi_transport_sas
> [  278.802263] CR2: 0000000000000000
> [  278.807170] ---[ end trace b1f0a442a279e66f ]---
>
> Extend en_rep with new static mlx5e_rep_block_cb_list list and pass it to
> flow_block_cb_setup_simple() function instead of hardcoded NULL pointer.
>
> Fixes: 955bcb6ea0df ("drivers: net: use flow block API")
> Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
> ---
>  drivers/net/ethernet/mellanox/mlx5/core/en_rep.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
> index 10ef90a7bddd..7245d287633d 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
> @@ -1175,6 +1175,8 @@ static int mlx5e_rep_setup_tc_cb(enum tc_setup_type type, void *type_data,
>  	}
>  }
>  
> +static LIST_HEAD(mlx5e_rep_block_cb_list);
> +

I think it is not necessary needs a extra LIST_HEAD, the early mlx5e_block_cb_list is ok

The early patch  http://patchwork.ozlabs.org/patch/1130439/ is enough.

>  static int mlx5e_rep_setup_tc(struct net_device *dev, enum tc_setup_type type,
>  			      void *type_data)
>  {
> @@ -1182,7 +1184,8 @@ static int mlx5e_rep_setup_tc(struct net_device *dev, enum tc_setup_type type,
>  
>  	switch (type) {
>  	case TC_SETUP_BLOCK:
> -		return flow_block_cb_setup_simple(type_data, NULL,
> +		return flow_block_cb_setup_simple(type_data,
> +						  &mlx5e_rep_block_cb_list,
>  						  mlx5e_rep_setup_tc_cb,
>  						  priv, priv, true);
>  	default:

^ permalink raw reply

* Re: [PATCH bpf] bpf: fix BTF verifier size resolution logic
From: Andrii Nakryiko @ 2019-07-11  0:29 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Andrii Nakryiko, Alexei Starovoitov, daniel@iogearbox.net,
	bpf@vger.kernel.org, netdev@vger.kernel.org, Kernel Team,
	Martin Lau
In-Reply-To: <f6bc7a95-e8e1-eec4-9728-3b9e36b434fa@fb.com>

On Wed, Jul 10, 2019 at 5:16 PM Yonghong Song <yhs@fb.com> wrote:
>
>
>
> On 7/10/19 1:08 AM, Andrii Nakryiko wrote:
> > BTF verifier has Different logic depending on whether we are following
> > a PTR or STRUCT/ARRAY (or something else). This is an optimization to
> > stop early in DFS traversal while resolving BTF types. But it also
> > results in a size resolution bug, when there is a chain, e.g., of PTR ->
> > TYPEDEF -> ARRAY, in which case due to being in pointer context ARRAY
> > size won't be resolved, as it is considered to be a sink for pointer,
> > leading to TYPEDEF being in RESOLVED state with zero size, which is
> > completely wrong.
> >
> > Optimization is doubtful, though, as btf_check_all_types() will iterate
> > over all BTF types anyways, so the only saving is a potentially slightly
> > shorter stack. But correctness is more important that tiny savings.
> >
> > This bug manifests itself in rejecting BTF-defined maps that use array
> > typedef as a value type:
> >
> > typedef int array_t[16];
> >
> > struct {
> >       __uint(type, BPF_MAP_TYPE_ARRAY);
> >       __type(value, array_t); /* i.e., array_t *value; */
> > } test_map SEC(".maps");
> >
> > Fixes: eb3f595dab40 ("bpf: btf: Validate type reference")
> > Cc: Martin KaFai Lau <kafai@fb.com>
> > Signed-off-by: Andrii Nakryiko <andriin@fb.com>
>
> The change seems okay to me. Currently, looks like intermediate
> modifier type will carry size = 0 (in the internal data structure).

Yes, which is totally wrong, especially that we use that size in some
cases to reject map with specified BTF.

>
> If we remove RESOLVE logic, we probably want to double check
> whether we handle circular types correctly or not. Maybe we will
> be okay if all self tests pass.

I checked, it does. We'll attempt to add referenced type unless it's a
"resolve sink" (where size is immediately known) or is already
resolved (it's state is RESOLVED). In other cases, we'll attempt to
env_stack_push(), which check that the state of that type is
NOT_VISITED. If it's RESOLVED or VISITED, it returns -EEXISTS. When
type is added into the stack, it's resolve state goes from NOT_VISITED
to VISITED.

So, if there is a loop, then we'll detect it as soon as we'll attempt
to add the same type onto the stack second time.

>
> I may still be worthwhile to qualify the RESOLVE optimization benefit
> before removing it.

I don't think there is any, because every type will be visited exactly
once, due to DFS nature of algorithm. The only difference is that if
we have a long chain of modifiers, we can technically reach the max
limit and fail. But at 32 I think it's pretty unrealistic to have such
a long chain of PTR/TYPEDEF/CONST/VOLATILE/RESTRICTs :)

>
> Another possible change is, for external usage, removing
> modifiers, before checking the size, something like below.
> Note that I am not strongly advocating my below patch as
> it has the same shortcoming that maintained modifier type
> size may not be correct.

I don't think your patch helps, it can actually confuse things even
more. It skips modifiers until underlying type is found, but you still
don't guarantee that at that time that underlying type will have its
size resolved.

>
> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> index 546ebee39e2a..6f927c3e0a89 100644
> --- a/kernel/bpf/btf.c
> +++ b/kernel/bpf/btf.c
> @@ -620,6 +620,54 @@ static bool btf_type_int_is_regular(const struct
> btf_type *t)
>          return true;
>   }
>
> +static const struct btf_type *__btf_type_id_size(const struct btf *btf,
> +                                                u32 *type_id, u32
> *ret_size,
> +                                                bool skip_modifier)
> +{
> +       const struct btf_type *size_type;
> +       u32 size_type_id = *type_id;
> +       u32 size = 0;
> +
> +       size_type = btf_type_by_id(btf, size_type_id);
> +       if (size_type && skip_modifier) {
> +               while (btf_type_is_modifier(size_type))
> +                       size_type = btf_type_by_id(btf, size_type->type);
> +       }
> +
> +       if (btf_type_nosize_or_null(size_type))
> +               return NULL;
> +
> +       if (btf_type_has_size(size_type)) {
> +               size = size_type->size;
> +       } else if (btf_type_is_array(size_type)) {
> +               size = btf->resolved_sizes[size_type_id];
> +       } else if (btf_type_is_ptr(size_type)) {
> +               size = sizeof(void *);
> +       } else {
> +               if (WARN_ON_ONCE(!btf_type_is_modifier(size_type) &&
> +                                !btf_type_is_var(size_type)))
> +                       return NULL;
> +
> +               size = btf->resolved_sizes[size_type_id];
> +               size_type_id = btf->resolved_ids[size_type_id];
> +               size_type = btf_type_by_id(btf, size_type_id);
> +               if (btf_type_nosize_or_null(size_type))
> +                       return NULL;
> +       }
> +
> +       *type_id = size_type_id;
> +       if (ret_size)
> +               *ret_size = size;
> +
> +       return size_type;
> +}
> +
> +const struct btf_type *btf_type_id_size(const struct btf *btf,
> +                                       u32 *type_id, u32 *ret_size)
> +{
> +       return __btf_type_id_size(btf, type_id, ret_size, true);
> +}
> +
>   /*
>    * Check that given struct member is a regular int with expected
>    * offset and size.
> @@ -633,7 +681,7 @@ bool btf_member_is_reg_int(const struct btf *btf,
> const struct btf_type *s,
>          u8 nr_bits;
>
>          id = m->type;
> -       t = btf_type_id_size(btf, &id, NULL);
> +       t = __btf_type_id_size(btf, &id, NULL, false);
>          if (!t || !btf_type_is_int(t))
>                  return false;
>
> @@ -1051,42 +1099,6 @@ static const struct btf_type
> *btf_type_id_resolve(const struct btf *btf,
>          return btf_type_by_id(btf, *type_id);
>   }
>
> -const struct btf_type *btf_type_id_size(const struct btf *btf,
> -                                       u32 *type_id, u32 *ret_size)
> -{
> -       const struct btf_type *size_type;
> -       u32 size_type_id = *type_id;
> -       u32 size = 0;
> -
> -       size_type = btf_type_by_id(btf, size_type_id);
> -       if (btf_type_nosize_or_null(size_type))
> -               return NULL;
> -
> -       if (btf_type_has_size(size_type)) {
> -               size = size_type->size;
> -       } else if (btf_type_is_array(size_type)) {
> -               size = btf->resolved_sizes[size_type_id];
> -       } else if (btf_type_is_ptr(size_type)) {
> -               size = sizeof(void *);
> -       } else {
> -               if (WARN_ON_ONCE(!btf_type_is_modifier(size_type) &&
> -                                !btf_type_is_var(size_type)))
> -                       return NULL;
> -
> -               size = btf->resolved_sizes[size_type_id];
> -               size_type_id = btf->resolved_ids[size_type_id];
> -               size_type = btf_type_by_id(btf, size_type_id);
> -               if (btf_type_nosize_or_null(size_type))
> -                       return NULL;
> -       }
> -
> -       *type_id = size_type_id;
> -       if (ret_size)
> -               *ret_size = size;
> -
> -       return size_type;
> -}
> -
>   static int btf_df_check_member(struct btf_verifier_env *env,
>                                 const struct btf_type *struct_type,
>                                 const struct btf_member *member,
> @@ -1489,7 +1501,7 @@ static int btf_modifier_check_member(struct
> btf_verifier_env *env,
>          struct btf_member resolved_member;
>          struct btf *btf = env->btf;
>
> -       resolved_type = btf_type_id_size(btf, &resolved_type_id, NULL);
> +       resolved_type = __btf_type_id_size(btf, &resolved_type_id, NULL,
> false);
>          if (!resolved_type) {
>                  btf_verifier_log_member(env, struct_type, member,
>                                          "Invalid member");
> @@ -1514,7 +1526,7 @@ static int btf_modifier_check_kflag_member(struct
> btf_verifier_env *env,
>          struct btf_member resolved_member;
>          struct btf *btf = env->btf;
>
> -       resolved_type = btf_type_id_size(btf, &resolved_type_id, NULL);
> +       resolved_type = __btf_type_id_size(btf, &resolved_type_id, NULL,
> false);
>          if (!resolved_type) {
>                  btf_verifier_log_member(env, struct_type, member,
>                                          "Invalid member");
> @@ -1620,7 +1632,7 @@ static int btf_modifier_resolve(struct
> btf_verifier_env *env,
>           * save us a few type-following when we use it later (e.g. in
>           * pretty print).
>           */
> -       if (!btf_type_id_size(btf, &next_type_id, &next_type_size)) {
> +       if (!__btf_type_id_size(btf, &next_type_id, &next_type_size,
> false)) {
>                  if (env_type_is_resolved(env, next_type_id))
>                          next_type = btf_type_id_resolve(btf,
> &next_type_id);
>
> @@ -1675,7 +1687,7 @@ static int btf_var_resolve(struct btf_verifier_env
> *env,
>           * forward types or similar that would resolve to size of
>           * zero is allowed.
>           */
> -       if (!btf_type_id_size(btf, &next_type_id, &next_type_size)) {
> +       if (!__btf_type_id_size(btf, &next_type_id, &next_type_size,
> false)) {
>                  btf_verifier_log_type(env, v->t, "Invalid type_id");
>                  return -EINVAL;
>          }
> @@ -1725,7 +1737,7 @@ static int btf_ptr_resolve(struct btf_verifier_env
> *env,
>                                                resolved_type_id);
>          }
>
> -       if (!btf_type_id_size(btf, &next_type_id, NULL)) {
> +       if (!__btf_type_id_size(btf, &next_type_id, NULL, false)) {
>                  if (env_type_is_resolved(env, next_type_id))
>                          next_type = btf_type_id_resolve(btf,
> &next_type_id);
>
> @@ -1851,7 +1863,7 @@ static int btf_array_check_member(struct
> btf_verifier_env *env,
>          }
>
>          array_type_id = member->type;
> -       btf_type_id_size(btf, &array_type_id, &array_size);
> +       __btf_type_id_size(btf, &array_type_id, &array_size, false);
>          struct_size = struct_type->size;
>          bytes_offset = BITS_ROUNDDOWN_BYTES(struct_bits_off);
>          if (struct_size - bytes_offset < array_size) {
> @@ -1938,7 +1950,7 @@ static int btf_array_resolve(struct
> btf_verifier_env *env,
>              !env_type_is_resolved(env, index_type_id))
>                  return env_stack_push(env, index_type, index_type_id);
>
> -       index_type = btf_type_id_size(btf, &index_type_id, NULL);
> +       index_type = __btf_type_id_size(btf, &index_type_id, NULL, false);
>          if (!index_type || !btf_type_is_int(index_type) ||
>              !btf_type_int_is_regular(index_type)) {
>                  btf_verifier_log_type(env, v->t, "Invalid index");
> @@ -1959,7 +1971,7 @@ static int btf_array_resolve(struct
> btf_verifier_env *env,
>              !env_type_is_resolved(env, elem_type_id))
>                  return env_stack_push(env, elem_type, elem_type_id);
>
> -       elem_type = btf_type_id_size(btf, &elem_type_id, &elem_size);
> +       elem_type = __btf_type_id_size(btf, &elem_type_id, &elem_size,
> false);
>          if (!elem_type) {
>                  btf_verifier_log_type(env, v->t, "Invalid elem");
>                  return -EINVAL;
> @@ -2000,7 +2012,7 @@ static void btf_array_seq_show(const struct btf
> *btf, const struct btf_type *t,
>          u32 i, elem_size, elem_type_id;
>
>          elem_type_id = array->type;
> -       elem_type = btf_type_id_size(btf, &elem_type_id, &elem_size);
> +       elem_type = __btf_type_id_size(btf, &elem_type_id, &elem_size,
> false);
>          elem_ops = btf_type_ops(elem_type);
>          seq_puts(m, "[");
>          for (i = 0; i < array->nelems; i++) {
> @@ -2732,7 +2744,7 @@ static int btf_datasec_resolve(struct
> btf_verifier_env *env,
>                  }
>
>                  type_id = var_type->type;
> -               if (!btf_type_id_size(btf, &type_id, &type_size)) {
> +               if (!__btf_type_id_size(btf, &type_id, &type_size, false)) {
>                          btf_verifier_log_vsi(env, v->t, vsi, "Invalid
> type");
>                          return -EINVAL;
>                  }
> @@ -2813,7 +2825,7 @@ static int btf_func_proto_check(struct
> btf_verifier_env *env,
>                  }
>
>                  /* Ensure the return type is a type that has a size */
> -               if (!btf_type_id_size(btf, &ret_type_id, NULL)) {
> +               if (!__btf_type_id_size(btf, &ret_type_id, NULL, false)) {
>                          btf_verifier_log_type(env, t, "Invalid return
> type");
>                          return -EINVAL;
>                  }
> @@ -2861,7 +2873,7 @@ static int btf_func_proto_check(struct
> btf_verifier_env *env,
>                                  break;
>                  }
>
> -               if (!btf_type_id_size(btf, &arg_type_id, NULL)) {
> +               if (!__btf_type_id_size(btf, &arg_type_id, NULL, false)) {
>                          btf_verifier_log_type(env, t, "Invalid arg#%u",
> i + 1);
>                          err = -EINVAL;
>                          break;
> @@ -3014,7 +3026,7 @@ static bool btf_resolve_valid(struct
> btf_verifier_env *env,
>                  u32 elem_type_id = array->type;
>                  u32 elem_size;
>
> -               elem_type = btf_type_id_size(btf, &elem_type_id,
> &elem_size);
> +               elem_type = __btf_type_id_size(btf, &elem_type_id,
> &elem_size, false);
>                  return elem_type && !btf_type_is_modifier(elem_type) &&
>                          (array->nelems * elem_size ==
>                           btf->resolved_sizes[type_id]);
>
>
> > ---
> >   kernel/bpf/btf.c | 42 +++---------------------------------------
> >   1 file changed, 3 insertions(+), 39 deletions(-)
> >
> > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> > index cad09858a5f2..c68c7e73b0d1 100644
> > --- a/kernel/bpf/btf.c
> > +++ b/kernel/bpf/btf.c
> > @@ -231,14 +231,6 @@ enum visit_state {
> >       RESOLVED,
> >   };
> >
> > -enum resolve_mode {
> > -     RESOLVE_TBD,    /* To Be Determined */
> > -     RESOLVE_PTR,    /* Resolving for Pointer */
> > -     RESOLVE_STRUCT_OR_ARRAY,        /* Resolving for struct/union
> > -                                      * or array
> > -                                      */
> > -};
> > -
> >   #define MAX_RESOLVE_DEPTH 32
> >
> >   struct btf_sec_info {
> > @@ -254,7 +246,6 @@ struct btf_verifier_env {
> >       u32 log_type_id;
> >       u32 top_stack;
> >       enum verifier_phase phase;
> > -     enum resolve_mode resolve_mode;
> >   };
> >
> >   static const char * const btf_kind_str[NR_BTF_KINDS] = {
> > @@ -964,26 +955,7 @@ static void btf_verifier_env_free(struct btf_verifier_env *env)
> >   static bool env_type_is_resolve_sink(const struct btf_verifier_env *env,
> >                                    const struct btf_type *next_type)
> >   {
> > -     switch (env->resolve_mode) {
> > -     case RESOLVE_TBD:
> > -             /* int, enum or void is a sink */
> > -             return !btf_type_needs_resolve(next_type);
> > -     case RESOLVE_PTR:
> > -             /* int, enum, void, struct, array, func or func_proto is a sink
> > -              * for ptr
> > -              */
> > -             return !btf_type_is_modifier(next_type) &&
> > -                     !btf_type_is_ptr(next_type);
> > -     case RESOLVE_STRUCT_OR_ARRAY:
> > -             /* int, enum, void, ptr, func or func_proto is a sink
> > -              * for struct and array
> > -              */
> > -             return !btf_type_is_modifier(next_type) &&
> > -                     !btf_type_is_array(next_type) &&
> > -                     !btf_type_is_struct(next_type);
> > -     default:
> > -             BUG();
> > -     }
> > +     return !btf_type_needs_resolve(next_type);
> >   }
> >
> >   static bool env_type_is_resolved(const struct btf_verifier_env *env,
> > @@ -1010,13 +982,6 @@ static int env_stack_push(struct btf_verifier_env *env,
> >       v->type_id = type_id;
> >       v->next_member = 0;
> >
> > -     if (env->resolve_mode == RESOLVE_TBD) {
> > -             if (btf_type_is_ptr(t))
> > -                     env->resolve_mode = RESOLVE_PTR;
> > -             else if (btf_type_is_struct(t) || btf_type_is_array(t))
> > -                     env->resolve_mode = RESOLVE_STRUCT_OR_ARRAY;
> > -     }
> > -
> >       return 0;
> >   }
> >
> > @@ -1038,7 +1003,7 @@ static void env_stack_pop_resolved(struct btf_verifier_env *env,
> >       env->visit_states[type_id] = RESOLVED;
> >   }
> >
> > -static const struct resolve_vertex *env_stack_peak(struct btf_verifier_env *env)
> > +static const struct resolve_vertex *env_stack_peek(struct btf_verifier_env *env)
> >   {
> >       return env->top_stack ? &env->stack[env->top_stack - 1] : NULL;
> >   }
> > @@ -3030,9 +2995,8 @@ static int btf_resolve(struct btf_verifier_env *env,
> >       const struct resolve_vertex *v;
> >       int err = 0;
> >
> > -     env->resolve_mode = RESOLVE_TBD;
> >       env_stack_push(env, t, type_id);
> > -     while (!err && (v = env_stack_peak(env))) {
> > +     while (!err && (v = env_stack_peek(env))) {
> >               env->log_type_id = v->type_id;
> >               err = btf_type_ops(v->t)->resolve(env, v);
> >       }
> >

^ permalink raw reply

* Re: [PATCH] ipvs: remove unnecessary space
From: Joe Perches @ 2019-07-11  0:22 UTC (permalink / raw)
  To: Simon Horman, yangxingwu, Pablo Neira Ayuso
  Cc: wensong, ja, kadlec, fw, davem, netdev, lvs-devel,
	netfilter-devel, coreteam, linux-kernel
In-Reply-To: <20190710080609.smxjqe2d5jyro4hv@verge.net.au>

On Wed, 2019-07-10 at 10:06 +0200, Simon Horman wrote:
> On Wed, Jul 10, 2019 at 03:45:52PM +0800, yangxingwu wrote:
> > this patch removes the extra space.
> > 
> > Signed-off-by: yangxingwu <xingwu.yang@gmail.com>
> 
> Thanks, this looks good to me.
> 
> Acked-by: Simon Horman <horms@verge.net.au>
> 
> Pablo, please consider including this in nf-next.
> 
> 
> > ---
> >  net/netfilter/ipvs/ip_vs_mh.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/net/netfilter/ipvs/ip_vs_mh.c b/net/netfilter/ipvs/ip_vs_mh.c
> > index 94d9d34..98e358e 100644
> > --- a/net/netfilter/ipvs/ip_vs_mh.c
> > +++ b/net/netfilter/ipvs/ip_vs_mh.c
> > @@ -174,8 +174,8 @@ static int ip_vs_mh_populate(struct ip_vs_mh_state *s,
> >  		return 0;
> >  	}
> >  
> > -	table =  kcalloc(BITS_TO_LONGS(IP_VS_MH_TAB_SIZE),
> > -			 sizeof(unsigned long), GFP_KERNEL);
> > +	table =	kcalloc(BITS_TO_LONGS(IP_VS_MH_TAB_SIZE),
> > +			sizeof(unsigned long), GFP_KERNEL);

bitmap_alloc?

> >  	if (!table)
> >  		return -ENOMEM;
> >  
> > -- 
> > 1.8.3.1
> > 


^ permalink raw reply

* Re: [PATCH net-next,v4 12/12] netfilter: nf_tables: add hardware offload support
From: Pablo Neira Ayuso @ 2019-07-11  0:18 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, davem, thomas.lendacky, f.fainelli, ariel.elior,
	michael.chan, madalin.bucur, yisen.zhuang, salil.mehta,
	jeffrey.t.kirsher, tariqt, saeedm, jiri, idosch, jakub.kicinski,
	peppe.cavallaro, grygorii.strashko, andrew, vivien.didelot,
	alexandre.torgue, joabreu, linux-net-drivers, ogerlitz,
	Manish.Chopra, marcelo.leitner, mkubecek, venkatkumar.duvvuru,
	maxime.chevallier, cphealy, phil, netfilter-devel
In-Reply-To: <20190710075227.GA4362@nanopsycho>

On Wed, Jul 10, 2019 at 09:52:27AM +0200, Jiri Pirko wrote:
> Tue, Jul 09, 2019 at 10:55:50PM CEST, pablo@netfilter.org wrote:
> 
> [...]
> 
> >+	if (!dev || !dev->netdev_ops->ndo_setup_tc)
> 
> Why didn't you rename ndo_setup_tc? I put a comment about it in the
> previous version thread. I expect that you can at least write why it is
> a wrong idea.

This is a good idea. It happened that I read this email by when my new
patch series was ready. I will follow up with a patch to address this
rename as soon as the bug fixes are sorted out.

Thanks.

^ permalink raw reply

* Re: [PATCH bpf] bpf: fix BTF verifier size resolution logic
From: Yonghong Song @ 2019-07-11  0:16 UTC (permalink / raw)
  To: Andrii Nakryiko, andrii.nakryiko@gmail.com, Alexei Starovoitov,
	daniel@iogearbox.net, bpf@vger.kernel.org, netdev@vger.kernel.org,
	Kernel Team
  Cc: Martin Lau
In-Reply-To: <20190710080840.2613160-1-andriin@fb.com>



On 7/10/19 1:08 AM, Andrii Nakryiko wrote:
> BTF verifier has Different logic depending on whether we are following
> a PTR or STRUCT/ARRAY (or something else). This is an optimization to
> stop early in DFS traversal while resolving BTF types. But it also
> results in a size resolution bug, when there is a chain, e.g., of PTR ->
> TYPEDEF -> ARRAY, in which case due to being in pointer context ARRAY
> size won't be resolved, as it is considered to be a sink for pointer,
> leading to TYPEDEF being in RESOLVED state with zero size, which is
> completely wrong.
> 
> Optimization is doubtful, though, as btf_check_all_types() will iterate
> over all BTF types anyways, so the only saving is a potentially slightly
> shorter stack. But correctness is more important that tiny savings.
> 
> This bug manifests itself in rejecting BTF-defined maps that use array
> typedef as a value type:
> 
> typedef int array_t[16];
> 
> struct {
> 	__uint(type, BPF_MAP_TYPE_ARRAY);
> 	__type(value, array_t); /* i.e., array_t *value; */
> } test_map SEC(".maps");
> 
> Fixes: eb3f595dab40 ("bpf: btf: Validate type reference")
> Cc: Martin KaFai Lau <kafai@fb.com>
> Signed-off-by: Andrii Nakryiko <andriin@fb.com>

The change seems okay to me. Currently, looks like intermediate
modifier type will carry size = 0 (in the internal data structure).

If we remove RESOLVE logic, we probably want to double check
whether we handle circular types correctly or not. Maybe we will
be okay if all self tests pass.

I may still be worthwhile to qualify the RESOLVE optimization benefit
before removing it.

Another possible change is, for external usage, removing
modifiers, before checking the size, something like below.
Note that I am not strongly advocating my below patch as
it has the same shortcoming that maintained modifier type
size may not be correct.

diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 546ebee39e2a..6f927c3e0a89 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -620,6 +620,54 @@ static bool btf_type_int_is_regular(const struct 
btf_type *t)
         return true;
  }

+static const struct btf_type *__btf_type_id_size(const struct btf *btf,
+                                                u32 *type_id, u32 
*ret_size,
+                                                bool skip_modifier)
+{
+       const struct btf_type *size_type;
+       u32 size_type_id = *type_id;
+       u32 size = 0;
+
+       size_type = btf_type_by_id(btf, size_type_id);
+       if (size_type && skip_modifier) {
+               while (btf_type_is_modifier(size_type))
+                       size_type = btf_type_by_id(btf, size_type->type);
+       }
+
+       if (btf_type_nosize_or_null(size_type))
+               return NULL;
+
+       if (btf_type_has_size(size_type)) {
+               size = size_type->size;
+       } else if (btf_type_is_array(size_type)) {
+               size = btf->resolved_sizes[size_type_id];
+       } else if (btf_type_is_ptr(size_type)) {
+               size = sizeof(void *);
+       } else {
+               if (WARN_ON_ONCE(!btf_type_is_modifier(size_type) &&
+                                !btf_type_is_var(size_type)))
+                       return NULL;
+
+               size = btf->resolved_sizes[size_type_id];
+               size_type_id = btf->resolved_ids[size_type_id];
+               size_type = btf_type_by_id(btf, size_type_id);
+               if (btf_type_nosize_or_null(size_type))
+                       return NULL;
+       }
+
+       *type_id = size_type_id;
+       if (ret_size)
+               *ret_size = size;
+
+       return size_type;
+}
+
+const struct btf_type *btf_type_id_size(const struct btf *btf,
+                                       u32 *type_id, u32 *ret_size)
+{
+       return __btf_type_id_size(btf, type_id, ret_size, true);
+}
+
  /*
   * Check that given struct member is a regular int with expected
   * offset and size.
@@ -633,7 +681,7 @@ bool btf_member_is_reg_int(const struct btf *btf, 
const struct btf_type *s,
         u8 nr_bits;

         id = m->type;
-       t = btf_type_id_size(btf, &id, NULL);
+       t = __btf_type_id_size(btf, &id, NULL, false);
         if (!t || !btf_type_is_int(t))
                 return false;

@@ -1051,42 +1099,6 @@ static const struct btf_type 
*btf_type_id_resolve(const struct btf *btf,
         return btf_type_by_id(btf, *type_id);
  }

-const struct btf_type *btf_type_id_size(const struct btf *btf,
-                                       u32 *type_id, u32 *ret_size)
-{
-       const struct btf_type *size_type;
-       u32 size_type_id = *type_id;
-       u32 size = 0;
-
-       size_type = btf_type_by_id(btf, size_type_id);
-       if (btf_type_nosize_or_null(size_type))
-               return NULL;
-
-       if (btf_type_has_size(size_type)) {
-               size = size_type->size;
-       } else if (btf_type_is_array(size_type)) {
-               size = btf->resolved_sizes[size_type_id];
-       } else if (btf_type_is_ptr(size_type)) {
-               size = sizeof(void *);
-       } else {
-               if (WARN_ON_ONCE(!btf_type_is_modifier(size_type) &&
-                                !btf_type_is_var(size_type)))
-                       return NULL;
-
-               size = btf->resolved_sizes[size_type_id];
-               size_type_id = btf->resolved_ids[size_type_id];
-               size_type = btf_type_by_id(btf, size_type_id);
-               if (btf_type_nosize_or_null(size_type))
-                       return NULL;
-       }
-
-       *type_id = size_type_id;
-       if (ret_size)
-               *ret_size = size;
-
-       return size_type;
-}
-
  static int btf_df_check_member(struct btf_verifier_env *env,
                                const struct btf_type *struct_type,
                                const struct btf_member *member,
@@ -1489,7 +1501,7 @@ static int btf_modifier_check_member(struct 
btf_verifier_env *env,
         struct btf_member resolved_member;
         struct btf *btf = env->btf;

-       resolved_type = btf_type_id_size(btf, &resolved_type_id, NULL);
+       resolved_type = __btf_type_id_size(btf, &resolved_type_id, NULL, 
false);
         if (!resolved_type) {
                 btf_verifier_log_member(env, struct_type, member,
                                         "Invalid member");
@@ -1514,7 +1526,7 @@ static int btf_modifier_check_kflag_member(struct 
btf_verifier_env *env,
         struct btf_member resolved_member;
         struct btf *btf = env->btf;

-       resolved_type = btf_type_id_size(btf, &resolved_type_id, NULL);
+       resolved_type = __btf_type_id_size(btf, &resolved_type_id, NULL, 
false);
         if (!resolved_type) {
                 btf_verifier_log_member(env, struct_type, member,
                                         "Invalid member");
@@ -1620,7 +1632,7 @@ static int btf_modifier_resolve(struct 
btf_verifier_env *env,
          * save us a few type-following when we use it later (e.g. in
          * pretty print).
          */
-       if (!btf_type_id_size(btf, &next_type_id, &next_type_size)) {
+       if (!__btf_type_id_size(btf, &next_type_id, &next_type_size, 
false)) {
                 if (env_type_is_resolved(env, next_type_id))
                         next_type = btf_type_id_resolve(btf, 
&next_type_id);

@@ -1675,7 +1687,7 @@ static int btf_var_resolve(struct btf_verifier_env 
*env,
          * forward types or similar that would resolve to size of
          * zero is allowed.
          */
-       if (!btf_type_id_size(btf, &next_type_id, &next_type_size)) {
+       if (!__btf_type_id_size(btf, &next_type_id, &next_type_size, 
false)) {
                 btf_verifier_log_type(env, v->t, "Invalid type_id");
                 return -EINVAL;
         }
@@ -1725,7 +1737,7 @@ static int btf_ptr_resolve(struct btf_verifier_env 
*env,
                                               resolved_type_id);
         }

-       if (!btf_type_id_size(btf, &next_type_id, NULL)) {
+       if (!__btf_type_id_size(btf, &next_type_id, NULL, false)) {
                 if (env_type_is_resolved(env, next_type_id))
                         next_type = btf_type_id_resolve(btf, 
&next_type_id);

@@ -1851,7 +1863,7 @@ static int btf_array_check_member(struct 
btf_verifier_env *env,
         }

         array_type_id = member->type;
-       btf_type_id_size(btf, &array_type_id, &array_size);
+       __btf_type_id_size(btf, &array_type_id, &array_size, false);
         struct_size = struct_type->size;
         bytes_offset = BITS_ROUNDDOWN_BYTES(struct_bits_off);
         if (struct_size - bytes_offset < array_size) {
@@ -1938,7 +1950,7 @@ static int btf_array_resolve(struct 
btf_verifier_env *env,
             !env_type_is_resolved(env, index_type_id))
                 return env_stack_push(env, index_type, index_type_id);

-       index_type = btf_type_id_size(btf, &index_type_id, NULL);
+       index_type = __btf_type_id_size(btf, &index_type_id, NULL, false);
         if (!index_type || !btf_type_is_int(index_type) ||
             !btf_type_int_is_regular(index_type)) {
                 btf_verifier_log_type(env, v->t, "Invalid index");
@@ -1959,7 +1971,7 @@ static int btf_array_resolve(struct 
btf_verifier_env *env,
             !env_type_is_resolved(env, elem_type_id))
                 return env_stack_push(env, elem_type, elem_type_id);

-       elem_type = btf_type_id_size(btf, &elem_type_id, &elem_size);
+       elem_type = __btf_type_id_size(btf, &elem_type_id, &elem_size, 
false);
         if (!elem_type) {
                 btf_verifier_log_type(env, v->t, "Invalid elem");
                 return -EINVAL;
@@ -2000,7 +2012,7 @@ static void btf_array_seq_show(const struct btf 
*btf, const struct btf_type *t,
         u32 i, elem_size, elem_type_id;

         elem_type_id = array->type;
-       elem_type = btf_type_id_size(btf, &elem_type_id, &elem_size);
+       elem_type = __btf_type_id_size(btf, &elem_type_id, &elem_size, 
false);
         elem_ops = btf_type_ops(elem_type);
         seq_puts(m, "[");
         for (i = 0; i < array->nelems; i++) {
@@ -2732,7 +2744,7 @@ static int btf_datasec_resolve(struct 
btf_verifier_env *env,
                 }

                 type_id = var_type->type;
-               if (!btf_type_id_size(btf, &type_id, &type_size)) {
+               if (!__btf_type_id_size(btf, &type_id, &type_size, false)) {
                         btf_verifier_log_vsi(env, v->t, vsi, "Invalid 
type");
                         return -EINVAL;
                 }
@@ -2813,7 +2825,7 @@ static int btf_func_proto_check(struct 
btf_verifier_env *env,
                 }

                 /* Ensure the return type is a type that has a size */
-               if (!btf_type_id_size(btf, &ret_type_id, NULL)) {
+               if (!__btf_type_id_size(btf, &ret_type_id, NULL, false)) {
                         btf_verifier_log_type(env, t, "Invalid return 
type");
                         return -EINVAL;
                 }
@@ -2861,7 +2873,7 @@ static int btf_func_proto_check(struct 
btf_verifier_env *env,
                                 break;
                 }

-               if (!btf_type_id_size(btf, &arg_type_id, NULL)) {
+               if (!__btf_type_id_size(btf, &arg_type_id, NULL, false)) {
                         btf_verifier_log_type(env, t, "Invalid arg#%u", 
i + 1);
                         err = -EINVAL;
                         break;
@@ -3014,7 +3026,7 @@ static bool btf_resolve_valid(struct 
btf_verifier_env *env,
                 u32 elem_type_id = array->type;
                 u32 elem_size;

-               elem_type = btf_type_id_size(btf, &elem_type_id, 
&elem_size);
+               elem_type = __btf_type_id_size(btf, &elem_type_id, 
&elem_size, false);
                 return elem_type && !btf_type_is_modifier(elem_type) &&
                         (array->nelems * elem_size ==
                          btf->resolved_sizes[type_id]);


> ---
>   kernel/bpf/btf.c | 42 +++---------------------------------------
>   1 file changed, 3 insertions(+), 39 deletions(-)
> 
> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> index cad09858a5f2..c68c7e73b0d1 100644
> --- a/kernel/bpf/btf.c
> +++ b/kernel/bpf/btf.c
> @@ -231,14 +231,6 @@ enum visit_state {
>   	RESOLVED,
>   };
>   
> -enum resolve_mode {
> -	RESOLVE_TBD,	/* To Be Determined */
> -	RESOLVE_PTR,	/* Resolving for Pointer */
> -	RESOLVE_STRUCT_OR_ARRAY,	/* Resolving for struct/union
> -					 * or array
> -					 */
> -};
> -
>   #define MAX_RESOLVE_DEPTH 32
>   
>   struct btf_sec_info {
> @@ -254,7 +246,6 @@ struct btf_verifier_env {
>   	u32 log_type_id;
>   	u32 top_stack;
>   	enum verifier_phase phase;
> -	enum resolve_mode resolve_mode;
>   };
>   
>   static const char * const btf_kind_str[NR_BTF_KINDS] = {
> @@ -964,26 +955,7 @@ static void btf_verifier_env_free(struct btf_verifier_env *env)
>   static bool env_type_is_resolve_sink(const struct btf_verifier_env *env,
>   				     const struct btf_type *next_type)
>   {
> -	switch (env->resolve_mode) {
> -	case RESOLVE_TBD:
> -		/* int, enum or void is a sink */
> -		return !btf_type_needs_resolve(next_type);
> -	case RESOLVE_PTR:
> -		/* int, enum, void, struct, array, func or func_proto is a sink
> -		 * for ptr
> -		 */
> -		return !btf_type_is_modifier(next_type) &&
> -			!btf_type_is_ptr(next_type);
> -	case RESOLVE_STRUCT_OR_ARRAY:
> -		/* int, enum, void, ptr, func or func_proto is a sink
> -		 * for struct and array
> -		 */
> -		return !btf_type_is_modifier(next_type) &&
> -			!btf_type_is_array(next_type) &&
> -			!btf_type_is_struct(next_type);
> -	default:
> -		BUG();
> -	}
> +	return !btf_type_needs_resolve(next_type);
>   }
>   
>   static bool env_type_is_resolved(const struct btf_verifier_env *env,
> @@ -1010,13 +982,6 @@ static int env_stack_push(struct btf_verifier_env *env,
>   	v->type_id = type_id;
>   	v->next_member = 0;
>   
> -	if (env->resolve_mode == RESOLVE_TBD) {
> -		if (btf_type_is_ptr(t))
> -			env->resolve_mode = RESOLVE_PTR;
> -		else if (btf_type_is_struct(t) || btf_type_is_array(t))
> -			env->resolve_mode = RESOLVE_STRUCT_OR_ARRAY;
> -	}
> -
>   	return 0;
>   }
>   
> @@ -1038,7 +1003,7 @@ static void env_stack_pop_resolved(struct btf_verifier_env *env,
>   	env->visit_states[type_id] = RESOLVED;
>   }
>   
> -static const struct resolve_vertex *env_stack_peak(struct btf_verifier_env *env)
> +static const struct resolve_vertex *env_stack_peek(struct btf_verifier_env *env)
>   {
>   	return env->top_stack ? &env->stack[env->top_stack - 1] : NULL;
>   }
> @@ -3030,9 +2995,8 @@ static int btf_resolve(struct btf_verifier_env *env,
>   	const struct resolve_vertex *v;
>   	int err = 0;
>   
> -	env->resolve_mode = RESOLVE_TBD;
>   	env_stack_push(env, t, type_id);
> -	while (!err && (v = env_stack_peak(env))) {
> +	while (!err && (v = env_stack_peek(env))) {
>   		env->log_type_id = v->type_id;
>   		err = btf_type_ops(v->t)->resolve(env, v);
>   	}
> 

^ permalink raw reply related

* [PATCH net-next 2/3] net: flow_offload: rename tc_setup_cb_t to flow_setup_cb_t
From: Pablo Neira Ayuso @ 2019-07-11  0:12 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, jiri, jakub.kicinski
In-Reply-To: <20190711001235.20686-1-pablo@netfilter.org>

Rename this type definition and adapt users.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
This patch is a dependency for patch 3/3, so include/net/flow_offload.h
does not need to include include/net/sch_cls.h, and hence avoid a
circular inclusion.

 drivers/net/ethernet/mellanox/mlxsw/spectrum.c |  2 +-
 drivers/net/ethernet/mscc/ocelot_tc.c          |  2 +-
 include/net/flow_offload.h                     | 16 ++++++++++------
 include/net/pkt_cls.h                          |  4 ++--
 include/net/sch_generic.h                      |  6 ++----
 net/core/flow_offload.c                        |  9 +++++----
 net/dsa/slave.c                                |  2 +-
 net/sched/cls_api.c                            |  2 +-
 net/sched/cls_bpf.c                            |  2 +-
 net/sched/cls_flower.c                         |  2 +-
 net/sched/cls_matchall.c                       |  2 +-
 net/sched/cls_u32.c                            |  6 +++---
 12 files changed, 29 insertions(+), 26 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index a469035400cf..51cd0b6f1f3e 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -1679,7 +1679,7 @@ static int mlxsw_sp_setup_tc_block(struct mlxsw_sp_port *mlxsw_sp_port,
 				   struct flow_block_offload *f)
 {
 	struct flow_block_cb *block_cb;
-	tc_setup_cb_t *cb;
+	flow_setup_cb_t *cb;
 	bool ingress;
 	int err;
 
diff --git a/drivers/net/ethernet/mscc/ocelot_tc.c b/drivers/net/ethernet/mscc/ocelot_tc.c
index abbcb66bf5ac..fba9512e9ca6 100644
--- a/drivers/net/ethernet/mscc/ocelot_tc.c
+++ b/drivers/net/ethernet/mscc/ocelot_tc.c
@@ -134,7 +134,7 @@ static int ocelot_setup_tc_block(struct ocelot_port *port,
 				 struct flow_block_offload *f)
 {
 	struct flow_block_cb *block_cb;
-	tc_setup_cb_t *cb;
+	flow_setup_cb_t *cb;
 	int err;
 
 	netdev_dbg(port->dev, "tc_block command %d, binder_type %d\n",
diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
index aa9b5287b231..98bf3af5c84d 100644
--- a/include/net/flow_offload.h
+++ b/include/net/flow_offload.h
@@ -3,7 +3,6 @@
 
 #include <linux/kernel.h>
 #include <net/flow_dissector.h>
-#include <net/sch_generic.h>
 
 struct flow_match {
 	struct flow_dissector	*dissector;
@@ -261,23 +260,27 @@ struct flow_block_offload {
 	struct netlink_ext_ack *extack;
 };
 
+enum tc_setup_type;
+typedef int flow_setup_cb_t(enum tc_setup_type type, void *type_data,
+			    void *cb_priv);
+
 struct flow_block_cb {
 	struct list_head	driver_list;
 	struct list_head	list;
-	tc_setup_cb_t		*cb;
+	flow_setup_cb_t		*cb;
 	void			*cb_ident;
 	void			*cb_priv;
 	void			(*release)(void *cb_priv);
 	unsigned int		refcnt;
 };
 
-struct flow_block_cb *flow_block_cb_alloc(tc_setup_cb_t *cb,
+struct flow_block_cb *flow_block_cb_alloc(flow_setup_cb_t *cb,
 					  void *cb_ident, void *cb_priv,
 					  void (*release)(void *cb_priv));
 void flow_block_cb_free(struct flow_block_cb *block_cb);
 
 struct flow_block_cb *flow_block_cb_lookup(struct flow_block_offload *offload,
-					   tc_setup_cb_t *cb, void *cb_ident);
+					   flow_setup_cb_t *cb, void *cb_ident);
 
 void *flow_block_cb_priv(struct flow_block_cb *block_cb);
 void flow_block_cb_incref(struct flow_block_cb *block_cb);
@@ -295,11 +298,12 @@ static inline void flow_block_cb_remove(struct flow_block_cb *block_cb,
 	list_move(&block_cb->list, &offload->cb_list);
 }
 
-bool flow_block_cb_is_busy(tc_setup_cb_t *cb, void *cb_ident,
+bool flow_block_cb_is_busy(flow_setup_cb_t *cb, void *cb_ident,
 			   struct list_head *driver_block_list);
 
 int flow_block_cb_setup_simple(struct flow_block_offload *f,
-			       struct list_head *driver_list, tc_setup_cb_t *cb,
+			       struct list_head *driver_list,
+			       flow_setup_cb_t *cb,
 			       void *cb_ident, void *cb_priv, bool ingress_only);
 
 enum flow_cls_command {
diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index 841faadceb6e..cee651b76a1f 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -126,14 +126,14 @@ static inline struct Qdisc *tcf_block_q(struct tcf_block *block)
 }
 
 static inline
-int tc_setup_cb_block_register(struct tcf_block *block, tc_setup_cb_t *cb,
+int tc_setup_cb_block_register(struct tcf_block *block, flow_setup_cb_t *cb,
 			       void *cb_priv)
 {
 	return 0;
 }
 
 static inline
-void tc_setup_cb_block_unregister(struct tcf_block *block, tc_setup_cb_t *cb,
+void tc_setup_cb_block_unregister(struct tcf_block *block, flow_setup_cb_t *cb,
 				  void *cb_priv)
 {
 }
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 855167bbc372..9482e060483b 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -15,6 +15,7 @@
 #include <linux/mutex.h>
 #include <net/gen_stats.h>
 #include <net/rtnetlink.h>
+#include <net/flow_offload.h>
 
 struct Qdisc_ops;
 struct qdisc_walker;
@@ -22,9 +23,6 @@ struct tcf_walker;
 struct module;
 struct bpf_flow_keys;
 
-typedef int tc_setup_cb_t(enum tc_setup_type type,
-			  void *type_data, void *cb_priv);
-
 typedef int tc_indr_block_bind_cb_t(struct net_device *dev, void *cb_priv,
 				    enum tc_setup_type type, void *type_data);
 
@@ -313,7 +311,7 @@ struct tcf_proto_ops {
 	void			(*walk)(struct tcf_proto *tp,
 					struct tcf_walker *arg, bool rtnl_held);
 	int			(*reoffload)(struct tcf_proto *tp, bool add,
-					     tc_setup_cb_t *cb, void *cb_priv,
+					     flow_setup_cb_t *cb, void *cb_priv,
 					     struct netlink_ext_ack *extack);
 	void			(*bind_class)(void *, u32, unsigned long);
 	void *			(*tmplt_create)(struct net *net,
diff --git a/net/core/flow_offload.c b/net/core/flow_offload.c
index 507de4b48815..a800fa78d96c 100644
--- a/net/core/flow_offload.c
+++ b/net/core/flow_offload.c
@@ -165,7 +165,7 @@ void flow_rule_match_enc_opts(const struct flow_rule *rule,
 }
 EXPORT_SYMBOL(flow_rule_match_enc_opts);
 
-struct flow_block_cb *flow_block_cb_alloc(tc_setup_cb_t *cb,
+struct flow_block_cb *flow_block_cb_alloc(flow_setup_cb_t *cb,
 					  void *cb_ident, void *cb_priv,
 					  void (*release)(void *cb_priv))
 {
@@ -194,7 +194,7 @@ void flow_block_cb_free(struct flow_block_cb *block_cb)
 EXPORT_SYMBOL(flow_block_cb_free);
 
 struct flow_block_cb *flow_block_cb_lookup(struct flow_block_offload *f,
-					   tc_setup_cb_t *cb, void *cb_ident)
+					   flow_setup_cb_t *cb, void *cb_ident)
 {
 	struct flow_block_cb *block_cb;
 
@@ -226,7 +226,7 @@ unsigned int flow_block_cb_decref(struct flow_block_cb *block_cb)
 }
 EXPORT_SYMBOL(flow_block_cb_decref);
 
-bool flow_block_cb_is_busy(tc_setup_cb_t *cb, void *cb_ident,
+bool flow_block_cb_is_busy(flow_setup_cb_t *cb, void *cb_ident,
 			   struct list_head *driver_block_list)
 {
 	struct flow_block_cb *block_cb;
@@ -243,7 +243,8 @@ EXPORT_SYMBOL(flow_block_cb_is_busy);
 
 int flow_block_cb_setup_simple(struct flow_block_offload *f,
 			       struct list_head *driver_block_list,
-			       tc_setup_cb_t *cb, void *cb_ident, void *cb_priv,
+			       flow_setup_cb_t *cb,
+			       void *cb_ident, void *cb_priv,
 			       bool ingress_only)
 {
 	struct flow_block_cb *block_cb;
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 6ca9ec58f881..d697a64fb564 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -951,7 +951,7 @@ static int dsa_slave_setup_tc_block(struct net_device *dev,
 				    struct flow_block_offload *f)
 {
 	struct flow_block_cb *block_cb;
-	tc_setup_cb_t *cb;
+	flow_setup_cb_t *cb;
 
 	if (f->binder_type == FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS)
 		cb = dsa_slave_setup_tc_block_cb_ig;
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 278014e26aec..51fbe6e95a92 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -1514,7 +1514,7 @@ void tcf_block_put(struct tcf_block *block)
 EXPORT_SYMBOL(tcf_block_put);
 
 static int
-tcf_block_playback_offloads(struct tcf_block *block, tc_setup_cb_t *cb,
+tcf_block_playback_offloads(struct tcf_block *block, flow_setup_cb_t *cb,
 			    void *cb_priv, bool add, bool offload_in_use,
 			    struct netlink_ext_ack *extack)
 {
diff --git a/net/sched/cls_bpf.c b/net/sched/cls_bpf.c
index 691f71830134..3f7a9c02b70c 100644
--- a/net/sched/cls_bpf.c
+++ b/net/sched/cls_bpf.c
@@ -651,7 +651,7 @@ static void cls_bpf_walk(struct tcf_proto *tp, struct tcf_walker *arg,
 	}
 }
 
-static int cls_bpf_reoffload(struct tcf_proto *tp, bool add, tc_setup_cb_t *cb,
+static int cls_bpf_reoffload(struct tcf_proto *tp, bool add, flow_setup_cb_t *cb,
 			     void *cb_priv, struct netlink_ext_ack *extack)
 {
 	struct cls_bpf_head *head = rtnl_dereference(tp->root);
diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index 38d6e85693fc..054123742e32 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -1800,7 +1800,7 @@ fl_get_next_hw_filter(struct tcf_proto *tp, struct cls_fl_filter *f, bool add)
 	return NULL;
 }
 
-static int fl_reoffload(struct tcf_proto *tp, bool add, tc_setup_cb_t *cb,
+static int fl_reoffload(struct tcf_proto *tp, bool add, flow_setup_cb_t *cb,
 			void *cb_priv, struct netlink_ext_ack *extack)
 {
 	struct tcf_block *block = tp->chain->block;
diff --git a/net/sched/cls_matchall.c b/net/sched/cls_matchall.c
index a30d2f8feb32..455ea2793f9b 100644
--- a/net/sched/cls_matchall.c
+++ b/net/sched/cls_matchall.c
@@ -282,7 +282,7 @@ static void mall_walk(struct tcf_proto *tp, struct tcf_walker *arg,
 	arg->count++;
 }
 
-static int mall_reoffload(struct tcf_proto *tp, bool add, tc_setup_cb_t *cb,
+static int mall_reoffload(struct tcf_proto *tp, bool add, flow_setup_cb_t *cb,
 			  void *cb_priv, struct netlink_ext_ack *extack)
 {
 	struct cls_mall_head *head = rtnl_dereference(tp->root);
diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c
index be9e46c77e8b..8614088edd1b 100644
--- a/net/sched/cls_u32.c
+++ b/net/sched/cls_u32.c
@@ -1152,7 +1152,7 @@ static void u32_walk(struct tcf_proto *tp, struct tcf_walker *arg,
 }
 
 static int u32_reoffload_hnode(struct tcf_proto *tp, struct tc_u_hnode *ht,
-			       bool add, tc_setup_cb_t *cb, void *cb_priv,
+			       bool add, flow_setup_cb_t *cb, void *cb_priv,
 			       struct netlink_ext_ack *extack)
 {
 	struct tc_cls_u32_offload cls_u32 = {};
@@ -1172,7 +1172,7 @@ static int u32_reoffload_hnode(struct tcf_proto *tp, struct tc_u_hnode *ht,
 }
 
 static int u32_reoffload_knode(struct tcf_proto *tp, struct tc_u_knode *n,
-			       bool add, tc_setup_cb_t *cb, void *cb_priv,
+			       bool add, flow_setup_cb_t *cb, void *cb_priv,
 			       struct netlink_ext_ack *extack)
 {
 	struct tc_u_hnode *ht = rtnl_dereference(n->ht_down);
@@ -1213,7 +1213,7 @@ static int u32_reoffload_knode(struct tcf_proto *tp, struct tc_u_knode *n,
 	return 0;
 }
 
-static int u32_reoffload(struct tcf_proto *tp, bool add, tc_setup_cb_t *cb,
+static int u32_reoffload(struct tcf_proto *tp, bool add, flow_setup_cb_t *cb,
 			 void *cb_priv, struct netlink_ext_ack *extack)
 {
 	struct tc_u_common *tp_c = tp->data;
-- 
2.11.0



^ permalink raw reply related

* [PATCH net-next 3/3] net: flow_offload: add flow_block structure and use it
From: Pablo Neira Ayuso @ 2019-07-11  0:12 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, jiri, jakub.kicinski
In-Reply-To: <20190711001235.20686-1-pablo@netfilter.org>

This object stores the flow block callbacks that are attached to this
block. This patch restores block sharing.

Fixes: da3eeb904ff4 ("net: flow_offload: add list handling functions")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/net/flow_offload.h        |  5 +++++
 include/net/netfilter/nf_tables.h |  5 +++--
 include/net/sch_generic.h         |  2 +-
 net/core/flow_offload.c           |  2 +-
 net/netfilter/nf_tables_api.c     |  2 +-
 net/netfilter/nf_tables_offload.c |  5 +++--
 net/sched/cls_api.c               | 10 +++++++---
 7 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
index 98bf3af5c84d..e50d94736829 100644
--- a/include/net/flow_offload.h
+++ b/include/net/flow_offload.h
@@ -248,6 +248,10 @@ enum flow_block_binder_type {
 	FLOW_BLOCK_BINDER_TYPE_CLSACT_EGRESS,
 };
 
+struct flow_block {
+	struct list_head cb_list;
+};
+
 struct netlink_ext_ack;
 
 struct flow_block_offload {
@@ -255,6 +259,7 @@ struct flow_block_offload {
 	enum flow_block_binder_type binder_type;
 	bool block_shared;
 	struct net *net;
+	struct flow_block *block;
 	struct list_head cb_list;
 	struct list_head *driver_block_list;
 	struct netlink_ext_ack *extack;
diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
index 35dfdd9f69b3..00658462f89b 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -11,6 +11,7 @@
 #include <linux/rhashtable.h>
 #include <net/netfilter/nf_flow_table.h>
 #include <net/netlink.h>
+#include <net/flow_offload.h>
 
 struct module;
 
@@ -951,7 +952,7 @@ struct nft_stats {
  *	@stats: per-cpu chain stats
  *	@chain: the chain
  *	@dev_name: device name that this base chain is attached to (if any)
- *	@cb_list: list of flow block callbacks (for hardware offload)
+ *	@block: flow block (for hardware offload)
  */
 struct nft_base_chain {
 	struct nf_hook_ops		ops;
@@ -961,7 +962,7 @@ struct nft_base_chain {
 	struct nft_stats __percpu	*stats;
 	struct nft_chain		chain;
 	char 				dev_name[IFNAMSIZ];
-	struct list_head		cb_list;
+	struct flow_block		block;
 };
 
 static inline struct nft_base_chain *nft_base_chain(const struct nft_chain *chain)
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 9482e060483b..58041cb0ce15 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -399,7 +399,7 @@ struct tcf_block {
 	refcount_t refcnt;
 	struct net *net;
 	struct Qdisc *q;
-	struct list_head cb_list;
+	struct flow_block flow;
 	struct list_head owner_list;
 	bool keep_dst;
 	unsigned int offloadcnt; /* Number of oddloaded filters */
diff --git a/net/core/flow_offload.c b/net/core/flow_offload.c
index a800fa78d96c..935c7f81a9ef 100644
--- a/net/core/flow_offload.c
+++ b/net/core/flow_offload.c
@@ -198,7 +198,7 @@ struct flow_block_cb *flow_block_cb_lookup(struct flow_block_offload *f,
 {
 	struct flow_block_cb *block_cb;
 
-	list_for_each_entry(block_cb, f->driver_block_list, driver_list) {
+	list_for_each_entry(block_cb, &f->block->cb_list, list) {
 		if (block_cb->cb == cb &&
 		    block_cb->cb_ident == cb_ident)
 			return block_cb;
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index ed17a7c29b86..c565f146435b 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -1662,7 +1662,7 @@ static int nf_tables_addchain(struct nft_ctx *ctx, u8 family, u8 genmask,
 
 		chain->flags |= NFT_BASE_CHAIN | flags;
 		basechain->policy = NF_ACCEPT;
-		INIT_LIST_HEAD(&basechain->cb_list);
+		INIT_LIST_HEAD(&basechain->block.cb_list);
 	} else {
 		chain = kzalloc(sizeof(*chain), GFP_KERNEL);
 		if (chain == NULL)
diff --git a/net/netfilter/nf_tables_offload.c b/net/netfilter/nf_tables_offload.c
index 2c3302845f67..2a184277ee58 100644
--- a/net/netfilter/nf_tables_offload.c
+++ b/net/netfilter/nf_tables_offload.c
@@ -116,7 +116,7 @@ static int nft_setup_cb_call(struct nft_base_chain *basechain,
 	struct flow_block_cb *block_cb;
 	int err;
 
-	list_for_each_entry(block_cb, &basechain->cb_list, list) {
+	list_for_each_entry(block_cb, &basechain->block.cb_list, list) {
 		err = block_cb->cb(type, type_data, block_cb->cb_priv);
 		if (err < 0)
 			return err;
@@ -154,7 +154,7 @@ static int nft_flow_offload_rule(struct nft_trans *trans,
 static int nft_flow_offload_bind(struct flow_block_offload *bo,
 				 struct nft_base_chain *basechain)
 {
-	list_splice(&bo->cb_list, &basechain->cb_list);
+	list_splice(&bo->cb_list, &basechain->block.cb_list);
 	return 0;
 }
 
@@ -198,6 +198,7 @@ static int nft_flow_offload_chain(struct nft_trans *trans,
 		return -EOPNOTSUPP;
 
 	bo.command = cmd;
+	bo.block = &basechain->block;
 	bo.binder_type = FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS;
 	bo.extack = &extack;
 	INIT_LIST_HEAD(&bo.cb_list);
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 51fbe6e95a92..66181961ad6f 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -691,6 +691,8 @@ static void tc_indr_block_ing_cmd(struct tc_indr_block_dev *indr_dev,
 	if (!indr_dev->block)
 		return;
 
+	bo.block = &indr_dev->block->flow;
+
 	indr_block_cb->cb(indr_dev->dev, indr_block_cb->cb_priv, TC_SETUP_BLOCK,
 			  &bo);
 	tcf_block_setup(indr_dev->block, &bo);
@@ -775,6 +777,7 @@ static void tc_indr_block_call(struct tcf_block *block, struct net_device *dev,
 		.command	= command,
 		.binder_type	= ei->binder_type,
 		.net		= dev_net(dev),
+		.block		= &block->flow,
 		.block_shared	= tcf_block_shared(block),
 		.extack		= extack,
 	};
@@ -810,6 +813,7 @@ static int tcf_block_offload_cmd(struct tcf_block *block,
 	bo.net = dev_net(dev);
 	bo.command = command;
 	bo.binder_type = ei->binder_type;
+	bo.block = &block->flow;
 	bo.block_shared = tcf_block_shared(block);
 	bo.extack = extack;
 	INIT_LIST_HEAD(&bo.cb_list);
@@ -988,7 +992,7 @@ static struct tcf_block *tcf_block_create(struct net *net, struct Qdisc *q,
 	}
 	mutex_init(&block->lock);
 	INIT_LIST_HEAD(&block->chain_list);
-	INIT_LIST_HEAD(&block->cb_list);
+	INIT_LIST_HEAD(&block->flow.cb_list);
 	INIT_LIST_HEAD(&block->owner_list);
 	INIT_LIST_HEAD(&block->chain0.filter_chain_list);
 
@@ -1570,7 +1574,7 @@ static int tcf_block_bind(struct tcf_block *block,
 
 		i++;
 	}
-	list_splice(&bo->cb_list, &block->cb_list);
+	list_splice(&bo->cb_list, &block->flow.cb_list);
 
 	return 0;
 
@@ -3155,7 +3159,7 @@ int tc_setup_cb_call(struct tcf_block *block, enum tc_setup_type type,
 	if (block->nooffloaddevcnt && err_stop)
 		return -EOPNOTSUPP;
 
-	list_for_each_entry(block_cb, &block->cb_list, list) {
+	list_for_each_entry(block_cb, &block->flow.cb_list, list) {
 		err = block_cb->cb(type, type_data, block_cb->cb_priv);
 		if (err) {
 			if (err_stop)
-- 
2.11.0



^ permalink raw reply related

* [PATCH net-next 1/3] net: flow_offload: remove netns parameter from flow_block_cb_alloc()
From: Pablo Neira Ayuso @ 2019-07-11  0:12 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, jiri, jakub.kicinski

No need to annotate the netns on the flow block callback object,
flow_block_cb_is_busy() already checks for used blocks.

Fixes: d63db30c8537 ("net: flow_offload: add flow_block_cb_alloc() and flow_block_cb_free()")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_rep.c    | 3 +--
 drivers/net/ethernet/mellanox/mlxsw/spectrum.c      | 5 ++---
 drivers/net/ethernet/mscc/ocelot_flower.c           | 3 +--
 drivers/net/ethernet/mscc/ocelot_tc.c               | 2 +-
 drivers/net/ethernet/netronome/nfp/flower/offload.c | 6 ++----
 include/net/flow_offload.h                          | 3 +--
 net/core/flow_offload.c                             | 9 +++------
 net/dsa/slave.c                                     | 2 +-
 8 files changed, 12 insertions(+), 21 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index 7245d287633d..2162412073c5 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -735,8 +735,7 @@ mlx5e_rep_indr_setup_tc_block(struct net_device *netdev,
 		list_add(&indr_priv->list,
 			 &rpriv->uplink_priv.tc_indr_block_priv_list);
 
-		block_cb = flow_block_cb_alloc(f->net,
-					       mlx5e_rep_indr_setup_block_cb,
+		block_cb = flow_block_cb_alloc(mlx5e_rep_indr_setup_block_cb,
 					       indr_priv, indr_priv,
 					       mlx5e_rep_indr_tc_block_unbind);
 		if (IS_ERR(block_cb)) {
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index 4d34d42b3b0e..a469035400cf 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -1610,8 +1610,7 @@ mlxsw_sp_setup_tc_block_flower_bind(struct mlxsw_sp_port *mlxsw_sp_port,
 		acl_block = mlxsw_sp_acl_block_create(mlxsw_sp, f->net);
 		if (!acl_block)
 			return -ENOMEM;
-		block_cb = flow_block_cb_alloc(f->net,
-					       mlxsw_sp_setup_tc_block_cb_flower,
+		block_cb = flow_block_cb_alloc(mlxsw_sp_setup_tc_block_cb_flower,
 					       mlxsw_sp, acl_block,
 					       mlxsw_sp_tc_block_flower_release);
 		if (IS_ERR(block_cb)) {
@@ -1702,7 +1701,7 @@ static int mlxsw_sp_setup_tc_block(struct mlxsw_sp_port *mlxsw_sp_port,
 					  &mlxsw_sp_block_cb_list))
 			return -EBUSY;
 
-		block_cb = flow_block_cb_alloc(f->net, cb, mlxsw_sp_port,
+		block_cb = flow_block_cb_alloc(cb, mlxsw_sp_port,
 					       mlxsw_sp_port, NULL);
 		if (IS_ERR(block_cb))
 			return PTR_ERR(block_cb);
diff --git a/drivers/net/ethernet/mscc/ocelot_flower.c b/drivers/net/ethernet/mscc/ocelot_flower.c
index 7aaddc09c185..6a11aea8b186 100644
--- a/drivers/net/ethernet/mscc/ocelot_flower.c
+++ b/drivers/net/ethernet/mscc/ocelot_flower.c
@@ -323,8 +323,7 @@ int ocelot_setup_tc_block_flower_bind(struct ocelot_port *port,
 		if (!port_block)
 			return -ENOMEM;
 
-		block_cb = flow_block_cb_alloc(f->net,
-					       ocelot_setup_tc_block_cb_flower,
+		block_cb = flow_block_cb_alloc(ocelot_setup_tc_block_cb_flower,
 					       port, port_block,
 					       ocelot_tc_block_unbind);
 		if (IS_ERR(block_cb)) {
diff --git a/drivers/net/ethernet/mscc/ocelot_tc.c b/drivers/net/ethernet/mscc/ocelot_tc.c
index 9e6464ffae5d..abbcb66bf5ac 100644
--- a/drivers/net/ethernet/mscc/ocelot_tc.c
+++ b/drivers/net/ethernet/mscc/ocelot_tc.c
@@ -156,7 +156,7 @@ static int ocelot_setup_tc_block(struct ocelot_port *port,
 		if (flow_block_cb_is_busy(cb, port, &ocelot_block_cb_list))
 			return -EBUSY;
 
-		block_cb = flow_block_cb_alloc(f->net, cb, port, port, NULL);
+		block_cb = flow_block_cb_alloc(cb, port, port, NULL);
 		if (IS_ERR(block_cb))
 			return PTR_ERR(block_cb);
 
diff --git a/drivers/net/ethernet/netronome/nfp/flower/offload.c b/drivers/net/ethernet/netronome/nfp/flower/offload.c
index 7e725fa60347..a0f8892bb4b5 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/offload.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/offload.c
@@ -1324,8 +1324,7 @@ static int nfp_flower_setup_tc_block(struct net_device *netdev,
 					  &nfp_block_cb_list))
 			return -EBUSY;
 
-		block_cb = flow_block_cb_alloc(f->net,
-					       nfp_flower_setup_tc_block_cb,
+		block_cb = flow_block_cb_alloc(nfp_flower_setup_tc_block_cb,
 					       repr, repr, NULL);
 		if (IS_ERR(block_cb))
 			return PTR_ERR(block_cb);
@@ -1430,8 +1429,7 @@ nfp_flower_setup_indr_tc_block(struct net_device *netdev, struct nfp_app *app,
 		cb_priv->app = app;
 		list_add(&cb_priv->list, &priv->indr_block_cb_priv);
 
-		block_cb = flow_block_cb_alloc(f->net,
-					       nfp_flower_setup_indr_block_cb,
+		block_cb = flow_block_cb_alloc(nfp_flower_setup_indr_block_cb,
 					       cb_priv, cb_priv,
 					       nfp_flower_setup_indr_tc_release);
 		if (IS_ERR(block_cb)) {
diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
index db337299e81e..aa9b5287b231 100644
--- a/include/net/flow_offload.h
+++ b/include/net/flow_offload.h
@@ -264,7 +264,6 @@ struct flow_block_offload {
 struct flow_block_cb {
 	struct list_head	driver_list;
 	struct list_head	list;
-	struct net		*net;
 	tc_setup_cb_t		*cb;
 	void			*cb_ident;
 	void			*cb_priv;
@@ -272,7 +271,7 @@ struct flow_block_cb {
 	unsigned int		refcnt;
 };
 
-struct flow_block_cb *flow_block_cb_alloc(struct net *net, tc_setup_cb_t *cb,
+struct flow_block_cb *flow_block_cb_alloc(tc_setup_cb_t *cb,
 					  void *cb_ident, void *cb_priv,
 					  void (*release)(void *cb_priv));
 void flow_block_cb_free(struct flow_block_cb *block_cb);
diff --git a/net/core/flow_offload.c b/net/core/flow_offload.c
index 76f8db3841d7..507de4b48815 100644
--- a/net/core/flow_offload.c
+++ b/net/core/flow_offload.c
@@ -165,7 +165,7 @@ void flow_rule_match_enc_opts(const struct flow_rule *rule,
 }
 EXPORT_SYMBOL(flow_rule_match_enc_opts);
 
-struct flow_block_cb *flow_block_cb_alloc(struct net *net, tc_setup_cb_t *cb,
+struct flow_block_cb *flow_block_cb_alloc(tc_setup_cb_t *cb,
 					  void *cb_ident, void *cb_priv,
 					  void (*release)(void *cb_priv))
 {
@@ -175,7 +175,6 @@ struct flow_block_cb *flow_block_cb_alloc(struct net *net, tc_setup_cb_t *cb,
 	if (!block_cb)
 		return ERR_PTR(-ENOMEM);
 
-	block_cb->net = net;
 	block_cb->cb = cb;
 	block_cb->cb_ident = cb_ident;
 	block_cb->cb_priv = cb_priv;
@@ -200,8 +199,7 @@ struct flow_block_cb *flow_block_cb_lookup(struct flow_block_offload *f,
 	struct flow_block_cb *block_cb;
 
 	list_for_each_entry(block_cb, f->driver_block_list, driver_list) {
-		if (block_cb->net == f->net &&
-		    block_cb->cb == cb &&
+		if (block_cb->cb == cb &&
 		    block_cb->cb_ident == cb_ident)
 			return block_cb;
 	}
@@ -261,8 +259,7 @@ int flow_block_cb_setup_simple(struct flow_block_offload *f,
 		if (flow_block_cb_is_busy(cb, cb_ident, driver_block_list))
 			return -EBUSY;
 
-		block_cb = flow_block_cb_alloc(f->net, cb, cb_ident,
-					       cb_priv, NULL);
+		block_cb = flow_block_cb_alloc(cb, cb_ident, cb_priv, NULL);
 		if (IS_ERR(block_cb))
 			return PTR_ERR(block_cb);
 
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 614c38ece104..6ca9ec58f881 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -967,7 +967,7 @@ static int dsa_slave_setup_tc_block(struct net_device *dev,
 		if (flow_block_cb_is_busy(cb, dev, &dsa_slave_block_cb_list))
 			return -EBUSY;
 
-		block_cb = flow_block_cb_alloc(f->net, cb, dev, dev, NULL);
+		block_cb = flow_block_cb_alloc(cb, dev, dev, NULL);
 		if (IS_ERR(block_cb))
 			return PTR_ERR(block_cb);
 
-- 
2.11.0



^ permalink raw reply related

* Re: [bpf-next v3 04/12] selftests/bpf: Use bpf_prog_test_run_xattr
From: Andrii Nakryiko @ 2019-07-11  0:03 UTC (permalink / raw)
  To: Krzesimir Nowak
  Cc: open list, Alban Crequy, Iago López Galeiras,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
	Yonghong Song, David S. Miller, Jakub Kicinski,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Networking, bpf, xdp-newbies
In-Reply-To: <20190708163121.18477-5-krzesimir@kinvolk.io>

On Mon, Jul 8, 2019 at 3:43 PM Krzesimir Nowak <krzesimir@kinvolk.io> wrote:
>
> The bpf_prog_test_run_xattr function gives more options to set up a
> test run of a BPF program than the bpf_prog_test_run function.
>
> We will need this extra flexibility to pass ctx data later.
>
> Signed-off-by: Krzesimir Nowak <krzesimir@kinvolk.io>
> ---

lgtm, with some nits below

Acked-by: Andrii Nakryiko <andriin@fb.com>

>  tools/testing/selftests/bpf/test_verifier.c | 16 +++++++++++-----
>  1 file changed, 11 insertions(+), 5 deletions(-)
>
> diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
> index c7541f572932..1640ba9f12c1 100644
> --- a/tools/testing/selftests/bpf/test_verifier.c
> +++ b/tools/testing/selftests/bpf/test_verifier.c
> @@ -822,14 +822,20 @@ static int do_prog_test_run(int fd_prog, bool unpriv, uint32_t expected_val,
>  {
>         __u8 tmp[TEST_DATA_LEN << 2];
>         __u32 size_tmp = sizeof(tmp);

nit: this is now is not needed as a separate local variable, inline?

> -       uint32_t retval;
>         int saved_errno;
>         int err;
> +       struct bpf_prog_test_run_attr attr = {
> +               .prog_fd = fd_prog,
> +               .repeat = 1,
> +               .data_in = data,
> +               .data_size_in = size_data,
> +               .data_out = tmp,
> +               .data_size_out = size_tmp,
> +       };
>
>         if (unpriv)
>                 set_admin(true);
> -       err = bpf_prog_test_run(fd_prog, 1, data, size_data,
> -                               tmp, &size_tmp, &retval, NULL);
> +       err = bpf_prog_test_run_xattr(&attr);
>         saved_errno = errno;
>         if (unpriv)
>                 set_admin(false);
> @@ -846,9 +852,9 @@ static int do_prog_test_run(int fd_prog, bool unpriv, uint32_t expected_val,
>                         return err;
>                 }
>         }
> -       if (retval != expected_val &&
> +       if (attr.retval != expected_val &&
>             expected_val != POINTER_VALUE) {

this if condition now fits one line, can you please combine? thanks!

> -               printf("FAIL retval %d != %d ", retval, expected_val);
> +               printf("FAIL retval %d != %d ", attr.retval, expected_val);
>                 return 1;
>         }
>
> --
> 2.20.1
>

^ permalink raw reply

* Re: [bpf-next v3 03/12] selftests/bpf: Avoid another case of errno clobbering
From: Andrii Nakryiko @ 2019-07-10 23:57 UTC (permalink / raw)
  To: Krzesimir Nowak
  Cc: open list, Alban Crequy, Iago López Galeiras,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
	Yonghong Song, David S. Miller, Jakub Kicinski,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Networking, bpf, xdp-newbies
In-Reply-To: <20190708163121.18477-4-krzesimir@kinvolk.io>

On Mon, Jul 8, 2019 at 3:43 PM Krzesimir Nowak <krzesimir@kinvolk.io> wrote:
>
> Commit 8184d44c9a57 ("selftests/bpf: skip verifier tests for
> unsupported program types") added a check for an unsupported program
> type. The function doing it changes errno, so test_verifier should
> save it before calling it if test_verifier wants to print a reason why
> verifying a BPF program of a supported type failed.
>
> Changes since v2:
> - Move the declaration to fit the reverse christmas tree style.
>
> Fixes: 8184d44c9a57 ("selftests/bpf: skip verifier tests for unsupported program types")
> Cc: Stanislav Fomichev <sdf@google.com>
> Signed-off-by: Krzesimir Nowak <krzesimir@kinvolk.io>
> ---

Acked-by: Andrii Nakryiko <andriin@fb.com>

>  tools/testing/selftests/bpf/test_verifier.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
> index 3fe126e0083b..c7541f572932 100644
> --- a/tools/testing/selftests/bpf/test_verifier.c
> +++ b/tools/testing/selftests/bpf/test_verifier.c
> @@ -864,6 +864,7 @@ static void do_test_single(struct bpf_test *test, bool unpriv,
>         int run_errs, run_successes;
>         int map_fds[MAX_NR_MAPS];
>         const char *expected_err;
> +       int saved_errno;
>         int fixup_skips;

nit: combine those ints? or even with i and err below as well?

>         __u32 pflags;
>         int i, err;
> @@ -894,6 +895,7 @@ static void do_test_single(struct bpf_test *test, bool unpriv,
>                 pflags |= BPF_F_ANY_ALIGNMENT;
>         fd_prog = bpf_verify_program(prog_type, prog, prog_len, pflags,
>                                      "GPL", 0, bpf_vlog, sizeof(bpf_vlog), 4);
> +       saved_errno = errno;
>         if (fd_prog < 0 && !bpf_probe_prog_type(prog_type, 0)) {
>                 printf("SKIP (unsupported program type %d)\n", prog_type);
>                 skips++;
> @@ -910,7 +912,7 @@ static void do_test_single(struct bpf_test *test, bool unpriv,
>         if (expected_ret == ACCEPT) {
>                 if (fd_prog < 0) {
>                         printf("FAIL\nFailed to load prog '%s'!\n",
> -                              strerror(errno));
> +                              strerror(saved_errno));
>                         goto fail_log;
>                 }
>  #ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
> --
> 2.20.1
>

^ permalink raw reply

* Re: [bpf-next v3 02/12] selftests/bpf: Avoid a clobbering of errno
From: Andrii Nakryiko @ 2019-07-10 23:51 UTC (permalink / raw)
  To: Krzesimir Nowak
  Cc: open list, Alban Crequy, Iago López Galeiras,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
	Yonghong Song, David S. Miller, Jakub Kicinski,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Networking, bpf, xdp-newbies
In-Reply-To: <20190708163121.18477-3-krzesimir@kinvolk.io>

On Mon, Jul 8, 2019 at 3:42 PM Krzesimir Nowak <krzesimir@kinvolk.io> wrote:
>
> Save errno right after bpf_prog_test_run returns, so we later check
> the error code actually set by bpf_prog_test_run, not by some libcap
> function.
>
> Changes since v1:
> - Fix the "Fixes:" tag to mention actual commit that introduced the
>   bug
>
> Changes since v2:
> - Move the declaration so it fits the reverse christmas tree style.
>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Fixes: 832c6f2c29ec ("bpf: test make sure to run unpriv test cases in test_verifier")
> Signed-off-by: Krzesimir Nowak <krzesimir@kinvolk.io>
> ---
>  tools/testing/selftests/bpf/test_verifier.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
> index b8d065623ead..3fe126e0083b 100644
> --- a/tools/testing/selftests/bpf/test_verifier.c
> +++ b/tools/testing/selftests/bpf/test_verifier.c
> @@ -823,16 +823,18 @@ static int do_prog_test_run(int fd_prog, bool unpriv, uint32_t expected_val,
>         __u8 tmp[TEST_DATA_LEN << 2];
>         __u32 size_tmp = sizeof(tmp);
>         uint32_t retval;
> +       int saved_errno;
>         int err;
>
>         if (unpriv)
>                 set_admin(true);
>         err = bpf_prog_test_run(fd_prog, 1, data, size_data,
>                                 tmp, &size_tmp, &retval, NULL);

Given err is either 0 or -1, how about instead making err useful right
here without extra variable?

if (bpf_prog_test_run(...))
        err = errno;

> +       saved_errno = errno;
>         if (unpriv)
>                 set_admin(false);
>         if (err) {
> -               switch (errno) {
> +               switch (saved_errno) {
>                 case 524/*ENOTSUPP*/:

ENOTSUPP is defined in include/linux/errno.h, is there any problem
with using this in selftests?

>                         printf("Did not run the program (not supported) ");
>                         return 0;
> --
> 2.20.1
>

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox