Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [bpf-next v3 05/12] selftests/bpf: Allow passing more information to BPF prog test run
From: Krzesimir Nowak @ 2019-07-11 12:17 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: open list, Alban Crequy, Iago López Galeiras,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
	Yonghong Song, David S. Miller, Jakub Kicinski,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Networking, bpf, xdp-newbies
In-Reply-To: <CAEf4BzYYdrcwJKg271ZL7kPJNYyZEGdxQeuUNbfPk=EjewuHeQ@mail.gmail.com>

On Thu, Jul 11, 2019 at 3:17 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Mon, Jul 8, 2019 at 3:42 PM Krzesimir Nowak <krzesimir@kinvolk.io> wrote:
> >
> > The test case can now specify a custom length of the data member,
> > context data and its length, which will be passed to
> > bpf_prog_test_run_xattr. For backward compatilibity, if the data
> > length is 0 (which is what will happen when the field is left
> > unspecified in the designated initializer of a struct), then the
> > length passed to the bpf_prog_test_run_xattr is TEST_DATA_LEN.
> >
> > Also for backward compatilibity, if context data length is 0, NULL is
> > passed as a context to bpf_prog_test_run_xattr. This is to avoid
> > breaking other tests, where context data being NULL and context data
> > length being 0 is handled differently from the case where context data
> > is not NULL and context data length is 0.
> >
> > Custom lengths still can't be greater than hardcoded 64 bytes for data
> > and 192 for context data.
> >
> > 192 for context data was picked to allow passing struct
> > bpf_perf_event_data as a context for perf event programs. The struct
> > is quite large, because it contains struct pt_regs.
> >
> > Test runs for perf event programs will not allow the copying the data
> > back to data_out buffer, so they require data_out_size to be zero and
> > data_out to be NULL. Since test_verifier hardcodes it, make it
> > possible to override the size. Overriding the size to zero will cause
> > the buffer to be NULL.
> >
> > Changes since v2:
> > - Allow overriding the data out size and buffer.
> >
> > Signed-off-by: Krzesimir Nowak <krzesimir@kinvolk.io>
> > ---
> >  tools/testing/selftests/bpf/test_verifier.c | 105 +++++++++++++++++---
> >  1 file changed, 93 insertions(+), 12 deletions(-)
> >
> > diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
> > index 1640ba9f12c1..6f124cc4ee34 100644
> > --- a/tools/testing/selftests/bpf/test_verifier.c
> > +++ b/tools/testing/selftests/bpf/test_verifier.c
> > @@ -54,6 +54,7 @@
> >  #define MAX_TEST_RUNS  8
> >  #define POINTER_VALUE  0xcafe4all
> >  #define TEST_DATA_LEN  64
> > +#define TEST_CTX_LEN   192
> >
> >  #define F_NEEDS_EFFICIENT_UNALIGNED_ACCESS     (1 << 0)
> >  #define F_LOAD_WITH_STRICT_ALIGNMENT           (1 << 1)
> > @@ -96,7 +97,12 @@ struct bpf_test {
> >         enum bpf_prog_type prog_type;
> >         uint8_t flags;
> >         __u8 data[TEST_DATA_LEN];
> > +       __u32 data_len;
> > +       __u8 ctx[TEST_CTX_LEN];
> > +       __u32 ctx_len;
> >         void (*fill_helper)(struct bpf_test *self);
> > +       bool override_data_out_len;
> > +       __u32 overridden_data_out_len;
> >         uint8_t runs;
> >         struct {
> >                 uint32_t retval, retval_unpriv;
> > @@ -104,6 +110,9 @@ struct bpf_test {
> >                         __u8 data[TEST_DATA_LEN];
> >                         __u64 data64[TEST_DATA_LEN / 8];
> >                 };
> > +               __u32 data_len;
> > +               __u8 ctx[TEST_CTX_LEN];
> > +               __u32 ctx_len;
> >         } retvals[MAX_TEST_RUNS];
> >  };
> >
> > @@ -818,21 +827,35 @@ static int set_admin(bool admin)
> >  }
> >
> >  static int do_prog_test_run(int fd_prog, bool unpriv, uint32_t expected_val,
> > -                           void *data, size_t size_data)
> > +                           void *data, size_t size_data, void *ctx,
> > +                           size_t size_ctx, u32 *overridden_data_out_size)
> >  {
> > -       __u8 tmp[TEST_DATA_LEN << 2];
> > -       __u32 size_tmp = sizeof(tmp);
> > -       int saved_errno;
> > -       int err;
> >         struct bpf_prog_test_run_attr attr = {
> >                 .prog_fd = fd_prog,
> >                 .repeat = 1,
> >                 .data_in = data,
> >                 .data_size_in = size_data,
> > -               .data_out = tmp,
> > -               .data_size_out = size_tmp,
> > +               .ctx_in = ctx,
> > +               .ctx_size_in = size_ctx,
> >         };
> > +       __u8 tmp[TEST_DATA_LEN << 2];
> > +       __u32 size_tmp = sizeof(tmp);
> > +       __u32 size_buf = size_tmp;
> > +       __u8 *buf = tmp;
> > +       int saved_errno;
> > +       int err;
> >
> > +       if (overridden_data_out_size)
> > +               size_buf = *overridden_data_out_size;
> > +       if (size_buf > size_tmp) {
> > +               printf("FAIL: out data size (%d) greater than a buffer size (%d) ",
> > +                      size_buf, size_tmp);
> > +               return -EINVAL;
> > +       }
> > +       if (!size_buf)
> > +               buf = NULL;
> > +       attr.data_size_out = size_buf;
> > +       attr.data_out = buf;
> >         if (unpriv)
> >                 set_admin(true);
> >         err = bpf_prog_test_run_xattr(&attr);
> > @@ -956,13 +979,45 @@ static void do_test_single(struct bpf_test *test, bool unpriv,
> >         if (!alignment_prevented_execution && fd_prog >= 0) {
> >                 uint32_t expected_val;
> >                 int i;
> > +               __u32 size_data;
> > +               __u32 size_ctx;
> > +               bool bad_size;
> > +               void *ctx;
> > +               __u32 *overridden_data_out_size;
> >
> >                 if (!test->runs) {
> > +                       if (test->data_len > 0)
> > +                               size_data = test->data_len;
> > +                       else
> > +                               size_data = sizeof(test->data);
> > +                       if (test->override_data_out_len)
> > +                               overridden_data_out_size = &test->overridden_data_out_len;
> > +                       else
> > +                               overridden_data_out_size = NULL;
> > +                       size_ctx = test->ctx_len;
> > +                       bad_size = false;
>
> I hated all this duplication of logic, which with this patch becomes
> even more expansive, so I removed it. Please see [0]. Can you please
> apply that patch and add all this new logic only once?
>
>   [0] https://patchwork.ozlabs.org/patch/1130601/

Will do.

>
> >                         expected_val = unpriv && test->retval_unpriv ?
> >                                 test->retval_unpriv : test->retval;
> >
> > -                       err = do_prog_test_run(fd_prog, unpriv, expected_val,
> > -                                              test->data, sizeof(test->data));
> > +                       if (size_data > sizeof(test->data)) {
> > +                               printf("FAIL: data size (%u) greater than TEST_DATA_LEN (%lu) ", size_data, sizeof(test->data));
> > +                               bad_size = true;
> > +                       }
> > +                       if (size_ctx > sizeof(test->ctx)) {
> > +                               printf("FAIL: ctx size (%u) greater than TEST_CTX_LEN (%lu) ", size_ctx, sizeof(test->ctx));
>
> These look like way too long lines, wrap them?

Ah, yeah, these can be wrapped easily. Will do.

>
> > +                               bad_size = true;
> > +                       }
> > +                       if (size_ctx)
> > +                               ctx = test->ctx;
> > +                       else
> > +                               ctx = NULL;
>
> nit: single line:
>
> ctx = size_ctx ? test->ctx : NULL;
>
> > +                       if (bad_size)
> > +                               err = 1;
> > +                       else
> > +                               err = do_prog_test_run(fd_prog, unpriv, expected_val,
> > +                                                      test->data, size_data,
> > +                                                      ctx, size_ctx,
> > +                                                      overridden_data_out_size);
> >                         if (err)
> >                                 run_errs++;
> >                         else
> > @@ -970,14 +1025,40 @@ static void do_test_single(struct bpf_test *test, bool unpriv,
> >                 }
> >
> >                 for (i = 0; i < test->runs; i++) {
> > +                       if (test->retvals[i].data_len > 0)
> > +                               size_data = test->retvals[i].data_len;
> > +                       else
> > +                               size_data = sizeof(test->retvals[i].data);
> > +                       if (test->override_data_out_len)
> > +                               overridden_data_out_size = &test->overridden_data_out_len;
> > +                       else
> > +                               overridden_data_out_size = NULL;
> > +                       size_ctx = test->retvals[i].ctx_len;
> > +                       bad_size = false;
> >                         if (unpriv && test->retvals[i].retval_unpriv)
> >                                 expected_val = test->retvals[i].retval_unpriv;
> >                         else
> >                                 expected_val = test->retvals[i].retval;
> >
> > -                       err = do_prog_test_run(fd_prog, unpriv, expected_val,
> > -                                              test->retvals[i].data,
> > -                                              sizeof(test->retvals[i].data));
> > +                       if (size_data > sizeof(test->retvals[i].data)) {
> > +                               printf("FAIL: data size (%u) at run %i greater than TEST_DATA_LEN (%lu) ", size_data, i + 1, sizeof(test->retvals[i].data));
> > +                               bad_size = true;
> > +                       }
> > +                       if (size_ctx > sizeof(test->retvals[i].ctx)) {
> > +                               printf("FAIL: ctx size (%u) at run %i greater than TEST_CTX_LEN (%lu) ", size_ctx, i + 1, sizeof(test->retvals[i].ctx));
> > +                               bad_size = true;
> > +                       }
> > +                       if (size_ctx)
> > +                               ctx = test->retvals[i].ctx;
> > +                       else
> > +                               ctx = NULL;
> > +                       if (bad_size)
> > +                               err = 1;
> > +                       else
> > +                               err = do_prog_test_run(fd_prog, unpriv, expected_val,
> > +                                                      test->retvals[i].data, size_data,
> > +                                                      ctx, size_ctx,
> > +                                                      overridden_data_out_size);
> >                         if (err) {
> >                                 printf("(run %d/%d) ", i + 1, test->runs);
> >                                 run_errs++;
> > --
> > 2.20.1
> >



-- 
Kinvolk GmbH | Adalbertstr.6a, 10999 Berlin | tel: +491755589364
Geschäftsführer/Directors: Alban Crequy, Chris Kühl, Iago López Galeiras
Registergericht/Court of registration: Amtsgericht Charlottenburg
Registernummer/Registration number: HRB 171414 B
Ust-ID-Nummer/VAT ID number: DE302207000

^ permalink raw reply

* Re: [oss-drivers] Re: [RFC bpf-next 2/8] bpf: extend list based insn patching infra to verification layer
From: Jiong Wang @ 2019-07-11 12:20 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Andrii Nakryiko, Alexei Starovoitov, Daniel Borkmann, Edward Cree,
	Naveen N. Rao, Andrii Nakryiko, Jakub Kicinski, bpf, Networking,
	oss-drivers
In-Reply-To: <87o920235d.fsf@netronome.com>


Jiong Wang writes:

> Andrii Nakryiko writes:
>
>> On Thu, Jul 4, 2019 at 2:32 PM Jiong Wang <jiong.wang@netronome.com> wrote:
>>>
>>> Verification layer also needs to handle auxiliar info as well as adjusting
>>> subprog start.
>>>
>>> At this layer, insns inside patch buffer could be jump, but they should
>>> have been resolved, meaning they shouldn't jump to insn outside of the
>>> patch buffer. Lineration function for this layer won't touch insns inside
>>> patch buffer.
>>>
>>> Adjusting subprog is finished along with adjusting jump target when the
>>> input will cover bpf to bpf call insn, re-register subprog start is cheap.
>>> But adjustment when there is insn deleteion is not considered yet.
>>>
>>> Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
>>> ---
>>>  kernel/bpf/verifier.c | 150 ++++++++++++++++++++++++++++++++++++++++++++++++++
>>>  1 file changed, 150 insertions(+)
>>>
>>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>>> index a2e7637..2026d64 100644
>>> --- a/kernel/bpf/verifier.c
>>> +++ b/kernel/bpf/verifier.c
>>> @@ -8350,6 +8350,156 @@ static void opt_hard_wire_dead_code_branches(struct bpf_verifier_env *env)
>>>         }
>>>  }
>>>
>>> +/* Linearize bpf list insn to array (verifier layer). */
>>> +static struct bpf_verifier_env *
>>> +verifier_linearize_list_insn(struct bpf_verifier_env *env,
>>> +                            struct bpf_list_insn *list)
>>
>> It's unclear why this returns env back? It's not allocating a new env,
>> so it's weird and unnecessary. Just return error code.
>
> The reason is I was thinking we have two layers in BPF, the core and the
> verifier.
>
> For core layer (the relevant file is core.c), when doing patching, the
> input is insn list and bpf_prog, the linearization should linearize the
> insn list into insn array, and also whatever others affect inside bpf_prog
> due to changing on insns, for example line info inside prog->aux. So the
> return value is bpf_prog for core layer linearization hook. 
>
> For verifier layer, it is similar, but the context if bpf_verifier_env, the
> linearization hook should linearize the insn list, and also those affected
> inside env, for example bpf_insn_aux_data, so the return value is
> bpf_verifier_env, meaning returning an updated verifier context
> (bpf_verifier_env) after insn list linearization.

Realized your point is no new env is allocated, so just return error
code. Yes, the env pointer is not changed, just internal data is
updated. Return bpf_verifier_env mostly is trying to make the hook more
clear that it returns an updated "context" where the linearization happens,
for verifier layer, it is bpf_verifier_env, and for core layer, it is
bpf_prog, so return value was designed to return these two types.

>
> Make sense?
>
> Regards,
> Jiong
>
>>
>>> +{
>>> +       u32 *idx_map, idx, orig_cnt, fini_cnt = 0;
>>> +       struct bpf_subprog_info *new_subinfo;
>>> +       struct bpf_insn_aux_data *new_data;
>>> +       struct bpf_prog *prog = env->prog;
>>> +       struct bpf_verifier_env *ret_env;
>>> +       struct bpf_insn *insns, *insn;
>>> +       struct bpf_list_insn *elem;
>>> +       int ret;
>>> +
>>> +       /* Calculate final size. */
>>> +       for (elem = list; elem; elem = elem->next)
>>> +               if (!(elem->flag & LIST_INSN_FLAG_REMOVED))
>>> +                       fini_cnt++;
>>> +
>>> +       orig_cnt = prog->len;
>>> +       insns = prog->insnsi;
>>> +       /* If prog length remains same, nothing else to do. */
>>> +       if (fini_cnt == orig_cnt) {
>>> +               for (insn = insns, elem = list; elem; elem = elem->next, insn++)
>>> +                       *insn = elem->insn;
>>> +               return env;
>>> +       }
>>> +       /* Realloc insn buffer when necessary. */
>>> +       if (fini_cnt > orig_cnt)
>>> +               prog = bpf_prog_realloc(prog, bpf_prog_size(fini_cnt),
>>> +                                       GFP_USER);
>>> +       if (!prog)
>>> +               return ERR_PTR(-ENOMEM);
>>> +       insns = prog->insnsi;
>>> +       prog->len = fini_cnt;
>>> +       ret_env = env;
>>> +
>>> +       /* idx_map[OLD_IDX] = NEW_IDX */
>>> +       idx_map = kvmalloc(orig_cnt * sizeof(u32), GFP_KERNEL);
>>> +       if (!idx_map)
>>> +               return ERR_PTR(-ENOMEM);
>>> +       memset(idx_map, 0xff, orig_cnt * sizeof(u32));
>>> +
>>> +       /* Use the same alloc method used when allocating env->insn_aux_data. */
>>> +       new_data = vzalloc(array_size(sizeof(*new_data), fini_cnt));
>>> +       if (!new_data) {
>>> +               kvfree(idx_map);
>>> +               return ERR_PTR(-ENOMEM);
>>> +       }
>>> +
>>> +       /* Copy over insn + calculate idx_map. */
>>> +       for (idx = 0, elem = list; elem; elem = elem->next) {
>>> +               int orig_idx = elem->orig_idx - 1;
>>> +
>>> +               if (orig_idx >= 0) {
>>> +                       idx_map[orig_idx] = idx;
>>> +
>>> +                       if (elem->flag & LIST_INSN_FLAG_REMOVED)
>>> +                               continue;
>>> +
>>> +                       new_data[idx] = env->insn_aux_data[orig_idx];
>>> +
>>> +                       if (elem->flag & LIST_INSN_FLAG_PATCHED)
>>> +                               new_data[idx].zext_dst =
>>> +                                       insn_has_def32(env, &elem->insn);
>>> +               } else {
>>> +                       new_data[idx].seen = true;
>>> +                       new_data[idx].zext_dst = insn_has_def32(env,
>>> +                                                               &elem->insn);
>>> +               }
>>> +               insns[idx++] = elem->insn;
>>> +       }
>>> +
>>> +       new_subinfo = kvzalloc(sizeof(env->subprog_info), GFP_KERNEL);
>>> +       if (!new_subinfo) {
>>> +               kvfree(idx_map);
>>> +               vfree(new_data);
>>> +               return ERR_PTR(-ENOMEM);
>>> +       }
>>> +       memcpy(new_subinfo, env->subprog_info, sizeof(env->subprog_info));
>>> +       memset(env->subprog_info, 0, sizeof(env->subprog_info));
>>> +       env->subprog_cnt = 0;
>>> +       env->prog = prog;
>>> +       ret = add_subprog(env, 0);
>>> +       if (ret < 0) {
>>> +               ret_env = ERR_PTR(ret);
>>> +               goto free_all_ret;
>>> +       }
>>> +       /* Relocate jumps using idx_map.
>>> +        *   old_dst = jmp_insn.old_target + old_pc + 1;
>>> +        *   new_dst = idx_map[old_dst] = jmp_insn.new_target + new_pc + 1;
>>> +        *   jmp_insn.new_target = new_dst - new_pc - 1;
>>> +        */
>>> +       for (idx = 0, elem = list; elem; elem = elem->next) {
>>> +               int orig_idx = elem->orig_idx;
>>> +
>>> +               if (elem->flag & LIST_INSN_FLAG_REMOVED)
>>> +                       continue;
>>> +               if ((elem->flag & LIST_INSN_FLAG_PATCHED) || !orig_idx) {
>>> +                       idx++;
>>> +                       continue;
>>> +               }
>>> +
>>> +               ret = bpf_jit_adj_imm_off(&insns[idx], orig_idx - 1, idx,
>>> +                                         idx_map);
>>> +               if (ret < 0) {
>>> +                       ret_env = ERR_PTR(ret);
>>> +                       goto free_all_ret;
>>> +               }
>>> +               /* Recalculate subprog start as we are at bpf2bpf call insn. */
>>> +               if (ret > 0) {
>>> +                       ret = add_subprog(env, idx + insns[idx].imm + 1);
>>> +                       if (ret < 0) {
>>> +                               ret_env = ERR_PTR(ret);
>>> +                               goto free_all_ret;
>>> +                       }
>>> +               }
>>> +               idx++;
>>> +       }
>>> +       if (ret < 0) {
>>> +               ret_env = ERR_PTR(ret);
>>> +               goto free_all_ret;
>>> +       }
>>> +
>>> +       env->subprog_info[env->subprog_cnt].start = fini_cnt;
>>> +       for (idx = 0; idx <= env->subprog_cnt; idx++)
>>> +               new_subinfo[idx].start = env->subprog_info[idx].start;
>>> +       memcpy(env->subprog_info, new_subinfo, sizeof(env->subprog_info));
>>> +
>>> +       /* Adjust linfo.
>>> +        * FIXME: no support for insn removal at the moment.
>>> +        */
>>> +       if (prog->aux->nr_linfo) {
>>> +               struct bpf_line_info *linfo = prog->aux->linfo;
>>> +               u32 nr_linfo = prog->aux->nr_linfo;
>>> +
>>> +               for (idx = 0; idx < nr_linfo; idx++)
>>> +                       linfo[idx].insn_off = idx_map[linfo[idx].insn_off];
>>> +       }
>>> +       vfree(env->insn_aux_data);
>>> +       env->insn_aux_data = new_data;
>>> +       goto free_mem_list_ret;
>>> +free_all_ret:
>>> +       vfree(new_data);
>>> +free_mem_list_ret:
>>> +       kvfree(new_subinfo);
>>> +       kvfree(idx_map);
>>> +       return ret_env;
>>> +}
>>> +
>>>  static int opt_remove_dead_code(struct bpf_verifier_env *env)
>>>  {
>>>         struct bpf_insn_aux_data *aux_data = env->insn_aux_data;
>>> --
>>> 2.7.4
>>>


^ permalink raw reply

* Re:  Re: Re: linux-next: build failure after merge of the net-next tree
From: Bernard Metzler @ 2019-07-11 12:29 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Leon Romanovsky, Stephen Rothwell, Doug Ledford, David Miller,
	Networking, Linux Next Mailing List, Linux Kernel Mailing List
In-Reply-To: <20190711115235.GA25821@mellanox.com>

-----"Jason Gunthorpe" <jgg@mellanox.com> wrote: -----

>To: "Bernard Metzler" <BMT@zurich.ibm.com>
>From: "Jason Gunthorpe" <jgg@mellanox.com>
>Date: 07/11/2019 01:53PM
>Cc: "Leon Romanovsky" <leon@kernel.org>, "Stephen Rothwell"
><sfr@canb.auug.org.au>, "Doug Ledford" <dledford@redhat.com>, "David
>Miller" <davem@davemloft.net>, "Networking" <netdev@vger.kernel.org>,
>"Linux Next Mailing List" <linux-next@vger.kernel.org>, "Linux Kernel
>Mailing List" <linux-kernel@vger.kernel.org>
>Subject: [EXTERNAL] Re: Re: linux-next: build failure after merge of
>the net-next tree
>
>On Thu, Jul 11, 2019 at 08:00:49AM +0000, Bernard Metzler wrote:
>
>> That listen will not sleep. The socket is just marked
>> listening. 
>
>Eh? siw_listen_address() calls siw_cep_alloc() which does:
>
>	struct siw_cep *cep = kzalloc(sizeof(*cep), GFP_KERNEL);
>
>Which is sleeping. Many other cases too.
>
>Jason
>
>
Ah, true! I was after really deep sleeps like user level
socket accept() calls ;) So you are correct of course.

Thanks!
Bernard


^ permalink raw reply

* Re: [PATCH net-next 00/11] Add drop monitor for offloaded data paths
From: Ido Schimmel @ 2019-07-11 12:39 UTC (permalink / raw)
  To: David Miller, nhorman
  Cc: netdev, jiri, mlxsw, dsahern, roopa, nikolay, andy, pablo,
	jakub.kicinski, pieter.jansenvanvuuren, andrew, f.fainelli,
	vivien.didelot, idosch
In-Reply-To: <20190707.124541.451040901050013496.davem@davemloft.net>

On Sun, Jul 07, 2019 at 12:45:41PM -0700, David Miller wrote:
> From: Ido Schimmel <idosch@idosch.org>
> Date: Sun,  7 Jul 2019 10:58:17 +0300
> 
> > Users have several ways to debug the kernel and understand why a packet
> > was dropped. For example, using "drop monitor" and "perf". Both
> > utilities trace kfree_skb(), which is the function called when a packet
> > is freed as part of a failure. The information provided by these tools
> > is invaluable when trying to understand the cause of a packet loss.
> > 
> > In recent years, large portions of the kernel data path were offloaded
> > to capable devices. Today, it is possible to perform L2 and L3
> > forwarding in hardware, as well as tunneling (IP-in-IP and VXLAN).
> > Different TC classifiers and actions are also offloaded to capable
> > devices, at both ingress and egress.
> > 
> > However, when the data path is offloaded it is not possible to achieve
> > the same level of introspection as tools such "perf" and "drop monitor"
> > become irrelevant.
> > 
> > This patchset aims to solve this by allowing users to monitor packets
> > that the underlying device decided to drop along with relevant metadata
> > such as the drop reason and ingress port.
> 
> We are now going to have 5 or so ways to capture packets passing through
> the system, this is nonsense.
> 
> AF_PACKET, kfree_skb drop monitor, perf, XDP perf events, and now this
> devlink thing.
> 
> This is insanity, too many ways to do the same thing and therefore the
> worst possible user experience.
> 
> Pick _ONE_ method to trap packets and forward normal kfree_skb events,
> XDP perf events, and these taps there too.
> 
> I mean really, think about it from the average user's perspective.  To
> see all drops/pkts I have to attach a kfree_skb tracepoint, and not just
> listen on devlink but configure a special tap thing beforehand and then
> if someone is using XDP I gotta setup another perf event buffer capture
> thing too.

Dave,

Before I start working on v2, I would like to get your feedback on the
high level plan. Also adding Neil who is the maintainer of drop_monitor
(and counterpart DropWatch tool [1]).

IIUC, the problem you point out is that users need to use different
tools to monitor packet drops based on where these drops occur
(SW/HW/XDP).

Therefore, my plan is to extend the existing drop_monitor netlink
channel to also cover HW drops. I will add a new message type and a new
multicast group for HW drops and encode in the message what is currently
encoded in the devlink events.

I would like to emphasize that the configuration of whether these
dropped packets are even sent to the CPU from the device still needs to
reside in devlink given this is the go-to tool for device-specific
configuration. In addition, these drop traps are a small subset of the
entire packet traps devices support and all have similar needs such as
HW policer configuration and statistics.

In the future we might also want to report events that indicate the
formation of possible problems. For example, in case packets are queued
above a certain threshold or for long periods of time. I hope we could
re-use drop_monitor for this as well, thereby making it the go-to
channel for diagnosing current and to-be problems in the data path.

Thanks

[1] https://github.com/nhorman/dropwatch

^ permalink raw reply

* RE: [PATCH] tipc: ensure skb->lock is initialised
From: Jon Maloy @ 2019-07-11 12:55 UTC (permalink / raw)
  To: Chris Packham, Eric Dumazet, ying.xue@windriver.com,
	davem@davemloft.net
  Cc: netdev@vger.kernel.org, tipc-discussion@lists.sourceforge.net,
	linux-kernel@vger.kernel.org
In-Reply-To: <4d2ac0ce7f974184ac43b71f19aee7a3@svr-chch-ex1.atlnz.lc>



> -----Original Message-----
> From: Chris Packham <Chris.Packham@alliedtelesis.co.nz>
> Sent: 10-Jul-19 16:58
> To: Jon Maloy <jon.maloy@ericsson.com>; Eric Dumazet
> <eric.dumazet@gmail.com>; ying.xue@windriver.com;
> davem@davemloft.net
> Cc: netdev@vger.kernel.org; tipc-discussion@lists.sourceforge.net; linux-
> kernel@vger.kernel.org
> Subject: Re: [PATCH] tipc: ensure skb->lock is initialised
> 
> 
> On 11/07/19 1:10 AM, Jon Maloy wrote:
> >> -----Original Message-----
> >> From: Eric Dumazet <eric.dumazet@gmail.com>
> >> Sent: 10-Jul-19 04:00
> >> To: Jon Maloy <jon.maloy@ericsson.com>; Eric Dumazet
> >> <eric.dumazet@gmail.com>; Chris Packham
> >> <Chris.Packham@alliedtelesis.co.nz>; ying.xue@windriver.com;
> >> davem@davemloft.net
> >> Cc: netdev@vger.kernel.org; tipc-discussion@lists.sourceforge.net;
> >> linux- kernel@vger.kernel.org
> >> Subject: Re: [PATCH] tipc: ensure skb->lock is initialised
> >>
> >>
> >>
> >> On 7/9/19 10:15 PM, Jon Maloy wrote:
> >>>
> >>> It is not only for lockdep purposes, -it is essential.  But please
> >>> provide details
> >> about where you see that more fixes are needed.
> >>>
> >>
> >> Simple fact that you detect a problem only when skb_queue_purge() is
> >> called should talk by itself.
> >>
> >> As I stated, there are many places where the list is manipulated
> >> _without_ its spinlock being held.
> >
> > Yes, and that is the way it should be on the send path.
> >
> >>
> >> You want consistency, then
> >>
> >> - grab the spinlock all the time.
> >> - Or do not ever use it.
> >
> > That is exactly what we are doing.
> > - The send path doesn't need the spinlock, and never grabs it.
> > - The receive path does need it, and always grabs it.
> >
> > However, since we don't know from the beginning which path a created
> > message will follow, we initialize the queue spinlock "just in case"
> > when it is created, even though it may never be used later.
> > You can see this as a violation of the principle you are stating
> > above, but it is a prize that is worth paying, given savings in code
> > volume, complexity and performance.
> >
> >>
> >> Do not initialize the spinlock just in case a path will use
> >> skb_queue_purge() (instead of using __skb_queue_purge())
> >
> > I am ok with that. I think we can agree that Chris goes for that
> > solution, so we can get this bug fixed.
> 
> So would you like a v2 with an improved commit message? I note that I said
> skb->lock instead of head->lock in the subject line so at least that should be
> corrected.

Yes, unless Eric has any more objections. 
I addition, I have realized that we can change from skb_queue_head_init()  to __skb_queue_head_init() at all the disputed locations in the socket code.
Then, we do a separate call to spin_lock_init(&list->lock) at the moment we find that the message will follow the receive path, i.e., in tipc_node_xmit().
That should make everybody happy. 
I will post a patch when net-next re-opens.

BR
///jon



^ permalink raw reply

* [PATCH net-next,v2 1/3] net: flow_offload: remove netns parameter from flow_block_cb_alloc()
From: Pablo Neira Ayuso @ 2019-07-11 13:09 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, jiri, jakub.kicinski

No need to annotate the netns on the flow block callback object,
flow_block_cb_is_busy() already checks for used blocks.

Fixes: d63db30c8537 ("net: flow_offload: add flow_block_cb_alloc() and flow_block_cb_free()")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
v2: no changes

 drivers/net/ethernet/mellanox/mlx5/core/en_rep.c    | 3 +--
 drivers/net/ethernet/mellanox/mlxsw/spectrum.c      | 5 ++---
 drivers/net/ethernet/mscc/ocelot_flower.c           | 3 +--
 drivers/net/ethernet/mscc/ocelot_tc.c               | 2 +-
 drivers/net/ethernet/netronome/nfp/flower/offload.c | 6 ++----
 include/net/flow_offload.h                          | 3 +--
 net/core/flow_offload.c                             | 9 +++------
 net/dsa/slave.c                                     | 2 +-
 8 files changed, 12 insertions(+), 21 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index 7245d287633d..2162412073c5 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -735,8 +735,7 @@ mlx5e_rep_indr_setup_tc_block(struct net_device *netdev,
 		list_add(&indr_priv->list,
 			 &rpriv->uplink_priv.tc_indr_block_priv_list);
 
-		block_cb = flow_block_cb_alloc(f->net,
-					       mlx5e_rep_indr_setup_block_cb,
+		block_cb = flow_block_cb_alloc(mlx5e_rep_indr_setup_block_cb,
 					       indr_priv, indr_priv,
 					       mlx5e_rep_indr_tc_block_unbind);
 		if (IS_ERR(block_cb)) {
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index 4d34d42b3b0e..a469035400cf 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -1610,8 +1610,7 @@ mlxsw_sp_setup_tc_block_flower_bind(struct mlxsw_sp_port *mlxsw_sp_port,
 		acl_block = mlxsw_sp_acl_block_create(mlxsw_sp, f->net);
 		if (!acl_block)
 			return -ENOMEM;
-		block_cb = flow_block_cb_alloc(f->net,
-					       mlxsw_sp_setup_tc_block_cb_flower,
+		block_cb = flow_block_cb_alloc(mlxsw_sp_setup_tc_block_cb_flower,
 					       mlxsw_sp, acl_block,
 					       mlxsw_sp_tc_block_flower_release);
 		if (IS_ERR(block_cb)) {
@@ -1702,7 +1701,7 @@ static int mlxsw_sp_setup_tc_block(struct mlxsw_sp_port *mlxsw_sp_port,
 					  &mlxsw_sp_block_cb_list))
 			return -EBUSY;
 
-		block_cb = flow_block_cb_alloc(f->net, cb, mlxsw_sp_port,
+		block_cb = flow_block_cb_alloc(cb, mlxsw_sp_port,
 					       mlxsw_sp_port, NULL);
 		if (IS_ERR(block_cb))
 			return PTR_ERR(block_cb);
diff --git a/drivers/net/ethernet/mscc/ocelot_flower.c b/drivers/net/ethernet/mscc/ocelot_flower.c
index 7aaddc09c185..6a11aea8b186 100644
--- a/drivers/net/ethernet/mscc/ocelot_flower.c
+++ b/drivers/net/ethernet/mscc/ocelot_flower.c
@@ -323,8 +323,7 @@ int ocelot_setup_tc_block_flower_bind(struct ocelot_port *port,
 		if (!port_block)
 			return -ENOMEM;

-		block_cb = flow_block_cb_alloc(f->net,
-					       ocelot_setup_tc_block_cb_flower,
+		block_cb = flow_block_cb_alloc(ocelot_setup_tc_block_cb_flower,
 					       port, port_block,
 					       ocelot_tc_block_unbind);
 		if (IS_ERR(block_cb)) {
diff --git a/drivers/net/ethernet/mscc/ocelot_tc.c b/drivers/net/ethernet/mscc/ocelot_tc.c
index 9e6464ffae5d..abbcb66bf5ac 100644
--- a/drivers/net/ethernet/mscc/ocelot_tc.c
+++ b/drivers/net/ethernet/mscc/ocelot_tc.c
@@ -156,7 +156,7 @@ static int ocelot_setup_tc_block(struct ocelot_port *port,
 		if (flow_block_cb_is_busy(cb, port, &ocelot_block_cb_list))
 			return -EBUSY;
 
-		block_cb = flow_block_cb_alloc(f->net, cb, port, port, NULL);
+		block_cb = flow_block_cb_alloc(cb, port, port, NULL);
 		if (IS_ERR(block_cb))
 			return PTR_ERR(block_cb);
 
diff --git a/drivers/net/ethernet/netronome/nfp/flower/offload.c b/drivers/net/ethernet/netronome/nfp/flower/offload.c
index 7e725fa60347..a0f8892bb4b5 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/offload.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/offload.c
@@ -1324,8 +1324,7 @@ static int nfp_flower_setup_tc_block(struct net_device *netdev,
 					  &nfp_block_cb_list))
 			return -EBUSY;
 
-		block_cb = flow_block_cb_alloc(f->net,
-					       nfp_flower_setup_tc_block_cb,
+		block_cb = flow_block_cb_alloc(nfp_flower_setup_tc_block_cb,
 					       repr, repr, NULL);
 		if (IS_ERR(block_cb))
 			return PTR_ERR(block_cb);
@@ -1430,8 +1429,7 @@ nfp_flower_setup_indr_tc_block(struct net_device *netdev, struct nfp_app *app,
 		cb_priv->app = app;
 		list_add(&cb_priv->list, &priv->indr_block_cb_priv);
 
-		block_cb = flow_block_cb_alloc(f->net,
-					       nfp_flower_setup_indr_block_cb,
+		block_cb = flow_block_cb_alloc(nfp_flower_setup_indr_block_cb,
 					       cb_priv, cb_priv,
 					       nfp_flower_setup_indr_tc_release);
 		if (IS_ERR(block_cb)) {
diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
index db337299e81e..aa9b5287b231 100644
--- a/include/net/flow_offload.h
+++ b/include/net/flow_offload.h
@@ -264,7 +264,6 @@ struct flow_block_offload {
 struct flow_block_cb {
 	struct list_head	driver_list;
 	struct list_head	list;
-	struct net		*net;
 	tc_setup_cb_t		*cb;
 	void			*cb_ident;
 	void			*cb_priv;
@@ -272,7 +271,7 @@ struct flow_block_cb {
 	unsigned int		refcnt;
 };
 
-struct flow_block_cb *flow_block_cb_alloc(struct net *net, tc_setup_cb_t *cb,
+struct flow_block_cb *flow_block_cb_alloc(tc_setup_cb_t *cb,
 					  void *cb_ident, void *cb_priv,
 					  void (*release)(void *cb_priv));
 void flow_block_cb_free(struct flow_block_cb *block_cb);
diff --git a/net/core/flow_offload.c b/net/core/flow_offload.c
index 76f8db3841d7..507de4b48815 100644
--- a/net/core/flow_offload.c
+++ b/net/core/flow_offload.c
@@ -165,7 +165,7 @@ void flow_rule_match_enc_opts(const struct flow_rule *rule,
 }
 EXPORT_SYMBOL(flow_rule_match_enc_opts);
 
-struct flow_block_cb *flow_block_cb_alloc(struct net *net, tc_setup_cb_t *cb,
+struct flow_block_cb *flow_block_cb_alloc(tc_setup_cb_t *cb,
 					  void *cb_ident, void *cb_priv,
 					  void (*release)(void *cb_priv))
 {
@@ -175,7 +175,6 @@ struct flow_block_cb *flow_block_cb_alloc(struct net *net, tc_setup_cb_t *cb,
 	if (!block_cb)
 		return ERR_PTR(-ENOMEM);
 
-	block_cb->net = net;
 	block_cb->cb = cb;
 	block_cb->cb_ident = cb_ident;
 	block_cb->cb_priv = cb_priv;
@@ -200,8 +199,7 @@ struct flow_block_cb *flow_block_cb_lookup(struct flow_block_offload *f,
 	struct flow_block_cb *block_cb;
 
 	list_for_each_entry(block_cb, f->driver_block_list, driver_list) {
-		if (block_cb->net == f->net &&
-		    block_cb->cb == cb &&
+		if (block_cb->cb == cb &&
 		    block_cb->cb_ident == cb_ident)
 			return block_cb;
 	}
@@ -261,8 +259,7 @@ int flow_block_cb_setup_simple(struct flow_block_offload *f,
 		if (flow_block_cb_is_busy(cb, cb_ident, driver_block_list))
 			return -EBUSY;
 
-		block_cb = flow_block_cb_alloc(f->net, cb, cb_ident,
-					       cb_priv, NULL);
+		block_cb = flow_block_cb_alloc(cb, cb_ident, cb_priv, NULL);
 		if (IS_ERR(block_cb))
 			return PTR_ERR(block_cb);
 
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 614c38ece104..6ca9ec58f881 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -967,7 +967,7 @@ static int dsa_slave_setup_tc_block(struct net_device *dev,
 		if (flow_block_cb_is_busy(cb, dev, &dsa_slave_block_cb_list))
 			return -EBUSY;
 
-		block_cb = flow_block_cb_alloc(f->net, cb, dev, dev, NULL);
+		block_cb = flow_block_cb_alloc(cb, dev, dev, NULL);
 		if (IS_ERR(block_cb))
 			return PTR_ERR(block_cb);
 
-- 
2.11.0


^ permalink raw reply related

* [PATCH net-next,v2 3/3] net: flow_offload: add flow_block structure and use it
From: Pablo Neira Ayuso @ 2019-07-11 13:09 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, jiri, jakub.kicinski
In-Reply-To: <20190711130923.2483-1-pablo@netfilter.org>

This object stores the flow block callbacks that are attached to this
block. This patch restores block sharing.

Fixes: da3eeb904ff4 ("net: flow_offload: add list handling functions")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
v3: add flow_block_init() - Jiri Pirko.

 include/net/flow_offload.h        | 10 ++++++++++
 include/net/netfilter/nf_tables.h |  5 +++--
 include/net/sch_generic.h         |  2 +-
 net/core/flow_offload.c           |  2 +-
 net/netfilter/nf_tables_api.c     |  2 +-
 net/netfilter/nf_tables_offload.c |  5 +++--
 net/sched/cls_api.c               | 10 +++++++---
 7 files changed, 26 insertions(+), 10 deletions(-)

diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
index 98bf3af5c84d..8d90d96ace82 100644
--- a/include/net/flow_offload.h
+++ b/include/net/flow_offload.h
@@ -248,6 +248,10 @@ enum flow_block_binder_type {
 	FLOW_BLOCK_BINDER_TYPE_CLSACT_EGRESS,
 };
 
+struct flow_block {
+	struct list_head cb_list;
+};
+
 struct netlink_ext_ack;
 
 struct flow_block_offload {
@@ -255,6 +259,7 @@ struct flow_block_offload {
 	enum flow_block_binder_type binder_type;
 	bool block_shared;
 	struct net *net;
+	struct flow_block *block;
 	struct list_head cb_list;
 	struct list_head *driver_block_list;
 	struct netlink_ext_ack *extack;
@@ -336,4 +341,9 @@ flow_cls_offload_flow_rule(struct flow_cls_offload *flow_cmd)
 	return flow_cmd->rule;
 }
 
+static inline void flow_block_init(struct flow_block *flow_block)
+{
+	INIT_LIST_HEAD(&flow_block->cb_list);
+}
+
 #endif /* _NET_FLOW_OFFLOAD_H */
diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
index 35dfdd9f69b3..b02d8fa80b95 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -11,6 +11,7 @@
 #include <linux/rhashtable.h>
 #include <net/netfilter/nf_flow_table.h>
 #include <net/netlink.h>
+#include <net/flow_offload.h>
 
 struct module;
 
@@ -951,7 +952,7 @@ struct nft_stats {
  *	@stats: per-cpu chain stats
  *	@chain: the chain
  *	@dev_name: device name that this base chain is attached to (if any)
- *	@cb_list: list of flow block callbacks (for hardware offload)
+ *	@flow: flow block (for hardware offload)
  */
 struct nft_base_chain {
 	struct nf_hook_ops		ops;
@@ -961,7 +962,7 @@ struct nft_base_chain {
 	struct nft_stats __percpu	*stats;
 	struct nft_chain		chain;
 	char 				dev_name[IFNAMSIZ];
-	struct list_head		cb_list;
+	struct flow_block		flow_block;
 };
 
 static inline struct nft_base_chain *nft_base_chain(const struct nft_chain *chain)
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 9482e060483b..6b6b01234dd9 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -399,7 +399,7 @@ struct tcf_block {
 	refcount_t refcnt;
 	struct net *net;
 	struct Qdisc *q;
-	struct list_head cb_list;
+	struct flow_block flow_block;
 	struct list_head owner_list;
 	bool keep_dst;
 	unsigned int offloadcnt; /* Number of oddloaded filters */
diff --git a/net/core/flow_offload.c b/net/core/flow_offload.c
index a800fa78d96c..935c7f81a9ef 100644
--- a/net/core/flow_offload.c
+++ b/net/core/flow_offload.c
@@ -198,7 +198,7 @@ struct flow_block_cb *flow_block_cb_lookup(struct flow_block_offload *f,
 {
 	struct flow_block_cb *block_cb;
 
-	list_for_each_entry(block_cb, f->driver_block_list, driver_list) {
+	list_for_each_entry(block_cb, &f->block->cb_list, list) {
 		if (block_cb->cb == cb &&
 		    block_cb->cb_ident == cb_ident)
 			return block_cb;
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index ed17a7c29b86..11acfc3ee37a 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -1662,7 +1662,7 @@ static int nf_tables_addchain(struct nft_ctx *ctx, u8 family, u8 genmask,
 
 		chain->flags |= NFT_BASE_CHAIN | flags;
 		basechain->policy = NF_ACCEPT;
-		INIT_LIST_HEAD(&basechain->cb_list);
+		flow_block_init(&basechain->flow_block);
 	} else {
 		chain = kzalloc(sizeof(*chain), GFP_KERNEL);
 		if (chain == NULL)
diff --git a/net/netfilter/nf_tables_offload.c b/net/netfilter/nf_tables_offload.c
index 2c3302845f67..64f5fd5f240e 100644
--- a/net/netfilter/nf_tables_offload.c
+++ b/net/netfilter/nf_tables_offload.c
@@ -116,7 +116,7 @@ static int nft_setup_cb_call(struct nft_base_chain *basechain,
 	struct flow_block_cb *block_cb;
 	int err;
 
-	list_for_each_entry(block_cb, &basechain->cb_list, list) {
+	list_for_each_entry(block_cb, &basechain->flow_block.cb_list, list) {
 		err = block_cb->cb(type, type_data, block_cb->cb_priv);
 		if (err < 0)
 			return err;
@@ -154,7 +154,7 @@ static int nft_flow_offload_rule(struct nft_trans *trans,
 static int nft_flow_offload_bind(struct flow_block_offload *bo,
 				 struct nft_base_chain *basechain)
 {
-	list_splice(&bo->cb_list, &basechain->cb_list);
+	list_splice(&bo->cb_list, &basechain->flow_block.cb_list);
 	return 0;
 }
 
@@ -198,6 +198,7 @@ static int nft_flow_offload_chain(struct nft_trans *trans,
 		return -EOPNOTSUPP;
 
 	bo.command = cmd;
+	bo.block = &basechain->flow_block;
 	bo.binder_type = FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS;
 	bo.extack = &extack;
 	INIT_LIST_HEAD(&bo.cb_list);
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 51fbe6e95a92..3b9720fe1e60 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -691,6 +691,8 @@ static void tc_indr_block_ing_cmd(struct tc_indr_block_dev *indr_dev,
 	if (!indr_dev->block)
 		return;
 
+	bo.block = &indr_dev->block->flow_block;
+
 	indr_block_cb->cb(indr_dev->dev, indr_block_cb->cb_priv, TC_SETUP_BLOCK,
 			  &bo);
 	tcf_block_setup(indr_dev->block, &bo);
@@ -775,6 +777,7 @@ static void tc_indr_block_call(struct tcf_block *block, struct net_device *dev,
 		.command	= command,
 		.binder_type	= ei->binder_type,
 		.net		= dev_net(dev),
+		.block		= &block->flow_block,
 		.block_shared	= tcf_block_shared(block),
 		.extack		= extack,
 	};
@@ -810,6 +813,7 @@ static int tcf_block_offload_cmd(struct tcf_block *block,
 	bo.net = dev_net(dev);
 	bo.command = command;
 	bo.binder_type = ei->binder_type;
+	bo.block = &block->flow_block;
 	bo.block_shared = tcf_block_shared(block);
 	bo.extack = extack;
 	INIT_LIST_HEAD(&bo.cb_list);
@@ -987,8 +991,8 @@ static struct tcf_block *tcf_block_create(struct net *net, struct Qdisc *q,
 		return ERR_PTR(-ENOMEM);
 	}
 	mutex_init(&block->lock);
+	flow_block_init(&block->flow_block);
 	INIT_LIST_HEAD(&block->chain_list);
-	INIT_LIST_HEAD(&block->cb_list);
 	INIT_LIST_HEAD(&block->owner_list);
 	INIT_LIST_HEAD(&block->chain0.filter_chain_list);
 
@@ -1570,7 +1574,7 @@ static int tcf_block_bind(struct tcf_block *block,
 
 		i++;
 	}
-	list_splice(&bo->cb_list, &block->cb_list);
+	list_splice(&bo->cb_list, &block->flow_block.cb_list);
 
 	return 0;
 
@@ -3155,7 +3159,7 @@ int tc_setup_cb_call(struct tcf_block *block, enum tc_setup_type type,
 	if (block->nooffloaddevcnt && err_stop)
 		return -EOPNOTSUPP;
 
-	list_for_each_entry(block_cb, &block->cb_list, list) {
+	list_for_each_entry(block_cb, &block->flow_block.cb_list, list) {
 		err = block_cb->cb(type, type_data, block_cb->cb_priv);
 		if (err) {
 			if (err_stop)
-- 
2.11.0


^ permalink raw reply related

* [PATCH net-next,v2 2/3] net: flow_offload: rename tc_setup_cb_t to flow_setup_cb_t
From: Pablo Neira Ayuso @ 2019-07-11 13:09 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, jiri, jakub.kicinski
In-Reply-To: <20190711130923.2483-1-pablo@netfilter.org>

Rename this type definition and adapt users.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum.c |  2 +-
 drivers/net/ethernet/mscc/ocelot_tc.c          |  2 +-
 include/net/flow_offload.h                     | 16 ++++++++++------
 include/net/pkt_cls.h                          |  4 ++--
 include/net/sch_generic.h                      |  6 ++----
 net/core/flow_offload.c                        |  9 +++++----
 net/dsa/slave.c                                |  2 +-
 net/sched/cls_api.c                            |  2 +-
 net/sched/cls_bpf.c                            |  2 +-
 net/sched/cls_flower.c                         |  2 +-
 net/sched/cls_matchall.c                       |  2 +-
 net/sched/cls_u32.c                            |  6 +++---
 12 files changed, 29 insertions(+), 26 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index a469035400cf..51cd0b6f1f3e 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -1679,7 +1679,7 @@ static int mlxsw_sp_setup_tc_block(struct mlxsw_sp_port *mlxsw_sp_port,
 				   struct flow_block_offload *f)
 {
 	struct flow_block_cb *block_cb;
-	tc_setup_cb_t *cb;
+	flow_setup_cb_t *cb;
 	bool ingress;
 	int err;
 
diff --git a/drivers/net/ethernet/mscc/ocelot_tc.c b/drivers/net/ethernet/mscc/ocelot_tc.c
index abbcb66bf5ac..fba9512e9ca6 100644
--- a/drivers/net/ethernet/mscc/ocelot_tc.c
+++ b/drivers/net/ethernet/mscc/ocelot_tc.c
@@ -134,7 +134,7 @@ static int ocelot_setup_tc_block(struct ocelot_port *port,
 				 struct flow_block_offload *f)
 {
 	struct flow_block_cb *block_cb;
-	tc_setup_cb_t *cb;
+	flow_setup_cb_t *cb;
 	int err;
 
 	netdev_dbg(port->dev, "tc_block command %d, binder_type %d\n",
diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
index aa9b5287b231..98bf3af5c84d 100644
--- a/include/net/flow_offload.h
+++ b/include/net/flow_offload.h
@@ -3,7 +3,6 @@
 
 #include <linux/kernel.h>
 #include <net/flow_dissector.h>
-#include <net/sch_generic.h>
 
 struct flow_match {
 	struct flow_dissector	*dissector;
@@ -261,23 +260,27 @@ struct flow_block_offload {
 	struct netlink_ext_ack *extack;
 };
 
+enum tc_setup_type;
+typedef int flow_setup_cb_t(enum tc_setup_type type, void *type_data,
+			    void *cb_priv);
+
 struct flow_block_cb {
 	struct list_head	driver_list;
 	struct list_head	list;
-	tc_setup_cb_t		*cb;
+	flow_setup_cb_t		*cb;
 	void			*cb_ident;
 	void			*cb_priv;
 	void			(*release)(void *cb_priv);
 	unsigned int		refcnt;
 };
 
-struct flow_block_cb *flow_block_cb_alloc(tc_setup_cb_t *cb,
+struct flow_block_cb *flow_block_cb_alloc(flow_setup_cb_t *cb,
 					  void *cb_ident, void *cb_priv,
 					  void (*release)(void *cb_priv));
 void flow_block_cb_free(struct flow_block_cb *block_cb);
 
 struct flow_block_cb *flow_block_cb_lookup(struct flow_block_offload *offload,
-					   tc_setup_cb_t *cb, void *cb_ident);
+					   flow_setup_cb_t *cb, void *cb_ident);
 
 void *flow_block_cb_priv(struct flow_block_cb *block_cb);
 void flow_block_cb_incref(struct flow_block_cb *block_cb);
@@ -295,11 +298,12 @@ static inline void flow_block_cb_remove(struct flow_block_cb *block_cb,
 	list_move(&block_cb->list, &offload->cb_list);
 }
 
-bool flow_block_cb_is_busy(tc_setup_cb_t *cb, void *cb_ident,
+bool flow_block_cb_is_busy(flow_setup_cb_t *cb, void *cb_ident,
 			   struct list_head *driver_block_list);
 
 int flow_block_cb_setup_simple(struct flow_block_offload *f,
-			       struct list_head *driver_list, tc_setup_cb_t *cb,
+			       struct list_head *driver_list,
+			       flow_setup_cb_t *cb,
 			       void *cb_ident, void *cb_priv, bool ingress_only);
 
 enum flow_cls_command {
diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index 841faadceb6e..cee651b76a1f 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -126,14 +126,14 @@ static inline struct Qdisc *tcf_block_q(struct tcf_block *block)
 }
 
 static inline
-int tc_setup_cb_block_register(struct tcf_block *block, tc_setup_cb_t *cb,
+int tc_setup_cb_block_register(struct tcf_block *block, flow_setup_cb_t *cb,
 			       void *cb_priv)
 {
 	return 0;
 }
 
 static inline
-void tc_setup_cb_block_unregister(struct tcf_block *block, tc_setup_cb_t *cb,
+void tc_setup_cb_block_unregister(struct tcf_block *block, flow_setup_cb_t *cb,
 				  void *cb_priv)
 {
 }
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 855167bbc372..9482e060483b 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -15,6 +15,7 @@
 #include <linux/mutex.h>
 #include <net/gen_stats.h>
 #include <net/rtnetlink.h>
+#include <net/flow_offload.h>
 
 struct Qdisc_ops;
 struct qdisc_walker;
@@ -22,9 +23,6 @@ struct tcf_walker;
 struct module;
 struct bpf_flow_keys;
 
-typedef int tc_setup_cb_t(enum tc_setup_type type,
-			  void *type_data, void *cb_priv);
-
 typedef int tc_indr_block_bind_cb_t(struct net_device *dev, void *cb_priv,
 				    enum tc_setup_type type, void *type_data);
 
@@ -313,7 +311,7 @@ struct tcf_proto_ops {
 	void			(*walk)(struct tcf_proto *tp,
 					struct tcf_walker *arg, bool rtnl_held);
 	int			(*reoffload)(struct tcf_proto *tp, bool add,
-					     tc_setup_cb_t *cb, void *cb_priv,
+					     flow_setup_cb_t *cb, void *cb_priv,
 					     struct netlink_ext_ack *extack);
 	void			(*bind_class)(void *, u32, unsigned long);
 	void *			(*tmplt_create)(struct net *net,
diff --git a/net/core/flow_offload.c b/net/core/flow_offload.c
index 507de4b48815..a800fa78d96c 100644
--- a/net/core/flow_offload.c
+++ b/net/core/flow_offload.c
@@ -165,7 +165,7 @@ void flow_rule_match_enc_opts(const struct flow_rule *rule,
 }
 EXPORT_SYMBOL(flow_rule_match_enc_opts);
 
-struct flow_block_cb *flow_block_cb_alloc(tc_setup_cb_t *cb,
+struct flow_block_cb *flow_block_cb_alloc(flow_setup_cb_t *cb,
 					  void *cb_ident, void *cb_priv,
 					  void (*release)(void *cb_priv))
 {
@@ -194,7 +194,7 @@ void flow_block_cb_free(struct flow_block_cb *block_cb)
 EXPORT_SYMBOL(flow_block_cb_free);
 
 struct flow_block_cb *flow_block_cb_lookup(struct flow_block_offload *f,
-					   tc_setup_cb_t *cb, void *cb_ident)
+					   flow_setup_cb_t *cb, void *cb_ident)
 {
 	struct flow_block_cb *block_cb;
 
@@ -226,7 +226,7 @@ unsigned int flow_block_cb_decref(struct flow_block_cb *block_cb)
 }
 EXPORT_SYMBOL(flow_block_cb_decref);
 
-bool flow_block_cb_is_busy(tc_setup_cb_t *cb, void *cb_ident,
+bool flow_block_cb_is_busy(flow_setup_cb_t *cb, void *cb_ident,
 			   struct list_head *driver_block_list)
 {
 	struct flow_block_cb *block_cb;
@@ -243,7 +243,8 @@ EXPORT_SYMBOL(flow_block_cb_is_busy);
 
 int flow_block_cb_setup_simple(struct flow_block_offload *f,
 			       struct list_head *driver_block_list,
-			       tc_setup_cb_t *cb, void *cb_ident, void *cb_priv,
+			       flow_setup_cb_t *cb,
+			       void *cb_ident, void *cb_priv,
 			       bool ingress_only)
 {
 	struct flow_block_cb *block_cb;
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 6ca9ec58f881..d697a64fb564 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -951,7 +951,7 @@ static int dsa_slave_setup_tc_block(struct net_device *dev,
 				    struct flow_block_offload *f)
 {
 	struct flow_block_cb *block_cb;
-	tc_setup_cb_t *cb;
+	flow_setup_cb_t *cb;
 
 	if (f->binder_type == FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS)
 		cb = dsa_slave_setup_tc_block_cb_ig;
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 278014e26aec..51fbe6e95a92 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -1514,7 +1514,7 @@ void tcf_block_put(struct tcf_block *block)
 EXPORT_SYMBOL(tcf_block_put);
 
 static int
-tcf_block_playback_offloads(struct tcf_block *block, tc_setup_cb_t *cb,
+tcf_block_playback_offloads(struct tcf_block *block, flow_setup_cb_t *cb,
 			    void *cb_priv, bool add, bool offload_in_use,
 			    struct netlink_ext_ack *extack)
 {
diff --git a/net/sched/cls_bpf.c b/net/sched/cls_bpf.c
index 691f71830134..3f7a9c02b70c 100644
--- a/net/sched/cls_bpf.c
+++ b/net/sched/cls_bpf.c
@@ -651,7 +651,7 @@ static void cls_bpf_walk(struct tcf_proto *tp, struct tcf_walker *arg,
 	}
 }
 
-static int cls_bpf_reoffload(struct tcf_proto *tp, bool add, tc_setup_cb_t *cb,
+static int cls_bpf_reoffload(struct tcf_proto *tp, bool add, flow_setup_cb_t *cb,
 			     void *cb_priv, struct netlink_ext_ack *extack)
 {
 	struct cls_bpf_head *head = rtnl_dereference(tp->root);
diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index 38d6e85693fc..054123742e32 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -1800,7 +1800,7 @@ fl_get_next_hw_filter(struct tcf_proto *tp, struct cls_fl_filter *f, bool add)
 	return NULL;
 }
 
-static int fl_reoffload(struct tcf_proto *tp, bool add, tc_setup_cb_t *cb,
+static int fl_reoffload(struct tcf_proto *tp, bool add, flow_setup_cb_t *cb,
 			void *cb_priv, struct netlink_ext_ack *extack)
 {
 	struct tcf_block *block = tp->chain->block;
diff --git a/net/sched/cls_matchall.c b/net/sched/cls_matchall.c
index a30d2f8feb32..455ea2793f9b 100644
--- a/net/sched/cls_matchall.c
+++ b/net/sched/cls_matchall.c
@@ -282,7 +282,7 @@ static void mall_walk(struct tcf_proto *tp, struct tcf_walker *arg,
 	arg->count++;
 }
 
-static int mall_reoffload(struct tcf_proto *tp, bool add, tc_setup_cb_t *cb,
+static int mall_reoffload(struct tcf_proto *tp, bool add, flow_setup_cb_t *cb,
 			  void *cb_priv, struct netlink_ext_ack *extack)
 {
 	struct cls_mall_head *head = rtnl_dereference(tp->root);
diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c
index be9e46c77e8b..8614088edd1b 100644
--- a/net/sched/cls_u32.c
+++ b/net/sched/cls_u32.c
@@ -1152,7 +1152,7 @@ static void u32_walk(struct tcf_proto *tp, struct tcf_walker *arg,
 }
 
 static int u32_reoffload_hnode(struct tcf_proto *tp, struct tc_u_hnode *ht,
-			       bool add, tc_setup_cb_t *cb, void *cb_priv,
+			       bool add, flow_setup_cb_t *cb, void *cb_priv,
 			       struct netlink_ext_ack *extack)
 {
 	struct tc_cls_u32_offload cls_u32 = {};
@@ -1172,7 +1172,7 @@ static int u32_reoffload_hnode(struct tcf_proto *tp, struct tc_u_hnode *ht,
 }
 
 static int u32_reoffload_knode(struct tcf_proto *tp, struct tc_u_knode *n,
-			       bool add, tc_setup_cb_t *cb, void *cb_priv,
+			       bool add, flow_setup_cb_t *cb, void *cb_priv,
 			       struct netlink_ext_ack *extack)
 {
 	struct tc_u_hnode *ht = rtnl_dereference(n->ht_down);
@@ -1213,7 +1213,7 @@ static int u32_reoffload_knode(struct tcf_proto *tp, struct tc_u_knode *n,
 	return 0;
 }
 
-static int u32_reoffload(struct tcf_proto *tp, bool add, tc_setup_cb_t *cb,
+static int u32_reoffload(struct tcf_proto *tp, bool add, flow_setup_cb_t *cb,
 			 void *cb_priv, struct netlink_ext_ack *extack)
 {
 	struct tc_u_common *tp_c = tp->data;
-- 
2.11.0


^ permalink raw reply related

* Re: [PATCH net-next 3/3] net: flow_offload: add flow_block structure and use it
From: Pablo Neira Ayuso @ 2019-07-11 13:13 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: netfilter-devel, davem, netdev, jakub.kicinski
In-Reply-To: <20190711080609.GF2291@nanopsycho>

On Thu, Jul 11, 2019 at 10:06:09AM +0200, Jiri Pirko wrote:
> Thu, Jul 11, 2019 at 02:12:35AM CEST, pablo@netfilter.org wrote:
> >This object stores the flow block callbacks that are attached to this
> >block. This patch restores block sharing.
> >
> >Fixes: da3eeb904ff4 ("net: flow_offload: add list handling functions")
> >Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
> >---
> > include/net/flow_offload.h        |  5 +++++
> > include/net/netfilter/nf_tables.h |  5 +++--
> > include/net/sch_generic.h         |  2 +-
> > net/core/flow_offload.c           |  2 +-
> > net/netfilter/nf_tables_api.c     |  2 +-
> > net/netfilter/nf_tables_offload.c |  5 +++--
> > net/sched/cls_api.c               | 10 +++++++---
> > 7 files changed, 21 insertions(+), 10 deletions(-)
> >
> >diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
> >index 98bf3af5c84d..e50d94736829 100644
> >--- a/include/net/flow_offload.h
> >+++ b/include/net/flow_offload.h
> >@@ -248,6 +248,10 @@ enum flow_block_binder_type {
> > 	FLOW_BLOCK_BINDER_TYPE_CLSACT_EGRESS,
> > };
> > 
> >+struct flow_block {
> >+	struct list_head cb_list;
> >+};
> >+
> > struct netlink_ext_ack;
> > 
> > struct flow_block_offload {
> >@@ -255,6 +259,7 @@ struct flow_block_offload {
> > 	enum flow_block_binder_type binder_type;
> > 	bool block_shared;
> > 	struct net *net;
> >+	struct flow_block *block;
> > 	struct list_head cb_list;
> > 	struct list_head *driver_block_list;
> > 	struct netlink_ext_ack *extack;
> >diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
> >index 35dfdd9f69b3..00658462f89b 100644
> >--- a/include/net/netfilter/nf_tables.h
> >+++ b/include/net/netfilter/nf_tables.h
> >@@ -11,6 +11,7 @@
> > #include <linux/rhashtable.h>
> > #include <net/netfilter/nf_flow_table.h>
> > #include <net/netlink.h>
> >+#include <net/flow_offload.h>
> > 
> > struct module;
> > 
> >@@ -951,7 +952,7 @@ struct nft_stats {
> >  *	@stats: per-cpu chain stats
> >  *	@chain: the chain
> >  *	@dev_name: device name that this base chain is attached to (if any)
> >- *	@cb_list: list of flow block callbacks (for hardware offload)
> >+ *	@block: flow block (for hardware offload)
> >  */
> > struct nft_base_chain {
> > 	struct nf_hook_ops		ops;
> >@@ -961,7 +962,7 @@ struct nft_base_chain {
> > 	struct nft_stats __percpu	*stats;
> > 	struct nft_chain		chain;
> > 	char 				dev_name[IFNAMSIZ];
> >-	struct list_head		cb_list;
> >+	struct flow_block		block;
> > };
> > 
> > static inline struct nft_base_chain *nft_base_chain(const struct nft_chain *chain)
> >diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
> >index 9482e060483b..58041cb0ce15 100644
> >--- a/include/net/sch_generic.h
> >+++ b/include/net/sch_generic.h
> >@@ -399,7 +399,7 @@ struct tcf_block {
> > 	refcount_t refcnt;
> > 	struct net *net;
> > 	struct Qdisc *q;
> >-	struct list_head cb_list;
> >+	struct flow_block flow;
> 
> It is not a "flow", that is confusing. It should be named "flow_block".

Done.

> > 	struct list_head owner_list;
> > 	bool keep_dst;
> > 	unsigned int offloadcnt; /* Number of oddloaded filters */
> >diff --git a/net/core/flow_offload.c b/net/core/flow_offload.c
> >index a800fa78d96c..935c7f81a9ef 100644
> >--- a/net/core/flow_offload.c
> >+++ b/net/core/flow_offload.c
> >@@ -198,7 +198,7 @@ struct flow_block_cb *flow_block_cb_lookup(struct flow_block_offload *f,
> > {
> > 	struct flow_block_cb *block_cb;
> > 
> >-	list_for_each_entry(block_cb, f->driver_block_list, driver_list) {
> >+	list_for_each_entry(block_cb, &f->block->cb_list, list) {
> 
> Please made struct flow_block *block and argument of cb_lookup instead
> of struct flow_block_offload *f (as it was previously).

I can do so if you insist, this will make this fix larger.

> > 		if (block_cb->cb == cb &&
> > 		    block_cb->cb_ident == cb_ident)
> > 			return block_cb;
> >diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
> >index ed17a7c29b86..c565f146435b 100644
> >--- a/net/netfilter/nf_tables_api.c
> >+++ b/net/netfilter/nf_tables_api.c
> >@@ -1662,7 +1662,7 @@ static int nf_tables_addchain(struct nft_ctx *ctx, u8 family, u8 genmask,
> > 
> > 		chain->flags |= NFT_BASE_CHAIN | flags;
> > 		basechain->policy = NF_ACCEPT;
> >-		INIT_LIST_HEAD(&basechain->cb_list);
> >+		INIT_LIST_HEAD(&basechain->block.cb_list);
> > 	} else {
> > 		chain = kzalloc(sizeof(*chain), GFP_KERNEL);
> > 		if (chain == NULL)
> >diff --git a/net/netfilter/nf_tables_offload.c b/net/netfilter/nf_tables_offload.c
> >index 2c3302845f67..2a184277ee58 100644
> >--- a/net/netfilter/nf_tables_offload.c
> >+++ b/net/netfilter/nf_tables_offload.c
> >@@ -116,7 +116,7 @@ static int nft_setup_cb_call(struct nft_base_chain *basechain,
> > 	struct flow_block_cb *block_cb;
> > 	int err;
> > 
> >-	list_for_each_entry(block_cb, &basechain->cb_list, list) {
> >+	list_for_each_entry(block_cb, &basechain->block.cb_list, list) {
> > 		err = block_cb->cb(type, type_data, block_cb->cb_priv);
> > 		if (err < 0)
> > 			return err;
> >@@ -154,7 +154,7 @@ static int nft_flow_offload_rule(struct nft_trans *trans,
> > static int nft_flow_offload_bind(struct flow_block_offload *bo,
> > 				 struct nft_base_chain *basechain)
> > {
> >-	list_splice(&bo->cb_list, &basechain->cb_list);
> >+	list_splice(&bo->cb_list, &basechain->block.cb_list);
> > 	return 0;
> > }
> > 
> >@@ -198,6 +198,7 @@ static int nft_flow_offload_chain(struct nft_trans *trans,
> > 		return -EOPNOTSUPP;
> > 
> > 	bo.command = cmd;
> >+	bo.block = &basechain->block;
> > 	bo.binder_type = FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS;
> > 	bo.extack = &extack;
> > 	INIT_LIST_HEAD(&bo.cb_list);
> >diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
> >index 51fbe6e95a92..66181961ad6f 100644
> >--- a/net/sched/cls_api.c
> >+++ b/net/sched/cls_api.c
> >@@ -691,6 +691,8 @@ static void tc_indr_block_ing_cmd(struct tc_indr_block_dev *indr_dev,
> > 	if (!indr_dev->block)
> > 		return;
> > 
> >+	bo.block = &indr_dev->block->flow;
> >+
> > 	indr_block_cb->cb(indr_dev->dev, indr_block_cb->cb_priv, TC_SETUP_BLOCK,
> > 			  &bo);
> > 	tcf_block_setup(indr_dev->block, &bo);
> >@@ -775,6 +777,7 @@ static void tc_indr_block_call(struct tcf_block *block, struct net_device *dev,
> > 		.command	= command,
> > 		.binder_type	= ei->binder_type,
> > 		.net		= dev_net(dev),
> >+		.block		= &block->flow,
> > 		.block_shared	= tcf_block_shared(block),
> > 		.extack		= extack,
> > 	};
> >@@ -810,6 +813,7 @@ static int tcf_block_offload_cmd(struct tcf_block *block,
> > 	bo.net = dev_net(dev);
> > 	bo.command = command;
> > 	bo.binder_type = ei->binder_type;
> >+	bo.block = &block->flow;
> > 	bo.block_shared = tcf_block_shared(block);
> > 	bo.extack = extack;
> > 	INIT_LIST_HEAD(&bo.cb_list);
> >@@ -988,7 +992,7 @@ static struct tcf_block *tcf_block_create(struct net *net, struct Qdisc *q,
> > 	}
> > 	mutex_init(&block->lock);
> > 	INIT_LIST_HEAD(&block->chain_list);
> >-	INIT_LIST_HEAD(&block->cb_list);
> >+	INIT_LIST_HEAD(&block->flow.cb_list);
> 
> With introduction of struct flow_block, please introduce also a helper
> to init this struct. Does not look right to init it from user codes
> (tc/nft).

Done.

^ permalink raw reply

* Re: [PATCH net-next 3/3] net: flow_offload: add flow_block structure and use it
From: Jiri Pirko @ 2019-07-11 13:18 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel, davem, netdev, jakub.kicinski
In-Reply-To: <20190711131300.ojbo5hxzvv6wi44t@salvia>

Thu, Jul 11, 2019 at 03:13:00PM CEST, pablo@netfilter.org wrote:
>On Thu, Jul 11, 2019 at 10:06:09AM +0200, Jiri Pirko wrote:
>> Thu, Jul 11, 2019 at 02:12:35AM CEST, pablo@netfilter.org wrote:
>> >This object stores the flow block callbacks that are attached to this
>> >block. This patch restores block sharing.
>> >
>> >Fixes: da3eeb904ff4 ("net: flow_offload: add list handling functions")
>> >Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
>> >---
>> > include/net/flow_offload.h        |  5 +++++
>> > include/net/netfilter/nf_tables.h |  5 +++--
>> > include/net/sch_generic.h         |  2 +-
>> > net/core/flow_offload.c           |  2 +-
>> > net/netfilter/nf_tables_api.c     |  2 +-
>> > net/netfilter/nf_tables_offload.c |  5 +++--
>> > net/sched/cls_api.c               | 10 +++++++---
>> > 7 files changed, 21 insertions(+), 10 deletions(-)
>> >
>> >diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
>> >index 98bf3af5c84d..e50d94736829 100644
>> >--- a/include/net/flow_offload.h
>> >+++ b/include/net/flow_offload.h
>> >@@ -248,6 +248,10 @@ enum flow_block_binder_type {
>> > 	FLOW_BLOCK_BINDER_TYPE_CLSACT_EGRESS,
>> > };
>> > 
>> >+struct flow_block {
>> >+	struct list_head cb_list;
>> >+};
>> >+
>> > struct netlink_ext_ack;
>> > 
>> > struct flow_block_offload {
>> >@@ -255,6 +259,7 @@ struct flow_block_offload {
>> > 	enum flow_block_binder_type binder_type;
>> > 	bool block_shared;
>> > 	struct net *net;
>> >+	struct flow_block *block;
>> > 	struct list_head cb_list;
>> > 	struct list_head *driver_block_list;
>> > 	struct netlink_ext_ack *extack;
>> >diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
>> >index 35dfdd9f69b3..00658462f89b 100644
>> >--- a/include/net/netfilter/nf_tables.h
>> >+++ b/include/net/netfilter/nf_tables.h
>> >@@ -11,6 +11,7 @@
>> > #include <linux/rhashtable.h>
>> > #include <net/netfilter/nf_flow_table.h>
>> > #include <net/netlink.h>
>> >+#include <net/flow_offload.h>
>> > 
>> > struct module;
>> > 
>> >@@ -951,7 +952,7 @@ struct nft_stats {
>> >  *	@stats: per-cpu chain stats
>> >  *	@chain: the chain
>> >  *	@dev_name: device name that this base chain is attached to (if any)
>> >- *	@cb_list: list of flow block callbacks (for hardware offload)
>> >+ *	@block: flow block (for hardware offload)
>> >  */
>> > struct nft_base_chain {
>> > 	struct nf_hook_ops		ops;
>> >@@ -961,7 +962,7 @@ struct nft_base_chain {
>> > 	struct nft_stats __percpu	*stats;
>> > 	struct nft_chain		chain;
>> > 	char 				dev_name[IFNAMSIZ];
>> >-	struct list_head		cb_list;
>> >+	struct flow_block		block;
>> > };
>> > 
>> > static inline struct nft_base_chain *nft_base_chain(const struct nft_chain *chain)
>> >diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
>> >index 9482e060483b..58041cb0ce15 100644
>> >--- a/include/net/sch_generic.h
>> >+++ b/include/net/sch_generic.h
>> >@@ -399,7 +399,7 @@ struct tcf_block {
>> > 	refcount_t refcnt;
>> > 	struct net *net;
>> > 	struct Qdisc *q;
>> >-	struct list_head cb_list;
>> >+	struct flow_block flow;
>> 
>> It is not a "flow", that is confusing. It should be named "flow_block".
>
>Done.
>
>> > 	struct list_head owner_list;
>> > 	bool keep_dst;
>> > 	unsigned int offloadcnt; /* Number of oddloaded filters */
>> >diff --git a/net/core/flow_offload.c b/net/core/flow_offload.c
>> >index a800fa78d96c..935c7f81a9ef 100644
>> >--- a/net/core/flow_offload.c
>> >+++ b/net/core/flow_offload.c
>> >@@ -198,7 +198,7 @@ struct flow_block_cb *flow_block_cb_lookup(struct flow_block_offload *f,
>> > {
>> > 	struct flow_block_cb *block_cb;
>> > 
>> >-	list_for_each_entry(block_cb, f->driver_block_list, driver_list) {
>> >+	list_for_each_entry(block_cb, &f->block->cb_list, list) {
>> 
>> Please made struct flow_block *block and argument of cb_lookup instead
>> of struct flow_block_offload *f (as it was previously).
>
>I can do so if you insist, this will make this fix larger.

Yes please. Thanks!


>
>> > 		if (block_cb->cb == cb &&
>> > 		    block_cb->cb_ident == cb_ident)
>> > 			return block_cb;
>> >diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
>> >index ed17a7c29b86..c565f146435b 100644
>> >--- a/net/netfilter/nf_tables_api.c
>> >+++ b/net/netfilter/nf_tables_api.c
>> >@@ -1662,7 +1662,7 @@ static int nf_tables_addchain(struct nft_ctx *ctx, u8 family, u8 genmask,
>> > 
>> > 		chain->flags |= NFT_BASE_CHAIN | flags;
>> > 		basechain->policy = NF_ACCEPT;
>> >-		INIT_LIST_HEAD(&basechain->cb_list);
>> >+		INIT_LIST_HEAD(&basechain->block.cb_list);
>> > 	} else {
>> > 		chain = kzalloc(sizeof(*chain), GFP_KERNEL);
>> > 		if (chain == NULL)
>> >diff --git a/net/netfilter/nf_tables_offload.c b/net/netfilter/nf_tables_offload.c
>> >index 2c3302845f67..2a184277ee58 100644
>> >--- a/net/netfilter/nf_tables_offload.c
>> >+++ b/net/netfilter/nf_tables_offload.c
>> >@@ -116,7 +116,7 @@ static int nft_setup_cb_call(struct nft_base_chain *basechain,
>> > 	struct flow_block_cb *block_cb;
>> > 	int err;
>> > 
>> >-	list_for_each_entry(block_cb, &basechain->cb_list, list) {
>> >+	list_for_each_entry(block_cb, &basechain->block.cb_list, list) {
>> > 		err = block_cb->cb(type, type_data, block_cb->cb_priv);
>> > 		if (err < 0)
>> > 			return err;
>> >@@ -154,7 +154,7 @@ static int nft_flow_offload_rule(struct nft_trans *trans,
>> > static int nft_flow_offload_bind(struct flow_block_offload *bo,
>> > 				 struct nft_base_chain *basechain)
>> > {
>> >-	list_splice(&bo->cb_list, &basechain->cb_list);
>> >+	list_splice(&bo->cb_list, &basechain->block.cb_list);
>> > 	return 0;
>> > }
>> > 
>> >@@ -198,6 +198,7 @@ static int nft_flow_offload_chain(struct nft_trans *trans,
>> > 		return -EOPNOTSUPP;
>> > 
>> > 	bo.command = cmd;
>> >+	bo.block = &basechain->block;
>> > 	bo.binder_type = FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS;
>> > 	bo.extack = &extack;
>> > 	INIT_LIST_HEAD(&bo.cb_list);
>> >diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
>> >index 51fbe6e95a92..66181961ad6f 100644
>> >--- a/net/sched/cls_api.c
>> >+++ b/net/sched/cls_api.c
>> >@@ -691,6 +691,8 @@ static void tc_indr_block_ing_cmd(struct tc_indr_block_dev *indr_dev,
>> > 	if (!indr_dev->block)
>> > 		return;
>> > 
>> >+	bo.block = &indr_dev->block->flow;
>> >+
>> > 	indr_block_cb->cb(indr_dev->dev, indr_block_cb->cb_priv, TC_SETUP_BLOCK,
>> > 			  &bo);
>> > 	tcf_block_setup(indr_dev->block, &bo);
>> >@@ -775,6 +777,7 @@ static void tc_indr_block_call(struct tcf_block *block, struct net_device *dev,
>> > 		.command	= command,
>> > 		.binder_type	= ei->binder_type,
>> > 		.net		= dev_net(dev),
>> >+		.block		= &block->flow,
>> > 		.block_shared	= tcf_block_shared(block),
>> > 		.extack		= extack,
>> > 	};
>> >@@ -810,6 +813,7 @@ static int tcf_block_offload_cmd(struct tcf_block *block,
>> > 	bo.net = dev_net(dev);
>> > 	bo.command = command;
>> > 	bo.binder_type = ei->binder_type;
>> >+	bo.block = &block->flow;
>> > 	bo.block_shared = tcf_block_shared(block);
>> > 	bo.extack = extack;
>> > 	INIT_LIST_HEAD(&bo.cb_list);
>> >@@ -988,7 +992,7 @@ static struct tcf_block *tcf_block_create(struct net *net, struct Qdisc *q,
>> > 	}
>> > 	mutex_init(&block->lock);
>> > 	INIT_LIST_HEAD(&block->chain_list);
>> >-	INIT_LIST_HEAD(&block->cb_list);
>> >+	INIT_LIST_HEAD(&block->flow.cb_list);
>> 
>> With introduction of struct flow_block, please introduce also a helper
>> to init this struct. Does not look right to init it from user codes
>> (tc/nft).
>
>Done.

^ permalink raw reply

* Re: [PATCH net-next,v2 3/3] net: flow_offload: add flow_block structure and use it
From: Jiri Pirko @ 2019-07-11 13:23 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel, davem, netdev, jakub.kicinski
In-Reply-To: <20190711130923.2483-3-pablo@netfilter.org>

Thu, Jul 11, 2019 at 03:09:23PM CEST, pablo@netfilter.org wrote:
>This object stores the flow block callbacks that are attached to this
>block. This patch restores block sharing.
>
>Fixes: da3eeb904ff4 ("net: flow_offload: add list handling functions")
>Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
>---
>v3: add flow_block_init() - Jiri Pirko.

and rename flow/flow_block/

[...]


>@@ -951,7 +952,7 @@ struct nft_stats {
>  *	@stats: per-cpu chain stats
>  *	@chain: the chain
>  *	@dev_name: device name that this base chain is attached to (if any)
>- *	@cb_list: list of flow block callbacks (for hardware offload)
>+ *	@flow: flow block (for hardware offload)

You missed rename here: s/flow:/flow_block:/


>  */
> struct nft_base_chain {
> 	struct nf_hook_ops		ops;
>@@ -961,7 +962,7 @@ struct nft_base_chain {
> 	struct nft_stats __percpu	*stats;
> 	struct nft_chain		chain;
> 	char 				dev_name[IFNAMSIZ];
>-	struct list_head		cb_list;
>+	struct flow_block		flow_block;
> };
> 
> static inline struct nft_base_chain *nft_base_chain(const struct nft_chain *chain)
>diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
>index 9482e060483b..6b6b01234dd9 100644
>--- a/include/net/sch_generic.h
>+++ b/include/net/sch_generic.h
>@@ -399,7 +399,7 @@ struct tcf_block {
> 	refcount_t refcnt;
> 	struct net *net;
> 	struct Qdisc *q;
>-	struct list_head cb_list;
>+	struct flow_block flow_block;
> 	struct list_head owner_list;
> 	bool keep_dst;
> 	unsigned int offloadcnt; /* Number of oddloaded filters */
>diff --git a/net/core/flow_offload.c b/net/core/flow_offload.c
>index a800fa78d96c..935c7f81a9ef 100644
>--- a/net/core/flow_offload.c
>+++ b/net/core/flow_offload.c
>@@ -198,7 +198,7 @@ struct flow_block_cb *flow_block_cb_lookup(struct flow_block_offload *f,

Reminding the block arg here.


> {
> 	struct flow_block_cb *block_cb;
> 
>-	list_for_each_entry(block_cb, f->driver_block_list, driver_list) {
>+	list_for_each_entry(block_cb, &f->block->cb_list, list) {
> 		if (block_cb->cb == cb &&
> 		    block_cb->cb_ident == cb_ident)
> 			return block_cb;

[...]


^ permalink raw reply

* Re: [PATCH v3 0/3] kernel/notifier.c: avoid duplicate registration
From: Vasily Averin @ 2019-07-11 13:57 UTC (permalink / raw)
  To: Nixiaoming, adobriyan@gmail.com, akpm@linux-foundation.org,
	anna.schumaker@netapp.com, arjan@linux.intel.com,
	bfields@fieldses.org, chuck.lever@oracle.com, davem@davemloft.net,
	gregkh@linuxfoundation.org, jlayton@kernel.org, luto@kernel.org,
	mingo@kernel.org, Nadia.Derbey@bull.net,
	paulmck@linux.vnet.ibm.com, semen.protsenko@linaro.org,
	stable@kernel.org, stern@rowland.harvard.edu, tglx@linutronix.de,
	torvalds@linux-foundation.org, trond.myklebust@hammerspace.com,
	viresh.kumar@linaro.org
  Cc: Huangjianhui (Alex), Dailei, linux-kernel@vger.kernel.org,
	linux-nfs@vger.kernel.org, netdev@vger.kernel.org
In-Reply-To: <E490CD805F7529488761C40FD9D26EF12AC9D068@dggemm507-mbx.china.huawei.com>

On 7/11/19 4:55 AM, Nixiaoming wrote:
> On Wed, July 10, 2019 1:49 PM Vasily Averin wrote:
>> On 7/10/19 6:09 AM, Xiaoming Ni wrote:
>>> Registering the same notifier to a hook repeatedly can cause the hook
>>> list to form a ring or lose other members of the list.
>>
>> I think is not enough to _prevent_ 2nd register attempt,
>> it's enough to detect just attempt and generate warning to mark host in bad state.
>>
> 
> Duplicate registration is prevented in my patch, not just "mark host in bad state"
> 
> Duplicate registration is checked and exited in notifier_chain_cond_register()
> 
> Duplicate registration was checked in notifier_chain_register() but only 
> the alarm was triggered without exiting. added by commit 831246570d34692e 
> ("kernel/notifier.c: double register detection")
> 
> My patch is like a combination of 831246570d34692e and notifier_chain_cond_register(),
>  which triggers an alarm and exits when a duplicate registration is detected.
> 
>> Unexpected 2nd register of the same hook most likely will lead to 2nd unregister,
>> and it can lead to host crash in any time: 
>> you can unregister notifier on first attempt it can be too early, it can be still in use.
>> on the other hand you can never call 2nd unregister at all.
> 
> Since the member was not added to the linked list at the time of the second registration, 
> no linked list ring was formed. 
> The member is released on the first unregistration and -ENOENT on the second unregistration.
> After patching, the fault has been alleviated

You are wrong here.
2nd notifier's registration is a pure bug, this should never happen.
If you know the way to reproduce this situation -- you need to fix it. 

2nd registration can happen in 2 cases:
1) missed rollback, when someone forget to call unregister after successfull registration, 
and then tried to call register again. It can lead to crash for example when according module will be unloaded.
2) some subsystem is registered twice, for example from  different namespaces.
in this case unregister called during sybsystem cleanup in first namespace will incorrectly remove notifier used 
in second namespace, it also can lead to unexpacted behaviour.

> It may be more helpful to return an error code when someone tries to register the same
> notification program a second time.

You are wrong again here, it is senseless.
If you have detected 2nd register -- your node is already in bad state.

> But I noticed that notifier_chain_cond_register() returns 0 when duplicate registration 
> is detected. At the same time, in all the existing export function comments of notify,
> "Currently always returns zero"
> 
> I am a bit confused: which is better?
> 
>>
>> Unfortunately I do not see any ways to handle such cases properly,
>> and it seems for me your patches does not resolve this problem.
>>
>> Am I missed something probably?
>>
>>> case1: An infinite loop in notifier_chain_register() can cause soft lockup
>>>         atomic_notifier_chain_register(&test_notifier_list, &test1);
>>>         atomic_notifier_chain_register(&test_notifier_list, &test1);
>>>         atomic_notifier_chain_register(&test_notifier_list, &test2);
> 
> Thanks
> 
> Xiaoming Ni
> 

^ permalink raw reply

* [PATCH] libertas: add terminating entry to fw_table
From: Oliver Neukum @ 2019-07-11 14:27 UTC (permalink / raw)
  To: davem, netdev; +Cc: Oliver Neukum

In case no firmware was found, the system would happily read
and try to load garbage. Terminate the table properly.

Signed-off-by: Oliver Neukum <oneukum@suse.com>
Fixes: ce84bb69f50e6 ("libertas USB: convert to asynchronous firmware loading")
Reported-by: syzbot+8a8f48672560c8ca59dd@syzkaller.appspotmail.com
---
 drivers/net/wireless/marvell/libertas/if_usb.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/wireless/marvell/libertas/if_usb.c b/drivers/net/wireless/marvell/libertas/if_usb.c
index f1622f0ff8c9..b79c65547f4c 100644
--- a/drivers/net/wireless/marvell/libertas/if_usb.c
+++ b/drivers/net/wireless/marvell/libertas/if_usb.c
@@ -50,7 +50,10 @@ static const struct lbs_fw_table fw_table[] = {
 	{ MODEL_8388, "libertas/usb8388_v5.bin", NULL },
 	{ MODEL_8388, "libertas/usb8388.bin", NULL },
 	{ MODEL_8388, "usb8388.bin", NULL },
-	{ MODEL_8682, "libertas/usb8682.bin", NULL }
+	{ MODEL_8682, "libertas/usb8682.bin", NULL },
+
+	/*terminating entry - keep at end */
+	{ MODEL_8388, NULL, NULL }
 };
 
 static const struct usb_device_id if_usb_table[] = {
-- 
2.16.4


^ permalink raw reply related

* [PATCH v4 bpf-next 0/4] selftests/bpf: fix compiling loop{1,2,3}.c on s390
From: Ilya Leoshkevich @ 2019-07-11 14:29 UTC (permalink / raw)
  To: bpf, netdev; +Cc: ys114321, daniel, sdf, davem, ast, Ilya Leoshkevich

Use PT_REGS_RC(ctx) instead of ctx->rax, which is not present on s390.

This patch series consists of three preparatory commits, which make it
possible to use PT_REGS_RC in BPF selftests, followed by the actual fix.

> > Will this also work for 32-bit x86?
> Thanks, this is a good catch: this builds, but makes 64-bit accesses, as
> if it used the 64-bit variant of pt_regs. I will fix this.
I found four problems in this area:

1. Selftest tracing progs are built with -target bpf, leading to struct
   pt_regs and friends being interpreted incorrectly.
2. When the Makefile is adjusted to build them without -target bpf, it
   still lacks -m32/-m64, leading to a similar issue.
3. There is no __i386__ define, leading to incorrect userspace struct
   pt_regs variant being chosen for x86.
4. Finally, there is an issue in my patch: when 1-3 are fixed, it fails
   to build, since i386 defines yet another set of field names.

I will send fixes for problems 1-3 separately, I believe for this patch
series to be correct, it's enough to fix #4 (which I did by adding
another #ifdef).

I've also changed ARCH to SRCARCH in patch #1, since while ARCH can be
e.g. "i386", SRCARCH always corresponds to directory names under arch/.

v1->v2: Split into multiple patches.
v2->v3: Added arm64 support.
v3->v4: Added i386 support, use SRCARCH instead of ARCH.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>

^ permalink raw reply

* [PATCH v4 bpf-next 1/4] selftests/bpf: compile progs with -D__TARGET_ARCH_$(SRCARCH)
From: Ilya Leoshkevich @ 2019-07-11 14:29 UTC (permalink / raw)
  To: bpf, netdev; +Cc: ys114321, daniel, sdf, davem, ast, Ilya Leoshkevich
In-Reply-To: <20190711142930.68809-1-iii@linux.ibm.com>

This opens up the possibility of accessing registers in an
arch-independent way.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 tools/testing/selftests/bpf/Makefile | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 2620406a53ec..ad84450e4ab8 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -1,4 +1,5 @@
 # SPDX-License-Identifier: GPL-2.0
+include ../../../scripts/Makefile.arch
 
 LIBDIR := ../../../lib
 BPFDIR := $(LIBDIR)/bpf
@@ -138,7 +139,8 @@ CLANG_SYS_INCLUDES := $(shell $(CLANG) -v -E - </dev/null 2>&1 \
 
 CLANG_FLAGS = -I. -I./include/uapi -I../../../include/uapi \
 	      $(CLANG_SYS_INCLUDES) \
-	      -Wno-compare-distinct-pointer-types
+	      -Wno-compare-distinct-pointer-types \
+	      -D__TARGET_ARCH_$(SRCARCH)
 
 $(OUTPUT)/test_l4lb_noinline.o: CLANG_FLAGS += -fno-inline
 $(OUTPUT)/test_xdp_noinline.o: CLANG_FLAGS += -fno-inline
-- 
2.21.0


^ permalink raw reply related

* [PATCH v4 bpf-next 2/4] selftests/bpf: fix s930 -> s390 typo
From: Ilya Leoshkevich @ 2019-07-11 14:29 UTC (permalink / raw)
  To: bpf, netdev; +Cc: ys114321, daniel, sdf, davem, ast, Ilya Leoshkevich
In-Reply-To: <20190711142930.68809-1-iii@linux.ibm.com>

Also check for __s390__ instead of __s390x__, just in case bpf_helpers.h
is ever used by 32-bit userspace.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 tools/testing/selftests/bpf/bpf_helpers.h | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/tools/testing/selftests/bpf/bpf_helpers.h b/tools/testing/selftests/bpf/bpf_helpers.h
index 5a3d92c8bec8..73071a94769a 100644
--- a/tools/testing/selftests/bpf/bpf_helpers.h
+++ b/tools/testing/selftests/bpf/bpf_helpers.h
@@ -315,8 +315,8 @@ static int (*bpf_skb_adjust_room)(void *ctx, __s32 len_diff, __u32 mode,
 #if defined(__TARGET_ARCH_x86)
 	#define bpf_target_x86
 	#define bpf_target_defined
-#elif defined(__TARGET_ARCH_s930x)
-	#define bpf_target_s930x
+#elif defined(__TARGET_ARCH_s390)
+	#define bpf_target_s390
 	#define bpf_target_defined
 #elif defined(__TARGET_ARCH_arm)
 	#define bpf_target_arm
@@ -341,8 +341,8 @@ static int (*bpf_skb_adjust_room)(void *ctx, __s32 len_diff, __u32 mode,
 #ifndef bpf_target_defined
 #if defined(__x86_64__)
 	#define bpf_target_x86
-#elif defined(__s390x__)
-	#define bpf_target_s930x
+#elif defined(__s390__)
+	#define bpf_target_s390
 #elif defined(__arm__)
 	#define bpf_target_arm
 #elif defined(__aarch64__)
@@ -369,7 +369,7 @@ static int (*bpf_skb_adjust_room)(void *ctx, __s32 len_diff, __u32 mode,
 #define PT_REGS_SP(x) ((x)->sp)
 #define PT_REGS_IP(x) ((x)->ip)
 
-#elif defined(bpf_target_s390x)
+#elif defined(bpf_target_s390)
 
 #define PT_REGS_PARM1(x) ((x)->gprs[2])
 #define PT_REGS_PARM2(x) ((x)->gprs[3])
-- 
2.21.0


^ permalink raw reply related

* [PATCH v4 bpf-next 3/4] selftests/bpf: make PT_REGS_* work in userspace
From: Ilya Leoshkevich @ 2019-07-11 14:29 UTC (permalink / raw)
  To: bpf, netdev; +Cc: ys114321, daniel, sdf, davem, ast, Ilya Leoshkevich
In-Reply-To: <20190711142930.68809-1-iii@linux.ibm.com>

Right now, on certain architectures, these macros are usable only with
kernel headers. This patch makes it possible to use them with userspace
headers and, as a consequence, not only in BPF samples, but also in BPF
selftests.

On s390, provide the forward declaration of struct pt_regs and cast it
to user_pt_regs in PT_REGS_* macros. This is necessary, because instead
of the full struct pt_regs, s390 exposes only its first member
user_pt_regs to userspace, and bpf_helpers.h is used with both userspace
(in selftests) and kernel (in samples) headers. It was added in commit
466698e654e8 ("s390/bpf: correct broken uapi for
BPF_PROG_TYPE_PERF_EVENT program type").

Ditto on arm64.

On x86, provide userspace versions of PT_REGS_* macros. Unlike s390 and
arm64, x86 provides struct pt_regs to both userspace and kernel, however,
with different member names.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 tools/testing/selftests/bpf/bpf_helpers.h | 75 +++++++++++++++++------
 1 file changed, 55 insertions(+), 20 deletions(-)

diff --git a/tools/testing/selftests/bpf/bpf_helpers.h b/tools/testing/selftests/bpf/bpf_helpers.h
index 73071a94769a..27090d94afb6 100644
--- a/tools/testing/selftests/bpf/bpf_helpers.h
+++ b/tools/testing/selftests/bpf/bpf_helpers.h
@@ -358,6 +358,7 @@ static int (*bpf_skb_adjust_room)(void *ctx, __s32 len_diff, __u32 mode,
 
 #if defined(bpf_target_x86)
 
+#ifdef __KERNEL__
 #define PT_REGS_PARM1(x) ((x)->di)
 #define PT_REGS_PARM2(x) ((x)->si)
 #define PT_REGS_PARM3(x) ((x)->dx)
@@ -368,19 +369,49 @@ static int (*bpf_skb_adjust_room)(void *ctx, __s32 len_diff, __u32 mode,
 #define PT_REGS_RC(x) ((x)->ax)
 #define PT_REGS_SP(x) ((x)->sp)
 #define PT_REGS_IP(x) ((x)->ip)
+#else
+#ifdef __i386__
+/* i386 kernel is built with -mregparm=3 */
+#define PT_REGS_PARM1(x) ((x)->eax)
+#define PT_REGS_PARM2(x) ((x)->edx)
+#define PT_REGS_PARM3(x) ((x)->ecx)
+#define PT_REGS_PARM4(x) 0
+#define PT_REGS_PARM5(x) 0
+#define PT_REGS_RET(x) ((x)->esp)
+#define PT_REGS_FP(x) ((x)->ebp)
+#define PT_REGS_RC(x) ((x)->eax)
+#define PT_REGS_SP(x) ((x)->esp)
+#define PT_REGS_IP(x) ((x)->eip)
+#else
+#define PT_REGS_PARM1(x) ((x)->rdi)
+#define PT_REGS_PARM2(x) ((x)->rsi)
+#define PT_REGS_PARM3(x) ((x)->rdx)
+#define PT_REGS_PARM4(x) ((x)->rcx)
+#define PT_REGS_PARM5(x) ((x)->r8)
+#define PT_REGS_RET(x) ((x)->rsp)
+#define PT_REGS_FP(x) ((x)->rbp)
+#define PT_REGS_RC(x) ((x)->rax)
+#define PT_REGS_SP(x) ((x)->rsp)
+#define PT_REGS_IP(x) ((x)->rip)
+#endif
+#endif
 
 #elif defined(bpf_target_s390)
 
-#define PT_REGS_PARM1(x) ((x)->gprs[2])
-#define PT_REGS_PARM2(x) ((x)->gprs[3])
-#define PT_REGS_PARM3(x) ((x)->gprs[4])
-#define PT_REGS_PARM4(x) ((x)->gprs[5])
-#define PT_REGS_PARM5(x) ((x)->gprs[6])
-#define PT_REGS_RET(x) ((x)->gprs[14])
-#define PT_REGS_FP(x) ((x)->gprs[11]) /* Works only with CONFIG_FRAME_POINTER */
-#define PT_REGS_RC(x) ((x)->gprs[2])
-#define PT_REGS_SP(x) ((x)->gprs[15])
-#define PT_REGS_IP(x) ((x)->psw.addr)
+/* s390 provides user_pt_regs instead of struct pt_regs to userspace */
+struct pt_regs;
+#define PT_REGS_S390 const volatile user_pt_regs
+#define PT_REGS_PARM1(x) (((PT_REGS_S390 *)(x))->gprs[2])
+#define PT_REGS_PARM2(x) (((PT_REGS_S390 *)(x))->gprs[3])
+#define PT_REGS_PARM3(x) (((PT_REGS_S390 *)(x))->gprs[4])
+#define PT_REGS_PARM4(x) (((PT_REGS_S390 *)(x))->gprs[5])
+#define PT_REGS_PARM5(x) (((PT_REGS_S390 *)(x))->gprs[6])
+#define PT_REGS_RET(x) (((PT_REGS_S390 *)(x))->gprs[14])
+/* Works only with CONFIG_FRAME_POINTER */
+#define PT_REGS_FP(x) (((PT_REGS_S390 *)(x))->gprs[11])
+#define PT_REGS_RC(x) (((PT_REGS_S390 *)(x))->gprs[2])
+#define PT_REGS_SP(x) (((PT_REGS_S390 *)(x))->gprs[15])
+#define PT_REGS_IP(x) (((PT_REGS_S390 *)(x))->psw.addr)
 
 #elif defined(bpf_target_arm)
 
@@ -397,16 +428,20 @@ static int (*bpf_skb_adjust_room)(void *ctx, __s32 len_diff, __u32 mode,
 
 #elif defined(bpf_target_arm64)
 
-#define PT_REGS_PARM1(x) ((x)->regs[0])
-#define PT_REGS_PARM2(x) ((x)->regs[1])
-#define PT_REGS_PARM3(x) ((x)->regs[2])
-#define PT_REGS_PARM4(x) ((x)->regs[3])
-#define PT_REGS_PARM5(x) ((x)->regs[4])
-#define PT_REGS_RET(x) ((x)->regs[30])
-#define PT_REGS_FP(x) ((x)->regs[29]) /* Works only with CONFIG_FRAME_POINTER */
-#define PT_REGS_RC(x) ((x)->regs[0])
-#define PT_REGS_SP(x) ((x)->sp)
-#define PT_REGS_IP(x) ((x)->pc)
+/* arm64 provides struct user_pt_regs instead of struct pt_regs to userspace */
+struct pt_regs;
+#define PT_REGS_ARM64 const volatile struct user_pt_regs
+#define PT_REGS_PARM1(x) (((PT_REGS_ARM64 *)(x))->regs[0])
+#define PT_REGS_PARM2(x) (((PT_REGS_ARM64 *)(x))->regs[1])
+#define PT_REGS_PARM3(x) (((PT_REGS_ARM64 *)(x))->regs[2])
+#define PT_REGS_PARM4(x) (((PT_REGS_ARM64 *)(x))->regs[3])
+#define PT_REGS_PARM5(x) (((PT_REGS_ARM64 *)(x))->regs[4])
+#define PT_REGS_RET(x) (((PT_REGS_ARM64 *)(x))->regs[30])
+/* Works only with CONFIG_FRAME_POINTER */
+#define PT_REGS_FP(x) (((PT_REGS_ARM64 *)(x))->regs[29])
+#define PT_REGS_RC(x) (((PT_REGS_ARM64 *)(x))->regs[0])
+#define PT_REGS_SP(x) (((PT_REGS_ARM64 *)(x))->sp)
+#define PT_REGS_IP(x) (((PT_REGS_ARM64 *)(x))->pc)
 
 #elif defined(bpf_target_mips)
 
-- 
2.21.0


^ permalink raw reply related

* [PATCH v4 bpf-next 4/4] selftests/bpf: fix compiling loop{1,2,3}.c on s390
From: Ilya Leoshkevich @ 2019-07-11 14:29 UTC (permalink / raw)
  To: bpf, netdev; +Cc: ys114321, daniel, sdf, davem, ast, Ilya Leoshkevich
In-Reply-To: <20190711142930.68809-1-iii@linux.ibm.com>

Use PT_REGS_RC(ctx) instead of ctx->rax, which is not present on s390.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 tools/testing/selftests/bpf/progs/loop1.c | 2 +-
 tools/testing/selftests/bpf/progs/loop2.c | 2 +-
 tools/testing/selftests/bpf/progs/loop3.c | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/bpf/progs/loop1.c b/tools/testing/selftests/bpf/progs/loop1.c
index dea395af9ea9..7cdb7f878310 100644
--- a/tools/testing/selftests/bpf/progs/loop1.c
+++ b/tools/testing/selftests/bpf/progs/loop1.c
@@ -18,7 +18,7 @@ int nested_loops(volatile struct pt_regs* ctx)
 	for (j = 0; j < 300; j++)
 		for (i = 0; i < j; i++) {
 			if (j & 1)
-				m = ctx->rax;
+				m = PT_REGS_RC(ctx);
 			else
 				m = j;
 			sum += i * m;
diff --git a/tools/testing/selftests/bpf/progs/loop2.c b/tools/testing/selftests/bpf/progs/loop2.c
index 0637bd8e8bcf..9b2f808a2863 100644
--- a/tools/testing/selftests/bpf/progs/loop2.c
+++ b/tools/testing/selftests/bpf/progs/loop2.c
@@ -16,7 +16,7 @@ int while_true(volatile struct pt_regs* ctx)
 	int i = 0;
 
 	while (true) {
-		if (ctx->rax & 1)
+		if (PT_REGS_RC(ctx) & 1)
 			i += 3;
 		else
 			i += 7;
diff --git a/tools/testing/selftests/bpf/progs/loop3.c b/tools/testing/selftests/bpf/progs/loop3.c
index 30a0f6cba080..d727657d51e2 100644
--- a/tools/testing/selftests/bpf/progs/loop3.c
+++ b/tools/testing/selftests/bpf/progs/loop3.c
@@ -16,7 +16,7 @@ int while_true(volatile struct pt_regs* ctx)
 	__u64 i = 0, sum = 0;
 	do {
 		i++;
-		sum += ctx->rax;
+		sum += PT_REGS_RC(ctx);
 	} while (i < 0x100000000ULL);
 	return sum;
 }
-- 
2.21.0


^ permalink raw reply related

* Re: Re: Re: linux-next: build failure after merge of the net-next tree
From: Jason Gunthorpe @ 2019-07-11 14:33 UTC (permalink / raw)
  To: Bernard Metzler
  Cc: Leon Romanovsky, Stephen Rothwell, Doug Ledford, David Miller,
	Networking, Linux Next Mailing List, Linux Kernel Mailing List
In-Reply-To: <OF9A485648.9C7A28A3-ON00258434.00449B07-00258434.00449B14@notes.na.collabserv.com>

On Thu, Jul 11, 2019 at 12:29:21PM +0000, Bernard Metzler wrote:
> 
> >To: "Bernard Metzler" <BMT@zurich.ibm.com>
> >From: "Jason Gunthorpe" <jgg@mellanox.com>
> >Date: 07/11/2019 01:53PM
> >Cc: "Leon Romanovsky" <leon@kernel.org>, "Stephen Rothwell"
> ><sfr@canb.auug.org.au>, "Doug Ledford" <dledford@redhat.com>, "David
> >Miller" <davem@davemloft.net>, "Networking" <netdev@vger.kernel.org>,
> >"Linux Next Mailing List" <linux-next@vger.kernel.org>, "Linux Kernel
> >Mailing List" <linux-kernel@vger.kernel.org>
> >Subject: [EXTERNAL] Re: Re: linux-next: build failure after merge of
> >the net-next tree
> >
> >On Thu, Jul 11, 2019 at 08:00:49AM +0000, Bernard Metzler wrote:
> >
> >> That listen will not sleep. The socket is just marked
> >> listening. 
> >
> >Eh? siw_listen_address() calls siw_cep_alloc() which does:
> >
> >	struct siw_cep *cep = kzalloc(sizeof(*cep), GFP_KERNEL);
> >
> >Which is sleeping. Many other cases too.
> >
> >Jason
> >
> >
> Ah, true! I was after really deep sleeps like user level
> socket accept() calls ;) So you are correct of course.

I've added this patch to the rdma tree to fix the missing locking.

The merge resolution will be simply swapping
for_ifa to in_dev_for_each_ifa_rtnl.

Jason

From c421651fa2295d1219c36674c7eb8c574542ceea Mon Sep 17 00:00:00 2001
From: Jason Gunthorpe <jgg@mellanox.com>
Date: Thu, 11 Jul 2019 11:29:42 -0300
Subject: [PATCH] RDMA/siw: Add missing rtnl_lock around access to ifa

ifa is protected by rcu or rtnl, add the missing locking. In this case we
have to use rtnl since siw_listen_address() is sleeping.

Fixes: 6c52fdc244b5 ("rdma/siw: connection management")
Reviewed-by: Bernard Metzler <bmt@zurich.ibm.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
---
 drivers/infiniband/sw/siw/siw_cm.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/infiniband/sw/siw/siw_cm.c b/drivers/infiniband/sw/siw/siw_cm.c
index 8e618cb7261f62..c25be723c15b64 100644
--- a/drivers/infiniband/sw/siw/siw_cm.c
+++ b/drivers/infiniband/sw/siw/siw_cm.c
@@ -1975,6 +1975,7 @@ int siw_create_listen(struct iw_cm_id *id, int backlog)
 			id, &s_laddr.sin_addr, ntohs(s_laddr.sin_port),
 			&s_raddr->sin_addr, ntohs(s_raddr->sin_port));
 
+		rtnl_lock();
 		for_ifa(in_dev)
 		{
 			if (ipv4_is_zeronet(s_laddr.sin_addr.s_addr) ||
@@ -1989,6 +1990,7 @@ int siw_create_listen(struct iw_cm_id *id, int backlog)
 			}
 		}
 		endfor_ifa(in_dev);
+		rtnl_unlock();
 		in_dev_put(in_dev);
 	} else if (id->local_addr.ss_family == AF_INET6) {
 		struct inet6_dev *in6_dev = in6_dev_get(dev);
-- 
2.21.0


^ permalink raw reply related

* Re: [PATCH bpf-next] selftests/bpf: remove logic duplication in test_verifier.c
From: Andrii Nakryiko @ 2019-07-11 14:43 UTC (permalink / raw)
  To: Krzesimir Nowak
  Cc: Andrii Nakryiko, Kernel Team, Alexei Starovoitov, Daniel Borkmann,
	bpf, Networking
In-Reply-To: <CAGGp+cETuvWUwET=6Mq5sWTJhi5+Rs2bw8xNP2NYZXAAuc6-Og@mail.gmail.com>

On Thu, Jul 11, 2019 at 5:13 AM Krzesimir Nowak <krzesimir@kinvolk.io> wrote:
>
> On Thu, Jul 11, 2019 at 3:08 AM Andrii Nakryiko <andriin@fb.com> wrote:
> >
> > test_verifier tests can specify single- and multi-runs tests. Internally
> > logic of handling them is duplicated. Get rid of it by making single run
> > retval specification to be a first retvals spec.
> >
> > Cc: Krzesimir Nowak <krzesimir@kinvolk.io>
> > Signed-off-by: Andrii Nakryiko <andriin@fb.com>
>
> Looks good, one nit below.
>
> Acked-by: Krzesimir Nowak <krzesimir@kinvolk.io>
>
> > ---
> >  tools/testing/selftests/bpf/test_verifier.c | 37 ++++++++++-----------
> >  1 file changed, 18 insertions(+), 19 deletions(-)
> >
> > diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
> > index b0773291012a..120ecdf4a7db 100644
> > --- a/tools/testing/selftests/bpf/test_verifier.c
> > +++ b/tools/testing/selftests/bpf/test_verifier.c
> > @@ -86,7 +86,7 @@ struct bpf_test {
> >         int fixup_sk_storage_map[MAX_FIXUPS];
> >         const char *errstr;
> >         const char *errstr_unpriv;
> > -       uint32_t retval, retval_unpriv, insn_processed;
> > +       uint32_t insn_processed;
> >         int prog_len;
> >         enum {
> >                 UNDEF,
> > @@ -95,16 +95,24 @@ struct bpf_test {
> >         } result, result_unpriv;
> >         enum bpf_prog_type prog_type;
> >         uint8_t flags;
> > -       __u8 data[TEST_DATA_LEN];
> >         void (*fill_helper)(struct bpf_test *self);
> >         uint8_t runs;
> > -       struct {
> > -               uint32_t retval, retval_unpriv;
> > -               union {
> > -                       __u8 data[TEST_DATA_LEN];
> > -                       __u64 data64[TEST_DATA_LEN / 8];
> > +       union {
> > +               struct {
>
> Maybe consider moving the struct definition outside to further the
> removal of the duplication?

Can't do that because then retval/retval_unpriv/data won't be
accessible as a normal field of struct bpf_test. It has to be in
anonymous structs/unions, unfortunately.

I tried the following, but that also didn't work:

union {
    struct bpf_test_retval {
        uint32_t retval, retval_unpriv;
        union {
            __u8 data[TEST_DATA_LEN];
            __u64 data64[TEST_DATA_LEN / 8];
        };
    };
    struct bpf_test_retval retvals[MAX_TEST_RUNS];
};

This also made retval/retval_unpriv to not behave as normal fields of
struct bpf_test.


>
> > +                       uint32_t retval, retval_unpriv;
> > +                       union {
> > +                               __u8 data[TEST_DATA_LEN];
> > +                               __u64 data64[TEST_DATA_LEN / 8];
> > +                       };
> >                 };
> > -       } retvals[MAX_TEST_RUNS];
> > +               struct {
> > +                       uint32_t retval, retval_unpriv;
> > +                       union {
> > +                               __u8 data[TEST_DATA_LEN];
> > +                               __u64 data64[TEST_DATA_LEN / 8];
> > +                       };
> > +               } retvals[MAX_TEST_RUNS];
> > +       };
> >         enum bpf_attach_type expected_attach_type;
> >  };
> >
> > @@ -949,17 +957,8 @@ static void do_test_single(struct bpf_test *test, bool unpriv,
> >                 uint32_t expected_val;
> >                 int i;
> >
> > -               if (!test->runs) {
> > -                       expected_val = unpriv && test->retval_unpriv ?
> > -                               test->retval_unpriv : test->retval;
> > -
> > -                       err = do_prog_test_run(fd_prog, unpriv, expected_val,
> > -                                              test->data, sizeof(test->data));
> > -                       if (err)
> > -                               run_errs++;
> > -                       else
> > -                               run_successes++;
> > -               }
> > +               if (!test->runs)
> > +                       test->runs = 1;
> >
> >                 for (i = 0; i < test->runs; i++) {
> >                         if (unpriv && test->retvals[i].retval_unpriv)
> > --
> > 2.17.1
> >
>
>
> --
> Kinvolk GmbH | Adalbertstr.6a, 10999 Berlin | tel: +491755589364
> Geschäftsführer/Directors: Alban Crequy, Chris Kühl, Iago López Galeiras
> Registergericht/Court of registration: Amtsgericht Charlottenburg
> Registernummer/Registration number: HRB 171414 B
> Ust-ID-Nummer/VAT ID number: DE302207000

^ permalink raw reply

* Re: [PATCH v3 bpf] selftests/bpf: do not ignore clang failures
From: Andrii Nakryiko @ 2019-07-11 14:55 UTC (permalink / raw)
  To: Ilya Leoshkevich; +Cc: bpf, Networking, Daniel Borkmann, Song Liu
In-Reply-To: <20190711091249.59865-1-iii@linux.ibm.com>

On Thu, Jul 11, 2019 at 2:14 AM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>
> When compiling an eBPF prog fails, make still returns 0, because
> failing clang command's output is piped to llc and therefore its
> exit status is ignored.
>
> When clang fails, pipe the string "clang failed" to llc. This will make
> llc fail with an informative error message. This solution was chosen
> over using pipefail, having separate targets or getting rid of llc
> invocation due to its simplicity.
>
> In addition, pull Kbuild.include in order to get .DELETE_ON_ERROR target,

In your original patch you explicitly declared .DELETE_ON_ERROR, but
in this one you just include Kbuild.include.
Is it enough to just include that file to get desired behavior or your
forgot to add .DELETE_ON_ERROR?

> which would cause partial .o files to be removed.
>
> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> ---

Thanks!

Acked-by: Andrii Nakryiko <andriin@fb.com>

> v1->v2: use intermediate targets instead of pipefail
> v2->v3: pipe "clang failed" instead of using intermediate targets
>
> tools/testing/selftests/bpf/Makefile | 13 +++++++------
>  1 file changed, 7 insertions(+), 6 deletions(-)
>
> diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
> index e36356e2377e..e375f399b7a6 100644
> --- a/tools/testing/selftests/bpf/Makefile
> +++ b/tools/testing/selftests/bpf/Makefile
> @@ -1,4 +1,5 @@
>  # SPDX-License-Identifier: GPL-2.0
> +include ../../../../scripts/Kbuild.include
>
>  LIBDIR := ../../../lib
>  BPFDIR := $(LIBDIR)/bpf
> @@ -185,8 +186,8 @@ $(ALU32_BUILD_DIR)/test_progs_32: prog_tests/*.c
>
>  $(ALU32_BUILD_DIR)/%.o: progs/%.c $(ALU32_BUILD_DIR) \
>                                         $(ALU32_BUILD_DIR)/test_progs_32
> -       $(CLANG) $(CLANG_FLAGS) \
> -                -O2 -target bpf -emit-llvm -c $< -o - |      \
> +       ($(CLANG) $(CLANG_FLAGS) -O2 -target bpf -emit-llvm -c $< -o - || \
> +               echo "clang failed") | \
>         $(LLC) -march=bpf -mattr=+alu32 -mcpu=$(CPU) $(LLC_FLAGS) \
>                 -filetype=obj -o $@
>  ifeq ($(DWARF2BTF),y)
> @@ -197,16 +198,16 @@ endif
>  # Have one program compiled without "-target bpf" to test whether libbpf loads
>  # it successfully
>  $(OUTPUT)/test_xdp.o: progs/test_xdp.c
> -       $(CLANG) $(CLANG_FLAGS) \
> -               -O2 -emit-llvm -c $< -o - | \
> +       ($(CLANG) $(CLANG_FLAGS) -O2 -emit-llvm -c $< -o - || \
> +               echo "clang failed") | \
>         $(LLC) -march=bpf -mcpu=$(CPU) $(LLC_FLAGS) -filetype=obj -o $@
>  ifeq ($(DWARF2BTF),y)
>         $(BTF_PAHOLE) -J $@
>  endif
>
>  $(OUTPUT)/%.o: progs/%.c
> -       $(CLANG) $(CLANG_FLAGS) \
> -                -O2 -target bpf -emit-llvm -c $< -o - |      \
> +       ($(CLANG) $(CLANG_FLAGS) -O2 -target bpf -emit-llvm -c $< -o - || \
> +               echo "clang failed") | \
>         $(LLC) -march=bpf -mcpu=$(CPU) $(LLC_FLAGS) -filetype=obj -o $@
>  ifeq ($(DWARF2BTF),y)
>         $(BTF_PAHOLE) -J $@
> --
> 2.21.0
>

^ permalink raw reply

* Re: [PATCH v3 bpf] selftests/bpf: do not ignore clang failures
From: Ilya Leoshkevich @ 2019-07-11 15:00 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: bpf, Networking, Daniel Borkmann, Song Liu
In-Reply-To: <CAEf4Bzb6mY-F-wUNNimS+hMSRbJetTKXNcGDQbsJXhXDywA+tg@mail.gmail.com>

> Am 11.07.2019 um 16:55 schrieb Andrii Nakryiko <andrii.nakryiko@gmail.com>:
> 
> On Thu, Jul 11, 2019 at 2:14 AM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>> 
>> 
>> In addition, pull Kbuild.include in order to get .DELETE_ON_ERROR target,
> 
> In your original patch you explicitly declared .DELETE_ON_ERROR, but
> in this one you just include Kbuild.include.
> Is it enough to just include that file to get desired behavior or your
> forgot to add .DELETE_ON_ERROR?

It’s enough to just include Kbuild.include. I grepped the source tree
and found that no one else declares .DELETE_ON_ERROR explicitly, so I've
decided to avoid doing this as well.

^ permalink raw reply

* Re: [PATCH] libertas: add terminating entry to fw_table
From: Sergei Shtylyov @ 2019-07-11 15:54 UTC (permalink / raw)
  To: Oliver Neukum, davem, netdev
In-Reply-To: <20190711142744.31956-1-oneukum@suse.com>

Hello!

On 07/11/2019 05:27 PM, Oliver Neukum wrote:

> In case no firmware was found, the system would happily read
> and try to load garbage. Terminate the table properly.
> 
> Signed-off-by: Oliver Neukum <oneukum@suse.com>
> Fixes: ce84bb69f50e6 ("libertas USB: convert to asynchronous firmware loading")

   The Fixed: tag should precede the sign-offs, according to DaveM...

> Reported-by: syzbot+8a8f48672560c8ca59dd@syzkaller.appspotmail.com

   That should be the 1st tag, I think...

> ---
>  drivers/net/wireless/marvell/libertas/if_usb.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/wireless/marvell/libertas/if_usb.c b/drivers/net/wireless/marvell/libertas/if_usb.c
> index f1622f0ff8c9..b79c65547f4c 100644
> --- a/drivers/net/wireless/marvell/libertas/if_usb.c
> +++ b/drivers/net/wireless/marvell/libertas/if_usb.c
> @@ -50,7 +50,10 @@ static const struct lbs_fw_table fw_table[] = {
>  	{ MODEL_8388, "libertas/usb8388_v5.bin", NULL },
>  	{ MODEL_8388, "libertas/usb8388.bin", NULL },
>  	{ MODEL_8388, "usb8388.bin", NULL },
> -	{ MODEL_8682, "libertas/usb8682.bin", NULL }
> +	{ MODEL_8682, "libertas/usb8682.bin", NULL },
> +
> +	/*terminating entry - keep at end */

   Why no space after /* ?

> +	{ MODEL_8388, NULL, NULL }
>  };
>  
>  static const struct usb_device_id if_usb_table[] = {

MBR, Sergei

^ permalink raw reply

* [PATCH net-next 1/1] tc-tests: updated skbedit tests
From: Roman Mashak @ 2019-07-11 16:29 UTC (permalink / raw)
  To: davem; +Cc: netdev, kernel, jhs, xiyou.wangcong, jiri, Roman Mashak

- Added mask upper bound test case
- Added mask validation test case
- Added mask replacement case

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
---
 .../tc-testing/tc-tests/actions/skbedit.json       | 117 +++++++++++++++++++++
 1 file changed, 117 insertions(+)

diff --git a/tools/testing/selftests/tc-testing/tc-tests/actions/skbedit.json b/tools/testing/selftests/tc-testing/tc-tests/actions/skbedit.json
index 45e7e89928a5..bf5ebf59c2d4 100644
--- a/tools/testing/selftests/tc-testing/tc-tests/actions/skbedit.json
+++ b/tools/testing/selftests/tc-testing/tc-tests/actions/skbedit.json
@@ -70,6 +70,123 @@
         "teardown": []
     },
     {
+        "id": "d4cd",
+        "name": "Add skbedit action with valid mark and mask",
+        "category": [
+            "actions",
+            "skbedit"
+        ],
+        "setup": [
+            [
+                "$TC actions flush action skbedit",
+                0,
+                1,
+                255
+            ]
+        ],
+        "cmdUnderTest": "$TC actions add action skbedit mark 1/0xaabb",
+        "expExitCode": "0",
+        "verifyCmd": "$TC actions list action skbedit",
+        "matchPattern": "action order [0-9]*: skbedit  mark 1/0xaabb",
+        "matchCount": "1",
+        "teardown": [
+            "$TC actions flush action skbedit"
+        ]
+    },
+    {
+        "id": "baa7",
+        "name": "Add skbedit action with valid mark and 32-bit maximum mask",
+        "category": [
+            "actions",
+            "skbedit"
+        ],
+        "setup": [
+            [
+                "$TC actions flush action skbedit",
+                0,
+                1,
+                255
+            ]
+        ],
+        "cmdUnderTest": "$TC actions add action skbedit mark 1/0xffffffff",
+        "expExitCode": "0",
+        "verifyCmd": "$TC actions list action skbedit",
+        "matchPattern": "action order [0-9]*: skbedit  mark 1/0xffffffff",
+        "matchCount": "1",
+        "teardown": [
+            "$TC actions flush action skbedit"
+        ]
+    },
+    {
+        "id": "62a5",
+        "name": "Add skbedit action with valid mark and mask exceeding 32-bit maximum",
+        "category": [
+            "actions",
+            "skbedit"
+        ],
+        "setup": [
+            [
+                "$TC actions flush action skbedit",
+                0,
+                1,
+                255
+            ]
+        ],
+        "cmdUnderTest": "$TC actions add action skbedit mark 1/0xaabbccddeeff112233",
+        "expExitCode": "255",
+        "verifyCmd": "$TC actions list action skbedit",
+        "matchPattern": "action order [0-9]*: skbedit  mark 1/0xaabbccddeeff112233",
+        "matchCount": "0",
+        "teardown": []
+    },
+    {
+        "id": "bc15",
+        "name": "Add skbedit action with valid mark and mask with invalid format",
+        "category": [
+            "actions",
+            "skbedit"
+        ],
+        "setup": [
+            [
+                "$TC actions flush action skbedit",
+                0,
+                1,
+                255
+            ]
+        ],
+        "cmdUnderTest": "$TC actions add action skbedit mark 1/-1234",
+        "expExitCode": "255",
+        "verifyCmd": "$TC actions list action skbedit",
+        "matchPattern": "action order [0-9]*: skbedit  mark 1/-1234",
+        "matchCount": "0",
+        "teardown": []
+    },
+    {
+        "id": "57c2",
+        "name": "Replace skbedit action with new mask",
+        "category": [
+            "actions",
+            "skbedit"
+        ],
+        "setup": [
+            [
+                "$TC actions flush action skbedit",
+                0,
+                1,
+                255
+            ],
+            "$TC actions add action skbedit mark 1/0x11223344 index 1"
+        ],
+        "cmdUnderTest": "$TC actions replace action skbedit mark 1/0xaabb index 1",
+        "expExitCode": "0",
+        "verifyCmd": "$TC actions list action skbedit",
+        "matchPattern": "action order [0-9]*: skbedit  mark 1/0xaabb",
+        "matchCount": "1",
+        "teardown": [
+            "$TC actions flush action skbedit"
+        ]
+    },
+    {
         "id": "081d",
         "name": "Add skbedit action with priority",
         "category": [
-- 
2.7.4


^ permalink raw reply related

* Re: [bpf PATCH v2 2/6] bpf: tls fix transition through disconnect with close
From: John Fastabend @ 2019-07-11 16:35 UTC (permalink / raw)
  To: Jakub Kicinski, John Fastabend; +Cc: ast, daniel, netdev, edumazet, bpf
In-Reply-To: <20190710123417.2157a459@cakuba.netronome.com>

Jakub Kicinski wrote:
> On Tue, 09 Jul 2019 20:39:24 -0700, John Fastabend wrote:
> > Jakub Kicinski wrote:
> > > On Mon, 08 Jul 2019 19:14:05 +0000, John Fastabend wrote:  
> > > > @@ -287,6 +313,27 @@ static void tls_sk_proto_cleanup(struct sock *sk,
> > > >  #endif
> > > >  }
> > > >  
> > > > +static void tls_sk_proto_unhash(struct sock *sk)
> > > > +{
> > > > +	struct inet_connection_sock *icsk = inet_csk(sk);
> > > > +	long timeo = sock_sndtimeo(sk, 0);
> > > > +	struct tls_context *ctx;
> > > > +
> > > > +	if (unlikely(!icsk->icsk_ulp_data)) {  
> > > 
> > > Is this for when sockmap is stacked on top of TLS and TLS got removed
> > > without letting sockmap know?  
> > 
> > Right its a pattern I used on the sockmap side and put here. But
> > I dropped the patch to let sockmap stack on top of TLS because
> > it was more than a fix IMO. We could probably drop this check on
> > the other hand its harmless.
> 
> I feel like this code is pretty complex I struggle to follow all the
> paths, so perhaps it'd be better to drop stuff that's not necessary 
> to have a clearer picture.
> 

Sure I can drop it and add it later when its necessary.

> > > > +		if (sk->sk_prot->unhash)
> > > > +			sk->sk_prot->unhash(sk);
> > > > +	}
> > > > +
> > > > +	ctx = tls_get_ctx(sk);
> > > > +	if (ctx->tx_conf == TLS_SW || ctx->rx_conf == TLS_SW)
> > > > +		tls_sk_proto_cleanup(sk, ctx, timeo);
> > > > +	icsk->icsk_ulp_data = NULL;  
> > > 
> > > I think close only starts checking if ctx is NULL in patch 6.
> > > Looks like some chunks of ctx checking/clearing got spread to
> > > patch 1 and some to patch 6.  
> > 
> > Yeah, I thought the patches were easier to read this way but
> > maybe not. Could add something in the commit log.
> 
> Ack! Let me try to get a full grip of patches 2 and 6 and come back 
> to this.
> 
> > > > +	tls_ctx_free_wq(ctx);
> > > > +
> > > > +	if (ctx->unhash)
> > > > +		ctx->unhash(sk);
> > > > +}
> > > > +
> > > >  static void tls_sk_proto_close(struct sock *sk, long timeout)
> > > >  {
> > > >  	struct tls_context *ctx = tls_get_ctx(sk);  

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox