public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
From: Yonghong Song <yhs@fb.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Andrii Nakryiko <andriin@fb.com>, bpf <bpf@vger.kernel.org>,
	Martin KaFai Lau <kafai@fb.com>,
	Networking <netdev@vger.kernel.org>,
	Alexei Starovoitov <ast@fb.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Kernel Team <kernel-team@fb.com>
Subject: Re: [PATCH bpf-next v2 13/20] bpf: add bpf_seq_printf and bpf_seq_write helpers
Date: Wed, 6 May 2020 14:42:15 -0700	[thread overview]
Message-ID: <71cff8d8-05b9-87ef-8a12-1da3e38c4b55@fb.com> (raw)
In-Reply-To: <CAEf4Bzb-COkgcLB=HK4ahtnEFD7QGY0s=Qb-kWTBKK56319JAg@mail.gmail.com>



On 5/6/20 10:37 AM, Andrii Nakryiko wrote:
> On Sun, May 3, 2020 at 11:26 PM Yonghong Song <yhs@fb.com> wrote:
>>
>> Two helpers bpf_seq_printf and bpf_seq_write, are added for
>> writing data to the seq_file buffer.
>>
>> bpf_seq_printf supports common format string flag/width/type
>> fields so at least I can get identical results for
>> netlink and ipv6_route targets.
>>
> 
> Does seq_printf() has its own format string specification? Is there
> any documentation explaining? I was confused by few different checks
> below...

Not really. Similar to bpf_trace_printk(), since we need to
parse format string, so we may only support a subset of
what seq_printf() does. But we should not invent new
formats.

> 
>> For bpf_seq_printf and bpf_seq_write, return value -EOVERFLOW
>> specifically indicates a write failure due to overflow, which
>> means the object will be repeated in the next bpf invocation
>> if object collection stays the same. Note that if the object
>> collection is changed, depending how collection traversal is
>> done, even if the object still in the collection, it may not
>> be visited.
>>
>> bpf_seq_printf may return -EBUSY meaning that internal percpu
>> buffer for memory copy of strings or other pointees is
>> not available. Bpf program can return 1 to indicate it
>> wants the same object to be repeated. Right now, this should not
>> happen on no-RT kernels since migrate_enable(), which guards
>> bpf prog call, calls preempt_enable().
> 
> You probably meant migrate_disable()/preempt_disable(), right? But

Yes, sorry for typo.

> could it still happen, at least due to NMI? E.g., perf_event BPF
> program gets triggered during bpf_iter program execution? I think for
> perf_event_output function, we have 3 levels, for one of each possible
> "contexts"? Should we do something like that here as well?

Currently bpf_seq_printf() and bpf_seq_write() helpers can
only be called by iter bpf programs. The iter bpf program can only
be run on process context as it is triggered by a read() syscall.
So one level should be enough for non-RT kernel.

For RT kernel, migrate_disable does not prevent preemption,
so it is possible task in the middle of bpf_seq_printf() might
be preempted, so I implemented the logic to return -EBUSY.
I think this case should be extremely rare so I only implemented
one level nesting.

> 
>>
>> Signed-off-by: Yonghong Song <yhs@fb.com>
>> ---
>>   include/uapi/linux/bpf.h       |  32 +++++-
>>   kernel/trace/bpf_trace.c       | 195 +++++++++++++++++++++++++++++++++
>>   scripts/bpf_helpers_doc.py     |   2 +
>>   tools/include/uapi/linux/bpf.h |  32 +++++-
>>   4 files changed, 259 insertions(+), 2 deletions(-)
>>
>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
>> index 97ceb0f2e539..e440a9d5cca2 100644
>> --- a/include/uapi/linux/bpf.h
>> +++ b/include/uapi/linux/bpf.h
>> @@ -3076,6 +3076,34 @@ union bpf_attr {
>>    *             See: clock_gettime(CLOCK_BOOTTIME)
>>    *     Return
>>    *             Current *ktime*.
>> + *
> 
> [...]
> 
>> +BPF_CALL_5(bpf_seq_printf, struct seq_file *, m, char *, fmt, u32, fmt_size,
>> +          const void *, data, u32, data_len)
>> +{
>> +       int err = -EINVAL, fmt_cnt = 0, memcpy_cnt = 0;
>> +       int i, buf_used, copy_size, num_args;
>> +       u64 params[MAX_SEQ_PRINTF_VARARGS];
>> +       struct bpf_seq_printf_buf *bufs;
>> +       const u64 *args = data;
>> +
>> +       buf_used = this_cpu_inc_return(bpf_seq_printf_buf_used);
>> +       if (WARN_ON_ONCE(buf_used > 1)) {
>> +               err = -EBUSY;
>> +               goto out;
>> +       }
>> +
>> +       bufs = this_cpu_ptr(&bpf_seq_printf_buf);
>> +
>> +       /*
>> +        * bpf_check()->check_func_arg()->check_stack_boundary()
>> +        * guarantees that fmt points to bpf program stack,
>> +        * fmt_size bytes of it were initialized and fmt_size > 0
>> +        */
>> +       if (fmt[--fmt_size] != 0)
> 
> If we allow fmt_size == 0, this will need to be changed.

Currently, we do not support fmt_size == 0. Yes, if we allow, this
needs change.

> 
>> +               goto out;
>> +
>> +       if (data_len & 7)
>> +               goto out;
>> +
>> +       for (i = 0; i < fmt_size; i++) {
>> +               if (fmt[i] == '%' && (!data || !data_len))
> 
> So %% escaping is not supported?

Yes, have not seen a need yet my ipv6_route/netlink example.
Can certain add if there is a use case.

> 
>> +                       goto out;
>> +       }
>> +
>> +       num_args = data_len / 8;
>> +
>> +       /* check format string for allowed specifiers */
>> +       for (i = 0; i < fmt_size; i++) {
>> +               if ((!isprint(fmt[i]) && !isspace(fmt[i])) || !isascii(fmt[i]))
> 
> why these restrictions? are they essential?

This is the same restriction in bpf_trace_printk(). I guess the purpose
is to avoid weird print. To promote bpf_iter to dump beyond asscii, I 
guess we can remove this restriction.

> 
>> +                       goto out;
>> +
>> +               if (fmt[i] != '%')
>> +                       continue;
>> +
>> +               if (fmt_cnt >= MAX_SEQ_PRINTF_VARARGS) {
>> +                       err = -E2BIG;
>> +                       goto out;
>> +               }
>> +
>> +               if (fmt_cnt >= num_args)
>> +                       goto out;
>> +
>> +               /* fmt[i] != 0 && fmt[last] == 0, so we can access fmt[i + 1] */
>> +               i++;
>> +
>> +               /* skip optional "[0+-][num]" width formating field */
>> +               while (fmt[i] == '0' || fmt[i] == '+'  || fmt[i] == '-')
> 
> There could be space as well, as an alternative to 0.

We can allow space. But '0' is used more common, right?

> 
>> +                       i++;
>> +               if (fmt[i] >= '1' && fmt[i] <= '9') {
>> +                       i++;
>> +                       while (fmt[i] >= '0' && fmt[i] <= '9')
>> +                               i++;
>> +               }
>> +
>> +               if (fmt[i] == 's') {
>> +                       /* disallow any further format extensions */
>> +                       if (fmt[i + 1] != 0 &&
>> +                           !isspace(fmt[i + 1]) &&
>> +                           !ispunct(fmt[i + 1]))
>> +                               goto out;
> 
> I'm not sure I follow this check either. printf("%sbla", "whatever")
> is a perfectly fine format string. Unless seq_printf has some
> additional restrictions?

Yes, just some restriction inherited from bpf_trace_printk().
Will remove.

> 
>> +
>> +                       /* try our best to copy */
>> +                       if (memcpy_cnt >= MAX_SEQ_PRINTF_MAX_MEMCPY) {
>> +                               err = -E2BIG;
>> +                               goto out;
>> +                       }
>> +
> 
> [...]
> 
>> +
>> +static int bpf_seq_printf_btf_ids[5];
>> +static const struct bpf_func_proto bpf_seq_printf_proto = {
>> +       .func           = bpf_seq_printf,
>> +       .gpl_only       = true,
>> +       .ret_type       = RET_INTEGER,
>> +       .arg1_type      = ARG_PTR_TO_BTF_ID,
>> +       .arg2_type      = ARG_PTR_TO_MEM,
>> +       .arg3_type      = ARG_CONST_SIZE,
> 
> It feels like allowing zero shouldn't hurt too much?

This is the format string, I would prefer to keep it non-zero.

> 
>> +       .arg4_type      = ARG_PTR_TO_MEM_OR_NULL,
>> +       .arg5_type      = ARG_CONST_SIZE_OR_ZERO,
>> +       .btf_id         = bpf_seq_printf_btf_ids,
>> +};
>> +
>> +BPF_CALL_3(bpf_seq_write, struct seq_file *, m, const void *, data, u32, len)
>> +{
>> +       return seq_write(m, data, len) ? -EOVERFLOW : 0;
>> +}
>> +
>> +static int bpf_seq_write_btf_ids[5];
>> +static const struct bpf_func_proto bpf_seq_write_proto = {
>> +       .func           = bpf_seq_write,
>> +       .gpl_only       = true,
>> +       .ret_type       = RET_INTEGER,
>> +       .arg1_type      = ARG_PTR_TO_BTF_ID,
>> +       .arg2_type      = ARG_PTR_TO_MEM,
>> +       .arg3_type      = ARG_CONST_SIZE,
> 
> Same, ARG_CONST_SIZE_OR_ZERO?

This one, possible. Let me check.

> 
>> +       .btf_id         = bpf_seq_write_btf_ids,
>> +};
>> +
> 
> [...]
> 

  reply	other threads:[~2020-05-06 21:42 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-04  6:25 [PATCH bpf-next v2 00/20] bpf: implement bpf iterator for kernel data Yonghong Song
2020-05-04  6:25 ` [PATCH bpf-next v2 01/20] bpf: implement an interface to register bpf_iter targets Yonghong Song
2020-05-05 21:19   ` Andrii Nakryiko
2020-05-04  6:25 ` [PATCH bpf-next v2 02/20] bpf: allow loading of a bpf_iter program Yonghong Song
2020-05-05 21:29   ` Andrii Nakryiko
2020-05-06  0:07     ` Yonghong Song
2020-05-04  6:25 ` [PATCH bpf-next v2 03/20] bpf: support bpf tracing/iter programs for BPF_LINK_CREATE Yonghong Song
2020-05-05 21:30   ` Andrii Nakryiko
2020-05-06  0:14     ` Yonghong Song
2020-05-06  0:54       ` Alexei Starovoitov
2020-05-06  3:09         ` Andrii Nakryiko
2020-05-06 18:08           ` Alexei Starovoitov
2020-05-04  6:25 ` [PATCH bpf-next v2 04/20] bpf: support bpf tracing/iter programs for BPF_LINK_UPDATE Yonghong Song
2020-05-05 21:32   ` Andrii Nakryiko
2020-05-04  6:25 ` [PATCH bpf-next v2 05/20] bpf: implement bpf_seq_read() for bpf iterator Yonghong Song
2020-05-05 19:56   ` Andrii Nakryiko
2020-05-05 19:57     ` Alexei Starovoitov
2020-05-05 20:25     ` Yonghong Song
2020-05-05 21:08       ` Andrii Nakryiko
2020-05-04  6:25 ` [PATCH bpf-next v2 06/20] bpf: create anonymous " Yonghong Song
2020-05-05 20:11   ` Andrii Nakryiko
2020-05-05 20:28     ` Yonghong Song
2020-05-04  6:25 ` [PATCH bpf-next v2 07/20] bpf: create file " Yonghong Song
2020-05-05 20:15   ` Andrii Nakryiko
2020-05-04  6:25 ` [PATCH bpf-next v2 08/20] bpf: implement common macros/helpers for target iterators Yonghong Song
2020-05-05 20:25   ` Andrii Nakryiko
2020-05-05 20:30     ` Yonghong Song
2020-05-05 21:10       ` Andrii Nakryiko
2020-05-04  6:25 ` [PATCH bpf-next v2 09/20] bpf: add bpf_map iterator Yonghong Song
2020-05-06  5:11   ` Andrii Nakryiko
2020-05-04  6:25 ` [PATCH bpf-next v2 10/20] net: bpf: add netlink and ipv6_route bpf_iter targets Yonghong Song
2020-05-06  5:21   ` Andrii Nakryiko
2020-05-06 17:32     ` Yonghong Song
2020-05-04  6:25 ` [PATCH bpf-next v2 11/20] bpf: add task and task/file iterator targets Yonghong Song
2020-05-06  7:30   ` Andrii Nakryiko
2020-05-06 18:24     ` Yonghong Song
2020-05-06 20:51       ` Andrii Nakryiko
2020-05-06 21:20         ` Yonghong Song
2020-05-04  6:26 ` [PATCH bpf-next v2 12/20] bpf: add PTR_TO_BTF_ID_OR_NULL support Yonghong Song
2020-05-05 20:27   ` Andrii Nakryiko
2020-05-04  6:26 ` [PATCH bpf-next v2 13/20] bpf: add bpf_seq_printf and bpf_seq_write helpers Yonghong Song
2020-05-06 17:37   ` Andrii Nakryiko
2020-05-06 21:42     ` Yonghong Song [this message]
2020-05-08 18:15       ` Andrii Nakryiko
2020-05-04  6:26 ` [PATCH bpf-next v2 14/20] bpf: handle spilled PTR_TO_BTF_ID properly when checking stack_boundary Yonghong Song
2020-05-06 17:38   ` Andrii Nakryiko
2020-05-06 21:47     ` Yonghong Song
2020-05-04  6:26 ` [PATCH bpf-next v2 15/20] bpf: support variable length array in tracing programs Yonghong Song
2020-05-06 17:40   ` Andrii Nakryiko
2020-05-04  6:26 ` [PATCH bpf-next v2 16/20] tools/libbpf: add bpf_iter support Yonghong Song
2020-05-06  5:44   ` Andrii Nakryiko
2020-05-04  6:26 ` [PATCH bpf-next v2 17/20] tools/bpftool: add bpf_iter support for bptool Yonghong Song
2020-05-04  6:26 ` [PATCH bpf-next v2 18/20] tools/bpf: selftests: add iterator programs for ipv6_route and netlink Yonghong Song
2020-05-06  6:01   ` Andrii Nakryiko
2020-05-07  1:09     ` Yonghong Song
2020-05-08 18:17       ` Andrii Nakryiko
2020-05-06  6:04   ` Andrii Nakryiko
2020-05-06 23:07     ` Yonghong Song
2020-05-04  6:26 ` [PATCH bpf-next v2 19/20] tools/bpf: selftests: add iter progs for bpf_map/task/task_file Yonghong Song
2020-05-06  6:14   ` Andrii Nakryiko
2020-05-04  6:26 ` [PATCH bpf-next v2 20/20] tools/bpf: selftests: add bpf_iter selftests Yonghong Song
2020-05-06  6:39   ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=71cff8d8-05b9-87ef-8a12-1da3e38c4b55@fb.com \
    --to=yhs@fb.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andriin@fb.com \
    --cc=ast@fb.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=kafai@fb.com \
    --cc=kernel-team@fb.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox