All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yonghong Song <yhs@fb.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>,
	Alexei Starovoitov <ast@fb.com>
Cc: Martin KaFai Lau <kafai@fb.com>, Andrii Nakryiko <andriin@fb.com>,
	bpf <bpf@vger.kernel.org>, Networking <netdev@vger.kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Kernel Team <kernel-team@fb.com>
Subject: Re: [PATCH bpf-next v1 03/19] bpf: add bpf_map iterator
Date: Wed, 29 Apr 2020 13:15:02 -0700	[thread overview]
Message-ID: <cc802671-76e6-e911-0e4e-53a4e99c69ff@fb.com> (raw)
In-Reply-To: <CAEf4BzZgZ7h_asHNGk_34vJv_yvLtWGcTGwdTO4fgLPySaG-Eg@mail.gmail.com>



On 4/29/20 12:19 PM, Andrii Nakryiko wrote:
> On Wed, Apr 29, 2020 at 8:34 AM Alexei Starovoitov <ast@fb.com> wrote:
>>
>> On 4/28/20 11:44 PM, Yonghong Song wrote:
>>>
>>>
>>> On 4/28/20 11:40 PM, Andrii Nakryiko wrote:
>>>> On Tue, Apr 28, 2020 at 11:30 PM Alexei Starovoitov <ast@fb.com> wrote:
>>>>>
>>>>> On 4/28/20 11:20 PM, Yonghong Song wrote:
>>>>>>
>>>>>>
>>>>>> On 4/28/20 11:08 PM, Andrii Nakryiko wrote:
>>>>>>> On Tue, Apr 28, 2020 at 10:10 PM Yonghong Song <yhs@fb.com> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 4/28/20 7:44 PM, Alexei Starovoitov wrote:
>>>>>>>>> On 4/28/20 6:15 PM, Yonghong Song wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 4/28/20 5:48 PM, Alexei Starovoitov wrote:
>>>>>>>>>>> On 4/28/20 5:37 PM, Martin KaFai Lau wrote:
>>>>>>>>>>>>> +    prog = bpf_iter_get_prog(seq, sizeof(struct
>>>>>>>>>>>>> bpf_iter_seq_map_info),
>>>>>>>>>>>>> +                 &meta.session_id, &meta.seq_num,
>>>>>>>>>>>>> +                 v == (void *)0);
>>>>>>>>>>>>     From looking at seq_file.c, when will show() be called with
>>>>>>>>>>>> "v ==
>>>>>>>>>>>> NULL"?
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> that v == NULL here and the whole verifier change just to allow
>>>>>>>>>>> NULL...
>>>>>>>>>>> may be use seq_num as an indicator of the last elem instead?
>>>>>>>>>>> Like seq_num with upper bit set to indicate that it's last?
>>>>>>>>>>
>>>>>>>>>> We could. But then verifier won't have an easy way to verify that.
>>>>>>>>>> For example, the above is expected:
>>>>>>>>>>
>>>>>>>>>>          int prog(struct bpf_map *map, u64 seq_num) {
>>>>>>>>>>             if (seq_num >> 63)
>>>>>>>>>>               return 0;
>>>>>>>>>>             ... map->id ...
>>>>>>>>>>             ... map->user_cnt ...
>>>>>>>>>>          }
>>>>>>>>>>
>>>>>>>>>> But if user writes
>>>>>>>>>>
>>>>>>>>>>          int prog(struct bpf_map *map, u64 seq_num) {
>>>>>>>>>>              ... map->id ...
>>>>>>>>>>              ... map->user_cnt ...
>>>>>>>>>>          }
>>>>>>>>>>
>>>>>>>>>> verifier won't be easy to conclude inproper map pointer tracing
>>>>>>>>>> here and in the above map->id, map->user_cnt will cause
>>>>>>>>>> exceptions and they will silently get value 0.
>>>>>>>>>
>>>>>>>>> I mean always pass valid object pointer into the prog.
>>>>>>>>> In above case 'map' will always be valid.
>>>>>>>>> Consider prog that iterating all map elements.
>>>>>>>>> It's weird that the prog would always need to do
>>>>>>>>> if (map == 0)
>>>>>>>>>       goto out;
>>>>>>>>> even if it doesn't care about finding last.
>>>>>>>>> All progs would have to have such extra 'if'.
>>>>>>>>> If we always pass valid object than there is no need
>>>>>>>>> for such extra checks inside the prog.
>>>>>>>>> First and last element can be indicated via seq_num
>>>>>>>>> or via another flag or via helper call like is_this_last_elem()
>>>>>>>>> or something.
>>>>>>>>
>>>>>>>> Okay, I see what you mean now. Basically this means
>>>>>>>> seq_ops->next() should try to get/maintain next two elements,
>>>>>>>
>>>>>>> What about the case when there are no elements to iterate to begin
>>>>>>> with? In that case, we still need to call bpf_prog for (empty)
>>>>>>> post-aggregation, but we have no valid element... For bpf_map
>>>>>>> iteration we could have fake empty bpf_map that would be passed, but
>>>>>>> I'm not sure it's applicable for any time of object (e.g., having a
>>>>>>> fake task_struct is probably quite a bit more problematic?)...
>>>>>>
>>>>>> Oh, yes, thanks for reminding me of this. I put a call to
>>>>>> bpf_prog in seq_ops->stop() especially to handle no object
>>>>>> case. In that case, seq_ops->start() will return NULL,
>>>>>> seq_ops->next() won't be called, and then seq_ops->stop()
>>>>>> is called. My earlier attempt tries to hook with next()
>>>>>> and then find it not working in all cases.
>>>>>
>>>>> wait a sec. seq_ops->stop() is not the end.
>>>>> With lseek of seq_file it can be called multiple times.
>>>
>>> Yes, I have taken care of this. when the object is NULL,
>>> bpf program will be called. When the object is NULL again,
>>> it won't be called. The private data remembers it has
>>> been called with NULL.
>>
>> Even without lseek stop() will be called multiple times.
>> If I read seq_file.c correctly it will be called before
>> every copy_to_user(). Which means that for a lot of text
>> (or if read() is done with small buffer) there will be
>> plenty of start,show,show,stop sequences.
> 
> 
> Right start/stop can be called multiple times, but seems like there
> are clear indicators of beginning of iteration and end of iteration:
> - start() with seq_num == 0 is start of iteration (can be called
> multiple times, if first element overflows buffer);
> - stop() with p == NULL is end of iteration (seems like can be called
> multiple times as well, if user keeps read()'ing after iteration
> completed).
> 
> There is another problem with stop(), though. If BPF program will
> attempt to output anything during stop(), that output will be just
> discarded. Not great. Especially if that output overflows and we need

The stop() output will not be discarded in the following cases:
    - regular show() objects overflow and stop() BPF program not called
    - regular show() objects not overflow, which means iteration is done,
      and stop() BPF program does not overflow.

The stop() seq_file output will be discarded if
    - regular show() objects not overflow and stop() BPF program output
      overflows.
    - no objects to iterate, BPF program got called, but its seq_file
      write/printf will be discarded.

Two options here:
   - implement Alexei suggestion to look ahead two elements to
     always having valid object and indicating the last element
     with a special flag.
   - Per Andrii's suggestion below to implement new way or to
     tweak seq_file() a little bit to resolve the above cases
     where stop() seq_file outputs being discarded.

Will try to experiment with both above options...


> to re-allocate buffer.
> 
> We are trying to use seq_file just to reuse 140 lines of code in
> seq_read(), which is no magic, just a simple double buffer and retry
> piece of logic. We don't need lseek and traverse, we don't need all
> the escaping stuff. I think bpf_iter implementation would be much
> simpler if bpf_iter had better control over iteration. Then this whole
> "end of iteration" behavior would be crystal clear. Should we maybe
> reconsider again?
> 
> I understand we want to re-use networking iteration code, but we can
> still do that with custom implementation of seq_read, because we are
> still using struct seq_file and follow its semantics. The change would
> be to allow stop(NULL) (or any stop() call for that matter) to perform
> output (and handle retry and buffer re-allocation). Or, alternatively,
> coupled with seq_operations intercept proposal in patch #7 discussion,
> we can add extra method (e.g., finish()) that would be called after
> all elements are traversed and will allow to emit extra stuff. We can
> do that (implement finish()) in seq_read, as well, if that's going to
> fly ok with seq_file maintainers, of course.
> 

  reply	other threads:[~2020-04-29 20:15 UTC|newest]

Thread overview: 85+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-27 20:12 [PATCH bpf-next v1 00/19] bpf: implement bpf iterator for kernel data Yonghong Song
2020-04-27 20:12 ` [PATCH bpf-next v1 01/19] net: refactor net assignment for seq_net_private structure Yonghong Song
2020-04-29  5:38   ` Andrii Nakryiko
2020-04-27 20:12 ` [PATCH bpf-next v1 02/19] bpf: implement an interface to register bpf_iter targets Yonghong Song
2020-04-28 16:20   ` Martin KaFai Lau
2020-04-28 16:50     ` Yonghong Song
2020-04-27 20:12 ` [PATCH bpf-next v1 03/19] bpf: add bpf_map iterator Yonghong Song
2020-04-29  0:37   ` Martin KaFai Lau
2020-04-29  0:48     ` Alexei Starovoitov
2020-04-29  1:15       ` Yonghong Song
2020-04-29  2:44         ` Alexei Starovoitov
2020-04-29  5:09           ` Yonghong Song
2020-04-29  6:08             ` Andrii Nakryiko
2020-04-29  6:20               ` Yonghong Song
2020-04-29  6:30                 ` Alexei Starovoitov
2020-04-29  6:40                   ` Andrii Nakryiko
2020-04-29  6:44                     ` Yonghong Song
2020-04-29 15:34                       ` Alexei Starovoitov
2020-04-29 18:14                         ` Yonghong Song
2020-04-29 19:19                         ` Andrii Nakryiko
2020-04-29 20:15                           ` Yonghong Song [this message]
2020-04-30  3:06                             ` Alexei Starovoitov
2020-04-30  4:01                               ` Yonghong Song
2020-04-29  6:34                 ` Martin KaFai Lau
2020-04-29  6:51                   ` Yonghong Song
2020-04-29 19:25                     ` Andrii Nakryiko
2020-04-29  1:02     ` Yonghong Song
2020-04-29  6:04   ` Andrii Nakryiko
2020-04-27 20:12 ` [PATCH bpf-next v1 04/19] bpf: allow loading of a bpf_iter program Yonghong Song
2020-04-29  0:54   ` Martin KaFai Lau
2020-04-29  1:27     ` Yonghong Song
2020-04-27 20:12 ` [PATCH bpf-next v1 05/19] bpf: support bpf tracing/iter programs for BPF_LINK_CREATE Yonghong Song
2020-04-29  1:17   ` [Potential Spoof] " Martin KaFai Lau
2020-04-29  6:25   ` Andrii Nakryiko
2020-04-27 20:12 ` [PATCH bpf-next v1 06/19] bpf: support bpf tracing/iter programs for BPF_LINK_UPDATE Yonghong Song
2020-04-29  1:32   ` Martin KaFai Lau
2020-04-29  5:04     ` Yonghong Song
2020-04-29  5:58       ` Martin KaFai Lau
2020-04-29  6:32         ` Andrii Nakryiko
2020-04-29  6:41           ` Martin KaFai Lau
2020-04-27 20:12 ` [PATCH bpf-next v1 07/19] bpf: create anonymous bpf iterator Yonghong Song
2020-04-29  5:39   ` Martin KaFai Lau
2020-04-29  6:56   ` Andrii Nakryiko
2020-04-29  7:06     ` Yonghong Song
2020-04-29 18:16       ` Andrii Nakryiko
2020-04-29 18:46         ` Martin KaFai Lau
2020-04-29 19:20           ` Yonghong Song
2020-04-29 20:50             ` Martin KaFai Lau
2020-04-29 20:54               ` Yonghong Song
2020-04-29 19:39   ` Andrii Nakryiko
2020-04-27 20:12 ` [PATCH bpf-next v1 08/19] bpf: create file " Yonghong Song
2020-04-29 20:40   ` Andrii Nakryiko
2020-04-30 18:02     ` Yonghong Song
2020-04-27 20:12 ` [PATCH bpf-next v1 09/19] bpf: add PTR_TO_BTF_ID_OR_NULL support Yonghong Song
2020-04-29 20:46   ` Andrii Nakryiko
2020-04-29 20:51     ` Yonghong Song
2020-04-27 20:12 ` [PATCH bpf-next v1 10/19] bpf: add netlink and ipv6_route targets Yonghong Song
2020-04-28 19:49   ` kbuild test robot
2020-04-28 19:49     ` kbuild test robot
2020-04-28 19:50   ` [RFC PATCH] bpf: __bpf_iter__netlink() can be static kbuild test robot
2020-04-28 19:50     ` kbuild test robot
2020-04-27 20:12 ` [PATCH bpf-next v1 11/19] bpf: add task and task/file targets Yonghong Song
2020-04-30  2:08   ` Andrii Nakryiko
2020-05-01 17:23     ` Yonghong Song
2020-05-01 19:01       ` Andrii Nakryiko
2020-04-27 20:12 ` [PATCH bpf-next v1 12/19] bpf: add bpf_seq_printf and bpf_seq_write helpers Yonghong Song
2020-04-28  6:02   ` kbuild test robot
2020-04-28  6:02     ` kbuild test robot
2020-04-28 16:35     ` Yonghong Song
2020-04-28 16:35       ` Yonghong Song
2020-04-30 20:06       ` Andrii Nakryiko
2020-04-27 20:12 ` [PATCH bpf-next v1 13/19] bpf: handle spilled PTR_TO_BTF_ID properly when checking stack_boundary Yonghong Song
2020-04-27 20:12 ` [PATCH bpf-next v1 14/19] bpf: support variable length array in tracing programs Yonghong Song
2020-04-30 20:04   ` Andrii Nakryiko
2020-04-27 20:12 ` [PATCH bpf-next v1 15/19] tools/libbpf: add bpf_iter support Yonghong Song
2020-04-30  1:41   ` Andrii Nakryiko
2020-05-02  7:17     ` Yonghong Song
2020-04-27 20:12 ` [PATCH bpf-next v1 16/19] tools/bpftool: add bpf_iter support for bptool Yonghong Song
2020-04-28  9:27   ` Quentin Monnet
2020-04-28 17:35     ` Yonghong Song
2020-04-29  8:37       ` Quentin Monnet
2020-04-27 20:12 ` [PATCH bpf-next v1 17/19] tools/bpf: selftests: add iterator programs for ipv6_route and netlink Yonghong Song
2020-04-30  2:12   ` Andrii Nakryiko
2020-04-27 20:12 ` [PATCH bpf-next v1 18/19] tools/bpf: selftests: add iter progs for bpf_map/task/task_file Yonghong Song
2020-04-27 20:12 ` [PATCH bpf-next v1 19/19] tools/bpf: selftests: add bpf_iter selftests Yonghong Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cc802671-76e6-e911-0e4e-53a4e99c69ff@fb.com \
    --to=yhs@fb.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andriin@fb.com \
    --cc=ast@fb.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=kafai@fb.com \
    --cc=kernel-team@fb.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.