From: Quentin Monnet <quentin.monnet@netronome.com>
To: Stanislav Fomichev <sdf@fomichev.me>,
Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>,
Stanislav Fomichev <sdf@google.com>,
netdev@vger.kernel.org, ast@kernel.org, vladum@google.com
Subject: Re: [PATCH bpf-next] bpf: libbpf: retry program creation without the name
Date: Wed, 21 Nov 2018 02:49:18 +0000 [thread overview]
Message-ID: <8bcd97bc-fffa-4cb7-2fcd-8433f56ea04d@netronome.com> (raw)
In-Reply-To: <20181120232614.kgzuoh6aj7bmo36u@mini-arch.hsd1.ca.comcast.net>
2018-11-20 15:26 UTC-0800 ~ Stanislav Fomichev <sdf@fomichev.me>
> On 11/20, Alexei Starovoitov wrote:
>> On Wed, Nov 21, 2018 at 12:18:57AM +0100, Daniel Borkmann wrote:
>>> On 11/21/2018 12:04 AM, Alexei Starovoitov wrote:
>>>> On Tue, Nov 20, 2018 at 01:19:05PM -0800, Stanislav Fomichev wrote:
>>>>> On 11/20, Alexei Starovoitov wrote:
>>>>>> On Mon, Nov 19, 2018 at 04:46:25PM -0800, Stanislav Fomichev wrote:
>>>>>>> [Recent commit 23499442c319 ("bpf: libbpf: retry map creation without
>>>>>>> the name") fixed this issue for maps, let's do the same for programs.]
>>>>>>>
>>>>>>> Since commit 88cda1c9da02 ("bpf: libbpf: Provide basic API support
>>>>>>> to specify BPF obj name"), libbpf unconditionally sets bpf_attr->name
>>>>>>> for programs. Pre v4.14 kernels don't know about programs names and
>>>>>>> return an error about unexpected non-zero data. Retry sys_bpf without
>>>>>>> a program name to cover older kernels.
>>>>>>>
>>>>>>> Signed-off-by: Stanislav Fomichev <sdf@google.com>
>>>>>>> ---
>>>>>>> tools/lib/bpf/bpf.c | 10 ++++++++++
>>>>>>> 1 file changed, 10 insertions(+)
>>>>>>>
>>>>>>> diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
>>>>>>> index 961e1b9fc592..cbe9d757c646 100644
>>>>>>> --- a/tools/lib/bpf/bpf.c
>>>>>>> +++ b/tools/lib/bpf/bpf.c
>>>>>>> @@ -212,6 +212,16 @@ int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr,
>>>>>>> if (fd >= 0 || !log_buf || !log_buf_sz)
>>>>>>> return fd;
>>>>>>>
>>>>>>> + if (fd < 0 && errno == E2BIG && load_attr->name) {
>>>>>>> + /* Retry the same syscall, but without the name.
>>>>>>> + * Pre v4.14 kernels don't support prog names.
>>>>>>> + */
>>>>>>
>>>>>> I'm afraid that will put unnecessary stress on the kernel.
>>>>>> This check needs to be tighter.
>>>>>> Like E2BIG and anything in the log_buf probably means that
>>>>>> E2BIG came from the verifier and nothing to do with prog_name.
>>>>>> Asking kernel to repeat is an unnecessary work.
>>>>>>
>>>>>> In general we need to think beyond this single prog_name field.
>>>>>> There are bunch of other fields in bpf_load_program_xattr() and older kernels
>>>>>> won't support them. Are we going to zero them out one by one
>>>>>> and retry? I don't think that would be practical.
>>>>> I general, we don't want to zero anything out. However,
>>>>> for this particular problem the rationale is the following:
>>>>> In commit 88cda1c9da02 we started unconditionally setting {prog,map}->name
>>>>> from the 'higher' libbpfc layer which breaks users on the older kernels.
>>>>>
>>>>>> Also libbpf silently ignoring prog_name is not great for debugging.
>>>>>> A warning is needed.
>>>>>> But it cannot be done out of lib/bpf/bpf.c, since it's a set of syscall
>>>>>> wrappers.
>>>>>> Imo such "old kernel -> lets retry" feature should probably be done
>>>>>> at lib/bpf/libbpf.c level. inside load_program().
>>>>> For maps bpftools calls bpf_create_map_xattr directly, that's why
>>>>> for maps I did the retry on the lower level (and why for programs I initially
>>>>> thought about doing the same). However, in this case maybe asking
>>>>> user to omit 'name' argument might be a better option.
>>>>>
>>>>> For program names, I agree, we might think about doing it on the higher
>>>>> level (although I'm not sure whether we want to have different API
>>>>> expectations, i.e. bpf_create_map_xattr ignoring the name and
>>>>> bpf_load_program_xattr not ignoring the name).
>>>>>
>>>>> So given that rationale above, what do you think is the best way to
>>>>> move forward?
>>>>> 1. Same patch, but tighten the retry check inside bpf_load_program_xattr ?
>>>>> 2. Move this retry logic into load_program and have different handling
>>>>> for bpf_create_map_xattr vs bpf_load_program_xattr ?
>>>>> 3. Do 2 and move the retry check for maps from bpf_create_map_xattr
>>>>> into bpf_object__create_maps ?
>>>>>
>>>>> (I'm slightly leaning towards #3)
>>>>
>>>> me too. I think it's cleaner for maps to do it in
>>>> bpf_object__create_maps().
>>>> Originally bpf.c was envisioned to be a thin layer on top of bpf syscall.
>>>> Whereas 'smart bits' would go into libbpf.c
>>>
>>> Can't we create in bpf_object__load() a small helper bpf_object__probe_caps()
>>> which would figure this out _once_ upon start with a few things to probe for
>>> availability in the underlying kernel for maps and programs? E.g. programs
>>> it could try to inject a tiny 'r0 = 0; exit' snippet where we figure out
>>> things like prog name support etc. Given underlying kernel doesn't change, we
>>> would only try this once and it doesn't require fallback every time.
>>
>> +1. great idea!
> Sounds good, let me try to do it.
>
> It sounds more like a recent LPC proposal/idea to have some sys_bpf option
> to query BPF features. This new bpf_object__probe_caps can probably query
> that in the future if we eventually add support for it.
>
Hi,
LPC proposal indeed. I've been working on implementing this kind of
probes in bpftool. I don't probe name support for now (but I can
certainly add it), but I detect supported program types, map types,
header functions, and a couple of other parameters. The idea (initially
from Daniel) was to dump "#define" declarations that could later be
included in a header file and used for a BPF project (or alternatively,
JSON output).
I felt like bpftool was maybe a better place to do it, as the set of
probes may grow large (all types, helpers, etc). It might have
consequences on the running system: for example, if I don't raise the
rlimit in bpftool before starting the probes, a good half of them fail
because of consecutive program creation and reclaim delay for locked
memory usage.
So should we start adding probes to libbpf and should I move mine to the
lib as well, or should the one in the v3 of this series be directed to
something like bpftool instead?
Thanks,
Quentin
next prev parent reply other threads:[~2018-11-21 13:21 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-20 0:46 [PATCH bpf-next] bpf: libbpf: retry program creation without the name Stanislav Fomichev
2018-11-20 1:24 ` Y Song
2018-11-20 19:51 ` Alexei Starovoitov
2018-11-20 21:19 ` Stanislav Fomichev
2018-11-20 23:04 ` Alexei Starovoitov
2018-11-20 23:15 ` [PATCH bpf-next v2 1/2] " Stanislav Fomichev
2018-11-20 23:15 ` [PATCH bpf-next v2 2/2] bpf: libbpf: move map name retry into libbpf.c Stanislav Fomichev
2018-11-20 23:18 ` [PATCH bpf-next] bpf: libbpf: retry program creation without the name Daniel Borkmann
2018-11-20 23:20 ` Alexei Starovoitov
2018-11-20 23:26 ` Stanislav Fomichev
2018-11-21 2:49 ` Quentin Monnet [this message]
2018-11-21 17:28 ` Stanislav Fomichev
2018-11-23 10:51 ` Quentin Monnet
2018-11-26 19:08 ` Vlad Dumitrescu
2018-11-26 19:45 ` Quentin Monnet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8bcd97bc-fffa-4cb7-2fcd-8433f56ea04d@netronome.com \
--to=quentin.monnet@netronome.com \
--cc=alexei.starovoitov@gmail.com \
--cc=ast@kernel.org \
--cc=daniel@iogearbox.net \
--cc=netdev@vger.kernel.org \
--cc=sdf@fomichev.me \
--cc=sdf@google.com \
--cc=vladum@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).