netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Quentin Monnet <quentin.monnet@netronome.com>
To: Stanislav Fomichev <sdf@fomichev.me>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>,
	Stanislav Fomichev <sdf@google.com>,
	netdev@vger.kernel.org, ast@kernel.org, vladum@google.com
Subject: Re: [PATCH bpf-next] bpf: libbpf: retry program creation without the name
Date: Wed, 21 Nov 2018 02:49:18 +0000	[thread overview]
Message-ID: <8bcd97bc-fffa-4cb7-2fcd-8433f56ea04d@netronome.com> (raw)
In-Reply-To: <20181120232614.kgzuoh6aj7bmo36u@mini-arch.hsd1.ca.comcast.net>

2018-11-20 15:26 UTC-0800 ~ Stanislav Fomichev <sdf@fomichev.me>
> On 11/20, Alexei Starovoitov wrote:
>> On Wed, Nov 21, 2018 at 12:18:57AM +0100, Daniel Borkmann wrote:
>>> On 11/21/2018 12:04 AM, Alexei Starovoitov wrote:
>>>> On Tue, Nov 20, 2018 at 01:19:05PM -0800, Stanislav Fomichev wrote:
>>>>> On 11/20, Alexei Starovoitov wrote:
>>>>>> On Mon, Nov 19, 2018 at 04:46:25PM -0800, Stanislav Fomichev wrote:
>>>>>>> [Recent commit 23499442c319 ("bpf: libbpf: retry map creation without
>>>>>>> the name") fixed this issue for maps, let's do the same for programs.]
>>>>>>>
>>>>>>> Since commit 88cda1c9da02 ("bpf: libbpf: Provide basic API support
>>>>>>> to specify BPF obj name"), libbpf unconditionally sets bpf_attr->name
>>>>>>> for programs. Pre v4.14 kernels don't know about programs names and
>>>>>>> return an error about unexpected non-zero data. Retry sys_bpf without
>>>>>>> a program name to cover older kernels.
>>>>>>>
>>>>>>> Signed-off-by: Stanislav Fomichev <sdf@google.com>
>>>>>>> ---
>>>>>>>  tools/lib/bpf/bpf.c | 10 ++++++++++
>>>>>>>  1 file changed, 10 insertions(+)
>>>>>>>
>>>>>>> diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
>>>>>>> index 961e1b9fc592..cbe9d757c646 100644
>>>>>>> --- a/tools/lib/bpf/bpf.c
>>>>>>> +++ b/tools/lib/bpf/bpf.c
>>>>>>> @@ -212,6 +212,16 @@ int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr,
>>>>>>>  	if (fd >= 0 || !log_buf || !log_buf_sz)
>>>>>>>  		return fd;
>>>>>>>  
>>>>>>> +	if (fd < 0 && errno == E2BIG && load_attr->name) {
>>>>>>> +		/* Retry the same syscall, but without the name.
>>>>>>> +		 * Pre v4.14 kernels don't support prog names.
>>>>>>> +		 */
>>>>>>
>>>>>> I'm afraid that will put unnecessary stress on the kernel.
>>>>>> This check needs to be tighter.
>>>>>> Like E2BIG and anything in the log_buf probably means that
>>>>>> E2BIG came from the verifier and nothing to do with prog_name.
>>>>>> Asking kernel to repeat is an unnecessary work.
>>>>>>
>>>>>> In general we need to think beyond this single prog_name field.
>>>>>> There are bunch of other fields in bpf_load_program_xattr() and older kernels
>>>>>> won't support them. Are we going to zero them out one by one
>>>>>> and retry? I don't think that would be practical.
>>>>> I general, we don't want to zero anything out. However,
>>>>> for this particular problem the rationale is the following:
>>>>> In commit 88cda1c9da02 we started unconditionally setting {prog,map}->name
>>>>> from the 'higher' libbpfc layer which breaks users on the older kernels.
>>>>>
>>>>>> Also libbpf silently ignoring prog_name is not great for debugging.
>>>>>> A warning is needed.
>>>>>> But it cannot be done out of lib/bpf/bpf.c, since it's a set of syscall
>>>>>> wrappers.
>>>>>> Imo such "old kernel -> lets retry" feature should probably be done
>>>>>> at lib/bpf/libbpf.c level. inside load_program().
>>>>> For maps bpftools calls bpf_create_map_xattr directly, that's why
>>>>> for maps I did the retry on the lower level (and why for programs I initially
>>>>> thought about doing the same). However, in this case maybe asking
>>>>> user to omit 'name' argument might be a better option.
>>>>>
>>>>> For program names, I agree, we might think about doing it on the higher
>>>>> level (although I'm not sure whether we want to have different API
>>>>> expectations, i.e. bpf_create_map_xattr ignoring the name and
>>>>> bpf_load_program_xattr not ignoring the name).
>>>>>
>>>>> So given that rationale above, what do you think is the best way to
>>>>> move forward?
>>>>> 1. Same patch, but tighten the retry check inside bpf_load_program_xattr ?
>>>>> 2. Move this retry logic into load_program and have different handling
>>>>>    for bpf_create_map_xattr vs bpf_load_program_xattr ?
>>>>> 3. Do 2 and move the retry check for maps from bpf_create_map_xattr
>>>>>    into bpf_object__create_maps ?
>>>>>
>>>>> (I'm slightly leaning towards #3)
>>>>
>>>> me too. I think it's cleaner for maps to do it in
>>>> bpf_object__create_maps().
>>>> Originally bpf.c was envisioned to be a thin layer on top of bpf syscall.
>>>> Whereas 'smart bits' would go into libbpf.c
>>>
>>> Can't we create in bpf_object__load() a small helper bpf_object__probe_caps()
>>> which would figure this out _once_ upon start with a few things to probe for
>>> availability in the underlying kernel for maps and programs? E.g. programs
>>> it could try to inject a tiny 'r0 = 0; exit' snippet where we figure out
>>> things like prog name support etc. Given underlying kernel doesn't change, we
>>> would only try this once and it doesn't require fallback every time.
>>
>> +1. great idea!
> Sounds good, let me try to do it.
> 
> It sounds more like a recent LPC proposal/idea to have some sys_bpf option
> to query BPF features. This new bpf_object__probe_caps can probably query
> that in the future if we eventually add support for it.
> 

Hi,

LPC proposal indeed. I've been working on implementing this kind of
probes in bpftool. I don't probe name support for now (but I can
certainly add it), but I detect supported program types, map types,
header functions, and a couple of other parameters. The idea (initially
from Daniel) was to dump "#define" declarations that could later be
included in a header file and used for a BPF project (or alternatively,
JSON output).

I felt like bpftool was maybe a better place to do it, as the set of
probes may grow large (all types, helpers, etc). It might have
consequences on the running system: for example, if I don't raise the
rlimit in bpftool before starting the probes, a good half of them fail
because of consecutive program creation and reclaim delay for locked
memory usage.

So should we start adding probes to libbpf and should I move mine to the
lib as well, or should the one in the v3 of this series be directed to
something like bpftool instead?

Thanks,
Quentin

  reply	other threads:[~2018-11-21 13:21 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-20  0:46 [PATCH bpf-next] bpf: libbpf: retry program creation without the name Stanislav Fomichev
2018-11-20  1:24 ` Y Song
2018-11-20 19:51 ` Alexei Starovoitov
2018-11-20 21:19   ` Stanislav Fomichev
2018-11-20 23:04     ` Alexei Starovoitov
2018-11-20 23:15       ` [PATCH bpf-next v2 1/2] " Stanislav Fomichev
2018-11-20 23:15         ` [PATCH bpf-next v2 2/2] bpf: libbpf: move map name retry into libbpf.c Stanislav Fomichev
2018-11-20 23:18       ` [PATCH bpf-next] bpf: libbpf: retry program creation without the name Daniel Borkmann
2018-11-20 23:20         ` Alexei Starovoitov
2018-11-20 23:26           ` Stanislav Fomichev
2018-11-21  2:49             ` Quentin Monnet [this message]
2018-11-21 17:28               ` Stanislav Fomichev
2018-11-23 10:51                 ` Quentin Monnet
2018-11-26 19:08                   ` Vlad Dumitrescu
2018-11-26 19:45                     ` Quentin Monnet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8bcd97bc-fffa-4cb7-2fcd-8433f56ea04d@netronome.com \
    --to=quentin.monnet@netronome.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=ast@kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=netdev@vger.kernel.org \
    --cc=sdf@fomichev.me \
    --cc=sdf@google.com \
    --cc=vladum@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).