Re: prog ID and next steps. Was: [RFC net-next 0/2] Introduce bpf_prog ID and iteration

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Daniel Borkmann <daniel@iogearbox.net>
To: Hannes Frederic Sowa <hannes@stressinduktion.org>,
	Alexei Starovoitov <ast@fb.com>, Martin KaFai Lau <kafai@fb.com>,
	netdev@vger.kernel.org
Cc: kernel-team@fb.com, "David S. Miller" <davem@davemloft.net>,
	Jesper Dangaard Brouer <brouer@redhat.com>,
	John Fastabend <john.fastabend@gmail.com>,
	Thomas Graf <tgraf@suug.ch>
Subject: Re: prog ID and next steps. Was: [RFC net-next 0/2] Introduce bpf_prog ID and iteration
Date: Fri, 28 Apr 2017 20:24:02 +0200	[thread overview]
Message-ID: <590388C2.7080000@iogearbox.net> (raw)
In-Reply-To: <44cdb2d2-9f5c-5d28-2966-3e43e6d2a2ef@stressinduktion.org>

On 04/28/2017 01:50 PM, Hannes Frederic Sowa wrote:
> On 28.04.2017 03:11, Alexei Starovoitov wrote:
[...]
>> i disagree re: kallsyms. The goal of prog_tag is to let program writers
>> understand which program is running in a stable way.
>
> But exactly it doesn't let program writers do that, it just confuses them:
>
> ---
>
> jit on:
>
> perf record -e bpf_redirect -agR
>
> The unwinder walks the stack, extracts address of upper function and
> sends it to user space (perf) or handles it inside the kernel/kallsyms
> (ftrace).
>
> User takes tag of bpf program and wants to inspect related maps to the
> program. Unfortunately the tag is not unique and thus we need to expand
> the tag back to all possible programs with the same tag and expand that
> to the union of all possible maps that those programs reference again.
>
> That is what we present to the application developer. I would seriously
> be very confused.
>
> If application developer doesn't trust perf and uses instruction pointer
> value from the stack directly he can't find out which program there is,
> because fdinfo e.g. doesn't show the actual address of where the program
> is allocated. I would use /dev/kmem now.

I don't think it would be reasonable to let fdinfo unconditionally
dump the address of the program including unprivileged progs. We
probably could add a run-time check into bpf_prog_show_fdinfo() and
show it dynamically when user has cap_sys_admin.

> ---
>
> jit off:
>
> perf probe -a '__bpf_prog_run ctx insn'
> perf probe -a 'bpf_redirect flags ifindex'
> perf record -e bpf_redirect -agR
>
> Situation doesn't change. We do get the insn pointer thus have a unique
> id for the program. That's it, no further introspection. I can read
> /dev/kmem now.
>
> ---
>
> Personally I wouldn't rely on such infrastructure.
>
> My proposal would be to maybe hash a map id into the program, so instead
> of replacing the user space file descriptor with zero, take a map id
> (like discussed below) or an inode number of the map into the register
> and hash with that, so that those program have unique identifiers.

I don't think that proposal would work, f.e. placing dev + inode number
(inode itself wouldn't be sufficient either; map would also have to be
pinned as anonymous inode from fd wouldn't work) or map id into insn
won't give you out of a sudden a unique prog id, since maps can be shared
among multiple progs, but also the same prog can be attached to, say,
multiple attachment points.

> Otherwise construct kallsym entries with prog id instead of tag.
>
> I think that the hash should try to reassemble some kind of identity
> function and mapping two programs to the same tag, that do something
> completely differently is not good (based on we don't include the map).
>
> Also I do think in future the difference between non-jit and jit
> operation in regards to tracing should also be lifted. We could add a
> manual tracing point into the interpreter for reporting the same event
> as if the program was jitted.
>
> Debugging should not be that different based on the sysctl flags.

With regards to tracing it's quite useful to see whether a program was
JITed or not JITed (aka __bpf_prog_run()), so I don't think it makes
sense to e.g. have everything named __bpf_prog_run(), at least the other
way around wouldn't work for interpreter as far as I see.

But lets assume JIT is off for a moment, and you only see __bpf_prog_run().
Then, in the stack trace you'll also see related functions that call this
in the first place, for example, mlx4_en_poll_rx_cq() / mlx4_en_process_rx_cq()
in case of XDP, meaning, you get the call path context as well, for which
you later on (with the proposed infrastructure for getting fds from
attachment points + dumping them) can return the attached prog fd and
with that also dump the code or map data.

next prev parent reply	other threads:[~2017-04-28 18:24 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-27  6:24 [RFC net-next 0/2] Introduce bpf_prog ID and iteration Martin KaFai Lau
2017-04-27  6:24 ` [RFC net-next 1/2] bpf: Introduce bpf_prog ID Martin KaFai Lau
2017-05-15  8:14   ` [lkp-robot] [bpf] de05014aba: BUG:sleeping_function_called_from_invalid_context_at_mm/slab.h kernel test robot
2017-05-15  8:14     ` kernel test robot
2017-05-15 18:57     ` David Miller
2017-05-15 18:57       ` David Miller
2017-04-27  6:24 ` [RFC net-next 2/2] bpf: Test for bpf_prog ID and BPF_PROG_GET_NEXT_ID Martin KaFai Lau
2017-04-27  7:23   ` Alexander Alemayhu
2017-04-27 21:10     ` Martin KaFai Lau
2017-04-27 13:36 ` [RFC net-next 0/2] Introduce bpf_prog ID and iteration Hannes Frederic Sowa
2017-04-27 21:14   ` Martin KaFai Lau
2017-04-28  1:11   ` prog ID and next steps. Was: " Alexei Starovoitov
2017-04-28 11:50     ` Hannes Frederic Sowa
2017-04-28 18:24       ` Daniel Borkmann [this message]
2017-04-28 18:51         ` Hannes Frederic Sowa
2017-04-28 18:57           ` Hannes Frederic Sowa
2017-04-28 19:31       ` Alexei Starovoitov
2017-04-28 21:13         ` Hannes Frederic Sowa
2017-04-30  2:06           ` Alexei Starovoitov
2017-04-30  2:36             ` Hannes Frederic Sowa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=590388C2.7080000@iogearbox.net \
    --to=daniel@iogearbox.net \
    --cc=ast@fb.com \
    --cc=brouer@redhat.com \
    --cc=davem@davemloft.net \
    --cc=hannes@stressinduktion.org \
    --cc=john.fastabend@gmail.com \
    --cc=kafai@fb.com \
    --cc=kernel-team@fb.com \
    --cc=netdev@vger.kernel.org \
    --cc=tgraf@suug.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.