Re: [PATCH v6 net-next 4/6] bpf: enable bpf syscall on x64 and i386

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Ingo Molnar <mingo@kernel.org>
To: Alexei Starovoitov <ast@plumgrid.com>
Cc: David Miller <davem@davemloft.net>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andy Lutomirski <luto@amacapital.net>,
	Steven Rostedt <rostedt@goodmis.org>,
	Daniel Borkmann <dborkman@redhat.com>,
	Chema Gonzalez <chema@google.com>,
	Eric Dumazet <edumazet@google.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Brendan Gregg <brendan.d.gregg@gmail.com>,
	Namhyung Kim <namhyung@kernel.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Kees Cook <keescook@chromium.org>,
	Linux API <linux-api@vger.kernel.org>,
	Network Development <netdev@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v6 net-next 4/6] bpf: enable bpf syscall on x64 and i386
Date: Tue, 26 Aug 2014 09:45:34 +0200	[thread overview]
Message-ID: <20140826074534.GA19799@gmail.com> (raw)
In-Reply-To: <CAMEtUuy1DXFMABAg2Uup5HtqmiJHw0WR=or-z9CfpVMscrVcVg@mail.gmail.com>

* Alexei Starovoitov <ast@plumgrid.com> wrote:

> On Mon, Aug 25, 2014 at 6:07 PM, David Miller <davem@davemloft.net> wrote:
> > From: Alexei Starovoitov <ast@plumgrid.com>
> > Date: Mon, 25 Aug 2014 18:00:56 -0700
> >
> >> -
> >> +asmlinkage long sys_bpf(int cmd, unsigned long arg2, unsigned long arg3,
> >> +                     unsigned long arg4, unsigned long arg5);
> >
> > Please do not add interfaces with opaque types as arguments.
> >
> > It is impossible for the compiler to type check the args at
> > compile time when userspace tries to use this stuff.
> 
> I share this concern. I went with single BPF syscall, because
> alternative is 6 syscalls for every command and more
> syscalls in the future when we'd need to add another command.

We had a similar problem growing the perf syscall - and we were 
able to hold to a single syscall, which I think has served us 
well. Had we gone with a per functionality syscall we'd have 
something like a dozen syscalls today, scattered all around 
non-continuously in the syscall space on most platforms.

But note that 'opaque or non-opaque' is a false dichotomy, as 
there are 3 options in reality: what we used instead of an opaque 
type was an extensible data type, and extensible C structure, 
with structure size expectations part of the structure.

See 'struct perf_event_attr':

SYSCALL_DEFINE5(perf_event_open,
                struct perf_event_attr __user *, attr_uptr,
                pid_t, pid, int, cpu, int, group_fd, unsigned long, flags)

That way new versions of the data type are immediately obvious to 
the kernel, and compatibility can be handled well. Smaller, 
previous versions received from old user-space are padded out 
transparently to the kernel's value of the structure, with zeroes 
filled in.

See perf_copy_attr() in kernel/events/core.c. Instead of 
versioning the structure, we use its size as a finegrained and 
robust version indicator in essence.

That way it's both forwards and backwards compatible, as much as 
possible technically: old kernel can run new user-space, and new 
user-space will be able to take advantage of as much of an old 
kernel's capabilities as possible, and in the typical case of 
version match there's no extra overhead worth speaking of.

This way we were able to gradually grow to the sophisticated ABI 
you can find in include/uapi/linux/perf_event.h, without having 
to touch the syscall interface. (It's not the only method: we 
also have a handful of ioctls, where that's the most natural 
interface for a perf event fd.)

Thanks,

	Ingo

next prev parent reply	other threads:[~2014-08-26  7:45 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-26  1:00 [PATCH v6 net-next 0/6] introduce BPF syscall Alexei Starovoitov
2014-08-26  1:00 ` [PATCH v6 net-next 1/6] net: filter: add "load 64-bit immediate" eBPF instruction Alexei Starovoitov
2014-08-26  1:06   ` David Miller
2014-08-26  1:35     ` Alexei Starovoitov
2014-08-26  1:38       ` Andy Lutomirski
2014-08-26  1:53         ` Alexei Starovoitov
2014-08-26  1:54           ` Andy Lutomirski
2014-08-26  2:02             ` Alexei Starovoitov
2014-08-26  4:12     ` Alexei Starovoitov
2014-08-26  1:00 ` [PATCH v6 net-next 2/6] net: filter: split filter.h and expose eBPF to user space Alexei Starovoitov
2014-08-26  1:00 ` [PATCH v6 net-next 3/6] bpf: introduce syscall(BPF, ...) and BPF maps Alexei Starovoitov
2014-08-26  1:00 ` [PATCH v6 net-next 4/6] bpf: enable bpf syscall on x64 and i386 Alexei Starovoitov
2014-08-26  1:07   ` David Miller
2014-08-26  1:43     ` Alexei Starovoitov
2014-08-26  7:45       ` Ingo Molnar [this message]
2014-08-26 16:29         ` Alexei Starovoitov
2014-08-26  3:52   ` Stephen Hemminger
2014-08-26  4:24     ` Alexei Starovoitov
2014-08-26  7:46       ` Ingo Molnar
2014-08-26  8:00         ` Daniel Borkmann
2014-08-26  8:02           ` Ingo Molnar
2014-08-26 16:40             ` Alexei Starovoitov
2014-08-26  1:00 ` [PATCH v6 net-next 5/6] bpf: add lookup/update/delete/iterate methods to BPF maps Alexei Starovoitov
2014-08-26  1:00 ` [PATCH v6 net-next 6/6] bpf: add hashtable type of " Alexei Starovoitov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140826074534.GA19799@gmail.com \
    --to=mingo@kernel.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=ast@plumgrid.com \
    --cc=brendan.d.gregg@gmail.com \
    --cc=chema@google.com \
    --cc=davem@davemloft.net \
    --cc=dborkman@redhat.com \
    --cc=edumazet@google.com \
    --cc=hpa@zytor.com \
    --cc=keescook@chromium.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=namhyung@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).