netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Daniel Borkmann <daniel@iogearbox.net>
To: Hannes Frederic Sowa <hannes@stressinduktion.org>,
	Alexei Starovoitov <ast@plumgrid.com>,
	"Eric W. Biederman" <ebiederm@xmission.com>
Cc: davem@davemloft.net, viro@ZenIV.linux.org.uk, tgraf@suug.ch,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	Alexei Starovoitov <ast@kernel.org>
Subject: Re: [PATCH net-next 3/4] bpf: add support for persistent maps/progs
Date: Mon, 19 Oct 2015 16:23:27 +0200	[thread overview]
Message-ID: <5624FCDF.3090601@iogearbox.net> (raw)
In-Reply-To: <5624BD0C.3070404@iogearbox.net>

On 10/19/2015 11:51 AM, Daniel Borkmann wrote:
> On 10/19/2015 09:36 AM, Hannes Frederic Sowa wrote:
>> On Sun, Oct 18, 2015, at 22:59, Alexei Starovoitov wrote:
>>> On 10/18/15 9:49 AM, Daniel Borkmann wrote:
>>>> Okay, I have pushed some rough working proof of concept here:
>>>>
>>>> https://git.breakpoint.cc/cgit/dborkman/net-next.git/log/?h=ebpf-fds-final5
>>>>
>>>> So the idea eventually had to be slightly modified after giving this
>>>> further
>>>> thoughts and is the following:
>>>>
>>>> We have 3 commands (BPF_DEV_CREATE, BPF_DEV_DESTROY, BPF_DEV_CONNECT), and
>>>> related to that a bpf_attr extension with only a single __u32 fd member
>>>> in it.
>>> ...
>>>> The nice thing about it is that you can create/unlink as many as you
>>>> want, but
>>>> when you remove the real device from an application via
>>>> bpf_dev_destroy(fd),
>>>> then all links disappear with it. Just like in the case of a normal
>>>> device driver.
>>>
>>> interesting idea!
>>> What happens if user app creates a dev via bpf_dev_create(), exits and
>>> then admin does rm of that dev ?
>>> Looks like map/prog will leak ?
>>> So the only proper way to delete such cdevs is via bpf_dev_destroy ?
>>
>> The mknod is not the holder but rather the kobject which should be
>> represented in sysfs will be. So you can still get the map major:minor
>> by looking up the /dev file in the correspdonding sysfs directory or I
>> think we should provide a 'unbind' file, which will drop the kobject if
>> the user writes a '1' to it.
>
> I agree, this could still be done.
>
>>>> On device creation, the kernel will return the minor number via bpf(2),
>>>> so you
>>>> can access the file easily, f.e. /dev/bpf/bpf_map<minor> resp.
>>>> /dev/bpf/bpf_prog<minor>,
>>>> and then move on with mknod(2) or symlink(2) from there if wished.
>>>
>>> what if admin mknod in that dir with some arbitrary minor ?
>>
>> Basically, -EIO. :)

If an admin does a mknod that has the major of a map or prog cdev, but a
not yet used minor, then connecting to that fails. And at the time when a
real device has been created with that assigned minor, then connecting to
it succeeds.

It's nothing different than with other devices in the system, f.e. ...

   # ls -la /dev/urandom
   crw-rw-rw-. 1 root root 1, 9 Oct 19 15:18 /dev/urandom
   # mknod ./foobar c 1 9

... will make random driver available under ./foobar as well.

If your question is rather on what happens when an admin does an ``mknod
/dev/bpf/bpf_map9 c 249 11'' and the device created has a minor of 9 and
/dev/bpf/bpf_map9 already exists in the system, then udev won't auto-create
or overwrite the node pointing to the major:minor there. The device itself
is being created nevertheless and visible under /sys/class/bpf/, but I think
this is a non-issue and nothing different from any other device drivers.

As Hannes said, under /sys/class/bpf/ an admin can see all held nodes, so
visibility is there for free at all times. The device management (creation/
deletion) itself and the mknod's pointing to it are simply decoupled.

This whole approach looks sound to me, also integrates nicely into the
existing Linux facilities, and works on top of every fs supporting special
files. Much cleaner than an extra file-system that would be required by a
syscall in order to make the syscall work.

>>> mknod will succeed, but it won't hold anything?
>>
>> That is right now true for basically all mknod operations, which udev
>> creates.
>>
>>> looks like bpf_dev_connect will handle it gracefully.
>>> So these cdevs should only be created and destroyed via bpf syscall
>>> and only sensible operations on them is open() to get fd and pass
>>> to bpf_dev_connect and symlink. Anything else admin should be
>>> careful not to do. Right?
>>
>> Besides maybe some statistics and other stuff in sysfs directory, no,
>> that is all.
>>
>> Bye,
>> Hannes

  reply	other threads:[~2015-10-19 14:23 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-16  1:09 [PATCH net-next 0/4] BPF updates Daniel Borkmann
2015-10-16  1:09 ` [PATCH net-next 1/4] bpf: abstract anon_inode_getfd invocations Daniel Borkmann
2015-10-16  1:09 ` [PATCH net-next 2/4] bpf: align and clean bpf_{map,prog}_get helpers Daniel Borkmann
2015-10-16  1:09 ` [PATCH net-next 3/4] bpf: add support for persistent maps/progs Daniel Borkmann
2015-10-16 10:25   ` Hannes Frederic Sowa
2015-10-16 13:36     ` Daniel Borkmann
2015-10-16 16:36       ` Hannes Frederic Sowa
2015-10-16 17:27         ` Daniel Borkmann
2015-10-16 17:37           ` Alexei Starovoitov
2015-10-16 16:18     ` Alexei Starovoitov
2015-10-16 16:43       ` Hannes Frederic Sowa
2015-10-16 17:32         ` Alexei Starovoitov
2015-10-16 17:37           ` Thomas Graf
2015-10-16 17:21   ` Hannes Frederic Sowa
2015-10-16 17:42     ` Alexei Starovoitov
2015-10-16 17:56       ` Daniel Borkmann
2015-10-16 18:41         ` Eric W. Biederman
2015-10-16 19:27           ` Alexei Starovoitov
2015-10-16 19:53             ` Eric W. Biederman
2015-10-16 20:56               ` Alexei Starovoitov
2015-10-16 23:44                 ` Eric W. Biederman
2015-10-17  2:43                   ` Alexei Starovoitov
2015-10-17 12:28                     ` Daniel Borkmann
2015-10-18  2:20                       ` Alexei Starovoitov
2015-10-18 15:03                         ` Daniel Borkmann
2015-10-18 16:49                           ` Daniel Borkmann
2015-10-18 20:59                             ` Alexei Starovoitov
2015-10-19  7:36                               ` Hannes Frederic Sowa
2015-10-19  9:51                                 ` Daniel Borkmann
2015-10-19 14:23                                   ` Daniel Borkmann [this message]
2015-10-19 16:22                                     ` Alexei Starovoitov
2015-10-19 17:37                                       ` Daniel Borkmann
2015-10-19 18:15                                         ` Alexei Starovoitov
2015-10-19 18:46                                           ` Hannes Frederic Sowa
2015-10-19 19:34                                             ` Alexei Starovoitov
2015-10-19 20:03                                               ` Hannes Frederic Sowa
2015-10-19 20:48                                                 ` Alexei Starovoitov
2015-10-19 22:17                                                   ` Daniel Borkmann
2015-10-20  0:30                                                     ` Alexei Starovoitov
2015-10-20  8:46                                                       ` Daniel Borkmann
2015-10-20 17:53                                                         ` Alexei Starovoitov
2015-10-20 18:56                                                           ` Eric W. Biederman
2015-10-21 15:17                                                             ` Daniel Borkmann
2015-10-21 18:34                                                               ` Thomas Graf
2015-10-21 22:44                                                                 ` Alexei Starovoitov
2015-10-22 13:22                                                                   ` Daniel Borkmann
2015-10-22 19:35                                                               ` Eric W. Biederman
2015-10-23 13:47                                                                 ` Daniel Borkmann
2015-10-20  9:43                                                       ` Hannes Frederic Sowa
2015-10-19 23:02                                                   ` Hannes Frederic Sowa
2015-10-20  1:09                                                     ` Alexei Starovoitov
2015-10-20 10:07                                                       ` Hannes Frederic Sowa
2015-10-20 18:44                                                         ` Alexei Starovoitov
2015-10-16 19:54             ` Daniel Borkmann
2015-10-16  1:09 ` [PATCH net-next 4/4] bpf: add sample usages " Daniel Borkmann
2015-10-19  2:53 ` [PATCH net-next 0/4] BPF updates David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5624FCDF.3090601@iogearbox.net \
    --to=daniel@iogearbox.net \
    --cc=ast@kernel.org \
    --cc=ast@plumgrid.com \
    --cc=davem@davemloft.net \
    --cc=ebiederm@xmission.com \
    --cc=hannes@stressinduktion.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=tgraf@suug.ch \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).