From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753338AbbJSJvu (ORCPT ); Mon, 19 Oct 2015 05:51:50 -0400 Received: from www62.your-server.de ([213.133.104.62]:56782 "EHLO www62.your-server.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750759AbbJSJvs (ORCPT ); Mon, 19 Oct 2015 05:51:48 -0400 Message-ID: <5624BD0C.3070404@iogearbox.net> Date: Mon, 19 Oct 2015 11:51:08 +0200 From: Daniel Borkmann User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: Hannes Frederic Sowa , Alexei Starovoitov , "Eric W. Biederman" CC: davem@davemloft.net, viro@ZenIV.linux.org.uk, tgraf@suug.ch, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Alexei Starovoitov Subject: Re: [PATCH net-next 3/4] bpf: add support for persistent maps/progs References: <1445016105.1251655.412231129.6574D430@webmail.messagingengine.com> <5621371C.2000507@plumgrid.com> <56213A61.40509@iogearbox.net> <87d1welkp8.fsf@x220.int.ebiederm.org> <56214FAC.5060704@plumgrid.com> <87y4f2io9l.fsf@x220.int.ebiederm.org> <5621649A.80403@plumgrid.com> <87mvviidku.fsf@x220.int.ebiederm.org> <5621B5BC.8020204@plumgrid.com> <56223F00.5030203@iogearbox.net> <562301F9.1030702@plumgrid.com> <5623B4B4.2010703@iogearbox.net> <5623CD8D.7000500@iogearbox.net> <56240814.8020105@plumgrid.com> <1445240171.3728424.413797809.230D716F@webmail.messagingengine.com> In-Reply-To: <1445240171.3728424.413797809.230D716F@webmail.messagingengine.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Authenticated-Sender: daniel@iogearbox.net Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 10/19/2015 09:36 AM, Hannes Frederic Sowa wrote: > Hi, > > On Sun, Oct 18, 2015, at 22:59, Alexei Starovoitov wrote: >> On 10/18/15 9:49 AM, Daniel Borkmann wrote: >>> Okay, I have pushed some rough working proof of concept here: >>> >>> https://git.breakpoint.cc/cgit/dborkman/net-next.git/log/?h=ebpf-fds-final5 >>> >>> So the idea eventually had to be slightly modified after giving this >>> further >>> thoughts and is the following: >>> >>> We have 3 commands (BPF_DEV_CREATE, BPF_DEV_DESTROY, BPF_DEV_CONNECT), and >>> related to that a bpf_attr extension with only a single __u32 fd member >>> in it. >> ... >>> The nice thing about it is that you can create/unlink as many as you >>> want, but >>> when you remove the real device from an application via >>> bpf_dev_destroy(fd), >>> then all links disappear with it. Just like in the case of a normal >>> device driver. >> >> interesting idea! >> What happens if user app creates a dev via bpf_dev_create(), exits and >> then admin does rm of that dev ? >> Looks like map/prog will leak ? >> So the only proper way to delete such cdevs is via bpf_dev_destroy ? > > The mknod is not the holder but rather the kobject which should be > represented in sysfs will be. So you can still get the map major:minor > by looking up the /dev file in the correspdonding sysfs directory or I > think we should provide a 'unbind' file, which will drop the kobject if > the user writes a '1' to it. I agree, this could still be done. >>> On device creation, the kernel will return the minor number via bpf(2), >>> so you >>> can access the file easily, f.e. /dev/bpf/bpf_map resp. >>> /dev/bpf/bpf_prog, >>> and then move on with mknod(2) or symlink(2) from there if wished. >> >> what if admin mknod in that dir with some arbitrary minor ? > > Basically, -EIO. :) > >> mknod will succeed, but it won't hold anything? > > That is right now true for basically all mknod operations, which udev > creates. > >> looks like bpf_dev_connect will handle it gracefully. >> So these cdevs should only be created and destroyed via bpf syscall >> and only sensible operations on them is open() to get fd and pass >> to bpf_dev_connect and symlink. Anything else admin should be >> careful not to do. Right? > > Besides maybe some statistics and other stuff in sysfs directory, no, > that is all. > > Bye, > Hannes