From mboxrd@z Thu Jan  1 00:00:00 1970
From: Daniel Borkmann <daniel-FeC+5ew28dpmcu3hnIyYJQ@public.gmane.org>
Subject: Re: [PATCH v2 2/5] seccomp: make underlying bpf ref counted as well
Date: Fri, 11 Sep 2015 18:03:59 +0200
Message-ID: <55F2FB6F.7050708@iogearbox.net>
References: <1441930862-14347-1-git-send-email-tycho.andersen@canonical.com> <1441930862-14347-3-git-send-email-tycho.andersen@canonical.com> <55F2D0EC.9090004@iogearbox.net> <20150911144400.GI27574@smitten>
Mime-Version: 1.0
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
Return-path: <linux-api-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
In-Reply-To: <20150911144400.GI27574@smitten>
Sender: linux-api-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
To: Tycho Andersen <tycho.andersen-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
Cc: Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>, Alexei Starovoitov <ast-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>, "David S. Miller" <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>, Will Drewry <wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>, Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>, Pavel Emelyanov <xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>, "Serge E. Hallyn" <serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org>, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
List-Id: linux-api@vger.kernel.org

On 09/11/2015 04:44 PM, Tycho Andersen wrote:
> On Fri, Sep 11, 2015 at 03:02:36PM +0200, Daniel Borkmann wrote:
>> On 09/11/2015 02:20 AM, Tycho Andersen wrote:
>>> In the next patch, we're going to add a way to access the underlying
>>> filters via bpf fds. This means that we need to ref-count both the
>>> struct seccomp_filter objects and the struct bpf_prog objects separately,
>>> in case a process dies but a filter is still referred to by another
>>> process.
>>>
>>> Additionally, we mark classic converted seccomp filters as seccomp eBPF
>>> programs, since they are a subset of what is supported in seccomp eBPF.
>>>
>>> Signed-off-by: Tycho Andersen <tycho.andersen-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
>>> CC: Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
>>> CC: Will Drewry <wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
>>> CC: Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
>>> CC: Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>
>>> CC: Pavel Emelyanov <xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
>>> CC: Serge E. Hallyn <serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org>
>>> CC: Alexei Starovoitov <ast-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
>>> CC: Daniel Borkmann <daniel-FeC+5ew28dpmcu3hnIyYJQ@public.gmane.org>
>>> ---
>>>   kernel/seccomp.c | 4 +++-
>>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
>>> index 245df6b..afaeddf 100644
>>> --- a/kernel/seccomp.c
>>> +++ b/kernel/seccomp.c
>>> @@ -378,6 +378,8 @@ static struct seccomp_filter *seccomp_prepare_filter(struct sock_fprog *fprog)
>>>   	}
>>>
>>>   	atomic_set(&sfilter->usage, 1);
>>> +	atomic_set(&sfilter->prog->aux->refcnt, 1);
>>> +	sfilter->prog->type = BPF_PROG_TYPE_SECCOMP;
>>
>> So, if you do this, then this breaks the assumption of eBPF JITs
>> that, currently, all classic converted BPF programs always have a
>> prog->type of BPF_PROG_TYPE_UNSPEC (see: bpf_prog_was_classic()).
>>
>> Currently, JITs make use of this information to determine whether
>> A and X mappings for such programs should or should not be cleared
>> in the prologue (s390 currently).
>>
>> In the seccomp_prepare_filter() stage, we're already past that, so
>> it will not cause an issue, but we certainly would need to be very
>> careful in future, if bpf_prog_was_classic() is then used at a later
>> stage when we already have a generated bpf_prog somewhere, as then
>> this assumption will break.
>
> The only reason we need to do this is to allow BPF_DUMP_PROG to work,
> since we were restricting it to only allow dumping of seccomp
> programs, since those don't have maps. Instead, perhaps we could allow
> dumping of BPF_PROG_TYPE_SECCOMP and BPF_PROG_TYPE_UNSPEC?

There are possibilities that BPF_PROG_TYPE_UNSPEC is calling helpers
already today, at least in networking case, not seccomp. So, since
you want to export [classic -> eBPF] only for seccomp, put fds on them
and dump these via bpf(2), you could allow that (with a big comment
stating why it's safe), but mid-term we really need to sanitize all
this stuff properly as this is needed for other types, too.