From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752025AbdLABdT (ORCPT ); Thu, 30 Nov 2017 20:33:19 -0500 Received: from zeniv.linux.org.uk ([195.92.253.2]:52490 "EHLO ZenIV.linux.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751500AbdLABdS (ORCPT ); Thu, 30 Nov 2017 20:33:18 -0500 Date: Fri, 1 Dec 2017 01:33:04 +0000 From: Al Viro To: Kees Cook Cc: Shmulik Ladkani , Willem de Bruijn , Daniel Borkmann , Pablo Neira Ayuso , Linus Torvalds , David Miller , LKML , Network Development , Christoph Hellwig , Thomas Garnier , Jann Horn Subject: Re: netfilter: xt_bpf: Fix XT_BPF_MODE_FD_PINNED mode of 'xt_bpf_info_v1' Message-ID: <20171201013304.GM21978@ZenIV.linux.org.uk> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.0 (2017-09-02) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Nov 30, 2017 at 04:57:30PM -0800, Kees Cook wrote: > On Mon, Oct 9, 2017 at 4:10 PM, David Miller wrote: > > Shmulik Ladkani (1): > > netfilter: xt_bpf: Fix XT_BPF_MODE_FD_PINNED mode of 'xt_bpf_info_v1' > > This adds a new user of set_fs(), which we're trying to eliminate (or > at least not expand): > > + set_fs(KERNEL_DS); > + fd = bpf_obj_get_user(path); > + set_fs(oldfs); > > Can you please adjust this to not make set_fs() changes? That's not the worst problem there. Messing with descriptor table is much worse. It can be shared between threads; by the time you get to fdget() the damn thing might have nothing to do with what bpf_obj_get_user() has put there, ditto for sys_close(). Use of file descriptors should be limited to "got a number from userland, convert to struct file *" on the way in and "install struct file * into descriptor table and return the descriptor to userland" on the way out. And the latter - *ONLY* after the last possible point of failure. Once a file reference is inserted into descriptor table, that's it - you can't undo that. The only way to use bpf_obj_get_user() is to pass its return value to userland. As return value of syscall - not even put_user() (for that you'd need to reserve the descriptor, copy it to userland and only then attach struct file * to it). The whole approach stinks - what it needs is something that would take struct filename * and return struct bpf_prog * or struct file * reference. With bpf_obj_get_user() and this thing implemented via that. I'm looking into that thing...