From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B4869C433FE for ; Thu, 10 Dec 2020 19:30:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5B23822EBE for ; Thu, 10 Dec 2020 19:30:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2393250AbgLJTai (ORCPT ); Thu, 10 Dec 2020 14:30:38 -0500 Received: from out02.mta.xmission.com ([166.70.13.232]:49746 "EHLO out02.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2393436AbgLJTad (ORCPT ); Thu, 10 Dec 2020 14:30:33 -0500 Received: from in01.mta.xmission.com ([166.70.13.51]) by out02.mta.xmission.com with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.93) (envelope-from ) id 1knRdL-003Miz-OI; Thu, 10 Dec 2020 12:29:47 -0700 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95] helo=x220.xmission.com) by in01.mta.xmission.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.87) (envelope-from ) id 1knRdK-0006sF-Uo; Thu, 10 Dec 2020 12:29:47 -0700 From: ebiederm@xmission.com (Eric W. Biederman) To: Al Viro Cc: Linus Torvalds , Linux Kernel Mailing List , linux-fsdevel , Christian Brauner , Oleg Nesterov , Jann Horn References: <20201120231441.29911-15-ebiederm@xmission.com> <20201207232900.GD4115853@ZenIV.linux.org.uk> <877dprvs8e.fsf@x220.int.ebiederm.org> <20201209040731.GK3579531@ZenIV.linux.org.uk> <877dprtxly.fsf@x220.int.ebiederm.org> <20201209142359.GN3579531@ZenIV.linux.org.uk> <87o8j2svnt.fsf_-_@x220.int.ebiederm.org> <20201209195033.GP3579531@ZenIV.linux.org.uk> <87sg8er7gp.fsf@x220.int.ebiederm.org> <20201210061304.GS3579531@ZenIV.linux.org.uk> Date: Thu, 10 Dec 2020 13:29:01 -0600 In-Reply-To: <20201210061304.GS3579531@ZenIV.linux.org.uk> (Al Viro's message of "Thu, 10 Dec 2020 06:13:04 +0000") Message-ID: <87h7oto3ya.fsf@x220.int.ebiederm.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1knRdK-0006sF-Uo;;;mid=<87h7oto3ya.fsf@x220.int.ebiederm.org>;;;hst=in01.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX19ZiLRNiJsD/mS47Qcr/W9aaoCcvplKFwU= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: Re: [PATCH] files: rcu free files_struct X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600) X-SA-Exim-Scanned: Yes (on in01.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Al Viro writes: > On Wed, Dec 09, 2020 at 03:32:38PM -0600, Eric W. Biederman wrote: >> Al Viro writes: >> >> > On Wed, Dec 09, 2020 at 11:13:38AM -0800, Linus Torvalds wrote: >> >> On Wed, Dec 9, 2020 at 10:05 AM Eric W. Biederman wrote: >> >> > >> >> > - struct file * file = xchg(&fdt->fd[i], NULL); >> >> > + struct file * file = fdt->fd[i]; >> >> > if (file) { >> >> > + rcu_assign_pointer(fdt->fd[i], NULL); >> >> >> >> This makes me nervous. Why did we use to do that xchg() there? That >> >> has atomicity guarantees that now are gone. >> >> >> >> Now, this whole thing should be called for just the last ref of the fd >> >> table, so presumably that atomicity was never needed in the first >> >> place. But the fact that we did that very expensive xchg() then makes >> >> me go "there's some reason for it". >> >> >> >> Is this xchg() just bogus historical leftover? It kind of looks that >> >> way. But maybe that change should be done separately? >> > >> > I'm still not convinced that exposing close_files() to parallel >> > 3rd-party accesses is safe in all cases, so this patch still needs >> > more analysis. >> >> That is fine. I just wanted to post the latest version so we could >> continue the discussion. Especially with comments etc. > > It's probably safe. I've spent today digging through the mess in > fs/notify and kernel/bpf, and while I'm disgusted with both, at > that point I believe that close_files() exposure is not going to > create problems with either. And xchg() in there _is_ useless. Then I will work on a cleaned up version. > Said that, BPF "file iterator" stuff is potentially very unpleasant - > it allows to pin a struct file found in any process' descriptor table > indefinitely long. Temporary struct file references grabbed by procfs > code, while unfortunate, are at least short-lived; with this stuff sky's > the limit. > > I'm not happy about having that available, especially if it's a user-visible > primitive we can't withdraw at zero notice ;-/ > > What are the users of that thing and is there any chance to replace it > with something saner? IOW, what *is* realistically called for each > struct file by the users of that iterator? The bpf guys are no longer Cc'd and they can probably answer better than I. In a previous conversation it was mentioned that task_iter was supposed to be a high performance interface for getting proc like data out of the kernel using bpf. If so I think that handles the lifetime issues as bpf programs are supposed to be short-lived and can not pass references anywhere. On the flip side it raises the question did the BPF guys just make the current layout of task_struct and struct file part of the linux kernel user space ABI? Eric