From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.1 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 31E24C432BE for ; Wed, 1 Sep 2021 08:28:10 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A80A761059 for ; Wed, 1 Sep 2021 08:28:09 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org A80A761059 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id E90326B006C; Wed, 1 Sep 2021 04:28:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E25636B0071; Wed, 1 Sep 2021 04:28:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C6C988D0001; Wed, 1 Sep 2021 04:28:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0018.hostedemail.com [216.40.44.18]) by kanga.kvack.org (Postfix) with ESMTP id B43946B006C for ; Wed, 1 Sep 2021 04:28:08 -0400 (EDT) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 5FDA427DF0 for ; Wed, 1 Sep 2021 08:28:08 +0000 (UTC) X-FDA: 78538327056.14.6D75B7B Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf22.hostedemail.com (Postfix) with ESMTP id 04CD61904 for ; Wed, 1 Sep 2021 08:28:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1630484887; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=urXf/lsujl5mpSnmr3xeWIt4stnv1XjbfuMO6ZCP2KA=; b=O/TsO4pXVjTTzegAS8bXWR7fKvbKUnc10Z6v66pW+gwuWqGAd0wqXbfpoPmfUlSXdNyBaw 2bs4Mw+tca6iG4djNjn0pkzWrElqf+Bh5HopWXDfI64fGyTwwv8nFZE8evq/rkJF5epxHx S6uDzsqd0KOizJczuFkWwPKOpIm3Xjg= Received: from mail-wr1-f70.google.com (mail-wr1-f70.google.com [209.85.221.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-45-_K3TQQDrPrG4w2Ga7gnq0g-1; Wed, 01 Sep 2021 04:28:04 -0400 X-MC-Unique: _K3TQQDrPrG4w2Ga7gnq0g-1 Received: by mail-wr1-f70.google.com with SMTP id 102-20020adf82ef000000b001576e345169so531788wrc.7 for ; Wed, 01 Sep 2021 01:28:04 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:organization :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=urXf/lsujl5mpSnmr3xeWIt4stnv1XjbfuMO6ZCP2KA=; b=MzTy2LNmFzsBhCNnnis1+qtlEfwkpZRDAtDGHWYd48hxivLY4TDABns5jRGp3jO6he 00DhxGheumE+25lFwDens3IAYy2dF/5i+LrA+i2G2RLPg+OnhWQYsjyxdWoxRebGT+O+ SN5RdAvcLWZQXYfWajsQf4updGFFxvvYfi40AWA2VBDIs5SpqX4dAX0GBzeDMfARI+69 UKWtpxvS6iHPxqNZEagvBVwC7quIMzjCdgikBGZFJTK2HkEHnT2S2k7hBOR7Ncv3PRGy ZcMyn1+8wgmJt4vm52RZfYNtQCeblrnohScpo2ppjxO+OKwIjf59wE+eIXODnkeZFZSF mSYA== X-Gm-Message-State: AOAM531RjelllJAWkBSxQWYqJiyQstluE8JP08wyHqShk5qngk+1WNBG xLowd4yjxQn7+QrrT2JQd8QdLoy+QCJYLq2eVFBkY6Tu8Uz8bcP+TXsI9tIMCXlY6Ex7kVJddwW OwAGPB30iFQM= X-Received: by 2002:a5d:63d2:: with SMTP id c18mr36140424wrw.240.1630484883102; Wed, 01 Sep 2021 01:28:03 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwlbF4D3fGzYRuLDlt/6l/uQWEWI7qpyhqVbwr1jujGUThxUVQleu8elgEn7R9xymi0rU9RmA== X-Received: by 2002:a5d:63d2:: with SMTP id c18mr36140350wrw.240.1630484882874; Wed, 01 Sep 2021 01:28:02 -0700 (PDT) Received: from [192.168.3.132] (p4ff23f71.dip0.t-ipconnect.de. [79.242.63.113]) by smtp.gmail.com with ESMTPSA id n3sm5121111wmi.0.2021.09.01.01.28.00 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 01 Sep 2021 01:28:02 -0700 (PDT) Subject: Re: [PATCH v1 0/7] Remove in-tree usage of MAP_DENYWRITE To: "Eric W. Biederman" Cc: Andy Lutomirski , Linus Torvalds , David Laight , Linux Kernel Mailing List , Andrew Morton , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , Al Viro , Alexey Dobriyan , Steven Rostedt , "Peter Zijlstra (Intel)" , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Petr Mladek , Sergey Senozhatsky , Andy Shevchenko , Rasmus Villemoes , Kees Cook , Greg Ungerer , Geert Uytterhoeven , Mike Rapoport , Vlastimil Babka , Vincenzo Frascino , Chinwen Chang , Michel Lespinasse , Catalin Marinas , "Matthew Wilcox (Oracle)" , Huang Ying , Jann Horn , Feng Tang , Kevin Brodsky , Michael Ellerman , Shawn Anastasio , Steven Price , Nicholas Piggin , Christian Brauner , Jens Axboe , Gabriel Krisman Bertazi , Peter Xu , Suren Baghdasaryan , Shakeel Butt , Marco Elver , Daniel Jordan , Nicolas Viennot , Thomas Cedeno , Collin Fijalkovich , Michal Hocko , Miklos Szeredi , Chengguang Xu , =?UTF-8?Q?Christian_K=c3=b6nig?= , "linux-unionfs@vger.kernel.org" , Linux API , the arch/x86 maintainers , linux-fsdevel@vger.kernel.org, Linux-MM , Florian Weimer , Michael Kerrisk References: <20210812084348.6521-1-david@redhat.com> <87o8a2d0wf.fsf@disp2133> <60db2e61-6b00-44fa-b718-e4361fcc238c@www.fastmail.com> <87lf56bllc.fsf@disp2133> <87eeay8pqx.fsf@disp2133> <5b0d7c1e73ca43ef9ce6665fec6c4d7e@AcuMS.aculab.com> <87h7ft2j68.fsf@disp2133> <0ed69079-9e13-a0f4-776c-1f24faa9daec@redhat.com> <87mtp3g8gv.fsf@disp2133> From: David Hildenbrand Organization: Red Hat Message-ID: Date: Wed, 1 Sep 2021 10:28:00 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <87mtp3g8gv.fsf@disp2133> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="O/TsO4pX"; spf=none (imf22.hostedemail.com: domain of david@redhat.com has no SPF policy when checking 216.205.24.124) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 04CD61904 X-Stat-Signature: e7apmuzzzo5d73mmmyzy4w6oqacnocqb X-HE-Tag: 1630484887-372083 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 27.08.21 00:13, Eric W. Biederman wrote: > David Hildenbrand writes: >=20 >> On 26.08.21 19:48, Andy Lutomirski wrote: >>> On Fri, Aug 13, 2021, at 5:54 PM, Linus Torvalds wrote: >>>> On Fri, Aug 13, 2021 at 2:49 PM Andy Lutomirski wr= ote: >>>>> >>>>> I=E2=80=99ll bite. How about we attack this in the opposite direct= ion: remove the deny write mechanism entirely. >>>> >>>> I think that would be ok, except I can see somebody relying on it. >>>> >>>> It's broken, it's stupid, but we've done that ETXTBUSY for a _loong_= time. >>> >>> Someone off-list just pointed something out to me, and I think we sho= uld push harder to remove ETXTBSY. Specifically, we've all been focused = on open() failing with ETXTBSY, and it's easy to make fun of anyone openi= ng a running program for write when they should be unlinking and replacin= g it. >>> >>> Alas, Linux's implementation of deny_write_access() is correct^Wabsur= d, and deny_write_access() *also* returns ETXTBSY if the file is open for= write. So, in a multithreaded program, one thread does: >>> >>> fd =3D open("some exefile", O_RDWR | O_CREAT | O_CLOEXEC); >>> write(fd, some stuff); >>> >>> <--- problem is here >>> >>> close(fd); >>> execve("some exefile"); >>> >>> Another thread does: >>> >>> fork(); >>> execve("something else"); >>> >>> In between fork and execve, there's another copy of the open file des= cription, and i_writecount is held, and the execve() fails. Whoops. See= , for example: >>> >>> https://github.com/golang/go/issues/22315 >>> >>> I propose we get rid of deny_write_access() completely to solve this. >>> >>> Getting rid of i_writecount itself seems a bit harder, since a handfu= l of filesystems use it for clever reasons. >>> >>> (OFD locks seem like they might have the same problem. Maybe we shou= ld have a clone() flag to unshare the file table and close close-on-exec = things?) >>> >> >> It's not like this issue is new (^2017) or relevant in practice. So no >> need to hurry IMHO. One step at a time: it might make perfect sense to >> remove ETXTBSY, but we have to be careful to not break other user >> space that actually cares about the current behavior in practice. >=20 > It is an old enough issue that I agree there is no need to hurry. >=20 > I also ran into this issue not too long ago when I refactored the > usermode_driver code. My challenge was not being in userspace > the delayed fput was not happening in my kernel thread. Which meant > that writing the file, then closing the file, then execing the file > consistently reported -ETXTBSY. >=20 > The kernel code wound up doing: > /* Flush delayed fput so exec can open the file read-only */ > flush_delayed_fput(); > task_work_run(); >=20 > As I read the code the delay for userspace file descriptors is > always done with task_work_add, so userspace should not hit > that kind of silliness, and should be able to actually close > the file descriptor before the exec. >=20 >=20 > On the flip side, I don't know how anything can depend upon getting an > -ETXTBSY. So I don't think there is any real risk of breaking userspac= e > if we remove it. At least in LTP, we have two test cases testing exactly that behavior: testcases/kernel/syscalls/creat/creat07.c testcases/kernel/syscalls/execve/execve04.c --=20 Thanks, David / dhildenb