From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-ed1-f50.google.com (mail-ed1-f50.google.com [209.85.208.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5E49D2848AE for ; Thu, 28 Aug 2025 20:17:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.208.50 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756412280; cv=none; b=iVC4smR247ct/moNC+ko0vKu73GKxVCiUBFYQ0TbKZ2d2aOnsVzXJFK/POYI3+B11ZCwsockM7RtaQjXcrclS4cRQMbXcp19z8U3bsP6rEUZ1RhO2dLpnE5qcrlFUL1DpYPuWdQZTt0hfMVCfdeoPVhj6huyXw2uaXB1tOSSKRI= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756412280; c=relaxed/simple; bh=fwMu4V9Q3WqapDL7tIyizhCPh6GFdknTswMuvUppL4Y=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=fp5x5ee3NgABa9DyccR9ftniXf1mWd/W418f3tLvbbas3xvlj1b/3RyPCZ5JwK9ZJlbRGCtAa3xFUo1meM79LJTRq1tsaziG4Vu9uQqc963TAVvHnOtzuZsU6zpk6sF6mQfMUGxcKegYEBnHlDWzSAhh5GKiPsYW+z7icoZy8vc= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=chromium.org; spf=pass smtp.mailfrom=chromium.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b=mIB1n8hL; arc=none smtp.client-ip=209.85.208.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=chromium.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=chromium.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b="mIB1n8hL" Received: by mail-ed1-f50.google.com with SMTP id 4fb4d7f45d1cf-6188b7895e9so215672a12.3 for ; Thu, 28 Aug 2025 13:17:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1756412277; x=1757017077; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=7ePZgtiHOPodLjgfM5UTPjnnd5iFKMsXyVJQSH87i1w=; b=mIB1n8hLSqkxsv/TZqmDVZD/oTZrkL7iaorO+ld4jlaOyBaKGzFS4Np6esccK4zT2j gjSZhd0FsPPZFEuFPc263txs541JiTCsny5xzR7wCuG+/qBmZ7uadukuEBrE3aPQcT7g +707/Mn2697f1fvKYD/tgN8LaA77H3H89NAMs= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1756412277; x=1757017077; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7ePZgtiHOPodLjgfM5UTPjnnd5iFKMsXyVJQSH87i1w=; b=s3EpRPm00+X+cQSmnjW+itApG6PlDHF1buFFJiNM7GnJSmN3C7fB9T+3v6xSuKBEfw l0umDZ2h7FBz7A9sNfMz6UPMTQ2JLKPDmP9dvklYjQT6DgrVQBvSISP1xQ3TfXPAcbLD d1jeMcF9U99/WyFDEJfsv1FA3VHw3bAQnR9CEo4oulQvCHu3Vpks2/P54lNBwckcFf2+ U1hIA9iuHkKEj5ZUcWtLQIJzv88jmLkDXjQm2QsZzNHQ7EryUku8oiIxOpGNIanWupZ/ jJ1K/7yS7vsztbZLya6oIzo1s9thByZL1iNbnwgIimm9le0HR2DUogQc/GkM3uWOend4 O6bA== X-Forwarded-Encrypted: i=1; AJvYcCU6Miu9puEDLLbf7eC2w13jAgnhjjPpBVRl8j7DNB2nyzITDyhzPvkjuy289Jccu+C3gDRnCYkkSHqSXx87fhuIO6l6bFU=@vger.kernel.org X-Gm-Message-State: AOJu0Yytgycyf/tKc5/m7JYsvFULEDV2iv5vBL+kW4U7TmdBTHwlwUGh 6XmiscNNL1juTOEsTYYFUqAPWLjf2mjse9b2sdr9Y3uEYTI8FiErbgM6BtlPZCmh6kgiixC+6VL s/NnWm3osI+xVUjobzljrpnVFBK+Cj572HUpxOYij X-Gm-Gg: ASbGncuXZcbcrzaZH8VRnYVJKqW2rh23m//7OlGeUlDPBWPyS9wkbxtQLmhXEQfouLM qgS19NOW66zfyocsZxr/AqTQlP6VxywUa+Ex8ZlMiGVR9ownVXJyQuFSew38Os7No6jlig+QiL8 SYUtMFnJ9qg1j/7JsolmjrFQVlAETntu5ra0VNN+Q74JMB4wb+kIoWRbzweobJ5yN+Kq5EKsjmt 1/SbCrKF/5vUokxSWq4qrv8lcnmaX5yAGaU+Oito3lZQetbW50= X-Google-Smtp-Source: AGHT+IE64d1WWE8kcq3TRbS70wrOGI1tKonG+bfd4KXXEWkgnBXYalFyybrk7xruzzNjFZLWMOhENjx6cCeacRzVtmk= X-Received: by 2002:a05:6402:440a:b0:61c:cfb2:b2ce with SMTP id 4fb4d7f45d1cf-61ccfc1f690mr1973394a12.7.1756412276638; Thu, 28 Aug 2025 13:17:56 -0700 (PDT) Precedence: bulk X-Mailing-List: linux-security-module@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20250822170800.2116980-1-mic@digikod.net> <20250822170800.2116980-2-mic@digikod.net> <20250824.Ujoh8unahy5a@digikod.net> <20250825.mahNeel0dohz@digikod.net> <20250826.eWi6chuayae4@digikod.net> <20250827.ieRaeNg4pah3@digikod.net> In-Reply-To: <20250827.ieRaeNg4pah3@digikod.net> From: Jeff Xu Date: Thu, 28 Aug 2025 13:17:42 -0700 X-Gm-Features: Ac12FXz7g15EGZQbQ__Nog7bDAptHr9NyXQh1X7xQPtUT8FxbZ8Ao8bCFm7NIZ0 Message-ID: Subject: Re: [RFC PATCH v1 1/2] fs: Add O_DENY_WRITE To: =?UTF-8?B?TWlja2HDq2wgU2FsYcO8bg==?= Cc: Jeff Xu , Andy Lutomirski , Jann Horn , Al Viro , Christian Brauner , Kees Cook , Paul Moore , Serge Hallyn , Andy Lutomirski , Arnd Bergmann , Christian Heimes , Dmitry Vyukov , Elliott Hughes , Fan Wu , Florian Weimer , Jonathan Corbet , Jordan R Abrahams , Lakshmi Ramasubramanian , Luca Boccassi , Matt Bobrowski , Miklos Szeredi , Mimi Zohar , Nicolas Bouchinet , Robert Waite , Roberto Sassu , Scott Shell , Steve Dower , Steve Grubb , kernel-hardening@lists.openwall.com, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-integrity@vger.kernel.org, linux-kernel@vger.kernel.org, linux-security-module@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Micka=C3=ABl On Wed, Aug 27, 2025 at 1:19=E2=80=AFAM Micka=C3=ABl Sala=C3=BCn wrote: > > On Tue, Aug 26, 2025 at 01:29:55PM -0700, Jeff Xu wrote: > > Hi Micka=C3=ABl > > > > On Tue, Aug 26, 2025 at 5:39=E2=80=AFAM Micka=C3=ABl Sala=C3=BCn wrote: > > > > > > On Mon, Aug 25, 2025 at 10:57:57AM -0700, Jeff Xu wrote: > > > > Hi Micka=C3=ABl > > > > > > > > On Mon, Aug 25, 2025 at 2:31=E2=80=AFAM Micka=C3=ABl Sala=C3=BCn wrote: > > > > > > > > > > On Sun, Aug 24, 2025 at 11:04:03AM -0700, Andy Lutomirski wrote: > > > > > > On Sun, Aug 24, 2025 at 4:03=E2=80=AFAM Micka=C3=ABl Sala=C3=BC= n wrote: > > > > > > > > > > > > > > On Fri, Aug 22, 2025 at 09:45:32PM +0200, Jann Horn wrote: > > > > > > > > On Fri, Aug 22, 2025 at 7:08=E2=80=AFPM Micka=C3=ABl Sala= =C3=BCn wrote: > > > > > > > > > Add a new O_DENY_WRITE flag usable at open time and on op= ened file (e.g. > > > > > > > > > passed file descriptors). This changes the state of the = opened file by > > > > > > > > > making it read-only until it is closed. The main use cas= e is for script > > > > > > > > > interpreters to get the guarantee that script' content ca= nnot be altered > > > > > > > > > while being read and interpreted. This is useful for gen= eric distros > > > > > > > > > that may not have a write-xor-execute policy. See commit= a5874fde3c08 > > > > > > > > > ("exec: Add a new AT_EXECVE_CHECK flag to execveat(2)") > > > > > > > > > > > > > > > > > > Both execve(2) and the IOCTL to enable fsverity can alrea= dy set this > > > > > > > > > property on files with deny_write_access(). This new O_D= ENY_WRITE make > > > > > > > > > > > > > > > > The kernel actually tried to get rid of this behavior on ex= ecve() in > > > > > > > > commit 2a010c41285345da60cece35575b4e0af7e7bf44.; but sadly= that had > > > > > > > > to be reverted in commit 3b832035387ff508fdcf0fba66701afc78= f79e3d > > > > > > > > because it broke userspace assumptions. > > > > > > > > > > > > > > Oh, good to know. > > > > > > > > > > > > > > > > > > > > > > > > it widely available. This is similar to what other OSs m= ay provide > > > > > > > > > e.g., opening a file with only FILE_SHARE_READ on Windows= . > > > > > > > > > > > > > > > > We used to have the analogous mmap() flag MAP_DENYWRITE, an= d that was > > > > > > > > removed for security reasons; as > > > > > > > > https://man7.org/linux/man-pages/man2/mmap.2.html says: > > > > > > > > > > > > > > > > | MAP_DENYWRITE > > > > > > > > | This flag is ignored. (Long ago=E2=80=94Li= nux 2.0 and earlier=E2=80=94it > > > > > > > > | signaled that attempts to write to the unde= rlying file > > > > > > > > | should fail with ETXTBSY. But this was a s= ource of denial- > > > > > > > > | of-service attacks.)" > > > > > > > > > > > > > > > > It seems to me that the same issue applies to your patch - = it would > > > > > > > > allow unprivileged processes to essentially lock files such= that other > > > > > > > > processes can't write to them anymore. This might allow unp= rivileged > > > > > > > > users to prevent root from updating config files or stuff l= ike that if > > > > > > > > they're updated in-place. > > > > > > > > > > > > > > Yes, I agree, but since it is the case for executed files I t= hough it > > > > > > > was worth starting a discussion on this topic. This new flag= could be > > > > > > > restricted to executable files, but we should avoid system-wi= de locks > > > > > > > like this. I'm not sure how Windows handle these issues thou= gh. > > > > > > > > > > > > > > Anyway, we should rely on the access control policy to contro= l write and > > > > > > > execute access in a consistent way (e.g. write-xor-execute). = Thanks for > > > > > > > the references and the background! > > > > > > > > > > > > I'm confused. I understand that there are many contexts in whi= ch one > > > > > > would want to prevent execution of unapproved content, which mi= ght > > > > > > include preventing a given process from modifying some code and= then > > > > > > executing it. > > > > > > > > > > > > I don't understand what these deny-write features have to do wi= th it. > > > > > > These features merely prevent someone from modifying code *that= is > > > > > > currently in use*, which is not at all the same thing as preven= ting > > > > > > modifying code that might get executed -- one can often modify > > > > > > contents *before* executing those contents. > > > > > > > > > > The order of checks would be: > > > > > 1. open script with O_DENY_WRITE > > > > > 2. check executability with AT_EXECVE_CHECK > > > > > 3. read the content and interpret it > > > > > > > > > I'm not sure about the O_DENY_WRITE approach, but the problem is wo= rth solving. > > > > > > > > AT_EXECVE_CHECK is not just for scripting languages. It could also > > > > work with bytecodes like Java, for example. If we let the Java runt= ime > > > > call AT_EXECVE_CHECK before loading the bytecode, the LSM could > > > > develop a policy based on that. > > > > > > Sure, I'm using "script" to make it simple, but this applies to other > > > use cases. > > > > > That makes sense. > > > > > > > > > > > The deny-write feature was to guarantee that there is no race con= dition > > > > > between step 2 and 3. All these checks are supposed to be done b= y a > > > > > trusted interpreter (which is allowed to be executed). The > > > > > AT_EXECVE_CHECK call enables the caller to know if the kernel (an= d > > > > > associated security policies) allowed the *current* content of th= e file > > > > > to be executed. Whatever happen before or after that (wrt. > > > > > O_DENY_WRITE) should be covered by the security policy. > > > > > > > > > Agree, the race problem needs to be solved in order for AT_EXECVE_C= HECK. > > > > > > > > Enforcing non-write for the path that stores scripts or bytecodes c= an > > > > be challenging due to historical or backward compatibility reasons. > > > > Since AT_EXECVE_CHECK provides a mechanism to check the file right > > > > before it is used, we can assume it will detect any "problem" that > > > > happened before that, (e.g. the file was overwritten). However, tha= t > > > > also imposes two additional requirements: > > > > 1> the file doesn't change while AT_EXECVE_CHECK does the check. > > > > > > This is already the case, so any kind of LSM checks are good. > > > > > May I ask how this is done? some code in do_open_execat() does this ? > > Apologies if this is a basic question. > > do_open_execat() calls exe_file_deny_write_access() > Thanks for pointing. With that, now I read the full history of discussion regarding this :-) > > > > > > 2>The file content kept by the process remains unchanged after pass= ing > > > > the AT_EXECVE_CHECK. > > > > > > The goal of this patch was to avoid such race condition in the case > > > where executable files can be updated. But in most cases it should n= ot > > > be a security issue (because processes allowed to write to executable > > > files should be trusted), but this could still lead to bugs (because = of > > > inconsistent file content, half-updated). > > > > > There is also a time gap between: > > a> the time of AT_EXECVE_CHECK > > b> the time that the app opens the file for execution. > > right ? another potential attack path (though this is not the case I > > mentioned previously). > > As explained in the documentation, to avoid this specific race > condition, interpreters should open the script once, check the FD with > AT_EXECVE_CHECK, and then read the content with the same FD. > Ya, now I see that in the description of this patch, sorry that I missed that previously. > > > > For the case I mentioned previously, I have to think more if the race > > condition is a bug or security issue. > > IIUC, two solutions are discussed so far: > > 1> the process could write to fs to update the script. However, for > > execution, the process still uses the copy that passed the > > AT_EXECVE_CHECK. (snapshot solution by Andy Lutomirski) > > Yes, the snapshot solution would be the best, but I guess it would rely > on filesystems to support this feature. > snapshot seems to be the reasonable direction to go Is this something related to the VMA ? e.g. preserve the in-memory copy of the file when the file on fs was updated. According to man mmap: MAP_PRIVATE Create a private copy-on-write mapping. Updates to the mapping are not visible to other processes mapping the same file, and are not carried through to the underlying file. It is unspecified whether changes made to the file after the mmap() call are visible in the mapped region. so the direction here is the process -> update the vma -> doesn't carry to the file. What we want is the reverse direction: (the unspecified part in the man pag= e) file updated on fs -> doesn't carry to the vma of this process. > > or 2> the process blocks the write while opening the file as read only > > and executing the script. (this seems to be the approach of this > > patch). > > Yes, and this is not something we want anymore. > right. Thank you for clarifying this. > > > > I wonder if there are other ideas. > > I don't see other efficient ways to give the same guarantees. right, me neither. Thanks and regards, -Jeff