Re: [PATCH RFC 1/2] fcntl: add F_CREATED

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Jeff Layton <jlayton@kernel.org>
To: Christian Brauner <brauner@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	 linux-fsdevel@vger.kernel.org,
	Josef Bacik <josef@toxicpanda.com>, Jann Horn <jannh@google.com>,
	Jan Kara <jack@suse.com>,
	Daan De Meyer <daan.j.demeyer@gmail.com>
Subject: Re: [PATCH RFC 1/2] fcntl: add F_CREATED
Date: Thu, 25 Jul 2024 07:16:41 -0400	[thread overview]
Message-ID: <afa789e106531ebf6f41a8b8b223ef5f6df36cf0.camel@kernel.org> (raw)
In-Reply-To: <20240725-mimik-herrichten-c51d721324bc@brauner>

On Thu, 2024-07-25 at 10:38 +0200, Christian Brauner wrote:
> On Wed, Jul 24, 2024 at 09:56:09AM GMT, Jeff Layton wrote:
> > On Wed, 2024-07-24 at 15:15 +0200, Christian Brauner wrote:
> > > Systemd has a helper called openat_report_new() that returns whether a
> > > file was created anew or it already existed before for cases where
> > > O_CREAT has to be used without O_EXCL (cf. [1]). That apparently isn't
> > > something that's specific to systemd but it's where I noticed it.
> > > 
> > > The current logic is that it first attempts to open the file without
> > > O_CREAT | O_EXCL and if it gets ENOENT the helper tries again with both
> > > flags. If that succeeds all is well. If it now reports EEXIST it
> > > retries.
> > > 
> > > That works fairly well but some corner cases make this more involved. If
> > > this operates on a dangling symlink the first openat() without O_CREAT |
> > > O_EXCL will return ENOENT but the second openat() with O_CREAT | O_EXCL
> > > will fail with EEXIST. The reason is that openat() without O_CREAT |
> > > O_EXCL follows the symlink while O_CREAT | O_EXCL doesn't for security
> > > reasons. So it's not something we can really change unless we add an
> > > explicit opt-in via O_FOLLOW which seems really ugly.
> > > 
> > > The caller could try and use fanotify() to register to listen for
> > > creation events in the directory before calling openat(). The caller
> > > could then compare the returned tid to its own tid to ensure that even
> > > in threaded environments it actually created the file. That might work
> > > but is a lot of work for something that should be fairly simple and I'm
> > > uncertain about it's reliability.
> > > 
> > > The caller could use a bpf lsm hook to hook into security_file_open() to
> > > figure out whether they created the file. That also seems a bit wild.
> > > 
> > > So let's add F_CREATED which allows the caller to check whether they
> > > actually did create the file. That has caveats of course but I don't
> > > think they are problematic:
> > > 
> > > * In multi-threaded environments a thread can only be sure that it did
> > >   create the file if it calls openat() with O_CREAT. In other words,
> > >   it's obviously not enough to just go through it's fdtable and check
> > >   these fds because another thread could've created the file.
> > > 
> > 
> > Not exactly. FMODE_CREATED is set in the file description. In principle
> > a userland program should know which thread actually did the the open()
> > that results in each fd. This new interface tells us which fd's open
> > actually resulted in the file being created (which is good).
> > 
> > In any case, I don't see this as a problem. The interface does what it
> > says on the tin.
> > 
> > > * If there's any codepaths where an openat() with O_CREAT would yield
> > >   the same struct file as that of another thread it would obviously
> > >   cause wrong results. I'm not aware of any such codepaths from openat()
> > >   itself. Imho, that would be a bug.
> > > 
> > 
> > Definitely a bug. That said, this will have interesting interactions
> > with dup that may need to be documented. IOW, if you dup a file with
> > FMODE_CREATED, then the new fd will also report that F_CREATED is true.
> 
> Yes, but it's the behavior one expects. Dup'ing an fd implies fd1->file
> == fd2->file and so it's correct behavior. IOW, someone must confuse
> open() with dup() to be surprised by this.

It's the behavior _we_ generally expect, but a lot of people don't
understand the subtleties of file descriptions vs. file descriptors.
Spelling out this behavior explicitly would probably prevent a few
headaches.

-- 
Jeff Layton <jlayton@kernel.org>

next prev parent reply	other threads:[~2024-07-25 11:16 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-24 13:15 [PATCH RFC 0/2] Add an fcntl() to check file creation Christian Brauner
2024-07-24 13:15 ` [PATCH RFC 1/2] fcntl: add F_CREATED Christian Brauner
2024-07-24 13:56   ` Jeff Layton
2024-07-25  8:38     ` Christian Brauner
2024-07-25 11:16       ` Jeff Layton [this message]
2024-07-24 16:10   ` Jan Kara
2024-07-24 13:15 ` [PATCH RFC 2/2] selftests: add F_CREATED tests Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=afa789e106531ebf6f41a8b8b223ef5f6df36cf0.camel@kernel.org \
    --to=jlayton@kernel.org \
    --cc=brauner@kernel.org \
    --cc=daan.j.demeyer@gmail.com \
    --cc=jack@suse.com \
    --cc=jannh@google.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).