From: Omar Sandoval <osandov@osandov.com>
To: Al Viro <viro@zeniv.linux.org.uk>
Cc: Trond Myklebust <trondmy@hammerspace.com>,
"amir73il@gmail.com" <amir73il@gmail.com>,
"dhowells@redhat.com" <dhowells@redhat.com>,
"lsf-pc@lists.linux-foundation.org"
<lsf-pc@lists.linux-foundation.org>, "hch@lst.de" <hch@lst.de>,
"miklos@szeredi.hu" <miklos@szeredi.hu>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>
Subject: Re: [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination
Date: Fri, 17 Jan 2020 15:54:44 -0800 [thread overview]
Message-ID: <20200117235444.GC295250@vader> (raw)
In-Reply-To: <20200117222212.GP8904@ZenIV.linux.org.uk>
On Fri, Jan 17, 2020 at 10:22:12PM +0000, Al Viro wrote:
> On Fri, Jan 17, 2020 at 12:22:19PM -0800, Omar Sandoval wrote:
>
> > Since manpage troff patches aren't particularly nice to read, here is
> > what I had in mind in a readable form:
> >
> > int linkat(int olddirfd, const char *oldpath, int newdirfd,
> > const char *newpath, int flags);
> >
> > AT_REPLACE
> > If newpath exists, replace it atomically. There is no
> > point at which another process attempting to access
> > newpath will find it missing.
> >
> > newpath must not be a directory.
> >
> > If oldpath and newpath are existing hard links referring
> > to the same file, then this does nothing, and returns a
> > success status. If newpath is a mount point, then this
> > comparison considers the mount point, not the mounted
> > file or directory.
> >
> > If newpath refers to a symbolic link, the link will be
> > overwritten.
> >
> > ERRORS
> > EBUSY AT_REPLACE was specified in flags, newpath is a mount
> > point, and the mount point does not refer to the same
> > file as oldpath. Or, AT_REPLACE was specified in flags
> > and newpath ends with a . or .. component.
>
> > EISDIR AT_REPLACE was specified in flags and newpath is an
> > existing directory.
Thanks for taking a look.
> So are <existing directory>/. or <existing directory>/.., so I wonder why
> bother with -EBUSY there.
This was for consistency with rename, as it happens to check for . and
.. before it checks whether oldpath is a directory. I don't feel
strongly either way.
> As for the gaps...
> 1) emptypath for new. Should be an error; EINVAL, probably.
> Avoids arseloads of fun cases ("what if newfd/newpath refers to something
> unlinked?", etc., etc.)
linkat(2) explicitly mentions that AT_EMPTY_PATH only applies to
oldpath, so this falls back to the standard ENOENT documented in
path_resolution(7).
> 2) mountpoint vs. mountpoint in the local namespace. See the
> rationale in that area for unlink()/rmdir()/rename().
I'll add that.
Side-note, unlink(2), rmdir(2), and rename(2) should probably mention
this behavior, too...
> 3) permission checks need to be specified
I believe the only difference here vs standard linkat is that newpath
must not be immutable or append-only?
> 4) as for the hardlinks, I would be very careful with the language;
> if anything, that's probably "if the link specified by newfd/newpath points
> to the object specified by oldfd/oldpath..."
It seems like there's room for elaborating on the distinction between
path-to-a-file vs. path-to-a-dir-entry in path_resolution(7). In the
meantime, I'll mention it explicitly.
> Another thing is, as you
> could've seen for rename(), such "permissive" clauses tend to give surprising
> results. For example, put the check for "it's a hardlink" early enough,
> and you've suddenly gotten a situation when it *can* succeed for directory.
> Or on a filesystem that never allowed hardlinks (the latter is probably
> unavoidable).
Good point. Unless I'm missing something, the only way you'll end up
with two paths referring to the same directory/file on a filesystem not
supporting hardlinks is through multiple mounts of the same filesystem,
and I check for the EXDEV case earlier than the same file case. I'll
double-check for other cases.
> 5) what _really_ needs to be said is that other links to
> newpath are unaffected and so are previously opened files.
Done.
> 6) "atomically" is bloody vague; the explanation you've put there
> is not enough, IMO - in addition to "nobody sees it gone in the middle
> of operation" there's also "if the operation fails, the target remains
> undisturbed" and something along the lines of "if filesystem makes any
> promises about the state after dirty shutdown in the middle of rename(),
> the same promises should apply here".
Done. I'd be nice if I could say "refer to rename(2) for durability
guarantees", but rename(2) doesn't mention it at all.
> references to pathconf, Cthulhu and other equally delightful entities
> are not really welcome.
EOPNOTSUPP is probably the most helpful.
> There might be be more - like I said, it needs to go through
> linux-abi, fsdevel and security lists. The above is all I see right
> now, but I might be missing something subtle (or not so subtle).
Here's the updated description. If you don't see any other glaring
problems, I'll update the implementation and send it out to the
appropriate lists.
int linkat(int olddirfd, const char *oldpath, int newdirfd,
const char *newpath, int flags);
AT_REPLACE
If newpath exists, replace it atomically. There is no
point at which another process attempting to access
newpath will find it missing. If newpath exists but the
operation fails, the original link specified by newpath
will remain in place. This has the same durability
guarantees for newpath as rename(2).
If newpath is replaced, any other hard links referring
to the file are unaffected. Open file descriptors for
newpath are also unaffected.
newpath must not be a directory.
If the entry specified by newpath refers to the file
specified by oldpath, linkat() does nothing and returns
a success status. Note that this comparison does not
follow mounts on newpath.
Otherwise, newpath must not be a mount point in the
local namespace. If it is a mount point in another
namespace and the operation succeeds, all mounts are
detached from newpath in all namespaces, as is the case
for rename(2), rmdir(2), and unlink(2).
If newpath refers to a symbolic link, the link will be
overwritten.
ERRORS
EBUSY AT_REPLACE was specified in flags, newpath is a mount
point in the local namespace, and the mount point does
not refer to the same file as oldpath. Or, AT_REPLACE
was specified in flags and newpath ends with a . or ..
component.
EISDIR AT_REPLACE was specified in flags and newpath is an
existing directory.
EOPNOTSUPP
AT_REPLACE was specified in flags and the filesystem
containing newpath does not support AT_REPLACE.
EPERM AT_REPLACE was specified and newpath refers to an
immutable or append-only file.
> As for the implementation... VFS side should be reasonably easy (OK,
> it'll bloat do_symlinkat() quite a bit, since we won't be able to use
> filename_create() for the target in there, but it's not _that_
> terrible).
Based on my previous attempt at it [1], it's not too bad.
> I'd probably prefer a separate method rather than
> overloading ->link(), with the old method called when the target
> is negative and new - when it's positive - less boilerplate and
> harder for an instance to get confused. They can easily use common
> helpers with ->link() instances, so the code duplication won't
> be an issue.
Agreed, thanks again!
1: https://lore.kernel.org/linux-fsdevel/eac9480f80c689504148b5d658ee4218cc1e421e.1524549513.git.osandov@fb.com/
next prev parent reply other threads:[~2020-01-17 23:54 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-01-17 12:49 [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination David Howells
2020-01-17 14:33 ` Trond Myklebust
2020-01-17 15:46 ` Al Viro
2020-01-17 16:12 ` Trond Myklebust
2020-01-17 16:48 ` Al Viro
2020-01-17 16:36 ` Omar Sandoval
2020-01-17 16:59 ` Al Viro
2020-01-17 17:28 ` Omar Sandoval
2020-01-17 18:17 ` Al Viro
2020-01-17 20:22 ` Omar Sandoval
2020-01-17 22:22 ` Al Viro
2020-01-17 23:54 ` Omar Sandoval [this message]
2020-01-18 0:47 ` Al Viro
2020-01-18 1:17 ` Omar Sandoval
2020-01-18 2:20 ` Al Viro
2020-01-21 23:05 ` Omar Sandoval
2020-01-22 6:57 ` Amir Goldstein
2020-01-22 22:10 ` Omar Sandoval
2020-01-23 3:47 ` Al Viro
2020-01-23 7:16 ` Dave Chinner
2020-01-23 7:47 ` Amir Goldstein
2020-01-24 21:25 ` Dave Chinner
2020-01-31 5:24 ` Darrick J. Wong
2020-01-31 5:29 ` hch
2020-01-31 7:00 ` Amir Goldstein
2020-01-31 20:33 ` Omar Sandoval
2020-01-31 21:55 ` Amir Goldstein
2020-01-28 1:27 ` Omar Sandoval
2020-01-28 14:35 ` David Howells
2020-01-31 5:31 ` hch
2020-01-31 8:04 ` David Howells
2020-01-31 8:56 ` Amir Goldstein
2020-01-22 9:53 ` David Howells
2020-01-17 14:47 ` David Howells
2020-01-17 14:56 ` Trond Myklebust
2020-01-17 16:01 ` Al Viro
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200117235444.GC295250@vader \
--to=osandov@osandov.com \
--cc=amir73il@gmail.com \
--cc=dhowells@redhat.com \
--cc=hch@lst.de \
--cc=linux-fsdevel@vger.kernel.org \
--cc=lsf-pc@lists.linux-foundation.org \
--cc=miklos@szeredi.hu \
--cc=trondmy@hammerspace.com \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).