From: Dave Chinner <david@fromorbit.com>
To: Sage Weil <sage@newdream.net>
Cc: Trond Myklebust <trond.myklebust@primarydata.com>,
Zach Brown <zab@redhat.com>,
Alexander Viro <viro@zeniv.linux.org.uk>,
Linux FS-devel Mailing List <linux-fsdevel@vger.kernel.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Linux API Mailing List <linux-api@vger.kernel.org>
Subject: Re: [PATCH RFC] vfs: add a O_NOMTIME flag
Date: Wed, 13 May 2015 10:57:46 +1000 [thread overview]
Message-ID: <20150513005746.GJ15721@dastard> (raw)
In-Reply-To: <alpine.DEB.2.00.1505121454000.20395@cobra.newdream.net>
On Tue, May 12, 2015 at 04:12:46PM -0700, Sage Weil wrote:
> On Tue, 12 May 2015, Dave Chinner wrote:
> > > > > I'd rather not make this XFS specific as other local filesystmes (ext4,
> > > > > f2fs, possibly btrfs) would similarly benefit. (And if we want to target
> > > > > XFS specifically the existing XFS open-by-handle ioctl is sufficient as it
> > > > > already does O_NOMTIME unconditionally.)
> > > >
> > > > Lack of a namespace, doesn't imply that you don't want to manage the
> > > > data. The whole point of using object storage instead of plain old
> > > > block storage is to be able to provide whatever metadata you still
> > > > need in order to manage the object.
> > >
> > > Yeah, agreed--this is presumably why open_by_handle(2) (which is what we'd
> > > like to use) doesn't assume O_NOMTIME.
> >
> > Right - the XFS ioctls were designed specifically for applications
> > that interacted directly with the structure of XFS filesystems and
> > so needed invisible IO (e.g. online defragmenter). IOWs, they are
> > not interfaces intended for general usage. They are also only
> > available to root, so a typical user application won't be making use
> > of them, either.
>
> I understand that's what they're intended for, but I'm having a hard time
> parsing out the difference between what they *do* and what O_NOMTIME + -o
> allow_nomtime does. The open-by-handle ioctls have nothing to do with the
> online XFS format--they simply allow you to open a file via an opaque
> handle (albeit a differently formatted one than the generic
> open_by_handle_at(2)). They also force you into an O_NOMTIME-equivalent
> mode.
Actually, the handle is dervied from the information on disk. We
don't do directory lookups to build handles in many cases, we do a
bulkstat to get *on-disk* inode information (inode number, generation,
timestamps, etc) and then use that to build a handle in userspace
*and* validate the file has not changed since the infomration was
retrieved and the handle was built.
> AFAICS the only difference that I see is that
>
> 1) the ioctl is XFS specific. (As open_by_handle_at(2) demonstrates, this
> needn't be the case.)
Of course - it's been in use for 15 years longer than the generic
interface. :)
> 2) the NOMTIME mode is only available via the open-by-handle interface,
> not open(2).
Right, because of the XFS handle interfaces are intended for
invisible IO which is required by applications interacting directly
with the XFS on-disk data layout.
> 3) it is an ioctl interface, and thus more obscure. (Well, there is a
> libhandle library, but it doesn't seem to be widely used.)
The library only exists for xfsdump and the HSMs that interact
directly with the XFS on disk data. These are very constrained
applications.
> Would you object less if
>
> 1) the O_NOMTIME flag were only available via open_by_handle_at(2)?
Which limits it to files that have already by created and written to
disk, otherwise there is no handle....
> 2) an equivalent ioctl were implemented for each file system of interest
> that (say) called into open_by_handle_at(2) code, adding in the O_NOMTIME
> flag?
Seems like a silly hoop to jump through. I was thinking of a
root-only fcntl() style flag that could be set, but....
> 3) O_NOMTIME required root (vs a mount option that requires root and
> unpriviledged O_NOMTIME)?
>
> Just trying to tease apart which part is problematic...
... it's very existence ias either a open or fcntl flag is still
problematic. :/
The concept of it being an on-disk attribute flag is less prone to
silent abuse - it's easily discoverable and is persistent. And it's
managable if we make it an "inherit from parent" style flag, because
then ceph can simply set it on the root dir, and every file it then
creates will not do mtime updates.
The other thing that is worth noting here is that we also have a
NODUMP flag on disk (chattr +d). Hence we could define that the
nomtime attribute also implies/sets the nodump attribute, and hence
makes it clear and upfront that turning on the nomtime inode
attribute will mean the files with this set will not get backed up
by mtime sensitive backup programs....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2015-05-13 0:57 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-05-06 22:00 [PATCH RFC] vfs: add a O_NOMTIME flag Zach Brown
2015-05-06 22:14 ` Trond Myklebust
2015-05-06 22:19 ` Sage Weil
[not found] ` <alpine.DEB.2.00.1505061515550.28239-vIokxiIdD2AQNTJnQDzGJqxOck334EZe@public.gmane.org>
2015-05-06 22:41 ` Zach Brown
[not found] ` <20150506224113.GA17282-fypN+1c5dIyjpB87vu3CluTW4wlIGRCZ@public.gmane.org>
2015-05-06 22:46 ` Sage Weil
2015-05-06 23:21 ` Theodore Ts'o
[not found] ` <1430949612-21356-1-git-send-email-zab-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-05-07 0:26 ` Dave Chinner
2015-05-07 17:20 ` Zach Brown
2015-05-07 18:43 ` Zach Brown
[not found] ` <20150507172053.GA659-fypN+1c5dIyjpB87vu3CluTW4wlIGRCZ@public.gmane.org>
2015-05-07 19:09 ` Richard Weinberger
2015-05-07 19:53 ` Andy Lutomirski
[not found] ` <554BC4D8.9010507@nod.at>
2015-05-07 20:06 ` Andy Lutomirski
[not found] ` <CALCETrWNDMq0nK3ac-uZweV5BKK_yWTQHH5D0YkyEu7bcONo9g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-05-08 2:42 ` Dave Chinner
2015-07-14 11:44 ` Pavel Machek
2015-05-08 2:37 ` Dave Chinner
2015-05-08 3:24 ` Andy Lutomirski
[not found] ` <CALCETrUksu5ZB4QBfC8DMwYO2OFjfPW2eWsTweZGN_gybzcsmw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-05-08 14:44 ` Eric Sandeen
2015-05-11 20:36 ` J. Bruce Fields
2015-05-08 1:01 ` Sage Weil
2015-05-08 14:29 ` John Stoffel
2015-07-14 11:50 ` Pavel Machek
[not found] ` <alpine.DEB.2.00.1505071752520.28239-vIokxiIdD2AQNTJnQDzGJqxOck334EZe@public.gmane.org>
2015-05-08 1:23 ` Trond Myklebust
2015-05-08 15:19 ` Sage Weil
[not found] ` <CAHQdGtQjMHA8rVPkggB2zMz=k3O667+APH_1EY_2FtYmHL7-hw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-05-08 22:13 ` Dave Chinner
2015-05-08 22:24 ` Sage Weil
[not found] ` <alpine.DEB.2.00.1505081517470.28239-vIokxiIdD2AQNTJnQDzGJqxOck334EZe@public.gmane.org>
2015-05-10 23:13 ` Trond Myklebust
[not found] ` <CAHQdGtTFTN2XuvmarFZ9HPQV=cuhh7FosdHSrJME_U4htr=i8w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-05-11 7:31 ` Dave Chinner
2015-05-11 16:39 ` Sage Weil
2015-05-11 17:12 ` Trond Myklebust
[not found] ` <CAHQdGtT3rCf-ycAYw-=7HGaemg1+HfY8sw3+kb54VHONxDyP3w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-05-11 17:30 ` Sage Weil
2015-05-12 1:21 ` Dave Chinner
2015-05-12 23:12 ` Sage Weil
2015-05-13 0:57 ` Dave Chinner [this message]
[not found] ` <alpine.DEB.2.00.1505111020120.28239-vIokxiIdD2AQNTJnQDzGJqxOck334EZe@public.gmane.org>
2015-05-12 13:41 ` John Stoffel
2015-05-11 14:47 ` Theodore Ts'o
[not found] ` <20150511144719.GA14088-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org>
2015-05-11 16:24 ` Sage Weil
[not found] ` <alpine.DEB.2.00.1505110920520.28239-vIokxiIdD2AQNTJnQDzGJqxOck334EZe@public.gmane.org>
2015-05-11 23:10 ` Theodore Ts'o
2015-05-12 5:08 ` Kevin Easton
[not found] ` <20150512050821.GA9404-Qr0l8DEfScZEV+tojptmR0B+6BGkLq7r@public.gmane.org>
2015-05-12 11:45 ` Austin S Hemmelgarn
[not found] ` <5551E7EB.8040301-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-05-12 13:54 ` John Stoffel
2015-05-12 14:36 ` J. Bruce Fields
[not found] ` <20150512143637.GA6370-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2015-05-12 14:53 ` Austin S Hemmelgarn
2015-05-12 21:51 ` Dave Chinner
2015-05-13 15:16 ` Austin S Hemmelgarn
2015-05-12 22:39 ` NeilBrown
[not found] ` <20150513083951.5eb63bc0-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2015-07-14 13:13 ` Pavel Machek
2015-07-15 4:54 ` NeilBrown
2015-07-22 13:47 ` Pavel Machek
2015-05-12 21:35 ` Sage Weil
2015-05-13 12:32 ` Jan Kara
2015-05-08 14:43 ` Austin S Hemmelgarn
2015-05-08 17:11 ` Zach Brown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150513005746.GJ15721@dastard \
--to=david@fromorbit.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=sage@newdream.net \
--cc=trond.myklebust@primarydata.com \
--cc=viro@zeniv.linux.org.uk \
--cc=zab@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).