* beginners project: RENAME_WHITEOUT
@ 2014-11-07 19:09 Christoph Hellwig
2014-11-07 22:59 ` Eric Sandeen
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Christoph Hellwig @ 2014-11-07 19:09 UTC (permalink / raw)
To: xfs; +Cc: Miklos Szeredi
The overlayfs merge introduces a new rename flag to create to whiteouts.
Should be a fairly easy to implement.
Miklos, do you have any good documentation and/or test cases for this?
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: beginners project: RENAME_WHITEOUT
2014-11-07 19:09 beginners project: RENAME_WHITEOUT Christoph Hellwig
@ 2014-11-07 22:59 ` Eric Sandeen
2014-11-08 6:40 ` Christoph Hellwig
2014-11-08 23:42 ` Dave Chinner
2014-11-10 9:14 ` Miklos Szeredi
2 siblings, 1 reply; 11+ messages in thread
From: Eric Sandeen @ 2014-11-07 22:59 UTC (permalink / raw)
To: Christoph Hellwig, xfs; +Cc: Carlos Maiolino, Miklos Szeredi
On 11/7/14 1:09 PM, Christoph Hellwig wrote:
> The overlayfs merge introduces a new rename flag to create to whiteouts.
> Should be a fairly easy to implement.
>
> Miklos, do you have any good documentation and/or test cases for this?
FWIW, Carlos just expressed some interest today in doing this, and it
builds on his earlier patches, yes?
-Eric
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: beginners project: RENAME_WHITEOUT
2014-11-07 22:59 ` Eric Sandeen
@ 2014-11-08 6:40 ` Christoph Hellwig
2014-11-09 0:08 ` cem
0 siblings, 1 reply; 11+ messages in thread
From: Christoph Hellwig @ 2014-11-08 6:40 UTC (permalink / raw)
To: Eric Sandeen; +Cc: Miklos Szeredi, Carlos Maiolino, xfs
On Fri, Nov 07, 2014 at 04:59:49PM -0600, Eric Sandeen wrote:
> On 11/7/14 1:09 PM, Christoph Hellwig wrote:
> > The overlayfs merge introduces a new rename flag to create to whiteouts.
> > Should be a fairly easy to implement.
> >
> > Miklos, do you have any good documentation and/or test cases for this?
>
> FWIW, Carlos just expressed some interest today in doing this, and it
> builds on his earlier patches, yes?
It's in the same area, but not otherwise related. But I'd be happy if
he picks it up.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: beginners project: RENAME_WHITEOUT
2014-11-07 19:09 beginners project: RENAME_WHITEOUT Christoph Hellwig
2014-11-07 22:59 ` Eric Sandeen
@ 2014-11-08 23:42 ` Dave Chinner
2014-11-10 9:25 ` Miklos Szeredi
2014-11-10 9:14 ` Miklos Szeredi
2 siblings, 1 reply; 11+ messages in thread
From: Dave Chinner @ 2014-11-08 23:42 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Miklos Szeredi, xfs
On Fri, Nov 07, 2014 at 11:09:59AM -0800, Christoph Hellwig wrote:
> The overlayfs merge introduces a new rename flag to create to whiteouts.
> Should be a fairly easy to implement.
>
> Miklos, do you have any good documentation and/or test cases for this?
So overlayfs uses some weird char dev hack to implement whiteout
inodes in directories? Why do we need a whiteout inode on disk?
what information is actually stored in the whiteout inode that
overlayfs actually needs? Only readdir and lookup care about
whiteouts, and AFAICT nothing of the inode is ever used except
checking the chrdev/whiteoutdev hack via ovl_is_whiteout(dentry).
Indeed, whatever happened to just storing the whiteout in the dirent
via DT_WHT and using that information on lookup in the lower
filesystem to mark the dentry returned appropriately without needing
to lookup a real inode?
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: beginners project: RENAME_WHITEOUT
2014-11-08 6:40 ` Christoph Hellwig
@ 2014-11-09 0:08 ` cem
0 siblings, 0 replies; 11+ messages in thread
From: cem @ 2014-11-09 0:08 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Miklos Szeredi, Eric Sandeen, xfs
Quoting Christoph Hellwig <hch@infradead.org>:
> On Fri, Nov 07, 2014 at 04:59:49PM -0600, Eric Sandeen wrote:
>> On 11/7/14 1:09 PM, Christoph Hellwig wrote:
>> > The overlayfs merge introduces a new rename flag to create to whiteouts.
>> > Should be a fairly easy to implement.
>> >
>> > Miklos, do you have any good documentation and/or test cases for this?
>>
>> FWIW, Carlos just expressed some interest today in doing this, and it
>> builds on his earlier patches, yes?
>
> It's in the same area, but not otherwise related. But I'd be happy if
> he picks it up.
consider it picked up..
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: beginners project: RENAME_WHITEOUT
2014-11-07 19:09 beginners project: RENAME_WHITEOUT Christoph Hellwig
2014-11-07 22:59 ` Eric Sandeen
2014-11-08 23:42 ` Dave Chinner
@ 2014-11-10 9:14 ` Miklos Szeredi
2 siblings, 0 replies; 11+ messages in thread
From: Miklos Szeredi @ 2014-11-10 9:14 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: xfs
On Fri, Nov 7, 2014 at 8:09 PM, Christoph Hellwig <hch@infradead.org> wrote:
> The overlayfs merge introduces a new rename flag to create to whiteouts.
> Should be a fairly easy to implement.
>
> Miklos, do you have any good documentation and/or test cases for this?
No test case for the RENAME_WHITEOUT flag yet. I'll work on that.
We do have a man page and an xfstest test case for RENAME_EXCHANGE,
which is also important for overlayfs.
Thanks,
Miklos
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: beginners project: RENAME_WHITEOUT
2014-11-08 23:42 ` Dave Chinner
@ 2014-11-10 9:25 ` Miklos Szeredi
2014-11-10 13:52 ` Dave Chinner
0 siblings, 1 reply; 11+ messages in thread
From: Miklos Szeredi @ 2014-11-10 9:25 UTC (permalink / raw)
To: Dave Chinner; +Cc: Christoph Hellwig, xfs
On Sun, Nov 9, 2014 at 12:42 AM, Dave Chinner <david@fromorbit.com> wrote:
> On Fri, Nov 07, 2014 at 11:09:59AM -0800, Christoph Hellwig wrote:
>> The overlayfs merge introduces a new rename flag to create to whiteouts.
>> Should be a fairly easy to implement.
>>
>> Miklos, do you have any good documentation and/or test cases for this?
>
> So overlayfs uses some weird char dev hack to implement whiteout
> inodes in directories? Why do we need a whiteout inode on disk?
> what information is actually stored in the whiteout inode that
> overlayfs actually needs? Only readdir and lookup care about
> whiteouts, and AFAICT nothing of the inode is ever used except
> checking the chrdev/whiteoutdev hack via ovl_is_whiteout(dentry).
>
> Indeed, whatever happened to just storing the whiteout in the dirent
> via DT_WHT and using that information on lookup in the lower
> filesystem to mark the dentry returned appropriately without needing
> to lookup a real inode?
The filesystem is free to implement whiteouts a dirent without an actual inode.
But we do need at least an inode in the VFS, since the whiteout needs
to be presented to userspace when not part of the overlay. The DT_WHT
makes the typical mistake of trying to make the implementation nice,
while not caring about user interfaces.
This is usually a big mistake, user interfaces are much more important
than implementation details, and an already existing file type on
which all the usual operations work (stat, unlink) is much better in
this respect than a completely new object which is unknown and
unmanageable for the vast majority of applications.
The special chardev was Linus' idea, but I agree with him completely
on this point. Introducing DT_WHT on the userspace API would be much
more of a hack than reusing existing objects and operations.
Thanks,
Miklos
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: beginners project: RENAME_WHITEOUT
2014-11-10 9:25 ` Miklos Szeredi
@ 2014-11-10 13:52 ` Dave Chinner
2014-11-10 16:55 ` Miklos Szeredi
2015-01-09 13:30 ` Carlos Maiolino
0 siblings, 2 replies; 11+ messages in thread
From: Dave Chinner @ 2014-11-10 13:52 UTC (permalink / raw)
To: Miklos Szeredi; +Cc: Christoph Hellwig, xfs
On Mon, Nov 10, 2014 at 10:25:40AM +0100, Miklos Szeredi wrote:
> On Sun, Nov 9, 2014 at 12:42 AM, Dave Chinner <david@fromorbit.com> wrote:
> > On Fri, Nov 07, 2014 at 11:09:59AM -0800, Christoph Hellwig wrote:
> >> The overlayfs merge introduces a new rename flag to create to whiteouts.
> >> Should be a fairly easy to implement.
> >>
> >> Miklos, do you have any good documentation and/or test cases for this?
> >
> > So overlayfs uses some weird char dev hack to implement whiteout
> > inodes in directories? Why do we need a whiteout inode on disk?
> > what information is actually stored in the whiteout inode that
> > overlayfs actually needs? Only readdir and lookup care about
> > whiteouts, and AFAICT nothing of the inode is ever used except
> > checking the chrdev/whiteoutdev hack via ovl_is_whiteout(dentry).
> >
> > Indeed, whatever happened to just storing the whiteout in the dirent
> > via DT_WHT and using that information on lookup in the lower
> > filesystem to mark the dentry returned appropriately without needing
> > to lookup a real inode?
>
> The filesystem is free to implement whiteouts a dirent without an actual inode.
Sure, but overlayfs won't make use of it, so we'd have
have to hack around overlayfs's ignorance of DT_WHT in several
different places to do this. e.g.
- in mknod to intercept creation of magical whiteout chardevs
- in readdir so we can convert them to DT_CHR so overlayfs
can detect them,
- in ->lookup so we can create magical chardev inodes in
memory rather than try to read them from disk.
- in rename we have to detect the magical chardev inodes so
we know it's a whiteout we are dealing with
This is difficult because overlayfs hard codes the definition of a
whiteout into the VFS interface implementation as well as it's
internal directory implementation. This leaves almost no room for
anyone to optimise the back end implementation because the
translation layers are complex and fiddly....
> But we do need at least an inode in the VFS, since the whiteout needs
> to be presented to userspace when not part of the overlay.
Sure, but that's a different problem.
> The DT_WHT
> makes the typical mistake of trying to make the implementation nice,
> while not caring about user interfaces.
You're implying the d_type field in a dirent is something that it is
not. d_type has only one purpose in life - to allowing userspace
applications to avoid a stat() call to find out the type of the
object the dirent points to.
> This is usually a big mistake, user interfaces are much more important
> than implementation details, and an already existing file type on
> which all the usual operations work (stat, unlink) is much better in
> this respect than a completely new object which is unknown and
> unmanageable for the vast majority of applications.
Sure, but again that's not the issue I'm commenting on. The dirent
type has no effect on stat, unlink, etc that are done on the dirent
after it is returned to userspace.
So why is overloading DT_CHR to mean 'either a char device or a
whiteout entry' a sane user interface design decision? d_type *was*
a simple, obvious, effective and efficient user interface that
allowed users to avoid extra syscalls. It's been used this way by
userspace for, what, 15 years?
With the overlayfs "magic" we now have the situation where d_type is
not sufficient to avoid a stat() call to determine the type the
dirent points to. IOWs, we've just fucked up a perfectly good
interface that is widely used because somebody thought that using
the DT_WHT value in the d_type field for whiteout dirents is a "bad
interface".
> The special chardev was Linus' idea, but I agree with him completely
> on this point. Introducing DT_WHT on the userspace API would be much
> more of a hack than reusing existing objects and operations.
Magical char dev for access, unlink, etc: no problems there.
DT_CHR for the whiteout dirent type: completely fucked up.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: beginners project: RENAME_WHITEOUT
2014-11-10 13:52 ` Dave Chinner
@ 2014-11-10 16:55 ` Miklos Szeredi
2015-01-09 13:30 ` Carlos Maiolino
1 sibling, 0 replies; 11+ messages in thread
From: Miklos Szeredi @ 2014-11-10 16:55 UTC (permalink / raw)
To: Dave Chinner; +Cc: Christoph Hellwig, xfs
On Mon, Nov 10, 2014 at 2:52 PM, Dave Chinner <david@fromorbit.com> wrote:
> On Mon, Nov 10, 2014 at 10:25:40AM +0100, Miklos Szeredi wrote:
>> On Sun, Nov 9, 2014 at 12:42 AM, Dave Chinner <david@fromorbit.com> wrote:
>> > On Fri, Nov 07, 2014 at 11:09:59AM -0800, Christoph Hellwig wrote:
>> >> The overlayfs merge introduces a new rename flag to create to whiteouts.
>> >> Should be a fairly easy to implement.
>> >>
>> >> Miklos, do you have any good documentation and/or test cases for this?
>> >
>> > So overlayfs uses some weird char dev hack to implement whiteout
>> > inodes in directories? Why do we need a whiteout inode on disk?
>> > what information is actually stored in the whiteout inode that
>> > overlayfs actually needs? Only readdir and lookup care about
>> > whiteouts, and AFAICT nothing of the inode is ever used except
>> > checking the chrdev/whiteoutdev hack via ovl_is_whiteout(dentry).
>> >
>> > Indeed, whatever happened to just storing the whiteout in the dirent
>> > via DT_WHT and using that information on lookup in the lower
>> > filesystem to mark the dentry returned appropriately without needing
>> > to lookup a real inode?
>>
>> The filesystem is free to implement whiteouts a dirent without an actual inode.
>
> Sure, but overlayfs won't make use of it, so we'd have
> have to hack around overlayfs's ignorance of DT_WHT in several
> different places to do this. e.g.
>
> - in mknod to intercept creation of magical whiteout chardevs
> - in readdir so we can convert them to DT_CHR so overlayfs
> can detect them,
> - in ->lookup so we can create magical chardev inodes in
> memory rather than try to read them from disk.
> - in rename we have to detect the magical chardev inodes so
> we know it's a whiteout we are dealing with
I care little where it's intercepted, in the filesystem or in the VFS.
Obviously if this is going to be done by more than one filesystem it
makes sense to gather the common bits and pieces into VFS helpers.
And overlayfs could easily make use of DT_WHT, if it was available, I
really have no problem with that.
As long as we don't have the insane DT_WHT == negative dentry on the
userspace API, it's good (i.e. you *should* be able to remove it).
And if "tar" is able to archive the thing and restore it, that's just
an added bonus. The special chardev does all that. So in my book
it's a perfect solution.
And while I don't care if DT_WHT ends up not using an inode *on disk*
or not, a negative entry on disk with a positive dentry in the VFS is
bound to end up with complex and fiddly conversion layers, that may
not actually be worth it.
- "DT_WHT + S_IFCHR inode" OK, needs some conversion but not too difficult
- "DT_WHT + negative indode" OK, but fiddly
- "DT_CHR + S_IFCHR inode" OK, but sligthly suboptimal (or "completely
fucked up" depending on your POV).
The big advantage of not introducing DT_WHT is that it spares the pain
of introducing possibly incompatible on-disk format (i.e. will old
versions of fsck going to balk at it? will old kernels reject the
image containing DT_WHT?)
If the filesystem and tools are already prepared for the DT_WHT dirent
type then that's a non-issue, obviously.
Thanks,
Miklos
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: beginners project: RENAME_WHITEOUT
2014-11-10 13:52 ` Dave Chinner
2014-11-10 16:55 ` Miklos Szeredi
@ 2015-01-09 13:30 ` Carlos Maiolino
2015-01-22 1:05 ` Dave Chinner
1 sibling, 1 reply; 11+ messages in thread
From: Carlos Maiolino @ 2015-01-09 13:30 UTC (permalink / raw)
To: xfs
So, giving this conversation, should we implement WHITEOUTS in XFS
already, or is this isn't decided yet?
Cheers.
On Tue, Nov 11, 2014 at 12:52:49AM +1100, Dave Chinner wrote:
> On Mon, Nov 10, 2014 at 10:25:40AM +0100, Miklos Szeredi wrote:
> > On Sun, Nov 9, 2014 at 12:42 AM, Dave Chinner <david@fromorbit.com> wrote:
> > > On Fri, Nov 07, 2014 at 11:09:59AM -0800, Christoph Hellwig wrote:
> > >> The overlayfs merge introduces a new rename flag to create to whiteouts.
> > >> Should be a fairly easy to implement.
> > >>
> > >> Miklos, do you have any good documentation and/or test cases for this?
> > >
> > > So overlayfs uses some weird char dev hack to implement whiteout
> > > inodes in directories? Why do we need a whiteout inode on disk?
> > > what information is actually stored in the whiteout inode that
> > > overlayfs actually needs? Only readdir and lookup care about
> > > whiteouts, and AFAICT nothing of the inode is ever used except
> > > checking the chrdev/whiteoutdev hack via ovl_is_whiteout(dentry).
> > >
> > > Indeed, whatever happened to just storing the whiteout in the dirent
> > > via DT_WHT and using that information on lookup in the lower
> > > filesystem to mark the dentry returned appropriately without needing
> > > to lookup a real inode?
> >
> > The filesystem is free to implement whiteouts a dirent without an actual inode.
>
> Sure, but overlayfs won't make use of it, so we'd have
> have to hack around overlayfs's ignorance of DT_WHT in several
> different places to do this. e.g.
>
> - in mknod to intercept creation of magical whiteout chardevs
> - in readdir so we can convert them to DT_CHR so overlayfs
> can detect them,
> - in ->lookup so we can create magical chardev inodes in
> memory rather than try to read them from disk.
> - in rename we have to detect the magical chardev inodes so
> we know it's a whiteout we are dealing with
>
> This is difficult because overlayfs hard codes the definition of a
> whiteout into the VFS interface implementation as well as it's
> internal directory implementation. This leaves almost no room for
> anyone to optimise the back end implementation because the
> translation layers are complex and fiddly....
>
> > But we do need at least an inode in the VFS, since the whiteout needs
> > to be presented to userspace when not part of the overlay.
>
> Sure, but that's a different problem.
>
> > The DT_WHT
> > makes the typical mistake of trying to make the implementation nice,
> > while not caring about user interfaces.
>
> You're implying the d_type field in a dirent is something that it is
> not. d_type has only one purpose in life - to allowing userspace
> applications to avoid a stat() call to find out the type of the
> object the dirent points to.
>
> > This is usually a big mistake, user interfaces are much more important
> > than implementation details, and an already existing file type on
> > which all the usual operations work (stat, unlink) is much better in
> > this respect than a completely new object which is unknown and
> > unmanageable for the vast majority of applications.
>
> Sure, but again that's not the issue I'm commenting on. The dirent
> type has no effect on stat, unlink, etc that are done on the dirent
> after it is returned to userspace.
>
> So why is overloading DT_CHR to mean 'either a char device or a
> whiteout entry' a sane user interface design decision? d_type *was*
> a simple, obvious, effective and efficient user interface that
> allowed users to avoid extra syscalls. It's been used this way by
> userspace for, what, 15 years?
>
> With the overlayfs "magic" we now have the situation where d_type is
> not sufficient to avoid a stat() call to determine the type the
> dirent points to. IOWs, we've just fucked up a perfectly good
> interface that is widely used because somebody thought that using
> the DT_WHT value in the d_type field for whiteout dirents is a "bad
> interface".
>
> > The special chardev was Linus' idea, but I agree with him completely
> > on this point. Introducing DT_WHT on the userspace API would be much
> > more of a hack than reusing existing objects and operations.
>
> Magical char dev for access, unlink, etc: no problems there.
>
> DT_CHR for the whiteout dirent type: completely fucked up.
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
>
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
--
Carlos
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: beginners project: RENAME_WHITEOUT
2015-01-09 13:30 ` Carlos Maiolino
@ 2015-01-22 1:05 ` Dave Chinner
0 siblings, 0 replies; 11+ messages in thread
From: Dave Chinner @ 2015-01-22 1:05 UTC (permalink / raw)
To: xfs
On Fri, Jan 09, 2015 at 08:30:19AM -0500, Carlos Maiolino wrote:
> So, giving this conversation, should we implement WHITEOUTS in XFS
> already, or is this isn't decided yet?
Sure, we have to implement whiteouts to support overlayfs.
Though internally we'll do it with DT_WHT in dirents, so we don't
need inodes on disk for them. lookups will just have to allocate in
memory chardev inodes so the rest of the world will function
appropriately if they access whiteouts from the lower filesystems.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2015-01-22 1:05 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-11-07 19:09 beginners project: RENAME_WHITEOUT Christoph Hellwig
2014-11-07 22:59 ` Eric Sandeen
2014-11-08 6:40 ` Christoph Hellwig
2014-11-09 0:08 ` cem
2014-11-08 23:42 ` Dave Chinner
2014-11-10 9:25 ` Miklos Szeredi
2014-11-10 13:52 ` Dave Chinner
2014-11-10 16:55 ` Miklos Szeredi
2015-01-09 13:30 ` Carlos Maiolino
2015-01-22 1:05 ` Dave Chinner
2014-11-10 9:14 ` Miklos Szeredi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox