* beginners project: RENAME_WHITEOUT
@ 2014-11-07 19:09 Christoph Hellwig
2014-11-07 22:59 ` Eric Sandeen
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Christoph Hellwig @ 2014-11-07 19:09 UTC (permalink / raw)
To: xfs; +Cc: Miklos Szeredi
The overlayfs merge introduces a new rename flag to create to whiteouts.
Should be a fairly easy to implement.
Miklos, do you have any good documentation and/or test cases for this?
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: beginners project: RENAME_WHITEOUT 2014-11-07 19:09 beginners project: RENAME_WHITEOUT Christoph Hellwig @ 2014-11-07 22:59 ` Eric Sandeen 2014-11-08 6:40 ` Christoph Hellwig 2014-11-08 23:42 ` Dave Chinner 2014-11-10 9:14 ` Miklos Szeredi 2 siblings, 1 reply; 11+ messages in thread From: Eric Sandeen @ 2014-11-07 22:59 UTC (permalink / raw) To: Christoph Hellwig, xfs; +Cc: Carlos Maiolino, Miklos Szeredi On 11/7/14 1:09 PM, Christoph Hellwig wrote: > The overlayfs merge introduces a new rename flag to create to whiteouts. > Should be a fairly easy to implement. > > Miklos, do you have any good documentation and/or test cases for this? FWIW, Carlos just expressed some interest today in doing this, and it builds on his earlier patches, yes? -Eric _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: beginners project: RENAME_WHITEOUT 2014-11-07 22:59 ` Eric Sandeen @ 2014-11-08 6:40 ` Christoph Hellwig 2014-11-09 0:08 ` cem 0 siblings, 1 reply; 11+ messages in thread From: Christoph Hellwig @ 2014-11-08 6:40 UTC (permalink / raw) To: Eric Sandeen; +Cc: Miklos Szeredi, Carlos Maiolino, xfs On Fri, Nov 07, 2014 at 04:59:49PM -0600, Eric Sandeen wrote: > On 11/7/14 1:09 PM, Christoph Hellwig wrote: > > The overlayfs merge introduces a new rename flag to create to whiteouts. > > Should be a fairly easy to implement. > > > > Miklos, do you have any good documentation and/or test cases for this? > > FWIW, Carlos just expressed some interest today in doing this, and it > builds on his earlier patches, yes? It's in the same area, but not otherwise related. But I'd be happy if he picks it up. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: beginners project: RENAME_WHITEOUT 2014-11-08 6:40 ` Christoph Hellwig @ 2014-11-09 0:08 ` cem 0 siblings, 0 replies; 11+ messages in thread From: cem @ 2014-11-09 0:08 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Miklos Szeredi, Eric Sandeen, xfs Quoting Christoph Hellwig <hch@infradead.org>: > On Fri, Nov 07, 2014 at 04:59:49PM -0600, Eric Sandeen wrote: >> On 11/7/14 1:09 PM, Christoph Hellwig wrote: >> > The overlayfs merge introduces a new rename flag to create to whiteouts. >> > Should be a fairly easy to implement. >> > >> > Miklos, do you have any good documentation and/or test cases for this? >> >> FWIW, Carlos just expressed some interest today in doing this, and it >> builds on his earlier patches, yes? > > It's in the same area, but not otherwise related. But I'd be happy if > he picks it up. consider it picked up.. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: beginners project: RENAME_WHITEOUT 2014-11-07 19:09 beginners project: RENAME_WHITEOUT Christoph Hellwig 2014-11-07 22:59 ` Eric Sandeen @ 2014-11-08 23:42 ` Dave Chinner 2014-11-10 9:25 ` Miklos Szeredi 2014-11-10 9:14 ` Miklos Szeredi 2 siblings, 1 reply; 11+ messages in thread From: Dave Chinner @ 2014-11-08 23:42 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Miklos Szeredi, xfs On Fri, Nov 07, 2014 at 11:09:59AM -0800, Christoph Hellwig wrote: > The overlayfs merge introduces a new rename flag to create to whiteouts. > Should be a fairly easy to implement. > > Miklos, do you have any good documentation and/or test cases for this? So overlayfs uses some weird char dev hack to implement whiteout inodes in directories? Why do we need a whiteout inode on disk? what information is actually stored in the whiteout inode that overlayfs actually needs? Only readdir and lookup care about whiteouts, and AFAICT nothing of the inode is ever used except checking the chrdev/whiteoutdev hack via ovl_is_whiteout(dentry). Indeed, whatever happened to just storing the whiteout in the dirent via DT_WHT and using that information on lookup in the lower filesystem to mark the dentry returned appropriately without needing to lookup a real inode? Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: beginners project: RENAME_WHITEOUT 2014-11-08 23:42 ` Dave Chinner @ 2014-11-10 9:25 ` Miklos Szeredi 2014-11-10 13:52 ` Dave Chinner 0 siblings, 1 reply; 11+ messages in thread From: Miklos Szeredi @ 2014-11-10 9:25 UTC (permalink / raw) To: Dave Chinner; +Cc: Christoph Hellwig, xfs On Sun, Nov 9, 2014 at 12:42 AM, Dave Chinner <david@fromorbit.com> wrote: > On Fri, Nov 07, 2014 at 11:09:59AM -0800, Christoph Hellwig wrote: >> The overlayfs merge introduces a new rename flag to create to whiteouts. >> Should be a fairly easy to implement. >> >> Miklos, do you have any good documentation and/or test cases for this? > > So overlayfs uses some weird char dev hack to implement whiteout > inodes in directories? Why do we need a whiteout inode on disk? > what information is actually stored in the whiteout inode that > overlayfs actually needs? Only readdir and lookup care about > whiteouts, and AFAICT nothing of the inode is ever used except > checking the chrdev/whiteoutdev hack via ovl_is_whiteout(dentry). > > Indeed, whatever happened to just storing the whiteout in the dirent > via DT_WHT and using that information on lookup in the lower > filesystem to mark the dentry returned appropriately without needing > to lookup a real inode? The filesystem is free to implement whiteouts a dirent without an actual inode. But we do need at least an inode in the VFS, since the whiteout needs to be presented to userspace when not part of the overlay. The DT_WHT makes the typical mistake of trying to make the implementation nice, while not caring about user interfaces. This is usually a big mistake, user interfaces are much more important than implementation details, and an already existing file type on which all the usual operations work (stat, unlink) is much better in this respect than a completely new object which is unknown and unmanageable for the vast majority of applications. The special chardev was Linus' idea, but I agree with him completely on this point. Introducing DT_WHT on the userspace API would be much more of a hack than reusing existing objects and operations. Thanks, Miklos _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: beginners project: RENAME_WHITEOUT 2014-11-10 9:25 ` Miklos Szeredi @ 2014-11-10 13:52 ` Dave Chinner 2014-11-10 16:55 ` Miklos Szeredi 2015-01-09 13:30 ` Carlos Maiolino 0 siblings, 2 replies; 11+ messages in thread From: Dave Chinner @ 2014-11-10 13:52 UTC (permalink / raw) To: Miklos Szeredi; +Cc: Christoph Hellwig, xfs On Mon, Nov 10, 2014 at 10:25:40AM +0100, Miklos Szeredi wrote: > On Sun, Nov 9, 2014 at 12:42 AM, Dave Chinner <david@fromorbit.com> wrote: > > On Fri, Nov 07, 2014 at 11:09:59AM -0800, Christoph Hellwig wrote: > >> The overlayfs merge introduces a new rename flag to create to whiteouts. > >> Should be a fairly easy to implement. > >> > >> Miklos, do you have any good documentation and/or test cases for this? > > > > So overlayfs uses some weird char dev hack to implement whiteout > > inodes in directories? Why do we need a whiteout inode on disk? > > what information is actually stored in the whiteout inode that > > overlayfs actually needs? Only readdir and lookup care about > > whiteouts, and AFAICT nothing of the inode is ever used except > > checking the chrdev/whiteoutdev hack via ovl_is_whiteout(dentry). > > > > Indeed, whatever happened to just storing the whiteout in the dirent > > via DT_WHT and using that information on lookup in the lower > > filesystem to mark the dentry returned appropriately without needing > > to lookup a real inode? > > The filesystem is free to implement whiteouts a dirent without an actual inode. Sure, but overlayfs won't make use of it, so we'd have have to hack around overlayfs's ignorance of DT_WHT in several different places to do this. e.g. - in mknod to intercept creation of magical whiteout chardevs - in readdir so we can convert them to DT_CHR so overlayfs can detect them, - in ->lookup so we can create magical chardev inodes in memory rather than try to read them from disk. - in rename we have to detect the magical chardev inodes so we know it's a whiteout we are dealing with This is difficult because overlayfs hard codes the definition of a whiteout into the VFS interface implementation as well as it's internal directory implementation. This leaves almost no room for anyone to optimise the back end implementation because the translation layers are complex and fiddly.... > But we do need at least an inode in the VFS, since the whiteout needs > to be presented to userspace when not part of the overlay. Sure, but that's a different problem. > The DT_WHT > makes the typical mistake of trying to make the implementation nice, > while not caring about user interfaces. You're implying the d_type field in a dirent is something that it is not. d_type has only one purpose in life - to allowing userspace applications to avoid a stat() call to find out the type of the object the dirent points to. > This is usually a big mistake, user interfaces are much more important > than implementation details, and an already existing file type on > which all the usual operations work (stat, unlink) is much better in > this respect than a completely new object which is unknown and > unmanageable for the vast majority of applications. Sure, but again that's not the issue I'm commenting on. The dirent type has no effect on stat, unlink, etc that are done on the dirent after it is returned to userspace. So why is overloading DT_CHR to mean 'either a char device or a whiteout entry' a sane user interface design decision? d_type *was* a simple, obvious, effective and efficient user interface that allowed users to avoid extra syscalls. It's been used this way by userspace for, what, 15 years? With the overlayfs "magic" we now have the situation where d_type is not sufficient to avoid a stat() call to determine the type the dirent points to. IOWs, we've just fucked up a perfectly good interface that is widely used because somebody thought that using the DT_WHT value in the d_type field for whiteout dirents is a "bad interface". > The special chardev was Linus' idea, but I agree with him completely > on this point. Introducing DT_WHT on the userspace API would be much > more of a hack than reusing existing objects and operations. Magical char dev for access, unlink, etc: no problems there. DT_CHR for the whiteout dirent type: completely fucked up. Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: beginners project: RENAME_WHITEOUT 2014-11-10 13:52 ` Dave Chinner @ 2014-11-10 16:55 ` Miklos Szeredi 2015-01-09 13:30 ` Carlos Maiolino 1 sibling, 0 replies; 11+ messages in thread From: Miklos Szeredi @ 2014-11-10 16:55 UTC (permalink / raw) To: Dave Chinner; +Cc: Christoph Hellwig, xfs On Mon, Nov 10, 2014 at 2:52 PM, Dave Chinner <david@fromorbit.com> wrote: > On Mon, Nov 10, 2014 at 10:25:40AM +0100, Miklos Szeredi wrote: >> On Sun, Nov 9, 2014 at 12:42 AM, Dave Chinner <david@fromorbit.com> wrote: >> > On Fri, Nov 07, 2014 at 11:09:59AM -0800, Christoph Hellwig wrote: >> >> The overlayfs merge introduces a new rename flag to create to whiteouts. >> >> Should be a fairly easy to implement. >> >> >> >> Miklos, do you have any good documentation and/or test cases for this? >> > >> > So overlayfs uses some weird char dev hack to implement whiteout >> > inodes in directories? Why do we need a whiteout inode on disk? >> > what information is actually stored in the whiteout inode that >> > overlayfs actually needs? Only readdir and lookup care about >> > whiteouts, and AFAICT nothing of the inode is ever used except >> > checking the chrdev/whiteoutdev hack via ovl_is_whiteout(dentry). >> > >> > Indeed, whatever happened to just storing the whiteout in the dirent >> > via DT_WHT and using that information on lookup in the lower >> > filesystem to mark the dentry returned appropriately without needing >> > to lookup a real inode? >> >> The filesystem is free to implement whiteouts a dirent without an actual inode. > > Sure, but overlayfs won't make use of it, so we'd have > have to hack around overlayfs's ignorance of DT_WHT in several > different places to do this. e.g. > > - in mknod to intercept creation of magical whiteout chardevs > - in readdir so we can convert them to DT_CHR so overlayfs > can detect them, > - in ->lookup so we can create magical chardev inodes in > memory rather than try to read them from disk. > - in rename we have to detect the magical chardev inodes so > we know it's a whiteout we are dealing with I care little where it's intercepted, in the filesystem or in the VFS. Obviously if this is going to be done by more than one filesystem it makes sense to gather the common bits and pieces into VFS helpers. And overlayfs could easily make use of DT_WHT, if it was available, I really have no problem with that. As long as we don't have the insane DT_WHT == negative dentry on the userspace API, it's good (i.e. you *should* be able to remove it). And if "tar" is able to archive the thing and restore it, that's just an added bonus. The special chardev does all that. So in my book it's a perfect solution. And while I don't care if DT_WHT ends up not using an inode *on disk* or not, a negative entry on disk with a positive dentry in the VFS is bound to end up with complex and fiddly conversion layers, that may not actually be worth it. - "DT_WHT + S_IFCHR inode" OK, needs some conversion but not too difficult - "DT_WHT + negative indode" OK, but fiddly - "DT_CHR + S_IFCHR inode" OK, but sligthly suboptimal (or "completely fucked up" depending on your POV). The big advantage of not introducing DT_WHT is that it spares the pain of introducing possibly incompatible on-disk format (i.e. will old versions of fsck going to balk at it? will old kernels reject the image containing DT_WHT?) If the filesystem and tools are already prepared for the DT_WHT dirent type then that's a non-issue, obviously. Thanks, Miklos _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: beginners project: RENAME_WHITEOUT 2014-11-10 13:52 ` Dave Chinner 2014-11-10 16:55 ` Miklos Szeredi @ 2015-01-09 13:30 ` Carlos Maiolino 2015-01-22 1:05 ` Dave Chinner 1 sibling, 1 reply; 11+ messages in thread From: Carlos Maiolino @ 2015-01-09 13:30 UTC (permalink / raw) To: xfs So, giving this conversation, should we implement WHITEOUTS in XFS already, or is this isn't decided yet? Cheers. On Tue, Nov 11, 2014 at 12:52:49AM +1100, Dave Chinner wrote: > On Mon, Nov 10, 2014 at 10:25:40AM +0100, Miklos Szeredi wrote: > > On Sun, Nov 9, 2014 at 12:42 AM, Dave Chinner <david@fromorbit.com> wrote: > > > On Fri, Nov 07, 2014 at 11:09:59AM -0800, Christoph Hellwig wrote: > > >> The overlayfs merge introduces a new rename flag to create to whiteouts. > > >> Should be a fairly easy to implement. > > >> > > >> Miklos, do you have any good documentation and/or test cases for this? > > > > > > So overlayfs uses some weird char dev hack to implement whiteout > > > inodes in directories? Why do we need a whiteout inode on disk? > > > what information is actually stored in the whiteout inode that > > > overlayfs actually needs? Only readdir and lookup care about > > > whiteouts, and AFAICT nothing of the inode is ever used except > > > checking the chrdev/whiteoutdev hack via ovl_is_whiteout(dentry). > > > > > > Indeed, whatever happened to just storing the whiteout in the dirent > > > via DT_WHT and using that information on lookup in the lower > > > filesystem to mark the dentry returned appropriately without needing > > > to lookup a real inode? > > > > The filesystem is free to implement whiteouts a dirent without an actual inode. > > Sure, but overlayfs won't make use of it, so we'd have > have to hack around overlayfs's ignorance of DT_WHT in several > different places to do this. e.g. > > - in mknod to intercept creation of magical whiteout chardevs > - in readdir so we can convert them to DT_CHR so overlayfs > can detect them, > - in ->lookup so we can create magical chardev inodes in > memory rather than try to read them from disk. > - in rename we have to detect the magical chardev inodes so > we know it's a whiteout we are dealing with > > This is difficult because overlayfs hard codes the definition of a > whiteout into the VFS interface implementation as well as it's > internal directory implementation. This leaves almost no room for > anyone to optimise the back end implementation because the > translation layers are complex and fiddly.... > > > But we do need at least an inode in the VFS, since the whiteout needs > > to be presented to userspace when not part of the overlay. > > Sure, but that's a different problem. > > > The DT_WHT > > makes the typical mistake of trying to make the implementation nice, > > while not caring about user interfaces. > > You're implying the d_type field in a dirent is something that it is > not. d_type has only one purpose in life - to allowing userspace > applications to avoid a stat() call to find out the type of the > object the dirent points to. > > > This is usually a big mistake, user interfaces are much more important > > than implementation details, and an already existing file type on > > which all the usual operations work (stat, unlink) is much better in > > this respect than a completely new object which is unknown and > > unmanageable for the vast majority of applications. > > Sure, but again that's not the issue I'm commenting on. The dirent > type has no effect on stat, unlink, etc that are done on the dirent > after it is returned to userspace. > > So why is overloading DT_CHR to mean 'either a char device or a > whiteout entry' a sane user interface design decision? d_type *was* > a simple, obvious, effective and efficient user interface that > allowed users to avoid extra syscalls. It's been used this way by > userspace for, what, 15 years? > > With the overlayfs "magic" we now have the situation where d_type is > not sufficient to avoid a stat() call to determine the type the > dirent points to. IOWs, we've just fucked up a perfectly good > interface that is widely used because somebody thought that using > the DT_WHT value in the d_type field for whiteout dirents is a "bad > interface". > > > The special chardev was Linus' idea, but I agree with him completely > > on this point. Introducing DT_WHT on the userspace API would be much > > more of a hack than reusing existing objects and operations. > > Magical char dev for access, unlink, etc: no problems there. > > DT_CHR for the whiteout dirent type: completely fucked up. > > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com > > _______________________________________________ > xfs mailing list > xfs@oss.sgi.com > http://oss.sgi.com/mailman/listinfo/xfs -- Carlos _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: beginners project: RENAME_WHITEOUT 2015-01-09 13:30 ` Carlos Maiolino @ 2015-01-22 1:05 ` Dave Chinner 0 siblings, 0 replies; 11+ messages in thread From: Dave Chinner @ 2015-01-22 1:05 UTC (permalink / raw) To: xfs On Fri, Jan 09, 2015 at 08:30:19AM -0500, Carlos Maiolino wrote: > So, giving this conversation, should we implement WHITEOUTS in XFS > already, or is this isn't decided yet? Sure, we have to implement whiteouts to support overlayfs. Though internally we'll do it with DT_WHT in dirents, so we don't need inodes on disk for them. lookups will just have to allocate in memory chardev inodes so the rest of the world will function appropriately if they access whiteouts from the lower filesystems. Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: beginners project: RENAME_WHITEOUT 2014-11-07 19:09 beginners project: RENAME_WHITEOUT Christoph Hellwig 2014-11-07 22:59 ` Eric Sandeen 2014-11-08 23:42 ` Dave Chinner @ 2014-11-10 9:14 ` Miklos Szeredi 2 siblings, 0 replies; 11+ messages in thread From: Miklos Szeredi @ 2014-11-10 9:14 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs On Fri, Nov 7, 2014 at 8:09 PM, Christoph Hellwig <hch@infradead.org> wrote: > The overlayfs merge introduces a new rename flag to create to whiteouts. > Should be a fairly easy to implement. > > Miklos, do you have any good documentation and/or test cases for this? No test case for the RENAME_WHITEOUT flag yet. I'll work on that. We do have a man page and an xfstest test case for RENAME_EXCHANGE, which is also important for overlayfs. Thanks, Miklos _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2015-01-22 1:05 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-11-07 19:09 beginners project: RENAME_WHITEOUT Christoph Hellwig 2014-11-07 22:59 ` Eric Sandeen 2014-11-08 6:40 ` Christoph Hellwig 2014-11-09 0:08 ` cem 2014-11-08 23:42 ` Dave Chinner 2014-11-10 9:25 ` Miklos Szeredi 2014-11-10 13:52 ` Dave Chinner 2014-11-10 16:55 ` Miklos Szeredi 2015-01-09 13:30 ` Carlos Maiolino 2015-01-22 1:05 ` Dave Chinner 2014-11-10 9:14 ` Miklos Szeredi
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox