* Stable inode numbers @ 2016-07-21 23:33 Vito Caputo 2016-07-22 12:15 ` Miklos Szeredi 0 siblings, 1 reply; 9+ messages in thread From: Vito Caputo @ 2016-07-21 23:33 UTC (permalink / raw) To: linux-unionfs Hello list, We're receiving bug reports from users simply untarring archives onto overlayfs mounts (docker + overlayfs) where tar fails with "Directory renamed before its status could be extracted" errors. This is triggered in GNU tar when the inode number of an ancestor directory changes unexpectedly in the deferred application of ancestral metadata. It's not consistently reproducible, but in a slightly memory-constrained vm it is very easy to reproduce. What appears to be happening is the tar extraction activity is churning through the kernel's cache so a subsequent lookup of the directory in question returns a new inode number. One can easily simulate this using `stat` and `echo 2 > /proc/sys/vm/drop_caches` to observe the overlayfs inode number change across the cache drop. The seemingly erratic nature of this failure is particularly frustrating for users, as it gives the impression things _should_ work and just randomly aren't due to a bug. Scrutiny of the overlayfs implementation reveals that unstable inode numbers is intentional, and spurious failures of applications which notice intersecting relevant cache evictions should be expected. Are there any plans or strategies for changing this situation in overlayfs? I can think of a few potential solutions, but they all tend to revolve around persistently allocating inode numbers on first lookup, which of course requires both space and time overlayfs currently elides. Cheers, Vito Caputo ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Stable inode numbers 2016-07-21 23:33 Stable inode numbers Vito Caputo @ 2016-07-22 12:15 ` Miklos Szeredi 2016-07-22 20:55 ` Vito Caputo 0 siblings, 1 reply; 9+ messages in thread From: Miklos Szeredi @ 2016-07-22 12:15 UTC (permalink / raw) To: Vito Caputo; +Cc: linux-unionfs On Thu, Jul 21, 2016 at 04:33:38PM -0700, Vito Caputo wrote: > Hello list, > > We're receiving bug reports from users simply untarring archives onto overlayfs > mounts (docker + overlayfs) where tar fails with "Directory renamed before its > status could be extracted" errors. > > This is triggered in GNU tar when the inode number of an ancestor directory > changes unexpectedly in the deferred application of ancestral metadata. > > It's not consistently reproducible, but in a slightly memory-constrained vm it > is very easy to reproduce. > > What appears to be happening is the tar extraction activity is churning through > the kernel's cache so a subsequent lookup of the directory in question returns > a new inode number. One can easily simulate this using `stat` and `echo 2 > > /proc/sys/vm/drop_caches` to observe the overlayfs inode number change across > the cache drop. > > The seemingly erratic nature of this failure is particularly frustrating for > users, as it gives the impression things _should_ work and just randomly aren't > due to a bug. Scrutiny of the overlayfs implementation reveals that unstable > inode numbers is intentional, and spurious failures of applications which > notice intersecting relevant cache evictions should be expected. > > Are there any plans or strategies for changing this situation in overlayfs? > > I can think of a few potential solutions, but they all tend to revolve around > persistently allocating inode numbers on first lookup, which of course requires > both space and time overlayfs currently elides. Hmm, I haven't thought about this problem yet. Maybe we are better off returning the underlying dev/ino in stat (which is persistent long term, but not persistent during copy-up). This would be cheating a bit, since stat would then return the same numbers for the overlay directory and the underlying directory, which can be different (unlike non-directories, which overlayfs returns verbatim). Can you please try the following patch. It passes unionmount-testsuite with the layering check disabled. Thanks, Miklos --- diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c index ddda948109d3..f913b7b73e9d 100644 --- a/fs/overlayfs/dir.c +++ b/fs/overlayfs/dir.c @@ -147,9 +147,6 @@ static int ovl_dir_getattr(struct vfsmount *mnt, struct dentry *dentry, if (err) return err; - stat->dev = dentry->d_sb->s_dev; - stat->ino = dentry->d_inode->i_ino; - /* * It's probably not worth it to count subdirs to get the * correct link count. nlink=1 seems to pacify 'find' and ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: Stable inode numbers 2016-07-22 12:15 ` Miklos Szeredi @ 2016-07-22 20:55 ` Vito Caputo 2016-07-25 11:34 ` Miklos Szeredi 0 siblings, 1 reply; 9+ messages in thread From: Vito Caputo @ 2016-07-22 20:55 UTC (permalink / raw) To: Miklos Szeredi; +Cc: linux-unionfs On Fri, Jul 22, 2016 at 5:15 AM, Miklos Szeredi <miklos@szeredi.hu> wrote: > On Thu, Jul 21, 2016 at 04:33:38PM -0700, Vito Caputo wrote: >> Hello list, >> >> We're receiving bug reports from users simply untarring archives onto overlayfs >> mounts (docker + overlayfs) where tar fails with "Directory renamed before its >> status could be extracted" errors. >> >> This is triggered in GNU tar when the inode number of an ancestor directory >> changes unexpectedly in the deferred application of ancestral metadata. >> >> It's not consistently reproducible, but in a slightly memory-constrained vm it >> is very easy to reproduce. >> >> What appears to be happening is the tar extraction activity is churning through >> the kernel's cache so a subsequent lookup of the directory in question returns >> a new inode number. One can easily simulate this using `stat` and `echo 2 > >> /proc/sys/vm/drop_caches` to observe the overlayfs inode number change across >> the cache drop. >> >> The seemingly erratic nature of this failure is particularly frustrating for >> users, as it gives the impression things _should_ work and just randomly aren't >> due to a bug. Scrutiny of the overlayfs implementation reveals that unstable >> inode numbers is intentional, and spurious failures of applications which >> notice intersecting relevant cache evictions should be expected. >> >> Are there any plans or strategies for changing this situation in overlayfs? >> >> I can think of a few potential solutions, but they all tend to revolve around >> persistently allocating inode numbers on first lookup, which of course requires >> both space and time overlayfs currently elides. > > Hmm, I haven't thought about this problem yet. > > Maybe we are better off returning the underlying dev/ino in stat (which is > persistent long term, but not persistent during copy-up). This would be > cheating a bit, since stat would then return the same numbers for the overlay > directory and the underlying directory, which can be different (unlike > non-directories, which overlayfs returns verbatim). > > Can you please try the following patch. It passes unionmount-testsuite with the > layering check disabled. > The patch eliminates the errors from tar in the test I've been using to reproduce the user-reported issues. However, I'm doubtful of it being a satisfactory general solution because the inode number of a pre-existing directory which undergoes copy-up spuriously changes. There's probably a scenario where this still upsets tar when extracting over an partial tree pre-existing in the lower layer, adding names to existing directories, causing their inode numbers to change. The relevant code where tar detects these shenanigans may be seen here: http://git.savannah.gnu.org/cgit/tar.git/tree/src/extract.c#n867 Perhaps we can come up with a better, more general solution without adding substantial complexity or overhead? Thanks, Vito Caputo ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Stable inode numbers 2016-07-22 20:55 ` Vito Caputo @ 2016-07-25 11:34 ` Miklos Szeredi 2016-07-25 21:25 ` Vito Caputo 0 siblings, 1 reply; 9+ messages in thread From: Miklos Szeredi @ 2016-07-25 11:34 UTC (permalink / raw) To: Vito Caputo; +Cc: linux-unionfs On Fri, Jul 22, 2016 at 01:55:56PM -0700, Vito Caputo wrote: > The patch eliminates the errors from tar in the test I've been using > to reproduce the user-reported issues. However, I'm doubtful of it > being a satisfactory general solution because the inode number of a > pre-existing directory which undergoes copy-up spuriously changes. > > There's probably a scenario where this still upsets tar when > extracting over an partial tree pre-existing in the lower layer, > adding names to existing directories, causing their inode numbers to > change. > > The relevant code where tar detects these shenanigans may be seen here: > http://git.savannah.gnu.org/cgit/tar.git/tree/src/extract.c#n867 > > Perhaps we can come up with a better, more general solution without > adding substantial complexity or overhead? Here's another try. It's simple enough, as to overhead, please let me know if you are comfortable with this. Thanks, Miklos --- From: Miklos Szeredi <mszeredi@redhat.com> Subject: ovl: optionally copy up directory on getattr This is to make directory dev/ino stable. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> --- fs/overlayfs/dir.c | 15 +++++++++++++-- fs/overlayfs/overlayfs.h | 6 ++++++ fs/overlayfs/super.c | 19 +++++++++++++++++++ 3 files changed, 38 insertions(+), 2 deletions(-) --- a/fs/overlayfs/dir.c +++ b/fs/overlayfs/dir.c @@ -139,6 +139,15 @@ static int ovl_dir_getattr(struct vfsmou enum ovl_path_type type; struct path realpath; const struct cred *old_cred; + bool devino_from_ovl = true; + + if (ovl_copy_up_type(dentry) == OVL_COPY_UP_ON_GETATTR) { + err = ovl_copy_up(dentry); + if (err) + return err; + + devino_from_ovl = false; + } type = ovl_path_real(dentry, &realpath); old_cred = ovl_override_creds(dentry->d_sb); @@ -147,8 +156,10 @@ static int ovl_dir_getattr(struct vfsmou if (err) return err; - stat->dev = dentry->d_sb->s_dev; - stat->ino = dentry->d_inode->i_ino; + if (devino_from_ovl) { + stat->dev = dentry->d_sb->s_dev; + stat->ino = dentry->d_inode->i_ino; + } /* * It's probably not worth it to count subdirs to get the --- a/fs/overlayfs/overlayfs.h +++ b/fs/overlayfs/overlayfs.h @@ -24,6 +24,11 @@ enum ovl_path_type { (OVL_TYPE_MERGE(type) || !OVL_TYPE_UPPER(type)) +enum ovl_copy_up_type { + OVL_COPY_UP_ON_MODIF, + OVL_COPY_UP_ON_GETATTR, +}; + #define OVL_XATTR_PREFIX XATTR_TRUSTED_PREFIX "overlay" #define OVL_XATTR_OPAQUE OVL_XATTR_PREFIX ".opaque" @@ -172,6 +177,7 @@ struct file *ovl_path_open(struct path * struct dentry *ovl_upper_create(struct dentry *upperdir, struct dentry *dentry, struct kstat *stat, const char *link); +enum ovl_copy_up_type ovl_copy_up_type(struct dentry *dentry); /* readdir.c */ extern const struct file_operations ovl_dir_operations; --- a/fs/overlayfs/super.c +++ b/fs/overlayfs/super.c @@ -31,6 +31,7 @@ struct ovl_config { char *upperdir; char *workdir; bool default_permissions; + enum ovl_copy_up_type dir_copy_up_type; }; /* private information held for overlayfs's superblock */ @@ -203,6 +204,16 @@ struct dentry *ovl_workdir(struct dentry return ofs->workdir; } +enum ovl_copy_up_type ovl_copy_up_type(struct dentry *dentry) +{ + struct ovl_fs *ofs = dentry->d_sb->s_fs_info; + + if (d_is_dir(dentry)) + return ofs->config.dir_copy_up_type; + + return OVL_COPY_UP_ON_MODIF; +} + bool ovl_dentry_is_opaque(struct dentry *dentry) { struct ovl_entry *oe = dentry->d_fsdata; @@ -681,6 +692,8 @@ static int ovl_show_options(struct seq_f } if (ufs->config.default_permissions) seq_puts(m, ",default_permissions"); + if (ufs->config.dir_copy_up_type == OVL_COPY_UP_ON_GETATTR) + seq_puts(m, ",dir_copy_up_on=getattr"); return 0; } @@ -707,6 +720,7 @@ enum { OPT_UPPERDIR, OPT_WORKDIR, OPT_DEFAULT_PERMISSIONS, + OPT_DIR_COPY_UP_ON_GETATTR, OPT_ERR, }; @@ -715,6 +729,7 @@ static const match_table_t ovl_tokens = {OPT_UPPERDIR, "upperdir=%s"}, {OPT_WORKDIR, "workdir=%s"}, {OPT_DEFAULT_PERMISSIONS, "default_permissions"}, + {OPT_DIR_COPY_UP_ON_GETATTR, "dir_copy_up_on=getattr"}, {OPT_ERR, NULL} }; @@ -779,6 +794,10 @@ static int ovl_parse_opt(char *opt, stru config->default_permissions = true; break; + case OPT_DIR_COPY_UP_ON_GETATTR: + config->dir_copy_up_type = OVL_COPY_UP_ON_GETATTR; + break; + default: pr_err("overlayfs: unrecognized mount option \"%s\" or missing value\n", p); return -EINVAL; ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Stable inode numbers 2016-07-25 11:34 ` Miklos Szeredi @ 2016-07-25 21:25 ` Vito Caputo 2016-07-26 1:51 ` J. R. Okajima 2016-07-26 10:02 ` Miklos Szeredi 0 siblings, 2 replies; 9+ messages in thread From: Vito Caputo @ 2016-07-25 21:25 UTC (permalink / raw) To: Miklos Szeredi; +Cc: linux-unionfs I think this strategy is an improvement, but I'm a bit apprehensive about how specific it is to this particular style of untar-like directory metadata preservation failure in only providing stability to directory inode numbers. Additionally it introduces a userspace-visible mount option, for something which /feels/ like a stop-gap kludge to make some things work better in the short-term. Maybe it's acceptable, since a more general solution could be implemented later which ignores the added mount option without conflict. As overlayfs usage increases in the "enterprise" via containers I suspect we'll be seeing substantial support load from unsuspecting users tripping over quirks like this and at some point there's a threshold where user confidence is lost. That is the lens I view overlayfs problems like these through, hence I'd prefer a more general solution with stable inode numbers making overlayfs behavior more consistent with the filesystems backing it. It's difficult for me to define what is "good enough" for overlayfs correctness, and this situation seems like it might be one of those epitomizing the saying "perfect is the enemy of good". Thanks, Vito Caputo On Mon, Jul 25, 2016 at 4:34 AM, Miklos Szeredi <miklos@szeredi.hu> wrote: > On Fri, Jul 22, 2016 at 01:55:56PM -0700, Vito Caputo wrote: > >> The patch eliminates the errors from tar in the test I've been using >> to reproduce the user-reported issues. However, I'm doubtful of it >> being a satisfactory general solution because the inode number of a >> pre-existing directory which undergoes copy-up spuriously changes. >> >> There's probably a scenario where this still upsets tar when >> extracting over an partial tree pre-existing in the lower layer, >> adding names to existing directories, causing their inode numbers to >> change. >> >> The relevant code where tar detects these shenanigans may be seen here: >> http://git.savannah.gnu.org/cgit/tar.git/tree/src/extract.c#n867 >> >> Perhaps we can come up with a better, more general solution without >> adding substantial complexity or overhead? > > Here's another try. It's simple enough, as to overhead, please let me know if > you are comfortable with this. > > Thanks, > Miklos > > --- > From: Miklos Szeredi <mszeredi@redhat.com> > Subject: ovl: optionally copy up directory on getattr > > This is to make directory dev/ino stable. > > Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> > --- > fs/overlayfs/dir.c | 15 +++++++++++++-- > fs/overlayfs/overlayfs.h | 6 ++++++ > fs/overlayfs/super.c | 19 +++++++++++++++++++ > 3 files changed, 38 insertions(+), 2 deletions(-) > > --- a/fs/overlayfs/dir.c > +++ b/fs/overlayfs/dir.c > @@ -139,6 +139,15 @@ static int ovl_dir_getattr(struct vfsmou > enum ovl_path_type type; > struct path realpath; > const struct cred *old_cred; > + bool devino_from_ovl = true; > + > + if (ovl_copy_up_type(dentry) == OVL_COPY_UP_ON_GETATTR) { > + err = ovl_copy_up(dentry); > + if (err) > + return err; > + > + devino_from_ovl = false; > + } > > type = ovl_path_real(dentry, &realpath); > old_cred = ovl_override_creds(dentry->d_sb); > @@ -147,8 +156,10 @@ static int ovl_dir_getattr(struct vfsmou > if (err) > return err; > > - stat->dev = dentry->d_sb->s_dev; > - stat->ino = dentry->d_inode->i_ino; > + if (devino_from_ovl) { > + stat->dev = dentry->d_sb->s_dev; > + stat->ino = dentry->d_inode->i_ino; > + } > > /* > * It's probably not worth it to count subdirs to get the > --- a/fs/overlayfs/overlayfs.h > +++ b/fs/overlayfs/overlayfs.h > @@ -24,6 +24,11 @@ enum ovl_path_type { > (OVL_TYPE_MERGE(type) || !OVL_TYPE_UPPER(type)) > > > +enum ovl_copy_up_type { > + OVL_COPY_UP_ON_MODIF, > + OVL_COPY_UP_ON_GETATTR, > +}; > + > #define OVL_XATTR_PREFIX XATTR_TRUSTED_PREFIX "overlay" > #define OVL_XATTR_OPAQUE OVL_XATTR_PREFIX ".opaque" > > @@ -172,6 +177,7 @@ struct file *ovl_path_open(struct path * > > struct dentry *ovl_upper_create(struct dentry *upperdir, struct dentry *dentry, > struct kstat *stat, const char *link); > +enum ovl_copy_up_type ovl_copy_up_type(struct dentry *dentry); > > /* readdir.c */ > extern const struct file_operations ovl_dir_operations; > --- a/fs/overlayfs/super.c > +++ b/fs/overlayfs/super.c > @@ -31,6 +31,7 @@ struct ovl_config { > char *upperdir; > char *workdir; > bool default_permissions; > + enum ovl_copy_up_type dir_copy_up_type; > }; > > /* private information held for overlayfs's superblock */ > @@ -203,6 +204,16 @@ struct dentry *ovl_workdir(struct dentry > return ofs->workdir; > } > > +enum ovl_copy_up_type ovl_copy_up_type(struct dentry *dentry) > +{ > + struct ovl_fs *ofs = dentry->d_sb->s_fs_info; > + > + if (d_is_dir(dentry)) > + return ofs->config.dir_copy_up_type; > + > + return OVL_COPY_UP_ON_MODIF; > +} > + > bool ovl_dentry_is_opaque(struct dentry *dentry) > { > struct ovl_entry *oe = dentry->d_fsdata; > @@ -681,6 +692,8 @@ static int ovl_show_options(struct seq_f > } > if (ufs->config.default_permissions) > seq_puts(m, ",default_permissions"); > + if (ufs->config.dir_copy_up_type == OVL_COPY_UP_ON_GETATTR) > + seq_puts(m, ",dir_copy_up_on=getattr"); > return 0; > } > > @@ -707,6 +720,7 @@ enum { > OPT_UPPERDIR, > OPT_WORKDIR, > OPT_DEFAULT_PERMISSIONS, > + OPT_DIR_COPY_UP_ON_GETATTR, > OPT_ERR, > }; > > @@ -715,6 +729,7 @@ static const match_table_t ovl_tokens = > {OPT_UPPERDIR, "upperdir=%s"}, > {OPT_WORKDIR, "workdir=%s"}, > {OPT_DEFAULT_PERMISSIONS, "default_permissions"}, > + {OPT_DIR_COPY_UP_ON_GETATTR, "dir_copy_up_on=getattr"}, > {OPT_ERR, NULL} > }; > > @@ -779,6 +794,10 @@ static int ovl_parse_opt(char *opt, stru > config->default_permissions = true; > break; > > + case OPT_DIR_COPY_UP_ON_GETATTR: > + config->dir_copy_up_type = OVL_COPY_UP_ON_GETATTR; > + break; > + > default: > pr_err("overlayfs: unrecognized mount option \"%s\" or missing value\n", p); > return -EINVAL; ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Stable inode numbers 2016-07-25 21:25 ` Vito Caputo @ 2016-07-26 1:51 ` J. R. Okajima 2016-07-26 10:02 ` Miklos Szeredi 1 sibling, 0 replies; 9+ messages in thread From: J. R. Okajima @ 2016-07-26 1:51 UTC (permalink / raw) To: Vito Caputo; +Cc: Miklos Szeredi, linux-unionfs Vito Caputo: > ... That is the lens I view > overlayfs problems like these through, hence I'd prefer a more general > solution with stable inode numbers making overlayfs behavior more > consistent with the filesystems backing it. You may want to try aufs instead of overlayfs since aufs has a such feature called XINO (external inode number). It is working well for over a decade, but some users set 'noxino' mount option which makes aufs not to use XINO. Because XINO consumes some disk space, and if XINO file is placed on tmpfs, then it means consuming memory. J. R. Okajima ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Stable inode numbers 2016-07-25 21:25 ` Vito Caputo 2016-07-26 1:51 ` J. R. Okajima @ 2016-07-26 10:02 ` Miklos Szeredi 2016-07-26 11:09 ` Miklos Szeredi 1 sibling, 1 reply; 9+ messages in thread From: Miklos Szeredi @ 2016-07-26 10:02 UTC (permalink / raw) To: Vito Caputo; +Cc: linux-unionfs@vger.kernel.org On Mon, Jul 25, 2016 at 11:25 PM, Vito Caputo <vito.caputo@coreos.com> wrote: > I think this strategy is an improvement, but I'm a bit apprehensive > about how specific it is to this particular style of untar-like > directory metadata preservation failure in only providing stability to > directory inode numbers. > > Additionally it introduces a userspace-visible mount option, for > something which /feels/ like a stop-gap kludge to make some things > work better in the short-term. Maybe it's acceptable, since a more > general solution could be implemented later which ignores the added > mount option without conflict. Not sure that it's a kludge. AFAIR union-mounts copy up directory on lookup for simplicity of implementation. Overlayfs doesn't do that and it turns out to be not a big problem. But I don't have any input into the performance impact of doing this way or that. If it turns out to be unacceptable then we can think about some other solution (possibly something out-of-band, like aufs does). > As overlayfs usage increases in the "enterprise" via containers I > suspect we'll be seeing substantial support load from unsuspecting > users tripping over quirks like this and at some point there's a > threshold where user confidence is lost. Too true. > That is the lens I view > overlayfs problems like these through, hence I'd prefer a more general > solution with stable inode numbers making overlayfs behavior more > consistent with the filesystems backing it. Perhaps this needs to be the default then, turning it off to get the more space efficient but less conformant version. That risks being a regression for some, but hey, overlayfs is still somewhat experimental. > It's difficult for me to define what is "good enough" for overlayfs > correctness, and this situation seems like it might be one of those > epitomizing the saying "perfect is the enemy of good". If it breaks real apps, then it's not good. We need to fix it one way or the other. Thanks, Miklos ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Stable inode numbers 2016-07-26 10:02 ` Miklos Szeredi @ 2016-07-26 11:09 ` Miklos Szeredi 2016-08-16 0:03 ` Vito Caputo 0 siblings, 1 reply; 9+ messages in thread From: Miklos Szeredi @ 2016-07-26 11:09 UTC (permalink / raw) To: Vito Caputo; +Cc: linux-unionfs@vger.kernel.org On Tue, Jul 26, 2016 at 12:02:13PM +0200, Miklos Szeredi wrote: > On Mon, Jul 25, 2016 at 11:25 PM, Vito Caputo <vito.caputo@coreos.com> wrote: > > I think this strategy is an improvement, but I'm a bit apprehensive > > about how specific it is to this particular style of untar-like > > directory metadata preservation failure in only providing stability to > > directory inode numbers. > > > > Additionally it introduces a userspace-visible mount option, for > > something which /feels/ like a stop-gap kludge to make some things > > work better in the short-term. Maybe it's acceptable, since a more > > general solution could be implemented later which ignores the added > > mount option without conflict. Here's another patch that achieves stability without compromising performance/disk usage. Again not perfect, but perhaps good enough? Thanks, Miklos diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c index 12bcd07b9e32..5b7de7d489e0 100644 --- a/fs/overlayfs/dir.c +++ b/fs/overlayfs/dir.c @@ -143,13 +143,24 @@ static int ovl_dir_getattr(struct vfsmount *mnt, struct dentry *dentry, type = ovl_path_real(dentry, &realpath); old_cred = ovl_override_creds(dentry->d_sb); err = vfs_getattr(&realpath, stat); + /* + * Use lower ino/dev for merged dir, so they are stable across copy up. + */ + if (!err && OVL_TYPE_MERGE(type) && OVL_TYPE_UPPER(type)) { + struct path lowerpath; + struct kstat lowerstat; + + ovl_path_lower(dentry, &lowerpath); + err = vfs_getattr(&lowerpath, &lowerstat); + if (!err) { + stat->dev = lowerstat.dev; + stat->ino = lowerstat.ino; + } + } revert_creds(old_cred); if (err) return err; - stat->dev = dentry->d_sb->s_dev; - stat->ino = dentry->d_inode->i_ino; - /* * It's probably not worth it to count subdirs to get the * correct link count. nlink=1 seems to pacify 'find' and ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: Stable inode numbers 2016-07-26 11:09 ` Miklos Szeredi @ 2016-08-16 0:03 ` Vito Caputo 0 siblings, 0 replies; 9+ messages in thread From: Vito Caputo @ 2016-08-16 0:03 UTC (permalink / raw) To: Miklos Szeredi; +Cc: linux-unionfs@vger.kernel.org Miklos, This most recent patch looks ok to me, and it works fine in my limited testing. Thanks, Vito Caputo On Tue, Jul 26, 2016 at 4:09 AM, Miklos Szeredi <miklos@szeredi.hu> wrote: > On Tue, Jul 26, 2016 at 12:02:13PM +0200, Miklos Szeredi wrote: >> On Mon, Jul 25, 2016 at 11:25 PM, Vito Caputo <vito.caputo@coreos.com> wrote: >> > I think this strategy is an improvement, but I'm a bit apprehensive >> > about how specific it is to this particular style of untar-like >> > directory metadata preservation failure in only providing stability to >> > directory inode numbers. >> > >> > Additionally it introduces a userspace-visible mount option, for >> > something which /feels/ like a stop-gap kludge to make some things >> > work better in the short-term. Maybe it's acceptable, since a more >> > general solution could be implemented later which ignores the added >> > mount option without conflict. > > Here's another patch that achieves stability without compromising > performance/disk usage. > > Again not perfect, but perhaps good enough? > > Thanks, > Miklos > > diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c > index 12bcd07b9e32..5b7de7d489e0 100644 > --- a/fs/overlayfs/dir.c > +++ b/fs/overlayfs/dir.c > @@ -143,13 +143,24 @@ static int ovl_dir_getattr(struct vfsmount *mnt, struct dentry *dentry, > type = ovl_path_real(dentry, &realpath); > old_cred = ovl_override_creds(dentry->d_sb); > err = vfs_getattr(&realpath, stat); > + /* > + * Use lower ino/dev for merged dir, so they are stable across copy up. > + */ > + if (!err && OVL_TYPE_MERGE(type) && OVL_TYPE_UPPER(type)) { > + struct path lowerpath; > + struct kstat lowerstat; > + > + ovl_path_lower(dentry, &lowerpath); > + err = vfs_getattr(&lowerpath, &lowerstat); > + if (!err) { > + stat->dev = lowerstat.dev; > + stat->ino = lowerstat.ino; > + } > + } > revert_creds(old_cred); > if (err) > return err; > > - stat->dev = dentry->d_sb->s_dev; > - stat->ino = dentry->d_inode->i_ino; > - > /* > * It's probably not worth it to count subdirs to get the > * correct link count. nlink=1 seems to pacify 'find' and ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2016-08-16 0:03 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-07-21 23:33 Stable inode numbers Vito Caputo 2016-07-22 12:15 ` Miklos Szeredi 2016-07-22 20:55 ` Vito Caputo 2016-07-25 11:34 ` Miklos Szeredi 2016-07-25 21:25 ` Vito Caputo 2016-07-26 1:51 ` J. R. Okajima 2016-07-26 10:02 ` Miklos Szeredi 2016-07-26 11:09 ` Miklos Szeredi 2016-08-16 0:03 ` Vito Caputo
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).