* [PATCH v3 00/23] Overlayfs inodes index
@ 2017-06-14 7:26 Amir Goldstein
2017-06-14 7:26 ` [PATCH v3 01/23] vfs: introduce inode 'inuse' lock Amir Goldstein
2017-06-14 7:30 ` [PATCH v3 00/23] Overlayfs inodes index Miklos Szeredi
0 siblings, 2 replies; 4+ messages in thread
From: Amir Goldstein @ 2017-06-14 7:26 UTC (permalink / raw)
To: Miklos Szeredi; +Cc: linux-unionfs, linux-fsdevel
Miklos,
I've made all the changes we discussed on v2 review and now all
upper/lower hardlinks are consistent. Note that I chose to invalidate
lower indexed dentries instead of updating all aliases on first copy up
(patch 21). That seemed simpler to me. Please let me know if you see any
culprit with that approach. I also re-introduced ovl_update_type()
(patch 8) for easier and more compact storing of the INDEX type flag.
The remaining work on persistent nlink accounting and orphan index
cleanup is left for a differnt posting. Without that complementary work,
the index entries will remain forever. I am not aware of any other
culprit or correctness issues that may arrise from lack of index cleanup.
This work introduces the inodes index opt-in feature, which provides:
- Hardlinks are not broken on copy up
- Infrastructure for overlayfs NFS export
The work is available on my ovl-hardlinks branch [1].
For the curious, documentation about how overlay NFS export would work
with the inodes index is available in the linked commit [2].
The most significant change w.r.t. vfs is that with inodes index,
overlay inodes are hashed and unique throughout the lifecycle of an
overlay object (i.e. across copy up). This is required for not breaking
hardlinks on copy up.
Hardlink copy up tests including coverage for concurrent copy up of
lower hardlinks and lower/upper hardlinks consistency are available on
my xfstests dev branch [3].
This work also fixes constant st_ino for samefs case for lower hardlinks,
so the few constant st_ino tests that fail on v4.12-rc1 due to copy up of
lower hardlinks now pass [3][4].
Thanks,
Amir.
Changes since v2:
- Introduce ovl_inode and ovl_inode mutex
- Synchronize copy up with ovl_inode mutex
- Don't store index dentry in overlay dentry
- Consistency of lower/upper hardlinks by link-up on lookup
- Preemptive copy up before lower hardlink unlink
TODO:
- Persistent overlay nlink accounting
- Whiteout index when overlay nlink drops to zero
- Cleanup stale and orphan index entries on mount
- Document the inodes index feature
[1] https://github.com/amir73il/linux/commits/ovl-hardlinks
[2] https://github.com/amir73il/linux/commits/ovl-nfs-export
[3] https://github.com/amir73il/xfstests/commits/overlayfs-devel
[4] https://github.com/amir73il/unionmount-testsuite/commits/overlayfs-devel
Amir Goldstein (23):
vfs: introduce inode 'inuse' lock
ovl: get exclusive ownership on upper/work dirs
ovl: relax same fs constrain for ovl_check_origin()
ovl: generalize ovl_create_workdir()
ovl: introduce the inodes index dir feature
ovl: verify upper root dir matches lower root dir
ovl: verify index dir matches upper dir
ovl: store path type in dentry
ovl: cram dentry state booleans into type flags
ovl: lookup index entry for copy up origin
ovl: allocate an ovl_inode struct
ovl: store upper/lower real inode in ovl_inode_info
ovl: use ovl_inode_init() for initializing new inode
ovl: hash overlay non-dir inodes by copy up origin inode
ovl: defer upper dir lock to tempfile link
ovl: factor out ovl_copy_up_inode() helper
ovl: generalize ovl_copy_up_locked() using actors
ovl: generalize ovl_copy_up_one() using actors
ovl: use ovl_inode mutex to synchronize concurrent copy up
ovl: implement index dir copy up method
ovl: link up indexed lower hardlink on lookup
ovl: fix nlink leak in ovl_rename()
ovl: adjust overlay inode nlink for indexed inodes
fs/inode.c | 50 ++++
fs/overlayfs/Kconfig | 20 ++
fs/overlayfs/copy_up.c | 592 +++++++++++++++++++++++++++++++++++++----------
fs/overlayfs/dir.c | 50 +++-
fs/overlayfs/inode.c | 68 +++++-
fs/overlayfs/namei.c | 283 ++++++++++++++++++----
fs/overlayfs/overlayfs.h | 37 +--
fs/overlayfs/ovl_entry.h | 37 ++-
fs/overlayfs/super.c | 308 +++++++++++++++++++++---
fs/overlayfs/util.c | 164 ++++++++++---
include/linux/fs.h | 14 ++
11 files changed, 1356 insertions(+), 267 deletions(-)
--
2.7.4
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v3 01/23] vfs: introduce inode 'inuse' lock
2017-06-14 7:26 [PATCH v3 00/23] Overlayfs inodes index Amir Goldstein
@ 2017-06-14 7:26 ` Amir Goldstein
2017-06-14 7:30 ` [PATCH v3 00/23] Overlayfs inodes index Miklos Szeredi
1 sibling, 0 replies; 4+ messages in thread
From: Amir Goldstein @ 2017-06-14 7:26 UTC (permalink / raw)
To: Miklos Szeredi; +Cc: linux-unionfs, linux-fsdevel
Added an i_state flag I_INUSE and helpers to set/clear/test the bit.
The 'inuse' lock is an 'advisory' inode lock, that can be used to extend
exclusive create protection beyond parent->i_mutex lock among cooperating
users.
This is going to be used by overlayfs to get exclusive ownership on upper
and work dirs among overlayfs mounts.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
fs/inode.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++
include/linux/fs.h | 14 ++++++++++++++
2 files changed, 64 insertions(+)
diff --git a/fs/inode.c b/fs/inode.c
index db5914783a71..6a41b27467eb 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -2120,3 +2120,53 @@ struct timespec current_time(struct inode *inode)
return timespec_trunc(now, inode->i_sb->s_time_gran);
}
EXPORT_SYMBOL(current_time);
+
+/**
+ * inode_inuse_trylock - try to get an exclusive 'inuse' lock on inode
+ * @inode: inode being locked
+ *
+ * The 'inuse' lock is an 'advisory' lock that can be used to extend exclusive
+ * create protection beyond parent->i_mutex lock among cooperating users.
+ * Used by overlayfs to get exclusive ownership on upper and work dirs among
+ * overlayfs mounts.
+ *
+ * Caller must hold a reference to inode to prevent it from being freed while
+ * it is marked inuse.
+ *
+ * Return true if I_INUSE flag was set by this call.
+ */
+bool inode_inuse_trylock(struct inode *inode)
+{
+ bool locked = false;
+
+ spin_lock(&inode->i_lock);
+ if (!WARN_ON(!atomic_read(&inode->i_count)) &&
+ !WARN_ON(inode->i_state & (I_NEW | I_FREEING | I_WILL_FREE)) &&
+ !(inode->i_state & I_INUSE)) {
+ inode->i_state |= I_INUSE;
+ locked = true;
+ }
+ spin_unlock(&inode->i_lock);
+ return locked;
+}
+EXPORT_SYMBOL(inode_inuse_trylock);
+
+/**
+ * inode_inuse_unlock - release exclusive 'inuse' lock
+ * @inode: inode inuse to unlock
+ *
+ * Clear the I_INUSE state.
+ *
+ * Caller must hold a reference to inode and must have successfully marked
+ * the inode 'inuse' prior to this call.
+ */
+void inode_inuse_unlock(struct inode *inode)
+{
+ spin_lock(&inode->i_lock);
+ WARN_ON(!atomic_read(&inode->i_count));
+ WARN_ON(inode->i_state & (I_NEW | I_FREEING | I_WILL_FREE));
+ WARN_ON(!(inode->i_state & I_INUSE));
+ inode->i_state &= ~I_INUSE;
+ spin_unlock(&inode->i_lock);
+}
+EXPORT_SYMBOL(inode_inuse_unlock);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index aab10f93ef23..532802120258 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1929,6 +1929,12 @@ static inline bool HAS_UNMAPPED_ID(struct inode *inode)
* wb stat updates to grab mapping->tree_lock. See
* inode_switch_wb_work_fn() for details.
*
+ * I_INUSE An 'advisory' bit to get exclusive ownership on inode
+ * using inode_inuse_trylock(). It can be used to extend
+ * exclusive create protection beyond parent->i_mutex lock.
+ * Used by overlayfs to get exclusive ownership on upper
+ * and work dirs among overlayfs mounts.
+ *
* Q: What is the difference between I_WILL_FREE and I_FREEING?
*/
#define I_DIRTY_SYNC (1 << 0)
@@ -1949,6 +1955,7 @@ static inline bool HAS_UNMAPPED_ID(struct inode *inode)
#define __I_DIRTY_TIME_EXPIRED 12
#define I_DIRTY_TIME_EXPIRED (1 << __I_DIRTY_TIME_EXPIRED)
#define I_WB_SWITCH (1 << 13)
+#define I_INUSE (1 << 14)
#define I_DIRTY (I_DIRTY_SYNC | I_DIRTY_DATASYNC | I_DIRTY_PAGES)
#define I_DIRTY_ALL (I_DIRTY | I_DIRTY_TIME)
@@ -3258,5 +3265,12 @@ static inline bool dir_relax_shared(struct inode *inode)
extern bool path_noexec(const struct path *path);
extern void inode_nohighmem(struct inode *inode);
+extern bool inode_inuse_trylock(struct inode *inode);
+extern void inode_inuse_unlock(struct inode *inode);
+
+static inline bool inode_inuse(struct inode *inode)
+{
+ return inode->i_state & I_INUSE;
+}
#endif /* _LINUX_FS_H */
--
2.7.4
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH v3 00/23] Overlayfs inodes index
2017-06-14 7:26 [PATCH v3 00/23] Overlayfs inodes index Amir Goldstein
2017-06-14 7:26 ` [PATCH v3 01/23] vfs: introduce inode 'inuse' lock Amir Goldstein
@ 2017-06-14 7:30 ` Miklos Szeredi
2017-06-14 7:48 ` Amir Goldstein
1 sibling, 1 reply; 4+ messages in thread
From: Miklos Szeredi @ 2017-06-14 7:30 UTC (permalink / raw)
To: Amir Goldstein; +Cc: linux-unionfs@vger.kernel.org, linux-fsdevel
On Wed, Jun 14, 2017 at 9:26 AM, Amir Goldstein <amir73il@gmail.com> wrote:
> Miklos,
>
> I've made all the changes we discussed on v2 review and now all
> upper/lower hardlinks are consistent. Note that I chose to invalidate
> lower indexed dentries instead of updating all aliases on first copy up
> (patch 21). That seemed simpler to me. Please let me know if you see any
> culprit with that approach.
If the alias is open, it won't receive the st_ino updates through
fstat(2) since it will still have the ref to the old dentry/old inode.
So it's more correct to update the aliases, but I guess it's also more
complex.
> I also re-introduced ovl_update_type()
> (patch 8) for easier and more compact storing of the INDEX type flag.
>
> The remaining work on persistent nlink accounting and orphan index
> cleanup is left for a differnt posting. Without that complementary work,
> the index entries will remain forever. I am not aware of any other
> culprit or correctness issues that may arrise from lack of index cleanup.
>
> This work introduces the inodes index opt-in feature, which provides:
> - Hardlinks are not broken on copy up
> - Infrastructure for overlayfs NFS export
>
> The work is available on my ovl-hardlinks branch [1].
> For the curious, documentation about how overlay NFS export would work
> with the inodes index is available in the linked commit [2].
>
> The most significant change w.r.t. vfs is that with inodes index,
> overlay inodes are hashed and unique throughout the lifecycle of an
> overlay object (i.e. across copy up). This is required for not breaking
> hardlinks on copy up.
>
> Hardlink copy up tests including coverage for concurrent copy up of
> lower hardlinks and lower/upper hardlinks consistency are available on
> my xfstests dev branch [3].
>
> This work also fixes constant st_ino for samefs case for lower hardlinks,
> so the few constant st_ino tests that fail on v4.12-rc1 due to copy up of
> lower hardlinks now pass [3][4].
Thanks, will have a look though.
Miklos
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v3 00/23] Overlayfs inodes index
2017-06-14 7:30 ` [PATCH v3 00/23] Overlayfs inodes index Miklos Szeredi
@ 2017-06-14 7:48 ` Amir Goldstein
0 siblings, 0 replies; 4+ messages in thread
From: Amir Goldstein @ 2017-06-14 7:48 UTC (permalink / raw)
To: Miklos Szeredi; +Cc: linux-unionfs@vger.kernel.org, linux-fsdevel
On Wed, Jun 14, 2017 at 10:30 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Wed, Jun 14, 2017 at 9:26 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>> Miklos,
>>
>> I've made all the changes we discussed on v2 review and now all
>> upper/lower hardlinks are consistent. Note that I chose to invalidate
>> lower indexed dentries instead of updating all aliases on first copy up
>> (patch 21). That seemed simpler to me. Please let me know if you see any
>> culprit with that approach.
>
> If the alias is open, it won't receive the st_ino updates through
> fstat(2) since it will still have the ref to the old dentry/old inode.
> So it's more correct to update the aliases, but I guess it's also more
> complex.
>
fstat(2) and st_ino are not a problem, because st_ino of lower is returned
both before and after link up. The issue is with other attributes like mtime.
However, if the alias is open, then read(2) will read the lower content, so you
may as well say that returning the lower mtime is a feature...
My main concern for indexed lower inconsistency was that ovl_inode_real()
may be upper, while ovl_dentry_real() is lower, especially when checking
overlay object access permissions.
The only places I found where this inconsistency might be exposed are
ovl_get_acl() and ovl_permission() and both those cases are relevant for
lookup and not for the "open alias" case. Right?
Amir.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2017-06-14 7:48 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-06-14 7:26 [PATCH v3 00/23] Overlayfs inodes index Amir Goldstein
2017-06-14 7:26 ` [PATCH v3 01/23] vfs: introduce inode 'inuse' lock Amir Goldstein
2017-06-14 7:30 ` [PATCH v3 00/23] Overlayfs inodes index Miklos Szeredi
2017-06-14 7:48 ` Amir Goldstein
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).