* [RFC PATCH 0/6] Fix NFSv4 delegations @ 2012-01-17 16:00 J. Bruce Fields [not found] ` <1326816029-13913-1-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> ` (4 more replies) 0 siblings, 5 replies; 16+ messages in thread From: J. Bruce Fields @ 2012-01-17 16:00 UTC (permalink / raw) To: linux-fsdevel, linux-nfs; +Cc: J. Bruce Fields From: "J. Bruce Fields" <bfields@redhat.com> NFSv4 delegations are currently implemented as leases. However, leases don't have quite the same semantics, which means that delegations are currently enforced correctly only for conflicts between NFSv4 clients. I previously attempted to address this by "fixing" the existing lease behavior, assuming it was a bug--but from further investigation it looks like that would be the wrong thing to do for Samba. So instead I'm defining a new FL_DELEG flag to mark NFSv4 delegations, and varying behavior based on that. Delegations aren't meant to be available to userspace for now--nfsd is the only user. In theory this code allows both read and write delegations, but I'm only using read delegations for now--write delegations are a project for another day. (At which point write leases will probably need some fixing while we're at it.) Still to do: if OPEN(dirfd, name) returns a delegation to the client, then the client should be guaranteed that the opened file is still linked as (dirfd, name). But the current nfsd code allows a rename or unlink to intervene between the lookup and the request for a delegation. --b. J. Bruce Fields (6): locks: introduce new FL_DELEG lock flag. locks: give break_lease its own flags locks: break delegations on unlink locks: break delegations on rename locks: break delegations on any attribute modification locks: break delegations on link Documentation/filesystems/directory-locking | 11 +++--- fs/attr.c | 3 ++ fs/locks.c | 51 +++++++++++++++++++------- fs/namei.c | 27 ++++++++++++-- fs/nfsd/nfs4state.c | 2 +- fs/nfsd/vfs.c | 6 ++-- fs/open.c | 4 +- include/linux/fs.h | 50 ++++++++++++++++++++++++-- 8 files changed, 122 insertions(+), 32 deletions(-) -- 1.7.5.4 ^ permalink raw reply [flat|nested] 16+ messages in thread
[parent not found: <1326816029-13913-1-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* [RFC PATCH 1/6] locks: introduce new FL_DELEG lock flag. [not found] ` <1326816029-13913-1-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2012-01-17 16:00 ` J. Bruce Fields 2012-01-17 16:00 ` [RFC PATCH 3/6] locks: break delegations on unlink J. Bruce Fields 2012-01-17 16:00 ` [RFC PATCH 6/6] locks: break delegations on link J. Bruce Fields 2 siblings, 0 replies; 16+ messages in thread From: J. Bruce Fields @ 2012-01-17 16:00 UTC (permalink / raw) To: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, linux-nfs-u79uwXL29TY76Z2rM5mHXA Cc: J. Bruce Fields From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> For now FL_DELEG is just a synonym for FL_LEASE. So this patch doesn't change behavior. Next we'll modify break_lease to treat FL_DELEG leases differently, to account for the fact that NFSv4 delegations should be broken in more situations than Windows oplocks. Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> --- fs/locks.c | 2 +- fs/nfsd/nfs4state.c | 2 +- include/linux/fs.h | 1 + 3 files changed, 3 insertions(+), 2 deletions(-) diff --git a/fs/locks.c b/fs/locks.c index 3b0d05d..bc55194 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -131,7 +131,7 @@ #define IS_POSIX(fl) (fl->fl_flags & FL_POSIX) #define IS_FLOCK(fl) (fl->fl_flags & FL_FLOCK) -#define IS_LEASE(fl) (fl->fl_flags & FL_LEASE) +#define IS_LEASE(fl) (fl->fl_flags & (FL_LEASE|FL_DELEG)) static bool lease_breaking(struct file_lock *fl) { diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c index 7355fe4..207b6b8 100644 --- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -2804,7 +2804,7 @@ static struct file_lock *nfs4_alloc_init_lease(struct nfs4_delegation *dp, int f return NULL; locks_init_lock(fl); fl->fl_lmops = &nfsd_lease_mng_ops; - fl->fl_flags = FL_LEASE; + fl->fl_flags = FL_DELEG; fl->fl_type = flag == NFS4_OPEN_DELEGATE_READ? F_RDLCK: F_WRLCK; fl->fl_end = OFFSET_MAX; fl->fl_owner = (fl_owner_t)(dp->dl_file); diff --git a/include/linux/fs.h b/include/linux/fs.h index 0c4df26..425d05d 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1074,6 +1074,7 @@ static inline int file_check_writeable(struct file *filp) #define FL_POSIX 1 #define FL_FLOCK 2 +#define FL_DELEG 4 /* NFSv4 delegation */ #define FL_ACCESS 8 /* not trying to lock, just looking */ #define FL_EXISTS 16 /* when unlocking, test for existence */ #define FL_LEASE 32 /* lease held on this file */ -- 1.7.5.4 -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [RFC PATCH 3/6] locks: break delegations on unlink [not found] ` <1326816029-13913-1-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2012-01-17 16:00 ` [RFC PATCH 1/6] locks: introduce new FL_DELEG lock flag J. Bruce Fields @ 2012-01-17 16:00 ` J. Bruce Fields 2012-01-17 16:00 ` [RFC PATCH 6/6] locks: break delegations on link J. Bruce Fields 2 siblings, 0 replies; 16+ messages in thread From: J. Bruce Fields @ 2012-01-17 16:00 UTC (permalink / raw) To: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, linux-nfs-u79uwXL29TY76Z2rM5mHXA Cc: J. Bruce Fields From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> A read delegation is used by NFSv4 as a guarantee that a client can perform local read opens without informing the server. The open operation takes the last component of the pathname as an argument, thus is also a lookup operation, and giving the client the above guarantee means informing the client before we allow anything that would change the set of names pointing to the inode. Therefore, we need to break delegations on rename, link, and unlink. Start with unlink. The simplest thing to do is just to use the fact that unlink always takes the i_mutex to prevent new delegations from being acquired while the unlink is in progress. The delegation is generally just an optimization--it's always OK not to give one out. So we can just do a mutex_trylock() in setlease() and fail the setlease if we don't get the lock. Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> --- fs/locks.c | 39 +++++++++++++++++++++++++++++++-------- fs/namei.c | 3 +++ 2 files changed, 34 insertions(+), 8 deletions(-) diff --git a/fs/locks.c b/fs/locks.c index d6661a6..d746a6e 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -1185,6 +1185,13 @@ static void time_out_leases(struct inode *inode) } } +static bool leases_conflict(struct file_lock *lease, struct file_lock *breaker) +{ + if ((breaker->fl_flags & FL_DELEG) && (lease->fl_flags & FL_LEASE)) + return false; + return locks_conflict(breaker, lease); +} + /** * __break_lease - revoke all outstanding leases on file * @inode: the inode of the file to return @@ -1202,9 +1209,11 @@ int __break_lease(struct inode *inode, unsigned int mode) struct file_lock *fl; unsigned long break_time; int i_have_this_lease = 0; + bool lease_conflict = false; int want_write = break_all_leases(mode); new_fl = lease_alloc(NULL, want_write ? F_WRLCK : F_RDLCK); + new_fl->fl_flags = mode & BREAK_ONLY_DELEGS ? FL_DELEG : FL_LEASE; lock_flocks(); @@ -1214,13 +1223,16 @@ int __break_lease(struct inode *inode, unsigned int mode) if ((flock == NULL) || !IS_LEASE(flock)) goto out; - if (!locks_conflict(flock, new_fl)) + for (fl = flock; fl && IS_LEASE(fl); fl = fl->fl_next) { + if (leases_conflict(fl, new_fl)) { + lease_conflict = true; + if (fl->fl_owner == current->files) + i_have_this_lease = 1; + } + } + if (!lease_conflict) goto out; - for (fl = flock; fl && IS_LEASE(fl); fl = fl->fl_next) - if (fl->fl_owner == current->files) - i_have_this_lease = 1; - if (IS_ERR(new_fl) && !i_have_this_lease && ((mode & O_NONBLOCK) == 0)) { error = PTR_ERR(new_fl); @@ -1235,6 +1247,8 @@ int __break_lease(struct inode *inode, unsigned int mode) } for (fl = flock; fl && IS_LEASE(fl); fl = fl->fl_next) { + if (!leases_conflict(fl, new_fl)) + continue; if (want_write) { if (fl->fl_flags & FL_UNLOCK_PENDING) continue; @@ -1276,7 +1290,7 @@ restart: */ for (flock = inode->i_flock; flock && IS_LEASE(flock); flock = flock->fl_next) { - if (locks_conflict(new_fl, flock)) + if (leases_conflict(new_fl, flock)) goto restart; } error = 0; @@ -1357,10 +1371,18 @@ int generic_add_lease(struct file *filp, long arg, struct file_lock **flp) struct file_lock *fl, **before, **my_before = NULL, *lease; struct dentry *dentry = filp->f_path.dentry; struct inode *inode = dentry->d_inode; + bool is_deleg = (*flp)->fl_flags & FL_DELEG; int error; lease = *flp; + /* + * In the delegation case we need mutual exclusion with + * a number of operations that take the i_mutex: + */ + if (is_deleg && !mutex_trylock(&inode->i_mutex)) + return -EAGAIN; + error = -EAGAIN; if ((arg == F_RDLCK) && (atomic_read(&inode->i_writecount) > 0)) goto out; @@ -1411,9 +1433,10 @@ int generic_add_lease(struct file *filp, long arg, struct file_lock **flp) goto out; locks_insert_lock(before, lease); - return 0; - + error = 0; out: + if (is_deleg) + mutex_unlock(&inode->i_mutex); return error; } diff --git a/fs/namei.c b/fs/namei.c index 5008f01..7182209 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -2787,6 +2787,9 @@ static long do_unlinkat(int dfd, const char __user *pathname) error = security_path_unlink(&nd.path, dentry); if (error) goto exit3; + error = break_lease(inode, BREAK_ONLY_DELEGS|BREAK_ALL_LEASES); + if (error) + goto exit3; error = vfs_unlink(nd.path.dentry->d_inode, dentry); exit3: mnt_drop_write(nd.path.mnt); -- 1.7.5.4 -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [RFC PATCH 6/6] locks: break delegations on link [not found] ` <1326816029-13913-1-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2012-01-17 16:00 ` [RFC PATCH 1/6] locks: introduce new FL_DELEG lock flag J. Bruce Fields 2012-01-17 16:00 ` [RFC PATCH 3/6] locks: break delegations on unlink J. Bruce Fields @ 2012-01-17 16:00 ` J. Bruce Fields 2 siblings, 0 replies; 16+ messages in thread From: J. Bruce Fields @ 2012-01-17 16:00 UTC (permalink / raw) To: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, linux-nfs-u79uwXL29TY76Z2rM5mHXA Cc: J. Bruce Fields From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> --- fs/namei.c | 7 +++++-- 1 files changed, 5 insertions(+), 2 deletions(-) diff --git a/fs/namei.c b/fs/namei.c index facf295..1607e5e 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -2919,8 +2919,11 @@ int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_de /* Make sure we don't allow creating hardlink to an unlinked file */ if (inode->i_nlink == 0) error = -ENOENT; - else - error = dir->i_op->link(old_dentry, dir, new_dentry); + else { + error = break_lease(inode, BREAK_ONLY_DELEGS|BREAK_ALL_LEASES); + if (!error) + error = dir->i_op->link(old_dentry, dir, new_dentry); + } mutex_unlock(&inode->i_mutex); if (!error) fsnotify_link(dir, inode, new_dentry); -- 1.7.5.4 -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [RFC PATCH 2/6] locks: give break_lease its own flags 2012-01-17 16:00 [RFC PATCH 0/6] Fix NFSv4 delegations J. Bruce Fields [not found] ` <1326816029-13913-1-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2012-01-17 16:00 ` J. Bruce Fields [not found] ` <1326816029-13913-3-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2012-01-17 16:00 ` [RFC PATCH 4/6] locks: break delegations on rename J. Bruce Fields ` (2 subsequent siblings) 4 siblings, 1 reply; 16+ messages in thread From: J. Bruce Fields @ 2012-01-17 16:00 UTC (permalink / raw) To: linux-fsdevel, linux-nfs; +Cc: J. Bruce Fields From: "J. Bruce Fields" <bfields@redhat.com> Instead of passing O_* flags to break_lease, give it its own set of constants to use. We'll want more bits here soon. Be careful to continue treating these bits the same way for now, so there isn't a "flag day" for break_lease callers. We'll probably change the constants eventually, though. Signed-off-by: J. Bruce Fields <bfields@redhat.com> --- fs/locks.c | 10 +++++----- fs/nfsd/vfs.c | 6 +++--- fs/open.c | 4 ++-- include/linux/fs.h | 40 ++++++++++++++++++++++++++++++++++++++++ 4 files changed, 50 insertions(+), 10 deletions(-) diff --git a/fs/locks.c b/fs/locks.c index bc55194..d6661a6 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -1188,12 +1188,12 @@ static void time_out_leases(struct inode *inode) /** * __break_lease - revoke all outstanding leases on file * @inode: the inode of the file to return - * @mode: the open mode (read or write) + * @mode: the type of leases to break * * break_lease (inlined for speed) has checked there already is at least - * some kind of lock (maybe a lease) on this file. Leases are broken on - * a call to open() or truncate(). This function can sleep unless you - * specified %O_NONBLOCK to your open(). + * some kind of lock (maybe a lease) on this file. + * This function will wait for the lease to be broken unless you include + * BREAK_NONBLOCK in the mode. */ int __break_lease(struct inode *inode, unsigned int mode) { @@ -1202,7 +1202,7 @@ int __break_lease(struct inode *inode, unsigned int mode) struct file_lock *fl; unsigned long break_time; int i_have_this_lease = 0; - int want_write = (mode & O_ACCMODE) != O_RDONLY; + int want_write = break_all_leases(mode); new_fl = lease_alloc(NULL, want_write ? F_WRLCK : F_RDLCK); diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c index 8968913..a522b2f 100644 --- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -276,7 +276,7 @@ static int nfsd_break_lease(struct inode *inode) { if (!S_ISREG(inode->i_mode)) return 0; - return break_lease(inode, O_WRONLY | O_NONBLOCK); + return break_lease(inode, BREAK_ALL_LEASES | BREAK_NONBLOCK); } /* @@ -731,8 +731,8 @@ static int nfsd_open_break_lease(struct inode *inode, int access) if (access & NFSD_MAY_NOT_BREAK_LEASE) return 0; - mode = (access & NFSD_MAY_WRITE) ? O_WRONLY : O_RDONLY; - return break_lease(inode, mode | O_NONBLOCK); + mode = (access & NFSD_MAY_WRITE) ? BREAK_ALL_LEASES : BREAK_WRITE_LEASES; + return break_lease(inode, mode | BREAK_NONBLOCK); } /* diff --git a/fs/open.c b/fs/open.c index 22c41b5..b7d55e2 100644 --- a/fs/open.c +++ b/fs/open.c @@ -105,7 +105,7 @@ static long do_sys_truncate(const char __user *pathname, loff_t length) * Make sure that there are no leases. get_write_access() protects * against the truncate racing with a lease-granting setlease(). */ - error = break_lease(inode, O_WRONLY); + error = break_lease(inode, BREAK_ALL_LEASES); if (error) goto put_write_and_out; @@ -685,7 +685,7 @@ static struct file *__dentry_open(struct dentry *dentry, struct vfsmount *mnt, if (error) goto cleanup_all; - error = break_lease(inode, f->f_flags); + error = open_break_lease(inode, f->f_flags); if (error) goto cleanup_all; diff --git a/include/linux/fs.h b/include/linux/fs.h index 425d05d..939cb3b 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1083,6 +1083,35 @@ static inline int file_check_writeable(struct file *filp) #define FL_DOWNGRADE_PENDING 256 /* Lease is being downgraded */ #define FL_UNLOCK_PENDING 512 /* Lease is being broken */ +/* break-lease flags */ +/* + * These agree with O_RDONLY, O_WRONLY, O_ACCMODE, and O_NONBLOCK as a + * (possibly temporary) courtesy to developers of any out-of-tree + * code that calls break_lease: + */ +#define BREAK_WRITE_LEASES 00000000 +#define BREAK_ALL_LEASES 00000001 +#define BREAK_TYPE_MASK 00000003 +#define BREAK_NONBLOCK 00004000 +/* + * Read delegations are also broken on operations that modify file + * metadata or the set of links that point to a file: + */ +#define BREAK_ONLY_DELEGS 00000010 + +/* + * Make it easy to allow O_RDONLY, O_RDWR, O_WRONLY to continue to work + * as they did before, for now: + */ +static inline bool break_write_leases(unsigned int type) +{ + return ((type & BREAK_TYPE_MASK) == BREAK_WRITE_LEASES); +} +static inline bool break_all_leases(unsigned int type) +{ + return ((type & BREAK_TYPE_MASK) == BREAK_ALL_LEASES); +} + /* * Special return value from posix_lock_file() and vfs_lock_file() for * asynchronous locking. @@ -2005,6 +2034,13 @@ static inline int break_lease(struct inode *inode, unsigned int mode) return __break_lease(inode, mode); return 0; } + +static inline int open_break_lease(struct inode *inode, unsigned int mode) +{ + unsigned int type = (mode & O_ACCMODE) == O_RDONLY ? + BREAK_WRITE_LEASES : BREAK_ALL_LEASES; + return __break_lease(inode, type); +} #else /* !CONFIG_FILE_LOCKING */ static inline int locks_mandatory_locked(struct inode *inode) { @@ -2044,6 +2080,10 @@ static inline int break_lease(struct inode *inode, unsigned int mode) return 0; } +static inline int open_break_lease(struct inode *inode, unsigned int mode) +{ + return 0; +} #endif /* CONFIG_FILE_LOCKING */ /* fs/open.c */ -- 1.7.5.4 ^ permalink raw reply related [flat|nested] 16+ messages in thread
[parent not found: <1326816029-13913-3-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Bypass encrypt and decrypt data in dm-crypt [not found] ` <1326816029-13913-3-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2012-01-17 16:17 ` Fan Zhang 0 siblings, 0 replies; 16+ messages in thread From: Fan Zhang @ 2012-01-17 16:17 UTC (permalink / raw) To: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org All: We are using dm-crypt for Android device encryption. However, we need reserve some sectors in block device for status and integration check and do not want to encrypt/decrypt some sectors when using dm-crypt. So in crypt_convert_block() When offset sector of ctx + sector number of bio_in is the range of bypass sector list. instead call if (bio_data_dir(ctx->bio_in) == WRITE) r = crypt_copy_write_data(bv_in, bv_out, offset, 1 << SECTOR_SHIFT); else r = crypt_copy_read_data(bv_in, bv_out, offset, 1 << SECTOR_SHIFT); I want to call another function to copy data of a sector from ctx->bio_in to ctx->bio_out directly. I tried the following implementation struct bio_vec *bv_in = bio_iovec_idx(ctx->bio_in, ctx->idx_in); struct bio_vec *bv_out = bio_iovec_idx(ctx->bio_out, ctx->idx_out); struct page * page_in = bv_in->bv_page; struct page * page_out = bv_out->bv_page; void * addr1 = kmap_atomic(page_in, KM_USER0); void * addr2 = kmap_atomic(page_out, KM_USER1); unsigned int offset = ctx->offset_in; memcpy(addr2 + offset, addr1 + offset, 1 << SECTOR_SHIFT); kunmap_atomic(addr2, KM_USER1); kunmap_atomic(addr1, KM_USER0); but above implementation does not work. Could you give me a kint to reslove this issue? Thanks Fan -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 16+ messages in thread
* [RFC PATCH 4/6] locks: break delegations on rename 2012-01-17 16:00 [RFC PATCH 0/6] Fix NFSv4 delegations J. Bruce Fields [not found] ` <1326816029-13913-1-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2012-01-17 16:00 ` [RFC PATCH 2/6] locks: give break_lease its own flags J. Bruce Fields @ 2012-01-17 16:00 ` J. Bruce Fields 2012-01-17 16:00 ` [RFC PATCH 5/6] locks: break delegations on any attribute modification J. Bruce Fields 2012-02-03 20:58 ` [RFC PATCH 0/6] Fix NFSv4 delegations J. Bruce Fields 4 siblings, 0 replies; 16+ messages in thread From: J. Bruce Fields @ 2012-01-17 16:00 UTC (permalink / raw) To: linux-fsdevel, linux-nfs; +Cc: J. Bruce Fields From: "J. Bruce Fields" <bfields@redhat.com> To rely on the i_mutex for exclusion between setlease and rename, we need rename to take the i_mutex on the source as well as on any possible target. Also fix up lockdep and the Documentation/filesystems/directory-locking documentation. (I need to review the latter one more time to make sure I've got it right.) Signed-off-by: J. Bruce Fields <bfields@redhat.com> --- Documentation/filesystems/directory-locking | 11 ++++++----- fs/namei.c | 17 +++++++++++++++-- include/linux/fs.h | 9 ++++++--- 3 files changed, 27 insertions(+), 10 deletions(-) diff --git a/Documentation/filesystems/directory-locking b/Documentation/filesystems/directory-locking index ff7b611..c51cbed 100644 --- a/Documentation/filesystems/directory-locking +++ b/Documentation/filesystems/directory-locking @@ -12,8 +12,8 @@ kinds of locks - per-inode (->i_mutex) and per-filesystem locks victim and calls the method. 4) rename() that is _not_ cross-directory. Locking rules: caller locks -the parent, finds source and target, if target already exists - locks it -and then calls the method. +the parent, finds source and target, locks source, also locks target if +it already exists, and then calls the method. 5) link creation. Locking rules: * lock parent @@ -30,6 +30,7 @@ rules: fail with -ENOTEMPTY * if new parent is equal to or is a descendent of source fail with -ELOOP + * lock source if it is not a directory. * if target exists - lock it. * call the method. @@ -56,9 +57,9 @@ objects - A < B iff A is an ancestor of B. renames will be blocked on filesystem lock and we don't start changing the order until we had acquired all locks). -(3) any operation holds at most one lock on non-directory object and - that lock is acquired after all other locks. (Proof: see descriptions - of operations). +(3) locks on non-directory objects are acquired only after taking locks + on their parents (which remain their parents by (1) and (2)). + (Proof: see descriptions of operations). Now consider the minimal deadlock. Each process is blocked on attempt to acquire some lock and already holds at least one lock. Let's diff --git a/fs/namei.c b/fs/namei.c index 7182209..facf295 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -3075,6 +3075,7 @@ static int vfs_rename_other(struct inode *old_dir, struct dentry *old_dentry, struct inode *new_dir, struct dentry *new_dentry) { struct inode *target = new_dentry->d_inode; + struct inode *source = old_dentry->d_inode; int error; error = security_inode_rename(old_dir, old_dentry, new_dir, new_dentry); @@ -3082,13 +3083,23 @@ static int vfs_rename_other(struct inode *old_dir, struct dentry *old_dentry, return error; dget(new_dentry); - if (target) + mutex_lock_nested(&source->i_mutex, I_MUTEX_RENAME_SOURCE); + error = break_lease(source, BREAK_ONLY_DELEGS|BREAK_ALL_LEASES); + if (error) + goto out_unlock_source; + if (target) { mutex_lock(&target->i_mutex); - + error = break_lease(target, BREAK_ONLY_DELEGS|BREAK_ALL_LEASES); + if (error) + goto out; + } error = -EBUSY; if (d_mountpoint(old_dentry)||d_mountpoint(new_dentry)) goto out; + error = break_lease(old_dentry->d_inode, BREAK_ONLY_DELEGS|BREAK_ALL_LEASES); + if (error) + goto out; error = old_dir->i_op->rename(old_dir, old_dentry, new_dir, new_dentry); if (error) goto out; @@ -3100,6 +3111,8 @@ static int vfs_rename_other(struct inode *old_dir, struct dentry *old_dentry, out: if (target) mutex_unlock(&target->i_mutex); +out_unlock_source: + mutex_unlock(&source->i_mutex); dput(new_dentry); return error; } diff --git a/include/linux/fs.h b/include/linux/fs.h index 939cb3b..dccec8c 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -848,10 +848,12 @@ static inline int inode_unhashed(struct inode *inode) * 0: the object of the current VFS operation * 1: parent * 2: child/target - * 3: quota file + * 3: xattr + * 4: quota file + * 5: the file being renamed (used only in rename of a non-directory) * * The locking order between these classes is - * parent -> child -> normal -> xattr -> quota + * parent -> child -> rename_source -> normal -> xattr -> quota */ enum inode_i_mutex_lock_class { @@ -859,7 +861,8 @@ enum inode_i_mutex_lock_class I_MUTEX_PARENT, I_MUTEX_CHILD, I_MUTEX_XATTR, - I_MUTEX_QUOTA + I_MUTEX_QUOTA, + I_MUTEX_RENAME_SOURCE }; /* -- 1.7.5.4 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [RFC PATCH 5/6] locks: break delegations on any attribute modification 2012-01-17 16:00 [RFC PATCH 0/6] Fix NFSv4 delegations J. Bruce Fields ` (2 preceding siblings ...) 2012-01-17 16:00 ` [RFC PATCH 4/6] locks: break delegations on rename J. Bruce Fields @ 2012-01-17 16:00 ` J. Bruce Fields 2012-02-03 20:58 ` [RFC PATCH 0/6] Fix NFSv4 delegations J. Bruce Fields 4 siblings, 0 replies; 16+ messages in thread From: J. Bruce Fields @ 2012-01-17 16:00 UTC (permalink / raw) To: linux-fsdevel, linux-nfs; +Cc: J. Bruce Fields From: "J. Bruce Fields" <bfields@redhat.com> NFSv4 uses leases to guarantee that clients can cache metadata as well as data. This covers chmod, chown, etc. Signed-off-by: J. Bruce Fields <bfields@redhat.com> --- fs/attr.c | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/fs/attr.c b/fs/attr.c index 7ee7ba4..15a2a09 100644 --- a/fs/attr.c +++ b/fs/attr.c @@ -232,6 +232,9 @@ int notify_change(struct dentry * dentry, struct iattr * attr) error = security_inode_setattr(dentry, attr); if (error) return error; + error = break_lease(inode, BREAK_ONLY_DELEGS|BREAK_ALL_LEASES); + if (error) + return error; if (inode->i_op->setattr) error = inode->i_op->setattr(dentry, attr); -- 1.7.5.4 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [RFC PATCH 0/6] Fix NFSv4 delegations 2012-01-17 16:00 [RFC PATCH 0/6] Fix NFSv4 delegations J. Bruce Fields ` (3 preceding siblings ...) 2012-01-17 16:00 ` [RFC PATCH 5/6] locks: break delegations on any attribute modification J. Bruce Fields @ 2012-02-03 20:58 ` J. Bruce Fields 2012-02-03 21:09 ` [PATCH 1/6] locks: introduce new FL_DELEG lock flag J. Bruce Fields ` (3 more replies) 4 siblings, 4 replies; 16+ messages in thread From: J. Bruce Fields @ 2012-02-03 20:58 UTC (permalink / raw) To: J. Bruce Fields; +Cc: linux-fsdevel, linux-nfs On Tue, Jan 17, 2012 at 11:00:23AM -0500, J. Bruce Fields wrote: > From: "J. Bruce Fields" <bfields@redhat.com> > > NFSv4 delegations are currently implemented as leases. However, leases > don't have quite the same semantics, which means that delegations are > currently enforced correctly only for conflicts between NFSv4 clients. > > I previously attempted to address this by "fixing" the existing lease > behavior, assuming it was a bug--but from further investigation it looks > like that would be the wrong thing to do for Samba. > > So instead I'm defining a new FL_DELEG flag to mark NFSv4 delegations, > and varying behavior based on that. Delegations aren't meant to be > available to userspace for now--nfsd is the only user. > > In theory this code allows both read and write delegations, but I'm only > using read delegations for now--write delegations are a project for > another day. (At which point write leases will probably need some > fixing while we're at it.) > > Still to do: if OPEN(dirfd, name) returns a delegation to the client, > then the client should be guaranteed that the opened file is still > linked as (dirfd, name). But the current nfsd code allows a rename or > unlink to intervene between the lookup and the request for a delegation. OK, I decided I could fix that in the nfsd4 open code by retrying the lookup after obtaining the delegation, and just dropping the delegation (they're always optional) if it turned out there'd been a race. I'll post revised vfs patches (without that extra nfsd stuff). --b. ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 1/6] locks: introduce new FL_DELEG lock flag. 2012-02-03 20:58 ` [RFC PATCH 0/6] Fix NFSv4 delegations J. Bruce Fields @ 2012-02-03 21:09 ` J. Bruce Fields [not found] ` <20120203205818.GE2999-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> ` (2 subsequent siblings) 3 siblings, 0 replies; 16+ messages in thread From: J. Bruce Fields @ 2012-02-03 21:09 UTC (permalink / raw) To: Al Viro; +Cc: linux-nfs, linux-fsdevel, Christoph Hellwig, J. Bruce Fields From: "J. Bruce Fields" <bfields@redhat.com> For now FL_DELEG is just a synonym for FL_LEASE. So this patch doesn't change behavior. Next we'll modify break_lease to treat FL_DELEG leases differently, to account for the fact that NFSv4 delegations should be broken in more situations than Windows oplocks. Signed-off-by: J. Bruce Fields <bfields@redhat.com> --- fs/locks.c | 2 +- fs/nfsd/nfs4state.c | 2 +- include/linux/fs.h | 1 + 3 files changed, 3 insertions(+), 2 deletions(-) diff --git a/fs/locks.c b/fs/locks.c index 637694b..ea07a98 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -131,7 +131,7 @@ #define IS_POSIX(fl) (fl->fl_flags & FL_POSIX) #define IS_FLOCK(fl) (fl->fl_flags & FL_FLOCK) -#define IS_LEASE(fl) (fl->fl_flags & FL_LEASE) +#define IS_LEASE(fl) (fl->fl_flags & (FL_LEASE|FL_DELEG)) static bool lease_breaking(struct file_lock *fl) { diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c index e8c98f0..ad82255 100644 --- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -2804,7 +2804,7 @@ static struct file_lock *nfs4_alloc_init_lease(struct nfs4_delegation *dp, int f return NULL; locks_init_lock(fl); fl->fl_lmops = &nfsd_lease_mng_ops; - fl->fl_flags = FL_LEASE; + fl->fl_flags = FL_DELEG; fl->fl_type = flag == NFS4_OPEN_DELEGATE_READ? F_RDLCK: F_WRLCK; fl->fl_end = OFFSET_MAX; fl->fl_owner = (fl_owner_t)(dp->dl_file); diff --git a/include/linux/fs.h b/include/linux/fs.h index 386da09..234b086 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1082,6 +1082,7 @@ static inline int file_check_writeable(struct file *filp) #define FL_POSIX 1 #define FL_FLOCK 2 +#define FL_DELEG 4 /* NFSv4 delegation */ #define FL_ACCESS 8 /* not trying to lock, just looking */ #define FL_EXISTS 16 /* when unlocking, test for existence */ #define FL_LEASE 32 /* lease held on this file */ -- 1.7.5.4 ^ permalink raw reply related [flat|nested] 16+ messages in thread
[parent not found: <20120203205818.GE2999-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>]
* Provide vfs support for NFSv4 delegations [not found] ` <20120203205818.GE2999-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> @ 2012-02-03 21:09 ` J. Bruce Fields 2012-02-03 21:09 ` [PATCH 2/6] locks: give break_lease its own flags J. Bruce Fields ` (2 subsequent siblings) 3 siblings, 0 replies; 16+ messages in thread From: J. Bruce Fields @ 2012-02-03 21:09 UTC (permalink / raw) To: Al Viro Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Christoph Hellwig The following patches provide vfs support for NFSv4 delegations. If they look sane then I'd like to get some version of them merged into 3.4. I'm assuming Al would be the one to do that. Currently nfsd implements delegations as leases. However, leases don't have quite the same semantics, which means that delegations are currently enforced correctly only for conflicts between NFSv4 clients. So, I define a new FL_DELEG flag to mark NFSv4 delegations. In most respects delegations behave just like leases, but they need to be broken on more operations. Delegations aren't exposed to userspace; nfsd is the only user. In theory this code allows both read and write delegations, but I'm only using read delegations for now--write delegations are a project for another day. (At which point write leases will probably need some fixing too, while we're at it.) --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 2/6] locks: give break_lease its own flags [not found] ` <20120203205818.GE2999-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> 2012-02-03 21:09 ` Provide vfs support for NFSv4 delegations J. Bruce Fields @ 2012-02-03 21:09 ` J. Bruce Fields 2012-02-03 21:09 ` [PATCH 3/6] locks: break delegations on unlink J. Bruce Fields 2012-02-03 21:09 ` [PATCH 6/6] locks: break delegations on link J. Bruce Fields 3 siblings, 0 replies; 16+ messages in thread From: J. Bruce Fields @ 2012-02-03 21:09 UTC (permalink / raw) To: Al Viro Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Christoph Hellwig, J. Bruce Fields From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> Instead of passing O_* flags to break_lease, give it its own set of constants to use. We'll want more bits here soon. Be careful to continue treating these bits the same way for now, so there isn't a "flag day" for break_lease callers. We'll probably change the constants eventually, though. Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> --- fs/locks.c | 10 +++++----- fs/nfsd/vfs.c | 6 +++--- fs/open.c | 4 ++-- include/linux/fs.h | 27 +++++++++++++++++++++++++++ 4 files changed, 37 insertions(+), 10 deletions(-) diff --git a/fs/locks.c b/fs/locks.c index ea07a98..5f9ae63 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -1188,12 +1188,12 @@ static void time_out_leases(struct inode *inode) /** * __break_lease - revoke all outstanding leases on file * @inode: the inode of the file to return - * @mode: the open mode (read or write) + * @mode: the type of leases to break * * break_lease (inlined for speed) has checked there already is at least - * some kind of lock (maybe a lease) on this file. Leases are broken on - * a call to open() or truncate(). This function can sleep unless you - * specified %O_NONBLOCK to your open(). + * some kind of lock (maybe a lease) on this file. + * This function will wait for the lease to be broken unless you include + * BREAK_NONBLOCK in the mode. */ int __break_lease(struct inode *inode, unsigned int mode) { @@ -1202,7 +1202,7 @@ int __break_lease(struct inode *inode, unsigned int mode) struct file_lock *fl; unsigned long break_time; int i_have_this_lease = 0; - int want_write = (mode & O_ACCMODE) != O_RDONLY; + int want_write = (mode & BREAK_RW_MASK) != BREAK_W_LEASES; new_fl = lease_alloc(NULL, want_write ? F_WRLCK : F_RDLCK); if (IS_ERR(new_fl)) diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c index edf6d3e..4ef03536 100644 --- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -276,7 +276,7 @@ static int nfsd_break_lease(struct inode *inode) { if (!S_ISREG(inode->i_mode)) return 0; - return break_lease(inode, O_WRONLY | O_NONBLOCK); + return break_lease(inode, BREAK_R_AND_W_LEASES | BREAK_NONBLOCK); } /* @@ -731,8 +731,8 @@ static int nfsd_open_break_lease(struct inode *inode, int access) if (access & NFSD_MAY_NOT_BREAK_LEASE) return 0; - mode = (access & NFSD_MAY_WRITE) ? O_WRONLY : O_RDONLY; - return break_lease(inode, mode | O_NONBLOCK); + mode = (access & NFSD_MAY_WRITE) ? BREAK_R_AND_W_LEASES : BREAK_W_LEASES; + return break_lease(inode, mode | BREAK_NONBLOCK); } /* diff --git a/fs/open.c b/fs/open.c index 77becc0..9e45cd5 100644 --- a/fs/open.c +++ b/fs/open.c @@ -105,7 +105,7 @@ static long do_sys_truncate(const char __user *pathname, loff_t length) * Make sure that there are no leases. get_write_access() protects * against the truncate racing with a lease-granting setlease(). */ - error = break_lease(inode, O_WRONLY); + error = break_lease(inode, BREAK_R_AND_W_LEASES); if (error) goto put_write_and_out; @@ -685,7 +685,7 @@ static struct file *__dentry_open(struct dentry *dentry, struct vfsmount *mnt, if (error) goto cleanup_all; - error = break_lease(inode, f->f_flags); + error = open_break_lease(inode, f->f_flags); if (error) goto cleanup_all; diff --git a/include/linux/fs.h b/include/linux/fs.h index 234b086..89c8481 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1091,6 +1091,22 @@ static inline int file_check_writeable(struct file *filp) #define FL_DOWNGRADE_PENDING 256 /* Lease is being downgraded */ #define FL_UNLOCK_PENDING 512 /* Lease is being broken */ +/* break-lease flags */ +/* + * These agree with O_RDONLY, O_WRONLY, O_ACCMODE, and O_NONBLOCK as a + * (possibly temporary) courtesy to developers of any out-of-tree + * code that calls break_lease: + */ +#define BREAK_W_LEASES 00000000 +#define BREAK_R_AND_W_LEASES 00000001 +#define BREAK_RW_MASK 00000003 +#define BREAK_NONBLOCK 00004000 +/* + * Read delegations are also broken on operations that modify file + * metadata or the set of links that point to a file: + */ +#define BREAK_ONLY_DELEGS 00000010 + /* * Special return value from posix_lock_file() and vfs_lock_file() for * asynchronous locking. @@ -1972,6 +1988,13 @@ static inline int break_lease(struct inode *inode, unsigned int mode) return __break_lease(inode, mode); return 0; } + +static inline int open_break_lease(struct inode *inode, unsigned int mode) +{ + unsigned int type = (mode & O_ACCMODE) == O_RDONLY ? + BREAK_W_LEASES : BREAK_R_AND_W_LEASES; + return __break_lease(inode, type); +} #else /* !CONFIG_FILE_LOCKING */ static inline int locks_mandatory_locked(struct inode *inode) { @@ -2011,6 +2034,10 @@ static inline int break_lease(struct inode *inode, unsigned int mode) return 0; } +static inline int open_break_lease(struct inode *inode, unsigned int mode) +{ + return 0; +} #endif /* CONFIG_FILE_LOCKING */ /* fs/open.c */ -- 1.7.5.4 -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 3/6] locks: break delegations on unlink [not found] ` <20120203205818.GE2999-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> 2012-02-03 21:09 ` Provide vfs support for NFSv4 delegations J. Bruce Fields 2012-02-03 21:09 ` [PATCH 2/6] locks: give break_lease its own flags J. Bruce Fields @ 2012-02-03 21:09 ` J. Bruce Fields 2012-02-03 21:09 ` [PATCH 6/6] locks: break delegations on link J. Bruce Fields 3 siblings, 0 replies; 16+ messages in thread From: J. Bruce Fields @ 2012-02-03 21:09 UTC (permalink / raw) To: Al Viro Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Christoph Hellwig, J. Bruce Fields From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> A read delegation is used by NFSv4 as a guarantee that a client can perform local read opens without informing the server. The open operation takes the last component of the pathname as an argument, thus is also a lookup operation, and giving the client the above guarantee means informing the client before we allow anything that would change the set of names pointing to the inode. Therefore, we need to break delegations on rename, link, and unlink. Start with unlink. The simplest thing to do is just to use the fact that unlink always takes the i_mutex to prevent new delegations from being acquired while the unlink is in progress. The delegation is generally just an optimization--it's always OK not to give one out. So we can just do a mutex_trylock() in setlease() and fail the setlease if we don't get the lock. Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> --- fs/locks.c | 42 ++++++++++++++++++++++++++++++++++-------- fs/namei.c | 3 +++ 2 files changed, 37 insertions(+), 8 deletions(-) diff --git a/fs/locks.c b/fs/locks.c index 5f9ae63..fce9760 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -1185,6 +1185,13 @@ static void time_out_leases(struct inode *inode) } } +static bool leases_conflict(struct file_lock *lease, struct file_lock *breaker) +{ + if ((breaker->fl_flags & FL_DELEG) && (lease->fl_flags & FL_LEASE)) + return false; + return locks_conflict(breaker, lease); +} + /** * __break_lease - revoke all outstanding leases on file * @inode: the inode of the file to return @@ -1202,11 +1209,13 @@ int __break_lease(struct inode *inode, unsigned int mode) struct file_lock *fl; unsigned long break_time; int i_have_this_lease = 0; + bool lease_conflict = false; int want_write = (mode & BREAK_RW_MASK) != BREAK_W_LEASES; new_fl = lease_alloc(NULL, want_write ? F_WRLCK : F_RDLCK); if (IS_ERR(new_fl)) return PTR_ERR(new_fl); + new_fl->fl_flags = mode & BREAK_ONLY_DELEGS ? FL_DELEG : FL_LEASE; lock_flocks(); @@ -1216,13 +1225,16 @@ int __break_lease(struct inode *inode, unsigned int mode) if ((flock == NULL) || !IS_LEASE(flock)) goto out; - if (!locks_conflict(flock, new_fl)) + for (fl = flock; fl && IS_LEASE(fl); fl = fl->fl_next) { + if (leases_conflict(fl, new_fl)) { + lease_conflict = true; + if (fl->fl_owner == current->files) + i_have_this_lease = 1; + } + } + if (!lease_conflict) goto out; - for (fl = flock; fl && IS_LEASE(fl); fl = fl->fl_next) - if (fl->fl_owner == current->files) - i_have_this_lease = 1; - break_time = 0; if (lease_break_time > 0) { break_time = jiffies + lease_break_time * HZ; @@ -1231,6 +1243,8 @@ int __break_lease(struct inode *inode, unsigned int mode) } for (fl = flock; fl && IS_LEASE(fl); fl = fl->fl_next) { + if (!leases_conflict(fl, new_fl)) + continue; if (want_write) { if (fl->fl_flags & FL_UNLOCK_PENDING) continue; @@ -1272,7 +1286,7 @@ restart: */ for (flock = inode->i_flock; flock && IS_LEASE(flock); flock = flock->fl_next) { - if (locks_conflict(new_fl, flock)) + if (leases_conflict(new_fl, flock)) goto restart; } error = 0; @@ -1352,9 +1366,20 @@ int generic_add_lease(struct file *filp, long arg, struct file_lock **flp) struct file_lock *fl, **before, **my_before = NULL, *lease; struct dentry *dentry = filp->f_path.dentry; struct inode *inode = dentry->d_inode; + bool is_deleg = (*flp)->fl_flags & FL_DELEG; int error; lease = *flp; + /* + * In the delegation case we need mutual exclusion with + * a number of operations that take the i_mutex. We trylock + * because delegations are an optional optimization, and if + * there's some chance of a conflict--we'd rather not + * bother, maybe that's a sign this just isn't a good file to + * hand out a delegation on. + */ + if (is_deleg && !mutex_trylock(&inode->i_mutex)) + return -EAGAIN; error = -EAGAIN; if ((arg == F_RDLCK) && (atomic_read(&inode->i_writecount) > 0)) @@ -1406,9 +1431,10 @@ int generic_add_lease(struct file *filp, long arg, struct file_lock **flp) goto out; locks_insert_lock(before, lease); - return 0; - + error = 0; out: + if (is_deleg) + mutex_unlock(&inode->i_mutex); return error; } diff --git a/fs/namei.c b/fs/namei.c index 208c6aa..4516958 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -2788,6 +2788,9 @@ static long do_unlinkat(int dfd, const char __user *pathname) error = security_path_unlink(&nd.path, dentry); if (error) goto exit3; + error = break_lease(inode, BREAK_ONLY_DELEGS|BREAK_R_AND_W_LEASES); + if (error) + goto exit3; error = vfs_unlink(nd.path.dentry->d_inode, dentry); exit3: mnt_drop_write(nd.path.mnt); -- 1.7.5.4 -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 6/6] locks: break delegations on link [not found] ` <20120203205818.GE2999-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> ` (2 preceding siblings ...) 2012-02-03 21:09 ` [PATCH 3/6] locks: break delegations on unlink J. Bruce Fields @ 2012-02-03 21:09 ` J. Bruce Fields 3 siblings, 0 replies; 16+ messages in thread From: J. Bruce Fields @ 2012-02-03 21:09 UTC (permalink / raw) To: Al Viro Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Christoph Hellwig, J. Bruce Fields From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> --- fs/namei.c | 7 +++++-- 1 files changed, 5 insertions(+), 2 deletions(-) diff --git a/fs/namei.c b/fs/namei.c index 1e0c383..b8f3cbd 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -2920,8 +2920,11 @@ int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_de /* Make sure we don't allow creating hardlink to an unlinked file */ if (inode->i_nlink == 0) error = -ENOENT; - else - error = dir->i_op->link(old_dentry, dir, new_dentry); + else { + error = break_lease(inode, BREAK_ONLY_DELEGS|BREAK_R_AND_W_LEASES); + if (!error) + error = dir->i_op->link(old_dentry, dir, new_dentry); + } mutex_unlock(&inode->i_mutex); if (!error) fsnotify_link(dir, inode, new_dentry); -- 1.7.5.4 -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 4/6] locks: break delegations on rename 2012-02-03 20:58 ` [RFC PATCH 0/6] Fix NFSv4 delegations J. Bruce Fields 2012-02-03 21:09 ` [PATCH 1/6] locks: introduce new FL_DELEG lock flag J. Bruce Fields [not found] ` <20120203205818.GE2999-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> @ 2012-02-03 21:09 ` J. Bruce Fields 2012-02-03 21:09 ` [PATCH 5/6] locks: break delegations on any attribute modification J. Bruce Fields 3 siblings, 0 replies; 16+ messages in thread From: J. Bruce Fields @ 2012-02-03 21:09 UTC (permalink / raw) To: Al Viro; +Cc: linux-nfs, linux-fsdevel, Christoph Hellwig, J. Bruce Fields From: "J. Bruce Fields" <bfields@redhat.com> To rely on the i_mutex for exclusion between setlease and rename, we need rename to take the i_mutex on the source as well as on any possible target. Also fix up lockdep and the Documentation/filesystems/directory-locking documentation. Signed-off-by: J. Bruce Fields <bfields@redhat.com> --- Documentation/filesystems/directory-locking | 11 ++++++----- fs/namei.c | 12 ++++++++++++ include/linux/fs.h | 9 ++++++--- 3 files changed, 24 insertions(+), 8 deletions(-) diff --git a/Documentation/filesystems/directory-locking b/Documentation/filesystems/directory-locking index ff7b611..9edbcd2 100644 --- a/Documentation/filesystems/directory-locking +++ b/Documentation/filesystems/directory-locking @@ -12,8 +12,8 @@ kinds of locks - per-inode (->i_mutex) and per-filesystem locks victim and calls the method. 4) rename() that is _not_ cross-directory. Locking rules: caller locks -the parent, finds source and target, if target already exists - locks it -and then calls the method. +the parent, finds source and target, locks source, also locks target if +it already exists, and then calls the method. 5) link creation. Locking rules: * lock parent @@ -30,6 +30,7 @@ rules: fail with -ENOTEMPTY * if new parent is equal to or is a descendent of source fail with -ELOOP + * lock source if it is not a directory. * if target exists - lock it. * call the method. @@ -56,9 +57,9 @@ objects - A < B iff A is an ancestor of B. renames will be blocked on filesystem lock and we don't start changing the order until we had acquired all locks). -(3) any operation holds at most one lock on non-directory object and - that lock is acquired after all other locks. (Proof: see descriptions - of operations). +(3) locks on non-directory objects are acquired only after taking locks + on their parents (which remain their parents until all locks are + acquired, by (1) and (2)). (Proof: see descriptions of operations). Now consider the minimal deadlock. Each process is blocked on attempt to acquire some lock and already holds at least one lock. Let's diff --git a/fs/namei.c b/fs/namei.c index 4516958..1e0c383 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -3076,6 +3076,7 @@ static int vfs_rename_other(struct inode *old_dir, struct dentry *old_dentry, struct inode *new_dir, struct dentry *new_dentry) { struct inode *target = new_dentry->d_inode; + struct inode *source = old_dentry->d_inode; int error; error = security_inode_rename(old_dir, old_dentry, new_dir, new_dentry); @@ -3083,6 +3084,7 @@ static int vfs_rename_other(struct inode *old_dir, struct dentry *old_dentry, return error; dget(new_dentry); + mutex_lock_nested(&source->i_mutex, I_MUTEX_RENAME_SOURCE); if (target) mutex_lock(&target->i_mutex); @@ -3090,6 +3092,14 @@ static int vfs_rename_other(struct inode *old_dir, struct dentry *old_dentry, if (d_mountpoint(old_dentry)||d_mountpoint(new_dentry)) goto out; + error = break_lease(source, BREAK_ONLY_DELEGS|BREAK_R_AND_W_LEASES); + if (error) + goto out; + if (target) { + error = break_lease(target, BREAK_ONLY_DELEGS|BREAK_R_AND_W_LEASES); + if (error) + goto out; + } error = old_dir->i_op->rename(old_dir, old_dentry, new_dir, new_dentry); if (error) goto out; @@ -3101,6 +3111,8 @@ static int vfs_rename_other(struct inode *old_dir, struct dentry *old_dentry, out: if (target) mutex_unlock(&target->i_mutex); +out_unlock_source: + mutex_unlock(&source->i_mutex); dput(new_dentry); return error; } diff --git a/include/linux/fs.h b/include/linux/fs.h index 89c8481..f6a09b4 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -855,10 +855,12 @@ static inline int inode_unhashed(struct inode *inode) * 0: the object of the current VFS operation * 1: parent * 2: child/target - * 3: quota file + * 3: xattr + * 4: quota file + * 5: the file being renamed (used only in rename of a non-directory) * * The locking order between these classes is - * parent -> child -> normal -> xattr -> quota + * parent -> child -> rename_source -> normal -> xattr -> quota */ enum inode_i_mutex_lock_class { @@ -866,7 +868,8 @@ enum inode_i_mutex_lock_class I_MUTEX_PARENT, I_MUTEX_CHILD, I_MUTEX_XATTR, - I_MUTEX_QUOTA + I_MUTEX_QUOTA, + I_MUTEX_RENAME_SOURCE }; /* -- 1.7.5.4 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 5/6] locks: break delegations on any attribute modification 2012-02-03 20:58 ` [RFC PATCH 0/6] Fix NFSv4 delegations J. Bruce Fields ` (2 preceding siblings ...) 2012-02-03 21:09 ` [PATCH 4/6] locks: break delegations on rename J. Bruce Fields @ 2012-02-03 21:09 ` J. Bruce Fields 3 siblings, 0 replies; 16+ messages in thread From: J. Bruce Fields @ 2012-02-03 21:09 UTC (permalink / raw) To: Al Viro; +Cc: linux-nfs, linux-fsdevel, Christoph Hellwig, J. Bruce Fields From: "J. Bruce Fields" <bfields@redhat.com> NFSv4 uses leases to guarantee that clients can cache metadata as well as data. This covers chmod, chown, etc. Signed-off-by: J. Bruce Fields <bfields@redhat.com> --- fs/attr.c | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/fs/attr.c b/fs/attr.c index 95053ad..ac97995 100644 --- a/fs/attr.c +++ b/fs/attr.c @@ -232,6 +232,9 @@ int notify_change(struct dentry * dentry, struct iattr * attr) error = security_inode_setattr(dentry, attr); if (error) return error; + error = break_lease(inode, BREAK_ONLY_DELEGS|BREAK_R_AND_W_LEASES); + if (error) + return error; if (inode->i_op->setattr) error = inode->i_op->setattr(dentry, attr); -- 1.7.5.4 ^ permalink raw reply related [flat|nested] 16+ messages in thread
end of thread, other threads:[~2012-02-03 21:10 UTC | newest] Thread overview: 16+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-01-17 16:00 [RFC PATCH 0/6] Fix NFSv4 delegations J. Bruce Fields [not found] ` <1326816029-13913-1-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2012-01-17 16:00 ` [RFC PATCH 1/6] locks: introduce new FL_DELEG lock flag J. Bruce Fields 2012-01-17 16:00 ` [RFC PATCH 3/6] locks: break delegations on unlink J. Bruce Fields 2012-01-17 16:00 ` [RFC PATCH 6/6] locks: break delegations on link J. Bruce Fields 2012-01-17 16:00 ` [RFC PATCH 2/6] locks: give break_lease its own flags J. Bruce Fields [not found] ` <1326816029-13913-3-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2012-01-17 16:17 ` Bypass encrypt and decrypt data in dm-crypt Fan Zhang 2012-01-17 16:00 ` [RFC PATCH 4/6] locks: break delegations on rename J. Bruce Fields 2012-01-17 16:00 ` [RFC PATCH 5/6] locks: break delegations on any attribute modification J. Bruce Fields 2012-02-03 20:58 ` [RFC PATCH 0/6] Fix NFSv4 delegations J. Bruce Fields 2012-02-03 21:09 ` [PATCH 1/6] locks: introduce new FL_DELEG lock flag J. Bruce Fields [not found] ` <20120203205818.GE2999-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> 2012-02-03 21:09 ` Provide vfs support for NFSv4 delegations J. Bruce Fields 2012-02-03 21:09 ` [PATCH 2/6] locks: give break_lease its own flags J. Bruce Fields 2012-02-03 21:09 ` [PATCH 3/6] locks: break delegations on unlink J. Bruce Fields 2012-02-03 21:09 ` [PATCH 6/6] locks: break delegations on link J. Bruce Fields 2012-02-03 21:09 ` [PATCH 4/6] locks: break delegations on rename J. Bruce Fields 2012-02-03 21:09 ` [PATCH 5/6] locks: break delegations on any attribute modification J. Bruce Fields
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).