* [PATCH 0/2 v5] Fixups for l_pid @ 2017-06-19 13:24 Benjamin Coddington 2017-06-19 13:24 ` [PATCH 1/2] fs/locks: Use allocation rather than the stack in fcntl_getlk() Benjamin Coddington 2017-06-19 13:24 ` [PATCH 2/2] fs/locks: Remove fl_nspid and use fs-specific l_pid for remote locks Benjamin Coddington 0 siblings, 2 replies; 11+ messages in thread From: Benjamin Coddington @ 2017-06-19 13:24 UTC (permalink / raw) To: Jeff Layton, bfields, Alexander Viro Cc: linux-nfs, linux-fsdevel, linux-kernel LTP fcntl tests (fcntl11 fcntl14 fcntl17 fcntl19 fcntl20 fcntl21) have been failing for NFSv4 mounts due to an unexpected l_pid. What follows are some fixups: v2: - Rebase onto linux-next - Revert back to using the stack in locks_mandatory_area(), and fixup patch description for 1/3 v3: - The lkp-robot found some serious per_thread_ops performance regressions for v1 and v2, so this version changes things around to not acquire a reference to struct pid in fl_nspid for every lock. Instead, it drops fl_nspid altogether, and defers the lookup of the namespace-translated pid until it actually needed. v4: - Instead of looking up the virtual pid by way of referencing the struct task of the that pid, instead use find_pid_ns() and pid_nr_ns(), which avoids a the problem where we race to get a reference to the struct task while it may be freed. v5: - Squash previous 2/3 and 3/3 to avoid regression where F_GETLK would return the init_ns pid instead of a translated pid. Benjamin Coddington (2): fs/locks: Use allocation rather than the stack in fcntl_getlk() fs/locks: Remove fl_nspid and use fs-specific l_pid for remote locks fs/locks.c | 116 ++++++++++++++++++++++++++++++++--------------------- include/linux/fs.h | 2 +- 2 files changed, 72 insertions(+), 46 deletions(-) -- 2.9.3 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/2] fs/locks: Use allocation rather than the stack in fcntl_getlk() 2017-06-19 13:24 [PATCH 0/2 v5] Fixups for l_pid Benjamin Coddington @ 2017-06-19 13:24 ` Benjamin Coddington 2017-06-19 13:24 ` [PATCH 2/2] fs/locks: Remove fl_nspid and use fs-specific l_pid for remote locks Benjamin Coddington 1 sibling, 0 replies; 11+ messages in thread From: Benjamin Coddington @ 2017-06-19 13:24 UTC (permalink / raw) To: Jeff Layton, bfields, Alexander Viro Cc: linux-nfs, linux-fsdevel, linux-kernel Struct file_lock is fairly large, so let's save some space on the stack by using an allocation for struct file_lock in fcntl_getlk(), just as we do for fcntl_setlk(). Signed-off-by: Benjamin Coddington <bcodding@redhat.com> --- fs/locks.c | 46 ++++++++++++++++++++++++++-------------------- 1 file changed, 26 insertions(+), 20 deletions(-) diff --git a/fs/locks.c b/fs/locks.c index afefeb4ad6de..d7daa6c8932f 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -2086,14 +2086,17 @@ static void posix_lock_to_flock64(struct flock64 *flock, struct file_lock *fl) */ int fcntl_getlk(struct file *filp, unsigned int cmd, struct flock *flock) { - struct file_lock file_lock; + struct file_lock *fl; int error; + fl = locks_alloc_lock(); + if (fl == NULL) + return -ENOMEM; error = -EINVAL; if (flock->l_type != F_RDLCK && flock->l_type != F_WRLCK) goto out; - error = flock_to_posix_lock(filp, &file_lock, flock); + error = flock_to_posix_lock(filp, fl, flock); if (error) goto out; @@ -2103,23 +2106,22 @@ int fcntl_getlk(struct file *filp, unsigned int cmd, struct flock *flock) goto out; cmd = F_GETLK; - file_lock.fl_flags |= FL_OFDLCK; - file_lock.fl_owner = filp; + fl->fl_flags |= FL_OFDLCK; + fl->fl_owner = filp; } - error = vfs_test_lock(filp, &file_lock); + error = vfs_test_lock(filp, fl); if (error) goto out; - flock->l_type = file_lock.fl_type; - if (file_lock.fl_type != F_UNLCK) { - error = posix_lock_to_flock(flock, &file_lock); + flock->l_type = fl->fl_type; + if (fl->fl_type != F_UNLCK) { + error = posix_lock_to_flock(flock, fl); if (error) - goto rel_priv; + goto out; } -rel_priv: - locks_release_private(&file_lock); out: + locks_free_lock(fl); return error; } @@ -2298,14 +2300,18 @@ int fcntl_setlk(unsigned int fd, struct file *filp, unsigned int cmd, */ int fcntl_getlk64(struct file *filp, unsigned int cmd, struct flock64 *flock) { - struct file_lock file_lock; + struct file_lock *fl; int error; + fl = locks_alloc_lock(); + if (fl == NULL) + return -ENOMEM; + error = -EINVAL; if (flock->l_type != F_RDLCK && flock->l_type != F_WRLCK) goto out; - error = flock64_to_posix_lock(filp, &file_lock, flock); + error = flock64_to_posix_lock(filp, fl, flock); if (error) goto out; @@ -2315,20 +2321,20 @@ int fcntl_getlk64(struct file *filp, unsigned int cmd, struct flock64 *flock) goto out; cmd = F_GETLK64; - file_lock.fl_flags |= FL_OFDLCK; - file_lock.fl_owner = filp; + fl->fl_flags |= FL_OFDLCK; + fl->fl_owner = filp; } - error = vfs_test_lock(filp, &file_lock); + error = vfs_test_lock(filp, fl); if (error) goto out; - flock->l_type = file_lock.fl_type; - if (file_lock.fl_type != F_UNLCK) - posix_lock_to_flock64(flock, &file_lock); + flock->l_type = fl->fl_type; + if (fl->fl_type != F_UNLCK) + posix_lock_to_flock64(flock, fl); - locks_release_private(&file_lock); out: + locks_free_lock(fl); return error; } -- 2.9.3 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 2/2] fs/locks: Remove fl_nspid and use fs-specific l_pid for remote locks 2017-06-19 13:24 [PATCH 0/2 v5] Fixups for l_pid Benjamin Coddington 2017-06-19 13:24 ` [PATCH 1/2] fs/locks: Use allocation rather than the stack in fcntl_getlk() Benjamin Coddington @ 2017-06-19 13:24 ` Benjamin Coddington 2017-06-19 17:32 ` Jeff Layton 1 sibling, 1 reply; 11+ messages in thread From: Benjamin Coddington @ 2017-06-19 13:24 UTC (permalink / raw) To: Jeff Layton, bfields, Alexander Viro Cc: linux-nfs, linux-fsdevel, linux-kernel Since commit c69899a17ca4 "NFSv4: Update of VFS byte range lock must be atomic with the stateid update", NFSv4 has been inserting locks in rpciod worker context. The result is that the file_lock's fl_nspid is the kworker's pid instead of the original userspace pid. The fl_nspid is only used to represent the namespaced virtual pid number when displaying locks or returning from F_GETLK. There's no reason to set it for every inserted lock, since we can usually just look it up from fl_pid. So, instead of looking up and holding struct pid for every lock, let's just look up the virtual pid number from fl_pid when it is needed. That means we can remove fl_nspid entirely. Also, if we're now translating fl_pid for F_GETLK and /proc/locks, we need to handle the case where a remote filesystem directly sets fl_pid. In that case, the fl_pid should not be translated into a local pid namespace. If the filesystem implements the lock operation, set a flag to return the lock's fl_pid value directly, rather translate it. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> --- fs/locks.c | 70 +++++++++++++++++++++++++++++++++++------------------- include/linux/fs.h | 2 +- 2 files changed, 46 insertions(+), 26 deletions(-) diff --git a/fs/locks.c b/fs/locks.c index d7daa6c8932f..206a46d28bbd 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -733,7 +733,6 @@ static void locks_wake_up_blocks(struct file_lock *blocker) static void locks_insert_lock_ctx(struct file_lock *fl, struct list_head *before) { - fl->fl_nspid = get_pid(task_tgid(current)); list_add_tail(&fl->fl_list, before); locks_insert_global_locks(fl); } @@ -743,10 +742,6 @@ locks_unlink_lock_ctx(struct file_lock *fl) { locks_delete_global_locks(fl); list_del_init(&fl->fl_list); - if (fl->fl_nspid) { - put_pid(fl->fl_nspid); - fl->fl_nspid = NULL; - } locks_wake_up_blocks(fl); } @@ -823,8 +818,6 @@ posix_test_lock(struct file *filp, struct file_lock *fl) list_for_each_entry(cfl, &ctx->flc_posix, fl_list) { if (posix_locks_conflict(fl, cfl)) { locks_copy_conflock(fl, cfl); - if (cfl->fl_nspid) - fl->fl_pid = pid_vnr(cfl->fl_nspid); goto out; } } @@ -2041,16 +2034,46 @@ SYSCALL_DEFINE2(flock, unsigned int, fd, unsigned int, cmd) */ int vfs_test_lock(struct file *filp, struct file_lock *fl) { - if (filp->f_op->lock && is_remote_lock(filp)) + if (filp->f_op->lock && is_remote_lock(filp)) { + fl->fl_flags |= FL_PID_PRIV; return filp->f_op->lock(filp, F_GETLK, fl); + } posix_test_lock(filp, fl); return 0; } EXPORT_SYMBOL_GPL(vfs_test_lock); +/** + * locks_translate_pid - translate a pid number into a namespace + * @nr: The pid number in the init_pid_ns + * @ns: The namespace into which the pid should be translated + * + * Used to tranlate a fl_pid into a namespace virtual pid number + */ +static pid_t locks_translate_pid(int init_nr, struct pid_namespace *ns) +{ + pid_t vnr; + struct pid *pid; + + rcu_read_lock(); + pid = find_pid_ns(init_nr, &init_pid_ns); + vnr = pid_nr_ns(pid, ns); + rcu_read_unlock(); + return vnr; +} + +static pid_t flock_translate_pid(struct file_lock *fl) +{ + if (IS_OFDLCK(fl)) + return -1; + if (fl->fl_flags & FL_PID_PRIV) + return fl->fl_pid; + return locks_translate_pid(fl->fl_pid, task_active_pid_ns(current)); +} + static int posix_lock_to_flock(struct flock *flock, struct file_lock *fl) { - flock->l_pid = IS_OFDLCK(fl) ? -1 : fl->fl_pid; + flock->l_pid = flock_translate_pid(fl); #if BITS_PER_LONG == 32 /* * Make sure we can represent the posix lock via @@ -2072,7 +2095,7 @@ static int posix_lock_to_flock(struct flock *flock, struct file_lock *fl) #if BITS_PER_LONG == 32 static void posix_lock_to_flock64(struct flock64 *flock, struct file_lock *fl) { - flock->l_pid = IS_OFDLCK(fl) ? -1 : fl->fl_pid; + flock->l_pid = flock_translate_pid(fl); flock->l_start = fl->fl_start; flock->l_len = fl->fl_end == OFFSET_MAX ? 0 : fl->fl_end - fl->fl_start + 1; @@ -2584,22 +2607,19 @@ static void lock_get_status(struct seq_file *f, struct file_lock *fl, { struct inode *inode = NULL; unsigned int fl_pid; + struct pid_namespace *proc_pidns = file_inode(f->file)->i_sb->s_fs_info; - if (fl->fl_nspid) { - struct pid_namespace *proc_pidns = file_inode(f->file)->i_sb->s_fs_info; - - /* Don't let fl_pid change based on who is reading the file */ - fl_pid = pid_nr_ns(fl->fl_nspid, proc_pidns); - - /* - * If there isn't a fl_pid don't display who is waiting on - * the lock if we are called from locks_show, or if we are - * called from __show_fd_info - skip lock entirely - */ - if (fl_pid == 0) - return; - } else + if (fl->fl_flags & FL_PID_PRIV) fl_pid = fl->fl_pid; + else + fl_pid = locks_translate_pid(fl->fl_pid, proc_pidns); + /* + * If there isn't a fl_pid don't display who is waiting on + * the lock if we are called from locks_show, or if we are + * called from __show_fd_info - skip lock entirely + */ + if (fl_pid == 0) + return; if (fl->fl_file != NULL) inode = locks_inode(fl->fl_file); @@ -2674,7 +2694,7 @@ static int locks_show(struct seq_file *f, void *v) fl = hlist_entry(v, struct file_lock, fl_link); - if (fl->fl_nspid && !pid_nr_ns(fl->fl_nspid, proc_pidns)) + if (locks_translate_pid(fl->fl_pid, proc_pidns) == 0) return 0; lock_get_status(f, fl, iter->li_pos, ""); diff --git a/include/linux/fs.h b/include/linux/fs.h index aa4affb38c39..179496a9719d 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -908,6 +908,7 @@ static inline struct file *get_file(struct file *f) #define FL_UNLOCK_PENDING 512 /* Lease is being broken */ #define FL_OFDLCK 1024 /* lock is "owned" by struct file */ #define FL_LAYOUT 2048 /* outstanding pNFS layout */ +#define FL_PID_PRIV 4096 /* F_GETLK should report fl_pid */ #define FL_CLOSE_POSIX (FL_POSIX | FL_CLOSE) @@ -984,7 +985,6 @@ struct file_lock { unsigned char fl_type; unsigned int fl_pid; int fl_link_cpu; /* what cpu's list is this on? */ - struct pid *fl_nspid; wait_queue_head_t fl_wait; struct file *fl_file; loff_t fl_start; -- 2.9.3 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] fs/locks: Remove fl_nspid and use fs-specific l_pid for remote locks 2017-06-19 13:24 ` [PATCH 2/2] fs/locks: Remove fl_nspid and use fs-specific l_pid for remote locks Benjamin Coddington @ 2017-06-19 17:32 ` Jeff Layton 2017-06-20 14:03 ` Benjamin Coddington 0 siblings, 1 reply; 11+ messages in thread From: Jeff Layton @ 2017-06-19 17:32 UTC (permalink / raw) To: Benjamin Coddington, bfields, Alexander Viro Cc: linux-nfs, linux-fsdevel, open list On Mon, 2017-06-19 at 09:24 -0400, Benjamin Coddington wrote: > Since commit c69899a17ca4 "NFSv4: Update of VFS byte range lock must be > atomic with the stateid update", NFSv4 has been inserting locks in rpciod > worker context. The result is that the file_lock's fl_nspid is the > kworker's pid instead of the original userspace pid. > > The fl_nspid is only used to represent the namespaced virtual pid number > when displaying locks or returning from F_GETLK. There's no reason to set > it for every inserted lock, since we can usually just look it up from > fl_pid. So, instead of looking up and holding struct pid for every lock, > let's just look up the virtual pid number from fl_pid when it is needed. > That means we can remove fl_nspid entirely. > > Also, if we're now translating fl_pid for F_GETLK and /proc/locks, we need > to handle the case where a remote filesystem directly sets fl_pid. In that > case, the fl_pid should not be translated into a local pid namespace. If > the filesystem implements the lock operation, set a flag to return the > lock's fl_pid value directly, rather translate it. > > Signed-off-by: Benjamin Coddington <bcodding@redhat.com> > --- > fs/locks.c | 70 +++++++++++++++++++++++++++++++++++------------------- > include/linux/fs.h | 2 +- > 2 files changed, 46 insertions(+), 26 deletions(-) > > diff --git a/fs/locks.c b/fs/locks.c > index d7daa6c8932f..206a46d28bbd 100644 > --- a/fs/locks.c > +++ b/fs/locks.c > @@ -733,7 +733,6 @@ static void locks_wake_up_blocks(struct file_lock *blocker) > static void > locks_insert_lock_ctx(struct file_lock *fl, struct list_head *before) > { > - fl->fl_nspid = get_pid(task_tgid(current)); > list_add_tail(&fl->fl_list, before); > locks_insert_global_locks(fl); > } > @@ -743,10 +742,6 @@ locks_unlink_lock_ctx(struct file_lock *fl) > { > locks_delete_global_locks(fl); > list_del_init(&fl->fl_list); > - if (fl->fl_nspid) { > - put_pid(fl->fl_nspid); > - fl->fl_nspid = NULL; > - } > locks_wake_up_blocks(fl); > } > > @@ -823,8 +818,6 @@ posix_test_lock(struct file *filp, struct file_lock *fl) > list_for_each_entry(cfl, &ctx->flc_posix, fl_list) { > if (posix_locks_conflict(fl, cfl)) { > locks_copy_conflock(fl, cfl); > - if (cfl->fl_nspid) > - fl->fl_pid = pid_vnr(cfl->fl_nspid); > goto out; > } > } > @@ -2041,16 +2034,46 @@ SYSCALL_DEFINE2(flock, unsigned int, fd, unsigned int, cmd) > */ > int vfs_test_lock(struct file *filp, struct file_lock *fl) > { > - if (filp->f_op->lock && is_remote_lock(filp)) > + if (filp->f_op->lock && is_remote_lock(filp)) { > + fl->fl_flags |= FL_PID_PRIV; > return filp->f_op->lock(filp, F_GETLK, fl); > + } > posix_test_lock(filp, fl); > return 0; > } > EXPORT_SYMBOL_GPL(vfs_test_lock); > I think this looks wrong for NFS. There are really two cases we're concerned with here: 1) the lock is held by a task on the client itself, in which case we probably want to report the pid as we would on a local fs. ...or... 2) the lock is held by another host entirely in which case the pid doesn't have any meaning. We probably ought to return something like '- 1' as the pid (like we would for OFD locks). The problem for NFS is that you're setting the flag unconditionally there. It may very well be the case that we _want_ to translate the fl_pid according to the local namespace (i.e. if the lock is held by a task on the same host). I think what you want to do here is have the fs ->lock operation set that flag if the fl_pid should be used "as-is" instead of being translated. Most of the current lock operations can just set it early (to preserve the existing behavior), but NFS could be set up to set that flag if the lock request goes to the server. > +/** > + * locks_translate_pid - translate a pid number into a namespace > + * @nr: The pid number in the init_pid_ns > + * @ns: The namespace into which the pid should be translated > + * > + * Used to tranlate a fl_pid into a namespace virtual pid number > + */ > +static pid_t locks_translate_pid(int init_nr, struct pid_namespace *ns) > +{ > + pid_t vnr; > + struct pid *pid; > + > + rcu_read_lock(); > + pid = find_pid_ns(init_nr, &init_pid_ns); > + vnr = pid_nr_ns(pid, ns); > + rcu_read_unlock(); > + return vnr; > +} > + > +static pid_t flock_translate_pid(struct file_lock *fl) > +{ > + if (IS_OFDLCK(fl)) > + return -1; > + if (fl->fl_flags & FL_PID_PRIV) > + return fl->fl_pid; > + return locks_translate_pid(fl->fl_pid, task_active_pid_ns(current)); > +} > + > static int posix_lock_to_flock(struct flock *flock, struct file_lock *fl) > { > - flock->l_pid = IS_OFDLCK(fl) ? -1 : fl->fl_pid; > + flock->l_pid = flock_translate_pid(fl); > #if BITS_PER_LONG == 32 > /* > * Make sure we can represent the posix lock via > @@ -2072,7 +2095,7 @@ static int posix_lock_to_flock(struct flock *flock, struct file_lock *fl) > #if BITS_PER_LONG == 32 > static void posix_lock_to_flock64(struct flock64 *flock, struct file_lock *fl) > { > - flock->l_pid = IS_OFDLCK(fl) ? -1 : fl->fl_pid; > + flock->l_pid = flock_translate_pid(fl); > flock->l_start = fl->fl_start; > flock->l_len = fl->fl_end == OFFSET_MAX ? 0 : > fl->fl_end - fl->fl_start + 1; > @@ -2584,22 +2607,19 @@ static void lock_get_status(struct seq_file *f, struct file_lock *fl, > { > struct inode *inode = NULL; > unsigned int fl_pid; > + struct pid_namespace *proc_pidns = file_inode(f->file)->i_sb->s_fs_info; > > - if (fl->fl_nspid) { > - struct pid_namespace *proc_pidns = file_inode(f->file)->i_sb->s_fs_info; > - > - /* Don't let fl_pid change based on who is reading the file */ > - fl_pid = pid_nr_ns(fl->fl_nspid, proc_pidns); > - > - /* > - * If there isn't a fl_pid don't display who is waiting on > - * the lock if we are called from locks_show, or if we are > - * called from __show_fd_info - skip lock entirely > - */ > - if (fl_pid == 0) > - return; > - } else > + if (fl->fl_flags & FL_PID_PRIV) > fl_pid = fl->fl_pid; > + else > + fl_pid = locks_translate_pid(fl->fl_pid, proc_pidns); > + /* > + * If there isn't a fl_pid don't display who is waiting on > + * the lock if we are called from locks_show, or if we are > + * called from __show_fd_info - skip lock entirely > + */ > + if (fl_pid == 0) > + return; > > if (fl->fl_file != NULL) > inode = locks_inode(fl->fl_file); > @@ -2674,7 +2694,7 @@ static int locks_show(struct seq_file *f, void *v) > > fl = hlist_entry(v, struct file_lock, fl_link); > > - if (fl->fl_nspid && !pid_nr_ns(fl->fl_nspid, proc_pidns)) > + if (locks_translate_pid(fl->fl_pid, proc_pidns) == 0) > return 0; > > lock_get_status(f, fl, iter->li_pos, ""); > diff --git a/include/linux/fs.h b/include/linux/fs.h > index aa4affb38c39..179496a9719d 100644 > --- a/include/linux/fs.h > +++ b/include/linux/fs.h > @@ -908,6 +908,7 @@ static inline struct file *get_file(struct file *f) > #define FL_UNLOCK_PENDING 512 /* Lease is being broken */ > #define FL_OFDLCK 1024 /* lock is "owned" by struct file */ > #define FL_LAYOUT 2048 /* outstanding pNFS layout */ > +#define FL_PID_PRIV 4096 /* F_GETLK should report fl_pid */ > > #define FL_CLOSE_POSIX (FL_POSIX | FL_CLOSE) > > @@ -984,7 +985,6 @@ struct file_lock { > unsigned char fl_type; > unsigned int fl_pid; > int fl_link_cpu; /* what cpu's list is this on? */ > - struct pid *fl_nspid; > wait_queue_head_t fl_wait; > struct file *fl_file; > loff_t fl_start; -- Jeff Layton <jlayton@poochiereds.net> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] fs/locks: Remove fl_nspid and use fs-specific l_pid for remote locks 2017-06-19 17:32 ` Jeff Layton @ 2017-06-20 14:03 ` Benjamin Coddington 2017-06-20 16:09 ` Benjamin Coddington 0 siblings, 1 reply; 11+ messages in thread From: Benjamin Coddington @ 2017-06-20 14:03 UTC (permalink / raw) To: Jeff Layton; +Cc: bfields, Alexander Viro, linux-nfs, linux-fsdevel, open list On 19 Jun 2017, at 13:32, Jeff Layton wrote: > On Mon, 2017-06-19 at 09:24 -0400, Benjamin Coddington wrote: >> @@ -2041,16 +2034,46 @@ SYSCALL_DEFINE2(flock, unsigned int, fd, >> unsigned int, cmd) >> */ >> int vfs_test_lock(struct file *filp, struct file_lock *fl) >> { >> - if (filp->f_op->lock && is_remote_lock(filp)) >> + if (filp->f_op->lock && is_remote_lock(filp)) { >> + fl->fl_flags |= FL_PID_PRIV; >> return filp->f_op->lock(filp, F_GETLK, fl); >> + } >> posix_test_lock(filp, fl); >> return 0; >> } >> EXPORT_SYMBOL_GPL(vfs_test_lock); >> > > I think this looks wrong for NFS. Oh yes, this is completely wrong.. It should be looking for fl_ops, which would set the flag for lock managers. > There are really two cases we're concerned with here: > > 1) the lock is held by a task on the client itself, in which case we > probably want to report the pid as we would on a local fs. > > ...or... > > 2) the lock is held by another host entirely in which case the pid > doesn't have any meaning. We probably ought to return something like > '- > 1' as the pid (like we would for OFD locks). I don't think we have f_op->lock() users that only set remote locks. For NFS, the remote lock is always matched by a local lock. > The problem for NFS is that you're setting the flag unconditionally > there. It may very well be the case that we _want_ to translate the > fl_pid according to the local namespace (i.e. if the lock is held by a > task on the same host). > > I think what you want to do here is have the fs ->lock operation set > that flag if the fl_pid should be used "as-is" instead of being > translated. > > Most of the current lock operations can just set it early (to preserve > the existing behavior), but NFS could be set up to set that flag if > the > lock request goes to the server. I think this is just a mistake.. I think we want to always translate all local locks, unless the lock is placed by a lock manager. I'll send a corrected version. Ben ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] fs/locks: Remove fl_nspid and use fs-specific l_pid for remote locks 2017-06-20 14:03 ` Benjamin Coddington @ 2017-06-20 16:09 ` Benjamin Coddington 2017-06-20 17:06 ` Jeff Layton 0 siblings, 1 reply; 11+ messages in thread From: Benjamin Coddington @ 2017-06-20 16:09 UTC (permalink / raw) To: Jeff Layton; +Cc: bfields, Alexander Viro, linux-nfs, linux-fsdevel, open list On 20 Jun 2017, at 10:03, Benjamin Coddington wrote: > On 19 Jun 2017, at 13:32, Jeff Layton wrote: > >> On Mon, 2017-06-19 at 09:24 -0400, Benjamin Coddington wrote: >>> @@ -2041,16 +2034,46 @@ SYSCALL_DEFINE2(flock, unsigned int, fd, >>> unsigned int, cmd) >>> */ >>> int vfs_test_lock(struct file *filp, struct file_lock *fl) >>> { >>> - if (filp->f_op->lock && is_remote_lock(filp)) >>> + if (filp->f_op->lock && is_remote_lock(filp)) { >>> + fl->fl_flags |= FL_PID_PRIV; >>> return filp->f_op->lock(filp, F_GETLK, fl); >>> + } >>> posix_test_lock(filp, fl); >>> return 0; >>> } >>> EXPORT_SYMBOL_GPL(vfs_test_lock); >>> >> >> I think this looks wrong for NFS. > > Oh yes, this is completely wrong.. It should be looking for fl_ops, > which > would set the flag for lock managers. OK, please disregard this response completely. You're absolutely correct. I spent too much time away from this problem and was confused. >> There are really two cases we're concerned with here: >> >> 1) the lock is held by a task on the client itself, in which case we >> probably want to report the pid as we would on a local fs. >> >> ...or... >> >> 2) the lock is held by another host entirely in which case the pid >> doesn't have any meaning. We probably ought to return something like >> '- >> 1' as the pid (like we would for OFD locks). Right, exactly. > I don't think we have f_op->lock() users that only set remote locks. > For > NFS, the remote lock is always matched by a local lock. But we can do F_GETLK for a remote file with a remote lock. >> The problem for NFS is that you're setting the flag unconditionally >> there. It may very well be the case that we _want_ to translate the >> fl_pid according to the local namespace (i.e. if the lock is held by >> a >> task on the same host). >> >> I think what you want to do here is have the fs ->lock operation set >> that flag if the fl_pid should be used "as-is" instead of being >> translated. >> >> Most of the current lock operations can just set it early (to >> preserve >> the existing behavior), but NFS could be set up to set that flag if >> the >> lock request goes to the server. Yes, I think we ought to add the flag in this patch, but as you suggest push the responsibility for setting it out to the filesystems. I'll send one more version that adds the flag, but doesn't set it in vfs_test_lock(), and follow that with a patch for where the flag ought to be set. Ben ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] fs/locks: Remove fl_nspid and use fs-specific l_pid for remote locks 2017-06-20 16:09 ` Benjamin Coddington @ 2017-06-20 17:06 ` Jeff Layton 2017-06-20 19:17 ` Benjamin Coddington 0 siblings, 1 reply; 11+ messages in thread From: Jeff Layton @ 2017-06-20 17:06 UTC (permalink / raw) To: Benjamin Coddington Cc: bfields, Alexander Viro, linux-nfs, linux-fsdevel, open list On Tue, 2017-06-20 at 12:09 -0400, Benjamin Coddington wrote: > On 20 Jun 2017, at 10:03, Benjamin Coddington wrote: > > > On 19 Jun 2017, at 13:32, Jeff Layton wrote: > > > > > On Mon, 2017-06-19 at 09:24 -0400, Benjamin Coddington wrote: > > > > @@ -2041,16 +2034,46 @@ SYSCALL_DEFINE2(flock, unsigned int, fd, > > > > unsigned int, cmd) > > > > */ > > > > int vfs_test_lock(struct file *filp, struct file_lock *fl) > > > > { > > > > - if (filp->f_op->lock && is_remote_lock(filp)) > > > > + if (filp->f_op->lock && is_remote_lock(filp)) { > > > > + fl->fl_flags |= FL_PID_PRIV; > > > > return filp->f_op->lock(filp, F_GETLK, fl); > > > > + } > > > > posix_test_lock(filp, fl); > > > > return 0; > > > > } > > > > EXPORT_SYMBOL_GPL(vfs_test_lock); > > > > > > > > > > I think this looks wrong for NFS. > > > > Oh yes, this is completely wrong.. It should be looking for fl_ops, > > which > > would set the flag for lock managers. > > OK, please disregard this response completely. You're absolutely > correct. > I spent too much time away from this problem and was confused. > > > > There are really two cases we're concerned with here: > > > > > > 1) the lock is held by a task on the client itself, in which case we > > > probably want to report the pid as we would on a local fs. > > > > > > ...or... > > > > > > 2) the lock is held by another host entirely in which case the pid > > > doesn't have any meaning. We probably ought to return something like > > > '- > > > 1' as the pid (like we would for OFD locks). > > Right, exactly. > > > I don't think we have f_op->lock() users that only set remote locks. > > For > > NFS, the remote lock is always matched by a local lock. > > But we can do F_GETLK for a remote file with a remote lock. > > > > The problem for NFS is that you're setting the flag unconditionally > > > there. It may very well be the case that we _want_ to translate the > > > fl_pid according to the local namespace (i.e. if the lock is held by > > > a > > > task on the same host). > > > > > > I think what you want to do here is have the fs ->lock operation set > > > that flag if the fl_pid should be used "as-is" instead of being > > > translated. > > > > > > Most of the current lock operations can just set it early (to > > > preserve > > > the existing behavior), but NFS could be set up to set that flag if > > > the > > > lock request goes to the server. > > Yes, I think we ought to add the flag in this patch, but as you suggest > push > the responsibility for setting it out to the filesystems. I'll send one > more version that adds the flag, but doesn't set it in vfs_test_lock(), > and > follow that with a patch for where the flag ought to be set. > > Ben Now that I think about it a bit more, I don't think we really need a flag here. Just have the ->lock operation set the fl_pid to a negative value. That will never be a valid pid anyway. Then flock_translate_pid could just return any negative value directly instead of trying to translate it. In practice we would always just set it to -1. Maybe even add something like this that the lock-> operation could set it to? #define FILE_LOCK_OWNER_UNDEFINED -1 -- Jeff Layton <jlayton@poochiereds.net> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] fs/locks: Remove fl_nspid and use fs-specific l_pid for remote locks 2017-06-20 17:06 ` Jeff Layton @ 2017-06-20 19:17 ` Benjamin Coddington 2017-06-20 19:32 ` Jeff Layton 0 siblings, 1 reply; 11+ messages in thread From: Benjamin Coddington @ 2017-06-20 19:17 UTC (permalink / raw) To: Jeff Layton; +Cc: bfields, Alexander Viro, linux-nfs, linux-fsdevel, open list On 20 Jun 2017, at 13:06, Jeff Layton wrote: > > Now that I think about it a bit more, I don't think we really need a > flag here. > > Just have the ->lock operation set the fl_pid to a negative value. That > will never be a valid pid anyway. Then flock_translate_pid could just > return any negative value directly instead of trying to translate it. > > In practice we would always just set it to -1. Maybe even add something > like this that the lock-> operation could set it to? > > #define FILE_LOCK_OWNER_UNDEFINED -1 So for filesystems that set a remote pid, they should negate the pid to mean that the pid should not be translated? Then when we return that pid, we flip it back again, or display a negative number, or turn it into -1? The flag, having a readable name, would make things a bit clearer as to what the filesystems expect to happen to that pid value. Ben ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] fs/locks: Remove fl_nspid and use fs-specific l_pid for remote locks 2017-06-20 19:17 ` Benjamin Coddington @ 2017-06-20 19:32 ` Jeff Layton 2017-06-20 19:39 ` Benjamin Coddington 0 siblings, 1 reply; 11+ messages in thread From: Jeff Layton @ 2017-06-20 19:32 UTC (permalink / raw) To: Benjamin Coddington Cc: bfields, Alexander Viro, linux-nfs, linux-fsdevel, open list On Tue, 2017-06-20 at 15:17 -0400, Benjamin Coddington wrote: > On 20 Jun 2017, at 13:06, Jeff Layton wrote: > > > > Now that I think about it a bit more, I don't think we really need a > > flag here. > > > > Just have the ->lock operation set the fl_pid to a negative value. That > > will never be a valid pid anyway. Then flock_translate_pid could just > > return any negative value directly instead of trying to translate it. > > > > In practice we would always just set it to -1. Maybe even add something > > like this that the lock-> operation could set it to? > > > > #define FILE_LOCK_OWNER_UNDEFINED -1 > > So for filesystems that set a remote pid, they should negate the pid to mean > that the pid should not be translated? Then when we return that pid, we > flip it back again, or display a negative number, or turn it into -1? > > The flag, having a readable name, would make things a bit clearer as to what > the filesystems expect to happen to that pid value. > I now think that we really only ought to be filling out the pid when it refers to a process on the local host. It seems sketchy to me to return a pid here that is really the pid on another host, but happens to have the same pid as something else on this host. It's misleading at best, and if anyone tries to act on that info it could be dangerous. So I'm thinking that we should just set it to -1 when the lock is held by another host entirely. But, since pid values must be positive, we can code the basic infrastructure to return any negative value as-is instead of trying to translate it. -- Jeff Layton <jlayton@poochiereds.net> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] fs/locks: Remove fl_nspid and use fs-specific l_pid for remote locks 2017-06-20 19:32 ` Jeff Layton @ 2017-06-20 19:39 ` Benjamin Coddington 2017-06-20 20:13 ` Jeff Layton 0 siblings, 1 reply; 11+ messages in thread From: Benjamin Coddington @ 2017-06-20 19:39 UTC (permalink / raw) To: Jeff Layton; +Cc: bfields, Alexander Viro, linux-nfs, linux-fsdevel, open list On 20 Jun 2017, at 15:32, Jeff Layton wrote: > On Tue, 2017-06-20 at 15:17 -0400, Benjamin Coddington wrote: >> On 20 Jun 2017, at 13:06, Jeff Layton wrote: >>> >>> Now that I think about it a bit more, I don't think we really need a >>> flag here. >>> >>> Just have the ->lock operation set the fl_pid to a negative value. >>> That >>> will never be a valid pid anyway. Then flock_translate_pid could >>> just >>> return any negative value directly instead of trying to translate >>> it. >>> >>> In practice we would always just set it to -1. Maybe even add >>> something >>> like this that the lock-> operation could set it to? >>> >>> #define FILE_LOCK_OWNER_UNDEFINED -1 >> >> So for filesystems that set a remote pid, they should negate the pid >> to mean >> that the pid should not be translated? Then when we return that pid, >> we >> flip it back again, or display a negative number, or turn it into -1? >> >> The flag, having a readable name, would make things a bit clearer as >> to what >> the filesystems expect to happen to that pid value. >> > > I now think that we really only ought to be filling out the pid when > it > refers to a process on the local host. It seems sketchy to me to > return > a pid here that is really the pid on another host, but happens to have > the same pid as something else on this host. It's misleading at best, > and if anyone tries to act on that info it could be dangerous. So I'm > thinking that we should just set it to -1 when the lock is held by > another host entirely. > > But, since pid values must be positive, we can code the basic > infrastructure to return any negative value as-is instead of trying to > translate it. Ok, so we have to patch several filesystems. The question is do we patch those filesystems that set remote pids to negate their pid values in the lock they return from F_GETLK, or do we ask them to set a flag? We'd be patching them to negate their pid just to then transform it to -1.. I'd prefer a flag rather than carrying meaning in a modified value since the flag has readable information. No one will come along later and wonder why some filesystems are negating their pid values. If we're going to touch filesystems that set have remote locks anyway, perhaps it makes sense to take a step toward l_sysid by adding another member to file_lock. Then a special value of fl_sysid would indicate the local system. Ben ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] fs/locks: Remove fl_nspid and use fs-specific l_pid for remote locks 2017-06-20 19:39 ` Benjamin Coddington @ 2017-06-20 20:13 ` Jeff Layton 0 siblings, 0 replies; 11+ messages in thread From: Jeff Layton @ 2017-06-20 20:13 UTC (permalink / raw) To: Benjamin Coddington Cc: bfields, Alexander Viro, linux-nfs, linux-fsdevel, open list On Tue, 2017-06-20 at 15:39 -0400, Benjamin Coddington wrote: > On 20 Jun 2017, at 15:32, Jeff Layton wrote: > > > On Tue, 2017-06-20 at 15:17 -0400, Benjamin Coddington wrote: > > > On 20 Jun 2017, at 13:06, Jeff Layton wrote: > > > > > > > > Now that I think about it a bit more, I don't think we really need a > > > > flag here. > > > > > > > > Just have the ->lock operation set the fl_pid to a negative value. > > > > That > > > > will never be a valid pid anyway. Then flock_translate_pid could > > > > just > > > > return any negative value directly instead of trying to translate > > > > it. > > > > > > > > In practice we would always just set it to -1. Maybe even add > > > > something > > > > like this that the lock-> operation could set it to? > > > > > > > > #define FILE_LOCK_OWNER_UNDEFINED -1 > > > > > > So for filesystems that set a remote pid, they should negate the pid > > > to mean > > > that the pid should not be translated? Then when we return that pid, > > > we > > > flip it back again, or display a negative number, or turn it into -1? > > > > > > The flag, having a readable name, would make things a bit clearer as > > > to what > > > the filesystems expect to happen to that pid value. > > > > > > > I now think that we really only ought to be filling out the pid when > > it > > refers to a process on the local host. It seems sketchy to me to > > return > > a pid here that is really the pid on another host, but happens to have > > the same pid as something else on this host. It's misleading at best, > > and if anyone tries to act on that info it could be dangerous. So I'm > > thinking that we should just set it to -1 when the lock is held by > > another host entirely. > > > > But, since pid values must be positive, we can code the basic > > infrastructure to return any negative value as-is instead of trying to > > translate it. > > Ok, so we have to patch several filesystems. The question is do we > patch > those filesystems that set remote pids to negate their pid values in the > lock > they return from F_GETLK, or do we ask them to set a flag? We'd be > patching > them to negate their pid just to then transform it to -1.. > > I'd prefer a flag rather than carrying meaning in a modified value since > the > flag has readable information. No one will come along later and wonder > why > some filesystems are negating their pid values. > > If we're going to touch filesystems that set have remote locks anyway, > perhaps it makes sense to take a step toward l_sysid by adding another > member to file_lock. Then a special value of fl_sysid would indicate > the > local system. > I think we need to fix up the current API first. My main interest is that we have the kernel report l_pid properly to the best of its ability, and when it can't that it report some clearly non- sensical value (e.g., -1) for the pid. I think that's the only sane thing we can do at this point. If we want to start discussing new locking APIs then I'm fine with that, but I'd still want to do something sane here before we start down that road anyway. -- Jeff Layton <jlayton@poochiereds.net> ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2017-06-20 20:14 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-06-19 13:24 [PATCH 0/2 v5] Fixups for l_pid Benjamin Coddington 2017-06-19 13:24 ` [PATCH 1/2] fs/locks: Use allocation rather than the stack in fcntl_getlk() Benjamin Coddington 2017-06-19 13:24 ` [PATCH 2/2] fs/locks: Remove fl_nspid and use fs-specific l_pid for remote locks Benjamin Coddington 2017-06-19 17:32 ` Jeff Layton 2017-06-20 14:03 ` Benjamin Coddington 2017-06-20 16:09 ` Benjamin Coddington 2017-06-20 17:06 ` Jeff Layton 2017-06-20 19:17 ` Benjamin Coddington 2017-06-20 19:32 ` Jeff Layton 2017-06-20 19:39 ` Benjamin Coddington 2017-06-20 20:13 ` Jeff Layton
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).