linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/6] locks: implement "filp-private" (aka UNPOSIX) locks
@ 2013-12-10 19:17 Jeff Layton
  2013-12-10 19:17 ` [PATCH v3 1/6] locks: consolidate common code in the flock_to_posix_lock routines Jeff Layton
                   ` (6 more replies)
  0 siblings, 7 replies; 29+ messages in thread
From: Jeff Layton @ 2013-12-10 19:17 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: nfs-ganesha-devel, samba-technical, linux-kernel

This patchset is the third posting of this set. Behavior between this
set and the last should be more or less the same. Here is a summary of
changes:

v3:
- more consolidation of common code between flock_to_posix_lock and
  flock_to_posix_lock64
- better bisectability by reordering changes, such that partial
  implementation won't be exposed
- s/filp/file/ s/FILP/FILE/ in symbol names

v2:
- inheritance semantics have been change to be more BSD-lock like
- patchset size has been reduced by changing how lock ownership
  is handled
- new F_UNLCKP l_type value has been added

Note too that I've gone ahead and opened a request for the POSIX folks
to consider adding this to the POSIX spec once we have something
mergeable. They seem amenable to the idea but don't want to enshrine it
into the standard until there's a real implementation of it:

    http://austingroupbugs.net/view.php?id=768

I also owe them a better writeup of the semantics for these new locks,
but have been holding off on doing that until they're a little more
settled.

As a side note, I've also had a few other userland developers reach
out to me as to the status of this work. There seems to be a lot of
interest since classic POSIX locks are such a pain to work with in
threaded programs. Hopefully some will chime in on this posting...

Original cover letter from v1 posting follows. Comments and suggestions
welcome.

-------------------------------[snip]------------------------------

At LSF this year, there was a discussion about the "wishlist" for
userland file servers. One of the things brought up was the goofy and
problematic behavior of POSIX locks when a file is closed. Boaz started
a thread on it here:

    http://permalink.gmane.org/gmane.linux.file-systems/73364

Userland fileservers often need to maintain more than one open file
descriptor on a file. The POSIX spec says:

"All locks associated with a file for a given process shall be removed
 when a file descriptor for that file is closed by that process or the
 process holding that file descriptor terminates."

This is problematic since you can't close any file descriptor without
dropping all your POSIX locks. Most userland file servers therefore
end up opening the file with more access than is really necessary, and
keeping fd's open for longer than is necessary to work around this.

This patchset is a first stab at an approach to address this problem by
adding two new l_type values -- F_RDLCKP and F_WRLCKP (the 'P' is short
for "private" -- I'm open to changing that if you have a better
mnemonic).

For all intents and purposes these lock types act just like their
"non-P" counterpart. The difference is that they are only implicitly
released when the fd against which they were acquired is closed. As a
side effect, these locks cannot be merged with "non-P" locks since they
have different semantics on close.

I've given this patchset some very basic smoke testing and it seems to
do the right thing, but it is still pretty rough. If this looks
reasonable I'll plan to do some documentation updates and will take a
stab at trying to get these new lock types added to the POSIX spec (as
HCH recommended).

At this point, my main questions are:

1) does this look useful, particularly for fileserver implementors?

2) does this look OK API-wise? We could consider different "cmd" values
   or even different syscalls, but I figured this makes it clearer that
   "P" and "non-P" locks will still conflict with one another.

Jeff Layton (6):
  locks: consolidate common code in the flock_to_posix_lock routines
  locks: consolidate checks for compatible filp->f_mode values in setlk
    handlers
  locks: rename locks_remove_flock to locks_remove_file
  locks: show private lock types in /proc/locks
  locks: report l_pid as -1 for FL_FILE_PVT locks
  locks: add new "private" lock type that is owned by the filp

 fs/file_table.c                  |   2 +-
 fs/locks.c                       | 122 +++++++++++++++++++++------------------
 include/linux/fs.h               |   5 +-
 include/uapi/asm-generic/fcntl.h |  16 +++++
 4 files changed, 85 insertions(+), 60 deletions(-)

-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v3 1/6] locks: consolidate common code in the flock_to_posix_lock routines
  2013-12-10 19:17 [PATCH v3 0/6] locks: implement "filp-private" (aka UNPOSIX) locks Jeff Layton
@ 2013-12-10 19:17 ` Jeff Layton
  2013-12-10 21:22   ` J. Bruce Fields
  2013-12-10 19:17 ` [PATCH v3 2/6] locks: consolidate checks for compatible filp->f_mode values in setlk handlers Jeff Layton
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 29+ messages in thread
From: Jeff Layton @ 2013-12-10 19:17 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: nfs-ganesha-devel, samba-technical, linux-kernel

Currently, there's a lot of copy and paste between the two.  Add some
functions to do the initialization of the file_lock from values
passed in, and turn the flock/flock64 variants of those functions into
wrappers around them.

Unfortunately it's harder to consolidate the fl_start/fl_end
calculations due to the differently sized types involved so I've left
them separate for now.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
---
 fs/locks.c | 31 +++++++++++++++----------------
 1 file changed, 15 insertions(+), 16 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index 6084f5a..a5848ed 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -344,6 +344,19 @@ static int assign_type(struct file_lock *fl, long type)
 	return 0;
 }
 
+static int
+flock_to_posix_lock_common(struct file_lock *fl, struct file *filp, short type)
+{
+	fl->fl_owner = current->files;
+	fl->fl_pid = current->tgid;
+	fl->fl_file = filp;
+	fl->fl_flags = FL_POSIX;
+	fl->fl_ops = NULL;
+	fl->fl_lmops = NULL;
+
+	return assign_type(fl, type);
+}
+
 /* Verify a "struct flock" and copy it to a "struct file_lock" as a POSIX
  * style lock.
  */
@@ -386,14 +399,7 @@ static int flock_to_posix_lock(struct file *filp, struct file_lock *fl,
 	if (fl->fl_end < fl->fl_start)
 		return -EOVERFLOW;
 	
-	fl->fl_owner = current->files;
-	fl->fl_pid = current->tgid;
-	fl->fl_file = filp;
-	fl->fl_flags = FL_POSIX;
-	fl->fl_ops = NULL;
-	fl->fl_lmops = NULL;
-
-	return assign_type(fl, l->l_type);
+	return flock_to_posix_lock_common(fl, filp, l->l_type);
 }
 
 #if BITS_PER_LONG == 32
@@ -431,15 +437,8 @@ static int flock64_to_posix_lock(struct file *filp, struct file_lock *fl,
 	fl->fl_start = start;	/* we record the absolute position */
 	if (fl->fl_end < fl->fl_start)
 		return -EOVERFLOW;
-	
-	fl->fl_owner = current->files;
-	fl->fl_pid = current->tgid;
-	fl->fl_file = filp;
-	fl->fl_flags = FL_POSIX;
-	fl->fl_ops = NULL;
-	fl->fl_lmops = NULL;
 
-	return assign_type(fl, l->l_type);
+	return flock_to_posix_lock_common(fl, filp, l->l_type);
 }
 #endif
 
-- 
1.8.4.2

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 2/6] locks: consolidate checks for compatible filp->f_mode values in setlk handlers
  2013-12-10 19:17 [PATCH v3 0/6] locks: implement "filp-private" (aka UNPOSIX) locks Jeff Layton
  2013-12-10 19:17 ` [PATCH v3 1/6] locks: consolidate common code in the flock_to_posix_lock routines Jeff Layton
@ 2013-12-10 19:17 ` Jeff Layton
  2013-12-10 19:17 ` [PATCH v3 3/6] locks: rename locks_remove_flock to locks_remove_file Jeff Layton
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 29+ messages in thread
From: Jeff Layton @ 2013-12-10 19:17 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: nfs-ganesha-devel, samba-technical, linux-kernel

Move this check into the posix_lock_setup_common handler instead of
duplicating this check in two places. This also fixes a minor wart in
the code where we continue referring to the struct flock after
converting it to struct file_lock.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
---
 fs/locks.c | 46 ++++++++++++----------------------------------
 1 file changed, 12 insertions(+), 34 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index a5848ed..f96c455 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -354,6 +354,18 @@ flock_to_posix_lock_common(struct file_lock *fl, struct file *filp, short type)
 	fl->fl_ops = NULL;
 	fl->fl_lmops = NULL;
 
+	/* Ensure that fl->fl_filp has compatible f_mode */
+	switch (type) {
+	case F_RDLCK:
+		if (!(filp->f_mode & FMODE_READ))
+			return -EBADF;
+		break;
+	case F_WRLCK:
+		if (!(filp->f_mode & FMODE_WRITE))
+			return -EBADF;
+		break;
+	}
+
 	return assign_type(fl, type);
 }
 
@@ -2060,23 +2072,6 @@ again:
 		file_lock->fl_flags |= FL_SLEEP;
 	}
 	
-	error = -EBADF;
-	switch (flock.l_type) {
-	case F_RDLCK:
-		if (!(filp->f_mode & FMODE_READ))
-			goto out;
-		break;
-	case F_WRLCK:
-		if (!(filp->f_mode & FMODE_WRITE))
-			goto out;
-		break;
-	case F_UNLCK:
-		break;
-	default:
-		error = -EINVAL;
-		goto out;
-	}
-
 	error = do_lock_file_wait(filp, cmd, file_lock);
 
 	/*
@@ -2178,23 +2173,6 @@ again:
 		file_lock->fl_flags |= FL_SLEEP;
 	}
 	
-	error = -EBADF;
-	switch (flock.l_type) {
-	case F_RDLCK:
-		if (!(filp->f_mode & FMODE_READ))
-			goto out;
-		break;
-	case F_WRLCK:
-		if (!(filp->f_mode & FMODE_WRITE))
-			goto out;
-		break;
-	case F_UNLCK:
-		break;
-	default:
-		error = -EINVAL;
-		goto out;
-	}
-
 	error = do_lock_file_wait(filp, cmd, file_lock);
 
 	/*
-- 
1.8.4.2

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 3/6] locks: rename locks_remove_flock to locks_remove_file
  2013-12-10 19:17 [PATCH v3 0/6] locks: implement "filp-private" (aka UNPOSIX) locks Jeff Layton
  2013-12-10 19:17 ` [PATCH v3 1/6] locks: consolidate common code in the flock_to_posix_lock routines Jeff Layton
  2013-12-10 19:17 ` [PATCH v3 2/6] locks: consolidate checks for compatible filp->f_mode values in setlk handlers Jeff Layton
@ 2013-12-10 19:17 ` Jeff Layton
  2013-12-10 19:17 ` [PATCH v3 4/6] locks: show private lock types in /proc/locks Jeff Layton
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 29+ messages in thread
From: Jeff Layton @ 2013-12-10 19:17 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: nfs-ganesha-devel, samba-technical, linux-kernel

This function currently removes leases in addition to flock locks, so
the name is misleading. Rename it to locks_remove_file to indicate that
it removes locks that are associated with a particular struct file.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
---
 fs/file_table.c    | 2 +-
 fs/locks.c         | 2 +-
 include/linux/fs.h | 4 ++--
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/file_table.c b/fs/file_table.c
index 5fff903..468543c 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -234,7 +234,7 @@ static void __fput(struct file *file)
 	 * in the file cleanup chain.
 	 */
 	eventpoll_release(file);
-	locks_remove_flock(file);
+	locks_remove_file(file);
 
 	if (unlikely(file->f_flags & FASYNC)) {
 		if (file->f_op->fasync)
diff --git a/fs/locks.c b/fs/locks.c
index f96c455..fb10d58 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -2231,7 +2231,7 @@ EXPORT_SYMBOL(locks_remove_posix);
 /*
  * This function is called on the last close of an open file.
  */
-void locks_remove_flock(struct file *filp)
+void locks_remove_file(struct file *filp)
 {
 	struct inode * inode = file_inode(filp);
 	struct file_lock *fl;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 04be202..51974e0 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1012,7 +1012,7 @@ extern struct file_lock * locks_alloc_lock(void);
 extern void locks_copy_lock(struct file_lock *, struct file_lock *);
 extern void __locks_copy_lock(struct file_lock *, const struct file_lock *);
 extern void locks_remove_posix(struct file *, fl_owner_t);
-extern void locks_remove_flock(struct file *);
+extern void locks_remove_file(struct file *);
 extern void locks_release_private(struct file_lock *);
 extern void posix_test_lock(struct file *, struct file_lock *);
 extern int posix_lock_file(struct file *, struct file_lock *, struct file_lock *);
@@ -1083,7 +1083,7 @@ static inline void locks_remove_posix(struct file *filp, fl_owner_t owner)
 	return;
 }
 
-static inline void locks_remove_flock(struct file *filp)
+static inline void locks_remove_file(struct file *filp)
 {
 	return;
 }
-- 
1.8.4.2

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 4/6] locks: show private lock types in /proc/locks
  2013-12-10 19:17 [PATCH v3 0/6] locks: implement "filp-private" (aka UNPOSIX) locks Jeff Layton
                   ` (2 preceding siblings ...)
  2013-12-10 19:17 ` [PATCH v3 3/6] locks: rename locks_remove_flock to locks_remove_file Jeff Layton
@ 2013-12-10 19:17 ` Jeff Layton
  2013-12-10 19:17 ` [PATCH v3 5/6] locks: report l_pid as -1 for FL_FILE_PVT locks Jeff Layton
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 29+ messages in thread
From: Jeff Layton @ 2013-12-10 19:17 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: nfs-ganesha-devel, samba-technical, linux-kernel

In a later patch, we'll be adding a new type of lock that's owned by
the struct file instead of the files_struct. Those sorts of locks
will be flagged with a new IS_FILE_PVT flag.

Show "private" locks in /proc/locks with a 'P' suffix. This does mean
that we need to widen the column by one character.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
---
 fs/locks.c         | 9 ++++++---
 include/linux/fs.h | 1 +
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index fb10d58..e163a30 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -135,6 +135,7 @@
 #define IS_POSIX(fl)	(fl->fl_flags & FL_POSIX)
 #define IS_FLOCK(fl)	(fl->fl_flags & FL_FLOCK)
 #define IS_LEASE(fl)	(fl->fl_flags & (FL_LEASE|FL_DELEG))
+#define IS_FILE_PVT(fl)	(fl->fl_flags & FL_FILE_PVT)
 
 static bool lease_breaking(struct file_lock *fl)
 {
@@ -2358,13 +2359,15 @@ static void lock_get_status(struct seq_file *f, struct file_lock *fl,
 		seq_printf(f, "UNKNOWN UNKNOWN  ");
 	}
 	if (fl->fl_type & LOCK_MAND) {
-		seq_printf(f, "%s ",
+		seq_printf(f, "%s  ",
 			       (fl->fl_type & LOCK_READ)
 			       ? (fl->fl_type & LOCK_WRITE) ? "RW   " : "READ "
 			       : (fl->fl_type & LOCK_WRITE) ? "WRITE" : "NONE ");
+	} else if (IS_FILE_PVT(fl)) {
+		seq_printf(f, "%s ", fl->fl_type == F_WRLCK ?  "WRITEP" : "READP ");
 	} else {
-		seq_printf(f, "%s ",
-			       (lease_breaking(fl))
+		seq_printf(f, "%s  ",
+			       lease_breaking(fl)
 			       ? (fl->fl_type == F_UNLCK) ? "UNLCK" : "READ "
 			       : (fl->fl_type == F_WRLCK) ? "WRITE" : "READ ");
 	}
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 51974e0..639a3b7 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -888,6 +888,7 @@ static inline int file_check_writeable(struct file *filp)
 #define FL_SLEEP	128	/* A blocking lock */
 #define FL_DOWNGRADE_PENDING	256 /* Lease is being downgraded */
 #define FL_UNLOCK_PENDING	512 /* Lease is being broken */
+#define FL_FILE_PVT	1024	/* lock is private to the file */
 
 /*
  * Special return value from posix_lock_file() and vfs_lock_file() for
-- 
1.8.4.2

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 5/6] locks: report l_pid as -1 for FL_FILE_PVT locks
  2013-12-10 19:17 [PATCH v3 0/6] locks: implement "filp-private" (aka UNPOSIX) locks Jeff Layton
                   ` (3 preceding siblings ...)
  2013-12-10 19:17 ` [PATCH v3 4/6] locks: show private lock types in /proc/locks Jeff Layton
@ 2013-12-10 19:17 ` Jeff Layton
  2013-12-10 19:31   ` Jeff Layton
  2013-12-10 19:17 ` [PATCH v3 6/6] locks: add new "private" lock type that is owned by the filp Jeff Layton
  2013-12-10 19:30 ` [Nfs-ganesha-devel] [PATCH v3 0/6] locks: implement "filp-private" (aka UNPOSIX) locks Frank Filz
  6 siblings, 1 reply; 29+ messages in thread
From: Jeff Layton @ 2013-12-10 19:17 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: nfs-ganesha-devel, samba-technical, linux-kernel

FL_FILE_PVT locks are no longer tied to a particular pid, and are
instead inheritable by child processes. Report a l_pid of '-1' for
these sorts of locks since the pid is somewhat meaningless for them.

This precedent comes from FreeBSD. There, POSIX and flock() locks can
conflict with one another. If fcntl(F_GETLK, ...) returns a lock set
with flock() then the l_pid member cannot be a process ID because the
lock is not held by a process as such.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
---
 fs/locks.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index e163a30..5372ddd 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1899,7 +1899,7 @@ EXPORT_SYMBOL_GPL(vfs_test_lock);
 
 static int posix_lock_to_flock(struct flock *flock, struct file_lock *fl)
 {
-	flock->l_pid = fl->fl_pid;
+	flock->l_pid = IS_FILE_PVT(fl) ? -1 : fl->fl_pid;
 #if BITS_PER_LONG == 32
 	/*
 	 * Make sure we can represent the posix lock via
@@ -1921,7 +1921,7 @@ static int posix_lock_to_flock(struct flock *flock, struct file_lock *fl)
 #if BITS_PER_LONG == 32
 static void posix_lock_to_flock64(struct flock64 *flock, struct file_lock *fl)
 {
-	flock->l_pid = fl->fl_pid;
+	flock->l_pid = IS_FILE_PVT(fl) ? -1 : fl->fl_pid;
 	flock->l_start = fl->fl_start;
 	flock->l_len = fl->fl_end == OFFSET_MAX ? 0 :
 		fl->fl_end - fl->fl_start + 1;
-- 
1.8.4.2

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 6/6] locks: add new "private" lock type that is owned by the filp
  2013-12-10 19:17 [PATCH v3 0/6] locks: implement "filp-private" (aka UNPOSIX) locks Jeff Layton
                   ` (4 preceding siblings ...)
  2013-12-10 19:17 ` [PATCH v3 5/6] locks: report l_pid as -1 for FL_FILE_PVT locks Jeff Layton
@ 2013-12-10 19:17 ` Jeff Layton
  2013-12-17 13:31   ` Jeff Layton
  2013-12-10 19:30 ` [Nfs-ganesha-devel] [PATCH v3 0/6] locks: implement "filp-private" (aka UNPOSIX) locks Frank Filz
  6 siblings, 1 reply; 29+ messages in thread
From: Jeff Layton @ 2013-12-10 19:17 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: linux-kernel, nfs-ganesha-devel, samba-technical

Due to some unfortunate history, POSIX locks have very strange and
unhelpful semantics. The thing that usually catches people by surprise
is that they are dropped whenever the process closes any file descriptor
associated with the inode.

This is extremely problematic for people developing file servers that
need to implement byte-range locks. Developers often need a "lock
management" facility to ensure that file descriptors are not closed
until all of the locks associated with the inode are finished.

Additionally, "classic" POSIX locks are owned by the process. Locks
taken between threads within the same process won't conflict with one
another, which renders classic POSIX locks useless for synchronization
between threads.

This patchset adds a new type of lock that attempts to address these
issues. These locks work just like classic POSIX read/write locks, but
have semantics that are more like BSD locks with respect to inheritance
and behavior on close.

This is implemented primarily by changing how fl_owner field is set for
these locks. Instead of having them owned by the files_struct of the
process, they are instead owned by the filp on which they were acquired.
Thus, they are inherited across fork() and are only released when the
last reference to a filp is put.

These new semantics prevent them from being merged with classic POSIX
locks, even if they are acquired by the same process. These locks will
also conflict with classic POSIX locks even if they are acquired by
the same process or on the same file descriptor.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
---
 fs/locks.c                       | 32 ++++++++++++++++++++++++++++++--
 include/uapi/asm-generic/fcntl.h | 16 ++++++++++++++++
 2 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index 5372ddd..98a503d 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -367,7 +367,34 @@ flock_to_posix_lock_common(struct file_lock *fl, struct file *filp, short type)
 		break;
 	}
 
-	return assign_type(fl, type);
+	/*
+	 * FL_FILE_PVT locks are "owned" by the filp upon which they were
+	 * acquired, regardless of what task is dealing with them. Set the
+	 * fl_owner appropriately and flag them as private.
+	 */
+	switch(type) {
+	case F_RDLCKP:
+		fl->fl_owner = (fl_owner_t)fl->fl_file;
+		fl->fl_type = F_RDLCK;
+		fl->fl_flags |= FL_FILE_PVT;
+		break;
+	case F_WRLCKP:
+		fl->fl_owner = (fl_owner_t)fl->fl_file;
+		fl->fl_type = F_WRLCK;
+		fl->fl_flags |= FL_FILE_PVT;
+		break;
+	case F_UNLCKP:
+		fl->fl_owner = (fl_owner_t)fl->fl_file;
+		fl->fl_type = F_UNLCK;
+		fl->fl_flags |= FL_FILE_PVT;
+		break;
+	default:
+		/* Any other POSIX lock is owned by the file_struct */
+		fl->fl_owner = current->files;
+		return assign_type(fl, type);
+	}
+
+	return 0;
 }
 
 /* Verify a "struct flock" and copy it to a "struct file_lock" as a POSIX
@@ -2259,7 +2286,8 @@ void locks_remove_file(struct file *filp)
 
 	while ((fl = *before) != NULL) {
 		if (fl->fl_file == filp) {
-			if (IS_FLOCK(fl)) {
+			if (IS_FLOCK(fl) ||
+			    (IS_POSIX(fl) && IS_FILE_PVT(fl))) {
 				locks_delete_lock(before);
 				continue;
 			}
diff --git a/include/uapi/asm-generic/fcntl.h b/include/uapi/asm-generic/fcntl.h
index 95e46c8..ed09fc5 100644
--- a/include/uapi/asm-generic/fcntl.h
+++ b/include/uapi/asm-generic/fcntl.h
@@ -151,6 +151,22 @@ struct f_owner_ex {
 #define F_UNLCK		2
 #endif
 
+/*
+ * fd "private" POSIX locks.
+ *
+ * Usually POSIX locks held by a process are released on *any* close and are
+ * not inherited across a fork().
+ *
+ * These lock types will conflict with normal POSIX locks, but are "owned"
+ * by the opened file, not the process. This means that they are inherited
+ * across fork() like BSD (flock) locks, and they are only released
+ * automatically when the last reference to the the open file against which
+ * they were acquired is put.
+ */
+#define F_RDLCKP	5
+#define F_WRLCKP	6
+#define F_UNLCKP	7
+
 /* for old implementation of bsd flock () */
 #ifndef F_EXLCK
 #define F_EXLCK		4	/* or 3 */
-- 
1.8.4.2

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* RE: [Nfs-ganesha-devel] [PATCH v3 0/6] locks: implement "filp-private" (aka UNPOSIX) locks
  2013-12-10 19:17 [PATCH v3 0/6] locks: implement "filp-private" (aka UNPOSIX) locks Jeff Layton
                   ` (5 preceding siblings ...)
  2013-12-10 19:17 ` [PATCH v3 6/6] locks: add new "private" lock type that is owned by the filp Jeff Layton
@ 2013-12-10 19:30 ` Frank Filz
  6 siblings, 0 replies; 29+ messages in thread
From: Frank Filz @ 2013-12-10 19:30 UTC (permalink / raw)
  To: 'Jeff Layton', linux-fsdevel
  Cc: nfs-ganesha-devel, samba-technical, linux-kernel

> This patchset is the third posting of this set. Behavior between this set
and
> the last should be more or less the same. Here is a summary of
> changes:
> 
> v3:
> - more consolidation of common code between flock_to_posix_lock and
>   flock_to_posix_lock64
> - better bisectability by reordering changes, such that partial
>   implementation won't be exposed
> - s/filp/file/ s/FILP/FILE/ in symbol names
> 
> v2:
> - inheritance semantics have been change to be more BSD-lock like
> - patchset size has been reduced by changing how lock ownership
>   is handled
> - new F_UNLCKP l_type value has been added
> 
> Note too that I've gone ahead and opened a request for the POSIX folks to
> consider adding this to the POSIX spec once we have something mergeable.
> They seem amenable to the idea but don't want to enshrine it into the
> standard until there's a real implementation of it:
> 
>     http://austingroupbugs.net/view.php?id=768
> 
> I also owe them a better writeup of the semantics for these new locks, but
> have been holding off on doing that until they're a little more settled.
> 
> As a side note, I've also had a few other userland developers reach out to
me
> as to the status of this work. There seems to be a lot of interest since
classic
> POSIX locks are such a pain to work with in threaded programs. Hopefully
> some will chime in on this posting...

This is looking really good to me. I will  be excited to utilize this
capability in Ganesha.

Thanks

Frank

> Original cover letter from v1 posting follows. Comments and suggestions
> welcome.
> 
> -------------------------------[snip]------------------------------
> 
> At LSF this year, there was a discussion about the "wishlist" for userland
file
> servers. One of the things brought up was the goofy and problematic
> behavior of POSIX locks when a file is closed. Boaz started a thread on it
here:
> 
>     http://permalink.gmane.org/gmane.linux.file-systems/73364
> 
> Userland fileservers often need to maintain more than one open file
> descriptor on a file. The POSIX spec says:
> 
> "All locks associated with a file for a given process shall be removed
when a
> file descriptor for that file is closed by that process or the  process
holding
> that file descriptor terminates."
> 
> This is problematic since you can't close any file descriptor without
dropping
> all your POSIX locks. Most userland file servers therefore end up opening
the
> file with more access than is really necessary, and keeping fd's open for
> longer than is necessary to work around this.
> 
> This patchset is a first stab at an approach to address this problem by
adding
> two new l_type values -- F_RDLCKP and F_WRLCKP (the 'P' is short for
> "private" -- I'm open to changing that if you have a better mnemonic).
> 
> For all intents and purposes these lock types act just like their "non-P"
> counterpart. The difference is that they are only implicitly released when
the
> fd against which they were acquired is closed. As a side effect, these
locks
> cannot be merged with "non-P" locks since they have different semantics on
> close.
> 
> I've given this patchset some very basic smoke testing and it seems to do
the
> right thing, but it is still pretty rough. If this looks reasonable I'll
plan to do
> some documentation updates and will take a stab at trying to get these new
> lock types added to the POSIX spec (as HCH recommended).
> 
> At this point, my main questions are:
> 
> 1) does this look useful, particularly for fileserver implementors?
> 
> 2) does this look OK API-wise? We could consider different "cmd" values
>    or even different syscalls, but I figured this makes it clearer that
>    "P" and "non-P" locks will still conflict with one another.
> 
> Jeff Layton (6):
>   locks: consolidate common code in the flock_to_posix_lock routines
>   locks: consolidate checks for compatible filp->f_mode values in setlk
>     handlers
>   locks: rename locks_remove_flock to locks_remove_file
>   locks: show private lock types in /proc/locks
>   locks: report l_pid as -1 for FL_FILE_PVT locks
>   locks: add new "private" lock type that is owned by the filp
> 
>  fs/file_table.c                  |   2 +-
>  fs/locks.c                       | 122
+++++++++++++++++++++------------------
>  include/linux/fs.h               |   5 +-
>  include/uapi/asm-generic/fcntl.h |  16 +++++
>  4 files changed, 85 insertions(+), 60 deletions(-)
> 
> --
> 1.8.4.2
> 
> 
>
----------------------------------------------------------------------------
--
> Rapidly troubleshoot problems before they affect your business. Most IT
> organizations don't have a clear picture of how application performance
> affects their revenue. With AppDynamics, you get 100% visibility into your
> Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics
> Pro!
> http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clk
> trk
> _______________________________________________
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 5/6] locks: report l_pid as -1 for FL_FILE_PVT locks
  2013-12-10 19:17 ` [PATCH v3 5/6] locks: report l_pid as -1 for FL_FILE_PVT locks Jeff Layton
@ 2013-12-10 19:31   ` Jeff Layton
  2013-12-10 19:41     ` [Nfs-ganesha-devel] " Frank Filz
  0 siblings, 1 reply; 29+ messages in thread
From: Jeff Layton @ 2013-12-10 19:31 UTC (permalink / raw)
  To: Jeff Layton
  Cc: linux-fsdevel, nfs-ganesha-devel, samba-technical, linux-kernel

On Tue, 10 Dec 2013 14:17:34 -0500
Jeff Layton <jlayton@redhat.com> wrote:

> FL_FILE_PVT locks are no longer tied to a particular pid, and are
> instead inheritable by child processes. Report a l_pid of '-1' for
> these sorts of locks since the pid is somewhat meaningless for them.
> 
> This precedent comes from FreeBSD. There, POSIX and flock() locks can
> conflict with one another. If fcntl(F_GETLK, ...) returns a lock set
> with flock() then the l_pid member cannot be a process ID because the
> lock is not held by a process as such.
> 
> Signed-off-by: Jeff Layton <jlayton@redhat.com>
> ---
>  fs/locks.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/locks.c b/fs/locks.c
> index e163a30..5372ddd 100644
> --- a/fs/locks.c
> +++ b/fs/locks.c
> @@ -1899,7 +1899,7 @@ EXPORT_SYMBOL_GPL(vfs_test_lock);
>  
>  static int posix_lock_to_flock(struct flock *flock, struct file_lock *fl)
>  {
> -	flock->l_pid = fl->fl_pid;
> +	flock->l_pid = IS_FILE_PVT(fl) ? -1 : fl->fl_pid;
>  #if BITS_PER_LONG == 32
>  	/*
>  	 * Make sure we can represent the posix lock via
> @@ -1921,7 +1921,7 @@ static int posix_lock_to_flock(struct flock *flock, struct file_lock *fl)
>  #if BITS_PER_LONG == 32
>  static void posix_lock_to_flock64(struct flock64 *flock, struct file_lock *fl)
>  {
> -	flock->l_pid = fl->fl_pid;
> +	flock->l_pid = IS_FILE_PVT(fl) ? -1 : fl->fl_pid;
>  	flock->l_start = fl->fl_start;
>  	flock->l_len = fl->fl_end == OFFSET_MAX ? 0 :
>  		fl->fl_end - fl->fl_start + 1;

While I think this behavior is reasonable, I do wonder if we ought to
consider more changes to how F_GETLK works. Currently the F_GETLK code
won't handle the new l_type values, but maybe it should...

For instance, if there is a conflicting lock, and the F_GETLK caller
specified F_RDLCKP or F_WRLCKP, might it make sense to report the
l_type on the conflicting lock as F_RDLCKP or F_WRLCKP if that
conflicting lock is also a *P type?

...or maybe we should consider a new F_GETLKP cmd value, and a new
expanded struct flock that gives more info. The pid is already somewhat
meaningless with this sort of lock. Perhaps we could obfuscate the
fl_owner value and report that instead? What other sorts of info would
be useful to programs that intend to use these new interfaces?

-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: [Nfs-ganesha-devel] [PATCH v3 5/6] locks: report l_pid as -1 for FL_FILE_PVT locks
  2013-12-10 19:31   ` Jeff Layton
@ 2013-12-10 19:41     ` Frank Filz
  2013-12-10 19:57       ` Jeff Layton
  0 siblings, 1 reply; 29+ messages in thread
From: Frank Filz @ 2013-12-10 19:41 UTC (permalink / raw)
  To: 'Jeff Layton'
  Cc: linux-fsdevel, nfs-ganesha-devel, samba-technical, linux-kernel

> On Tue, 10 Dec 2013 14:17:34 -0500
> Jeff Layton <jlayton@redhat.com> wrote:
> 
> > FL_FILE_PVT locks are no longer tied to a particular pid, and are
> > instead inheritable by child processes. Report a l_pid of '-1' for
> > these sorts of locks since the pid is somewhat meaningless for them.
> >
> > This precedent comes from FreeBSD. There, POSIX and flock() locks can
> > conflict with one another. If fcntl(F_GETLK, ...) returns a lock set
> > with flock() then the l_pid member cannot be a process ID because the
> > lock is not held by a process as such.
> >
> > Signed-off-by: Jeff Layton <jlayton@redhat.com>
> > ---
> >  fs/locks.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/fs/locks.c b/fs/locks.c
> > index e163a30..5372ddd 100644
> > --- a/fs/locks.c
> > +++ b/fs/locks.c
> > @@ -1899,7 +1899,7 @@ EXPORT_SYMBOL_GPL(vfs_test_lock);
> >
> >  static int posix_lock_to_flock(struct flock *flock, struct file_lock
> > *fl)  {
> > -	flock->l_pid = fl->fl_pid;
> > +	flock->l_pid = IS_FILE_PVT(fl) ? -1 : fl->fl_pid;
> >  #if BITS_PER_LONG == 32
> >  	/*
> >  	 * Make sure we can represent the posix lock via @@ -1921,7 +1921,7
> > @@ static int posix_lock_to_flock(struct flock *flock, struct
> > file_lock *fl)  #if BITS_PER_LONG == 32  static void
> > posix_lock_to_flock64(struct flock64 *flock, struct file_lock *fl)  {
> > -	flock->l_pid = fl->fl_pid;
> > +	flock->l_pid = IS_FILE_PVT(fl) ? -1 : fl->fl_pid;
> >  	flock->l_start = fl->fl_start;
> >  	flock->l_len = fl->fl_end == OFFSET_MAX ? 0 :
> >  		fl->fl_end - fl->fl_start + 1;
> 
> While I think this behavior is reasonable, I do wonder if we ought to
consider
> more changes to how F_GETLK works. Currently the F_GETLK code won't
> handle the new l_type values, but maybe it should...
> 
> For instance, if there is a conflicting lock, and the F_GETLK caller
specified
> F_RDLCKP or F_WRLCKP, might it make sense to report the l_type on the
> conflicting lock as F_RDLCKP or F_WRLCKP if that conflicting lock is also
a *P
> type?
> 
> ...or maybe we should consider a new F_GETLKP cmd value, and a new
> expanded struct flock that gives more info. The pid is already somewhat
> meaningless with this sort of lock. Perhaps we could obfuscate the
fl_owner
> value and report that instead? What other sorts of info would be useful to
> programs that intend to use these new interfaces?

I always wonder if anyone actually uses the conflicting lock information...

I think returning the lock type as F_RDLCK or F_WRLCK is fine, but maybe an
extended F_GETLKP could report the extended type. For Ganesha, we wouldn't
use the extended type since the NFS protocol has no concept of separate
private and non-private locks (all NFS locks are "private" to the specified
owner in the request). The pid is also not that useful to Ganesha.

Frank

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Nfs-ganesha-devel] [PATCH v3 5/6] locks: report l_pid as -1 for FL_FILE_PVT locks
  2013-12-10 19:41     ` [Nfs-ganesha-devel] " Frank Filz
@ 2013-12-10 19:57       ` Jeff Layton
  0 siblings, 0 replies; 29+ messages in thread
From: Jeff Layton @ 2013-12-10 19:57 UTC (permalink / raw)
  To: Frank Filz
  Cc: linux-fsdevel, nfs-ganesha-devel, samba-technical, linux-kernel

On Tue, 10 Dec 2013 11:41:04 -0800
"Frank Filz" <ffilzlnx@mindspring.com> wrote:

> > On Tue, 10 Dec 2013 14:17:34 -0500
> > Jeff Layton <jlayton@redhat.com> wrote:
> > 
> > > FL_FILE_PVT locks are no longer tied to a particular pid, and are
> > > instead inheritable by child processes. Report a l_pid of '-1' for
> > > these sorts of locks since the pid is somewhat meaningless for them.
> > >
> > > This precedent comes from FreeBSD. There, POSIX and flock() locks can
> > > conflict with one another. If fcntl(F_GETLK, ...) returns a lock set
> > > with flock() then the l_pid member cannot be a process ID because the
> > > lock is not held by a process as such.
> > >
> > > Signed-off-by: Jeff Layton <jlayton@redhat.com>
> > > ---
> > >  fs/locks.c | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/fs/locks.c b/fs/locks.c
> > > index e163a30..5372ddd 100644
> > > --- a/fs/locks.c
> > > +++ b/fs/locks.c
> > > @@ -1899,7 +1899,7 @@ EXPORT_SYMBOL_GPL(vfs_test_lock);
> > >
> > >  static int posix_lock_to_flock(struct flock *flock, struct file_lock
> > > *fl)  {
> > > -	flock->l_pid = fl->fl_pid;
> > > +	flock->l_pid = IS_FILE_PVT(fl) ? -1 : fl->fl_pid;
> > >  #if BITS_PER_LONG == 32
> > >  	/*
> > >  	 * Make sure we can represent the posix lock via @@ -1921,7 +1921,7
> > > @@ static int posix_lock_to_flock(struct flock *flock, struct
> > > file_lock *fl)  #if BITS_PER_LONG == 32  static void
> > > posix_lock_to_flock64(struct flock64 *flock, struct file_lock *fl)  {
> > > -	flock->l_pid = fl->fl_pid;
> > > +	flock->l_pid = IS_FILE_PVT(fl) ? -1 : fl->fl_pid;
> > >  	flock->l_start = fl->fl_start;
> > >  	flock->l_len = fl->fl_end == OFFSET_MAX ? 0 :
> > >  		fl->fl_end - fl->fl_start + 1;
> > 
> > While I think this behavior is reasonable, I do wonder if we ought to
> consider
> > more changes to how F_GETLK works. Currently the F_GETLK code won't
> > handle the new l_type values, but maybe it should...
> > 
> > For instance, if there is a conflicting lock, and the F_GETLK caller
> specified
> > F_RDLCKP or F_WRLCKP, might it make sense to report the l_type on the
> > conflicting lock as F_RDLCKP or F_WRLCKP if that conflicting lock is also
> a *P
> > type?
> > 
> > ...or maybe we should consider a new F_GETLKP cmd value, and a new
> > expanded struct flock that gives more info. The pid is already somewhat
> > meaningless with this sort of lock. Perhaps we could obfuscate the
> fl_owner
> > value and report that instead? What other sorts of info would be useful to
> > programs that intend to use these new interfaces?
> 
> I always wonder if anyone actually uses the conflicting lock information...
> 
> I think returning the lock type as F_RDLCK or F_WRLCK is fine, but maybe an
> extended F_GETLKP could report the extended type. For Ganesha, we wouldn't
> use the extended type since the NFS protocol has no concept of separate
> private and non-private locks (all NFS locks are "private" to the specified
> owner in the request). The pid is also not that useful to Ganesha.
> 

Yeah, the value of that has always been a little questionable to me
too...

FWIW, we don't actually need a new cmd value to get that. What we
could do is simply report the new lock types in there iff the F_GETLK
request had a F_RDLCKP or F_WRLCKP l_type value. If a classic l_type
value was specified though, then I think we'd need to report only
classic l_type values for the conflicting lock.

That said, since the utility of that isn't completely clear, it's
probably best not to try to do anything too fancy here. We could always
add that later if we need it.

Thanks...
-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 1/6] locks: consolidate common code in the flock_to_posix_lock routines
  2013-12-10 19:17 ` [PATCH v3 1/6] locks: consolidate common code in the flock_to_posix_lock routines Jeff Layton
@ 2013-12-10 21:22   ` J. Bruce Fields
  2013-12-10 23:22     ` J. Bruce Fields
  0 siblings, 1 reply; 29+ messages in thread
From: J. Bruce Fields @ 2013-12-10 21:22 UTC (permalink / raw)
  To: Jeff Layton
  Cc: linux-fsdevel, nfs-ganesha-devel, samba-technical, linux-kernel

On Tue, Dec 10, 2013 at 02:17:30PM -0500, Jeff Layton wrote:
> Currently, there's a lot of copy and paste between the two.  Add some
> functions to do the initialization of the file_lock from values
> passed in, and turn the flock/flock64 variants of those functions into
> wrappers around them.
> 
> Unfortunately it's harder to consolidate the fl_start/fl_end
> calculations due to the differently sized types involved so I've left
> them separate for now.

I'd think you could assign everything to the flock64 type and do the
common work there or something.

But I'm confused about what the current code is actually trying to do:
if I'm chasing down the definitions right, these quantities are all
signed, and when start is defined as an off_t it can overflow in the
SEEK_CUR and SEEK_END cases.  And

	if (fl->fl_end < fl->fl_start)
		return -EOVERFLOW

is counting on overlow wrapping around, which I thought wasn't
guaranteed in the case of signed arithmetic?

--b.


> 
> Signed-off-by: Jeff Layton <jlayton@redhat.com>
> ---
>  fs/locks.c | 31 +++++++++++++++----------------
>  1 file changed, 15 insertions(+), 16 deletions(-)
> 
> diff --git a/fs/locks.c b/fs/locks.c
> index 6084f5a..a5848ed 100644
> --- a/fs/locks.c
> +++ b/fs/locks.c
> @@ -344,6 +344,19 @@ static int assign_type(struct file_lock *fl, long type)
>  	return 0;
>  }
>  
> +static int
> +flock_to_posix_lock_common(struct file_lock *fl, struct file *filp, short type)
> +{
> +	fl->fl_owner = current->files;
> +	fl->fl_pid = current->tgid;
> +	fl->fl_file = filp;
> +	fl->fl_flags = FL_POSIX;
> +	fl->fl_ops = NULL;
> +	fl->fl_lmops = NULL;
> +
> +	return assign_type(fl, type);
> +}
> +
>  /* Verify a "struct flock" and copy it to a "struct file_lock" as a POSIX
>   * style lock.
>   */
> @@ -386,14 +399,7 @@ static int flock_to_posix_lock(struct file *filp, struct file_lock *fl,
>  	if (fl->fl_end < fl->fl_start)
>  		return -EOVERFLOW;
>  	
> -	fl->fl_owner = current->files;
> -	fl->fl_pid = current->tgid;
> -	fl->fl_file = filp;
> -	fl->fl_flags = FL_POSIX;
> -	fl->fl_ops = NULL;
> -	fl->fl_lmops = NULL;
> -
> -	return assign_type(fl, l->l_type);
> +	return flock_to_posix_lock_common(fl, filp, l->l_type);
>  }
>  
>  #if BITS_PER_LONG == 32
> @@ -431,15 +437,8 @@ static int flock64_to_posix_lock(struct file *filp, struct file_lock *fl,
>  	fl->fl_start = start;	/* we record the absolute position */
>  	if (fl->fl_end < fl->fl_start)
>  		return -EOVERFLOW;
> -	
> -	fl->fl_owner = current->files;
> -	fl->fl_pid = current->tgid;
> -	fl->fl_file = filp;
> -	fl->fl_flags = FL_POSIX;
> -	fl->fl_ops = NULL;
> -	fl->fl_lmops = NULL;
>  
> -	return assign_type(fl, l->l_type);
> +	return flock_to_posix_lock_common(fl, filp, l->l_type);
>  }
>  #endif
>  
> -- 
> 1.8.4.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 1/6] locks: consolidate common code in the flock_to_posix_lock routines
  2013-12-10 21:22   ` J. Bruce Fields
@ 2013-12-10 23:22     ` J. Bruce Fields
  2013-12-11 11:18       ` Jeff Layton
  0 siblings, 1 reply; 29+ messages in thread
From: J. Bruce Fields @ 2013-12-10 23:22 UTC (permalink / raw)
  To: Jeff Layton
  Cc: linux-fsdevel, linux-kernel, nfs-ganesha-devel, samba-technical

On Tue, Dec 10, 2013 at 04:22:53PM -0500, bfields wrote:
> On Tue, Dec 10, 2013 at 02:17:30PM -0500, Jeff Layton wrote:
> > Currently, there's a lot of copy and paste between the two.  Add some
> > functions to do the initialization of the file_lock from values
> > passed in, and turn the flock/flock64 variants of those functions into
> > wrappers around them.
> > 
> > Unfortunately it's harder to consolidate the fl_start/fl_end
> > calculations due to the differently sized types involved so I've left
> > them separate for now.
> 
> I'd think you could assign everything to the flock64 type and do the
> common work there or something.
> 
> But I'm confused about what the current code is actually trying to do:
> if I'm chasing down the definitions right, these quantities are all
> signed, and when start is defined as an off_t it can overflow in the
> SEEK_CUR and SEEK_END cases.  And
> 
> 	if (fl->fl_end < fl->fl_start)
> 		return -EOVERFLOW
> 
> is counting on overlow wrapping around, which I thought wasn't
> guaranteed in the case of signed arithmetic?

E.g. the following (untested) removes the duplication and should return
-EOVERFLOW in the cases we currently do a random conversion from 64- to
32-bit and back.  Susv3 says:

	[EOVERFLOW]
	The cmd argument is F_GETLK, F_SETLK, or F_SETLKW and the
	smallest or, if l_len is non-zero, the largest offset of any
	byte in the requested segment cannot be represented correctly in
	an object of type off_t.

so that's what I tried to do.

--b.

diff --git a/fs/locks.c b/fs/locks.c
index 92a0f0a..47832f5 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -344,48 +344,41 @@ static int assign_type(struct file_lock *fl, long type)
 	return 0;
 }
 
-/* Verify a "struct flock" and copy it to a "struct file_lock" as a POSIX
- * style lock.
- */
-static int flock_to_posix_lock(struct file *filp, struct file_lock *fl,
-			       struct flock *l)
+static int flock_to_posix_lock_common(struct file *filp, struct file_lock *fl,
+					struct flock64 *l, loff_t offset_max)
 {
-	off_t start, end;
+	loff_t start;
 
 	switch (l->l_whence) {
 	case SEEK_SET:
-		start = 0;
-		break;
+		fl->fl_start = 0;
 	case SEEK_CUR:
-		start = filp->f_pos;
-		break;
+		fl->fl_start = filp->f_pos;
 	case SEEK_END:
-		start = i_size_read(file_inode(filp));
-		break;
+		fl->fl_start = i_size_read(file_inode(filp));
 	default:
 		return -EINVAL;
 	}
+	if (l->l_start < 0)
+		return -EINVAL;
+	if (l->l_start > offset_max - fl->fl_start)
+		return -EOVERFLOW;
+	fl->fl_start += l->l_start;
+	if (l->l_len > offset_max - fl->fl_start)
+		return -EOVERFLOW;
+	if (fl->fl_start + l->l_len < 0)
+		return -EINVAL;
 
 	/* POSIX-1996 leaves the case l->l_len < 0 undefined;
 	   POSIX-2001 defines it. */
-	start += l->l_start;
-	if (start < 0)
-		return -EINVAL;
-	fl->fl_end = OFFSET_MAX;
-	if (l->l_len > 0) {
-		end = start + l->l_len - 1;
-		fl->fl_end = end;
-	} else if (l->l_len < 0) {
-		end = start - 1;
-		fl->fl_end = end;
-		start += l->l_len;
-		if (start < 0)
-			return -EINVAL;
-	}
-	fl->fl_start = start;	/* we record the absolute position */
-	if (fl->fl_end < fl->fl_start)
-		return -EOVERFLOW;
-	
+	if (l->l_len > 0)
+		fl->fl_end = fl->fl_start + l->l_len - 1;
+	else if (l->l_len < 0) {
+		fl->fl_end = start - 1;
+		fl->fl_start += l->l_len;
+	} else
+		fl->fl_end = OFFSET_MAX;
+
 	fl->fl_owner = current->files;
 	fl->fl_pid = current->tgid;
 	fl->fl_file = filp;
@@ -396,50 +389,27 @@ static int flock_to_posix_lock(struct file *filp, struct file_lock *fl,
 	return assign_type(fl, l->l_type);
 }
 
+/* Verify a "struct flock" and copy it to a "struct file_lock" as a POSIX
+ * style lock.
+ */
+static int flock_to_posix_lock(struct file *filp, struct file_lock *fl,
+			       struct flock *l)
+{
+	struct flock64 ll = {
+		.l_type = l->l_type,
+		.l_whence = l->l_whence,
+		.l_start = l->l_start,
+		.l_len = l->l_len,
+	};
+	
+	return flock_to_posix_lock_common(filp, fl, &ll, OFFT_OFFSET_MAX);
+}
+
 #if BITS_PER_LONG == 32
 static int flock64_to_posix_lock(struct file *filp, struct file_lock *fl,
 				 struct flock64 *l)
 {
-	loff_t start;
-
-	switch (l->l_whence) {
-	case SEEK_SET:
-		start = 0;
-		break;
-	case SEEK_CUR:
-		start = filp->f_pos;
-		break;
-	case SEEK_END:
-		start = i_size_read(file_inode(filp));
-		break;
-	default:
-		return -EINVAL;
-	}
-
-	start += l->l_start;
-	if (start < 0)
-		return -EINVAL;
-	fl->fl_end = OFFSET_MAX;
-	if (l->l_len > 0) {
-		fl->fl_end = start + l->l_len - 1;
-	} else if (l->l_len < 0) {
-		fl->fl_end = start - 1;
-		start += l->l_len;
-		if (start < 0)
-			return -EINVAL;
-	}
-	fl->fl_start = start;	/* we record the absolute position */
-	if (fl->fl_end < fl->fl_start)
-		return -EOVERFLOW;
-	
-	fl->fl_owner = current->files;
-	fl->fl_pid = current->tgid;
-	fl->fl_file = filp;
-	fl->fl_flags = FL_POSIX;
-	fl->fl_ops = NULL;
-	fl->fl_lmops = NULL;
-
-	return assign_type(fl, l->l_type);
+	return flock_to_posix_lock_common(filp, fl, l, OFFSET_MAX);
 }
 #endif
 
diff --git a/include/uapi/asm-generic/fcntl.h b/include/uapi/asm-generic/fcntl.h
index 95e46c8..36025f7 100644
--- a/include/uapi/asm-generic/fcntl.h
+++ b/include/uapi/asm-generic/fcntl.h
@@ -186,8 +186,6 @@ struct flock {
 };
 #endif
 
-#ifndef CONFIG_64BIT
-
 #ifndef HAVE_ARCH_STRUCT_FLOCK64
 #ifndef __ARCH_FLOCK64_PAD
 #define __ARCH_FLOCK64_PAD
@@ -202,6 +200,5 @@ struct flock64 {
 	__ARCH_FLOCK64_PAD
 };
 #endif
-#endif /* !CONFIG_64BIT */
 
 #endif /* _ASM_GENERIC_FCNTL_H */

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 1/6] locks: consolidate common code in the flock_to_posix_lock routines
  2013-12-10 23:22     ` J. Bruce Fields
@ 2013-12-11 11:18       ` Jeff Layton
  2013-12-11 14:37         ` J. Bruce Fields
  0 siblings, 1 reply; 29+ messages in thread
From: Jeff Layton @ 2013-12-11 11:18 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: linux-fsdevel, nfs-ganesha-devel, samba-technical, linux-kernel

On Tue, 10 Dec 2013 18:22:04 -0500
"J. Bruce Fields" <bfields@fieldses.org> wrote:

> On Tue, Dec 10, 2013 at 04:22:53PM -0500, bfields wrote:
> > On Tue, Dec 10, 2013 at 02:17:30PM -0500, Jeff Layton wrote:
> > > Currently, there's a lot of copy and paste between the two.  Add some
> > > functions to do the initialization of the file_lock from values
> > > passed in, and turn the flock/flock64 variants of those functions into
> > > wrappers around them.
> > > 
> > > Unfortunately it's harder to consolidate the fl_start/fl_end
> > > calculations due to the differently sized types involved so I've left
> > > them separate for now.
> > 
> > I'd think you could assign everything to the flock64 type and do the
> > common work there or something.
> > 
> > But I'm confused about what the current code is actually trying to do:
> > if I'm chasing down the definitions right, these quantities are all
> > signed, and when start is defined as an off_t it can overflow in the
> > SEEK_CUR and SEEK_END cases.  And
> > 
> > 	if (fl->fl_end < fl->fl_start)
> > 		return -EOVERFLOW
> > 
> > is counting on overlow wrapping around, which I thought wasn't
> > guaranteed in the case of signed arithmetic?
> 
> E.g. the following (untested) removes the duplication and should return
> -EOVERFLOW in the cases we currently do a random conversion from 64- to
> 32-bit and back.  Susv3 says:
> 
> 	[EOVERFLOW]
> 	The cmd argument is F_GETLK, F_SETLK, or F_SETLKW and the
> 	smallest or, if l_len is non-zero, the largest offset of any
> 	byte in the requested segment cannot be represented correctly in
> 	an object of type off_t.
> 
> so that's what I tried to do.
> 
> --b.
> 
> diff --git a/fs/locks.c b/fs/locks.c
> index 92a0f0a..47832f5 100644
> --- a/fs/locks.c
> +++ b/fs/locks.c
> @@ -344,48 +344,41 @@ static int assign_type(struct file_lock *fl, long type)
>  	return 0;
>  }
>  
> -/* Verify a "struct flock" and copy it to a "struct file_lock" as a POSIX
> - * style lock.
> - */
> -static int flock_to_posix_lock(struct file *filp, struct file_lock *fl,
> -			       struct flock *l)
> +static int flock_to_posix_lock_common(struct file *filp, struct file_lock *fl,
> +					struct flock64 *l, loff_t offset_max)
>  {
> -	off_t start, end;
> +	loff_t start;
>  
>  	switch (l->l_whence) {
>  	case SEEK_SET:
> -		start = 0;
> -		break;
> +		fl->fl_start = 0;
>  	case SEEK_CUR:
> -		start = filp->f_pos;
> -		break;
> +		fl->fl_start = filp->f_pos;
>  	case SEEK_END:
> -		start = i_size_read(file_inode(filp));
> -		break;
> +		fl->fl_start = i_size_read(file_inode(filp));
>  	default:
>  		return -EINVAL;
>  	}
> +	if (l->l_start < 0)
> +		return -EINVAL;
> +	if (l->l_start > offset_max - fl->fl_start)
> +		return -EOVERFLOW;
> +	fl->fl_start += l->l_start;
> +	if (l->l_len > offset_max - fl->fl_start)
> +		return -EOVERFLOW;
> +	if (fl->fl_start + l->l_len < 0)
> +		return -EINVAL;
>  
>  	/* POSIX-1996 leaves the case l->l_len < 0 undefined;
>  	   POSIX-2001 defines it. */
> -	start += l->l_start;
> -	if (start < 0)
> -		return -EINVAL;
> -	fl->fl_end = OFFSET_MAX;
> -	if (l->l_len > 0) {
> -		end = start + l->l_len - 1;
> -		fl->fl_end = end;
> -	} else if (l->l_len < 0) {
> -		end = start - 1;
> -		fl->fl_end = end;
> -		start += l->l_len;
> -		if (start < 0)
> -			return -EINVAL;
> -	}
> -	fl->fl_start = start;	/* we record the absolute position */
> -	if (fl->fl_end < fl->fl_start)
> -		return -EOVERFLOW;
> -	
> +	if (l->l_len > 0)
> +		fl->fl_end = fl->fl_start + l->l_len - 1;
> +	else if (l->l_len < 0) {
> +		fl->fl_end = start - 1;
> +		fl->fl_start += l->l_len;
> +	} else
> +		fl->fl_end = OFFSET_MAX;
> +
>  	fl->fl_owner = current->files;
>  	fl->fl_pid = current->tgid;
>  	fl->fl_file = filp;
> @@ -396,50 +389,27 @@ static int flock_to_posix_lock(struct file *filp, struct file_lock *fl,
>  	return assign_type(fl, l->l_type);
>  }
>  
> +/* Verify a "struct flock" and copy it to a "struct file_lock" as a POSIX
> + * style lock.
> + */
> +static int flock_to_posix_lock(struct file *filp, struct file_lock *fl,
> +			       struct flock *l)
> +{
> +	struct flock64 ll = {
> +		.l_type = l->l_type,
> +		.l_whence = l->l_whence,
> +		.l_start = l->l_start,
> +		.l_len = l->l_len,
> +	};
> +	
> +	return flock_to_posix_lock_common(filp, fl, &ll, OFFT_OFFSET_MAX);
> +}
> +
>  #if BITS_PER_LONG == 32
>  static int flock64_to_posix_lock(struct file *filp, struct file_lock *fl,
>  				 struct flock64 *l)
>  {
> -	loff_t start;
> -
> -	switch (l->l_whence) {
> -	case SEEK_SET:
> -		start = 0;
> -		break;
> -	case SEEK_CUR:
> -		start = filp->f_pos;
> -		break;
> -	case SEEK_END:
> -		start = i_size_read(file_inode(filp));
> -		break;
> -	default:
> -		return -EINVAL;
> -	}
> -
> -	start += l->l_start;
> -	if (start < 0)
> -		return -EINVAL;
> -	fl->fl_end = OFFSET_MAX;
> -	if (l->l_len > 0) {
> -		fl->fl_end = start + l->l_len - 1;
> -	} else if (l->l_len < 0) {
> -		fl->fl_end = start - 1;
> -		start += l->l_len;
> -		if (start < 0)
> -			return -EINVAL;
> -	}
> -	fl->fl_start = start;	/* we record the absolute position */
> -	if (fl->fl_end < fl->fl_start)
> -		return -EOVERFLOW;
> -	
> -	fl->fl_owner = current->files;
> -	fl->fl_pid = current->tgid;
> -	fl->fl_file = filp;
> -	fl->fl_flags = FL_POSIX;
> -	fl->fl_ops = NULL;
> -	fl->fl_lmops = NULL;
> -
> -	return assign_type(fl, l->l_type);
> +	return flock_to_posix_lock_common(filp, fl, l, OFFSET_MAX);
>  }
>  #endif
>  
> diff --git a/include/uapi/asm-generic/fcntl.h b/include/uapi/asm-generic/fcntl.h
> index 95e46c8..36025f7 100644
> --- a/include/uapi/asm-generic/fcntl.h
> +++ b/include/uapi/asm-generic/fcntl.h
> @@ -186,8 +186,6 @@ struct flock {
>  };
>  #endif
>  
> -#ifndef CONFIG_64BIT
> -
>  #ifndef HAVE_ARCH_STRUCT_FLOCK64
>  #ifndef __ARCH_FLOCK64_PAD
>  #define __ARCH_FLOCK64_PAD
> @@ -202,6 +200,5 @@ struct flock64 {
>  	__ARCH_FLOCK64_PAD
>  };
>  #endif
> -#endif /* !CONFIG_64BIT */
>  
>  #endif /* _ASM_GENERIC_FCNTL_H */

Nice. I had started to consolidate them, but then figured out that
there are so many edge cases and I didn't have a good way to test them
all.

As far as I can tell though, this looks correct. I'll plan to drop my
patch and base the rest of the set on top of yours.

Thanks,
-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 1/6] locks: consolidate common code in the flock_to_posix_lock routines
  2013-12-11 11:18       ` Jeff Layton
@ 2013-12-11 14:37         ` J. Bruce Fields
  2013-12-11 15:19           ` J. Bruce Fields
  0 siblings, 1 reply; 29+ messages in thread
From: J. Bruce Fields @ 2013-12-11 14:37 UTC (permalink / raw)
  To: Jeff Layton
  Cc: linux-fsdevel, linux-kernel, nfs-ganesha-devel, samba-technical

On Wed, Dec 11, 2013 at 06:18:56AM -0500, Jeff Layton wrote:
> On Tue, 10 Dec 2013 18:22:04 -0500
> "J. Bruce Fields" <bfields@fieldses.org> wrote:
> 
> > On Tue, Dec 10, 2013 at 04:22:53PM -0500, bfields wrote:
> > > On Tue, Dec 10, 2013 at 02:17:30PM -0500, Jeff Layton wrote:
> > > > Currently, there's a lot of copy and paste between the two.  Add some
> > > > functions to do the initialization of the file_lock from values
> > > > passed in, and turn the flock/flock64 variants of those functions into
> > > > wrappers around them.
> > > > 
> > > > Unfortunately it's harder to consolidate the fl_start/fl_end
> > > > calculations due to the differently sized types involved so I've left
> > > > them separate for now.
> > > 
> > > I'd think you could assign everything to the flock64 type and do the
> > > common work there or something.
> > > 
> > > But I'm confused about what the current code is actually trying to do:
> > > if I'm chasing down the definitions right, these quantities are all
> > > signed, and when start is defined as an off_t it can overflow in the
> > > SEEK_CUR and SEEK_END cases.  And
> > > 
> > > 	if (fl->fl_end < fl->fl_start)
> > > 		return -EOVERFLOW
> > > 
> > > is counting on overlow wrapping around, which I thought wasn't
> > > guaranteed in the case of signed arithmetic?
> > 
> > E.g. the following (untested) removes the duplication and should return
> > -EOVERFLOW in the cases we currently do a random conversion from 64- to
> > 32-bit and back.  Susv3 says:
> > 
> > 	[EOVERFLOW]
> > 	The cmd argument is F_GETLK, F_SETLK, or F_SETLKW and the
> > 	smallest or, if l_len is non-zero, the largest offset of any
> > 	byte in the requested segment cannot be represented correctly in
> > 	an object of type off_t.
> > 
> > so that's what I tried to do.
> > 
> > --b.
> > 
> > diff --git a/fs/locks.c b/fs/locks.c
> > index 92a0f0a..47832f5 100644
> > --- a/fs/locks.c
> > +++ b/fs/locks.c
> > @@ -344,48 +344,41 @@ static int assign_type(struct file_lock *fl, long type)
> >  	return 0;
> >  }
> >  
> > -/* Verify a "struct flock" and copy it to a "struct file_lock" as a POSIX
> > - * style lock.
> > - */
> > -static int flock_to_posix_lock(struct file *filp, struct file_lock *fl,
> > -			       struct flock *l)
> > +static int flock_to_posix_lock_common(struct file *filp, struct file_lock *fl,
> > +					struct flock64 *l, loff_t offset_max)
> >  {
> > -	off_t start, end;
> > +	loff_t start;
> >  
> >  	switch (l->l_whence) {
> >  	case SEEK_SET:
> > -		start = 0;
> > -		break;
> > +		fl->fl_start = 0;
> >  	case SEEK_CUR:
> > -		start = filp->f_pos;
> > -		break;
> > +		fl->fl_start = filp->f_pos;
> >  	case SEEK_END:
> > -		start = i_size_read(file_inode(filp));
> > -		break;
> > +		fl->fl_start = i_size_read(file_inode(filp));
> >  	default:
> >  		return -EINVAL;
> >  	}
> > +	if (l->l_start < 0)
> > +		return -EINVAL;
> > +	if (l->l_start > offset_max - fl->fl_start)
> > +		return -EOVERFLOW;
> > +	fl->fl_start += l->l_start;
> > +	if (l->l_len > offset_max - fl->fl_start)
> > +		return -EOVERFLOW;
> > +	if (fl->fl_start + l->l_len < 0)
> > +		return -EINVAL;
> >  
> >  	/* POSIX-1996 leaves the case l->l_len < 0 undefined;
> >  	   POSIX-2001 defines it. */
> > -	start += l->l_start;
> > -	if (start < 0)
> > -		return -EINVAL;
> > -	fl->fl_end = OFFSET_MAX;
> > -	if (l->l_len > 0) {
> > -		end = start + l->l_len - 1;
> > -		fl->fl_end = end;
> > -	} else if (l->l_len < 0) {
> > -		end = start - 1;
> > -		fl->fl_end = end;
> > -		start += l->l_len;
> > -		if (start < 0)
> > -			return -EINVAL;
> > -	}
> > -	fl->fl_start = start;	/* we record the absolute position */
> > -	if (fl->fl_end < fl->fl_start)
> > -		return -EOVERFLOW;
> > -	
> > +	if (l->l_len > 0)
> > +		fl->fl_end = fl->fl_start + l->l_len - 1;
> > +	else if (l->l_len < 0) {
> > +		fl->fl_end = start - 1;
> > +		fl->fl_start += l->l_len;
> > +	} else
> > +		fl->fl_end = OFFSET_MAX;
> > +
> >  	fl->fl_owner = current->files;
> >  	fl->fl_pid = current->tgid;
> >  	fl->fl_file = filp;
> > @@ -396,50 +389,27 @@ static int flock_to_posix_lock(struct file *filp, struct file_lock *fl,
> >  	return assign_type(fl, l->l_type);
> >  }
> >  
> > +/* Verify a "struct flock" and copy it to a "struct file_lock" as a POSIX
> > + * style lock.
> > + */
> > +static int flock_to_posix_lock(struct file *filp, struct file_lock *fl,
> > +			       struct flock *l)
> > +{
> > +	struct flock64 ll = {
> > +		.l_type = l->l_type,
> > +		.l_whence = l->l_whence,
> > +		.l_start = l->l_start,
> > +		.l_len = l->l_len,
> > +	};
> > +	
> > +	return flock_to_posix_lock_common(filp, fl, &ll, OFFT_OFFSET_MAX);
> > +}
> > +
> >  #if BITS_PER_LONG == 32
> >  static int flock64_to_posix_lock(struct file *filp, struct file_lock *fl,
> >  				 struct flock64 *l)
> >  {
> > -	loff_t start;
> > -
> > -	switch (l->l_whence) {
> > -	case SEEK_SET:
> > -		start = 0;
> > -		break;
> > -	case SEEK_CUR:
> > -		start = filp->f_pos;
> > -		break;
> > -	case SEEK_END:
> > -		start = i_size_read(file_inode(filp));
> > -		break;
> > -	default:
> > -		return -EINVAL;
> > -	}
> > -
> > -	start += l->l_start;
> > -	if (start < 0)
> > -		return -EINVAL;
> > -	fl->fl_end = OFFSET_MAX;
> > -	if (l->l_len > 0) {
> > -		fl->fl_end = start + l->l_len - 1;
> > -	} else if (l->l_len < 0) {
> > -		fl->fl_end = start - 1;
> > -		start += l->l_len;
> > -		if (start < 0)
> > -			return -EINVAL;
> > -	}
> > -	fl->fl_start = start;	/* we record the absolute position */
> > -	if (fl->fl_end < fl->fl_start)
> > -		return -EOVERFLOW;
> > -	
> > -	fl->fl_owner = current->files;
> > -	fl->fl_pid = current->tgid;
> > -	fl->fl_file = filp;
> > -	fl->fl_flags = FL_POSIX;
> > -	fl->fl_ops = NULL;
> > -	fl->fl_lmops = NULL;
> > -
> > -	return assign_type(fl, l->l_type);
> > +	return flock_to_posix_lock_common(filp, fl, l, OFFSET_MAX);
> >  }
> >  #endif
> >  
> > diff --git a/include/uapi/asm-generic/fcntl.h b/include/uapi/asm-generic/fcntl.h
> > index 95e46c8..36025f7 100644
> > --- a/include/uapi/asm-generic/fcntl.h
> > +++ b/include/uapi/asm-generic/fcntl.h
> > @@ -186,8 +186,6 @@ struct flock {
> >  };
> >  #endif
> >  
> > -#ifndef CONFIG_64BIT
> > -
> >  #ifndef HAVE_ARCH_STRUCT_FLOCK64
> >  #ifndef __ARCH_FLOCK64_PAD
> >  #define __ARCH_FLOCK64_PAD
> > @@ -202,6 +200,5 @@ struct flock64 {
> >  	__ARCH_FLOCK64_PAD
> >  };
> >  #endif
> > -#endif /* !CONFIG_64BIT */
> >  
> >  #endif /* _ASM_GENERIC_FCNTL_H */
> 
> Nice. I had started to consolidate them, but then figured out that
> there are so many edge cases and I didn't have a good way to test them
> all.
> 
> As far as I can tell though, this looks correct. I'll plan to drop my
> patch and base the rest of the set on top of yours.

Well, it'd be weird if I didn't screw up something somewhere.  If we've
both looked at it and not found anything then maybe it's OK.  But I'll
see if I can do some basic tests for the edge cases.

--b.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 1/6] locks: consolidate common code in the flock_to_posix_lock routines
  2013-12-11 14:37         ` J. Bruce Fields
@ 2013-12-11 15:19           ` J. Bruce Fields
  2013-12-11 16:54             ` Jeff Layton
  2013-12-11 19:07             ` Jeff Layton
  0 siblings, 2 replies; 29+ messages in thread
From: J. Bruce Fields @ 2013-12-11 15:19 UTC (permalink / raw)
  To: Jeff Layton
  Cc: linux-fsdevel, linux-kernel, nfs-ganesha-devel, samba-technical

On Wed, Dec 11, 2013 at 09:37:24AM -0500, J. Bruce Fields wrote:
> Well, it'd be weird if I didn't screw up something somewhere.

Yes, a classic: I forgot the breaks after each switch case.

Here's a version that at least doesn't return -EINVAL on every lock
attempt.

--b.

commit 70b7b9a442d06a71306736eb01cedba5dd6d86cf
Author: J. Bruce Fields <bfields@redhat.com>
Date:   Tue Dec 10 18:14:28 2013 -0500

    locks: fix posix lock range overflow handling
    
    In the 32-bit case fcntl is converting signed 64-bit to signed 32-bit
    quantities in a couple places, with probably incorrect results.
    
    So instead let's return -EOVERFLOW as described in SUSv3, whenever "the
    smallest or, if l_len is non-zero, the largest offset of any byte in the
    requested segment cannot be represented correctly in an object of type
    off_t."
    
    While we're here, do some cleanup including consolidating code for the
    flock and flock64 cases.
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>

diff --git a/fs/locks.c b/fs/locks.c
index 92a0f0a..b70486a 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -344,48 +344,44 @@ static int assign_type(struct file_lock *fl, long type)
 	return 0;
 }
 
-/* Verify a "struct flock" and copy it to a "struct file_lock" as a POSIX
- * style lock.
- */
-static int flock_to_posix_lock(struct file *filp, struct file_lock *fl,
-			       struct flock *l)
+static int flock_to_posix_lock_common(struct file *filp, struct file_lock *fl,
+					struct flock64 *l, loff_t offset_max)
 {
-	off_t start, end;
+	loff_t start;
 
 	switch (l->l_whence) {
 	case SEEK_SET:
-		start = 0;
+		fl->fl_start = 0;
 		break;
 	case SEEK_CUR:
-		start = filp->f_pos;
+		fl->fl_start = filp->f_pos;
 		break;
 	case SEEK_END:
-		start = i_size_read(file_inode(filp));
+		fl->fl_start = i_size_read(file_inode(filp));
 		break;
 	default:
 		return -EINVAL;
 	}
+	if (l->l_start < 0)
+		return -EINVAL;
+	if (l->l_start > offset_max - fl->fl_start)
+		return -EOVERFLOW;
+	fl->fl_start += l->l_start;
+	if (l->l_len > offset_max - fl->fl_start)
+		return -EOVERFLOW;
+	if (fl->fl_start + l->l_len < 0)
+		return -EINVAL;
 
 	/* POSIX-1996 leaves the case l->l_len < 0 undefined;
 	   POSIX-2001 defines it. */
-	start += l->l_start;
-	if (start < 0)
-		return -EINVAL;
-	fl->fl_end = OFFSET_MAX;
-	if (l->l_len > 0) {
-		end = start + l->l_len - 1;
-		fl->fl_end = end;
-	} else if (l->l_len < 0) {
-		end = start - 1;
-		fl->fl_end = end;
-		start += l->l_len;
-		if (start < 0)
-			return -EINVAL;
-	}
-	fl->fl_start = start;	/* we record the absolute position */
-	if (fl->fl_end < fl->fl_start)
-		return -EOVERFLOW;
-	
+	if (l->l_len > 0)
+		fl->fl_end = fl->fl_start + l->l_len - 1;
+	else if (l->l_len < 0) {
+		fl->fl_end = start - 1;
+		fl->fl_start += l->l_len;
+	} else
+		fl->fl_end = OFFSET_MAX;
+
 	fl->fl_owner = current->files;
 	fl->fl_pid = current->tgid;
 	fl->fl_file = filp;
@@ -396,50 +392,27 @@ static int flock_to_posix_lock(struct file *filp, struct file_lock *fl,
 	return assign_type(fl, l->l_type);
 }
 
+/* Verify a "struct flock" and copy it to a "struct file_lock" as a POSIX
+ * style lock.
+ */
+static int flock_to_posix_lock(struct file *filp, struct file_lock *fl,
+			       struct flock *l)
+{
+	struct flock64 ll = {
+		.l_type = l->l_type,
+		.l_whence = l->l_whence,
+		.l_start = l->l_start,
+		.l_len = l->l_len,
+	};
+	
+	return flock_to_posix_lock_common(filp, fl, &ll, OFFT_OFFSET_MAX);
+}
+
 #if BITS_PER_LONG == 32
 static int flock64_to_posix_lock(struct file *filp, struct file_lock *fl,
 				 struct flock64 *l)
 {
-	loff_t start;
-
-	switch (l->l_whence) {
-	case SEEK_SET:
-		start = 0;
-		break;
-	case SEEK_CUR:
-		start = filp->f_pos;
-		break;
-	case SEEK_END:
-		start = i_size_read(file_inode(filp));
-		break;
-	default:
-		return -EINVAL;
-	}
-
-	start += l->l_start;
-	if (start < 0)
-		return -EINVAL;
-	fl->fl_end = OFFSET_MAX;
-	if (l->l_len > 0) {
-		fl->fl_end = start + l->l_len - 1;
-	} else if (l->l_len < 0) {
-		fl->fl_end = start - 1;
-		start += l->l_len;
-		if (start < 0)
-			return -EINVAL;
-	}
-	fl->fl_start = start;	/* we record the absolute position */
-	if (fl->fl_end < fl->fl_start)
-		return -EOVERFLOW;
-	
-	fl->fl_owner = current->files;
-	fl->fl_pid = current->tgid;
-	fl->fl_file = filp;
-	fl->fl_flags = FL_POSIX;
-	fl->fl_ops = NULL;
-	fl->fl_lmops = NULL;
-
-	return assign_type(fl, l->l_type);
+	return flock_to_posix_lock_common(filp, fl, l, OFFSET_MAX);
 }
 #endif
 
diff --git a/include/uapi/asm-generic/fcntl.h b/include/uapi/asm-generic/fcntl.h
index 95e46c8..36025f7 100644
--- a/include/uapi/asm-generic/fcntl.h
+++ b/include/uapi/asm-generic/fcntl.h
@@ -186,8 +186,6 @@ struct flock {
 };
 #endif
 
-#ifndef CONFIG_64BIT
-
 #ifndef HAVE_ARCH_STRUCT_FLOCK64
 #ifndef __ARCH_FLOCK64_PAD
 #define __ARCH_FLOCK64_PAD
@@ -202,6 +200,5 @@ struct flock64 {
 	__ARCH_FLOCK64_PAD
 };
 #endif
-#endif /* !CONFIG_64BIT */
 
 #endif /* _ASM_GENERIC_FCNTL_H */

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 1/6] locks: consolidate common code in the flock_to_posix_lock routines
  2013-12-11 15:19           ` J. Bruce Fields
@ 2013-12-11 16:54             ` Jeff Layton
  2013-12-11 16:59               ` J. Bruce Fields
  2013-12-11 19:07             ` Jeff Layton
  1 sibling, 1 reply; 29+ messages in thread
From: Jeff Layton @ 2013-12-11 16:54 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: linux-fsdevel, nfs-ganesha-devel, samba-technical, linux-kernel

On Wed, 11 Dec 2013 10:19:31 -0500
"J. Bruce Fields" <bfields@fieldses.org> wrote:

> On Wed, Dec 11, 2013 at 09:37:24AM -0500, J. Bruce Fields wrote:
> > Well, it'd be weird if I didn't screw up something somewhere.
> 
> Yes, a classic: I forgot the breaks after each switch case.
> 
> Here's a version that at least doesn't return -EINVAL on every lock
> attempt.
> 
> --b.
> 
> commit 70b7b9a442d06a71306736eb01cedba5dd6d86cf
> Author: J. Bruce Fields <bfields@redhat.com>
> Date:   Tue Dec 10 18:14:28 2013 -0500
> 
>     locks: fix posix lock range overflow handling
>     
>     In the 32-bit case fcntl is converting signed 64-bit to signed 32-bit
>     quantities in a couple places, with probably incorrect results.
>     
>     So instead let's return -EOVERFLOW as described in SUSv3, whenever "the
>     smallest or, if l_len is non-zero, the largest offset of any byte in the
>     requested segment cannot be represented correctly in an object of type
>     off_t."
>     
>     While we're here, do some cleanup including consolidating code for the
>     flock and flock64 cases.
>     
>     Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> 
> diff --git a/fs/locks.c b/fs/locks.c
> index 92a0f0a..b70486a 100644
> --- a/fs/locks.c
> +++ b/fs/locks.c
> @@ -344,48 +344,44 @@ static int assign_type(struct file_lock *fl, long type)
>  	return 0;
>  }
>  
> -/* Verify a "struct flock" and copy it to a "struct file_lock" as a POSIX
> - * style lock.
> - */
> -static int flock_to_posix_lock(struct file *filp, struct file_lock *fl,
> -			       struct flock *l)
> +static int flock_to_posix_lock_common(struct file *filp, struct file_lock *fl,
> +					struct flock64 *l, loff_t offset_max)
>  {
> -	off_t start, end;
> +	loff_t start;
>  
>  	switch (l->l_whence) {
>  	case SEEK_SET:
> -		start = 0;
> +		fl->fl_start = 0;
>  		break;
>  	case SEEK_CUR:
> -		start = filp->f_pos;
> +		fl->fl_start = filp->f_pos;
>  		break;
>  	case SEEK_END:
> -		start = i_size_read(file_inode(filp));
> +		fl->fl_start = i_size_read(file_inode(filp));
>  		break;
>  	default:
>  		return -EINVAL;
>  	}
> +	if (l->l_start < 0)
> +		return -EINVAL;
> +	if (l->l_start > offset_max - fl->fl_start)
> +		return -EOVERFLOW;
> +	fl->fl_start += l->l_start;
> +	if (l->l_len > offset_max - fl->fl_start)
> +		return -EOVERFLOW;
> +	if (fl->fl_start + l->l_len < 0)
> +		return -EINVAL;
>  
>  	/* POSIX-1996 leaves the case l->l_len < 0 undefined;
>  	   POSIX-2001 defines it. */
> -	start += l->l_start;
> -	if (start < 0)
> -		return -EINVAL;
> -	fl->fl_end = OFFSET_MAX;
> -	if (l->l_len > 0) {
> -		end = start + l->l_len - 1;
> -		fl->fl_end = end;
> -	} else if (l->l_len < 0) {
> -		end = start - 1;
> -		fl->fl_end = end;
> -		start += l->l_len;
> -		if (start < 0)
> -			return -EINVAL;
> -	}
> -	fl->fl_start = start;	/* we record the absolute position */
> -	if (fl->fl_end < fl->fl_start)
> -		return -EOVERFLOW;
> -	
> +	if (l->l_len > 0)
> +		fl->fl_end = fl->fl_start + l->l_len - 1;
> +	else if (l->l_len < 0) {
> +		fl->fl_end = start - 1;
> +		fl->fl_start += l->l_len;
> +	} else
> +		fl->fl_end = OFFSET_MAX;
> +
>  	fl->fl_owner = current->files;
>  	fl->fl_pid = current->tgid;
>  	fl->fl_file = filp;
> @@ -396,50 +392,27 @@ static int flock_to_posix_lock(struct file *filp, struct file_lock *fl,
>  	return assign_type(fl, l->l_type);
>  }
>  
> +/* Verify a "struct flock" and copy it to a "struct file_lock" as a POSIX
> + * style lock.
> + */
> +static int flock_to_posix_lock(struct file *filp, struct file_lock *fl,
> +			       struct flock *l)
> +{
> +	struct flock64 ll = {
> +		.l_type = l->l_type,
> +		.l_whence = l->l_whence,
> +		.l_start = l->l_start,
> +		.l_len = l->l_len,
> +	};
> +	
> +	return flock_to_posix_lock_common(filp, fl, &ll, OFFT_OFFSET_MAX);
> +}
> +
>  #if BITS_PER_LONG == 32
>  static int flock64_to_posix_lock(struct file *filp, struct file_lock *fl,
>  				 struct flock64 *l)
>  {
> -	loff_t start;
> -
> -	switch (l->l_whence) {
> -	case SEEK_SET:
> -		start = 0;
> -		break;
> -	case SEEK_CUR:
> -		start = filp->f_pos;
> -		break;
> -	case SEEK_END:
> -		start = i_size_read(file_inode(filp));
> -		break;
> -	default:
> -		return -EINVAL;
> -	}
> -
> -	start += l->l_start;
> -	if (start < 0)
> -		return -EINVAL;
> -	fl->fl_end = OFFSET_MAX;
> -	if (l->l_len > 0) {
> -		fl->fl_end = start + l->l_len - 1;
> -	} else if (l->l_len < 0) {
> -		fl->fl_end = start - 1;
> -		start += l->l_len;
> -		if (start < 0)
> -			return -EINVAL;
> -	}
> -	fl->fl_start = start;	/* we record the absolute position */
> -	if (fl->fl_end < fl->fl_start)
> -		return -EOVERFLOW;
> -	
> -	fl->fl_owner = current->files;
> -	fl->fl_pid = current->tgid;
> -	fl->fl_file = filp;
> -	fl->fl_flags = FL_POSIX;
> -	fl->fl_ops = NULL;
> -	fl->fl_lmops = NULL;
> -
> -	return assign_type(fl, l->l_type);
> +	return flock_to_posix_lock_common(filp, fl, l, OFFSET_MAX);
>  }
>  #endif
>  
> diff --git a/include/uapi/asm-generic/fcntl.h b/include/uapi/asm-generic/fcntl.h
> index 95e46c8..36025f7 100644
> --- a/include/uapi/asm-generic/fcntl.h
> +++ b/include/uapi/asm-generic/fcntl.h
> @@ -186,8 +186,6 @@ struct flock {
>  };
>  #endif
>  
> -#ifndef CONFIG_64BIT
> -
>  #ifndef HAVE_ARCH_STRUCT_FLOCK64
>  #ifndef __ARCH_FLOCK64_PAD
>  #define __ARCH_FLOCK64_PAD
> @@ -202,6 +200,5 @@ struct flock64 {
>  	__ARCH_FLOCK64_PAD
>  };
>  #endif
> -#endif /* !CONFIG_64BIT */
>  
>  #endif /* _ASM_GENERIC_FCNTL_H */

Thanks -- looks like this is hitting the same discrepancy that I was
getting with my consolidation attempt. Setting a lock like this on
x86_64 fails with the current kernel:

	struct flock lck1 = {
		.l_type		= F_WRLCK,
		.l_whence	= SEEK_SET,
		.l_start	= 0,
		.l_len		= LONG_MAX,
	};


...but succeeds with your patch in place. I've not sat down yet to
figure out whether you broke or fixed something though :).

-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 1/6] locks: consolidate common code in the flock_to_posix_lock routines
  2013-12-11 16:54             ` Jeff Layton
@ 2013-12-11 16:59               ` J. Bruce Fields
  2013-12-11 18:09                 ` Jeff Layton
  0 siblings, 1 reply; 29+ messages in thread
From: J. Bruce Fields @ 2013-12-11 16:59 UTC (permalink / raw)
  To: Jeff Layton
  Cc: linux-fsdevel, linux-kernel, nfs-ganesha-devel, samba-technical

On Wed, Dec 11, 2013 at 11:54:33AM -0500, Jeff Layton wrote:
> On Wed, 11 Dec 2013 10:19:31 -0500
> "J. Bruce Fields" <bfields@fieldses.org> wrote:
> 
> > On Wed, Dec 11, 2013 at 09:37:24AM -0500, J. Bruce Fields wrote:
> > > Well, it'd be weird if I didn't screw up something somewhere.
> > 
> > Yes, a classic: I forgot the breaks after each switch case.
> > 
> > Here's a version that at least doesn't return -EINVAL on every lock
> > attempt.
> > 
> > --b.
> > 
> > commit 70b7b9a442d06a71306736eb01cedba5dd6d86cf
> > Author: J. Bruce Fields <bfields@redhat.com>
> > Date:   Tue Dec 10 18:14:28 2013 -0500
> > 
> >     locks: fix posix lock range overflow handling
> >     
> >     In the 32-bit case fcntl is converting signed 64-bit to signed 32-bit
> >     quantities in a couple places, with probably incorrect results.
> >     
> >     So instead let's return -EOVERFLOW as described in SUSv3, whenever "the
> >     smallest or, if l_len is non-zero, the largest offset of any byte in the
> >     requested segment cannot be represented correctly in an object of type
> >     off_t."
> >     
> >     While we're here, do some cleanup including consolidating code for the
> >     flock and flock64 cases.
> >     
> >     Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> > 
> > diff --git a/fs/locks.c b/fs/locks.c
> > index 92a0f0a..b70486a 100644
> > --- a/fs/locks.c
> > +++ b/fs/locks.c
> > @@ -344,48 +344,44 @@ static int assign_type(struct file_lock *fl, long type)
> >  	return 0;
> >  }
> >  
> > -/* Verify a "struct flock" and copy it to a "struct file_lock" as a POSIX
> > - * style lock.
> > - */
> > -static int flock_to_posix_lock(struct file *filp, struct file_lock *fl,
> > -			       struct flock *l)
> > +static int flock_to_posix_lock_common(struct file *filp, struct file_lock *fl,
> > +					struct flock64 *l, loff_t offset_max)
> >  {
> > -	off_t start, end;
> > +	loff_t start;
> >  
> >  	switch (l->l_whence) {
> >  	case SEEK_SET:
> > -		start = 0;
> > +		fl->fl_start = 0;
> >  		break;
> >  	case SEEK_CUR:
> > -		start = filp->f_pos;
> > +		fl->fl_start = filp->f_pos;
> >  		break;
> >  	case SEEK_END:
> > -		start = i_size_read(file_inode(filp));
> > +		fl->fl_start = i_size_read(file_inode(filp));
> >  		break;
> >  	default:
> >  		return -EINVAL;
> >  	}
> > +	if (l->l_start < 0)
> > +		return -EINVAL;
> > +	if (l->l_start > offset_max - fl->fl_start)
> > +		return -EOVERFLOW;
> > +	fl->fl_start += l->l_start;
> > +	if (l->l_len > offset_max - fl->fl_start)
> > +		return -EOVERFLOW;
> > +	if (fl->fl_start + l->l_len < 0)
> > +		return -EINVAL;
> >  
> >  	/* POSIX-1996 leaves the case l->l_len < 0 undefined;
> >  	   POSIX-2001 defines it. */
> > -	start += l->l_start;
> > -	if (start < 0)
> > -		return -EINVAL;
> > -	fl->fl_end = OFFSET_MAX;
> > -	if (l->l_len > 0) {
> > -		end = start + l->l_len - 1;
> > -		fl->fl_end = end;
> > -	} else if (l->l_len < 0) {
> > -		end = start - 1;
> > -		fl->fl_end = end;
> > -		start += l->l_len;
> > -		if (start < 0)
> > -			return -EINVAL;
> > -	}
> > -	fl->fl_start = start;	/* we record the absolute position */
> > -	if (fl->fl_end < fl->fl_start)
> > -		return -EOVERFLOW;
> > -	
> > +	if (l->l_len > 0)
> > +		fl->fl_end = fl->fl_start + l->l_len - 1;
> > +	else if (l->l_len < 0) {
> > +		fl->fl_end = start - 1;
> > +		fl->fl_start += l->l_len;
> > +	} else
> > +		fl->fl_end = OFFSET_MAX;
> > +
> >  	fl->fl_owner = current->files;
> >  	fl->fl_pid = current->tgid;
> >  	fl->fl_file = filp;
> > @@ -396,50 +392,27 @@ static int flock_to_posix_lock(struct file *filp, struct file_lock *fl,
> >  	return assign_type(fl, l->l_type);
> >  }
> >  
> > +/* Verify a "struct flock" and copy it to a "struct file_lock" as a POSIX
> > + * style lock.
> > + */
> > +static int flock_to_posix_lock(struct file *filp, struct file_lock *fl,
> > +			       struct flock *l)
> > +{
> > +	struct flock64 ll = {
> > +		.l_type = l->l_type,
> > +		.l_whence = l->l_whence,
> > +		.l_start = l->l_start,
> > +		.l_len = l->l_len,
> > +	};
> > +	
> > +	return flock_to_posix_lock_common(filp, fl, &ll, OFFT_OFFSET_MAX);
> > +}
> > +
> >  #if BITS_PER_LONG == 32
> >  static int flock64_to_posix_lock(struct file *filp, struct file_lock *fl,
> >  				 struct flock64 *l)
> >  {
> > -	loff_t start;
> > -
> > -	switch (l->l_whence) {
> > -	case SEEK_SET:
> > -		start = 0;
> > -		break;
> > -	case SEEK_CUR:
> > -		start = filp->f_pos;
> > -		break;
> > -	case SEEK_END:
> > -		start = i_size_read(file_inode(filp));
> > -		break;
> > -	default:
> > -		return -EINVAL;
> > -	}
> > -
> > -	start += l->l_start;
> > -	if (start < 0)
> > -		return -EINVAL;
> > -	fl->fl_end = OFFSET_MAX;
> > -	if (l->l_len > 0) {
> > -		fl->fl_end = start + l->l_len - 1;
> > -	} else if (l->l_len < 0) {
> > -		fl->fl_end = start - 1;
> > -		start += l->l_len;
> > -		if (start < 0)
> > -			return -EINVAL;
> > -	}
> > -	fl->fl_start = start;	/* we record the absolute position */
> > -	if (fl->fl_end < fl->fl_start)
> > -		return -EOVERFLOW;
> > -	
> > -	fl->fl_owner = current->files;
> > -	fl->fl_pid = current->tgid;
> > -	fl->fl_file = filp;
> > -	fl->fl_flags = FL_POSIX;
> > -	fl->fl_ops = NULL;
> > -	fl->fl_lmops = NULL;
> > -
> > -	return assign_type(fl, l->l_type);
> > +	return flock_to_posix_lock_common(filp, fl, l, OFFSET_MAX);
> >  }
> >  #endif
> >  
> > diff --git a/include/uapi/asm-generic/fcntl.h b/include/uapi/asm-generic/fcntl.h
> > index 95e46c8..36025f7 100644
> > --- a/include/uapi/asm-generic/fcntl.h
> > +++ b/include/uapi/asm-generic/fcntl.h
> > @@ -186,8 +186,6 @@ struct flock {
> >  };
> >  #endif
> >  
> > -#ifndef CONFIG_64BIT
> > -
> >  #ifndef HAVE_ARCH_STRUCT_FLOCK64
> >  #ifndef __ARCH_FLOCK64_PAD
> >  #define __ARCH_FLOCK64_PAD
> > @@ -202,6 +200,5 @@ struct flock64 {
> >  	__ARCH_FLOCK64_PAD
> >  };
> >  #endif
> > -#endif /* !CONFIG_64BIT */
> >  
> >  #endif /* _ASM_GENERIC_FCNTL_H */
> 
> Thanks -- looks like this is hitting the same discrepancy that I was
> getting with my consolidation attempt. Setting a lock like this on
> x86_64 fails with the current kernel:
> 
> 	struct flock lck1 = {
> 		.l_type		= F_WRLCK,
> 		.l_whence	= SEEK_SET,
> 		.l_start	= 0,
> 		.l_len		= LONG_MAX,
> 	};

Are you sure?  I've tried but can't reproduce that.  I'm on 3.13-rc3
plus a couple DRC patches from you.

--b.

> 
> 
> ...but succeeds with your patch in place. I've not sat down yet to
> figure out whether you broke or fixed something though :).
> 
> -- 
> Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 1/6] locks: consolidate common code in the flock_to_posix_lock routines
  2013-12-11 16:59               ` J. Bruce Fields
@ 2013-12-11 18:09                 ` Jeff Layton
  0 siblings, 0 replies; 29+ messages in thread
From: Jeff Layton @ 2013-12-11 18:09 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: linux-fsdevel, nfs-ganesha-devel, samba-technical, linux-kernel

On Wed, 11 Dec 2013 11:59:18 -0500
"J. Bruce Fields" <bfields@fieldses.org> wrote:

> On Wed, Dec 11, 2013 at 11:54:33AM -0500, Jeff Layton wrote:
> > On Wed, 11 Dec 2013 10:19:31 -0500
> > "J. Bruce Fields" <bfields@fieldses.org> wrote:
> > 
> > > On Wed, Dec 11, 2013 at 09:37:24AM -0500, J. Bruce Fields wrote:
> > > > Well, it'd be weird if I didn't screw up something somewhere.
> > > 
> > > Yes, a classic: I forgot the breaks after each switch case.
> > > 
> > > Here's a version that at least doesn't return -EINVAL on every lock
> > > attempt.
> > > 
> > > --b.
> > > 
> > > commit 70b7b9a442d06a71306736eb01cedba5dd6d86cf
> > > Author: J. Bruce Fields <bfields@redhat.com>
> > > Date:   Tue Dec 10 18:14:28 2013 -0500
> > > 
> > >     locks: fix posix lock range overflow handling
> > >     
> > >     In the 32-bit case fcntl is converting signed 64-bit to signed 32-bit
> > >     quantities in a couple places, with probably incorrect results.
> > >     
> > >     So instead let's return -EOVERFLOW as described in SUSv3, whenever "the
> > >     smallest or, if l_len is non-zero, the largest offset of any byte in the
> > >     requested segment cannot be represented correctly in an object of type
> > >     off_t."
> > >     
> > >     While we're here, do some cleanup including consolidating code for the
> > >     flock and flock64 cases.
> > >     
> > >     Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> > > 
> > > diff --git a/fs/locks.c b/fs/locks.c
> > > index 92a0f0a..b70486a 100644
> > > --- a/fs/locks.c
> > > +++ b/fs/locks.c
> > > @@ -344,48 +344,44 @@ static int assign_type(struct file_lock *fl, long type)
> > >  	return 0;
> > >  }
> > >  
> > > -/* Verify a "struct flock" and copy it to a "struct file_lock" as a POSIX
> > > - * style lock.
> > > - */
> > > -static int flock_to_posix_lock(struct file *filp, struct file_lock *fl,
> > > -			       struct flock *l)
> > > +static int flock_to_posix_lock_common(struct file *filp, struct file_lock *fl,
> > > +					struct flock64 *l, loff_t offset_max)
> > >  {
> > > -	off_t start, end;
> > > +	loff_t start;
> > >  
> > >  	switch (l->l_whence) {
> > >  	case SEEK_SET:
> > > -		start = 0;
> > > +		fl->fl_start = 0;
> > >  		break;
> > >  	case SEEK_CUR:
> > > -		start = filp->f_pos;
> > > +		fl->fl_start = filp->f_pos;
> > >  		break;
> > >  	case SEEK_END:
> > > -		start = i_size_read(file_inode(filp));
> > > +		fl->fl_start = i_size_read(file_inode(filp));
> > >  		break;
> > >  	default:
> > >  		return -EINVAL;
> > >  	}
> > > +	if (l->l_start < 0)
> > > +		return -EINVAL;
> > > +	if (l->l_start > offset_max - fl->fl_start)
> > > +		return -EOVERFLOW;
> > > +	fl->fl_start += l->l_start;
> > > +	if (l->l_len > offset_max - fl->fl_start)
> > > +		return -EOVERFLOW;
> > > +	if (fl->fl_start + l->l_len < 0)
> > > +		return -EINVAL;
> > >  
> > >  	/* POSIX-1996 leaves the case l->l_len < 0 undefined;
> > >  	   POSIX-2001 defines it. */
> > > -	start += l->l_start;
> > > -	if (start < 0)
> > > -		return -EINVAL;
> > > -	fl->fl_end = OFFSET_MAX;
> > > -	if (l->l_len > 0) {
> > > -		end = start + l->l_len - 1;
> > > -		fl->fl_end = end;
> > > -	} else if (l->l_len < 0) {
> > > -		end = start - 1;
> > > -		fl->fl_end = end;
> > > -		start += l->l_len;
> > > -		if (start < 0)
> > > -			return -EINVAL;
> > > -	}
> > > -	fl->fl_start = start;	/* we record the absolute position */
> > > -	if (fl->fl_end < fl->fl_start)
> > > -		return -EOVERFLOW;
> > > -	
> > > +	if (l->l_len > 0)
> > > +		fl->fl_end = fl->fl_start + l->l_len - 1;
> > > +	else if (l->l_len < 0) {
> > > +		fl->fl_end = start - 1;
> > > +		fl->fl_start += l->l_len;
> > > +	} else
> > > +		fl->fl_end = OFFSET_MAX;
> > > +
> > >  	fl->fl_owner = current->files;
> > >  	fl->fl_pid = current->tgid;
> > >  	fl->fl_file = filp;
> > > @@ -396,50 +392,27 @@ static int flock_to_posix_lock(struct file *filp, struct file_lock *fl,
> > >  	return assign_type(fl, l->l_type);
> > >  }
> > >  
> > > +/* Verify a "struct flock" and copy it to a "struct file_lock" as a POSIX
> > > + * style lock.
> > > + */
> > > +static int flock_to_posix_lock(struct file *filp, struct file_lock *fl,
> > > +			       struct flock *l)
> > > +{
> > > +	struct flock64 ll = {
> > > +		.l_type = l->l_type,
> > > +		.l_whence = l->l_whence,
> > > +		.l_start = l->l_start,
> > > +		.l_len = l->l_len,
> > > +	};
> > > +	
> > > +	return flock_to_posix_lock_common(filp, fl, &ll, OFFT_OFFSET_MAX);
> > > +}
> > > +
> > >  #if BITS_PER_LONG == 32
> > >  static int flock64_to_posix_lock(struct file *filp, struct file_lock *fl,
> > >  				 struct flock64 *l)
> > >  {
> > > -	loff_t start;
> > > -
> > > -	switch (l->l_whence) {
> > > -	case SEEK_SET:
> > > -		start = 0;
> > > -		break;
> > > -	case SEEK_CUR:
> > > -		start = filp->f_pos;
> > > -		break;
> > > -	case SEEK_END:
> > > -		start = i_size_read(file_inode(filp));
> > > -		break;
> > > -	default:
> > > -		return -EINVAL;
> > > -	}
> > > -
> > > -	start += l->l_start;
> > > -	if (start < 0)
> > > -		return -EINVAL;
> > > -	fl->fl_end = OFFSET_MAX;
> > > -	if (l->l_len > 0) {
> > > -		fl->fl_end = start + l->l_len - 1;
> > > -	} else if (l->l_len < 0) {
> > > -		fl->fl_end = start - 1;
> > > -		start += l->l_len;
> > > -		if (start < 0)
> > > -			return -EINVAL;
> > > -	}
> > > -	fl->fl_start = start;	/* we record the absolute position */
> > > -	if (fl->fl_end < fl->fl_start)
> > > -		return -EOVERFLOW;
> > > -	
> > > -	fl->fl_owner = current->files;
> > > -	fl->fl_pid = current->tgid;
> > > -	fl->fl_file = filp;
> > > -	fl->fl_flags = FL_POSIX;
> > > -	fl->fl_ops = NULL;
> > > -	fl->fl_lmops = NULL;
> > > -
> > > -	return assign_type(fl, l->l_type);
> > > +	return flock_to_posix_lock_common(filp, fl, l, OFFSET_MAX);
> > >  }
> > >  #endif
> > >  
> > > diff --git a/include/uapi/asm-generic/fcntl.h b/include/uapi/asm-generic/fcntl.h
> > > index 95e46c8..36025f7 100644
> > > --- a/include/uapi/asm-generic/fcntl.h
> > > +++ b/include/uapi/asm-generic/fcntl.h
> > > @@ -186,8 +186,6 @@ struct flock {
> > >  };
> > >  #endif
> > >  
> > > -#ifndef CONFIG_64BIT
> > > -
> > >  #ifndef HAVE_ARCH_STRUCT_FLOCK64
> > >  #ifndef __ARCH_FLOCK64_PAD
> > >  #define __ARCH_FLOCK64_PAD
> > > @@ -202,6 +200,5 @@ struct flock64 {
> > >  	__ARCH_FLOCK64_PAD
> > >  };
> > >  #endif
> > > -#endif /* !CONFIG_64BIT */
> > >  
> > >  #endif /* _ASM_GENERIC_FCNTL_H */
> > 
> > Thanks -- looks like this is hitting the same discrepancy that I was
> > getting with my consolidation attempt. Setting a lock like this on
> > x86_64 fails with the current kernel:
> > 
> > 	struct flock lck1 = {
> > 		.l_type		= F_WRLCK,
> > 		.l_whence	= SEEK_SET,
> > 		.l_start	= 0,
> > 		.l_len		= LONG_MAX,
> > 	};
> 
> Are you sure?  I've tried but can't reproduce that.  I'm on 3.13-rc3
> plus a couple DRC patches from you.
> 
> --b.
> 

Nope, my bad -- stupid mistake on my part. My test program had F_WRLCKP,
which of course *should* get an EINVAL on the older kernel. Patch seems
to be fine so far in all of the edge cases I've tested (which
admittedly, isn't many).

-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 1/6] locks: consolidate common code in the flock_to_posix_lock routines
  2013-12-11 15:19           ` J. Bruce Fields
  2013-12-11 16:54             ` Jeff Layton
@ 2013-12-11 19:07             ` Jeff Layton
  2013-12-11 22:56               ` J. Bruce Fields
  1 sibling, 1 reply; 29+ messages in thread
From: Jeff Layton @ 2013-12-11 19:07 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: linux-fsdevel, linux-kernel, nfs-ganesha-devel, samba-technical

On Wed, 11 Dec 2013 10:19:31 -0500
"J. Bruce Fields" <bfields@fieldses.org> wrote:

> On Wed, Dec 11, 2013 at 09:37:24AM -0500, J. Bruce Fields wrote:
> > Well, it'd be weird if I didn't screw up something somewhere.
> 
> Yes, a classic: I forgot the breaks after each switch case.
> 
> Here's a version that at least doesn't return -EINVAL on every lock
> attempt.
> 
> --b.
> 
> commit 70b7b9a442d06a71306736eb01cedba5dd6d86cf
> Author: J. Bruce Fields <bfields@redhat.com>
> Date:   Tue Dec 10 18:14:28 2013 -0500
> 
>     locks: fix posix lock range overflow handling
>     
>     In the 32-bit case fcntl is converting signed 64-bit to signed 32-bit
>     quantities in a couple places, with probably incorrect results.
>     
>     So instead let's return -EOVERFLOW as described in SUSv3, whenever "the
>     smallest or, if l_len is non-zero, the largest offset of any byte in the
>     requested segment cannot be represented correctly in an object of type
>     off_t."
>     
>     While we're here, do some cleanup including consolidating code for the
>     flock and flock64 cases.
>     
>     Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> 
> diff --git a/fs/locks.c b/fs/locks.c
> index 92a0f0a..b70486a 100644
> --- a/fs/locks.c
> +++ b/fs/locks.c
> @@ -344,48 +344,44 @@ static int assign_type(struct file_lock *fl, long type)
>  	return 0;
>  }
>  
> -/* Verify a "struct flock" and copy it to a "struct file_lock" as a POSIX
> - * style lock.
> - */
> -static int flock_to_posix_lock(struct file *filp, struct file_lock *fl,
> -			       struct flock *l)
> +static int flock_to_posix_lock_common(struct file *filp, struct file_lock *fl,
> +					struct flock64 *l, loff_t offset_max)
>  {
> -	off_t start, end;
> +	loff_t start;
>  
>  	switch (l->l_whence) {
>  	case SEEK_SET:
> -		start = 0;
> +		fl->fl_start = 0;
>  		break;
>  	case SEEK_CUR:
> -		start = filp->f_pos;
> +		fl->fl_start = filp->f_pos;
>  		break;
>  	case SEEK_END:
> -		start = i_size_read(file_inode(filp));
> +		fl->fl_start = i_size_read(file_inode(filp));
>  		break;
>  	default:
>  		return -EINVAL;
>  	}
> +	if (l->l_start < 0)
> +		return -EINVAL;
> +	if (l->l_start > offset_max - fl->fl_start)
> +		return -EOVERFLOW;
> +	fl->fl_start += l->l_start;
> +	if (l->l_len > offset_max - fl->fl_start)
> +		return -EOVERFLOW;
> +	if (fl->fl_start + l->l_len < 0)
> +		return -EINVAL;
>  
>  	/* POSIX-1996 leaves the case l->l_len < 0 undefined;
>  	   POSIX-2001 defines it. */
> -	start += l->l_start;
> -	if (start < 0)
> -		return -EINVAL;
> -	fl->fl_end = OFFSET_MAX;
> -	if (l->l_len > 0) {
> -		end = start + l->l_len - 1;
> -		fl->fl_end = end;
> -	} else if (l->l_len < 0) {
> -		end = start - 1;
> -		fl->fl_end = end;
> -		start += l->l_len;
> -		if (start < 0)
> -			return -EINVAL;
> -	}
> -	fl->fl_start = start;	/* we record the absolute position */
> -	if (fl->fl_end < fl->fl_start)
> -		return -EOVERFLOW;
> -	
> +	if (l->l_len > 0)
> +		fl->fl_end = fl->fl_start + l->l_len - 1;
> +	else if (l->l_len < 0) {
> +		fl->fl_end = start - 1;

Erm... I think this is not quite right...

"start" is uninitialized here. I think this should be:

    fl->fl_end = fl->fl_start - 1

With that too, we can get rid of the local "start" variable. I think
this may explain why I'm tripping over the BUG() in locks_remove_file.

While we're in here, I'm going to see what can be done to get rid of
that BUG() call too. That problem doesn't seem like something we ought
to be bringing down the box over...


> +		fl->fl_start += l->l_len;
> +	} else
> +		fl->fl_end = OFFSET_MAX;
> +
>  	fl->fl_owner = current->files;
>  	fl->fl_pid = current->tgid;
>  	fl->fl_file = filp;
> @@ -396,50 +392,27 @@ static int flock_to_posix_lock(struct file *filp, struct file_lock *fl,
>  	return assign_type(fl, l->l_type);
>  }
>  
> +/* Verify a "struct flock" and copy it to a "struct file_lock" as a POSIX
> + * style lock.
> + */
> +static int flock_to_posix_lock(struct file *filp, struct file_lock *fl,
> +			       struct flock *l)
> +{
> +	struct flock64 ll = {
> +		.l_type = l->l_type,
> +		.l_whence = l->l_whence,
> +		.l_start = l->l_start,
> +		.l_len = l->l_len,
> +	};
> +	
> +	return flock_to_posix_lock_common(filp, fl, &ll, OFFT_OFFSET_MAX);
> +}
> +
>  #if BITS_PER_LONG == 32
>  static int flock64_to_posix_lock(struct file *filp, struct file_lock *fl,
>  				 struct flock64 *l)
>  {
> -	loff_t start;
> -
> -	switch (l->l_whence) {
> -	case SEEK_SET:
> -		start = 0;
> -		break;
> -	case SEEK_CUR:
> -		start = filp->f_pos;
> -		break;
> -	case SEEK_END:
> -		start = i_size_read(file_inode(filp));
> -		break;
> -	default:
> -		return -EINVAL;
> -	}
> -
> -	start += l->l_start;
> -	if (start < 0)
> -		return -EINVAL;
> -	fl->fl_end = OFFSET_MAX;
> -	if (l->l_len > 0) {
> -		fl->fl_end = start + l->l_len - 1;
> -	} else if (l->l_len < 0) {
> -		fl->fl_end = start - 1;
> -		start += l->l_len;
> -		if (start < 0)
> -			return -EINVAL;
> -	}
> -	fl->fl_start = start;	/* we record the absolute position */
> -	if (fl->fl_end < fl->fl_start)
> -		return -EOVERFLOW;
> -	
> -	fl->fl_owner = current->files;
> -	fl->fl_pid = current->tgid;
> -	fl->fl_file = filp;
> -	fl->fl_flags = FL_POSIX;
> -	fl->fl_ops = NULL;
> -	fl->fl_lmops = NULL;
> -
> -	return assign_type(fl, l->l_type);
> +	return flock_to_posix_lock_common(filp, fl, l, OFFSET_MAX);
>  }
>  #endif
>  
> diff --git a/include/uapi/asm-generic/fcntl.h b/include/uapi/asm-generic/fcntl.h
> index 95e46c8..36025f7 100644
> --- a/include/uapi/asm-generic/fcntl.h
> +++ b/include/uapi/asm-generic/fcntl.h
> @@ -186,8 +186,6 @@ struct flock {
>  };
>  #endif
>  
> -#ifndef CONFIG_64BIT
> -
>  #ifndef HAVE_ARCH_STRUCT_FLOCK64
>  #ifndef __ARCH_FLOCK64_PAD
>  #define __ARCH_FLOCK64_PAD
> @@ -202,6 +200,5 @@ struct flock64 {
>  	__ARCH_FLOCK64_PAD
>  };
>  #endif
> -#endif /* !CONFIG_64BIT */
>  
>  #endif /* _ASM_GENERIC_FCNTL_H */


-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 1/6] locks: consolidate common code in the flock_to_posix_lock routines
  2013-12-11 19:07             ` Jeff Layton
@ 2013-12-11 22:56               ` J. Bruce Fields
  2013-12-11 22:57                 ` J. Bruce Fields
  2013-12-12 10:44                 ` Jeff Layton
  0 siblings, 2 replies; 29+ messages in thread
From: J. Bruce Fields @ 2013-12-11 22:56 UTC (permalink / raw)
  To: Jeff Layton
  Cc: linux-fsdevel, linux-kernel, nfs-ganesha-devel, samba-technical

On Wed, Dec 11, 2013 at 02:07:41PM -0500, Jeff Layton wrote:
> On Wed, 11 Dec 2013 10:19:31 -0500
> "J. Bruce Fields" <bfields@fieldses.org> wrote:
...
> > +	if (l->l_len > 0)
> > +		fl->fl_end = fl->fl_start + l->l_len - 1;
> > +	else if (l->l_len < 0) {
> > +		fl->fl_end = start - 1;
> 
> Erm... I think this is not quite right...
> 
> "start" is uninitialized here. I think this should be:
> 
>     fl->fl_end = fl->fl_start - 1
> 
> With that too, we can get rid of the local "start" variable. I think
> this may explain why I'm tripping over the BUG() in locks_remove_file.

Yep.

One other bug: I think l_start < 0 is actually fine in the
SEEK_CUR/SEEK_END cases.

With that fixed and another comment (though I don't know how much it
helps), it looks like the below.

--b.

commit 052f395880af17919d4c13037deed581f367b2a8
Author: J. Bruce Fields <bfields@redhat.com>
Date:   Tue Dec 10 18:14:28 2013 -0500

    locks: fix posix lock range overflow handling
    
    In the 32-bit case fcntl assigns f_pos and i_size to a 32-bit off_t,
    with probably incorrect results.
    
    So instead let's return -EOVERFLOW as described in SUSv3, whenever "the
    smallest or, if l_len is non-zero, the largest offset of any byte in the
    requested segment cannot be represented correctly in an object of type
    off_t."
    
    While we're here, do some cleanup including consolidating code for the
    flock and flock64 cases.
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>

diff --git a/fs/locks.c b/fs/locks.c
index 92a0f0a..39f2ca9 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -344,48 +344,42 @@ static int assign_type(struct file_lock *fl, long type)
 	return 0;
 }
 
-/* Verify a "struct flock" and copy it to a "struct file_lock" as a POSIX
- * style lock.
- */
-static int flock_to_posix_lock(struct file *filp, struct file_lock *fl,
-			       struct flock *l)
+static int flock_to_posix_lock_common(struct file *filp, struct file_lock *fl,
+					struct flock64 *l, loff_t offset_max)
 {
-	off_t start, end;
-
 	switch (l->l_whence) {
 	case SEEK_SET:
-		start = 0;
+		fl->fl_start = 0;
 		break;
 	case SEEK_CUR:
-		start = filp->f_pos;
+		fl->fl_start = filp->f_pos;
 		break;
 	case SEEK_END:
-		start = i_size_read(file_inode(filp));
+		fl->fl_start = i_size_read(file_inode(filp));
 		break;
 	default:
 		return -EINVAL;
 	}
+	if (l->l_start > offset_max - fl->fl_start)
+		return -EOVERFLOW;
+	fl->fl_start += l->l_start;
+	if (fl->fl_start < 0)
+		return -EINVAL;
+	if (l->l_len > offset_max - fl->fl_start)
+		return -EOVERFLOW;
+	if (fl->fl_start + l->l_len < 0)
+		return -EINVAL;
 
 	/* POSIX-1996 leaves the case l->l_len < 0 undefined;
 	   POSIX-2001 defines it. */
-	start += l->l_start;
-	if (start < 0)
-		return -EINVAL;
-	fl->fl_end = OFFSET_MAX;
-	if (l->l_len > 0) {
-		end = start + l->l_len - 1;
-		fl->fl_end = end;
-	} else if (l->l_len < 0) {
-		end = start - 1;
-		fl->fl_end = end;
-		start += l->l_len;
-		if (start < 0)
-			return -EINVAL;
-	}
-	fl->fl_start = start;	/* we record the absolute position */
-	if (fl->fl_end < fl->fl_start)
-		return -EOVERFLOW;
-	
+	if (l->l_len > 0)
+		fl->fl_end = fl->fl_start + l->l_len - 1;
+	else if (l->l_len < 0) {
+		fl->fl_end = fl->fl_start - 1;
+		fl->fl_start += l->l_len;
+	} else
+		fl->fl_end = OFFSET_MAX;
+
 	fl->fl_owner = current->files;
 	fl->fl_pid = current->tgid;
 	fl->fl_file = filp;
@@ -396,50 +390,32 @@ static int flock_to_posix_lock(struct file *filp, struct file_lock *fl,
 	return assign_type(fl, l->l_type);
 }
 
+/* Verify a "struct flock" and copy it to a "struct file_lock" as a POSIX
+ * style lock.
+ */
+static int flock_to_posix_lock(struct file *filp, struct file_lock *fl,
+			       struct flock *l)
+{
+	struct flock64 ll = {
+		.l_type = l->l_type,
+		.l_whence = l->l_whence,
+		.l_start = l->l_start,
+		.l_len = l->l_len,
+	};
+
+	/*
+	 * The use of OFFT_OFFSET_MAX here ensures we return -EOVERFLOW
+	 * if the start or end of the lock could not be represented as
+	 * an off_t, following SUSv3.
+	 */
+	return flock_to_posix_lock_common(filp, fl, &ll, OFFT_OFFSET_MAX);
+}
+
 #if BITS_PER_LONG == 32
 static int flock64_to_posix_lock(struct file *filp, struct file_lock *fl,
 				 struct flock64 *l)
 {
-	loff_t start;
-
-	switch (l->l_whence) {
-	case SEEK_SET:
-		start = 0;
-		break;
-	case SEEK_CUR:
-		start = filp->f_pos;
-		break;
-	case SEEK_END:
-		start = i_size_read(file_inode(filp));
-		break;
-	default:
-		return -EINVAL;
-	}
-
-	start += l->l_start;
-	if (start < 0)
-		return -EINVAL;
-	fl->fl_end = OFFSET_MAX;
-	if (l->l_len > 0) {
-		fl->fl_end = start + l->l_len - 1;
-	} else if (l->l_len < 0) {
-		fl->fl_end = start - 1;
-		start += l->l_len;
-		if (start < 0)
-			return -EINVAL;
-	}
-	fl->fl_start = start;	/* we record the absolute position */
-	if (fl->fl_end < fl->fl_start)
-		return -EOVERFLOW;
-	
-	fl->fl_owner = current->files;
-	fl->fl_pid = current->tgid;
-	fl->fl_file = filp;
-	fl->fl_flags = FL_POSIX;
-	fl->fl_ops = NULL;
-	fl->fl_lmops = NULL;
-
-	return assign_type(fl, l->l_type);
+	return flock_to_posix_lock_common(filp, fl, l, OFFSET_MAX);
 }
 #endif
 
diff --git a/include/uapi/asm-generic/fcntl.h b/include/uapi/asm-generic/fcntl.h
index 95e46c8..36025f7 100644
--- a/include/uapi/asm-generic/fcntl.h
+++ b/include/uapi/asm-generic/fcntl.h
@@ -186,8 +186,6 @@ struct flock {
 };
 #endif
 
-#ifndef CONFIG_64BIT
-
 #ifndef HAVE_ARCH_STRUCT_FLOCK64
 #ifndef __ARCH_FLOCK64_PAD
 #define __ARCH_FLOCK64_PAD
@@ -202,6 +200,5 @@ struct flock64 {
 	__ARCH_FLOCK64_PAD
 };
 #endif
-#endif /* !CONFIG_64BIT */
 
 #endif /* _ASM_GENERIC_FCNTL_H */

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 1/6] locks: consolidate common code in the flock_to_posix_lock routines
  2013-12-11 22:56               ` J. Bruce Fields
@ 2013-12-11 22:57                 ` J. Bruce Fields
  2013-12-12 10:43                   ` Jeff Layton
  2013-12-12 10:44                 ` Jeff Layton
  1 sibling, 1 reply; 29+ messages in thread
From: J. Bruce Fields @ 2013-12-11 22:57 UTC (permalink / raw)
  To: Jeff Layton
  Cc: linux-fsdevel, linux-kernel, nfs-ganesha-devel, samba-technical

On Wed, Dec 11, 2013 at 05:56:16PM -0500, J. Bruce Fields wrote:
> On Wed, Dec 11, 2013 at 02:07:41PM -0500, Jeff Layton wrote:
> > On Wed, 11 Dec 2013 10:19:31 -0500
> > "J. Bruce Fields" <bfields@fieldses.org> wrote:
> ...
> > > +	if (l->l_len > 0)
> > > +		fl->fl_end = fl->fl_start + l->l_len - 1;
> > > +	else if (l->l_len < 0) {
> > > +		fl->fl_end = start - 1;
> > 
> > Erm... I think this is not quite right...
> > 
> > "start" is uninitialized here. I think this should be:
> > 
> >     fl->fl_end = fl->fl_start - 1
> > 
> > With that too, we can get rid of the local "start" variable. I think
> > this may explain why I'm tripping over the BUG() in locks_remove_file.
> 
> Yep.
> 
> One other bug: I think l_start < 0 is actually fine in the
> SEEK_CUR/SEEK_END cases.
> 
> With that fixed and another comment (though I don't know how much it
> helps), it looks like the below.

Alternatively, maybe we could simplify?  (On top of the previous):

commit d4bf5cb021a3ac1ec07530ebda904e262cc89d11
Author: J. Bruce Fields <bfields@redhat.com>
Date:   Wed Dec 11 17:42:32 2013 -0500

    locks: simplify overflow checking
    
    Or maybe we don't actually care about indicating overflow in the 32-bit
    case: sure we could fail if e.g. f_pos+start or f_pos+start+len would
    exceed 32-bits, but do we really need to?
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>

diff --git a/fs/locks.c b/fs/locks.c
index 39f2ca9..efbf577 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -344,8 +344,8 @@ static int assign_type(struct file_lock *fl, long type)
 	return 0;
 }
 
-static int flock_to_posix_lock_common(struct file *filp, struct file_lock *fl,
-					struct flock64 *l, loff_t offset_max)
+static int flock64_to_posix_lock(struct file *filp, struct file_lock *fl,
+				 struct flock64 *l)
 {
 	switch (l->l_whence) {
 	case SEEK_SET:
@@ -360,12 +360,12 @@ static int flock_to_posix_lock_common(struct file *filp, struct file_lock *fl,
 	default:
 		return -EINVAL;
 	}
-	if (l->l_start > offset_max - fl->fl_start)
+	if (l->l_start > OFFSET_MAX - fl->fl_start)
 		return -EOVERFLOW;
 	fl->fl_start += l->l_start;
 	if (fl->fl_start < 0)
 		return -EINVAL;
-	if (l->l_len > offset_max - fl->fl_start)
+	if (l->l_len > OFFSET_MAX - fl->fl_start)
 		return -EOVERFLOW;
 	if (fl->fl_start + l->l_len < 0)
 		return -EINVAL;
@@ -403,22 +403,9 @@ static int flock_to_posix_lock(struct file *filp, struct file_lock *fl,
 		.l_len = l->l_len,
 	};
 
-	/*
-	 * The use of OFFT_OFFSET_MAX here ensures we return -EOVERFLOW
-	 * if the start or end of the lock could not be represented as
-	 * an off_t, following SUSv3.
-	 */
-	return flock_to_posix_lock_common(filp, fl, &ll, OFFT_OFFSET_MAX);
+	return flock64_to_posix_lock(filp, fl, &ll);
 }
 
-#if BITS_PER_LONG == 32
-static int flock64_to_posix_lock(struct file *filp, struct file_lock *fl,
-				 struct flock64 *l)
-{
-	return flock_to_posix_lock_common(filp, fl, l, OFFSET_MAX);
-}
-#endif
-
 /* default lease lock manager operations */
 static void lease_break_callback(struct file_lock *fl)
 {

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 1/6] locks: consolidate common code in the flock_to_posix_lock routines
  2013-12-11 22:57                 ` J. Bruce Fields
@ 2013-12-12 10:43                   ` Jeff Layton
  0 siblings, 0 replies; 29+ messages in thread
From: Jeff Layton @ 2013-12-12 10:43 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: linux-fsdevel, nfs-ganesha-devel, samba-technical, linux-kernel

On Wed, 11 Dec 2013 17:57:24 -0500
"J. Bruce Fields" <bfields@fieldses.org> wrote:

> On Wed, Dec 11, 2013 at 05:56:16PM -0500, J. Bruce Fields wrote:
> > On Wed, Dec 11, 2013 at 02:07:41PM -0500, Jeff Layton wrote:
> > > On Wed, 11 Dec 2013 10:19:31 -0500
> > > "J. Bruce Fields" <bfields@fieldses.org> wrote:
> > ...
> > > > +	if (l->l_len > 0)
> > > > +		fl->fl_end = fl->fl_start + l->l_len - 1;
> > > > +	else if (l->l_len < 0) {
> > > > +		fl->fl_end = start - 1;
> > > 
> > > Erm... I think this is not quite right...
> > > 
> > > "start" is uninitialized here. I think this should be:
> > > 
> > >     fl->fl_end = fl->fl_start - 1
> > > 
> > > With that too, we can get rid of the local "start" variable. I think
> > > this may explain why I'm tripping over the BUG() in locks_remove_file.
> > 
> > Yep.
> > 
> > One other bug: I think l_start < 0 is actually fine in the
> > SEEK_CUR/SEEK_END cases.
> > 
> > With that fixed and another comment (though I don't know how much it
> > helps), it looks like the below.
> 
> Alternatively, maybe we could simplify?  (On top of the previous):
> 
> commit d4bf5cb021a3ac1ec07530ebda904e262cc89d11
> Author: J. Bruce Fields <bfields@redhat.com>
> Date:   Wed Dec 11 17:42:32 2013 -0500
> 
>     locks: simplify overflow checking
>     
>     Or maybe we don't actually care about indicating overflow in the 32-bit
>     case: sure we could fail if e.g. f_pos+start or f_pos+start+len would
>     exceed 32-bits, but do we really need to?
>     
>     Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> 
> diff --git a/fs/locks.c b/fs/locks.c
> index 39f2ca9..efbf577 100644
> --- a/fs/locks.c
> +++ b/fs/locks.c
> @@ -344,8 +344,8 @@ static int assign_type(struct file_lock *fl, long type)
>  	return 0;
>  }
>  
> -static int flock_to_posix_lock_common(struct file *filp, struct file_lock *fl,
> -					struct flock64 *l, loff_t offset_max)
> +static int flock64_to_posix_lock(struct file *filp, struct file_lock *fl,
> +				 struct flock64 *l)
>  {
>  	switch (l->l_whence) {
>  	case SEEK_SET:
> @@ -360,12 +360,12 @@ static int flock_to_posix_lock_common(struct file *filp, struct file_lock *fl,
>  	default:
>  		return -EINVAL;
>  	}
> -	if (l->l_start > offset_max - fl->fl_start)
> +	if (l->l_start > OFFSET_MAX - fl->fl_start)
>  		return -EOVERFLOW;
>  	fl->fl_start += l->l_start;
>  	if (fl->fl_start < 0)
>  		return -EINVAL;
> -	if (l->l_len > offset_max - fl->fl_start)
> +	if (l->l_len > OFFSET_MAX - fl->fl_start)
>  		return -EOVERFLOW;
>  	if (fl->fl_start + l->l_len < 0)
>  		return -EINVAL;
> @@ -403,22 +403,9 @@ static int flock_to_posix_lock(struct file *filp, struct file_lock *fl,
>  		.l_len = l->l_len,
>  	};
>  
> -	/*
> -	 * The use of OFFT_OFFSET_MAX here ensures we return -EOVERFLOW
> -	 * if the start or end of the lock could not be represented as
> -	 * an off_t, following SUSv3.
> -	 */
> -	return flock_to_posix_lock_common(filp, fl, &ll, OFFT_OFFSET_MAX);
> +	return flock64_to_posix_lock(filp, fl, &ll);
>  }
>  
> -#if BITS_PER_LONG == 32
> -static int flock64_to_posix_lock(struct file *filp, struct file_lock *fl,
> -				 struct flock64 *l)
> -{
> -	return flock_to_posix_lock_common(filp, fl, l, OFFSET_MAX);
> -}
> -#endif
> -
>  /* default lease lock manager operations */
>  static void lease_break_callback(struct file_lock *fl)
>  {

Makes sense to me -- be liberal in what you accept and all that...

It might be problematic for 32-bit apps that then later try to do
F_GETLK against such a lock, I doubt that'll be any worse than the
current situation. I'll toss this one onto the pile...

Thanks!
-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 1/6] locks: consolidate common code in the flock_to_posix_lock routines
  2013-12-11 22:56               ` J. Bruce Fields
  2013-12-11 22:57                 ` J. Bruce Fields
@ 2013-12-12 10:44                 ` Jeff Layton
  2014-01-05 20:39                   ` J. Bruce Fields
  1 sibling, 1 reply; 29+ messages in thread
From: Jeff Layton @ 2013-12-12 10:44 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: linux-fsdevel, linux-kernel, nfs-ganesha-devel, samba-technical

On Wed, 11 Dec 2013 17:56:16 -0500
"J. Bruce Fields" <bfields@fieldses.org> wrote:

> On Wed, Dec 11, 2013 at 02:07:41PM -0500, Jeff Layton wrote:
> > On Wed, 11 Dec 2013 10:19:31 -0500
> > "J. Bruce Fields" <bfields@fieldses.org> wrote:
> ...
> > > +	if (l->l_len > 0)
> > > +		fl->fl_end = fl->fl_start + l->l_len - 1;
> > > +	else if (l->l_len < 0) {
> > > +		fl->fl_end = start - 1;
> > 
> > Erm... I think this is not quite right...
> > 
> > "start" is uninitialized here. I think this should be:
> > 
> >     fl->fl_end = fl->fl_start - 1
> > 
> > With that too, we can get rid of the local "start" variable. I think
> > this may explain why I'm tripping over the BUG() in locks_remove_file.
> 
> Yep.
> 
> One other bug: I think l_start < 0 is actually fine in the
> SEEK_CUR/SEEK_END cases.
> 
> With that fixed and another comment (though I don't know how much it
> helps), it looks like the below.
> 
> --b.
> 
> commit 052f395880af17919d4c13037deed581f367b2a8
> Author: J. Bruce Fields <bfields@redhat.com>
> Date:   Tue Dec 10 18:14:28 2013 -0500
> 
>     locks: fix posix lock range overflow handling
>     
>     In the 32-bit case fcntl assigns f_pos and i_size to a 32-bit off_t,
>     with probably incorrect results.
>     
>     So instead let's return -EOVERFLOW as described in SUSv3, whenever "the
>     smallest or, if l_len is non-zero, the largest offset of any byte in the
>     requested segment cannot be represented correctly in an object of type
>     off_t."
>     
>     While we're here, do some cleanup including consolidating code for the
>     flock and flock64 cases.
>     
>     Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> 
> diff --git a/fs/locks.c b/fs/locks.c
> index 92a0f0a..39f2ca9 100644
> --- a/fs/locks.c
> +++ b/fs/locks.c
> @@ -344,48 +344,42 @@ static int assign_type(struct file_lock *fl, long type)
>  	return 0;
>  }
>  
> -/* Verify a "struct flock" and copy it to a "struct file_lock" as a POSIX
> - * style lock.
> - */
> -static int flock_to_posix_lock(struct file *filp, struct file_lock *fl,
> -			       struct flock *l)
> +static int flock_to_posix_lock_common(struct file *filp, struct file_lock *fl,
> +					struct flock64 *l, loff_t offset_max)
>  {
> -	off_t start, end;
> -
>  	switch (l->l_whence) {
>  	case SEEK_SET:
> -		start = 0;
> +		fl->fl_start = 0;
>  		break;
>  	case SEEK_CUR:
> -		start = filp->f_pos;
> +		fl->fl_start = filp->f_pos;
>  		break;
>  	case SEEK_END:
> -		start = i_size_read(file_inode(filp));
> +		fl->fl_start = i_size_read(file_inode(filp));
>  		break;
>  	default:
>  		return -EINVAL;
>  	}
> +	if (l->l_start > offset_max - fl->fl_start)
> +		return -EOVERFLOW;
> +	fl->fl_start += l->l_start;
> +	if (fl->fl_start < 0)
> +		return -EINVAL;
> +	if (l->l_len > offset_max - fl->fl_start)
> +		return -EOVERFLOW;
> +	if (fl->fl_start + l->l_len < 0)
> +		return -EINVAL;
>  


Well spotted. The change looks correct to me.

>  	/* POSIX-1996 leaves the case l->l_len < 0 undefined;
>  	   POSIX-2001 defines it. */
> -	start += l->l_start;
> -	if (start < 0)
> -		return -EINVAL;
> -	fl->fl_end = OFFSET_MAX;
> -	if (l->l_len > 0) {
> -		end = start + l->l_len - 1;
> -		fl->fl_end = end;
> -	} else if (l->l_len < 0) {
> -		end = start - 1;
> -		fl->fl_end = end;
> -		start += l->l_len;
> -		if (start < 0)
> -			return -EINVAL;
> -	}
> -	fl->fl_start = start;	/* we record the absolute position */
> -	if (fl->fl_end < fl->fl_start)
> -		return -EOVERFLOW;
> -	
> +	if (l->l_len > 0)
> +		fl->fl_end = fl->fl_start + l->l_len - 1;
> +	else if (l->l_len < 0) {
> +		fl->fl_end = fl->fl_start - 1;
> +		fl->fl_start += l->l_len;
> +	} else
> +		fl->fl_end = OFFSET_MAX;
> +
>  	fl->fl_owner = current->files;
>  	fl->fl_pid = current->tgid;
>  	fl->fl_file = filp;
> @@ -396,50 +390,32 @@ static int flock_to_posix_lock(struct file *filp, struct file_lock *fl,
>  	return assign_type(fl, l->l_type);
>  }
>  
> +/* Verify a "struct flock" and copy it to a "struct file_lock" as a POSIX
> + * style lock.
> + */
> +static int flock_to_posix_lock(struct file *filp, struct file_lock *fl,
> +			       struct flock *l)
> +{
> +	struct flock64 ll = {
> +		.l_type = l->l_type,
> +		.l_whence = l->l_whence,
> +		.l_start = l->l_start,
> +		.l_len = l->l_len,
> +	};
> +
> +	/*
> +	 * The use of OFFT_OFFSET_MAX here ensures we return -EOVERFLOW
> +	 * if the start or end of the lock could not be represented as
> +	 * an off_t, following SUSv3.
> +	 */
> +	return flock_to_posix_lock_common(filp, fl, &ll, OFFT_OFFSET_MAX);
> +}
> +
>  #if BITS_PER_LONG == 32
>  static int flock64_to_posix_lock(struct file *filp, struct file_lock *fl,
>  				 struct flock64 *l)
>  {
> -	loff_t start;
> -
> -	switch (l->l_whence) {
> -	case SEEK_SET:
> -		start = 0;
> -		break;
> -	case SEEK_CUR:
> -		start = filp->f_pos;
> -		break;
> -	case SEEK_END:
> -		start = i_size_read(file_inode(filp));
> -		break;
> -	default:
> -		return -EINVAL;
> -	}
> -
> -	start += l->l_start;
> -	if (start < 0)
> -		return -EINVAL;
> -	fl->fl_end = OFFSET_MAX;
> -	if (l->l_len > 0) {
> -		fl->fl_end = start + l->l_len - 1;
> -	} else if (l->l_len < 0) {
> -		fl->fl_end = start - 1;
> -		start += l->l_len;
> -		if (start < 0)
> -			return -EINVAL;
> -	}
> -	fl->fl_start = start;	/* we record the absolute position */
> -	if (fl->fl_end < fl->fl_start)
> -		return -EOVERFLOW;
> -	
> -	fl->fl_owner = current->files;
> -	fl->fl_pid = current->tgid;
> -	fl->fl_file = filp;
> -	fl->fl_flags = FL_POSIX;
> -	fl->fl_ops = NULL;
> -	fl->fl_lmops = NULL;
> -
> -	return assign_type(fl, l->l_type);
> +	return flock_to_posix_lock_common(filp, fl, l, OFFSET_MAX);
>  }
>  #endif
>  
> diff --git a/include/uapi/asm-generic/fcntl.h b/include/uapi/asm-generic/fcntl.h
> index 95e46c8..36025f7 100644
> --- a/include/uapi/asm-generic/fcntl.h
> +++ b/include/uapi/asm-generic/fcntl.h
> @@ -186,8 +186,6 @@ struct flock {
>  };
>  #endif
>  
> -#ifndef CONFIG_64BIT
> -
>  #ifndef HAVE_ARCH_STRUCT_FLOCK64
>  #ifndef __ARCH_FLOCK64_PAD
>  #define __ARCH_FLOCK64_PAD
> @@ -202,6 +200,5 @@ struct flock64 {
>  	__ARCH_FLOCK64_PAD
>  };
>  #endif
> -#endif /* !CONFIG_64BIT */
>  
>  #endif /* _ASM_GENERIC_FCNTL_H */


-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 6/6] locks: add new "private" lock type that is owned by the filp
  2013-12-10 19:17 ` [PATCH v3 6/6] locks: add new "private" lock type that is owned by the filp Jeff Layton
@ 2013-12-17 13:31   ` Jeff Layton
  2013-12-17 13:37     ` Christoph Hellwig
  0 siblings, 1 reply; 29+ messages in thread
From: Jeff Layton @ 2013-12-17 13:31 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: linux-kernel, nfs-ganesha-devel, samba-technical

On Tue, 10 Dec 2013 14:17:35 -0500
Jeff Layton <jlayton@redhat.com> wrote:

> Due to some unfortunate history, POSIX locks have very strange and
> unhelpful semantics. The thing that usually catches people by surprise
> is that they are dropped whenever the process closes any file descriptor
> associated with the inode.
> 
> This is extremely problematic for people developing file servers that
> need to implement byte-range locks. Developers often need a "lock
> management" facility to ensure that file descriptors are not closed
> until all of the locks associated with the inode are finished.
> 
> Additionally, "classic" POSIX locks are owned by the process. Locks
> taken between threads within the same process won't conflict with one
> another, which renders classic POSIX locks useless for synchronization
> between threads.
> 
> This patchset adds a new type of lock that attempts to address these
> issues. These locks work just like classic POSIX read/write locks, but
> have semantics that are more like BSD locks with respect to inheritance
> and behavior on close.
> 
> This is implemented primarily by changing how fl_owner field is set for
> these locks. Instead of having them owned by the files_struct of the
> process, they are instead owned by the filp on which they were acquired.
> Thus, they are inherited across fork() and are only released when the
> last reference to a filp is put.
> 
> These new semantics prevent them from being merged with classic POSIX
> locks, even if they are acquired by the same process. These locks will
> also conflict with classic POSIX locks even if they are acquired by
> the same process or on the same file descriptor.
> 
> Signed-off-by: Jeff Layton <jlayton@redhat.com>
> ---
>  fs/locks.c                       | 32 ++++++++++++++++++++++++++++++--
>  include/uapi/asm-generic/fcntl.h | 16 ++++++++++++++++
>  2 files changed, 46 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/locks.c b/fs/locks.c
> index 5372ddd..98a503d 100644
> --- a/fs/locks.c
> +++ b/fs/locks.c
> @@ -367,7 +367,34 @@ flock_to_posix_lock_common(struct file_lock *fl, struct file *filp, short type)
>  		break;
>  	}
>  
> -	return assign_type(fl, type);
> +	/*
> +	 * FL_FILE_PVT locks are "owned" by the filp upon which they were
> +	 * acquired, regardless of what task is dealing with them. Set the
> +	 * fl_owner appropriately and flag them as private.
> +	 */
> +	switch(type) {
> +	case F_RDLCKP:
> +		fl->fl_owner = (fl_owner_t)fl->fl_file;
> +		fl->fl_type = F_RDLCK;
> +		fl->fl_flags |= FL_FILE_PVT;
> +		break;
> +	case F_WRLCKP:
> +		fl->fl_owner = (fl_owner_t)fl->fl_file;
> +		fl->fl_type = F_WRLCK;
> +		fl->fl_flags |= FL_FILE_PVT;
> +		break;
> +	case F_UNLCKP:
> +		fl->fl_owner = (fl_owner_t)fl->fl_file;
> +		fl->fl_type = F_UNLCK;
> +		fl->fl_flags |= FL_FILE_PVT;
> +		break;
> +	default:
> +		/* Any other POSIX lock is owned by the file_struct */
> +		fl->fl_owner = current->files;
> +		return assign_type(fl, type);
> +	}
> +
> +	return 0;
>  }
>  
>  /* Verify a "struct flock" and copy it to a "struct file_lock" as a POSIX
> @@ -2259,7 +2286,8 @@ void locks_remove_file(struct file *filp)
>  
>  	while ((fl = *before) != NULL) {
>  		if (fl->fl_file == filp) {
> -			if (IS_FLOCK(fl)) {
> +			if (IS_FLOCK(fl) ||
> +			    (IS_POSIX(fl) && IS_FILE_PVT(fl))) {
>  				locks_delete_lock(before);
>  				continue;
>  			}
> diff --git a/include/uapi/asm-generic/fcntl.h b/include/uapi/asm-generic/fcntl.h
> index 95e46c8..ed09fc5 100644
> --- a/include/uapi/asm-generic/fcntl.h
> +++ b/include/uapi/asm-generic/fcntl.h
> @@ -151,6 +151,22 @@ struct f_owner_ex {
>  #define F_UNLCK		2
>  #endif
>  
> +/*
> + * fd "private" POSIX locks.
> + *
> + * Usually POSIX locks held by a process are released on *any* close and are
> + * not inherited across a fork().
> + *
> + * These lock types will conflict with normal POSIX locks, but are "owned"
> + * by the opened file, not the process. This means that they are inherited
> + * across fork() like BSD (flock) locks, and they are only released
> + * automatically when the last reference to the the open file against which
> + * they were acquired is put.
> + */
> +#define F_RDLCKP	5
> +#define F_WRLCKP	6
> +#define F_UNLCKP	7
> +
>  /* for old implementation of bsd flock () */
>  #ifndef F_EXLCK
>  #define F_EXLCK		4	/* or 3 */


So, I think the above semantics are pretty clear, but now that I've had
a go at sitting down to document this stuff for the POSIX spec and
manpages, it's clear how convoluted the text in there is becoming.

That makes me wonder...would we be better off with a new set of cmd
values here instead of new l_type values? IOW, we could add new:

F_GETLKP
F_SETLKP
F_SETLKPW

...and then just reuse the same F_RDLCK/F_WRLCK/F_UNLCK values? With
that too, we could create a new equivalent to struct flock that has
fixed length types instead of dealing with the off_t mess.

I think doing that would make it a little easier to document, at the
expense of being a little trickier to code up for all of the different
arches. What would be the most intuitive interface from the standpoint
of a userland developer?

-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 6/6] locks: add new "private" lock type that is owned by the filp
  2013-12-17 13:31   ` Jeff Layton
@ 2013-12-17 13:37     ` Christoph Hellwig
  2013-12-17 13:50       ` Jeff Layton
  0 siblings, 1 reply; 29+ messages in thread
From: Christoph Hellwig @ 2013-12-17 13:37 UTC (permalink / raw)
  To: Jeff Layton
  Cc: linux-fsdevel, linux-kernel, nfs-ganesha-devel, samba-technical

On Tue, Dec 17, 2013 at 08:31:25AM -0500, Jeff Layton wrote:
> So, I think the above semantics are pretty clear, but now that I've had
> a go at sitting down to document this stuff for the POSIX spec and
> manpages, it's clear how convoluted the text in there is becoming.
> 
> That makes me wonder...would we be better off with a new set of cmd
> values here instead of new l_type values? IOW, we could add new:
> 
> F_GETLKP
> F_SETLKP
> F_SETLKPW

That seems a tad cleaner to me indeed.

> ...and then just reuse the same F_RDLCK/F_WRLCK/F_UNLCK values? With
> that too, we could create a new equivalent to struct flock that has
> fixed length types instead of dealing with the off_t mess.

For the Posix interface you'd need an off_t as that's what the whole
API uses for file offsets.  We could make sure to always use a off64_t
for the kernel interface though.

What is the API you propose to posix?  An new posix_lockf?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 6/6] locks: add new "private" lock type that is owned by the filp
  2013-12-17 13:37     ` Christoph Hellwig
@ 2013-12-17 13:50       ` Jeff Layton
  0 siblings, 0 replies; 29+ messages in thread
From: Jeff Layton @ 2013-12-17 13:50 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-fsdevel, nfs-ganesha-devel, samba-technical, linux-kernel

On Tue, 17 Dec 2013 05:37:21 -0800
Christoph Hellwig <hch@infradead.org> wrote:

> On Tue, Dec 17, 2013 at 08:31:25AM -0500, Jeff Layton wrote:
> > So, I think the above semantics are pretty clear, but now that I've had
> > a go at sitting down to document this stuff for the POSIX spec and
> > manpages, it's clear how convoluted the text in there is becoming.
> > 
> > That makes me wonder...would we be better off with a new set of cmd
> > values here instead of new l_type values? IOW, we could add new:
> > 
> > F_GETLKP
> > F_SETLKP
> > F_SETLKPW
> 
> That seems a tad cleaner to me indeed.
> 
> > ...and then just reuse the same F_RDLCK/F_WRLCK/F_UNLCK values? With
> > that too, we could create a new equivalent to struct flock that has
> > fixed length types instead of dealing with the off_t mess.
> 
> For the Posix interface you'd need an off_t as that's what the whole
> API uses for file offsets.  We could make sure to always use a off64_t
> for the kernel interface though.
> 

Ok.

> What is the API you propose to posix?  An new posix_lockf?
> 

I haven't proposed anything concrete to POSIX just yet. I'm trying to
get the Linux patches done and then I'll do that. I don't think we want
a new syscall when fcntl() will work.

If we use new cmd values however, then we don't necessarily need to use
struct flock/flock64. I think the question is -- should we stick with
struct flock/flock64, or would we be better served with something new
for this?

Thanks,
-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 1/6] locks: consolidate common code in the flock_to_posix_lock routines
  2013-12-12 10:44                 ` Jeff Layton
@ 2014-01-05 20:39                   ` J. Bruce Fields
  2014-01-05 20:42                     ` [PATCH] locks: fix posix lock range overflow handling J. Bruce Fields
  0 siblings, 1 reply; 29+ messages in thread
From: J. Bruce Fields @ 2014-01-05 20:39 UTC (permalink / raw)
  To: Jeff Layton
  Cc: linux-fsdevel, nfs-ganesha-devel, samba-technical, linux-kernel

Ugh, I screwed up one more when rewriting flock{64}_to_posix_lock, an
off-by-one error caused by not noticing that the "end" offset of a lock
is at start + len - 1, not start + len. 

(So for example, a 1-byte lock starting at offset 5 is recorded as
(fl_start, fl_end) == (5, 5), not (5,6)....)

This actually causes "cthon -l" fails as it attempts a lock with
(start, len) == (1, OFFSET_MAX).

--b.

diff --git a/fs/locks.c b/fs/locks.c
index 9523b89..f017280 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -365,16 +365,17 @@ static int flock64_to_posix_lock(struct file *filp, struct file_lock *fl,
 	fl->fl_start += l->l_start;
 	if (fl->fl_start < 0)
 		return -EINVAL;
-	if (l->l_len > 0 && l->l_len - 1 > OFFSET_MAX - fl->fl_start)
-		return -EOVERFLOW;
-	if (fl->fl_start + l->l_len < 0)
-		return -EINVAL;
 
 	/* POSIX-1996 leaves the case l->l_len < 0 undefined;
 	   POSIX-2001 defines it. */
-	if (l->l_len > 0)
+	if (l->l_len > 0) {
+		if (l->l_len - 1 > OFFSET_MAX - fl->fl_start)
+			return -EOVERFLOW;
 		fl->fl_end = fl->fl_start + l->l_len - 1;
-	else if (l->l_len < 0) {
+
+	} else if (l->l_len < 0) {
+		if (fl->fl_start + l->l_len < 0)
+			return -EINVAL;
 		fl->fl_end = fl->fl_start - 1;
 		fl->fl_start += l->l_len;
 	} else

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH] locks: fix posix lock range overflow handling
  2014-01-05 20:39                   ` J. Bruce Fields
@ 2014-01-05 20:42                     ` J. Bruce Fields
  0 siblings, 0 replies; 29+ messages in thread
From: J. Bruce Fields @ 2014-01-05 20:42 UTC (permalink / raw)
  To: Jeff Layton
  Cc: linux-fsdevel, nfs-ganesha-devel, samba-technical, linux-kernel

From: "J. Bruce Fields" <bfields@redhat.com>

In the 32-bit case fcntl assigns the 64-bit f_pos and i_size to a 32-bit
off_t.

The existing range checks also seem to depend on signed arithmetic
wrapping when it overflows.  In practice maybe that works, but we can be
more careful.  That also allows us to make a more reliable distinction
between -EINVAL and -EOVERFLOW.

Note that in the 32-bit case SEEK_CUR or SEEK_END might allow the caller
to set a lock with starting point no longer representable as a 32-bit
value.  We could return -EOVERFLOW in such cases, but the locks code is
capable of handling such ranges, so we choose to be lenient here.  The
only problem is that subsequent GETLK calls on such a lock will fail
with EOVERFLOW.

While we're here, do some cleanup including consolidating code for the
flock and flock64 cases.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/locks.c                       | 100 +++++++++++++--------------------------
 include/uapi/asm-generic/fcntl.h |   3 --
 2 files changed, 32 insertions(+), 71 deletions(-)

Here's the updated patch.  Could probably use another read-through, I'm
tired of looking at it....

--b.

diff --git a/fs/locks.c b/fs/locks.c
index 92a0f0a..f017280 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -344,48 +344,43 @@ static int assign_type(struct file_lock *fl, long type)
 	return 0;
 }
 
-/* Verify a "struct flock" and copy it to a "struct file_lock" as a POSIX
- * style lock.
- */
-static int flock_to_posix_lock(struct file *filp, struct file_lock *fl,
-			       struct flock *l)
+static int flock64_to_posix_lock(struct file *filp, struct file_lock *fl,
+				 struct flock64 *l)
 {
-	off_t start, end;
-
 	switch (l->l_whence) {
 	case SEEK_SET:
-		start = 0;
+		fl->fl_start = 0;
 		break;
 	case SEEK_CUR:
-		start = filp->f_pos;
+		fl->fl_start = filp->f_pos;
 		break;
 	case SEEK_END:
-		start = i_size_read(file_inode(filp));
+		fl->fl_start = i_size_read(file_inode(filp));
 		break;
 	default:
 		return -EINVAL;
 	}
+	if (l->l_start > OFFSET_MAX - fl->fl_start)
+		return -EOVERFLOW;
+	fl->fl_start += l->l_start;
+	if (fl->fl_start < 0)
+		return -EINVAL;
 
 	/* POSIX-1996 leaves the case l->l_len < 0 undefined;
 	   POSIX-2001 defines it. */
-	start += l->l_start;
-	if (start < 0)
-		return -EINVAL;
-	fl->fl_end = OFFSET_MAX;
 	if (l->l_len > 0) {
-		end = start + l->l_len - 1;
-		fl->fl_end = end;
+		if (l->l_len - 1 > OFFSET_MAX - fl->fl_start)
+			return -EOVERFLOW;
+		fl->fl_end = fl->fl_start + l->l_len - 1;
+
 	} else if (l->l_len < 0) {
-		end = start - 1;
-		fl->fl_end = end;
-		start += l->l_len;
-		if (start < 0)
+		if (fl->fl_start + l->l_len < 0)
 			return -EINVAL;
-	}
-	fl->fl_start = start;	/* we record the absolute position */
-	if (fl->fl_end < fl->fl_start)
-		return -EOVERFLOW;
-	
+		fl->fl_end = fl->fl_start - 1;
+		fl->fl_start += l->l_len;
+	} else
+		fl->fl_end = OFFSET_MAX;
+
 	fl->fl_owner = current->files;
 	fl->fl_pid = current->tgid;
 	fl->fl_file = filp;
@@ -396,52 +391,21 @@ static int flock_to_posix_lock(struct file *filp, struct file_lock *fl,
 	return assign_type(fl, l->l_type);
 }
 
-#if BITS_PER_LONG == 32
-static int flock64_to_posix_lock(struct file *filp, struct file_lock *fl,
-				 struct flock64 *l)
+/* Verify a "struct flock" and copy it to a "struct file_lock" as a POSIX
+ * style lock.
+ */
+static int flock_to_posix_lock(struct file *filp, struct file_lock *fl,
+			       struct flock *l)
 {
-	loff_t start;
-
-	switch (l->l_whence) {
-	case SEEK_SET:
-		start = 0;
-		break;
-	case SEEK_CUR:
-		start = filp->f_pos;
-		break;
-	case SEEK_END:
-		start = i_size_read(file_inode(filp));
-		break;
-	default:
-		return -EINVAL;
-	}
+	struct flock64 ll = {
+		.l_type = l->l_type,
+		.l_whence = l->l_whence,
+		.l_start = l->l_start,
+		.l_len = l->l_len,
+	};
 
-	start += l->l_start;
-	if (start < 0)
-		return -EINVAL;
-	fl->fl_end = OFFSET_MAX;
-	if (l->l_len > 0) {
-		fl->fl_end = start + l->l_len - 1;
-	} else if (l->l_len < 0) {
-		fl->fl_end = start - 1;
-		start += l->l_len;
-		if (start < 0)
-			return -EINVAL;
-	}
-	fl->fl_start = start;	/* we record the absolute position */
-	if (fl->fl_end < fl->fl_start)
-		return -EOVERFLOW;
-	
-	fl->fl_owner = current->files;
-	fl->fl_pid = current->tgid;
-	fl->fl_file = filp;
-	fl->fl_flags = FL_POSIX;
-	fl->fl_ops = NULL;
-	fl->fl_lmops = NULL;
-
-	return assign_type(fl, l->l_type);
+	return flock64_to_posix_lock(filp, fl, &ll);
 }
-#endif
 
 /* default lease lock manager operations */
 static void lease_break_callback(struct file_lock *fl)
diff --git a/include/uapi/asm-generic/fcntl.h b/include/uapi/asm-generic/fcntl.h
index 95e46c8..36025f7 100644
--- a/include/uapi/asm-generic/fcntl.h
+++ b/include/uapi/asm-generic/fcntl.h
@@ -186,8 +186,6 @@ struct flock {
 };
 #endif
 
-#ifndef CONFIG_64BIT
-
 #ifndef HAVE_ARCH_STRUCT_FLOCK64
 #ifndef __ARCH_FLOCK64_PAD
 #define __ARCH_FLOCK64_PAD
@@ -202,6 +200,5 @@ struct flock64 {
 	__ARCH_FLOCK64_PAD
 };
 #endif
-#endif /* !CONFIG_64BIT */
 
 #endif /* _ASM_GENERIC_FCNTL_H */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2014-01-05 20:42 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-12-10 19:17 [PATCH v3 0/6] locks: implement "filp-private" (aka UNPOSIX) locks Jeff Layton
2013-12-10 19:17 ` [PATCH v3 1/6] locks: consolidate common code in the flock_to_posix_lock routines Jeff Layton
2013-12-10 21:22   ` J. Bruce Fields
2013-12-10 23:22     ` J. Bruce Fields
2013-12-11 11:18       ` Jeff Layton
2013-12-11 14:37         ` J. Bruce Fields
2013-12-11 15:19           ` J. Bruce Fields
2013-12-11 16:54             ` Jeff Layton
2013-12-11 16:59               ` J. Bruce Fields
2013-12-11 18:09                 ` Jeff Layton
2013-12-11 19:07             ` Jeff Layton
2013-12-11 22:56               ` J. Bruce Fields
2013-12-11 22:57                 ` J. Bruce Fields
2013-12-12 10:43                   ` Jeff Layton
2013-12-12 10:44                 ` Jeff Layton
2014-01-05 20:39                   ` J. Bruce Fields
2014-01-05 20:42                     ` [PATCH] locks: fix posix lock range overflow handling J. Bruce Fields
2013-12-10 19:17 ` [PATCH v3 2/6] locks: consolidate checks for compatible filp->f_mode values in setlk handlers Jeff Layton
2013-12-10 19:17 ` [PATCH v3 3/6] locks: rename locks_remove_flock to locks_remove_file Jeff Layton
2013-12-10 19:17 ` [PATCH v3 4/6] locks: show private lock types in /proc/locks Jeff Layton
2013-12-10 19:17 ` [PATCH v3 5/6] locks: report l_pid as -1 for FL_FILE_PVT locks Jeff Layton
2013-12-10 19:31   ` Jeff Layton
2013-12-10 19:41     ` [Nfs-ganesha-devel] " Frank Filz
2013-12-10 19:57       ` Jeff Layton
2013-12-10 19:17 ` [PATCH v3 6/6] locks: add new "private" lock type that is owned by the filp Jeff Layton
2013-12-17 13:31   ` Jeff Layton
2013-12-17 13:37     ` Christoph Hellwig
2013-12-17 13:50       ` Jeff Layton
2013-12-10 19:30 ` [Nfs-ganesha-devel] [PATCH v3 0/6] locks: implement "filp-private" (aka UNPOSIX) locks Frank Filz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).