[patch 2/2] vfs: relax count check in rw_verify

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [patch 2/2] vfs: relax count check in rw_verify_area
@ 2010-10-13 20:46 Edward Shishkin
  2010-10-14 23:30 ` Andrew Morton
  0 siblings, 1 reply; 3+ messages in thread
From: Edward Shishkin @ 2010-10-13 20:46 UTC (permalink / raw)
  To: Andrew Morton, linux-fsdevel; +Cc: Eric Sandeen, lmcilroy, LKML

Increase count limit in rw_verify_area().

Signed-off-by: Edward Shishkin <edward@redhat.com>
---
 fs/read_write.c    |   17 +++++++----------
 fs/splice.c        |    4 ++--
 include/linux/fs.h |    2 +-
 3 files changed, 10 insertions(+), 13 deletions(-)

--- linux-2.6.36-rc7.orig/fs/read_write.c
+++ linux-2.6.36-rc7/fs/read_write.c
@@ -223,21 +223,20 @@ bad:
 #endif
 
 /*
- * rw_verify_area doesn't like huge counts. We limit
- * them to something that fits in "int" so that others
- * won't have to do range checks all the time.
+ * We limit huge counts to something that fits in "ssize_t"
  */
-#define MAX_RW_COUNT (INT_MAX & PAGE_CACHE_MASK)
+#define MAX_RW_COUNT ((~(size_t)0) >> 1 & PAGE_CACHE_MASK)
 
-int rw_verify_area(int read_write, struct file *file, loff_t *ppos, size_t count)
+ssize_t rw_verify_area(int read_write, struct file *file, loff_t *ppos,
+		       size_t count)
 {
 	struct inode *inode;
 	loff_t pos;
 	int retval = -EINVAL;
 
 	inode = file->f_path.dentry->d_inode;
-	if (unlikely((ssize_t) count < 0))
-		return retval;
+	if (unlikely(count > MAX_RW_COUNT))
+		count = MAX_RW_COUNT;
 	pos = *ppos;
 	if (unlikely((pos < 0) || (loff_t) (pos + count) < 0))
 		return retval;
@@ -251,9 +250,7 @@ int rw_verify_area(int read_write, struc
 	}
 	retval = security_file_permission(file,
 				read_write == READ ? MAY_READ : MAY_WRITE);
-	if (retval)
-		return retval;
-	return count > MAX_RW_COUNT ? MAX_RW_COUNT : count;
+	return retval ? retval : count;
 }
 
 static void wait_on_retry_sync_kiocb(struct kiocb *iocb)
--- linux-2.6.36-rc7.orig/fs/splice.c
+++ linux-2.6.36-rc7/fs/splice.c
@@ -1097,7 +1097,7 @@ static long do_splice_from(struct pipe_i
 {
 	ssize_t (*splice_write)(struct pipe_inode_info *, struct file *,
 				loff_t *, size_t, unsigned int);
-	int ret;
+	ssize_t ret;
 
 	if (unlikely(!(out->f_mode & FMODE_WRITE)))
 		return -EBADF;
@@ -1126,7 +1126,7 @@ static long do_splice_to(struct file *in
 {
 	ssize_t (*splice_read)(struct file *, loff_t *,
 			       struct pipe_inode_info *, size_t, unsigned int);
-	int ret;
+	ssize_t ret;
 
 	if (unlikely(!(in->f_mode & FMODE_READ)))
 		return -EBADF;
--- linux-2.6.36-rc7.orig/include/linux/fs.h
+++ linux-2.6.36-rc7/include/linux/fs.h
@@ -1824,7 +1824,7 @@ extern int current_umask(void);
 /* /sys/fs */
 extern struct kobject *fs_kobj;
 
-extern int rw_verify_area(int, struct file *, loff_t *, size_t);
+extern ssize_t rw_verify_area(int, struct file *, loff_t *, size_t);
 
 #define FLOCK_VERIFY_READ  1
 #define FLOCK_VERIFY_WRITE 2

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [patch 2/2] vfs: relax count check in rw_verify_area
  2010-10-13 20:46 [patch 2/2] vfs: relax count check in rw_verify_area Edward Shishkin
@ 2010-10-14 23:30 ` Andrew Morton
  2010-10-26 14:44   ` Edward Shishkin
  0 siblings, 1 reply; 3+ messages in thread
From: Andrew Morton @ 2010-10-14 23:30 UTC (permalink / raw)
  To: Edward Shishkin; +Cc: linux-fsdevel, Eric Sandeen, lmcilroy, LKML

On Wed, 13 Oct 2010 22:46:21 +0200
Edward Shishkin <edward.shishkin@gmail.com> wrote:

> Increase count limit in rw_verify_area().
> 

OK, now this is a truly awful attempt to describe a patch.

afaict what the patch does is to change rw_verify_area() so that the
kernel now permits single reads and writes of up to 2^63 bytes on
64-bit systems.  Whereas it was previously limited to 2^31.  And the
patch also fixes up a couple of callsites which were assuming that
rw_verify_area() had that particular behaviour.

But that's just my guess, based on a quick read of the implementation. 
I didn't check how far this change penetrates.  Does it affect all
filesystems, for example?  If so were they all reviewed (or tested!)
for correctness?

And why was this patch written?  What motivated you?  What are the
user-visible effects?  Do manpages need updating?

I don't want to have to sit here scratching my head over the
implications and intent of *your* patch.  As at least a starting
point, you should be telling us, please.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [patch 2/2] vfs: relax count check in rw_verify_area
  2010-10-14 23:30 ` Andrew Morton
@ 2010-10-26 14:44   ` Edward Shishkin
  0 siblings, 0 replies; 3+ messages in thread
From: Edward Shishkin @ 2010-10-26 14:44 UTC (permalink / raw)
  To: Andrew Morton, Al Viro
  Cc: linux-fsdevel, Eric Sandeen, lmcilroy, LKML, Christoph Hellwig

Andrew Morton wrote:
> On Wed, 13 Oct 2010 22:46:21 +0200
> Edward Shishkin <edward.shishkin@gmail.com> wrote:
>
>   
>> Increase count limit in rw_verify_area().
>>
>>     
>
> OK, now this is a truly awful attempt to describe a patch.
>   

I was sure I have nicely described everything in the
"[patch 0/2][RFC] vfs: artefact(?) in rw_verify_area"
Well, I'll provide more details..

> afaict what the patch does is to change rw_verify_area() so that the
> kernel now permits single reads and writes of up to 2^63 bytes on
> 64-bit systems.  Whereas it was previously limited to 2^31.  And the
> patch also fixes up a couple of callsites which were assuming that
> rw_verify_area() had that particular behaviour.
>   

I found such assumptions rather strange. Why to not assume
documentation for read(2), write(2), where we can nominate
SSIZE_MAX bytes to read/write?

Now about the bad aspect of this limitation.
There is a so-called concept of transactions, which is very useful.
Sometimes we want some operations to be performed atomically. For
example, when you pay by your credit card. Should I explain what
can happen, if such operation will be half done?

Now note that the 2G restriction in rw_verify_area means that a file
system can not write more then 2G bytes atomically without a special
notification from user space. Do we really need such workarounds?

Large transactions are possible, they can be issued, for example,
by some trusted centre, which has many clients (like commercial
bank, notary, etc). Actually, 2G is not a large value nowadays..

> But that's just my guess, based on a quick read of the implementation. 
> I didn't check how far this change penetrates.  Does it affect all
> filesystems, for example?  If so were they all reviewed (or tested!)
> for correctness?
>   

Currently I have tested 15 callsites, and only 2 of them was failed
(direct-io and ecryptfs). The direct-io has been fixed already:
there was a truncation bug (see
[patch 1/2] vfs: fix overflow in direct-io subsystem).

I am ready to check/fix other ones, if there are any chances, that
this permit of large IOs will be eventually accepted.

> And why was this patch written?  What motivated you?

Our users ask us.

>   What are the user-visible effects?

There must not be any effects: in accordance with documentation
we can nominate SSIZE_MAX bytes to read/write.

>   Do manpages need updating?
>   

No, they don't.

Thanks,
Edward.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2010-10-26 14:45 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-10-13 20:46 [patch 2/2] vfs: relax count check in rw_verify_area Edward Shishkin
2010-10-14 23:30 ` Andrew Morton
2010-10-26 14:44   ` Edward Shishkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).