linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] block_dev:Fix bug when read/write block-device which is larger than 16TB in 32bit-OS.
@ 2012-05-29  8:56 majianpeng
  2012-07-24 12:44 ` majianpeng
  0 siblings, 1 reply; 5+ messages in thread
From: majianpeng @ 2012-05-29  8:56 UTC (permalink / raw)
  To: Andrew Morton, viro; +Cc: linux-fsdevel, linux-mm

The size of block-device is larger than 16TB, and the os is 32bit.
If the offset of read/write is larger then 16TB. The index of address_space will
overflow and supply data from low offset instead.

when read-operation, in function do_generic_file_read():
>index = *ppos >> PAGE_CACHE_SHIFT;
Because the *ppos is larger than 16TB and the index  is the type pgoff_t which 32bit
in 32bit-OS. So index will overflow.

When write-operation, in function generic_write_checks():
>if (likely(!isblk)) {
>		.....
>	} else {
>#ifdef CONFIG_BLOCK
>		loff_t isize;
>		if (bdev_read_only(I_BDEV(inode)))
			return -EPERM;
>		isize = i_size_read(inode);
>		if (*pos >= isize) {
>			if (*count || *pos > isize)
>				return -ENOSPC;
>		}
>
>		if (*pos + *count > isize)
>			*count = isize - *pos;
The code only check size.But continue code:
generic_file_buffered_write-->generic_perform_write-->blkdev_write_begin 
--->block_write_begin()
> pgoff_t index = pos >> PAGE_CACHE_SHIFT;
The index will overflow again.

Although filesystem has a attribute s_maxbytes, the block-device was not create so no affect.


Signed-off-by: majianpeng <majianpeng@gmail.com>
---
 fs/block_dev.c |    4 +++-
 mm/filemap.c   |   28 ++++++++++++++++++++++++++++
 2 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/fs/block_dev.c b/fs/block_dev.c
index c2bbe1f..1752c0e 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -382,7 +382,9 @@ static loff_t block_llseek(struct file *file, loff_t offset, int origin)
 
 	mutex_lock(&bd_inode->i_mutex);
 	size = i_size_read(bd_inode);
-
+#if BITS_PER_LONG == 32
+	size = min_t(loff_t, size, (loff_t)0xFFFFFFFF * PAGE_CACHE_SIZE - 1);
+#endif
 	retval = -EINVAL;
 	switch (origin) {
 		case SEEK_END:
diff --git a/mm/filemap.c b/mm/filemap.c
index 79c4b2b..34a15bf 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1373,6 +1373,25 @@ int generic_segment_checks(const struct iovec *iov,
 }
 EXPORT_SYMBOL(generic_segment_checks);
 
+static inline
+int generic_read_block_checks(struct file *file, loff_t *pos, size_t *count)
+{
+	struct inode *inode = file->f_mapping->host;
+	loff_t isize = 0;
+#if BITS_PER_LONG == 32 && defined(CONFIG_BLOCK)
+	isize = min_t(loff_t, i_size_read(inode),
+			(loff_t)0xFFFFFFFF * PAGE_CACHE_SIZE - 1);
+	if (*pos >= isize) {
+		if (*count || *pos > isize)
+			return -ENOSPC;
+	}
+
+	if (*pos + *count > isize)
+		*count = isize - *pos;
+#endif
+	return 0;
+}
+
 /**
  * generic_file_aio_read - generic filesystem read routine
  * @iocb:	kernel I/O control block
@@ -1398,6 +1417,11 @@ generic_file_aio_read(struct kiocb *iocb, const struct iovec *iov,
 	if (retval)
 		return retval;
 
+	if (S_ISBLK(filp->f_mapping->host->i_mode)) {
+		retval = generic_read_block_checks(filp, &pos, &count);
+		if (retval)
+			return retval;
+	}
 	/* coalesce the iovecs and go direct-to-BIO for O_DIRECT */
 	if (filp->f_flags & O_DIRECT) {
 		loff_t size;
@@ -2214,6 +2238,10 @@ inline int generic_write_checks(struct file *file, loff_t *pos, size_t *count, i
 		if (bdev_read_only(I_BDEV(inode)))
 			return -EPERM;
 		isize = i_size_read(inode);
+#if BITS_PER_LONG == 32
+		isize = min_t(loff_t, isize,
+				(loff_t)0xFFFFFFFF * PAGE_CACHE_SIZE - 1);
+#endif
 		if (*pos >= isize) {
 			if (*count || *pos > isize)
 				return -ENOSPC;
-- 
1.7.9.5

 				
--------------
majianpeng
2012-05-29

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [RFC] block_dev:Fix bug when read/write block-device which is larger than 16TB in 32bit-OS.
  2012-05-29  8:56 [RFC] block_dev:Fix bug when read/write block-device which is larger than 16TB in 32bit-OS majianpeng
@ 2012-07-24 12:44 ` majianpeng
  2012-07-24 13:48   ` Christoph Hellwig
  0 siblings, 1 reply; 5+ messages in thread
From: majianpeng @ 2012-07-24 12:44 UTC (permalink / raw)
  To: Andrew Morton, viro@ZenIV.linux.org.uk; +Cc: linux-fsdevel, linux-mm

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="gb2312", Size: 3822 bytes --]

On 2012-05-29 16:56 majianpeng <majianpeng@gmail.com> Wrote:
>The size of block-device is larger than 16TB, and the os is 32bit.
>If the offset of read/write is larger then 16TB. The index of address_space will
>overflow and supply data from low offset instead.
>
>when read-operation, in function do_generic_file_read():
>>index = *ppos >> PAGE_CACHE_SHIFT;
>Because the *ppos is larger than 16TB and the index  is the type pgoff_t which 32bit
>in 32bit-OS. So index will overflow.
>
>When write-operation, in function generic_write_checks():
>>if (likely(!isblk)) {
>>		.....
>>	} else {
>>#ifdef CONFIG_BLOCK
>>		loff_t isize;
>>		if (bdev_read_only(I_BDEV(inode)))
>			return -EPERM;
>>		isize = i_size_read(inode);
>>		if (*pos >= isize) {
>>			if (*count || *pos > isize)
>>				return -ENOSPC;
>>		}
>>
>>		if (*pos + *count > isize)
>>			*count = isize - *pos;
>The code only check size.But continue code:
>generic_file_buffered_write-->generic_perform_write-->blkdev_write_begin 
>--->block_write_begin()
>> pgoff_t index = pos >> PAGE_CACHE_SHIFT;
>The index will overflow again.
>
>Although filesystem has a attribute s_maxbytes, the block-device was not create so no affect.
>
>
>Signed-off-by: majianpeng <majianpeng@gmail.com>
>---
> fs/block_dev.c |    4 +++-
> mm/filemap.c   |   28 ++++++++++++++++++++++++++++
> 2 files changed, 31 insertions(+), 1 deletion(-)
>
>diff --git a/fs/block_dev.c b/fs/block_dev.c
>index c2bbe1f..1752c0e 100644
>--- a/fs/block_dev.c
>+++ b/fs/block_dev.c
>@@ -382,7 +382,9 @@ static loff_t block_llseek(struct file *file, loff_t offset, int origin)
> 
> 	mutex_lock(&bd_inode->i_mutex);
> 	size = i_size_read(bd_inode);
>-
>+#if BITS_PER_LONG == 32
>+	size = min_t(loff_t, size, (loff_t)0xFFFFFFFF * PAGE_CACHE_SIZE - 1);
>+#endif
> 	retval = -EINVAL;
> 	switch (origin) {
> 		case SEEK_END:
>diff --git a/mm/filemap.c b/mm/filemap.c
>index 79c4b2b..34a15bf 100644
>--- a/mm/filemap.c
>+++ b/mm/filemap.c
>@@ -1373,6 +1373,25 @@ int generic_segment_checks(const struct iovec *iov,
> }
> EXPORT_SYMBOL(generic_segment_checks);
> 
>+static inline
>+int generic_read_block_checks(struct file *file, loff_t *pos, size_t *count)
>+{
>+	struct inode *inode = file->f_mapping->host;
>+	loff_t isize = 0;
>+#if BITS_PER_LONG == 32 && defined(CONFIG_BLOCK)
>+	isize = min_t(loff_t, i_size_read(inode),
>+			(loff_t)0xFFFFFFFF * PAGE_CACHE_SIZE - 1);
>+	if (*pos >= isize) {
>+		if (*count || *pos > isize)
>+			return -ENOSPC;
>+	}
>+
>+	if (*pos + *count > isize)
>+		*count = isize - *pos;
>+#endif
>+	return 0;
>+}
>+
> /**
>  * generic_file_aio_read - generic filesystem read routine
>  * @iocb:	kernel I/O control block
>@@ -1398,6 +1417,11 @@ generic_file_aio_read(struct kiocb *iocb, const struct iovec *iov,
> 	if (retval)
> 		return retval;
> 
>+	if (S_ISBLK(filp->f_mapping->host->i_mode)) {
>+		retval = generic_read_block_checks(filp, &pos, &count);
>+		if (retval)
>+			return retval;
>+	}
> 	/* coalesce the iovecs and go direct-to-BIO for O_DIRECT */
> 	if (filp->f_flags & O_DIRECT) {
> 		loff_t size;
>@@ -2214,6 +2238,10 @@ inline int generic_write_checks(struct file *file, loff_t *pos, size_t *count, i
> 		if (bdev_read_only(I_BDEV(inode)))
> 			return -EPERM;
> 		isize = i_size_read(inode);
>+#if BITS_PER_LONG == 32
>+		isize = min_t(loff_t, isize,
>+				(loff_t)0xFFFFFFFF * PAGE_CACHE_SIZE - 1);
>+#endif
> 		if (*pos >= isize) {
> 			if (*count || *pos > isize)
> 				return -ENOSPC;
>-- 
>1.7.9.5

How about this patch? ok or error ?
No one to reply? Maybe the patch did no sense.

Thansk!
>
> 				
>--------------
>majianpeng
>2012-05-29N‹§²æìr¸›zǧu©ž²Æ {\b­†éì¹»\x1c®&Þ–)îÆi¢žØ^n‡r¶‰šŽŠÝ¢j$½§$¢¸\x05¢¹¨­è§~Š'.)îÄÃ,yèm¶ŸÿÃ\f%Š{±šj+ƒðèž×¦j)Z†·Ÿ

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC] block_dev:Fix bug when read/write block-device which is larger than 16TB in 32bit-OS.
  2012-07-24 12:44 ` majianpeng
@ 2012-07-24 13:48   ` Christoph Hellwig
  2012-07-26  5:22     ` majianpeng
  2012-07-27  5:38     ` majianpeng
  0 siblings, 2 replies; 5+ messages in thread
From: Christoph Hellwig @ 2012-07-24 13:48 UTC (permalink / raw)
  To: majianpeng
  Cc: Andrew Morton, viro@ZenIV.linux.org.uk, linux-fsdevel, linux-mm

On Tue, Jul 24, 2012 at 08:44:27PM +0800, majianpeng wrote:
> On 2012-05-29 16:56 majianpeng <majianpeng@gmail.com> Wrote:
> >The size of block-device is larger than 16TB, and the os is 32bit.
> >If the offset of read/write is larger then 16TB. The index of address_space will
> >overflow and supply data from low offset instead.

We can't support > 16TB block device on 32-bit systems with 4k page
size, just like we can't support files that large.

For filesystems the s_maxbytes limit of MAX_LFS_FILESIZE takes care of
that, but it seems like we miss that check for block devices.

The proper fix is to add that check (either via s_maxbytes or by
checking MAX_LFS_FILESIZE) to generic_write_checks and
generic_file_aio_read (or a block device specific wrapper)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Re: [RFC] block_dev:Fix bug when read/write block-device which is larger than 16TB in 32bit-OS.
  2012-07-24 13:48   ` Christoph Hellwig
@ 2012-07-26  5:22     ` majianpeng
  2012-07-27  5:38     ` majianpeng
  1 sibling, 0 replies; 5+ messages in thread
From: majianpeng @ 2012-07-26  5:22 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Andrew Morton, viro, linux-fsdevel, linux-mm

On 2012-07-24 21:48 Christoph Hellwig <hch@infradead.org> Wrote:
>On Tue, Jul 24, 2012 at 08:44:27PM +0800, majianpeng wrote:
>> On 2012-05-29 16:56 majianpeng <majianpeng@gmail.com> Wrote:
>> >The size of block-device is larger than 16TB, and the os is 32bit.
>> >If the offset of read/write is larger then 16TB. The index of address_space will
>> >overflow and supply data from low offset instead.
>
>We can't support > 16TB block device on 32-bit systems with 4k page
>size, just like we can't support files that large.
>
>For filesystems the s_maxbytes limit of MAX_LFS_FILESIZE takes care of
>that, but it seems like we miss that check for block devices.
>
>The proper fix is to add that check (either via s_maxbytes or by
>checking MAX_LFS_FILESIZE) to generic_write_checks and
>generic_file_aio_read (or a block device specific wrapper)
>
I had a problem:why do read-operation  not to check like generic_write_chekcs?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Re: [RFC] block_dev:Fix bug when read/write block-device which is larger than 16TB in 32bit-OS.
  2012-07-24 13:48   ` Christoph Hellwig
  2012-07-26  5:22     ` majianpeng
@ 2012-07-27  5:38     ` majianpeng
  1 sibling, 0 replies; 5+ messages in thread
From: majianpeng @ 2012-07-27  5:38 UTC (permalink / raw)
  To: Christoph Hellwig, fengguang.wu
  Cc: Andrew Morton, viro, linux-fsdevel, linux-mm

On 2012-07-24 21:48 Christoph Hellwig <hch@infradead.org> Wrote:
>On Tue, Jul 24, 2012 at 08:44:27PM +0800, majianpeng wrote:
>> On 2012-05-29 16:56 majianpeng <majianpeng@gmail.com> Wrote:
>> >The size of block-device is larger than 16TB, and the os is 32bit.
>> >If the offset of read/write is larger then 16TB. The index of address_space will
>> >overflow and supply data from low offset instead.
>
>We can't support > 16TB block device on 32-bit systems with 4k page
>size, just like we can't support files that large.
>
>For filesystems the s_maxbytes limit of MAX_LFS_FILESIZE takes care of
>that, but it seems like we miss that check for block devices.
>
>The proper fix is to add that check (either via s_maxbytes or by
>checking MAX_LFS_FILESIZE) to generic_write_checks and
>generic_file_aio_read (or a block device specific wrapper)
>
/* Page cache limit. The filesystems should put that into their s_maxbytes 
   limits, otherwise bad things can happen in VM. */ 
#if BITS_PER_LONG==32
#define MAX_LFS_FILESIZE	(((u64)PAGE_CACHE_SIZE << (BITS_PER_LONG-1))-1) 
#elif BITS_PER_LONG==64
#define MAX_LFS_FILESIZE 	0x7fffffffffffffffUL
#endif

If we used MAX_LFS_FILESIZE to limit the block-device, so in 32bit-os, the size of block is
only 8T -1.
But in function do_generic_file_read():
>>index = *ppos >> PAGE_CACHE_SHIFT;
index is unsigned long type. So the ppos can 16T -1.

But the comment said:
>>/* Page cache limit. The filesystems should put that into their s_maxbytes 
>>   limits, otherwise bad things can happen in VM. */ 
Why ?

Thanks !

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2012-07-27  5:38 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-05-29  8:56 [RFC] block_dev:Fix bug when read/write block-device which is larger than 16TB in 32bit-OS majianpeng
2012-07-24 12:44 ` majianpeng
2012-07-24 13:48   ` Christoph Hellwig
2012-07-26  5:22     ` majianpeng
2012-07-27  5:38     ` majianpeng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).