public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Fix generic_block_fiemap for files bigger than 4GB
@ 2009-10-26 17:24 Mike Hommey
  2009-10-29  6:00 ` Andrew Morton
  0 siblings, 1 reply; 6+ messages in thread
From: Mike Hommey @ 2009-10-26 17:24 UTC (permalink / raw)
  To: Alexander Viro; +Cc: Linus Torvalds, linux-kernel

Because of an integer overflow on start_blk, various kind of wrong results
would be returned by the generic_block_fiemap handler, such as no extents
when there is a 4GB+ hole at the beginning of the file, or wrong fe_logical
when an extent starts after the first 4GB.

Signed-off-by: Mike Hommey <mh@glandium.org>
---
 fs/ioctl.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/ioctl.c b/fs/ioctl.c
index 7b17a14..6c75110 100644
--- a/fs/ioctl.c
+++ b/fs/ioctl.c
@@ -254,7 +254,7 @@ int __generic_block_fiemap(struct inode *inode,
 			   u64 len, get_block_t *get_block)
 {
 	struct buffer_head tmp;
-	unsigned int start_blk;
+	unsigned long long start_blk;
 	long long length = 0, map_len = 0;
 	u64 logical = 0, phys = 0, size = 0;
 	u32 flags = FIEMAP_EXTENT_MERGED;
-- 
1.6.5


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] Fix generic_block_fiemap for files bigger than 4GB
  2009-10-26 17:24 [PATCH] Fix generic_block_fiemap for files bigger than 4GB Mike Hommey
@ 2009-10-29  6:00 ` Andrew Morton
  2009-10-29 12:31   ` Josef Bacik
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Morton @ 2009-10-29  6:00 UTC (permalink / raw)
  To: Mike Hommey
  Cc: Alexander Viro, Linus Torvalds, linux-kernel, Steven Whitehouse,
	Theodore Ts'o, Eric Sandeen, Josef Bacik, Mark Fasheh

On Mon, 26 Oct 2009 18:24:28 +0100 Mike Hommey <mh@glandium.org> wrote:

> Because of an integer overflow on start_blk, various kind of wrong results
> would be returned by the generic_block_fiemap handler, such as no extents
> when there is a 4GB+ hole at the beginning of the file, or wrong fe_logical
> when an extent starts after the first 4GB.
> 
> Signed-off-by: Mike Hommey <mh@glandium.org>
> ---
>  fs/ioctl.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/fs/ioctl.c b/fs/ioctl.c
> index 7b17a14..6c75110 100644
> --- a/fs/ioctl.c
> +++ b/fs/ioctl.c
> @@ -254,7 +254,7 @@ int __generic_block_fiemap(struct inode *inode,
>  			   u64 len, get_block_t *get_block)
>  {
>  	struct buffer_head tmp;
> -	unsigned int start_blk;
> +	unsigned long long start_blk;
>  	long long length = 0, map_len = 0;
>  	u64 logical = 0, phys = 0, size = 0;
>  	u32 flags = FIEMAP_EXTENT_MERGED;

Well.  Should it be unsigned long long, or u64 or sector_t?  Or even loff_t.

The code's a bit confused about types in there.  And it's made much
more confusing by the moronic and wholly unnecessary use of macros for
blk_to_logical() and logical_to_blk().

It's also unhelpful that the `u64 start' argument forgot to get itself
documented in the kerneldoc comment.  Sigh.

Ah, generic_block_fiemap() has it:

 * @start: The initial block to map

I guess u64 was logical there as it comes in from userspace.  But at
some boundary we should start talking kernel types so I suspect the
correct thing to do here is to use sector_t?


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Fix generic_block_fiemap for files bigger than 4GB
  2009-10-29  6:00 ` Andrew Morton
@ 2009-10-29 12:31   ` Josef Bacik
  2009-10-29 12:40     ` Josef Bacik
  0 siblings, 1 reply; 6+ messages in thread
From: Josef Bacik @ 2009-10-29 12:31 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mike Hommey, Alexander Viro, Linus Torvalds, linux-kernel,
	Steven Whitehouse, Theodore Ts'o, Eric Sandeen, Josef Bacik,
	Mark Fasheh

On Wed, Oct 28, 2009 at 11:00:22PM -0700, Andrew Morton wrote:
> On Mon, 26 Oct 2009 18:24:28 +0100 Mike Hommey <mh@glandium.org> wrote:
> 
> > Because of an integer overflow on start_blk, various kind of wrong results
> > would be returned by the generic_block_fiemap handler, such as no extents
> > when there is a 4GB+ hole at the beginning of the file, or wrong fe_logical
> > when an extent starts after the first 4GB.
> > 
> > Signed-off-by: Mike Hommey <mh@glandium.org>
> > ---
> >  fs/ioctl.c |    2 +-
> >  1 files changed, 1 insertions(+), 1 deletions(-)
> > 
> > diff --git a/fs/ioctl.c b/fs/ioctl.c
> > index 7b17a14..6c75110 100644
> > --- a/fs/ioctl.c
> > +++ b/fs/ioctl.c
> > @@ -254,7 +254,7 @@ int __generic_block_fiemap(struct inode *inode,
> >  			   u64 len, get_block_t *get_block)
> >  {
> >  	struct buffer_head tmp;
> > -	unsigned int start_blk;
> > +	unsigned long long start_blk;
> >  	long long length = 0, map_len = 0;
> >  	u64 logical = 0, phys = 0, size = 0;
> >  	u32 flags = FIEMAP_EXTENT_MERGED;
> 
> Well.  Should it be unsigned long long, or u64 or sector_t?  Or even loff_t.
> 
> The code's a bit confused about types in there.  And it's made much
> more confusing by the moronic and wholly unnecessary use of macros for
> blk_to_logical() and logical_to_blk().
> 
> It's also unhelpful that the `u64 start' argument forgot to get itself
> documented in the kerneldoc comment.  Sigh.
> 
> Ah, generic_block_fiemap() has it:
> 
>  * @start: The initial block to map
> 
> I guess u64 was logical there as it comes in from userspace.  But at
> some boundary we should start talking kernel types so I suspect the
> correct thing to do here is to use sector_t?
> 

Hmm this is strange, I had sent a patch a few months ago after you chewed me out
the first time for this to change all the types and such and make the macro's
inlined functions.  I will see if I can find it and resend it.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Fix generic_block_fiemap for files bigger than 4GB
  2009-10-29 12:31   ` Josef Bacik
@ 2009-10-29 12:40     ` Josef Bacik
  2009-10-29 15:17       ` Andrew Morton
  0 siblings, 1 reply; 6+ messages in thread
From: Josef Bacik @ 2009-10-29 12:40 UTC (permalink / raw)
  To: Josef Bacik
  Cc: Andrew Morton, Mike Hommey, Alexander Viro, Linus Torvalds,
	linux-kernel, Steven Whitehouse, Theodore Ts'o, Eric Sandeen,
	Josef Bacik, Mark Fasheh

On Thu, Oct 29, 2009 at 08:31:54AM -0400, Josef Bacik wrote:
> On Wed, Oct 28, 2009 at 11:00:22PM -0700, Andrew Morton wrote:
> > On Mon, 26 Oct 2009 18:24:28 +0100 Mike Hommey <mh@glandium.org> wrote:
> > 
> > > Because of an integer overflow on start_blk, various kind of wrong results
> > > would be returned by the generic_block_fiemap handler, such as no extents
> > > when there is a 4GB+ hole at the beginning of the file, or wrong fe_logical
> > > when an extent starts after the first 4GB.
> > > 
> > > Signed-off-by: Mike Hommey <mh@glandium.org>
> > > ---
> > >  fs/ioctl.c |    2 +-
> > >  1 files changed, 1 insertions(+), 1 deletions(-)
> > > 
> > > diff --git a/fs/ioctl.c b/fs/ioctl.c
> > > index 7b17a14..6c75110 100644
> > > --- a/fs/ioctl.c
> > > +++ b/fs/ioctl.c
> > > @@ -254,7 +254,7 @@ int __generic_block_fiemap(struct inode *inode,
> > >  			   u64 len, get_block_t *get_block)
> > >  {
> > >  	struct buffer_head tmp;
> > > -	unsigned int start_blk;
> > > +	unsigned long long start_blk;
> > >  	long long length = 0, map_len = 0;
> > >  	u64 logical = 0, phys = 0, size = 0;
> > >  	u32 flags = FIEMAP_EXTENT_MERGED;
> > 
> > Well.  Should it be unsigned long long, or u64 or sector_t?  Or even loff_t.
> > 
> > The code's a bit confused about types in there.  And it's made much
> > more confusing by the moronic and wholly unnecessary use of macros for
> > blk_to_logical() and logical_to_blk().
> > 
> > It's also unhelpful that the `u64 start' argument forgot to get itself
> > documented in the kerneldoc comment.  Sigh.
> > 
> > Ah, generic_block_fiemap() has it:
> > 
> >  * @start: The initial block to map
> > 
> > I guess u64 was logical there as it comes in from userspace.  But at
> > some boundary we should start talking kernel types so I suspect the
> > correct thing to do here is to use sector_t?
> > 
> 
> Hmm this is strange, I had sent a patch a few months ago after you chewed me out
> the first time for this to change all the types and such and make the macro's
> inlined functions.  I will see if I can find it and resend it.  Thanks,
> 

Ok here it is.  I've fixed all the type issues and there were other problems
with not setting FIEMAP_EXTENT_LAST last properly.  If this is more reasonable
to you I will update it and send it along properly.  Thanks,

Josef

diff --git a/fs/ioctl.c b/fs/ioctl.c
index ac2d47e..ee9fba0 100644
--- a/fs/ioctl.c
+++ b/fs/ioctl.c
@@ -228,13 +228,22 @@ static int ioctl_fiemap(struct file *filp, unsigned long arg)
 
 #ifdef CONFIG_BLOCK
 
-#define blk_to_logical(inode, blk) (blk << (inode)->i_blkbits)
-#define logical_to_blk(inode, offset) (offset >> (inode)->i_blkbits);
+static inline sector_t logical_to_blk(struct inode *inode, loff_t offset)
+{
+	return (offset >> inode->i_blkbits);
+}
+
+static inline loff_t blk_to_logical(struct inode *inode, sector_t blk)
+{
+	return (blk << inode->i_blkbits);
+}
 
 /**
  * __generic_block_fiemap - FIEMAP for block based inodes (no locking)
  * @inode - the inode to map
- * @arg - the pointer to userspace where we copy everything to
+ * @fieinfo - the fiemap info struct that will be passed back to userspace
+ * @start - where to start mapping in the inode
+ * @len - how much space to map
  * @get_block - the fs's get_block function
  *
  * This does FIEMAP for block based inodes.  Basically it will just loop
@@ -250,43 +259,63 @@ static int ioctl_fiemap(struct file *filp, unsigned long arg)
  */
 
 int __generic_block_fiemap(struct inode *inode,
-			   struct fiemap_extent_info *fieinfo, u64 start,
-			   u64 len, get_block_t *get_block)
+			   struct fiemap_extent_info *fieinfo, loff_t start,
+			   loff_t len, get_block_t *get_block)
 {
-	struct buffer_head tmp;
-	unsigned int start_blk;
-	long long length = 0, map_len = 0;
+	struct buffer_head map_bh;
+	sector_t start_blk;
+	loff_t end;
 	u64 logical = 0, phys = 0, size = 0;
 	u32 flags = FIEMAP_EXTENT_MERGED;
+	bool past_eof = false, whole_file = false;
 	int ret = 0;
 
-	if ((ret = fiemap_check_flags(fieinfo, FIEMAP_FLAG_SYNC)))
+	ret = fiemap_check_flags(fieinfo, FIEMAP_FLAG_SYNC);
+	if (ret)
 		return ret;
 
 	start_blk = logical_to_blk(inode, start);
 
-	length = (long long)min_t(u64, len, i_size_read(inode));
-	map_len = length;
+	if (len >= i_size_read(inode)) {
+		whole_file = true;
+		len = i_size_read(inode);
+	}
+
+	end = start + len;
 
 	do {
 		/*
-		 * we set b_size to the total size we want so it will map as
+		 * We set b_size to the total size we want so it will map as
 		 * many contiguous blocks as possible at once
 		 */
-		memset(&tmp, 0, sizeof(struct buffer_head));
-		tmp.b_size = map_len;
+		memset(&map_bh, 0, sizeof(struct buffer_head));
+		map_bh.b_size = len;
 
-		ret = get_block(inode, start_blk, &tmp, 0);
+		ret = get_block(inode, start_blk, &map_bh, 0);
 		if (ret)
 			break;
 
 		/* HOLE */
-		if (!buffer_mapped(&tmp)) {
+		if (!buffer_mapped(&map_bh)) {
+			start_blk++;
+
 			/*
-			 * first hole after going past the EOF, this is our
+			 * We want to handle the case where there is an
+			 * allocated block at the front of the file, and then
+			 * nothing but holes up to the end of the file properly,
+			 * to make sure that extent at the front gets properly
+			 * marked with FIEMAP_EXTENT_LAST
+			 */
+			if (!past_eof &&
+			    blk_to_logical(inode, start_blk) >=
+			    blk_to_logical(inode, 0) + i_size_read(inode))
+				past_eof = true;
+
+			/*
+			 * First hole after going past the EOF, this is our
 			 * last extent
 			 */
-			if (length <= 0) {
+			if (past_eof && size) {
 				flags = FIEMAP_EXTENT_MERGED|FIEMAP_EXTENT_LAST;
 				ret = fiemap_fill_next_extent(fieinfo, logical,
 							      phys, size,
@@ -294,15 +323,39 @@ int __generic_block_fiemap(struct inode *inode,
 				break;
 			}
 
-			length -= blk_to_logical(inode, 1);
-
-			/* if we have holes up to/past EOF then we're done */
-			if (length <= 0)
+			/* If we have holes up to/past EOF then we're done */
+			if (blk_to_logical(inode, start_blk) >= end
+			    || past_eof)
 				break;
-
-			start_blk++;
 		} else {
-			if (length <= 0 && size) {
+			/*
+			 * We have gone over the length of what we wanted to
+			 * map, and it wasn't the entire file, so add the extent
+			 * we got last time and exit.
+			 *
+			 * This is for the case where say we want to map all the
+			 * way up to the second to the last block in a file, but
+			 * the last block is a hole, making the second to last
+			 * block FIEMAP_EXTENT_LAST.  In this case we want to
+			 * see if there is a hole after the second to last block
+			 * so we can mark it properly.  If we found data after
+			 * we exceeded the length we were requesting, then we
+			 * are good to go, just add the extent to the fieinfo
+			 * and break
+			 */
+			if (blk_to_logical(inode, start_blk) >= end
+			    && !whole_file) {
+				ret = fiemap_fill_next_extent(fieinfo, logical,
+							      phys, size,
+							      flags);
+				break;
+			}
+
+			/*
+			 * If size != 0 then we know we already have an extent
+			 * to add, so add it.
+			 */
+			if (size) {
 				ret = fiemap_fill_next_extent(fieinfo, logical,
 							      phys, size,
 							      flags);
@@ -311,32 +364,26 @@ int __generic_block_fiemap(struct inode *inode,
 			}
 
 			logical = blk_to_logical(inode, start_blk);
-			phys = blk_to_logical(inode, tmp.b_blocknr);
-			size = tmp.b_size;
+			phys = blk_to_logical(inode, map_bh.b_blocknr);
+			size = map_bh.b_size;
 			flags = FIEMAP_EXTENT_MERGED;
 
-			length -= tmp.b_size;
 			start_blk += logical_to_blk(inode, size);
 
 			/*
-			 * if we are past the EOF we need to loop again to see
-			 * if there is a hole so we can mark this extent as the
-			 * last one, and if not keep mapping things until we
-			 * find a hole, or we run out of slots in the extent
-			 * array
+			 * If we are past the EOF, then we need to make sure as
+			 * soon as we find a hole that the last extent we found
+			 * is marked with FIEMAP_EXTENT_LAST
 			 */
-			if (length <= 0)
-				continue;
-
-			ret = fiemap_fill_next_extent(fieinfo, logical, phys,
-						      size, flags);
-			if (ret)
-				break;
+			if (!past_eof &&
+			    logical + size >=
+			    blk_to_logical(inode, 0) + i_size_read(inode))
+				past_eof = true;
 		}
 		cond_resched();
 	} while (1);
 
-	/* if ret is 1 then we just hit the end of the extent array */
+	/* If ret is 1 then we just hit the end of the extent array */
 	if (ret == 1)
 		ret = 0;
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 5bed436..b23c725 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2305,8 +2305,9 @@ extern int vfs_fstatat(int , char __user *, struct kstat *, int);
 extern int do_vfs_ioctl(struct file *filp, unsigned int fd, unsigned int cmd,
 		    unsigned long arg);
 extern int __generic_block_fiemap(struct inode *inode,
-				  struct fiemap_extent_info *fieinfo, u64 start,
-				  u64 len, get_block_t *get_block);
+				  struct fiemap_extent_info *fieinfo,
+				  loff_t start, loff_t len,
+				  get_block_t *get_block);
 extern int generic_block_fiemap(struct inode *inode,
 				struct fiemap_extent_info *fieinfo, u64 start,
 				u64 len, get_block_t *get_block);

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] Fix generic_block_fiemap for files bigger than 4GB
  2009-10-29 12:40     ` Josef Bacik
@ 2009-10-29 15:17       ` Andrew Morton
  2009-10-29 15:19         ` Mike Hommey
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Morton @ 2009-10-29 15:17 UTC (permalink / raw)
  To: Josef Bacik
  Cc: Mike Hommey, Alexander Viro, Linus Torvalds, linux-kernel,
	Steven Whitehouse, Theodore Ts'o, Eric Sandeen, Josef Bacik,
	Mark Fasheh

On Thu, 29 Oct 2009 08:40:56 -0400 Josef Bacik <josef@redhat.com> wrote:

> 
> Ok here it is.  I've fixed all the type issues and there were other problems
> with not setting FIEMAP_EXTENT_LAST last properly.  If this is more reasonable
> to you I will update it and send it along properly.  Thanks,

geeze, how did we lose a patch of this magnitude?

> diff --git a/fs/ioctl.c b/fs/ioctl.c
> index ac2d47e..ee9fba0 100644
> --- a/fs/ioctl.c
> +++ b/fs/ioctl.c
> @@ -228,13 +228,22 @@ static int ioctl_fiemap(struct file *filp, unsigned long arg)
>  
>  #ifdef CONFIG_BLOCK
>  
> -#define blk_to_logical(inode, blk) (blk << (inode)->i_blkbits)
> -#define logical_to_blk(inode, offset) (offset >> (inode)->i_blkbits);
> +static inline sector_t logical_to_blk(struct inode *inode, loff_t offset)
> +{
> +	return (offset >> inode->i_blkbits);
> +}
> +
> +static inline loff_t blk_to_logical(struct inode *inode, sector_t blk)
> +{
> +	return (blk << inode->i_blkbits);
> +}

ah.  Adding the types really does clarify things.

>  /**
>   * __generic_block_fiemap - FIEMAP for block based inodes (no locking)
>   * @inode - the inode to map
> - * @arg - the pointer to userspace where we copy everything to
> + * @fieinfo - the fiemap info struct that will be passed back to userspace
> + * @start - where to start mapping in the inode
> + * @len - how much space to map
>   * @get_block - the fs's get_block function
>   *
>   * This does FIEMAP for block based inodes.  Basically it will just loop
> @@ -250,43 +259,63 @@ static int ioctl_fiemap(struct file *filp, unsigned long arg)
>   */
>  
>  int __generic_block_fiemap(struct inode *inode,
> -			   struct fiemap_extent_info *fieinfo, u64 start,
> -			   u64 len, get_block_t *get_block)
> +			   struct fiemap_extent_info *fieinfo, loff_t start,
> +			   loff_t len, get_block_t *get_block)
>  {
> -	struct buffer_head tmp;
> -	unsigned int start_blk;
> -	long long length = 0, map_len = 0;
> +	struct buffer_head map_bh;
> +	sector_t start_blk;

And the bugfix is there.

Has anyone actually tested this code on large files?  Greater than 4G
sectors and greater than 4G pages?



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Fix generic_block_fiemap for files bigger than 4GB
  2009-10-29 15:17       ` Andrew Morton
@ 2009-10-29 15:19         ` Mike Hommey
  0 siblings, 0 replies; 6+ messages in thread
From: Mike Hommey @ 2009-10-29 15:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Josef Bacik, Alexander Viro, Linus Torvalds, linux-kernel,
	Steven Whitehouse, Theodore Ts'o, Eric Sandeen, Josef Bacik,
	Mark Fasheh

On Thu, Oct 29, 2009 at 08:17:01AM -0700, Andrew Morton wrote:
> > -	struct buffer_head tmp;
> > -	unsigned int start_blk;
> > -	long long length = 0, map_len = 0;
> > +	struct buffer_head map_bh;
> > +	sector_t start_blk;
> 
> And the bugfix is there.
> 
> Has anyone actually tested this code on large files?  Greater than 4G
> sectors and greater than 4G pages?
> 

I tested my patch with 2TB sparse files that would fail without the patch.

I can give a try to Josef's patch if that's necessary.

Mike

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2009-10-29 15:20 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-26 17:24 [PATCH] Fix generic_block_fiemap for files bigger than 4GB Mike Hommey
2009-10-29  6:00 ` Andrew Morton
2009-10-29 12:31   ` Josef Bacik
2009-10-29 12:40     ` Josef Bacik
2009-10-29 15:17       ` Andrew Morton
2009-10-29 15:19         ` Mike Hommey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox