public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
* btrfs fallocate woes
@ 2010-01-14 11:28 Paul Komkoff
  2010-01-14 17:27 ` Roland Dreier
  0 siblings, 1 reply; 10+ messages in thread
From: Paul Komkoff @ 2010-01-14 11:28 UTC (permalink / raw)
  To: linux-btrfs

Oh hai

Sorry if it's already fixed, but at least in fedora's
2.6.32.3-10.fc12.i686.PAE btrfs has a regression which can be
illustrated by running the following dumb test:

=== cut here ===
#define _GNU_SOURCE
#define _FILE_OFFSET_BITS 64
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

int main() {
  int fd = open("testfile", O_CLOEXEC | O_CREAT | O_NOATIME | O_WRONLY, 0666);

  fallocate(fd, 0, 0, 100);

  write(fd, "test", 4);

  close(fd);
}
=== cut here ===

if you run it on ext4, it will create a 4-byte file with "test" in it.
On btrfs, however, the file size would be 4096, and the remaining
space will be filled with zeroes.

I will try more recent kernels with that though.
-- 
This message represents the official view of the voices in my head

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: btrfs fallocate woes
  2010-01-14 11:28 btrfs fallocate woes Paul Komkoff
@ 2010-01-14 17:27 ` Roland Dreier
  2010-01-14 18:24   ` Paul Komkoff
                     ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Roland Dreier @ 2010-01-14 17:27 UTC (permalink / raw)
  To: Paul Komkoff; +Cc: linux-btrfs


 > if you run it on ext4, it will create a 4-byte file with "test" in it.
 > On btrfs, however, the file size would be 4096, and the remaining
 > space will be filled with zeroes.

My fallocate man page says:

       Because allocation is done in block size chunks, fallocate() may
       allocate a larger range than that which was specified.

so the btrfs behavior seems OK to me.

You say this is a regression.  What btrfs version behaved differently?

 - R.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: btrfs fallocate woes
  2010-01-14 17:27 ` Roland Dreier
@ 2010-01-14 18:24   ` Paul Komkoff
  2010-01-14 18:25     ` Paul Komkoff
  2010-01-14 18:26   ` Aneesh Kumar K. V
  2010-01-14 19:20   ` Chris Mason
  2 siblings, 1 reply; 10+ messages in thread
From: Paul Komkoff @ 2010-01-14 18:24 UTC (permalink / raw)
  To: Roland Dreier; +Cc: linux-btrfs

On Thu, Jan 14, 2010 at 5:27 PM, Roland Dreier <rdreier@cisco.com> wrot=
e:
> My fallocate man page says:
>
> =A0 =A0 =A0 Because allocation is done in block size chunks, fallocat=
e() may
> =A0 =A0 =A0 allocate a larger range than that which was specified.
>
> so the btrfs behavior seems OK to me.
>
> You say this is a regression. =A0What btrfs version behaved different=
ly?

Err. My wording may not be very precise.
I see this as the regression from every other filesystem, to the
extent that rsync --preallocate will give you directory tree full of
corrupted files.

Also, your man page also says something else:
       If  FALLOC_FL_KEEP_SIZE  flag is not specified in mode, the
default behavior is almost same as when this flag is specified.  The
only difference is that on
       success, the file size will be changed if offset + len is
greater than the file size.   This  default  behavior  closely
resembles  the  behavior  of  the
       posix_fallocate(3) library function, and is intended as a
method of optimally implementing that function.

See, here nothing says to which value file size will be changed. Also,
posix_fallocate man page (which is actually a shortcut to
fallocate(fd, 0, ...)) says:
       If the size of the file is less than offset+len, then the file
is increased to this size; otherwise the file size is left unchanged.

Nowhere there it is said that size is ceil()'d to a blocksize.
Moreover, I never saw an app that does fallocate() and then
ftruncate() - every single app out there assumes that posix_fallocate
(and fallocate, fwiw), while may allocate more space, will set the
file size to something different than offset+len.
--=20
This message represents the official view of the voices in my head
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: btrfs fallocate woes
  2010-01-14 18:24   ` Paul Komkoff
@ 2010-01-14 18:25     ` Paul Komkoff
  0 siblings, 0 replies; 10+ messages in thread
From: Paul Komkoff @ 2010-01-14 18:25 UTC (permalink / raw)
  To: Roland Dreier; +Cc: linux-btrfs

2010/1/14 i <i@stingr.net>:
> Nowhere there it is said that size is ceil()'d to a blocksize. Moreover, I never saw an app that does fallocate() and then ftruncate() - every single app out there assumes that posix_fallocate (and fallocate, fwiw), while may allocate more space, will set the file size to something different than offset+len.

Err, of course it's to just offset+len. I may need more sleep (this
won't purge away the fallocate problem though).
-- 
This message represents the official view of the voices in my head

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: btrfs fallocate woes
  2010-01-14 17:27 ` Roland Dreier
  2010-01-14 18:24   ` Paul Komkoff
@ 2010-01-14 18:26   ` Aneesh Kumar K. V
  2010-01-14 19:20   ` Chris Mason
  2 siblings, 0 replies; 10+ messages in thread
From: Aneesh Kumar K. V @ 2010-01-14 18:26 UTC (permalink / raw)
  To: Roland Dreier, Paul Komkoff; +Cc: linux-btrfs

On Thu, 14 Jan 2010 09:27:38 -0800, Roland Dreier <rdreier@cisco.com> wrote:
> 
>  > if you run it on ext4, it will create a 4-byte file with "test" in it.
>  > On btrfs, however, the file size would be 4096, and the remaining
>  > space will be filled with zeroes.
> 
> My fallocate man page says:
> 
>        Because allocation is done in block size chunks, fallocate() may
>        allocate a larger range than that which was specified.
> 

But with zero as the mode value it should not have updated 
i_size. Looking at the latest linus tree it seems to be doing the right
thing

-aneesh

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: btrfs fallocate woes
  2010-01-14 17:27 ` Roland Dreier
  2010-01-14 18:24   ` Paul Komkoff
  2010-01-14 18:26   ` Aneesh Kumar K. V
@ 2010-01-14 19:20   ` Chris Mason
  2010-01-14 20:33     ` Paul Komkoff
  2 siblings, 1 reply; 10+ messages in thread
From: Chris Mason @ 2010-01-14 19:20 UTC (permalink / raw)
  To: Roland Dreier; +Cc: Paul Komkoff, linux-btrfs

On Thu, Jan 14, 2010 at 09:27:38AM -0800, Roland Dreier wrote:
> 
>  > if you run it on ext4, it will create a 4-byte file with "test" in it.
>  > On btrfs, however, the file size would be 4096, and the remaining
>  > space will be filled with zeroes.
> 
> My fallocate man page says:
> 
>        Because allocation is done in block size chunks, fallocate() may
>        allocate a larger range than that which was specified.
> 
> so the btrfs behavior seems OK to me.
> 
> You say this is a regression.  What btrfs version behaved differently?

The file size should still be 4 bytes I think, even if we allocate 4096.

-chris


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: btrfs fallocate woes
  2010-01-14 19:20   ` Chris Mason
@ 2010-01-14 20:33     ` Paul Komkoff
  2010-01-19 15:18       ` Paul Komkoff
  0 siblings, 1 reply; 10+ messages in thread
From: Paul Komkoff @ 2010-01-14 20:33 UTC (permalink / raw)
  To: Chris Mason, Roland Dreier, Paul Komkoff, linux-btrfs

On Thu, Jan 14, 2010 at 7:20 PM, Chris Mason <chris.mason@oracle.com> wrote:
> The file size should still be 4 bytes I think, even if we allocate 4096.

Yes, to clarify - I changed the code to be
  fallocate(fd, 0, 0, 4);
and then
  posix_fallocate(fd, 0, 4);

(which is the same I guess)

and tried it on different filesystems, and on btrfs I ended up with
entire block (of zeroes).
If it's fixed in latest tree it's fine, I guess that fix isn't in
fedora's 2.6.32.3
-- 
This message represents the official view of the voices in my head

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: btrfs fallocate woes
  2010-01-14 20:33     ` Paul Komkoff
@ 2010-01-19 15:18       ` Paul Komkoff
  2010-01-20  7:28         ` Aneesh Kumar K. V
  0 siblings, 1 reply; 10+ messages in thread
From: Paul Komkoff @ 2010-01-19 15:18 UTC (permalink / raw)
  To: Chris Mason, Roland Dreier, Paul Komkoff, linux-btrfs

On Thu, Jan 14, 2010 at 8:33 PM, Paul Komkoff <i@stingr.net> wrote:
> If it's fixed in latest tree it's fine, I guess that fix isn't in
> fedora's 2.6.32.3

Sorry for popping up again, but did anyone fix this/verified there's
no problem in recent kernels? For some reasons I cannot run latest git
so I'm stuck with fedora kernels, and every one I have around me (with
btrfs) has this problem.

Thanks.
-- 
This message represents the official view of the voices in my head

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: btrfs fallocate woes
  2010-01-19 15:18       ` Paul Komkoff
@ 2010-01-20  7:28         ` Aneesh Kumar K. V
  2010-01-20 15:13           ` Paul Komkoff
  0 siblings, 1 reply; 10+ messages in thread
From: Aneesh Kumar K. V @ 2010-01-20  7:28 UTC (permalink / raw)
  To: Paul Komkoff, Chris Mason, Roland Dreier, Paul Komkoff,
	linux-btrfs

On Tue, 19 Jan 2010 15:18:26 +0000, Paul Komkoff <i@stingr.net> wrote:
> On Thu, Jan 14, 2010 at 8:33 PM, Paul Komkoff <i@stingr.net> wrote:
> > If it's fixed in latest tree it's fine, I guess that fix isn't in
> > fedora's 2.6.32.3
> 
> Sorry for popping up again, but did anyone fix this/verified there's
> no problem in recent kernels? For some reasons I cannot run latest git
> so I'm stuck with fedora kernels, and every one I have around me (with
> btrfs) has this problem.
> 

the below change fixes this for me on btrfs

commit f2bc9dd07e3424c4ec5f3949961fe053d47bc825
Author: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Date:   Wed Jan 20 12:57:53 2010 +0530

    btrfs: Use correct values when updating inode i_size on fallocate
    
    Even though we allocate more, we should be updating inode i_size
    as per the arguments passed
    
    Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 5440bab..db406a4 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -5789,7 +5789,7 @@ out_fail:
 }
 
 static int prealloc_file_range(struct inode *inode, u64 start, u64 end,
-			       u64 alloc_hint, int mode)
+			u64 alloc_hint, int mode, loff_t actual_len)
 {
 	struct btrfs_trans_handle *trans;
 	struct btrfs_root *root = BTRFS_I(inode)->root;
@@ -5798,6 +5798,7 @@ static int prealloc_file_range(struct inode *inode, u64 start, u64 end,
 	u64 cur_offset = start;
 	u64 num_bytes = end - start;
 	int ret = 0;
+	u64 i_size;
 
 	while (num_bytes > 0) {
 		alloc_size = min(num_bytes, root->fs_info->max_extent);
@@ -5836,8 +5837,12 @@ static int prealloc_file_range(struct inode *inode, u64 start, u64 end,
 		BTRFS_I(inode)->flags |= BTRFS_INODE_PREALLOC;
 		if (!(mode & FALLOC_FL_KEEP_SIZE) &&
 		    cur_offset > inode->i_size) {
-			i_size_write(inode, cur_offset);
-			btrfs_ordered_update_i_size(inode, cur_offset, NULL);
+			if (cur_offset > actual_len)
+				i_size  = actual_len;
+			else
+				i_size = cur_offset;
+			i_size_write(inode, i_size);
+			btrfs_ordered_update_i_size(inode, i_size, NULL);
 		}
 
 		ret = btrfs_update_inode(trans, root, inode);
@@ -5930,7 +5935,7 @@ static long btrfs_fallocate(struct inode *inode, int mode,
 		     !test_bit(EXTENT_FLAG_PREALLOC, &em->flags))) {
 			ret = prealloc_file_range(inode,
 						  cur_offset, last_byte,
-						  alloc_hint, mode);
+						alloc_hint, mode, offset+len);
 			if (ret < 0) {
 				free_extent_map(em);
 				break;

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: btrfs fallocate woes
  2010-01-20  7:28         ` Aneesh Kumar K. V
@ 2010-01-20 15:13           ` Paul Komkoff
  0 siblings, 0 replies; 10+ messages in thread
From: Paul Komkoff @ 2010-01-20 15:13 UTC (permalink / raw)
  To: Aneesh Kumar K. V; +Cc: Chris Mason, Roland Dreier, linux-btrfs

On Wed, Jan 20, 2010 at 7:28 AM, Aneesh Kumar K. V
<aneesh.kumar@linux.vnet.ibm.com> wrote:
> the below change fixes this for me on btrfs

Thanks a million, now I guess we're waiting for Chris to pull it.
Will it qualify for stable update?
-- 
This message represents the official view of the voices in my head

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2010-01-20 15:13 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-14 11:28 btrfs fallocate woes Paul Komkoff
2010-01-14 17:27 ` Roland Dreier
2010-01-14 18:24   ` Paul Komkoff
2010-01-14 18:25     ` Paul Komkoff
2010-01-14 18:26   ` Aneesh Kumar K. V
2010-01-14 19:20   ` Chris Mason
2010-01-14 20:33     ` Paul Komkoff
2010-01-19 15:18       ` Paul Komkoff
2010-01-20  7:28         ` Aneesh Kumar K. V
2010-01-20 15:13           ` Paul Komkoff

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox