Odd fallocate behavior on BTRFS.

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Odd fallocate behavior on BTRFS.
@ 2017-08-01 17:34 Austin S. Hemmelgarn
  2017-08-01 18:15 ` Holger Hoffstätte
  0 siblings, 1 reply; 4+ messages in thread
From: Austin S. Hemmelgarn @ 2017-08-01 17:34 UTC (permalink / raw)
  To: Btrfs BTRFS; +Cc: linux-fsdevel

A recent thread on the BTRFS mailing list [1] brought up some odd 
behavior in BTRFS that I've long suspected but not had prior reason to 
test.  I've put the fsdevel mailing list on CC since I'm curious to hear 
what people there think about this.

Apparently, if you call fallocate() on a file with an offset of 0 and a 
length longer than the length of the file itself, BTRFS will allocate 
that exact amount of space, instead of just filling in holes in the file 
and allocating space to extend it.  If there isn't enough space on the 
filesystem for this, then it will fail, even though it would succeed on 
ext4, XFS, and F2FS.  The following script demonstrates this:

     #!/bin/bash
     touch ./test-fs
     truncate --size=4G ./test-fs
     mkfs.btrfs ./test-fs
     mkdir ./test
     mount -t auto ./test-fs ./test
     dd if=/dev/zero of=./test/test bs=65536 count=32768
     fallocate -l 2147483650 ./test/test && echo "Success!"
     umount ./test
     rmdir ./test
     rm -f ./test-fs

This will spit out a -ENOSPC error from the fallocate call, but if you 
change the mkfs call to ext4, XFS, or F2FS, it will instead succeed 
without error.  If the fallocate call is changed to `fallocate -o 
2147483648 -l 2 ./test/test`, it will succeed on all filesystems.

I have not yet done any testing to determine if this also applies for 
offsets other than 0, but I suspect it does (it would be kind of odd if 
it didn't).

My thought on this is that the behavior that BTRFS exhibits is incorrect 
in this case, at a minimum because it does not follow the apparent 
de-facto standard, and because it keeps some things from working (the OP 
in the thread that resulted in me finding this was having issues trying 
to extend a SnapRAID parity file that was already larger than half the 
size of the BTRFS volume it was stored on).

I'm curious to hear anybody's thoughts on this, namely:
1. Is this behavior that should be considered implementation defined?
2. If not, is my assessment that BTRFS is behaving incorrectly in this 
case accurate?

[1] https://marc.info/?l=linux-btrfs&m=150158963921123&w=2

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Odd fallocate behavior on BTRFS.
  2017-08-01 17:34 Odd fallocate behavior on BTRFS Austin S. Hemmelgarn
@ 2017-08-01 18:15 ` Holger Hoffstätte
  2017-08-01 19:07   ` Holger Hoffstätte
  0 siblings, 1 reply; 4+ messages in thread
From: Holger Hoffstätte @ 2017-08-01 18:15 UTC (permalink / raw)
  To: Austin S. Hemmelgarn, linux-btrfs; +Cc: linux-fsdevel

On 08/01/17 19:34, Austin S. Hemmelgarn wrote:
[..]
> Apparently, if you call fallocate() on a file with an offset of 0 and
> a length longer than the length of the file itself, BTRFS will
> allocate that exact amount of space, instead of just filling in holes
> in the file and allocating space to extend it.  If there isn't enough
> space on the filesystem for this, then it will fail, even though it
> would succeed on ext4, XFS, and F2FS.
[..]
> I'm curious to hear anybody's thoughts on this, namely: 1. Is this
> behavior that should be considered implementation defined? 2. If not,
> is my assessment that BTRFS is behaving incorrectly in this case
> accurate?

IMHO no and yes, respectively. Both fallocate(2) and posix_fallocate(3)
make it very clear that the expected default behaviour is to extend.
I don't think this can be interpreted in any other way than incorrect
behaviour on behalf of btrfs.

Your script reproduces for me, so that's a start.

-h

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Odd fallocate behavior on BTRFS.
  2017-08-01 18:15 ` Holger Hoffstätte
@ 2017-08-01 19:07   ` Holger Hoffstätte
  2017-08-01 19:14     ` Austin S. Hemmelgarn
  0 siblings, 1 reply; 4+ messages in thread
From: Holger Hoffstätte @ 2017-08-01 19:07 UTC (permalink / raw)
  To: Austin S. Hemmelgarn, linux-btrfs; +Cc: linux-fsdevel

On 08/01/17 20:15, Holger Hoffstätte wrote:
> On 08/01/17 19:34, Austin S. Hemmelgarn wrote:
> [..]
>> Apparently, if you call fallocate() on a file with an offset of 0 and
>> a length longer than the length of the file itself, BTRFS will
>> allocate that exact amount of space, instead of just filling in holes
>> in the file and allocating space to extend it.  If there isn't enough
>> space on the filesystem for this, then it will fail, even though it
>> would succeed on ext4, XFS, and F2FS.
> [..]
>> I'm curious to hear anybody's thoughts on this, namely: 1. Is this
>> behavior that should be considered implementation defined? 2. If not,
>> is my assessment that BTRFS is behaving incorrectly in this case
>> accurate?
> 
> IMHO no and yes, respectively. Both fallocate(2) and posix_fallocate(3)
> make it very clear that the expected default behaviour is to extend.
> I don't think this can be interpreted in any other way than incorrect
> behaviour on behalf of btrfs.
> 
> Your script reproduces for me, so that's a start.

Your reproducer should never ENOSPC because it requires exactly 0 new bytes
to be allocated, yet it also fails with --keep-size.

>From a quick look it seems that btrfs_fallocate() unconditionally calls
btrfs_alloc_data_chunk_ondemand(inode, alloc_end - alloc_start) to lazily
allocate the necessary extent(s), which goes ENOSPC because that size
is again the full size of the requested range, not the difference between
the existing file size and the new range length. 
But I might be misreading things..

-h

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Odd fallocate behavior on BTRFS.
  2017-08-01 19:07   ` Holger Hoffstätte
@ 2017-08-01 19:14     ` Austin S. Hemmelgarn
  0 siblings, 0 replies; 4+ messages in thread
From: Austin S. Hemmelgarn @ 2017-08-01 19:14 UTC (permalink / raw)
  To: Holger Hoffstätte, linux-btrfs; +Cc: linux-fsdevel

On 2017-08-01 15:07, Holger Hoffstätte wrote:
> On 08/01/17 20:15, Holger Hoffstätte wrote:
>> On 08/01/17 19:34, Austin S. Hemmelgarn wrote:
>> [..]
>>> Apparently, if you call fallocate() on a file with an offset of 0 and
>>> a length longer than the length of the file itself, BTRFS will
>>> allocate that exact amount of space, instead of just filling in holes
>>> in the file and allocating space to extend it.  If there isn't enough
>>> space on the filesystem for this, then it will fail, even though it
>>> would succeed on ext4, XFS, and F2FS.
>> [..]
>>> I'm curious to hear anybody's thoughts on this, namely: 1. Is this
>>> behavior that should be considered implementation defined? 2. If not,
>>> is my assessment that BTRFS is behaving incorrectly in this case
>>> accurate?
>>
>> IMHO no and yes, respectively. Both fallocate(2) and posix_fallocate(3)
>> make it very clear that the expected default behaviour is to extend.
>> I don't think this can be interpreted in any other way than incorrect
>> behaviour on behalf of btrfs.
>>
>> Your script reproduces for me, so that's a start.
> 
> Your reproducer should never ENOSPC because it requires exactly 0 new bytes
> to be allocated, yet it also fails with --keep-size.
Unless I'm doing the math wrong, it should require exactly 2 new bytes. 
65536 (the block size for dd) times 32768 (the block count for dd) is 
2147483648 (2^31), while the fallocate call requests a total size of 
2147483650 bytes.  It may not need to allocate a new block, but it 
should definitely be extending the file.>
>  From a quick look it seems that btrfs_fallocate() unconditionally calls
> btrfs_alloc_data_chunk_ondemand(inode, alloc_end - alloc_start) to lazily
> allocate the necessary extent(s), which goes ENOSPC because that size
> is again the full size of the requested range, not the difference between
> the existing file size and the new range length.
> But I might be misreading things..
As far as I can tell, that is correct.  However, we can't just extend 
the range, because the existing file might have sparse regions, and 
those need to have allocations forced too (and based on the code, this 
will also cause issues any time the fallocate range includes already 
allocated extents, so I don't think it can be special cased either).

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-08-01 19:14 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-08-01 17:34 Odd fallocate behavior on BTRFS Austin S. Hemmelgarn
2017-08-01 18:15 ` Holger Hoffstätte
2017-08-01 19:07   ` Holger Hoffstätte
2017-08-01 19:14     ` Austin S. Hemmelgarn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).