* Question about fallocate
@ 2014-07-01 10:16 Gionatan Danti
2014-07-02 8:04 ` Duncan
0 siblings, 1 reply; 5+ messages in thread
From: Gionatan Danti @ 2014-07-01 10:16 UTC (permalink / raw)
To: linux-btrfs; +Cc: g.danti
Hi all,
I'm doing some test using a small BTRFS volume on CentOS 6.5 x86_64 (I
know that CentOS 6 use an old kernel and btrfs version and I plan to
replicate the same test on Fedora 20).
From my understanding, disabling CoW and fallocate a file should give a
non-fragmented file. The followind commands show that:
[root@blackhole test]# fallocate test.img -l 1G
[root@blackhole test]# sync
[root@blackhole test]# filefrag -v test.img
Filesystem type is: 9123683e
File size of test.img is 1073741824 (262144 blocks, blocksize 4096)
ext logical physical expected length flags
0 0 269312 262144 eof
test.img: 1 extent found
As you can see, I have a single, continuous block stream.
However, write some 4k blocks into the file leads to fragmentation:
[root@blackhole test]# for id in `seq 1 32`; do dd if=/dev/zero
of=test.img bs=4k count=1 seek=$id conv=notrunc,nocreat
oflag=direct,sync; done
...
[root@blackhole test]# filefrag -v test.img
Filesystem type is: 9123683e
File size of test.img is 1073741824 (262144 blocks, blocksize 4096)
ext logical physical expected length flags
0 0 269312 1
1 1 269313 1
2 2 531456 269314 31
3 33 269345 531487 262111 eof
test.img: 3 extents found
If I don't use fallocate to reserve space, using a simple dd to write 0s
to the target file, the fragmentation do not occour:
[root@blackhole test]# dd if=/dev/zero of=test.img bs=2M count=512
[root@blackhole test]# sync
[root@blackhole test]# for id in `seq 1 32`; do dd if=/dev/zero
of=test.img bs=4k count=1 seek=$id conv=notrunc,nocreat
oflag=direct,sync; done
...
[root@blackhole test]# filefrag -v test.img
Filesystem type is: 9123683e
File size of test.img is 1073741824 (262144 blocks, blocksize 4096)
ext logical physical expected length flags
0 0 269312 262144 eof
test.img: 1 extent found
So, my question is: why writing to a fallocated file produce
fragmentation, even with CoW disabled?
Regards.
--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Question about fallocate
2014-07-01 10:16 Gionatan Danti
@ 2014-07-02 8:04 ` Duncan
0 siblings, 0 replies; 5+ messages in thread
From: Duncan @ 2014-07-02 8:04 UTC (permalink / raw)
To: linux-btrfs
Gionatan Danti posted on Tue, 01 Jul 2014 12:16:34 +0200 as excerpted:
> So, my question is: why writing to a fallocated file produce
> fragmentation, even with CoW disabled?
Good question.
But how did you disable COW? The nodatacow mount option? Setting the
NOCOW attribute on the file or parent-dir (chattr +C)? Something else?
Because there are caveats to both the mount option and the file attribute
methods.
For the nodatacow mount option, the caveat is that to the best of my
knowledge, that option is one of several btrfs-specific options that's
still whole-filesystem-based. So if you are mounting multiple subvolumes
from the same filesystem, nodatacow will toggle together for all of them
-- you can't use it for just one.
The possible explanation there is thus that if multiple subvolumes are
mounted and not all of them have the same nodatacow option applied, you
might not have nodatacow set on that subvolume after all.
For the nocow file attribute case, there's two possibilities. First,
it's critical how the file attribute was set. If it was set on a file
with existing data, nocow behavior isn't guaranteed. The nocow attribute
must be set on the file while it is zero-sized, before it has any data in
it. The easiest way to do that is to set the attribute on the directory,
such that newly created files (and subdirs) in it inherit the attribute.
Nocow doesn't affect the directory itself and thus only determines
inheritance.
Second, there's the snapshotting exception. Because a btrfs snapshot
locks the existing file data in place with the snapshot, the first
modification to a fileblock after a snapshot will force a COW for that
block, even on an otherwise nocow file. The nocow attribute remains in
effect, however, and further writes to the same block will modify it in-
place... until the next snapshot of course. This becomes a real issue
for people running automated snapshotting scripts, particularly if
they've set it to something extreme like once a minute, since in that
case the existing file content is locked in place once a minute, meaning
the nocow attribute is effectively worthless for them. The
recommendation in that case is to put the nocow files in a dedicated
subvolume, since snapshots stop at subvolume boundaries, and to use
conventional backup instead of snapshotting for that subvolume.
So the possible explanations for the file attribute case is either that
the file attribute wasn't applied correctly (the file already had content
when the attribute was set), or that a snapshot intervened between file
creation and modification, locking the existing file data in-place and
thus forcing a cow for the first post-snapshot modification per-block.
Other than that, I don't know, but it'd be interesting to see if the
behavior replicates on a current kernel.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Question about fallocate
[not found] <20140702014130.47a5ba75@ws>
@ 2014-07-02 10:36 ` Gionatan Danti
2014-07-02 10:52 ` Maurizio Lombardi
0 siblings, 1 reply; 5+ messages in thread
From: Gionatan Danti @ 2014-07-02 10:36 UTC (permalink / raw)
To: linux-btrfs; +Cc: Duncan, g.danti
Hi,
> But how did you disable COW? The nodatacow mount option? Setting the
> NOCOW attribute on the file or parent-dir (chattr +C)? Something else?
>
> Because there are caveats to both the mount option and the file
> attribute methods.
>
I used the nodatacow mount point. When doing the test, I had no other
subvolume or snapshot.
> Other than that, I don't know, but it'd be interesting to see if the
> behavior replicates on a current kernel.
>
Yes, I'll try to replicate it on a 3.14.x based Fedora 20.
Thanks.
--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Question about fallocate
2014-07-02 10:36 ` Question about fallocate Gionatan Danti
@ 2014-07-02 10:52 ` Maurizio Lombardi
2014-07-02 14:04 ` Gionatan Danti
0 siblings, 1 reply; 5+ messages in thread
From: Maurizio Lombardi @ 2014-07-02 10:52 UTC (permalink / raw)
To: Gionatan Danti, linux-btrfs; +Cc: Duncan
On 07/02/2014 12:36 PM, Gionatan Danti wrote:
> Hi,
>
>> But how did you disable COW? The nodatacow mount option? Setting the
>> NOCOW attribute on the file or parent-dir (chattr +C)? Something else?
>>
>> Because there are caveats to both the mount option and the file
>> attribute methods.
>>
>
> I used the nodatacow mount point. When doing the test, I had no other subvolume or snapshot.
>
>> Other than that, I don't know, but it'd be interesting to see if the
>> behavior replicates on a current kernel.
>>
>
> Yes, I'll try to replicate it on a 3.14.x based Fedora 20.
I am unable to reproduce the problem on kernel version 3.16.0-rc2
using nodatacow.
$ mkfs.btrfs testdisk
WARNING! - Btrfs v3.14.1 IS EXPERIMENTAL
WARNING! - see http://btrfs.wiki.kernel.org before using
Performing full device TRIM (20.00GiB) ...
Turning ON incompat feature 'extref': increased hardlink limit per file to 65536
fs created label (null) on testdisk
$ sudo mount -o nodatacow testdisk mnt/
$ mount
/mnt/iscsi-disk/test/testdisk on /mnt/iscsi-disk/test/mnt type btrfs (rw,relatime,seclabel,nodatasum,nodatacow,space_cache)
$ cd mnt
$ fallocate test.img -l 1G
$ sync
$ filefrag -v test.img
Filesystem type is: 9123683e
File size of test.img is 1073741824 (262144 blocks of 4096 bytes)
ext: logical_offset: physical_offset: length: expected: flags:
0: 0.. 65535: 269312.. 334847: 65536:
1: 65536.. 131071: 334848.. 400383: 65536:
2: 131072.. 196607: 400384.. 465919: 65536:
3: 196608.. 262143: 465920.. 531455: 65536: eof
test.img: 1 extent found
$ for id in `seq 1 32`; do dd if=/dev/zero of=test.img bs=4k count=1 seek=$id conv=notrunc,nocreat oflag=direct,sync; done
$ filefrag -v test.img
Filesystem type is: 9123683e
File size of test.img is 1073741824 (262144 blocks of 4096 bytes)
ext: logical_offset: physical_offset: length: expected: flags:
0: 0.. 0: 269312.. 269312: 1:
1: 1.. 32: 269313.. 269344: 32:
2: 33.. 262143: 269345.. 531455: 262111: eof
test.img: 1 extent found
$ uname -a
Linux dhcp-27-189.brq.redhat.com 3.16.0-rc2-mainline #17 SMP Mon Jun 23 12:28:11 CEST 2014 x86_64 x86_64 x86_64 GNU/Linux
Regards,
Maurizio Lombardi
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Question about fallocate
2014-07-02 10:52 ` Maurizio Lombardi
@ 2014-07-02 14:04 ` Gionatan Danti
0 siblings, 0 replies; 5+ messages in thread
From: Gionatan Danti @ 2014-07-02 14:04 UTC (permalink / raw)
To: Maurizio Lombardi, linux-btrfs; +Cc: Duncan, g.danti
> I am unable to reproduce the problem on kernel version 3.16.0-rc2
> using nodatacow.
I all,
I can confirm that with Fedora 20 x86_64 (kernel 3.14.9-200.fc20.x86_64)
the problem do not occour: using the nodatacow option leads to
un-fragmented file.
Sorry for the noise.
--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2014-07-02 14:04 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20140702014130.47a5ba75@ws>
2014-07-02 10:36 ` Question about fallocate Gionatan Danti
2014-07-02 10:52 ` Maurizio Lombardi
2014-07-02 14:04 ` Gionatan Danti
2014-07-01 10:16 Gionatan Danti
2014-07-02 8:04 ` Duncan
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.