* cp --reflink and NOCOW files
@ 2022-05-24 19:02 Goffredo Baroncelli
2022-05-25 2:19 ` Matthew Warren
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Goffredo Baroncelli @ 2022-05-24 19:02 UTC (permalink / raw)
To: linux-btrfs
Hi All,
recently I discovered that BTRFS doesn't allow to reflink a file
when the source is marked as NOCOW
$ lsattr
---------------C------ ./file-very-big-nocow
$ cp --reflink file-very-big-nocow file2
cp: failed to clone 'file2' from 'file-very-big-nocow': Invalid argument
$ strace cp --reflink file-very-big-nocow file2 2>&1 | egrep ioctl
ioctl(4, BTRFS_IOC_CLONE or FICLONE, 3) = -1 EINVAL (Invalid argument)
My first thought was that it would be sufficient to remove the "nocow" flag.
But I was unable to do that.
$ chattr -C file-very-big-nocow
$ strace cp --reflink file-very-big-nocow file2 2>&1 | egrep ioctl
ioctl(4, BTRFS_IOC_CLONE or FICLONE, 3) = -1 EINVAL (Invalid argument)
(I tried "chattr +C ..." too)
Ok, now my question is: how we can remove the NOCOW flag from a file ?
My use case is to move files between subvolumes some of which are marked as NOWCOW.
The files are videos, so I want to avoid to copy the data.
BR
--
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: cp --reflink and NOCOW files
2022-05-24 19:02 cp --reflink and NOCOW files Goffredo Baroncelli
@ 2022-05-25 2:19 ` Matthew Warren
2022-05-25 4:34 ` Forza
2022-05-25 5:11 ` Andrei Borzenkov
2022-05-27 14:37 ` Forza
2 siblings, 1 reply; 6+ messages in thread
From: Matthew Warren @ 2022-05-25 2:19 UTC (permalink / raw)
To: kreijack; +Cc: linux-btrfs
Goffredo,
You can only reflink files which are COW. If you want to reflink files
which are NOCOW, you have to copy them to a COW file (eg. cp
file-very-big-nocow file-very-big-cow) and then you can reflink it.
It's recommended to keep everything COW unless it has many random
writes like databases and virtual machines.
Matthew Warren
On Tue, May 24, 2022 at 8:19 PM Goffredo Baroncelli <kreijack@libero.it> wrote:
>
> Hi All,
>
> recently I discovered that BTRFS doesn't allow to reflink a file
> when the source is marked as NOCOW
>
> $ lsattr
> ---------------C------ ./file-very-big-nocow
> $ cp --reflink file-very-big-nocow file2
> cp: failed to clone 'file2' from 'file-very-big-nocow': Invalid argument
> $ strace cp --reflink file-very-big-nocow file2 2>&1 | egrep ioctl
> ioctl(4, BTRFS_IOC_CLONE or FICLONE, 3) = -1 EINVAL (Invalid argument)
>
> My first thought was that it would be sufficient to remove the "nocow" flag.
> But I was unable to do that.
>
> $ chattr -C file-very-big-nocow
>
> $ strace cp --reflink file-very-big-nocow file2 2>&1 | egrep ioctl
> ioctl(4, BTRFS_IOC_CLONE or FICLONE, 3) = -1 EINVAL (Invalid argument)
>
> (I tried "chattr +C ..." too)
>
> Ok, now my question is: how we can remove the NOCOW flag from a file ?
>
> My use case is to move files between subvolumes some of which are marked as NOWCOW.
> The files are videos, so I want to avoid to copy the data.
>
>
> BR
>
> --
> gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
> Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: cp --reflink and NOCOW files
2022-05-25 2:19 ` Matthew Warren
@ 2022-05-25 4:34 ` Forza
2022-05-25 16:15 ` Goffredo Baroncelli
0 siblings, 1 reply; 6+ messages in thread
From: Forza @ 2022-05-25 4:34 UTC (permalink / raw)
To: Matthew Warren, kreijack; +Cc: linux-btrfs
Hi,
>
> On Tue, May 24, 2022 at 8:19 PM Goffredo Baroncelli <kreijack@libero.it> wrote:
>>
>> Hi All,
>>
>> recently I discovered that BTRFS doesn't allow to reflink a file
>> when the source is marked as NOCOW
>>
>> $ lsattr
>> ---------------C------ ./file-very-big-nocow
>> $ cp --reflink file-very-big-nocow file2
>> cp: failed to clone 'file2' from 'file-very-big-nocow': Invalid argument
>> $ strace cp --reflink file-very-big-nocow file2 2>&1 | egrep ioctl
>> ioctl(4, BTRFS_IOC_CLONE or FICLONE, 3) = -1 EINVAL (Invalid argument)
>>
>> My first thought was that it would be sufficient to remove the "nocow" flag.
>> But I was unable to do that.
>>
>> $ chattr -C file-very-big-nocow
>>
>> $ strace cp --reflink file-very-big-nocow file2 2>&1 | egrep ioctl
>> ioctl(4, BTRFS_IOC_CLONE or FICLONE, 3) = -1 EINVAL (Invalid argument)
>>
>> (I tried "chattr +C ..." too)
>>
>> Ok, now my question is: how we can remove the NOCOW flag from a file ?
>>
>> My use case is to move files between subvolumes some of which are marked as NOWCOW.
>> The files are videos, so I want to avoid to copy the data.
>>
>>
>> BR
>>
>> --
>> gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
>> Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5
On 2022-05-25 04:19, Matthew Warren wrote:
> Goffredo,
>
> You can only reflink files which are COW. If you want to reflink files
> which are NOCOW, you have to copy them to a COW file (eg. cp
> file-very-big-nocow file-very-big-cow) and then you can reflink it.
> It's recommended to keep everything COW unless it has many random
> writes like databases and virtual machines.
>
> Matthew Warren
The problem is with coreutils and 'cp', not Btrfs. It is possible to
reflink copy nodatacow files. The requirement is that the target is also
nodatacow. Deduplication of nodatacow files has the same limitation as well.
Example:
# touch foo
# chattr +C foo
# truncate -s1G foo
# ll
total 16
drwxr-xr-x 1 root root 6 May 25 06:24 ./
drwxr-xr-x 1 root root 148 May 16 21:53 ../
-rw-r--r-- 1 root root 1073741824 May 25 06:24 foo
# lsattr
---------------C------ ./foo
# cp --reflink=auto foo bar
'foo' -> 'bar'
# ll
total 1048592
drwxr-xr-x 1 root root 12 May 25 06:24 ./
drwxr-xr-x 1 root root 148 May 16 21:53 ../
-rw-r--r-- 1 root root 1073741824 May 25 06:24 bar
-rw-r--r-- 1 root root 1073741824 May 25 06:24 foo
# lsattr
---------------C------ ./foo
---------------------- ./bar
We can see that 'bar' is a normal file. It did not get reflinked.
The solution is instead to create the target file with +C first.
# rm bar
# touch bar
# chattr +C bar
# cp --reflink=auto foo bar
'foo' -> 'bar'
# ll
total 16
drwxr-xr-x 1 root root 12 May 25 06:25 ./
drwxr-xr-x 1 root root 148 May 16 21:53 ../
-rw-r--r-- 1 root root 1073741824 May 25 06:25 bar
-rw-r--r-- 1 root root 1073741824 May 25 06:24 foo
# lsattr
---------------C------ ./foo
---------------C------ ./bar
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: cp --reflink and NOCOW files
2022-05-24 19:02 cp --reflink and NOCOW files Goffredo Baroncelli
2022-05-25 2:19 ` Matthew Warren
@ 2022-05-25 5:11 ` Andrei Borzenkov
2022-05-27 14:37 ` Forza
2 siblings, 0 replies; 6+ messages in thread
From: Andrei Borzenkov @ 2022-05-25 5:11 UTC (permalink / raw)
To: kreijack, linux-btrfs
On 24.05.2022 22:02, Goffredo Baroncelli wrote:
> Hi All,
>
> recently I discovered that BTRFS doesn't allow to reflink a file
> when the source is marked as NOCOW
>
> $ lsattr
> ---------------C------ ./file-very-big-nocow
> $ cp --reflink file-very-big-nocow file2
> cp: failed to clone 'file2' from 'file-very-big-nocow': Invalid argument
> $ strace cp --reflink file-very-big-nocow file2 2>&1 | egrep ioctl
> ioctl(4, BTRFS_IOC_CLONE or FICLONE, 3) = -1 EINVAL (Invalid argument)
>
> My first thought was that it would be sufficient to remove the "nocow" flag.
> But I was unable to do that.
>
> $ chattr -C file-very-big-nocow
>
> $ strace cp --reflink file-very-big-nocow file2 2>&1 | egrep ioctl
> ioctl(4, BTRFS_IOC_CLONE or FICLONE, 3) = -1 EINVAL (Invalid argument)
>
> (I tried "chattr +C ..." too)
>
E-h-h ... you tried to set this attribute expecting it to be unset as
result?
> Ok, now my question is: how we can remove the NOCOW flag from a file ?
>
btrfs silently ignores changes to NOCOW for non-empty files.
NOCOW also implies no checksumming. Enabling COW will require rebuilding
all checksums for existing data. I do not say it is impossible, but this
is certainly much more involved than single bit flip.
And disabling COW will (potentially) invalidate all existing references
for which checksums are already stored, and those are likely read-only
and cannot be changed at all. So it is only possible if it is verified
that no other references exist.
> My use case is to move files between subvolumes some of which are marked as NOWCOW.
> The files are videos, so I want to avoid to copy the data.
>
>
> BR
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: cp --reflink and NOCOW files
2022-05-25 4:34 ` Forza
@ 2022-05-25 16:15 ` Goffredo Baroncelli
0 siblings, 0 replies; 6+ messages in thread
From: Goffredo Baroncelli @ 2022-05-25 16:15 UTC (permalink / raw)
To: Forza, Matthew Warren; +Cc: linux-btrfs
On 25/05/2022 06.34, Forza wrote:
> > You can only reflink files which are COW. If you want to reflink files
> > which are NOCOW, you have to copy them to a COW file (eg. cp
> > file-very-big-nocow file-very-big-cow) and then you can reflink it.
> > It's recommended to keep everything COW unless it has many random
> > writes like databases and virtual machines.
> >
> > Matthew Warren
>
> The problem is with coreutils and 'cp', not Btrfs. It is possible to reflink copy nodatacow files. The requirement is that the target is also nodatacow. Deduplication of nodatacow files has the same limitation as well.
>
Correct. More in another thread.
--
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: cp --reflink and NOCOW files
2022-05-24 19:02 cp --reflink and NOCOW files Goffredo Baroncelli
2022-05-25 2:19 ` Matthew Warren
2022-05-25 5:11 ` Andrei Borzenkov
@ 2022-05-27 14:37 ` Forza
2 siblings, 0 replies; 6+ messages in thread
From: Forza @ 2022-05-27 14:37 UTC (permalink / raw)
To: kreijack, linux-btrfs
Hi Goffredo,
On 2022-05-24 21:02, Goffredo Baroncelli wrote:
> Hi All,
>
> recently I discovered that BTRFS doesn't allow to reflink a file
> when the source is marked as NOCOW
You probably saw in the my earlier email to the mailing list that this
is a problem with the 'cp' tool, because it tries to set the +C after it
has appended data to the target file.
>
> $ lsattr
> ---------------C------ ./file-very-big-nocow
> $ cp --reflink file-very-big-nocow file2
> cp: failed to clone 'file2' from 'file-very-big-nocow': Invalid argument
> $ strace cp --reflink file-very-big-nocow file2 2>&1 | egrep ioctl
> ioctl(4, BTRFS_IOC_CLONE or FICLONE, 3) = -1 EINVAL (Invalid argument)
>
> My first thought was that it would be sufficient to remove the "nocow"
> flag.
> But I was unable to do that.
>
> $ chattr -C file-very-big-nocow
>
> $ strace cp --reflink file-very-big-nocow file2 2>&1 | egrep ioctl
> ioctl(4, BTRFS_IOC_CLONE or FICLONE, 3) = -1 EINVAL (Invalid argument)
>
> (I tried "chattr +C ..." too)
>
> Ok, now my question is: how we can remove the NOCOW flag from a file ?
>
The issue here is that nocow also means nodatasum (no checksums). The
checksums are needed to verify the the integrity of the data blocks.
So in order to turn off nocow, checksums would have to be created for
every block of data. Btrfs does not do this, because there is no way for
Btrfs to actually know if the data is correct. The result is that Btrfs
refuses to turn off the nocow fsattr.
The solution is to make a normal copy as you just experienced.
> My use case is to move files between subvolumes some of which are marked
> as NOWCOW.
> The files are videos, so I want to avoid to copy the data.
>
You can still reflink copy the files if you do something like:
# touch target/foo
# chattr +C target/foo
# cp --reflink=always source/foo target/foo
Target and source must be reachable from the same mountpoint. Easiest to
manage such a situation is to mount toplevel (subvolid=5) in /mnt/btrfs/
and handle the files from there.
Video files are unusual to keep as nocow. From my own experience and
IMHO I think that nocow should always a last resort to take to in order
to solve an urgent performance issue. Certain workloads can cause lots
of extents and fragmentation due to the cow, but video editing or
similar is normally not such a use case.
If you download large files to a spinning HDD, and they end up very
fragmented, I suggest you use 'btrfs filesystem defrag' before you
snapshot or reflink copy the files.
There are a lot of guides on the internet that suggest turning on nocow,
but they rarely discuss the (potentially harmful) downsides with this
choice.
You should now that because nocow also means nodatasum, Btrfs cannot
self-heal or even detect data corruption in nocow files, even when using
tools like 'scrub' or 'check'. In addition, there is no guaranteed
redundancy with RAID/DUP profiles for those files because Btrfs relies
on checksums to determine which device has a correct copy.
I advice against nocow in most situations because I have myself had
corruption on precious photographs and videos due to bitrot. I have
written a little about this on https://wiki.tnonline.net/w/Btrfs/Scrub
Good luck!
>
> BR
>
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2022-05-27 14:37 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-05-24 19:02 cp --reflink and NOCOW files Goffredo Baroncelli
2022-05-25 2:19 ` Matthew Warren
2022-05-25 4:34 ` Forza
2022-05-25 16:15 ` Goffredo Baroncelli
2022-05-25 5:11 ` Andrei Borzenkov
2022-05-27 14:37 ` Forza
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox