Linux Btrfs filesystem development
 help / color / mirror / Atom feed
* cp --reflink and NOCOW files
@ 2022-05-24 19:02 Goffredo Baroncelli
  2022-05-25  2:19 ` Matthew Warren
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Goffredo Baroncelli @ 2022-05-24 19:02 UTC (permalink / raw)
  To: linux-btrfs

Hi All,

recently I discovered that BTRFS doesn't allow to reflink a file
when the source is marked as NOCOW

$ lsattr
---------------C------ ./file-very-big-nocow
$ cp --reflink file-very-big-nocow file2
cp: failed to clone 'file2' from 'file-very-big-nocow': Invalid argument
$ strace cp --reflink file-very-big-nocow file2 2>&1 | egrep ioctl
ioctl(4, BTRFS_IOC_CLONE or FICLONE, 3) = -1 EINVAL (Invalid argument)

My first thought was that it would be sufficient to remove the "nocow" flag.
But I was unable to do that.

$ chattr -C file-very-big-nocow

$ strace cp --reflink file-very-big-nocow file2 2>&1 | egrep ioctl
ioctl(4, BTRFS_IOC_CLONE or FICLONE, 3) = -1 EINVAL (Invalid argument)

(I tried "chattr +C ..." too)

Ok, now my question is: how we can remove the NOCOW flag from a file ?

My use case is to move files between subvolumes some of which are marked as NOWCOW.
The files are videos, so I want to avoid to copy the data.


BR

-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: cp --reflink and NOCOW files
  2022-05-24 19:02 cp --reflink and NOCOW files Goffredo Baroncelli
@ 2022-05-25  2:19 ` Matthew Warren
  2022-05-25  4:34   ` Forza
  2022-05-25  5:11 ` Andrei Borzenkov
  2022-05-27 14:37 ` Forza
  2 siblings, 1 reply; 6+ messages in thread
From: Matthew Warren @ 2022-05-25  2:19 UTC (permalink / raw)
  To: kreijack; +Cc: linux-btrfs

Goffredo,

You can only reflink files which are COW. If you want to reflink files
which are NOCOW, you have to copy them to a COW file (eg. cp
file-very-big-nocow file-very-big-cow) and then you can reflink it.
It's recommended to keep everything COW unless it has many random
writes like databases and virtual machines.

Matthew Warren

On Tue, May 24, 2022 at 8:19 PM Goffredo Baroncelli <kreijack@libero.it> wrote:
>
> Hi All,
>
> recently I discovered that BTRFS doesn't allow to reflink a file
> when the source is marked as NOCOW
>
> $ lsattr
> ---------------C------ ./file-very-big-nocow
> $ cp --reflink file-very-big-nocow file2
> cp: failed to clone 'file2' from 'file-very-big-nocow': Invalid argument
> $ strace cp --reflink file-very-big-nocow file2 2>&1 | egrep ioctl
> ioctl(4, BTRFS_IOC_CLONE or FICLONE, 3) = -1 EINVAL (Invalid argument)
>
> My first thought was that it would be sufficient to remove the "nocow" flag.
> But I was unable to do that.
>
> $ chattr -C file-very-big-nocow
>
> $ strace cp --reflink file-very-big-nocow file2 2>&1 | egrep ioctl
> ioctl(4, BTRFS_IOC_CLONE or FICLONE, 3) = -1 EINVAL (Invalid argument)
>
> (I tried "chattr +C ..." too)
>
> Ok, now my question is: how we can remove the NOCOW flag from a file ?
>
> My use case is to move files between subvolumes some of which are marked as NOWCOW.
> The files are videos, so I want to avoid to copy the data.
>
>
> BR
>
> --
> gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
> Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: cp --reflink and NOCOW files
  2022-05-25  2:19 ` Matthew Warren
@ 2022-05-25  4:34   ` Forza
  2022-05-25 16:15     ` Goffredo Baroncelli
  0 siblings, 1 reply; 6+ messages in thread
From: Forza @ 2022-05-25  4:34 UTC (permalink / raw)
  To: Matthew Warren, kreijack; +Cc: linux-btrfs

Hi,

> 
> On Tue, May 24, 2022 at 8:19 PM Goffredo Baroncelli <kreijack@libero.it> wrote:
>>
>> Hi All,
>>
>> recently I discovered that BTRFS doesn't allow to reflink a file
>> when the source is marked as NOCOW
>>
>> $ lsattr
>> ---------------C------ ./file-very-big-nocow
>> $ cp --reflink file-very-big-nocow file2
>> cp: failed to clone 'file2' from 'file-very-big-nocow': Invalid argument
>> $ strace cp --reflink file-very-big-nocow file2 2>&1 | egrep ioctl
>> ioctl(4, BTRFS_IOC_CLONE or FICLONE, 3) = -1 EINVAL (Invalid argument)
>>
>> My first thought was that it would be sufficient to remove the "nocow" flag.
>> But I was unable to do that.
>>
>> $ chattr -C file-very-big-nocow
>>
>> $ strace cp --reflink file-very-big-nocow file2 2>&1 | egrep ioctl
>> ioctl(4, BTRFS_IOC_CLONE or FICLONE, 3) = -1 EINVAL (Invalid argument)
>>
>> (I tried "chattr +C ..." too)
>>
>> Ok, now my question is: how we can remove the NOCOW flag from a file ?
>>
>> My use case is to move files between subvolumes some of which are marked as NOWCOW.
>> The files are videos, so I want to avoid to copy the data.
>>
>>
>> BR
>>
>> --
>> gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
>> Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5

On 2022-05-25 04:19, Matthew Warren wrote:
 > Goffredo,
 >
 > You can only reflink files which are COW. If you want to reflink files
 > which are NOCOW, you have to copy them to a COW file (eg. cp
 > file-very-big-nocow file-very-big-cow) and then you can reflink it.
 > It's recommended to keep everything COW unless it has many random
 > writes like databases and virtual machines.
 >
 > Matthew Warren

The problem is with coreutils and 'cp', not Btrfs. It is possible to 
reflink copy nodatacow files. The requirement is that the target is also 
nodatacow. Deduplication of nodatacow files has the same limitation as well.

Example:

# touch foo
# chattr +C foo
# truncate -s1G foo
# ll
total 16
drwxr-xr-x 1 root root          6 May 25 06:24 ./
drwxr-xr-x 1 root root        148 May 16 21:53 ../
-rw-r--r-- 1 root root 1073741824 May 25 06:24 foo

# lsattr
---------------C------ ./foo

# cp --reflink=auto foo bar
'foo' -> 'bar'

# ll
total 1048592
drwxr-xr-x 1 root root         12 May 25 06:24 ./
drwxr-xr-x 1 root root        148 May 16 21:53 ../
-rw-r--r-- 1 root root 1073741824 May 25 06:24 bar
-rw-r--r-- 1 root root 1073741824 May 25 06:24 foo

# lsattr
---------------C------ ./foo
---------------------- ./bar

We can see that 'bar' is a normal file. It did not get reflinked.
The solution is instead to create the target file with +C first.

# rm bar
# touch bar
# chattr +C bar
# cp --reflink=auto foo bar
'foo' -> 'bar'

# ll
total 16
drwxr-xr-x 1 root root         12 May 25 06:25 ./
drwxr-xr-x 1 root root        148 May 16 21:53 ../
-rw-r--r-- 1 root root 1073741824 May 25 06:25 bar
-rw-r--r-- 1 root root 1073741824 May 25 06:24 foo

# lsattr
---------------C------ ./foo
---------------C------ ./bar

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: cp --reflink and NOCOW files
  2022-05-24 19:02 cp --reflink and NOCOW files Goffredo Baroncelli
  2022-05-25  2:19 ` Matthew Warren
@ 2022-05-25  5:11 ` Andrei Borzenkov
  2022-05-27 14:37 ` Forza
  2 siblings, 0 replies; 6+ messages in thread
From: Andrei Borzenkov @ 2022-05-25  5:11 UTC (permalink / raw)
  To: kreijack, linux-btrfs

On 24.05.2022 22:02, Goffredo Baroncelli wrote:
> Hi All,
> 
> recently I discovered that BTRFS doesn't allow to reflink a file
> when the source is marked as NOCOW
> 
> $ lsattr
> ---------------C------ ./file-very-big-nocow
> $ cp --reflink file-very-big-nocow file2
> cp: failed to clone 'file2' from 'file-very-big-nocow': Invalid argument
> $ strace cp --reflink file-very-big-nocow file2 2>&1 | egrep ioctl
> ioctl(4, BTRFS_IOC_CLONE or FICLONE, 3) = -1 EINVAL (Invalid argument)
> 
> My first thought was that it would be sufficient to remove the "nocow" flag.
> But I was unable to do that.
> 
> $ chattr -C file-very-big-nocow
> 
> $ strace cp --reflink file-very-big-nocow file2 2>&1 | egrep ioctl
> ioctl(4, BTRFS_IOC_CLONE or FICLONE, 3) = -1 EINVAL (Invalid argument)
> 
> (I tried "chattr +C ..." too)
> 

E-h-h ... you tried to set this attribute expecting it to be unset as
result?

> Ok, now my question is: how we can remove the NOCOW flag from a file ?
> 

btrfs silently ignores changes to NOCOW for non-empty files.

NOCOW also implies no checksumming. Enabling COW will require rebuilding
all checksums for existing data. I do not say it is impossible, but this
is certainly much more involved than single bit flip.

And disabling COW will (potentially) invalidate all existing references
for which checksums are already stored, and those are likely read-only
and cannot be changed at all. So it is only possible if it is verified
that no other references exist.

> My use case is to move files between subvolumes some of which are marked as NOWCOW.
> The files are videos, so I want to avoid to copy the data.
> 
> 
> BR
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: cp --reflink and NOCOW files
  2022-05-25  4:34   ` Forza
@ 2022-05-25 16:15     ` Goffredo Baroncelli
  0 siblings, 0 replies; 6+ messages in thread
From: Goffredo Baroncelli @ 2022-05-25 16:15 UTC (permalink / raw)
  To: Forza, Matthew Warren; +Cc: linux-btrfs

On 25/05/2022 06.34, Forza wrote:
>  > You can only reflink files which are COW. If you want to reflink files
>  > which are NOCOW, you have to copy them to a COW file (eg. cp
>  > file-very-big-nocow file-very-big-cow) and then you can reflink it.
>  > It's recommended to keep everything COW unless it has many random
>  > writes like databases and virtual machines.
>  >
>  > Matthew Warren
> 
> The problem is with coreutils and 'cp', not Btrfs. It is possible to reflink copy nodatacow files. The requirement is that the target is also nodatacow. Deduplication of nodatacow files has the same limitation as well.
> 

Correct. More in another thread.

-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: cp --reflink and NOCOW files
  2022-05-24 19:02 cp --reflink and NOCOW files Goffredo Baroncelli
  2022-05-25  2:19 ` Matthew Warren
  2022-05-25  5:11 ` Andrei Borzenkov
@ 2022-05-27 14:37 ` Forza
  2 siblings, 0 replies; 6+ messages in thread
From: Forza @ 2022-05-27 14:37 UTC (permalink / raw)
  To: kreijack, linux-btrfs

Hi Goffredo,

On 2022-05-24 21:02, Goffredo Baroncelli wrote:
> Hi All,
> 
> recently I discovered that BTRFS doesn't allow to reflink a file
> when the source is marked as NOCOW

You probably saw in the my earlier email to the mailing list that this 
is a problem with the 'cp' tool, because it tries to set the +C after it 
has appended data to the target file.

> 
> $ lsattr
> ---------------C------ ./file-very-big-nocow
> $ cp --reflink file-very-big-nocow file2
> cp: failed to clone 'file2' from 'file-very-big-nocow': Invalid argument
> $ strace cp --reflink file-very-big-nocow file2 2>&1 | egrep ioctl
> ioctl(4, BTRFS_IOC_CLONE or FICLONE, 3) = -1 EINVAL (Invalid argument)
> 
> My first thought was that it would be sufficient to remove the "nocow" 
> flag.
> But I was unable to do that.
> 
> $ chattr -C file-very-big-nocow
> 
> $ strace cp --reflink file-very-big-nocow file2 2>&1 | egrep ioctl
> ioctl(4, BTRFS_IOC_CLONE or FICLONE, 3) = -1 EINVAL (Invalid argument)
> 
> (I tried "chattr +C ..." too)
> 
> Ok, now my question is: how we can remove the NOCOW flag from a file ?
> 

The issue here is that nocow also means nodatasum (no checksums). The 
checksums are needed to verify the the integrity of the data blocks.

So in order to turn off nocow, checksums would have to be created for 
every block of data. Btrfs does not do this, because there is no way for 
Btrfs to actually know if the data is correct. The result is that Btrfs 
refuses to turn off the nocow fsattr.

The solution is to make a normal copy as you just experienced.

> My use case is to move files between subvolumes some of which are marked 
> as NOWCOW.
> The files are videos, so I want to avoid to copy the data.
> 

You can still reflink copy the files if you do something like:

# touch target/foo
# chattr +C target/foo
# cp --reflink=always source/foo target/foo

Target and source must be reachable from the same mountpoint. Easiest to
manage such a situation is to mount toplevel (subvolid=5) in /mnt/btrfs/ 
and handle the files from there.

Video files are unusual to keep as nocow. From my own experience and 
IMHO I think that nocow should always a last resort to take to in order 
to solve an urgent performance issue. Certain workloads can cause lots 
of extents and fragmentation due to the cow, but video editing or 
similar is normally not such a use case.

If you download large files to a spinning HDD, and they end up very 
fragmented, I suggest you use 'btrfs filesystem defrag' before you 
snapshot or reflink copy the files.

There are a lot of guides on the internet that suggest turning on nocow, 
but they rarely discuss the (potentially harmful) downsides with this 
choice.

You should now that because nocow also means nodatasum, Btrfs cannot 
self-heal or even detect data corruption in nocow files, even when using 
tools like 'scrub' or 'check'. In addition, there is no guaranteed 
redundancy with RAID/DUP profiles for those files because Btrfs relies 
on checksums to determine which device has a correct copy.

I advice against nocow in most situations because I have myself had 
corruption on precious photographs and videos due to bitrot. I have 
written a little about this on https://wiki.tnonline.net/w/Btrfs/Scrub

Good luck!

> 
> BR
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-05-27 14:37 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-05-24 19:02 cp --reflink and NOCOW files Goffredo Baroncelli
2022-05-25  2:19 ` Matthew Warren
2022-05-25  4:34   ` Forza
2022-05-25 16:15     ` Goffredo Baroncelli
2022-05-25  5:11 ` Andrei Borzenkov
2022-05-27 14:37 ` Forza

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox