* Can the output of FIEMAP on BTRFS be used to check if a file and its reflink copy might have diverged? @ 2025-09-22 0:07 Demi Marie Obenour 2025-09-22 0:50 ` Qu Wenruo 2025-09-22 16:48 ` Christoph Hellwig 0 siblings, 2 replies; 21+ messages in thread From: Demi Marie Obenour @ 2025-09-22 0:07 UTC (permalink / raw) To: linux-btrfs [-- Attachment #1.1.1: Type: text/plain, Size: 602 bytes --] Wyng Backup (https://codeberg.org/tasket/wyng-backup) relies on FIEMAP to determine which parts of a file have not changed since it was last backed up. Specifically, the output of filefrag -v is passed to sort and then to uniq, and differences between the outputs for the file and the previous version (a reflink copy) determine what gets backed up. Is this safe under BTRFS, or can it result in data loss due to data not being backed up that should be? In other words, can it result in data being considered unchanged when it really is? -- Sincerely, Demi Marie Obenour (she/her/hers) [-- Attachment #1.1.2: OpenPGP public key --] [-- Type: application/pgp-keys, Size: 7253 bytes --] [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Can the output of FIEMAP on BTRFS be used to check if a file and its reflink copy might have diverged? 2025-09-22 0:07 Can the output of FIEMAP on BTRFS be used to check if a file and its reflink copy might have diverged? Demi Marie Obenour @ 2025-09-22 0:50 ` Qu Wenruo 2025-09-22 18:24 ` Demi Marie Obenour 2025-09-22 16:48 ` Christoph Hellwig 1 sibling, 1 reply; 21+ messages in thread From: Qu Wenruo @ 2025-09-22 0:50 UTC (permalink / raw) To: Demi Marie Obenour, linux-btrfs 在 2025/9/22 09:37, Demi Marie Obenour 写道: > Wyng Backup (https://codeberg.org/tasket/wyng-backup) relies on FIEMAP > to determine which parts of a file have not changed since it was last > backed up. Specifically, the output of filefrag -v is passed to sort and > then to uniq, and differences between the outputs for the file and > the previous version (a reflink copy) determine what gets backed up. > > Is this safe under BTRFS, No. There are several factors affecting this, some are minor some are not: - Inlined extents The returned bytenr is unreliable in that case. Although the fiemap flags should indicate that, with 'inline' flag set. - Balance Btrfs can balance the data extents, which will result the change of the fiemap. E.g. ## Before balance # md5sum /mnt/btrfs/foobar 27c9068d1b51da575a53ad34c57ca5cc /mnt/btrfs/foobar # filefrag -v /mnt/btrfs/foobar Filesystem type is: 9123683e File size of /mnt/btrfs/foobar is 65536 (8 blocks of 8192 bytes) ext: logical_offset: physical_offset: length: expected: flags: 0: 0.. 7: 1664.. 1671: 8: last,eof /mnt/btrfs/foobar: 1 extent found ## Do data balance # btrfs balance start -d /mnt/btrfs/ Done, had to relocate 1 out of 3 chunks ## After data balannce # filefrag -v /mnt/btrfs/foobar Filesystem type is: 9123683e File size of /mnt/btrfs/foobar is 65536 (8 blocks of 8192 bytes) ext: logical_offset: physical_offset: length: expected: flags: 0: 0.. 7: 36480.. 36487: 8: last,eof /mnt/btrfs/foobar: 1 extent found - NODATACOW cases. In that case new data is written into the same location, without any extra new data extents. This completely breaks the assumption. - Dirty data that is not yet written into the disk In that case fiemap won't show those data but only the ones that are on the disk. > or can it result in data loss due to data > not being backed up that should be? In other words, can it result > in data being considered unchanged when it really is? Dirty data and NODATACOW will result data being considered unchanged using fiemap only. And balance will make the unchanged data to be considered changed. So overall, fiemap based solution on btrfs is unreliable. Thanks, Qu ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Can the output of FIEMAP on BTRFS be used to check if a file and its reflink copy might have diverged? 2025-09-22 0:50 ` Qu Wenruo @ 2025-09-22 18:24 ` Demi Marie Obenour 2025-09-22 21:38 ` Qu Wenruo 0 siblings, 1 reply; 21+ messages in thread From: Demi Marie Obenour @ 2025-09-22 18:24 UTC (permalink / raw) To: Qu Wenruo, linux-btrfs [-- Attachment #1.1.1: Type: text/plain, Size: 3158 bytes --] On 9/21/25 20:50, Qu Wenruo wrote: > > > 在 2025/9/22 09:37, Demi Marie Obenour 写道: >> Wyng Backup (https://codeberg.org/tasket/wyng-backup) relies on FIEMAP >> to determine which parts of a file have not changed since it was last >> backed up. Specifically, the output of filefrag -v is passed to sort and >> then to uniq, and differences between the outputs for the file and >> the previous version (a reflink copy) determine what gets backed up. >> >> Is this safe under BTRFS, > > No. There are several factors affecting this, some are minor some are not: > > - Inlined extents > The returned bytenr is unreliable in that case. > Although the fiemap flags should indicate that, with 'inline' flag > set. > > - Balance > Btrfs can balance the data extents, which will result the change of > the fiemap. > > E.g. > ## Before balance > # md5sum /mnt/btrfs/foobar > 27c9068d1b51da575a53ad34c57ca5cc /mnt/btrfs/foobar > # filefrag -v /mnt/btrfs/foobar > Filesystem type is: 9123683e > File size of /mnt/btrfs/foobar is 65536 (8 blocks of 8192 bytes) > ext: logical_offset: physical_offset: length: expected: > flags: > 0: 0.. 7: 1664.. 1671: 8: > last,eof > /mnt/btrfs/foobar: 1 extent found > > ## Do data balance > # btrfs balance start -d /mnt/btrfs/ > Done, had to relocate 1 out of 3 chunks > > ## After data balannce > # filefrag -v /mnt/btrfs/foobar > Filesystem type is: 9123683e > File size of /mnt/btrfs/foobar is 65536 (8 blocks of 8192 bytes) > ext: logical_offset: physical_offset: length: > expected: flags: > 0: 0.. 7: 36480.. 36487: 8: > last,eof > /mnt/btrfs/foobar: 1 extent found > > > - NODATACOW cases. > In that case new data is written into the same location, without any > extra new data extents. This completely breaks the assumption. > > - Dirty data that is not yet written into the disk > In that case fiemap won't show those data but only the ones that are > on the disk. > >> or can it result in data loss due to data >> not being backed up that should be? In other words, can it result >> in data being considered unchanged when it really is? > > Dirty data and NODATACOW will result data being considered unchanged > using fiemap only. > > And balance will make the unchanged data to be considered changed. > > So overall, fiemap based solution on btrfs is unreliable. Can one implement this reliably with TREE_SEARCH_V2 or by parsing the output of `btrfs send`? Using `btrfs receive` to apply deltas might work, but the backups must be encrypted with a key the destination doesn't have, and that means the backups must be opaque to the remote. Therefore, I think periodic full backups with incremental backups on top of them (each containing the hash of the last) is the best that could be done. This means that old backups cannot be garbage-collected without taking a full backup first. -- Sincerely, Demi Marie Obenour (she/her/hers) [-- Attachment #1.1.2: OpenPGP public key --] [-- Type: application/pgp-keys, Size: 7253 bytes --] [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Can the output of FIEMAP on BTRFS be used to check if a file and its reflink copy might have diverged? 2025-09-22 18:24 ` Demi Marie Obenour @ 2025-09-22 21:38 ` Qu Wenruo 0 siblings, 0 replies; 21+ messages in thread From: Qu Wenruo @ 2025-09-22 21:38 UTC (permalink / raw) To: Demi Marie Obenour, linux-btrfs 在 2025/9/23 03:54, Demi Marie Obenour 写道: > On 9/21/25 20:50, Qu Wenruo wrote: >> >> >> 在 2025/9/22 09:37, Demi Marie Obenour 写道: >>> Wyng Backup (https://codeberg.org/tasket/wyng-backup) relies on FIEMAP >>> to determine which parts of a file have not changed since it was last >>> backed up. Specifically, the output of filefrag -v is passed to sort and >>> then to uniq, and differences between the outputs for the file and >>> the previous version (a reflink copy) determine what gets backed up. >>> >>> Is this safe under BTRFS, >> >> No. There are several factors affecting this, some are minor some are not: >> >> - Inlined extents >> The returned bytenr is unreliable in that case. >> Although the fiemap flags should indicate that, with 'inline' flag >> set. >> >> - Balance >> Btrfs can balance the data extents, which will result the change of >> the fiemap. >> >> E.g. >> ## Before balance >> # md5sum /mnt/btrfs/foobar >> 27c9068d1b51da575a53ad34c57ca5cc /mnt/btrfs/foobar >> # filefrag -v /mnt/btrfs/foobar >> Filesystem type is: 9123683e >> File size of /mnt/btrfs/foobar is 65536 (8 blocks of 8192 bytes) >> ext: logical_offset: physical_offset: length: expected: >> flags: >> 0: 0.. 7: 1664.. 1671: 8: >> last,eof >> /mnt/btrfs/foobar: 1 extent found >> >> ## Do data balance >> # btrfs balance start -d /mnt/btrfs/ >> Done, had to relocate 1 out of 3 chunks >> >> ## After data balannce >> # filefrag -v /mnt/btrfs/foobar >> Filesystem type is: 9123683e >> File size of /mnt/btrfs/foobar is 65536 (8 blocks of 8192 bytes) >> ext: logical_offset: physical_offset: length: >> expected: flags: >> 0: 0.. 7: 36480.. 36487: 8: >> last,eof >> /mnt/btrfs/foobar: 1 extent found >> >> >> - NODATACOW cases. >> In that case new data is written into the same location, without any >> extra new data extents. This completely breaks the assumption. >> >> - Dirty data that is not yet written into the disk >> In that case fiemap won't show those data but only the ones that are >> on the disk. >> >>> or can it result in data loss due to data >>> not being backed up that should be? In other words, can it result >>> in data being considered unchanged when it really is? >> >> Dirty data and NODATACOW will result data being considered unchanged >> using fiemap only. >> >> And balance will make the unchanged data to be considered changed. >> >> So overall, fiemap based solution on btrfs is unreliable. > > Can one implement this reliably with TREE_SEARCH_V2 If you go deep into that rabbit hole, I guess you must be very experienced with btrfs on-disk format. > or by parsing > the output of `btrfs send`? Using `btrfs receive` to apply deltas > might work, but the backups must be encrypted with a key the > destination doesn't have, You're mixing extra objectives into the situation. And why you need to bother encryption by yourself? There are things like ecryptfs to do that for you. Your objective looks overly complex without proper layer separation. Thanks, Qu > and that means the backups must be > opaque to the remote. Therefore, I think periodic full backups > with incremental backups on top of them (each containing the hash > of the last) is the best that could be done. This means that old > backups cannot be garbage-collected without taking a full backup > first. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Can the output of FIEMAP on BTRFS be used to check if a file and its reflink copy might have diverged? 2025-09-22 0:07 Can the output of FIEMAP on BTRFS be used to check if a file and its reflink copy might have diverged? Demi Marie Obenour 2025-09-22 0:50 ` Qu Wenruo @ 2025-09-22 16:48 ` Christoph Hellwig 2025-09-22 17:18 ` Demi Marie Obenour 1 sibling, 1 reply; 21+ messages in thread From: Christoph Hellwig @ 2025-09-22 16:48 UTC (permalink / raw) To: Demi Marie Obenour; +Cc: linux-btrfs On Sun, Sep 21, 2025 at 08:07:10PM -0400, Demi Marie Obenour wrote: > Wyng Backup (https://codeberg.org/tasket/wyng-backup) relies on FIEMAP > to determine which parts of a file have not changed since it was last > backed up. Specifically, the output of filefrag -v is passed to sort and > then to uniq, and differences between the outputs for the file and > the previous version (a reflink copy) determine what gets backed up. > > Is this safe under BTRFS, or can it result in data loss due to data > not being backed up that should be? In other words, can it result > in data being considered unchanged when it really is? This is not safe with any file system. FIEMAP is purely a debugging interface, and the output may or may not correspond to physical block numbers. E.g. for btrfs it points to virtual space, for XFS it can point to the RT device. It also is racy against file I/O. The idea of that tool looks nice, but without a proper kernel interface to look at the relationship between two files in a way that is locked against I/O it is fundamentally unsafe. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Can the output of FIEMAP on BTRFS be used to check if a file and its reflink copy might have diverged? 2025-09-22 16:48 ` Christoph Hellwig @ 2025-09-22 17:18 ` Demi Marie Obenour 2025-09-22 17:20 ` Christoph Hellwig 0 siblings, 1 reply; 21+ messages in thread From: Demi Marie Obenour @ 2025-09-22 17:18 UTC (permalink / raw) To: Christoph Hellwig; +Cc: linux-btrfs [-- Attachment #1.1.1: Type: text/plain, Size: 1197 bytes --] On 9/22/25 12:48, Christoph Hellwig wrote: >> then to uniq, and differences between the outputs for the file and >> the previous version (a reflink copy) determine what gets backed up. >> >> Is this safe under BTRFS, or can it result in data loss due to data >> not being backed up that should be? In other words, can it result >> in data being considered unchanged when it really is? > This is not safe with any file system. FIEMAP is purely a debugging > interface, and the output may or may not correspond to physical > block numbers. E.g. for btrfs it points to virtual space, for XFS > it can point to the RT device. It also is racy against file I/O. > > The idea of that tool looks nice, but without a proper kernel interface > to look at the relationship between two files in a way that is locked > against I/O it is fundamentally unsafe. Is it safe on XFS if there is no RT device and both files have been fsync'd and are not modified by userspace? I believe this is the case here: reflinks are used to snapshot the files before they are looked at. I am glad to know that BTRFS is unsafe and will report this. -- Sincerely, Demi Marie Obenour (she/her/hers) [-- Attachment #1.1.2: OpenPGP public key --] [-- Type: application/pgp-keys, Size: 7253 bytes --] [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Can the output of FIEMAP on BTRFS be used to check if a file and its reflink copy might have diverged? 2025-09-22 17:18 ` Demi Marie Obenour @ 2025-09-22 17:20 ` Christoph Hellwig 2025-09-22 17:30 ` Demi Marie Obenour 2025-09-22 23:25 ` Chris Laprise 0 siblings, 2 replies; 21+ messages in thread From: Christoph Hellwig @ 2025-09-22 17:20 UTC (permalink / raw) To: Demi Marie Obenour; +Cc: Christoph Hellwig, linux-btrfs On Mon, Sep 22, 2025 at 01:18:52PM -0400, Demi Marie Obenour wrote: > Is it safe on XFS if there is no RT device and both files have been > fsync'd and are not modified by userspace? I believe this is the case > here: reflinks are used to snapshot the files before they are looked at. No. Yo ucan still have defragementation or garbage collection going on underneath. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Can the output of FIEMAP on BTRFS be used to check if a file and its reflink copy might have diverged? 2025-09-22 17:20 ` Christoph Hellwig @ 2025-09-22 17:30 ` Demi Marie Obenour 2025-09-22 17:31 ` Christoph Hellwig 2025-09-22 23:25 ` Chris Laprise 1 sibling, 1 reply; 21+ messages in thread From: Demi Marie Obenour @ 2025-09-22 17:30 UTC (permalink / raw) To: Christoph Hellwig; +Cc: linux-btrfs [-- Attachment #1.1.1: Type: text/plain, Size: 642 bytes --] On 9/22/25 13:20, Christoph Hellwig wrote: > On Mon, Sep 22, 2025 at 01:18:52PM -0400, Demi Marie Obenour wrote: >> Is it safe on XFS if there is no RT device and both files have been >> fsync'd and are not modified by userspace? I believe this is the case >> here: reflinks are used to snapshot the files before they are looked at. > > No. Yo ucan still have defragementation or garbage collection going on > underneath. Can these be prevented by mounting a read-only device-mapper snapshot, replaying the journal, and then doing processing in userspace on the block device? -- Sincerely, Demi Marie Obenour (she/her/hers) [-- Attachment #1.1.2: OpenPGP public key --] [-- Type: application/pgp-keys, Size: 7253 bytes --] [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Can the output of FIEMAP on BTRFS be used to check if a file and its reflink copy might have diverged? 2025-09-22 17:30 ` Demi Marie Obenour @ 2025-09-22 17:31 ` Christoph Hellwig 2025-09-22 17:54 ` Demi Marie Obenour 0 siblings, 1 reply; 21+ messages in thread From: Christoph Hellwig @ 2025-09-22 17:31 UTC (permalink / raw) To: Demi Marie Obenour; +Cc: Christoph Hellwig, linux-btrfs On Mon, Sep 22, 2025 at 01:30:36PM -0400, Demi Marie Obenour wrote: > On 9/22/25 13:20, Christoph Hellwig wrote: > > On Mon, Sep 22, 2025 at 01:18:52PM -0400, Demi Marie Obenour wrote: > >> Is it safe on XFS if there is no RT device and both files have been > >> fsync'd and are not modified by userspace? I believe this is the case > >> here: reflinks are used to snapshot the files before they are looked at. > > > > No. Yo ucan still have defragementation or garbage collection going on > > underneath. > > Can these be prevented by mounting a read-only device-mapper snapshot, > replaying the journal, and then doing processing in userspace on the > block device? Which part of "looking at FIEMAP output except for debugging the file system is highly dangerous" did you not understand? Don't do it, you will lose data eventually. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Can the output of FIEMAP on BTRFS be used to check if a file and its reflink copy might have diverged? 2025-09-22 17:31 ` Christoph Hellwig @ 2025-09-22 17:54 ` Demi Marie Obenour 2025-09-29 8:50 ` Christoph Hellwig 0 siblings, 1 reply; 21+ messages in thread From: Demi Marie Obenour @ 2025-09-22 17:54 UTC (permalink / raw) To: Christoph Hellwig; +Cc: linux-btrfs [-- Attachment #1.1.1: Type: text/plain, Size: 1831 bytes --] On 9/22/25 13:31, Christoph Hellwig wrote: > On Mon, Sep 22, 2025 at 01:30:36PM -0400, Demi Marie Obenour wrote: >> On 9/22/25 13:20, Christoph Hellwig wrote: >>> On Mon, Sep 22, 2025 at 01:18:52PM -0400, Demi Marie Obenour wrote: >>>> Is it safe on XFS if there is no RT device and both files have been >>>> fsync'd and are not modified by userspace? I believe this is the case >>>> here: reflinks are used to snapshot the files before they are looked at. >>> >>> No. Yo ucan still have defragementation or garbage collection going on >>> underneath. >> >> Can these be prevented by mounting a read-only device-mapper snapshot, >> replaying the journal, and then doing processing in userspace on the >> block device? > > Which part of "looking at FIEMAP output except for debugging the file > system is highly dangerous" did you not understand? Don't do it, you > will lose data eventually. I understand that FIEMAP is not to be used. This also explains why BTRFS_IOCTL_TREE_SEARCH_V2 is privileged: there is no need for it to be used in production. This leaves the question of whether the needed information is in the filesystem metadata. If so, xfsprogs and/or btrfsprogs could obtain it from a block-layer snapshot offline without needing kernel changes. Otherwise, kernel changes will be needed. I don't know if the changes to the userspace tools will be accepted, though. Until then, btrfs send/receive will be the only way to efficiently back up a BTRFS filesystem, and XFS will only be able to be efficiently backed up at the block level. What makes thin_delta awesome is that it allows backing up a block device *without having to read the entire device*. This makes backups O(log N) instead of O(N) in the size of the device. -- Sincerely, Demi Marie Obenour (she/her/hers) [-- Attachment #1.1.2: OpenPGP public key --] [-- Type: application/pgp-keys, Size: 7253 bytes --] [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Can the output of FIEMAP on BTRFS be used to check if a file and its reflink copy might have diverged? 2025-09-22 17:54 ` Demi Marie Obenour @ 2025-09-29 8:50 ` Christoph Hellwig 2025-09-29 23:56 ` Demi Marie Obenour 2025-09-30 1:34 ` Demi Marie Obenour 0 siblings, 2 replies; 21+ messages in thread From: Christoph Hellwig @ 2025-09-29 8:50 UTC (permalink / raw) To: Demi Marie Obenour; +Cc: Christoph Hellwig, linux-btrfs On Mon, Sep 22, 2025 at 01:54:56PM -0400, Demi Marie Obenour wrote: > This leaves the question of whether the needed information is in the > filesystem metadata. If so, xfsprogs and/or btrfsprogs could obtain > it from a block-layer snapshot offline without needing kernel changes. > Otherwise, kernel changes will be needed. I don't know if the changes > to the userspace tools will be accepted, though. Until then, btrfs > send/receive will be the only way to efficiently back up a BTRFS > filesystem, and XFS will only be able to be efficiently backed up > at the block level. Using userspace tools that poke at the block-level mapping is fundamentally unsafe because it is not synchronized with the file system. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Can the output of FIEMAP on BTRFS be used to check if a file and its reflink copy might have diverged? 2025-09-29 8:50 ` Christoph Hellwig @ 2025-09-29 23:56 ` Demi Marie Obenour 2025-09-30 1:34 ` Demi Marie Obenour 1 sibling, 0 replies; 21+ messages in thread From: Demi Marie Obenour @ 2025-09-29 23:56 UTC (permalink / raw) To: Christoph Hellwig; +Cc: linux-btrfs [-- Attachment #1.1.1: Type: text/plain, Size: 1014 bytes --] On 9/29/25 04:50, Christoph Hellwig wrote: > On Mon, Sep 22, 2025 at 01:54:56PM -0400, Demi Marie Obenour wrote: >> This leaves the question of whether the needed information is in the >> filesystem metadata. If so, xfsprogs and/or btrfsprogs could obtain >> it from a block-layer snapshot offline without needing kernel changes. >> Otherwise, kernel changes will be needed. I don't know if the changes >> to the userspace tools will be accepted, though. Until then, btrfs >> send/receive will be the only way to efficiently back up a BTRFS >> filesystem, and XFS will only be able to be efficiently backed up >> at the block level. > > Using userspace tools that poke at the block-level mapping is > fundamentally unsafe because it is not synchronized with the file > system. I should have been clearer: this would be operating on an unmounted filesystem, most likely obtained via a device-mapper snapshot of the underlying block device. -- Sincerely, Demi Marie Obenour (she/her/hers) [-- Attachment #1.1.2: OpenPGP public key --] [-- Type: application/pgp-keys, Size: 7253 bytes --] [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Can the output of FIEMAP on BTRFS be used to check if a file and its reflink copy might have diverged? 2025-09-29 8:50 ` Christoph Hellwig 2025-09-29 23:56 ` Demi Marie Obenour @ 2025-09-30 1:34 ` Demi Marie Obenour 2025-10-03 7:45 ` Christoph Hellwig 1 sibling, 1 reply; 21+ messages in thread From: Demi Marie Obenour @ 2025-09-30 1:34 UTC (permalink / raw) To: Christoph Hellwig; +Cc: linux-btrfs [-- Attachment #1.1.1: Type: text/plain, Size: 900 bytes --] On 9/29/25 04:50, Christoph Hellwig wrote: > On Mon, Sep 22, 2025 at 01:54:56PM -0400, Demi Marie Obenour wrote: >> This leaves the question of whether the needed information is in the >> filesystem metadata. If so, xfsprogs and/or btrfsprogs could obtain >> it from a block-layer snapshot offline without needing kernel changes. >> Otherwise, kernel changes will be needed. I don't know if the changes >> to the userspace tools will be accepted, though. Until then, btrfs >> send/receive will be the only way to efficiently back up a BTRFS >> filesystem, and XFS will only be able to be efficiently backed up >> at the block level. > > Using userspace tools that poke at the block-level mapping is > fundamentally unsafe because it is not synchronized with the file > system. Is it unsafe even if the filesystem is not mounted? -- Sincerely, Demi Marie Obenour (she/her/hers) [-- Attachment #1.1.2: OpenPGP public key --] [-- Type: application/pgp-keys, Size: 7253 bytes --] [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Can the output of FIEMAP on BTRFS be used to check if a file and its reflink copy might have diverged? 2025-09-30 1:34 ` Demi Marie Obenour @ 2025-10-03 7:45 ` Christoph Hellwig 0 siblings, 0 replies; 21+ messages in thread From: Christoph Hellwig @ 2025-10-03 7:45 UTC (permalink / raw) To: Demi Marie Obenour; +Cc: Christoph Hellwig, linux-btrfs On Mon, Sep 29, 2025 at 09:34:32PM -0400, Demi Marie Obenour wrote: > On 9/29/25 04:50, Christoph Hellwig wrote: > > On Mon, Sep 22, 2025 at 01:54:56PM -0400, Demi Marie Obenour wrote: > >> This leaves the question of whether the needed information is in the > >> filesystem metadata. If so, xfsprogs and/or btrfsprogs could obtain > >> it from a block-layer snapshot offline without needing kernel changes. > >> Otherwise, kernel changes will be needed. I don't know if the changes > >> to the userspace tools will be accepted, though. Until then, btrfs > >> send/receive will be the only way to efficiently back up a BTRFS > >> filesystem, and XFS will only be able to be efficiently backed up > >> at the block level. > > > > Using userspace tools that poke at the block-level mapping is > > fundamentally unsafe because it is not synchronized with the file > > system. > > Is it unsafe even if the filesystem is not mounted? Well, if the file system is not mounted whoever pokes at it is obviously in control. And needs full understanding of the on-disk structures. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Can the output of FIEMAP on BTRFS be used to check if a file and its reflink copy might have diverged? 2025-09-22 17:20 ` Christoph Hellwig 2025-09-22 17:30 ` Demi Marie Obenour @ 2025-09-22 23:25 ` Chris Laprise 2025-09-29 8:49 ` Christoph Hellwig 1 sibling, 1 reply; 21+ messages in thread From: Chris Laprise @ 2025-09-22 23:25 UTC (permalink / raw) To: Christoph Hellwig, Demi Marie Obenour; +Cc: linux-btrfs On 9/22/25 13:20, Christoph Hellwig wrote: > On Mon, Sep 22, 2025 at 01:18:52PM -0400, Demi Marie Obenour wrote: >> Is it safe on XFS if there is no RT device and both files have been >> fsync'd and are not modified by userspace? I believe this is the case >> here: reflinks are used to snapshot the files before they are looked at. > > No. Yo ucan still have defragementation or garbage collection going on > underneath. > Hi, Wyng developer here... The overall procedure is: 1. Get subvolume's Generation ID 2. Read FIEMAP data 3. Get subvolume's Generation ID again 4. Check that the Generation ID hasn't changed: No match skips the file or raises an error I believe with those steps it is safe to use FIEMAP data for a read-only use case (backup), even when online maintenance is occurring. That has certainly been my experience after years of daily use and verification with no incidents of related data corruption. Ironically, this has been more reliable than using 'thin_delta' output from thin-provisioned LVM volumes. I should clarify that in this application we are not interested in physical mappings, but the logical representation of data. It is also understood that for some other applications FIEMAP would not be sufficient; however Wyng is not fetching or manipulating data at a low level. Also, there is a lack of accuracy in the form of false positives, where unchanged data show up as deltas, but this only results in longer processing time not data corruption; false negatives are the only thing that must be avoided. Other concerns upthread appear to be addressed by Wyng. For instance NODATACOW files are not an issue because the snapshots are reflink based; Wyng uses a conventional "full scan" sans snapshot in those cases. -- Chris Laprise, tasket@posteo.net https://codeberg.org/tasket PGP: BEE2 20C5 356E 764A 73EB 4AB3 1DC4 D106 F07F 1886 ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Can the output of FIEMAP on BTRFS be used to check if a file and its reflink copy might have diverged? 2025-09-22 23:25 ` Chris Laprise @ 2025-09-29 8:49 ` Christoph Hellwig 2025-09-29 23:55 ` Demi Marie Obenour 2025-10-04 1:43 ` Chris Laprise 0 siblings, 2 replies; 21+ messages in thread From: Christoph Hellwig @ 2025-09-29 8:49 UTC (permalink / raw) To: Chris Laprise; +Cc: Christoph Hellwig, Demi Marie Obenour, linux-btrfs On Mon, Sep 22, 2025 at 11:25:48PM +0000, Chris Laprise wrote: > The overall procedure is: > > 1. Get subvolume's Generation ID > > 2. Read FIEMAP data > > 3. Get subvolume's Generation ID again > > 4. Check that the Generation ID hasn't changed: No match skips the file or > raises an error I'll let the btrfs developers answer this as it's clearly not about XFS. > I should clarify that in this application we are not interested in physical > mappings, but the logical representation of data. And that's not what FIEMAP provides. > It is also understood that > for some other applications FIEMAP would not be sufficient; however Wyng is > not fetching or manipulating data at a low level. Also, there is a lack of > accuracy in the form of false positives, where unchanged data show up as > deltas, but this only results in longer processing time not data corruption; > false negatives are the only thing that must be avoided. I think what you want/need is a way to look at the delta between two reflinked files. At least for XFS (and I'm pretty sure for btrfs as well) the low-level data structures could provide this, but building an actually safe interface to that is unfortunately hard. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Can the output of FIEMAP on BTRFS be used to check if a file and its reflink copy might have diverged? 2025-09-29 8:49 ` Christoph Hellwig @ 2025-09-29 23:55 ` Demi Marie Obenour 2025-10-03 7:44 ` Christoph Hellwig 2025-10-04 1:43 ` Chris Laprise 1 sibling, 1 reply; 21+ messages in thread From: Demi Marie Obenour @ 2025-09-29 23:55 UTC (permalink / raw) To: Christoph Hellwig, Chris Laprise; +Cc: linux-btrfs [-- Attachment #1.1.1: Type: text/plain, Size: 1748 bytes --] On 9/29/25 04:49, Christoph Hellwig wrote: > On Mon, Sep 22, 2025 at 11:25:48PM +0000, Chris Laprise wrote: >> The overall procedure is: >> >> 1. Get subvolume's Generation ID >> >> 2. Read FIEMAP data >> >> 3. Get subvolume's Generation ID again >> >> 4. Check that the Generation ID hasn't changed: No match skips the file or >> raises an error > > I'll let the btrfs developers answer this as it's clearly not about > XFS. > >> I should clarify that in this application we are not interested in physical >> mappings, but the logical representation of data. > > And that's not what FIEMAP provides. Can two extents have the same offset on disk but different logical contents? For XFS that seems impossible unless a realtime device is involved, which is not the case here. >> It is also understood that >> for some other applications FIEMAP would not be sufficient; however Wyng is >> not fetching or manipulating data at a low level. Also, there is a lack of >> accuracy in the form of false positives, where unchanged data show up as >> deltas, but this only results in longer processing time not data corruption; >> false negatives are the only thing that must be avoided. > > I think what you want/need is a way to look at the delta between two > reflinked files. At least for XFS (and I'm pretty sure for btrfs as > well) the low-level data structures could provide this, but building an > actually safe interface to that is unfortunately hard. Is it easier if one requires the filesystem to be read-only? Taking a device-mapper snapshot (thin or CoW) before the backup is not too onerous, at least if the filesystem is already on an LVM LV. -- Sincerely, Demi Marie Obenour (she/her/hers) [-- Attachment #1.1.2: OpenPGP public key --] [-- Type: application/pgp-keys, Size: 7253 bytes --] [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Can the output of FIEMAP on BTRFS be used to check if a file and its reflink copy might have diverged? 2025-09-29 23:55 ` Demi Marie Obenour @ 2025-10-03 7:44 ` Christoph Hellwig 2025-10-04 1:09 ` Demi Marie Obenour 0 siblings, 1 reply; 21+ messages in thread From: Christoph Hellwig @ 2025-10-03 7:44 UTC (permalink / raw) To: Demi Marie Obenour; +Cc: Christoph Hellwig, Chris Laprise, linux-btrfs On Mon, Sep 29, 2025 at 07:55:22PM -0400, Demi Marie Obenour wrote: > Can two extents have the same offset on disk but different logical contents? > For XFS that seems impossible unless a realtime device is involved, which > is not the case here. Only with the RT device. But again that is insider knowledge and not an API exposed to applications. More importantly the offset can change underneath without any notice to the application. > Is it easier if one requires the filesystem to be read-only? Taking a > device-mapper snapshot (thin or CoW) before the backup is not too onerous, > at least if the filesystem is already on an LVM LV. At least that locks out other changes. But it is a rather opaque setup. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Can the output of FIEMAP on BTRFS be used to check if a file and its reflink copy might have diverged? 2025-10-03 7:44 ` Christoph Hellwig @ 2025-10-04 1:09 ` Demi Marie Obenour 0 siblings, 0 replies; 21+ messages in thread From: Demi Marie Obenour @ 2025-10-04 1:09 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Chris Laprise, linux-btrfs, linux-xfs [-- Attachment #1.1.1: Type: text/plain, Size: 1135 bytes --] On 10/3/25 03:44, Christoph Hellwig wrote: > On Mon, Sep 29, 2025 at 07:55:22PM -0400, Demi Marie Obenour wrote: >> Can two extents have the same offset on disk but different logical contents? >> For XFS that seems impossible unless a realtime device is involved, which >> is not the case here. > Only with the RT device. But again that is insider knowledge and not an > API exposed to applications. More importantly the offset can change > underneath without any notice to the application. Are there cases where the offset can change even if neither file is written to and there is no RT device? >> Is it easier if one requires the filesystem to be read-only? Taking a >> device-mapper snapshot (thin or CoW) before the backup is not too onerous, >> at least if the filesystem is already on an LVM LV. > At least that locks out other changes. But it is a rather opaque setup. It's indeed not great, but if it is set up by the OS installer the snapshots could all be automated. Is it at currently safe in this case, provided that online fsck is disabled? -- Sincerely, Demi Marie Obenour (she/her/hers) [-- Attachment #1.1.2: OpenPGP public key --] [-- Type: application/pgp-keys, Size: 7253 bytes --] [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Can the output of FIEMAP on BTRFS be used to check if a file and its reflink copy might have diverged? 2025-09-29 8:49 ` Christoph Hellwig 2025-09-29 23:55 ` Demi Marie Obenour @ 2025-10-04 1:43 ` Chris Laprise 2025-10-04 4:51 ` Christoph Hellwig 1 sibling, 1 reply; 21+ messages in thread From: Chris Laprise @ 2025-10-04 1:43 UTC (permalink / raw) To: Christoph Hellwig; +Cc: linux-btrfs On 9/29/25 04:49, Christoph Hellwig wrote: > On Mon, Sep 22, 2025 at 11:25:48PM +0000, Chris Laprise wrote: >> The overall procedure is: >> >> 1. Get subvolume's Generation ID >> >> 2. Read FIEMAP data >> >> 3. Get subvolume's Generation ID again >> >> 4. Check that the Generation ID hasn't changed: No match skips the file or >> raises an error > > I'll let the btrfs developers answer this as it's clearly not about > XFS. > >> I should clarify that in this application we are not interested in physical >> mappings, but the logical representation of data. > > And that's not what FIEMAP provides. Its actually what FIEMAP provides: the reported 'logical offsets' are intra-file offsets mapped to extent addresses which are labeled 'physical' (in FIEMAP only) as a historical holdover in Btrfs. From the "BTRFS: The Linux B-Tree Filesystem" paper [1] and developer reference [2]: > A chunk tree maintains a mapping from logical chunks to physical > chunks. A device tree maintains the reverse mapping. The rest of the > filesystem sees logical chunks, and all extent references address > logical chunks [1] https://web.archive.org/web/20140423000340/http://domino.watson.ibm.com/library/CyberDig.nsf/papers/6E1C5B6A1B6EDD9885257A38006B6130/$File/rj10501.pdf [2] https://btrfs.readthedocs.io/en/latest/dev/On-disk-format.html#extent-data-6c FIEMAP is a readout from an inode's extent references. Btrfs extent references are pointers to logical addresses. The distinction between physical and logical (deemed "opaque") is important mainly in cases not grounded in VFS access, such as when the Btrfs docs warn not to use FIEMAP addresses "reported as physical" for system hibernation storage. (This is the only warning on FIEMAP use I could find in the documentation, BTW.) Knowing that the value is unusable for hibernation on Btrfs but usable on XFS only means the developer has to know the difference and check the filesystem type. For Wyng's purposes, each record coming from FIEMAP is logical because its a mapping of something (logical for Btrfs or physical for other fs) to a logical range in a file, and the list of those ranges (w/extent references) under an inode record in the subvolume tree define a file's contents. The two files being compared in a read-only subvol must contain the same data before, during and after maint transactions. So it only matters that FIEMAPs for both files were read under the same version of the subvol tree (filesystem layer), which contains the logical extent reference mapping, so checking the subvol generation ID works here. (Noting also, it likely won't work much longer on XFS since online maint was recently added apparently without a way to online check transaction IDs; Wyng has an open issue for this.) Notes on general usage – The FIEMAP ioctl is used by Btrfs userspace utils, samba, libarchive, bees, duperemove, blockdiff and probably more. Coreutils 'cp' used it for a time before deciding they preferred the portability of lseek(). 'duperemove' and 'blockdiff' use FIEMAP to find shared extents similar to Wyng's usage. 'libarchive' notes [3] that extent references map to "logical file blocks". [3] https://github.com/libarchive/libarchive/blob/372e709c1a143c08281fef76edaf84db42327559/libarchive/archive_read_disk_entry_from_file.c#L841 Given that, the oddly disparate warnings to limit FIEMAP use to only debugging (one person) or only fragmentation detection (another person) look disingenuous. Neither the FIEMAP documentation for the kernel nor for Btrfs put such restrictions on its use. So in the spirit of purported "hacker ethos" and not aesthetics or appeals to proximate expertise, I'll need to see an actually demonstrated or fully explained code path for data corruption for _this_ specific application in order to view the OP claims as serious. Even a moderately fleshed-out thought experiment would be a big improvement. Additionally, I welcome sincere requests to explore the ramifications of using Wyng under specific scenarios in the Wyng issues system, although not catastrophizing based on loosely offered opinion. > I think what you want/need is a way to look at the delta between two > reflinked files. [...] Yes, but only the deltas that matter for read()-able content. We are not trying to make a comprehensive picture of storage internals as btrfs-send/receive do; we only want to know where to seek() during an otherwise normal incremental file backup. -- Chris Laprise, tasket@posteo.net https://codeberg.org/tasket PGP: BEE2 20C5 356E 764A 73EB 4AB3 1DC4 D106 F07F 1886 ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Can the output of FIEMAP on BTRFS be used to check if a file and its reflink copy might have diverged? 2025-10-04 1:43 ` Chris Laprise @ 2025-10-04 4:51 ` Christoph Hellwig 0 siblings, 0 replies; 21+ messages in thread From: Christoph Hellwig @ 2025-10-04 4:51 UTC (permalink / raw) To: Chris Laprise; +Cc: Christoph Hellwig, linux-btrfs On Sat, Oct 04, 2025 at 01:43:32AM +0000, Chris Laprise wrote: > > > I should clarify that in this application we are not interested in physical > > > mappings, but the logical representation of data. > > > > And that's not what FIEMAP provides. > > Its actually what FIEMAP provides: the reported 'logical offsets' are > intra-file offsets mapped to extent addresses which are labeled 'physical' > (in FIEMAP only) as a historical holdover in Btrfs. From the "BTRFS: The > Linux B-Tree Filesystem" paper [1] and developer reference [2]: FIEMAP is NOT an application tool but a provider out debug output. The target is supposed to be physical addresses, but as you've noticed the interpretation varies widely. If you use it, you will eventually lose data. DONT do it, and do not blame the file systems if it eventually happens. If you want to be known for a backup tools that loses data, go ahead. But don't blame it on anyone else. And please, don't give me a reading on FIEMAP. I've been involved with it from the very beginning. Reading out the documentation feels rather patronizing. ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2025-10-04 4:51 UTC | newest] Thread overview: 21+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-09-22 0:07 Can the output of FIEMAP on BTRFS be used to check if a file and its reflink copy might have diverged? Demi Marie Obenour 2025-09-22 0:50 ` Qu Wenruo 2025-09-22 18:24 ` Demi Marie Obenour 2025-09-22 21:38 ` Qu Wenruo 2025-09-22 16:48 ` Christoph Hellwig 2025-09-22 17:18 ` Demi Marie Obenour 2025-09-22 17:20 ` Christoph Hellwig 2025-09-22 17:30 ` Demi Marie Obenour 2025-09-22 17:31 ` Christoph Hellwig 2025-09-22 17:54 ` Demi Marie Obenour 2025-09-29 8:50 ` Christoph Hellwig 2025-09-29 23:56 ` Demi Marie Obenour 2025-09-30 1:34 ` Demi Marie Obenour 2025-10-03 7:45 ` Christoph Hellwig 2025-09-22 23:25 ` Chris Laprise 2025-09-29 8:49 ` Christoph Hellwig 2025-09-29 23:55 ` Demi Marie Obenour 2025-10-03 7:44 ` Christoph Hellwig 2025-10-04 1:09 ` Demi Marie Obenour 2025-10-04 1:43 ` Chris Laprise 2025-10-04 4:51 ` Christoph Hellwig
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).