* FIDEDUPERANGE and compression
@ 2022-02-19 3:14 ov 2k
2022-02-21 6:37 ` Zygo Blaxell
0 siblings, 1 reply; 6+ messages in thread
From: ov 2k @ 2022-02-19 3:14 UTC (permalink / raw)
To: linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 757 bytes --]
FIDEDUPERANGE does not seem to behave as expected with compressible
data on a btrfs volume with compression enabled, at least with small
adjacent FIDEDUPERANGE requests. I've attached a basic test case. It
writes two short identical files and calls FIDEDUPERANGE three times,
on the thirds of the file, in order. filefrag -v reports that the
destination file has three extents that each reference the first third
of the source file.
To be clear, the data in the destination file remains correct.
However, the second and third FIDEDUPERANGE calls do not seem to cause
the destination file to reference the expected source extents. I'm
not actually certain whether this is a bug in FIDEDUPERANGE or
FS_IOC_FIEMAP or something deeper within btrfs itself.
[-- Attachment #2: test.c --]
[-- Type: text/x-c-code, Size: 990 bytes --]
#include <linux/fs.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/ioctl.h>
#include <sys/statvfs.h>
#include <unistd.h>
int main(int argc, char **argv) {
FILE *src;
FILE *dest;
struct statvfs vfsstats;
unsigned long bsize;
size_t size;
struct file_dedupe_range *fdr;
src = fopen("src", "w+");
dest = fopen("dest", "w+");
fstatvfs(fileno(src), &vfsstats);
bsize = vfsstats.f_bsize;
for(size_t i = 0; i < bsize * 2; i++) {
fprintf(src, "%s", "foo");
fprintf(dest, "%s", "foo");
}
fflush(src);
fsync(fileno(src));
fflush(dest);
fsync(fileno(dest));
size = sizeof (struct file_dedupe_range);
size += sizeof (struct file_dedupe_range_info);
fdr = calloc(1, size);
fdr->src_length = 2 * bsize;
fdr->dest_count = 1;
fdr->info[0].dest_fd = fileno(dest);
for(size_t i = 0; i < 3; i++) {
ioctl(fileno(src), FIDEDUPERANGE, fdr);
fdr->src_offset += 2 * bsize;
fdr->info[0].dest_offset += 2 * bsize;
}
fflush(dest);
fsync(fileno(dest));
return 0;
}
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: FIDEDUPERANGE and compression
2022-02-19 3:14 FIDEDUPERANGE and compression ov 2k
@ 2022-02-21 6:37 ` Zygo Blaxell
2022-02-21 22:31 ` ov2k
0 siblings, 1 reply; 6+ messages in thread
From: Zygo Blaxell @ 2022-02-21 6:37 UTC (permalink / raw)
To: ov 2k; +Cc: linux-btrfs
On Fri, Feb 18, 2022 at 10:14:20PM -0500, ov 2k wrote:
> FIDEDUPERANGE does not seem to behave as expected with compressible
> data on a btrfs volume with compression enabled, at least with small
> adjacent FIDEDUPERANGE requests. I've attached a basic test case. It
> writes two short identical files and calls FIDEDUPERANGE three times,
> on the thirds of the file, in order. filefrag -v reports that the
> destination file has three extents that each reference the first third
> of the source file.
>
> To be clear, the data in the destination file remains correct.
> However, the second and third FIDEDUPERANGE calls do not seem to cause
> the destination file to reference the expected source extents. I'm
> not actually certain whether this is a bug in FIDEDUPERANGE or
> FS_IOC_FIEMAP or something deeper within btrfs itself.
FIEMAP's output cannot correctly represent btrfs compressed data.
In some cases you may be able to identify logical blocks as belonging
to the same underlying compressed extent, but not with enough precision
to infer data content of the blocks.
The physical location of a compressed byte is a two-dimensional
quantity--one to identify the physical compressed extent, one to identify
the byte's offset within the decompressed data. The length is similarly
two-dimensional, one for the physical size and one for the logical size.
Since compressed bytes are a different size unit than uncompressed bytes,
we can't add a compressed offset or length to a physical position and
get a number that isn't garbage, so we can't fill in distinct values
for physical location of compressed data blocks that make numerical sense.
Try 'btrfs-search-metadata file' (from the python-btrfs package) for
an accurate description of what's going on with the extent references.
It uses TREE_SEARCH_V2 and the underlying btrfs file extent reference
structure, which has the fields that FIEMAP is missing.
Underneath, the compressed extent is an immutable contiguous region of
storage, identified by the bytenr (virtual address) of the first byte
of the storage. Each reference to the extent in the file refers to a
contiguous range of the extent's logical blocks (after decompression).
The fields are, in no particular order:
1. the logical offset within the file (seek offset) where
the referenced data appears in the file
2. the extent bytenr (extent identifier for reference counting
and backref search, first physical byte of the extent)
3. the logical length of the referenced data (the portion
of the compressed data referenced at this offset in the file)
4. the logical offset within the extent where the referenced
data begins (after decompressing the extent, where to start
reading the data in memory)
5. the physical (compressed) length of the complete extent data
(how many bytes are used in physical storage)
6. the logical (decompressed) length of the complete extent data
(how much RAM is required to decompress the extent)
Only the first three of these fields are available via FIEMAP. FIEMAP
provides only one length field, so it can't handle compressed extents
which have two distinct lengths. FIEMAP provides only one integer for
physical position, so it can't handle references to blocks that are
not the first block in a compressed extent.
TREE_SEARCH_V2 provides all six fields, so you can get accurate logical or
physical extent boundary information as needed.
In simple write() cases, the offset fields are zero, so FIEMAP appears to
work at first:
1. seek offset is some number, FIEMAP returns that number
2. extent bytenr is the FIEMAP physical start of extent
3. logical length of the referenced data (#3) is the same as
the logical decompressed length (#6). FIEMAP gives #3.
This value will change if the extent is partially overwritten
in the file.
4. logical offset within the extent is 0, since the extent
was created for exactly this file data reference
5. physical length of the compressed extent isn't reported in
FIEMAP. Tools like 'filefrag -v' which try to compute extent
boundary adjacency won't work--they will use the length in #3
when they should use field #2 + #5 to compute physical extent
end boundaries.
6. logical length of the compressed extent is the same as #3.
This value never changes until the extent is destroyed.
In the test case, FIEMAP reports the same number at #2 for all extents
since the same physical extent is referenced, but the referenced data
location is actually a function of fields #2 and #4. The second and
third extents have non-zero offsets for #4, and the length at #3 becomes
different from the length at #6, making any computed values based on
these fields nonsense.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: FIDEDUPERANGE and compression
2022-02-21 6:37 ` Zygo Blaxell
@ 2022-02-21 22:31 ` ov2k
2022-03-06 4:44 ` Zygo Blaxell
0 siblings, 1 reply; 6+ messages in thread
From: ov2k @ 2022-02-21 22:31 UTC (permalink / raw)
To: linux-btrfs
It looks like btrfs coalesces adjacent uncompressed extents. I'm not
sure whether this is done by FIDEDUPERANGE or FS_IOC_FIEMAP. I think
the problem is that adjacent decompressed ranges (defined by #3 and
#4) within the same compressed block are not coalesced in a similar
manner. Is there a particular reason why this isn't done, or is this
simply a case of nobody having done it?
On Mon, Feb 21, 2022 at 1:37 AM Zygo Blaxell
<ce3g8jdj@umail.furryterror.org> wrote:
>
> On Fri, Feb 18, 2022 at 10:14:20PM -0500, ov 2k wrote:
> > FIDEDUPERANGE does not seem to behave as expected with compressible
> > data on a btrfs volume with compression enabled, at least with small
> > adjacent FIDEDUPERANGE requests. I've attached a basic test case. It
> > writes two short identical files and calls FIDEDUPERANGE three times,
> > on the thirds of the file, in order. filefrag -v reports that the
> > destination file has three extents that each reference the first third
> > of the source file.
> >
> > To be clear, the data in the destination file remains correct.
> > However, the second and third FIDEDUPERANGE calls do not seem to cause
> > the destination file to reference the expected source extents. I'm
> > not actually certain whether this is a bug in FIDEDUPERANGE or
> > FS_IOC_FIEMAP or something deeper within btrfs itself.
>
> FIEMAP's output cannot correctly represent btrfs compressed data.
> In some cases you may be able to identify logical blocks as belonging
> to the same underlying compressed extent, but not with enough precision
> to infer data content of the blocks.
>
> The physical location of a compressed byte is a two-dimensional
> quantity--one to identify the physical compressed extent, one to identify
> the byte's offset within the decompressed data. The length is similarly
> two-dimensional, one for the physical size and one for the logical size.
> Since compressed bytes are a different size unit than uncompressed bytes,
> we can't add a compressed offset or length to a physical position and
> get a number that isn't garbage, so we can't fill in distinct values
> for physical location of compressed data blocks that make numerical sense.
>
> Try 'btrfs-search-metadata file' (from the python-btrfs package) for
> an accurate description of what's going on with the extent references.
> It uses TREE_SEARCH_V2 and the underlying btrfs file extent reference
> structure, which has the fields that FIEMAP is missing.
>
> Underneath, the compressed extent is an immutable contiguous region of
> storage, identified by the bytenr (virtual address) of the first byte
> of the storage. Each reference to the extent in the file refers to a
> contiguous range of the extent's logical blocks (after decompression).
> The fields are, in no particular order:
>
> 1. the logical offset within the file (seek offset) where
> the referenced data appears in the file
>
> 2. the extent bytenr (extent identifier for reference counting
> and backref search, first physical byte of the extent)
>
> 3. the logical length of the referenced data (the portion
> of the compressed data referenced at this offset in the file)
>
> 4. the logical offset within the extent where the referenced
> data begins (after decompressing the extent, where to start
> reading the data in memory)
>
> 5. the physical (compressed) length of the complete extent data
> (how many bytes are used in physical storage)
>
> 6. the logical (decompressed) length of the complete extent data
> (how much RAM is required to decompress the extent)
>
> Only the first three of these fields are available via FIEMAP. FIEMAP
> provides only one length field, so it can't handle compressed extents
> which have two distinct lengths. FIEMAP provides only one integer for
> physical position, so it can't handle references to blocks that are
> not the first block in a compressed extent.
>
> TREE_SEARCH_V2 provides all six fields, so you can get accurate logical or
> physical extent boundary information as needed.
>
> In simple write() cases, the offset fields are zero, so FIEMAP appears to
> work at first:
>
> 1. seek offset is some number, FIEMAP returns that number
>
> 2. extent bytenr is the FIEMAP physical start of extent
>
> 3. logical length of the referenced data (#3) is the same as
> the logical decompressed length (#6). FIEMAP gives #3.
> This value will change if the extent is partially overwritten
> in the file.
>
> 4. logical offset within the extent is 0, since the extent
> was created for exactly this file data reference
>
> 5. physical length of the compressed extent isn't reported in
> FIEMAP. Tools like 'filefrag -v' which try to compute extent
> boundary adjacency won't work--they will use the length in #3
> when they should use field #2 + #5 to compute physical extent
> end boundaries.
>
> 6. logical length of the compressed extent is the same as #3.
> This value never changes until the extent is destroyed.
>
> In the test case, FIEMAP reports the same number at #2 for all extents
> since the same physical extent is referenced, but the referenced data
> location is actually a function of fields #2 and #4. The second and
> third extents have non-zero offsets for #4, and the length at #3 becomes
> different from the length at #6, making any computed values based on
> these fields nonsense.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: FIDEDUPERANGE and compression
2022-02-21 22:31 ` ov2k
@ 2022-03-06 4:44 ` Zygo Blaxell
2022-03-09 20:04 ` ov2k
0 siblings, 1 reply; 6+ messages in thread
From: Zygo Blaxell @ 2022-03-06 4:44 UTC (permalink / raw)
To: ov2k; +Cc: linux-btrfs
On Mon, Feb 21, 2022 at 05:31:13PM -0500, ov2k wrote:
> It looks like btrfs coalesces adjacent uncompressed extents. I'm not
> sure whether this is done by FIDEDUPERANGE or FS_IOC_FIEMAP. I think
> the problem is that adjacent decompressed ranges (defined by #3 and
> #4) within the same compressed block are not coalesced in a similar
> manner. Is there a particular reason why this isn't done, or is this
> simply a case of nobody having done it?
It hasn't been done because FIEMAP can't produce results for compressed
extents that aren't nonsense. The interface can't cope with compressed
data.
Adjacent compressed extents occur when all of the following are true:
first extent #3 (decompressed start offset) + #4 (decompressed
logical length) == #6 (end of decompressed extent)
second extent #3 (decompressed start offset) = 0 (beginning
of decompressed extent)
first extent #2 (physical start offset) + #5 (physical compressed
length) == second extent #2 (physical start offset)
FIEMAP doesn't have access to #5, so it can't evaluate that condition
(and neither can anything that uses FIEMAP).
Suppose you have two adjacent extents, 128K and 96K that are compressed
to 64K and 48K respectively. They start at physical block 10000 at
offset 0 in the file. Then:
Extent 1 starts at physical 10000 and ends at 10063.
Extent 1 starts at logical offset 0 and ends at 127.
Extent 2 starts at physical 10064 and ends at 10111.
Extent 2 starts at logical offset 128 and ends at 223.
FIEMAP reports:
extent 1 physical 10000 offset 0 length 128
extent 2 physical 10064 offset 128 length 48
How would you be able to determine from this information that these
extents are physically adjacent and contiguous?
Lets add extent 3 and 4:
Extent 3 starts at physical 10112 and ends at 10127.
Extent 3 starts at logical offset 224 and ends at 239.
Extent 4 starts at physical 10128 and ends at 10127.
Extent 4 starts at logical offset 240 and ends at 255.
FIEMAP reports:
extent 1 physical 10000 offset 0 length 128
extent 2 physical 10064 offset 128 length 48
extent 3 physical 10112 offset 224 length 16
extent 4 physical 10128 offset 240 length 16
How would you be able to determine extents 1 and 4 are _not_ physically
adjacent?
> On Mon, Feb 21, 2022 at 1:37 AM Zygo Blaxell
> <ce3g8jdj@umail.furryterror.org> wrote:
> >
> > On Fri, Feb 18, 2022 at 10:14:20PM -0500, ov 2k wrote:
> > > FIDEDUPERANGE does not seem to behave as expected with compressible
> > > data on a btrfs volume with compression enabled, at least with small
> > > adjacent FIDEDUPERANGE requests. I've attached a basic test case. It
> > > writes two short identical files and calls FIDEDUPERANGE three times,
> > > on the thirds of the file, in order. filefrag -v reports that the
> > > destination file has three extents that each reference the first third
> > > of the source file.
> > >
> > > To be clear, the data in the destination file remains correct.
> > > However, the second and third FIDEDUPERANGE calls do not seem to cause
> > > the destination file to reference the expected source extents. I'm
> > > not actually certain whether this is a bug in FIDEDUPERANGE or
> > > FS_IOC_FIEMAP or something deeper within btrfs itself.
> >
> > FIEMAP's output cannot correctly represent btrfs compressed data.
> > In some cases you may be able to identify logical blocks as belonging
> > to the same underlying compressed extent, but not with enough precision
> > to infer data content of the blocks.
> >
> > The physical location of a compressed byte is a two-dimensional
> > quantity--one to identify the physical compressed extent, one to identify
> > the byte's offset within the decompressed data. The length is similarly
> > two-dimensional, one for the physical size and one for the logical size.
> > Since compressed bytes are a different size unit than uncompressed bytes,
> > we can't add a compressed offset or length to a physical position and
> > get a number that isn't garbage, so we can't fill in distinct values
> > for physical location of compressed data blocks that make numerical sense.
> >
> > Try 'btrfs-search-metadata file' (from the python-btrfs package) for
> > an accurate description of what's going on with the extent references.
> > It uses TREE_SEARCH_V2 and the underlying btrfs file extent reference
> > structure, which has the fields that FIEMAP is missing.
> >
> > Underneath, the compressed extent is an immutable contiguous region of
> > storage, identified by the bytenr (virtual address) of the first byte
> > of the storage. Each reference to the extent in the file refers to a
> > contiguous range of the extent's logical blocks (after decompression).
> > The fields are, in no particular order:
> >
> > 1. the logical offset within the file (seek offset) where
> > the referenced data appears in the file
> >
> > 2. the extent bytenr (extent identifier for reference counting
> > and backref search, first physical byte of the extent)
> >
> > 3. the logical length of the referenced data (the portion
> > of the compressed data referenced at this offset in the file)
> >
> > 4. the logical offset within the extent where the referenced
> > data begins (after decompressing the extent, where to start
> > reading the data in memory)
> >
> > 5. the physical (compressed) length of the complete extent data
> > (how many bytes are used in physical storage)
> >
> > 6. the logical (decompressed) length of the complete extent data
> > (how much RAM is required to decompress the extent)
> >
> > Only the first three of these fields are available via FIEMAP. FIEMAP
> > provides only one length field, so it can't handle compressed extents
> > which have two distinct lengths. FIEMAP provides only one integer for
> > physical position, so it can't handle references to blocks that are
> > not the first block in a compressed extent.
> >
> > TREE_SEARCH_V2 provides all six fields, so you can get accurate logical or
> > physical extent boundary information as needed.
> >
> > In simple write() cases, the offset fields are zero, so FIEMAP appears to
> > work at first:
> >
> > 1. seek offset is some number, FIEMAP returns that number
> >
> > 2. extent bytenr is the FIEMAP physical start of extent
> >
> > 3. logical length of the referenced data (#3) is the same as
> > the logical decompressed length (#6). FIEMAP gives #3.
> > This value will change if the extent is partially overwritten
> > in the file.
> >
> > 4. logical offset within the extent is 0, since the extent
> > was created for exactly this file data reference
> >
> > 5. physical length of the compressed extent isn't reported in
> > FIEMAP. Tools like 'filefrag -v' which try to compute extent
> > boundary adjacency won't work--they will use the length in #3
> > when they should use field #2 + #5 to compute physical extent
> > end boundaries.
> >
> > 6. logical length of the compressed extent is the same as #3.
> > This value never changes until the extent is destroyed.
> >
> > In the test case, FIEMAP reports the same number at #2 for all extents
> > since the same physical extent is referenced, but the referenced data
> > location is actually a function of fields #2 and #4. The second and
> > third extents have non-zero offsets for #4, and the length at #3 becomes
> > different from the length at #6, making any computed values based on
> > these fields nonsense.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: FIDEDUPERANGE and compression
2022-03-06 4:44 ` Zygo Blaxell
@ 2022-03-09 20:04 ` ov2k
2022-03-12 2:47 ` Zygo Blaxell
0 siblings, 1 reply; 6+ messages in thread
From: ov2k @ 2022-03-09 20:04 UTC (permalink / raw)
To: linux-btrfs
On Sat, Mar 5, 2022 at 11:44 PM Zygo Blaxell
<ce3g8jdj@umail.furryterror.org> wrote:
>
> On Mon, Feb 21, 2022 at 05:31:13PM -0500, ov2k wrote:
> > It looks like btrfs coalesces adjacent uncompressed extents. I'm not
> > sure whether this is done by FIDEDUPERANGE or FS_IOC_FIEMAP. I think
> > the problem is that adjacent decompressed ranges (defined by #3 and
> > #4) within the same compressed block are not coalesced in a similar
> > manner. Is there a particular reason why this isn't done, or is this
> > simply a case of nobody having done it?
>
> It hasn't been done because FIEMAP can't produce results for compressed
> extents that aren't nonsense. The interface can't cope with compressed
> data.
>
I think there's a misunderstanding here. The issue isn't making FS_IOC_FIEMAP
represent compressed data sensibly. The goal is for btrfs_fiemap() to handle
adjacent subranges of a compressed extent in much the same way as it handles
adjacent uncompressed extents. The result should be no more or less
nonsensical than it already is.
> Adjacent compressed extents occur when all of the following are true:
>
> first extent #3 (decompressed start offset) + #4 (decompressed
> logical length) == #6 (end of decompressed extent)
>
> second extent #3 (decompressed start offset) = 0 (beginning
> of decompressed extent)
>
> first extent #2 (physical start offset) + #5 (physical compressed
> length) == second extent #2 (physical start offset)
>
> FIEMAP doesn't have access to #5, so it can't evaluate that condition
> (and neither can anything that uses FIEMAP).
>
This is incorrect. btrfs_fiemap() has access to #5. I believe that's the
num_bytes member of struct btrfs_file_extent_item. Extent merging is handled
by emit_fiemap_extent(). However, it looks like a lot of the information
needed to merge adjacent subranges of a compressed extent is discarded before
emit_fiemap_extent() is called.
> Suppose you have two adjacent extents, 128K and 96K that are compressed
> to 64K and 48K respectively. They start at physical block 10000 at
> offset 0 in the file. Then:
>
> Extent 1 starts at physical 10000 and ends at 10063.
> Extent 1 starts at logical offset 0 and ends at 127.
>
> Extent 2 starts at physical 10064 and ends at 10111.
> Extent 2 starts at logical offset 128 and ends at 223.
>
> FIEMAP reports:
>
> extent 1 physical 10000 offset 0 length 128
> extent 2 physical 10064 offset 128 length 48
>
> How would you be able to determine from this information that these
> extents are physically adjacent and contiguous?
>
> Lets add extent 3 and 4:
>
> Extent 3 starts at physical 10112 and ends at 10127.
> Extent 3 starts at logical offset 224 and ends at 239.
>
> Extent 4 starts at physical 10128 and ends at 10127.
> Extent 4 starts at logical offset 240 and ends at 255.
>
> FIEMAP reports:
>
> extent 1 physical 10000 offset 0 length 128
> extent 2 physical 10064 offset 128 length 48
> extent 3 physical 10112 offset 224 length 16
> extent 4 physical 10128 offset 240 length 16
>
> How would you be able to determine extents 1 and 4 are _not_ physically
> adjacent?
>
I'm talking about emitting a single struct fiemap_extent that corresponds to
two adjacent subranges of the same compressed btrfs extent. The two btrfs
extents would simply have to satisfy:
extent 1 #2 (bytenr) == extent2 #2 (bytenr)
extent 1 #1 (seek offset) + extent 1 #3 (decompressed subrange length)
== extent 2 #1 (seek offset)
extent 1 #4 (decompressed subrange offset) + extent 1 #3 (decompressed
subrange length) == extent 2 #4 (decompressed subrange offset)
The resulting struct fiemap_extent would have:
fe_logical: extent 1 #1 (seek offset)
fe_physical: extent 1 #2 (bytenr)
fe_length: extent 1 #3 (decompressed subrange length) + extent 2 #3
(decompressed subrange length)
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: FIDEDUPERANGE and compression
2022-03-09 20:04 ` ov2k
@ 2022-03-12 2:47 ` Zygo Blaxell
0 siblings, 0 replies; 6+ messages in thread
From: Zygo Blaxell @ 2022-03-12 2:47 UTC (permalink / raw)
To: ov2k; +Cc: linux-btrfs
On Wed, Mar 09, 2022 at 03:04:40PM -0500, ov2k wrote:
> On Sat, Mar 5, 2022 at 11:44 PM Zygo Blaxell
> <ce3g8jdj@umail.furryterror.org> wrote:
> >
> > On Mon, Feb 21, 2022 at 05:31:13PM -0500, ov2k wrote:
> > > It looks like btrfs coalesces adjacent uncompressed extents. I'm not
> > > sure whether this is done by FIDEDUPERANGE or FS_IOC_FIEMAP. I think
> > > the problem is that adjacent decompressed ranges (defined by #3 and
> > > #4) within the same compressed block are not coalesced in a similar
> > > manner. Is there a particular reason why this isn't done, or is this
> > > simply a case of nobody having done it?
> >
> > It hasn't been done because FIEMAP can't produce results for compressed
> > extents that aren't nonsense. The interface can't cope with compressed
> > data.
> >
>
> I think there's a misunderstanding here. The issue isn't making FS_IOC_FIEMAP
> represent compressed data sensibly. The goal is for btrfs_fiemap() to handle
> adjacent subranges of a compressed extent in much the same way as it handles
> adjacent uncompressed extents. The result should be no more or less
> nonsensical than it already is.
[...]
> I'm talking about emitting a single struct fiemap_extent that corresponds to
> two adjacent subranges of the same compressed btrfs extent. The two btrfs
> extents would simply have to satisfy:
>
> extent 1 #2 (bytenr) == extent2 #2 (bytenr)
>
> extent 1 #1 (seek offset) + extent 1 #3 (decompressed subrange length)
> == extent 2 #1 (seek offset)
>
> extent 1 #4 (decompressed subrange offset) + extent 1 #3 (decompressed
> subrange length) == extent 2 #4 (decompressed subrange offset)
>
> The resulting struct fiemap_extent would have:
>
> fe_logical: extent 1 #1 (seek offset)
>
> fe_physical: extent 1 #2 (bytenr)
>
> fe_length: extent 1 #3 (decompressed subrange length) + extent 2 #3
> (decompressed subrange length)
OK, FIEMAP could handle that one special case. And it is a frequently
requested feature--filefrag's physically-contiguous-extent counter report
doesn't work at all on compressed files, and it could work in the common
case of a simple sequential write (or reflink thereof).
On the other hand, if you're trying to do dedupe on btrfs, you'll need
access to all the other extent fields to avoid bookending issues.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2022-03-12 2:47 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-02-19 3:14 FIDEDUPERANGE and compression ov 2k
2022-02-21 6:37 ` Zygo Blaxell
2022-02-21 22:31 ` ov2k
2022-03-06 4:44 ` Zygo Blaxell
2022-03-09 20:04 ` ov2k
2022-03-12 2:47 ` Zygo Blaxell
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox