linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Extents for a particular subvolume
@ 2016-08-03 19:56 Graham Cobb
  2016-08-03 20:37 ` Adam Borowski
  0 siblings, 1 reply; 5+ messages in thread
From: Graham Cobb @ 2016-08-03 19:56 UTC (permalink / raw)
  To: linux-btrfs

Are there any btrfs commands (or APIs) to allow a script to create a
list of all the extents referred to within a particular (mounted)
subvolume?  And is it a reasonably efficient process (i.e. doesn't
involve backrefs and, preferably, doesn't involve following directory
trees)?

I am not looking to relate the extents to files/inodes/paths.  My
particular need, at the moment, is to work out how much of two snapshots
is shared data, but I can think of other uses for the information.

Graham

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Extents for a particular subvolume
  2016-08-03 19:56 Extents for a particular subvolume Graham Cobb
@ 2016-08-03 20:37 ` Adam Borowski
  2016-08-03 21:55   ` Graham Cobb
  0 siblings, 1 reply; 5+ messages in thread
From: Adam Borowski @ 2016-08-03 20:37 UTC (permalink / raw)
  To: Graham Cobb; +Cc: linux-btrfs

On Wed, Aug 03, 2016 at 08:56:01PM +0100, Graham Cobb wrote:
> Are there any btrfs commands (or APIs) to allow a script to create a
> list of all the extents referred to within a particular (mounted)
> subvolume?  And is it a reasonably efficient process (i.e. doesn't
> involve backrefs and, preferably, doesn't involve following directory
> trees)?

Since the size of your output is linear to the number of extents which is
between the number of files and sum of their sizes, I see no gain in
trying to avoid following the directory tree.

And that can be done with FIEMAP on any filesystem, not just btrfs.

> I am not looking to relate the extents to files/inodes/paths.  My
> particular need, at the moment, is to work out how much of two snapshots
> is shared data, but I can think of other uses for the information.

Thus, unlike the question you asked above, you're not interested in _all_
extents, merely those which changed.

You may want to look at "btrfs subv find-new" and "btrfs send --no-data".


Meow!
-- 
An imaginary friend squared is a real enemy.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Extents for a particular subvolume
  2016-08-03 20:37 ` Adam Borowski
@ 2016-08-03 21:55   ` Graham Cobb
  2016-08-04 11:34     ` Austin S. Hemmelgarn
  2016-08-15 19:18     ` Graham Cobb
  0 siblings, 2 replies; 5+ messages in thread
From: Graham Cobb @ 2016-08-03 21:55 UTC (permalink / raw)
  To: linux-btrfs

On 03/08/16 21:37, Adam Borowski wrote:
> On Wed, Aug 03, 2016 at 08:56:01PM +0100, Graham Cobb wrote:
>> Are there any btrfs commands (or APIs) to allow a script to create a
>> list of all the extents referred to within a particular (mounted)
>> subvolume?  And is it a reasonably efficient process (i.e. doesn't
>> involve backrefs and, preferably, doesn't involve following directory
>> trees)?
> 
> Since the size of your output is linear to the number of extents which is
> between the number of files and sum of their sizes, I see no gain in
> trying to avoid following the directory tree.

Thanks for the help, Adam.  There are a lot of files and a lot of
directories - find, "ls -R" and similar operations take a very long
time. I was hoping that I could query some sort of extent tree for the
subvolume and get the answer back in seconds instead of multiple minutes.

But I can follow the directory tree if I need to.

>> I am not looking to relate the extents to files/inodes/paths.  My
>> particular need, at the moment, is to work out how much of two snapshots
>> is shared data, but I can think of other uses for the information.
> 
> Thus, unlike the question you asked above, you're not interested in _all_
> extents, merely those which changed.
> 
> You may want to look at "btrfs subv find-new" and "btrfs send --no-data".

Unfortunately, the subvolumes do not have an ancestor-descendent
relationship (although they do have some common ancestors), so I don't
think find-new is much help (as far as I can see).

But just looking at the size of the output  from "send -c" would work
well enough for the particular problem I am trying to solve tonight!
Although I will need to take read-only snapshots of the subvolumes to
allow send to work. Thanks for the suggestion.

I would still be interested in the extent list, though.  The main
problem with find-new and send is that they don't tell me how much has
been deleted, only added.  I am thinking about using the extents to get
a much better handle on what is using up space and what I could recover
if I removed (or moved to another volume) various groups of related
subvolumes.

Thanks again for the help.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Extents for a particular subvolume
  2016-08-03 21:55   ` Graham Cobb
@ 2016-08-04 11:34     ` Austin S. Hemmelgarn
  2016-08-15 19:18     ` Graham Cobb
  1 sibling, 0 replies; 5+ messages in thread
From: Austin S. Hemmelgarn @ 2016-08-04 11:34 UTC (permalink / raw)
  To: Graham Cobb, linux-btrfs

On 2016-08-03 17:55, Graham Cobb wrote:
> On 03/08/16 21:37, Adam Borowski wrote:
>> On Wed, Aug 03, 2016 at 08:56:01PM +0100, Graham Cobb wrote:
>>> Are there any btrfs commands (or APIs) to allow a script to create a
>>> list of all the extents referred to within a particular (mounted)
>>> subvolume?  And is it a reasonably efficient process (i.e. doesn't
>>> involve backrefs and, preferably, doesn't involve following directory
>>> trees)?
>>
>> Since the size of your output is linear to the number of extents which is
>> between the number of files and sum of their sizes, I see no gain in
>> trying to avoid following the directory tree.
>
> Thanks for the help, Adam.  There are a lot of files and a lot of
> directories - find, "ls -R" and similar operations take a very long
> time. I was hoping that I could query some sort of extent tree for the
> subvolume and get the answer back in seconds instead of multiple minutes.
>
> But I can follow the directory tree if I need to.
>
>>> I am not looking to relate the extents to files/inodes/paths.  My
>>> particular need, at the moment, is to work out how much of two snapshots
>>> is shared data, but I can think of other uses for the information.
>>
>> Thus, unlike the question you asked above, you're not interested in _all_
>> extents, merely those which changed.
>>
>> You may want to look at "btrfs subv find-new" and "btrfs send --no-data".
>
> Unfortunately, the subvolumes do not have an ancestor-descendent
> relationship (although they do have some common ancestors), so I don't
> think find-new is much help (as far as I can see).
>
> But just looking at the size of the output  from "send -c" would work
> well enough for the particular problem I am trying to solve tonight!
> Although I will need to take read-only snapshots of the subvolumes to
> allow send to work. Thanks for the suggestion.
FWIW, if you're not using any files in the subvolumes, you can run:
btrfs property set <subvolume> ro true

to mark them read-only so you don't need the snapshots, and then run the 
same command with 'false' at the end instead of true to mark them 
writable again.
>
> I would still be interested in the extent list, though.  The main
> problem with find-new and send is that they don't tell me how much has
> been deleted, only added.  I am thinking about using the extents to get
> a much better handle on what is using up space and what I could recover
> if I removed (or moved to another volume) various groups of related
> subvolumes.
You may want to look into 'btrfs filesystem usage' and 'btrfs filesystem 
du' commands.  I'm not sure if they'll cover what you need, but they can 
show info about how much is shared.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Extents for a particular subvolume
  2016-08-03 21:55   ` Graham Cobb
  2016-08-04 11:34     ` Austin S. Hemmelgarn
@ 2016-08-15 19:18     ` Graham Cobb
  1 sibling, 0 replies; 5+ messages in thread
From: Graham Cobb @ 2016-08-15 19:18 UTC (permalink / raw)
  To: linux-btrfs

On 03/08/16 22:55, Graham Cobb wrote:
> On 03/08/16 21:37, Adam Borowski wrote:
>> On Wed, Aug 03, 2016 at 08:56:01PM +0100, Graham Cobb wrote:
>>> Are there any btrfs commands (or APIs) to allow a script to create a
>>> list of all the extents referred to within a particular (mounted)
>>> subvolume?  And is it a reasonably efficient process (i.e. doesn't
>>> involve backrefs and, preferably, doesn't involve following directory
>>> trees)?

In case anyone else is interested in this, I ended up creating some
simple scripts to allow me to do this.  They are slow because they walk
the directory tree and they use filefrag to get the extent data, but
they do let me answer questions like:

* How much space am I wasting by keeping historical snapshots?
* How much data is being shared between two subvolumes
* How much of the data in my latest snapshot is unique to that snapshot?
* How much data would I actually free up if I removed (just) these
particular subvolumes?

If they are useful to anyone else you can find them at:

https://github.com/GrahamCobb/extents-lists

If anyone knows of more efficient ways to get this information please
let me know. And, of course, feel free to suggest improvements/bugfixes!



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-08-15 19:18 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-08-03 19:56 Extents for a particular subvolume Graham Cobb
2016-08-03 20:37 ` Adam Borowski
2016-08-03 21:55   ` Graham Cobb
2016-08-04 11:34     ` Austin S. Hemmelgarn
2016-08-15 19:18     ` Graham Cobb

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).