* counting fragments takes more time than defragmenting
@ 2015-06-04 8:42 Marc MERLIN
2015-06-24 3:20 ` Marc MERLIN
0 siblings, 1 reply; 13+ messages in thread
From: Marc MERLIN @ 2015-06-04 8:42 UTC (permalink / raw)
To: Chris Mason, linux-btrfs
Hi Chris,
After our quick chat, I gave it a shot on 3.19.6, and things are better
than last time I tried.
legolas:/var/local/nobck/VirtualBox VMs# lsattr Win7/
---------------C Win7/Logs
---------------C Win7/Snapshots
---------------C Win7/Win7.vdi
---------------C Win7/Win7.png
---------------C Win7/autotune1.png
---------------C Win7/new_autotune2.png
---------------C Win7/Win7.vbox-prev
---------------C Win7/Win7.vbox
But I have snapshots of that subvolume, so obviously that gets
in the way of disabling COW.
I had a look, and I have 100K fragments. That took 10mn to figure out:
legolas:/var/local/nobck/VirtualBox VMs/Win7# filefrag Win7.vdi
Win7.vdi: 104306 extents found
This first filefrag run took about 10mn to count all the fragments on my
SSD. That feels a bit slow, but maybe the userland tool is doing things
in suboptimal ways.
Defrag actually worked (mostly) and wasn't too slow. It used to take hours
not to finish, and now it worked in 3mn:
legolas:/var/local/nobck/VirtualBox VMs/Win7# time btrfs fi defrag Win7.vdi
real 3m43.807s
user 0m0.000s
sys 0m44.044s
This is defintely better than before.
Note that it's not fully defragged, but close enough. Each subsequent
run, filefrag is faster, and defrag is still faster than filefrag:
legolas:/var/local/nobck/VirtualBox VMs/Win7# time filefrag Win7.vdi
Win7.vdi: 11428 extents found
real 2m42.090s
user 0m0.000s
sys 2m37.308s
legolas:/var/local/nobck/VirtualBox VMs/Win7# time btrfs fi defrag Win7.vdi
real 0m7.483s
user 0m0.000s
sys 0m2.672s
legolas:/var/local/nobck/VirtualBox VMs/Win7# time filefrag Win7.vdi
Win7.vdi: 11132 extents found
real 0m22.525s
user 0m0.000s
sys 0m22.264s
It's a bit unexpected that I still have 10k fragments after 2 defrag
runs, but it's better than 100k :)
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/ | PGP 1024R/763BE901
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: counting fragments takes more time than defragmenting
2015-06-04 8:42 counting fragments takes more time than defragmenting Marc MERLIN
@ 2015-06-24 3:20 ` Marc MERLIN
2015-06-24 8:28 ` Patrik Lundquist
0 siblings, 1 reply; 13+ messages in thread
From: Marc MERLIN @ 2015-06-24 3:20 UTC (permalink / raw)
To: Chris Mason, linux-btrfs
Hello again,
Just curious, is anyone seeing similar things with big VM images or other
DBs?
I forgot to mention that my vdi file is 88GB.
It's surprising that it took longer to count the fragments than to actually
defragment the file.
Or that it took 3 defrag runs to get down to 11K extents from 104K.
Are others seeing similar things?
Marc
On Thu, Jun 04, 2015 at 05:42:45PM +0900, Marc MERLIN wrote:
> Hi Chris,
>
> After our quick chat, I gave it a shot on 3.19.6, and things are better
> than last time I tried.
>
> legolas:/var/local/nobck/VirtualBox VMs# lsattr Win7/
> ---------------C Win7/Logs
> ---------------C Win7/Snapshots
> ---------------C Win7/Win7.vdi
> ---------------C Win7/Win7.png
> ---------------C Win7/autotune1.png
> ---------------C Win7/new_autotune2.png
> ---------------C Win7/Win7.vbox-prev
> ---------------C Win7/Win7.vbox
>
> But I have snapshots of that subvolume, so obviously that gets
> in the way of disabling COW.
>
> I had a look, and I have 100K fragments. That took 10mn to figure out:
>
> legolas:/var/local/nobck/VirtualBox VMs/Win7# filefrag Win7.vdi
> Win7.vdi: 104306 extents found
>
> This first filefrag run took about 10mn to count all the fragments on my
> SSD. That feels a bit slow, but maybe the userland tool is doing things
> in suboptimal ways.
>
> Defrag actually worked (mostly) and wasn't too slow. It used to take hours
> not to finish, and now it worked in 3mn:
> legolas:/var/local/nobck/VirtualBox VMs/Win7# time btrfs fi defrag Win7.vdi
> real 3m43.807s
> user 0m0.000s
> sys 0m44.044s
>
> This is defintely better than before.
> Note that it's not fully defragged, but close enough. Each subsequent
> run, filefrag is faster, and defrag is still faster than filefrag:
>
> legolas:/var/local/nobck/VirtualBox VMs/Win7# time filefrag Win7.vdi
> Win7.vdi: 11428 extents found
> real 2m42.090s
> user 0m0.000s
> sys 2m37.308s
>
> legolas:/var/local/nobck/VirtualBox VMs/Win7# time btrfs fi defrag Win7.vdi
> real 0m7.483s
> user 0m0.000s
> sys 0m2.672s
>
> legolas:/var/local/nobck/VirtualBox VMs/Win7# time filefrag Win7.vdi
> Win7.vdi: 11132 extents found
> real 0m22.525s
> user 0m0.000s
> sys 0m22.264s
>
> It's a bit unexpected that I still have 10k fragments after 2 defrag
> runs, but it's better than 100k :)
>
> Marc
> --
> "A mouse is a device used to point at the xterm you want to type in" - A.S.R.
> Microsoft is to operating systems ....
> .... what McDonalds is to gourmet cooking
> Home page: http://marc.merlins.org/ | PGP 1024R/763BE901
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: counting fragments takes more time than defragmenting
2015-06-24 3:20 ` Marc MERLIN
@ 2015-06-24 8:28 ` Patrik Lundquist
2015-06-24 10:46 ` Duncan
0 siblings, 1 reply; 13+ messages in thread
From: Patrik Lundquist @ 2015-06-24 8:28 UTC (permalink / raw)
To: Marc MERLIN; +Cc: Chris Mason, linux-btrfs
On 24 June 2015 at 05:20, Marc MERLIN <marc@merlins.org> wrote:
>
> Hello again,
>
> Just curious, is anyone seeing similar things with big VM images or other
> DBs?
> I forgot to mention that my vdi file is 88GB.
>
> It's surprising that it took longer to count the fragments than to actually
> defragment the file.
> Or that it took 3 defrag runs to get down to 11K extents from 104K.
>
> Are others seeing similar things?
Filefrag is pretty much instant for my 30GB (150 extents) virtual
disk, no CoW on file, no snapshots on volume.
But what doesn't make sense to me is btrfs fi defrag; the -t option says
-t <size>
defragment only files at least <size> bytes big
The -t value goes into struct
btrfs_ioctl_defrag_range_args.extent_thresh which is documented as
/*
* any extent bigger than this will be considered
* already defragged. Use 0 to take the kernel default
* Use 1 to say every single extent must be rewritten
*/
Default extent_thresh is 256K. I can't see how 1 would say every
single extent must be rewritten. On the contrary; 1 skips every
extent. The compress flag even sets extent_thresh=(u32)-1 to force a
rewrite.
Marc, try btrfs fi defrag -t 4294967295 Win7.vdi for maximum defrag
and time filefrag again with fewer extents.
/Patrik
> Marc
>
> On Thu, Jun 04, 2015 at 05:42:45PM +0900, Marc MERLIN wrote:
> > Hi Chris,
> >
> > After our quick chat, I gave it a shot on 3.19.6, and things are better
> > than last time I tried.
> >
> > legolas:/var/local/nobck/VirtualBox VMs# lsattr Win7/
> > ---------------C Win7/Logs
> > ---------------C Win7/Snapshots
> > ---------------C Win7/Win7.vdi
> > ---------------C Win7/Win7.png
> > ---------------C Win7/autotune1.png
> > ---------------C Win7/new_autotune2.png
> > ---------------C Win7/Win7.vbox-prev
> > ---------------C Win7/Win7.vbox
> >
> > But I have snapshots of that subvolume, so obviously that gets
> > in the way of disabling COW.
> >
> > I had a look, and I have 100K fragments. That took 10mn to figure out:
> >
> > legolas:/var/local/nobck/VirtualBox VMs/Win7# filefrag Win7.vdi
> > Win7.vdi: 104306 extents found
> >
> > This first filefrag run took about 10mn to count all the fragments on my
> > SSD. That feels a bit slow, but maybe the userland tool is doing things
> > in suboptimal ways.
> >
> > Defrag actually worked (mostly) and wasn't too slow. It used to take hours
> > not to finish, and now it worked in 3mn:
> > legolas:/var/local/nobck/VirtualBox VMs/Win7# time btrfs fi defrag Win7.vdi
> > real 3m43.807s
> > user 0m0.000s
> > sys 0m44.044s
> >
> > This is defintely better than before.
> > Note that it's not fully defragged, but close enough. Each subsequent
> > run, filefrag is faster, and defrag is still faster than filefrag:
> >
> > legolas:/var/local/nobck/VirtualBox VMs/Win7# time filefrag Win7.vdi
> > Win7.vdi: 11428 extents found
> > real 2m42.090s
> > user 0m0.000s
> > sys 2m37.308s
> >
> > legolas:/var/local/nobck/VirtualBox VMs/Win7# time btrfs fi defrag Win7.vdi
> > real 0m7.483s
> > user 0m0.000s
> > sys 0m2.672s
> >
> > legolas:/var/local/nobck/VirtualBox VMs/Win7# time filefrag Win7.vdi
> > Win7.vdi: 11132 extents found
> > real 0m22.525s
> > user 0m0.000s
> > sys 0m22.264s
> >
> > It's a bit unexpected that I still have 10k fragments after 2 defrag
> > runs, but it's better than 100k :)
> >
> > Marc
> > --
> > "A mouse is a device used to point at the xterm you want to type in" - A.S.R.
> > Microsoft is to operating systems ....
> > .... what McDonalds is to gourmet cooking
> > Home page: http://marc.merlins.org/ | PGP 1024R/763BE901
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> >
>
> --
> "A mouse is a device used to point at the xterm you want to type in" - A.S.R.
> Microsoft is to operating systems ....
> .... what McDonalds is to gourmet cooking
> Home page: http://marc.merlins.org/
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: counting fragments takes more time than defragmenting
2015-06-24 8:28 ` Patrik Lundquist
@ 2015-06-24 10:46 ` Duncan
2015-06-24 12:05 ` Patrik Lundquist
2015-07-14 11:57 ` Patrik Lundquist
0 siblings, 2 replies; 13+ messages in thread
From: Duncan @ 2015-06-24 10:46 UTC (permalink / raw)
To: linux-btrfs
Patrik Lundquist posted on Wed, 24 Jun 2015 10:28:09 +0200 as excerpted:
> But what doesn't make sense to me is btrfs fi defrag; the -t option says
>
> -t <size>
> defragment only files at least <size> bytes big
>
> The -t value goes into struct
> btrfs_ioctl_defrag_range_args.extent_thresh which is documented as
>
> /*
> * any extent bigger than this will be considered * already
> defragged. Use 0 to take the kernel default * Use 1 to say
> every single extent must be rewritten */
>
> Default extent_thresh is 256K. I can't see how 1 would say every single
> extent must be rewritten. On the contrary; 1 skips every extent. The
> compress flag even sets extent_thresh=(u32)-1 to force a rewrite.
>
> Marc, try btrfs fi defrag -t 4294967295 Win7.vdi for maximum defrag and
> time filefrag again with fewer extents.
The manpage wording for btrfs fi defrag -t has been debated on-list
several times, and I believe remains (as of btrfs-progs v4.1) confusing
still.
First, under the general defragment description, before the individual
options, it says:
>>>> Any extent bigger than threshold given by -t option, will be
>>>> considered already defragged. Use 0 to take the kernel default.
So according to that, an extent BIGGER than -t is treated as already
defragged (it defrags SMALLER extents)
But, under the -t option, it says:
>>>> -t <size>[kKmMgGtTpPeE]
>>>> defragment only files at least <size> bytes big
So according to that, only extents BIGGER than -t are defragged (smaller
is ignored).
Again, that's the btrfs-filesystem (8) manpage as of -progs 4.1.
So which is it? The manpage itself can't make up its mind.
AFAIK, it's set huge to defrag everything, but last time I posted on this
I got it wrong, and I don't remember for sure what I said then, so... try
it and see to be sure, which is what I'd do.
Meanwhile, it's worth noting that btrfs data chunks are normally 1 GiB
(tho apparently they can be bigger under certain circumstances). 1
extent per chunk is the best btrfs normally does, which means 1 GiB per
extent is nominally the best that can be done, with the first and last
extent possibly less than a gig (the first taking up the remainder of a
partially used chunk and the last finishing up the file, which probably
won't end on an even chunk boundary).
Assuming "set a huge -t to defrag to the maximum extent possible" is
correct, that means -t 1G should be exactly as effective as -t 1T...
Regardless of whether 1 or huge -t means maximum defrag, however, the
nominal data chunk size of 1 GiB means that 30 GiB file you mentioned
should be considered ideally defragged at 31 extents. This is a
departure from ext4, which AFAIK in theory has no extent upper limit, so
should be able to do that 30 GiB file in a single extent.
But btrfs or ext4, 31 extents ideal or a single extent ideal, 150 extents
still indicates at least some remaining fragmentation.
Finally, last I remember, filefrag didn't understand btrfs compression
(which is off for nocow, so this shouldn't apply there), which uses 128
KiB blocks IIRC. Until it does, large btrfs-compressed files will always
show many extents 8/MiB, so thousands on anything even close to a GiB,
tens of thousands on multiple GiBs). But I believe there had been some
work to teach filefrag about btrfs compression, tho I don't know if it
has made it into an e2fsprogs release, yet. If so, it'll be pretty close
to the latest release. So anything but the latest filefrag won't be
accurate with btrfs-compressed files, while the latest may now be
accurate, or not, I'm not sure. I guess one could check e2fsprogs'
release notes...
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: counting fragments takes more time than defragmenting
2015-06-24 10:46 ` Duncan
@ 2015-06-24 12:05 ` Patrik Lundquist
2015-06-25 4:01 ` Duncan
2015-07-14 11:57 ` Patrik Lundquist
1 sibling, 1 reply; 13+ messages in thread
From: Patrik Lundquist @ 2015-06-24 12:05 UTC (permalink / raw)
To: linux-btrfs@vger.kernel.org
On 24 June 2015 at 12:46, Duncan <1i5t5.duncan@cox.net> wrote:
> Patrik Lundquist posted on Wed, 24 Jun 2015 10:28:09 +0200 as excerpted:
>
> AFAIK, it's set huge to defrag everything,
It's set to 256K by default.
> Assuming "set a huge -t to defrag to the maximum extent possible" is
> correct, that means -t 1G should be exactly as effective as -t 1T...
1G is actually more effective because 1T overflows the uint32
extent_thresh field, so 1T, 0, and 256K are currently the same.
3G is the largest value that works with -t as expected (disregarding
the man page) and is easy to type.
> But btrfs or ext4, 31 extents ideal or a single extent ideal, 150 extents
> still indicates at least some remaining fragmentation.
I gave it another shot but I've now got 154 extents instead. :-)
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: counting fragments takes more time than defragmenting
2015-06-24 12:05 ` Patrik Lundquist
@ 2015-06-25 4:01 ` Duncan
2015-06-25 6:30 ` Patrik Lundquist
0 siblings, 1 reply; 13+ messages in thread
From: Duncan @ 2015-06-25 4:01 UTC (permalink / raw)
To: linux-btrfs
Patrik Lundquist posted on Wed, 24 Jun 2015 14:05:57 +0200 as excerpted:
> On 24 June 2015 at 12:46, Duncan <1i5t5.duncan@cox.net> wrote:
>> Patrik Lundquist posted on Wed, 24 Jun 2015 10:28:09 +0200 as
>> excerpted:
>>
>> AFAIK, it's set huge to defrag everything,
>
> It's set to 256K by default.
What I meant is that AFAIK, set it huge to defrag everything...
>> Assuming "set a huge -t to defrag to the maximum extent possible" is
>> correct, that means -t 1G should be exactly as effective as -t 1T...
>
> 1G is actually more effective because 1T overflows the uint32
> extent_thresh field, so 1T, 0, and 256K are currently the same.
Then the manpage needs some work (in addition to the more serious
ambiguity over whether 1 or 1G means defrag everything), since it
mentions upto petabyte (P), without any indication that setting anything
that large won't work as expected.
If it's uint32 limited, either kill everything above that in both the
documentation and code, or alias everything above that to 3G (your next
paragraph) or whatever.
> 3G is the largest value that works with -t as expected (disregarding the
> man page) and is easy to type.
>
>
>> But btrfs or ext4, 31 extents ideal or a single extent ideal, 150
>> extents still indicates at least some remaining fragmentation.
>
> I gave it another shot but I've now got 154 extents instead. :-)
Is it possible there's simply no gig-size free-space holes in the
filesystem allocation, so it simply /can't/ defrag further than that,
because there's no place to allocate whole-gig data chunks at a time?
Which brings up a more general defrag functionality question. For multi-
gig files, does btrfs fi defrag allocate fresh data chunks in ordered to
create the largest extents possible (possibly after filling the remainder
of the original first chunk), thereby increasing data chunk allocation
before fully using currently allocated chunks, or does it try to find the
biggest extents possible in currently allocated chunks, first, and only
allocate new chunks when all current allocation is full?
Obviously if it uses up current allocations first, that could explain
your problem. OTOH, if either defrag or general allocation strategy
favors new chunks for large extents when necessary, that would explain
the "deoptimization" some people report from running balance.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: counting fragments takes more time than defragmenting
2015-06-25 4:01 ` Duncan
@ 2015-06-25 6:30 ` Patrik Lundquist
0 siblings, 0 replies; 13+ messages in thread
From: Patrik Lundquist @ 2015-06-25 6:30 UTC (permalink / raw)
To: linux-btrfs@vger.kernel.org
On 25 June 2015 at 06:01, Duncan <1i5t5.duncan@cox.net> wrote:
>
> Patrik Lundquist posted on Wed, 24 Jun 2015 14:05:57 +0200 as excerpted:
>
> > On 24 June 2015 at 12:46, Duncan <1i5t5.duncan@cox.net> wrote:
>
> If it's uint32 limited, either kill everything above that in both the
> documentation and code, or alias everything above that to 3G (your next
> paragraph) or whatever.
My simple overflow patch yesterday fixes the problem, so 4G or larger
is max instead of 0.
> >> But btrfs or ext4, 31 extents ideal or a single extent ideal, 150
> >> extents still indicates at least some remaining fragmentation.
> >
> > I gave it another shot but I've now got 154 extents instead. :-)
>
> Is it possible there's simply no gig-size free-space holes in the
> filesystem allocation, so it simply /can't/ defrag further than that,
> because there's no place to allocate whole-gig data chunks at a time?
I would guess so, without allocating new chunks. Defrag can probably
be smarter and avoid rewriting extents if it means splitting them
(unless the compression flag is set and it must rewrite everything).
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: counting fragments takes more time than defragmenting
2015-06-24 10:46 ` Duncan
2015-06-24 12:05 ` Patrik Lundquist
@ 2015-07-14 11:57 ` Patrik Lundquist
2015-07-14 17:32 ` Duncan
2015-07-14 18:41 ` Hugo Mills
1 sibling, 2 replies; 13+ messages in thread
From: Patrik Lundquist @ 2015-07-14 11:57 UTC (permalink / raw)
To: linux-btrfs@vger.kernel.org
On 24 June 2015 at 12:46, Duncan <1i5t5.duncan@cox.net> wrote:
>
> Regardless of whether 1 or huge -t means maximum defrag, however, the
> nominal data chunk size of 1 GiB means that 30 GiB file you mentioned
> should be considered ideally defragged at 31 extents. This is a
> departure from ext4, which AFAIK in theory has no extent upper limit, so
> should be able to do that 30 GiB file in a single extent.
>
> But btrfs or ext4, 31 extents ideal or a single extent ideal, 150 extents
> still indicates at least some remaining fragmentation.
So I converted the VMware VMDK file to a VirtualBox VDI file:
-rw------- 1 plu plu 28845539328 jul 13 13:36 Windows7-disk1.vmdk
-rw------- 1 plu plu 28993126400 jul 13 14:04 Windows7.vdi
$ filefrag Windows7.vdi
Windows7.vdi: 15 extents found
$ btrfs filesystem defragment -t 3g Windows7.vdi
$ filefrag Windows7.vdi
Windows7.vdi: 24 extents found
How can it be less than 28 extents with a chunk size of 1 GiB?
E2fsprogs version 1.42.12
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: counting fragments takes more time than defragmenting
2015-07-14 11:57 ` Patrik Lundquist
@ 2015-07-14 17:32 ` Duncan
2015-07-14 18:41 ` Hugo Mills
1 sibling, 0 replies; 13+ messages in thread
From: Duncan @ 2015-07-14 17:32 UTC (permalink / raw)
To: linux-btrfs
Patrik Lundquist posted on Tue, 14 Jul 2015 13:57:07 +0200 as excerpted:
> On 24 June 2015 at 12:46, Duncan <1i5t5.duncan@cox.net> wrote:
>>
>> Regardless of whether 1 or huge -t means maximum defrag, however, the
>> nominal data chunk size of 1 GiB means that 30 GiB file you mentioned
>> should be considered ideally defragged at 31 extents. This is a
>> departure from ext4, which AFAIK in theory has no extent upper limit,
>> so should be able to do that 30 GiB file in a single extent.
>>
>> But btrfs or ext4, 31 extents ideal or a single extent ideal, 150
>> extents still indicates at least some remaining fragmentation.
>
> So I converted the VMware VMDK file to a VirtualBox VDI file:
>
> -rw------- 1 plu plu 28845539328 jul 13 13:36 Windows7-disk1.vmdk
> -rw------- 1 plu plu 28993126400 jul 13 14:04 Windows7.vdi
>
> $ filefrag Windows7.vdi Windows7.vdi: 15 extents found
>
> $ btrfs filesystem defragment -t 3g Windows7.vdi $ filefrag Windows7.vdi
> Windows7.vdi: 24 extents found
>
> How can it be less than 28 extents with a chunk size of 1 GiB?
>
> E2fsprogs version 1.42.12
That's why I said "nominal"[1] 1 GiB. I'm just a list and filesystem
user, not a dev, and I don't know the details, but someone (a dev or at
least someone that can actually read code, but not a btrfs dev) mentioned
in reply to a post of mine a few months ago, that under the right
conditions, btrfs can allocate larger-than 1 GiB data chunks.
I /believe/ data chunk allocation size has something to do with the
amount of unallocated space on the filesystem; that on large (TiB plus,
perhaps) btrfs some of the initial allocations will be multiple GiB,
which of course would allow greater-than 1 GiB extents as well. But I
really don't know the conditions under which that can happen and I've not
seen an actual btrfs dev comment on it, and AFAIK the "base" data chunk
size remains 1 GiB under most conditions. Meanwhile, I tend to partition
up my storage here, and while I have multiple separate btrfs, the
partitions are all under 50 GiB, so I'm unlikely to see that sort of > 1
GiB data chunk allocations at all, here.
So rather than go to the complexity of explaining all this detail that
I'm not sure of anyway, I deliberately blurred out a bit as not necessary
to the primary point, which was that for files over a GiB, don't expect
to see or be able to defrag to a single extent, as 1 GiB data chunks and
thus extents are nominal/normal.
If it does happen, I'd consider it due to those data "superchunks" and
wouldn't be entirely surprised, but the point remains that you're
unlikely to get the number of extents much below the file size number in
GiB using defrag, even when everything is working "perfectly as designed".
---
[1] Nominal: In the sense of normal or standard as-designed value, see
wiktionary's English adjective sense 6 and 10, as well as the wikipedia
writeups on real vs. nominal values and nominal size:
https://en.wiktionary.org/wiki/nominal#Adjective
https://en.wikipedia.org/wiki/Real_versus_nominal_value
https://en.wikipedia.org/wiki/Nominal_size
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: counting fragments takes more time than defragmenting
2015-07-14 11:57 ` Patrik Lundquist
2015-07-14 17:32 ` Duncan
@ 2015-07-14 18:41 ` Hugo Mills
2015-07-14 19:09 ` Patrik Lundquist
1 sibling, 1 reply; 13+ messages in thread
From: Hugo Mills @ 2015-07-14 18:41 UTC (permalink / raw)
To: Patrik Lundquist; +Cc: linux-btrfs@vger.kernel.org
[-- Attachment #1: Type: text/plain, Size: 1741 bytes --]
On Tue, Jul 14, 2015 at 01:57:07PM +0200, Patrik Lundquist wrote:
> On 24 June 2015 at 12:46, Duncan <1i5t5.duncan@cox.net> wrote:
> >
> > Regardless of whether 1 or huge -t means maximum defrag, however, the
> > nominal data chunk size of 1 GiB means that 30 GiB file you mentioned
> > should be considered ideally defragged at 31 extents. This is a
> > departure from ext4, which AFAIK in theory has no extent upper limit, so
> > should be able to do that 30 GiB file in a single extent.
> >
> > But btrfs or ext4, 31 extents ideal or a single extent ideal, 150 extents
> > still indicates at least some remaining fragmentation.
>
> So I converted the VMware VMDK file to a VirtualBox VDI file:
>
> -rw------- 1 plu plu 28845539328 jul 13 13:36 Windows7-disk1.vmdk
> -rw------- 1 plu plu 28993126400 jul 13 14:04 Windows7.vdi
>
> $ filefrag Windows7.vdi
> Windows7.vdi: 15 extents found
>
> $ btrfs filesystem defragment -t 3g Windows7.vdi
> $ filefrag Windows7.vdi
> Windows7.vdi: 24 extents found
>
> How can it be less than 28 extents with a chunk size of 1 GiB?
I _think_ the fragment size will be limited by the block group
size. This is not the same as the chunk size for some RAID levels --
for example, RAID-0, a block group can be anything from 2 to n chunks
(across the same number of devices), where each chunk is 1 GiB, so
potentially you could have arbitrary-sized block groups. The same
would apply to RAID-10, -5 and -6.
(Note, I haven't verified this, but it makes sense based on what I
know of the internal data structures).
Hugo.
--
Hugo Mills | Go not to the elves for counsel, for they will say
hugo@... carfax.org.uk | both no and yes.
http://carfax.org.uk/ |
PGP: E2AB1DE4 |
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: counting fragments takes more time than defragmenting
2015-07-14 18:41 ` Hugo Mills
@ 2015-07-14 19:09 ` Patrik Lundquist
2015-07-14 19:15 ` Hugo Mills
0 siblings, 1 reply; 13+ messages in thread
From: Patrik Lundquist @ 2015-07-14 19:09 UTC (permalink / raw)
To: linux-btrfs@vger.kernel.org
On 14 July 2015 at 20:41, Hugo Mills <hugo@carfax.org.uk> wrote:
> On Tue, Jul 14, 2015 at 01:57:07PM +0200, Patrik Lundquist wrote:
>> On 24 June 2015 at 12:46, Duncan <1i5t5.duncan@cox.net> wrote:
>> >
>> > Regardless of whether 1 or huge -t means maximum defrag, however, the
>> > nominal data chunk size of 1 GiB means that 30 GiB file you mentioned
>> > should be considered ideally defragged at 31 extents. This is a
>> > departure from ext4, which AFAIK in theory has no extent upper limit, so
>> > should be able to do that 30 GiB file in a single extent.
>> >
>> > But btrfs or ext4, 31 extents ideal or a single extent ideal, 150 extents
>> > still indicates at least some remaining fragmentation.
>>
>> So I converted the VMware VMDK file to a VirtualBox VDI file:
>>
>> -rw------- 1 plu plu 28845539328 jul 13 13:36 Windows7-disk1.vmdk
>> -rw------- 1 plu plu 28993126400 jul 13 14:04 Windows7.vdi
>>
>> $ filefrag Windows7.vdi
>> Windows7.vdi: 15 extents found
>>
>> $ btrfs filesystem defragment -t 3g Windows7.vdi
>> $ filefrag Windows7.vdi
>> Windows7.vdi: 24 extents found
>>
>> How can it be less than 28 extents with a chunk size of 1 GiB?
>
> I _think_ the fragment size will be limited by the block group
> size. This is not the same as the chunk size for some RAID levels --
> for example, RAID-0, a block group can be anything from 2 to n chunks
> (across the same number of devices), where each chunk is 1 GiB, so
> potentially you could have arbitrary-sized block groups. The same
> would apply to RAID-10, -5 and -6.
>
> (Note, I haven't verified this, but it makes sense based on what I
> know of the internal data structures).
It's a raid1 filesystem, so the block group ought to be the same size
as the chunk, right?
A 2GiB block group would suffice to explain it though.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: counting fragments takes more time than defragmenting
2015-07-14 19:09 ` Patrik Lundquist
@ 2015-07-14 19:15 ` Hugo Mills
2015-07-21 15:35 ` Patrik Lundquist
0 siblings, 1 reply; 13+ messages in thread
From: Hugo Mills @ 2015-07-14 19:15 UTC (permalink / raw)
To: Patrik Lundquist; +Cc: linux-btrfs@vger.kernel.org
[-- Attachment #1: Type: text/plain, Size: 2189 bytes --]
On Tue, Jul 14, 2015 at 09:09:00PM +0200, Patrik Lundquist wrote:
> On 14 July 2015 at 20:41, Hugo Mills <hugo@carfax.org.uk> wrote:
> > On Tue, Jul 14, 2015 at 01:57:07PM +0200, Patrik Lundquist wrote:
> >> On 24 June 2015 at 12:46, Duncan <1i5t5.duncan@cox.net> wrote:
> >> >
> >> > Regardless of whether 1 or huge -t means maximum defrag, however, the
> >> > nominal data chunk size of 1 GiB means that 30 GiB file you mentioned
> >> > should be considered ideally defragged at 31 extents. This is a
> >> > departure from ext4, which AFAIK in theory has no extent upper limit, so
> >> > should be able to do that 30 GiB file in a single extent.
> >> >
> >> > But btrfs or ext4, 31 extents ideal or a single extent ideal, 150 extents
> >> > still indicates at least some remaining fragmentation.
> >>
> >> So I converted the VMware VMDK file to a VirtualBox VDI file:
> >>
> >> -rw------- 1 plu plu 28845539328 jul 13 13:36 Windows7-disk1.vmdk
> >> -rw------- 1 plu plu 28993126400 jul 13 14:04 Windows7.vdi
> >>
> >> $ filefrag Windows7.vdi
> >> Windows7.vdi: 15 extents found
> >>
> >> $ btrfs filesystem defragment -t 3g Windows7.vdi
> >> $ filefrag Windows7.vdi
> >> Windows7.vdi: 24 extents found
> >>
> >> How can it be less than 28 extents with a chunk size of 1 GiB?
> >
> > I _think_ the fragment size will be limited by the block group
> > size. This is not the same as the chunk size for some RAID levels --
> > for example, RAID-0, a block group can be anything from 2 to n chunks
> > (across the same number of devices), where each chunk is 1 GiB, so
> > potentially you could have arbitrary-sized block groups. The same
> > would apply to RAID-10, -5 and -6.
> >
> > (Note, I haven't verified this, but it makes sense based on what I
> > know of the internal data structures).
>
> It's a raid1 filesystem, so the block group ought to be the same size
> as the chunk, right?
Yes.
> A 2GiB block group would suffice to explain it though.
Not with RAID-1 -- I'd expect the block group size to be 1 GiB.
Hugo.
--
Hugo Mills | There isn't a noun that can't be verbed.
hugo@... carfax.org.uk |
http://carfax.org.uk/ |
PGP: E2AB1DE4 |
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: counting fragments takes more time than defragmenting
2015-07-14 19:15 ` Hugo Mills
@ 2015-07-21 15:35 ` Patrik Lundquist
0 siblings, 0 replies; 13+ messages in thread
From: Patrik Lundquist @ 2015-07-21 15:35 UTC (permalink / raw)
To: linux-btrfs@vger.kernel.org
On 14 July 2015 at 21:15, Hugo Mills <hugo@carfax.org.uk> wrote:
> On Tue, Jul 14, 2015 at 09:09:00PM +0200, Patrik Lundquist wrote:
>> On 14 July 2015 at 20:41, Hugo Mills <hugo@carfax.org.uk> wrote:
>> > On Tue, Jul 14, 2015 at 01:57:07PM +0200, Patrik Lundquist wrote:
>> >> On 24 June 2015 at 12:46, Duncan <1i5t5.duncan@cox.net> wrote:
>> >> >
>> >> > Regardless of whether 1 or huge -t means maximum defrag, however, the
>> >> > nominal data chunk size of 1 GiB means that 30 GiB file you mentioned
>> >> > should be considered ideally defragged at 31 extents. This is a
>> >> > departure from ext4, which AFAIK in theory has no extent upper limit, so
>> >> > should be able to do that 30 GiB file in a single extent.
>> >> >
>> >> > But btrfs or ext4, 31 extents ideal or a single extent ideal, 150 extents
>> >> > still indicates at least some remaining fragmentation.
>> >>
>> >> So I converted the VMware VMDK file to a VirtualBox VDI file:
>> >>
>> >> -rw------- 1 plu plu 28845539328 jul 13 13:36 Windows7-disk1.vmdk
>> >> -rw------- 1 plu plu 28993126400 jul 13 14:04 Windows7.vdi
>> >>
>> >> $ filefrag Windows7.vdi
>> >> Windows7.vdi: 15 extents found
>> >>
>> >> $ btrfs filesystem defragment -t 3g Windows7.vdi
>> >> $ filefrag Windows7.vdi
>> >> Windows7.vdi: 24 extents found
>> >>
>> >> How can it be less than 28 extents with a chunk size of 1 GiB?
>> >
>> > I _think_ the fragment size will be limited by the block group
>> > size. This is not the same as the chunk size for some RAID levels --
>> > for example, RAID-0, a block group can be anything from 2 to n chunks
>> > (across the same number of devices), where each chunk is 1 GiB, so
>> > potentially you could have arbitrary-sized block groups. The same
>> > would apply to RAID-10, -5 and -6.
>> >
>> > (Note, I haven't verified this, but it makes sense based on what I
>> > know of the internal data structures).
>>
>> It's a raid1 filesystem, so the block group ought to be the same size
>> as the chunk, right?
>
> Yes.
>
>> A 2GiB block group would suffice to explain it though.
>
> Not with RAID-1 -- I'd expect the block group size to be 1 GiB.
So I had a look at the filefrag source and filefrag actually doesn't
print the number of extents but the number of disk fragments.
Contiguously allocated extents counts as one fragment.
"Windows7.vdi: 47 extents found" is really 213 extents over 47 disk fragments.
But I have one 2GiB extent, according to filefrag -v, so the question
remains. :-)
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2015-07-21 15:35 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-06-04 8:42 counting fragments takes more time than defragmenting Marc MERLIN
2015-06-24 3:20 ` Marc MERLIN
2015-06-24 8:28 ` Patrik Lundquist
2015-06-24 10:46 ` Duncan
2015-06-24 12:05 ` Patrik Lundquist
2015-06-25 4:01 ` Duncan
2015-06-25 6:30 ` Patrik Lundquist
2015-07-14 11:57 ` Patrik Lundquist
2015-07-14 17:32 ` Duncan
2015-07-14 18:41 ` Hugo Mills
2015-07-14 19:09 ` Patrik Lundquist
2015-07-14 19:15 ` Hugo Mills
2015-07-21 15:35 ` Patrik Lundquist
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox