linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kai Krakow <hurikhan77@gmail.com>
To: linux-btrfs@vger.kernel.org
Subject: Re: defragmenting best practice?
Date: Thu, 14 Sep 2017 17:47:15 +0200	[thread overview]
Message-ID: <20170914174715.7eed39cb@jupiter.sol.kaishome.de> (raw)
In-Reply-To: 20170914172434.39eae89d@jupiter.sol.kaishome.de

Am Thu, 14 Sep 2017 17:24:34 +0200
schrieb Kai Krakow <hurikhan77@gmail.com>:

Errors corrected, see below...


> Am Thu, 14 Sep 2017 14:31:48 +0100
> schrieb Tomasz Kłoczko <kloczko.tomasz@gmail.com>:
> 
> > On 14 September 2017 at 12:38, Kai Krakow <hurikhan77@gmail.com>
> > wrote: [..]  
> > >
> > > I suggest you only ever defragment parts of your main subvolume or
> > > rely on autodefrag, and let bees do optimizing the snapshots.  
> 
> Please read that again including the parts you omitted.
> 
> 
> > > Also, I experimented with adding btrfs support to shake, still
> > > working on better integration but currently lacking time... :-(
> > >
> > > Shake is an adaptive defragger which rewrites files. With my
> > > current patches it clones each file, and then rewrites it to its
> > > original location. This approach is currently not optimal as it
> > > simply bails out if some other process is accessing the file and
> > > leaves you with an (intact) temporary copy you need to move back
> > > in place manually.    
> > 
> > If you really want to have real and *ideal* distribution of the data
> > across physical disk first you need to build time travel device.
> > This device will allow you to put all blocks which needs to be read
> > in perfect order (to read all data only sequentially without seek).
> > However it will be working only in case of spindles because in case
> > of SSDs there is no seek time.
> > Please let us know when you will write drivers/timetravel/ Linux
> > kernel driver. When such driver will be available I promise I'll
> > write all necessary btrfs code by myself in matter of few days (it
> > will be piece of cake compare to build such device).
> > 
> > But seriously ..  
> 
> Seriously: Defragmentation on spindles is IMHO not about getting the
> perfect continuous allocation but providing better spatial layout of
> the files you work with.
> 
> Getting e.g. boot files into read order or at least nearby improves
> boot time a lot. Similar for loading applications. Shake tries to
> improve this by rewriting the files - and this works because file
> systems (given enough free space) already do a very good job at doing
> this. But constant system updates degrade this order over time.
> 
> It really doesn't matter if some big file is laid out in 1 allocation
> of 1 GB or in 250 allocations of 4MB: It really doesn't make a big
> difference.
> 
> Recombining extents into bigger once, tho, can make a big difference
> in an aging btrfs, even on SSDs.
> 
> Bees is, btw, not about defragmentation: I have some OS containers
> running and I want to deduplicate data after updates. It seems to do a
> good job here, better than other deduplicators I found. And if some
> defrag tools destroyed your snapshot reflinks, bees can also help
> here. On its way it may recombine extents so it may improve
> fragmentation. But usually it probably defragments because it needs
                                         ^^^^^^^^^^^
It fragments!

> to split extents that a defragger combined.
> 
> But well, I think getting 100% continuous allocation is really not the
> achievement you want to get, especially when reflinks are a primary
> concern.
> 
> 
> > Only context/scenario when you may want to lower defragmentation is
> > when you are something needs to allocate continuous area lower than
> > free space and larger than largest free chunk. Something like this
> > happens only when volume is working on almost 100% allocated space.
> > In such scenario even you bees cannot do to much as it may be not
> > enough free space to move some other data in larger chunks to
> > defragment FS physical space.  
> 
> Bees does not do that.
> 
> 
> > If your workload will be still writing
> > new data to FS such defragmentation may give you (maybe) few more
> > seconds and just after this FS will be 100% full,
> > 
> > In other words if someone is thinking that such defragmentation
> > daemon is solving any problems he/she may be 100% right .. such
> > person is only *thinking* that this is truth.  
> 
> Bees is not about that.
> 
> 
> > kloczek
> > PS. Do you know first McGyver rule? -> "If it ain't broke, don't fix
> > it".  
> 
> Do you know the saying "think first, then act"?
> 
> 
> > So first show that fragmentation is hurting latency of the
> > access to btrfs data and it will be possible to measurable such
> > impact. Before you will start measuring this you need to learn how o
> > sample for example VFS layer latency. Do you know how to do this to
> > deliver such proof?  
> 
> You didn't get the point. You only read "defragmentation" and your
> alarm lights lid up. You even think bees would be a defragmenter. It
> probably is more the opposite because it introduces more fragments in
> exchange for more reflinks.
> 
> 
> > PS2. The same "discussions" about fragmentation where in the past
> > about +10 years ago after ZFS has been introduced. Just to let you
> > know that after initial ZFS introduction up to now was not written
> > even single line of ZFS code to handle active fragmentation and no
> > one been able to prove that something about active defragmentation
> > needs to be done in case of ZFS.  
> 
> Btrfs has autodefrag to reduce the number of fragments by rewriting
> small portions of the file being written to. This is needed, otherwise
> the feature won't be there. Why? Have you tried working with 1GB files
> broken into 100000+ of fragments just because of how CoW works? Try,
> there's your latency.
> 
> 
> > Why? Because all stands on the shoulders of enough cleaver
> > *allocation algorithm*. Only this and nothing more.
> > PS3. Please can we stop this/EOT?  
> 
> Can we please not start a flame war just because you hate defrag
> tools?
> 
> I think the whole discussion about "defragmenting" should be stopped.
> Let's call it "optimizers":
> 
> If it reduces needed storage space, it optimizes. And I need a tool
> for that. Otherwise tell me how btrfs solves this in-kernel, when
> applications break reflinks by rewriting data...
> 
> If you're on spindles you want files be kept spatially nearby that are
> needed at around the same time. This improves boot times and
> application start times. The file system already does a good job at
> doing this. But for some work loads (like booting) this degrades over
> time and the FS can do nothing about it because this is just not how
> package managers work (or Windows updates, NTFS also uses extent
> allocation and as such solves the same problems in similar way as
> most Linux systems). Let the package manager reinstall all files
> accessed at boot and it would probably be solved. But who wants that?
> Btrfs does not solve this, SSDs do. Using bcache for that matter on
> my local system. Wihtout SSDs, shake (and other tools) can solve this.
> 
> If you are on SSD and work with almost full file systems, you may get
> back performance by recombining free space. Defragmentation here is
> not about files but free space. This can also be called an optimizer
> then.
> 
> 
> I really have no interest in defragmenting a file system to 100%
> continuous allocation. That was need for FAT and small system without
> enough RAM for caching all the file system infrastructure. Today
> systems use extent allocations and that solves the problem where the
> original idea of defragmentation came from. When I speak of
> defragmentation I mean something more intelligent like optimizing file
> system layout for access patterns you use.
> 
> 
> Conclusion: The original question was about defrag best practice with
> regards to reflinked snapshots. And I recommended partially against it
> and instead recommended bees which restores and optimizes the reflinks
> and may recombine some of the extents. From my wording, and I
> apologize for that, it was probably not completely clear what this
> means:
> 
> [I wrote]
> > You may want to try https://github.com/Zygo/bees. It is a daemon
> > watching the file system generation changes, scanning the blocks and
> > then recombines them. Of course, this process somewhat defeats the
> > purpose of defragging in the first place as it will undo some of the
> > defragmenting.  
> 
> It scans for duplicate blocks and recombines them into reflinked
> blocks. This is done by recombining extents. For that purpose, extents
> that the file system allocated, usually need to be broken up again
> into smaller chunks. But bees tries to recombine such broken extents
> back into bigger ones. But it is not a defragger, seriously! It indeed
> breaks extents into smaller chunks.
> 
> Later I recommended to have a look at shake which I experimented with.
> And I also recommended to let the btrfs autodefrag do the work and
> only ever defragment only very selected parts of the file system he
> feels needing "defragmentation". My patches to shake try to avoid
> btrfs shared extents so actually they reduce the effect of
> defragmenting the FS, because I think keeping reflinked extents is
> more important. But I see the main purpose of shake to re-layout
> supplied files into nearby space. I think it is more important to
> improve spatial locality of files than having them 100% continuous.
> 
> I will try to make my intent more clear next time but I guess you
> won't probably read it in its entirely anyways. :,-(
> 
> 



-- 
Regards,
Kai

Replies to list-only preferred.



  reply	other threads:[~2017-09-14 15:47 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-31  7:05 btrfs filesystem defragment -r -- does it affect subvolumes? Ulli Horlacher
2017-09-12 16:28 ` defragmenting best practice? Ulli Horlacher
2017-09-12 17:27   ` Austin S. Hemmelgarn
2017-09-14  7:54     ` Duncan
2017-09-14 12:28       ` Austin S. Hemmelgarn
2017-09-14 11:38   ` Kai Krakow
2017-09-14 13:31     ` Tomasz Kłoczko
2017-09-14 15:24       ` Kai Krakow
2017-09-14 15:47         ` Kai Krakow [this message]
2017-09-14 17:48         ` Tomasz Kłoczko
2017-09-14 18:53           ` Austin S. Hemmelgarn
2017-09-15  2:26             ` Tomasz Kłoczko
2017-09-15 12:23               ` Austin S. Hemmelgarn
2017-09-14 20:17           ` Kai Krakow
2017-09-15 10:54           ` Michał Sokołowski
2017-09-15 11:13             ` Peter Grandi
2017-09-15 13:07             ` Tomasz Kłoczko
2017-09-15 14:11               ` Michał Sokołowski
2017-09-15 16:35                 ` Peter Grandi
2017-09-15 17:08                 ` Kai Krakow
2017-09-15 19:10                   ` Tomasz Kłoczko
2017-09-20  6:38                     ` Dave
2017-09-20 11:46                       ` Austin S. Hemmelgarn
2017-09-21 20:10                         ` Kai Krakow
2017-09-21 23:30                           ` Dave
2017-09-21 23:58                           ` Kai Krakow
2017-09-22 11:22                           ` Austin S. Hemmelgarn
2017-09-22 20:29                             ` Marc Joliet
2017-09-21 11:09                       ` Duncan
2017-10-31 21:47                         ` Dave
2017-10-31 23:06                           ` Peter Grandi
2017-11-01  0:37                             ` Dave
2017-11-01 12:21                               ` Austin S. Hemmelgarn
2017-11-02  1:39                                 ` Dave
2017-11-02 11:07                                   ` Austin S. Hemmelgarn
2017-11-03  2:59                                     ` Dave
2017-11-03  7:12                                       ` Kai Krakow
2017-11-03  5:58                                   ` Marat Khalili
2017-11-03  7:19                                     ` Kai Krakow
2017-11-01 17:48                               ` Peter Grandi
2017-11-02  0:09                                 ` Dave
2017-11-02 11:17                                   ` Austin S. Hemmelgarn
2017-11-02 18:09                                     ` Dave
2017-11-02 18:37                                       ` Austin S. Hemmelgarn
2017-11-02  0:43                                 ` Peter Grandi
2017-11-02 21:16                               ` Kai Krakow
2017-11-03  2:47                                 ` Dave
2017-11-03  7:26                                   ` Kai Krakow
2017-11-03 11:30                                     ` Austin S. Hemmelgarn
     [not found]                             ` <CAH=dxU47-52-asM5vJ_-qOpEpjZczHw7vQzgi1-TeKm58++zBQ@mail.gmail.com>
2017-12-11  5:18                               ` Dave
2017-12-11  6:10                                 ` Timofey Titovets
2017-11-01  7:43                           ` Sean Greenslade
2017-11-01 13:31                           ` Duncan
2017-11-01 23:36                             ` Dave
2017-09-21 19:28                       ` Sean Greenslade
2017-09-20  7:34                     ` Dmitry Kudriavtsev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170914174715.7eed39cb@jupiter.sol.kaishome.de \
    --to=hurikhan77@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).