* Balance & scrub & defrag
@ 2014-12-10 22:15 sys.syphus
2014-12-11 1:17 ` Robert White
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: sys.syphus @ 2014-12-10 22:15 UTC (permalink / raw)
To: linux-btrfs
I am working on a script that i can run daily that will do maintenance
on my btrfs mountpoints. is there any reason not to concurrently do
all of the above? possibly including discards as well.
also, is there anything existing currently that will do maintenance on
btrfs so i don't have to reinvent the wheel?
#!/bin/bash
btrfs filesystem defragment -r -v /media/btrfs/ &
btrfs scrub start /media/btrfs/ &
btrfs balance start /media/btrfs/ &
watch -d -n 30 "btrfs balance status /media/btrfs/; btrfs scrub status
/media/btrfs/"
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Balance & scrub & defrag
2014-12-10 22:15 Balance & scrub & defrag sys.syphus
@ 2014-12-11 1:17 ` Robert White
2014-12-12 1:00 ` Russell Coker
2014-12-11 8:33 ` Duncan
2014-12-12 4:32 ` Zygo Blaxell
2 siblings, 1 reply; 10+ messages in thread
From: Robert White @ 2014-12-11 1:17 UTC (permalink / raw)
To: sys.syphus, linux-btrfs
On 12/10/2014 02:15 PM, sys.syphus wrote:
> I am working on a script that i can run daily that will do maintenance
> on my btrfs mountpoints. is there any reason not to concurrently do
> all of the above? possibly including discards as well.
>
>
> also, is there anything existing currently that will do maintenance on
> btrfs so i don't have to reinvent the wheel?
>
> #!/bin/bash
> btrfs filesystem defragment -r -v /media/btrfs/ &
> btrfs scrub start /media/btrfs/ &
> btrfs balance start /media/btrfs/ &
>
>
> watch -d -n 30 "btrfs balance status /media/btrfs/; btrfs scrub status
> /media/btrfs/"
I'd recommend doing "none of the above" on a daily basis. One of the
goals of the filesystem design is to remove the need for any of these
operations on any regular basis. You are just going to bog down your
system and increase you heat and wear profiles for no good reason.
Those tools should be used if you notice something fishy like recent
decreases in efficiency or errors in your log files.
A _monthly_ scrub is maybe worth scheduling if you have a lot of churn
in your disk contents.
Defragging should be done after significant content additions/changes
(like replacing a lot of files via package management) and limited to
the directories most likely changed.
Balancing is almost never necessary and can be anti-helpful if a
experiences random updates in batches (because the nicely packed file
may end up far, far away from the active data extent where its COW
events are taking place.
Resist the urge to tinker with production systems. The exposure
(rewriting stable data is just the chance to destabilize your data,
balancing your drive can take two files that always change together and
put them far away from one another, etc) is not worth the nearly
non-existent chance of benefit. Once the system is "good" just leave it
that way until you notice something "not good" coming on the horizon.
If you feel you _must_ do these tasks then doing them all at once, where
possible, will just make both tasks take longer. If you are transcribing
a file over on one side of the disk to defrag it, and you are
transcribing an extent on the other side of the disk to balance it, you
are just bouncing your disk heads back-and-forth and wasing wall-clock time.
So yea, it's not windows, it doesn't need the defrag hammer.
Trying to over-manage the system will prevent it from seeking its
dynamic (and so predictable) equilibrium.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Balance & scrub & defrag
2014-12-10 22:15 Balance & scrub & defrag sys.syphus
2014-12-11 1:17 ` Robert White
@ 2014-12-11 8:33 ` Duncan
2014-12-12 4:32 ` Zygo Blaxell
2 siblings, 0 replies; 10+ messages in thread
From: Duncan @ 2014-12-11 8:33 UTC (permalink / raw)
To: linux-btrfs
sys.syphus posted on Wed, 10 Dec 2014 16:15:17 -0600 as excerpted:
> I am working on a script that i can run daily that will do maintenance
> on my btrfs mountpoints. is there any reason not to concurrently do all
> of the above? possibly including discards as well.
>
>
> also, is there anything existing currently that will do maintenance on
> btrfs so i don't have to reinvent the wheel?
>
> #!/bin/bash btrfs filesystem defragment -r -v /media/btrfs/ &
> btrfs scrub start /media/btrfs/ &
> btrfs balance start /media/btrfs/ &
Btrfs has had concurrency issues in the past, tho there has been a recent
patch series aimed at fixing many of them. Still, running more than one
of defrag/scrub/balance at once, particularly on spinning rust (as
opposed to SSD which is faster and doesn't have I/O bottlenecks to the
same degree) does put a lot of stress on the system and is thus more
likely to trigger bugs than running them one at a time. If your goal is
to stress-test and find and report bugs, that's a reasonable start,
otherwise consider doing one at a time.
There's also the memory issue. These utilities can take quite a bit of
memory at times, particularly if you're running with lots of snapshots.
Meanwhile, as others have said, doing these daily is overkill. If you're
running multi-TB filesystems on spinning rust, it'll take several hours
for one of these anyway. Maybe once a week for scrub, which won't
rewrite anything unless it finds errors. Balance you don't need to run
routinely, only when adding/deleting devices or if your data/metadata
chunk balance (see btrfs fi df) gets out of balance.
And for defrag, take a look at the autodefrag mount option. Tho be aware
that it can interact badly with large (say half a gig or larger),
actively internal-write-pattern rewritten, files such as VM images and
databases. With autodefrag on, you shouldn't have to worry about
fragmentation at all, unless of course you're using big VMs or the like,
but there's available solutions for that as well. See the wiki and many
previous threads here for more.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Balance & scrub & defrag
2014-12-11 1:17 ` Robert White
@ 2014-12-12 1:00 ` Russell Coker
2014-12-12 1:31 ` Robert White
0 siblings, 1 reply; 10+ messages in thread
From: Russell Coker @ 2014-12-12 1:00 UTC (permalink / raw)
To: Robert White; +Cc: sys.syphus, linux-btrfs
On Wed, 10 Dec 2014 17:17:28 Robert White wrote:
> A _monthly_ scrub is maybe worth scheduling if you have a lot of churn
> in your disk contents.
I do weekly scrubs. I recently had 2 disks in a RAID-1 array develop read
errors within a month of each other. The first scrub after replacing sdb
revealed an error on sdc!
> Defragging should be done after significant content additions/changes
> (like replacing a lot of files via package management) and limited to
> the directories most likely changed.
I have never run defrag. Currently all my BTRFS filesystems that have any
performance requirements are on SSD and I don't think that defragmenting a SSD
does much good.
> Balancing is almost never necessary and can be anti-helpful if a
> experiences random updates in batches (because the nicely packed file
> may end up far, far away from the active data extent where its COW
> events are taking place.
The problem with running out of metadata space requires a need for an
occasional data balance. If you set it to only balance chunks that are less
than 10% used then it doesn't take much time.
--
My Main Blog http://etbe.coker.com.au/
My Documents Blog http://doc.coker.com.au/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Balance & scrub & defrag
2014-12-12 1:00 ` Russell Coker
@ 2014-12-12 1:31 ` Robert White
2014-12-12 9:17 ` Erkki Seppala
0 siblings, 1 reply; 10+ messages in thread
From: Robert White @ 2014-12-12 1:31 UTC (permalink / raw)
To: russell; +Cc: sys.syphus, linux-btrfs
On 12/11/2014 05:00 PM, Russell Coker wrote:
> On Wed, 10 Dec 2014 17:17:28 Robert White wrote:
>> A _monthly_ scrub is maybe worth scheduling if you have a lot of churn
>> in your disk contents.
>
> I do weekly scrubs. I recently had 2 disks in a RAID-1 array develop read
> errors within a month of each other. The first scrub after replacing sdb
> revealed an error on sdc!
You need to buy better disks. 8-)
I use SMART (smartmontools etc) and its tests to keep track of and warn
me of such issues. It's way more likely to catch incipient media
failures long before scrub would. It's also more likely to correct
situations before they become visible to userspace. Its also a way
better full-platter scan that involves less real time delay and won't
bog down a running system.
I reserve scrub for after maintenance and the occasional look-see.
But whatever works for you.
>
> The problem with running out of metadata space requires a need for an
> occasional data balance. If you set it to only balance chunks that are less
> than 10% used then it doesn't take much time.
In very recent kernels the empty extent remover will take up most of
this burden.
A shallow balance is fast, but you are missing most of its potential
benefits at that point. I wash my clothes instead of just taking a lint
brush to them. Half measures, repeated, lead to more and more fractional
results.
Every time you sweep a 10% full extent into a another extent far, far
away you are perturbing your locality and probably shaving a little off
of probable peak performance. It's the equivalent of organizing your
sock drawer by just taking all the socks out of the dryer in a lump and
cramming them into the back of the drawer. That is you are moving the
most-changed items back to pack them against the least-changed ones. The
natural lay of the filesystem is to spread out and churn. Repeatedly
smashing it down is just going to wrinkle your data.
If you are getting anywhere near running out of metadata extents on any
kind of regular basis then you need to reexamine your entire deal. Make
sure you are running a recent kernel with the reclaim update. Do a full
balance _once_ and then leave it alone. Maybe consider autodefrag if
your file load is compatible (not a lot of VMs and RDBMS extents).
Of course if this is your pirate warez machine and you are regularly
passing torrents through it, then you just need more space and better
delete discipline.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Balance & scrub & defrag
2014-12-10 22:15 Balance & scrub & defrag sys.syphus
2014-12-11 1:17 ` Robert White
2014-12-11 8:33 ` Duncan
@ 2014-12-12 4:32 ` Zygo Blaxell
2 siblings, 0 replies; 10+ messages in thread
From: Zygo Blaxell @ 2014-12-12 4:32 UTC (permalink / raw)
To: sys.syphus; +Cc: linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 2922 bytes --]
On Wed, Dec 10, 2014 at 04:15:17PM -0600, sys.syphus wrote:
> I am working on a script that i can run daily that will do maintenance
> on my btrfs mountpoints. is there any reason not to concurrently do
> all of the above? possibly including discards as well.
>
>
> also, is there anything existing currently that will do maintenance on
> btrfs so i don't have to reinvent the wheel?
There's not a lot of wheel to reinvent. Just a one-liner in a crontab
is sufficient.
> #!/bin/bash
> btrfs filesystem defragment -r -v /media/btrfs/ &
> btrfs scrub start /media/btrfs/ &
> btrfs balance start /media/btrfs/ &
They should be run sequentially for simple performance reasons. They all
attempt to occupy all the available disk bandwidth, so running them all
at the same time just increases access latency and usually makes them
much slower than if they were run sequentially. There is no cooperative
scheduling of these operations in btrfs, even though they theoretically
could be combined into a single pass.
Run scrub once a week on low-end consumer drives, once a month on drives
designed for NAS applications. Scrub is a fast and (assuming no errors
are detected) read-only scan of allocated data areas that is well worth
its relatively low cost. There's no need to run it daily--but there's no
reason _not_ to run it daily either, if your disks' speed-to-size ratio
is big enough.
Don't run defragment at all, unless you have a database or VM image,
and if you do, run defrag only on that. It's necessary for databases
because each fragment ends up being the size of a database page, and
the extent records for large badly-fragmented files consume almost
as much RAM as the file pages themselves. defrag on arbitrary large
files is a fairly good way to lock yourself out of your system: defrag
will eventually finish, but in pathological cases it can take hours and
prevent you from using the filesystem while it runs. You can try using
the autodefrag mount option instead, but be prepared to turn it off if
autodefrag is not right for your workload.
Balance is something to use only when there is a configuration change
(e.g. you added a new disk or replaced one with a larger one) or you've
drastically changed the average size of files in a nearly-full filesystem.
It will make the filesystem painfully slow the whole time it runs, and it
can run for weeks on a filesystem smaller than 1TB. There's a _reason_
why balance requests persist across reboots. Speaking of reboots: if
a balance is interrupted by a reboot, it can delay the next mount for
minutes or hours (the mount command seems to hang until it has processed
the interrupted block group) depending on filesystem size.
> watch -d -n 30 "btrfs balance status /media/btrfs/; btrfs scrub status
> /media/btrfs/"
That part is fine. I throw in 'btrfs fi df' into the watches too.
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Balance & scrub & defrag
2014-12-12 1:31 ` Robert White
@ 2014-12-12 9:17 ` Erkki Seppala
2014-12-12 13:32 ` Robert White
2014-12-13 5:15 ` Zygo Blaxell
0 siblings, 2 replies; 10+ messages in thread
From: Erkki Seppala @ 2014-12-12 9:17 UTC (permalink / raw)
To: linux-btrfs
Robert White <rwhite@pobox.com> writes:
> You need to buy better disks. 8-)
Where can one buy these better disks with reasonable prices?-) Disks are
best thought of as consumables.
> I use SMART (smartmontools etc) and its tests to keep track of and
> warn me of such issues. It's way more likely to catch incipient media
> failures long before scrub would.
That may be sort of true, but I think even SMART is helped by the fact
that the media is read through from the beginning to the end*, so it can
detect even the errors that don't bubble through the IO layer. And BTRFS
can indeed note errors that the media doesn't - two checksums is better
than one checksum, assuming they aren't exactly the same algorithm ;).
Do you alternatively execute SMART self tests?
* scrub doesn't do this, it reads only through used data
--
_____________________________________________________________________
/ __// /__ ____ __ http://www.modeemi.fi/~flux/\ \
/ /_ / // // /\ \/ / \ /
/_/ /_/ \___/ /_/\_\@modeemi.fi \/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Balance & scrub & defrag
@ 2014-12-12 9:49 Tomasz Chmielewski
0 siblings, 0 replies; 10+ messages in thread
From: Tomasz Chmielewski @ 2014-12-12 9:49 UTC (permalink / raw)
To: Btrfs BTRFS
> I use SMART (smartmontools etc) and its tests to keep track of and warn
> me of such issues. It's way more likely to catch incipient media
> failures long before scrub would. It's also more likely to correct
> situations before they become visible to userspace. Its also a way
> better full-platter scan that involves less real time delay and won't
> bog down a running system.
Don't put too much trust in SMART - sectors can rot unexpectedly even if
SMART is thinking everything is fine with the drive.
I had exactly this issue recently:
1) one of the drives in the server failed and was replaced
2) "btrfs device delete missing" (which basically moves data from the
remaining drive to the new one) was failing with IO error
3) according to SMART, the drive with IO error was fine (no reallocated
sectors, no warnings etc.)
So, scrub to the rescue - it printed "broken" files, after removing them
manually, it was possible to finish "btrfs device delete missing".
Probably it makes sense to run scrub occasionally (just like mdraid is
doing on most distributions).
--
Tomasz Chmielewski
http://www.sslrack.com
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Balance & scrub & defrag
2014-12-12 9:17 ` Erkki Seppala
@ 2014-12-12 13:32 ` Robert White
2014-12-13 5:15 ` Zygo Blaxell
1 sibling, 0 replies; 10+ messages in thread
From: Robert White @ 2014-12-12 13:32 UTC (permalink / raw)
To: Erkki Seppala, linux-btrfs
On 12/12/2014 01:17 AM, Erkki Seppala wrote:
> Robert White <rwhite@pobox.com> writes:
>
>> You need to buy better disks. 8-)
>
> Where can one buy these better disks with reasonable prices?-) Disks are
> best thought of as consumables.
A good disk is only about 9% more expensive. So like the WD "green"
disks were all cheap because they were (essentially) the disks that
didn't pass the full quality suite for the higher WD lines like "caviar".
"Inexpensive" and "Cheap" are not the same thing.
Disks are not best thought of as consumables unless the data you store
on them is discardable.
> Do you alternatively execute SMART self tests?
Indeed. If you install and activate SMART but you never run the tests
you've done another one of those half-measures I was talking about.
The "long offline" test reads 100% of the disk surface (well, up until
it hits an error anyway). But since none of that data has to leave the
disk controller and go out through the interface etc it doesn't bog the
rest of the system.
All but the oldest or cheapest drives have controllers that will "resume
the offline test after any command" so you do
smartctl --test=long /dev/sda # or whatever
every few days and you'll know when things start to get dicy.
The one thing you do have to be watchful of is that the tests _stop_
when they hit the first read error, so you do have to keep up with things.
For instance I just had a pair of uncorrectable read errors. When I used
hdparm to write the sectors, however, the disk didn't need to relocate
the block(s) as bad. So it was some funky event on the disk itself.
Of course it's a very old disk (1525 days of power-on runtime) so two
correctable-with-overwrite read errors isn't bad.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Balance & scrub & defrag
2014-12-12 9:17 ` Erkki Seppala
2014-12-12 13:32 ` Robert White
@ 2014-12-13 5:15 ` Zygo Blaxell
1 sibling, 0 replies; 10+ messages in thread
From: Zygo Blaxell @ 2014-12-13 5:15 UTC (permalink / raw)
To: Erkki Seppala; +Cc: linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 2593 bytes --]
On Fri, Dec 12, 2014 at 11:17:58AM +0200, Erkki Seppala wrote:
> That may be sort of true, but I think even SMART is helped by the fact
> that the media is read through from the beginning to the end*, so it can
> detect even the errors that don't bubble through the IO layer. And BTRFS
> can indeed note errors that the media doesn't - two checksums is better
> than one checksum, assuming they aren't exactly the same algorithm ;).
>
> Do you alternatively execute SMART self tests?
>
> * scrub doesn't do this, it reads only through used data
I do both. They operate at different layers of the storage stack, and have
access to different information. They also have different (and hopefully
non-overlapping) bugs.
scrub pros:
+ can compare data with the other copies in RAID1 or DUP mode
+ can fix bad data when good copies available
+ slows down when other processes want to use the disk
+ can be suspended and resumed at will by software
+ error data is impervious to drive firmware bugs
+ straightforward error reports
+ only scans allocated data
scrub cons:
- only scans allocated data
- btrfs filesystems only
- CPU and I/O burden
- error sources are not localized: scrub errors could be software
bugs, bad RAM, bad CPU cooling, bad cabling, bad power supply,
or bad hard drive
smart pros:
+ runs in the background
+ no CPU or I/O required, just read results from previous run
and launch new test daily
+ access to electrical and mechanical data from the drive
that are otherwise unavailable to the host
+ 100% surface scan (including bad sector count)
+ logs host I/O errors that OS might miss
(e.g. because they occur during BIOS booting)
+ works with any filesystems, partitions, swap, etc.
+ error sources are localized to the drive in test
smart cons:
- buggy firmware does not detect or report error events when
significant failures occur
- buggy firmware does detect and report error events when
signficant failures do not occur
- buggy firmware will make host accesses painfully slow during
scan (WD Green is very bad for this)
- firmware does not implement useful subset of SMART command set
- SMART command set can be inaccessible through some SATA bridge
chips (especially USB)
- cannot fix anything, only report quantities of data already lost
- cannot reliably detect RAM or CPU failure (on host or drive)
- requires the drive to spin for 1-2 continuous hours during test
- interpreting the raw data is a black art
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2014-12-13 5:15 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-10 22:15 Balance & scrub & defrag sys.syphus
2014-12-11 1:17 ` Robert White
2014-12-12 1:00 ` Russell Coker
2014-12-12 1:31 ` Robert White
2014-12-12 9:17 ` Erkki Seppala
2014-12-12 13:32 ` Robert White
2014-12-13 5:15 ` Zygo Blaxell
2014-12-11 8:33 ` Duncan
2014-12-12 4:32 ` Zygo Blaxell
-- strict thread matches above, loose matches on Subject: below --
2014-12-12 9:49 Tomasz Chmielewski
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.