All of lore.kernel.org
 help / color / mirror / Atom feed
From: Austin S Hemmelgarn <ahferroin7@gmail.com>
To: Mitch Fossen <msfossen@gmail.com>, linux-btrfs@vger.kernel.org
Subject: Re: btrfs: poor performance on deleting many large files
Date: Mon, 23 Nov 2015 07:59:25 -0500	[thread overview]
Message-ID: <56530DAD.9080607@gmail.com> (raw)
In-Reply-To: <CA+ve2MYBAPbLPiX4i2oZeDeu+9=JurXHsx5fMef2iV3rRrCKxg@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3471 bytes --]

On 2015-11-22 20:43, Mitch Fossen wrote:
> Hi all,
>
> I have a btrfs setup of 4x2TB HDDs for /home in btrfs RAID0 on Ubuntu
> 15.10 (kernel 4.2) and btrfs-progs 4.3.1. Root is on a separate SSD
> also running btrfs.
>
> About 6 people use it via ssh and run simulations. One of these
> simulations generates a lot of intermediate data that can be discarded
> after it is run, it usually ends up being around 100GB to 300GB spread
> across dozens of files 500M to 5GB apiece.
>
> The problem is that, when it comes time to do a "rm -rf
> ~/working_directory" the entire machine locks up and sporadically
> allows other IO requests to go through, with a 5 to 10 minute delay
> before other requests seem to be served. It can end up taking half an
> hour or more to fully remove the offending directory, with the hangs
> happening frequently enough to be frustrating. This didn't seem to
> happen when the system was using ext4 on LVM.
Based on this description, this sounds to me like an issue with 
fragmentation.
>
> Is there a way to fix this performance issue or at least mitigate it?
> Would using ionice and the CFQ scheduler help? As far as I know Ubuntu
> uses deadline by default which ignores ionice values.
This depends on a number of factors.  If you are on a new enough kernel, 
you may actually be using the blk-mq code instead of one of the 
traditional I/O schedulers, which does honor ionice values, and is 
generally a lot better than CFQ or deadline at actual fairness and 
performance.  If you aren't running on that code path, then whether 
deadline or CFQ is better is pretty hard to determine.  In general, CFQ 
needs some serious effort and benchmarking to get reasonable performance 
out of it.  CFQ can beat deadline in performance when properly tuned to 
the workload (except if you have really small rotational media (smaller 
than 32G or so), or if you absolutely need deterministic scheduling), 
but when you don't take the time to tune CFQ, deadline is usually better 
(except on SSD's, where CFQ is generally better than deadline even 
without performance tuning).
>
> Alternatively, would balancing and defragging data more often help?
> The current mount options are compress=lzo and space_cache, and I will
> try it with autodefrag enabled as well to see if that helps.
Balance is not likely to help much, but defragmentation might.  I would 
suggest running the defrag when nobody has any other data on the 
filesystem, as it will likely cause a severe drop in performance the 
first time it's run.  Autodefrag might help, but it may also make 
performance worse while writing the files in the first place.  You might 
also try with compress=none, depending on your storage hardware, using 
in-line compression can actually make things go significantly slower (I 
see this a lot with SSD's, and also with some high-end storage 
controllers, and especially when dealing with large data-sets that 
aren't very compressible).
>
> For now I think I'll recommend that everyone use subvolumes for these
> runs and then enable user_subvol_rm_allowed.
As Duncan said, this is probably the best option short term.  It is 
worth noting however that removing a subvolume still has some overhead 
(which appears to scale linearly with the amount of data in the 
subvolume).  This overhead isn't likely to be an issue however unless a 
bunch of subvolumes get removed in bulk however.



[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 3019 bytes --]

  parent reply	other threads:[~2015-11-23 12:59 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-23  1:43 btrfs: poor performance on deleting many large files Mitch Fossen
2015-11-23  6:29 ` Duncan
2015-11-25 21:49   ` Mitchell Fossen
2015-11-26 16:52     ` Duncan
2015-11-26 18:25       ` Christoph Anton Mitterer
2015-11-26 23:29         ` Duncan
2015-11-27  0:06           ` Christoph Anton Mitterer
2015-11-27  3:38             ` Duncan
2015-11-28  3:57               ` Christoph Anton Mitterer
2015-11-28  6:49                 ` Duncan
2015-12-12 22:15                   ` Christoph Anton Mitterer
2015-12-13  7:10                     ` Duncan
2015-12-16 22:14                       ` Christoph Anton Mitterer
2015-12-14 14:24                     ` Austin S. Hemmelgarn
2015-12-14 19:39                       ` Christoph Anton Mitterer
2015-12-14 20:27                         ` Austin S. Hemmelgarn
2015-12-14 21:30                           ` Lionel Bouton
2015-12-14 23:25                             ` Christoph Anton Mitterer
2015-12-15  1:49                               ` Duncan
2015-12-15  2:38                                 ` Lionel Bouton
2015-12-16  8:10                                   ` Duncan
2015-12-14 23:10                           ` Christoph Anton Mitterer
2015-12-14 23:16                           ` project idea: per-object default mount-options / more btrfs-properties / chattr attributes (was: btrfs: poor performance on deleting many large files) Christoph Anton Mitterer
2015-12-15  2:08                           ` btrfs: poor performance on deleting many large files Duncan
2015-12-15  4:05                       ` Chris Murphy
2015-11-27  1:49     ` Qu Wenruo
2015-11-23 12:59 ` Austin S Hemmelgarn [this message]
2015-11-26  0:23   ` [auto-]defrag, nodatacow - general suggestions?(was: btrfs: poor performance on deleting many large files?) Christoph Anton Mitterer
2015-11-26  0:33     ` Hugo Mills
2015-12-09  5:43       ` Christoph Anton Mitterer
2015-12-09 13:36         ` Duncan
2015-12-14  2:46           ` Christoph Anton Mitterer
2015-12-14 11:19             ` Duncan
2015-12-16 23:39           ` Kai Krakow
2015-12-14  1:44       ` Christoph Anton Mitterer
2015-12-14 10:51         ` Duncan
2015-12-16 23:55           ` Christoph Anton Mitterer
2015-11-26 23:08     ` Duncan
2015-12-09  5:45       ` Christoph Anton Mitterer
2015-12-09 16:36         ` Duncan
2015-12-16 21:59           ` Christoph Anton Mitterer
2015-12-17  4:06             ` Duncan
2015-12-18  0:21               ` Christoph Anton Mitterer
2015-12-17  4:35             ` Duncan
2015-12-17  5:07             ` Duncan
2015-12-17  5:12             ` Duncan
2015-12-17  6:00             ` Duncan
2015-12-17  6:01             ` Duncan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56530DAD.9080607@gmail.com \
    --to=ahferroin7@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=msfossen@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.