linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Theodore Ts'o <tytso@mit.edu>
To: "Joseph D. Wagner" <joe@josephdwagner.info>
Cc: linux-ext4@vger.kernel.org
Subject: Re: dump ext4 performance degrades linearly as disk fills
Date: Mon, 16 Jun 2014 08:42:29 -0400	[thread overview]
Message-ID: <20140616124229.GA8465@thunk.org> (raw)
In-Reply-To: <539E8401.5000607@josephdwagner.info>

On Sun, Jun 15, 2014 at 10:43:29PM -0700, Joseph D. Wagner wrote:
> Background:
> - I use lvm snapshots on ext4 for backup.  I use dump to backup the
> snapshots.  The backup goes to an external hard drive over usb 3.0. The
> external hard drive has 1 partition formatted with ext4.
>
> My Thoughts So Far:
> - I suspect that either 1) dump is doing something which lowers performance
> as the backup progresses, or 2) the ext4 algorithm for finding and
> allocating free blocks is vulnerable to performance degradation as the
> volume fills.
> 
> - I haven't tested this thoroughly.  However, performance appears to improve
> when I clear out the external drive and do a fresh, full dump (-0), and
> performance appears to remain degraded on incremental backups on a nearly
> full volume.  This leads me to suspect #2.

The issue is that when the external disk is freshly mounted, we don't
have any of the block allocation bitmaps cached.  We also cache at run
information about the largest contiguous free block in a block group.
On a freshly unmounted file system we don't have any of this information.

So it's a known issue that on a freshly mounted file system,
allocation performance is bad for a little while until we have more
information cached.  It's not something we've really worked on trying
to improve, but there are a number of things we can do.  In
particular, with ext4 file system (as opposed to an ext3 file system
which was upgraded to ext4), the block allocation bitmaps are much
more contiguous.  So one of the things we could do is to readahead a
chunk of allocation bitmaps, so we avoid a whole series of 4k random
reads.   

> - What steps can I take to isolate the cause of the problem?  If there's any
> information I can provide, please let me know.

If you run dumpe2fs on the file system and send us the output, we can
probably confirm this pretty quickly.  The e2freefrag program can also
show us whether how fragmented the free space is, but I'm pretty sure
that's not the problem.

Something that might help is simply running "dumpe2fs /dev/sdXX >
/dev/null" or "e2freefrag /dev/hdXX > /dev/null" after you mount the
file system and before you kick off the backup.  This will load all of
the block allocation bitmaps into the buffer cache, and the libext2fs
functions used by dumpe2fs and e2freefrag will do so much more
efficiently than the kernel code will as it demand-loads the bitmap
blocks.

Hope this helps!

					- Ted

  reply	other threads:[~2014-06-16 12:42 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-16  5:43 dump ext4 performance degrades linearly as disk fills Joseph D. Wagner
2014-06-16 12:42 ` Theodore Ts'o [this message]
2014-06-17 14:08   ` Joseph D. Wagner
2014-06-17 14:51     ` Theodore Ts'o
2014-06-16 21:48 ` Andreas Dilger
  -- strict thread matches above, loose matches on Subject: below --
2014-06-19 18:42 Joseph D. Wagner
2014-06-21  0:38 ` Theodore Ts'o
2014-07-29 20:55 ` Phillip Susi
2014-07-29 22:48   ` Joseph D. Wagner
2014-06-20 23:59 Joseph D. Wagner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140616124229.GA8465@thunk.org \
    --to=tytso@mit.edu \
    --cc=joe@josephdwagner.info \
    --cc=linux-ext4@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).