All of lore.kernel.org
 help / color / mirror / Atom feed
From: Duncan <1i5t5.duncan@cox.net>
To: linux-btrfs@vger.kernel.org
Subject: Re: btrfs-transaction blocked for more than 120 seconds
Date: Fri, 3 Jan 2014 09:18:51 +0000 (UTC)	[thread overview]
Message-ID: <pan$ce611$d2dfc4df$982de663$6d9e3ddf@cox.net> (raw)
In-Reply-To: h2ghpa-u74.ln1@hurikhan77.spdns.de

Kai Krakow posted on Fri, 03 Jan 2014 02:24:01 +0100 as excerpted:

> Duncan <1i5t5.duncan@cox.net> schrieb:
> 
>> But because a full balance rewrites everything anyway, it'll
>> effectively defrag too.
> 
> Is that really true? I thought it just rewrites each distinct extent and
> shuffels chunks around... This would mean it does not merge extents
> together.

While I'm not a coder and they're free to correct me if I'm wrong...

With a full balance (there are now options allowing one to do only data, 
or only metadata, or for that matter only system, and do other filtering, 
say to rebalance only chunks less than 10% used or only those not yet 
converted to a new raid level, if desired, but we're talking a full 
balance here), all chunks are rewritten, merging data (or metadata) into 
fewer chunks if possible, eliminating the then unused chunks and 
returning the space they took to the unallocated pool.

Given that everything is being rewritten anyway, a process that can take 
hours or even days on multi-terabyte spinning rust filesystems, /not/ 
doing a file defrag as part of the process would be stupid.

So doing a separate defrag and balance isn't necessary.  And while we're 
at it, doing a separate scrub and balance isn't necessary, for the same 
reason.  (If one copy of the data is invalid and there's another, it'll 
be used for the rewrite and redup if necessary during the balance and the 
invalid copy will simply be erased.  If there's no valid copy, then there 
will be balance errors and I believe the chunks containing the bad data 
are simply not rewritten at all, tho the valid data from them might be 
rewritten, leaving only the bad data (I'm not sure which, on that), thus 
allowing the admin to try other tools to clean up or recover from the 
damage as necessary.)

That's one reason why the balance operation can take so much longer than 
a straight sequential read/write of the data might indicate, because it's 
doing all that extra work behind the scenes as well.

Tho I'm not sure that it defrags across chunks, particularly if a file's 
fragments reach across enough chunks that they'd not have been processed 
by the time a written chunk is full and the balance progresses to the 
next one.  However, given that data chunks are 1 GiB in size, that should 
still cut down a multi-thousand-extent file to perhaps a few dozen 
extents, one each per rewritten chunk.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


  reply	other threads:[~2014-01-03  9:19 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-31 11:46 btrfs-transaction blocked for more than 120 seconds Sulla
2014-01-01 12:37 ` Duncan
2014-01-01 20:08   ` Sulla
2014-01-02  8:38     ` Duncan
2014-01-03  1:24       ` Kai Krakow
2014-01-03  9:18         ` Duncan [this message]
2014-01-05  0:12     ` Sulla
2014-01-03 17:25   ` Marc MERLIN
2014-01-03 21:34     ` Duncan
2014-01-05  6:39       ` Marc MERLIN
2014-01-05 17:09         ` Chris Murphy
2014-01-05 17:54           ` Jim Salter
2014-01-05 19:57             ` Duncan
2014-01-05 20:44               ` Chris Murphy
2014-01-08  3:22       ` Marc MERLIN
2014-01-08  9:45         ` Duncan
2014-01-04 20:48     ` Roger Binns
2014-01-02  8:49 ` Jojo
2014-01-05 20:32 ` Chris Murphy
2014-01-05 21:17   ` Sulla
2014-01-05 22:36     ` Brendan Hide
2014-01-05 22:57       ` Roman Mamedov
2014-01-07 10:22         ` Brendan Hide
2014-01-06  0:15       ` Chris Murphy
2014-01-06  0:19         ` Chris Murphy
2014-01-05 23:48     ` Chris Murphy
2014-01-05 23:57       ` Chris Murphy
2014-01-06  0:25         ` Sulla
2014-01-06  0:49           ` Chris Murphy
     [not found]             ` <52CA06FE.2030802@gmx.at>
2014-01-06  1:55               ` Chris Murphy
     [not found] <ADin1n00P0VAdqd01DioM9>
2014-01-05 20:44 ` Duncan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='pan$ce611$d2dfc4df$982de663$6d9e3ddf@cox.net' \
    --to=1i5t5.duncan@cox.net \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.