From: Duncan <1i5t5.duncan@cox.net>
To: linux-btrfs@vger.kernel.org
Subject: Re: umount waiting for 12 hours and still running
Date: Tue, 5 Nov 2013 18:21:15 +0000 (UTC) [thread overview]
Message-ID: <pan$eb9c$5cdff220$5f3b344b$1690c54d@cox.net> (raw)
In-Reply-To: loom.20131105T170351-783@post.gmane.org
John Goerzen posted on Tue, 05 Nov 2013 16:11:56 +0000 as excerpted:
> Duncan <1i5t5.duncan <at> cox.net> writes:
>
>
>> John Goerzen posted on Tue, 05 Nov 2013 07:42:02 -0600 as excerpted:
>>
>> > The filesystem in question involves two 2TB USB hard drives. It is
>> > 49% full. Data is RAID0, metadata is RAID1. The files stored on it
>> > are for BackupPC, meaning there are many, many directories and
>> > hardlinks. I would estimate 30 million inodes in use and many of
>> > them have dozens of hardlinks to them.
>>
>> That's a bit of a problem for btrfs at this point, as you rightly
>> mention.
> Can you clarify a bit about what sort of problems I might expect to
> encounter with this sort of setup on btrfs?
I'm not a dev nor do I run that sort of setup, so I won't attempt a lot
of detail. This is admittedly a bit handwavy, but if you need more just
use it as a place to start for for your own research.
That out of the way, having followed the list for awhile, I've seen
several reports of complications related to high hardlink count, mostly
exactly as yours, related to unresponsive for N seconds warnings and
inordinately long processing times for unmounts, etc.
Additionally, it's worth noting that until relatively recently (the wiki
changelog page says 3.7), btrfs had a rather low limit on hardlinks in a
single directory that people using btrfs for hardlink intensive purposes
kept hitting. A developer could give you more details, but IIRC, the
solution that worked around that, while it /did/ give btrfs the ability
to handle them, effectively created a setup where the first few hardlinks
were handled inline and thus were reasonably fast, but beyond that limit,
an indirect referencing scheme was used that was rather less efficient.
I'd guess btrfs' current problems in that regard are thus two-fold,
first, above a certain level the implementation /does/ get less
efficient, and second, given the relatively recent kernel 3.7
implementation, btrfs' large-numbers-of-hardlinks code hasn't had nearly
the time to shake out the bugs and get incremental optimizations that the
more basic code has had. I doubt btrfs will ever be a speed demon in
this area, but I expect that given another year or so, the high-numbers
hardlink code will be somewhat better optimized and tested simply due to
the incremental effect of bug shakeout and small code changes over time
as btrfs continues maturing.
Meanwhile, my own interest in btrfs is as a filesystem for SSDs (I still
use reiserfs on my spinning rust and I've had very good luck with it even
thru various shoddy hardware experiences since the ordered-by-default
code went in around 2.6.16, IIRC, but its journaling isn't well suited to
SSDs), and being able to actually use btrfs' data checksumming and
integrity features, which means raid1 or raid10 mode (raid1 in my case),
and the speed of SSDs does mitigate to a large degree a lot of the
slowness I see others reporting for this and other cases. Additionally,
I run several independent smaller partitions so if there /is/ a problem,
the damage is contained, which means I'm typically dealing with double-
digit gigs per partition at most, thus reducing full partition scrub and
rebalance times from the hours to days I see people reporting on-list for
multi-terabyte spinning rust, to typically seconds, perhaps a couple
minutes, here. The time is short enough I typically use the don't-
background option, and run the scrub/balance in real-time, waiting for
the result. Needless to say, if a full balance is going to take days,
you don't run it very often, but since it's only a couple minutes here, I
scrub and balance reasonably frequently, say if I have a bad shutdown (I
use suspend-to-ram and sometimes on resume the SSDs don't stabilize fast
enough for the kernel, so a device drops from the btrfs raid1 and the
whole system goes unstable after that, often leading to a bad shutdown
and reboot). Since a full balance involves rewriting everything to new
chunks that tends to limit bitrot or the chance for any errors to build
up over time.
My point being that my particular use-case is pretty much diametrically
opposite yours! For your backups use-case, I'd probably use something
less experimental than btrfs, like xfs or ext4 with ordered journaling...
or the reiserfs I still use on spinning rust, tho people's experience
with it seems to either be really good or really bad, and while mine is
definitely good that doesn't mean yours will be.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
next prev parent reply other threads:[~2013-11-05 18:21 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-05 13:42 umount waiting for 12 hours and still running John Goerzen
2013-11-05 14:20 ` Duncan
2013-11-05 16:11 ` John Goerzen
2013-11-05 18:21 ` Duncan [this message]
-- strict thread matches above, loose matches on Subject: below --
2013-11-05 18:46 Tomasz Chmielewski
2013-11-05 18:53 ` John Goerzen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='pan$eb9c$5cdff220$5f3b344b$1690c54d@cox.net' \
--to=1i5t5.duncan@cox.net \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).