From: Duncan <1i5t5.duncan@cox.net>
To: linux-btrfs@vger.kernel.org
Subject: Re: Rapid memory exhaustion during normal operation
Date: Wed, 29 Jan 2014 01:55:28 +0000 (UTC) [thread overview]
Message-ID: <pan$4117f$37c109bd$5e86c97$a007d311@cox.net> (raw)
In-Reply-To: 52E430F7.7030801@gmail.com
Dan Merillat posted on Sat, 25 Jan 2014 16:47:35 -0500 as excerpted:
> I'm trying to track this down - this started happening without changing
> the kernel in use, so probably a corrupted filesystem. The symptoms are
> that all memory is suddenly used by no apparent source. OOM killer is
> invoked on every task, still can't free up enough memory to continue.
>
> When it goes wrong, it's extremely rapid - system goes from stable to
> dead in less than 30 seconds.
>
> Tested 3.9.0, 3.12.0, 3.12.8. Limited testing on 3.13 shows I think
> the same problem but I need to double-check that it's not a different
> issue. Blows up the exact same way on a real kernel or in UML.
>
> All sorts of things can trigger it - defrag, random writes to files.
> Balance and scrub don't,
> readonly mount doesn't.
>
> I can reproduce this trivially, mount the filesystem read-write and
> perform some activity. It only takes a few minutes. The other btrfs
> filesystems on the same machine don't show similar problems.
I was hoping someone with a bit more expertise in the area would reply to
this, but if they did, I missed it, and I had kept this marked unread to
reply to after the weekend if nobody better qualified replied first. So
here it is... sorry it took so long (I've been on the other end myself),
but under the circumstances...
Two possibilities I'm aware of.
The one that best matches the outlined circumstances is qgroups. Are you
using quotas/qgroups on that filesystem? There's some weird corner-cases
with them still, including negative quotas after subvolume delete and
apparently qgroup-triggered runaway memory usage as reported here, that
remain a problem. I see patches addressing various bits going by on the
list, but I've been steering a wide course around any potential qgroups
usage here in part because of the scary reports I keep seeing onlist, and
would recommend others not directly involved in qgroup development and
testing do the same for now. So if you can avoid qgroups on your btrfs
deployments do so, for now. If your use-case NEEDS quota/qgroup
functionality, then I'd recommend using something other than btrfs for
the time being, perhaps with a reexamination scheduled in a year as
hopefully the qgroup bugs will be worked thru by then and it'll be
reasonably stable functionality, something I'd definitely NOT
characterize qgroups as, ATM.
The other but less close match possibility I'm aware of is the large
(half-gig plus) internal-write file case, with VM images, large database
files and pre-allocated-then-written files such as bittorrent clients
often create, being prime examples. Ideally these should be located in a
directory with the NOCOW (chattr +C) set on the directory BEFORE the
files are created and written into, so they inherit it. There are
present reported problems, sometimes reaching pathelogic degree, with
these files if NOT properly marked NOCOW, but the biggest trigger there
appears to be extreme snapshotting (thousand-plus) in addition to the
large internal-rewritten files, and the bottleneck is reported to be CPU,
not IO or memory. Additionally, balance will trigger that issue too, and
you're saying it doesn't for you, so I'd say this isn't likely to be your
particular problem ATM, and am mostly just throwing it in in case you're
not using qgroups so the above can't be your issue, and as a heads-up to
be on the lookout for.
If you're using qgroups, I'd consider that the 90+% likely culprit.
They're Just. Not. Ready.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
next prev parent reply other threads:[~2014-01-29 1:55 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-01-25 21:47 Rapid memory exhaustion during normal operation Dan Merillat
2014-01-29 1:55 ` Duncan [this message]
2014-01-29 3:57 ` Chris Murphy
2014-01-29 6:23 ` Duncan
2014-01-29 21:00 ` Josef Bacik
2014-01-29 22:38 ` Imran Geriskovan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='pan$4117f$37c109bd$5e86c97$a007d311@cox.net' \
--to=1i5t5.duncan@cox.net \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox