Re: 4.11.1: cannot btrfs check --repair a filesystem, causes heavy memory stalls

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: Kai Krakow <hurikhan77@gmail.com>, linux-btrfs@vger.kernel.org
Subject: Re: 4.11.1: cannot btrfs check --repair a filesystem, causes heavy memory stalls
Date: Wed, 24 May 2017 07:57:30 -0400	[thread overview]
Message-ID: <0a689ab1-739c-a35c-8e7c-23e9395ad7be@gmail.com> (raw)
In-Reply-To: <20170523203207.6fb276c4@jupiter.sol.kaishome.de>

On 2017-05-23 14:32, Kai Krakow wrote:
> Am Tue, 23 May 2017 07:21:33 -0400
> schrieb "Austin S. Hemmelgarn" <ahferroin7@gmail.com>:
> 
>> On 2017-05-22 22:07, Chris Murphy wrote:
>>> On Mon, May 22, 2017 at 5:57 PM, Marc MERLIN <marc@merlins.org>
>>> wrote:
>>>> On Mon, May 22, 2017 at 05:26:25PM -0600, Chris Murphy wrote:
>>   [...]
>>   [...]
>>   [...]
>>>>
>>>> Oh, swap will work, you're sure?
>>>> I already have an SSD, if that's good enough, I can give it a
>>>> shot.
>>>
>>> Yeah although I have no idea how much swap is needed for it to
>>> succeed. I'm not sure what the relationship is to fs metadata chunk
>>> size to btrfs check RAM requirement is; but if it wants all of the
>>> metadata in RAM, then whatever btrfs fi us shows you for metadata
>>> may be a guide (?) for how much memory it's going to want.
>> I think the in-memory storage is a bit more space efficient than the
>> on-disk storage, but I'm not certain, and I'm pretty sure it takes up
>> more space when it's actually repairing things.  If I'm doing the
>> math correctly, you _may_ need up to 50% _more_ than the total
>> metadata size for the FS in virtual memory space.
>>>
>>> Another possibility is zswap, which still requires a backing device,
>>> but it might be able to limit how much swap to disk is needed if the
>>> data to swap out is highly compressible. *shrug*
>>>   
>> zswap won't help in that respect, but it might make swapping stuff
>> back in faster.  It just keeps a compressed copy in memory in
>> parallel to writing the full copy out to disk, then uses that
>> compressed copy to swap in instead of going to disk if the copy is
>> still in memory (but it will discard the compressed copies if memory
>> gets really low).  In essence, it reduces the impact of swapping when
>> memory pressure is moderate (the situation for most desktops for
>> example), but becomes almost useless when you have very high memory
>> pressure (which is what describes this usage).
> 
> Is this really how zswap works?
OK, looking at the documentation, you're correct, and my assumption 
based on the description of the frond-end (frontswap) and how the other 
back-end (the Xen transcendent memory driver) appears to behave was 
wrong. However, given how zswap does behave, I can't see how it would 
ever be useful with the default kernel settings, since without manual 
configuration, the kernel won't try to swap until memory pressure is 
pretty high, at which point zswap won't likely have much impact.
> 
> I always thought it acts as a compressed write-back cache in front of
> the swap devices. Pages first go to zswap compressed, and later
> write-back kicks in and migrates those compressed pages to real swap,
> but still compressed. This is done by zswap putting two (or up to three
> in modern kernels) compressed pages into one page. It has the downside
> of uncompressing all "buddy pages" when only one is needed back in. But
> it stays compressed. This also tells me zswap will either achieve
> around 1:2 or 1:3 effective compression ratio or none. So it cannot be
> compared to how streaming compression works.
> 
> OTOH, if the page is reloaded from cache before write-back kicks in, it
> will never be written to swap but just uncompressed and discarded from
> the cache.
> 
> Under high memory pressure it doesn't really work that well due to high
> CPU overhead if pages constantly swap out, compress, write, read,
> uncompress, swap in... This usually results in very low CPU usage for
> processes but high IO and disk wait and high kernel CPU usage. But it
> defers memory pressure conditions to a little later in exchange for
> more a little more IO usage and more CPU usage. If you have a lot of
> inactive memory around, it can make a difference. But it is counter
> productive if almost all your memory is active and pressure is high.
> 
> So, in this scenario, it probably still doesn't help.

     prev parent reply	other threads:[~2017-05-24 11:57 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-21 21:47 4.11.1: cannot btrfs check --repair a filesystem, causes heavy memory stalls Marc MERLIN
2017-05-21 23:45 ` Marc MERLIN
2017-05-22  1:35   ` Marc MERLIN
2017-05-22  9:19     ` Duncan
2017-05-23 17:15       ` Marc MERLIN
2017-05-22 16:31     ` Marc MERLIN
2017-05-22 23:26       ` Chris Murphy
2017-05-22 23:57         ` Marc MERLIN
2017-05-23  2:07           ` Chris Murphy
2017-05-23 11:21             ` Austin S. Hemmelgarn
2017-05-23 16:49               ` Marc MERLIN
2017-05-23 18:32               ` Kai Krakow
2017-05-24 11:57                 ` Austin S. Hemmelgarn [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0a689ab1-739c-a35c-8e7c-23e9395ad7be@gmail.com \
    --to=ahferroin7@gmail.com \
    --cc=hurikhan77@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).