Re: Deleting large amounts of data causes system freeze due to OOM.

All of lore.kernel.org
 help / color / mirror / Atom feed

From: fdavidl073rnovn@tutanota.com
To: Qu Wenruo <quwenruo.btrfs@gmx.com>
Cc: Qu Wenruo <wqu@suse.com>, Linux Btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: Deleting large amounts of data causes system freeze due to OOM.
Date: Sat, 14 Oct 2023 00:28:22 +0200 (CEST)	[thread overview]
Message-ID: <Ngf8uVZ--7-9@tutanota.com> (raw)
In-Reply-To: <d6fb2fd0-8c59-449c-a342-84eb908de969@gmx.com>


Sep 29, 2023, 01:02 by quwenruo.btrfs@gmx.com:

>
>
> On 2023/9/29 09:02, fdavidl073rnovn@tutanota.com wrote:
>
>>
>>
>> Sep 27, 2023, 04:53 by quwenruo.btrfs@gmx.com:
>>
>>>
>>> The compression is the easily way to create tons of small file extents
>>> (the limit of a compressed extent is only 128K).
>>>
>>> Furthermore, each file extent would need an in-memory structure (struct
>>> extent_map, for a debug kernel, it's 122 bytes) to cache the contents.
>>>
>>> Thus for a 8TiB file with all compressed file extents at their max size
>>> (pretty common if it's only for backup).
>>> Then we still have 512M file extents.
>>>
>>> Just multiple that by 122, you can see how this go crazy.
>>>
>>> But still, if you're only deleting the file, the result shouldn't go
>>> this crazy, as deleting itself won't try to read the file extents thus
>>> no such cache.
>>>
>>> However as long as we start doing read/write, the cache can go very
>>> large, especially if you use compress, and only get released when the
>>> whole inode get released from kernel.
>>>
>>> On the other hand, if you go uncompressed data, the maximum file extent
>>> size is enlarged to 128M (a 1024x increase), thus a huge reduce in the
>>> number of extents.
>>>
>>> In the long run I guess we need some way to release the extent_map when
>>> low on memory.
>>> But for now, I'm afraid I don't have better suggestion other than
>>> turning off compression and defrag the compressed files using newer
>>> kernel (v6.2 and newer).
>>>
>>> In v6.2, there is a patch to prevent defrag from populating the extent
>>> map cache, thus it won't take all the memory just by defrag.
>>> And with all those files converted from compression, I believe the
>>> situation would be greatly improved.
>>>
>>> Thanks,
>>> Qu
>>>
>> The backup itself is gone and will need to be re-sent. If I'm understanding things properly then by mounting the btrfs device for the backup without compression and enforcing send protocol one it should be written uncompressed which will avoid the issue correct?
>>
>
> IIRC yes.
>
> The send stream only contains the decompressed content, thus as long as
> it's mounted without compression, the received data on-disk would not be
> compressed either.
>
>>
>> I was also looking at the source code and it seems relatively straight forward to change BTRFS_MAX_COMPRESSED and BTRFS_MAX_UNCOMPRESSED to SZ_128M or somewhere in between like SZ_8M. Do you have any thoughts on how well that might work?
>>
>
> The size is a trade-off between space wasted by COW and memory needed to
> decompress an extent.
>
> Remember even if we only need part of the compressed extent, we still
> need to decompress the whole extent.
> Image if we have to read 8 compressed extents in the same time, and the
> BTRFS_MAX_COMPRESSED is 128M.
>
> So I'm afraid we can not got super large on the value.
>
>>
>> Do you have any idea on how complicated the long term fix is or when it might added? v6.8 maybe?
>>
>
> At least not near term, I'm not aware of any ongoing project related to
> this.
>
> Thanks,
> Qu
>
>>
>> Thank you for your prompt responses. Sending the backup again will take some days but I will email you to tell you if disabling compression fixes the issue.
>>
>> Sincerely,
>> David
>>
To follow up on this I was successfully able to transfer my backup then both make and delete snapshots of it without running out of memory. I will update my ticket on there bug tracker if and I think there should be a warning about this in the documents.

Is there anything else I can do to make sure this is addressed at some point? I would like to eventually be able to re-enable compression as it was saving me several terabytes.

Sincerely,
David

next prev parent reply	other threads:[~2023-10-13 22:28 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-13  2:28 Deleting large amounts of data causes system freeze due to OOM fdavidl073rnovn
2023-09-13  5:55 ` Qu Wenruo
2023-09-14  3:38   ` fdavidl073rnovn
2023-09-14  5:12     ` Qu Wenruo
2023-09-14 23:08       ` fdavidl073rnovn
2023-09-27  1:46         ` fdavidl073rnovn
2023-09-27  4:53           ` Qu Wenruo
2023-09-28 23:32             ` fdavidl073rnovn
2023-09-29  1:01               ` Qu Wenruo
2023-10-13 22:28                 ` fdavidl073rnovn [this message]
2023-10-13 22:32                   ` Qu Wenruo
2023-10-14 19:09                     ` Chris Murphy
2023-10-14 22:10                       ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Ngf8uVZ--7-9@tutanota.com \
    --to=fdavidl073rnovn@tutanota.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.