linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: fdavidl073rnovn@tutanota.com
To: Linux Btrfs <linux-btrfs@vger.kernel.org>
Subject: Deleting large amounts of data causes system freeze due to OOM.
Date: Wed, 13 Sep 2023 04:28:47 +0200 (CEST)	[thread overview]
Message-ID: <NeBMdyL--3-9@tutanota.com> (raw)

Dear Btrfs Mailing List,

Full disclosure I reported this on kernel.org but am hoping to get more exposure on the mailing list. 

When I delete several terabytes of data memory usage increases until the system becomes entirely unresponsive. This has been an issue for several kernel version since at least 5.19 and continues to be an issue up to 6.5.2-artix1-1. This is on an older computer with several hard drives, eight gigabytes of memory, and a four core x86_64 cpu. Slabtop output right before the system becomes unresponsive shows about four gigabytes used by khugepaged_mm_slot and three used by btrfs_extent_map. This happens in over the span of a couple minutes and during this time btrfs-transaction is using a moderate amount of cpu time.

While this is happening the free space reported by btrfs filesystem usage slowly falls until the system is unresponsive. If I delete smaller amounts of data at a time memory usage increases but if the system doesn't go out of memory all the disk space is freed and memory usage comes back down. Deleting things bit by bit isn't a useful workaround because this also happens when deleting a snapshot even if it won't free any disk space and I am trying to use this computer for incremental backups.

The only things that seem to cause a difference are the checksum used and slower hard drives. Checksum changes the behavior of the issue. If using xxhash when I remount the filesystem it seems to try to either restart or continue the delete operation causing another out of memory condition but using the default crc32 remounting the filesystem has it in the original state before the delete command was issued and nothing happens (I haven't tried any other checksums). Having slower (SMR) drives as part of the device causes the out of memory to happen much faster. Nothing else like raid level, compression, kernel version, block group tree have seemed to change anything.

My speculation is that operations to finish the delete are being queued up in memory faster than they can be completed until the system completely runs out of memory. That would explain what's happening, why slower drives make it worse, and why deleting small amounts of data works. I'm not sure why checksum seems to change the behavior when remounting the filesystem.

I am willing to do destructive testing on this data to hopefully get this fixed.

Sincerely,
David

             reply	other threads:[~2023-09-13  2:28 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-13  2:28 fdavidl073rnovn [this message]
2023-09-13  5:55 ` Deleting large amounts of data causes system freeze due to OOM Qu Wenruo
2023-09-14  3:38   ` fdavidl073rnovn
2023-09-14  5:12     ` Qu Wenruo
2023-09-14 23:08       ` fdavidl073rnovn
2023-09-27  1:46         ` fdavidl073rnovn
2023-09-27  4:53           ` Qu Wenruo
2023-09-28 23:32             ` fdavidl073rnovn
2023-09-29  1:01               ` Qu Wenruo
2023-10-13 22:28                 ` fdavidl073rnovn
2023-10-13 22:32                   ` Qu Wenruo
2023-10-14 19:09                     ` Chris Murphy
2023-10-14 22:10                       ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=NeBMdyL--3-9@tutanota.com \
    --to=fdavidl073rnovn@tutanota.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).