linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* btrfs_extent_map memory consumption results in "Out of memory"
@ 2023-10-10 15:02 Ospan, Abylay
  2023-10-10 15:47 ` Filipe Manana
  0 siblings, 1 reply; 6+ messages in thread
From: Ospan, Abylay @ 2023-10-10 15:02 UTC (permalink / raw)
  To: linux-btrfs@vger.kernel.org

Greetings Btrfs development team!

I would like to express my gratitude for your outstanding work on Btrfs. However, I recently experienced an 'out of memory' issue as described below.

Steps to reproduce:

1. Run FIO test on a btrfs partition with random write on a 300GB file:

cat <<EOF >> rand.fio 
[global]
name=fio-rand-write
filename=fio-rand-write
rw=randwrite
bs=4K
direct=1
numjobs=16
time_based
runtime=90000

[file1]
size=300G
ioengine=libaio
iodepth=16
EOF

fio rand.fio

2. Monitor slab consumption with "slabtop -s -a"

  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
25820620 23138538  89%    0.14K 922165       28   3688660K btrfs_extent_map

3. Observe oom-killer:
[49689.294138] ip invoked oom-killer: gfp_mask=0xc2cc0(GFP_KERNEL|__GFP_NOWARN|__GFP_COMP|__GFP_NOMEMALLOC), order=3, oom_score_adj=0
...
[49689.294425] Unreclaimable slab info:
[49689.294426] Name                      Used          Total	
[49689.329363] btrfs_extent_map     3207098KB    3375622KB
...

Memory usage by btrfs_extent_map gradually increases until it reaches a critical point, causing the system to run out of memory.

Test environment: Intel CPU, 8GB RAM (To expedite the reproduction of this issue, I also conducted tests within QEMU with a restricted amount of memory). 
Linux kernel tested: LTS 5.15.133, and mainline 6.6-rc5

Quick review of the 'fs/btrfs/extent_map.c' code reveals no built-in limitations on memory allocation for extents mapping.
Are there any known workarounds or alternative solutions to mitigate this issue?

Thank you!

--
Abylay Ospan



^ permalink raw reply	[flat|nested] 6+ messages in thread
* RE: btrfs_extent_map memory consumption results in "Out of memory"
@ 2023-10-18 22:45 fdavidl073rnovn
  0 siblings, 0 replies; 6+ messages in thread
From: fdavidl073rnovn @ 2023-10-18 22:45 UTC (permalink / raw)
  To: Fdmanana, Aospan; +Cc: Linux Btrfs

>Hi Filipe,
>
>> > I was just wondering about "direct IO writes", so I ran a quick test by fully
>> removing fio's config option "direct=1" (default value is false).
>> > Unfortunately, I'm still experiencing the same oom-kill:
>> >
>> > [ 4843.936881]
>> > oom-
>> kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allo
>> > wed=0,global_oom,task_memcg=/,task=fio,pid=649,uid=0
>> > [ 4843.939001] Out of memory: Killed process 649 (fio)
>> > total-vm:216868kB, anon-rss:896kB, file-rss:128kB, shmem-rss:2176kB,
>> > UID:0 pgtables:100kB oom_score_a0 [ 5306.210082] tmux: server invoked
>> oom-killer: gfp_mask=0x140cca(GFP_HIGHUSER_MOVABLE|__GFP_COMP),
>> order=0, oom_score_adj=0 ...
>> > [ 5306.240968] Unreclaimable slab info:
>> > [ 5306.241271] Name                      Used          Total
>> > [ 5306.242700] btrfs_extent_map       26093KB      26093KB
>> >
>> > Here's my updated fio config:
>> > [global]
>> > name=fio-rand-write
>> > filename=fio-rand-write
>> > rw=randwrite
>> > bs=4K
>> > numjobs=1
>> > time_based
>> > runtime=90000
>> >
>> > [file1]
>> > size=3G
>> > iodepth=1
>> >
>> > "slabtop -s -a" output:
>> >   OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
>> > 206080 206080 100%    0.14K   7360       28     29440K btrfs_extent_map
>> >
>> > I accelerated my testing by running fio test inside a QEMU VM with a limited
>> amount of RAM (140MB):
>> >
>> > qemu-kvm
>> >   -kernel bzImage.v6.6 \
>> >   -m 140M  \
>> >   -drive file=rootfs.btrfs,format=raw,if=none,id=drive0
>> > ...
>> >
>> > It appears that this issue may not be limited to direct IO writes alone?
>> 
>> In the buffered IO case it's typically much less likely to happen.
>> 
>> The reason why it happens in your test it's because the VM has very little RAM,
>> 140M, which is very unlikely to find in the real world nowadays. 
>
>I increased the memory to 8GB and ran the test overnight without any OOM errors. Glad memory management mechanism works as expected!
>
>> Pages can only
>> be released when they are not dirty and not under writeback, and in this case
>> there's no fsync, so the amount of dirty pages (or under writeback)
>> accumulates very quickly.
>> If pages can not be released, extent maps can not be released either.
>> 
>> If you add "fsync=1" to your fio test, things should change dramatically.
>> 
>> Thanks.
>> 
>> (And btw, try to avoid top posting if possible, as that makes the thread harder
>> to follow.)
>My apologies for the top posting.
>
>Thanks for your help!
>
>--
>Abylay Ospan

I did not see if you were using compression mentioned anywhere in the email chain but the extent map issue appears to be compounded significantly by compression. I've run into it under normal loads deleting snapshots on real machines with only 8gb of memory. https://www.spinics.net/lists/linux-btrfs/msg139657.html
Sincerely,
David

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-10-18 22:45 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-10 15:02 btrfs_extent_map memory consumption results in "Out of memory" Ospan, Abylay
2023-10-10 15:47 ` Filipe Manana
2023-10-10 21:23   ` Ospan, Abylay
2023-10-10 21:44     ` Filipe Manana
2023-10-12 14:24       ` Ospan, Abylay
  -- strict thread matches above, loose matches on Subject: below --
2023-10-18 22:45 fdavidl073rnovn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).