* Re: 6.10/regression/bisected - after f1d97e769152 I spotted increased execution time of the kswapd0 process and symptoms as if there is not enough memory
2024-07-04 17:25 ` Filipe Manana
@ 2024-07-04 17:31 ` Filipe Manana
2024-07-04 22:15 ` Andrea Gelmini
` (3 subsequent siblings)
4 siblings, 0 replies; 56+ messages in thread
From: Filipe Manana @ 2024-07-04 17:31 UTC (permalink / raw)
To: Andrea Gelmini
Cc: Mikhail Gavrilov, Linux List Kernel Mailing,
Linux regressions mailing list, Btrfs BTRFS, dsterba, josef
On Thu, Jul 4, 2024 at 6:25 PM Filipe Manana <fdmanana@kernel.org> wrote:
>
> On Thu, Jul 4, 2024 at 3:48 PM Andrea Gelmini <andrea.gelmini@gmail.com> wrote:
> >
> > Il giorno gio 4 lug 2024 alle ore 15:47 Andrea Gelmini
> > <andrea.gelmini@gmail.com> ha scritto:
> > > I send you everything when I collect enough data.
> >
> > Here we are.
> >
> > Kernel rc6+branch:
> > Output of bfptrace:
> > https://pastebin.com/P9RFp5mg
>
> So a couple interesting things here, which we didn't get in the short
> capture from Mikhail:
>
> 1) There's apparently multiple tasks entering the shrinker at the same time:
> kswapd0, Chrome_ChildIOT, Chrome_IOThread, chrome, Xorg.
>
> 2) In some cases we get very large negative numbers for the number of
> extent maps to scan.
> This shouldn't happen and either our own btrfs counter might have
> overflowed or some other bug,
> or the super block's shrinker is being called with sc->nr_to_scan
> negative, and outside btrfs' control,
> and it seems outside of control of the VFS's shrinker callback
> (see fs/super.c:super_cache_scan()).
>
> >
> > Recording of tar session: (summary: start fast, then flipping super slow)
> > https://asciinema.org/a/BxYI83TkrlOhEe42IWXNY135D
> >
> > Recording of htop session: (summary: PSI high and two threads at 100%)
> > https://asciinema.org/a/ZwGSepZZ8TSpFfPssACUUXcCB
>
> Ok, so maybe I missed it, but I haven't kswapd0 in there, or nothing
> taking 100% CPU.
> Maybe it was just Mikhail running into that?
>
> I was looking at the memory PSI and I never noticed it going over 60%.
> As for cpu and IO PSI, for cpu it was always low, under 3% from what
> I've seen and for IO even lower than that, very close to 0%.
>
> So I'm surprised that you get an unresponsive desktop.
>
> >
> >
> > Kernel 6.6.36:
> > Recording of tar session: (summary: tar always fast)
> > https://asciinema.org/a/a6dOkbjyPFkkQ5aNTaRiFD3H8
> >
> > Recording of htop session: (summary: no threads and PSI load)
> > https://asciinema.org/a/mFsypWzHfSdsjrIQf8zpzNpKo
>
> Interestingly, here the memory PSI stays at 0% or very close to that,
> it never reaches anything close to the 60%.
>
> >
> > If you need to run for longer time, I can do it in the weekend.
> > If you need dump of my BTRFS fs, no problem, but I need 'btrfs image
> > -s" working (point is: scrambling filenames).
>
> Ok, so I haven been delaying my reply because I kept accumulating
> things for you (or Mikhail) to try, and avoid sending several messages
> with very little.
>
> So first thing, I tried reproducing your scenario like you described
> in a previous message using tar:
>
> On a fresh btrfs filesystem, I cloned Linus' kernel tree into /mnt/git/linux
> Compiled a kernel.
> Then copied the tree 3 times like this:
>
> cd /mnt/git
> cp --reflink=never -r linux linux2
> cp --reflink=never -r linux linux3
> cp --reflink=never -r linux linux4
>
> The total size of /mnt/git was 62G (as reported by: du -hs /mnt/git).
>
> Than I ran:
>
> cd /mnt/git
> tar cp git/ | pv > /dev/null
>
> With htop in parallel, the bpftrace script, and since my htop version
> doesn't show PSI information (probably an older version than yours), I
> kept monitoring PSI like this:
>
> watch -d -n 3 'echo "cpu:\n"; cat /proc/pressure/cpu ; echo
> "\nmemory:\n" ; cat /proc/pressure/memory ; echo "\nio:\n" ; cat
> /proc/pressure/io'
>
> Nothing went out of the roof, the machine was always responsive, never
> seen kswapd0 anywhere near the top, and the process using most CPU was
> tar (and always under 30%).
> PSI had all values low.
>
> The shrinker was being triggered very often, for small numbers (mostly
> under 1000, and most of the time much less than that), but I never had
> those large negative numbers nor apparently different tasks entering
> into it concurrently.
> It took a few seconds at most in each run.
>
> I also tried monitoring while doing the "cp --reflink=never -r"
> commands and while PSI often peaked to 92%, 93%, the system was always
> responsive (and such IO PSI seems reasonable since we are doing a lot
> of read and write IO).
>
> So several different things to try here:
>
> 1) First let's check that the problem is really a consequence of the shrinker.
> Try this patch:
>
> https://gist.githubusercontent.com/fdmanana/b44abaade0000d28ba0e1e1ae3ac4fee/raw/5c9bf0beb5aa156b893be2837c9244d035962c74/gistfile1.txt
>
> This disables the shrinker. This is just to confirm if I'm looking
> in the right direction, if your problem is the same as Mikhail's and
> double check his bisection.
>
> 2) Then drop that patch that disables the shrinker.
> With all the previous 4 patches applied, apply this one on top of them:
>
> https://gist.githubusercontent.com/fdmanana/9cea16ca56594f8c7e20b67dc66c6c94/raw/557bd5f6b37b65d210218f8da8987b74bfe5e515/gistfile1.txt
>
> The goal here is to see if the extent map eviction done by the
> shrinker is making reads from other tasks too slow, and check if
> that's what0s making your system unresponsive.
>
> 3) Then drop the patch from step 2), and on top of the previous 4
> patches from my git tree, apply this one:
>
> https://gist.githubusercontent.com/fdmanana/a7c9c2abb69c978cf5b80c2f784243d5/raw/b4cca964904d3ec15c74e36ccf111a3a2f530520/gistfile1.txt
>
> This is just to confirm if we do have concurrent calls to the
> shrinker, as the tracing seems to suggest, and where the negative
> numbers come from.
> It also helps to check if not allowing concurrent calls to it, by
> skipping if it's already running, helps making the problems go away.
Oh and for this one, show your 'dmesg' after your testing to see if
any stack traces or warning messages were logged (even if it happens
to solve all the problems).
Thanks!
>
> >
> > Thanks a lot,
>
> Thanks a lot to you and Mikhail, not just for the reporting but also
> to apply patches, compile a kernel, run the tests and do all those
> valuable observations which are all very time consuming.
>
> Thanks!
>
> > Gelma
^ permalink raw reply [flat|nested] 56+ messages in thread* Re: 6.10/regression/bisected - after f1d97e769152 I spotted increased execution time of the kswapd0 process and symptoms as if there is not enough memory
2024-07-04 17:25 ` Filipe Manana
2024-07-04 17:31 ` Filipe Manana
@ 2024-07-04 22:15 ` Andrea Gelmini
2024-07-04 22:23 ` Andrea Gelmini
2024-07-05 11:00 ` Filipe Manana
2024-07-05 6:30 ` Andrea Gelmini
` (2 subsequent siblings)
4 siblings, 2 replies; 56+ messages in thread
From: Andrea Gelmini @ 2024-07-04 22:15 UTC (permalink / raw)
To: Filipe Manana
Cc: Mikhail Gavrilov, Linux List Kernel Mailing,
Linux regressions mailing list, Btrfs BTRFS, dsterba, josef
Il giorno gio 4 lug 2024 alle ore 19:25 Filipe Manana
<fdmanana@kernel.org> ha scritto:
> 2) In some cases we get very large negative numbers for the number of
> extent maps to scan.
> This shouldn't happen and either our own btrfs counter might have
> overflowed or some other bug,
Well, I was thinking about my specific odds, and I tried this:
a) kernel 6.6.36;
b) on spare partition nvme created a new shiny btrfs;
c) then mount it forcing compression;
d) multiple parallel cp of kernel and libreoffice src;
e) reboot with same rc6+branch already used;
f) tar of the new btrfs: no problem at all;
g) let it finish;
h) tar of /.snapshots: PSI memory skyrocket, and usual slowdown reading;
i) stop it;
l) again tar of the new btrfs: no problem
m) repeat a few times.
You can see the output here:
https://asciinema.org/a/rJpGWvXYH6IDBXWYhtJckkKWo
In the end you see I kill tar and let the PSI going down to zero, if
you are interested.
> Ok, so maybe I missed it, but I haven't kswapd0 in there, or nothing
> taking 100% CPU.
> Maybe it was just Mikhail running into that?
To have this effect and the extreme luggish response (I mean, click
something and it takes more than 30 seconds to react)
I need to work at least one day on my laptop. At this point also
cycling to virtual desktop takes a lot.
Thinking about my different use case:
a) i always suspend. I just reboot when change kernel. So, I can work
for weeks with same kernel. Suspend2RAM, not disk, btw;
b) months ago I let run beesd for a day.
> So I'm surprised that you get an unresponsive desktop.
Same point as before. In this case is not so luggish, but - i.e. - if
I click for screenlock it doesn't start immediately, it waits for a
little bit more than one second.
> Interestingly, here the memory PSI stays at 0% or very close to that,
> it never reaches anything close to the 60%.
You see the same thing with the last test with new btrfs partition.
New partition: ~0%
/.snapshots/: near 60%.
> With htop in parallel, the bpftrace script, and since my htop version
> doesn't show PSI information (probably an older version than yours), I
> kept monitoring PSI like this:
Well, mine is taken from here:
https://github.com/htop-dev/htop.git
Compiled with:
./configure --enable-capabilities --enable-delayacct --enable-sensors
--enable-werror --enable-affinity
And tweaked config file. If you want I can send it.
> So several different things to try here:
I stop here for the moment. I have to sleep.
In the weekend I do the rest and reply to you!
> Thanks a lot to you and Mikhail, not just for the reporting but also
> to apply patches, compile a kernel, run the tests and do all those
> valuable observations which are all very time consuming.
My little contribution to free software!
Ciao,
Gelma
^ permalink raw reply [flat|nested] 56+ messages in thread* Re: 6.10/regression/bisected - after f1d97e769152 I spotted increased execution time of the kswapd0 process and symptoms as if there is not enough memory
2024-07-04 22:15 ` Andrea Gelmini
@ 2024-07-04 22:23 ` Andrea Gelmini
2024-07-05 11:00 ` Filipe Manana
1 sibling, 0 replies; 56+ messages in thread
From: Andrea Gelmini @ 2024-07-04 22:23 UTC (permalink / raw)
To: Filipe Manana
Cc: Mikhail Gavrilov, Linux List Kernel Mailing,
Linux regressions mailing list, Btrfs BTRFS, dsterba, josef
[-- Attachment #1: Type: text/plain, Size: 231 bytes --]
Il giorno ven 5 lug 2024 alle ore 00:15 Andrea Gelmini
<andrea.gelmini@gmail.com> ha scritto:
>
> You can see the output here:
> https://asciinema.org/a/rJpGWvXYH6IDBXWYhtJckkKWo
Sorry, in attachment, the bfp log of this session.
[-- Attachment #2: em_shrinker_log.txt.bz2 --]
[-- Type: application/x-bzip, Size: 280773 bytes --]
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: 6.10/regression/bisected - after f1d97e769152 I spotted increased execution time of the kswapd0 process and symptoms as if there is not enough memory
2024-07-04 22:15 ` Andrea Gelmini
2024-07-04 22:23 ` Andrea Gelmini
@ 2024-07-05 11:00 ` Filipe Manana
1 sibling, 0 replies; 56+ messages in thread
From: Filipe Manana @ 2024-07-05 11:00 UTC (permalink / raw)
To: Andrea Gelmini
Cc: Mikhail Gavrilov, Linux List Kernel Mailing,
Linux regressions mailing list, Btrfs BTRFS, dsterba, josef
On Thu, Jul 4, 2024 at 11:15 PM Andrea Gelmini <andrea.gelmini@gmail.com> wrote:
>
> Il giorno gio 4 lug 2024 alle ore 19:25 Filipe Manana
> <fdmanana@kernel.org> ha scritto:
> > 2) In some cases we get very large negative numbers for the number of
> > extent maps to scan.
> > This shouldn't happen and either our own btrfs counter might have
> > overflowed or some other bug,
>
> Well, I was thinking about my specific odds, and I tried this:
> a) kernel 6.6.36;
> b) on spare partition nvme created a new shiny btrfs;
> c) then mount it forcing compression;
> d) multiple parallel cp of kernel and libreoffice src;
> e) reboot with same rc6+branch already used;
> f) tar of the new btrfs: no problem at all;
> g) let it finish;
> h) tar of /.snapshots: PSI memory skyrocket, and usual slowdown reading;
> i) stop it;
> l) again tar of the new btrfs: no problem
> m) repeat a few times.
>
> You can see the output here:
> https://asciinema.org/a/rJpGWvXYH6IDBXWYhtJckkKWo
>
> In the end you see I kill tar and let the PSI going down to zero, if
> you are interested.
>
> > Ok, so maybe I missed it, but I haven't kswapd0 in there, or nothing
> > taking 100% CPU.
> > Maybe it was just Mikhail running into that?
>
> To have this effect and the extreme luggish response (I mean, click
> something and it takes more than 30 seconds to react)
> I need to work at least one day on my laptop. At this point also
> cycling to virtual desktop takes a lot.
>
> Thinking about my different use case:
> a) i always suspend. I just reboot when change kernel. So, I can work
> for weeks with same kernel. Suspend2RAM, not disk, btw;
> b) months ago I let run beesd for a day.
>
> > So I'm surprised that you get an unresponsive desktop.
> Same point as before. In this case is not so luggish, but - i.e. - if
> I click for screenlock it doesn't start immediately, it waits for a
> little bit more than one second.
Oh I see that on my main desktop which only uses ext4 and always has 2
qemu vms usually running debian and opensuse.
Sometimes even if the VMs aren't doing anything, but they used to be
doing IO heavy testing, the desktop in the host gets unresponsive,
clicking the screenlock often takes at least some 5 seconds, or
changing workspaces takes a few seconds too, etc. Shouldn't happen in
theory.
>
> > Interestingly, here the memory PSI stays at 0% or very close to that,
> > it never reaches anything close to the 60%.
>
> You see the same thing with the last test with new btrfs partition.
> New partition: ~0%
> /.snapshots/: near 60%.
It could be due to heavy fragmentation, but that should only be too
slow if you were using a spinning disk.
I think somewhere you mentioned nvme or ssd.
Removing the extent maps could cause extra reads of metadata and be slow.
But the number of extent maps removed on every iteration is relatively
small, and round-robin, so... it's strange that it causes such huge
pressure and desktop unresponsiveness.
We will know if that's the case with the 2nd test patch.
>
>
> > With htop in parallel, the bpftrace script, and since my htop version
> > doesn't show PSI information (probably an older version than yours), I
> > kept monitoring PSI like this:
>
> Well, mine is taken from here:
> https://github.com/htop-dev/htop.git
> Compiled with:
> ./configure --enable-capabilities --enable-delayacct --enable-sensors
> --enable-werror --enable-affinity
> And tweaked config file. If you want I can send it.
Thanks, I'll have to try it eventually.
>
>
> > So several different things to try here:
>
> I stop here for the moment. I have to sleep.
> In the weekend I do the rest and reply to you!
Sure, take your time. It takes time patching and building kernels,
plus the testing, etc.
Many thanks for that!
>
> > Thanks a lot to you and Mikhail, not just for the reporting but also
> > to apply patches, compile a kernel, run the tests and do all those
> > valuable observations which are all very time consuming.
>
> My little contribution to free software!
>
> Ciao,
> Gelma
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: 6.10/regression/bisected - after f1d97e769152 I spotted increased execution time of the kswapd0 process and symptoms as if there is not enough memory
2024-07-04 17:25 ` Filipe Manana
2024-07-04 17:31 ` Filipe Manana
2024-07-04 22:15 ` Andrea Gelmini
@ 2024-07-05 6:30 ` Andrea Gelmini
2024-07-05 11:06 ` Filipe Manana
2024-07-05 18:36 ` Mikhail Gavrilov
2024-07-06 0:11 ` Andrea Gelmini
4 siblings, 1 reply; 56+ messages in thread
From: Andrea Gelmini @ 2024-07-05 6:30 UTC (permalink / raw)
To: Filipe Manana
Cc: Mikhail Gavrilov, Linux List Kernel Mailing,
Linux regressions mailing list, Btrfs BTRFS, dsterba, josef
Il giorno gio 4 lug 2024 alle ore 19:25 Filipe Manana
<fdmanana@kernel.org> ha scritto:
> 1) First let's check that the problem is really a consequence of the shrinker.
> Try this patch:
>
> https://gist.githubusercontent.com/fdmanana/b44abaade0000d28ba0e1e1ae3ac4fee/raw/5c9bf0beb5aa156b893be2837c9244d035962c74/gistfile1.txt
>
> This disables the shrinker. This is just to confirm if I'm looking
> in the right direction, if your problem is the same as Mikhail's and
> double check his bisection.
Ok, so, I confirm. With this change, just a little bit of PSI memory
sometime (<3%), but no skyrocket. Also, tar at full speed.
Now, I'm going to prepare the btrfs image to send you.
The other steps later.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: 6.10/regression/bisected - after f1d97e769152 I spotted increased execution time of the kswapd0 process and symptoms as if there is not enough memory
2024-07-05 6:30 ` Andrea Gelmini
@ 2024-07-05 11:06 ` Filipe Manana
0 siblings, 0 replies; 56+ messages in thread
From: Filipe Manana @ 2024-07-05 11:06 UTC (permalink / raw)
To: Andrea Gelmini
Cc: Mikhail Gavrilov, Linux List Kernel Mailing,
Linux regressions mailing list, Btrfs BTRFS, dsterba, josef
On Fri, Jul 5, 2024 at 7:30 AM Andrea Gelmini <andrea.gelmini@gmail.com> wrote:
>
> Il giorno gio 4 lug 2024 alle ore 19:25 Filipe Manana
> <fdmanana@kernel.org> ha scritto:
> > 1) First let's check that the problem is really a consequence of the shrinker.
> > Try this patch:
> >
> > https://gist.githubusercontent.com/fdmanana/b44abaade0000d28ba0e1e1ae3ac4fee/raw/5c9bf0beb5aa156b893be2837c9244d035962c74/gistfile1.txt
> >
> > This disables the shrinker. This is just to confirm if I'm looking
> > in the right direction, if your problem is the same as Mikhail's and
> > double check his bisection.
>
> Ok, so, I confirm. With this change, just a little bit of PSI memory
> sometime (<3%), but no skyrocket. Also, tar at full speed.
Ok, so the bisection is reliable and it means you are experiencing the
same problem that Mikhail reported.
>
> Now, I'm going to prepare the btrfs image to send you.
That might not be necessary, not sure how it would help, the 2nd patch
to try would confirm if it's any fragmentation causing too many slow
reads after extent map eviction.
So save yourself some time for now because making the image is likely slow.
>
> The other steps later.
Thanks!
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: 6.10/regression/bisected - after f1d97e769152 I spotted increased execution time of the kswapd0 process and symptoms as if there is not enough memory
2024-07-04 17:25 ` Filipe Manana
` (2 preceding siblings ...)
2024-07-05 6:30 ` Andrea Gelmini
@ 2024-07-05 18:36 ` Mikhail Gavrilov
2024-07-05 23:09 ` Filipe Manana
2024-07-06 0:11 ` Andrea Gelmini
4 siblings, 1 reply; 56+ messages in thread
From: Mikhail Gavrilov @ 2024-07-05 18:36 UTC (permalink / raw)
To: Filipe Manana
Cc: Andrea Gelmini, Linux List Kernel Mailing,
Linux regressions mailing list, Btrfs BTRFS, dsterba, josef
[-- Attachment #1: Type: text/plain, Size: 3855 bytes --]
On Thu, Jul 4, 2024 at 10:25 PM Filipe Manana <fdmanana@kernel.org> wrote:
>
> So several different things to try here:
>
> 1) First let's check that the problem is really a consequence of the shrinker.
> Try this patch:
>
> https://gist.githubusercontent.com/fdmanana/b44abaade0000d28ba0e1e1ae3ac4fee/raw/5c9bf0beb5aa156b893be2837c9244d035962c74/gistfile1.txt
>
> This disables the shrinker. This is just to confirm if I'm looking
> in the right direction, if your problem is the same as Mikhail's and
> double check his bisection.
[1]
I can't check it because the patch is unapplyable on top of 661e504db04c.
> git apply debug-1.patch
error: patch failed: fs/btrfs/super.c:2410
error: fs/btrfs/super.c: patch does not apply
> cat debug-1.patch
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index f05cce7c8b8d..06c0db641d18 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -2410,8 +2410,10 @@ static const struct super_operations btrfs_super_ops = {
.statfs = btrfs_statfs,
.freeze_fs = btrfs_freeze,
.unfreeze_fs = btrfs_unfreeze,
+ /*
.nr_cached_objects = btrfs_nr_cached_objects,
.free_cached_objects = btrfs_free_cached_objects,
+ */
};
static const struct file_operations btrfs_ctl_fops = {
> 2) Then drop that patch that disables the shrinker.
> With all the previous 4 patches applied, apply this one on top of them:
>
> https://gist.githubusercontent.com/fdmanana/9cea16ca56594f8c7e20b67dc66c6c94/raw/557bd5f6b37b65d210218f8da8987b74bfe5e515/gistfile1.txt
>
> The goal here is to see if the extent map eviction done by the
> shrinker is making reads from other tasks too slow, and check if
> that's what0s making your system unresponsive.
>
[2]
6.10.0-rc6-661e504db04c-test2
up 1:00
root 269 15.5 0.0 0 0 ? R 10:23 9:20 [kswapd0]
up 2:02
root 269 21.6 0.0 0 0 ? S 10:23 26:27 [kswapd0]
up 3:10
root 269 25.2 0.0 0 0 ? R 10:23 48:11 [kswapd0]
up 4:04
root 269 29.0 0.0 0 0 ? S 10:23 71:12 [kswapd0]
up 5:04
root 269 26.8 0.0 0 0 ? R 10:23 81:47 [kswapd0]
up 6:07
root 269 27.9 0.0 0 0 ? R 10:23 102:40 [kswapd0]
dmesg attached below as 6.10.0-rc6-661e504db04c-test2.zip
> 3) Then drop the patch from step 2), and on top of the previous 4
> patches from my git tree, apply this one:
>
> https://gist.githubusercontent.com/fdmanana/a7c9c2abb69c978cf5b80c2f784243d5/raw/b4cca964904d3ec15c74e36ccf111a3a2f530520/gistfile1.txt
>
> This is just to confirm if we do have concurrent calls to the
> shrinker, as the tracing seems to suggest, and where the negative
> numbers come from.
> It also helps to check if not allowing concurrent calls to it, by
> skipping if it's already running, helps making the problems go away.
[3]
6.10.0-rc6-661e504db04c-test3
up 1:00
root 269 18.6 0.0 0 0 ? S 17:09 11:12 [kswapd0]
up 2:00
root 269 23.7 0.0 0 0 ? R 17:09 28:30 [kswapd0]
up 3:00
root 269 27.0 0.0 0 0 ? S 17:09 48:47 [kswapd0]
up 4:00
root 269 28.8 0.0 0 0 ? S 17:09 69:10 [kswapd0]
up 5:00
root 269 32.0 0.0 0 0 ? S 17:09 96:17 [kswapd0]
up 6:00
root 269 29.7 0.0 0 0 ? S 17:09 107:12 [kswapd0]
dmesg attached below as 6.10.0-rc6-661e504db04c-test3.zip
As we can see, the time of kswapd0 has increased significantly. It was
30 min in 6 hours it became 100 min. That is, it became three times
worse even with proposed patches (1-4).
--
Best Regards,
Mike Gavrilov.
[-- Attachment #2: 6.10.0-rc6-661e504db04c-test2.zip --]
[-- Type: application/zip, Size: 53393 bytes --]
[-- Attachment #3: 6.10.0-rc6-661e504db04c-test3.zip --]
[-- Type: application/zip, Size: 54961 bytes --]
^ permalink raw reply related [flat|nested] 56+ messages in thread* Re: 6.10/regression/bisected - after f1d97e769152 I spotted increased execution time of the kswapd0 process and symptoms as if there is not enough memory
2024-07-05 18:36 ` Mikhail Gavrilov
@ 2024-07-05 23:09 ` Filipe Manana
0 siblings, 0 replies; 56+ messages in thread
From: Filipe Manana @ 2024-07-05 23:09 UTC (permalink / raw)
To: Mikhail Gavrilov
Cc: Andrea Gelmini, Linux List Kernel Mailing,
Linux regressions mailing list, Btrfs BTRFS, dsterba, josef
On Fri, Jul 5, 2024 at 7:36 PM Mikhail Gavrilov
<mikhail.v.gavrilov@gmail.com> wrote:
>
> On Thu, Jul 4, 2024 at 10:25 PM Filipe Manana <fdmanana@kernel.org> wrote:
> >
> > So several different things to try here:
> >
> > 1) First let's check that the problem is really a consequence of the shrinker.
> > Try this patch:
> >
> > https://gist.githubusercontent.com/fdmanana/b44abaade0000d28ba0e1e1ae3ac4fee/raw/5c9bf0beb5aa156b893be2837c9244d035962c74/gistfile1.txt
> >
> > This disables the shrinker. This is just to confirm if I'm looking
> > in the right direction, if your problem is the same as Mikhail's and
> > double check his bisection.
>
> [1]
> I can't check it because the patch is unapplyable on top of 661e504db04c.
> > git apply debug-1.patch
> error: patch failed: fs/btrfs/super.c:2410
> error: fs/btrfs/super.c: patch does not apply
> > cat debug-1.patch
> diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
> index f05cce7c8b8d..06c0db641d18 100644
> --- a/fs/btrfs/super.c
> +++ b/fs/btrfs/super.c
> @@ -2410,8 +2410,10 @@ static const struct super_operations btrfs_super_ops = {
> .statfs = btrfs_statfs,
> .freeze_fs = btrfs_freeze,
> .unfreeze_fs = btrfs_unfreeze,
> + /*
> .nr_cached_objects = btrfs_nr_cached_objects,
> .free_cached_objects = btrfs_free_cached_objects,
> + */
> };
>
> static const struct file_operations btrfs_ctl_fops = {
>
>
>
> > 2) Then drop that patch that disables the shrinker.
> > With all the previous 4 patches applied, apply this one on top of them:
> >
> > https://gist.githubusercontent.com/fdmanana/9cea16ca56594f8c7e20b67dc66c6c94/raw/557bd5f6b37b65d210218f8da8987b74bfe5e515/gistfile1.txt
> >
> > The goal here is to see if the extent map eviction done by the
> > shrinker is making reads from other tasks too slow, and check if
> > that's what0s making your system unresponsive.
> >
>
> [2]
> 6.10.0-rc6-661e504db04c-test2
> up 1:00
> root 269 15.5 0.0 0 0 ? R 10:23 9:20 [kswapd0]
> up 2:02
> root 269 21.6 0.0 0 0 ? S 10:23 26:27 [kswapd0]
> up 3:10
> root 269 25.2 0.0 0 0 ? R 10:23 48:11 [kswapd0]
> up 4:04
> root 269 29.0 0.0 0 0 ? S 10:23 71:12 [kswapd0]
> up 5:04
> root 269 26.8 0.0 0 0 ? R 10:23 81:47 [kswapd0]
> up 6:07
> root 269 27.9 0.0 0 0 ? R 10:23 102:40 [kswapd0]
> dmesg attached below as 6.10.0-rc6-661e504db04c-test2.zip
>
> > 3) Then drop the patch from step 2), and on top of the previous 4
> > patches from my git tree, apply this one:
> >
> > https://gist.githubusercontent.com/fdmanana/a7c9c2abb69c978cf5b80c2f784243d5/raw/b4cca964904d3ec15c74e36ccf111a3a2f530520/gistfile1.txt
> >
> > This is just to confirm if we do have concurrent calls to the
> > shrinker, as the tracing seems to suggest, and where the negative
> > numbers come from.
> > It also helps to check if not allowing concurrent calls to it, by
> > skipping if it's already running, helps making the problems go away.
>
> [3]
> 6.10.0-rc6-661e504db04c-test3
> up 1:00
> root 269 18.6 0.0 0 0 ? S 17:09 11:12 [kswapd0]
> up 2:00
> root 269 23.7 0.0 0 0 ? R 17:09 28:30 [kswapd0]
> up 3:00
> root 269 27.0 0.0 0 0 ? S 17:09 48:47 [kswapd0]
> up 4:00
> root 269 28.8 0.0 0 0 ? S 17:09 69:10 [kswapd0]
> up 5:00
> root 269 32.0 0.0 0 0 ? S 17:09 96:17 [kswapd0]
> up 6:00
> root 269 29.7 0.0 0 0 ? S 17:09 107:12 [kswapd0]
> dmesg attached below as 6.10.0-rc6-661e504db04c-test3.zip
>
> As we can see, the time of kswapd0 has increased significantly. It was
> 30 min in 6 hours it became 100 min. That is, it became three times
> worse even with proposed patches (1-4).
Can you try the following two branches based on 6.10-rc6?
1) https://git.kernel.org/pub/scm/linux/kernel/git/fdmanana/linux.git/log/?h=test1_em_shrinker_6.10
2) https://git.kernel.org/pub/scm/linux/kernel/git/fdmanana/linux.git/log/?h=test2_em_shrinker_6.10
Even if the first one makes things good, also try the second one please.
The first just includes some changes for the next merge window (for
6.11) that might help speedup things.
The second just has a change that would be simple to add to 6.10 and
we'll probably always want it or some variation of it.
Thanks!
>
> --
> Best Regards,
> Mike Gavrilov.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: 6.10/regression/bisected - after f1d97e769152 I spotted increased execution time of the kswapd0 process and symptoms as if there is not enough memory
2024-07-04 17:25 ` Filipe Manana
` (3 preceding siblings ...)
2024-07-05 18:36 ` Mikhail Gavrilov
@ 2024-07-06 0:11 ` Andrea Gelmini
2024-07-06 12:07 ` Andrea Gelmini
4 siblings, 1 reply; 56+ messages in thread
From: Andrea Gelmini @ 2024-07-06 0:11 UTC (permalink / raw)
To: Filipe Manana
Cc: Mikhail Gavrilov, Linux List Kernel Mailing,
Linux regressions mailing list, Btrfs BTRFS, dsterba, josef
Il giorno gio 4 lug 2024 alle ore 19:25 Filipe Manana
<fdmanana@kernel.org> ha scritto:
> 2) Then drop that patch that disables the shrinker.
> With all the previous 4 patches applied, apply this one on top of them:
>
> https://gist.githubusercontent.com/fdmanana/9cea16ca56594f8c7e20b67dc66c6c94/raw/557bd5f6b37b65d210218f8da8987b74bfe5e515/gistfile1.txt
>
> The goal here is to see if the extent map eviction done by the
> shrinker is making reads from other tasks too slow, and check if
> that's what0s making your system unresponsive.
>
> 3) Then drop the patch from step 2), and on top of the previous 4
> patches from my git tree, apply this one:
>
> https://gist.githubusercontent.com/fdmanana/a7c9c2abb69c978cf5b80c2f784243d5/raw/b4cca964904d3ec15c74e36ccf111a3a2f530520/gistfile1.txt
>
> This is just to confirm if we do have concurrent calls to the
> shrinker, as the tracing seems to suggest, and where the negative
> numbers come from.
> It also helps to check if not allowing concurrent calls to it, by
> skipping if it's already running, helps making the problems go away.
Uhm... good news...
To recap, here's this evening tests:
Kernel 6.6.36:
Fresh BTRFS: (tar cp . | pv -ta > /dev/null): 0:03:53 [ 231MiB/s]
(time and average speed)
Aged snapshots: (tar cp /.snapshots/|pv -at -s 100G -S >
/dev/null): 0:02:20 [ 726MiB/s]
Kernel rc6+branch+2nd patch:
Fresh BTRFS: 0:03:14 [ 278MiB/s]
Aged snapshots: I had to stop. PSI memory > 80%. Processes stucked
for most time. i.e.: mplayer via nfs stops every few seconds for a
while, switching virtual desktop takes >5 seconds. Also "echo 3 >
drop_caches" takes more than 5 minutes to finish (on the other two
kernels, it was quite immediate).
Kernel rc6+branch+3rd patch:
Fresh BTRFS: 0:03:40 [ 245MiB/s]
Aged snapshots: 0:02:03 [ 826MiB/s]
N.b.: no skyrocket PSI memory, no swap pressure, no sluggish results!!!
Now, that was just one run, I'm going to use this patch for a few
days. Next week I can tell you for sure if everything is right!
For the moment it seems we have a winner!
^ permalink raw reply [flat|nested] 56+ messages in thread* Re: 6.10/regression/bisected - after f1d97e769152 I spotted increased execution time of the kswapd0 process and symptoms as if there is not enough memory
2024-07-06 0:11 ` Andrea Gelmini
@ 2024-07-06 12:07 ` Andrea Gelmini
2024-07-06 17:37 ` Filipe Manana
0 siblings, 1 reply; 56+ messages in thread
From: Andrea Gelmini @ 2024-07-06 12:07 UTC (permalink / raw)
To: Filipe Manana
Cc: Mikhail Gavrilov, Linux List Kernel Mailing,
Linux regressions mailing list, Btrfs BTRFS, dsterba, josef
Il giorno sab 6 lug 2024 alle ore 02:11 Andrea Gelmini
<andrea.gelmini@gmail.com> ha scritto:
> For the moment it seems we have a winner!
I confirm this, but I forgot to add this (a lot of these):
[sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
shrinker already running, comm cc1plus nr_to_scan 2
[sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
shrinker already running, comm cc1plus nr_to_scan 2
[sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
shrinker already running, comm cc1plus nr_to_scan 2
[sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
shrinker already running, comm cc1plus nr_to_scan 2
[sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
shrinker already running, comm cc1plus nr_to_scan 2
[sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
shrinker already running, comm cc1plus nr_to_scan 2
[sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
shrinker already running, comm firefox-bin nr_to_scan 2
[sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
shrinker already running, comm firefox-bin nr_to_scan 2
[sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
shrinker already running, comm cc1plus nr_to_scan 2
[sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
shrinker already running, comm cc1plus nr_to_scan 2
[sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
shrinker already running, comm cc1plus nr_to_scan 2
[sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
shrinker already running, comm cc1plus nr_to_scan 2
[sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
shrinker already running, comm cc1plus nr_to_scan 2
[sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
shrinker already running, comm cc1plus nr_to_scan 2
[sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
shrinker already running, comm cc1plus nr_to_scan 2
[sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
shrinker already running, comm cc1plus nr_to_scan 2
[sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
shrinker already running, comm cc1plus nr_to_scan 2
[sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
shrinker already running, comm cc1plus nr_to_scan 2
[sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
shrinker already running, comm cc1plus nr_to_scan 2
[sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
shrinker already running, comm cc1plus nr_to_scan 2
[sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
shrinker already running, comm cc1plus nr_to_scan 2
[sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
shrinker already running, comm cc1plus nr_to_scan 2
[sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
shrinker already running, comm cc1plus nr_to_scan 2
[sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
shrinker already running, comm cc1plus nr_to_scan 2
[sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
shrinker already running, comm cc1plus nr_to_scan 2
[sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
shrinker already running, comm cc1plus nr_to_scan 2
[sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
shrinker already running, comm cc1plus nr_to_scan 2
[sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
shrinker already running, comm cc1plus nr_to_scan 2
[sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
shrinker already running, comm cc1plus nr_to_scan 2
Just for the record, compiling LibreOffice.
In the meanwhile running restic (full backup to force read
everything), no sluggish at all.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: 6.10/regression/bisected - after f1d97e769152 I spotted increased execution time of the kswapd0 process and symptoms as if there is not enough memory
2024-07-06 12:07 ` Andrea Gelmini
@ 2024-07-06 17:37 ` Filipe Manana
2024-07-07 9:41 ` Filipe Manana
2024-07-07 11:35 ` Mikhail Gavrilov
0 siblings, 2 replies; 56+ messages in thread
From: Filipe Manana @ 2024-07-06 17:37 UTC (permalink / raw)
To: Andrea Gelmini
Cc: Mikhail Gavrilov, Linux List Kernel Mailing,
Linux regressions mailing list, Btrfs BTRFS, dsterba, josef
On Sat, Jul 6, 2024 at 1:07 PM Andrea Gelmini <andrea.gelmini@gmail.com> wrote:
>
> Il giorno sab 6 lug 2024 alle ore 02:11 Andrea Gelmini
> <andrea.gelmini@gmail.com> ha scritto:
> > For the moment it seems we have a winner!
>
> I confirm this, but I forgot to add this (a lot of these):
Oh, those I added on purpose to confirm what the bpftrace logs
suggested: concurrent calls into the shrinker.
> [sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm firefox-bin nr_to_scan 2
> [sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm firefox-bin nr_to_scan 2
> [sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
>
> Just for the record, compiling LibreOffice.
>
> In the meanwhile running restic (full backup to force read
> everything), no sluggish at all.
That's great!
So I've been working on a proper approach following all those test
results from you and Mikhail, and I would like to ask you both to try
this branch:
https://git.kernel.org/pub/scm/linux/kernel/git/fdmanana/linux.git/log/?h=test3_em_shrinker_6.10
Again, this is based on 6.10-rc6 plus 3 fixes for this issue you're both having.
Can you guys test that branch?
Thank you a lot for all the time spent on this!
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: 6.10/regression/bisected - after f1d97e769152 I spotted increased execution time of the kswapd0 process and symptoms as if there is not enough memory
2024-07-06 17:37 ` Filipe Manana
@ 2024-07-07 9:41 ` Filipe Manana
2024-07-07 10:15 ` Andrea Gelmini
2024-07-07 11:35 ` Mikhail Gavrilov
1 sibling, 1 reply; 56+ messages in thread
From: Filipe Manana @ 2024-07-07 9:41 UTC (permalink / raw)
To: Andrea Gelmini
Cc: Mikhail Gavrilov, Linux List Kernel Mailing,
Linux regressions mailing list, Btrfs BTRFS, dsterba, josef
On Sat, Jul 6, 2024 at 6:37 PM Filipe Manana <fdmanana@kernel.org> wrote:
>
> On Sat, Jul 6, 2024 at 1:07 PM Andrea Gelmini <andrea.gelmini@gmail.com> wrote:
> >
> > Il giorno sab 6 lug 2024 alle ore 02:11 Andrea Gelmini
> > <andrea.gelmini@gmail.com> ha scritto:
> > > For the moment it seems we have a winner!
> >
> > I confirm this, but I forgot to add this (a lot of these):
>
> Oh, those I added on purpose to confirm what the bpftrace logs
> suggested: concurrent calls into the shrinker.
>
>
> > [sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm firefox-bin nr_to_scan 2
> > [sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm firefox-bin nr_to_scan 2
> > [sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> >
> > Just for the record, compiling LibreOffice.
> >
> > In the meanwhile running restic (full backup to force read
> > everything), no sluggish at all.
>
> That's great!
>
> So I've been working on a proper approach following all those test
> results from you and Mikhail, and I would like to ask you both to try
> this branch:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/fdmanana/linux.git/log/?h=test3_em_shrinker_6.10
>
> Again, this is based on 6.10-rc6 plus 3 fixes for this issue you're both having.
>
> Can you guys test that branch?
I just updated the branch with a last minute change to avoid an
unnecessary reschedule and re-lock, therefore helping reduce latency.
Thanks.
>
> Thank you a lot for all the time spent on this!
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: 6.10/regression/bisected - after f1d97e769152 I spotted increased execution time of the kswapd0 process and symptoms as if there is not enough memory
2024-07-07 9:41 ` Filipe Manana
@ 2024-07-07 10:15 ` Andrea Gelmini
2024-07-07 10:28 ` Filipe Manana
0 siblings, 1 reply; 56+ messages in thread
From: Andrea Gelmini @ 2024-07-07 10:15 UTC (permalink / raw)
To: Filipe Manana
Cc: Mikhail Gavrilov, Linux List Kernel Mailing,
Linux regressions mailing list, Btrfs BTRFS, dsterba, josef
Il giorno dom 7 lug 2024 alle ore 11:41 Filipe Manana
<fdmanana@kernel.org> ha scritto:
> > Again, this is based on 6.10-rc6 plus 3 fixes for this issue you're both having.
> >
> > Can you guys test that branch?
Used yesterday and today. Seems fine. Just in quick test, I see
sometimes PSI memory spike over 40, but - important thing - no effect
on interactivity. So I didn't investigated more.
Well, just to be sure. I compiled the latest git with -rc6 and
test3_em_shrinker_6.10. Nothing more about patches.
Anyway, just for the record:
kernel: test3
fresh: 0:03:44 [ 241MiB/s]
aged: 0:02:07 [ 801MiB/s]
funny thing: next runs of aged no more than 0:03:22 [
504MiB/s] (but, as I wrote, no problem with interaction).
> I just updated the branch with a last minute change to avoid an
> unnecessary reschedule and re-lock, therefore helping reduce latency.
Ok, recompile now and test!
^ permalink raw reply [flat|nested] 56+ messages in thread* Re: 6.10/regression/bisected - after f1d97e769152 I spotted increased execution time of the kswapd0 process and symptoms as if there is not enough memory
2024-07-07 10:15 ` Andrea Gelmini
@ 2024-07-07 10:28 ` Filipe Manana
2024-07-07 11:15 ` Andrea Gelmini
0 siblings, 1 reply; 56+ messages in thread
From: Filipe Manana @ 2024-07-07 10:28 UTC (permalink / raw)
To: Andrea Gelmini
Cc: Mikhail Gavrilov, Linux List Kernel Mailing,
Linux regressions mailing list, Btrfs BTRFS, dsterba, josef
On Sun, Jul 7, 2024 at 11:15 AM Andrea Gelmini <andrea.gelmini@gmail.com> wrote:
>
> Il giorno dom 7 lug 2024 alle ore 11:41 Filipe Manana
> <fdmanana@kernel.org> ha scritto:
> > > Again, this is based on 6.10-rc6 plus 3 fixes for this issue you're both having.
> > >
> > > Can you guys test that branch?
>
> Used yesterday and today. Seems fine. Just in quick test, I see
> sometimes PSI memory spike over 40, but - important thing - no effect
> on interactivity. So I didn't investigated more.
Awesome!
>
> Well, just to be sure. I compiled the latest git with -rc6 and
> test3_em_shrinker_6.10. Nothing more about patches.
That's right, just that branch. It has all the necessary patches (3),
no need to apply any other patches on top of it.
>
> Anyway, just for the record:
> kernel: test3
> fresh: 0:03:44 [ 241MiB/s]
> aged: 0:02:07 [ 801MiB/s]
> funny thing: next runs of aged no more than 0:03:22 [
> 504MiB/s] (but, as I wrote, no problem with interaction).
>
> > I just updated the branch with a last minute change to avoid an
> > unnecessary reschedule and re-lock, therefore helping reduce latency.
>
> Ok, recompile now and test!
Thanks! Much appreciated!
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: 6.10/regression/bisected - after f1d97e769152 I spotted increased execution time of the kswapd0 process and symptoms as if there is not enough memory
2024-07-07 10:28 ` Filipe Manana
@ 2024-07-07 11:15 ` Andrea Gelmini
2024-07-07 12:10 ` Filipe Manana
0 siblings, 1 reply; 56+ messages in thread
From: Andrea Gelmini @ 2024-07-07 11:15 UTC (permalink / raw)
To: Filipe Manana
Cc: Mikhail Gavrilov, Linux List Kernel Mailing,
Linux regressions mailing list, Btrfs BTRFS, dsterba, josef
Il giorno dom 7 lug 2024 alle ore 12:28 Filipe Manana
<fdmanana@kernel.org> ha scritto:
> > Ok, recompile now and test!
>
> Thanks! Much appreciated!
So, usual benchmark:
fresh: 0:03:16 [ 275MiB/s]
aged: 0:02:30 [ 680MiB/s]
I let you know in a few days.
Well, does it make sense to add the option to disable shrinker via /proc?
Thanks to you,
Gelma
^ permalink raw reply [flat|nested] 56+ messages in thread* Re: 6.10/regression/bisected - after f1d97e769152 I spotted increased execution time of the kswapd0 process and symptoms as if there is not enough memory
2024-07-07 11:15 ` Andrea Gelmini
@ 2024-07-07 12:10 ` Filipe Manana
0 siblings, 0 replies; 56+ messages in thread
From: Filipe Manana @ 2024-07-07 12:10 UTC (permalink / raw)
To: Andrea Gelmini
Cc: Mikhail Gavrilov, Linux List Kernel Mailing,
Linux regressions mailing list, Btrfs BTRFS, dsterba, josef
On Sun, Jul 7, 2024 at 12:15 PM Andrea Gelmini <andrea.gelmini@gmail.com> wrote:
>
> Il giorno dom 7 lug 2024 alle ore 12:28 Filipe Manana
> <fdmanana@kernel.org> ha scritto:
>
> > > Ok, recompile now and test!
> >
> > Thanks! Much appreciated!
>
> So, usual benchmark:
> fresh: 0:03:16 [ 275MiB/s]
> aged: 0:02:30 [ 680MiB/s]
>
> I let you know in a few days.
> Well, does it make sense to add the option to disable shrinker via /proc?
Maybe (through sysfs), but the shrinker is important to prevent OOM
situations because otherwise we can create an unlimited number of
extent maps.
It can be triggered by a regular user, intentionally or not.
>
> Thanks to you,
> Gelma
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: 6.10/regression/bisected - after f1d97e769152 I spotted increased execution time of the kswapd0 process and symptoms as if there is not enough memory
2024-07-06 17:37 ` Filipe Manana
2024-07-07 9:41 ` Filipe Manana
@ 2024-07-07 11:35 ` Mikhail Gavrilov
2024-07-07 12:15 ` Filipe Manana
1 sibling, 1 reply; 56+ messages in thread
From: Mikhail Gavrilov @ 2024-07-07 11:35 UTC (permalink / raw)
To: Filipe Manana
Cc: Andrea Gelmini, Linux List Kernel Mailing,
Linux regressions mailing list, Btrfs BTRFS, dsterba, josef
[-- Attachment #1: Type: text/plain, Size: 2624 bytes --]
On Sat, Jul 6, 2024 at 10:38 PM Filipe Manana <fdmanana@kernel.org> wrote:
> So I've been working on a proper approach following all those test
> results from you and Mikhail, and I would like to ask you both to try
> this branch:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/fdmanana/linux.git/log/?h=test3_em_shrinker_6.10
>
> Again, this is based on 6.10-rc6 plus 3 fixes for this issue you're both having.
>
> Can you guys test that branch?
>
> Thank you a lot for all the time spent on this!
6.10.0-rc6-test1_em_shrinker_6.10
up 1:01
root 269 25.8 0.0 0 0 ? R 10:59 15:47 [kswapd0]
up 2:00
root 269 25.5 0.0 0 0 ? S 10:59 30:46 [kswapd0]
up 3:00
root 269 27.9 0.0 0 0 ? S 10:59 50:18 [kswapd0]
up 4:00
root 269 27.8 0.0 0 0 ? S 10:59 67:08 [kswapd0]
up 5:00
root 269 27.5 0.0 0 0 ? S 10:59 83:01 [kswapd0]
up 6:00
root 269 27.5 0.0 0 0 ? S 10:59 99:31 [kswapd0]
kswapd0 on the test1 branch is bad as
https://gist.githubusercontent.com/fdmanana/9cea16ca56594f8c7e20b67dc66c6c94/raw/557bd5f6b37b65d210218f8da8987b74bfe5e515/gistfile1.txt
6.10.0-rc6-test2_em_shrinker_6.10
up 1:00
root 269 11.7 0.0 0 0 ? S 19:23 7:03 [kswapd0]
up 2:02
root 269 11.9 0.0 0 0 ? S 19:23 14:38 [kswapd0]
up 3:00
root 269 11.9 0.0 0 0 ? S 19:23 21:30 [kswapd0]
up 4:01
root 269 11.2 0.0 0 0 ? S 19:23 27:15 [kswapd0]
up 5:00
root 269 11.4 0.0 0 0 ? R Jul06 34:25 [kswapd0]
up 6:00
root 269 13.9 0.0 0 0 ? S Jul06 50:14 [kswapd0]
On the test2 branch, kswapd0 is two times better.
6.10.0-rc6-test3_em_shrinker_6.10 (d22fedf5058d)
up 1:02
root 269 11.0 0.0 0 0 ? S 09:54 6:50 [kswapd0]
up 2:00
root 269 10.7 0.0 0 0 ? S 09:54 12:54 [kswapd0]
up 3:00
root 269 10.1 0.0 0 0 ? S 09:54 18:18 [kswapd0]
up 4:00
root 269 9.5 0.0 0 0 ? S 09:54 23:03 [kswapd0]
up 5:01
root 269 10.0 0.0 0 0 ? S 09:54 30:24 [kswapd0]
up 6:00
root 269 9.9 0.0 0 0 ? S 09:54 35:42 [kswapd0]
On the test3 branch, kswapd0 is thee times better.
To catch up with the 6.9 branch, the timing needs to be 4 times better.
--
Best Regards,
Mike Gavrilov.
[-- Attachment #2: 6.10.0-rc6-test1_em_shrinker_6.10.zip --]
[-- Type: application/zip, Size: 52872 bytes --]
[-- Attachment #3: 6.10.0-rc6-test2_em_shrinker_6.10.zip --]
[-- Type: application/zip, Size: 45286 bytes --]
[-- Attachment #4: 6.10.0-rc6-test3_em_shrinker_6.10.zip --]
[-- Type: application/zip, Size: 52841 bytes --]
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: 6.10/regression/bisected - after f1d97e769152 I spotted increased execution time of the kswapd0 process and symptoms as if there is not enough memory
2024-07-07 11:35 ` Mikhail Gavrilov
@ 2024-07-07 12:15 ` Filipe Manana
2024-07-07 19:16 ` Mikhail Gavrilov
0 siblings, 1 reply; 56+ messages in thread
From: Filipe Manana @ 2024-07-07 12:15 UTC (permalink / raw)
To: Mikhail Gavrilov
Cc: Andrea Gelmini, Linux List Kernel Mailing,
Linux regressions mailing list, Btrfs BTRFS, dsterba, josef
On Sun, Jul 7, 2024 at 12:35 PM Mikhail Gavrilov
<mikhail.v.gavrilov@gmail.com> wrote:
>
> On Sat, Jul 6, 2024 at 10:38 PM Filipe Manana <fdmanana@kernel.org> wrote:
> > So I've been working on a proper approach following all those test
> > results from you and Mikhail, and I would like to ask you both to try
> > this branch:
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/fdmanana/linux.git/log/?h=test3_em_shrinker_6.10
> >
> > Again, this is based on 6.10-rc6 plus 3 fixes for this issue you're both having.
> >
> > Can you guys test that branch?
> >
> > Thank you a lot for all the time spent on this!
>
> 6.10.0-rc6-test1_em_shrinker_6.10
> up 1:01
> root 269 25.8 0.0 0 0 ? R 10:59 15:47 [kswapd0]
> up 2:00
> root 269 25.5 0.0 0 0 ? S 10:59 30:46 [kswapd0]
> up 3:00
> root 269 27.9 0.0 0 0 ? S 10:59 50:18 [kswapd0]
> up 4:00
> root 269 27.8 0.0 0 0 ? S 10:59 67:08 [kswapd0]
> up 5:00
> root 269 27.5 0.0 0 0 ? S 10:59 83:01 [kswapd0]
> up 6:00
> root 269 27.5 0.0 0 0 ? S 10:59 99:31 [kswapd0]
> kswapd0 on the test1 branch is bad as
> https://gist.githubusercontent.com/fdmanana/9cea16ca56594f8c7e20b67dc66c6c94/raw/557bd5f6b37b65d210218f8da8987b74bfe5e515/gistfile1.txt
>
>
> 6.10.0-rc6-test2_em_shrinker_6.10
> up 1:00
> root 269 11.7 0.0 0 0 ? S 19:23 7:03 [kswapd0]
> up 2:02
> root 269 11.9 0.0 0 0 ? S 19:23 14:38 [kswapd0]
> up 3:00
> root 269 11.9 0.0 0 0 ? S 19:23 21:30 [kswapd0]
> up 4:01
> root 269 11.2 0.0 0 0 ? S 19:23 27:15 [kswapd0]
> up 5:00
> root 269 11.4 0.0 0 0 ? R Jul06 34:25 [kswapd0]
> up 6:00
> root 269 13.9 0.0 0 0 ? S Jul06 50:14 [kswapd0]
> On the test2 branch, kswapd0 is two times better.
>
>
> 6.10.0-rc6-test3_em_shrinker_6.10 (d22fedf5058d)
> up 1:02
> root 269 11.0 0.0 0 0 ? S 09:54 6:50 [kswapd0]
> up 2:00
> root 269 10.7 0.0 0 0 ? S 09:54 12:54 [kswapd0]
> up 3:00
> root 269 10.1 0.0 0 0 ? S 09:54 18:18 [kswapd0]
> up 4:00
> root 269 9.5 0.0 0 0 ? S 09:54 23:03 [kswapd0]
> up 5:01
> root 269 10.0 0.0 0 0 ? S 09:54 30:24 [kswapd0]
> up 6:00
> root 269 9.9 0.0 0 0 ? S 09:54 35:42 [kswapd0]
> On the test3 branch, kswapd0 is thee times better.
That's good. And is the DE unresponsiveness gone too?
I see you tested d22fedf5058d, but I updated the branch a couple hours
ago, now the top commit is fa8b5dd7fa18.
Can you test the updated branch? It may help further in your case.
https://git.kernel.org/pub/scm/linux/kernel/git/fdmanana/linux.git/commit/?h=test3_em_shrinker_6.10&id=fa8b5dd7fa18a4dc2ea6bdeaf5525b1af348f383
>
> To catch up with the 6.9 branch, the timing needs to be 4 times better.
Hopefully it will be much closer to that with the updated branch.
The upcoming changes for 6.11 would help there too, but anyway we can
still further optimize on top of the 6.10-rc code.
Thanks Mikhail!
>
> --
> Best Regards,
> Mike Gavrilov.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: 6.10/regression/bisected - after f1d97e769152 I spotted increased execution time of the kswapd0 process and symptoms as if there is not enough memory
2024-07-07 12:15 ` Filipe Manana
@ 2024-07-07 19:16 ` Mikhail Gavrilov
2024-07-08 14:15 ` Filipe Manana
0 siblings, 1 reply; 56+ messages in thread
From: Mikhail Gavrilov @ 2024-07-07 19:16 UTC (permalink / raw)
To: Filipe Manana
Cc: Andrea Gelmini, Linux List Kernel Mailing,
Linux regressions mailing list, Btrfs BTRFS, dsterba, josef
[-- Attachment #1: Type: text/plain, Size: 1371 bytes --]
On Sunday, Jul 7, 2024, at 5:15 PM Filipe Manana <fdmanana@kernel.org>, wrote:
> That's good. And is the DE unresponsiveness gone too?
Yes. I don’t know how to objectively measure responsiveness, but there
There were no more freezes like those on my video.
> I see you tested d22fedf5058d, but I updated the branch a couple hours
> ago, now the top commit is fa8b5dd7fa18.
> Can you test the updated branch? It may help further in your case.
>
> https://git.kernel.org/pub/scm/linux/kernel/git/fdmanana/linux.git/commit/?h=test3_em_shrinker_6.10&id=fa8b5dd7fa18a4dc2ea6bdeaf5525b1af348f383
6.10.0-rc6-test3_em_shrinker_6.10-fa8b5dd7fa18
up 1:00
root 269 13.1 0.0 0 0 ? S 18:01 7:54 [kswapd0]
up 2:00
root 269 9.8 0.0 0 0 ? S 18:01 11:46 [kswapd0]
up 3:00
root 269 10.8 0.0 0 0 ? S 18:01 19:36 [kswapd0]
up 4:00
root 269 11.9 0.0 0 0 ? R 18:01 28:37 [kswapd0]
up 5:00
root 269 13.1 0.0 0 0 ? S 18:01 39:29 [kswapd0]
up 6:00
root 269 13.1 0.0 0 0 ? S Jul07 47:24 [kswapd0]
It’s as if kswapd0 got worse based on time measurements (it became
like on the test2 branch), but subjectively, the responsiveness got
better.
--
Best Regards,
Mike Gavrilov.
[-- Attachment #2: 6.10.0-rc6-test3_em_shrinker_6.10-fa8b5dd7fa18.zip --]
[-- Type: application/zip, Size: 54067 bytes --]
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: 6.10/regression/bisected - after f1d97e769152 I spotted increased execution time of the kswapd0 process and symptoms as if there is not enough memory
2024-07-07 19:16 ` Mikhail Gavrilov
@ 2024-07-08 14:15 ` Filipe Manana
2024-07-10 9:24 ` Mikhail Gavrilov
0 siblings, 1 reply; 56+ messages in thread
From: Filipe Manana @ 2024-07-08 14:15 UTC (permalink / raw)
To: Mikhail Gavrilov
Cc: Andrea Gelmini, Linux List Kernel Mailing,
Linux regressions mailing list, Btrfs BTRFS, dsterba, josef
On Sun, Jul 7, 2024 at 8:16 PM Mikhail Gavrilov
<mikhail.v.gavrilov@gmail.com> wrote:
>
> On Sunday, Jul 7, 2024, at 5:15 PM Filipe Manana <fdmanana@kernel.org>, wrote:
> > That's good. And is the DE unresponsiveness gone too?
>
> Yes. I don’t know how to objectively measure responsiveness, but there
> There were no more freezes like those on my video.
That's good.
>
> > I see you tested d22fedf5058d, but I updated the branch a couple hours
> > ago, now the top commit is fa8b5dd7fa18.
> > Can you test the updated branch? It may help further in your case.
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/fdmanana/linux.git/commit/?h=test3_em_shrinker_6.10&id=fa8b5dd7fa18a4dc2ea6bdeaf5525b1af348f383
>
> 6.10.0-rc6-test3_em_shrinker_6.10-fa8b5dd7fa18
> up 1:00
> root 269 13.1 0.0 0 0 ? S 18:01 7:54 [kswapd0]
> up 2:00
> root 269 9.8 0.0 0 0 ? S 18:01 11:46 [kswapd0]
> up 3:00
> root 269 10.8 0.0 0 0 ? S 18:01 19:36 [kswapd0]
> up 4:00
> root 269 11.9 0.0 0 0 ? R 18:01 28:37 [kswapd0]
> up 5:00
> root 269 13.1 0.0 0 0 ? S 18:01 39:29 [kswapd0]
> up 6:00
> root 269 13.1 0.0 0 0 ? S Jul07 47:24 [kswapd0]
>
> It’s as if kswapd0 got worse based on time measurements (it became
> like on the test2 branch), but subjectively, the responsiveness got
> better.
That's weird, I think you might be observing some variance.
I noticed that too for your reports of the test2 branch and the old
test3 branch, which were very identical, yet you got a very
significant difference between them.
Thanks.
>
> --
> Best Regards,
> Mike Gavrilov.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: 6.10/regression/bisected - after f1d97e769152 I spotted increased execution time of the kswapd0 process and symptoms as if there is not enough memory
2024-07-08 14:15 ` Filipe Manana
@ 2024-07-10 9:24 ` Mikhail Gavrilov
2024-07-10 10:53 ` Filipe Manana
0 siblings, 1 reply; 56+ messages in thread
From: Mikhail Gavrilov @ 2024-07-10 9:24 UTC (permalink / raw)
To: Filipe Manana
Cc: Andrea Gelmini, Linux List Kernel Mailing,
Linux regressions mailing list, Btrfs BTRFS, dsterba, josef
On Mon, Jul 8, 2024 at 7:16 PM Filipe Manana <fdmanana@kernel.org> wrote:
>
> That's weird, I think you might be observing some variance.
> I noticed that too for your reports of the test2 branch and the old
> test3 branch, which were very identical, yet you got a very
> significant difference between them.
>
> Thanks.
>
up 1:00
root 269 10.2 0.0 0 0 ? S 10:06 6:13 [kswapd0]
up 2:01
root 269 9.1 0.0 0 0 ? S 10:06 11:07 [kswapd0]
up 3:00
root 269 8.4 0.0 0 0 ? R 10:06 15:18 [kswapd0]
up 4:21
root 269 11.7 0.0 0 0 ? S 10:06 30:33 [kswapd0]
up 5:01
root 269 11.7 0.0 0 0 ? S 10:06 35:19 [kswapd0]
up 6:27
root 269 11.5 0.0 0 0 ? S 10:06 44:39 [kswapd0]
up 7:00
root 269 11.2 0.0 0 0 ? R 10:06 47:18 [kswapd0]
The measurement error can reach ±10 min.
Did you plan to merge the fix before the 6.10 release?
--
Best Regards,
Mike Gavrilov.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: 6.10/regression/bisected - after f1d97e769152 I spotted increased execution time of the kswapd0 process and symptoms as if there is not enough memory
2024-07-10 9:24 ` Mikhail Gavrilov
@ 2024-07-10 10:53 ` Filipe Manana
2024-08-11 8:08 ` Jannik Glückert
0 siblings, 1 reply; 56+ messages in thread
From: Filipe Manana @ 2024-07-10 10:53 UTC (permalink / raw)
To: Mikhail Gavrilov
Cc: Andrea Gelmini, Linux List Kernel Mailing,
Linux regressions mailing list, Btrfs BTRFS, dsterba, josef
On Wed, Jul 10, 2024 at 10:24 AM Mikhail Gavrilov
<mikhail.v.gavrilov@gmail.com> wrote:
>
> On Mon, Jul 8, 2024 at 7:16 PM Filipe Manana <fdmanana@kernel.org> wrote:
> >
> > That's weird, I think you might be observing some variance.
> > I noticed that too for your reports of the test2 branch and the old
> > test3 branch, which were very identical, yet you got a very
> > significant difference between them.
> >
> > Thanks.
> >
>
> up 1:00
> root 269 10.2 0.0 0 0 ? S 10:06 6:13 [kswapd0]
> up 2:01
> root 269 9.1 0.0 0 0 ? S 10:06 11:07 [kswapd0]
> up 3:00
> root 269 8.4 0.0 0 0 ? R 10:06 15:18 [kswapd0]
> up 4:21
> root 269 11.7 0.0 0 0 ? S 10:06 30:33 [kswapd0]
> up 5:01
> root 269 11.7 0.0 0 0 ? S 10:06 35:19 [kswapd0]
> up 6:27
> root 269 11.5 0.0 0 0 ? S 10:06 44:39 [kswapd0]
> up 7:00
> root 269 11.2 0.0 0 0 ? R 10:06 47:18 [kswapd0]
>
> The measurement error can reach ±10 min.
> Did you plan to merge the fix before the 6.10 release?
I've submitted a patchset with the goal to apply against 6.10 (see the
notes there in the cover letter):
https://lore.kernel.org/linux-btrfs/cover.1720448663.git.fdmanana@suse.com/
But it's up to David to submit to Linus, as he's the maintainer.
Though I haven't heard from him yet.
I plan at least one more improvement for the shrinker, but I would
like to know too if those patches go into 6.10 before it's released or
not,
because there are conflicts with the for-next branch.
> --
> Best Regards,
> Mike Gavrilov.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: 6.10/regression/bisected - after f1d97e769152 I spotted increased execution time of the kswapd0 process and symptoms as if there is not enough memory
2024-07-10 10:53 ` Filipe Manana
@ 2024-08-11 8:08 ` Jannik Glückert
2024-08-11 15:33 ` Filipe Manana
0 siblings, 1 reply; 56+ messages in thread
From: Jannik Glückert @ 2024-08-11 8:08 UTC (permalink / raw)
To: fdmanana
Cc: andrea.gelmini, dsterba, josef, linux-btrfs, linux-kernel,
mikhail.v.gavrilov, regressions
[-- Attachment #1: Type: text/plain, Size: 794 bytes --]
Hello,
I am still encountering this issue on 6.10.3. As far as I can see this
is the last post in the thread, if the discussion continued elsewhere
please let me know.
My workload is a backup via restic, the system is idle otherwise.
This is on a Zen4 CPU with a very fast PCIe Gen4 nvme, so perhaps it was
fixed for others because they had comparatively slow IO or a smaller
workload?
I have attached the bpftrace run and a graph of the memory PSI. kswapd0
is at 100% during the critical sections. dmesg is empty.
Important events were e.g. 09:31-09:32 and 09:33-09:34 where the system
was completely unresponsive multiple times, for about 5 seconds at a time.
I did also mention this on the #btrfs IRC channel and there are other
users still encountering this on 6.10
Best
Jannik
[-- Attachment #2: psi_memory.jpg --]
[-- Type: image/jpeg, Size: 297648 bytes --]
[-- Attachment #3: bpftrace.log.gz --]
[-- Type: application/gzip, Size: 178863 bytes --]
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: 6.10/regression/bisected - after f1d97e769152 I spotted increased execution time of the kswapd0 process and symptoms as if there is not enough memory
2024-08-11 8:08 ` Jannik Glückert
@ 2024-08-11 15:33 ` Filipe Manana
2024-08-14 21:24 ` Jannik Glückert
2024-08-15 22:21 ` intelfx
0 siblings, 2 replies; 56+ messages in thread
From: Filipe Manana @ 2024-08-11 15:33 UTC (permalink / raw)
To: Jannik Glückert
Cc: andrea.gelmini, dsterba, josef, linux-btrfs, linux-kernel,
mikhail.v.gavrilov, regressions
On Sun, Aug 11, 2024 at 9:08 AM Jannik Glückert
<jannik.glueckert@gmail.com> wrote:
>
> Hello,
>
> I am still encountering this issue on 6.10.3. As far as I can see this
> is the last post in the thread, if the discussion continued elsewhere
> please let me know.
>
> My workload is a backup via restic, the system is idle otherwise.
> This is on a Zen4 CPU with a very fast PCIe Gen4 nvme, so perhaps it was
> fixed for others because they had comparatively slow IO or a smaller
> workload?
>
> I have attached the bpftrace run and a graph of the memory PSI. kswapd0
> is at 100% during the critical sections. dmesg is empty.
> Important events were e.g. 09:31-09:32 and 09:33-09:34 where the system
> was completely unresponsive multiple times, for about 5 seconds at a time.
>
> I did also mention this on the #btrfs IRC channel and there are other
> users still encountering this on 6.10
This came to my attention a couple days ago in a bugzilla report here:
https://bugzilla.kernel.org/show_bug.cgi?id=219121
There's also 2 other recent threads in the mailing about it.
There's a fix there in the bugzilla, and I've just sent it to the mailing list.
In case you want to try it:
https://lore.kernel.org/linux-btrfs/d85d72b968a1f7b8538c581eeb8f5baa973dfc95.1723377230.git.fdmanana@suse.com/
Thanks.
>
> Best
> Jannik
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: 6.10/regression/bisected - after f1d97e769152 I spotted increased execution time of the kswapd0 process and symptoms as if there is not enough memory
2024-08-11 15:33 ` Filipe Manana
@ 2024-08-14 21:24 ` Jannik Glückert
2024-08-15 22:21 ` intelfx
1 sibling, 0 replies; 56+ messages in thread
From: Jannik Glückert @ 2024-08-14 21:24 UTC (permalink / raw)
To: Filipe Manana
Cc: andrea.gelmini, dsterba, josef, linux-btrfs, linux-kernel,
mikhail.v.gavrilov, regressions
On 8/11/24 17:33, Filipe Manana wrote:
> There's a fix there in the bugzilla, and I've just sent it to the mailing list.
> In case you want to try it:
>
> https://lore.kernel.org/linux-btrfs/d85d72b968a1f7b8538c581eeb8f5baa973dfc95.1723377230.git.fdmanana@suse.com/
>
> Thanks
Hi Filipe,
this patch mostly fixes the issue, but I still get a 1-2 second window
of freezing every now and then. I also still see long periods of 100%
kswapd0 usage, but without the periodic freezing.
Thanks
Jannik
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: 6.10/regression/bisected - after f1d97e769152 I spotted increased execution time of the kswapd0 process and symptoms as if there is not enough memory
2024-08-11 15:33 ` Filipe Manana
2024-08-14 21:24 ` Jannik Glückert
@ 2024-08-15 22:21 ` intelfx
2024-08-15 23:17 ` intelfx
1 sibling, 1 reply; 56+ messages in thread
From: intelfx @ 2024-08-15 22:21 UTC (permalink / raw)
To: Filipe Manana, Jannik Glückert
Cc: andrea.gelmini, dsterba, josef, linux-btrfs, linux-kernel,
mikhail.v.gavrilov, regressions
[-- Attachment #1: Type: text/plain, Size: 1027 bytes --]
On 2024-08-11 at 16:33 +0100, Filipe Manana wrote:
> <...>
> This came to my attention a couple days ago in a bugzilla report here:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=219121
>
> There's also 2 other recent threads in the mailing about it.
>
> There's a fix there in the bugzilla, and I've just sent it to the mailing list.
> In case you want to try it:
>
> https://lore.kernel.org/linux-btrfs/d85d72b968a1f7b8538c581eeb8f5baa973dfc95.1723377230.git.fdmanana@suse.com/
>
> Thanks.
Hello,
I confirm that excessive "system" CPU usage by kswapd and btrfs-cleaner
kernel threads is still happening on the latest 6.10 stable with all
quoted patches applied, making the system close to unusable (not to
mention excessive power usage which crosses the line well *into*
"unusable" for low-power systems such as laptops).
With just 5 minutes of uptime on a freshly booted 6.10.5 system, the
cumulative CPU time of kswapd is already at 2 minutes.
Regards,
--
Ivan Shapovalov / intelfx /
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 862 bytes --]
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: 6.10/regression/bisected - after f1d97e769152 I spotted increased execution time of the kswapd0 process and symptoms as if there is not enough memory
2024-08-15 22:21 ` intelfx
@ 2024-08-15 23:17 ` intelfx
2024-08-16 0:02 ` David Sterba
` (2 more replies)
0 siblings, 3 replies; 56+ messages in thread
From: intelfx @ 2024-08-15 23:17 UTC (permalink / raw)
To: Filipe Manana, Jannik Glückert
Cc: andrea.gelmini, dsterba, josef, linux-btrfs, linux-kernel,
mikhail.v.gavrilov, regressions
[-- Attachment #1: Type: text/plain, Size: 1508 bytes --]
On 2024-08-16 at 00:21 +0200, intelfx@intelfx.name wrote:
> On 2024-08-11 at 16:33 +0100, Filipe Manana wrote:
> > <...>
> > This came to my attention a couple days ago in a bugzilla report here:
> >
> > https://bugzilla.kernel.org/show_bug.cgi?id=219121
> >
> > There's also 2 other recent threads in the mailing about it.
> >
> > There's a fix there in the bugzilla, and I've just sent it to the mailing list.
> > In case you want to try it:
> >
> > https://lore.kernel.org/linux-btrfs/d85d72b968a1f7b8538c581eeb8f5baa973dfc95.1723377230.git.fdmanana@suse.com/
> >
> > Thanks.
>
> Hello,
>
> I confirm that excessive "system" CPU usage by kswapd and btrfs-cleaner
> kernel threads is still happening on the latest 6.10 stable with all
> quoted patches applied, making the system close to unusable (not to
> mention excessive power usage which crosses the line well *into*
> "unusable" for low-power systems such as laptops).
>
> With just 5 minutes of uptime on a freshly booted 6.10.5 system, the
> cumulative CPU time of kswapd is already at 2 minutes.
As a follow-up, after 1 hour of uptime of this system the total CPU
time of kswapd0 is exactly 30 minutes. So whatever is the theoretical
OOM issue that the extent map shrinker is trying to solve, the solution
in its current form is clearly unacceptable.
Can we please have it reverted on the basis of this severe regression,
until a better solution is found?
Thanks,
--
Ivan Shapovalov / intelfx /
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 862 bytes --]
^ permalink raw reply [flat|nested] 56+ messages in thread* Re: 6.10/regression/bisected - after f1d97e769152 I spotted increased execution time of the kswapd0 process and symptoms as if there is not enough memory
2024-08-15 23:17 ` intelfx
@ 2024-08-16 0:02 ` David Sterba
2024-08-16 6:42 ` Andrea Gelmini
2024-08-16 10:58 ` Filipe Manana
2 siblings, 0 replies; 56+ messages in thread
From: David Sterba @ 2024-08-16 0:02 UTC (permalink / raw)
To: intelfx
Cc: Filipe Manana, Jannik Glückert, andrea.gelmini, dsterba,
josef, linux-btrfs, linux-kernel, mikhail.v.gavrilov, regressions
On Fri, Aug 16, 2024 at 01:17:25AM +0200, intelfx@intelfx.name wrote:
> On 2024-08-16 at 00:21 +0200, intelfx@intelfx.name wrote:
> > On 2024-08-11 at 16:33 +0100, Filipe Manana wrote:
> > > <...>
> > > This came to my attention a couple days ago in a bugzilla report here:
> > >
> > > https://bugzilla.kernel.org/show_bug.cgi?id=219121
> > >
> > > There's also 2 other recent threads in the mailing about it.
> > >
> > > There's a fix there in the bugzilla, and I've just sent it to the mailing list.
> > > In case you want to try it:
> > >
> > > https://lore.kernel.org/linux-btrfs/d85d72b968a1f7b8538c581eeb8f5baa973dfc95.1723377230.git.fdmanana@suse.com/
> > >
> > > Thanks.
> >
> > Hello,
> >
> > I confirm that excessive "system" CPU usage by kswapd and btrfs-cleaner
> > kernel threads is still happening on the latest 6.10 stable with all
> > quoted patches applied, making the system close to unusable (not to
> > mention excessive power usage which crosses the line well *into*
> > "unusable" for low-power systems such as laptops).
> >
> > With just 5 minutes of uptime on a freshly booted 6.10.5 system, the
> > cumulative CPU time of kswapd is already at 2 minutes.
>
> As a follow-up, after 1 hour of uptime of this system the total CPU
> time of kswapd0 is exactly 30 minutes. So whatever is the theoretical
> OOM issue that the extent map shrinker is trying to solve, the solution
> in its current form is clearly unacceptable.
>
> Can we please have it reverted on the basis of this severe regression,
> until a better solution is found?
It's not just one patch so a clean revert may not be possible, I'll see
if there's another possibility to either avoid depending on shrinker to
free the data or do a different workaround.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: 6.10/regression/bisected - after f1d97e769152 I spotted increased execution time of the kswapd0 process and symptoms as if there is not enough memory
2024-08-15 23:17 ` intelfx
2024-08-16 0:02 ` David Sterba
@ 2024-08-16 6:42 ` Andrea Gelmini
2024-08-16 6:47 ` Ivan Shapovalov
2024-08-16 10:58 ` Filipe Manana
2 siblings, 1 reply; 56+ messages in thread
From: Andrea Gelmini @ 2024-08-16 6:42 UTC (permalink / raw)
To: intelfx
Cc: Filipe Manana, Jannik Glückert, dsterba, josef, linux-btrfs,
linux-kernel, mikhail.v.gavrilov, regressions
Il giorno ven 16 ago 2024 alle ore 01:17 <intelfx@intelfx.name> ha scritto:
> Can we please have it reverted on the basis of this severe regression,
> until a better solution is found?
To disable the shrinker I simply remove two items:
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index f05cce7c8b8d..4f958ba61e0e 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -2410,8 +2410,6 @@ static const struct super_operations btrfs_super_ops = {
.statfs = btrfs_statfs,
.freeze_fs = btrfs_freeze,
.unfreeze_fs = btrfs_unfreeze,
- .nr_cached_objects = btrfs_nr_cached_objects,
- .free_cached_objects = btrfs_free_cached_objects,
};
static const struct file_operations btrfs_ctl_fops = {
This is from my thread with Filipe about same topic you can find in
the mailing list archive.
Ciao,
Gelma
^ permalink raw reply related [flat|nested] 56+ messages in thread* Re: 6.10/regression/bisected - after f1d97e769152 I spotted increased execution time of the kswapd0 process and symptoms as if there is not enough memory
2024-08-16 6:42 ` Andrea Gelmini
@ 2024-08-16 6:47 ` Ivan Shapovalov
2024-08-16 7:45 ` Qu Wenruo
0 siblings, 1 reply; 56+ messages in thread
From: Ivan Shapovalov @ 2024-08-16 6:47 UTC (permalink / raw)
To: Andrea Gelmini; +Cc: linux-btrfs, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 1225 bytes --]
On 2024-08-16 at 08:42 +0200, Andrea Gelmini wrote:
> Il giorno ven 16 ago 2024 alle ore 01:17 <intelfx@intelfx.name> ha scritto:
> > Can we please have it reverted on the basis of this severe regression,
> > until a better solution is found?
>
> To disable the shrinker I simply remove two items:
>
> diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
> index f05cce7c8b8d..4f958ba61e0e 100644
> --- a/fs/btrfs/super.c
> +++ b/fs/btrfs/super.c
> @@ -2410,8 +2410,6 @@ static const struct super_operations btrfs_super_ops = {
> .statfs = btrfs_statfs,
> .freeze_fs = btrfs_freeze,
> .unfreeze_fs = btrfs_unfreeze,
> - .nr_cached_objects = btrfs_nr_cached_objects,
> - .free_cached_objects = btrfs_free_cached_objects,
> };
>
> static const struct file_operations btrfs_ctl_fops = {
>
> This is from my thread with Filipe about same topic you can find in
> the mailing list archive.
Yes, that's what I did locally so far, on those systems that I _can_
run custom kernels on. The others I had to downgrade to 6.9 for the
time being. So I do have a vested interest in this being resolved in
the mainline/stable tree :-)
--
Ivan Shapovalov / intelfx /
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 862 bytes --]
^ permalink raw reply [flat|nested] 56+ messages in thread* Re: 6.10/regression/bisected - after f1d97e769152 I spotted increased execution time of the kswapd0 process and symptoms as if there is not enough memory
2024-08-16 6:47 ` Ivan Shapovalov
@ 2024-08-16 7:45 ` Qu Wenruo
0 siblings, 0 replies; 56+ messages in thread
From: Qu Wenruo @ 2024-08-16 7:45 UTC (permalink / raw)
To: Ivan Shapovalov, Andrea Gelmini; +Cc: linux-btrfs, linux-kernel
在 2024/8/16 16:17, Ivan Shapovalov 写道:
> On 2024-08-16 at 08:42 +0200, Andrea Gelmini wrote:
>> Il giorno ven 16 ago 2024 alle ore 01:17 <intelfx@intelfx.name> ha scritto:
>>> Can we please have it reverted on the basis of this severe regression,
>>> until a better solution is found?
>>
>> To disable the shrinker I simply remove two items:
>>
>> diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
>> index f05cce7c8b8d..4f958ba61e0e 100644
>> --- a/fs/btrfs/super.c
>> +++ b/fs/btrfs/super.c
>> @@ -2410,8 +2410,6 @@ static const struct super_operations btrfs_super_ops = {
>> .statfs = btrfs_statfs,
>> .freeze_fs = btrfs_freeze,
>> .unfreeze_fs = btrfs_unfreeze,
>> - .nr_cached_objects = btrfs_nr_cached_objects,
>> - .free_cached_objects = btrfs_free_cached_objects,
>> };
>>
>> static const struct file_operations btrfs_ctl_fops = {
>>
>> This is from my thread with Filipe about same topic you can find in
>> the mailing list archive.
>
> Yes, that's what I did locally so far, on those systems that I _can_
> run custom kernels on. The others I had to downgrade to 6.9 for the
> time being. So I do have a vested interest in this being resolved in
> the mainline/stable tree :-)
>
That's the most straightforward way to revert to the previous behavior.
Or you can try this patch, which is less obvious but should do the same
thing:
https://lore.kernel.org/linux-btrfs/09ca70ddac244d13780bd82866b8b708088362fb.1723770634.git.wqu@suse.com/T/#u
Meanwhile after looking into how XFS triggers its reclaim, I believe we
should not even bother using those callbacks.
XFS handles the trigger by making sure there is only one reclaim
workload queued, and the workload always delay 18s by default.
So for btrfs, I believe it's better to do the reclaim in the cleaner thread.
Will craft a proper fix for you guys to test, and since Filipe is on
vacation, we may go disable the reclaim workload for now.
Thanks,
Qu
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: 6.10/regression/bisected - after f1d97e769152 I spotted increased execution time of the kswapd0 process and symptoms as if there is not enough memory
2024-08-15 23:17 ` intelfx
2024-08-16 0:02 ` David Sterba
2024-08-16 6:42 ` Andrea Gelmini
@ 2024-08-16 10:58 ` Filipe Manana
2024-08-16 11:16 ` Ivan Shapovalov
2 siblings, 1 reply; 56+ messages in thread
From: Filipe Manana @ 2024-08-16 10:58 UTC (permalink / raw)
To: intelfx
Cc: Jannik Glückert, andrea.gelmini, dsterba, josef, linux-btrfs,
linux-kernel, mikhail.v.gavrilov, regressions
On Fri, Aug 16, 2024 at 12:17 AM <intelfx@intelfx.name> wrote:
>
> On 2024-08-16 at 00:21 +0200, intelfx@intelfx.name wrote:
> > On 2024-08-11 at 16:33 +0100, Filipe Manana wrote:
> > > <...>
> > > This came to my attention a couple days ago in a bugzilla report here:
> > >
> > > https://bugzilla.kernel.org/show_bug.cgi?id=219121
> > >
> > > There's also 2 other recent threads in the mailing about it.
> > >
> > > There's a fix there in the bugzilla, and I've just sent it to the mailing list.
> > > In case you want to try it:
> > >
> > > https://lore.kernel.org/linux-btrfs/d85d72b968a1f7b8538c581eeb8f5baa973dfc95.1723377230.git.fdmanana@suse.com/
> > >
> > > Thanks.
> >
> > Hello,
> >
> > I confirm that excessive "system" CPU usage by kswapd and btrfs-cleaner
> > kernel threads is still happening on the latest 6.10 stable with all
> > quoted patches applied, making the system close to unusable (not to
> > mention excessive power usage which crosses the line well *into*
> > "unusable" for low-power systems such as laptops).
> >
> > With just 5 minutes of uptime on a freshly booted 6.10.5 system, the
> > cumulative CPU time of kswapd is already at 2 minutes.
Less than 24 hours before your message, there was a patch merged to
Linus' tree, which was not (and is not) yet in any stable release
(including 6.10.5 of course).
Have you tried that patch?
>
> As a follow-up, after 1 hour of uptime of this system the total CPU
> time of kswapd0 is exactly 30 minutes. So whatever is the theoretical
> OOM issue that the extent map shrinker is trying to solve, the solution
It's not a theoretical problem.
It's a problem that any unprivileged user can trigger provided that
the amount of available disk space is much higher than total RAM,
which is by far the most common case.
The problem is explained in the commit change log, there's a
reproducer and it was even reported by a user:
https://lore.kernel.org/linux-btrfs/13f94633dcf04d29aaf1f0a43d42c55e@amazon.com/
This link was included in the changelog of the patch when submitted to
the list [1], but somehow it disappeared when it was merged to the git
repository.
Any user can effectively trigger a denial of service by creating an
unlimited number of extent maps that never get removed while it keeps
a file descriptor open and doing writes, either with direct IO, which
is simpler, or even buffered IO in case it creates holes in the files
(example: keep doing append writes starting after current eof, to
create a bunch of holes). Even if that task doing that gets killed by
the OOM, as long as there are idle processes keeping the file open,
the problem doesn't go away.
[1] https://lore.kernel.org/linux-btrfs/1cb649870b6cad4411da7998735ab1141bb9f2f0.1712837044.git.fdmanana@suse.com/
> in its current form is clearly unacceptable.
>
> Can we please have it reverted on the basis of this severe regression,
> until a better solution is found?
Disabling the shrinker might be the best for now. I'm on vacation and
can't write and test code, but I do have plans for making it better
and solving any remaining issues.
There's already a patch for that from Qu.
>
> Thanks,
> --
> Ivan Shapovalov / intelfx /
>
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: 6.10/regression/bisected - after f1d97e769152 I spotted increased execution time of the kswapd0 process and symptoms as if there is not enough memory
2024-08-16 10:58 ` Filipe Manana
@ 2024-08-16 11:16 ` Ivan Shapovalov
2024-09-26 13:45 ` Filipe Manana
0 siblings, 1 reply; 56+ messages in thread
From: Ivan Shapovalov @ 2024-08-16 11:16 UTC (permalink / raw)
To: Filipe Manana
Cc: Jannik Glückert, andrea.gelmini, dsterba, josef, linux-btrfs,
linux-kernel, mikhail.v.gavrilov, regressions
[-- Attachment #1: Type: text/plain, Size: 4004 bytes --]
On 2024-08-16 at 11:58 +0100, Filipe Manana wrote:
> On Fri, Aug 16, 2024 at 12:17 AM <intelfx@intelfx.name> wrote:
> >
> > On 2024-08-16 at 00:21 +0200, intelfx@intelfx.name wrote:
> > > On 2024-08-11 at 16:33 +0100, Filipe Manana wrote:
> > > > <...>
> > > > This came to my attention a couple days ago in a bugzilla report here:
> > > >
> > > > https://bugzilla.kernel.org/show_bug.cgi?id=219121
> > > >
> > > > There's also 2 other recent threads in the mailing about it.
> > > >
> > > > There's a fix there in the bugzilla, and I've just sent it to the mailing list.
> > > > In case you want to try it:
> > > >
> > > > https://lore.kernel.org/linux-btrfs/d85d72b968a1f7b8538c581eeb8f5baa973dfc95.1723377230.git.fdmanana@suse.com/
> > > >
> > > > Thanks.
> > >
> > > Hello,
> > >
> > > I confirm that excessive "system" CPU usage by kswapd and btrfs-cleaner
> > > kernel threads is still happening on the latest 6.10 stable with all
> > > quoted patches applied, making the system close to unusable (not to
> > > mention excessive power usage which crosses the line well *into*
> > > "unusable" for low-power systems such as laptops).
> > >
> > > With just 5 minutes of uptime on a freshly booted 6.10.5 system, the
> > > cumulative CPU time of kswapd is already at 2 minutes.
>
> Less than 24 hours before your message, there was a patch merged to
> Linus' tree, which was not (and is not) yet in any stable release
> (including 6.10.5 of course).
> Have you tried that patch?
Yes, I did — as I said, I tried 6.10.5 with all combinations of patches
ever posted in this thread (skipping those that I was not able to
apply; it seems that there were a few mutually incompatible attempts to
improve the extent map shrinker, some of which have already gone into
the stable tree, thus making others inapplicable).
> > As a follow-up, after 1 hour of uptime of this system the total CPU
> > time of kswapd0 is exactly 30 minutes. So whatever is the theoretical
> > OOM issue that the extent map shrinker is trying to solve, the solution
>
> It's not a theoretical problem.
> It's a problem that any unprivileged user can trigger provided that
> the amount of available disk space is much higher than total RAM,
> which is by far the most common case.
>
> The problem is explained in the commit change log, there's a
> reproducer and it was even reported by a user:
>
> https://lore.kernel.org/linux-btrfs/13f94633dcf04d29aaf1f0a43d42c55e@amazon.com/
>
> This link was included in the changelog of the patch when submitted to
> the list [1], but somehow it disappeared when it was merged to the git
> repository.
>
> Any user can effectively trigger a denial of service by creating an
> unlimited number of extent maps that never get removed while it keeps
> a file descriptor open and doing writes, either with direct IO, which
> is simpler, or even buffered IO in case it creates holes in the files
> (example: keep doing append writes starting after current eof, to
> create a bunch of holes). Even if that task doing that gets killed by
> the OOM, as long as there are idle processes keeping the file open,
> the problem doesn't go away.
Sorry, I did not intend to sound dismissive — what I wanted to say was
that we fixed an edge case (and yes, I acknowledge that this edge case
could be a security problem) by instead pessimizing a common case.
--
Ivan Shapovalov / intelfx /
> [1] https://lore.kernel.org/linux-btrfs/1cb649870b6cad4411da7998735ab1141bb9f2f0.1712837044.git.fdmanana@suse.com/
>
> > in its current form is clearly unacceptable.
> >
> > Can we please have it reverted on the basis of this severe regression,
> > until a better solution is found?
>
> Disabling the shrinker might be the best for now. I'm on vacation and
> can't write and test code, but I do have plans for making it better
> and solving any remaining issues.
> There's already a patch for that from Qu.
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 862 bytes --]
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: 6.10/regression/bisected - after f1d97e769152 I spotted increased execution time of the kswapd0 process and symptoms as if there is not enough memory
2024-08-16 11:16 ` Ivan Shapovalov
@ 2024-09-26 13:45 ` Filipe Manana
0 siblings, 0 replies; 56+ messages in thread
From: Filipe Manana @ 2024-09-26 13:45 UTC (permalink / raw)
To: Ivan Shapovalov
Cc: Jannik Glückert, andrea.gelmini, dsterba, josef, linux-btrfs,
linux-kernel, mikhail.v.gavrilov, regressions
On Fri, Aug 16, 2024 at 12:16 PM Ivan Shapovalov <intelfx@intelfx.name> wrote:
>
> On 2024-08-16 at 11:58 +0100, Filipe Manana wrote:
> > On Fri, Aug 16, 2024 at 12:17 AM <intelfx@intelfx.name> wrote:
> > >
> > > On 2024-08-16 at 00:21 +0200, intelfx@intelfx.name wrote:
> > > > On 2024-08-11 at 16:33 +0100, Filipe Manana wrote:
> > > > > <...>
> > > > > This came to my attention a couple days ago in a bugzilla report here:
> > > > >
> > > > > https://bugzilla.kernel.org/show_bug.cgi?id=219121
> > > > >
> > > > > There's also 2 other recent threads in the mailing about it.
> > > > >
> > > > > There's a fix there in the bugzilla, and I've just sent it to the mailing list.
> > > > > In case you want to try it:
> > > > >
> > > > > https://lore.kernel.org/linux-btrfs/d85d72b968a1f7b8538c581eeb8f5baa973dfc95.1723377230.git.fdmanana@suse.com/
> > > > >
> > > > > Thanks.
> > > >
> > > > Hello,
> > > >
> > > > I confirm that excessive "system" CPU usage by kswapd and btrfs-cleaner
> > > > kernel threads is still happening on the latest 6.10 stable with all
> > > > quoted patches applied, making the system close to unusable (not to
> > > > mention excessive power usage which crosses the line well *into*
> > > > "unusable" for low-power systems such as laptops).
> > > >
> > > > With just 5 minutes of uptime on a freshly booted 6.10.5 system, the
> > > > cumulative CPU time of kswapd is already at 2 minutes.
> >
> > Less than 24 hours before your message, there was a patch merged to
> > Linus' tree, which was not (and is not) yet in any stable release
> > (including 6.10.5 of course).
> > Have you tried that patch?
>
> Yes, I did — as I said, I tried 6.10.5 with all combinations of patches
> ever posted in this thread (skipping those that I was not able to
> apply; it seems that there were a few mutually incompatible attempts to
> improve the extent map shrinker, some of which have already gone into
> the stable tree, thus making others inapplicable).
>
> > > As a follow-up, after 1 hour of uptime of this system the total CPU
> > > time of kswapd0 is exactly 30 minutes. So whatever is the theoretical
> > > OOM issue that the extent map shrinker is trying to solve, the solution
> >
> > It's not a theoretical problem.
> > It's a problem that any unprivileged user can trigger provided that
> > the amount of available disk space is much higher than total RAM,
> > which is by far the most common case.
> >
> > The problem is explained in the commit change log, there's a
> > reproducer and it was even reported by a user:
> >
> > https://lore.kernel.org/linux-btrfs/13f94633dcf04d29aaf1f0a43d42c55e@amazon.com/
> >
> > This link was included in the changelog of the patch when submitted to
> > the list [1], but somehow it disappeared when it was merged to the git
> > repository.
> >
> > Any user can effectively trigger a denial of service by creating an
> > unlimited number of extent maps that never get removed while it keeps
> > a file descriptor open and doing writes, either with direct IO, which
> > is simpler, or even buffered IO in case it creates holes in the files
> > (example: keep doing append writes starting after current eof, to
> > create a bunch of holes). Even if that task doing that gets killed by
> > the OOM, as long as there are idle processes keeping the file open,
> > the problem doesn't go away.
>
> Sorry, I did not intend to sound dismissive — what I wanted to say was
> that we fixed an edge case (and yes, I acknowledge that this edge case
> could be a security problem) by instead pessimizing a common case.
So I've recently sent out a patchset to update the shrinker and
re-enable it again:
https://lore.kernel.org/linux-btrfs/cover.1727174151.git.fdmanana@suse.com/
It applies against the current for-next branch, and should apply
against a 6.11 release too, except for the last patch due to a rename
in a function: CONFIG_BTRFS_DEBUG to CONFIG_BTRFS_EXPERIMENTAL.
I can prepare a git branch based on a 6.11 release (or 6.10) if anyone
prefers that rather than manually picking patches and resolving
conflicts (or testing for-next which has many unrelated changes).
If any of you can test it and report, it would be much appreciated.
Thanks.
>
> --
> Ivan Shapovalov / intelfx /
>
> > [1] https://lore.kernel.org/linux-btrfs/1cb649870b6cad4411da7998735ab1141bb9f2f0.1712837044.git.fdmanana@suse.com/
> >
> > > in its current form is clearly unacceptable.
> > >
> > > Can we please have it reverted on the basis of this severe regression,
> > > until a better solution is found?
> >
> > Disabling the shrinker might be the best for now. I'm on vacation and
> > can't write and test code, but I do have plans for making it better
> > and solving any remaining issues.
> > There's already a patch for that from Qu.
^ permalink raw reply [flat|nested] 56+ messages in thread