From: Ondrej Kozina <okozina@redhat.com>
To: dm-devel@redhat.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org
Cc: Matthias Dahl <ml_linux-dm-devel@binary-island.eu>, rientjes@google.com
Subject: Re: [dm-devel] [4.4, 4.5, 4.6] Regression: encrypted swap (dm-crypt) freezes system while under memory pressure and swapping
Date: Thu, 5 May 2016 17:54:27 +0200 [thread overview]
Message-ID: <572B6CB3.10802@redhat.com> (raw)
In-Reply-To: <8125260b-30b0-e80d-c451-8194e6866227@binary-island.eu>
On 04/21/2016 09:48 AM, Matthias Dahl wrote:
> Hello @all,
>
> first of all, I sent this exact msg also to the lkml a few days ago but
> since I received no reaction, I thought this list might be a better
> place for this problem -- or I might at least reach the right persons to
> get this fixed/debugged/... . :-)
>
> Recently I started seeing freezes while compiling bigger packages that
> do require lots of memory (I use Gentoo).
>
> The freezes where in the form that while in Xorg, the system would just
> completely hang -- no magic sysrq keys, no mouse movement, nothing.
> While in a terminal, one could still issue a magic sysrq command but it
> would only echo the command itself but not execute it -- except for the
> reboot command. So there was no way to get a backtrace or states or
> anything alike.
>
> After debugging this further, it became clear that the system always
> froze when it started hitting the encrypted swap. It worked absolutely
> fine as soon as you took the encryption out of the picture.
>
> My setup then was: A 8 GiB swap on S/W-RAID5 for my 8 GiB physical ram
> that was encrypted with dm-crypt and AES256-CBC-ESSIV.
>
> I debugged this further and changed my setup to several swap partitions
> on the physical disks w/o a RAID in-between to isolate the culprit. This
> made no difference -- neither did switching ciphers and so forth.
>
> Since this setup had worked for ages, I started looking into what had
> changed the weeks before and noticed I had done several kernel upgrades.
>
> To make a long story short, here my findings:
>
> 4.3.0, 4.4.0-final, 4.5-rc1 to 4.5-rc2:
> No problems, except for the usual sluggishness with encrypted swap that
> has been there since forever (it is like the encryption has the highest
> priority and takes over the system, e.g. no terminal input is accepted
> on a different terminal while high memory pressure is going on which is
> in contrast with the encrypted swap, where this still works fine).
>
> 4.4.x, >= 4.5-rc3 (incl. 4.6-rcX and master):
> The system freezes under memory pressure as soon as it starts swapping
> out. 4.6 master is an exception here, it still responds to magic sysrq
> commands properly but after some time though completely freezes hard.
>
> I hadn't had the time to test all 4.3.x and 4.4.x releases, I am afraid.
> What I can say though is that 4.4.6 is affected as well.
>
> A git bisect between 4.5-rc2 and 4.5-rc3, lead me to the following commit:
>
> 564e81a57f9788b1475127012e0fd44e9049e342 is the first bad commit
> commit 564e81a57f9788b1475127012e0fd44e9049e342
> Author: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
> Date: Fri Feb 5 15:36:30 2016 -0800
>
> mm, vmstat: fix wrong WQ sleep when memory reclaim doesn't make any
> progress
>
> This is obviously not the real culprit in my opinion but a trigger.
> Reverting that commit on 4.5.1 for example, makes the encrypted swap
> work flawlessly again (except for the usual system sluggishness).
>
> Reverting it on 4.6 master@c3b46c73264b03000d1e18b22f5caf63332547c9,
> does show a different picture though: The system freezes while the sysrq
> keys do still work and usually recovers after some while if the
> corresponding task that triggered the swapping in the first place, gets
> killed. It sometimes does a bit of swapping, and sometimes don't while
> it hangs there -- while usually with the other kernels in the "frozen"
> state, the swapping stops completely.
>
> I managed to get a bit more information out of 4.6 master though since
> it sometimes recovers after quite some time and I can copy backtraces
> and such to the disk, which I have attached.
>
> I hope this helps in finding the real issue behind this. I am sorry I
> could not provide more information but this has been a rather time
> consuming task thus far. :-)
>
> If there is anything else I can do to help or test, please let me know
> and I will gladly do so.
>
> Thanks in advance.
>
> So long,
> Matthias
>
Hello,
I second the observation that something is wrong and it doesn't seem to
be related to dm-crypt target. My test setup is as follows:
2 CPUs
system ram: 1 GB
swap on top of dm-crypt: 2 GB
Whenever I start workload that consumes more memory than system ram but
much less than total memory including the swap I end with following OOM
message that I found to be premature and unexpected:
-
https://okozina.fedorapeople.org/bugs/swap_on_dmcrypt/vmlog-1462458369-00000/sample-00011/dmesg
the important snippet in-before the oom:
active_anon:4096kB inactive_anon:4636kB, writeback:4636kB
and also:
Free swap = 2039832kB
Total swap = 2097148kB
you can find more details in sample-* directories located in:
https://okozina.fedorapeople.org/bugs/swap_on_dmcrypt/vmlog-1462458369-00000/
each sample directory contains stats collection taken approximately each
second after I started the workload (it's a script from
http://linux-mm.org/OOM site).
For me OOM killer message can be observed starting with this commit:
commit f9054c70d28bc214b2857cf8db8269f4f45a5e23
Author: David Rientjes <rientjes@google.com>
Date: Thu Mar 17 14:19:19 2016 -0700
mm, mempool: only set __GFP_NOMEMALLOC if there are free elements
(...)
Kind regards
Ondrej
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Ondrej Kozina <okozina@redhat.com>
To: dm-devel@redhat.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org
Cc: Matthias Dahl <ml_linux-dm-devel@binary-island.eu>, rientjes@google.com
Subject: Re: [dm-devel] [4.4, 4.5, 4.6] Regression: encrypted swap (dm-crypt) freezes system while under memory pressure and swapping
Date: Thu, 5 May 2016 17:54:27 +0200 [thread overview]
Message-ID: <572B6CB3.10802@redhat.com> (raw)
In-Reply-To: <8125260b-30b0-e80d-c451-8194e6866227@binary-island.eu>
On 04/21/2016 09:48 AM, Matthias Dahl wrote:
> Hello @all,
>
> first of all, I sent this exact msg also to the lkml a few days ago but
> since I received no reaction, I thought this list might be a better
> place for this problem -- or I might at least reach the right persons to
> get this fixed/debugged/... . :-)
>
> Recently I started seeing freezes while compiling bigger packages that
> do require lots of memory (I use Gentoo).
>
> The freezes where in the form that while in Xorg, the system would just
> completely hang -- no magic sysrq keys, no mouse movement, nothing.
> While in a terminal, one could still issue a magic sysrq command but it
> would only echo the command itself but not execute it -- except for the
> reboot command. So there was no way to get a backtrace or states or
> anything alike.
>
> After debugging this further, it became clear that the system always
> froze when it started hitting the encrypted swap. It worked absolutely
> fine as soon as you took the encryption out of the picture.
>
> My setup then was: A 8 GiB swap on S/W-RAID5 for my 8 GiB physical ram
> that was encrypted with dm-crypt and AES256-CBC-ESSIV.
>
> I debugged this further and changed my setup to several swap partitions
> on the physical disks w/o a RAID in-between to isolate the culprit. This
> made no difference -- neither did switching ciphers and so forth.
>
> Since this setup had worked for ages, I started looking into what had
> changed the weeks before and noticed I had done several kernel upgrades.
>
> To make a long story short, here my findings:
>
> 4.3.0, 4.4.0-final, 4.5-rc1 to 4.5-rc2:
> No problems, except for the usual sluggishness with encrypted swap that
> has been there since forever (it is like the encryption has the highest
> priority and takes over the system, e.g. no terminal input is accepted
> on a different terminal while high memory pressure is going on which is
> in contrast with the encrypted swap, where this still works fine).
>
> 4.4.x, >= 4.5-rc3 (incl. 4.6-rcX and master):
> The system freezes under memory pressure as soon as it starts swapping
> out. 4.6 master is an exception here, it still responds to magic sysrq
> commands properly but after some time though completely freezes hard.
>
> I hadn't had the time to test all 4.3.x and 4.4.x releases, I am afraid.
> What I can say though is that 4.4.6 is affected as well.
>
> A git bisect between 4.5-rc2 and 4.5-rc3, lead me to the following commit:
>
> 564e81a57f9788b1475127012e0fd44e9049e342 is the first bad commit
> commit 564e81a57f9788b1475127012e0fd44e9049e342
> Author: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
> Date: Fri Feb 5 15:36:30 2016 -0800
>
> mm, vmstat: fix wrong WQ sleep when memory reclaim doesn't make any
> progress
>
> This is obviously not the real culprit in my opinion but a trigger.
> Reverting that commit on 4.5.1 for example, makes the encrypted swap
> work flawlessly again (except for the usual system sluggishness).
>
> Reverting it on 4.6 master@c3b46c73264b03000d1e18b22f5caf63332547c9,
> does show a different picture though: The system freezes while the sysrq
> keys do still work and usually recovers after some while if the
> corresponding task that triggered the swapping in the first place, gets
> killed. It sometimes does a bit of swapping, and sometimes don't while
> it hangs there -- while usually with the other kernels in the "frozen"
> state, the swapping stops completely.
>
> I managed to get a bit more information out of 4.6 master though since
> it sometimes recovers after quite some time and I can copy backtraces
> and such to the disk, which I have attached.
>
> I hope this helps in finding the real issue behind this. I am sorry I
> could not provide more information but this has been a rather time
> consuming task thus far. :-)
>
> If there is anything else I can do to help or test, please let me know
> and I will gladly do so.
>
> Thanks in advance.
>
> So long,
> Matthias
>
Hello,
I second the observation that something is wrong and it doesn't seem to
be related to dm-crypt target. My test setup is as follows:
2 CPUs
system ram: 1 GB
swap on top of dm-crypt: 2 GB
Whenever I start workload that consumes more memory than system ram but
much less than total memory including the swap I end with following OOM
message that I found to be premature and unexpected:
-
https://okozina.fedorapeople.org/bugs/swap_on_dmcrypt/vmlog-1462458369-00000/sample-00011/dmesg
the important snippet in-before the oom:
active_anon:4096kB inactive_anon:4636kB, writeback:4636kB
and also:
Free swap = 2039832kB
Total swap = 2097148kB
you can find more details in sample-* directories located in:
https://okozina.fedorapeople.org/bugs/swap_on_dmcrypt/vmlog-1462458369-00000/
each sample directory contains stats collection taken approximately each
second after I started the workload (it's a script from
http://linux-mm.org/OOM site).
For me OOM killer message can be observed starting with this commit:
commit f9054c70d28bc214b2857cf8db8269f4f45a5e23
Author: David Rientjes <rientjes@google.com>
Date: Thu Mar 17 14:19:19 2016 -0700
mm, mempool: only set __GFP_NOMEMALLOC if there are free elements
(...)
Kind regards
Ondrej
next prev parent reply other threads:[~2016-05-05 15:54 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-21 7:48 [4.4, 4.5, 4.6] Regression: encrypted swap (dm-crypt) freezes system while under memory pressure and swapping Matthias Dahl
2016-05-05 15:54 ` Ondrej Kozina [this message]
2016-05-05 15:54 ` [dm-devel] " Ondrej Kozina
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=572B6CB3.10802@redhat.com \
--to=okozina@redhat.com \
--cc=dm-devel@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ml_linux-dm-devel@binary-island.eu \
--cc=rientjes@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.