* shrink_inactive_list() failed to reclaim pages
@ 2017-01-11 17:16 Cheng-yu Lee
2017-01-11 17:38 ` Michal Hocko
0 siblings, 1 reply; 6+ messages in thread
From: Cheng-yu Lee @ 2017-01-11 17:16 UTC (permalink / raw)
To: linux-mm; +Cc: Luigi Semenzato, Ben Cheng
[-- Attachment #1: Type: text/plain, Size: 2243 bytes --]
Hi community,
I have a x86_64 Chromebook running 3.14 kernel with 8G of memory. Using
zram with swap size set to ~12GB. When in low memory, kswapd is awaken to
reclaim pages, but under some circumstances the kernel can not find pages
to reclaim while I'm sure there're still plenty of memory which could be
reclaimed from background processes (For example, I run some C programs
which just malloc() lots of memory and get suspended in the background.
There's no reason they could't be swapped). The consequence is that most of
CPU time is spent on page reclamation. The system hangs or becomes very
laggy for a long period. Sometimes it even triggers a kernel panic by the
hung task detector like:
<0>[46246.676366] Kernel panic - not syncing: hung_task: blocked tasks
I've added kernel message to trace the problem. I found shrink_inactive_list()
can barely find any page to reclaim. More precisely, when the problem
happens, lots of page have _count > 2 in __remove_mapping(). So the
condition at line 662 of vmscan.c holds:
http://lxr.free-electrons.com/source/mm/vmscan.c#L662
Thus the kernel fails to reclaim those pages at line 1209
http://lxr.free-electrons.com/source/mm/vmscan.c#L1209
It's weird that the inactive anonymous list is huge (several GB), but
nothing can really be freed. So I did some hack to see if moving more pages
from the active list helps. I commented out the "inactive_list_is_low()"
checking at line 2420
in shrink_node_memcg() so shrink_active_list() is always called.
http://lxr.free-electrons.com/source/mm/vmscan.c#L2420
It turns out that the hack helps. If moving more pages from the active
list, kswapd works smoothly. The whole 12G zram can be used up before
system enters OOM condition.
Any idea why the whole inactive anonymous LRU is occupied by pages which
can not be freed for la long time (several minutes before system dies) ?
Are there any parameters I can tune to help the situation ? I've tried
swappiness but it doesn't help.
An alternative is to patch the kernel to call shrink_active_list() more
frequently when it finds there's nothing that can be reclaimed . But I am
not sure if it's the right direction. Also it's not so trivial to figure
out where to add the call.
Thanks,
Cheng-Yu
[-- Attachment #2: Type: text/html, Size: 4079 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: shrink_inactive_list() failed to reclaim pages
2017-01-11 17:16 shrink_inactive_list() failed to reclaim pages Cheng-yu Lee
@ 2017-01-11 17:38 ` Michal Hocko
2017-01-12 4:02 ` Pintu Kumar
` (3 more replies)
0 siblings, 4 replies; 6+ messages in thread
From: Michal Hocko @ 2017-01-11 17:38 UTC (permalink / raw)
To: Cheng-yu Lee
Cc: linux-mm, Luigi Semenzato, Ben Cheng, Sergey Senozhatsky,
Minchan Kim
[CC Minchan and Sergey for the zram part]
On Thu 12-01-17 01:16:11, Cheng-yu Lee wrote:
> Hi community,
>
> I have a x86_64 Chromebook running 3.14 kernel with 8G of memory. Using
Do you see the same with the current Linus tree?
> zram with swap size set to ~12GB. When in low memory, kswapd is awaken to
> reclaim pages, but under some circumstances the kernel can not find pages
> to reclaim while I'm sure there're still plenty of memory which could be
> reclaimed from background processes (For example, I run some C programs
> which just malloc() lots of memory and get suspended in the background.
> There's no reason they could't be swapped). The consequence is that most of
> CPU time is spent on page reclamation. The system hangs or becomes very
> laggy for a long period. Sometimes it even triggers a kernel panic by the
> hung task detector like:
> <0>[46246.676366] Kernel panic - not syncing: hung_task: blocked tasks
>
> I've added kernel message to trace the problem. I found shrink_inactive_list()
> can barely find any page to reclaim. More precisely, when the problem
> happens, lots of page have _count > 2 in __remove_mapping(). So the
> condition at line 662 of vmscan.c holds:
> http://lxr.free-electrons.com/source/mm/vmscan.c#L662
> Thus the kernel fails to reclaim those pages at line 1209
> http://lxr.free-electrons.com/source/mm/vmscan.c#L1209
I assume that you are talking about the anonymous LRU
> It's weird that the inactive anonymous list is huge (several GB), but
> nothing can really be freed. So I did some hack to see if moving more pages
> from the active list helps. I commented out the "inactive_list_is_low()"
> checking at line 2420
> in shrink_node_memcg() so shrink_active_list() is always called.
> http://lxr.free-electrons.com/source/mm/vmscan.c#L2420
> It turns out that the hack helps. If moving more pages from the active
> list, kswapd works smoothly. The whole 12G zram can be used up before
> system enters OOM condition.
>
> Any idea why the whole inactive anonymous LRU is occupied by pages which
> can not be freed for la long time (several minutes before system dies) ?
> Are there any parameters I can tune to help the situation ? I've tried
> swappiness but it doesn't help.
>
> An alternative is to patch the kernel to call shrink_active_list() more
> frequently when it finds there's nothing that can be reclaimed . But I am
> not sure if it's the right direction. Also it's not so trivial to figure
> out where to add the call.
>
> Thanks,
> Cheng-Yu
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: shrink_inactive_list() failed to reclaim pages
2017-01-11 17:38 ` Michal Hocko
@ 2017-01-12 4:02 ` Pintu Kumar
2017-01-12 5:33 ` Minchan Kim
` (2 subsequent siblings)
3 siblings, 0 replies; 6+ messages in thread
From: Pintu Kumar @ 2017-01-12 4:02 UTC (permalink / raw)
To: Michal Hocko
Cc: Cheng-yu Lee, linux-mm@kvack.org, Luigi Semenzato, Ben Cheng,
Sergey Senozhatsky, minchan@kernel.org, Pintu Kumar, Pintu Kumar
Adding my self
> On 11-Jan-2017, at 11:08 PM, Michal Hocko <mhocko@kernel.org> wrote:
>
> [CC Minchan and Sergey for the zram part]
>
> On Thu 12-01-17 01:16:11, Cheng-yu Lee wrote:
>> Hi community,
>>
>> I have a x86_64 Chromebook running 3.14 kernel with 8G of memory. Using
>
> Do you see the same with the current Linus tree?
>
>> zram with swap size set to ~12GB. When in low memory, kswapd is awaken to
>> reclaim pages, but under some circumstances the kernel can not find pages
>> to reclaim while I'm sure there're still plenty of memory which could be
>> reclaimed from background processes (For example, I run some C programs
>> which just malloc() lots of memory and get suspended in the background.
>> There's no reason they could't be swapped). The consequence is that most of
>> CPU time is spent on page reclamation. The system hangs or becomes very
>> laggy for a long period. Sometimes it even triggers a kernel panic by the
>> hung task detector like:
>> <0>[46246.676366] Kernel panic - not syncing: hung_task: blocked tasks
>>
>> I've added kernel message to trace the problem. I found shrink_inactive_list()
>> can barely find any page to reclaim. More precisely, when the problem
>> happens, lots of page have _count > 2 in __remove_mapping(). So the
>> condition at line 662 of vmscan.c holds:
>> http://lxr.free-electrons.com/source/mm/vmscan.c#L662
>> Thus the kernel fails to reclaim those pages at line 1209
>> http://lxr.free-electrons.com/source/mm/vmscan.c#L1209
>
> I assume that you are talking about the anonymous LRU
>
>> It's weird that the inactive anonymous list is huge (several GB), but
>> nothing can really be freed. So I did some hack to see if moving more pages
>> from the active list helps. I commented out the "inactive_list_is_low()"
>> checking at line 2420
>> in shrink_node_memcg() so shrink_active_list() is always called.
>> http://lxr.free-electrons.com/source/mm/vmscan.c#L2420
>> It turns out that the hack helps. If moving more pages from the active
>> list, kswapd works smoothly. The whole 12G zram can be used up before
>> system enters OOM condition.
>>
>> Any idea why the whole inactive anonymous LRU is occupied by pages which
>> can not be freed for la long time (several minutes before system dies) ?
>> Are there any parameters I can tune to help the situation ? I've tried
>> swappiness but it doesn't help.
>>
>> An alternative is to patch the kernel to call shrink_active_list() more
>> frequently when it finds there's nothing that can be reclaimed . But I am
>> not sure if it's the right direction. Also it's not so trivial to figure
>> out where to add the call.
>>
>> Thanks,
>> Cheng-Yu
>
> --
> Michal Hocko
> SUSE Labs
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: shrink_inactive_list() failed to reclaim pages
2017-01-11 17:38 ` Michal Hocko
2017-01-12 4:02 ` Pintu Kumar
@ 2017-01-12 5:33 ` Minchan Kim
2017-01-12 12:55 ` Sergey Senozhatsky
2017-01-12 16:34 ` Cheng-yu Lee
3 siblings, 0 replies; 6+ messages in thread
From: Minchan Kim @ 2017-01-12 5:33 UTC (permalink / raw)
To: Michal Hocko
Cc: Cheng-yu Lee, linux-mm, Luigi Semenzato, Ben Cheng,
Sergey Senozhatsky
Thanks for Ccing me, Michal.
On Wed, Jan 11, 2017 at 06:38:02PM +0100, Michal Hocko wrote:
> [CC Minchan and Sergey for the zram part]
>
> On Thu 12-01-17 01:16:11, Cheng-yu Lee wrote:
> > Hi community,
> >
> > I have a x86_64 Chromebook running 3.14 kernel with 8G of memory. Using
>
> Do you see the same with the current Linus tree?
>
> > zram with swap size set to ~12GB. When in low memory, kswapd is awaken to
> > reclaim pages, but under some circumstances the kernel can not find pages
> > to reclaim while I'm sure there're still plenty of memory which could be
> > reclaimed from background processes (For example, I run some C programs
> > which just malloc() lots of memory and get suspended in the background.
> > There's no reason they could't be swapped). The consequence is that most of
> > CPU time is spent on page reclamation. The system hangs or becomes very
> > laggy for a long period. Sometimes it even triggers a kernel panic by the
> > hung task detector like:
> > <0>[46246.676366] Kernel panic - not syncing: hung_task: blocked tasks
> >
> > I've added kernel message to trace the problem. I found shrink_inactive_list()
> > can barely find any page to reclaim. More precisely, when the problem
> > happens, lots of page have _count > 2 in __remove_mapping(). So the
> > condition at line 662 of vmscan.c holds:
> > http://lxr.free-electrons.com/source/mm/vmscan.c#L662
> > Thus the kernel fails to reclaim those pages at line 1209
> > http://lxr.free-electrons.com/source/mm/vmscan.c#L1209
>
> I assume that you are talking about the anonymous LRU
>
> > It's weird that the inactive anonymous list is huge (several GB), but
> > nothing can really be freed. So I did some hack to see if moving more pages
> > from the active list helps. I commented out the "inactive_list_is_low()"
> > checking at line 2420
> > in shrink_node_memcg() so shrink_active_list() is always called.
> > http://lxr.free-electrons.com/source/mm/vmscan.c#L2420
> > It turns out that the hack helps. If moving more pages from the active
> > list, kswapd works smoothly. The whole 12G zram can be used up before
> > system enters OOM condition.
> >
> > Any idea why the whole inactive anonymous LRU is occupied by pages which
> > can not be freed for la long time (several minutes before system dies) ?
> > Are there any parameters I can tune to help the situation ? I've tried
> > swappiness but it doesn't help.
I've never heard such problem until now so my *imaginary* scenario is some
of driver or something in your system calls get_user_pages or friends to
grab a page reference count so that lots of anonymous pages are pinned.
With that, VM swapped it out but cannot free the page until someone releases
the refcount of the page.
On the situation, what VM can do it is to rotate the page back into inactive
LRU's head. It causes inactive list's size is never changed so that
inactive_anon_is_low always return false. It means VM cannot deactivate
reclaimable pages on active list to inactive's LRU so it ends up scanning
inactive anonymous LRU list fulled of pinned pages.
There would be several ways to solve but before that, I want to confirm
my random guess.
> >
> > An alternative is to patch the kernel to call shrink_active_list() more
> > frequently when it finds there's nothing that can be reclaimed . But I am
> > not sure if it's the right direction. Also it's not so trivial to figure
> > out where to add the call.
> >
> > Thanks,
> > Cheng-Yu
>
> --
> Michal Hocko
> SUSE Labs
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: shrink_inactive_list() failed to reclaim pages
2017-01-11 17:38 ` Michal Hocko
2017-01-12 4:02 ` Pintu Kumar
2017-01-12 5:33 ` Minchan Kim
@ 2017-01-12 12:55 ` Sergey Senozhatsky
2017-01-12 16:34 ` Cheng-yu Lee
3 siblings, 0 replies; 6+ messages in thread
From: Sergey Senozhatsky @ 2017-01-12 12:55 UTC (permalink / raw)
To: Michal Hocko
Cc: Cheng-yu Lee, linux-mm, Luigi Semenzato, Ben Cheng,
Sergey Senozhatsky, Minchan Kim
Hello,
On (01/11/17 18:38), Michal Hocko wrote:
> On Thu 12-01-17 01:16:11, Cheng-yu Lee wrote:
> > Hi community,
> >
> > I have a x86_64 Chromebook running 3.14 kernel with 8G of memory. Using
>
> Do you see the same with the current Linus tree?
>
> > zram with swap size set to ~12GB. When in low memory, kswapd is awaken to
> > reclaim pages, but under some circumstances the kernel can not find pages
> > to reclaim while I'm sure there're still plenty of memory which could be
> > reclaimed from background processes (For example, I run some C programs
> > which just malloc() lots of memory and get suspended in the background.
> > There's no reason they could't be swapped). The consequence is that most of
> > CPU time is spent on page reclamation. The system hangs or becomes very
> > laggy for a long period. Sometimes it even triggers a kernel panic by the
> > hung task detector like:
> > <0>[46246.676366] Kernel panic - not syncing: hung_task: blocked tasks
> >
> > I've added kernel message to trace the problem. I found shrink_inactive_list()
> > can barely find any page to reclaim. More precisely, when the problem
> > happens, lots of page have _count > 2 in __remove_mapping(). So the
> > condition at line 662 of vmscan.c holds:
> > http://lxr.free-electrons.com/source/mm/vmscan.c#L662
> > Thus the kernel fails to reclaim those pages at line 1209
> > http://lxr.free-electrons.com/source/mm/vmscan.c#L1209
>
> I assume that you are talking about the anonymous LRU
hm. as a side note, I think this is not the first time I see
"kswapd consumes 100% cpu" report.
https://bugzilla.kernel.org/show_bug.cgi?id=65201#c50
http://lkml.iu.edu//hypermail/linux/kernel/1601.2/03564.html
https://marc.info/?l=linux-mm&m=145442159521487
https://marc.info/?l=linux-mm&m=145443027124595
-ss
> > It's weird that the inactive anonymous list is huge (several GB), but
> > nothing can really be freed. So I did some hack to see if moving more pages
> > from the active list helps. I commented out the "inactive_list_is_low()"
> > checking at line 2420
> > in shrink_node_memcg() so shrink_active_list() is always called.
> > http://lxr.free-electrons.com/source/mm/vmscan.c#L2420
> > It turns out that the hack helps. If moving more pages from the active
> > list, kswapd works smoothly. The whole 12G zram can be used up before
> > system enters OOM condition.
> >
> > Any idea why the whole inactive anonymous LRU is occupied by pages which
> > can not be freed for la long time (several minutes before system dies) ?
> > Are there any parameters I can tune to help the situation ? I've tried
> > swappiness but it doesn't help.
> >
> > An alternative is to patch the kernel to call shrink_active_list() more
> > frequently when it finds there's nothing that can be reclaimed . But I am
> > not sure if it's the right direction. Also it's not so trivial to figure
> > out where to add the call.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: shrink_inactive_list() failed to reclaim pages
2017-01-11 17:38 ` Michal Hocko
` (2 preceding siblings ...)
2017-01-12 12:55 ` Sergey Senozhatsky
@ 2017-01-12 16:34 ` Cheng-yu Lee
3 siblings, 0 replies; 6+ messages in thread
From: Cheng-yu Lee @ 2017-01-12 16:34 UTC (permalink / raw)
To: Michal Hocko
Cc: linux-mm, Luigi Semenzato, Ben Cheng, Sergey Senozhatsky,
Minchan Kim
[-- Attachment #1: Type: text/plain, Size: 503 bytes --]
>
> > I have a x86_64 Chromebook running 3.14 kernel with 8G of memory. Using
>
> Do you see the same with the current Linus tree?
>
I haven't tried on ToT because it takes much effort to port to the specific
device.
But I've managed to try it on v4.4 .
Surprisingly the problem goes away.
> > Thus the kernel fails to reclaim those pages at line 1209
> > http://lxr.free-electrons.com/source/mm/vmscan.c#L1209
>
> I assume that you are talking about the anonymous LRU
>
Yes, I mean anonymous LRU.
[-- Attachment #2: Type: text/html, Size: 1096 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2017-01-12 16:34 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-01-11 17:16 shrink_inactive_list() failed to reclaim pages Cheng-yu Lee
2017-01-11 17:38 ` Michal Hocko
2017-01-12 4:02 ` Pintu Kumar
2017-01-12 5:33 ` Minchan Kim
2017-01-12 12:55 ` Sergey Senozhatsky
2017-01-12 16:34 ` Cheng-yu Lee
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).