linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] mm,ksm: fix endless looping in allocating memory when ksm enable
@ 2016-09-20  5:50 zhongjiang
  2016-09-20  7:07 ` Michal Hocko
  0 siblings, 1 reply; 4+ messages in thread
From: zhongjiang @ 2016-09-20  5:50 UTC (permalink / raw)
  To: hughd, mhocko, akpm; +Cc: linux-mm

From: zhong jiang <zhongjiang@huawei.com>

I hit the following issue when run a OOM case of the LTP and
ksm enable.

Call trace:
[<ffffffc000086a88>] __switch_to+0x74/0x8c
[<ffffffc000a1bae0>] __schedule+0x23c/0x7bc
[<ffffffc000a1c09c>] schedule+0x3c/0x94
[<ffffffc000a1eb84>] rwsem_down_write_failed+0x214/0x350
[<ffffffc000a1e32c>] down_write+0x64/0x80
[<ffffffc00021f794>] __ksm_exit+0x90/0x19c
[<ffffffc0000be650>] mmput+0x118/0x11c
[<ffffffc0000c3ec4>] do_exit+0x2dc/0xa74
[<ffffffc0000c46f8>] do_group_exit+0x4c/0xe4
[<ffffffc0000d0f34>] get_signal+0x444/0x5e0
[<ffffffc000089fcc>] do_signal+0x1d8/0x450
[<ffffffc00008a35c>] do_notify_resume+0x70/0x78

it will leads to a hung task because the exiting task cannot get the
mmap sem for write. but the root cause is that the ksmd holds it for
read while allocateing memory which just takes ages to complete.
and ksmd will loop in the following path.

 scan_get_next_rmap_item
          down_read
                get_next_rmap_item
                        alloc_rmap_item   #ksmd will loop permanently.

The caller alloc_rmap_item with GFP_KERENL will trigger OOM killer when free
memory is under pressure. and it can will successfully bail out without calling
out_of_memory. because it find the OOM invoked by other process is in progress
in the same zone. therefore, memory allocation will loop again and again.

we fix it by changing the GFP to add __GFP_NORETRY. if it is so, alloc_rmap_item
allow to sometimes memory allocation fails, if it fails , ksmd will jsut give up
and takes a sleep. even though memory is low, OOM killer would not be triggered.
at the same time, GFP_NOWARN shuld be also added. because we're not at all
interested in hearing abot that.

CC: <stable@vger.kernel.org>
Suggested-by: Hugh Dickins <hughd@google.com>
Suggested-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
---
 mm/ksm.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/ksm.c b/mm/ksm.c
index 73d43ba..5048083 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -283,7 +283,8 @@ static inline struct rmap_item *alloc_rmap_item(void)
 {
 	struct rmap_item *rmap_item;
 
-	rmap_item = kmem_cache_zalloc(rmap_item_cache, GFP_KERNEL);
+	rmap_item = kmem_cache_zalloc(rmap_item_cache, GFP_KERNEL |
+						__GFP_NORETRY | __GFP_NOWARN);
 	if (rmap_item)
 		ksm_rmap_items++;
 	return rmap_item;
-- 
1.8.3.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] mm,ksm: fix endless looping in allocating memory when ksm enable
  2016-09-20  5:50 [PATCH v2] mm,ksm: fix endless looping in allocating memory when ksm enable zhongjiang
@ 2016-09-20  7:07 ` Michal Hocko
  2016-09-20  7:12   ` zhong jiang
  2016-09-20  7:14   ` Michal Hocko
  0 siblings, 2 replies; 4+ messages in thread
From: Michal Hocko @ 2016-09-20  7:07 UTC (permalink / raw)
  To: zhongjiang; +Cc: hughd, akpm, linux-mm

[CCing Tetsuo again - please make sure you CC everybody who did respond
 in earlier versions of the patch]

I am sorry to insist here but this doesn't address the previous review
feedback. Let me try to show you what I would find much better. I do not
insist on this precise wording of course but I do insist on mentioning
the current state and making clear why GFP_NORETRY is really ok.

On Tue 20-09-16 13:50:13, zhongjiang wrote:
> From: zhong jiang <zhongjiang@huawei.com>
> 
> I hit the following issue when run a OOM case of the LTP and
> ksm enable.

"
I hit the following hung task when running an OOM LTP test case with 4.1
kernel.
"

> 
> Call trace:
> [<ffffffc000086a88>] __switch_to+0x74/0x8c
> [<ffffffc000a1bae0>] __schedule+0x23c/0x7bc
> [<ffffffc000a1c09c>] schedule+0x3c/0x94
> [<ffffffc000a1eb84>] rwsem_down_write_failed+0x214/0x350
> [<ffffffc000a1e32c>] down_write+0x64/0x80
> [<ffffffc00021f794>] __ksm_exit+0x90/0x19c
> [<ffffffc0000be650>] mmput+0x118/0x11c
> [<ffffffc0000c3ec4>] do_exit+0x2dc/0xa74
> [<ffffffc0000c46f8>] do_group_exit+0x4c/0xe4
> [<ffffffc0000d0f34>] get_signal+0x444/0x5e0
> [<ffffffc000089fcc>] do_signal+0x1d8/0x450
> [<ffffffc00008a35c>] do_notify_resume+0x70/0x78
> 
> it will leads to a hung task because the exiting task cannot get the
> mmap sem for write. but the root cause is that the ksmd holds it for
> read while allocateing memory which just takes ages to complete.
> and ksmd will loop in the following path.

"
The oom victim cannot terminate because it needs to take mmap_sem for
write while the lock is held by ksmd for read which loops in the page
allocator

ksm_do_scan
	scan_get_next_rmap_item
		down_read
		get_next_rmap_item
			alloc_rmap_item   #ksmd will loop permanently.

There is not way forward because the oom victim cannot release any
memory in 4.1 based kernel. Since 4.6 we have the oom reaper which would
solve this problem because it would release the memory asynchronously.
Nevertheless we can relax alloc_rmap_item requirements and use
__GFP_NORETRY because the allocation failure is acceptable as
ksm_do_scan would just retry later after the lock got dropped.

Such a patch would be also easy to backport to older stable kernels
which do not have oom_reaper.

While we are at it add GFP_NOWARN as the admin doesn't have to be
alarmed by the allocation failure.
> 
> CC: <stable@vger.kernel.org>
> Suggested-by: Hugh Dickins <hughd@google.com>
> Suggested-by: Michal Hocko <mhocko@suse.cz>
> Signed-off-by: zhong jiang <zhongjiang@huawei.com>
> ---
>  mm/ksm.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/ksm.c b/mm/ksm.c
> index 73d43ba..5048083 100644
> --- a/mm/ksm.c
> +++ b/mm/ksm.c
> @@ -283,7 +283,8 @@ static inline struct rmap_item *alloc_rmap_item(void)
>  {
>  	struct rmap_item *rmap_item;
>  
> -	rmap_item = kmem_cache_zalloc(rmap_item_cache, GFP_KERNEL);
> +	rmap_item = kmem_cache_zalloc(rmap_item_cache, GFP_KERNEL |
> +						__GFP_NORETRY | __GFP_NOWARN);
>  	if (rmap_item)
>  		ksm_rmap_items++;
>  	return rmap_item;
> -- 
> 1.8.3.1

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] mm,ksm: fix endless looping in allocating memory when ksm enable
  2016-09-20  7:07 ` Michal Hocko
@ 2016-09-20  7:12   ` zhong jiang
  2016-09-20  7:14   ` Michal Hocko
  1 sibling, 0 replies; 4+ messages in thread
From: zhong jiang @ 2016-09-20  7:12 UTC (permalink / raw)
  To: Michal Hocko; +Cc: hughd, akpm, linux-mm

On 2016/9/20 15:07, Michal Hocko wrote:
> [CCing Tetsuo again - please make sure you CC everybody who did respond
>  in earlier versions of the patch]
>
> I am sorry to insist here but this doesn't address the previous review
> feedback. Let me try to show you what I would find much better. I do not
> insist on this precise wording of course but I do insist on mentioning
> the current state and making clear why GFP_NORETRY is really ok.
>
> On Tue 20-09-16 13:50:13, zhongjiang wrote:
>> From: zhong jiang <zhongjiang@huawei.com>
>>
>> I hit the following issue when run a OOM case of the LTP and
>> ksm enable.
> "
> I hit the following hung task when running an OOM LTP test case with 4.1
> kernel.
> "
>
>> Call trace:
>> [<ffffffc000086a88>] __switch_to+0x74/0x8c
>> [<ffffffc000a1bae0>] __schedule+0x23c/0x7bc
>> [<ffffffc000a1c09c>] schedule+0x3c/0x94
>> [<ffffffc000a1eb84>] rwsem_down_write_failed+0x214/0x350
>> [<ffffffc000a1e32c>] down_write+0x64/0x80
>> [<ffffffc00021f794>] __ksm_exit+0x90/0x19c
>> [<ffffffc0000be650>] mmput+0x118/0x11c
>> [<ffffffc0000c3ec4>] do_exit+0x2dc/0xa74
>> [<ffffffc0000c46f8>] do_group_exit+0x4c/0xe4
>> [<ffffffc0000d0f34>] get_signal+0x444/0x5e0
>> [<ffffffc000089fcc>] do_signal+0x1d8/0x450
>> [<ffffffc00008a35c>] do_notify_resume+0x70/0x78
>>
>> it will leads to a hung task because the exiting task cannot get the
>> mmap sem for write. but the root cause is that the ksmd holds it for
>> read while allocateing memory which just takes ages to complete.
>> and ksmd will loop in the following path.
> "
> The oom victim cannot terminate because it needs to take mmap_sem for
> write while the lock is held by ksmd for read which loops in the page
> allocator
>
> ksm_do_scan
> 	scan_get_next_rmap_item
> 		down_read
> 		get_next_rmap_item
> 			alloc_rmap_item   #ksmd will loop permanently.
>
> There is not way forward because the oom victim cannot release any
> memory in 4.1 based kernel. Since 4.6 we have the oom reaper which would
> solve this problem because it would release the memory asynchronously.
> Nevertheless we can relax alloc_rmap_item requirements and use
> __GFP_NORETRY because the allocation failure is acceptable as
> ksm_do_scan would just retry later after the lock got dropped.
>
> Such a patch would be also easy to backport to older stable kernels
> which do not have oom_reaper.
>
> While we are at it add GFP_NOWARN as the admin doesn't have to be
> alarmed by the allocation failure.
>> CC: <stable@vger.kernel.org>
>> Suggested-by: Hugh Dickins <hughd@google.com>
>> Suggested-by: Michal Hocko <mhocko@suse.cz>
>> Signed-off-by: zhong jiang <zhongjiang@huawei.com>
>> ---
>>  mm/ksm.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/ksm.c b/mm/ksm.c
>> index 73d43ba..5048083 100644
>> --- a/mm/ksm.c
>> +++ b/mm/ksm.c
>> @@ -283,7 +283,8 @@ static inline struct rmap_item *alloc_rmap_item(void)
>>  {
>>  	struct rmap_item *rmap_item;
>>  
>> -	rmap_item = kmem_cache_zalloc(rmap_item_cache, GFP_KERNEL);
>> +	rmap_item = kmem_cache_zalloc(rmap_item_cache, GFP_KERNEL |
>> +						__GFP_NORETRY | __GFP_NOWARN);
>>  	if (rmap_item)
>>  		ksm_rmap_items++;
>>  	return rmap_item;
>> -- 
>> 1.8.3.1
 Thanks you for advice, I will modify it now .

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] mm,ksm: fix endless looping in allocating memory when ksm enable
  2016-09-20  7:07 ` Michal Hocko
  2016-09-20  7:12   ` zhong jiang
@ 2016-09-20  7:14   ` Michal Hocko
  1 sibling, 0 replies; 4+ messages in thread
From: Michal Hocko @ 2016-09-20  7:14 UTC (permalink / raw)
  To: zhongjiang; +Cc: hughd, akpm, linux-mm, Tetsuo Handa

On Tue 20-09-16 09:07:43, Michal Hocko wrote:
> [CCing Tetsuo again - please make sure you CC everybody who did respond
>  in earlier versions of the patch]

now for real

> 
> I am sorry to insist here but this doesn't address the previous review
> feedback. Let me try to show you what I would find much better. I do not
> insist on this precise wording of course but I do insist on mentioning
> the current state and making clear why GFP_NORETRY is really ok.
> 
> On Tue 20-09-16 13:50:13, zhongjiang wrote:
> > From: zhong jiang <zhongjiang@huawei.com>
> > 
> > I hit the following issue when run a OOM case of the LTP and
> > ksm enable.
> 
> "
> I hit the following hung task when running an OOM LTP test case with 4.1
> kernel.
> "
> 
> > 
> > Call trace:
> > [<ffffffc000086a88>] __switch_to+0x74/0x8c
> > [<ffffffc000a1bae0>] __schedule+0x23c/0x7bc
> > [<ffffffc000a1c09c>] schedule+0x3c/0x94
> > [<ffffffc000a1eb84>] rwsem_down_write_failed+0x214/0x350
> > [<ffffffc000a1e32c>] down_write+0x64/0x80
> > [<ffffffc00021f794>] __ksm_exit+0x90/0x19c
> > [<ffffffc0000be650>] mmput+0x118/0x11c
> > [<ffffffc0000c3ec4>] do_exit+0x2dc/0xa74
> > [<ffffffc0000c46f8>] do_group_exit+0x4c/0xe4
> > [<ffffffc0000d0f34>] get_signal+0x444/0x5e0
> > [<ffffffc000089fcc>] do_signal+0x1d8/0x450
> > [<ffffffc00008a35c>] do_notify_resume+0x70/0x78
> > 
> > it will leads to a hung task because the exiting task cannot get the
> > mmap sem for write. but the root cause is that the ksmd holds it for
> > read while allocateing memory which just takes ages to complete.
> > and ksmd will loop in the following path.
> 
> "
> The oom victim cannot terminate because it needs to take mmap_sem for
> write while the lock is held by ksmd for read which loops in the page
> allocator
> 
> ksm_do_scan
> 	scan_get_next_rmap_item
> 		down_read
> 		get_next_rmap_item
> 			alloc_rmap_item   #ksmd will loop permanently.
> 
> There is not way forward because the oom victim cannot release any
> memory in 4.1 based kernel. Since 4.6 we have the oom reaper which would
> solve this problem because it would release the memory asynchronously.
> Nevertheless we can relax alloc_rmap_item requirements and use
> __GFP_NORETRY because the allocation failure is acceptable as
> ksm_do_scan would just retry later after the lock got dropped.
> 
> Such a patch would be also easy to backport to older stable kernels
> which do not have oom_reaper.
> 
> While we are at it add GFP_NOWARN as the admin doesn't have to be
> alarmed by the allocation failure.
> > 
> > CC: <stable@vger.kernel.org>
> > Suggested-by: Hugh Dickins <hughd@google.com>
> > Suggested-by: Michal Hocko <mhocko@suse.cz>
> > Signed-off-by: zhong jiang <zhongjiang@huawei.com>
> > ---
> >  mm/ksm.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/mm/ksm.c b/mm/ksm.c
> > index 73d43ba..5048083 100644
> > --- a/mm/ksm.c
> > +++ b/mm/ksm.c
> > @@ -283,7 +283,8 @@ static inline struct rmap_item *alloc_rmap_item(void)
> >  {
> >  	struct rmap_item *rmap_item;
> >  
> > -	rmap_item = kmem_cache_zalloc(rmap_item_cache, GFP_KERNEL);
> > +	rmap_item = kmem_cache_zalloc(rmap_item_cache, GFP_KERNEL |
> > +						__GFP_NORETRY | __GFP_NOWARN);
> >  	if (rmap_item)
> >  		ksm_rmap_items++;
> >  	return rmap_item;
> > -- 
> > 1.8.3.1
> 
> -- 
> Michal Hocko
> SUSE Labs

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-09-20  7:15 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-09-20  5:50 [PATCH v2] mm,ksm: fix endless looping in allocating memory when ksm enable zhongjiang
2016-09-20  7:07 ` Michal Hocko
2016-09-20  7:12   ` zhong jiang
2016-09-20  7:14   ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).