All of lore.kernel.org
 help / color / mirror / Atom feed
From: Qian Cai <cai@lca.pw>
To: Minchan Kim <minchan@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@suse.com>,
	linux-mm@kvack.org,  linux-kernel@vger.kernel.org
Subject: Re: "mm: account nr_isolated_xxx in [isolate|putback]_lru_page" breaks OOM with swap
Date: Wed, 31 Jul 2019 12:09:06 -0400	[thread overview]
Message-ID: <1564589346.11067.38.camel@lca.pw> (raw)
In-Reply-To: <20190731053444.GA155569@google.com>

On Wed, 2019-07-31 at 14:34 +0900, Minchan Kim wrote:
> On Tue, Jul 30, 2019 at 12:25:28PM -0400, Qian Cai wrote:
> > OOM workloads with swapping is unable to recover with linux-next since next-
> > 20190729 due to the commit "mm: account nr_isolated_xxx in
> > [isolate|putback]_lru_page" breaks OOM with swap" [1]
> > 
> > [1] https://lore.kernel.org/linux-mm/20190726023435.214162-4-minchan@kernel.
> > org/
> > T/#mdcd03bcb4746f2f23e6f508c205943726aee8355
> > 
> > For example, LTP oom01 test case is stuck for hours, while it finishes in a
> > few
> > minutes here after reverted the above commit. Sometimes, it prints those
> > message
> > while hanging.
> > 
> > [  509.983393][  T711] INFO: task oom01:5331 blocked for more than 122
> > seconds.
> > [  509.983431][  T711]       Not tainted 5.3.0-rc2-next-20190730 #7
> > [  509.983447][  T711] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [  509.983477][  T711] oom01           D24656  5331   5157 0x00040000
> > [  509.983513][  T711] Call Trace:
> > [  509.983538][  T711] [c00020037d00f880] [0000000000000008] 0x8
> > (unreliable)
> > [  509.983583][  T711] [c00020037d00fa60] [c000000000023724]
> > __switch_to+0x3a4/0x520
> > [  509.983615][  T711] [c00020037d00fad0] [c0000000008d17bc]
> > __schedule+0x2fc/0x950
> > [  509.983647][  T711] [c00020037d00fba0] [c0000000008d1e68]
> > schedule+0x58/0x150
> > [  509.983684][  T711] [c00020037d00fbd0] [c0000000008d7614]
> > rwsem_down_read_slowpath+0x4b4/0x630
> > [  509.983727][  T711] [c00020037d00fc90] [c0000000008d7dfc]
> > down_read+0x12c/0x240
> > [  509.983758][  T711] [c00020037d00fd20] [c00000000005fb28]
> > __do_page_fault+0x6f8/0xee0
> > [  509.983801][  T711] [c00020037d00fe20] [c00000000000a364]
> > handle_page_fault+0x18/0x38
> 
> Thanks for the testing! No surprise the patch make some bugs because
> it's rather tricky.
> 
> Could you test this patch?

It does help the situation a bit, but the recover speed is still way slower than
just reverting the commit "mm: account nr_isolated_xxx in
[isolate|putback]_lru_page". For example, on this powerpc system, it used to
take 4-min to finish oom01 while now still take 13-min.

The oom02 (testing NUMA mempolicy) takes even longer and I gave up after 26-min
with several hang tasks below.

[ 7881.086027][  T723]       Tainted: G        W         5.3.0-rc2-next-
20190731+ #4
[ 7881.086045][  T723] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 7881.086064][  T723] oom02           D26080 112911 112776 0x00040000
[ 7881.086100][  T723] Call Trace:
[ 7881.086113][  T723] [c00000185deef880] [0000000000000008] 0x8 (unreliable)
[ 7881.086142][  T723] [c00000185deefa60] [c0000000000236e4]
__switch_to+0x3a4/0x520
[ 7881.086182][  T723] [c00000185deefad0] [c0000000008d045c]
__schedule+0x2fc/0x950
[ 7881.086225][  T723] [c00000185deefba0] [c0000000008d0b08] schedule+0x58/0x150
[ 7881.086279][  T723] [c00000185deefbd0] [c0000000008d6284]
rwsem_down_read_slowpath+0x4b4/0x630
[ 7881.086311][  T723] [c00000185deefc90] [c0000000008d6a6c]
down_read+0x12c/0x240
[ 7881.086340][  T723] [c00000185deefd20] [c00000000005fa34]
__do_page_fault+0x6e4/0xeb0
[ 7881.086406][  T723] [c00000185deefe20] [c00000000000a364]
handle_page_fault+0x18/0x38
[ 7881.086435][  T723] INFO: task oom02:112913 blocked for more than 368
seconds.
[ 7881.086472][  T723]       Tainted: G        W         5.3.0-rc2-next-
20190731+ #4
[ 7881.086509][  T723] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 7881.086551][  T723] oom02           D26832 112913 112776 0x00040000
[ 7881.086583][  T723] Call Trace:
[ 7881.086596][  T723] [c000201c450af890] [0000000000000008] 0x8 (unreliable)
[ 7881.086636][  T723] [c000201c450afa70] [c0000000000236e4]
__switch_to+0x3a4/0x520
[ 7881.086679][  T723] [c000201c450afae0] [c0000000008d045c]
__schedule+0x2fc/0x950
[ 7881.086720][  T723] [c000201c450afbb0] [c0000000008d0b08] schedule+0x58/0x150
[ 7881.086762][  T723] [c000201c450afbe0] [c0000000008d6284]
rwsem_down_read_slowpath+0x4b4/0x630
[ 7881.086818][  T723] [c000201c450afca0] [c0000000008d6a6c]
down_read+0x12c/0x240
[ 7881.086860][  T723] [c000201c450afd30] [c00000000035534c]
__mm_populate+0x12c/0x200
[ 7881.086902][  T723] [c000201c450afda0] [c00000000036a65c] do_mlock+0xec/0x2f0
[ 7881.086955][  T723] [c000201c450afe00] [c00000000036aa24] sys_mlock+0x24/0x40
[ 7881.086987][  T723] [c000201c450afe20] [c00000000000ae08]
system_call+0x5c/0x70
[ 7881.087025][  T723] 
[ 7881.087025][  T723] Showing all locks held in the system:
[ 7881.087065][  T723] 3 locks held by systemd/1:
[ 7881.087111][  T723]  #0: 000000002f8cb0d9 (&ep->mtx){....}, at:
ep_scan_ready_list+0x2a8/0x2d0
[ 7881.087159][  T723]  #1: 000000004e0b13a9 (&mm->mmap_sem){....}, at:
__do_page_fault+0x184/0xeb0
[ 7881.087209][  T723]  #2: 000000006dafe1e3 (fs_reclaim){....}, at:
fs_reclaim_acquire.part.17+0x10/0x60
[ 7881.087292][  T723] 1 lock held by khungtaskd/723:
[ 7881.087327][  T723]  #0: 00000000e4addba8 (rcu_read_lock){....}, at:
debug_show_all_locks+0x50/0x170
[ 7881.087388][  T723] 1 lock held by oom02/112907:
[ 7881.087411][  T723]  #0: 000000003463bed2 (&mm->mmap_sem){....}, at:
vm_mmap_pgoff+0x8c/0x160
[ 7881.087487][  T723] 1 lock held by oom02/112908:
[ 7881.087522][  T723]  #0: 000000003463bed2 (&mm->mmap_sem){....}, at:
vm_mmap_pgoff+0x8c/0x160
[ 7881.087566][  T723] 1 lock held by oom02/112909:
[ 7881.087591][  T723]  #0: 000000003463bed2 (&mm->mmap_sem){....}, at:
vm_mmap_pgoff+0x8c/0x160
[ 7881.087627][  T723] 1 lock held by oom02/112910:
[ 7881.087662][  T723]  #0: 000000003463bed2 (&mm->mmap_sem){....}, at:
vm_mmap_pgoff+0x8c/0x160
[ 7881.087707][  T723] 1 lock held by oom02/112911:
[ 7881.087743][  T723]  #0: 000000003463bed2 (&mm->mmap_sem){....}, at:
__do_page_fault+0x6e4/0xeb0
[ 7881.087793][  T723] 1 lock held by oom02/112912:
[ 7881.087827][  T723]  #0: 000000003463bed2 (&mm->mmap_sem){....}, at:
vm_mmap_pgoff+0x8c/0x160
[ 7881.087872][  T723] 1 lock held by oom02/112913:
[ 7881.087897][  T723]  #0: 000000003463bed2 (&mm->mmap_sem){....}, at:
__mm_populate+0x12c/0x200
[ 7881.087943][  T723] 1 lock held by oom02/112914:
[ 7881.087979][  T723]  #0: 000000003463bed2 (&mm->mmap_sem){....}, at:
vm_mmap_pgoff+0x8c/0x160
[ 7881.088037][  T723] 1 lock held by oom02/112915:
[ 7881.088060][  T723]  #0: 000000003463bed2 (&mm->mmap_sem){....}, at:
vm_mmap_pgoff+0x8c/0x160
[ 7881.088095][  T723] 2 locks held by oom02/112916:
[ 7881.088134][  T723]  #0: 000000003463bed2 (&mm->mmap_sem){....}, at:
__mm_populate+0x12c/0x200
[ 7881.088180][  T723]  #1: 000000006dafe1e3 (fs_reclaim){....}, at:
fs_reclaim_acquire.part.17+0x10/0x60
[ 7881.088230][  T723] 1 lock held by oom02/112917:
[ 7881.088257][  T723]  #0: 000000003463bed2 (&mm->mmap_sem){....}, at:
do_mlock+0x88/0x2f0
[ 7881.088291][  T723] 1 lock held by oom02/112918:
[ 7881.088325][  T723]  #0: 000000003463bed2 (&mm->mmap_sem){....}, at:
vm_mmap_pgoff+0x8c/0x160
[ 7881.088370][  T723] 
[ 7881.088391][  T723] =============================================

> 
> From b31667210dd747f4d8aeb7bdc1f5c14f1f00bff5 Mon Sep 17 00:00:00 2001
> From: Minchan Kim <minchan@kernel.org>
> Date: Wed, 31 Jul 2019 14:18:01 +0900
> Subject: [PATCH] mm: decrease NR_ISOALTED count at succesful migration
> 
> If migration fails, it should go back to LRU list so putback_lru_page
> could handle NR_ISOLATED count in pair with isolate_lru_page. However,
> if migration is successful, the page will be freed so no need to
> add the page back to LRU list. Thus, NR_ISOLATED count should be done
> in manually.
> 
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
>  mm/migrate.c | 16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 84b89d2d69065..96ae0c3cada8d 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1166,6 +1166,7 @@ static ICE_noinline int unmap_and_move(new_page_t
> get_new_page,
>  {
>  	int rc = MIGRATEPAGE_SUCCESS;
>  	struct page *newpage;
> +	bool is_lru = __PageMovable(page);
>  
>  	if (!thp_migration_supported() && PageTransHuge(page))
>  		return -ENOMEM;
> @@ -1175,17 +1176,10 @@ static ICE_noinline int unmap_and_move(new_page_t
> get_new_page,
>  		return -ENOMEM;
>  
>  	if (page_count(page) == 1) {
> -		bool is_lru = !__PageMovable(page);
> -
>  		/* page was freed from under us. So we are done. */
>  		ClearPageActive(page);
>  		ClearPageUnevictable(page);
> -		if (likely(is_lru))
> -			mod_node_page_state(page_pgdat(page),
> -						NR_ISOLATED_ANON +
> -						page_is_file_cache(page),
> -						-hpage_nr_pages(page));
> -		else {
> +		if (unlikely(!is_lru)) {
>  			lock_page(page);
>  			if (!PageMovable(page))
>  				__ClearPageIsolated(page);
> @@ -1229,6 +1223,12 @@ static ICE_noinline int unmap_and_move(new_page_t
> get_new_page,
>  			if (set_hwpoison_free_buddy_page(page))
>  				num_poisoned_pages_inc();
>  		}
> +
> +		if (likely(is_lru))
> +			mod_node_page_state(page_pgdat(page),
> +					NR_ISOLATED_ANON +
> +						page_is_file_cache(page),
> +					-hpage_nr_pages(page));
>  	} else {
>  		if (rc != -EAGAIN) {
>  			if (likely(!__PageMovable(page))) {


  reply	other threads:[~2019-07-31 16:09 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-30 16:25 "mm: account nr_isolated_xxx in [isolate|putback]_lru_page" breaks OOM with swap Qian Cai
2019-07-31  5:34 ` Minchan Kim
2019-07-31 16:09   ` Qian Cai [this message]
2019-07-31 18:18     ` Qian Cai
2019-08-01  6:51       ` Minchan Kim
2019-08-01 11:46         ` Qian Cai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1564589346.11067.38.camel@lca.pw \
    --to=cai@lca.pw \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.