public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@techsingularity.net>
To: valdis.kletnieks@vt.edu
Cc: Pavel Machek <pavel@ucw.cz>,
	kernel list <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@osdl.org>,
	vbabka@suse.cz, aarcange@redhat.com, rientjes@google.com,
	mhocko@kernel.org, zi.yan@cs.rutgers.edu, hannes@cmpxchg.org,
	jack@suse.cz
Subject: Re: [regression -next0117] What is kcompactd and why is he eating 100% of my cpu?
Date: Sun, 27 Jan 2019 14:15:56 +0000	[thread overview]
Message-ID: <20190127141556.GB9565@techsingularity.net> (raw)
In-Reply-To: <12171.1548557813@turing-police.cc.vt.edu>

On Sat, Jan 26, 2019 at 09:56:53PM -0500, valdis.kletnieks@vt.edu wrote:
> On Sat, 26 Jan 2019 21:00:05 +0100, Pavel Machek said:
> 
> > top - 13:38:51 up  1:42, 16 users,  load average: 1.41, 1.93, 1.62
> > Tasks: 182 total,   3 running, 138 sleeping,   0 stopped,   0 zombie
> > %Cpu(s):  2.3 us, 57.8 sy,  0.0 ni, 39.9 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
> > KiB Mem:   3020044 total,  2429420 used,   590624 free,    27468 buffers
> > KiB Swap:  2097148 total,        0 used,  2097148 free.  1924268 cached Mem
> >
> >   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
> >   608 root      20   0       0      0      0 R  99.6  0.0  11:34.38 kcompactd0
> >  9782 root      20   0       0      0      0 I   7.9  0.0   0:59.02 kworker/0:
> >  2971 root      20   0   46624  23076  13576 S   4.3  0.8   2:50.22 Xorg
> 
> I've noticed this as well on earlier kernels (next-20181224 to 20190115)
> 
> Some more info:
> 
> 1) echo 3 > /proc/sys/vm/drop_caches  unwedges kcompactd in 1-3 seconds.
> 

This aspect is curious as it indicates that kcompactd could potentially
be infinite looping but it's not something I've experienced myself. By
any chance is there a preditable reproduction case for this?

> I've also seen khugepaged hung up:
> 
> cat /proc/29/stack
> [<0>] ___preempt_schedule+0x16/0x18
> [<0>] page_vma_mapped_walk+0x60/0x840
> [<0>] remove_migration_pte+0x67/0x390
> [<0>] rmap_walk_file+0x186/0x380
> [<0>] rmap_walk+0xa3/0xd0
> [<0>] remove_migration_ptes+0x69/0x70
> [<0>] migrate_pages+0xb6d/0xfd8
> [<0>] compact_zone+0xb70/0x1370
> [<0>] compact_zone_order+0xd8/0x120
> [<0>] try_to_compact_pages+0xe5/0x550
> [<0>] __alloc_pages_direct_compact+0x6d/0x1a0
> [<0>] __alloc_pages_slowpath+0x6c9/0x1640
> [<0>] __alloc_pages_nodemask+0x558/0x5b0
> [<0>] khugepaged+0x499/0x810
> [<0>] kthread+0x158/0x170
> [<0>] ret_from_fork+0x3a/0x50
> [<0>] 0xffffffffffffffff
> 
> Looks like something has gone astray with compact_zone.
> 

It's a possibility that the buffer aspect of the trace is a red herring
and there is some corner case that prevents the migration scan/free
scanner meeting and exiting compaction. Again, a reproduction case of
some sort would be nice or an indication of how long it takes to
trigger. An update of the series is due which may or may not fix this
but if it doesn't, we'll need to start tracing this to see what's going
on at the point of failure.

-- 
Mel Gorman
SUSE Labs

  parent reply	other threads:[~2019-01-27 14:16 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-26 20:00 [regression -next0117] What is kcompactd and why is he eating 100% of my cpu? Pavel Machek
2019-01-27  2:56 ` valdis.kletnieks
2019-01-27 14:09   ` Mel Gorman
2019-01-27 14:15   ` Mel Gorman [this message]
2019-01-27 16:00     ` Pavel Machek
2019-01-27 21:36       ` valdis.kletnieks
2019-01-28  9:16         ` Jan Kara
2019-01-28 10:57           ` Sergey Senozhatsky
2019-01-28 11:03           ` Mel Gorman
2019-01-30  1:06           ` valdis.kletnieks
2019-01-30  4:29             ` valdis.kletnieks
2019-01-30 10:40               ` Mel Gorman
2021-01-25 18:54                 ` Tibor Bana
2021-01-26  8:52                   ` Valdis Klētnieks
2021-01-26  9:17                   ` Mel Gorman
2021-01-27 19:29                     ` Tibor Bana
2021-02-16 12:36                   ` Jason A. Donenfeld
2021-02-16 22:33                     ` Valdis Klētnieks

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190127141556.GB9565@techsingularity.net \
    --to=mgorman@techsingularity.net \
    --cc=aarcange@redhat.com \
    --cc=akpm@osdl.org \
    --cc=hannes@cmpxchg.org \
    --cc=jack@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhocko@kernel.org \
    --cc=pavel@ucw.cz \
    --cc=rientjes@google.com \
    --cc=valdis.kletnieks@vt.edu \
    --cc=vbabka@suse.cz \
    --cc=zi.yan@cs.rutgers.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox