Re: khugepaged eating 100%CPU

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Andrea Arcangeli <aarcange@redhat.com>
To: Michal Hocko <mhocko@suse.cz>
Cc: LKML <linux-kernel@vger.kernel.org>, linux-mm@kvack.org
Subject: Re: khugepaged eating 100%CPU
Date: Tue, 8 Feb 2011 00:12:28 +0100	[thread overview]
Message-ID: <20110207231228.GI3347@random.random> (raw)
In-Reply-To: <20110207211601.GA25665@tiehlicka.suse.cz>

Hello Michal,

On Mon, Feb 07, 2011 at 10:16:01PM +0100, Michal Hocko wrote:
> On Mon 07-02-11 22:06:54, Michal Hocko wrote:
> > Hi Andrea,
> > 
> > I am currently running into an issue when khugepaged is running 100% on
> > one of my CPUs for a long time (at least one hour as I am writing the
> > email). The kernel is the clean 2.6.38-rc3 (i386) vanilla kernel.
> > 
> > I have tried to disable defrag but it didn't help (I haven't rebooted
> > after setting the value). I am not sure what information is helpful and
> > also not sure whether I am able to reproduce it after restart (it is the
> > first time I can see this problem) so sorry for the poor report.
> > 
> > Here is some basic info which might be useful (config and sysrq+t are
> > attached):
> > =========
> 
> And I have just realized that I forgot about the daemon stack:
> # cat /proc/573/stack 
> [<c019c981>] shrink_zone+0x1b9/0x455
> [<c019d462>] do_try_to_free_pages+0x9d/0x301
> [<c019d803>] try_to_free_pages+0xb3/0x104
> [<c01966d7>] __alloc_pages_nodemask+0x358/0x589
> [<c01bf314>] khugepaged+0x13f/0xc60
> [<c014c301>] kthread+0x67/0x6c
> [<c0102db6>] kernel_thread_helper+0x6/0x10
> [<ffffffff>] 0xffffffff

It would be great to know if __alloc_pages_nodemask returned or if it
was calling it in a loop.

When __alloc_pages_nodemask fails in collapse_huge_page, hpage is set
to ERR_PTR(-ENOMEM), then khugepaged_scan_pmd returns 1, then
khugepaged_scan_mm_slot goto breakouterloop_mmap_sem and return
progress, then the khugepaged_do_scan main loop should notice that
IS_ERR(*hpage) is set and break out of the loop and return void, then
khugepaged_loop should notice that IS_ERR(hpage) is set and it should
throttle for alloc_sleep_millisecs inside khugepaged_alloc_sleep
before setting hpage to NULL and trying again to allocate. I wonder
what could be going wrong in khugepaged.. I wonder if it's a bug inside
__alloc_pages_nodemask and not a khugepaged issue. Best would be if
you run SYSRQ+l several times (/proc/*/stack don't seem to be the best
for running tasks even if it should be accurate enough already, but if
you run it often and with sysrq+l it'll be more clear what is
running).

I hope you can reproduce, if it's an allocator issue you should notice
it again by keeping the same workload on that same system. I doubt I
can reproduce at the moment as I don't know what's going on to
simulate your load.

Thanks a lot,
Andrea

WARNING: multiple messages have this Message-ID (diff)

From: Andrea Arcangeli <aarcange@redhat.com>
To: Michal Hocko <mhocko@suse.cz>
Cc: LKML <linux-kernel@vger.kernel.org>, linux-mm@kvack.org
Subject: Re: khugepaged eating 100%CPU
Date: Tue, 8 Feb 2011 00:12:28 +0100	[thread overview]
Message-ID: <20110207231228.GI3347@random.random> (raw)
In-Reply-To: <20110207211601.GA25665@tiehlicka.suse.cz>

Hello Michal,

On Mon, Feb 07, 2011 at 10:16:01PM +0100, Michal Hocko wrote:
> On Mon 07-02-11 22:06:54, Michal Hocko wrote:
> > Hi Andrea,
> > 
> > I am currently running into an issue when khugepaged is running 100% on
> > one of my CPUs for a long time (at least one hour as I am writing the
> > email). The kernel is the clean 2.6.38-rc3 (i386) vanilla kernel.
> > 
> > I have tried to disable defrag but it didn't help (I haven't rebooted
> > after setting the value). I am not sure what information is helpful and
> > also not sure whether I am able to reproduce it after restart (it is the
> > first time I can see this problem) so sorry for the poor report.
> > 
> > Here is some basic info which might be useful (config and sysrq+t are
> > attached):
> > =========
> 
> And I have just realized that I forgot about the daemon stack:
> # cat /proc/573/stack 
> [<c019c981>] shrink_zone+0x1b9/0x455
> [<c019d462>] do_try_to_free_pages+0x9d/0x301
> [<c019d803>] try_to_free_pages+0xb3/0x104
> [<c01966d7>] __alloc_pages_nodemask+0x358/0x589
> [<c01bf314>] khugepaged+0x13f/0xc60
> [<c014c301>] kthread+0x67/0x6c
> [<c0102db6>] kernel_thread_helper+0x6/0x10
> [<ffffffff>] 0xffffffff

It would be great to know if __alloc_pages_nodemask returned or if it
was calling it in a loop.

When __alloc_pages_nodemask fails in collapse_huge_page, hpage is set
to ERR_PTR(-ENOMEM), then khugepaged_scan_pmd returns 1, then
khugepaged_scan_mm_slot goto breakouterloop_mmap_sem and return
progress, then the khugepaged_do_scan main loop should notice that
IS_ERR(*hpage) is set and break out of the loop and return void, then
khugepaged_loop should notice that IS_ERR(hpage) is set and it should
throttle for alloc_sleep_millisecs inside khugepaged_alloc_sleep
before setting hpage to NULL and trying again to allocate. I wonder
what could be going wrong in khugepaged.. I wonder if it's a bug inside
__alloc_pages_nodemask and not a khugepaged issue. Best would be if
you run SYSRQ+l several times (/proc/*/stack don't seem to be the best
for running tasks even if it should be accurate enough already, but if
you run it often and with sysrq+l it'll be more clear what is
running).

I hope you can reproduce, if it's an allocator issue you should notice
it again by keeping the same workload on that same system. I doubt I
can reproduce at the moment as I don't know what's going on to
simulate your load.

Thanks a lot,
Andrea

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2011-02-07 23:12 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-07 21:06 khugepaged eating 100%CPU Michal Hocko
2011-02-07 21:16 ` Michal Hocko
2011-02-07 21:16   ` Michal Hocko
2011-02-07 21:45   ` Michal Hocko
2011-02-07 21:45     ` Michal Hocko
2011-02-07 23:12   ` Andrea Arcangeli [this message]
2011-02-07 23:12     ` Andrea Arcangeli
2011-02-08  8:45     ` Michal Hocko
2011-02-08  8:45       ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110207231228.GI3347@random.random \
    --to=aarcange@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.