From: Thorsten Leemhuis <fedora@leemhuis.info>
To: Josh Boyer <jwboyer@gmail.com>
Cc: Mel Gorman <mgorman@suse.de>,
Zdenek Kabelac <zkabelac@redhat.com>,
Seth Jennings <sjenning@linux.vnet.ibm.com>,
Jiri Slaby <jslaby@suse.cz>,
Valdis.Kletnieks@vt.edu, Jiri Slaby <jirislaby@gmail.com>,
linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Rik van Riel <riel@redhat.com>,
Robert Jennings <rcj@linux.vnet.ibm.com>,
bruno@wolff.to
Subject: Re: [PATCH] Revert "mm: remove __GFP_NO_KSWAPD"
Date: Tue, 20 Nov 2012 18:43:04 +0100 [thread overview]
Message-ID: <50ABC128.80706@leemhuis.info> (raw)
In-Reply-To: <CA+5PVA7__=JcjLAhs5cpVK-WaZbF5bQhp5WojBJsdEt9SnG3cw@mail.gmail.com>
On 20.11.2012 16:38, Josh Boyer wrote:
> On Fri, Nov 16, 2012 at 3:06 PM, Mel Gorman <mgorman@suse.de> wrote:
>> On Fri, Nov 16, 2012 at 02:14:47PM -0500, Josh Boyer wrote:
>>> On Mon, Nov 12, 2012 at 6:37 AM, Mel Gorman <mgorman@suse.de> wrote:
>>>> With "mm: vmscan: scale number of pages reclaimed by reclaim/compaction
>>>> based on failures" reverted, Zdenek Kabelac reported the following
>>>>
>>>> Hmm, so it's just took longer to hit the problem and observe
>>>> kswapd0 spinning on my CPU again - it's not as endless like before -
>>>> but still it easily eats minutes - it helps to turn off Firefox
>>>> or TB (memory hungry apps) so kswapd0 stops soon - and restart
>>>> those apps again. (And I still have like >1GB of cached memory)
>>>>
>>>> kswapd0 R running task 0 30 2 0x00000000
>>>> ffff8801331efae8 0000000000000082 0000000000000018 0000000000000246
>>>> ffff880135b9a340 ffff8801331effd8 ffff8801331effd8 ffff8801331effd8
>>>> ffff880055dfa340 ffff880135b9a340 00000000331efad8 ffff8801331ee000
>>>> Call Trace:
>>>> [<ffffffff81555bf2>] preempt_schedule+0x42/0x60
>>>> [<ffffffff81557a95>] _raw_spin_unlock+0x55/0x60
>>>> [<ffffffff81192971>] put_super+0x31/0x40
>>>> [<ffffffff81192a42>] drop_super+0x22/0x30
>>>> [<ffffffff81193b89>] prune_super+0x149/0x1b0
>>>> [<ffffffff81141e2a>] shrink_slab+0xba/0x510
>>>>
>>>> The sysrq+m indicates the system has no swap so it'll never reclaim
>>>> anonymous pages as part of reclaim/compaction. That is one part of the
>>>> problem but not the root cause as file-backed pages could also be reclaimed.
>>>>
>>>> The likely underlying problem is that kswapd is woken up or kept awake
>>>> for each THP allocation request in the page allocator slow path.
>>>>
>>>> If compaction fails for the requesting process then compaction will be
>>>> deferred for a time and direct reclaim is avoided. However, if there
>>>> are a storm of THP requests that are simply rejected, it will still
>>>> be the the case that kswapd is awake for a prolonged period of time
>>>> as pgdat->kswapd_max_order is updated each time. This is noticed by
>>>> the main kswapd() loop and it will not call kswapd_try_to_sleep().
>>>> Instead it will loopp, shrinking a small number of pages and calling
>>>> shrink_slab() on each iteration.
>>>>
>>>> The temptation is to supply a patch that checks if kswapd was woken for
>>>> THP and if so ignore pgdat->kswapd_max_order but it'll be a hack and not
>>>> backed up by proper testing. As 3.7 is very close to release and this is
>>>> not a bug we should release with, a safer path is to revert "mm: remove
>>>> __GFP_NO_KSWAPD" for now and revisit it with the view to ironing out the
>>>> balance_pgdat() logic in general.
>>>>
>>>> Signed-off-by: Mel Gorman <mgorman@suse.de>
>>>
>>> Does anyone know if this is queued to go into 3.7 somewhere? I looked
>>> a bit and can't find it in a tree. We have a few reports of Fedora
>>> rawhide users hitting this.
>>
>> No, because I was waiting to hear if a) it worked and preferably if the
>> alternative "less safe" option worked. This close to release it might be
>> better to just go with the safe option.
>
> We've been tracking it in https://bugzilla.redhat.com/show_bug.cgi?id=866988
> and people say this revert patch doesn't seem to make the issue go away
> fully. Thorsten has created another kernel with the other patch applied
> for testing.
>
> At least I think that is the latest status from the bug. Hopefully the
> commenters will chime in.
The short story from my current point of view is:
* my main machine at home where I initially saw the issue that started
this thread seems to be running fine with rc6 and the "safe" patch Mel
posted in https://lkml.org/lkml/2012/11/12/113 Before that I ran a rc5
kernel with the revert that went into rc6 and the "safe" patch -- that
worked fine for a few days, too.
* I have a second machine where I started to use 3.7-rc kernels only
yesterday (the machine triggered a bug in the radeon driver that seems
to be fixed in rc6) which showed symptoms like the ones Zdenek Kabelac
mentions in this thread. I wasn't able to look closer at it, but simply
tried rc6 with the safe patch, which didn't help. I'm now running rc6
with the "riskier" patch from https://lkml.org/lkml/2012/11/12/151
I can't yet tell if it helps. If the problems shows up again I'll try to
capture more debugging data via sysrq -- there wasn't any time for that
when I was running rc6 with the safe patch, sorry.
Thorsten
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2012-11-20 17:43 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-10-11 8:52 kswapd0: wxcessive CPU usage Jiri Slaby
2012-10-11 13:44 ` Valdis.Kletnieks
2012-10-11 15:34 ` Jiri Slaby
2012-10-11 17:56 ` Valdis.Kletnieks
2012-10-11 17:59 ` Jiri Slaby
2012-10-11 18:19 ` Valdis.Kletnieks
2012-10-11 22:08 ` kswapd0: excessive " Jiri Slaby
2012-10-12 12:37 ` Jiri Slaby
2012-10-12 13:57 ` Mel Gorman
2012-10-15 9:54 ` Jiri Slaby
2012-10-15 11:09 ` Mel Gorman
2012-10-29 10:52 ` Thorsten Leemhuis
2012-10-30 19:18 ` Mel Gorman
2012-10-31 11:25 ` Thorsten Leemhuis
2012-10-31 15:04 ` Mel Gorman
2012-11-04 16:36 ` Rik van Riel
2012-11-02 10:44 ` Zdenek Kabelac
2012-11-02 10:53 ` Jiri Slaby
2012-11-02 19:45 ` Jiri Slaby
2012-11-04 11:26 ` Zdenek Kabelac
2012-11-05 14:24 ` [PATCH] Revert "mm: vmscan: scale number of pages reclaimed by reclaim/compaction based on failures" Mel Gorman
2012-11-06 10:15 ` Johannes Hirte
2012-11-09 8:36 ` Mel Gorman
2012-11-14 21:43 ` Johannes Hirte
2012-11-09 9:12 ` Mel Gorman
2012-11-09 4:22 ` kswapd0: excessive CPU usage Seth Jennings
2012-11-09 8:07 ` Zdenek Kabelac
2012-11-09 9:06 ` Mel Gorman
2012-11-11 9:13 ` Zdenek Kabelac
2012-11-12 11:37 ` [PATCH] Revert "mm: remove __GFP_NO_KSWAPD" Mel Gorman
2012-11-16 19:14 ` Josh Boyer
2012-11-16 19:51 ` Andrew Morton
2012-11-20 1:43 ` Valdis.Kletnieks
2012-11-16 20:06 ` Mel Gorman
2012-11-20 15:38 ` Josh Boyer
2012-11-20 16:13 ` Bruno Wolff III
2012-11-20 17:43 ` Thorsten Leemhuis [this message]
2012-11-23 15:20 ` Thorsten Leemhuis
2012-11-27 11:12 ` Mel Gorman
2012-11-21 15:08 ` Mel Gorman
2012-11-20 9:18 ` Glauber Costa
2012-11-20 20:18 ` Andrew Morton
2012-11-21 8:30 ` Glauber Costa
2012-11-12 12:19 ` kswapd0: excessive CPU usage Mel Gorman
2012-11-12 13:13 ` Zdenek Kabelac
2012-11-12 13:31 ` Mel Gorman
2012-11-12 14:50 ` Zdenek Kabelac
2012-11-18 19:00 ` Zdenek Kabelac
2012-11-18 19:07 ` Jiri Slaby
2012-11-09 8:40 ` Mel Gorman
2012-10-11 22:14 ` kswapd0: wxcessive " Andrew Morton
2012-10-11 22:26 ` Jiri Slaby
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=50ABC128.80706@leemhuis.info \
--to=fedora@leemhuis.info \
--cc=Valdis.Kletnieks@vt.edu \
--cc=akpm@linux-foundation.org \
--cc=bruno@wolff.to \
--cc=jirislaby@gmail.com \
--cc=jslaby@suse.cz \
--cc=jwboyer@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=rcj@linux.vnet.ibm.com \
--cc=riel@redhat.com \
--cc=sjenning@linux.vnet.ibm.com \
--cc=zkabelac@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).