linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Thorsten Leemhuis <fedora@leemhuis.info>
To: Josh Boyer <jwboyer@gmail.com>
Cc: Mel Gorman <mgorman@suse.de>,
	Zdenek Kabelac <zkabelac@redhat.com>,
	Seth Jennings <sjenning@linux.vnet.ibm.com>,
	Jiri Slaby <jslaby@suse.cz>,
	Valdis.Kletnieks@vt.edu, Jiri Slaby <jirislaby@gmail.com>,
	linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Rik van Riel <riel@redhat.com>,
	Robert Jennings <rcj@linux.vnet.ibm.com>,
	bruno@wolff.to
Subject: Re: [PATCH] Revert "mm: remove __GFP_NO_KSWAPD"
Date: Tue, 20 Nov 2012 18:43:04 +0100	[thread overview]
Message-ID: <50ABC128.80706@leemhuis.info> (raw)
In-Reply-To: <CA+5PVA7__=JcjLAhs5cpVK-WaZbF5bQhp5WojBJsdEt9SnG3cw@mail.gmail.com>

On 20.11.2012 16:38, Josh Boyer wrote:
> On Fri, Nov 16, 2012 at 3:06 PM, Mel Gorman <mgorman@suse.de> wrote:
>> On Fri, Nov 16, 2012 at 02:14:47PM -0500, Josh Boyer wrote:
>>> On Mon, Nov 12, 2012 at 6:37 AM, Mel Gorman <mgorman@suse.de> wrote:
>>>> With "mm: vmscan: scale number of pages reclaimed by reclaim/compaction
>>>> based on failures" reverted, Zdenek Kabelac reported the following
>>>>
>>>>          Hmm,  so it's just took longer to hit the problem and observe
>>>>          kswapd0 spinning on my CPU again - it's not as endless like before -
>>>>          but still it easily eats minutes - it helps to  turn off  Firefox
>>>>          or TB  (memory hungry apps) so kswapd0 stops soon - and restart
>>>>          those apps again.  (And I still have like >1GB of cached memory)
>>>>
>>>>          kswapd0         R  running task        0    30      2 0x00000000
>>>>           ffff8801331efae8 0000000000000082 0000000000000018 0000000000000246
>>>>           ffff880135b9a340 ffff8801331effd8 ffff8801331effd8 ffff8801331effd8
>>>>           ffff880055dfa340 ffff880135b9a340 00000000331efad8 ffff8801331ee000
>>>>          Call Trace:
>>>>           [<ffffffff81555bf2>] preempt_schedule+0x42/0x60
>>>>           [<ffffffff81557a95>] _raw_spin_unlock+0x55/0x60
>>>>           [<ffffffff81192971>] put_super+0x31/0x40
>>>>           [<ffffffff81192a42>] drop_super+0x22/0x30
>>>>           [<ffffffff81193b89>] prune_super+0x149/0x1b0
>>>>           [<ffffffff81141e2a>] shrink_slab+0xba/0x510
>>>>
>>>> The sysrq+m indicates the system has no swap so it'll never reclaim
>>>> anonymous pages as part of reclaim/compaction. That is one part of the
>>>> problem but not the root cause as file-backed pages could also be reclaimed.
>>>>
>>>> The likely underlying problem is that kswapd is woken up or kept awake
>>>> for each THP allocation request in the page allocator slow path.
>>>>
>>>> If compaction fails for the requesting process then compaction will be
>>>> deferred for a time and direct reclaim is avoided. However, if there
>>>> are a storm of THP requests that are simply rejected, it will still
>>>> be the the case that kswapd is awake for a prolonged period of time
>>>> as pgdat->kswapd_max_order is updated each time. This is noticed by
>>>> the main kswapd() loop and it will not call kswapd_try_to_sleep().
>>>> Instead it will loopp, shrinking a small number of pages and calling
>>>> shrink_slab() on each iteration.
>>>>
>>>> The temptation is to supply a patch that checks if kswapd was woken for
>>>> THP and if so ignore pgdat->kswapd_max_order but it'll be a hack and not
>>>> backed up by proper testing. As 3.7 is very close to release and this is
>>>> not a bug we should release with, a safer path is to revert "mm: remove
>>>> __GFP_NO_KSWAPD" for now and revisit it with the view to ironing out the
>>>> balance_pgdat() logic in general.
>>>>
>>>> Signed-off-by: Mel Gorman <mgorman@suse.de>
>>>
>>> Does anyone know if this is queued to go into 3.7 somewhere?  I looked
>>> a bit and can't find it in a tree.  We have a few reports of Fedora
>>> rawhide users hitting this.
>>
>> No, because I was waiting to hear if a) it worked and preferably if the
>> alternative "less safe" option worked. This close to release it might be
>> better to just go with the safe option.
>
> We've been tracking it in https://bugzilla.redhat.com/show_bug.cgi?id=866988
> and people say this revert patch doesn't seem to make the issue go away
> fully.  Thorsten has created another kernel with the other patch applied
> for testing.
>
> At least I think that is the latest status from the bug.  Hopefully the
> commenters will chime in.

The short story from my current point of view is:

  * my main machine at home where I initially saw the issue that started 
this thread seems to be running fine with rc6 and the "safe" patch Mel 
posted in https://lkml.org/lkml/2012/11/12/113 Before that I ran a rc5 
kernel with the revert that went into rc6 and the "safe" patch -- that 
worked fine for a few days, too.

  * I have a second machine where I started to use 3.7-rc kernels only 
yesterday (the machine triggered a bug in the radeon driver that seems 
to be fixed in rc6) which showed symptoms like the ones Zdenek Kabelac 
mentions in this thread. I wasn't able to look closer at it, but simply 
tried rc6 with the safe patch, which didn't help. I'm now running rc6 
with the "riskier" patch from https://lkml.org/lkml/2012/11/12/151
I can't yet tell if it helps. If the problems shows up again I'll try to 
capture more debugging data via sysrq -- there wasn't any time for that 
when I was running rc6 with the safe patch, sorry.

Thorsten

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2012-11-20 17:43 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-11  8:52 kswapd0: wxcessive CPU usage Jiri Slaby
2012-10-11 13:44 ` Valdis.Kletnieks
2012-10-11 15:34   ` Jiri Slaby
2012-10-11 17:56     ` Valdis.Kletnieks
2012-10-11 17:59       ` Jiri Slaby
2012-10-11 18:19         ` Valdis.Kletnieks
2012-10-11 22:08           ` kswapd0: excessive " Jiri Slaby
2012-10-12 12:37             ` Jiri Slaby
2012-10-12 13:57               ` Mel Gorman
2012-10-15  9:54                 ` Jiri Slaby
2012-10-15 11:09                   ` Mel Gorman
2012-10-29 10:52                     ` Thorsten Leemhuis
2012-10-30 19:18                       ` Mel Gorman
2012-10-31 11:25                         ` Thorsten Leemhuis
2012-10-31 15:04                           ` Mel Gorman
2012-11-04 16:36                         ` Rik van Riel
2012-11-02 10:44                     ` Zdenek Kabelac
2012-11-02 10:53                       ` Jiri Slaby
2012-11-02 19:45                         ` Jiri Slaby
2012-11-04 11:26                           ` Zdenek Kabelac
2012-11-05 14:24                           ` [PATCH] Revert "mm: vmscan: scale number of pages reclaimed by reclaim/compaction based on failures" Mel Gorman
2012-11-06 10:15                             ` Johannes Hirte
2012-11-09  8:36                               ` Mel Gorman
2012-11-14 21:43                                 ` Johannes Hirte
2012-11-09  9:12                             ` Mel Gorman
2012-11-09  4:22                           ` kswapd0: excessive CPU usage Seth Jennings
2012-11-09  8:07                             ` Zdenek Kabelac
2012-11-09  9:06                               ` Mel Gorman
2012-11-11  9:13                                 ` Zdenek Kabelac
2012-11-12 11:37                                   ` [PATCH] Revert "mm: remove __GFP_NO_KSWAPD" Mel Gorman
2012-11-16 19:14                                     ` Josh Boyer
2012-11-16 19:51                                       ` Andrew Morton
2012-11-20  1:43                                         ` Valdis.Kletnieks
2012-11-16 20:06                                       ` Mel Gorman
2012-11-20 15:38                                         ` Josh Boyer
2012-11-20 16:13                                           ` Bruno Wolff III
2012-11-20 17:43                                           ` Thorsten Leemhuis [this message]
2012-11-23 15:20                                             ` Thorsten Leemhuis
2012-11-27 11:12                                               ` Mel Gorman
2012-11-21 15:08                                           ` Mel Gorman
2012-11-20  9:18                                     ` Glauber Costa
2012-11-20 20:18                                       ` Andrew Morton
2012-11-21  8:30                                         ` Glauber Costa
2012-11-12 12:19                                   ` kswapd0: excessive CPU usage Mel Gorman
2012-11-12 13:13                                     ` Zdenek Kabelac
2012-11-12 13:31                                       ` Mel Gorman
2012-11-12 14:50                                         ` Zdenek Kabelac
2012-11-18 19:00                                         ` Zdenek Kabelac
2012-11-18 19:07                                           ` Jiri Slaby
2012-11-09  8:40                             ` Mel Gorman
2012-10-11 22:14 ` kswapd0: wxcessive " Andrew Morton
2012-10-11 22:26   ` Jiri Slaby

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50ABC128.80706@leemhuis.info \
    --to=fedora@leemhuis.info \
    --cc=Valdis.Kletnieks@vt.edu \
    --cc=akpm@linux-foundation.org \
    --cc=bruno@wolff.to \
    --cc=jirislaby@gmail.com \
    --cc=jslaby@suse.cz \
    --cc=jwboyer@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=rcj@linux.vnet.ibm.com \
    --cc=riel@redhat.com \
    --cc=sjenning@linux.vnet.ibm.com \
    --cc=zkabelac@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).