From: Mel Gorman <mgorman@suse.de>
To: Rik van Riel <riel@redhat.com>
Cc: Hugh Dickins <hughd@google.com>, Ingo Molnar <mingo@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Stephen Rothwell <sfr@canb.auug.org.au>,
linux-next@vger.kernel.org, linux-kernel@vger.kernel.org,
Linus Torvalds <torvalds@linux-foundation.org>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Andrea Arcangeli <aarcange@redhat.com>
Subject: Re: linux-next: Tree for Nov 14
Date: Thu, 15 Nov 2012 12:10:57 +0000 [thread overview]
Message-ID: <20121115121057.GU8218@suse.de> (raw)
In-Reply-To: <50A3CF4B.8020806@redhat.com>
On Wed, Nov 14, 2012 at 12:05:15PM -0500, Rik van Riel wrote:
> On 11/14/2012 03:13 AM, Hugh Dickins wrote:
>
> >Please, Ingo, stop trying to force this in ahead of time, yet again.
> >
> >People are still reviewing and comparing competing solutions.
> >Maybe this latest will prove to be closest to the right answer,
> >maybe it will not. It's, what, about two days old right now?
> >
> >If we had wanted to push in a good solution a little prematurely,
> >we would surely have chosen Andrea's AutoNUMA months ago, despite
> >efforts to block it; and maybe we shall still want to go that way.
>
> As much as I would like to see NUMA stuff going upstream
> the day before yesterday, I have to agree with Hugh that
> we need to do things right.
>
After my last test of tests against schednuma I have to agree. While the
differences we see in different tests could be explained by different JVM
configurations, it does not tell us *why* they performed differently. Because
of the monolithic nature of some of the patches it's non-trivial to
establish which part is causing the problems. I still have not got
around to sending the latest schednuma through a spidey decoder ring to
see exactly how it works. FWIW the idea that is described sounds great.
> Having unreviewed (some of it NAKed) code sitting in
> tip.git and you trying to force it upstream is not the
> right way to go.
>
> >Please, forget about v3.8, cut this branch out of linux-next,
> >and seek consensus around getting it right for v3.9.
>
> I suspect that no matter how long we delay merging the
> NUMA placement code, we will always run into some kind
> of regression. I am not sure if a delay will buy us much.
>
> On the mm/ bits, there appears to be consensus already.
> Mel Gorman's patch series contains the nicest mm/ bits
> from autonuma and sched/numa, plus further improvements.
> Andrea has supported Mel's series, and Ingo is pulling
> code from it.
>
> That leads me to believe Mel's NUMA bits may be worth
> considering for 3.8.
>
I still think the series is not fully baked. I'm still working on getting
some of the basics right and getting the System CPU usage down which right
now is through the roof. It's going to take me time and while I think I'll
have something working semi-properly by 3.8 rolls around I severely doubt
it'll have seen any wide-spread testing. My preference is the final result
be sortof comparable with autonumas performance but satisfy the scheduler
folk in terms of how it integrates with kernel/sched/* and not use kernel
threads except as a last resort.
Big chunks are still missing. No knob for turning off from command line,
no THP native migration (getting the simple case right first), placement
policy is still extremely heavy (was run in kernel thread context before and
needs to change now), page struct elements are not folded into page->flags,
task_struct has fields that should move to task_struct->task_balancenuma,
no docs etc etc etc.
> On top of that, we could place the policy code by
> Peter and Ingo, but as a nice reviewable patch series,
> not hidden away through various tip.git branches.
>
> Does a combination of Mel's NUMA mm/ bits and the
> policy code from Peter and Ingo sound reasonable?
>
> Mel, is that reasonable to you?
>
It'd be reasonable to me. Preferably patches would affect individual areas
rather than being a large patch affecting multiple areas. As well as being
easier to comprehend, we can also bisect the result. To me, the obvious
discrete areas that a single patch would affect are
1. The PTE update helper functions
2. The PTE scanning machinary driven from task_numa_tick
3. Task and process fault accounting and how that information is used
to determine if a page is misplaced
4. Fault handling, migrating the page if misplaced, what information is
provided to the placement policy
Obviously that is not always possible.
Thanks to the kernel.org folk I have a git tree at
git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux-balancenuma.git
The mm-balancenuma-v1r15[*] and mm-balancenuma-v2r45 branches correspond to
the V1 and V2 series I released. I've pushed a mm-balancenuma-v3r22-snapshot
branch which is unreleased but shows where the tree currently stands.
Almost nothing in there after the initial placement policy has been tested
at all but it shows the initial adjustment to how PMD faults are handled
and some preliminary migration rate-limiting code. The same patches when
complete should be usable by schednuma be it due to a rebase on top or
because they pull the patches in and adjust them accordingly.
--
Mel Gorman
SUSE Labs
next prev parent reply other threads:[~2012-11-15 12:11 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-14 5:30 linux-next: Tree for Nov 14 Stephen Rothwell
2012-11-14 5:37 ` Andrew Morton
2012-11-14 5:53 ` Andrew Morton
2012-11-14 6:47 ` Ingo Molnar
2012-11-14 6:56 ` Andrew Morton
2012-11-14 7:15 ` Stephen Rothwell
2012-11-14 7:24 ` Andrew Morton
2012-11-14 7:39 ` Ingo Molnar
2012-11-14 8:13 ` Hugh Dickins
2012-11-14 17:05 ` Rik van Riel
2012-11-15 12:10 ` Mel Gorman [this message]
2012-11-14 17:19 ` Linus Torvalds
2012-11-14 6:55 ` Stephen Rothwell
2012-11-14 7:03 ` Stephen Rothwell
2012-11-14 19:41 ` linux-next: Tree for Nov 14 (gpu/drm/i915) Randy Dunlap
2012-11-14 20:17 ` Andrew Morton
2012-11-15 0:59 ` [PATCH 1/2] asm-generic: add __WARN() to bug.h Randy Dunlap
2012-11-15 1:28 ` David Rientjes
2012-11-15 0:59 ` [PATCH 2/2] mm: balloon_compaction.c needs asm-generic/bug.h Randy Dunlap
2012-11-15 1:29 ` David Rientjes
2012-11-15 1:29 ` Randy Dunlap
-- strict thread matches above, loose matches on Subject: below --
2025-11-14 5:33 linux-next: Tree for Nov 14 Stephen Rothwell
2024-11-14 6:16 Stephen Rothwell
2023-11-14 3:19 Stephen Rothwell
2022-11-14 7:49 Stephen Rothwell
2019-11-14 8:31 Stephen Rothwell
2019-11-14 18:38 ` Naresh Kamboju
2019-11-14 20:11 ` Jan Stancek
2019-11-14 21:19 ` Arnd Bergmann
2018-11-14 5:26 Stephen Rothwell
2017-11-14 6:20 Stephen Rothwell
2016-11-14 7:23 Stephen Rothwell
2014-11-14 8:27 Stephen Rothwell
2014-11-15 21:19 ` Guenter Roeck
2014-11-16 2:33 ` Jiang Liu
2014-11-16 3:22 ` Guenter Roeck
2014-11-16 4:20 ` Jiang Liu
2014-11-16 6:56 ` Guenter Roeck
2014-11-16 8:24 ` Jiang Liu
2014-11-16 8:37 ` Jiang Liu
2014-11-16 15:42 ` Guenter Roeck
2014-11-16 16:01 ` Guenter Roeck
2014-11-16 16:11 ` Guenter Roeck
2014-11-17 5:12 ` Jiang Liu
2014-11-17 17:02 ` Guenter Roeck
2013-11-14 4:22 Stephen Rothwell
2011-11-14 3:43 Stephen Rothwell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121115121057.GU8218@suse.de \
--to=mgorman@suse.de \
--cc=a.p.zijlstra@chello.nl \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=hughd@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-next@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=riel@redhat.com \
--cc=sfr@canb.auug.org.au \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).