linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Christoph Lameter <clameter@engr.sgi.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Mel Gorman <mel@skynet.ie>,
	npiggin@suse.de, mingo@elte.hu, jschopp@austin.ibm.com,
	arjan@infradead.org, torvalds@linux-foundation.org,
	mbligh@mbligh.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: The performance and behaviour of the anti-fragmentation related patches
Date: Thu, 1 Mar 2007 19:05:48 -0800 (PST)	[thread overview]
Message-ID: <Pine.LNX.4.64.0703011854540.5530@schroedinger.engr.sgi.com> (raw)
In-Reply-To: <20070301160915.6da876c5.akpm@linux-foundation.org>

On Thu, 1 Mar 2007, Andrew Morton wrote:

> What worries me is memory hot-unplug and per-container RSS limits.  We
> don't know how we're going to do either of these yet, and it could well be
> that the anti-frag work significantly complexicates whatever we end up
> doing there.

Right now it seems that the per container RSS limits differ from the 
statistics calculated per zone. There would be a conceptual overlap but 
the containers are optional and track numbers differently. There is no RSS 
counter in a zone f.e.

memory hot-unplug would directly tap into the anti-frag work. Essentially 
only the zone with movable pages would be unpluggable without additional 
measures. Making slab items and other allocations that is fixed movable 
requires work anyways. A new zone concept will not help.

> For prioritisation purposes I'd judge that memory hot-unplug is of similar
> value to the antifrag work (because memory hot-unplug permits DIMM
> poweroff).

I would say that anti-frag / defrag enables memory unplug.

> And I'd judge that per-container RSS limits are of considerably more value
> than antifrag (in fact per-container RSS might be a superset of antifrag,
> in the sense that per-container RSS and containers could be abused to fix
> the i-cant-get-any-hugepages problem, dunno).

They relate? How can a container perform antifrag? Meaning a container 
reserves a portion of a hardware zone and becomes a software zone.

> So some urgent questions are: how are we going to do mem hotunplug and
> per-container RSS?

Separately. There is no need to mingle these two together.

> Our basic unit of memory management is the zone.  Right now, a zone maps
> onto some hardware-imposed thing.  But the zone-based MM works *well*.  I

Thats a value judgement that I doubt. Zone based balancing is bad and has 
been repeatedly patched up so that it works with the usual loads.

> suspect that a good way to solve both per-container RSS and mem hotunplug
> is to split the zone concept away from its hardware limitations: create a
> "software zone" and a "hardware zone".  All the existing page allocator and
> reclaim code remains basically unchanged, and it operates on "software
> zones".  Each software zones always lies within a single hardware zone. 
> The software zones are resizeable.  For per-container RSS we give each
> container one (or perhaps multiple) resizeable software zones.

Resizable software zones? Are they contiguous or not? If not then we
add another layer to the defrag problem.

> For memory hotunplug, some of the hardware zone's software zones are marked
> reclaimable and some are not; DIMMs which are wholly within reclaimable
> zones can be depopulated and powered off or removed.

So subzones indeed. How about calling the MAX_ORDER entities that Mel's 
patches create "software zones"?

> NUMA and cpusets screwed up: they've gone and used nodes as their basic
> unit of memory management whereas they should have used zones.  This will
> need to be untangled.

zones have hardware characteristics at its core. In a NUMA setting zones 
determine the performance of loads from those areas. I would like to have
zones and nodes merged. Maybe extend node numbers into the negative area
-1 = DMA -2 DMA32 etc? All systems then manage the "nones" (node / zones 
meerged). One could create additional "virtual" nones after the real nones 
that have hardware characteristics behind them. The virtual nones would be 
something like the software zones? Contain MAX_ORDER portions of hardware 
nones?

> Anyway, that's just a shot in the dark.  Could be that we implement unplug
> and RSS control by totally different means.  But I do wish that we'd sort
> out what those means will be before we potentially complicate the story a
> lot by adding antifragmentation.

Hmmm.... My shot:

1. Merge zones/nodes

2. Create new virtual zones/nodes that are subsets of MAX_order blocks of 
the real zones/nodes. These may then have additional characteristics such
as 

A. moveable/unmovable
B. DMA restrictions
C. container assignment.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2007-03-02  3:05 UTC|newest]

Thread overview: 99+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-03-01 10:12 The performance and behaviour of the anti-fragmentation related patches Mel Gorman
2007-03-02  0:09 ` Andrew Morton
2007-03-02  0:44   ` Linus Torvalds
2007-03-02  1:52     ` Balbir Singh
2007-03-02  3:44       ` Linus Torvalds
2007-03-02  3:59         ` Andrew Morton
2007-03-02  5:11           ` Linus Torvalds
2007-03-02  5:50             ` KAMEZAWA Hiroyuki
2007-03-02  6:15               ` Paul Mundt
2007-03-02 17:01                 ` Mel Gorman
2007-03-02 16:20             ` Mark Gross
2007-03-02 17:07               ` Andrew Morton
2007-03-02 17:35                 ` Mark Gross
2007-03-02 18:02                   ` Andrew Morton
2007-03-02 19:02                     ` Mark Gross
2007-03-02 17:16               ` Linus Torvalds
2007-03-02 18:45                 ` Mark Gross
2007-03-02 19:03                   ` Linus Torvalds
2007-03-02 23:58                 ` Martin J. Bligh
2007-03-02  4:18         ` Balbir Singh
2007-03-02  5:13         ` Jeremy Fitzhardinge
2007-03-06  4:16         ` Paul Mackerras
2007-03-02 16:58     ` Mel Gorman
2007-03-02 17:05     ` Joel Schopp
2007-03-05  3:21       ` Nick Piggin
2007-03-05 15:20         ` Joel Schopp
2007-03-05 16:01           ` Nick Piggin
2007-03-05 16:45             ` Joel Schopp
2007-05-03  8:49           ` Andy Whitcroft
2007-03-02  1:39   ` Balbir Singh
2007-03-02  2:34   ` KAMEZAWA Hiroyuki
2007-03-02  3:05   ` Christoph Lameter [this message]
2007-03-02  3:57     ` Nick Piggin
2007-03-02  4:06       ` Christoph Lameter
2007-03-02  4:21         ` Nick Piggin
2007-03-02  4:31           ` Christoph Lameter
2007-03-02  5:06             ` Nick Piggin
2007-03-02  5:40               ` Christoph Lameter
2007-03-02  5:49                 ` Nick Piggin
2007-03-02  5:53                   ` Christoph Lameter
2007-03-02  6:08                     ` Nick Piggin
2007-03-02  6:19                       ` Christoph Lameter
2007-03-02  6:29                         ` Nick Piggin
2007-03-02  6:51                           ` Christoph Lameter
2007-03-02  7:03                             ` Andrew Morton
2007-03-02  7:19                             ` Nick Piggin
2007-03-02  7:44                               ` Christoph Lameter
2007-03-02  8:12                                 ` Nick Piggin
2007-03-02  8:21                                   ` Christoph Lameter
2007-03-02  8:38                                     ` Nick Piggin
2007-03-02 17:09                                       ` Christoph Lameter
2007-03-04  1:26                                   ` Rik van Riel
2007-03-04  1:51                                     ` Andrew Morton
2007-03-04  1:58                                       ` Rik van Riel
2007-03-02  5:50               ` Christoph Lameter
2007-03-02  4:29         ` Andrew Morton
2007-03-02  4:33           ` Christoph Lameter
2007-03-02  4:58             ` Andrew Morton
2007-03-02  4:20       ` Paul Mundt
2007-03-02 13:50   ` Arjan van de Ven
2007-03-02 15:29   ` Rik van Riel
2007-03-02 16:58     ` Andrew Morton
2007-03-02 17:09       ` Mel Gorman
2007-03-02 17:23       ` Christoph Lameter
2007-03-02 17:35         ` Andrew Morton
2007-03-02 17:43           ` Rik van Riel
2007-03-02 18:06             ` Andrew Morton
2007-03-02 18:15               ` Christoph Lameter
2007-03-02 18:23                 ` Andrew Morton
2007-03-02 18:23                 ` Rik van Riel
2007-03-02 19:31                   ` Christoph Lameter
2007-03-02 19:40                     ` Rik van Riel
2007-03-02 21:12                   ` Bill Irwin
2007-03-02 21:19                     ` Rik van Riel
2007-03-02 21:52                       ` Andrew Morton
2007-03-02 22:03                         ` Rik van Riel
2007-03-02 22:22                           ` Andrew Morton
2007-03-02 22:34                             ` Rik van Riel
2007-03-02 22:51                               ` Martin Bligh
2007-03-02 22:54                                 ` Rik van Riel
2007-03-02 23:28                                   ` Martin J. Bligh
2007-03-03  0:24                                     ` Andrew Morton
2007-03-02 22:52                               ` Chuck Ebbert
2007-03-02 22:59                               ` Andrew Morton
2007-03-02 23:20                                 ` Rik van Riel
2007-03-03  1:40                                 ` William Lee Irwin III
2007-03-03  1:58                                   ` Andrew Morton
2007-03-03  3:55                                     ` William Lee Irwin III
2007-03-03  0:33                             ` William Lee Irwin III
2007-03-03  0:54                               ` Andrew Morton
2007-03-03  3:15                               ` Christoph Lameter
2007-03-03  4:19                                 ` William Lee Irwin III
2007-03-03 17:16                                 ` Martin J. Bligh
2007-03-03 17:50                                   ` Christoph Lameter
2007-03-02 20:59               ` Bill Irwin
2007-03-02  1:52 ` Bill Irwin
2007-03-02 10:38   ` Mel Gorman
2007-03-02 16:31     ` Joel Schopp
2007-03-02 21:37       ` Bill Irwin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0703011854540.5530@schroedinger.engr.sgi.com \
    --to=clameter@engr.sgi.com \
    --cc=akpm@linux-foundation.org \
    --cc=arjan@infradead.org \
    --cc=jschopp@austin.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mbligh@mbligh.org \
    --cc=mel@skynet.ie \
    --cc=mingo@elte.hu \
    --cc=npiggin@suse.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).