From: Andrew Morton <akpm@osdl.org>
To: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
Cc: magnus.damm@gmail.com, clameter@sgi.com, kravetz@us.ibm.com,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH 0/4] Swap migration V3: Overview
Date: Thu, 27 Oct 2005 20:07:45 -0700 [thread overview]
Message-ID: <20051027200745.17d0767b.akpm@osdl.org> (raw)
In-Reply-To: <20051027213548.GB8128@logos.cnet>
Marcelo Tosatti <marcelo.tosatti@cyclades.com> wrote:
>
> Hi Andrew!
>
> On Thu, Oct 27, 2005 at 01:43:47PM -0700, Andrew Morton wrote:
> > Marcelo Tosatti <marcelo.tosatti@cyclades.com> wrote:
> > >
> > > The fair approach would be to have the
> > > number of pages to reclaim also relative to zone size.
> > >
> > > sc->nr_to_reclaim = (zone->present_pages * sc->swap_cluster_max) /
> > > total_memory;
> >
> > You can try it, but that shouldn't matter. SWAP_CLUSTER_MAX is just a
> > batching factor used to reduce CPU consumption. If you make it twice as
> > bug, we run DMA-zone reclaim half as often - it should balance out.
>
> But you're not taking the relationship between DMA and NORMAL zone
> into account?
We need to be careful to differentiate between page allocation and page
reclaim. In some ways they're coupled, but the VM does attempt to make one
independent from the other..
> I suppose that a side effect of such change is that more allocations
> will become serviced from the NORMAL/HIGHMEM zones ("more intensively
> reclaimed") while less allocations will become serviced by the DMA zone
> (whose scan/reclaim progress should now be _much_ lighter than that of
> the NORMAL zone). ie DMA zone will be much less often "available" for
> GFP_HIGHMEM/GFP_KERNEL allocations, which are the vast majority.
The use of SWAP_CLUSTER_MAX in the reclaim code shouldn't affect the
inter-zone balancing over in the allocation code. Much.
> Might be talking BS though.
>
> What else could explain this numbers from Magnus, taking into account
> that a large number of pages in the DMA zone are used for kernel text,
> etc. These unbalancing seems to be potentially suboptimal (and result
> in unpredictable behaviour depending from which zone pages becomes
> allocated from):
>
> "$ cat /proc/zoneinfo | grep present
> present 4096
> present 225280
> present 30342
>
> $ cat /proc/zoneinfo | grep tscanned
> tscanned 151352
> tscanned 3480599
> tscanned 541466
>
> "tscanned" counts how many pages that has been scanned in each zone
> since power on. Executive summary assuming that only LRU pages exist
> in the zone:
>
> DMA: each page has been scanned ~37 times
> Normal: each page has been scanned ~15 times
> HighMem: each page has been scanned ~18 times"
Yes, I've noticed that.
> I feel that I'm reaching the point where things should be confirmed
> instead of guessed (on my part!).
Need to check the numbers, but I expect you'll find that ZONE_DMA is
basically never used for either __GFP_HIGHMEM or GFP_KERNEL allocations,
due to the watermark thingies.
So it's basically just sitting there, being used by GFP_DMA allocations.
And IIRC there _are_ a batch of GFP_DMA allocations early in boot for
block-related stuff(?). It's all hazy ;)
But that would mean that most of the ZONE_DMA pages are used for
unreclaimable purposes, and only a small proportion of them are on the LRU.
That might cause the arithmetic to perform more scanning down there.
WARNING: multiple messages have this Message-ID (diff)
From: Andrew Morton <akpm@osdl.org>
To: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
Cc: magnus.damm@gmail.com, clameter@sgi.com, kravetz@us.ibm.com,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH 0/4] Swap migration V3: Overview
Date: Thu, 27 Oct 2005 20:07:45 -0700 [thread overview]
Message-ID: <20051027200745.17d0767b.akpm@osdl.org> (raw)
In-Reply-To: <20051027213548.GB8128@logos.cnet>
Marcelo Tosatti <marcelo.tosatti@cyclades.com> wrote:
>
> Hi Andrew!
>
> On Thu, Oct 27, 2005 at 01:43:47PM -0700, Andrew Morton wrote:
> > Marcelo Tosatti <marcelo.tosatti@cyclades.com> wrote:
> > >
> > > The fair approach would be to have the
> > > number of pages to reclaim also relative to zone size.
> > >
> > > sc->nr_to_reclaim = (zone->present_pages * sc->swap_cluster_max) /
> > > total_memory;
> >
> > You can try it, but that shouldn't matter. SWAP_CLUSTER_MAX is just a
> > batching factor used to reduce CPU consumption. If you make it twice as
> > bug, we run DMA-zone reclaim half as often - it should balance out.
>
> But you're not taking the relationship between DMA and NORMAL zone
> into account?
We need to be careful to differentiate between page allocation and page
reclaim. In some ways they're coupled, but the VM does attempt to make one
independent from the other..
> I suppose that a side effect of such change is that more allocations
> will become serviced from the NORMAL/HIGHMEM zones ("more intensively
> reclaimed") while less allocations will become serviced by the DMA zone
> (whose scan/reclaim progress should now be _much_ lighter than that of
> the NORMAL zone). ie DMA zone will be much less often "available" for
> GFP_HIGHMEM/GFP_KERNEL allocations, which are the vast majority.
The use of SWAP_CLUSTER_MAX in the reclaim code shouldn't affect the
inter-zone balancing over in the allocation code. Much.
> Might be talking BS though.
>
> What else could explain this numbers from Magnus, taking into account
> that a large number of pages in the DMA zone are used for kernel text,
> etc. These unbalancing seems to be potentially suboptimal (and result
> in unpredictable behaviour depending from which zone pages becomes
> allocated from):
>
> "$ cat /proc/zoneinfo | grep present
> present 4096
> present 225280
> present 30342
>
> $ cat /proc/zoneinfo | grep tscanned
> tscanned 151352
> tscanned 3480599
> tscanned 541466
>
> "tscanned" counts how many pages that has been scanned in each zone
> since power on. Executive summary assuming that only LRU pages exist
> in the zone:
>
> DMA: each page has been scanned ~37 times
> Normal: each page has been scanned ~15 times
> HighMem: each page has been scanned ~18 times"
Yes, I've noticed that.
> I feel that I'm reaching the point where things should be confirmed
> instead of guessed (on my part!).
Need to check the numbers, but I expect you'll find that ZONE_DMA is
basically never used for either __GFP_HIGHMEM or GFP_KERNEL allocations,
due to the watermark thingies.
So it's basically just sitting there, being used by GFP_DMA allocations.
And IIRC there _are_ a batch of GFP_DMA allocations early in boot for
block-related stuff(?). It's all hazy ;)
But that would mean that most of the ZONE_DMA pages are used for
unreclaimable purposes, and only a small proportion of them are on the LRU.
That might cause the arithmetic to perform more scanning down there.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2005-10-28 3:08 UTC|newest]
Thread overview: 109+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-10-20 22:59 [PATCH 0/4] Swap migration V3: Overview Christoph Lameter
2005-10-20 22:59 ` Christoph Lameter
2005-10-20 22:59 ` [PATCH 1/4] Swap migration V3: LRU operations Christoph Lameter
2005-10-20 22:59 ` Christoph Lameter
2005-10-21 6:06 ` Dave Hansen
2005-10-21 6:06 ` Dave Hansen
2005-10-21 6:27 ` Magnus Damm
2005-10-21 6:27 ` Magnus Damm
2005-10-21 6:56 ` Dave Hansen
2005-10-21 6:56 ` Dave Hansen
2005-10-21 7:25 ` Magnus Damm
2005-10-21 7:25 ` Magnus Damm
2005-10-21 15:42 ` Christoph Lameter
2005-10-21 15:42 ` Christoph Lameter
2005-10-21 11:49 ` Nikita Danilov
2005-10-21 11:49 ` Nikita Danilov
2005-10-20 22:59 ` [PATCH 2/4] Swap migration V3: Page Eviction Christoph Lameter
2005-10-20 22:59 ` Christoph Lameter
2005-10-22 1:06 ` Marcelo Tosatti
2005-10-22 1:06 ` Marcelo Tosatti
2005-10-20 22:59 ` [PATCH 3/4] Swap migration V3: MPOL_MF_MOVE interface Christoph Lameter
2005-10-20 22:59 ` Christoph Lameter
2005-10-20 22:59 ` [PATCH 4/4] Swap migration V3: sys_migrate_pages interface Christoph Lameter
2005-10-20 22:59 ` Christoph Lameter
2005-10-21 2:55 ` KAMEZAWA Hiroyuki
2005-10-21 2:55 ` KAMEZAWA Hiroyuki
2005-10-21 7:07 ` Simon Derr
2005-10-21 7:07 ` Simon Derr
2005-10-21 7:20 ` KAMEZAWA Hiroyuki
2005-10-21 7:20 ` KAMEZAWA Hiroyuki
2005-10-21 7:39 ` Simon Derr
2005-10-21 7:39 ` Simon Derr
2005-10-21 7:46 ` KAMEZAWA Hiroyuki
2005-10-21 7:46 ` KAMEZAWA Hiroyuki
2005-10-21 15:22 ` Paul Jackson
2005-10-21 15:22 ` Paul Jackson
2005-10-21 15:15 ` Paul Jackson
2005-10-21 15:15 ` Paul Jackson
2005-10-21 15:21 ` Kamezawa Hiroyuki
2005-10-21 15:21 ` Kamezawa Hiroyuki
2005-10-21 18:10 ` Paul Jackson
2005-10-21 18:10 ` Paul Jackson
2005-10-21 18:26 ` Christoph Lameter
2005-10-21 18:26 ` Christoph Lameter
2005-10-21 18:57 ` Paul Jackson
2005-10-21 18:57 ` Paul Jackson
2005-10-21 15:47 ` Christoph Lameter
2005-10-21 15:47 ` Christoph Lameter
2005-10-21 16:18 ` Ray Bryant
2005-10-21 16:18 ` Ray Bryant
2005-10-21 16:33 ` Christoph Lameter
2005-10-21 16:33 ` Christoph Lameter
2005-10-21 15:18 ` Paul Jackson
2005-10-21 15:18 ` Paul Jackson
2005-10-21 16:27 ` Christoph Lameter
2005-10-21 16:27 ` Christoph Lameter
2005-10-21 16:59 ` Kamezawa Hiroyuki
2005-10-21 16:59 ` Kamezawa Hiroyuki
2005-10-21 17:03 ` Paul Jackson
2005-10-21 17:03 ` Paul Jackson
2005-10-21 17:06 ` Christoph Lameter
2005-10-21 17:06 ` Christoph Lameter
2005-10-21 18:17 ` Paul Jackson
2005-10-21 18:17 ` Paul Jackson
2005-10-20 23:06 ` [PATCH 0/4] Swap migration V3: Overview Andrew Morton
2005-10-20 23:06 ` Andrew Morton
2005-10-20 23:46 ` mike kravetz
2005-10-20 23:46 ` mike kravetz
2005-10-21 3:22 ` KAMEZAWA Hiroyuki
2005-10-21 3:22 ` KAMEZAWA Hiroyuki
2005-10-21 3:32 ` mike kravetz
2005-10-21 3:32 ` mike kravetz
2005-10-21 3:56 ` KAMEZAWA Hiroyuki
2005-10-21 3:56 ` KAMEZAWA Hiroyuki
2005-10-21 4:22 ` mike kravetz
2005-10-21 4:22 ` mike kravetz
2005-10-21 5:13 ` KAMEZAWA Hiroyuki
2005-10-21 5:13 ` KAMEZAWA Hiroyuki
2005-10-21 15:28 ` Paul Jackson
2005-10-21 15:28 ` Paul Jackson
2005-10-21 16:00 ` mike kravetz
2005-10-21 16:00 ` mike kravetz
2005-10-21 5:59 ` KAMEZAWA Hiroyuki
2005-10-21 5:59 ` KAMEZAWA Hiroyuki
2005-10-22 1:16 ` Marcelo Tosatti
2005-10-22 1:16 ` Marcelo Tosatti
2005-10-21 15:54 ` Christoph Lameter
2005-10-21 15:54 ` Christoph Lameter
2005-10-21 1:57 ` Magnus Damm
2005-10-21 1:57 ` Magnus Damm
2005-10-22 0:50 ` Marcelo Tosatti
2005-10-22 0:50 ` Marcelo Tosatti
2005-10-23 12:50 ` Magnus Damm
2005-10-23 12:50 ` Magnus Damm
2005-10-24 7:44 ` Marcelo Tosatti
2005-10-24 7:44 ` Marcelo Tosatti
2005-10-25 11:37 ` Magnus Damm
2005-10-25 14:37 ` Marcelo Tosatti
2005-10-25 14:37 ` Marcelo Tosatti
2005-10-26 7:04 ` Magnus Damm
2005-10-26 7:04 ` Magnus Damm
2005-10-27 15:01 ` Marcelo Tosatti
2005-10-27 15:01 ` Marcelo Tosatti
2005-10-27 20:43 ` Andrew Morton
2005-10-27 20:43 ` Andrew Morton
2005-10-27 21:35 ` Marcelo Tosatti
2005-10-27 21:35 ` Marcelo Tosatti
2005-10-28 3:07 ` Andrew Morton [this message]
2005-10-28 3:07 ` Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20051027200745.17d0767b.akpm@osdl.org \
--to=akpm@osdl.org \
--cc=clameter@sgi.com \
--cc=kravetz@us.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=magnus.damm@gmail.com \
--cc=marcelo.tosatti@cyclades.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.