linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH 00/07][RFC] i386: NUMA emulation
       [not found] <20050930073232.10631.63786.sendpatchset@cherry.local>
@ 2005-09-30 15:23 ` Dave Hansen
  2005-10-03  2:08   ` Magnus Damm
  2005-10-03  3:21   ` Paul Jackson
       [not found] ` <20050930073258.10631.74982.sendpatchset@cherry.local>
       [not found] ` <20050930073308.10631.24247.sendpatchset@cherry.local>
  2 siblings, 2 replies; 30+ messages in thread
From: Dave Hansen @ 2005-09-30 15:23 UTC (permalink / raw)
  To: Magnus Damm; +Cc: linux-mm, Linux Kernel Mailing List

On Fri, 2005-09-30 at 16:33 +0900, Magnus Damm wrote:
> These patches implement NUMA memory node emulation for regular i386 PC:s.
> 
> NUMA emulation could be used to provide coarse-grained memory resource control
> using CPUSETS. Another use is as a test environment for NUMA memory code or
> CPUSETS using an i386 emulator such as QEMU.

This patch set basically allows the "NUMA depends on SMP" dependency to
be removed.  I'm not sure this is the right approach.  There will likely
never be a real-world NUMA system without SMP.  So, this set would seem
to include some increased (#ifdef) complexity for supporting SMP && !
NUMA, which will likely never happen in the real world.

Also, I worry that simply #ifdef'ing things out like CPUsets' update
means that CPUsets lacks some kind of abstraction that it should have
been using in the first place.  An #ifdef just papers over the real
problem.  

I think it would likely be cleaner if the approach was to emulate an SMP
NUMA system where each NUMA node simply doesn't have all of its CPUs
online.

-- Dave


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 05/07] i386: sparsemem on pc
       [not found] ` <20050930073258.10631.74982.sendpatchset@cherry.local>
@ 2005-09-30 15:25   ` Dave Hansen
  2005-10-01  0:32     ` Magnus Damm
  0 siblings, 1 reply; 30+ messages in thread
From: Dave Hansen @ 2005-09-30 15:25 UTC (permalink / raw)
  To: Magnus Damm; +Cc: linux-mm, Linux Kernel Mailing List

On Fri, 2005-09-30 at 16:33 +0900, Magnus Damm wrote:
> This patch for enables and fixes sparsemem support on i386. This is the
> same patch that was sent to linux-kernel on September 6:th 2005, but this 
> patch includes up-porting to fit on top of the patches written by Dave Hansen.

I'll post a more comprehensive way to do this in just a moment.  

	Subject: memhotplug testing: hack for flat systems

-- Dave


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 07/07] i386: numa emulation on pc
       [not found] ` <20050930073308.10631.24247.sendpatchset@cherry.local>
@ 2005-09-30 18:55   ` Dave Hansen
  2005-10-03  9:59     ` Magnus Damm
  2005-10-04  7:52   ` Hirokazu Takahashi
  1 sibling, 1 reply; 30+ messages in thread
From: Dave Hansen @ 2005-09-30 18:55 UTC (permalink / raw)
  To: Magnus Damm, Isaku Yamahata; +Cc: linux-mm, Linux Kernel Mailing List

On Fri, 2005-09-30 at 16:33 +0900, Magnus Damm wrote:
>  void __init nid_zone_sizes_init(int nid)
>  {
>  	unsigned long zones_size[MAX_NR_ZONES] = {0, 0, 0};
> -	unsigned long max_dma;
> +	unsigned long max_dma = min(max_hardware_dma_pfn(), max_low_pfn);
>  	unsigned long start = node_start_pfn[nid];
>  	unsigned long end = node_end_pfn[nid];
>  
>  	if (node_has_online_mem(nid)){
> -		if (nid_starts_in_highmem(nid)) {
> -			zones_size[ZONE_HIGHMEM] = nid_size_pages(nid);
> -		} else {
> -			max_dma = min(max_hardware_dma_pfn(), max_low_pfn);
> -			zones_size[ZONE_DMA] = max_dma;
> -			zones_size[ZONE_NORMAL] = max_low_pfn - max_dma;
> -			zones_size[ZONE_HIGHMEM] = end - max_low_pfn;
> +		if (start < max_dma) {
> +			zones_size[ZONE_DMA] = min(end, max_dma) - start;
> +		}
> +		if (start < max_low_pfn && max_dma < end) {
> +			zones_size[ZONE_NORMAL] = min(end, max_low_pfn) - max(start, max_dma);
> +		}
> +		if (max_low_pfn <= end) {
> +			zones_size[ZONE_HIGHMEM] = end - max(start, max_low_pfn);
>  		}
>  	}

That is a decent cleanup all by itself.  You might want to break it out.
Take a look at the patches I just sent out.  They do some similar things
to the same code.

> @@ -1270,7 +1273,12 @@ void __init setup_bootmem_allocator(void
>  	/*
>  	 * Initialize the boot-time allocator (with low memory only):
>  	 */
> +#ifdef CONFIG_NUMA_EMU
> +	bootmap_size = init_bootmem(max(min_low_pfn, node_start_pfn[0]),
> +				    min(max_low_pfn, node_end_pfn[0]));
> +#else
>  	bootmap_size = init_bootmem(min_low_pfn, max_low_pfn);
> +#endif

This shouldn't be necessary.  Again, take a look at my discontig
separation patches and see if what I did works for you here.

>  	register_bootmem_low_pages(max_low_pfn);
>  
> --- from-0006/arch/i386/mm/numa.c
> +++ to-work/arch/i386/mm/numa.c	2005-09-28 17:49:53.000000000 +0900
> @@ -165,3 +165,103 @@ int early_pfn_to_nid(unsigned long pfn)
>  
>  	return 0;
>  }
> +
> +#ifdef CONFIG_NUMA_EMU
...
> +#endif

Ewwwwww :)  No real need to put new function in a big #ifdef like that.
Can you just create a new file for NUMA emulation?

> --- from-0001/include/asm-i386/numnodes.h
> +++ to-work/include/asm-i386/numnodes.h	2005-09-28 17:49:53.000000000 +0900
> @@ -8,7 +8,7 @@
>  /* Max 16 Nodes */
>  #define NODES_SHIFT	4
>  
> -#elif defined(CONFIG_ACPI_SRAT)
> +#elif defined(CONFIG_ACPI_SRAT) || defined(CONFIG_NUMA_EMU)
>  
>  /* Max 8 Nodes */
>  #define NODES_SHIFT	3

Geez.  We should probably just do those in the Kconfig files.  Would
look much simpler.  But, that's a patch for another day.  This is fine
by itself.

-- Dave


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 05/07] i386: sparsemem on pc
  2005-09-30 15:25   ` [PATCH 05/07] i386: sparsemem on pc Dave Hansen
@ 2005-10-01  0:32     ` Magnus Damm
  0 siblings, 0 replies; 30+ messages in thread
From: Magnus Damm @ 2005-10-01  0:32 UTC (permalink / raw)
  To: Dave Hansen; +Cc: Magnus Damm, linux-mm, Linux Kernel Mailing List

On 10/1/05, Dave Hansen <haveblue@us.ibm.com> wrote:
> On Fri, 2005-09-30 at 16:33 +0900, Magnus Damm wrote:
> > This patch for enables and fixes sparsemem support on i386. This is the
> > same patch that was sent to linux-kernel on September 6:th 2005, but this
> > patch includes up-porting to fit on top of the patches written by Dave Hansen.
>
> I'll post a more comprehensive way to do this in just a moment.
>
>         Subject: memhotplug testing: hack for flat systems

Looks much better, will compile and test on Monday. Thanks.

/ magnus

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 00/07][RFC] i386: NUMA emulation
  2005-09-30 15:23 ` [PATCH 00/07][RFC] i386: NUMA emulation Dave Hansen
@ 2005-10-03  2:08   ` Magnus Damm
  2005-10-03  7:34     ` David Lang
  2005-10-03  3:21   ` Paul Jackson
  1 sibling, 1 reply; 30+ messages in thread
From: Magnus Damm @ 2005-10-03  2:08 UTC (permalink / raw)
  To: Dave Hansen; +Cc: Magnus Damm, linux-mm, Linux Kernel Mailing List

On 10/1/05, Dave Hansen <haveblue@us.ibm.com> wrote:
> On Fri, 2005-09-30 at 16:33 +0900, Magnus Damm wrote:
> > These patches implement NUMA memory node emulation for regular i386 PC:s.
> >
> > NUMA emulation could be used to provide coarse-grained memory resource control
> > using CPUSETS. Another use is as a test environment for NUMA memory code or
> > CPUSETS using an i386 emulator such as QEMU.
>
> This patch set basically allows the "NUMA depends on SMP" dependency to
> be removed.  I'm not sure this is the right approach.  There will likely
> never be a real-world NUMA system without SMP.  So, this set would seem
> to include some increased (#ifdef) complexity for supporting SMP && !
> NUMA, which will likely never happen in the real world.

Yes, this patch set removes "NUMA depends on SMP". It also adds some
simple NUMA emulation code too, but I am sure you are aware of that!
=)

I agree that it is very unlikely to find a single-processor NUMA
system in the real world. So yes, "[PATCH 02/07] i386: numa on
non-smp" adds _some_ extra complexity. But because SMP is set when
supporting more than one cpu, and NUMA is set when supporting more
than one memory node, I see no reason why they should be dependent on
each other. Except that they depend on each other today and breaking
them loose will increase complexity a bit.

> Also, I worry that simply #ifdef'ing things out like CPUsets' update
> means that CPUsets lacks some kind of abstraction that it should have
> been using in the first place.  An #ifdef just papers over the real
> problem.

Maybe. CPUSETS has two bitmaps, one for cpus and one for mems. So
depending on SMP or NUMA seems logical to me. Regarding the #ifdef, it
was added because partition_sched_domain() is only implemented for
SMP. That symbol has no prototype or implementation when CONFIG_SMP is
not set. Maybe it is better to add an empty inline function in
linux/sched.h for !SMP?

> I think it would likely be cleaner if the approach was to emulate an SMP
> NUMA system where each NUMA node simply doesn't have all of its CPUs
> online.

Absolutely. And that removes the need for some of my patches. QEMU
runs SMP kernels. It is possible to run SMP kernels on UP hardware.
But there is of course a certain performance loss introduced by all
the SMP locks. I'd rather not force !SMP users to run SMP kernels if
they want coarse-grained memory resource control.

Thanks for your input!

/ magnus

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 00/07][RFC] i386: NUMA emulation
  2005-09-30 15:23 ` [PATCH 00/07][RFC] i386: NUMA emulation Dave Hansen
  2005-10-03  2:08   ` Magnus Damm
@ 2005-10-03  3:21   ` Paul Jackson
  2005-10-03  5:05     ` Magnus Damm
  1 sibling, 1 reply; 30+ messages in thread
From: Paul Jackson @ 2005-10-03  3:21 UTC (permalink / raw)
  To: Dave Hansen; +Cc: magnus, linux-mm, linux-kernel

Dave wrote:
> Also, I worry that simply #ifdef'ing things out like CPUsets' update
> means that CPUsets lacks some kind of abstraction that it should have
> been using in the first place. 

In the abstract, cpusets should just assume that the system has one or
more CPUs, and one or more Memory Nodes.  Ideally, it should not
require either SMP nor NUMA.  Indeed, if you (Magnus) can get it
to compile with just one or the other of those two:

     config CPUSETS
	    bool "Cpuset support"
    -       depends on SMP
    +       depends on SMP || NUMA

then I would hope that it would compile with neither.  The cpuset
hierarchy on such a system would be rather boring, with all cpusets
having the same one CPU and one Memory Node, but it should work ... in
theory of course.

In practice of course, there may be details on the edges that depend on
the current SMP/NUMA limitations, such as:

Magnus wrote:
> Regarding the #ifdef, it
> was added because partition_sched_domain() is only implemented for
> SMP. That symbol has no prototype or implementation when CONFIG_SMP is
> not set. Maybe it is better to add an empty inline function in
> linux/sched.h for !SMP?

An empty inline partition_sched_domain() would be better than ifdef's
in cpuset.c, yes.  Or at least, that's usually the case.  Probably here
too.

In theory at least, I applaud Magnus's work here.  The assymetry of the
SMP/NUMA define structure has always annoyed me slightly, and only been
explainable in my view as a consequence of the historical order of
development.  I had a PC with a second memory board in an ISA slot,
which would qualify as a one CPU, two Memory Node system.

Or what byte us in the future (that PC was a long time ago), the kinks
in the current setup might be a hitch in our side as we extend to
increasingly interesting architectures.

Aside - for those reading this thread on lkml, it originated
on linux-mm.  It looks like Dave added lkml to the cc list.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.925.600.0401

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 00/07][RFC] i386: NUMA emulation
  2005-10-03  3:21   ` Paul Jackson
@ 2005-10-03  5:05     ` Magnus Damm
  2005-10-03  5:26       ` Hirokazu Takahashi
                         ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Magnus Damm @ 2005-10-03  5:05 UTC (permalink / raw)
  To: Paul Jackson; +Cc: Dave Hansen, magnus, linux-mm, linux-kernel

On 10/3/05, Paul Jackson <pj@sgi.com> wrote:
> Dave wrote:
> > Also, I worry that simply #ifdef'ing things out like CPUsets' update
> > means that CPUsets lacks some kind of abstraction that it should have
> > been using in the first place.
>
> In the abstract, cpusets should just assume that the system has one or
> more CPUs, and one or more Memory Nodes.  Ideally, it should not
> require either SMP nor NUMA.  Indeed, if you (Magnus) can get it
> to compile with just one or the other of those two:
>
>      config CPUSETS
>             bool "Cpuset support"
>     -       depends on SMP
>     +       depends on SMP || NUMA
>
> then I would hope that it would compile with neither.  The cpuset
> hierarchy on such a system would be rather boring, with all cpusets
> having the same one CPU and one Memory Node, but it should work ... in
> theory of course.

I just tested this on top of my patches:
@@ -245,7 +245,6 @@ config IKCONFIG_PROC

 config CPUSETS
        bool "Cpuset support"
-       depends on SMP || NUMA
        help

and it seems to work ok in practice too. On a regular !SMP !NUMA PC
anyway. As you note, the hierarchy is not that exciting. =) Anyway,
both SMP || NUMA or nothing seems to work as dependencies. After
partition_sched_domain() gets fixed that is.

> In practice of course, there may be details on the edges that depend on
> the current SMP/NUMA limitations, such as:
>
> Magnus wrote:
> > Regarding the #ifdef, it
> > was added because partition_sched_domain() is only implemented for
> > SMP. That symbol has no prototype or implementation when CONFIG_SMP is
> > not set. Maybe it is better to add an empty inline function in
> > linux/sched.h for !SMP?
>
> An empty inline partition_sched_domain() would be better than ifdef's
> in cpuset.c, yes.  Or at least, that's usually the case.  Probably here
> too.

I agree.

> In theory at least, I applaud Magnus's work here.  The assymetry of the
> SMP/NUMA define structure has always annoyed me slightly, and only been
> explainable in my view as a consequence of the historical order of
> development.  I had a PC with a second memory board in an ISA slot,
> which would qualify as a one CPU, two Memory Node system.
>
> Or what byte us in the future (that PC was a long time ago), the kinks
> in the current setup might be a hitch in our side as we extend to
> increasingly interesting architectures.

Nice to hear that you like the idea.

Maybe I should have broken down my patches into three smaller sets:

1) i386: NUMA without SMP
2) CPUSETS: NUMA || SMP
3) i386: NUMA emulation

If people like 1) then it's probably a good idea to convert other
architectures too. Both 2) and 3) above are separate but related
issues. And now seems like a good time to solve 2).

So, Paul, please let me know if you prefer SMP || NUMA or no
depencencies in the Kconfig. When I know that I will create a new
patch that hopefully can get into -mm later on.

> Aside - for those reading this thread on lkml, it originated
> on linux-mm.  It looks like Dave added lkml to the cc list.

Huh? I sent my patches both to lkml and linux-mm...

Thank you for the feedback!

/ magnus

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 00/07][RFC] i386: NUMA emulation
  2005-10-03  5:05     ` Magnus Damm
@ 2005-10-03  5:26       ` Hirokazu Takahashi
  2005-10-03  5:33       ` Paul Jackson
  2005-10-03  5:34       ` Paul Jackson
  2 siblings, 0 replies; 30+ messages in thread
From: Hirokazu Takahashi @ 2005-10-03  5:26 UTC (permalink / raw)
  To: pj; +Cc: magnus.damm, haveblue, magnus, linux-mm, linux-kernel

Hi,

> > In theory at least, I applaud Magnus's work here.  The assymetry of the
> > SMP/NUMA define structure has always annoyed me slightly, and only been
> > explainable in my view as a consequence of the historical order of
> > development.  I had a PC with a second memory board in an ISA slot,
> > which would qualify as a one CPU, two Memory Node system.
> >
> > Or what byte us in the future (that PC was a long time ago), the kinks
> > in the current setup might be a hitch in our side as we extend to
> > increasingly interesting architectures.
> 
> Nice to hear that you like the idea.
> 
> Maybe I should have broken down my patches into three smaller sets:
> 
> 1) i386: NUMA without SMP
> 2) CPUSETS: NUMA || SMP
> 3) i386: NUMA emulation
> 
> If people like 1) then it's probably a good idea to convert other
> architectures too. Both 2) and 3) above are separate but related
> issues. And now seems like a good time to solve 2).
> 
> So, Paul, please let me know if you prefer SMP || NUMA or no
> depencencies in the Kconfig. When I know that I will create a new
> patch that hopefully can get into -mm later on.

The latter seems a good idea to me if you're going to enhance CPUSETS
acceptable for CPUMETER or something like that.

Thanks.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 00/07][RFC] i386: NUMA emulation
  2005-10-03  5:05     ` Magnus Damm
  2005-10-03  5:26       ` Hirokazu Takahashi
@ 2005-10-03  5:33       ` Paul Jackson
  2005-10-03  5:59         ` Magnus Damm
  2005-10-03  5:34       ` Paul Jackson
  2 siblings, 1 reply; 30+ messages in thread
From: Paul Jackson @ 2005-10-03  5:33 UTC (permalink / raw)
  To: Magnus Damm; +Cc: haveblue, magnus, linux-mm, linux-kernel

Magnus wrote:
> So, Paul, please let me know if you prefer SMP || NUMA or no
> depencencies in the Kconfig.

In theory, I prefer none.  But the devil is in the details here,
and I really don't care that much.

So pick whichever you prefer, or whichever provides the nicest
looking code or patch, or flip a coin ;).

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.925.600.0401

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 00/07][RFC] i386: NUMA emulation
  2005-10-03  5:05     ` Magnus Damm
  2005-10-03  5:26       ` Hirokazu Takahashi
  2005-10-03  5:33       ` Paul Jackson
@ 2005-10-03  5:34       ` Paul Jackson
  2 siblings, 0 replies; 30+ messages in thread
From: Paul Jackson @ 2005-10-03  5:34 UTC (permalink / raw)
  To: Magnus Damm; +Cc: haveblue, magnus, linux-mm, linux-kernel

Magnus wrote:
> I sent my patches both to lkml and linux-mm...

Must be confusion on my end then.  Sorry.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.925.600.0401

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 00/07][RFC] i386: NUMA emulation
  2005-10-03  5:33       ` Paul Jackson
@ 2005-10-03  5:59         ` Magnus Damm
  2005-10-03  7:26           ` Paul Jackson
  0 siblings, 1 reply; 30+ messages in thread
From: Magnus Damm @ 2005-10-03  5:59 UTC (permalink / raw)
  To: Paul Jackson; +Cc: haveblue, magnus, linux-mm, linux-kernel

On 10/3/05, Paul Jackson <pj@sgi.com> wrote:
> Magnus wrote:
> > So, Paul, please let me know if you prefer SMP || NUMA or no
> > depencencies in the Kconfig.
>
> In theory, I prefer none.  But the devil is in the details here,
> and I really don't care that much.
>
> So pick whichever you prefer, or whichever provides the nicest
> looking code or patch, or flip a coin ;).

I'm tempted to consult the magic eight-ball, but I think I will stick
with the advice from Takahashi-san instead. =) So, the dependency will
be removed.

/ magnus

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 00/07][RFC] i386: NUMA emulation
  2005-10-03  5:59         ` Magnus Damm
@ 2005-10-03  7:26           ` Paul Jackson
  0 siblings, 0 replies; 30+ messages in thread
From: Paul Jackson @ 2005-10-03  7:26 UTC (permalink / raw)
  To: Magnus Damm; +Cc: haveblue, magnus, linux-mm, linux-kernel

Magnus wrote:
> I think I will stick with the advice from Takahashi-san

Yes - Takahashi-san gives much better advice than an eight-ball.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.925.600.0401

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 00/07][RFC] i386: NUMA emulation
  2005-10-03  2:08   ` Magnus Damm
@ 2005-10-03  7:34     ` David Lang
  2005-10-03 10:02       ` Magnus Damm
  2005-10-03 14:45       ` Martin J. Bligh
  0 siblings, 2 replies; 30+ messages in thread
From: David Lang @ 2005-10-03  7:34 UTC (permalink / raw)
  To: Magnus Damm; +Cc: Dave Hansen, Magnus Damm, linux-mm, Linux Kernel Mailing List

On Mon, 3 Oct 2005, Magnus Damm wrote:

> On 10/1/05, Dave Hansen <haveblue@us.ibm.com> wrote:
>> On Fri, 2005-09-30 at 16:33 +0900, Magnus Damm wrote:
>>> These patches implement NUMA memory node emulation for regular i386 PC:s.
>>>
>>> NUMA emulation could be used to provide coarse-grained memory resource control
>>> using CPUSETS. Another use is as a test environment for NUMA memory code or
>>> CPUSETS using an i386 emulator such as QEMU.
>>
>> This patch set basically allows the "NUMA depends on SMP" dependency to
>> be removed.  I'm not sure this is the right approach.  There will likely
>> never be a real-world NUMA system without SMP.  So, this set would seem
>> to include some increased (#ifdef) complexity for supporting SMP && !
>> NUMA, which will likely never happen in the real world.
>
> Yes, this patch set removes "NUMA depends on SMP". It also adds some
> simple NUMA emulation code too, but I am sure you are aware of that!
> =)
>
> I agree that it is very unlikely to find a single-processor NUMA
> system in the real world. So yes, "[PATCH 02/07] i386: numa on
> non-smp" adds _some_ extra complexity. But because SMP is set when
> supporting more than one cpu, and NUMA is set when supporting more
> than one memory node, I see no reason why they should be dependent on
> each other. Except that they depend on each other today and breaking
> them loose will increase complexity a bit.

hmm, observation from the peanut gallery, would it make sene to look at 
useing the NUMA code on single proc machines that use PAE to access more 
then 4G or ram on a 32 bit system?

David Lang

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 07/07] i386: numa emulation on pc
  2005-09-30 18:55   ` [PATCH 07/07] i386: numa emulation " Dave Hansen
@ 2005-10-03  9:59     ` Magnus Damm
  2005-10-03 16:16       ` Dave Hansen
  0 siblings, 1 reply; 30+ messages in thread
From: Magnus Damm @ 2005-10-03  9:59 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Magnus Damm, Isaku Yamahata, linux-mm, Linux Kernel Mailing List

Hi again Dave,

On 10/1/05, Dave Hansen <haveblue@us.ibm.com> wrote:
> On Fri, 2005-09-30 at 16:33 +0900, Magnus Damm wrote:
> >  void __init nid_zone_sizes_init(int nid)
> >  {
> >       unsigned long zones_size[MAX_NR_ZONES] = {0, 0, 0};
> > -     unsigned long max_dma;
> > +     unsigned long max_dma = min(max_hardware_dma_pfn(), max_low_pfn);
> >       unsigned long start = node_start_pfn[nid];
> >       unsigned long end = node_end_pfn[nid];
> >
> >       if (node_has_online_mem(nid)){
> > -             if (nid_starts_in_highmem(nid)) {
> > -                     zones_size[ZONE_HIGHMEM] = nid_size_pages(nid);
> > -             } else {
> > -                     max_dma = min(max_hardware_dma_pfn(), max_low_pfn);
> > -                     zones_size[ZONE_DMA] = max_dma;
> > -                     zones_size[ZONE_NORMAL] = max_low_pfn - max_dma;
> > -                     zones_size[ZONE_HIGHMEM] = end - max_low_pfn;
> > +             if (start < max_dma) {
> > +                     zones_size[ZONE_DMA] = min(end, max_dma) - start;
> > +             }
> > +             if (start < max_low_pfn && max_dma < end) {
> > +                     zones_size[ZONE_NORMAL] = min(end, max_low_pfn) - max(start, max_dma);
> > +             }
> > +             if (max_low_pfn <= end) {
> > +                     zones_size[ZONE_HIGHMEM] = end - max(start, max_low_pfn);
> >               }
> >       }
>
> That is a decent cleanup all by itself.  You might want to break it out.
> Take a look at the patches I just sent out.  They do some similar things
> to the same code.

Break it out, sure! I'm not sure which patch to look at, though.

> > @@ -1270,7 +1273,12 @@ void __init setup_bootmem_allocator(void
> >       /*
> >        * Initialize the boot-time allocator (with low memory only):
> >        */
> > +#ifdef CONFIG_NUMA_EMU
> > +     bootmap_size = init_bootmem(max(min_low_pfn, node_start_pfn[0]),
> > +                                 min(max_low_pfn, node_end_pfn[0]));
> > +#else
> >       bootmap_size = init_bootmem(min_low_pfn, max_low_pfn);
> > +#endif
>
> This shouldn't be necessary.  Again, take a look at my discontig
> separation patches and see if what I did works for you here.

Do you mean "discontig-consolidate0.patch"? Maybe I'm misunderstanding.

> > +#ifdef CONFIG_NUMA_EMU
> ...
> > +#endif
>
> Ewwwwww :)  No real need to put new function in a big #ifdef like that.
> Can you just create a new file for NUMA emulation?

Hehe, what is this, a beauty contest? =) I agree, but I guess the
reason for this code to be here is that a similar arrangement is done
by x86_64...

I will create a new file. Is arch/i386/mm/numa_emu.c good?

> > --- from-0001/include/asm-i386/numnodes.h
> > +++ to-work/include/asm-i386/numnodes.h       2005-09-28 17:49:53.000000000 +0900
> > @@ -8,7 +8,7 @@
> >  /* Max 16 Nodes */
> >  #define NODES_SHIFT  4
> >
> > -#elif defined(CONFIG_ACPI_SRAT)
> > +#elif defined(CONFIG_ACPI_SRAT) || defined(CONFIG_NUMA_EMU)
> >
> >  /* Max 8 Nodes */
> >  #define NODES_SHIFT  3
>
> Geez.  We should probably just do those in the Kconfig files.  Would
> look much simpler.  But, that's a patch for another day.  This is fine
> by itself.

No biggie, I will add a config option.

But first, you have written lots and lots of patches, and I am
confused. Could you please tell me on which patches I should base my
code to make things as easy as possible?

Many thanks,

/ magnus

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 00/07][RFC] i386: NUMA emulation
  2005-10-03  7:34     ` David Lang
@ 2005-10-03 10:02       ` Magnus Damm
  2005-10-03 13:33         ` David Lang
  2005-10-03 14:45       ` Martin J. Bligh
  1 sibling, 1 reply; 30+ messages in thread
From: Magnus Damm @ 2005-10-03 10:02 UTC (permalink / raw)
  To: David Lang; +Cc: Dave Hansen, Magnus Damm, linux-mm, Linux Kernel Mailing List

On 10/3/05, David Lang <david.lang@digitalinsight.com> wrote:
> On Mon, 3 Oct 2005, Magnus Damm wrote:
>
> > On 10/1/05, Dave Hansen <haveblue@us.ibm.com> wrote:
> >> On Fri, 2005-09-30 at 16:33 +0900, Magnus Damm wrote:
> >>> These patches implement NUMA memory node emulation for regular i386 PC:s.
> >>>
> >>> NUMA emulation could be used to provide coarse-grained memory resource control
> >>> using CPUSETS. Another use is as a test environment for NUMA memory code or
> >>> CPUSETS using an i386 emulator such as QEMU.
> >>
> >> This patch set basically allows the "NUMA depends on SMP" dependency to
> >> be removed.  I'm not sure this is the right approach.  There will likely
> >> never be a real-world NUMA system without SMP.  So, this set would seem
> >> to include some increased (#ifdef) complexity for supporting SMP && !
> >> NUMA, which will likely never happen in the real world.
> >
> > Yes, this patch set removes "NUMA depends on SMP". It also adds some
> > simple NUMA emulation code too, but I am sure you are aware of that!
> > =)
> >
> > I agree that it is very unlikely to find a single-processor NUMA
> > system in the real world. So yes, "[PATCH 02/07] i386: numa on
> > non-smp" adds _some_ extra complexity. But because SMP is set when
> > supporting more than one cpu, and NUMA is set when supporting more
> > than one memory node, I see no reason why they should be dependent on
> > each other. Except that they depend on each other today and breaking
> > them loose will increase complexity a bit.
>
> hmm, observation from the peanut gallery, would it make sene to look at
> useing the NUMA code on single proc machines that use PAE to access more
> then 4G or ram on a 32 bit system?

Hm, maybe? =) What would you like to accomplish by that?

/ magnus

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 00/07][RFC] i386: NUMA emulation
  2005-10-03 10:02       ` Magnus Damm
@ 2005-10-03 13:33         ` David Lang
  2005-10-03 14:59           ` Martin J. Bligh
  0 siblings, 1 reply; 30+ messages in thread
From: David Lang @ 2005-10-03 13:33 UTC (permalink / raw)
  To: Magnus Damm; +Cc: Dave Hansen, Magnus Damm, linux-mm, Linux Kernel Mailing List

On Mon, 3 Oct 2005, Magnus Damm wrote:

> Date: Mon, 3 Oct 2005 19:02:08 +0900
> From: Magnus Damm <magnus.damm@gmail.com>
> To: David Lang <david.lang@digitalinsight.com>
> Cc: Dave Hansen <haveblue@us.ibm.com>, Magnus Damm <magnus@valinux.co.jp>,
>     linux-mm <linux-mm@kvack.org>,
>     Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
> Subject: Re: [PATCH 00/07][RFC] i386: NUMA emulation
> 
> On 10/3/05, David Lang <david.lang@digitalinsight.com> wrote:
>> On Mon, 3 Oct 2005, Magnus Damm wrote:
>>
>>> On 10/1/05, Dave Hansen <haveblue@us.ibm.com> wrote:
>>>> On Fri, 2005-09-30 at 16:33 +0900, Magnus Damm wrote:
>>>>> These patches implement NUMA memory node emulation for regular i386 PC:s.
>>>>>
>>>>> NUMA emulation could be used to provide coarse-grained memory resource control
>>>>> using CPUSETS. Another use is as a test environment for NUMA memory code or
>>>>> CPUSETS using an i386 emulator such as QEMU.
>>>>
>>>> This patch set basically allows the "NUMA depends on SMP" dependency to
>>>> be removed.  I'm not sure this is the right approach.  There will likely
>>>> never be a real-world NUMA system without SMP.  So, this set would seem
>>>> to include some increased (#ifdef) complexity for supporting SMP && !
>>>> NUMA, which will likely never happen in the real world.
>>>
>>> Yes, this patch set removes "NUMA depends on SMP". It also adds some
>>> simple NUMA emulation code too, but I am sure you are aware of that!
>>> =)
>>>
>>> I agree that it is very unlikely to find a single-processor NUMA
>>> system in the real world. So yes, "[PATCH 02/07] i386: numa on
>>> non-smp" adds _some_ extra complexity. But because SMP is set when
>>> supporting more than one cpu, and NUMA is set when supporting more
>>> than one memory node, I see no reason why they should be dependent on
>>> each other. Except that they depend on each other today and breaking
>>> them loose will increase complexity a bit.
>>
>> hmm, observation from the peanut gallery, would it make sene to look at
>> useing the NUMA code on single proc machines that use PAE to access more
>> then 4G or ram on a 32 bit system?
>
> Hm, maybe? =) What would you like to accomplish by that?

if nothing else preferential use of 'local' (non PAE) memory over 'remote' 
(PAE) memory for programs, while still useing it all as needed.

this may be done already, but this type of difference between the access 
speed of different chunks of ram seems to be exactly the type of thing 
that the NUMA code solves the general case for. I'm thinking that it may 
end up simplifying things if the same general-purpose logic will work for 
the specific case of PAE instead of it being hard coded as a special case.

it also just struck me as the most obvious example of where a UP box could 
have a NUMA-like memory arrangement (and therefor a case to justify 
decoupling the SMP and NUMA options)

David Lang

> / magnus
>

-- 
There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies.
  -- C.A.R. Hoare

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 00/07][RFC] i386: NUMA emulation
  2005-10-03  7:34     ` David Lang
  2005-10-03 10:02       ` Magnus Damm
@ 2005-10-03 14:45       ` Martin J. Bligh
  2005-10-03 14:49         ` David Lang
  1 sibling, 1 reply; 30+ messages in thread
From: Martin J. Bligh @ 2005-10-03 14:45 UTC (permalink / raw)
  To: David Lang, Magnus Damm
  Cc: Dave Hansen, Magnus Damm, linux-mm, Linux Kernel Mailing List

--David Lang <david.lang@digitalinsight.com> wrote (on Monday, October 03, 2005 00:34:40 -0700):

> On Mon, 3 Oct 2005, Magnus Damm wrote:
> 
>> On 10/1/05, Dave Hansen <haveblue@us.ibm.com> wrote:
>>> On Fri, 2005-09-30 at 16:33 +0900, Magnus Damm wrote:
>>>> These patches implement NUMA memory node emulation for regular i386 PC:s.
>>>> 
>>>> NUMA emulation could be used to provide coarse-grained memory resource control
>>>> using CPUSETS. Another use is as a test environment for NUMA memory code or
>>>> CPUSETS using an i386 emulator such as QEMU.
>>> 
>>> This patch set basically allows the "NUMA depends on SMP" dependency to
>>> be removed.  I'm not sure this is the right approach.  There will likely
>>> never be a real-world NUMA system without SMP.  So, this set would seem
>>> to include some increased (#ifdef) complexity for supporting SMP && !
>>> NUMA, which will likely never happen in the real world.
>> 
>> Yes, this patch set removes "NUMA depends on SMP". It also adds some
>> simple NUMA emulation code too, but I am sure you are aware of that!
>> =)
>> 
>> I agree that it is very unlikely to find a single-processor NUMA
>> system in the real world. So yes, "[PATCH 02/07] i386: numa on
>> non-smp" adds _some_ extra complexity. But because SMP is set when
>> supporting more than one cpu, and NUMA is set when supporting more
>> than one memory node, I see no reason why they should be dependent on
>> each other. Except that they depend on each other today and breaking
>> them loose will increase complexity a bit.
> 
> hmm, observation from the peanut gallery, would it make sene to look at 
> useing the NUMA code on single proc machines that use PAE to access 
> more then 4G or ram on a 32 bit system?

2 problems:

1) there aren't any ;-)
2) The memory is not physically differently separated from the CPUs, so
it's not NUMA.

M.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 00/07][RFC] i386: NUMA emulation
  2005-10-03 14:45       ` Martin J. Bligh
@ 2005-10-03 14:49         ` David Lang
  0 siblings, 0 replies; 30+ messages in thread
From: David Lang @ 2005-10-03 14:49 UTC (permalink / raw)
  To: Martin J. Bligh
  Cc: Magnus Damm, Dave Hansen, Magnus Damm, linux-mm,
	Linux Kernel Mailing List

On Mon, 3 Oct 2005, Martin J. Bligh wrote:

>>> I agree that it is very unlikely to find a single-processor NUMA
>>> system in the real world. So yes, "[PATCH 02/07] i386: numa on
>>> non-smp" adds _some_ extra complexity. But because SMP is set when
>>> supporting more than one cpu, and NUMA is set when supporting more
>>> than one memory node, I see no reason why they should be dependent on
>>> each other. Except that they depend on each other today and breaking
>>> them loose will increase complexity a bit.
>>
>> hmm, observation from the peanut gallery, would it make sene to look at
>> useing the NUMA code on single proc machines that use PAE to access
>> more then 4G or ram on a 32 bit system?
>
> 2 problems:
>
> 1) there aren't any ;-)
> 2) The memory is not physically differently separated from the CPUs, so
> it's not NUMA.

even though it's not physically differently seperated from the CPU(s) 
doesn't it's differing performance amount to the same thing?

David Lang

-- 
There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies.
  -- C.A.R. Hoare

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 00/07][RFC] i386: NUMA emulation
  2005-10-03 13:33         ` David Lang
@ 2005-10-03 14:59           ` Martin J. Bligh
  2005-10-03 15:03             ` David Lang
  0 siblings, 1 reply; 30+ messages in thread
From: Martin J. Bligh @ 2005-10-03 14:59 UTC (permalink / raw)
  To: David Lang, Magnus Damm
  Cc: Dave Hansen, Magnus Damm, linux-mm, Linux Kernel Mailing List

> if nothing else preferential use of 'local' (non PAE) memory over 
> 'remote' (PAE) memory for programs, while still useing it all as needed.

Why would you want to do that? ;-)

> this may be done already, but this type of difference between the access 
> speed of different chunks of ram seems to be exactly the type of thing 
> that the NUMA code solves the general case for.

It is! 

> I'm thinking that it 
> may end up simplifying things if the same general-purpose logic will 
> work for the specific case of PAE instead of it being hard coded as 
> a special case.

But that's not the same at all! ;-) PAE memory is the same speed as
the other stuff. You just have a 3rd level of pagetables for everything.
One could (correctly) argue it made *all* memory slower, but it does so
in a uniform fashion.

M.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 00/07][RFC] i386: NUMA emulation
  2005-10-03 14:59           ` Martin J. Bligh
@ 2005-10-03 15:03             ` David Lang
  2005-10-03 15:08               ` Martin J. Bligh
  0 siblings, 1 reply; 30+ messages in thread
From: David Lang @ 2005-10-03 15:03 UTC (permalink / raw)
  To: Martin J. Bligh
  Cc: Magnus Damm, Dave Hansen, Magnus Damm, linux-mm,
	Linux Kernel Mailing List

On Mon, 3 Oct 2005, Martin J. Bligh wrote:

> But that's not the same at all! ;-) PAE memory is the same speed as
> the other stuff. You just have a 3rd level of pagetables for everything.
> One could (correctly) argue it made *all* memory slower, but it does so
> in a uniform fashion.

is it? I've seen during the memory self-test at boot that machines slow 
down noticably as they pass the 4G mark.

David Lang

-- 
There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies.
  -- C.A.R. Hoare

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 00/07][RFC] i386: NUMA emulation
  2005-10-03 15:03             ` David Lang
@ 2005-10-03 15:08               ` Martin J. Bligh
  2005-10-03 15:13                 ` David Lang
  0 siblings, 1 reply; 30+ messages in thread
From: Martin J. Bligh @ 2005-10-03 15:08 UTC (permalink / raw)
  To: David Lang
  Cc: Magnus Damm, Dave Hansen, Magnus Damm, linux-mm,
	Linux Kernel Mailing List



--David Lang <david.lang@digitalinsight.com> wrote (on Monday, October 03, 2005 08:03:44 -0700):

> On Mon, 3 Oct 2005, Martin J. Bligh wrote:
> 
>> But that's not the same at all! ;-) PAE memory is the same speed as
>> the other stuff. You just have a 3rd level of pagetables for everything.
>> One could (correctly) argue it made *all* memory slower, but it does so
>> in a uniform fashion.
> 
> is it? I've seen during the memory self-test at boot that machines slow down noticably as they pass the 4G mark.

Not noticed that, and I can't see why it should be the case in general,
though I suppose some machines might be odd. Got any numbers?

M.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 00/07][RFC] i386: NUMA emulation
  2005-10-03 15:08               ` Martin J. Bligh
@ 2005-10-03 15:13                 ` David Lang
  2005-10-03 15:25                   ` Martin J. Bligh
  0 siblings, 1 reply; 30+ messages in thread
From: David Lang @ 2005-10-03 15:13 UTC (permalink / raw)
  To: Martin J. Bligh
  Cc: Magnus Damm, Dave Hansen, Magnus Damm, linux-mm,
	Linux Kernel Mailing List

On Mon, 3 Oct 2005, Martin J. Bligh wrote:

> --David Lang <david.lang@digitalinsight.com> wrote (on Monday, October 03, 2005 08:03:44 -0700):
>
>> On Mon, 3 Oct 2005, Martin J. Bligh wrote:
>>
>>> But that's not the same at all! ;-) PAE memory is the same speed as
>>> the other stuff. You just have a 3rd level of pagetables for everything.
>>> One could (correctly) argue it made *all* memory slower, but it does so
>>> in a uniform fashion.
>>
>> is it? I've seen during the memory self-test at boot that machines slow down noticably as they pass the 4G mark.
>
> Not noticed that, and I can't see why it should be the case in general,
> though I suppose some machines might be odd. Got any numbers?

just the fact that the system boot memory test takes 3-4 times as long 
with 8G or ram then with 4G of ram. I then boot a 64 bit kernel on the 
system and never use PAE mode again :-)

if you can point me at a utility that will test the speed of the memory in 
different chunks I'll do some testing on the Opteron systems I have 
available. unfortunantly I don't have any Xeon systems to test this on.

David Lang

-- 
There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies.
  -- C.A.R. Hoare

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 00/07][RFC] i386: NUMA emulation
  2005-10-03 15:13                 ` David Lang
@ 2005-10-03 15:25                   ` Martin J. Bligh
  2005-10-03 15:32                     ` David Lang
  0 siblings, 1 reply; 30+ messages in thread
From: Martin J. Bligh @ 2005-10-03 15:25 UTC (permalink / raw)
  To: David Lang
  Cc: Magnus Damm, Dave Hansen, Magnus Damm, linux-mm,
	Linux Kernel Mailing List



--David Lang <david.lang@digitalinsight.com> wrote (on Monday, October 03, 2005 08:13:09 -0700):

> On Mon, 3 Oct 2005, Martin J. Bligh wrote:
> 
>> --David Lang <david.lang@digitalinsight.com> wrote (on Monday, October 03, 2005 08:03:44 -0700):
>> 
>>> On Mon, 3 Oct 2005, Martin J. Bligh wrote:
>>> 
>>>> But that's not the same at all! ;-) PAE memory is the same speed as
>>>> the other stuff. You just have a 3rd level of pagetables for everything.
>>>> One could (correctly) argue it made *all* memory slower, but it does so
>>>> in a uniform fashion.
>>> 
>>> is it? I've seen during the memory self-test at boot that machines slow down noticably as they pass the 4G mark.
>> 
>> Not noticed that, and I can't see why it should be the case in general,
>> though I suppose some machines might be odd. Got any numbers?
> 
> just the fact that the system boot memory test takes 3-4 times as long with 8G or ram then with 4G of ram. I then boot a 64 bit kernel on the system and never use PAE mode again :-)
> 
> if you can point me at a utility that will test the speed of the memory in different chunks I'll do some testing on the Opteron systems I have available. unfortunantly I don't have any Xeon systems to test this on.

Mmm. 64-bit uniproc systems, with > 4GB of RAM, running a 32 bit kernel
don't really strike me as a huge market segment ;-)

M.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 00/07][RFC] i386: NUMA emulation
  2005-10-03 15:25                   ` Martin J. Bligh
@ 2005-10-03 15:32                     ` David Lang
  2005-10-03 15:54                       ` Martin J. Bligh
  0 siblings, 1 reply; 30+ messages in thread
From: David Lang @ 2005-10-03 15:32 UTC (permalink / raw)
  To: Martin J. Bligh
  Cc: Magnus Damm, Dave Hansen, Magnus Damm, linux-mm,
	Linux Kernel Mailing List

On Mon, 3 Oct 2005, Martin J. Bligh wrote:

> --David Lang <david.lang@digitalinsight.com> wrote (on Monday, October 03, 2005 08:13:09 -0700):
>
>> On Mon, 3 Oct 2005, Martin J. Bligh wrote:
>>
>>> --David Lang <david.lang@digitalinsight.com> wrote (on Monday, October 03, 2005 08:03:44 -0700):
>>>
>>>> On Mon, 3 Oct 2005, Martin J. Bligh wrote:
>>>>
>>>>> But that's not the same at all! ;-) PAE memory is the same speed as
>>>>> the other stuff. You just have a 3rd level of pagetables for everything.
>>>>> One could (correctly) argue it made *all* memory slower, but it does so
>>>>> in a uniform fashion.
>>>>
>>>> is it? I've seen during the memory self-test at boot that machines slow down noticably as they pass the 4G mark.
>>>
>>> Not noticed that, and I can't see why it should be the case in general,
>>> though I suppose some machines might be odd. Got any numbers?
>>
>> just the fact that the system boot memory test takes 3-4 times as long with 8G or ram then with 4G of ram. I then boot a 64 bit kernel on the system and never use PAE mode again :-)
>>
>> if you can point me at a utility that will test the speed of the memory in different chunks I'll do some testing on the Opteron systems I have available. unfortunantly I don't have any Xeon systems to test this on.
>
> Mmm. 64-bit uniproc systems, with > 4GB of RAM, running a 32 bit kernel
> don't really strike me as a huge market segment ;-)

true, but there are a lot of 32-bit uniproc systems sold by Intel that 
have (or can have) more then 4G of ram. These are the machines I was 
thinking of.

David Lang

-- 
There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies.
  -- C.A.R. Hoare

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 00/07][RFC] i386: NUMA emulation
  2005-10-03 15:32                     ` David Lang
@ 2005-10-03 15:54                       ` Martin J. Bligh
  2005-10-03 16:44                         ` David Lang
  0 siblings, 1 reply; 30+ messages in thread
From: Martin J. Bligh @ 2005-10-03 15:54 UTC (permalink / raw)
  To: David Lang
  Cc: Magnus Damm, Dave Hansen, Magnus Damm, linux-mm,
	Linux Kernel Mailing List



--David Lang <david.lang@digitalinsight.com> wrote (on Monday, October 03, 2005 08:32:47 -0700):

> On Mon, 3 Oct 2005, Martin J. Bligh wrote:
> 
>> --David Lang <david.lang@digitalinsight.com> wrote (on Monday, October 03, 2005 08:13:09 -0700):
>> 
>>> On Mon, 3 Oct 2005, Martin J. Bligh wrote:
>>> 
>>>> --David Lang <david.lang@digitalinsight.com> wrote (on Monday, October 03, 2005 08:03:44 -0700):
>>>> 
>>>>> On Mon, 3 Oct 2005, Martin J. Bligh wrote:
>>>>> 
>>>>>> But that's not the same at all! ;-) PAE memory is the same speed as
>>>>>> the other stuff. You just have a 3rd level of pagetables for everything.
>>>>>> One could (correctly) argue it made *all* memory slower, but it does so
>>>>>> in a uniform fashion.
>>>>> 
>>>>> is it? I've seen during the memory self-test at boot that machines slow down noticably as they pass the 4G mark.
>>>> 
>>>> Not noticed that, and I can't see why it should be the case in general,
>>>> though I suppose some machines might be odd. Got any numbers?
>>> 
>>> just the fact that the system boot memory test takes 3-4 times as long with 8G or ram then with 4G of ram. I then boot a 64 bit kernel on the system and never use PAE mode again :-)
>>> 
>>> if you can point me at a utility that will test the speed of the memory in different chunks I'll do some testing on the Opteron systems I have available. unfortunantly I don't have any Xeon systems to test this on.
>> 
>> Mmm. 64-bit uniproc systems, with > 4GB of RAM, running a 32 bit kernel
>> don't really strike me as a huge market segment ;-)
> 
> true, but there are a lot of 32-bit uniproc systems sold by Intel that have (or can have) more then 4G of ram. These are the machines I was thinking of.

Does your opteron box have more than 1 socket? that'd explain it.

Anyway, it shouldn't happen on any normal platform. Until we get 
numbers that prove that it does (and understand why), I don't think
we need NUMA for PAE.

M.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 07/07] i386: numa emulation on pc
  2005-10-03  9:59     ` Magnus Damm
@ 2005-10-03 16:16       ` Dave Hansen
  2005-10-04  5:06         ` Magnus Damm
  0 siblings, 1 reply; 30+ messages in thread
From: Dave Hansen @ 2005-10-03 16:16 UTC (permalink / raw)
  To: Magnus Damm
  Cc: Magnus Damm, Isaku Yamahata, linux-mm, Linux Kernel Mailing List

On Mon, 2005-10-03 at 18:59 +0900, Magnus Damm wrote:
> > > +#ifdef CONFIG_NUMA_EMU
> > > +     bootmap_size = init_bootmem(max(min_low_pfn, node_start_pfn[0]),
> > > +                                 min(max_low_pfn, node_end_pfn[0]));
> > > +#else
> > >       bootmap_size = init_bootmem(min_low_pfn, max_low_pfn);
> > > +#endif
> >
> > This shouldn't be necessary.  Again, take a look at my discontig
> > separation patches and see if what I did works for you here.
> 
> Do you mean "discontig-consolidate0.patch"? Maybe I'm misunderstanding.

This one, I believe:

http://sr71.net/patches/2.6.14/2.6.14-rc2-git8-mhp1/broken-out/B2.1-i386-discontig-consolidation.patch

> > > +#ifdef CONFIG_NUMA_EMU
> > ...
> > > +#endif
> >
> > Ewwwwww :)  No real need to put new function in a big #ifdef like that.
> > Can you just create a new file for NUMA emulation?
> 
> Hehe, what is this, a beauty contest? =) I agree, but I guess the
> reason for this code to be here is that a similar arrangement is done
> by x86_64...

If that's really the case, can they _actually_ share code?  Maybe we can
do this NUMA emulation thing in non-arch code.  Just guessing...

> I will create a new file. Is arch/i386/mm/numa_emu.c good?

> But first, you have written lots and lots of patches, and I am
> confused. Could you please tell me on which patches I should base my
> code to make things as easy as possible?

This is the staging ground for my memory hotplug work.  But, it contains
all of my work on other stuff, too.  If you build on top of this, it
would be great:

http://sr71.net/patches/2.6.14/2.6.14-rc2-git8-mhp1/

-- Dave


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 00/07][RFC] i386: NUMA emulation
  2005-10-03 15:54                       ` Martin J. Bligh
@ 2005-10-03 16:44                         ` David Lang
  0 siblings, 0 replies; 30+ messages in thread
From: David Lang @ 2005-10-03 16:44 UTC (permalink / raw)
  To: Martin J. Bligh
  Cc: Magnus Damm, Dave Hansen, Magnus Damm, linux-mm,
	Linux Kernel Mailing List

On Mon, 3 Oct 2005, Martin J. Bligh wrote:

>>>>>
>>>>> Not noticed that, and I can't see why it should be the case in general,
>>>>> though I suppose some machines might be odd. Got any numbers?
>>>>
>>>> just the fact that the system boot memory test takes 3-4 times as long with 8G or ram then with 4G of ram. I then boot a 64 bit kernel on the system and never use PAE mode again :-)
>>>>
>>>> if you can point me at a utility that will test the speed of the memory in different chunks I'll do some testing on the Opteron systems I have available. unfortunantly I don't have any Xeon systems to test this on.
>>>
>>> Mmm. 64-bit uniproc systems, with > 4GB of RAM, running a 32 bit kernel
>>> don't really strike me as a huge market segment ;-)
>>
>> true, but there are a lot of 32-bit uniproc systems sold by Intel that have (or can have) more then 4G of ram. These are the machines I was thinking of.
>
> Does your opteron box have more than 1 socket? that'd explain it.

yes, but I see the same 4G breakpoint no matter what the memory config 
(including one dual proc machine with 16G, if it was a matter of hitting 
memory connected to the other socket I would expect the slowdown at 8G, 
not at 4G)

> Anyway, it shouldn't happen on any normal platform. Until we get
> numbers that prove that it does (and understand why), I don't think
> we need NUMA for PAE.

Ok, if nobody else is seeing any slowdown.

David Lang

-- 
There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies.
  -- C.A.R. Hoare

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 07/07] i386: numa emulation on pc
  2005-10-03 16:16       ` Dave Hansen
@ 2005-10-04  5:06         ` Magnus Damm
  0 siblings, 0 replies; 30+ messages in thread
From: Magnus Damm @ 2005-10-04  5:06 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Magnus Damm, Isaku Yamahata, linux-mm, Linux Kernel Mailing List

On 10/4/05, Dave Hansen <haveblue@us.ibm.com> wrote:
> On Mon, 2005-10-03 at 18:59 +0900, Magnus Damm wrote:
> > > > +#ifdef CONFIG_NUMA_EMU
> > > ...
> > > > +#endif
> > >
> > > Ewwwwww :)  No real need to put new function in a big #ifdef like that.
> > > Can you just create a new file for NUMA emulation?
> >
> > Hehe, what is this, a beauty contest? =) I agree, but I guess the
> > reason for this code to be here is that a similar arrangement is done
> > by x86_64...
>
> If that's really the case, can they _actually_ share code?  Maybe we can
> do this NUMA emulation thing in non-arch code.  Just guessing...

I'd like to avoid duplication as much as you, but at a quick glance
the x86_64 and i386 architecture looked pretty different. But I will
see what I can do.

> > I will create a new file. Is arch/i386/mm/numa_emu.c good?
>
> > But first, you have written lots and lots of patches, and I am
> > confused. Could you please tell me on which patches I should base my
> > code to make things as easy as possible?
>
> This is the staging ground for my memory hotplug work.  But, it contains
> all of my work on other stuff, too.  If you build on top of this, it
> would be great:
>
> http://sr71.net/patches/2.6.14/2.6.14-rc2-git8-mhp1/

I will build on top of that then.

Thanks,

/ magnus

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 07/07] i386: numa emulation on pc
       [not found] ` <20050930073308.10631.24247.sendpatchset@cherry.local>
  2005-09-30 18:55   ` [PATCH 07/07] i386: numa emulation " Dave Hansen
@ 2005-10-04  7:52   ` Hirokazu Takahashi
  2005-10-04  9:49     ` Magnus Damm
  1 sibling, 1 reply; 30+ messages in thread
From: Hirokazu Takahashi @ 2005-10-04  7:52 UTC (permalink / raw)
  To: magnus; +Cc: linux-mm, linux-kernel

Hi,

> This patch adds NUMA emulation for i386 on top of the fixes for sparsemem and
> discontigmem. NUMA emulation already exists for x86_64, and this patch adds
> the same feature using the same config option CONFIG_NUMA_EMU. The kernel
> command line option used is also the same as for x86_64.

It seems like you've forgot to bind cpus with emulated nodes as linux for
x86_64 does. I don't think it's your intention.


Thanks.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 07/07] i386: numa emulation on pc
  2005-10-04  7:52   ` Hirokazu Takahashi
@ 2005-10-04  9:49     ` Magnus Damm
  0 siblings, 0 replies; 30+ messages in thread
From: Magnus Damm @ 2005-10-04  9:49 UTC (permalink / raw)
  To: Hirokazu Takahashi; +Cc: magnus, linux-mm, linux-kernel

On 10/4/05, Hirokazu Takahashi <taka@valinux.co.jp> wrote:
> It seems like you've forgot to bind cpus with emulated nodes as linux for
> x86_64 does. I don't think it's your intention.

True, not my intention. I will have a look at that. Thanks.

/ magnus

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2005-10-04  9:49 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20050930073232.10631.63786.sendpatchset@cherry.local>
2005-09-30 15:23 ` [PATCH 00/07][RFC] i386: NUMA emulation Dave Hansen
2005-10-03  2:08   ` Magnus Damm
2005-10-03  7:34     ` David Lang
2005-10-03 10:02       ` Magnus Damm
2005-10-03 13:33         ` David Lang
2005-10-03 14:59           ` Martin J. Bligh
2005-10-03 15:03             ` David Lang
2005-10-03 15:08               ` Martin J. Bligh
2005-10-03 15:13                 ` David Lang
2005-10-03 15:25                   ` Martin J. Bligh
2005-10-03 15:32                     ` David Lang
2005-10-03 15:54                       ` Martin J. Bligh
2005-10-03 16:44                         ` David Lang
2005-10-03 14:45       ` Martin J. Bligh
2005-10-03 14:49         ` David Lang
2005-10-03  3:21   ` Paul Jackson
2005-10-03  5:05     ` Magnus Damm
2005-10-03  5:26       ` Hirokazu Takahashi
2005-10-03  5:33       ` Paul Jackson
2005-10-03  5:59         ` Magnus Damm
2005-10-03  7:26           ` Paul Jackson
2005-10-03  5:34       ` Paul Jackson
     [not found] ` <20050930073258.10631.74982.sendpatchset@cherry.local>
2005-09-30 15:25   ` [PATCH 05/07] i386: sparsemem on pc Dave Hansen
2005-10-01  0:32     ` Magnus Damm
     [not found] ` <20050930073308.10631.24247.sendpatchset@cherry.local>
2005-09-30 18:55   ` [PATCH 07/07] i386: numa emulation " Dave Hansen
2005-10-03  9:59     ` Magnus Damm
2005-10-03 16:16       ` Dave Hansen
2005-10-04  5:06         ` Magnus Damm
2005-10-04  7:52   ` Hirokazu Takahashi
2005-10-04  9:49     ` Magnus Damm

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).