* A tale of three memory allocators
@ 2005-03-17 23:08 Magenheimer, Dan (HP Labs Fort Collins)
2005-03-18 4:48 ` Rik van Riel
0 siblings, 1 reply; 12+ messages in thread
From: Magenheimer, Dan (HP Labs Fort Collins) @ 2005-03-17 23:08 UTC (permalink / raw)
To: xen-devel
This should be a good meaty developers discussion, full
of opinions. I'm far from an expert in this area and hereby
solicit help.
Once upon a time there was the Xen simplified memory allocator,
written by Keir. When Dan (the ia64 man) looked at this allocator
for Xen/ia64, he was very displeased. "Ia64 has a much more
complex physical memory architecture than x86," he said.
"For example, physical memory space is often not contiguous,
and what about the cool NUMA stuff that is in Linux 2.6?"
So Dan took the Linux memory allocation code, hacked on
it mightily, adding numerous ugly ifdefs and used the result
in place of Keir's code for Xen/ia64. The result works,
but is truly an abomination.
Some time later, Rusty looked at Keir's code and too was
displeased. He rewrote it to be much cleaner and more
simplified, much to Keir's delight. And this code was
placed in common, still to be ignored by Xen/ia64.
Now Arun gazed upon the ugly Xen/ia64 memory allocator full
of ifdefs and was much dismayed. He preferred Rusty's code
and, even better, sharing more code with Xen/x86, so he busily
generated a patch for Xen/ia64 to use the common code.
Said patch is still pending because coincidentally Greg
is currently looking at porting Xen/ia64 to one of those newfangled
ia64 NUMA machines. "I would like to turn on CONFIG_NUMA.
CONFIG_DISCONTIGMEM, and CONFIG_VIRTUAL_MEM_MAP," said Greg.
What should Dan (the ia64 man) do??? How complex is too
complex? How ugly is too ugly?
This concludes (for now) the tale of three memory allocators...
(please respond/discuss :-)
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95&alloc_id\x14396&op=click
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: A tale of three memory allocators
2005-03-17 23:08 A tale of three memory allocators Magenheimer, Dan (HP Labs Fort Collins)
@ 2005-03-18 4:48 ` Rik van Riel
2005-03-18 16:56 ` Jesse Barnes
0 siblings, 1 reply; 12+ messages in thread
From: Rik van Riel @ 2005-03-18 4:48 UTC (permalink / raw)
To: Magenheimer, Dan (HP Labs Fort Collins); +Cc: xen-devel
On Thu, 17 Mar 2005, Magenheimer, Dan (HP Labs Fort Collins) wrote:
> Said patch is still pending because coincidentally Greg
> is currently looking at porting Xen/ia64 to one of those newfangled
> ia64 NUMA machines. "I would like to turn on CONFIG_NUMA.
> CONFIG_DISCONTIGMEM, and CONFIG_VIRTUAL_MEM_MAP," said Greg.
Two out of three is enough. I don't see the need for
both CONFIG_DISCONTIGMEM and CONFIG_VIRTUAL_MEM_MAP.
I guess that having a NUMA aware allocator could come
in handy though, so guest domains get their memory from
the right node wrt. to where they get CPU time scheduled.
To me, that would suggest that Ian's hunch is right and
the proper thing to do would be different instances of
Rusty's simple allocator.
--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: A tale of three memory allocators
2005-03-18 4:48 ` Rik van Riel
@ 2005-03-18 16:56 ` Jesse Barnes
2005-03-18 19:42 ` Jesse Barnes
0 siblings, 1 reply; 12+ messages in thread
From: Jesse Barnes @ 2005-03-18 16:56 UTC (permalink / raw)
To: xen-devel
Rik van Riel <riel <at> redhat.com> writes:
> Two out of three is enough. I don't see the need for
> both CONFIG_DISCONTIGMEM and CONFIG_VIRTUAL_MEM_MAP.
Well, we need it for sn2. We have very large memory holes within a node, so
we use the virtual memmap to make the mem_map array within a node virtually
contiguous.
> I guess that having a NUMA aware allocator could come
> in handy though, so guest domains get their memory from
> the right node wrt. to where they get CPU time scheduled.
Yep, it would probably be a mistake to overoptimize Xen on NUMA at this point,
but doing basic things like this makes sense.
Jesse
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Re: A tale of three memory allocators
2005-03-18 16:56 ` Jesse Barnes
@ 2005-03-18 19:42 ` Jesse Barnes
0 siblings, 0 replies; 12+ messages in thread
From: Jesse Barnes @ 2005-03-18 19:42 UTC (permalink / raw)
To: xen-devel
On Friday, March 18, 2005 8:56 am, Jesse Barnes wrote:
> Rik van Riel <riel <at> redhat.com> writes:
> > Two out of three is enough. I don't see the need for
> > both CONFIG_DISCONTIGMEM and CONFIG_VIRTUAL_MEM_MAP.
>
> Well, we need it for sn2. We have very large memory holes within a node,
> so we use the virtual memmap to make the mem_map array within a node
> virtually contiguous.
Of course I mean that the hypervisor probably has to support this stuff to
work correctly on multi-node ia64 machines. The guests can probably get away
with being special cased since they can be presented with a contiguous memory
model (Dan is this what you're doing now?)...
Thanks,
Jesse
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: A tale of three memory allocators
@ 2005-03-18 0:08 Ian Pratt
0 siblings, 0 replies; 12+ messages in thread
From: Ian Pratt @ 2005-03-18 0:08 UTC (permalink / raw)
To: Magenheimer, Dan (HP Labs Fort Collins), xen-devel; +Cc: ian.pratt
> Said patch is still pending because coincidentally Greg
> is currently looking at porting Xen/ia64 to one of those newfangled
> ia64 NUMA machines. "I would like to turn on CONFIG_NUMA.
> CONFIG_DISCONTIGMEM, and CONFIG_VIRTUAL_MEM_MAP," said Greg.
I'd vote strongly for:
Stick with Rusty's allocator and just have different instances for
different memory banks. Wrap the allocation function to prioritise which
pool to allocate from. This handles discontig memory and NUMA nicely.
Ian
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95&alloc_id\x14396&op=click
^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: A tale of three memory allocators
@ 2005-03-18 19:44 Tian, Kevin
0 siblings, 0 replies; 12+ messages in thread
From: Tian, Kevin @ 2005-03-18 19:44 UTC (permalink / raw)
To: Magenheimer, Dan (HP Labs Fort Collins), xen-devel
Hi, Dan,
This is really a good writing. :) My feeling is that we could
still try to merge with XEN common memory system first, and then, based
on this simplified model, investigate carefully about what's the best
and cleanest approach to support add-on features, like NUMA here. Yes,
it would be great if XEN can run on so many different work models soon.
It's really quickest way to support NUMA stuff based on existing
Linux code. However sometimes re-design based on a new usage model may
be more valuable than simply copying stuff for largely different one. :)
VM hypervisor differs with normal OS to large extent. Linux memory
management is efficient which however, contains too many redundancies
for XEN. For example, Linux has to differentiate allocation request for
normal memory or DMA-capable one. Instead Xen grants Dom0 (service OS)
to take charge of physical devices, thus no need to know DMA related
stuff. Another example is, large part of Linux memory management code is
for handling user process, which has a different appearance compared to
Domain. That brings with many uncertainties for later development.
So to adopt XEN common code actually let us to sync with new
fix/update/design which benefiting virtual machine concept quickly;
however to borrow Linux code brings us complexity to maintain and catch
up with change of linux. It's very likely that efforts on those finally
just shows unrelated to XEN usage model.
Actually our patch replaces linux stuff with XEN common memory
part, including both boot-time allocator, buddy system and Rusty's
simple slab allocator. If this can be merged earlier, we then get a
better base to consider how to support NUMA model on XEN environment.
IMO, actually XEN already leaves space for such enhancement. The buddy
system is based on concept of zone, which you can also think as node. A
quick way may be: (As Ian points out)
1. Define more domain ID, like:
#define MEMZONE_XEN 0
#define MEMZONE_DOM 1
#define MEMZONE_NODE 4 /* say 4 node */
#define NR_ZONES MEMZONE_NODE + 2
2. Define new wrapping interfaces:
struct pfn_info *alloc_node_pages(struct domain *d, unsigned int order)
{
Scan node_list
Alloc_heap_pages(node_id, order)..
...
}
Maybe later it can also be enhanced to use hierarchy domain structure,
if really required. Who knows? Anyway, I just throw above out as an
example that it's not so difficult if we merge with xen common code
first. :)
Thanks,
Kevin
>-----Original Message-----
>From: xen-devel-admin@lists.sourceforge.net
>[mailto:xen-devel-admin@lists.sourceforge.net] On Behalf Of
Magenheimer, Dan (HP
>Labs Fort Collins)
>Sent: Thursday, March 17, 2005 3:09 PM
>To: xen-devel@lists.sourceforge.net
>Subject: [Xen-devel] A tale of three memory allocators
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95&alloc_id\x14396&op=click
^ permalink raw reply [flat|nested] 12+ messages in thread* RE: A tale of three memory allocators
@ 2005-03-18 19:48 Tian, Kevin
0 siblings, 0 replies; 12+ messages in thread
From: Tian, Kevin @ 2005-03-18 19:48 UTC (permalink / raw)
To: Ian Pratt, Magenheimer, Dan (HP Labs Fort Collins), xen-devel; +Cc: ian.pratt
>-----Original Message-----
>From: xen-devel-admin@lists.sourceforge.net
>[mailto:xen-devel-admin@lists.sourceforge.net] On Behalf Of Ian Pratt
>Sent: Thursday, March 17, 2005 4:08 PM
>
>> Said patch is still pending because coincidentally Greg
>> is currently looking at porting Xen/ia64 to one of those newfangled
>> ia64 NUMA machines. "I would like to turn on CONFIG_NUMA.
>> CONFIG_DISCONTIGMEM, and CONFIG_VIRTUAL_MEM_MAP," said Greg.
>
>I'd vote strongly for:
>Stick with Rusty's allocator and just have different instances for
>different memory banks. Wrap the allocation function to prioritise
which
>pool to allocate from. This handles discontig memory and NUMA nicely.
>
>
>Ian
I assume that we can have no change to Rusty's allocator at all, except
some wrap interfaces to buddy system, right? Slab should only focus on
memory in xenheap, while xenheap seems to be node irrespective... :)
Thanks,
Kevin
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95&alloc_id\x14396&op=click
^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: A tale of three memory allocators
@ 2005-03-18 20:07 Magenheimer, Dan (HP Labs Fort Collins)
2005-03-18 20:18 ` Andrew Theurer
0 siblings, 1 reply; 12+ messages in thread
From: Magenheimer, Dan (HP Labs Fort Collins) @ 2005-03-18 20:07 UTC (permalink / raw)
To: xen-devel
>Of course I mean that the hypervisor probably has to support this stuff
to
>work correctly on multi-node ia64 machines. The guests can probably
get away
>with being special cased since they can be presented with a contiguous
memory
>model (Dan is this what you're doing now?)...
On Xen/ia64, domain0 is given the real EFI memory map and has capability
to map any part of memory not owned by the Xen hypervisor. All other
domains are presented with contiguous memory starting at zero.
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95&alloc_id\x14396&op=click
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: A tale of three memory allocators
2005-03-18 20:07 Magenheimer, Dan (HP Labs Fort Collins)
@ 2005-03-18 20:18 ` Andrew Theurer
0 siblings, 0 replies; 12+ messages in thread
From: Andrew Theurer @ 2005-03-18 20:18 UTC (permalink / raw)
To: Magenheimer, Dan (HP Labs Fort Collins); +Cc: xen-devel
On Friday 18 March 2005 14:07, Magenheimer, Dan (HP Labs Fort Collins) wrote:
> >Of course I mean that the hypervisor probably has to support this stuff
>
> to
>
> >work correctly on multi-node ia64 machines. The guests can probably
>
> get away
>
> >with being special cased since they can be presented with a contiguous
>
> memory
>
> >model (Dan is this what you're doing now?)...
>
> On Xen/ia64, domain0 is given the real EFI memory map and has capability
> to map any part of memory not owned by the Xen hypervisor. All other
> domains are presented with contiguous memory starting at zero.
Even though the memory presented to the domU's are contiguous and start at
zero, do you think it would be advantageous to still provide some sort of
cpu<->memory topology infomation to the domU, so that domU's which span more
than one node can take advantage of the NUMA code in the Linux kernel? I
would think eventually this is something many of the platforms would want,
no?
-Andrew Theurer
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: A tale of three memory allocators
@ 2005-03-18 20:11 Magenheimer, Dan (HP Labs Fort Collins)
2005-03-18 20:24 ` Hollis Blanchard
0 siblings, 1 reply; 12+ messages in thread
From: Magenheimer, Dan (HP Labs Fort Collins) @ 2005-03-18 20:11 UTC (permalink / raw)
To: Tian, Kevin, xen-devel
I'm less concerned about NUMA configurations... I agree that
NUMA support could be added later. My first concern would
be discontiguous memory as, on ia64, it is not uncommon
for a machine to have a physical memory map of something like:
0GB-1GB
2GB-4GB
64GB-69GB
Total 8GB
How does the Rusty Russell allocator handle a map like this?
Dan
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95&alloc_id\x14396&op=click
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: A tale of three memory allocators
2005-03-18 20:11 Magenheimer, Dan (HP Labs Fort Collins)
@ 2005-03-18 20:24 ` Hollis Blanchard
0 siblings, 0 replies; 12+ messages in thread
From: Hollis Blanchard @ 2005-03-18 20:24 UTC (permalink / raw)
To: xen-devel; +Cc: Magenheimer, Dan (HP Labs Fort Collins), Tian, Kevin
On Friday 18 March 2005 14:11, Magenheimer, Dan (HP Labs Fort Collins) wrote:
> I'm less concerned about NUMA configurations... I agree that
> NUMA support could be added later. My first concern would
> be discontiguous memory as, on ia64, it is not uncommon
> for a machine to have a physical memory map of something like:
>
> 0GB-1GB
> 2GB-4GB
> 64GB-69GB
> Total 8GB
>
> How does the Rusty Russell allocator handle a map like this?
xmalloc.c calls alloc_xenheap_pages() as needed.
So call page_alloc.c's init_boot_pages() repeatedly (as is done already for
x86), passing it the memory ranges you do have, and xmalloc will use only
that memory.
--
Hollis Blanchard
IBM Linux Technology Center
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: A tale of three memory allocators
@ 2005-03-18 22:05 Tian, Kevin
0 siblings, 0 replies; 12+ messages in thread
From: Tian, Kevin @ 2005-03-18 22:05 UTC (permalink / raw)
To: Magenheimer, Dan (HP Labs Fort Collins), xen-devel
>-----Original Message-----
>From: Magenheimer, Dan (HP Labs Fort Collins)
[mailto:dan.magenheimer@hp.com]
>Sent: Friday, March 18, 2005 12:12 PM
>
>I'm less concerned about NUMA configurations... I agree that
>NUMA support could be added later. My first concern would
>be discontiguous memory as, on ia64, it is not uncommon
>for a machine to have a physical memory map of something like:
>
>0GB-1GB
>2GB-4GB
>64GB-69GB
>Total 8GB
>
>How does the Rusty Russell allocator handle a map like this?
>
>Dan
Rusty's allocator only handles slab, which is in upper layer upon buddy
system. Here what the holes actually affects is the memmap, which is the
base of buddy system. It's easier to add virtual memmap when initialize
frame_table, without any touch upon slab part.
Thanks,
Kevin
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95&alloc_id\x14396&op=click
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2005-03-18 22:05 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-03-17 23:08 A tale of three memory allocators Magenheimer, Dan (HP Labs Fort Collins)
2005-03-18 4:48 ` Rik van Riel
2005-03-18 16:56 ` Jesse Barnes
2005-03-18 19:42 ` Jesse Barnes
-- strict thread matches above, loose matches on Subject: below --
2005-03-18 0:08 Ian Pratt
2005-03-18 19:44 Tian, Kevin
2005-03-18 19:48 Tian, Kevin
2005-03-18 20:07 Magenheimer, Dan (HP Labs Fort Collins)
2005-03-18 20:18 ` Andrew Theurer
2005-03-18 20:11 Magenheimer, Dan (HP Labs Fort Collins)
2005-03-18 20:24 ` Hollis Blanchard
2005-03-18 22:05 Tian, Kevin
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.