All of lore.kernel.org
 help / color / mirror / Atom feed
* A tale of three memory allocators
@ 2005-03-17 23:08 Magenheimer, Dan (HP Labs Fort Collins)
  2005-03-18  4:48 ` Rik van Riel
  0 siblings, 1 reply; 12+ messages in thread
From: Magenheimer, Dan (HP Labs Fort Collins) @ 2005-03-17 23:08 UTC (permalink / raw)
  To: xen-devel

This should be a good meaty developers discussion, full
of opinions. I'm far from an expert in this area and hereby
solicit help.

Once upon a time there was the Xen simplified memory allocator,
written by Keir.  When Dan (the ia64 man) looked at this allocator
for Xen/ia64, he was very displeased.  "Ia64 has a much more
complex physical memory architecture than x86," he said.
"For example, physical memory space is often not contiguous,
and what about the cool NUMA stuff that is in Linux 2.6?"

So Dan took the Linux memory allocation code, hacked on
it mightily, adding numerous ugly ifdefs and used the result
in place of Keir's code for Xen/ia64.  The result works,
but is truly an abomination.

Some time later, Rusty looked at Keir's code and too was
displeased.  He rewrote it to be much cleaner and more
simplified, much to Keir's delight.  And this code was
placed in common, still to be ignored by Xen/ia64.

Now Arun gazed upon the ugly Xen/ia64 memory allocator full
of ifdefs and was much dismayed.  He preferred Rusty's code
and, even better, sharing more code with Xen/x86, so he busily
generated a patch for Xen/ia64 to use the common code.

Said patch is still pending because coincidentally Greg
is currently looking at porting Xen/ia64 to one of those newfangled
ia64 NUMA machines.  "I would like to turn on CONFIG_NUMA.
CONFIG_DISCONTIGMEM, and CONFIG_VIRTUAL_MEM_MAP," said Greg.

What should Dan (the ia64 man) do???  How complex is too
complex?  How ugly is too ugly?

This concludes (for now) the tale of three memory allocators...
(please respond/discuss :-)


-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95&alloc_id\x14396&op=click

^ permalink raw reply	[flat|nested] 12+ messages in thread
* RE: A tale of three memory allocators
@ 2005-03-18  0:08 Ian Pratt
  0 siblings, 0 replies; 12+ messages in thread
From: Ian Pratt @ 2005-03-18  0:08 UTC (permalink / raw)
  To: Magenheimer, Dan (HP Labs Fort Collins), xen-devel; +Cc: ian.pratt

> Said patch is still pending because coincidentally Greg
> is currently looking at porting Xen/ia64 to one of those newfangled
> ia64 NUMA machines.  "I would like to turn on CONFIG_NUMA.
> CONFIG_DISCONTIGMEM, and CONFIG_VIRTUAL_MEM_MAP," said Greg.

I'd vote strongly for:
Stick with Rusty's allocator and just have different instances for
different memory banks. Wrap the allocation function to prioritise which
pool to allocate from. This handles discontig memory and NUMA nicely.


Ian


-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95&alloc_id\x14396&op=click

^ permalink raw reply	[flat|nested] 12+ messages in thread
* RE: A tale of three memory allocators
@ 2005-03-18 19:44 Tian, Kevin
  0 siblings, 0 replies; 12+ messages in thread
From: Tian, Kevin @ 2005-03-18 19:44 UTC (permalink / raw)
  To: Magenheimer, Dan (HP Labs Fort Collins), xen-devel

Hi, Dan,
	This is really a good writing. :) My feeling is that we could
still try to merge with XEN common memory system first, and then, based
on this simplified model, investigate carefully about what's the best
and cleanest approach to support add-on features, like NUMA here. Yes,
it would be great if XEN can run on so many different work models soon.

	It's really quickest way to support NUMA stuff based on existing
Linux code. However sometimes re-design based on a new usage model may
be more valuable than simply copying stuff for largely different one. :)
VM hypervisor differs with normal OS to large extent. Linux memory
management is efficient which however, contains too many redundancies
for XEN. For example, Linux has to differentiate allocation request for
normal memory or DMA-capable one. Instead Xen grants Dom0 (service OS)
to take charge of physical devices, thus no need to know DMA related
stuff. Another example is, large part of Linux memory management code is
for handling user process, which has a different appearance compared to
Domain. That brings with many uncertainties for later development.

	So to adopt XEN common code actually let us to sync with new
fix/update/design which benefiting virtual machine concept quickly;
however to borrow Linux code brings us complexity to maintain and catch
up with change of linux. It's very likely that efforts on those finally
just shows unrelated to XEN usage model.

	Actually our patch replaces linux stuff with XEN common memory
part, including both boot-time allocator, buddy system and Rusty's
simple slab allocator. If this can be merged earlier, we then get a
better base to consider how to support NUMA model on XEN environment.
IMO, actually XEN already leaves space for such enhancement. The buddy
system is based on concept of zone, which you can also think as node. A
quick way may be: (As Ian points out)

1. Define more domain ID, like:
#define MEMZONE_XEN 0
#define MEMZONE_DOM 1
#define MEMZONE_NODE	4	/* say 4 node */
#define NR_ZONES    MEMZONE_NODE + 2

2. Define new wrapping interfaces:
struct pfn_info *alloc_node_pages(struct domain *d, unsigned int order)
{
	Scan node_list
	Alloc_heap_pages(node_id, order)..
	...
}

Maybe later it can also be enhanced to use hierarchy domain structure,
if really required. Who knows? Anyway, I just throw above out as an
example that it's not so difficult if we merge with xen common code
first. :)

Thanks,
Kevin

>-----Original Message-----
>From: xen-devel-admin@lists.sourceforge.net
>[mailto:xen-devel-admin@lists.sourceforge.net] On Behalf Of
Magenheimer, Dan (HP
>Labs Fort Collins)
>Sent: Thursday, March 17, 2005 3:09 PM
>To: xen-devel@lists.sourceforge.net
>Subject: [Xen-devel] A tale of three memory allocators


-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95&alloc_id\x14396&op=click

^ permalink raw reply	[flat|nested] 12+ messages in thread
* RE: A tale of three memory allocators
@ 2005-03-18 19:48 Tian, Kevin
  0 siblings, 0 replies; 12+ messages in thread
From: Tian, Kevin @ 2005-03-18 19:48 UTC (permalink / raw)
  To: Ian Pratt, Magenheimer, Dan (HP Labs Fort Collins), xen-devel; +Cc: ian.pratt

>-----Original Message-----
>From: xen-devel-admin@lists.sourceforge.net
>[mailto:xen-devel-admin@lists.sourceforge.net] On Behalf Of Ian Pratt
>Sent: Thursday, March 17, 2005 4:08 PM
>
>> Said patch is still pending because coincidentally Greg
>> is currently looking at porting Xen/ia64 to one of those newfangled
>> ia64 NUMA machines.  "I would like to turn on CONFIG_NUMA.
>> CONFIG_DISCONTIGMEM, and CONFIG_VIRTUAL_MEM_MAP," said Greg.
>
>I'd vote strongly for:
>Stick with Rusty's allocator and just have different instances for
>different memory banks. Wrap the allocation function to prioritise
which
>pool to allocate from. This handles discontig memory and NUMA nicely.
>
>
>Ian

I assume that we can have no change to Rusty's allocator at all, except
some wrap interfaces to buddy system, right? Slab should only focus on
memory in xenheap, while xenheap seems to be node irrespective... :)

Thanks,
Kevin


-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95&alloc_id\x14396&op=click

^ permalink raw reply	[flat|nested] 12+ messages in thread
* RE: A tale of three memory allocators
@ 2005-03-18 20:07 Magenheimer, Dan (HP Labs Fort Collins)
  2005-03-18 20:18 ` Andrew Theurer
  0 siblings, 1 reply; 12+ messages in thread
From: Magenheimer, Dan (HP Labs Fort Collins) @ 2005-03-18 20:07 UTC (permalink / raw)
  To: xen-devel

>Of course I mean that the hypervisor probably has to support this stuff
to 
>work correctly on multi-node ia64 machines.  The guests can probably
get away 
>with being special cased since they can be presented with a contiguous
memory 
>model (Dan is this what you're doing now?)...

On Xen/ia64, domain0 is given the real EFI memory map and has capability
to map any part of memory not owned by the Xen hypervisor.  All other
domains are presented with contiguous memory starting at zero.


-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95&alloc_id\x14396&op=click

^ permalink raw reply	[flat|nested] 12+ messages in thread
* RE: A tale of three memory allocators
@ 2005-03-18 20:11 Magenheimer, Dan (HP Labs Fort Collins)
  2005-03-18 20:24 ` Hollis Blanchard
  0 siblings, 1 reply; 12+ messages in thread
From: Magenheimer, Dan (HP Labs Fort Collins) @ 2005-03-18 20:11 UTC (permalink / raw)
  To: Tian, Kevin, xen-devel

I'm less concerned about NUMA configurations... I agree that
NUMA support could be added later.  My first concern would
be discontiguous memory as, on ia64, it is not uncommon
for a machine to have a physical memory map of something like:

0GB-1GB
2GB-4GB
64GB-69GB
Total  8GB

How does the Rusty Russell allocator handle a map like this?

Dan


-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95&alloc_id\x14396&op=click

^ permalink raw reply	[flat|nested] 12+ messages in thread
* RE: A tale of three memory allocators
@ 2005-03-18 22:05 Tian, Kevin
  0 siblings, 0 replies; 12+ messages in thread
From: Tian, Kevin @ 2005-03-18 22:05 UTC (permalink / raw)
  To: Magenheimer, Dan (HP Labs Fort Collins), xen-devel

>-----Original Message-----
>From: Magenheimer, Dan (HP Labs Fort Collins)
[mailto:dan.magenheimer@hp.com]
>Sent: Friday, March 18, 2005 12:12 PM
>
>I'm less concerned about NUMA configurations... I agree that
>NUMA support could be added later.  My first concern would
>be discontiguous memory as, on ia64, it is not uncommon
>for a machine to have a physical memory map of something like:
>
>0GB-1GB
>2GB-4GB
>64GB-69GB
>Total  8GB
>
>How does the Rusty Russell allocator handle a map like this?
>
>Dan

Rusty's allocator only handles slab, which is in upper layer upon buddy
system. Here what the holes actually affects is the memmap, which is the
base of buddy system. It's easier to add virtual memmap when initialize
frame_table, without any touch upon slab part.

Thanks,
Kevin


-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95&alloc_id\x14396&op=click

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2005-03-18 22:05 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-03-17 23:08 A tale of three memory allocators Magenheimer, Dan (HP Labs Fort Collins)
2005-03-18  4:48 ` Rik van Riel
2005-03-18 16:56   ` Jesse Barnes
2005-03-18 19:42     ` Jesse Barnes
  -- strict thread matches above, loose matches on Subject: below --
2005-03-18  0:08 Ian Pratt
2005-03-18 19:44 Tian, Kevin
2005-03-18 19:48 Tian, Kevin
2005-03-18 20:07 Magenheimer, Dan (HP Labs Fort Collins)
2005-03-18 20:18 ` Andrew Theurer
2005-03-18 20:11 Magenheimer, Dan (HP Labs Fort Collins)
2005-03-18 20:24 ` Hollis Blanchard
2005-03-18 22:05 Tian, Kevin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.