From: Ryan Harper <ryanh@us.ibm.com>
To: Ian Pratt <m+Ian.Pratt@cl.cam.ac.uk>
Cc: xen-devel@lists.xensource.com, Ryan Grimm <grimm@us.ibm.com>
Subject: Re: [PATCH] 0/7 xen: Add basic NUMA support
Date: Fri, 16 Dec 2005 22:53:31 -0600 [thread overview]
Message-ID: <20051217045331.GF18911@us.ibm.com> (raw)
In-Reply-To: <A95E2296287EAD4EB592B5DEEFCE0E9D409DA4@liverpoolst.ad.cl.cam.ac.uk>
* Ian Pratt <m+Ian.Pratt@cl.cam.ac.uk> [2005-12-16 19:28]:
>
> > The patchset will add basic NUMA support to Xen (hypervisor
> > only).
>
> I think we need a lot more discussion on this -- your approach differs
> from what we've previously discussed on the list. We need a session at
> the Jan summit.
OK.
> > Using this information, we also modified the page allocator
> > to provide a simple NUMA-aware API. The modified allocator
> > will attempt to find pages local to the cpu where possible,
> > but will fall back on using memory that is of the requested
> > size rather than fragmenting larger contiguous chunks to find
> > local pages. We expect to tune this algorithm in the future
> > after further study.
>
> Personally, I think we should have separate budy allocators for each of
> the zones; much simpler and faster in the common case.
I'm not sure how having multiple buddy allocators helps one choose
memory local to a node. Do you mean to have a buddy allocator per node?
> > We also modified Xen's increase_reservation memory op to
> > balance memory distribution across the vcpus in use by a
> > domain. Relying on previous patches which have already been
> > committed to xen-unstable, a guest can be constructed such
> > that its entire memory is contained within a specific NUMA node.
>
> This makes sense for 1 vcpu guests, but for multi vcpu guests this needs
> way more discussion. How do we expose the (potentially dynamic) mapping
> of vcpus to nodes? How do we expose the different memory zones to
> guests? How does Linux make use of this information? This is a can of
> worms, definitely phase 2.
I believe this makes sense for multi-vcpu guests as currently the vcpu
to cpu mapping is known at domain construction time and prior to memory
allocation. The dynamic case requires some thought as we don't want to
spread memory around, unplug two or three vcpus and potentially incur a
large number of misses because the remaining vcpus are not local to all
the domains memory.
The phase two plan is to provide virtual SRAT and SLIT tables to the
guests to leverage existing Linux NUMA code. Lots to discuss here.
> If only we had an x445 to be able to work on these patches :)
=)
Thanks for the feedback.
--
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
(512) 838-9253 T/L: 678-9253
ryanh@us.ibm.com
next prev parent reply other threads:[~2005-12-17 4:53 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-12-17 1:28 [PATCH] 0/7 xen: Add basic NUMA support Ian Pratt
2005-12-17 4:53 ` Ryan Harper [this message]
-- strict thread matches above, loose matches on Subject: below --
2005-12-18 20:18 Ian Pratt
2005-12-16 23:01 Ryan Harper
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20051217045331.GF18911@us.ibm.com \
--to=ryanh@us.ibm.com \
--cc=grimm@us.ibm.com \
--cc=m+Ian.Pratt@cl.cam.ac.uk \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.