From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: [Hackathon minutes] PV frontends/backends and NUMA machines Date: Tue, 21 May 2013 21:28:40 -0400 Message-ID: <20130522012840.GA5599@phenom.dumpdata.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: George Dunlap Cc: "xen-devel@lists.xensource.com" , Stefano Stabellini List-Id: xen-devel@lists.xenproject.org On Mon, May 20, 2013 at 02:48:50PM +0100, George Dunlap wrote: > On Mon, May 20, 2013 at 2:44 PM, Stefano Stabellini > wrote: > > Hi all, > > these are my notes from the discussion that we had at the Hackathon > > regarding PV frontends and backends running on NUMA machines. > > > > > > --- > > > > The problem: how can we make sure that frontends and backends run in the > > same NUMA node? > > > > We would need to run one backend kthread per NUMA node: we have already > > one kthread per netback vif (one per guest), we could pin each of them > > on a different NUMA node, the same one the frontend is running on. > > > > But that means that dom0 would be running on several NUMA nodes at once, > > how much of a performance penalty would that be? > > We would need to export NUMA information to dom0, so that dom0 can make > > smart decisions on memory allocations and we would also need to allocate > > memory for dom0 from multiple nodes. > > > > We need a way to automatically allocate the initial dom0 memory in Xen > > in a NUMA-aware way and we need Xen to automatically create one dom0 vcpu > > per NUMA node. > > > > After dom0 boots, the toolstack is going to decide where to place any > > new guests: it allocates the memory from the NUMA node it wants to run > > the guest on and it is going to ask dom0 to allocate the kthread from > > that node too. (Maybe writing the NUMA node on xenstore.) > > > > We need to make sure that the interrupts/MSIs coming from the NIC arrive > > on the same pcpu that is running the vcpu that needs to receive it. > > We need to do irqbalacing in dom0, then Xen automatically will make the > > physical MSIs follow the vcpu automatically. > > > > If the card is multiqueue we need to make sure that we use the multiple > > queues so that we can have difference sources of interrupts/MSIs for > > each vif. This allows us to independently notify each dom0 vcpu. > > So the work items I remember are as follows: > 1. Implement NUMA affinity for vcpus > 2. Implement Guest NUMA support for PV guests Did anybody volunteer for this one?