From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andre Przywara Subject: Re: [vNUMA v2][PATCH 2/8] public interface Date: Tue, 3 Aug 2010 23:21:53 +0200 Message-ID: <4C588871.8030907@amd.com> References: <1BEA8649F0C00540AB2811D7922ECB6C9338B4CC@orsmsx507.amr.corp.intel.com> <4C581BA6.3030502@amd.com> Mime-Version: 1.0 Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Dulloor Cc: "xen-devel@lists.xensource.com" List-Id: xen-devel@lists.xenproject.org Dulloor wrote: > On Tue, Aug 3, 2010 at 6:37 AM, Andre Przywara wrote: >> Dulloor wrote: >>> Interface definition. Structure that will be shared with hvmloader (with >>> HVMs) >>> and directly with the VMs (with PV). >>> >>> -dulloor >>> >>> Signed-off-by : Dulloor >>> >>> +/* vnodes are 1GB-aligned */ >>> +#define XEN_MIN_VNODE_SHIFT (30) >> Why that? Do you mean guest memory here? Isn't that a bit restrictive? >> What if the remaining system resources do not allow this? >> What about a 5GB guest on 2 nodes? >> In AMD hardware there is minimum shift of 16MB, so I think 24 bit would >> be better. > Linux has stricter restrictions on min vnode shift (256MB afair). And, > I remember > one of the emails from Jan Beulich where the minimum node size was discussed > (but in another context). I will get verify my facts and reply on this. OK. I was just asking cause I wondered how the PCI hole issue is solved (I haven't managed to review these patches today). 256 MB looks OK to me. > >> +struct xen_vnode_info { >> + uint8_t mnode_id; /* physical node vnode is allocated from */ >> + uint32_t start; /* start of the vnode range (in pages) */ >> + uint32_t end; /* end of the vnode range (in pages) */ >> +}; >> + >>> +struct xen_domain_numa_info { >>> + uint8_t version; /* Interface version */ >>> + uint8_t type; /* VM memory allocation scheme (see above) */ >>> + >>> + uint8_t nr_vcpus; >> Isn't that redundant with info stored somewhere else (for instance >> in the hvm_info table)? > But, this being a dynamic structure, nr_vcpus and nr_vnodes determine the > actual size of the populated structure. It's just easier to use in the > above helper macros. Right. That is better. My concern was how to deal with possible inconsistencies. But the number of VCPUs shouldn't be a problem. >>> + uint8_t nr_vnodes; >>> + /* data[] has the following entries : >>> + * //Only (nr_vnodes) entries are filled, each sizeof(struct >>> xen_vnode_info) >>> + * struct xen_vnode_info vnode_info[nr_vnodes]; >> Why would the guest need that info (physical node, start and end) here? >> Wouldn't be just the size of the node's memory sufficient? > I changed that from size to (start, end) on last review. size should > be sufficient since > all nodes are contiguous. Will revert this back to use size. start and end look fine on the first glance, but you gain nothing in using this if you only allow one entry per node. See the simple example of 4GB in 2 nodes, the SRAT looks like this: node0: 0-640K node0: 1MB - 2GB node1: 2GB - 3.5GB node1: 4GB - 4.5GB In my patches I did this hole-punching in hvmloader and only send 2G/2G via hvm_info. From an architectural point of view the Xen tools code shouldn't deal with these internals if this can be hidden in hvmloader. Regards, Andre. -- Andre Przywara AMD-Operating System Research Center (OSRC), Dresden, Germany Tel: +49 351 448-3567-12