Keir, >>> Yes, but it's a bad interface, particularlty when the function is called >>> alloc_domheap_pages_on_node(). Pass in a nodeid. Write a helper function to >>> work out the nodeid from the domain*. >> I was just looking at this code, too, so I fixed this. Eventually >> alloc_heap_pages is called, which deals with nodes only, so I replaced >> cpu with node everywhere else, too. Now __alloc_domheap_pages and >> alloc_domheap_pages_on_node are almost the same (except parameter >> ordering), so I removed the first one, since the naming of the latter is >> better. Passing node numbers instead of cpu numbers needs cpu_to_node >> and asm/numa.h, if you think there is a better way, I am all ears. > > That's fine. If you reference numa stuff then you need numa.h. > > But vcpu_to_node and domain_to_node as well as cpu_to_node, please. There's > no need to be open-coding v->processor everywhere. Also in future we might > care to pick node based on v's affinity map rather than just current > processor value. And usage of d->vcpu[0] without checking for != NULL is > asking to introduce edge-case bugs. We can easily do that NULL check in one > place if we implement domain_to_node(). Ok, I did this. I provided NUMA_NO_NODE in the case d->vcpu[0] is NULL, this will be resolved to the current node in alloc_heap_pages (at least for now). By the way, can we solve the DMA_BITSIZE problem (your mail from 28th Feb) with this? If no node is specified, use the current behaviour of preferring non DMA zones, else stick to the given node. If you agree, I will implement this. > And, while I'm thinking about the interfaces, let's just stick to > alloc_domheap_page() and alloc_domheap_pages(). Let's add a flags parameter > to the former (so it matches the latter in that respect) and let's add a > MEMF_node() flag subtype (similar to MEMF_bits). Semantics will be that if > MEMF_node(node) is provided then we try to allocate memory from node; else > we try to allocate memory from a node local to specified domain; else if > domain is NULL then we ignore locality. Sounds reasonable. I changed this, too. If domain is NULL, domain_to_node will return NUMA_NO_NODE, which will eventually ignore locality (in alloc_heap_pages). > > Since zero is probably a valid numa nodeid we can define MEMF_node() as > something like ((((node)+1)&0xff)<<8). Then since NUMA_NO_NODE==0xff > everything works nicely: MEMF_node(NUMA_NO_NODE) is equivalent to not > specifying MEMF_node() at all, which is what we would logically expect. Good idea. > NUMA_NO_NODE probably needs to be pulled out of asm-x86/numa.h and made the > official arch-neutral way to specify 'don't care' for numa nodes. Is this really needed? I provided memflags=0 is all don't care cases, this should work and is more compatible. But beware that this silently assumes in page_alloc.c#alloc_domheap_pages that NUMA_NO_NODE is 0xFF, otherwise this trick will not work. Attached again a diff against my last version and the full patch (for some reason a missing bracket slipped through my last one, sorry for that). This is only quick-tested (booted and created a guest on each node). Signed-off-by: Andre Przywara Regards, Andre. -- Andre Przywara AMD-Operating System Research Center (OSRC), Dresden, Germany Tel: +49 351 277-84917 ----to satisfy European Law for business letters: AMD Saxony Limited Liability Company & Co. KG, Wilschdorfer Landstr. 101, 01109 Dresden, Germany Register Court Dresden: HRA 4896, General Partner authorized to represent: AMD Saxony LLC (Wilmington, Delaware, US) General Manager of AMD Saxony LLC: Dr. Hans-R. Deppe, Thomas McCoy