* [PATCH 0/2] NUMA/x86 related adjustments @ 2016-08-10 9:17 Jan Beulich 2016-08-10 9:23 ` [PATCH 1/2] page-alloc/x86: don't restrict DMA heap to node 0 Jan Beulich 2016-08-10 9:24 ` [PATCH 2/2] x86/NUMA: cleanup Jan Beulich 0 siblings, 2 replies; 14+ messages in thread From: Jan Beulich @ 2016-08-10 9:17 UTC (permalink / raw) To: xen-devel Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan 1: page-alloc/x86: don't restrict DMA heap to node 0 2: x86/NUMA: cleanup Signed-off-by: Jan Beulich <jbeulich@suse.com> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/2] page-alloc/x86: don't restrict DMA heap to node 0 2016-08-10 9:17 [PATCH 0/2] NUMA/x86 related adjustments Jan Beulich @ 2016-08-10 9:23 ` Jan Beulich 2016-08-10 9:58 ` Andrew Cooper 2016-08-11 9:53 ` [PATCH " Julien Grall 2016-08-10 9:24 ` [PATCH 2/2] x86/NUMA: cleanup Jan Beulich 1 sibling, 2 replies; 14+ messages in thread From: Jan Beulich @ 2016-08-10 9:23 UTC (permalink / raw) To: xen-devel Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan, Julien Grall [-- Attachment #1: Type: text/plain, Size: 2506 bytes --] When node zero has no memory, the DMA bit width will end up getting set to 9, which is obviously not helpful to hold back a reasonable amount of low enough memory for Dom0 to use for DMA purposes. Find the lowest node with memory below 4Gb instead. Introduce arch_get_dma_bitsize() to keep this arch-specific logic out of common code. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- a/xen/arch/x86/numa.c +++ b/xen/arch/x86/numa.c @@ -355,11 +355,21 @@ void __init init_cpu_to_node(void) } } -EXPORT_SYMBOL(cpu_to_node); -EXPORT_SYMBOL(node_to_cpumask); -EXPORT_SYMBOL(memnode_shift); -EXPORT_SYMBOL(memnodemap); -EXPORT_SYMBOL(node_data); +unsigned int __init arch_get_dma_bitsize(void) +{ + unsigned int node; + + for_each_online_node(node) + if ( node_spanned_pages(node) && + !(node_start_pfn(node) >> (32 - PAGE_SHIFT)) ) + break; + if ( node >= MAX_NUMNODES ) + panic("No node with memory below 4Gb"); + + return min_t(unsigned int, + flsl(node_start_pfn(node) + node_spanned_pages(node) / 4 - 1) + + PAGE_SHIFT, 32); +} static void dump_numa(unsigned char key) { --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -1368,16 +1368,7 @@ void __init end_boot_allocator(void) init_heap_pages(virt_to_page(bootmem_region_list), 1); if ( !dma_bitsize && (num_online_nodes() > 1) ) - { -#ifdef CONFIG_X86 - dma_bitsize = min_t(unsigned int, - flsl(NODE_DATA(0)->node_spanned_pages) - 1 - + PAGE_SHIFT - 2, - 32); -#else - dma_bitsize = 32; -#endif - } + dma_bitsize = arch_get_dma_bitsize(); printk("Domain heap initialised"); if ( dma_bitsize ) --- a/xen/include/asm-arm/numa.h +++ b/xen/include/asm-arm/numa.h @@ -17,6 +17,11 @@ static inline __attribute__((pure)) node #define node_start_pfn(nid) (pdx_to_pfn(frametable_base_pdx)) #define __node_distance(a, b) (20) +static inline unsigned int arch_get_dma_bitsize(void) +{ + return 32; +} + #endif /* __ARCH_ARM_NUMA_H */ /* * Local variables: --- a/xen/include/asm-x86/numa.h +++ b/xen/include/asm-x86/numa.h @@ -86,5 +86,6 @@ extern int valid_numa_range(u64 start, u void srat_parse_regions(u64 addr); extern u8 __node_distance(nodeid_t a, nodeid_t b); +unsigned int arch_get_dma_bitsize(void); #endif [-- Attachment #2: DMA-heap-fallback.patch --] [-- Type: text/plain, Size: 2553 bytes --] page-alloc/x86: don't restrict DMA heap to node 0 When node zero has no memory, the DMA bit width will end up getting set to 9, which is obviously not helpful to hold back a reasonable amount of low enough memory for Dom0 to use for DMA purposes. Find the lowest node with memory below 4Gb instead. Introduce arch_get_dma_bitsize() to keep this arch-specific logic out of common code. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- a/xen/arch/x86/numa.c +++ b/xen/arch/x86/numa.c @@ -355,11 +355,21 @@ void __init init_cpu_to_node(void) } } -EXPORT_SYMBOL(cpu_to_node); -EXPORT_SYMBOL(node_to_cpumask); -EXPORT_SYMBOL(memnode_shift); -EXPORT_SYMBOL(memnodemap); -EXPORT_SYMBOL(node_data); +unsigned int __init arch_get_dma_bitsize(void) +{ + unsigned int node; + + for_each_online_node(node) + if ( node_spanned_pages(node) && + !(node_start_pfn(node) >> (32 - PAGE_SHIFT)) ) + break; + if ( node >= MAX_NUMNODES ) + panic("No node with memory below 4Gb"); + + return min_t(unsigned int, + flsl(node_start_pfn(node) + node_spanned_pages(node) / 4 - 1) + + PAGE_SHIFT, 32); +} static void dump_numa(unsigned char key) { --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -1368,16 +1368,7 @@ void __init end_boot_allocator(void) init_heap_pages(virt_to_page(bootmem_region_list), 1); if ( !dma_bitsize && (num_online_nodes() > 1) ) - { -#ifdef CONFIG_X86 - dma_bitsize = min_t(unsigned int, - flsl(NODE_DATA(0)->node_spanned_pages) - 1 - + PAGE_SHIFT - 2, - 32); -#else - dma_bitsize = 32; -#endif - } + dma_bitsize = arch_get_dma_bitsize(); printk("Domain heap initialised"); if ( dma_bitsize ) --- a/xen/include/asm-arm/numa.h +++ b/xen/include/asm-arm/numa.h @@ -17,6 +17,11 @@ static inline __attribute__((pure)) node #define node_start_pfn(nid) (pdx_to_pfn(frametable_base_pdx)) #define __node_distance(a, b) (20) +static inline unsigned int arch_get_dma_bitsize(void) +{ + return 32; +} + #endif /* __ARCH_ARM_NUMA_H */ /* * Local variables: --- a/xen/include/asm-x86/numa.h +++ b/xen/include/asm-x86/numa.h @@ -86,5 +86,6 @@ extern int valid_numa_range(u64 start, u void srat_parse_regions(u64 addr); extern u8 __node_distance(nodeid_t a, nodeid_t b); +unsigned int arch_get_dma_bitsize(void); #endif [-- Attachment #3: Type: text/plain, Size: 127 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] page-alloc/x86: don't restrict DMA heap to node 0 2016-08-10 9:23 ` [PATCH 1/2] page-alloc/x86: don't restrict DMA heap to node 0 Jan Beulich @ 2016-08-10 9:58 ` Andrew Cooper 2016-08-10 10:21 ` Jan Beulich 2016-08-11 9:53 ` [PATCH " Julien Grall 1 sibling, 1 reply; 14+ messages in thread From: Andrew Cooper @ 2016-08-10 9:58 UTC (permalink / raw) To: Jan Beulich, xen-devel Cc: Stefano Stabellini, Wei Liu, George Dunlap, Tim Deegan, Ian Jackson, Julien Grall On 10/08/16 10:23, Jan Beulich wrote: > --- a/xen/arch/x86/numa.c > +++ b/xen/arch/x86/numa.c > @@ -355,11 +355,21 @@ void __init init_cpu_to_node(void) > } > } > > -EXPORT_SYMBOL(cpu_to_node); > -EXPORT_SYMBOL(node_to_cpumask); > -EXPORT_SYMBOL(memnode_shift); > -EXPORT_SYMBOL(memnodemap); > -EXPORT_SYMBOL(node_data); > +unsigned int __init arch_get_dma_bitsize(void) > +{ > + unsigned int node; > + > + for_each_online_node(node) > + if ( node_spanned_pages(node) && > + !(node_start_pfn(node) >> (32 - PAGE_SHIFT)) ) > + break; > + if ( node >= MAX_NUMNODES ) > + panic("No node with memory below 4Gb"); > + > + return min_t(unsigned int, > + flsl(node_start_pfn(node) + node_spanned_pages(node) / 4 - 1) > + + PAGE_SHIFT, 32); You have moved the -1 and -2 inside the flsl() call, which alters its behaviour quite a bit. Is this intentional or accidental? ~Andrew > +} > > static void dump_numa(unsigned char key) > { > --- a/xen/common/page_alloc.c > +++ b/xen/common/page_alloc.c > @@ -1368,16 +1368,7 @@ void __init end_boot_allocator(void) > init_heap_pages(virt_to_page(bootmem_region_list), 1); > > if ( !dma_bitsize && (num_online_nodes() > 1) ) > - { > -#ifdef CONFIG_X86 > - dma_bitsize = min_t(unsigned int, > - flsl(NODE_DATA(0)->node_spanned_pages) - 1 > - + PAGE_SHIFT - 2, > - 32); > -#else > - dma_bitsize = 32; > -#endif > - } > + dma_bitsize = arch_get_dma_bitsize(); > > printk("Domain heap initialised"); > if ( dma_bitsize ) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] page-alloc/x86: don't restrict DMA heap to node 0 2016-08-10 9:58 ` Andrew Cooper @ 2016-08-10 10:21 ` Jan Beulich 2016-08-10 12:18 ` Andrew Cooper 0 siblings, 1 reply; 14+ messages in thread From: Jan Beulich @ 2016-08-10 10:21 UTC (permalink / raw) To: Andrew Cooper Cc: Stefano Stabellini, Wei Liu, George Dunlap, Tim Deegan, IanJackson, Julien Grall, xen-devel >>> On 10.08.16 at 11:58, <andrew.cooper3@citrix.com> wrote: > On 10/08/16 10:23, Jan Beulich wrote: >> --- a/xen/arch/x86/numa.c >> +++ b/xen/arch/x86/numa.c >> @@ -355,11 +355,21 @@ void __init init_cpu_to_node(void) >> } >> } >> >> -EXPORT_SYMBOL(cpu_to_node); >> -EXPORT_SYMBOL(node_to_cpumask); >> -EXPORT_SYMBOL(memnode_shift); >> -EXPORT_SYMBOL(memnodemap); >> -EXPORT_SYMBOL(node_data); >> +unsigned int __init arch_get_dma_bitsize(void) >> +{ >> + unsigned int node; >> + >> + for_each_online_node(node) >> + if ( node_spanned_pages(node) && >> + !(node_start_pfn(node) >> (32 - PAGE_SHIFT)) ) >> + break; >> + if ( node >= MAX_NUMNODES ) >> + panic("No node with memory below 4Gb"); >> + >> + return min_t(unsigned int, >> + flsl(node_start_pfn(node) + node_spanned_pages(node) / 4 - 1) >> + + PAGE_SHIFT, 32); > > You have moved the -1 and -2 inside the flsl() call, which alters its > behaviour quite a bit. Is this intentional or accidental? This is intentional, and their original placement was only not too wrong because of the effective use of zero in place of what is now node_start_pfn(node). (Obviously the division by 4 alone could have gone in either place, but the "- 1" should have been inside the flsl() even before imo.) Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] page-alloc/x86: don't restrict DMA heap to node 0 2016-08-10 10:21 ` Jan Beulich @ 2016-08-10 12:18 ` Andrew Cooper 2016-08-10 12:50 ` Jan Beulich 2016-08-11 10:44 ` [PATCH v2 " Jan Beulich 0 siblings, 2 replies; 14+ messages in thread From: Andrew Cooper @ 2016-08-10 12:18 UTC (permalink / raw) To: Jan Beulich Cc: Stefano Stabellini, Wei Liu, George Dunlap, Tim Deegan, IanJackson, Julien Grall, xen-devel On 10/08/16 11:21, Jan Beulich wrote: >>>> On 10.08.16 at 11:58, <andrew.cooper3@citrix.com> wrote: >> On 10/08/16 10:23, Jan Beulich wrote: >>> --- a/xen/arch/x86/numa.c >>> +++ b/xen/arch/x86/numa.c >>> @@ -355,11 +355,21 @@ void __init init_cpu_to_node(void) >>> } >>> } >>> >>> -EXPORT_SYMBOL(cpu_to_node); >>> -EXPORT_SYMBOL(node_to_cpumask); >>> -EXPORT_SYMBOL(memnode_shift); >>> -EXPORT_SYMBOL(memnodemap); >>> -EXPORT_SYMBOL(node_data); >>> +unsigned int __init arch_get_dma_bitsize(void) >>> +{ >>> + unsigned int node; >>> + >>> + for_each_online_node(node) >>> + if ( node_spanned_pages(node) && >>> + !(node_start_pfn(node) >> (32 - PAGE_SHIFT)) ) >>> + break; >>> + if ( node >= MAX_NUMNODES ) >>> + panic("No node with memory below 4Gb"); >>> + >>> + return min_t(unsigned int, >>> + flsl(node_start_pfn(node) + node_spanned_pages(node) / 4 - 1) >>> + + PAGE_SHIFT, 32); >> You have moved the -1 and -2 inside the flsl() call, which alters its >> behaviour quite a bit. Is this intentional or accidental? > This is intentional, and their original placement was only not too > wrong because of the effective use of zero in place of what is > now node_start_pfn(node). (Obviously the division by 4 alone > could have gone in either place, but the "- 1" should have been > inside the flsl() even before imo.) In which case it should be at least mentioned in the commit message. Finally, do you really mean to only divide the spanned pages by 4? Either way, it could do with further bracketing. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] page-alloc/x86: don't restrict DMA heap to node 0 2016-08-10 12:18 ` Andrew Cooper @ 2016-08-10 12:50 ` Jan Beulich 2016-08-11 10:44 ` [PATCH v2 " Jan Beulich 1 sibling, 0 replies; 14+ messages in thread From: Jan Beulich @ 2016-08-10 12:50 UTC (permalink / raw) To: Andrew Cooper Cc: Stefano Stabellini, Wei Liu, George Dunlap, Tim Deegan, IanJackson, Julien Grall, xen-devel >>> On 10.08.16 at 14:18, <andrew.cooper3@citrix.com> wrote: > On 10/08/16 11:21, Jan Beulich wrote: >>>>> On 10.08.16 at 11:58, <andrew.cooper3@citrix.com> wrote: >>> On 10/08/16 10:23, Jan Beulich wrote: >>>> --- a/xen/arch/x86/numa.c >>>> +++ b/xen/arch/x86/numa.c >>>> @@ -355,11 +355,21 @@ void __init init_cpu_to_node(void) >>>> } >>>> } >>>> >>>> -EXPORT_SYMBOL(cpu_to_node); >>>> -EXPORT_SYMBOL(node_to_cpumask); >>>> -EXPORT_SYMBOL(memnode_shift); >>>> -EXPORT_SYMBOL(memnodemap); >>>> -EXPORT_SYMBOL(node_data); >>>> +unsigned int __init arch_get_dma_bitsize(void) >>>> +{ >>>> + unsigned int node; >>>> + >>>> + for_each_online_node(node) >>>> + if ( node_spanned_pages(node) && >>>> + !(node_start_pfn(node) >> (32 - PAGE_SHIFT)) ) >>>> + break; >>>> + if ( node >= MAX_NUMNODES ) >>>> + panic("No node with memory below 4Gb"); >>>> + >>>> + return min_t(unsigned int, >>>> + flsl(node_start_pfn(node) + node_spanned_pages(node) / 4 - > 1) >>>> + + PAGE_SHIFT, 32); >>> You have moved the -1 and -2 inside the flsl() call, which alters its >>> behaviour quite a bit. Is this intentional or accidental? >> This is intentional, and their original placement was only not too >> wrong because of the effective use of zero in place of what is >> now node_start_pfn(node). (Obviously the division by 4 alone >> could have gone in either place, but the "- 1" should have been >> inside the flsl() even before imo.) > > In which case it should be at least mentioned in the commit message. "Also adjust the original calculation: I think the subtraction of 1 should have been part of the flsl() argument rather than getting applied to its result. And while previously the division by 4 was valid to be done on the flsl() result, this now also needs to be converted, as is should only be applied to the spanned pages value." > Finally, do you really mean to only divide the spanned pages by 4? Hmm, this question reads ambiguous to me: Do you mean the divisor should be larger than 4? I don't think so, or else the area could end up rather small. Or do you mean to divide the sum of start address and spanned pages? That would surely be wrong, as the result could end up below start address, and hence the area could continue to remain empty. > Either way, it could do with further bracketing. Bracketing of what? Even elementary school maths give precedence to multiplication and division over addition and subtraction. I know you like to parenthesize basically every binary expression that's an operand of another expression, but I think you meanwhile also know that I think this goes too far. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v2 1/2] page-alloc/x86: don't restrict DMA heap to node 0 2016-08-10 12:18 ` Andrew Cooper 2016-08-10 12:50 ` Jan Beulich @ 2016-08-11 10:44 ` Jan Beulich 2016-08-11 10:45 ` Andrew Cooper 1 sibling, 1 reply; 14+ messages in thread From: Jan Beulich @ 2016-08-11 10:44 UTC (permalink / raw) To: xen-devel Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper, IanJackson, Tim Deegan, Julien Grall [-- Attachment #1: Type: text/plain, Size: 3149 bytes --] When node zero has no memory, the DMA bit width will end up getting set to 9, which is obviously not helpful to hold back a reasonable amount of low enough memory for Dom0 to use for DMA purposes. Find the lowest node with memory below 4Gb instead. Introduce arch_get_dma_bitsize() to keep this arch-specific logic out of common code. Also adjust the original calculation: I think the subtraction of 1 should have been part of the flsl() argument rather than getting applied to its result. And while previously the division by 4 was valid to be done on the flsl() result, this now also needs to be converted, as is should only be applied to the spanned pages value. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Julien Grall <julien.grall@arm.com> --- v2: Extend commit message to reason about the calculation change. Add a comment to the calculation. --- a/xen/arch/x86/numa.c +++ b/xen/arch/x86/numa.c @@ -355,11 +355,25 @@ void __init init_cpu_to_node(void) } } -EXPORT_SYMBOL(cpu_to_node); -EXPORT_SYMBOL(node_to_cpumask); -EXPORT_SYMBOL(memnode_shift); -EXPORT_SYMBOL(memnodemap); -EXPORT_SYMBOL(node_data); +unsigned int __init arch_get_dma_bitsize(void) +{ + unsigned int node; + + for_each_online_node(node) + if ( node_spanned_pages(node) && + !(node_start_pfn(node) >> (32 - PAGE_SHIFT)) ) + break; + if ( node >= MAX_NUMNODES ) + panic("No node with memory below 4Gb"); + + /* + * Try to not reserve the whole node's memory for DMA, but dividing + * its spanned pages by (arbitrarily chosen) 4. + */ + return min_t(unsigned int, + flsl(node_start_pfn(node) + node_spanned_pages(node) / 4 - 1) + + PAGE_SHIFT, 32); +} static void dump_numa(unsigned char key) { --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -1368,16 +1368,7 @@ void __init end_boot_allocator(void) init_heap_pages(virt_to_page(bootmem_region_list), 1); if ( !dma_bitsize && (num_online_nodes() > 1) ) - { -#ifdef CONFIG_X86 - dma_bitsize = min_t(unsigned int, - flsl(NODE_DATA(0)->node_spanned_pages) - 1 - + PAGE_SHIFT - 2, - 32); -#else - dma_bitsize = 32; -#endif - } + dma_bitsize = arch_get_dma_bitsize(); printk("Domain heap initialised"); if ( dma_bitsize ) --- a/xen/include/asm-arm/numa.h +++ b/xen/include/asm-arm/numa.h @@ -17,6 +17,11 @@ static inline __attribute__((pure)) node #define node_start_pfn(nid) (pdx_to_pfn(frametable_base_pdx)) #define __node_distance(a, b) (20) +static inline unsigned int arch_get_dma_bitsize(void) +{ + return 32; +} + #endif /* __ARCH_ARM_NUMA_H */ /* * Local variables: --- a/xen/include/asm-x86/numa.h +++ b/xen/include/asm-x86/numa.h @@ -86,5 +86,6 @@ extern int valid_numa_range(u64 start, u void srat_parse_regions(u64 addr); extern u8 __node_distance(nodeid_t a, nodeid_t b); +unsigned int arch_get_dma_bitsize(void); #endif [-- Attachment #2: DMA-heap-fallback.patch --] [-- Type: text/plain, Size: 3196 bytes --] page-alloc/x86: don't restrict DMA heap to node 0 When node zero has no memory, the DMA bit width will end up getting set to 9, which is obviously not helpful to hold back a reasonable amount of low enough memory for Dom0 to use for DMA purposes. Find the lowest node with memory below 4Gb instead. Introduce arch_get_dma_bitsize() to keep this arch-specific logic out of common code. Also adjust the original calculation: I think the subtraction of 1 should have been part of the flsl() argument rather than getting applied to its result. And while previously the division by 4 was valid to be done on the flsl() result, this now also needs to be converted, as is should only be applied to the spanned pages value. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Julien Grall <julien.grall@arm.com> --- v2: Extend commit message to reason about the calculation change. Add a comment to the calculation. --- a/xen/arch/x86/numa.c +++ b/xen/arch/x86/numa.c @@ -355,11 +355,25 @@ void __init init_cpu_to_node(void) } } -EXPORT_SYMBOL(cpu_to_node); -EXPORT_SYMBOL(node_to_cpumask); -EXPORT_SYMBOL(memnode_shift); -EXPORT_SYMBOL(memnodemap); -EXPORT_SYMBOL(node_data); +unsigned int __init arch_get_dma_bitsize(void) +{ + unsigned int node; + + for_each_online_node(node) + if ( node_spanned_pages(node) && + !(node_start_pfn(node) >> (32 - PAGE_SHIFT)) ) + break; + if ( node >= MAX_NUMNODES ) + panic("No node with memory below 4Gb"); + + /* + * Try to not reserve the whole node's memory for DMA, but dividing + * its spanned pages by (arbitrarily chosen) 4. + */ + return min_t(unsigned int, + flsl(node_start_pfn(node) + node_spanned_pages(node) / 4 - 1) + + PAGE_SHIFT, 32); +} static void dump_numa(unsigned char key) { --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -1368,16 +1368,7 @@ void __init end_boot_allocator(void) init_heap_pages(virt_to_page(bootmem_region_list), 1); if ( !dma_bitsize && (num_online_nodes() > 1) ) - { -#ifdef CONFIG_X86 - dma_bitsize = min_t(unsigned int, - flsl(NODE_DATA(0)->node_spanned_pages) - 1 - + PAGE_SHIFT - 2, - 32); -#else - dma_bitsize = 32; -#endif - } + dma_bitsize = arch_get_dma_bitsize(); printk("Domain heap initialised"); if ( dma_bitsize ) --- a/xen/include/asm-arm/numa.h +++ b/xen/include/asm-arm/numa.h @@ -17,6 +17,11 @@ static inline __attribute__((pure)) node #define node_start_pfn(nid) (pdx_to_pfn(frametable_base_pdx)) #define __node_distance(a, b) (20) +static inline unsigned int arch_get_dma_bitsize(void) +{ + return 32; +} + #endif /* __ARCH_ARM_NUMA_H */ /* * Local variables: --- a/xen/include/asm-x86/numa.h +++ b/xen/include/asm-x86/numa.h @@ -86,5 +86,6 @@ extern int valid_numa_range(u64 start, u void srat_parse_regions(u64 addr); extern u8 __node_distance(nodeid_t a, nodeid_t b); +unsigned int arch_get_dma_bitsize(void); #endif [-- Attachment #3: Type: text/plain, Size: 127 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 1/2] page-alloc/x86: don't restrict DMA heap to node 0 2016-08-11 10:44 ` [PATCH v2 " Jan Beulich @ 2016-08-11 10:45 ` Andrew Cooper 0 siblings, 0 replies; 14+ messages in thread From: Andrew Cooper @ 2016-08-11 10:45 UTC (permalink / raw) To: Jan Beulich, xen-devel Cc: Stefano Stabellini, Wei Liu, George Dunlap, Tim Deegan, IanJackson, Julien Grall On 11/08/16 11:44, Jan Beulich wrote: > When node zero has no memory, the DMA bit width will end up getting set > to 9, which is obviously not helpful to hold back a reasonable amount > of low enough memory for Dom0 to use for DMA purposes. Find the lowest > node with memory below 4Gb instead. > > Introduce arch_get_dma_bitsize() to keep this arch-specific logic out > of common code. > > Also adjust the original calculation: I think the subtraction of 1 > should have been part of the flsl() argument rather than getting > applied to its result. And while previously the division by 4 was valid > to be done on the flsl() result, this now also needs to be converted, > as is should only be applied to the spanned pages value. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> > Acked-by: Julien Grall <julien.grall@arm.com> > --- > v2: Extend commit message to reason about the calculation change. Add > a comment to the calculation. Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] page-alloc/x86: don't restrict DMA heap to node 0 2016-08-10 9:23 ` [PATCH 1/2] page-alloc/x86: don't restrict DMA heap to node 0 Jan Beulich 2016-08-10 9:58 ` Andrew Cooper @ 2016-08-11 9:53 ` Julien Grall 2016-08-11 10:05 ` Jan Beulich 1 sibling, 1 reply; 14+ messages in thread From: Julien Grall @ 2016-08-11 9:53 UTC (permalink / raw) To: Jan Beulich, xen-devel Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan Hi Jan, On 10/08/2016 11:23, Jan Beulich wrote: > --- a/xen/include/asm-arm/numa.h > +++ b/xen/include/asm-arm/numa.h > @@ -17,6 +17,11 @@ static inline __attribute__((pure)) node > #define node_start_pfn(nid) (pdx_to_pfn(frametable_base_pdx)) > #define __node_distance(a, b) (20) > > +static inline unsigned int arch_get_dma_bitsize(void) > +{ > + return 32; > +} > + I am not sure why we return 32 here for ARM. Anyway, as it was already the case before this patch: Acked-by: Julien Grall <julien.grall@arm.com> Regards, -- Julien Grall _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] page-alloc/x86: don't restrict DMA heap to node 0 2016-08-11 9:53 ` [PATCH " Julien Grall @ 2016-08-11 10:05 ` Jan Beulich 0 siblings, 0 replies; 14+ messages in thread From: Jan Beulich @ 2016-08-11 10:05 UTC (permalink / raw) To: Julien Grall Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan, xen-devel >>> On 11.08.16 at 11:53, <julien.grall@arm.com> wrote: > Hi Jan, > > On 10/08/2016 11:23, Jan Beulich wrote: >> --- a/xen/include/asm-arm/numa.h >> +++ b/xen/include/asm-arm/numa.h >> @@ -17,6 +17,11 @@ static inline __attribute__((pure)) node >> #define node_start_pfn(nid) (pdx_to_pfn(frametable_base_pdx)) >> #define __node_distance(a, b) (20) >> >> +static inline unsigned int arch_get_dma_bitsize(void) >> +{ >> + return 32; >> +} >> + > > I am not sure why we return 32 here for ARM. Anyway, as it was already > the case before this patch: In fact I think we could get away without that inline function for the time being: Since NODES_SHIFT is undefined on ARM, the call out of page_alloc.c should get elided by the compiler anyway, so a possible alternative would be to just have a declaration of the function on ARM without actual implementation. > Acked-by: Julien Grall <julien.grall@arm.com> Thanks. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 2/2] x86/NUMA: cleanup 2016-08-10 9:17 [PATCH 0/2] NUMA/x86 related adjustments Jan Beulich 2016-08-10 9:23 ` [PATCH 1/2] page-alloc/x86: don't restrict DMA heap to node 0 Jan Beulich @ 2016-08-10 9:24 ` Jan Beulich 2016-08-10 10:35 ` Andrew Cooper 1 sibling, 1 reply; 14+ messages in thread From: Jan Beulich @ 2016-08-10 9:24 UTC (permalink / raw) To: xen-devel; +Cc: George Dunlap, Andrew Cooper [-- Attachment #1: Type: text/plain, Size: 4010 bytes --] - drop the only left CONFIG_NUMA conditional (this is always true) - drop struct node_data's node_id field (being always equal to the node_data[] array index used) - don't open code node_{start,end}_pfn() nor node_spanned_pages() except when used as lvalues (those could be converted too, but this seems a little awkward) - no longer open code pfn_to_paddr() in an expression being modified anyway - make dump less verbose by logging actual vs intended node IDs only when they don't match Signed-off-by: Jan Beulich <jbeulich@suse.com> --- a/xen/arch/x86/numa.c +++ b/xen/arch/x86/numa.c @@ -179,7 +179,6 @@ void __init setup_node_bootmem(nodeid_t start_pfn = start >> PAGE_SHIFT; end_pfn = end >> PAGE_SHIFT; - NODE_DATA(nodeid)->node_id = nodeid; NODE_DATA(nodeid)->node_start_pfn = start_pfn; NODE_DATA(nodeid)->node_spanned_pages = end_pfn - start_pfn; @@ -386,16 +385,15 @@ static void dump_numa(unsigned char key) for_each_online_node ( i ) { - paddr_t pa = (paddr_t)(NODE_DATA(i)->node_start_pfn + 1)<< PAGE_SHIFT; - printk("idx%d -> NODE%d start->%lu size->%lu free->%lu\n", - i, NODE_DATA(i)->node_id, - NODE_DATA(i)->node_start_pfn, - NODE_DATA(i)->node_spanned_pages, + paddr_t pa = pfn_to_paddr(node_start_pfn(i) + 1); + + printk("NODE%u start->%lu size->%lu free->%lu\n", + i, node_start_pfn(i), node_spanned_pages(i), avail_node_heap_pages(i)); /* sanity check phys_to_nid() */ - printk("phys_to_nid(%"PRIpaddr") -> %d should be %d\n", pa, - phys_to_nid(pa), - NODE_DATA(i)->node_id); + if ( phys_to_nid(pa) != i ) + printk("phys_to_nid(%"PRIpaddr") -> %d should be %u\n", + pa, phys_to_nid(pa), i); } j = cpumask_first(&cpu_online_map); --- a/xen/arch/x86/x86_64/mm.c +++ b/xen/arch/x86/x86_64/mm.c @@ -1386,21 +1386,21 @@ int memory_add(unsigned long spfn, unsig goto destroy_directmap; } - old_node_start = NODE_DATA(node)->node_start_pfn; - old_node_span = NODE_DATA(node)->node_spanned_pages; + old_node_start = node_start_pfn(node); + old_node_span = node_spanned_pages(node); orig_online = node_online(node); if ( !orig_online ) { dprintk(XENLOG_WARNING, "node %x pxm %x is not online\n",node, pxm); - NODE_DATA(node)->node_id = node; NODE_DATA(node)->node_start_pfn = spfn; NODE_DATA(node)->node_spanned_pages = epfn - node_start_pfn(node); node_set_online(node); - }else + } + else { - if (NODE_DATA(node)->node_start_pfn > spfn) + if (node_start_pfn(node) > spfn) NODE_DATA(node)->node_start_pfn = spfn; if (node_end_pfn(node) < epfn) NODE_DATA(node)->node_spanned_pages = epfn - node_start_pfn(node); --- a/xen/include/asm-x86/numa.h +++ b/xen/include/asm-x86/numa.h @@ -40,7 +40,6 @@ extern void srat_detect_node(int cpu); extern void setup_node_bootmem(nodeid_t nodeid, u64 start, u64 end); extern nodeid_t apicid_to_node[]; -#ifdef CONFIG_NUMA extern void init_cpu_to_node(void); static inline void clear_node_cpumask(int cpu) @@ -56,7 +55,6 @@ extern u8 *memnodemap; struct node_data { unsigned long node_start_pfn; unsigned long node_spanned_pages; - nodeid_t node_id; }; extern struct node_data node_data[]; @@ -78,11 +76,6 @@ static inline __attribute__((pure)) node NODE_DATA(nid)->node_spanned_pages) extern int valid_numa_range(u64 start, u64 end, nodeid_t node); -#else -#define init_cpu_to_node() do {} while (0) -#define clear_node_cpumask(cpu) do {} while (0) -#define valid_numa_range(start, end, node) 1 -#endif void srat_parse_regions(u64 addr); extern u8 __node_distance(nodeid_t a, nodeid_t b); [-- Attachment #2: x86-NUMA-cleanup.patch --] [-- Type: text/plain, Size: 4025 bytes --] x86/NUMA: cleanup - drop the only left CONFIG_NUMA conditional (this is always true) - drop struct node_data's node_id field (being always equal to the node_data[] array index used) - don't open code node_{start,end}_pfn() nor node_spanned_pages() except when used as lvalues (those could be converted too, but this seems a little awkward) - no longer open code pfn_to_paddr() in an expression being modified anyway - make dump less verbose by logging actual vs intended node IDs only when they don't match Signed-off-by: Jan Beulich <jbeulich@suse.com> --- a/xen/arch/x86/numa.c +++ b/xen/arch/x86/numa.c @@ -179,7 +179,6 @@ void __init setup_node_bootmem(nodeid_t start_pfn = start >> PAGE_SHIFT; end_pfn = end >> PAGE_SHIFT; - NODE_DATA(nodeid)->node_id = nodeid; NODE_DATA(nodeid)->node_start_pfn = start_pfn; NODE_DATA(nodeid)->node_spanned_pages = end_pfn - start_pfn; @@ -386,16 +385,15 @@ static void dump_numa(unsigned char key) for_each_online_node ( i ) { - paddr_t pa = (paddr_t)(NODE_DATA(i)->node_start_pfn + 1)<< PAGE_SHIFT; - printk("idx%d -> NODE%d start->%lu size->%lu free->%lu\n", - i, NODE_DATA(i)->node_id, - NODE_DATA(i)->node_start_pfn, - NODE_DATA(i)->node_spanned_pages, + paddr_t pa = pfn_to_paddr(node_start_pfn(i) + 1); + + printk("NODE%u start->%lu size->%lu free->%lu\n", + i, node_start_pfn(i), node_spanned_pages(i), avail_node_heap_pages(i)); /* sanity check phys_to_nid() */ - printk("phys_to_nid(%"PRIpaddr") -> %d should be %d\n", pa, - phys_to_nid(pa), - NODE_DATA(i)->node_id); + if ( phys_to_nid(pa) != i ) + printk("phys_to_nid(%"PRIpaddr") -> %d should be %u\n", + pa, phys_to_nid(pa), i); } j = cpumask_first(&cpu_online_map); --- a/xen/arch/x86/x86_64/mm.c +++ b/xen/arch/x86/x86_64/mm.c @@ -1386,21 +1386,21 @@ int memory_add(unsigned long spfn, unsig goto destroy_directmap; } - old_node_start = NODE_DATA(node)->node_start_pfn; - old_node_span = NODE_DATA(node)->node_spanned_pages; + old_node_start = node_start_pfn(node); + old_node_span = node_spanned_pages(node); orig_online = node_online(node); if ( !orig_online ) { dprintk(XENLOG_WARNING, "node %x pxm %x is not online\n",node, pxm); - NODE_DATA(node)->node_id = node; NODE_DATA(node)->node_start_pfn = spfn; NODE_DATA(node)->node_spanned_pages = epfn - node_start_pfn(node); node_set_online(node); - }else + } + else { - if (NODE_DATA(node)->node_start_pfn > spfn) + if (node_start_pfn(node) > spfn) NODE_DATA(node)->node_start_pfn = spfn; if (node_end_pfn(node) < epfn) NODE_DATA(node)->node_spanned_pages = epfn - node_start_pfn(node); --- a/xen/include/asm-x86/numa.h +++ b/xen/include/asm-x86/numa.h @@ -40,7 +40,6 @@ extern void srat_detect_node(int cpu); extern void setup_node_bootmem(nodeid_t nodeid, u64 start, u64 end); extern nodeid_t apicid_to_node[]; -#ifdef CONFIG_NUMA extern void init_cpu_to_node(void); static inline void clear_node_cpumask(int cpu) @@ -56,7 +55,6 @@ extern u8 *memnodemap; struct node_data { unsigned long node_start_pfn; unsigned long node_spanned_pages; - nodeid_t node_id; }; extern struct node_data node_data[]; @@ -78,11 +76,6 @@ static inline __attribute__((pure)) node NODE_DATA(nid)->node_spanned_pages) extern int valid_numa_range(u64 start, u64 end, nodeid_t node); -#else -#define init_cpu_to_node() do {} while (0) -#define clear_node_cpumask(cpu) do {} while (0) -#define valid_numa_range(start, end, node) 1 -#endif void srat_parse_regions(u64 addr); extern u8 __node_distance(nodeid_t a, nodeid_t b); [-- Attachment #3: Type: text/plain, Size: 127 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/2] x86/NUMA: cleanup 2016-08-10 9:24 ` [PATCH 2/2] x86/NUMA: cleanup Jan Beulich @ 2016-08-10 10:35 ` Andrew Cooper 2016-08-10 11:47 ` Jan Beulich 0 siblings, 1 reply; 14+ messages in thread From: Andrew Cooper @ 2016-08-10 10:35 UTC (permalink / raw) To: Jan Beulich, xen-devel; +Cc: George Dunlap On 10/08/16 10:24, Jan Beulich wrote: > - drop the only left CONFIG_NUMA conditional (this is always true) I observe that CONFIG_NUMA_EMU is also unconditionally true, which offers further cleanup opportunities (albeit it probably a separate patch). > - drop struct node_data's node_id field (being always equal to the > node_data[] array index used) > - don't open code node_{start,end}_pfn() nor node_spanned_pages() > except when used as lvalues (those could be converted too, but this > seems a little awkward) > - no longer open code pfn_to_paddr() in an expression being modified > anyway > - make dump less verbose by logging actual vs intended node IDs only > when they don't match > > Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/2] x86/NUMA: cleanup 2016-08-10 10:35 ` Andrew Cooper @ 2016-08-10 11:47 ` Jan Beulich 2016-08-16 10:02 ` Dario Faggioli 0 siblings, 1 reply; 14+ messages in thread From: Jan Beulich @ 2016-08-10 11:47 UTC (permalink / raw) To: Andrew Cooper, xen-devel; +Cc: George Dunlap >>> On 10.08.16 at 12:35, <andrew.cooper3@citrix.com> wrote: > On 10/08/16 10:24, Jan Beulich wrote: >> - drop the only left CONFIG_NUMA conditional (this is always true) > > I observe that CONFIG_NUMA_EMU is also unconditionally true, which > offers further cleanup opportunities (albeit it probably a separate patch). So I thought, but then wasn't sure eliminating just the conditionals it would actually be the right route - the alternative would be to instead just drop that NUMA emulation code altogether, as no-one should be using it. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/2] x86/NUMA: cleanup 2016-08-10 11:47 ` Jan Beulich @ 2016-08-16 10:02 ` Dario Faggioli 0 siblings, 0 replies; 14+ messages in thread From: Dario Faggioli @ 2016-08-16 10:02 UTC (permalink / raw) To: Jan Beulich, Andrew Cooper, xen-devel; +Cc: George Dunlap [-- Attachment #1.1: Type: text/plain, Size: 1275 bytes --] On Wed, 2016-08-10 at 05:47 -0600, Jan Beulich wrote: > > On 10.08.16 at 12:35, <andrew.cooper3@citrix.com> wrote: > > I observe that CONFIG_NUMA_EMU is also unconditionally true, which > > offers further cleanup opportunities (albeit it probably a separate > > patch). > So I thought, but then wasn't sure eliminating just the conditionals > it would actually be the right route - the alternative would be to > instead just drop that NUMA emulation code altogether, as no-one > should be using it. > Last time I tried to use it, it indeed was *not* useful at all... and looking at it, it seems that nothing changed since then. AFAICT, it's something imported from Linux quite a few time back, but while Linux evolved its own NUMA_EMU support to something that may actually be useful, ours stayed original. So, for what the opinion of someone that tried to use it is worth, I agree that we either also improve it (e.g., by trying syncing with the Linux one), or kill it. Regards, Dairo -- <<This happens because I choose it to happen!>> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) [-- Attachment #1.2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 819 bytes --] [-- Attachment #2: Type: text/plain, Size: 127 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2016-08-16 10:02 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-08-10 9:17 [PATCH 0/2] NUMA/x86 related adjustments Jan Beulich 2016-08-10 9:23 ` [PATCH 1/2] page-alloc/x86: don't restrict DMA heap to node 0 Jan Beulich 2016-08-10 9:58 ` Andrew Cooper 2016-08-10 10:21 ` Jan Beulich 2016-08-10 12:18 ` Andrew Cooper 2016-08-10 12:50 ` Jan Beulich 2016-08-11 10:44 ` [PATCH v2 " Jan Beulich 2016-08-11 10:45 ` Andrew Cooper 2016-08-11 9:53 ` [PATCH " Julien Grall 2016-08-11 10:05 ` Jan Beulich 2016-08-10 9:24 ` [PATCH 2/2] x86/NUMA: cleanup Jan Beulich 2016-08-10 10:35 ` Andrew Cooper 2016-08-10 11:47 ` Jan Beulich 2016-08-16 10:02 ` Dario Faggioli
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).