linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH v6 4/9] mm: thp: Introduce anon_orders and anon_always_mask sysfs files
       [not found] ` <20230929114421.3761121-5-ryan.roberts@arm.com>
@ 2023-09-29 22:55   ` Andrew Morton
  2023-10-02 10:15     ` Ryan Roberts
  2023-10-07 22:54     ` Michael Ellerman
  0 siblings, 2 replies; 6+ messages in thread
From: Andrew Morton @ 2023-09-29 22:55 UTC (permalink / raw)
  To: Ryan Roberts
  Cc: linux-arm-kernel, David Rientjes, Yu Zhao, John Hubbard,
	Anshuman Khandual, Catalin Marinas, Yang Shi, David Hildenbrand,
	Hugh Dickins, Yin Fengwei, Matthew Wilcox, linux-kernel, linux-mm,
	Luis Chamberlain, Vlastimil Babka, Zi Yan, Huang, Ying,
	Itaru Kitayama, linuxppc-dev, Kirill A. Shutemov

On Fri, 29 Sep 2023 12:44:15 +0100 Ryan Roberts <ryan.roberts@arm.com> wrote:

> In preparation for adding support for anonymous large folios that are
> smaller than the PMD-size, introduce 2 new sysfs files that will be used
> to control the new behaviours via the transparent_hugepage interface.
> For now, the kernel still only supports PMD-order anonymous THP, so when
> reading back anon_orders, it will reflect that. Therefore there are no
> behavioural changes intended here.

powerpc strikes again.  ARCH=powerpc allmodconfig:


In file included from ./include/linux/bits.h:6,
                 from ./include/linux/ratelimit_types.h:5,
                 from ./include/linux/printk.h:9,
                 from ./include/asm-generic/bug.h:22,
                 from ./arch/powerpc/include/asm/bug.h:116,
                 from ./include/linux/bug.h:5,
                 from ./include/linux/mmdebug.h:5,
                 from ./include/linux/mm.h:6,
                 from mm/huge_memory.c:8:
./include/vdso/bits.h:7:33: error: initializer element is not constant
    7 | #define BIT(nr)                 (UL(1) << (nr))
      |                                 ^
mm/huge_memory.c:77:47: note: in expansion of macro 'BIT'
   77 | unsigned int huge_anon_orders __read_mostly = BIT(PMD_ORDER);
      |                                               ^~~

We keep tripping over this.  I wish there was a way to fix it.



Style whine: an all-caps identifier is supposed to be a constant,
dammit.

	#define PTE_INDEX_SIZE  __pte_index_size

Nope.



I did this:

--- a/mm/huge_memory.c~mm-thp-introduce-anon_orders-and-anon_always_mask-sysfs-files-fix
+++ a/mm/huge_memory.c
@@ -74,7 +74,7 @@ static unsigned long deferred_split_scan
 static atomic_t huge_zero_refcount;
 struct page *huge_zero_page __read_mostly;
 unsigned long huge_zero_pfn __read_mostly = ~0UL;
-unsigned int huge_anon_orders __read_mostly = BIT(PMD_ORDER);
+unsigned int huge_anon_orders __read_mostly;
 static unsigned int huge_anon_always_mask __read_mostly;
 
 /**
@@ -528,6 +528,9 @@ static int __init hugepage_init_sysfs(st
 {
 	int err;
 
+	/* powerpc's PMD_ORDER isn't a compile-time constant */
+	huge_anon_orders = BIT(PMD_ORDER);
+
 	*hugepage_kobj = kobject_create_and_add("transparent_hugepage", mm_kobj);
 	if (unlikely(!*hugepage_kobj)) {
 		pr_err("failed to create transparent hugepage kobject\n");
_


I assume this is set up early enough.

I don't know why powerpc's PTE_INDEX_SIZE is variable.  Hopefully it
has been set up by this time and it won't get altered.  


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v6 4/9] mm: thp: Introduce anon_orders and anon_always_mask sysfs files
  2023-09-29 22:55   ` [PATCH v6 4/9] mm: thp: Introduce anon_orders and anon_always_mask sysfs files Andrew Morton
@ 2023-10-02 10:15     ` Ryan Roberts
  2023-10-07 22:54     ` Michael Ellerman
  1 sibling, 0 replies; 6+ messages in thread
From: Ryan Roberts @ 2023-10-02 10:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-arm-kernel, David Rientjes, Yu Zhao, John Hubbard,
	Anshuman Khandual, Catalin Marinas, Yang Shi, David Hildenbrand,
	Hugh Dickins, Yin Fengwei, Matthew Wilcox, linux-kernel, linux-mm,
	Luis Chamberlain, Vlastimil Babka, Zi Yan, Huang, Ying,
	Itaru Kitayama, linuxppc-dev, Kirill A. Shutemov

On 29/09/2023 23:55, Andrew Morton wrote:
> On Fri, 29 Sep 2023 12:44:15 +0100 Ryan Roberts <ryan.roberts@arm.com> wrote:
> 
>> In preparation for adding support for anonymous large folios that are
>> smaller than the PMD-size, introduce 2 new sysfs files that will be used
>> to control the new behaviours via the transparent_hugepage interface.
>> For now, the kernel still only supports PMD-order anonymous THP, so when
>> reading back anon_orders, it will reflect that. Therefore there are no
>> behavioural changes intended here.
> 
> powerpc strikes again.  ARCH=powerpc allmodconfig:
> 
> 
> In file included from ./include/linux/bits.h:6,
>                  from ./include/linux/ratelimit_types.h:5,
>                  from ./include/linux/printk.h:9,
>                  from ./include/asm-generic/bug.h:22,
>                  from ./arch/powerpc/include/asm/bug.h:116,
>                  from ./include/linux/bug.h:5,
>                  from ./include/linux/mmdebug.h:5,
>                  from ./include/linux/mm.h:6,
>                  from mm/huge_memory.c:8:
> ./include/vdso/bits.h:7:33: error: initializer element is not constant
>     7 | #define BIT(nr)                 (UL(1) << (nr))
>       |                                 ^
> mm/huge_memory.c:77:47: note: in expansion of macro 'BIT'
>    77 | unsigned int huge_anon_orders __read_mostly = BIT(PMD_ORDER);
>       |                                               ^~~

Ahh my bad, sorry about that - I built various configs and arches but not powerpc.

> 
> We keep tripping over this.  I wish there was a way to fix it.
> 
> 
> 
> Style whine: an all-caps identifier is supposed to be a constant,
> dammit.
> 
> 	#define PTE_INDEX_SIZE  __pte_index_size
> 
> Nope.
> 
> 
> 
> I did this:
> 
> --- a/mm/huge_memory.c~mm-thp-introduce-anon_orders-and-anon_always_mask-sysfs-files-fix
> +++ a/mm/huge_memory.c
> @@ -74,7 +74,7 @@ static unsigned long deferred_split_scan
>  static atomic_t huge_zero_refcount;
>  struct page *huge_zero_page __read_mostly;
>  unsigned long huge_zero_pfn __read_mostly = ~0UL;
> -unsigned int huge_anon_orders __read_mostly = BIT(PMD_ORDER);
> +unsigned int huge_anon_orders __read_mostly;
>  static unsigned int huge_anon_always_mask __read_mostly;
>  
>  /**
> @@ -528,6 +528,9 @@ static int __init hugepage_init_sysfs(st
>  {
>  	int err;
>  
> +	/* powerpc's PMD_ORDER isn't a compile-time constant */
> +	huge_anon_orders = BIT(PMD_ORDER);
> +
>  	*hugepage_kobj = kobject_create_and_add("transparent_hugepage", mm_kobj);
>  	if (unlikely(!*hugepage_kobj)) {
>  		pr_err("failed to create transparent hugepage kobject\n");
> _
> 
> 
> I assume this is set up early enough.

Yes this should be fine.

> 
> I don't know why powerpc's PTE_INDEX_SIZE is variable.  Hopefully it
> has been set up by this time and it won't get altered.  

Looks that way from the code; its set during early_init_mmu().

Anyway, I'll take the fix into my next spin if I need to do one. I see you've
taken it into mm-unstable - thanks! But given I'm introducing UABI, I was
expecting some comments and a probably need for a new rev. I'd like to think we
are getting there though.

Thanks,
Ryan


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v6 4/9] mm: thp: Introduce anon_orders and anon_always_mask sysfs files
  2023-09-29 22:55   ` [PATCH v6 4/9] mm: thp: Introduce anon_orders and anon_always_mask sysfs files Andrew Morton
  2023-10-02 10:15     ` Ryan Roberts
@ 2023-10-07 22:54     ` Michael Ellerman
  2023-10-10  0:20       ` Andrew Morton
  1 sibling, 1 reply; 6+ messages in thread
From: Michael Ellerman @ 2023-10-07 22:54 UTC (permalink / raw)
  To: Andrew Morton, Ryan Roberts, Aneesh Kumar K.V
  Cc: linux-arm-kernel, David Rientjes, Yu Zhao, John Hubbard,
	Anshuman Khandual, Catalin Marinas, Yang Shi, David Hildenbrand,
	Hugh Dickins, Yin Fengwei, Matthew Wilcox, linux-kernel, linux-mm,
	Luis Chamberlain, Vlastimil Babka, Zi Yan, Huang, Ying,
	Itaru Kitayama, linuxppc-dev, Kirill A. Shutemov

Andrew Morton <akpm@linux-foundation.org> writes:
> On Fri, 29 Sep 2023 12:44:15 +0100 Ryan Roberts <ryan.roberts@arm.com> wrote:
>
>> In preparation for adding support for anonymous large folios that are
>> smaller than the PMD-size, introduce 2 new sysfs files that will be used
>> to control the new behaviours via the transparent_hugepage interface.
>> For now, the kernel still only supports PMD-order anonymous THP, so when
>> reading back anon_orders, it will reflect that. Therefore there are no
>> behavioural changes intended here.
>
> powerpc strikes again.  ARCH=powerpc allmodconfig:
>
>
> In file included from ./include/linux/bits.h:6,
>                  from ./include/linux/ratelimit_types.h:5,
>                  from ./include/linux/printk.h:9,
>                  from ./include/asm-generic/bug.h:22,
>                  from ./arch/powerpc/include/asm/bug.h:116,
>                  from ./include/linux/bug.h:5,
>                  from ./include/linux/mmdebug.h:5,
>                  from ./include/linux/mm.h:6,
>                  from mm/huge_memory.c:8:
> ./include/vdso/bits.h:7:33: error: initializer element is not constant
>     7 | #define BIT(nr)                 (UL(1) << (nr))
>       |                                 ^
> mm/huge_memory.c:77:47: note: in expansion of macro 'BIT'
>    77 | unsigned int huge_anon_orders __read_mostly = BIT(PMD_ORDER);
>       |                                               ^~~
>
> We keep tripping over this.  I wish there was a way to fix it.

I can't think of any solution, other than ripping the code out.

To catch it earlier we'd need a generic compile-time test that all values
derived from the page table geometry are only used in places that don't
require a constant. I can't think of a way to write a test for that.

Or submitters could compile-test for powerpc - one can dream :D

> Style whine: an all-caps identifier is supposed to be a constant,
> dammit.
>
> 	#define PTE_INDEX_SIZE  __pte_index_size
>
> Nope.

I agree it's ugly. It was done that way because PTE_INDEX_SIZE used to
be constant, and still is for 32-bit PPC and 64-bit Book3E PPC.

We could rename PTE_INDEX_SIZE itself, but we'd still have eg.
PTE_TABLE_SIZE which is used in generic code, and which would be
sometimes constant and sometimes not for different powerpc subarches.

> I did this:
>
> --- a/mm/huge_memory.c~mm-thp-introduce-anon_orders-and-anon_always_mask-sysfs-files-fix
> +++ a/mm/huge_memory.c
> @@ -74,7 +74,7 @@ static unsigned long deferred_split_scan
>  static atomic_t huge_zero_refcount;
>  struct page *huge_zero_page __read_mostly;
>  unsigned long huge_zero_pfn __read_mostly = ~0UL;
> -unsigned int huge_anon_orders __read_mostly = BIT(PMD_ORDER);
> +unsigned int huge_anon_orders __read_mostly;
>  static unsigned int huge_anon_always_mask __read_mostly;
>  
>  /**
> @@ -528,6 +528,9 @@ static int __init hugepage_init_sysfs(st
>  {
>  	int err;
>  
> +	/* powerpc's PMD_ORDER isn't a compile-time constant */
> +	huge_anon_orders = BIT(PMD_ORDER);
> +
>  	*hugepage_kobj = kobject_create_and_add("transparent_hugepage", mm_kobj);
>  	if (unlikely(!*hugepage_kobj)) {
>  		pr_err("failed to create transparent hugepage kobject\n");
> _
>
>
> I assume this is set up early enough.

Yes it should be.

> I don't know why powerpc's PTE_INDEX_SIZE is variable.

To allow a single vmlinux to boot using either the Hashed Page Table
MMU, or Radix Tree MMU, which have different page table geometry.

That's a pretty crucial feature for distros, so that they can build a
single kernel to boot on Power8/9/10.

cheers

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v6 4/9] mm: thp: Introduce anon_orders and anon_always_mask sysfs files
  2023-10-07 22:54     ` Michael Ellerman
@ 2023-10-10  0:20       ` Andrew Morton
  2023-10-12  9:31         ` David Hildenbrand
  2023-10-12 11:07         ` Michael Ellerman
  0 siblings, 2 replies; 6+ messages in thread
From: Andrew Morton @ 2023-10-10  0:20 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: David Hildenbrand, Catalin Marinas, Yang Shi, linux-kernel,
	linux-mm, Yu Zhao, Aneesh Kumar K.V, Hugh Dickins, Matthew Wilcox,
	Vlastimil Babka, Zi Yan, Huang, Ying, Ryan Roberts,
	Anshuman Khandual, John Hubbard, David Rientjes, Itaru Kitayama,
	linux-arm-kernel, Yin Fengwei, Luis Chamberlain, linuxppc-dev,
	Kirill A. Shutemov

On Sun, 08 Oct 2023 09:54:22 +1100 Michael Ellerman <mpe@ellerman.id.au> wrote:

> > I don't know why powerpc's PTE_INDEX_SIZE is variable.
> 
> To allow a single vmlinux to boot using either the Hashed Page Table
> MMU, or Radix Tree MMU, which have different page table geometry.
> 
> That's a pretty crucial feature for distros, so that they can build a
> single kernel to boot on Power8/9/10.

Dumb question: why can't distros ship two kernels and have the boot
loader (or something else) pick the appropriate one?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v6 4/9] mm: thp: Introduce anon_orders and anon_always_mask sysfs files
  2023-10-10  0:20       ` Andrew Morton
@ 2023-10-12  9:31         ` David Hildenbrand
  2023-10-12 11:07         ` Michael Ellerman
  1 sibling, 0 replies; 6+ messages in thread
From: David Hildenbrand @ 2023-10-12  9:31 UTC (permalink / raw)
  To: Andrew Morton, Michael Ellerman
  Cc: linux-arm-kernel, David Rientjes, Ryan Roberts, Yu Zhao,
	John Hubbard, Anshuman Khandual, Catalin Marinas, Yang Shi,
	Huang, Ying, Hugh Dickins, Yin Fengwei, Matthew Wilcox,
	linux-kernel, linux-mm, Luis Chamberlain, Vlastimil Babka, Zi Yan,
	Aneesh Kumar K.V, Itaru Kitayama, linuxppc-dev,
	Kirill A. Shutemov

On 10.10.23 02:20, Andrew Morton wrote:
> On Sun, 08 Oct 2023 09:54:22 +1100 Michael Ellerman <mpe@ellerman.id.au> wrote:
> 
>>> I don't know why powerpc's PTE_INDEX_SIZE is variable.
>>
>> To allow a single vmlinux to boot using either the Hashed Page Table
>> MMU, or Radix Tree MMU, which have different page table geometry.
>>
>> That's a pretty crucial feature for distros, so that they can build a
>> single kernel to boot on Power8/9/10.
> 
> Dumb question: why can't distros ship two kernels and have the boot
> loader (or something else) pick the appropriate one?

One answer I keep hearing over and over again is "test matrix 
explosion". So distros only do it when unavoidable: for example, when 
differing PAGE_SIZE is required (e.g., 4k vs 64k) or we're dealing with 
RT support (RT vs !RT).

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v6 4/9] mm: thp: Introduce anon_orders and anon_always_mask sysfs files
  2023-10-10  0:20       ` Andrew Morton
  2023-10-12  9:31         ` David Hildenbrand
@ 2023-10-12 11:07         ` Michael Ellerman
  1 sibling, 0 replies; 6+ messages in thread
From: Michael Ellerman @ 2023-10-12 11:07 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Catalin Marinas, Yang Shi, linux-kernel,
	linux-mm, Yu Zhao, Aneesh Kumar K.V, Hugh Dickins, Matthew Wilcox,
	Vlastimil Babka, Zi Yan, Huang, Ying, Ryan Roberts,
	Anshuman Khandual, John Hubbard, David Rientjes, Itaru Kitayama,
	linux-arm-kernel, Yin Fengwei, Luis Chamberlain, linuxppc-dev,
	Kirill A. Shutemov

Andrew Morton <akpm@linux-foundation.org> writes:
> On Sun, 08 Oct 2023 09:54:22 +1100 Michael Ellerman <mpe@ellerman.id.au> wrote:
>
>> > I don't know why powerpc's PTE_INDEX_SIZE is variable.
>> 
>> To allow a single vmlinux to boot using either the Hashed Page Table
>> MMU, or Radix Tree MMU, which have different page table geometry.
>> 
>> That's a pretty crucial feature for distros, so that they can build a
>> single kernel to boot on Power8/9/10.
>
> Dumb question: why can't distros ship two kernels and have the boot
> loader (or something else) pick the appropriate one?

I'm not a grub expert, but AFAIK it doesn't support loading a different
kernel based on CPU/firwmare features. I'm quite sure it can't do that
on powerpc at least.

We also have another bootloader (petitboot) that is still supported by
some distros, and can't do that.

The other problem is like David says, distros are generally reluctant to
add new kernel configurations unless they absolutely have to. It adds
more work for them, more things to track, and can confuse users leading
to spurious bug reports.

cheers

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-10-12 11:08 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20230929114421.3761121-1-ryan.roberts@arm.com>
     [not found] ` <20230929114421.3761121-5-ryan.roberts@arm.com>
2023-09-29 22:55   ` [PATCH v6 4/9] mm: thp: Introduce anon_orders and anon_always_mask sysfs files Andrew Morton
2023-10-02 10:15     ` Ryan Roberts
2023-10-07 22:54     ` Michael Ellerman
2023-10-10  0:20       ` Andrew Morton
2023-10-12  9:31         ` David Hildenbrand
2023-10-12 11:07         ` Michael Ellerman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).