linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* Re: linux-next: boot warning after merge of the vfs-brauner tree
       [not found] <20240826175931.1989f99e@canb.auug.org.au>
@ 2024-08-26 15:48 ` Pankaj Raghav (Samsung)
  2024-08-26 17:43   ` Christophe Leroy
  2024-08-27  6:28   ` Michael Ellerman
  0 siblings, 2 replies; 8+ messages in thread
From: Pankaj Raghav (Samsung) @ 2024-08-26 15:48 UTC (permalink / raw)
  To: Stephen Rothwell
  Cc: Christian Brauner, Luis Chamberlain, Pankaj Raghav,
	Linux Kernel Mailing List, Linux Next Mailing List, djwong,
	ritesh.list, linuxppc-dev, christophe.leroy

On Mon, Aug 26, 2024 at 05:59:31PM +1000, Stephen Rothwell wrote:
> Hi all,
> 
> After merging the vfs-brauner tree, today's linux-next boot test (powerpc
> pseries_le_defconfig) produced this warning:

iomap dio calls set_memory_ro() on the page that is used for sub block
zeroing.

But looking at powerpc code, they don't support set_memory_ro() for
memory region that belongs to the kernel(LINEAR_MAP_REGION_ID).

/*
 * On hash, the linear mapping is not in the Linux page table so
 * apply_to_existing_page_range() will have no effect. If in the future
 * the set_memory_* functions are used on the linear map this will need
 * to be updated.
 */
if (!radix_enabled()) {
        int region = get_region_id(addr);

        if (WARN_ON_ONCE(region != VMALLOC_REGION_ID && region != IO_REGION_ID))
                return -EINVAL;
}

We call set_memory_ro() on the zero page as a extra security measure.
I don't know much about powerpc, but looking at the comment, is it just
adding the following to support it in powerpc:

diff --git a/arch/powerpc/mm/pageattr.c b/arch/powerpc/mm/pageattr.c
index ac22bf28086fa..e6e0b40ba6db4 100644
--- a/arch/powerpc/mm/pageattr.c
+++ b/arch/powerpc/mm/pageattr.c
@@ -94,7 +94,9 @@ int change_memory_attr(unsigned long addr, int numpages, long action)
        if (!radix_enabled()) {
                int region = get_region_id(addr);
 
-               if (WARN_ON_ONCE(region != VMALLOC_REGION_ID && region != IO_REGION_ID))
+               if (WARN_ON_ONCE(region != VMALLOC_REGION_ID &&
+                                region != IO_REGION_ID &&
+                                region != LINEAR_MAP_REGION_ID))
                        return -EINVAL;
        }
 #endif

 If it involves changing more things and this feature will be added to
 powerpc in the future, we could drop the set_memory_ro() call from
 iomap.

 CC: Darrick(as he suggested set_memory_ro() on zero page), Leroy,
 Ritesh, ppc list

> 
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 1 at arch/powerpc/mm/pageattr.c:97 change_memory_attr+0xbc/0x150
> Modules linked in:
> CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.11.0-rc5-06731-g66e0882fba22 #1
> Hardware name: IBM pSeries (emulated by qemu) POWER8 (architected) 0x4d0200 0xf000004 of:SLOF,HEAD pSeries
> NIP:  c00000000008a1ac LR: c00000000008a14c CTR: 0000000000000000
> REGS: c0000000049b7930 TRAP: 0700   Not tainted  (6.11.0-rc5-06731-g66e0882fba22)
> MSR:  8000000002029033 <SF,VEC,EE,ME,IR,DR,RI,LE>  CR: 84000482  XER: 00000000
> CFAR: c00000000008a218 IRQMASK: 0 
> GPR00: c00000000008a14c c0000000049b7bd0 c00000000167b100 0000000000000000 
> GPR04: 0000000000000001 0000000000000000 0000000000000200 c000000002b10878 
> GPR08: 000000007da60000 c007ffffffffffff ffffffffffffffff 0000000084000482 
> GPR12: 0000000000000180 c000000002b90000 c00000000001110c 0000000000000000 
> GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> GPR20: 0000000000000000 0000000000000000 0000000000000000 c000000001562288 
> GPR24: c000000002003e6c c000000001632418 000000000000018c c0000000020c1058 
> GPR28: 0000000000000000 0000000000000000 c000000006330000 0000000000000001 
> NIP [c00000000008a1ac] change_memory_attr+0xbc/0x150
> LR [c00000000008a14c] change_memory_attr+0x5c/0x150
> Call Trace:
> [c0000000049b7bd0] [000000000000018c] 0x18c (unreliable)
> [c0000000049b7c10] [c00000000206bf70] iomap_dio_init+0x64/0x88
> [c0000000049b7c30] [c000000000010d98] do_one_initcall+0x80/0x2f8
> [c0000000049b7d00] [c000000002005c9c] kernel_init_freeable+0x32c/0x520
> [c0000000049b7de0] [c000000000011138] kernel_init+0x34/0x26c
> [c0000000049b7e50] [c00000000000debc] ret_from_kernel_user_thread+0x14/0x1c
> --- interrupt: 0 at 0x0
> Code: 60000000 e8010050 eba10028 7c6307b4 ebc10030 38210040 ebe1fff8 7c0803a6 4e800020 7bc92720 2c29000c 41820058 <0fe00000> 4800002c 60000000 60000000 
> ---[ end trace 0000000000000000 ]---
> 
> Bisected to commit
> 
>   d940b3b7b76b ("iomap: fix iomap_dio_zero() for fs bs > system page size")
> 
> I have reverted commit
> 
>   9b0ebbc72358 ("Merge patch series "enable bs > ps in XFS"")
> 
> for today.
> 
> -- 
> Cheers,
> Stephen Rothwell



-- 
Pankaj Raghav


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: linux-next: boot warning after merge of the vfs-brauner tree
  2024-08-26 15:48 ` linux-next: boot warning after merge of the vfs-brauner tree Pankaj Raghav (Samsung)
@ 2024-08-26 17:43   ` Christophe Leroy
  2024-08-26 20:52     ` Luis Chamberlain
  2024-08-27  6:28   ` Michael Ellerman
  1 sibling, 1 reply; 8+ messages in thread
From: Christophe Leroy @ 2024-08-26 17:43 UTC (permalink / raw)
  To: Pankaj Raghav (Samsung), Stephen Rothwell
  Cc: Christian Brauner, Luis Chamberlain, Pankaj Raghav,
	Linux Kernel Mailing List, Linux Next Mailing List, djwong,
	ritesh.list, linuxppc-dev



Le 26/08/2024 à 17:48, Pankaj Raghav (Samsung) a écrit :
> On Mon, Aug 26, 2024 at 05:59:31PM +1000, Stephen Rothwell wrote:
>> Hi all,
>>
>> After merging the vfs-brauner tree, today's linux-next boot test (powerpc
>> pseries_le_defconfig) produced this warning:
> 
> iomap dio calls set_memory_ro() on the page that is used for sub block
> zeroing.
> 
> But looking at powerpc code, they don't support set_memory_ro() for
> memory region that belongs to the kernel(LINEAR_MAP_REGION_ID).
> 
> /*
>   * On hash, the linear mapping is not in the Linux page table so
>   * apply_to_existing_page_range() will have no effect. If in the future
>   * the set_memory_* functions are used on the linear map this will need
>   * to be updated.
>   */
> if (!radix_enabled()) {
>          int region = get_region_id(addr);
> 
>          if (WARN_ON_ONCE(region != VMALLOC_REGION_ID && region != IO_REGION_ID))
>                  return -EINVAL;
> }
> 
> We call set_memory_ro() on the zero page as a extra security measure.
> I don't know much about powerpc, but looking at the comment, is it just
> adding the following to support it in powerpc:
> 
> diff --git a/arch/powerpc/mm/pageattr.c b/arch/powerpc/mm/pageattr.c
> index ac22bf28086fa..e6e0b40ba6db4 100644
> --- a/arch/powerpc/mm/pageattr.c
> +++ b/arch/powerpc/mm/pageattr.c
> @@ -94,7 +94,9 @@ int change_memory_attr(unsigned long addr, int numpages, long action)
>          if (!radix_enabled()) {
>                  int region = get_region_id(addr);
>   
> -               if (WARN_ON_ONCE(region != VMALLOC_REGION_ID && region != IO_REGION_ID))
> +               if (WARN_ON_ONCE(region != VMALLOC_REGION_ID &&
> +                                region != IO_REGION_ID &&
> +                                region != LINEAR_MAP_REGION_ID))
>                          return -EINVAL;
>          }
>   #endif

By doing this you will just hide the fact that it didn't work.

See commit 1f9ad21c3b38 ("powerpc/mm: Implement set_memory() routines") 
for details. The linear memory region is not mapped using page tables so 
set_memory_ro() will have no effect on it.

You can either use vmalloc'ed pages, or do a const static allocation at 
buildtime so that it will be allocated in the kernel static rodata area.

By the way, your code should check the value returned by 
set_memory_ro(), there is some work in progress to make it mandatory, 
see https://github.com/KSPP/linux/issues/7

Christophe

> 
>   If it involves changing more things and this feature will be added to
>   powerpc in the future, we could drop the set_memory_ro() call from
>   iomap.
> 
>   CC: Darrick(as he suggested set_memory_ro() on zero page), Leroy,
>   Ritesh, ppc list
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: linux-next: boot warning after merge of the vfs-brauner tree
  2024-08-26 17:43   ` Christophe Leroy
@ 2024-08-26 20:52     ` Luis Chamberlain
  2024-08-26 21:10       ` Darrick J. Wong
  2024-08-27 15:38       ` Mike Rapoport
  0 siblings, 2 replies; 8+ messages in thread
From: Luis Chamberlain @ 2024-08-26 20:52 UTC (permalink / raw)
  To: Christophe Leroy, Mike Rapoport, Song Liu, Arnd Bergmann
  Cc: Pankaj Raghav (Samsung), Stephen Rothwell, Christian Brauner,
	Pankaj Raghav, Linux Kernel Mailing List, Linux Next Mailing List,
	djwong, ritesh.list, linuxppc-dev

On Mon, Aug 26, 2024 at 07:43:20PM +0200, Christophe Leroy wrote:
> 
> 
> Le 26/08/2024 à 17:48, Pankaj Raghav (Samsung) a écrit :
> > On Mon, Aug 26, 2024 at 05:59:31PM +1000, Stephen Rothwell wrote:
> > > Hi all,
> > > 
> > > After merging the vfs-brauner tree, today's linux-next boot test (powerpc
> > > pseries_le_defconfig) produced this warning:
> > 
> > iomap dio calls set_memory_ro() on the page that is used for sub block
> > zeroing.
> > 
> > But looking at powerpc code, they don't support set_memory_ro() for
> > memory region that belongs to the kernel(LINEAR_MAP_REGION_ID).
> > 
> > /*
> >   * On hash, the linear mapping is not in the Linux page table so
> >   * apply_to_existing_page_range() will have no effect. If in the future
> >   * the set_memory_* functions are used on the linear map this will need
> >   * to be updated.
> >   */
> > if (!radix_enabled()) {
> >          int region = get_region_id(addr);
> > 
> >          if (WARN_ON_ONCE(region != VMALLOC_REGION_ID && region != IO_REGION_ID))
> >                  return -EINVAL;
> > }
> > 
> > We call set_memory_ro() on the zero page as a extra security measure.
> > I don't know much about powerpc, but looking at the comment, is it just
> > adding the following to support it in powerpc:
> > 
> > diff --git a/arch/powerpc/mm/pageattr.c b/arch/powerpc/mm/pageattr.c
> > index ac22bf28086fa..e6e0b40ba6db4 100644
> > --- a/arch/powerpc/mm/pageattr.c
> > +++ b/arch/powerpc/mm/pageattr.c
> > @@ -94,7 +94,9 @@ int change_memory_attr(unsigned long addr, int numpages, long action)
> >          if (!radix_enabled()) {
> >                  int region = get_region_id(addr);
> > -               if (WARN_ON_ONCE(region != VMALLOC_REGION_ID && region != IO_REGION_ID))
> > +               if (WARN_ON_ONCE(region != VMALLOC_REGION_ID &&
> > +                                region != IO_REGION_ID &&
> > +                                region != LINEAR_MAP_REGION_ID))
> >                          return -EINVAL;
> >          }
> >   #endif
> 
> By doing this you will just hide the fact that it didn't work.
> 
> See commit 1f9ad21c3b38 ("powerpc/mm: Implement set_memory() routines") for
> details. The linear memory region is not mapped using page tables so
> set_memory_ro() will have no effect on it.
> 
> You can either use vmalloc'ed pages, or do a const static allocation at
> buildtime so that it will be allocated in the kernel static rodata area.
> 
> By the way, your code should check the value returned by set_memory_ro(),
> there is some work in progress to make it mandatory, see
> https://github.com/KSPP/linux/issues/7

Our users expect contiguous memory [0] and so we use alloc_pages() here,
so if we're architecture limitted by this I'd rather we just remove the
set_memory_ro() only for PPC, I don't see why other have to skip this.

diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index c02b266bba52..aba5cde89e14 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -775,14 +775,22 @@ EXPORT_SYMBOL_GPL(iomap_dio_rw);
 
 static int __init iomap_dio_init(void)
 {
+	int ret;
+
 	zero_page = alloc_pages(GFP_KERNEL | __GFP_ZERO,
 				IOMAP_ZERO_PAGE_ORDER);
 
 	if (!zero_page)
 		return -ENOMEM;
 
-	set_memory_ro((unsigned long)page_address(zero_page),
-		      1U << IOMAP_ZERO_PAGE_ORDER);
-	return 0;
+	if (IS_ENABLED(CONFIG_PPC))
+		return 0;
+
+	ret = set_memory_ro((unsigned long)page_address(zero_page),
+			    1U << IOMAP_ZERO_PAGE_ORDER);
+	if (ret)
+		free_pages((unsigned long) zero_page, IOMAP_ZERO_PAGE_ORDER);
+
+	return ret;
 }
 fs_initcall(iomap_dio_init);

Thoughts?

[0] https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git/commit/?h=vfs.blocksize&id=d940b3b7b76b409b0550fdf2de6dc2183f01526f

  Luis


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: linux-next: boot warning after merge of the vfs-brauner tree
  2024-08-26 20:52     ` Luis Chamberlain
@ 2024-08-26 21:10       ` Darrick J. Wong
  2024-08-26 21:41         ` Luis Chamberlain
  2024-08-27 15:38       ` Mike Rapoport
  1 sibling, 1 reply; 8+ messages in thread
From: Darrick J. Wong @ 2024-08-26 21:10 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: Christophe Leroy, Mike Rapoport, Song Liu, Arnd Bergmann,
	Pankaj Raghav (Samsung), Stephen Rothwell, Christian Brauner,
	Pankaj Raghav, Linux Kernel Mailing List, Linux Next Mailing List,
	ritesh.list, linuxppc-dev

On Mon, Aug 26, 2024 at 01:52:54PM -0700, Luis Chamberlain wrote:
> On Mon, Aug 26, 2024 at 07:43:20PM +0200, Christophe Leroy wrote:
> > 
> > 
> > Le 26/08/2024 à 17:48, Pankaj Raghav (Samsung) a écrit :
> > > On Mon, Aug 26, 2024 at 05:59:31PM +1000, Stephen Rothwell wrote:
> > > > Hi all,
> > > > 
> > > > After merging the vfs-brauner tree, today's linux-next boot test (powerpc
> > > > pseries_le_defconfig) produced this warning:
> > > 
> > > iomap dio calls set_memory_ro() on the page that is used for sub block
> > > zeroing.
> > > 
> > > But looking at powerpc code, they don't support set_memory_ro() for
> > > memory region that belongs to the kernel(LINEAR_MAP_REGION_ID).
> > > 
> > > /*
> > >   * On hash, the linear mapping is not in the Linux page table so
> > >   * apply_to_existing_page_range() will have no effect. If in the future
> > >   * the set_memory_* functions are used on the linear map this will need
> > >   * to be updated.
> > >   */
> > > if (!radix_enabled()) {
> > >          int region = get_region_id(addr);
> > > 
> > >          if (WARN_ON_ONCE(region != VMALLOC_REGION_ID && region != IO_REGION_ID))
> > >                  return -EINVAL;
> > > }
> > > 
> > > We call set_memory_ro() on the zero page as a extra security measure.
> > > I don't know much about powerpc, but looking at the comment, is it just
> > > adding the following to support it in powerpc:
> > > 
> > > diff --git a/arch/powerpc/mm/pageattr.c b/arch/powerpc/mm/pageattr.c
> > > index ac22bf28086fa..e6e0b40ba6db4 100644
> > > --- a/arch/powerpc/mm/pageattr.c
> > > +++ b/arch/powerpc/mm/pageattr.c
> > > @@ -94,7 +94,9 @@ int change_memory_attr(unsigned long addr, int numpages, long action)
> > >          if (!radix_enabled()) {
> > >                  int region = get_region_id(addr);
> > > -               if (WARN_ON_ONCE(region != VMALLOC_REGION_ID && region != IO_REGION_ID))
> > > +               if (WARN_ON_ONCE(region != VMALLOC_REGION_ID &&
> > > +                                region != IO_REGION_ID &&
> > > +                                region != LINEAR_MAP_REGION_ID))
> > >                          return -EINVAL;
> > >          }
> > >   #endif
> > 
> > By doing this you will just hide the fact that it didn't work.
> > 
> > See commit 1f9ad21c3b38 ("powerpc/mm: Implement set_memory() routines") for
> > details. The linear memory region is not mapped using page tables so
> > set_memory_ro() will have no effect on it.
> > 
> > You can either use vmalloc'ed pages, or do a const static allocation at
> > buildtime so that it will be allocated in the kernel static rodata area.
> > 
> > By the way, your code should check the value returned by set_memory_ro(),
> > there is some work in progress to make it mandatory, see
> > https://github.com/KSPP/linux/issues/7
> 
> Our users expect contiguous memory [0] and so we use alloc_pages() here,
> so if we're architecture limitted by this I'd rather we just remove the
> set_memory_ro() only for PPC, I don't see why other have to skip this.

Just drop it, then.

--D

> diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
> index c02b266bba52..aba5cde89e14 100644
> --- a/fs/iomap/direct-io.c
> +++ b/fs/iomap/direct-io.c
> @@ -775,14 +775,22 @@ EXPORT_SYMBOL_GPL(iomap_dio_rw);
>  
>  static int __init iomap_dio_init(void)
>  {
> +	int ret;
> +
>  	zero_page = alloc_pages(GFP_KERNEL | __GFP_ZERO,
>  				IOMAP_ZERO_PAGE_ORDER);
>  
>  	if (!zero_page)
>  		return -ENOMEM;
>  
> -	set_memory_ro((unsigned long)page_address(zero_page),
> -		      1U << IOMAP_ZERO_PAGE_ORDER);
> -	return 0;
> +	if (IS_ENABLED(CONFIG_PPC))
> +		return 0;
> +
> +	ret = set_memory_ro((unsigned long)page_address(zero_page),
> +			    1U << IOMAP_ZERO_PAGE_ORDER);
> +	if (ret)
> +		free_pages((unsigned long) zero_page, IOMAP_ZERO_PAGE_ORDER);
> +
> +	return ret;
>  }
>  fs_initcall(iomap_dio_init);
> 
> Thoughts?
> 
> [0] https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git/commit/?h=vfs.blocksize&id=d940b3b7b76b409b0550fdf2de6dc2183f01526f
> 
>   Luis


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: linux-next: boot warning after merge of the vfs-brauner tree
  2024-08-26 21:10       ` Darrick J. Wong
@ 2024-08-26 21:41         ` Luis Chamberlain
  2024-08-27  5:26           ` Ritesh Harjani
  0 siblings, 1 reply; 8+ messages in thread
From: Luis Chamberlain @ 2024-08-26 21:41 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Christophe Leroy, Mike Rapoport, Song Liu, Arnd Bergmann,
	Pankaj Raghav (Samsung), Stephen Rothwell, Christian Brauner,
	Pankaj Raghav, Linux Kernel Mailing List, Linux Next Mailing List,
	ritesh.list, linuxppc-dev

On Mon, Aug 26, 2024 at 02:10:49PM -0700, Darrick J. Wong wrote:
> On Mon, Aug 26, 2024 at 01:52:54PM -0700, Luis Chamberlain wrote:
> > On Mon, Aug 26, 2024 at 07:43:20PM +0200, Christophe Leroy wrote:
> > > 
> > > 
> > > Le 26/08/2024 à 17:48, Pankaj Raghav (Samsung) a écrit :
> > > > On Mon, Aug 26, 2024 at 05:59:31PM +1000, Stephen Rothwell wrote:
> > > > > Hi all,
> > > > > 
> > > > > After merging the vfs-brauner tree, today's linux-next boot test (powerpc
> > > > > pseries_le_defconfig) produced this warning:
> > > > 
> > > > iomap dio calls set_memory_ro() on the page that is used for sub block
> > > > zeroing.
> > > > 
> > > > But looking at powerpc code, they don't support set_memory_ro() for
> > > > memory region that belongs to the kernel(LINEAR_MAP_REGION_ID).
> > > > 
> > > > /*
> > > >   * On hash, the linear mapping is not in the Linux page table so
> > > >   * apply_to_existing_page_range() will have no effect. If in the future
> > > >   * the set_memory_* functions are used on the linear map this will need
> > > >   * to be updated.
> > > >   */
> > > > if (!radix_enabled()) {
> > > >          int region = get_region_id(addr);
> > > > 
> > > >          if (WARN_ON_ONCE(region != VMALLOC_REGION_ID && region != IO_REGION_ID))
> > > >                  return -EINVAL;
> > > > }
> > > > 
> > > > We call set_memory_ro() on the zero page as a extra security measure.
> > > > I don't know much about powerpc, but looking at the comment, is it just
> > > > adding the following to support it in powerpc:
> > > > 
> > > > diff --git a/arch/powerpc/mm/pageattr.c b/arch/powerpc/mm/pageattr.c
> > > > index ac22bf28086fa..e6e0b40ba6db4 100644
> > > > --- a/arch/powerpc/mm/pageattr.c
> > > > +++ b/arch/powerpc/mm/pageattr.c
> > > > @@ -94,7 +94,9 @@ int change_memory_attr(unsigned long addr, int numpages, long action)
> > > >          if (!radix_enabled()) {
> > > >                  int region = get_region_id(addr);
> > > > -               if (WARN_ON_ONCE(region != VMALLOC_REGION_ID && region != IO_REGION_ID))
> > > > +               if (WARN_ON_ONCE(region != VMALLOC_REGION_ID &&
> > > > +                                region != IO_REGION_ID &&
> > > > +                                region != LINEAR_MAP_REGION_ID))
> > > >                          return -EINVAL;
> > > >          }
> > > >   #endif
> > > 
> > > By doing this you will just hide the fact that it didn't work.
> > > 
> > > See commit 1f9ad21c3b38 ("powerpc/mm: Implement set_memory() routines") for
> > > details. The linear memory region is not mapped using page tables so
> > > set_memory_ro() will have no effect on it.
> > > 
> > > You can either use vmalloc'ed pages, or do a const static allocation at
> > > buildtime so that it will be allocated in the kernel static rodata area.
> > > 
> > > By the way, your code should check the value returned by set_memory_ro(),
> > > there is some work in progress to make it mandatory, see
> > > https://github.com/KSPP/linux/issues/7
> > 
> > Our users expect contiguous memory [0] and so we use alloc_pages() here,
> > so if we're architecture limitted by this I'd rather we just remove the
> > set_memory_ro() only for PPC, I don't see why other have to skip this.
> 
> Just drop it, then.

OK sent a patch for that.

  Luis


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: linux-next: boot warning after merge of the vfs-brauner tree
  2024-08-26 21:41         ` Luis Chamberlain
@ 2024-08-27  5:26           ` Ritesh Harjani
  0 siblings, 0 replies; 8+ messages in thread
From: Ritesh Harjani @ 2024-08-27  5:26 UTC (permalink / raw)
  To: Luis Chamberlain, Darrick J. Wong
  Cc: Christophe Leroy, Mike Rapoport, Song Liu, Arnd Bergmann,
	Pankaj Raghav (Samsung), Stephen Rothwell, Christian Brauner,
	Pankaj Raghav, Linux Kernel Mailing List, Linux Next Mailing List,
	linuxppc-dev

Luis Chamberlain <mcgrof@kernel.org> writes:

> On Mon, Aug 26, 2024 at 02:10:49PM -0700, Darrick J. Wong wrote:
>> On Mon, Aug 26, 2024 at 01:52:54PM -0700, Luis Chamberlain wrote:
>> > On Mon, Aug 26, 2024 at 07:43:20PM +0200, Christophe Leroy wrote:
>> > > 
>> > > 
>> > > Le 26/08/2024 à 17:48, Pankaj Raghav (Samsung) a écrit :
>> > > > On Mon, Aug 26, 2024 at 05:59:31PM +1000, Stephen Rothwell wrote:
>> > > > > Hi all,
>> > > > > 
>> > > > > After merging the vfs-brauner tree, today's linux-next boot test (powerpc
>> > > > > pseries_le_defconfig) produced this warning:
>> > > > 
>> > > > iomap dio calls set_memory_ro() on the page that is used for sub block
>> > > > zeroing.
>> > > > 
>> > > > But looking at powerpc code, they don't support set_memory_ro() for
>> > > > memory region that belongs to the kernel(LINEAR_MAP_REGION_ID).
>> > > > 
>> > > > /*
>> > > >   * On hash, the linear mapping is not in the Linux page table so
>> > > >   * apply_to_existing_page_range() will have no effect. If in the future
>> > > >   * the set_memory_* functions are used on the linear map this will need
>> > > >   * to be updated.
>> > > >   */
>> > > > if (!radix_enabled()) {
>> > > >          int region = get_region_id(addr);
>> > > > 
>> > > >          if (WARN_ON_ONCE(region != VMALLOC_REGION_ID && region != IO_REGION_ID))
>> > > >                  return -EINVAL;
>> > > > }
>> > > > 
>> > > > We call set_memory_ro() on the zero page as a extra security measure.
>> > > > I don't know much about powerpc, but looking at the comment, is it just
>> > > > adding the following to support it in powerpc:
>> > > > 
>> > > > diff --git a/arch/powerpc/mm/pageattr.c b/arch/powerpc/mm/pageattr.c
>> > > > index ac22bf28086fa..e6e0b40ba6db4 100644
>> > > > --- a/arch/powerpc/mm/pageattr.c
>> > > > +++ b/arch/powerpc/mm/pageattr.c
>> > > > @@ -94,7 +94,9 @@ int change_memory_attr(unsigned long addr, int numpages, long action)
>> > > >          if (!radix_enabled()) {
>> > > >                  int region = get_region_id(addr);
>> > > > -               if (WARN_ON_ONCE(region != VMALLOC_REGION_ID && region != IO_REGION_ID))
>> > > > +               if (WARN_ON_ONCE(region != VMALLOC_REGION_ID &&
>> > > > +                                region != IO_REGION_ID &&
>> > > > +                                region != LINEAR_MAP_REGION_ID))
>> > > >                          return -EINVAL;
>> > > >          }
>> > > >   #endif
>> > > 
>> > > By doing this you will just hide the fact that it didn't work.
>> > > 
>> > > See commit 1f9ad21c3b38 ("powerpc/mm: Implement set_memory() routines") for
>> > > details. The linear memory region is not mapped using page tables so
>> > > set_memory_ro() will have no effect on it.
>> > > 
>> > > You can either use vmalloc'ed pages, or do a const static allocation at
>> > > buildtime so that it will be allocated in the kernel static rodata area.
>> > > 
>> > > By the way, your code should check the value returned by set_memory_ro(),
>> > > there is some work in progress to make it mandatory, see
>> > > https://github.com/KSPP/linux/issues/7
>> > 
>> > Our users expect contiguous memory [0] and so we use alloc_pages() here,
>> > so if we're architecture limitted by this I'd rather we just remove the
>> > set_memory_ro() only for PPC, I don't see why other have to skip this.

Looks like not a standard thing to do for kernel linear memory map
region then and maybe few other archs could be ignoring too?

>> 
>> Just drop it, then.
>
> OK sent a patch for that.
>

Thanks for fixing it!

-ritesh


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: linux-next: boot warning after merge of the vfs-brauner tree
  2024-08-26 15:48 ` linux-next: boot warning after merge of the vfs-brauner tree Pankaj Raghav (Samsung)
  2024-08-26 17:43   ` Christophe Leroy
@ 2024-08-27  6:28   ` Michael Ellerman
  1 sibling, 0 replies; 8+ messages in thread
From: Michael Ellerman @ 2024-08-27  6:28 UTC (permalink / raw)
  To: Pankaj Raghav (Samsung), Stephen Rothwell
  Cc: Christian Brauner, Luis Chamberlain, Pankaj Raghav,
	Linux Kernel Mailing List, Linux Next Mailing List, djwong,
	ritesh.list, linuxppc-dev, christophe.leroy

"Pankaj Raghav (Samsung)" <kernel@pankajraghav.com> writes:
> On Mon, Aug 26, 2024 at 05:59:31PM +1000, Stephen Rothwell wrote:
>> Hi all,
>> 
>> After merging the vfs-brauner tree, today's linux-next boot test (powerpc
>> pseries_le_defconfig) produced this warning:
>
> iomap dio calls set_memory_ro() on the page that is used for sub block
> zeroing.
>
> But looking at powerpc code, they don't support set_memory_ro() for
> memory region that belongs to the kernel(LINEAR_MAP_REGION_ID).
>
> /*
>  * On hash, the linear mapping is not in the Linux page table so
>  * apply_to_existing_page_range() will have no effect. If in the future
>  * the set_memory_* functions are used on the linear map this will need
>  * to be updated.
>  */
> if (!radix_enabled()) {
>         int region = get_region_id(addr);
>
>         if (WARN_ON_ONCE(region != VMALLOC_REGION_ID && region != IO_REGION_ID))
>                 return -EINVAL;
> }

We should probably just turn that into a printk(), WARN is kind of heavy handed.

> We call set_memory_ro() on the zero page as a extra security measure.
 
Or a data integrity measure. But either way it makes sense.

On architectures that do implement set_memory_ro() it potentially breaks
the linear mapping into small pages, which could have a performance impact.

cheers


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: linux-next: boot warning after merge of the vfs-brauner tree
  2024-08-26 20:52     ` Luis Chamberlain
  2024-08-26 21:10       ` Darrick J. Wong
@ 2024-08-27 15:38       ` Mike Rapoport
  1 sibling, 0 replies; 8+ messages in thread
From: Mike Rapoport @ 2024-08-27 15:38 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: Christophe Leroy, Song Liu, Arnd Bergmann,
	Pankaj Raghav (Samsung), Stephen Rothwell, Christian Brauner,
	Pankaj Raghav, Linux Kernel Mailing List, Linux Next Mailing List,
	djwong, ritesh.list, linuxppc-dev

On Mon, Aug 26, 2024 at 01:52:54PM -0700, Luis Chamberlain wrote:
> On Mon, Aug 26, 2024 at 07:43:20PM +0200, Christophe Leroy wrote:
> > 
> > 
> > Le 26/08/2024 à 17:48, Pankaj Raghav (Samsung) a écrit :
> > > On Mon, Aug 26, 2024 at 05:59:31PM +1000, Stephen Rothwell wrote:
> > > > Hi all,
> > > > 
> > > > After merging the vfs-brauner tree, today's linux-next boot test (powerpc
> > > > pseries_le_defconfig) produced this warning:
> > > 
> > > iomap dio calls set_memory_ro() on the page that is used for sub block
> > > zeroing.
> > > 
> > > But looking at powerpc code, they don't support set_memory_ro() for
> > > memory region that belongs to the kernel(LINEAR_MAP_REGION_ID).
> > > 
> > > /*
> > >   * On hash, the linear mapping is not in the Linux page table so
> > >   * apply_to_existing_page_range() will have no effect. If in the future
> > >   * the set_memory_* functions are used on the linear map this will need
> > >   * to be updated.
> > >   */
> > > if (!radix_enabled()) {
> > >          int region = get_region_id(addr);
> > > 
> > >          if (WARN_ON_ONCE(region != VMALLOC_REGION_ID && region != IO_REGION_ID))
> > >                  return -EINVAL;
> > > }
> > > 
> > > We call set_memory_ro() on the zero page as a extra security measure.
> > > I don't know much about powerpc, but looking at the comment, is it just
> > > adding the following to support it in powerpc:
> > > 
> > > diff --git a/arch/powerpc/mm/pageattr.c b/arch/powerpc/mm/pageattr.c
> > > index ac22bf28086fa..e6e0b40ba6db4 100644
> > > --- a/arch/powerpc/mm/pageattr.c
> > > +++ b/arch/powerpc/mm/pageattr.c
> > > @@ -94,7 +94,9 @@ int change_memory_attr(unsigned long addr, int numpages, long action)
> > >          if (!radix_enabled()) {
> > >                  int region = get_region_id(addr);
> > > -               if (WARN_ON_ONCE(region != VMALLOC_REGION_ID && region != IO_REGION_ID))
> > > +               if (WARN_ON_ONCE(region != VMALLOC_REGION_ID &&
> > > +                                region != IO_REGION_ID &&
> > > +                                region != LINEAR_MAP_REGION_ID))
> > >                          return -EINVAL;
> > >          }
> > >   #endif
> > 
> > By doing this you will just hide the fact that it didn't work.
> > 
> > See commit 1f9ad21c3b38 ("powerpc/mm: Implement set_memory() routines") for
> > details. The linear memory region is not mapped using page tables so
> > set_memory_ro() will have no effect on it.
> > 
> > You can either use vmalloc'ed pages, or do a const static allocation at
> > buildtime so that it will be allocated in the kernel static rodata area.
> > 
> > By the way, your code should check the value returned by set_memory_ro(),
> > there is some work in progress to make it mandatory, see
> > https://github.com/KSPP/linux/issues/7
> 
> Our users expect contiguous memory [0] and so we use alloc_pages() here,
> so if we're architecture limitted by this I'd rather we just remove the
> set_memory_ro() only for PPC, I don't see why other have to skip this.
> 
> diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
> index c02b266bba52..aba5cde89e14 100644
> --- a/fs/iomap/direct-io.c
> +++ b/fs/iomap/direct-io.c
> @@ -775,14 +775,22 @@ EXPORT_SYMBOL_GPL(iomap_dio_rw);
>  
>  static int __init iomap_dio_init(void)
>  {
> +	int ret;
> +
>  	zero_page = alloc_pages(GFP_KERNEL | __GFP_ZERO,
>  				IOMAP_ZERO_PAGE_ORDER);
>  
>  	if (!zero_page)
>  		return -ENOMEM;
>  
> -	set_memory_ro((unsigned long)page_address(zero_page),
> -		      1U << IOMAP_ZERO_PAGE_ORDER);
> -	return 0;
> +	if (IS_ENABLED(CONFIG_PPC))
> +		return 0;
> +
> +	ret = set_memory_ro((unsigned long)page_address(zero_page),
> +			    1U << IOMAP_ZERO_PAGE_ORDER);
> +	if (ret)
> +		free_pages((unsigned long) zero_page, IOMAP_ZERO_PAGE_ORDER);

arm64 will return -EINVAL here, their code for changing memory attributes
only works on vmalloc:

	 * Let's restrict ourselves to mappings created by vmalloc (or vmap).
	 * Those are guaranteed to consist entirely of page mappings, and
	 * splitting is never needed.


> +
> +	return ret;
>  }
>  fs_initcall(iomap_dio_init);
> 
> Thoughts?
> 
> [0] https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git/commit/?h=vfs.blocksize&id=d940b3b7b76b409b0550fdf2de6dc2183f01526f
> 
>   Luis

-- 
Sincerely yours,
Mike.


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-08-27 18:07 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20240826175931.1989f99e@canb.auug.org.au>
2024-08-26 15:48 ` linux-next: boot warning after merge of the vfs-brauner tree Pankaj Raghav (Samsung)
2024-08-26 17:43   ` Christophe Leroy
2024-08-26 20:52     ` Luis Chamberlain
2024-08-26 21:10       ` Darrick J. Wong
2024-08-26 21:41         ` Luis Chamberlain
2024-08-27  5:26           ` Ritesh Harjani
2024-08-27 15:38       ` Mike Rapoport
2024-08-27  6:28   ` Michael Ellerman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).