LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH 02/11] fs: don't allow kernel reads and writes without iter ops
From: Kees Cook @ 2020-08-18 19:34 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-arch, x86, linux-kernel, Al Viro, linux-fsdevel,
	linuxppc-dev
In-Reply-To: <20200817073212.830069-3-hch@lst.de>

On Mon, Aug 17, 2020 at 09:32:03AM +0200, Christoph Hellwig wrote:
> Don't allow calling ->read or ->write with set_fs as a preparation for
> killing off set_fs.  All the instances that we use kernel_read/write on
> are using the iter ops already.
> 
> If a file has both the regular ->read/->write methods and the iter
> variants those could have different semantics for messed up enough
> drivers.  Also fails the kernel access to them in that case.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook

^ permalink raw reply

* Re: [PATCH] powerpc/pseries: Do not initiate shutdown when system is running on UPS
From: Tyrel Datwyler @ 2020-08-18 19:35 UTC (permalink / raw)
  To: Vasant Hegde, linuxppc-dev; +Cc: stable
In-Reply-To: <20200818105424.234108-1-hegdevasant@linux.vnet.ibm.com>

On 8/18/20 3:54 AM, Vasant Hegde wrote:
> As per PAPR specification whenever system is running on UPS we have to
> wait for predefined time (default 10mins) before initiating shutdown.

The wording in PAPR seems a little unclear. It states for an
EPOW_SYSTEM_SHUTDOWN action code that an EPOW error should be logged followed by
scheduling a shutdown to begin after an OS defined delay interval (with 10
minutes the suggested default).

However, the modifier code descriptions seems to imply that a normal shutdown is
the only one that should happen with no additional delay.

For EPOW sensor value = 3 (EPOW_SYSTEM_SHUTDOWN)
0x01 = Normal system shutdown with no additional delay
0x02 = Loss of utility power, system is running on UPS/Battery
0x03 = Loss of system critical functions, system should be shutdown
0x04 = Ambient temperature too high

For 0x03-0x04 we also do an orderly_poweroff().

Not sure if it really matters, but I was curious and this is just what I gleaned
from glancing at PAPR.

-Tyrel

> 
> We have user space tool (rtas_errd) to monitor for EPOW events and
> initiate shutdown after predefined time. Hence do not initiate shutdown
> whenever we get EPOW_SHUTDOWN_ON_UPS event.
> 
> Fixes: 79872e35 (powerpc/pseries: All events of EPOW_SYSTEM_SHUTDOWN must initiate shutdown)
> Cc: stable@vger.kernel.org # v4.0+
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
> ---
>  arch/powerpc/platforms/pseries/ras.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c
> index f3736fcd98fc..13c86a292c6d 100644
> --- a/arch/powerpc/platforms/pseries/ras.c
> +++ b/arch/powerpc/platforms/pseries/ras.c
> @@ -184,7 +184,6 @@ static void handle_system_shutdown(char event_modifier)
>  	case EPOW_SHUTDOWN_ON_UPS:
>  		pr_emerg("Loss of system power detected. System is running on"
>  			 " UPS/battery. Check RTAS error log for details\n");
> -		orderly_poweroff(true);
>  		break;
> 
>  	case EPOW_SHUTDOWN_LOSS_OF_CRITICAL_FUNCTIONS:
> 


^ permalink raw reply

* Re: [PATCH 03/11] fs: don't allow splice read/write without explicit ops
From: Kees Cook @ 2020-08-18 19:39 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-arch, x86, linux-kernel, Al Viro, linux-fsdevel,
	linuxppc-dev
In-Reply-To: <20200817073212.830069-4-hch@lst.de>

On Mon, Aug 17, 2020 at 09:32:04AM +0200, Christoph Hellwig wrote:
> default_file_splice_write is the last piece of generic code that uses
> set_fs to make the uaccess routines operate on kernel pointers.  It
> implements a "fallback loop" for splicing from files that do not actually
> provide a proper splice_read method.  The usual file systems and other
> high bandwith instances all provide a ->splice_read, so this just removes
> support for various device drivers and procfs/debugfs files.  If splice
> support for any of those turns out to be important it can be added back
> by switching them to the iter ops and using generic_file_splice_read.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

This seems a bit disruptive? I feel like this is going to make fuzzers
really noisy (e.g. trinity likes to splice random stuff out of /sys and
/proc).

Conceptually, though:

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook

^ permalink raw reply

* Re: [PATCH 04/11] uaccess: add infrastructure for kernel builds with set_fs()
From: Kees Cook @ 2020-08-18 19:40 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-arch, x86, linux-kernel, Al Viro, linux-fsdevel,
	linuxppc-dev
In-Reply-To: <20200817073212.830069-5-hch@lst.de>

On Mon, Aug 17, 2020 at 09:32:05AM +0200, Christoph Hellwig wrote:
> Add a CONFIG_SET_FS option that is selected by architecturess that
> implement set_fs, which is all of them initially.  If the option is not
> set stubs for routines related to overriding the address space are
> provided so that architectures can start to opt out of providing set_fs.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook

^ permalink raw reply

* Re: [PATCH 05/11] test_bitmap: skip user bitmap tests for !CONFIG_SET_FS
From: Kees Cook @ 2020-08-18 19:43 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-arch, x86, linux-kernel, Al Viro, linux-fsdevel,
	linuxppc-dev
In-Reply-To: <20200817073212.830069-6-hch@lst.de>

On Mon, Aug 17, 2020 at 09:32:06AM +0200, Christoph Hellwig wrote:
> We can't run the tests for userspace bitmap parsing if set_fs() doesn't
> exist.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook

^ permalink raw reply

* Re: [PATCH 08/11] x86: make TASK_SIZE_MAX usable from assembly code
From: Kees Cook @ 2020-08-18 19:44 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-arch, x86, linux-kernel, Al Viro, linux-fsdevel,
	linuxppc-dev
In-Reply-To: <20200817073212.830069-9-hch@lst.de>

On Mon, Aug 17, 2020 at 09:32:09AM +0200, Christoph Hellwig wrote:
> For 64-bit the only hing missing was a strategic _AC, and for 32-bit we

typo: thing

> need to use __PAGE_OFFSET instead of PAGE_OFFSET in the TASK_SIZE
> definition to escape the explicit unsigned long cast.  This just works
> because __PAGE_OFFSET is defined using _AC itself and thus never needs
> the cast anyway.

Shouldn't this be folded into the prior patch so there's no bisection
problem?

-Kees

> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  arch/x86/include/asm/page_32_types.h | 4 ++--
>  arch/x86/include/asm/page_64_types.h | 2 +-
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/include/asm/page_32_types.h b/arch/x86/include/asm/page_32_types.h
> index 26236925fb2c36..f462895a33e452 100644
> --- a/arch/x86/include/asm/page_32_types.h
> +++ b/arch/x86/include/asm/page_32_types.h
> @@ -44,8 +44,8 @@
>  /*
>   * User space process size: 3GB (default).
>   */
> -#define IA32_PAGE_OFFSET	PAGE_OFFSET
> -#define TASK_SIZE		PAGE_OFFSET
> +#define IA32_PAGE_OFFSET	__PAGE_OFFSET
> +#define TASK_SIZE		__PAGE_OFFSET
>  #define TASK_SIZE_LOW		TASK_SIZE
>  #define TASK_SIZE_MAX		TASK_SIZE
>  #define DEFAULT_MAP_WINDOW	TASK_SIZE
> diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h
> index 996595c9897e0a..838515daf87b36 100644
> --- a/arch/x86/include/asm/page_64_types.h
> +++ b/arch/x86/include/asm/page_64_types.h
> @@ -76,7 +76,7 @@
>   *
>   * With page table isolation enabled, we map the LDT in ... [stay tuned]
>   */
> -#define TASK_SIZE_MAX	((1UL << __VIRTUAL_MASK_SHIFT) - PAGE_SIZE)
> +#define TASK_SIZE_MAX	((_AC(1,UL) << __VIRTUAL_MASK_SHIFT) - PAGE_SIZE)
>  
>  #define DEFAULT_MAP_WINDOW	((1UL << 47) - PAGE_SIZE)
>  
> -- 
> 2.28.0
> 

-- 
Kees Cook

^ permalink raw reply

* Re: [PATCH 09/11] x86: remove address space overrides using set_fs()
From: Kees Cook @ 2020-08-18 19:46 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-arch, x86, linux-kernel, Al Viro, linux-fsdevel,
	linuxppc-dev
In-Reply-To: <20200817073212.830069-10-hch@lst.de>

On Mon, Aug 17, 2020 at 09:32:10AM +0200, Christoph Hellwig wrote:
> Stop providing the possibility to override the address space using
> set_fs() now that there is no need for that any more.  To properly
> handle the TASK_SIZE_MAX checking for 4 vs 5-level page tables on
> x86 a new alternative is introduced, which just like the one in
> entry_64.S has to use the hardcoded virtual address bits to escape
> the fact that TASK_SIZE_MAX isn't actually a constant when 5-level
> page tables are enabled.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Awesome. :)

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook

^ permalink raw reply

* Re: [PATCH v2] powerpc/pseries/svm: Allocate SWIOTLB buffer anywhere in memory
From: Thiago Jung Bauermann @ 2020-08-18 19:51 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Konrad Rzeszutek Wilk, linuxppc-dev, Ram Pai, linux-kernel, iommu,
	Satheesh Rajendran, Robin Murphy, Marek Szyprowski
In-Reply-To: <20200818065911.GA2324@lst.de>


Christoph Hellwig <hch@lst.de> writes:

> On Mon, Aug 17, 2020 at 06:46:58PM -0300, Thiago Jung Bauermann wrote:
>> POWER secure guests (i.e., guests which use the Protection Execution
>> Facility) need to use SWIOTLB to be able to do I/O with the hypervisor, but
>> they don't need the SWIOTLB memory to be in low addresses since the
>> hypervisor doesn't have any addressing limitation.
>> 
>> This solves a SWIOTLB initialization problem we are seeing in secure guests
>> with 128 GB of RAM: they are configured with 4 GB of crashkernel reserved
>> memory, which leaves no space for SWIOTLB in low addresses.
>> 
>> To do this, we use mostly the same code as swiotlb_init(), but allocate the
>> buffer using memblock_alloc() instead of memblock_alloc_low().
>> 
>> We also need to add swiotlb_set_no_iotlb_memory() in order to set the
>> no_iotlb_memory flag if initialization fails.
>
> Do you really need the helper?  As far as I can tell the secure guests
> very much rely on swiotlb for all I/O, so you might as well panic if
> you fail to allocate it.

That is true. Ok, I will do that.

-- 
Thiago Jung Bauermann
IBM Linux Technology Center

^ permalink raw reply

* Re: [PATCH 03/11] fs: don't allow splice read/write without explicit ops
From: Christoph Hellwig @ 2020-08-18 19:54 UTC (permalink / raw)
  To: Kees Cook
  Cc: linux-arch, x86, linux-kernel, Al Viro, linux-fsdevel,
	linuxppc-dev, Christoph Hellwig
In-Reply-To: <202008181239.E51B80265@keescook>

On Tue, Aug 18, 2020 at 12:39:34PM -0700, Kees Cook wrote:
> On Mon, Aug 17, 2020 at 09:32:04AM +0200, Christoph Hellwig wrote:
> > default_file_splice_write is the last piece of generic code that uses
> > set_fs to make the uaccess routines operate on kernel pointers.  It
> > implements a "fallback loop" for splicing from files that do not actually
> > provide a proper splice_read method.  The usual file systems and other
> > high bandwith instances all provide a ->splice_read, so this just removes
> > support for various device drivers and procfs/debugfs files.  If splice
> > support for any of those turns out to be important it can be added back
> > by switching them to the iter ops and using generic_file_splice_read.
> > 
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
> 
> This seems a bit disruptive? I feel like this is going to make fuzzers
> really noisy (e.g. trinity likes to splice random stuff out of /sys and
> /proc).

Noisy in the sence of triggering the pr_debug or because they can't
handle -EINVAL?

^ permalink raw reply

* Re: [PATCH 08/11] x86: make TASK_SIZE_MAX usable from assembly code
From: Christoph Hellwig @ 2020-08-18 19:55 UTC (permalink / raw)
  To: Kees Cook
  Cc: linux-arch, x86, linux-kernel, Al Viro, linux-fsdevel,
	linuxppc-dev, Christoph Hellwig
In-Reply-To: <202008181244.BBDA7DAB@keescook>

On Tue, Aug 18, 2020 at 12:44:49PM -0700, Kees Cook wrote:
> On Mon, Aug 17, 2020 at 09:32:09AM +0200, Christoph Hellwig wrote:
> > For 64-bit the only hing missing was a strategic _AC, and for 32-bit we
> 
> typo: thing
> 
> > need to use __PAGE_OFFSET instead of PAGE_OFFSET in the TASK_SIZE
> > definition to escape the explicit unsigned long cast.  This just works
> > because __PAGE_OFFSET is defined using _AC itself and thus never needs
> > the cast anyway.
> 
> Shouldn't this be folded into the prior patch so there's no bisection
> problem?

I didn't see a problem bisecting, do you have something particular in
mind?

^ permalink raw reply

* Re: [PATCH 03/11] fs: don't allow splice read/write without explicit ops
From: Kees Cook @ 2020-08-18 19:58 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-arch, x86, linux-kernel, Al Viro, linux-fsdevel,
	linuxppc-dev
In-Reply-To: <20200818195446.GA32691@lst.de>

On Tue, Aug 18, 2020 at 09:54:46PM +0200, Christoph Hellwig wrote:
> On Tue, Aug 18, 2020 at 12:39:34PM -0700, Kees Cook wrote:
> > On Mon, Aug 17, 2020 at 09:32:04AM +0200, Christoph Hellwig wrote:
> > > default_file_splice_write is the last piece of generic code that uses
> > > set_fs to make the uaccess routines operate on kernel pointers.  It
> > > implements a "fallback loop" for splicing from files that do not actually
> > > provide a proper splice_read method.  The usual file systems and other
> > > high bandwith instances all provide a ->splice_read, so this just removes
> > > support for various device drivers and procfs/debugfs files.  If splice
> > > support for any of those turns out to be important it can be added back
> > > by switching them to the iter ops and using generic_file_splice_read.
> > > 
> > > Signed-off-by: Christoph Hellwig <hch@lst.de>
> > 
> > This seems a bit disruptive? I feel like this is going to make fuzzers
> > really noisy (e.g. trinity likes to splice random stuff out of /sys and
> > /proc).
> 
> Noisy in the sence of triggering the pr_debug or because they can't
> handle -EINVAL?

Well, maybe both? I doubt much _expects_ to be using splice, so I'm fine
with that, but it seems weird not to have a fall-back, especially if
something would like to splice a file out of there. But, I'm not opposed
to the change, it just seems like it might cause pain down the road.

-- 
Kees Cook

^ permalink raw reply

* Re: [PATCH 08/11] x86: make TASK_SIZE_MAX usable from assembly code
From: Kees Cook @ 2020-08-18 19:59 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-arch, x86, linux-kernel, Al Viro, linux-fsdevel,
	linuxppc-dev
In-Reply-To: <20200818195539.GB32691@lst.de>

On Tue, Aug 18, 2020 at 09:55:39PM +0200, Christoph Hellwig wrote:
> On Tue, Aug 18, 2020 at 12:44:49PM -0700, Kees Cook wrote:
> > On Mon, Aug 17, 2020 at 09:32:09AM +0200, Christoph Hellwig wrote:
> > > For 64-bit the only hing missing was a strategic _AC, and for 32-bit we
> > 
> > typo: thing
> > 
> > > need to use __PAGE_OFFSET instead of PAGE_OFFSET in the TASK_SIZE
> > > definition to escape the explicit unsigned long cast.  This just works
> > > because __PAGE_OFFSET is defined using _AC itself and thus never needs
> > > the cast anyway.
> > 
> > Shouldn't this be folded into the prior patch so there's no bisection
> > problem?
> 
> I didn't see a problem bisecting, do you have something particular in
> mind?

Oh, I misunderstood this patch to be a fix for compilation. Is this just
a correctness fix?

-- 
Kees Cook

^ permalink raw reply

* Re: [PATCH 08/11] x86: make TASK_SIZE_MAX usable from assembly code
From: Christoph Hellwig @ 2020-08-18 20:00 UTC (permalink / raw)
  To: Kees Cook
  Cc: linux-arch, x86, linux-kernel, Al Viro, linux-fsdevel,
	linuxppc-dev, Christoph Hellwig
In-Reply-To: <202008181258.CEC4B8B3@keescook>

On Tue, Aug 18, 2020 at 12:59:05PM -0700, Kees Cook wrote:
> > I didn't see a problem bisecting, do you have something particular in
> > mind?
> 
> Oh, I misunderstood this patch to be a fix for compilation. Is this just
> a correctness fix?

It prepares for using the definition from assembly, which is done in
the next patch.

^ permalink raw reply

* Re: [PATCH 03/11] fs: don't allow splice read/write without explicit ops
From: Christoph Hellwig @ 2020-08-18 20:07 UTC (permalink / raw)
  To: Kees Cook
  Cc: linux-arch, x86, linux-kernel, Al Viro, linux-fsdevel,
	linuxppc-dev, Christoph Hellwig
In-Reply-To: <202008181256.CABD56782@keescook>

On Tue, Aug 18, 2020 at 12:58:07PM -0700, Kees Cook wrote:
> On Tue, Aug 18, 2020 at 09:54:46PM +0200, Christoph Hellwig wrote:
> > On Tue, Aug 18, 2020 at 12:39:34PM -0700, Kees Cook wrote:
> > > On Mon, Aug 17, 2020 at 09:32:04AM +0200, Christoph Hellwig wrote:
> > > > default_file_splice_write is the last piece of generic code that uses
> > > > set_fs to make the uaccess routines operate on kernel pointers.  It
> > > > implements a "fallback loop" for splicing from files that do not actually
> > > > provide a proper splice_read method.  The usual file systems and other
> > > > high bandwith instances all provide a ->splice_read, so this just removes
> > > > support for various device drivers and procfs/debugfs files.  If splice
> > > > support for any of those turns out to be important it can be added back
> > > > by switching them to the iter ops and using generic_file_splice_read.
> > > > 
> > > > Signed-off-by: Christoph Hellwig <hch@lst.de>
> > > 
> > > This seems a bit disruptive? I feel like this is going to make fuzzers
> > > really noisy (e.g. trinity likes to splice random stuff out of /sys and
> > > /proc).
> > 
> > Noisy in the sence of triggering the pr_debug or because they can't
> > handle -EINVAL?
> 
> Well, maybe both? I doubt much _expects_ to be using splice, so I'm fine
> with that, but it seems weird not to have a fall-back, especially if
> something would like to splice a file out of there. But, I'm not opposed
> to the change, it just seems like it might cause pain down the road.

The problem is that without pretending a buffer is in user space when
it actually isn't, we can't have a generic fallback.  So we'll have to
have specific support - I wrote generic support for seq_file, and
willy did for /proc/sys, but at least the first caused a few problems
and a fair amount of churn, so I'd rather see first if we can get
away without it.

> 
> -- 
> Kees Cook
---end quoted text---

^ permalink raw reply

* Re: [PATCH 08/11] x86: make TASK_SIZE_MAX usable from assembly code
From: Kees Cook @ 2020-08-18 20:08 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-arch, x86, linux-kernel, Al Viro, linux-fsdevel,
	linuxppc-dev
In-Reply-To: <20200818200016.GA681@lst.de>

On Tue, Aug 18, 2020 at 10:00:16PM +0200, Christoph Hellwig wrote:
> On Tue, Aug 18, 2020 at 12:59:05PM -0700, Kees Cook wrote:
> > > I didn't see a problem bisecting, do you have something particular in
> > > mind?
> > 
> > Oh, I misunderstood this patch to be a fix for compilation. Is this just
> > a correctness fix?
> 
> It prepares for using the definition from assembly, which is done in
> the next patch.

Ah! Okay; thanks.

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook

^ permalink raw reply

* Re: [PATCH 1/2] powerpc/64s: remove PROT_SAO support
From: Shawn Anastasio @ 2020-08-18 20:59 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev
In-Reply-To: <1597733955.vlt37n2lw9.astroid@bobo.none>

On 8/18/20 2:11 AM, Nicholas Piggin wrote> Very reasonable point.
> 
> The problem we're trying to get a handle on is live partition migration
> where a running guest might be using SAO then get migrated to a P10. I
> don't think we have a good way to handle this case. Potentially the
> hypervisor could revoke the page tables if the guest is running in hash
> mode and the guest kernel could be taught about that and sigbus the
> process, but in radix the guest controls those page tables and the SAO
> state and I don't think there's a way to cause it to take a fault.
> 
> I also don't know what the proprietary hypervisor does here.
> 
> We could add it back, default to n, or make it bare metal only, or
> somehow try to block live migration to a later CPU without the faciliy.
> I wouldn't be against that.


Admittedly I'm not too familiar with the specifics of live migration
or guest memory management, but restoring the functionality and adding
a way to prevent migration of SAO-using guests seems like a reasonable
choice to me. Would this be done with help from the guest using some
sort of infrastructure to signal to the hypervisor that SAO is in use,
or entirely on the hypervisor by e.g. scanning the through the process
table for SAO pages?

> It would be very interesting to know how it performs in such a "real"
> situation. I don't know how well POWER9 has optimised it -- it's
> possible that it's not much better than putting lwsync after every load
> or store.


This is definitely worth investigating in depth. That said, even if the
performance on P9 isn't super great, I think the feature could still be
useful, since it would offer more granularity than the sledgehammer
approach of emitting lwsync everywhere.

I'd be happy to put in some of the work required to get this to a point
where it can be reintroduced without breaking guest migration - I'd just
need some pointers on getting started with whatever approach is decided on.

Thanks,
Shawn

^ permalink raw reply

* [PATCH v3] powerpc/pseries/svm: Allocate SWIOTLB buffer anywhere in memory
From: Thiago Jung Bauermann @ 2020-08-18 22:11 UTC (permalink / raw)
  To: iommu
  Cc: Konrad Rzeszutek Wilk, linuxppc-dev, Ram Pai, linux-kernel,
	Satheesh Rajendran, Robin Murphy, Christoph Hellwig,
	Thiago Jung Bauermann, Marek Szyprowski

POWER secure guests (i.e., guests which use the Protection Execution
Facility) need to use SWIOTLB to be able to do I/O with the hypervisor, but
they don't need the SWIOTLB memory to be in low addresses since the
hypervisor doesn't have any addressing limitation.

This solves a SWIOTLB initialization problem we are seeing in secure guests
with 128 GB of RAM: they are configured with 4 GB of crashkernel reserved
memory, which leaves no space for SWIOTLB in low addresses.

To do this, we use mostly the same code as swiotlb_init(), but allocate the
buffer using memblock_alloc() instead of memblock_alloc_low().

Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
---
 arch/powerpc/include/asm/svm.h       |  4 ++++
 arch/powerpc/mm/mem.c                |  6 +++++-
 arch/powerpc/platforms/pseries/svm.c | 26 ++++++++++++++++++++++++++
 3 files changed, 35 insertions(+), 1 deletion(-)

Changes from v2:
- Panic if unable to allocate buffer, as suggested by Christoph.

Changes from v1:
- Open-code swiotlb_init() in arch-specific code, as suggested by
  Christoph.

diff --git a/arch/powerpc/include/asm/svm.h b/arch/powerpc/include/asm/svm.h
index 85580b30aba4..7546402d796a 100644
--- a/arch/powerpc/include/asm/svm.h
+++ b/arch/powerpc/include/asm/svm.h
@@ -15,6 +15,8 @@ static inline bool is_secure_guest(void)
 	return mfmsr() & MSR_S;
 }
 
+void __init svm_swiotlb_init(void);
+
 void dtl_cache_ctor(void *addr);
 #define get_dtl_cache_ctor()	(is_secure_guest() ? dtl_cache_ctor : NULL)
 
@@ -25,6 +27,8 @@ static inline bool is_secure_guest(void)
 	return false;
 }
 
+static inline void svm_swiotlb_init(void) {}
+
 #define get_dtl_cache_ctor() NULL
 
 #endif /* CONFIG_PPC_SVM */
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index c2c11eb8dcfc..0f21bcb16405 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -50,6 +50,7 @@
 #include <asm/swiotlb.h>
 #include <asm/rtas.h>
 #include <asm/kasan.h>
+#include <asm/svm.h>
 
 #include <mm/mmu_decl.h>
 
@@ -290,7 +291,10 @@ void __init mem_init(void)
 	 * back to to-down.
 	 */
 	memblock_set_bottom_up(true);
-	swiotlb_init(0);
+	if (is_secure_guest())
+		svm_swiotlb_init();
+	else
+		swiotlb_init(0);
 #endif
 
 	high_memory = (void *) __va(max_low_pfn * PAGE_SIZE);
diff --git a/arch/powerpc/platforms/pseries/svm.c b/arch/powerpc/platforms/pseries/svm.c
index 40c0637203d5..81085eb8f225 100644
--- a/arch/powerpc/platforms/pseries/svm.c
+++ b/arch/powerpc/platforms/pseries/svm.c
@@ -7,6 +7,7 @@
  */
 
 #include <linux/mm.h>
+#include <linux/memblock.h>
 #include <asm/machdep.h>
 #include <asm/svm.h>
 #include <asm/swiotlb.h>
@@ -34,6 +35,31 @@ static int __init init_svm(void)
 }
 machine_early_initcall(pseries, init_svm);
 
+/*
+ * Initialize SWIOTLB. Essentially the same as swiotlb_init(), except that it
+ * can allocate the buffer anywhere in memory. Since the hypervisor doesn't have
+ * any addressing limitation, we don't need to allocate it in low addresses.
+ */
+void __init svm_swiotlb_init(void)
+{
+	unsigned char *vstart;
+	unsigned long bytes, io_tlb_nslabs;
+
+	io_tlb_nslabs = (swiotlb_size_or_default() >> IO_TLB_SHIFT);
+	io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE);
+
+	bytes = io_tlb_nslabs << IO_TLB_SHIFT;
+
+	vstart = memblock_alloc(PAGE_ALIGN(bytes), PAGE_SIZE);
+	if (vstart && !swiotlb_init_with_tbl(vstart, io_tlb_nslabs, false))
+		return;
+
+	if (io_tlb_start)
+		memblock_free_early(io_tlb_start,
+				    PAGE_ALIGN(io_tlb_nslabs << IO_TLB_SHIFT));
+	panic("SVM: Cannot allocate SWIOTLB buffer");
+}
+
 int set_memory_encrypted(unsigned long addr, int numpages)
 {
 	if (!PAGE_ALIGNED(addr))

^ permalink raw reply related

* Re: [PATCH v3 10/17] memblock: reduce number of parameters in for_each_mem_range()
From: Miguel Ojeda @ 2020-08-18 22:18 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Thomas Gleixner, Emil Renner Berthing, linux-sh, Peter Zijlstra,
	Dave Hansen, linux-mips, Max Filippov, Paul Mackerras, sparclinux,
	linux-riscv, Will Deacon, Christoph Hellwig, Marek Szyprowski,
	linux-arch, linux-s390, linux-c6x-dev, Baoquan He,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Russell King,
	Mike Rapoport, clang-built-linux, Ingo Molnar, Linux ARM,
	Catalin Marinas, uclinux-h8-devel, linux-xtensa, openrisc,
	Borislav Petkov, Andy Lutomirski, Paul Walmsley, Stafford Horne,
	Hari Bathini, Daniel Axtens, Michal Simek, Thomas Bogendoerfer,
	Yoshinori Sato, Linux-MM, linux-kernel, iommu, Palmer Dabbelt,
	Andrew Morton, linuxppc-dev
In-Reply-To: <20200818151634.14343-11-rppt@kernel.org>

On Tue, Aug 18, 2020 at 5:19 PM Mike Rapoport <rppt@kernel.org> wrote:
>
>  .clang-format                          |  2 ++

For the .clang-format bit:

Acked-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>

Cheers,
Miguel

^ permalink raw reply

* [Bug 208957] New: 5.9-rc1 fails to build for a PowerMac G5: .../book3s64/hash_utils.c:1119:21: error: ‘default_uamor’ undeclared (first use in this function)  1119 |   mtspr(SPRN_UAMOR, default_uamor);
From: bugzilla-daemon @ 2020-08-18 22:32 UTC (permalink / raw)
  To: linuxppc-dev

https://bugzilla.kernel.org/show_bug.cgi?id=208957

            Bug ID: 208957
           Summary: 5.9-rc1 fails to build for a PowerMac G5:
                    .../book3s64/hash_utils.c:1119:21: error:
                    ‘default_uamor’ undeclared (first use in this
                    function)  1119 |   mtspr(SPRN_UAMOR, default_uamor);
           Product: Platform Specific/Hardware
           Version: 2.5
    Kernel Version: 5.9-rc1
          Hardware: PPC-64
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: PPC-64
          Assignee: platform_ppc-64@kernel-bugs.osdl.org
          Reporter: erhard_f@mailbox.org
        Regression: No

Created attachment 292021
  --> https://bugzilla.kernel.org/attachment.cgi?id=292021&action=edit
kernel .config (kernel 5.9-rc1, PowerMac G5 11,2)

[...]
  CALL    scripts/checksyscalls.sh
  CALL    scripts/atomic/check-atomics.sh
  CHK     include/generated/compile.h
  CC      arch/powerpc/mm/book3s64/hash_utils.o
In file included from ./arch/powerpc/include/asm/processor.h:9,
                 from ./arch/powerpc/include/asm/thread_info.h:40,
                 from ./include/linux/thread_info.h:38,
                 from ./include/asm-generic/preempt.h:5,
                 from ./arch/powerpc/include/generated/asm/preempt.h:1,
                 from ./include/linux/preempt.h:78,
                 from ./include/linux/spinlock.h:51,
                 from arch/powerpc/mm/book3s64/hash_utils.c:21:
arch/powerpc/mm/book3s64/hash_utils.c: In function
‘hash__early_init_mmu_secondary’:
arch/powerpc/mm/book3s64/hash_utils.c:1119:21: error: ‘default_uamor’
undeclared (first use in this function)
 1119 |   mtspr(SPRN_UAMOR, default_uamor);
      |                     ^~~~~~~~~~~~~
./arch/powerpc/include/asm/reg.h:1396:33: note: in definition of macro ‘mtspr’
 1396 |          : "r" ((unsigned long)(v)) \
      |                                 ^
arch/powerpc/mm/book3s64/hash_utils.c:1119:21: note: each undeclared identifier
is reported only once for each function it appears in
 1119 |   mtspr(SPRN_UAMOR, default_uamor);
      |                     ^~~~~~~~~~~~~
./arch/powerpc/include/asm/reg.h:1396:33: note: in definition of macro ‘mtspr’
 1396 |          : "r" ((unsigned long)(v)) \
      |                                 ^
make[3]: *** [scripts/Makefile.build:283:
arch/powerpc/mm/book3s64/hash_utils.o] Error 1
make[2]: *** [scripts/Makefile.build:500: arch/powerpc/mm/book3s64] Error 2
make[1]: *** [scripts/Makefile.build:500: arch/powerpc/mm] Error 2
make: *** [Makefile:1789: arch/powerpc] Error 2

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply

* Re: fsl_espi errors on v5.7.15
From: Chris Packham @ 2020-08-18 22:44 UTC (permalink / raw)
  To: Heiner Kallweit, broonie@kernel.org, mpe@ellerman.id.au,
	benh@kernel.crashing.org, paulus@samba.org,
	tiago.brusamarello@datacom.ind.br
  Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org,
	linux-spi@vger.kernel.org
In-Reply-To: <3c72eec1-41ba-7389-eceb-3de80065555a@alliedtelesis.co.nz>

Hi Again,

On 17/08/20 9:09 am, Chris Packham wrote:

>
> On 14/08/20 6:19 pm, Heiner Kallweit wrote:
>> On 14.08.2020 04:48, Chris Packham wrote:
>>> Hi,
>>>
>>> I'm seeing a problem with accessing spi-nor after upgrading a T2081
>>> based system to linux v5.7.15
>>>
>>> For this board u-boot and the u-boot environment live on spi-nor.
>>>
>>> When I use fw_setenv from userspace I get the following kernel logs
>>>
>>> # fw_setenv foo=1
>>> fsl_espi ffe110000.spi: Transfer done but SPIE_DON isn't set!
>>> fsl_espi ffe110000.spi: Transfer done but SPIE_DON isn't set!
>>> fsl_espi ffe110000.spi: Transfer done but SPIE_DON isn't set!
>>> fsl_espi ffe110000.spi: Transfer done but SPIE_DON isn't set!
>>> fsl_espi ffe110000.spi: Transfer done but SPIE_DON isn't set!
>>> fsl_espi ffe110000.spi: Transfer done but SPIE_DON isn't set!
>>> fsl_espi ffe110000.spi: Transfer done but SPIE_DON isn't set!
>>> fsl_espi ffe110000.spi: Transfer done but SPIE_DON isn't set!
>>> fsl_espi ffe110000.spi: Transfer done but SPIE_DON isn't set!
>>> fsl_espi ffe110000.spi: Transfer done but SPIE_DON isn't set!
>>> fsl_espi ffe110000.spi: Transfer done but SPIE_DON isn't set!
>>> fsl_espi ffe110000.spi: Transfer done but SPIE_DON isn't set!
>>> fsl_espi ffe110000.spi: Transfer done but SPIE_DON isn't set!
>>> fsl_espi ffe110000.spi: Transfer done but SPIE_DON isn't set!
>>> fsl_espi ffe110000.spi: Transfer done but rx/tx fifo's aren't empty!
>>> fsl_espi ffe110000.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>> fsl_espi ffe110000.spi: Transfer done but rx/tx fifo's aren't empty!
>>> fsl_espi ffe110000.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>> fsl_espi ffe110000.spi: Transfer done but rx/tx fifo's aren't empty!
>>> fsl_espi ffe110000.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>> ...
>>>
>> This error reporting doesn't exist yet in 4.4. So you may have an issue
>> under 4.4 too, it's just not reported.
>> Did you verify that under 4.4 fw_setenv actually has an effect?
> Just double checked and yes under 4.4 the setting does get saved.
>>> If I run fw_printenv (before getting it into a bad state) it is able to
>>> display the content of the boards u-boot environment.
>>>
>> This might indicate an issue with spi being locked. I've seen related
>> questions, just use the search engine of your choice and check for
>> fw_setenv and locked.
> I'm running a version of fw_setenv which includes 
> https://gitlab.denx.de/u-boot/u-boot/-/commit/db820159 so it shouldn't 
> be locking things unnecessarily.
>>> If been unsuccessful in producing a setup for bisecting the issue. I do
>>> know the issue doesn't occur on the old 4.4.x based kernel but that's
>>> probably not much help.
>>>
>>> Any pointers on what the issue (and/or solution) might be.

I finally managed to get our board running with a vanilla kernel. With 
corenet64_smp_defconfig I occasionally see

   fsl_espi ffe110000.spi: Transfer done but SPIE_DON isn't set!

other than the message things seem to be working.

With a custom defconfig I see

   fsl_espi ffe110000.spi: Transfer done but SPIE_DON isn't set!
   fsl_espi ffe110000.spi: Transfer done but rx/tx fifo's aren't empty!
   fsl_espi ffe110000.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
   ...

and access to the spi-nor does not work until the board is reset.

I'll try and pick apart the differences between the two defconfigs.

^ permalink raw reply

* Re: [PATCH] powerpc/book3s64/radix: Fix boot failure with large amount of guest memory
From: kernel test robot @ 2020-08-18 23:44 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linuxppc-dev, mpe
  Cc: Shirisha Ganta, Aneesh Kumar K.V, kbuild-all, Sandipan Das,
	npiggin
In-Reply-To: <20200813162039.608649-1-aneesh.kumar@linux.ibm.com>

[-- Attachment #1: Type: text/plain, Size: 2347 bytes --]

Hi "Aneesh,

I love your patch! Yet something to improve:

[auto build test ERROR on powerpc/next]
[also build test ERROR on v5.9-rc1 next-20200818]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Aneesh-Kumar-K-V/powerpc-book3s64-radix-Fix-boot-failure-with-large-amount-of-guest-memory/20200814-002215
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-allnoconfig (attached as .config)
compiler: powerpc-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=powerpc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   arch/powerpc/kernel/prom.c: In function 'early_init_devtree':
>> arch/powerpc/kernel/prom.c:818:3: error: implicit declaration of function 'radix__setup_initial_memory_limit'; did you mean 'setup_initial_memory_limit'? [-Werror=implicit-function-declaration]
     818 |   radix__setup_initial_memory_limit(memstart_addr, first_memblock_size);
         |   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         |   setup_initial_memory_limit
   cc1: all warnings being treated as errors

# https://github.com/0day-ci/linux/commit/082f192bfaabd1eeb28421d82574ce76ae0c4fba
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Aneesh-Kumar-K-V/powerpc-book3s64-radix-Fix-boot-failure-with-large-amount-of-guest-memory/20200814-002215
git checkout 082f192bfaabd1eeb28421d82574ce76ae0c4fba
vim +818 arch/powerpc/kernel/prom.c

   811	
   812		mmu_early_init_devtree();
   813	
   814		/*
   815		 * Reset ppc64_rma_size and memblock memory limit
   816		 */
   817		if (early_radix_enabled())
 > 818			radix__setup_initial_memory_limit(memstart_addr, first_memblock_size);
   819	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 6458 bytes --]

^ permalink raw reply

* Re: [PATCH 1/2] lockdep: improve current->(hard|soft)irqs_enabled synchronisation with actual irq state
From: Nicholas Piggin @ 2020-08-18 23:54 UTC (permalink / raw)
  To: peterz
  Cc: linux-arch, Alexey Kardashevskiy, Will Deacon, linux-kernel,
	Ingo Molnar, linuxppc-dev
In-Reply-To: <20200818154143.GT2674@hirez.programming.kicks-ass.net>

Excerpts from peterz@infradead.org's message of August 19, 2020 1:41 am:
> On Tue, Aug 18, 2020 at 05:22:33PM +1000, Nicholas Piggin wrote:
>> Excerpts from peterz@infradead.org's message of August 12, 2020 8:35 pm:
>> > On Wed, Aug 12, 2020 at 06:18:28PM +1000, Nicholas Piggin wrote:
>> >> Excerpts from peterz@infradead.org's message of August 7, 2020 9:11 pm:
>> >> > 
>> >> > What's wrong with something like this?
>> >> > 
>> >> > AFAICT there's no reason to actually try and add IRQ tracing here, it's
>> >> > just a hand full of instructions at the most.
>> >> 
>> >> Because we may want to use that in other places as well, so it would
>> >> be nice to have tracing.
>> >> 
>> >> Hmm... also, I thought NMI context was free to call local_irq_save/restore
>> >> anyway so the bug would still be there in those cases?
>> > 
>> > NMI code has in_nmi() true, in which case the IRQ tracing is disabled
>> > (except for x86 which has CONFIG_TRACE_IRQFLAGS_NMI).
>> > 
>> 
>> That doesn't help. It doesn't fix the lockdep irq state going out of
>> synch with the actual irq state. The code which triggered this with the
>> special powerpc irq disable has in_nmi() true as well.
> 
> Urgh, you're talking about using lockdep_assert_irqs*() from NMI
> context?
> 
> If not, I'm afraid I might've lost the plot a little on what exact
> failure case we're talking about.
> 

Hm, I may have been a bit confused actually. Since your Fix 
TRACE_IRQFLAGS vs NMIs patch it might now work.

I'm worried powerpc disables trace irqs trace_hardirqs_off()
before nmi_enter() might still be a problem, but not sure
actually. Alexey did you end up re-testing with Peter's patch
or current upstream?

Thanks,
Nick

^ permalink raw reply

* Re: [PATCH v2 11/25] powerpc/signal: Refactor bad frame logging
From: Joe Perches @ 2020-08-19  1:19 UTC (permalink / raw)
  To: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <fa094445c119fc00315e1c13783b493346306c6a.1597770847.git.christophe.leroy@csgroup.eu>

On Tue, 2020-08-18 at 17:19 +0000, Christophe Leroy wrote:
> The logging of bad frame appears half a dozen of times
> and is pretty similar.
[]
> diff --git a/arch/powerpc/kernel/signal.c b/arch/powerpc/kernel/signal.c
[]
> @@ -355,3 +355,14 @@ static unsigned long get_tm_stackpointer(struct task_struct *tsk)
>  #endif
>  	return ret;
>  }
> +
> +static const char fm32[] = KERN_INFO "%s[%d]: bad frame in %s: %p nip %08lx lr %08lx\n";
> +static const char fm64[] = KERN_INFO "%s[%d]: bad frame in %s: %p nip %016lx lr %016lx\n";

Why not remove this and use it in place with
%08lx/%016x used as %px with a case to (void *)?

> +void signal_fault(struct task_struct *tsk, struct pt_regs *regs,
> +		  const char *where, void __user *ptr)
> +{
> +	if (show_unhandled_signals)
> +		printk_ratelimited(regs->msr & MSR_64BIT ? fm64 : fm32, tsk->comm,
> +				   task_pid_nr(tsk), where, ptr, regs->nip, regs->link);

	pr_info_ratelimited("%s[%d]: bad frame in %s: %p nip %016lx lr %016lx\n",
			    tsk->comm, task_pid_nr(tsk), where, ptr,
			    (void *)regs->nip, (void *)regs->link);



^ permalink raw reply

* [PATCH 3/3] powerpc/smp: Move ppc_md.cpu_die() to smp_ops.cpu_offline_self()
From: Michael Ellerman @ 2020-08-19  1:56 UTC (permalink / raw)
  To: linuxppc-dev
In-Reply-To: <20200819015634.1974478-1-mpe@ellerman.id.au>

We have smp_ops->cpu_die() and ppc_md.cpu_die(). One of them offlines
the current CPU and one offlines another CPU, can you guess which is
which? Also one is in smp_ops and one is in ppc_md?

So rename ppc_md.cpu_die(), to cpu_offline_self(), because that's what
it does. And move it into smp_ops where it belongs.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/include/asm/machdep.h           | 1 -
 arch/powerpc/include/asm/smp.h               | 3 +++
 arch/powerpc/kernel/smp.c                    | 4 ++--
 arch/powerpc/kernel/sysfs.c                  | 4 +++-
 arch/powerpc/platforms/85xx/smp.c            | 4 ++--
 arch/powerpc/platforms/powermac/pmac.h       | 2 +-
 arch/powerpc/platforms/powermac/sleep.S      | 6 +++---
 arch/powerpc/platforms/powermac/smp.c        | 8 ++++----
 arch/powerpc/platforms/powernv/smp.c         | 4 ++--
 arch/powerpc/platforms/pseries/hotplug-cpu.c | 6 +++---
 10 files changed, 23 insertions(+), 19 deletions(-)

diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h
index a90b892f0bfe..cc2ec7101520 100644
--- a/arch/powerpc/include/asm/machdep.h
+++ b/arch/powerpc/include/asm/machdep.h
@@ -65,7 +65,6 @@ struct machdep_calls {
 	void __noreturn	(*restart)(char *cmd);
 	void __noreturn (*halt)(void);
 	void		(*panic)(char *str);
-	void		(*cpu_die)(void);
 
 	long		(*time_init)(void); /* Optional, may be NULL */
 
diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h
index a314d2d2d2be..0d00faf8f119 100644
--- a/arch/powerpc/include/asm/smp.h
+++ b/arch/powerpc/include/asm/smp.h
@@ -49,6 +49,9 @@ struct smp_ops_t {
 	int   (*cpu_disable)(void);
 	void  (*cpu_die)(unsigned int nr);
 	int   (*cpu_bootable)(unsigned int nr);
+#ifdef CONFIG_HOTPLUG_CPU
+	void  (*cpu_offline_self)(void);
+#endif
 };
 
 extern int smp_send_nmi_ipi(int cpu, void (*fn)(struct pt_regs *), u64 delay_us);
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index c616d975bf95..faba0fdee500 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -1439,8 +1439,8 @@ void arch_cpu_idle_dead(void)
 	 */
 	this_cpu_disable_ftrace();
 
-	if (ppc_md.cpu_die)
-		ppc_md.cpu_die();
+	if (smp_ops->cpu_offline_self)
+		smp_ops->cpu_offline_self();
 
 	/* If we return, we re-enter start_secondary */
 	start_secondary_resume();
diff --git a/arch/powerpc/kernel/sysfs.c b/arch/powerpc/kernel/sysfs.c
index 6bebc9a52444..7c4ccc03c2de 100644
--- a/arch/powerpc/kernel/sysfs.c
+++ b/arch/powerpc/kernel/sysfs.c
@@ -1161,6 +1161,7 @@ static int __init topology_init(void)
 	for_each_possible_cpu(cpu) {
 		struct cpu *c = &per_cpu(cpu_devices, cpu);
 
+#ifdef CONFIG_HOTPLUG_CPU
 		/*
 		 * For now, we just see if the system supports making
 		 * the RTAS calls for CPU hotplug.  But, there may be a
@@ -1168,8 +1169,9 @@ static int __init topology_init(void)
 		 * CPU.  For instance, the boot cpu might never be valid
 		 * for hotplugging.
 		 */
-		if (ppc_md.cpu_die)
+		if (smp_ops->cpu_offline_self)
 			c->hotpluggable = 1;
+#endif
 
 		if (cpu_online(cpu) || c->hotpluggable) {
 			register_cpu(c, cpu);
diff --git a/arch/powerpc/platforms/85xx/smp.c b/arch/powerpc/platforms/85xx/smp.c
index fda108bae95f..c6df294054fe 100644
--- a/arch/powerpc/platforms/85xx/smp.c
+++ b/arch/powerpc/platforms/85xx/smp.c
@@ -112,7 +112,7 @@ static void mpc85xx_take_timebase(void)
 	local_irq_restore(flags);
 }
 
-static void smp_85xx_mach_cpu_die(void)
+static void smp_85xx_cpu_offline_self(void)
 {
 	unsigned int cpu = smp_processor_id();
 
@@ -506,7 +506,7 @@ void __init mpc85xx_smp_init(void)
 	if (qoriq_pm_ops) {
 		smp_85xx_ops.give_timebase = mpc85xx_give_timebase;
 		smp_85xx_ops.take_timebase = mpc85xx_take_timebase;
-		ppc_md.cpu_die = smp_85xx_mach_cpu_die;
+		smp_85xx_ops.cpu_offline_self = smp_85xx_cpu_offline_self;
 		smp_85xx_ops.cpu_die = qoriq_cpu_kill;
 	}
 #endif
diff --git a/arch/powerpc/platforms/powermac/pmac.h b/arch/powerpc/platforms/powermac/pmac.h
index 16a52afdb76e..0d715db434dc 100644
--- a/arch/powerpc/platforms/powermac/pmac.h
+++ b/arch/powerpc/platforms/powermac/pmac.h
@@ -34,7 +34,7 @@ extern void pmac_check_ht_link(void);
 
 extern void pmac_setup_smp(void);
 extern int psurge_secondary_virq;
-extern void low_cpu_die(void) __attribute__((noreturn));
+extern void low_cpu_offline_self(void) __attribute__((noreturn));
 
 extern int pmac_nvram_init(void);
 extern void pmac_pic_init(void);
diff --git a/arch/powerpc/platforms/powermac/sleep.S b/arch/powerpc/platforms/powermac/sleep.S
index f9a680fdd9c4..c51bb63c9417 100644
--- a/arch/powerpc/platforms/powermac/sleep.S
+++ b/arch/powerpc/platforms/powermac/sleep.S
@@ -201,8 +201,8 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_USE_HIGH_BATS)
 	addi r3,r3,sleep_storage@l
 	stw r5,0(r3)
 
-	.globl	low_cpu_die
-low_cpu_die:
+	.globl	low_cpu_offline_self
+low_cpu_offline_self:
 	/* Flush & disable all caches */
 	bl	flush_disable_caches
 
@@ -244,7 +244,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_SPEC7450)
 	mtmsr	r2
 	isync
 	b	1b
-_ASM_NOKPROBE_SYMBOL(low_cpu_die)
+_ASM_NOKPROBE_SYMBOL(low_cpu_offline_self)
 /*
  * Here is the resume code.
  */
diff --git a/arch/powerpc/platforms/powermac/smp.c b/arch/powerpc/platforms/powermac/smp.c
index eb23264910e1..a6fedcfb714f 100644
--- a/arch/powerpc/platforms/powermac/smp.c
+++ b/arch/powerpc/platforms/powermac/smp.c
@@ -920,7 +920,7 @@ static int smp_core99_cpu_disable(void)
 
 #ifdef CONFIG_PPC32
 
-static void pmac_cpu_die(void)
+static void pmac_cpu_offline_self(void)
 {
 	int cpu = smp_processor_id();
 
@@ -930,12 +930,12 @@ static void pmac_cpu_die(void)
 	generic_set_cpu_dead(cpu);
 	smp_wmb();
 	mb();
-	low_cpu_die();
+	low_cpu_offline_self();
 }
 
 #else /* CONFIG_PPC32 */
 
-static void pmac_cpu_die(void)
+static void pmac_cpu_offline_self(void)
 {
 	int cpu = smp_processor_id();
 
@@ -1020,7 +1020,7 @@ void __init pmac_setup_smp(void)
 #endif /* CONFIG_PPC_PMAC32_PSURGE */
 
 #ifdef CONFIG_HOTPLUG_CPU
-	ppc_md.cpu_die = pmac_cpu_die;
+	smp_ops->cpu_offline_self = pmac_cpu_offline_self;
 #endif
 }
 
diff --git a/arch/powerpc/platforms/powernv/smp.c b/arch/powerpc/platforms/powernv/smp.c
index bbf361f23ae8..54c4ba45c7ce 100644
--- a/arch/powerpc/platforms/powernv/smp.c
+++ b/arch/powerpc/platforms/powernv/smp.c
@@ -158,7 +158,7 @@ static void pnv_flush_interrupts(void)
 	}
 }
 
-static void pnv_smp_cpu_kill_self(void)
+static void pnv_cpu_offline_self(void)
 {
 	unsigned long srr1, unexpected_mask, wmask;
 	unsigned int cpu;
@@ -417,6 +417,7 @@ static struct smp_ops_t pnv_smp_ops = {
 #ifdef CONFIG_HOTPLUG_CPU
 	.cpu_disable	= pnv_smp_cpu_disable,
 	.cpu_die	= generic_cpu_die,
+	.cpu_offline_self = pnv_cpu_offline_self,
 #endif /* CONFIG_HOTPLUG_CPU */
 };
 
@@ -430,7 +431,6 @@ void __init pnv_smp_init(void)
 	smp_ops = &pnv_smp_ops;
 
 #ifdef CONFIG_HOTPLUG_CPU
-	ppc_md.cpu_die	= pnv_smp_cpu_kill_self;
 #ifdef CONFIG_KEXEC_CORE
 	crash_wake_offline = 1;
 #endif
diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c
index c6e0d8abf75e..43b020a30072 100644
--- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
+++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
@@ -55,7 +55,7 @@ static void rtas_stop_self(void)
 	panic("Alas, I survived.\n");
 }
 
-static void pseries_mach_cpu_die(void)
+static void pseries_cpu_offline_self(void)
 {
 	unsigned int hwcpu = hard_smp_processor_id();
 
@@ -102,7 +102,7 @@ static int pseries_cpu_disable(void)
  * to self-destroy so that the cpu-offline thread can send the CPU_DEAD
  * notifications.
  *
- * OTOH, pseries_mach_cpu_die() is called by the @cpu when it wants to
+ * OTOH, pseries_cpu_offline_self() is called by the @cpu when it wants to
  * self-destruct.
  */
 static void pseries_cpu_die(unsigned int cpu)
@@ -895,7 +895,7 @@ static int __init pseries_cpu_hotplug_init(void)
 		return 0;
 	}
 
-	ppc_md.cpu_die = pseries_mach_cpu_die;
+	smp_ops->cpu_offline_self = pseries_cpu_offline_self;
 	smp_ops->cpu_disable = pseries_cpu_disable;
 	smp_ops->cpu_die = pseries_cpu_die;
 
-- 
2.25.1


^ permalink raw reply related

* [PATCH 2/3] powerpc/smp: Fold cpu_die() into its only caller
From: Michael Ellerman @ 2020-08-19  1:56 UTC (permalink / raw)
  To: linuxppc-dev
In-Reply-To: <20200819015634.1974478-1-mpe@ellerman.id.au>

Avoid the eternal confusion between cpu_die() and __cpu_die() by
removing the former, folding it into its only caller.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/include/asm/smp.h | 1 -
 arch/powerpc/kernel/smp.c      | 4 ----
 2 files changed, 5 deletions(-)

diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h
index 49a25e2400f2..a314d2d2d2be 100644
--- a/arch/powerpc/include/asm/smp.h
+++ b/arch/powerpc/include/asm/smp.h
@@ -29,7 +29,6 @@ extern int boot_cpuid;
 extern int spinning_secondaries;
 extern u32 *cpu_to_phys_id;
 
-extern void cpu_die(void);
 extern int cpu_to_chip_id(int cpu);
 
 #ifdef CONFIG_SMP
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index b05d2db13d08..c616d975bf95 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -1432,11 +1432,7 @@ void __cpu_die(unsigned int cpu)
 void arch_cpu_idle_dead(void)
 {
 	sched_preempt_enable_no_resched();
-	cpu_die();
-}
 
-void cpu_die(void)
-{
 	/*
 	 * Disable on the down path. This will be re-enabled by
 	 * start_secondary() via start_secondary_resume() below
-- 
2.25.1


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox