* Re: [PATCH v4 6/6] sched/fair: Consider SMT in ASYM_PACKING load balance
From: Vincent Guittot @ 2021-08-28 13:25 UTC (permalink / raw)
To: Ricardo Neri
Cc: Juri Lelli, Aubrey Li, Srikar Dronamraju, Ravi V. Shankar,
Peter Zijlstra (Intel), Ricardo Neri, Ben Segall,
Srinivas Pandruvada, Joel Fernandes (Google), Ingo Molnar,
Rafael J. Wysocki, Steven Rostedt, Mel Gorman, Len Brown,
Nicholas Piggin, Aubrey Li, Dietmar Eggemann, Tim Chen,
Quentin Perret, Daniel Bristot de Oliveira, linux-kernel,
linuxppc-dev
In-Reply-To: <20210827194503.GB14720@ranerica-svr.sc.intel.com>
On Fri, 27 Aug 2021 at 21:45, Ricardo Neri
<ricardo.neri-calderon@linux.intel.com> wrote:
>
> On Fri, Aug 27, 2021 at 12:13:42PM +0200, Vincent Guittot wrote:
> > On Tue, 10 Aug 2021 at 16:41, Ricardo Neri
> > <ricardo.neri-calderon@linux.intel.com> wrote:
> > > @@ -9540,6 +9629,12 @@ static struct rq *find_busiest_queue(struct lb_env *env,
> > > nr_running == 1)
> > > continue;
> > >
> > > + /* Make sure we only pull tasks from a CPU of lower priority */
> > > + if ((env->sd->flags & SD_ASYM_PACKING) &&
> > > + sched_asym_prefer(i, env->dst_cpu) &&
> > > + nr_running == 1)
> > > + continue;
> >
> > This really looks similar to the test above for SD_ASYM_CPUCAPACITY.
> > More generally speaking SD_ASYM_PACKING and SD_ASYM_CPUCAPACITY share
> > a lot of common policy and I wonder if at some point we could not
> > merge their behavior in LB
>
> I would like to confirm with you that you are not expecting this merge
> as part of this series, right?
Merging them will probably need more tests on both x86 and Arm so I
suppose that we could keep them separate for now
Regards,
Vincent
>
> Thanks and BR,
> Ricardo
^ permalink raw reply
* [GIT PULL] Please pull powerpc/linux.git powerpc-5.14-7 tag
From: Michael Ellerman @ 2021-08-28 12:46 UTC (permalink / raw)
To: Linus Torvalds; +Cc: lukas.bulwahn, linuxppc-dev, linux-kernel, npiggin
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256
Hi Linus,
Please pull two more powerpc fixes for 5.14:
The following changes since commit 9f7853d7609d59172eecfc5e7ccf503bc1b690bd:
powerpc/mm: Fix set_memory_*() against concurrent accesses (2021-08-19 09:41:54 +1000)
are available in the git repository at:
https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git tags/powerpc-5.14-7
for you to fetch changes up to 787c70f2f9990b5a197320152d2fc32cd8a6ad1a:
powerpc/64s: Fix scv implicit soft-mask table for relocated kernels (2021-08-20 22:35:18 +1000)
- ------------------------------------------------------------------
powerpc fixes for 5.14 #7
- Fix scv implicit soft-mask table for relocated (eg. kdump) kernels.
- Re-enable ARCH_ENABLE_SPLIT_PMD_PTLOCK, which was disabled due to a typo.
Thanks to: Lukas Bulwahn, Nicholas Piggin, Daniel Axtens.
- ------------------------------------------------------------------
Lukas Bulwahn (1):
powerpc: Re-enable ARCH_ENABLE_SPLIT_PMD_PTLOCK
Nicholas Piggin (1):
powerpc/64s: Fix scv implicit soft-mask table for relocated kernels
arch/powerpc/kernel/exceptions-64s.S | 7 ++++---
arch/powerpc/platforms/Kconfig.cputype | 2 +-
2 files changed, 5 insertions(+), 4 deletions(-)
-----BEGIN PGP SIGNATURE-----
iQIzBAEBCAAdFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmEqL8QACgkQUevqPMjh
pYDvjQ//dYPmb8eTLrWTRXUy6vLnem+CuJv6gyLEG+7s6T8qZMSA2uvKr4zXAvgF
2QOAoVYWDth6blxjfHc5zqpdERhZHhOq0o2ofuJYGz1J/HAymvIohGFmiIwvOu5n
rPI8NMmsw61W35jJS7dgRr86b3rBwZSzRpDL14g+zQRRzuCqnBdOCA3Ixxn5JQ/F
D1AhyWL61IpVdg0Tz6FRU8s+VKYHh4Yr/CsozkFRMgqZZ2k3zs7aTcluyN8JhEta
BoqejJStrgRt2KOJPr3HXk5fHaoz/9AJn3lSauhnEPFR/Li2ChjkDZPd9KQysHI/
f56HfO/jRx3lY/qhHQ3HeGVJ8rsQrzEILj7KqL0KHwfQoqAhP3E2sut6oqZBFWii
HzfNl0vDrVkjBW7WDV/Y1hlGYaeiGt6DgXwh6wifek6JhMSABwyrd+Uoi9efG7sg
fOUl11VHvBHHQoT+h526urQrdSvgNn+M2iwjElK3LC+tDStNVZTxEiS/KtvpK1vC
I7f7CuwS+z3y4kPs2lwr1t0qQOnmOsvJP82+IDb7Tq4LJRgajyFlGdUEtMrpxZvP
whpFxWrxlQabX6QEr3CGYTGjbiYaGUCfWLkwhwpcftp7QcadjYko7u6nxIyOjI2G
8XG3+CMcPkNB4fSo4Ijt5upaTjhjl4JUGEenLHbyzdcixfj9Z9w=
=yPpX
-----END PGP SIGNATURE-----
^ permalink raw reply
* Re: [RFC PATCH 4/6] powerpc/64s: Make hash MMU code build configurable
From: Christophe Leroy @ 2021-08-28 9:59 UTC (permalink / raw)
To: Nicholas Piggin, linuxppc-dev
In-Reply-To: <20210827163410.1177154-5-npiggin@gmail.com>
Le 27/08/2021 à 18:34, Nicholas Piggin a écrit :
> Introduce a new option CONFIG_PPC_64S_HASH_MMU which allows the 64s hash
> MMU code to be compiled out if radix is selected and the minimum
> supported CPU type is POWER9 or higher, and KVM is not selected.
>
> This saves 128kB kernel image size (90kB text) on powernv_defconfig
> minus KVM, 350kB on pseries_defconfig minus KVM, 40kB on a tiny config.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> arch/powerpc/Kconfig | 1 +
> arch/powerpc/include/asm/book3s/64/mmu.h | 22 +++++-
> .../include/asm/book3s/64/tlbflush-hash.h | 7 ++
> arch/powerpc/include/asm/book3s/pgtable.h | 4 ++
> arch/powerpc/include/asm/mmu.h | 38 +++++++++--
> arch/powerpc/include/asm/mmu_context.h | 2 +
> arch/powerpc/include/asm/paca.h | 8 +++
> arch/powerpc/kernel/asm-offsets.c | 2 +
> arch/powerpc/kernel/dt_cpu_ftrs.c | 10 ++-
> arch/powerpc/kernel/entry_64.S | 4 +-
> arch/powerpc/kernel/exceptions-64s.S | 16 +++++
> arch/powerpc/kernel/mce.c | 2 +-
> arch/powerpc/kernel/mce_power.c | 10 ++-
> arch/powerpc/kernel/paca.c | 18 ++---
> arch/powerpc/kernel/process.c | 13 ++--
> arch/powerpc/kernel/prom.c | 2 +
> arch/powerpc/kernel/setup_64.c | 4 ++
> arch/powerpc/kexec/core_64.c | 4 +-
> arch/powerpc/kexec/ranges.c | 4 ++
> arch/powerpc/kvm/Kconfig | 1 +
> arch/powerpc/mm/book3s64/Makefile | 17 +++--
> arch/powerpc/mm/book3s64/hash_pgtable.c | 1 -
> arch/powerpc/mm/book3s64/hash_utils.c | 10 ---
> .../{hash_hugetlbpage.c => hugetlbpage.c} | 6 ++
> arch/powerpc/mm/book3s64/mmu_context.c | 16 +++++
> arch/powerpc/mm/book3s64/pgtable.c | 22 +++++-
> arch/powerpc/mm/book3s64/radix_pgtable.c | 4 ++
> arch/powerpc/mm/book3s64/slb.c | 16 -----
> arch/powerpc/mm/copro_fault.c | 2 +
> arch/powerpc/mm/fault.c | 17 +++++
> arch/powerpc/mm/pgtable.c | 10 ++-
> arch/powerpc/platforms/Kconfig.cputype | 20 +++++-
> arch/powerpc/platforms/cell/Kconfig | 1 +
> arch/powerpc/platforms/maple/Kconfig | 1 +
> arch/powerpc/platforms/microwatt/Kconfig | 2 +-
> arch/powerpc/platforms/pasemi/Kconfig | 1 +
> arch/powerpc/platforms/powermac/Kconfig | 1 +
> arch/powerpc/platforms/powernv/Kconfig | 2 +-
> arch/powerpc/platforms/powernv/idle.c | 2 +
> arch/powerpc/platforms/powernv/setup.c | 2 +
> arch/powerpc/platforms/pseries/lpar.c | 68 ++++++++++---------
> arch/powerpc/platforms/pseries/lparcfg.c | 2 +-
> arch/powerpc/platforms/pseries/mobility.c | 6 ++
> arch/powerpc/platforms/pseries/ras.c | 4 ++
> arch/powerpc/platforms/pseries/reconfig.c | 2 +
> arch/powerpc/platforms/pseries/setup.c | 6 +-
> arch/powerpc/xmon/xmon.c | 8 ++-
> 47 files changed, 310 insertions(+), 111 deletions(-)
> rename arch/powerpc/mm/book3s64/{hash_hugetlbpage.c => hugetlbpage.c} (95%)
Whaou ! Huge patch.
Several places you should be able to use IS_ENABLED() or simply radix_enabled() instead of #ifdefs
and rely on GCC to opt out stuff when radix_enabled() folds into 'true'.
I may do more detailed comments later when I have time.
Christophe
^ permalink raw reply
* Re: [RFC PATCH 6/6] powerpc/microwatt: Stop building the hash MMU code
From: Christophe Leroy @ 2021-08-28 9:54 UTC (permalink / raw)
To: Nicholas Piggin, linuxppc-dev
In-Reply-To: <20210827163410.1177154-7-npiggin@gmail.com>
Le 27/08/2021 à 18:34, Nicholas Piggin a écrit :
> Microwatt is radix-only, so stop selecting the hash MMU code.
>
> This saves 20kB compressed dtbImage and 56kB vmlinux size.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> arch/powerpc/configs/microwatt_defconfig | 1 -
> arch/powerpc/platforms/Kconfig.cputype | 1 +
> arch/powerpc/platforms/microwatt/Kconfig | 1 -
> 3 files changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/configs/microwatt_defconfig b/arch/powerpc/configs/microwatt_defconfig
> index bf5f2e5905eb..6fbad42f9e3d 100644
> --- a/arch/powerpc/configs/microwatt_defconfig
> +++ b/arch/powerpc/configs/microwatt_defconfig
> @@ -26,7 +26,6 @@ CONFIG_PPC_MICROWATT=y
> # CONFIG_PPC_OF_BOOT_TRAMPOLINE is not set
> CONFIG_CPU_FREQ=y
> CONFIG_HZ_100=y
> -# CONFIG_PPC_MEM_KEYS is not set
> # CONFIG_SECCOMP is not set
> # CONFIG_MQ_IOSCHED_KYBER is not set
> # CONFIG_COREDUMP is not set
> diff --git a/arch/powerpc/platforms/Kconfig.cputype b/arch/powerpc/platforms/Kconfig.cputype
> index 9b90fc501ed4..b9acd6c68c81 100644
> --- a/arch/powerpc/platforms/Kconfig.cputype
> +++ b/arch/powerpc/platforms/Kconfig.cputype
> @@ -371,6 +371,7 @@ config SPE
> config PPC_64S_HASH_MMU
> bool "Hash MMU Support"
> depends on PPC_BOOK3S_64
> + depends on !PPC_MICROWATT
Do we really want to start nit picking which platforms can select such or such option ?
Isn't it enough to get it off through the defconfig ?
Is PPC_MICROWATT mutually exclusive with other book3s64 configs ? Don't we build multiplatform kernels ?
> default y
> select PPC_MM_SLICES
> help
> diff --git a/arch/powerpc/platforms/microwatt/Kconfig b/arch/powerpc/platforms/microwatt/Kconfig
> index e0ff2cfc1ca0..4f0ce292a6ce 100644
> --- a/arch/powerpc/platforms/microwatt/Kconfig
> +++ b/arch/powerpc/platforms/microwatt/Kconfig
> @@ -6,7 +6,6 @@ config PPC_MICROWATT
> select PPC_XICS
> select PPC_ICS_NATIVE
> select PPC_ICP_NATIVE
> - select PPC_HASH_MMU_NATIVE if PPC_64S_HASH_MMU
> select PPC_UDBG_16550
> select ARCH_RANDOM
> help
>
^ permalink raw reply
* Re: [RFC PATCH 5/6] powerpc/microwatt: select POWER9_CPU
From: Christophe Leroy @ 2021-08-28 9:50 UTC (permalink / raw)
To: Nicholas Piggin, linuxppc-dev
In-Reply-To: <20210827163410.1177154-6-npiggin@gmail.com>
Le 27/08/2021 à 18:34, Nicholas Piggin a écrit :
> Microwatt implements a subset of ISA v3.0 which is equivalent to
> the POWER9_CPU selection.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> arch/powerpc/configs/microwatt_defconfig | 1 +
> arch/powerpc/platforms/microwatt/Kconfig | 1 +
> 2 files changed, 2 insertions(+)
>
> diff --git a/arch/powerpc/configs/microwatt_defconfig b/arch/powerpc/configs/microwatt_defconfig
> index a08b739123da..bf5f2e5905eb 100644
> --- a/arch/powerpc/configs/microwatt_defconfig
> +++ b/arch/powerpc/configs/microwatt_defconfig
> @@ -14,6 +14,7 @@ CONFIG_EMBEDDED=y
> # CONFIG_COMPAT_BRK is not set
> # CONFIG_SLAB_MERGE_DEFAULT is not set
> CONFIG_PPC64=y
> +CONFIG_POWER9_CPU=y
That shouldn't be needed in the defconfig because you select it below. You can use make
savedefconfig to confirm.
> # CONFIG_PPC_KUEP is not set
> # CONFIG_PPC_KUAP is not set
> CONFIG_CPU_LITTLE_ENDIAN=y
> diff --git a/arch/powerpc/platforms/microwatt/Kconfig b/arch/powerpc/platforms/microwatt/Kconfig
> index 823192e9d38a..e0ff2cfc1ca0 100644
> --- a/arch/powerpc/platforms/microwatt/Kconfig
> +++ b/arch/powerpc/platforms/microwatt/Kconfig
> @@ -2,6 +2,7 @@
> config PPC_MICROWATT
> depends on PPC_BOOK3S_64 && !SMP
> bool "Microwatt SoC platform"
> + select POWER9_CPU
> select PPC_XICS
> select PPC_ICS_NATIVE
> select PPC_ICP_NATIVE
>
^ permalink raw reply
* Re: [RFC PATCH 4/6] powerpc/64s: Make hash MMU code build configurable
From: Christophe Leroy @ 2021-08-28 9:34 UTC (permalink / raw)
To: Nicholas Piggin, linuxppc-dev
In-Reply-To: <20210827163410.1177154-5-npiggin@gmail.com>
Le 27/08/2021 à 18:34, Nicholas Piggin a écrit :
> Introduce a new option CONFIG_PPC_64S_HASH_MMU which allows the 64s hash
> MMU code to be compiled out if radix is selected and the minimum
> supported CPU type is POWER9 or higher, and KVM is not selected.
>
> This saves 128kB kernel image size (90kB text) on powernv_defconfig
> minus KVM, 350kB on pseries_defconfig minus KVM, 40kB on a tiny config.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
...
> @@ -324,6 +330,7 @@ static inline void assert_pte_locked(struct mm_struct *mm, unsigned long addr)
> }
> #endif /* !CONFIG_DEBUG_VM */
>
> +#if defined(CONFIG_PPC_RADIX_MMU) && defined(CONFIG_PPC_64S_HASH_MMU)
> static inline bool radix_enabled(void)
> {
> return mmu_has_feature(MMU_FTR_TYPE_RADIX);
> @@ -333,6 +340,27 @@ static inline bool early_radix_enabled(void)
> {
> return early_mmu_has_feature(MMU_FTR_TYPE_RADIX);
> }
> +#elif defined(CONFIG_PPC_64S_HASH_MMU)
> +static inline bool radix_enabled(void)
> +{
> + return false;
> +}
> +
> +static inline bool early_radix_enabled(void)
> +{
> + return false;
> +}
> +#else
> +static inline bool radix_enabled(void)
> +{
> + return true;
> +}
> +
> +static inline bool early_radix_enabled(void)
> +{
> + return true;
> +}
> +#endif
You don't need something that complex. You don't need to change that at all indeed, just have to
ensure that when CONFIG_PPC_64S_HASH_MMU is not selected you have MMU_FTR_TYPE_RADIX included in
MMU_FTRS_ALWAYS and voila.
>
> #ifdef CONFIG_STRICT_KERNEL_RWX
> static inline bool strict_kernel_rwx_enabled(void)
^ permalink raw reply
* Re: [PATCH] net: spider_net: switch from 'pci_' to 'dma_' API
From: Geoff Levand @ 2021-08-28 1:34 UTC (permalink / raw)
To: Christophe JAILLET, kou.ishizaki, davem, kuba
Cc: netdev, kernel-janitors, linuxppc-dev, linux-kernel
In-Reply-To: <60abc3d0c8b4ef8368a4d63326a25a5cb3cd218c.1630094078.git.christophe.jaillet@wanadoo.fr>
Hi Christophe,
On 8/27/21 12:56 PM, Christophe JAILLET wrote:
> It has *not* been compile tested because I don't have the needed
> configuration or cross-compiler.
The powerpc ppc64_defconfig has CONFIG_SPIDER_NET set. My
tdd-builder Docker image has the needed gcc-powerpc-linux-gnu
cross compiler to build ppc64_defconfig:
https://hub.docker.com/r/glevand/tdd-builder
-Geoff
^ permalink raw reply
* Re: [PATCH v2 4/5] KVM: selftests: Add a test for KVM_RUN+rseq to detect task migration bugs
From: Mathieu Desnoyers @ 2021-08-28 0:06 UTC (permalink / raw)
To: Sean Christopherson
Cc: KVM list, Peter Zijlstra, Catalin Marinas, linux-kernel,
Will Deacon, Guo Ren, linux-kselftest, Ben Gardon, shuah,
Paul Mackerras, linux-s390, gor, Russell King, ARM Linux,
linux-csky, Christian Borntraeger, Ingo Molnar, dvhart,
linux-mips, Boqun Feng, paulmck, Heiko Carstens, rostedt,
Shakeel Butt, Andy Lutomirski, Thomas Gleixner, Peter Foley,
linux-arm-kernel, Thomas Bogendoerfer, Oleg Nesterov,
Paolo Bonzini, linuxppc-dev
In-Reply-To: <YSlz8h9SWgeuicak@google.com>
----- On Aug 27, 2021, at 7:23 PM, Sean Christopherson seanjc@google.com wrote:
> On Fri, Aug 27, 2021, Mathieu Desnoyers wrote:
[...]
>> Does it reproduce if we randomize the delay to have it picked randomly from 0us
>> to 100us (with 1us step) ? It would remove a lot of the needs for arch-specific
>> magic delay value.
>
> My less-than-scientific testing shows that it can reproduce at delays up to
> ~500us,
> but above ~10us the reproducibility starts to drop. The bug still reproduces
> reliably, it just takes more iterations, and obviously the test runs a bit
> slower.
>
> Any objection to using a 1-10us delay, e.g. a simple usleep((i % 10) + 1)?
Works for me, thanks!
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
^ permalink raw reply
* Re: [PATCH v2 4/5] KVM: selftests: Add a test for KVM_RUN+rseq to detect task migration bugs
From: Sean Christopherson @ 2021-08-27 23:23 UTC (permalink / raw)
To: Mathieu Desnoyers
Cc: KVM list, Peter Zijlstra, Catalin Marinas, linux-kernel,
Will Deacon, Guo Ren, linux-kselftest, Ben Gardon, shuah,
Paul Mackerras, linux-s390, gor, Russell King, ARM Linux,
linux-csky, Christian Borntraeger, Ingo Molnar, dvhart,
linux-mips, Boqun Feng, paulmck, Heiko Carstens, rostedt,
Shakeel Butt, Andy Lutomirski, Thomas Gleixner, Peter Foley,
linux-arm-kernel, Thomas Bogendoerfer, Oleg Nesterov,
Paolo Bonzini, linuxppc-dev
In-Reply-To: <339641531.29941.1630091374065.JavaMail.zimbra@efficios.com>
On Fri, Aug 27, 2021, Mathieu Desnoyers wrote:
> > So there are effectively three reasons we want a delay:
> >
> > 1. To allow sched_setaffinity() to coincide with ioctl(KVM_RUN) before KVM can
> > enter the guest so that the guest doesn't need an arch-specific VM-Exit source.
> >
> > 2. To let ioctl(KVM_RUN) make its way back to the test before the next round
> > of migration.
> >
> > 3. To ensure the read-side can make forward progress, e.g. if sched_getcpu()
> > involves a syscall.
> >
> >
> > After looking at KVM for arm64 and s390, #1 is a bit tenuous because x86 is the
> > only arch that currently uses xfer_to_guest_mode_work(), i.e. the test could be
> > tweaked to be overtly x86-specific. But since a delay is needed for #2 and #3,
> > I'd prefer to rely on it for #1 as well in the hopes that this test provides
> > coverage for arm64 and/or s390 if they're ever converted to use the common
> > xfer_to_guest_mode_work().
>
> Now that we have this understanding of why we need the delay, it would be good to
> write this down in a comment within the test.
Ya, I'll get a new version out next week.
> Does it reproduce if we randomize the delay to have it picked randomly from 0us
> to 100us (with 1us step) ? It would remove a lot of the needs for arch-specific
> magic delay value.
My less-than-scientific testing shows that it can reproduce at delays up to ~500us,
but above ~10us the reproducibility starts to drop. The bug still reproduces
reliably, it just takes more iterations, and obviously the test runs a bit slower.
Any objection to using a 1-10us delay, e.g. a simple usleep((i % 10) + 1)?
^ permalink raw reply
* [PATCH] net: spider_net: switch from 'pci_' to 'dma_' API
From: Christophe JAILLET @ 2021-08-27 19:56 UTC (permalink / raw)
To: kou.ishizaki, geoff, davem, kuba
Cc: netdev, kernel-janitors, Christophe JAILLET, linuxppc-dev,
linux-kernel
In [1], Christoph Hellwig has proposed to remove the wrappers in
include/linux/pci-dma-compat.h.
Some reasons why this API should be removed have been given by Julia
Lawall in [2].
A coccinelle script has been used to perform the needed transformation
Only relevant parts are given below.
@@ @@
- PCI_DMA_BIDIRECTIONAL
+ DMA_BIDIRECTIONAL
@@ @@
- PCI_DMA_TODEVICE
+ DMA_TO_DEVICE
@@ @@
- PCI_DMA_FROMDEVICE
+ DMA_FROM_DEVICE
@@
expression e1, e2, e3, e4;
@@
- pci_map_single(e1, e2, e3, e4)
+ dma_map_single(&e1->dev, e2, e3, e4)
@@
expression e1, e2, e3, e4;
@@
- pci_unmap_single(e1, e2, e3, e4)
+ dma_unmap_single(&e1->dev, e2, e3, e4)
@@
expression e1, e2;
@@
- pci_dma_mapping_error(e1, e2)
+ dma_mapping_error(&e1->dev, e2)
[1]: https://lore.kernel.org/kernel-janitors/20200421081257.GA131897@infradead.org/
[2]: https://lore.kernel.org/kernel-janitors/alpine.DEB.2.22.394.2007120902170.2424@hadrien/
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
---
It has *not* been compile tested because I don't have the needed
configuration or cross-compiler. However, the modification is completely
mechanical and done by coccinelle.
---
drivers/net/ethernet/toshiba/spider_net.c | 27 +++++++++++++----------
1 file changed, 15 insertions(+), 12 deletions(-)
diff --git a/drivers/net/ethernet/toshiba/spider_net.c b/drivers/net/ethernet/toshiba/spider_net.c
index 087f0af56c50..66d4e024d11e 100644
--- a/drivers/net/ethernet/toshiba/spider_net.c
+++ b/drivers/net/ethernet/toshiba/spider_net.c
@@ -354,9 +354,10 @@ spider_net_free_rx_chain_contents(struct spider_net_card *card)
descr = card->rx_chain.head;
do {
if (descr->skb) {
- pci_unmap_single(card->pdev, descr->hwdescr->buf_addr,
+ dma_unmap_single(&card->pdev->dev,
+ descr->hwdescr->buf_addr,
SPIDER_NET_MAX_FRAME,
- PCI_DMA_BIDIRECTIONAL);
+ DMA_BIDIRECTIONAL);
dev_kfree_skb(descr->skb);
descr->skb = NULL;
}
@@ -411,9 +412,9 @@ spider_net_prepare_rx_descr(struct spider_net_card *card,
if (offset)
skb_reserve(descr->skb, SPIDER_NET_RXBUF_ALIGN - offset);
/* iommu-map the skb */
- buf = pci_map_single(card->pdev, descr->skb->data,
- SPIDER_NET_MAX_FRAME, PCI_DMA_FROMDEVICE);
- if (pci_dma_mapping_error(card->pdev, buf)) {
+ buf = dma_map_single(&card->pdev->dev, descr->skb->data,
+ SPIDER_NET_MAX_FRAME, DMA_FROM_DEVICE);
+ if (dma_mapping_error(&card->pdev->dev, buf)) {
dev_kfree_skb_any(descr->skb);
descr->skb = NULL;
if (netif_msg_rx_err(card) && net_ratelimit())
@@ -653,8 +654,9 @@ spider_net_prepare_tx_descr(struct spider_net_card *card,
dma_addr_t buf;
unsigned long flags;
- buf = pci_map_single(card->pdev, skb->data, skb->len, PCI_DMA_TODEVICE);
- if (pci_dma_mapping_error(card->pdev, buf)) {
+ buf = dma_map_single(&card->pdev->dev, skb->data, skb->len,
+ DMA_TO_DEVICE);
+ if (dma_mapping_error(&card->pdev->dev, buf)) {
if (netif_msg_tx_err(card) && net_ratelimit())
dev_err(&card->netdev->dev, "could not iommu-map packet (%p, %i). "
"Dropping packet\n", skb->data, skb->len);
@@ -666,7 +668,8 @@ spider_net_prepare_tx_descr(struct spider_net_card *card,
descr = card->tx_chain.head;
if (descr->next == chain->tail->prev) {
spin_unlock_irqrestore(&chain->lock, flags);
- pci_unmap_single(card->pdev, buf, skb->len, PCI_DMA_TODEVICE);
+ dma_unmap_single(&card->pdev->dev, buf, skb->len,
+ DMA_TO_DEVICE);
return -ENOMEM;
}
hwdescr = descr->hwdescr;
@@ -822,8 +825,8 @@ spider_net_release_tx_chain(struct spider_net_card *card, int brutal)
/* unmap the skb */
if (skb) {
- pci_unmap_single(card->pdev, buf_addr, skb->len,
- PCI_DMA_TODEVICE);
+ dma_unmap_single(&card->pdev->dev, buf_addr, skb->len,
+ DMA_TO_DEVICE);
dev_consume_skb_any(skb);
}
}
@@ -1165,8 +1168,8 @@ spider_net_decode_one_descr(struct spider_net_card *card)
/* unmap descriptor */
hw_buf_addr = hwdescr->buf_addr;
hwdescr->buf_addr = 0xffffffff;
- pci_unmap_single(card->pdev, hw_buf_addr,
- SPIDER_NET_MAX_FRAME, PCI_DMA_FROMDEVICE);
+ dma_unmap_single(&card->pdev->dev, hw_buf_addr, SPIDER_NET_MAX_FRAME,
+ DMA_FROM_DEVICE);
if ( (status == SPIDER_NET_DESCR_RESPONSE_ERROR) ||
(status == SPIDER_NET_DESCR_PROTECTION_ERROR) ||
--
2.30.2
^ permalink raw reply related
* Re: [PATCH v4 6/6] sched/fair: Consider SMT in ASYM_PACKING load balance
From: Ricardo Neri @ 2021-08-27 19:45 UTC (permalink / raw)
To: Vincent Guittot
Cc: Juri Lelli, Aubrey Li, Srikar Dronamraju, Ravi V. Shankar,
Peter Zijlstra (Intel), Ricardo Neri, Ben Segall,
Srinivas Pandruvada, Joel Fernandes (Google), Ingo Molnar,
Rafael J. Wysocki, Steven Rostedt, Mel Gorman, Len Brown,
Nicholas Piggin, Aubrey Li, Dietmar Eggemann, Tim Chen,
Quentin Perret, Daniel Bristot de Oliveira, linux-kernel,
linuxppc-dev
In-Reply-To: <CAKfTPtArtmhLG5QYR6TzKevDrEuiQu2HJxm_C3pe549XdGU-1g@mail.gmail.com>
On Fri, Aug 27, 2021 at 12:13:42PM +0200, Vincent Guittot wrote:
> On Tue, 10 Aug 2021 at 16:41, Ricardo Neri
> <ricardo.neri-calderon@linux.intel.com> wrote:
> > @@ -9540,6 +9629,12 @@ static struct rq *find_busiest_queue(struct lb_env *env,
> > nr_running == 1)
> > continue;
> >
> > + /* Make sure we only pull tasks from a CPU of lower priority */
> > + if ((env->sd->flags & SD_ASYM_PACKING) &&
> > + sched_asym_prefer(i, env->dst_cpu) &&
> > + nr_running == 1)
> > + continue;
>
> This really looks similar to the test above for SD_ASYM_CPUCAPACITY.
> More generally speaking SD_ASYM_PACKING and SD_ASYM_CPUCAPACITY share
> a lot of common policy and I wonder if at some point we could not
> merge their behavior in LB
I would like to confirm with you that you are not expecting this merge
as part of this series, right?
Thanks and BR,
Ricardo
^ permalink raw reply
* Re: [PATCH v4 6/6] sched/fair: Consider SMT in ASYM_PACKING load balance
From: Ricardo Neri @ 2021-08-27 19:43 UTC (permalink / raw)
To: Vincent Guittot
Cc: Juri Lelli, Aubrey Li, Srikar Dronamraju, Ravi V. Shankar,
Peter Zijlstra, Ricardo Neri, Ben Segall, Srinivas Pandruvada,
Joel Fernandes (Google), Ingo Molnar, Rafael J. Wysocki,
Steven Rostedt, Mel Gorman, Len Brown, Nicholas Piggin, Aubrey Li,
Dietmar Eggemann, Tim Chen, Quentin Perret,
Daniel Bristot de Oliveira, linux-kernel, linuxppc-dev
In-Reply-To: <CAKfTPtBvhHBA-BLkh-fd2eJk_JOm2chgMy6AKpR=WV_hc3sQKA@mail.gmail.com>
On Fri, Aug 27, 2021 at 05:17:22PM +0200, Vincent Guittot wrote:
> On Fri, 27 Aug 2021 at 16:50, Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Fri, Aug 27, 2021 at 12:13:42PM +0200, Vincent Guittot wrote:
> > > > +/**
> > > > + * asym_smt_can_pull_tasks - Check whether the load balancing CPU can pull tasks
> > > > + * @dst_cpu: Destination CPU of the load balancing
> > > > + * @sds: Load-balancing data with statistics of the local group
> > > > + * @sgs: Load-balancing statistics of the candidate busiest group
> > > > + * @sg: The candidate busiet group
> > > > + *
> > > > + * Check the state of the SMT siblings of both @sds::local and @sg and decide
> > > > + * if @dst_cpu can pull tasks. If @dst_cpu does not have SMT siblings, it can
> > > > + * pull tasks if two or more of the SMT siblings of @sg are busy. If only one
> > > > + * CPU in @sg is busy, pull tasks only if @dst_cpu has higher priority.
> > > > + *
> > > > + * If both @dst_cpu and @sg have SMT siblings, even the number of idle CPUs
> > > > + * between @sds::local and @sg. Thus, pull tasks from @sg if the difference
> > > > + * between the number of busy CPUs is 2 or more. If the difference is of 1,
> > > > + * only pull if @dst_cpu has higher priority. If @sg does not have SMT siblings
> > > > + * only pull tasks if all of the SMT siblings of @dst_cpu are idle and @sg
> > > > + * has lower priority.
> > > > + */
> > > > +static bool asym_smt_can_pull_tasks(int dst_cpu, struct sd_lb_stats *sds,
> > > > + struct sg_lb_stats *sgs,
> > > > + struct sched_group *sg)
> > > > +{
> > > > +#ifdef CONFIG_SCHED_SMT
> > > > + bool local_is_smt, sg_is_smt;
> > > > + int sg_busy_cpus;
> > > > +
> > > > + local_is_smt = sds->local->flags & SD_SHARE_CPUCAPACITY;
> > > > + sg_is_smt = sg->flags & SD_SHARE_CPUCAPACITY;
> > > > +
> > > > + sg_busy_cpus = sgs->group_weight - sgs->idle_cpus;
> > > > +
> > > > + if (!local_is_smt) {
> > > > + /*
> > > > + * If we are here, @dst_cpu is idle and does not have SMT
> > > > + * siblings. Pull tasks if candidate group has two or more
> > > > + * busy CPUs.
> > > > + */
> > > > + if (sg_is_smt && sg_busy_cpus >= 2)
> > > > + return true;
> > > > +
> > > > + /*
> > > > + * @dst_cpu does not have SMT siblings. @sg may have SMT
> > > > + * siblings and only one is busy. In such case, @dst_cpu
> > > > + * can help if it has higher priority and is idle.
> > > > + */
> > > > + return !sds->local_stat.group_util &&
> > >
> > > sds->local_stat.group_util can't be used to decide if a CPU or group
> > > of CPUs is idle. util_avg is usually not null when a CPU becomes idle
> > > and you can have to wait more than 300ms before it becomes Null
> > > At the opposite, the utilization of a CPU can be null but a task with
> > > null utilization has just woken up on it.
> > > Utilization is used to reflect the average work of the CPU or group of
> > > CPUs but not the current state
> >
> > If you want immediate idle, sgs->nr_running == 0 or sgs->idle_cpus ==
> > sgs->group_weight come to mind.
>
> yes, I have the same in mind
Thank you very much Vincent and Peter for the feedback! I'll look at
using these stats to determine immediate idle.
>
> >
> > > > + sched_asym_prefer(dst_cpu, sg->asym_prefer_cpu);
> > > > + }
> > > > +
> > > > + /* @dst_cpu has SMT siblings. */
> > > > +
> > > > + if (sg_is_smt) {
> > > > + int local_busy_cpus = sds->local->group_weight -
> > > > + sds->local_stat.idle_cpus;
> > > > + int busy_cpus_delta = sg_busy_cpus - local_busy_cpus;
> > > > +
> > > > + /* Local can always help to even the number busy CPUs. */
> > >
> > > default behavior of the load balance already tries to even the number
> a> > of idle CPUs.
> >
> > Right, but I suppose this is because we're trapped here and have to deal
> > with the SMT-SMT case too. Ricardo, can you clarify?
>
> IIUC, this function is used in sg_lb_stats to set
> sgs->group_asym_packing which is then used to set the group state to
> group_asym_packing and force asym migration.
> But if we only want to even the number of busy CPUs between the group,
> we should not need to set the group state to group_asym_packing
Yes, what Vincent describe is the intent. Then, I think it is probably
true that it is not necessary to even the number of idle CPUs here.
>
> >
> > > > + if (busy_cpus_delta >= 2)
> > > > + return true;
> > > > +
> > > > + if (busy_cpus_delta == 1)
> > > > + return sched_asym_prefer(dst_cpu,
> > > > + sg->asym_prefer_cpu);
I think we should keep this check in order to move tasks to higher
priority CPUs.
> > > > +
> > > > + return false;
> > > > + }
> > > > +
> > > > + /*
> > > > + * @sg does not have SMT siblings. Ensure that @sds::local does not end
> > > > + * up with more than one busy SMT sibling and only pull tasks if there
> > > > + * are not busy CPUs. As CPUs move in and out of idle state frequently,
> > > > + * also check the group utilization to smoother the decision.
> > > > + */
> > > > + if (!sds->local_stat.group_util)
> > >
> > > same comment as above about the meaning of group_util == 0
I will look into using the suggested statistics.
Thanks and BR,
Ricardo
^ permalink raw reply
* Re: [PATCH v2 4/5] KVM: selftests: Add a test for KVM_RUN+rseq to detect task migration bugs
From: Mathieu Desnoyers @ 2021-08-27 19:09 UTC (permalink / raw)
To: Sean Christopherson
Cc: KVM list, Peter Zijlstra, Catalin Marinas, linux-kernel,
Will Deacon, Guo Ren, linux-kselftest, Ben Gardon, shuah,
Paul Mackerras, linux-s390, gor, Russell King, ARM Linux,
linux-csky, Christian Borntraeger, Ingo Molnar, dvhart,
linux-mips, Boqun Feng, paulmck, Heiko Carstens, rostedt,
Shakeel Butt, Andy Lutomirski, Thomas Gleixner, Peter Foley,
linux-arm-kernel, Thomas Bogendoerfer, Oleg Nesterov,
Paolo Bonzini, linuxppc-dev
In-Reply-To: <YSgpy8iXXXUQ+b/k@google.com>
----- On Aug 26, 2021, at 7:54 PM, Sean Christopherson seanjc@google.com wrote:
> On Thu, Aug 26, 2021, Mathieu Desnoyers wrote:
>> ----- On Aug 25, 2021, at 8:51 PM, Sean Christopherson seanjc@google.com wrote:
>> >> >> + r = sched_setaffinity(0, sizeof(allowed_mask), &allowed_mask);
>> >> >> + TEST_ASSERT(!r, "sched_setaffinity failed, errno = %d (%s)",
>> >> >> + errno, strerror(errno));
>> >> >> + atomic_inc(&seq_cnt);
>> >> >> +
>> >> >> + CPU_CLR(cpu, &allowed_mask);
>> >> >> +
>> >> >> + /*
>> >> >> + * Let the read-side get back into KVM_RUN to improve the odds
>> >> >> + * of task migration coinciding with KVM's run loop.
>> >> >
>> >> > This comment should be about increasing the odds of letting the seqlock
>> >> > read-side complete. Otherwise, the delay between the two back-to-back
>> >> > atomic_inc is so small that the seqlock read-side may never have time to
>> >> > complete the reading the rseq cpu id and the sched_getcpu() call, and can
>> >> > retry forever.
>> >
>> > Hmm, but that's not why there's a delay. I'm not arguing that a livelock isn't
>> > possible (though that syscall would have to be screaming fast),
>>
>> I don't think we have the same understanding of the livelock scenario. AFAIU the
>> livelock
>> would be caused by a too-small delay between the two consecutive atomic_inc()
>> from one
>> loop iteration to the next compared to the time it takes to perform a read-side
>> critical
>> section of the seqlock. Back-to-back atomic_inc can be performed very quickly,
>> so I
>> doubt that the sched_getcpu implementation have good odds to be fast enough to
>> complete
>> in that narrow window, leading to lots of read seqlock retry.
>
> Ooooh, yeah, brain fart on my side. I was thinking of the two atomic_inc() in
> the
> same loop iteration and completely ignoring the next iteration. Yes, I 100%
> agree
> that a delay to ensure forward progress is needed. An assertion in main() that
> the
> reader complete at least some reasonable number of KVM_RUNs is also probably a
> good
> idea, e.g. to rule out a false pass due to the reader never making forward
> progress.
Agreed.
>
> FWIW, the do-while loop does make forward progress without a delay, but at ~50%
> throughput, give or take.
I did not expect absolutely no progress, but a significant slow down of
the read-side.
>
>> > but the primary motivation is very much to allow the read-side enough time
>> > to get back into KVM proper.
>>
>> I'm puzzled by your statement. AFAIU, let's say we don't have the delay, then we
>> have:
>>
>> migration thread KVM_RUN/read-side thread
>> -----------------------------------------------------------------------------------
>> - ioctl(KVM_RUN)
>> - atomic_inc_seq_cst(&seqcnt)
>> - sched_setaffinity
>> - atomic_inc_seq_cst(&seqcnt)
>> - a = atomic_load(&seqcnt) & ~1
>> - smp_rmb()
>> - b = LOAD_ONCE(__rseq_abi->cpu_id);
>> - sched_getcpu()
>> - smp_rmb()
>> - re-load seqcnt/compare (succeeds)
>> - Can only succeed if entire read-side happens while the seqcnt
>> is in an even numbered state.
>> - if (a != b) abort()
>> /* no delay. Even counter state is very
>> short. */
>> - atomic_inc_seq_cst(&seqcnt)
>> /* Let's suppose the lack of delay causes the
>> setaffinity to complete too early compared
>> with KVM_RUN ioctl */
>> - sched_setaffinity
>> - atomic_inc_seq_cst(&seqcnt)
>>
>> /* no delay. Even counter state is very
>> short. */
>> - atomic_inc_seq_cst(&seqcnt)
>> /* Then a setaffinity from a following
>> migration thread loop will run
>> concurrently with KVM_RUN */
>> - ioctl(KVM_RUN)
>> - sched_setaffinity
>> - atomic_inc_seq_cst(&seqcnt)
>>
>> As pointed out here, if the first setaffinity runs too early compared with
>> KVM_RUN,
>> a following setaffinity will run concurrently with it. However, the fact that
>> the even counter state is very short may very well hurt progress of the read
>> seqlock.
>
> *sigh*
>
> Several hours later, I think I finally have my head wrapped around everything.
>
> Due to the way the test is written and because of how KVM's run loop works,
> TIF_NOTIFY_RESUME or TIF_NEED_RESCHED effectively has to be set before KVM
> actually
> enters the guest, otherwise KVM will exit to userspace without touching the
> flag,
> i.e. it will be handled by the normal exit_to_user_mode_loop().
>
> Where I got lost was trying to figure out why I couldn't make the bug reproduce
> by
> causing the guest to exit to KVM, but not userspace, in which case KVM should
> easily trigger the problematic flow as the window for sched_getcpu() to collide
> with KVM would be enormous. The reason I didn't go down this route for the
> "official" test is that, unless there's something clever I'm overlooking, it
> requires arch specific guest code, and ialso I don't know that forcing an exit
> would even be necessary/sufficient on other architectures.
>
> Anyways, I was trying to confirm that the bug was being hit without a delay,
> while
> still retaining the sequence retry in the test. The test doesn't fail because
> the
> back-to-back atomic_inc() changes seqcnt too fast. The read-side makes forward
> progress, but it never observes failure because the do-while loop only ever
> completes after another sched_setaffinity(), never after the one that collides
> with KVM because it takes too long to get out of ioctl(KVM_RUN) and back to the
> test. I.e. the atomic_inc() in the next loop iteration (makes seq_cnt odd)
> always
> completes before the check, and so the check ends up spinning until another
> migration, which correctly updates rseq. This was expected and didn't confuse
> me.
>
> What confused me is that I was trying to confirm the bug was being hit from
> within
> the kernel by confirming KVM observed TIF_NOTIFY_RESUME, but I misunderstood
> when
> TIF_NOTIFY_RESUME would get set. KVM can observe TIF_NOTIFY_RESUME directly,
> but
> it's rare, and I suspect happens iff sched_setaffinity() hits the small window
> where
> it collides with KVM_RUN before KVM enters the guest.
>
> More commonly, the bug occurs when KVM sees TIF_NEED_RESCHED. In that case, KVM
> calls xfer_to_guest_mode_work(), which does schedule() and _that_ sets
> TIF_NOTIFY_RESUME. xfer_to_guest_mode_work() then mishandles TIF_NOTIFY_RESUME
> and the bug is hit, but my confirmation logic in KVM never fired.
>
> So there are effectively three reasons we want a delay:
>
> 1. To allow sched_setaffinity() to coincide with ioctl(KVM_RUN) before KVM can
> enter the guest so that the guest doesn't need an arch-specific VM-Exit source.
>
> 2. To let ioctl(KVM_RUN) make its way back to the test before the next round
> of migration.
>
> 3. To ensure the read-side can make forward progress, e.g. if sched_getcpu()
> involves a syscall.
>
>
> After looking at KVM for arm64 and s390, #1 is a bit tenuous because x86 is the
> only arch that currently uses xfer_to_guest_mode_work(), i.e. the test could be
> tweaked to be overtly x86-specific. But since a delay is needed for #2 and #3,
> I'd prefer to rely on it for #1 as well in the hopes that this test provides
> coverage for arm64 and/or s390 if they're ever converted to use the common
> xfer_to_guest_mode_work().
Now that we have this understanding of why we need the delay, it would be good to
write this down in a comment within the test.
Does it reproduce if we randomize the delay to have it picked randomly from 0us
to 100us (with 1us step) ? It would remove a lot of the needs for arch-specific
magic delay value.
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
^ permalink raw reply
* Re: [FSL P50x0] lscpu reports wrong values since the RC1 of kernel 5.13
From: Christian Zigotzky @ 2021-08-27 18:32 UTC (permalink / raw)
To: Srikar Dronamraju
Cc: Darren Stevens, mad skateman, qemu-devel, R.T.Dickinson,
linuxppc-dev
In-Reply-To: <20210826034310.GA296102@linux.vnet.ibm.com>
> On 26. Aug 2021, at 05:43, Srikar Dronamraju <srikar@linux.vnet.ibm.com> wrote:
>
> * Christian Zigotzky <chzigotzky@xenosoft.de> [2021-08-16 14:29:21]:
>
>
> Hi Christian,
>
>> I tested the RC6 of kernel 5.14 today and unfortunately the issue still
>> exists. We have figured out that only P5040 SoCs are affected. [1]
>> P5020 SoCs display the correct values.
>> Please check the CPU changes in the PowerPC updates 5.13-1 and 5.13-2.
>>
>
> Thanks for reporting the issue.
> Would it be possible to try
> https://lore.kernel.org/linuxppc-dev/20210821092419.167454-3-srikar@linux.vnet.ibm.com/t/#u
Hi Srikar,
This patch works! Thanks a lot!
Cheers,
Christian
>
> If the above patch is not helping, then can you please collect the output of
>
> cat /sys/devices/system/cpu/cpu*/topology/core_siblings
>
> Were all the CPUs online at the time of boot?
> Did we do any CPU online/offline operations post boot?
>
> If we did CPU online/offline, can you capture the output just after the
> boot along with lscpu output..
>
> Since this is being seen on few SOCs, can you summarize the difference
> between P5040 and P5020.
>>
>> [1] https://forum.hyperion-entertainment.com/viewtopic.php?p=53775#p53775
>>
>>
>>> On 09 August 2021 um 02:37 pm, Christian Zigotzky wrote:
>>> Hi All,
>>>
>>> Lscpu reports wrong values [1] since the RC1 of kernel 5.13 on my FSL
>>> P5040 Cyrus+ board (A-EON AmigaOne X5000). [2]
>>> The differences are:
>>>
>>> Since the RC1 of kernel 5.13 (wrong values):
>>>
>>> Core(s) per socket: 1
>>> Socket(s): 3
>>>
>
> I know that the socket count was off by 1, but I cant explain how its off by
> 2 here.
>
>>> Before (correct values):
>>>
>>> Core(s) per socket: 4
>>> Socket(s): 1
>>>
>
> --
> Thanks and Regards
> Srikar Dronamraju
^ permalink raw reply
* Re: [PATCH v6 10/11] powerpc/pseries/iommu: Make use of DDW for indirect mapping
From: Leonardo Brás @ 2021-08-27 17:53 UTC (permalink / raw)
To: Frederic Barrat, Alexey Kardashevskiy, Michael Ellerman,
Benjamin Herrenschmidt, Paul Mackerras, David Gibson,
Nicolin Chen, kernel test robot
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <3d79480a-20df-ea1a-e17f-8bf2c8a8a2be@linux.ibm.com>
Hello Fred,
On Fri, 2021-08-27 at 19:06 +0200, Frederic Barrat wrote:
>
> I think it looks ok now as it was mostly me who was misunderstanding
> one
> part of the previous iteration.
> Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
>
Thank you for reviewing this series!
> Sorry for the late review, I was enjoying some time off. And thanks
> for
> that series, I believe it should help with those bugs complaining
> about
> lack of DMA space. It was also very educational for me, thanks to you
> and Alexey for your detailed answers.
Working on this series caused me to learn a lot about how DMA and IOMMU
works in Power, and it was a great experience. Thanks to Alexey who
helped and guided me through this!
Best regards,
Leonardo Bras
^ permalink raw reply
* Re: [PATCH v6 10/11] powerpc/pseries/iommu: Make use of DDW for indirect mapping
From: Frederic Barrat @ 2021-08-27 17:06 UTC (permalink / raw)
To: Leonardo Bras, Michael Ellerman, Benjamin Herrenschmidt,
Paul Mackerras, Alexey Kardashevskiy, David Gibson, Nicolin Chen,
kernel test robot
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <20210817063929.38701-11-leobras.c@gmail.com>
On 17/08/2021 08:39, Leonardo Bras wrote:
> So far it's assumed possible to map the guest RAM 1:1 to the bus, which
> works with a small number of devices. SRIOV changes it as the user can
> configure hundreds VFs and since phyp preallocates TCEs and does not
> allow IOMMU pages bigger than 64K, it has to limit the number of TCEs
> per a PE to limit waste of physical pages.
>
> As of today, if the assumed direct mapping is not possible, DDW creation
> is skipped and the default DMA window "ibm,dma-window" is used instead.
>
> By using DDW, indirect mapping can get more TCEs than available for the
> default DMA window, and also get access to using much larger pagesizes
> (16MB as implemented in qemu vs 4k from default DMA window), causing a
> significant increase on the maximum amount of memory that can be IOMMU
> mapped at the same time.
>
> Indirect mapping will only be used if direct mapping is not a
> possibility.
>
> For indirect mapping, it's necessary to re-create the iommu_table with
> the new DMA window parameters, so iommu_alloc() can use it.
>
> Removing the default DMA window for using DDW with indirect mapping
> is only allowed if there is no current IOMMU memory allocated in
> the iommu_table. enable_ddw() is aborted otherwise.
>
> Even though there won't be both direct and indirect mappings at the
> same time, we can't reuse the DIRECT64_PROPNAME property name, or else
> an older kexec()ed kernel can assume direct mapping, and skip
> iommu_alloc(), causing undesirable behavior.
> So a new property name DMA64_PROPNAME "linux,dma64-ddr-window-info"
> was created to represent a DDW that does not allow direct mapping.
>
> Signed-off-by: Leonardo Bras <leobras.c@gmail.com>
> ---
I think it looks ok now as it was mostly me who was misunderstanding one
part of the previous iteration.
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
Sorry for the late review, I was enjoying some time off. And thanks for
that series, I believe it should help with those bugs complaining about
lack of DMA space. It was also very educational for me, thanks to you
and Alexey for your detailed answers.
Fred
> arch/powerpc/platforms/pseries/iommu.c | 89 +++++++++++++++++++++-----
> 1 file changed, 74 insertions(+), 15 deletions(-)
>
> diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
> index e11c00b2dc1e..0eccc29f5573 100644
> --- a/arch/powerpc/platforms/pseries/iommu.c
> +++ b/arch/powerpc/platforms/pseries/iommu.c
> @@ -375,6 +375,7 @@ static DEFINE_SPINLOCK(direct_window_list_lock);
> /* protects initializing window twice for same device */
> static DEFINE_MUTEX(direct_window_init_mutex);
> #define DIRECT64_PROPNAME "linux,direct64-ddr-window-info"
> +#define DMA64_PROPNAME "linux,dma64-ddr-window-info"
>
> static int tce_clearrange_multi_pSeriesLP(unsigned long start_pfn,
> unsigned long num_pfn, const void *arg)
> @@ -940,6 +941,7 @@ static int find_existing_ddw_windows(void)
> return 0;
>
> find_existing_ddw_windows_named(DIRECT64_PROPNAME);
> + find_existing_ddw_windows_named(DMA64_PROPNAME);
>
> return 0;
> }
> @@ -1226,14 +1228,17 @@ static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn)
> struct ddw_create_response create;
> int page_shift;
> u64 win_addr;
> + const char *win_name;
> struct device_node *dn;
> u32 ddw_avail[DDW_APPLICABLE_SIZE];
> struct direct_window *window;
> struct property *win64;
> bool ddw_enabled = false;
> struct failed_ddw_pdn *fpdn;
> - bool default_win_removed = false;
> + bool default_win_removed = false, direct_mapping = false;
> bool pmem_present;
> + struct pci_dn *pci = PCI_DN(pdn);
> + struct iommu_table *tbl = pci->table_group->tables[0];
>
> dn = of_find_node_by_type(NULL, "ibm,pmemory");
> pmem_present = dn != NULL;
> @@ -1242,6 +1247,7 @@ static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn)
> mutex_lock(&direct_window_init_mutex);
>
> if (find_existing_ddw(pdn, &dev->dev.archdata.dma_offset, &len)) {
> + direct_mapping = (len >= max_ram_len);
> ddw_enabled = true;
> goto out_unlock;
> }
> @@ -1322,8 +1328,8 @@ static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn)
> query.page_size);
> goto out_failed;
> }
> - /* verify the window * number of ptes will map the partition */
> - /* check largest block * page size > max memory hotplug addr */
> +
> +
> /*
> * The "ibm,pmemory" can appear anywhere in the address space.
> * Assuming it is still backed by page structs, try MAX_PHYSMEM_BITS
> @@ -1339,13 +1345,25 @@ static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn)
> dev_info(&dev->dev, "Skipping ibm,pmemory");
> }
>
> + /* check if the available block * number of ptes will map everything */
> if (query.largest_available_block < (1ULL << (len - page_shift))) {
> dev_dbg(&dev->dev,
> "can't map partition max 0x%llx with %llu %llu-sized pages\n",
> 1ULL << len,
> query.largest_available_block,
> 1ULL << page_shift);
> - goto out_failed;
> +
> + /* DDW + IOMMU on single window may fail if there is any allocation */
> + if (default_win_removed && iommu_table_in_use(tbl)) {
> + dev_dbg(&dev->dev, "current IOMMU table in use, can't be replaced.\n");
> + goto out_failed;
> + }
> +
> + len = order_base_2(query.largest_available_block << page_shift);
> + win_name = DMA64_PROPNAME;
> + } else {
> + direct_mapping = true;
> + win_name = DIRECT64_PROPNAME;
> }
>
> ret = create_ddw(dev, ddw_avail, &create, page_shift, len);
> @@ -1356,8 +1374,8 @@ static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn)
> create.liobn, dn);
>
> win_addr = ((u64)create.addr_hi << 32) | create.addr_lo;
> - win64 = ddw_property_create(DIRECT64_PROPNAME, create.liobn, win_addr,
> - page_shift, len);
> + win64 = ddw_property_create(win_name, create.liobn, win_addr, page_shift, len);
> +
> if (!win64) {
> dev_info(&dev->dev,
> "couldn't allocate property, property name, or value\n");
> @@ -1375,15 +1393,54 @@ static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn)
> if (!window)
> goto out_del_prop;
>
> - ret = walk_system_ram_range(0, memblock_end_of_DRAM() >> PAGE_SHIFT,
> - win64->value, tce_setrange_multi_pSeriesLP_walk);
> - if (ret) {
> - dev_info(&dev->dev, "failed to map direct window for %pOF: %d\n",
> - dn, ret);
> + if (direct_mapping) {
> + /* DDW maps the whole partition, so enable direct DMA mapping */
> + ret = walk_system_ram_range(0, memblock_end_of_DRAM() >> PAGE_SHIFT,
> + win64->value, tce_setrange_multi_pSeriesLP_walk);
> + if (ret) {
> + dev_info(&dev->dev, "failed to map direct window for %pOF: %d\n",
> + dn, ret);
>
> /* Make sure to clean DDW if any TCE was set*/
> clean_dma_window(pdn, win64->value);
> - goto out_del_list;
> + goto out_del_list;
> + }
> + } else {
> + struct iommu_table *newtbl;
> + int i;
> +
> + for (i = 0; i < ARRAY_SIZE(pci->phb->mem_resources); i++) {
> + const unsigned long mask = IORESOURCE_MEM_64 | IORESOURCE_MEM;
> +
> + /* Look for MMIO32 */
> + if ((pci->phb->mem_resources[i].flags & mask) == IORESOURCE_MEM)
> + break;
> + }
> +
> + if (i == ARRAY_SIZE(pci->phb->mem_resources))
> + goto out_del_list;
> +
> + /* New table for using DDW instead of the default DMA window */
> + newtbl = iommu_pseries_alloc_table(pci->phb->node);
> + if (!newtbl) {
> + dev_dbg(&dev->dev, "couldn't create new IOMMU table\n");
> + goto out_del_list;
> + }
> +
> + iommu_table_setparms_common(newtbl, pci->phb->bus->number, create.liobn, win_addr,
> + 1UL << len, page_shift, NULL, &iommu_table_lpar_multi_ops);
> + iommu_init_table(newtbl, pci->phb->node, pci->phb->mem_resources[i].start,
> + pci->phb->mem_resources[i].end);
> +
> + pci->table_group->tables[1] = newtbl;
> +
> + /* Keep default DMA window stuct if removed */
> + if (default_win_removed) {
> + tbl->it_size = 0;
> + kfree(tbl->it_map);
> + }
> +
> + set_iommu_table_base(&dev->dev, newtbl);
> }
>
> spin_lock(&direct_window_list_lock);
> @@ -1427,10 +1484,10 @@ static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn)
> * as RAM, then we failed to create a window to cover persistent
> * memory and need to set the DMA limit.
> */
> - if (pmem_present && ddw_enabled && (len == max_ram_len))
> + if (pmem_present && ddw_enabled && direct_mapping && len == max_ram_len)
> dev->dev.bus_dma_limit = dev->dev.archdata.dma_offset + (1ULL << len);
>
> - return ddw_enabled;
> + return ddw_enabled && direct_mapping;
> }
>
> static void pci_dma_dev_setup_pSeriesLP(struct pci_dev *dev)
> @@ -1572,7 +1629,9 @@ static int iommu_reconfig_notifier(struct notifier_block *nb, unsigned long acti
> * we have to remove the property when releasing
> * the device node.
> */
> - remove_ddw(np, false, DIRECT64_PROPNAME);
> + if (remove_ddw(np, false, DIRECT64_PROPNAME))
> + remove_ddw(np, false, DMA64_PROPNAME);
> +
> if (pci && pci->table_group)
> iommu_pseries_free_group(pci->table_group,
> np->full_name);
>
^ permalink raw reply
* Re: [PATCH v6 08/11] powerpc/pseries/iommu: Update remove_dma_window() to accept property name
From: Frederic Barrat @ 2021-08-27 16:58 UTC (permalink / raw)
To: Leonardo Bras, Michael Ellerman, Benjamin Herrenschmidt,
Paul Mackerras, Alexey Kardashevskiy, David Gibson, Nicolin Chen,
kernel test robot
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <20210817063929.38701-9-leobras.c@gmail.com>
On 17/08/2021 08:39, Leonardo Bras wrote:
> Update remove_dma_window() so it can be used to remove DDW with a given
> property name.
>
> This enables the creation of new property names for DDW, so we can
> have different usage for it, like indirect mapping.
>
> Also, add return values to it so we can check if the property was found
> while removing the active DDW. This allows skipping the remaining property
> names while reducing the impact of multiple property names.
>
> Signed-off-by: Leonardo Bras <leobras.c@gmail.com>
> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
> arch/powerpc/platforms/pseries/iommu.c | 18 ++++++++++--------
> 1 file changed, 10 insertions(+), 8 deletions(-)
>
> diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
> index a47f59a8f107..901f290999d0 100644
> --- a/arch/powerpc/platforms/pseries/iommu.c
> +++ b/arch/powerpc/platforms/pseries/iommu.c
> @@ -844,31 +844,33 @@ static void remove_dma_window(struct device_node *np, u32 *ddw_avail,
> __remove_dma_window(np, ddw_avail, liobn);
> }
>
> -static void remove_ddw(struct device_node *np, bool remove_prop)
> +static int remove_ddw(struct device_node *np, bool remove_prop, const char *win_name)
> {
> struct property *win;
> u32 ddw_avail[DDW_APPLICABLE_SIZE];
> int ret = 0;
>
> + win = of_find_property(np, win_name, NULL);
> + if (!win)
> + return -EINVAL;
> +
> ret = of_property_read_u32_array(np, "ibm,ddw-applicable",
> &ddw_avail[0], DDW_APPLICABLE_SIZE);
> if (ret)
> - return;
> + return 0;
>
> - win = of_find_property(np, DIRECT64_PROPNAME, NULL);
> - if (!win)
> - return;
>
> if (win->length >= sizeof(struct dynamic_dma_window_prop))
> remove_dma_window(np, ddw_avail, win);
>
> if (!remove_prop)
> - return;
> + return 0;
>
> ret = of_remove_property(np, win);
> if (ret)
> pr_warn("%pOF: failed to remove direct window property: %d\n",
> np, ret);
> + return 0;
> }
>
> static bool find_existing_ddw(struct device_node *pdn, u64 *dma_addr, int *window_shift)
> @@ -921,7 +923,7 @@ static int find_existing_ddw_windows(void)
> for_each_node_with_property(pdn, DIRECT64_PROPNAME) {
> direct64 = of_get_property(pdn, DIRECT64_PROPNAME, &len);
> if (!direct64 || len < sizeof(*direct64)) {
> - remove_ddw(pdn, true);
> + remove_ddw(pdn, true, DIRECT64_PROPNAME);
> continue;
> }
>
> @@ -1565,7 +1567,7 @@ static int iommu_reconfig_notifier(struct notifier_block *nb, unsigned long acti
> * we have to remove the property when releasing
> * the device node.
> */
> - remove_ddw(np, false);
> + remove_ddw(np, false, DIRECT64_PROPNAME);
> if (pci && pci->table_group)
> iommu_pseries_free_group(pci->table_group,
> np->full_name);
>
^ permalink raw reply
* Re: [PATCH v6 06/11] powerpc/pseries/iommu: Add ddw_property_create() and refactor enable_ddw()
From: Frederic Barrat @ 2021-08-27 16:55 UTC (permalink / raw)
To: Leonardo Bras, Michael Ellerman, Benjamin Herrenschmidt,
Paul Mackerras, Alexey Kardashevskiy, David Gibson, Nicolin Chen,
kernel test robot
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <20210817063929.38701-7-leobras.c@gmail.com>
On 17/08/2021 08:39, Leonardo Bras wrote:
> Code used to create a ddw property that was previously scattered in
> enable_ddw() is now gathered in ddw_property_create(), which deals with
> allocation and filling the property, letting it ready for
> of_property_add(), which now occurs in sequence.
>
> This created an opportunity to reorganize the second part of enable_ddw():
>
> Without this patch enable_ddw() does, in order:
> kzalloc() property & members, create_ddw(), fill ddwprop inside property,
> ddw_list_new_entry(), do tce_setrange_multi_pSeriesLP_walk in all memory,
> of_add_property(), and list_add().
>
> With this patch enable_ddw() does, in order:
> create_ddw(), ddw_property_create(), of_add_property(),
> ddw_list_new_entry(), do tce_setrange_multi_pSeriesLP_walk in all memory,
> and list_add().
>
> This change requires of_remove_property() in case anything fails after
> of_add_property(), but we get to do tce_setrange_multi_pSeriesLP_walk
> in all memory, which looks the most expensive operation, only if
> everything else succeeds.
>
> Also, the error path got remove_ddw() replaced by a new helper
> __remove_dma_window(), which only removes the new DDW with an rtas-call.
> For this, a new helper clean_dma_window() was needed to clean anything
> that could left if walk_system_ram_range() fails.
>
> Signed-off-by: Leonardo Bras <leobras.c@gmail.com>
> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
Thanks for fixing the error paths
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
> arch/powerpc/platforms/pseries/iommu.c | 129 ++++++++++++++++---------
> 1 file changed, 84 insertions(+), 45 deletions(-)
>
> diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
> index b34b473bbdc1..00392582fe10 100644
> --- a/arch/powerpc/platforms/pseries/iommu.c
> +++ b/arch/powerpc/platforms/pseries/iommu.c
> @@ -795,17 +795,10 @@ static int __init disable_ddw_setup(char *str)
>
> early_param("disable_ddw", disable_ddw_setup);
>
> -static void remove_dma_window(struct device_node *np, u32 *ddw_avail,
> - struct property *win)
> +static void clean_dma_window(struct device_node *np, struct dynamic_dma_window_prop *dwp)
> {
> - struct dynamic_dma_window_prop *dwp;
> - u64 liobn;
> int ret;
>
> - dwp = win->value;
> - liobn = (u64)be32_to_cpu(dwp->liobn);
> -
> - /* clear the whole window, note the arg is in kernel pages */
> ret = tce_clearrange_multi_pSeriesLP(0,
> 1ULL << (be32_to_cpu(dwp->window_shift) - PAGE_SHIFT), dwp);
> if (ret)
> @@ -814,18 +807,39 @@ static void remove_dma_window(struct device_node *np, u32 *ddw_avail,
> else
> pr_debug("%pOF successfully cleared tces in window.\n",
> np);
> +}
> +
> +/*
> + * Call only if DMA window is clean.
> + */
> +static void __remove_dma_window(struct device_node *np, u32 *ddw_avail, u64 liobn)
> +{
> + int ret;
>
> ret = rtas_call(ddw_avail[DDW_REMOVE_PE_DMA_WIN], 1, 1, NULL, liobn);
> if (ret)
> - pr_warn("%pOF: failed to remove direct window: rtas returned "
> + pr_warn("%pOF: failed to remove DMA window: rtas returned "
> "%d to ibm,remove-pe-dma-window(%x) %llx\n",
> np, ret, ddw_avail[DDW_REMOVE_PE_DMA_WIN], liobn);
> else
> - pr_debug("%pOF: successfully removed direct window: rtas returned "
> + pr_debug("%pOF: successfully removed DMA window: rtas returned "
> "%d to ibm,remove-pe-dma-window(%x) %llx\n",
> np, ret, ddw_avail[DDW_REMOVE_PE_DMA_WIN], liobn);
> }
>
> +static void remove_dma_window(struct device_node *np, u32 *ddw_avail,
> + struct property *win)
> +{
> + struct dynamic_dma_window_prop *dwp;
> + u64 liobn;
> +
> + dwp = win->value;
> + liobn = (u64)be32_to_cpu(dwp->liobn);
> +
> + clean_dma_window(np, dwp);
> + __remove_dma_window(np, ddw_avail, liobn);
> +}
> +
> static void remove_ddw(struct device_node *np, bool remove_prop)
> {
> struct property *win;
> @@ -1153,6 +1167,35 @@ static int iommu_get_page_shift(u32 query_page_size)
> return 0;
> }
>
> +static struct property *ddw_property_create(const char *propname, u32 liobn, u64 dma_addr,
> + u32 page_shift, u32 window_shift)
> +{
> + struct dynamic_dma_window_prop *ddwprop;
> + struct property *win64;
> +
> + win64 = kzalloc(sizeof(*win64), GFP_KERNEL);
> + if (!win64)
> + return NULL;
> +
> + win64->name = kstrdup(propname, GFP_KERNEL);
> + ddwprop = kzalloc(sizeof(*ddwprop), GFP_KERNEL);
> + win64->value = ddwprop;
> + win64->length = sizeof(*ddwprop);
> + if (!win64->name || !win64->value) {
> + kfree(win64->name);
> + kfree(win64->value);
> + kfree(win64);
> + return NULL;
> + }
> +
> + ddwprop->liobn = cpu_to_be32(liobn);
> + ddwprop->dma_base = cpu_to_be64(dma_addr);
> + ddwprop->tce_shift = cpu_to_be32(page_shift);
> + ddwprop->window_shift = cpu_to_be32(window_shift);
> +
> + return win64;
> +}
> +
> /*
> * If the PE supports dynamic dma windows, and there is space for a table
> * that can map all pages in a linear offset, then setup such a table,
> @@ -1171,12 +1214,12 @@ static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn)
> struct ddw_query_response query;
> struct ddw_create_response create;
> int page_shift;
> + u64 win_addr;
> struct device_node *dn;
> u32 ddw_avail[DDW_APPLICABLE_SIZE];
> struct direct_window *window;
> struct property *win64;
> bool ddw_enabled = false;
> - struct dynamic_dma_window_prop *ddwprop;
> struct failed_ddw_pdn *fpdn;
> bool default_win_removed = false;
> bool pmem_present;
> @@ -1293,72 +1336,68 @@ static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn)
> 1ULL << page_shift);
> goto out_failed;
> }
> - win64 = kzalloc(sizeof(struct property), GFP_KERNEL);
> - if (!win64) {
> - dev_info(&dev->dev,
> - "couldn't allocate property for 64bit dma window\n");
> - goto out_failed;
> - }
> - win64->name = kstrdup(DIRECT64_PROPNAME, GFP_KERNEL);
> - win64->value = ddwprop = kmalloc(sizeof(*ddwprop), GFP_KERNEL);
> - win64->length = sizeof(*ddwprop);
> - if (!win64->name || !win64->value) {
> - dev_info(&dev->dev,
> - "couldn't allocate property name and value\n");
> - goto out_free_prop;
> - }
>
> ret = create_ddw(dev, ddw_avail, &create, page_shift, len);
> if (ret != 0)
> - goto out_free_prop;
> -
> - ddwprop->liobn = cpu_to_be32(create.liobn);
> - ddwprop->dma_base = cpu_to_be64(((u64)create.addr_hi << 32) |
> - create.addr_lo);
> - ddwprop->tce_shift = cpu_to_be32(page_shift);
> - ddwprop->window_shift = cpu_to_be32(len);
> + goto out_failed;
>
> dev_dbg(&dev->dev, "created tce table LIOBN 0x%x for %pOF\n",
> create.liobn, dn);
>
> - window = ddw_list_new_entry(pdn, ddwprop);
> + win_addr = ((u64)create.addr_hi << 32) | create.addr_lo;
> + win64 = ddw_property_create(DIRECT64_PROPNAME, create.liobn, win_addr,
> + page_shift, len);
> + if (!win64) {
> + dev_info(&dev->dev,
> + "couldn't allocate property, property name, or value\n");
> + goto out_remove_win;
> + }
> +
> + ret = of_add_property(pdn, win64);
> + if (ret) {
> + dev_err(&dev->dev, "unable to add dma window property for %pOF: %d",
> + pdn, ret);
> + goto out_free_prop;
> + }
> +
> + window = ddw_list_new_entry(pdn, win64->value);
> if (!window)
> - goto out_clear_window;
> + goto out_del_prop;
>
> ret = walk_system_ram_range(0, memblock_end_of_DRAM() >> PAGE_SHIFT,
> win64->value, tce_setrange_multi_pSeriesLP_walk);
> if (ret) {
> dev_info(&dev->dev, "failed to map direct window for %pOF: %d\n",
> dn, ret);
> - goto out_free_window;
> - }
>
> - ret = of_add_property(pdn, win64);
> - if (ret) {
> - dev_err(&dev->dev, "unable to add dma window property for %pOF: %d",
> - pdn, ret);
> - goto out_free_window;
> + /* Make sure to clean DDW if any TCE was set*/
> + clean_dma_window(pdn, win64->value);
> + goto out_del_list;
> }
>
> spin_lock(&direct_window_list_lock);
> list_add(&window->list, &direct_window_list);
> spin_unlock(&direct_window_list_lock);
>
> - dev->dev.archdata.dma_offset = be64_to_cpu(ddwprop->dma_base);
> + dev->dev.archdata.dma_offset = win_addr;
> ddw_enabled = true;
> goto out_unlock;
>
> -out_free_window:
> +out_del_list:
> kfree(window);
>
> -out_clear_window:
> - remove_ddw(pdn, true);
> +out_del_prop:
> + of_remove_property(pdn, win64);
>
> out_free_prop:
> kfree(win64->name);
> kfree(win64->value);
> kfree(win64);
>
> +out_remove_win:
> + /* DDW is clean, so it's ok to call this directly. */
> + __remove_dma_window(pdn, ddw_avail, create.liobn);
> +
> out_failed:
> if (default_win_removed)
> reset_dma_window(dev, pdn);
>
^ permalink raw reply
* Re: [PATCH v6 02/11] powerpc/kernel/iommu: Add new iommu_table_in_use() helper
From: Frederic Barrat @ 2021-08-27 16:52 UTC (permalink / raw)
To: Leonardo Bras, Michael Ellerman, Benjamin Herrenschmidt,
Paul Mackerras, Alexey Kardashevskiy, David Gibson, Nicolin Chen,
kernel test robot
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <20210817063929.38701-3-leobras.c@gmail.com>
On 17/08/2021 08:39, Leonardo Bras wrote:
> Having a function to check if the iommu table has any allocation helps
> deciding if a tbl can be reset for using a new DMA window.
>
> It should be enough to replace all instances of !bitmap_empty(tbl...).
>
> iommu_table_in_use() skips reserved memory, so we don't need to worry about
> releasing it before testing. This causes iommu_table_release_pages() to
> become unnecessary, given it is only used to remove reserved memory for
> testing.
>
> Also, only allow storing reserved memory values in tbl if they are valid
> in the table, so there is no need to check it in the new helper.
>
> Signed-off-by: Leonardo Bras <leobras.c@gmail.com>
> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
Looks ok to me now, thanks!
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
> arch/powerpc/include/asm/iommu.h | 1 +
> arch/powerpc/kernel/iommu.c | 61 ++++++++++++++++----------------
> 2 files changed, 32 insertions(+), 30 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/iommu.h b/arch/powerpc/include/asm/iommu.h
> index deef7c94d7b6..bf3b84128525 100644
> --- a/arch/powerpc/include/asm/iommu.h
> +++ b/arch/powerpc/include/asm/iommu.h
> @@ -154,6 +154,7 @@ extern int iommu_tce_table_put(struct iommu_table *tbl);
> */
> extern struct iommu_table *iommu_init_table(struct iommu_table *tbl,
> int nid, unsigned long res_start, unsigned long res_end);
> +bool iommu_table_in_use(struct iommu_table *tbl);
>
> #define IOMMU_TABLE_GROUP_MAX_TABLES 2
>
> diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
> index 2af89a5e379f..ed98ad63633e 100644
> --- a/arch/powerpc/kernel/iommu.c
> +++ b/arch/powerpc/kernel/iommu.c
> @@ -690,32 +690,24 @@ static void iommu_table_reserve_pages(struct iommu_table *tbl,
> if (tbl->it_offset == 0)
> set_bit(0, tbl->it_map);
>
> - tbl->it_reserved_start = res_start;
> - tbl->it_reserved_end = res_end;
> -
> - /* Check if res_start..res_end isn't empty and overlaps the table */
> - if (res_start && res_end &&
> - (tbl->it_offset + tbl->it_size < res_start ||
> - res_end < tbl->it_offset))
> - return;
> + if (res_start < tbl->it_offset)
> + res_start = tbl->it_offset;
>
> - for (i = tbl->it_reserved_start; i < tbl->it_reserved_end; ++i)
> - set_bit(i - tbl->it_offset, tbl->it_map);
> -}
> + if (res_end > (tbl->it_offset + tbl->it_size))
> + res_end = tbl->it_offset + tbl->it_size;
>
> -static void iommu_table_release_pages(struct iommu_table *tbl)
> -{
> - int i;
> + /* Check if res_start..res_end is a valid range in the table */
> + if (res_start >= res_end) {
> + tbl->it_reserved_start = tbl->it_offset;
> + tbl->it_reserved_end = tbl->it_offset;
> + return;
> + }
>
> - /*
> - * In case we have reserved the first bit, we should not emit
> - * the warning below.
> - */
> - if (tbl->it_offset == 0)
> - clear_bit(0, tbl->it_map);
> + tbl->it_reserved_start = res_start;
> + tbl->it_reserved_end = res_end;
>
> for (i = tbl->it_reserved_start; i < tbl->it_reserved_end; ++i)
> - clear_bit(i - tbl->it_offset, tbl->it_map);
> + set_bit(i - tbl->it_offset, tbl->it_map);
> }
>
> /*
> @@ -779,6 +771,22 @@ struct iommu_table *iommu_init_table(struct iommu_table *tbl, int nid,
> return tbl;
> }
>
> +bool iommu_table_in_use(struct iommu_table *tbl)
> +{
> + unsigned long start = 0, end;
> +
> + /* ignore reserved bit0 */
> + if (tbl->it_offset == 0)
> + start = 1;
> + end = tbl->it_reserved_start - tbl->it_offset;
> + if (find_next_bit(tbl->it_map, end, start) != end)
> + return true;
> +
> + start = tbl->it_reserved_end - tbl->it_offset;
> + end = tbl->it_size;
> + return find_next_bit(tbl->it_map, end, start) != end;
> +}
> +
> static void iommu_table_free(struct kref *kref)
> {
> struct iommu_table *tbl;
> @@ -795,10 +803,8 @@ static void iommu_table_free(struct kref *kref)
>
> iommu_debugfs_del(tbl);
>
> - iommu_table_release_pages(tbl);
> -
> /* verify that table contains no entries */
> - if (!bitmap_empty(tbl->it_map, tbl->it_size))
> + if (iommu_table_in_use(tbl))
> pr_warn("%s: Unexpected TCEs\n", __func__);
>
> /* free bitmap */
> @@ -1099,14 +1105,9 @@ int iommu_take_ownership(struct iommu_table *tbl)
> for (i = 0; i < tbl->nr_pools; i++)
> spin_lock_nest_lock(&tbl->pools[i].lock, &tbl->large_pool.lock);
>
> - iommu_table_release_pages(tbl);
> -
> - if (!bitmap_empty(tbl->it_map, tbl->it_size)) {
> + if (iommu_table_in_use(tbl)) {
> pr_err("iommu_tce: it_map is not empty");
> ret = -EBUSY;
> - /* Undo iommu_table_release_pages, i.e. restore bit#0, etc */
> - iommu_table_reserve_pages(tbl, tbl->it_reserved_start,
> - tbl->it_reserved_end);
> } else {
> memset(tbl->it_map, 0xff, sz);
> }
>
^ permalink raw reply
* [RFC PATCH 6/6] powerpc/microwatt: Stop building the hash MMU code
From: Nicholas Piggin @ 2021-08-27 16:34 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20210827163410.1177154-1-npiggin@gmail.com>
Microwatt is radix-only, so stop selecting the hash MMU code.
This saves 20kB compressed dtbImage and 56kB vmlinux size.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/configs/microwatt_defconfig | 1 -
arch/powerpc/platforms/Kconfig.cputype | 1 +
arch/powerpc/platforms/microwatt/Kconfig | 1 -
3 files changed, 1 insertion(+), 2 deletions(-)
diff --git a/arch/powerpc/configs/microwatt_defconfig b/arch/powerpc/configs/microwatt_defconfig
index bf5f2e5905eb..6fbad42f9e3d 100644
--- a/arch/powerpc/configs/microwatt_defconfig
+++ b/arch/powerpc/configs/microwatt_defconfig
@@ -26,7 +26,6 @@ CONFIG_PPC_MICROWATT=y
# CONFIG_PPC_OF_BOOT_TRAMPOLINE is not set
CONFIG_CPU_FREQ=y
CONFIG_HZ_100=y
-# CONFIG_PPC_MEM_KEYS is not set
# CONFIG_SECCOMP is not set
# CONFIG_MQ_IOSCHED_KYBER is not set
# CONFIG_COREDUMP is not set
diff --git a/arch/powerpc/platforms/Kconfig.cputype b/arch/powerpc/platforms/Kconfig.cputype
index 9b90fc501ed4..b9acd6c68c81 100644
--- a/arch/powerpc/platforms/Kconfig.cputype
+++ b/arch/powerpc/platforms/Kconfig.cputype
@@ -371,6 +371,7 @@ config SPE
config PPC_64S_HASH_MMU
bool "Hash MMU Support"
depends on PPC_BOOK3S_64
+ depends on !PPC_MICROWATT
default y
select PPC_MM_SLICES
help
diff --git a/arch/powerpc/platforms/microwatt/Kconfig b/arch/powerpc/platforms/microwatt/Kconfig
index e0ff2cfc1ca0..4f0ce292a6ce 100644
--- a/arch/powerpc/platforms/microwatt/Kconfig
+++ b/arch/powerpc/platforms/microwatt/Kconfig
@@ -6,7 +6,6 @@ config PPC_MICROWATT
select PPC_XICS
select PPC_ICS_NATIVE
select PPC_ICP_NATIVE
- select PPC_HASH_MMU_NATIVE if PPC_64S_HASH_MMU
select PPC_UDBG_16550
select ARCH_RANDOM
help
--
2.23.0
^ permalink raw reply related
* [RFC PATCH 5/6] powerpc/microwatt: select POWER9_CPU
From: Nicholas Piggin @ 2021-08-27 16:34 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20210827163410.1177154-1-npiggin@gmail.com>
Microwatt implements a subset of ISA v3.0 which is equivalent to
the POWER9_CPU selection.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/configs/microwatt_defconfig | 1 +
arch/powerpc/platforms/microwatt/Kconfig | 1 +
2 files changed, 2 insertions(+)
diff --git a/arch/powerpc/configs/microwatt_defconfig b/arch/powerpc/configs/microwatt_defconfig
index a08b739123da..bf5f2e5905eb 100644
--- a/arch/powerpc/configs/microwatt_defconfig
+++ b/arch/powerpc/configs/microwatt_defconfig
@@ -14,6 +14,7 @@ CONFIG_EMBEDDED=y
# CONFIG_COMPAT_BRK is not set
# CONFIG_SLAB_MERGE_DEFAULT is not set
CONFIG_PPC64=y
+CONFIG_POWER9_CPU=y
# CONFIG_PPC_KUEP is not set
# CONFIG_PPC_KUAP is not set
CONFIG_CPU_LITTLE_ENDIAN=y
diff --git a/arch/powerpc/platforms/microwatt/Kconfig b/arch/powerpc/platforms/microwatt/Kconfig
index 823192e9d38a..e0ff2cfc1ca0 100644
--- a/arch/powerpc/platforms/microwatt/Kconfig
+++ b/arch/powerpc/platforms/microwatt/Kconfig
@@ -2,6 +2,7 @@
config PPC_MICROWATT
depends on PPC_BOOK3S_64 && !SMP
bool "Microwatt SoC platform"
+ select POWER9_CPU
select PPC_XICS
select PPC_ICS_NATIVE
select PPC_ICP_NATIVE
--
2.23.0
^ permalink raw reply related
* [RFC PATCH 4/6] powerpc/64s: Make hash MMU code build configurable
From: Nicholas Piggin @ 2021-08-27 16:34 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20210827163410.1177154-1-npiggin@gmail.com>
Introduce a new option CONFIG_PPC_64S_HASH_MMU which allows the 64s hash
MMU code to be compiled out if radix is selected and the minimum
supported CPU type is POWER9 or higher, and KVM is not selected.
This saves 128kB kernel image size (90kB text) on powernv_defconfig
minus KVM, 350kB on pseries_defconfig minus KVM, 40kB on a tiny config.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/Kconfig | 1 +
arch/powerpc/include/asm/book3s/64/mmu.h | 22 +++++-
.../include/asm/book3s/64/tlbflush-hash.h | 7 ++
arch/powerpc/include/asm/book3s/pgtable.h | 4 ++
arch/powerpc/include/asm/mmu.h | 38 +++++++++--
arch/powerpc/include/asm/mmu_context.h | 2 +
arch/powerpc/include/asm/paca.h | 8 +++
arch/powerpc/kernel/asm-offsets.c | 2 +
arch/powerpc/kernel/dt_cpu_ftrs.c | 10 ++-
arch/powerpc/kernel/entry_64.S | 4 +-
arch/powerpc/kernel/exceptions-64s.S | 16 +++++
arch/powerpc/kernel/mce.c | 2 +-
arch/powerpc/kernel/mce_power.c | 10 ++-
arch/powerpc/kernel/paca.c | 18 ++---
arch/powerpc/kernel/process.c | 13 ++--
arch/powerpc/kernel/prom.c | 2 +
arch/powerpc/kernel/setup_64.c | 4 ++
arch/powerpc/kexec/core_64.c | 4 +-
arch/powerpc/kexec/ranges.c | 4 ++
arch/powerpc/kvm/Kconfig | 1 +
arch/powerpc/mm/book3s64/Makefile | 17 +++--
arch/powerpc/mm/book3s64/hash_pgtable.c | 1 -
arch/powerpc/mm/book3s64/hash_utils.c | 10 ---
.../{hash_hugetlbpage.c => hugetlbpage.c} | 6 ++
arch/powerpc/mm/book3s64/mmu_context.c | 16 +++++
arch/powerpc/mm/book3s64/pgtable.c | 22 +++++-
arch/powerpc/mm/book3s64/radix_pgtable.c | 4 ++
arch/powerpc/mm/book3s64/slb.c | 16 -----
arch/powerpc/mm/copro_fault.c | 2 +
arch/powerpc/mm/fault.c | 17 +++++
arch/powerpc/mm/pgtable.c | 10 ++-
arch/powerpc/platforms/Kconfig.cputype | 20 +++++-
arch/powerpc/platforms/cell/Kconfig | 1 +
arch/powerpc/platforms/maple/Kconfig | 1 +
arch/powerpc/platforms/microwatt/Kconfig | 2 +-
arch/powerpc/platforms/pasemi/Kconfig | 1 +
arch/powerpc/platforms/powermac/Kconfig | 1 +
arch/powerpc/platforms/powernv/Kconfig | 2 +-
arch/powerpc/platforms/powernv/idle.c | 2 +
arch/powerpc/platforms/powernv/setup.c | 2 +
arch/powerpc/platforms/pseries/lpar.c | 68 ++++++++++---------
arch/powerpc/platforms/pseries/lparcfg.c | 2 +-
arch/powerpc/platforms/pseries/mobility.c | 6 ++
arch/powerpc/platforms/pseries/ras.c | 4 ++
arch/powerpc/platforms/pseries/reconfig.c | 2 +
arch/powerpc/platforms/pseries/setup.c | 6 +-
arch/powerpc/xmon/xmon.c | 8 ++-
47 files changed, 310 insertions(+), 111 deletions(-)
rename arch/powerpc/mm/book3s64/{hash_hugetlbpage.c => hugetlbpage.c} (95%)
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index d01e3401581d..db4e0efd229b 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -940,6 +940,7 @@ config PPC_MEM_KEYS
prompt "PowerPC Memory Protection Keys"
def_bool y
depends on PPC_BOOK3S_64
+ depends on PPC_64S_HASH_MMU
select ARCH_USES_HIGH_VMA_FLAGS
select ARCH_HAS_PKEYS
help
diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h
index c02f42d1031e..857dc88b0043 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu.h
@@ -98,7 +98,9 @@ typedef struct {
* from EA and new context ids to build the new VAs.
*/
mm_context_id_t id;
+#ifdef CONFIG_PPC_64S_HASH_MMU
mm_context_id_t extended_id[TASK_SIZE_USER64/TASK_CONTEXT_SIZE];
+#endif
};
/* Number of bits in the mm_cpumask */
@@ -110,7 +112,9 @@ typedef struct {
/* Number of user space windows opened in process mm_context */
atomic_t vas_windows;
+#ifdef CONFIG_PPC_64S_HASH_MMU
struct hash_mm_context *hash_context;
+#endif
void __user *vdso;
/*
@@ -133,6 +137,7 @@ typedef struct {
#endif
} mm_context_t;
+#ifdef CONFIG_PPC_64S_HASH_MMU
static inline u16 mm_ctx_user_psize(mm_context_t *ctx)
{
return ctx->hash_context->user_psize;
@@ -187,11 +192,22 @@ static inline struct subpage_prot_table *mm_ctx_subpage_prot(mm_context_t *ctx)
}
#endif
+#endif
+
/*
* The current system page and segment sizes
*/
-extern int mmu_linear_psize;
+#if defined(CONFIG_PPC_RADIX_MMU) && !defined(CONFIG_PPC_64S_HASH_MMU)
+#ifdef CONFIG_PPC_64K_PAGES
+#define mmu_virtual_psize MMU_PAGE_64K
+#else
+#define mmu_virtual_psize MMU_PAGE_4K
+#endif
+#else
extern int mmu_virtual_psize;
+#endif
+
+extern int mmu_linear_psize;
extern int mmu_vmalloc_psize;
extern int mmu_vmemmap_psize;
extern int mmu_io_psize;
@@ -228,6 +244,7 @@ extern void hash__setup_initial_memory_limit(phys_addr_t first_memblock_base,
static inline void setup_initial_memory_limit(phys_addr_t first_memblock_base,
phys_addr_t first_memblock_size)
{
+#ifdef CONFIG_PPC_64S_HASH_MMU
/*
* Hash has more strict restrictions. At this point we don't
* know which translations we will pick. Hence go with hash
@@ -235,6 +252,7 @@ static inline void setup_initial_memory_limit(phys_addr_t first_memblock_base,
*/
return hash__setup_initial_memory_limit(first_memblock_base,
first_memblock_size);
+#endif
}
#ifdef CONFIG_PPC_PSERIES
@@ -255,6 +273,7 @@ static inline void radix_init_pseries(void) { }
void cleanup_cpu_mmu_context(void);
#endif
+#ifdef CONFIG_PPC_64S_HASH_MMU
static inline int get_user_context(mm_context_t *ctx, unsigned long ea)
{
int index = ea >> MAX_EA_BITS_PER_CONTEXT;
@@ -274,6 +293,7 @@ static inline unsigned long get_user_vsid(mm_context_t *ctx,
return get_vsid(context, ea, ssize);
}
+#endif
#endif /* __ASSEMBLY__ */
#endif /* _ASM_POWERPC_BOOK3S_64_MMU_H_ */
diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h b/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h
index 3b95769739c7..06f4bd09eecf 100644
--- a/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h
+++ b/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h
@@ -112,8 +112,15 @@ static inline void hash__flush_tlb_kernel_range(unsigned long start,
struct mmu_gather;
extern void hash__tlb_flush(struct mmu_gather *tlb);
+extern void flush_tlb_pmd_range(struct mm_struct *mm, pmd_t *pmd,
+ unsigned long addr);
+
+#ifdef CONFIG_PPC_64S_HASH_MMU
/* Private function for use by PCI IO mapping code */
extern void __flush_hash_table_range(unsigned long start, unsigned long end);
extern void flush_tlb_pmd_range(struct mm_struct *mm, pmd_t *pmd,
unsigned long addr);
+#else
+static inline void __flush_hash_table_range(unsigned long start, unsigned long end) { }
+#endif
#endif /* _ASM_POWERPC_BOOK3S_64_TLBFLUSH_HASH_H */
diff --git a/arch/powerpc/include/asm/book3s/pgtable.h b/arch/powerpc/include/asm/book3s/pgtable.h
index ad130e15a126..815f52c23579 100644
--- a/arch/powerpc/include/asm/book3s/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/pgtable.h
@@ -25,6 +25,7 @@ extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
unsigned long size, pgprot_t vma_prot);
#define __HAVE_PHYS_MEM_ACCESS_PROT
+#ifdef CONFIG_PPC_64S_HASH_MMU
/*
* This gets called at the end of handling a page fault, when
* the kernel has put a new PTE into the page table for the process.
@@ -35,6 +36,9 @@ extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
* waiting for the inevitable extra hash-table miss exception.
*/
void update_mmu_cache(struct vm_area_struct *vma, unsigned long address, pte_t *ptep);
+#else
+static inline void update_mmu_cache(struct vm_area_struct *vma, unsigned long address, pte_t *ptep) {}
+#endif
#endif /* __ASSEMBLY__ */
#endif
diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h
index 27016b98ecb2..f46217af74bf 100644
--- a/arch/powerpc/include/asm/mmu.h
+++ b/arch/powerpc/include/asm/mmu.h
@@ -157,7 +157,7 @@ DECLARE_PER_CPU(int, next_tlbcam_idx);
enum {
MMU_FTRS_POSSIBLE =
-#if defined(CONFIG_PPC_BOOK3S_64) || defined(CONFIG_PPC_BOOK3S_604)
+#if defined(CONFIG_PPC_BOOK3S_604)
MMU_FTR_HPTE_TABLE |
#endif
#ifdef CONFIG_PPC_8xx
@@ -184,23 +184,29 @@ enum {
MMU_FTR_USE_TLBRSRV | MMU_FTR_USE_PAIRED_MAS |
#endif
#ifdef CONFIG_PPC_BOOK3S_64
+ MMU_FTR_KERNEL_RO |
+#ifdef CONFIG_PPC_64S_HASH_MMU
MMU_FTR_NO_SLBIE_B | MMU_FTR_16M_PAGE | MMU_FTR_TLBIEL |
MMU_FTR_LOCKLESS_TLBIE | MMU_FTR_CI_LARGE_PAGE |
MMU_FTR_1T_SEGMENT | MMU_FTR_TLBIE_CROP_VA |
- MMU_FTR_KERNEL_RO | MMU_FTR_68_BIT_VA |
+ MMU_FTR_68_BIT_VA |
+#endif
#endif
#ifdef CONFIG_PPC_RADIX_MMU
MMU_FTR_TYPE_RADIX |
MMU_FTR_GTSE |
#endif /* CONFIG_PPC_RADIX_MMU */
+#ifdef CONFIG_PPC_64S_HASH_MMU
+ MMU_FTR_HPTE_TABLE |
+#endif /* CONFIG_PPC_64S_HASH_MMU */
#ifdef CONFIG_PPC_KUAP
- MMU_FTR_BOOK3S_KUAP |
+ MMU_FTR_BOOK3S_KUAP |
#endif /* CONFIG_PPC_KUAP */
#ifdef CONFIG_PPC_MEM_KEYS
- MMU_FTR_PKEY |
+ MMU_FTR_PKEY |
#endif
#ifdef CONFIG_PPC_KUEP
- MMU_FTR_BOOK3S_KUEP |
+ MMU_FTR_BOOK3S_KUEP |
#endif /* CONFIG_PPC_KUAP */
0,
@@ -324,6 +330,7 @@ static inline void assert_pte_locked(struct mm_struct *mm, unsigned long addr)
}
#endif /* !CONFIG_DEBUG_VM */
+#if defined(CONFIG_PPC_RADIX_MMU) && defined(CONFIG_PPC_64S_HASH_MMU)
static inline bool radix_enabled(void)
{
return mmu_has_feature(MMU_FTR_TYPE_RADIX);
@@ -333,6 +340,27 @@ static inline bool early_radix_enabled(void)
{
return early_mmu_has_feature(MMU_FTR_TYPE_RADIX);
}
+#elif defined(CONFIG_PPC_64S_HASH_MMU)
+static inline bool radix_enabled(void)
+{
+ return false;
+}
+
+static inline bool early_radix_enabled(void)
+{
+ return false;
+}
+#else
+static inline bool radix_enabled(void)
+{
+ return true;
+}
+
+static inline bool early_radix_enabled(void)
+{
+ return true;
+}
+#endif
#ifdef CONFIG_STRICT_KERNEL_RWX
static inline bool strict_kernel_rwx_enabled(void)
diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
index 9ba6b585337f..e46394d27785 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -75,6 +75,7 @@ extern void hash__reserve_context_id(int id);
extern void __destroy_context(int context_id);
static inline void mmu_context_init(void) { }
+#ifdef CONFIG_PPC_64S_HASH_MMU
static inline int alloc_extended_context(struct mm_struct *mm,
unsigned long ea)
{
@@ -100,6 +101,7 @@ static inline bool need_extra_context(struct mm_struct *mm, unsigned long ea)
return true;
return false;
}
+#endif
#else
extern void switch_mmu_context(struct mm_struct *prev, struct mm_struct *next,
diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index dc05a862e72a..295573a82c66 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -97,7 +97,9 @@ struct paca_struct {
/* this becomes non-zero. */
u8 kexec_state; /* set when kexec down has irqs off */
#ifdef CONFIG_PPC_BOOK3S_64
+#ifdef CONFIG_PPC_64S_HASH_MMU
struct slb_shadow *slb_shadow_ptr;
+#endif
struct dtl_entry *dispatch_log;
struct dtl_entry *dispatch_log_end;
#endif
@@ -110,6 +112,7 @@ struct paca_struct {
/* used for most interrupts/exceptions */
u64 exgen[EX_SIZE] __attribute__((aligned(0x80)));
+#ifdef CONFIG_PPC_64S_HASH_MMU
/* SLB related definitions */
u16 vmalloc_sllp;
u8 slb_cache_ptr;
@@ -120,6 +123,7 @@ struct paca_struct {
u32 slb_used_bitmap; /* Bitmaps for first 32 SLB entries. */
u32 slb_kern_bitmap;
u32 slb_cache[SLB_CACHE_ENTRIES];
+#endif
#endif /* CONFIG_PPC_BOOK3S_64 */
#ifdef CONFIG_PPC_BOOK3E
@@ -149,6 +153,7 @@ struct paca_struct {
#endif /* CONFIG_PPC_BOOK3E */
#ifdef CONFIG_PPC_BOOK3S
+#ifdef CONFIG_PPC_64S_HASH_MMU
#ifdef CONFIG_PPC_MM_SLICES
unsigned char mm_ctx_low_slices_psize[BITS_PER_LONG / BITS_PER_BYTE];
unsigned char mm_ctx_high_slices_psize[SLICE_ARRAY_SIZE];
@@ -156,6 +161,7 @@ struct paca_struct {
u16 mm_ctx_user_psize;
u16 mm_ctx_sllp;
#endif
+#endif
#endif
/*
@@ -268,9 +274,11 @@ struct paca_struct {
#endif /* CONFIG_PPC_PSERIES */
#ifdef CONFIG_PPC_BOOK3S_64
+#ifdef CONFIG_PPC_64S_HASH_MMU
/* Capture SLB related old contents in MCE handler. */
struct slb_entry *mce_faulty_slbs;
u16 slb_save_cache_ptr;
+#endif
#endif /* CONFIG_PPC_BOOK3S_64 */
#ifdef CONFIG_STACKPROTECTOR
unsigned long canary;
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index a47eefa09bcb..70b02241fc9d 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -220,10 +220,12 @@ int main(void)
OFFSET(PACA_EXGEN, paca_struct, exgen);
OFFSET(PACA_EXMC, paca_struct, exmc);
OFFSET(PACA_EXNMI, paca_struct, exnmi);
+#ifdef CONFIG_PPC_64S_HASH_MMU
OFFSET(PACA_SLBSHADOWPTR, paca_struct, slb_shadow_ptr);
OFFSET(SLBSHADOW_STACKVSID, slb_shadow, save_area[SLB_NUM_BOLTED - 1].vsid);
OFFSET(SLBSHADOW_STACKESID, slb_shadow, save_area[SLB_NUM_BOLTED - 1].esid);
OFFSET(SLBSHADOW_SAVEAREA, slb_shadow, save_area);
+#endif
OFFSET(LPPACA_PMCINUSE, lppaca, pmcregs_in_use);
#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
OFFSET(PACA_PMCINUSE, paca_struct, pmcregs_in_use);
diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c b/arch/powerpc/kernel/dt_cpu_ftrs.c
index 358aee7c2d79..0fddb74b1041 100644
--- a/arch/powerpc/kernel/dt_cpu_ftrs.c
+++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
@@ -269,6 +269,7 @@ static int __init feat_enable_idle_stop(struct dt_cpu_feature *f)
static int __init feat_enable_mmu_hash(struct dt_cpu_feature *f)
{
+#ifdef CONFIG_PPC_64S_HASH_MMU
u64 lpcr;
lpcr = mfspr(SPRN_LPCR);
@@ -284,10 +285,13 @@ static int __init feat_enable_mmu_hash(struct dt_cpu_feature *f)
cur_cpu_spec->cpu_user_features |= PPC_FEATURE_HAS_MMU;
return 1;
+#endif
+ return 0;
}
static int __init feat_enable_mmu_hash_v3(struct dt_cpu_feature *f)
{
+#ifdef CONFIG_PPC_64S_HASH_MMU
u64 lpcr;
lpcr = mfspr(SPRN_LPCR);
@@ -298,14 +302,18 @@ static int __init feat_enable_mmu_hash_v3(struct dt_cpu_feature *f)
cur_cpu_spec->cpu_user_features |= PPC_FEATURE_HAS_MMU;
return 1;
+#endif
+ return 0;
}
static int __init feat_enable_mmu_radix(struct dt_cpu_feature *f)
{
+#ifdef CONFIG_PPC_64S_HASH_MMU
+ cur_cpu_spec->mmu_features |= MMU_FTRS_HASH_BASE;
+#endif
#ifdef CONFIG_PPC_RADIX_MMU
cur_cpu_spec->mmu_features |= MMU_FTR_TYPE_RADIX;
- cur_cpu_spec->mmu_features |= MMU_FTRS_HASH_BASE;
cur_cpu_spec->mmu_features |= MMU_FTR_GTSE;
cur_cpu_spec->cpu_user_features |= PPC_FEATURE_HAS_MMU;
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 15720f8661a1..a6d063a914c5 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -180,7 +180,7 @@ _GLOBAL(_switch)
#endif
ld r8,KSP(r4) /* new stack pointer */
-#ifdef CONFIG_PPC_BOOK3S_64
+#ifdef CONFIG_PPC_64S_HASH_MMU
BEGIN_MMU_FTR_SECTION
b 2f
END_MMU_FTR_SECTION_IFSET(MMU_FTR_TYPE_RADIX)
@@ -232,7 +232,7 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_207S)
slbmte r7,r0
isync
2:
-#endif /* CONFIG_PPC_BOOK3S_64 */
+#endif /* CONFIG_PPC_64S_HASH_MMU */
clrrdi r7, r8, THREAD_SHIFT /* base of new stack */
/* Note: this uses SWITCH_FRAME_SIZE rather than INT_FRAME_SIZE
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 4aec59a77d4c..dc6c0254d73a 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1364,11 +1364,15 @@ EXC_COMMON_BEGIN(data_access_common)
addi r3,r1,STACK_FRAME_OVERHEAD
andis. r0,r4,DSISR_DABRMATCH@h
bne- 1f
+#ifdef CONFIG_PPC_64S_HASH_MMU
BEGIN_MMU_FTR_SECTION
bl do_hash_fault
MMU_FTR_SECTION_ELSE
bl do_page_fault
ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
+#else
+ bl do_page_fault
+#endif
b interrupt_return_srr
1: bl do_break
@@ -1411,6 +1415,7 @@ EXC_VIRT_BEGIN(data_access_slb, 0x4380, 0x80)
EXC_VIRT_END(data_access_slb, 0x4380, 0x80)
EXC_COMMON_BEGIN(data_access_slb_common)
GEN_COMMON data_access_slb
+#ifdef CONFIG_PPC_64S_HASH_MMU
BEGIN_MMU_FTR_SECTION
/* HPT case, do SLB fault */
addi r3,r1,STACK_FRAME_OVERHEAD
@@ -1423,6 +1428,9 @@ MMU_FTR_SECTION_ELSE
/* Radix case, access is outside page table range */
li r3,-EFAULT
ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
+#else
+ li r3,-EFAULT
+#endif
std r3,RESULT(r1)
addi r3,r1,STACK_FRAME_OVERHEAD
bl do_bad_slb_fault
@@ -1457,11 +1465,15 @@ EXC_VIRT_END(instruction_access, 0x4400, 0x80)
EXC_COMMON_BEGIN(instruction_access_common)
GEN_COMMON instruction_access
addi r3,r1,STACK_FRAME_OVERHEAD
+#ifdef CONFIG_PPC_64S_HASH_MMU
BEGIN_MMU_FTR_SECTION
bl do_hash_fault
MMU_FTR_SECTION_ELSE
bl do_page_fault
ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
+#else
+ bl do_page_fault
+#endif
b interrupt_return_srr
@@ -1491,6 +1503,7 @@ EXC_VIRT_BEGIN(instruction_access_slb, 0x4480, 0x80)
EXC_VIRT_END(instruction_access_slb, 0x4480, 0x80)
EXC_COMMON_BEGIN(instruction_access_slb_common)
GEN_COMMON instruction_access_slb
+#ifdef CONFIG_PPC_64S_HASH_MMU
BEGIN_MMU_FTR_SECTION
/* HPT case, do SLB fault */
addi r3,r1,STACK_FRAME_OVERHEAD
@@ -1503,6 +1516,9 @@ MMU_FTR_SECTION_ELSE
/* Radix case, access is outside page table range */
li r3,-EFAULT
ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
+#else
+ li r3,-EFAULT
+#endif
std r3,RESULT(r1)
addi r3,r1,STACK_FRAME_OVERHEAD
bl do_bad_slb_fault
diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
index 47a683cd00d2..8265d7908dd8 100644
--- a/arch/powerpc/kernel/mce.c
+++ b/arch/powerpc/kernel/mce.c
@@ -573,7 +573,7 @@ void machine_check_print_event_info(struct machine_check_event *evt,
mc_error_class[evt->error_class] : "Unknown";
printk("%sMCE: CPU%d: %s\n", level, evt->cpu, subtype);
-#ifdef CONFIG_PPC_BOOK3S_64
+#ifdef CONFIG_PPC_64S_HASH_MMU
/* Display faulty slb contents for SLB errors. */
if (evt->error_type == MCE_ERROR_TYPE_SLB && !in_guest)
slb_dump_contents(local_paca->mce_faulty_slbs);
diff --git a/arch/powerpc/kernel/mce_power.c b/arch/powerpc/kernel/mce_power.c
index c2f55fe7092d..dc51e154a54e 100644
--- a/arch/powerpc/kernel/mce_power.c
+++ b/arch/powerpc/kernel/mce_power.c
@@ -77,7 +77,7 @@ static bool mce_in_guest(void)
}
/* flush SLBs and reload */
-#ifdef CONFIG_PPC_BOOK3S_64
+#ifdef CONFIG_PPC_64S_HASH_MMU
void flush_and_reload_slb(void)
{
/* Invalidate all SLBs */
@@ -99,7 +99,7 @@ void flush_and_reload_slb(void)
void flush_erat(void)
{
-#ifdef CONFIG_PPC_BOOK3S_64
+#ifdef CONFIG_PPC_64S_HASH_MMU
if (!early_cpu_has_feature(CPU_FTR_ARCH_300)) {
flush_and_reload_slb();
return;
@@ -114,7 +114,7 @@ void flush_erat(void)
static int mce_flush(int what)
{
-#ifdef CONFIG_PPC_BOOK3S_64
+#ifdef CONFIG_PPC_64S_HASH_MMU
if (what == MCE_FLUSH_SLB) {
flush_and_reload_slb();
return 1;
@@ -499,8 +499,10 @@ static int mce_handle_ierror(struct pt_regs *regs, unsigned long srr1,
/* attempt to correct the error */
switch (table[i].error_type) {
case MCE_ERROR_TYPE_SLB:
+#ifdef CONFIG_PPC_64S_HASH_MMU
if (local_paca->in_mce == 1)
slb_save_contents(local_paca->mce_faulty_slbs);
+#endif
handled = mce_flush(MCE_FLUSH_SLB);
break;
case MCE_ERROR_TYPE_ERAT:
@@ -588,8 +590,10 @@ static int mce_handle_derror(struct pt_regs *regs,
/* attempt to correct the error */
switch (table[i].error_type) {
case MCE_ERROR_TYPE_SLB:
+#ifdef CONFIG_PPC_64S_HASH_MMU
if (local_paca->in_mce == 1)
slb_save_contents(local_paca->mce_faulty_slbs);
+#endif
if (mce_flush(MCE_FLUSH_SLB))
handled = 1;
break;
diff --git a/arch/powerpc/kernel/paca.c b/arch/powerpc/kernel/paca.c
index 9bd30cac852b..813930374d24 100644
--- a/arch/powerpc/kernel/paca.c
+++ b/arch/powerpc/kernel/paca.c
@@ -139,8 +139,7 @@ static struct lppaca * __init new_lppaca(int cpu, unsigned long limit)
}
#endif /* CONFIG_PPC_PSERIES */
-#ifdef CONFIG_PPC_BOOK3S_64
-
+#ifdef CONFIG_PPC_64S_HASH_MMU
/*
* 3 persistent SLBs are allocated here. The buffer will be zero
* initially, hence will all be invaild until we actually write them.
@@ -169,8 +168,7 @@ static struct slb_shadow * __init new_slb_shadow(int cpu, unsigned long limit)
return s;
}
-
-#endif /* CONFIG_PPC_BOOK3S_64 */
+#endif /* CONFIG_PPC_64S_HASH_MMU */
#ifdef CONFIG_PPC_PSERIES
/**
@@ -226,7 +224,7 @@ void __init initialise_paca(struct paca_struct *new_paca, int cpu)
new_paca->kexec_state = KEXEC_STATE_NONE;
new_paca->__current = &init_task;
new_paca->data_offset = 0xfeeeeeeeeeeeeeeeULL;
-#ifdef CONFIG_PPC_BOOK3S_64
+#ifdef CONFIG_PPC_64S_HASH_MMU
new_paca->slb_shadow_ptr = NULL;
#endif
@@ -307,7 +305,7 @@ void __init allocate_paca(int cpu)
#ifdef CONFIG_PPC_PSERIES
paca->lppaca_ptr = new_lppaca(cpu, limit);
#endif
-#ifdef CONFIG_PPC_BOOK3S_64
+#ifdef CONFIG_PPC_64S_HASH_MMU
paca->slb_shadow_ptr = new_slb_shadow(cpu, limit);
#endif
#ifdef CONFIG_PPC_PSERIES
@@ -328,7 +326,7 @@ void __init free_unused_pacas(void)
paca_nr_cpu_ids = nr_cpu_ids;
paca_ptrs_size = new_ptrs_size;
-#ifdef CONFIG_PPC_BOOK3S_64
+#ifdef CONFIG_PPC_64S_HASH_MMU
if (early_radix_enabled()) {
/* Ugly fixup, see new_slb_shadow() */
memblock_free(__pa(paca_ptrs[boot_cpuid]->slb_shadow_ptr),
@@ -341,9 +339,9 @@ void __init free_unused_pacas(void)
paca_ptrs_size + paca_struct_size, nr_cpu_ids);
}
+#ifdef CONFIG_PPC_64S_HASH_MMU
void copy_mm_to_paca(struct mm_struct *mm)
{
-#ifdef CONFIG_PPC_BOOK3S
mm_context_t *context = &mm->context;
#ifdef CONFIG_PPC_MM_SLICES
@@ -356,7 +354,5 @@ void copy_mm_to_paca(struct mm_struct *mm)
get_paca()->mm_ctx_user_psize = context->user_psize;
get_paca()->mm_ctx_sllp = context->sllp;
#endif
-#else /* !CONFIG_PPC_BOOK3S */
- return;
-#endif
}
+#endif /* CONFIG_PPC_64S_HASH_MMU */
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 185beb290580..718cc3153da0 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -1206,7 +1206,7 @@ struct task_struct *__switch_to(struct task_struct *prev,
{
struct thread_struct *new_thread, *old_thread;
struct task_struct *last;
-#ifdef CONFIG_PPC_BOOK3S_64
+#ifdef CONFIG_PPC_64S_HASH_MMU
struct ppc64_tlb_batch *batch;
#endif
@@ -1215,7 +1215,7 @@ struct task_struct *__switch_to(struct task_struct *prev,
WARN_ON(!irqs_disabled());
-#ifdef CONFIG_PPC_BOOK3S_64
+#ifdef CONFIG_PPC_64S_HASH_MMU
batch = this_cpu_ptr(&ppc64_tlb_batch);
if (batch->active) {
current_thread_info()->local_flags |= _TLF_LAZY_MMU;
@@ -1294,6 +1294,7 @@ struct task_struct *__switch_to(struct task_struct *prev,
*/
#ifdef CONFIG_PPC_BOOK3S_64
+#ifdef CONFIG_PPC_64S_HASH_MMU
/*
* This applies to a process that was context switched while inside
* arch_enter_lazy_mmu_mode(), to re-activate the batch that was
@@ -1305,6 +1306,7 @@ struct task_struct *__switch_to(struct task_struct *prev,
batch = this_cpu_ptr(&ppc64_tlb_batch);
batch->active = 1;
}
+#endif
/*
* Math facilities are masked out of the child MSR in copy_thread.
@@ -1655,7 +1657,7 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
static void setup_ksp_vsid(struct task_struct *p, unsigned long sp)
{
-#ifdef CONFIG_PPC_BOOK3S_64
+#ifdef CONFIG_PPC_64S_HASH_MMU
unsigned long sp_vsid;
unsigned long llp = mmu_psize_defs[mmu_linear_psize].sllp;
@@ -2302,10 +2304,9 @@ unsigned long arch_randomize_brk(struct mm_struct *mm)
* the heap, we can put it above 1TB so it is backed by a 1TB
* segment. Otherwise the heap will be in the bottom 1TB
* which always uses 256MB segments and this may result in a
- * performance penalty. We don't need to worry about radix. For
- * radix, mmu_highuser_ssize remains unchanged from 256MB.
+ * performance penalty.
*/
- if (!is_32bit_task() && (mmu_highuser_ssize == MMU_SEGSIZE_1T))
+ if (!radix_enabled() && !is_32bit_task() && (mmu_highuser_ssize == MMU_SEGSIZE_1T))
base = max_t(unsigned long, mm->brk, 1UL << SID_SHIFT_1T);
#endif
diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index f620e04dc9bf..ebc8871d644e 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -235,6 +235,7 @@ static void __init check_cpu_pa_features(unsigned long node)
#ifdef CONFIG_PPC_BOOK3S_64
static void __init init_mmu_slb_size(unsigned long node)
{
+#ifdef CONFIG_PPC_64S_HASH_MMU
const __be32 *slb_size_ptr;
slb_size_ptr = of_get_flat_dt_prop(node, "slb-size", NULL) ? :
@@ -242,6 +243,7 @@ static void __init init_mmu_slb_size(unsigned long node)
if (slb_size_ptr)
mmu_slb_size = be32_to_cpup(slb_size_ptr);
+#endif
}
#else
#define init_mmu_slb_size(node) do { } while(0)
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index 1ff258f6c76c..730bd943f7a7 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -880,6 +880,7 @@ void __init setup_per_cpu_areas(void)
unsigned int cpu;
int rc = -EINVAL;
+#ifdef CONFIG_PPC_64S_HASH_MMU
/*
* Linear mapping is one of 4K, 1M and 16M. For 4K, no need
* to group units. For larger mappings, use 1M atom which
@@ -889,6 +890,9 @@ void __init setup_per_cpu_areas(void)
atom_size = PAGE_SIZE;
else
atom_size = 1 << 20;
+#else
+ atom_size = PAGE_SIZE;
+#endif
if (pcpu_chosen_fc != PCPU_FC_PAGE) {
rc = pcpu_embed_first_chunk(0, dyn_size, atom_size, pcpu_cpu_distance,
diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
index 8a449b2d8715..a66a5c6638dd 100644
--- a/arch/powerpc/kexec/core_64.c
+++ b/arch/powerpc/kexec/core_64.c
@@ -374,7 +374,7 @@ void default_machine_kexec(struct kimage *image)
/* NOTREACHED */
}
-#ifdef CONFIG_PPC_BOOK3S_64
+#ifdef CONFIG_PPC_64S_HASH_MMU
/* Values we need to export to the second kernel via the device tree. */
static unsigned long htab_base;
static unsigned long htab_size;
@@ -416,4 +416,4 @@ static int __init export_htab_values(void)
return 0;
}
late_initcall(export_htab_values);
-#endif /* CONFIG_PPC_BOOK3S_64 */
+#endif /* CONFIG_PPC_64S_HASH_MMU */
diff --git a/arch/powerpc/kexec/ranges.c b/arch/powerpc/kexec/ranges.c
index 6b81c852feab..92d831621fa0 100644
--- a/arch/powerpc/kexec/ranges.c
+++ b/arch/powerpc/kexec/ranges.c
@@ -306,10 +306,14 @@ int add_initrd_mem_range(struct crash_mem **mem_ranges)
*/
int add_htab_mem_range(struct crash_mem **mem_ranges)
{
+#ifdef CONFIG_PPC_64S_HASH_MMU
if (!htab_address)
return 0;
return add_mem_range(mem_ranges, __pa(htab_address), htab_size_bytes);
+#else
+ return 0;
+#endif
}
#endif
diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
index e45644657d49..4ca2f0491e15 100644
--- a/arch/powerpc/kvm/Kconfig
+++ b/arch/powerpc/kvm/Kconfig
@@ -70,6 +70,7 @@ config KVM_BOOK3S_64
select KVM_BOOK3S_64_HANDLER
select KVM
select KVM_BOOK3S_PR_POSSIBLE if !KVM_BOOK3S_HV_POSSIBLE
+ select PPC_64S_HASH_MMU
select SPAPR_TCE_IOMMU if IOMMU_SUPPORT && (PPC_PSERIES || PPC_POWERNV)
help
Support running unmodified book3s_64 and book3s_32 guest kernels
diff --git a/arch/powerpc/mm/book3s64/Makefile b/arch/powerpc/mm/book3s64/Makefile
index 319f4b7f3357..9a474188f48f 100644
--- a/arch/powerpc/mm/book3s64/Makefile
+++ b/arch/powerpc/mm/book3s64/Makefile
@@ -2,20 +2,23 @@
ccflags-y := $(NO_MINIMAL_TOC)
+obj-y += mmu_context.o pgtable.o
+ifdef CONFIG_PPC_64S_HASH_MMU
CFLAGS_REMOVE_slb.o = $(CC_FLAGS_FTRACE)
-
-obj-y += hash_pgtable.o hash_utils.o slb.o \
- mmu_context.o pgtable.o hash_tlb.o
+obj-y += hash_pgtable.o hash_utils.o hash_tlb.o slb.o
obj-$(CONFIG_PPC_HASH_MMU_NATIVE) += hash_native.o
-obj-$(CONFIG_PPC_RADIX_MMU) += radix_pgtable.o radix_tlb.o
obj-$(CONFIG_PPC_4K_PAGES) += hash_4k.o
obj-$(CONFIG_PPC_64K_PAGES) += hash_64k.o
-obj-$(CONFIG_HUGETLB_PAGE) += hash_hugetlbpage.o
+obj-$(CONFIG_TRANSPARENT_HUGEPAGE) += hash_hugepage.o
+obj-$(CONFIG_PPC_SUBPAGE_PROT) += subpage_prot.o
+endif
+
+obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o
+
+obj-$(CONFIG_PPC_RADIX_MMU) += radix_pgtable.o radix_tlb.o
ifdef CONFIG_HUGETLB_PAGE
obj-$(CONFIG_PPC_RADIX_MMU) += radix_hugetlbpage.o
endif
-obj-$(CONFIG_TRANSPARENT_HUGEPAGE) += hash_hugepage.o
-obj-$(CONFIG_PPC_SUBPAGE_PROT) += subpage_prot.o
obj-$(CONFIG_SPAPR_TCE_IOMMU) += iommu_api.o
obj-$(CONFIG_PPC_PKEY) += pkeys.o
diff --git a/arch/powerpc/mm/book3s64/hash_pgtable.c b/arch/powerpc/mm/book3s64/hash_pgtable.c
index ad5eff097d31..7ce8914992e3 100644
--- a/arch/powerpc/mm/book3s64/hash_pgtable.c
+++ b/arch/powerpc/mm/book3s64/hash_pgtable.c
@@ -16,7 +16,6 @@
#include <mm/mmu_decl.h>
-#define CREATE_TRACE_POINTS
#include <trace/events/thp.h>
#if H_PGTABLE_RANGE > (USER_VSID_RANGE * (TASK_SIZE_USER64 / TASK_CONTEXT_SIZE))
diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
index 43c05cdbba73..b3c82975358d 100644
--- a/arch/powerpc/mm/book3s64/hash_utils.c
+++ b/arch/powerpc/mm/book3s64/hash_utils.c
@@ -99,8 +99,6 @@
*/
static unsigned long _SDR1;
-struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT];
-EXPORT_SYMBOL_GPL(mmu_psize_defs);
u8 hpte_page_sizes[1 << LP_BITS];
EXPORT_SYMBOL_GPL(hpte_page_sizes);
@@ -109,15 +107,7 @@ struct hash_pte *htab_address;
unsigned long htab_size_bytes;
unsigned long htab_hash_mask;
EXPORT_SYMBOL_GPL(htab_hash_mask);
-int mmu_linear_psize = MMU_PAGE_4K;
-EXPORT_SYMBOL_GPL(mmu_linear_psize);
int mmu_virtual_psize = MMU_PAGE_4K;
-int mmu_vmalloc_psize = MMU_PAGE_4K;
-EXPORT_SYMBOL_GPL(mmu_vmalloc_psize);
-#ifdef CONFIG_SPARSEMEM_VMEMMAP
-int mmu_vmemmap_psize = MMU_PAGE_4K;
-#endif
-int mmu_io_psize = MMU_PAGE_4K;
int mmu_kernel_ssize = MMU_SEGSIZE_256M;
EXPORT_SYMBOL_GPL(mmu_kernel_ssize);
int mmu_highuser_ssize = MMU_SEGSIZE_256M;
diff --git a/arch/powerpc/mm/book3s64/hash_hugetlbpage.c b/arch/powerpc/mm/book3s64/hugetlbpage.c
similarity index 95%
rename from arch/powerpc/mm/book3s64/hash_hugetlbpage.c
rename to arch/powerpc/mm/book3s64/hugetlbpage.c
index a688e1324ae5..d185c14802aa 100644
--- a/arch/powerpc/mm/book3s64/hash_hugetlbpage.c
+++ b/arch/powerpc/mm/book3s64/hugetlbpage.c
@@ -16,6 +16,11 @@
unsigned int hpage_shift;
EXPORT_SYMBOL(hpage_shift);
+#ifdef CONFIG_PPC_64S_HASH_MMU
+extern long hpte_insert_repeating(unsigned long hash, unsigned long vpn,
+ unsigned long pa, unsigned long rlags,
+ unsigned long vflags, int psize, int ssize);
+
int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
pte_t *ptep, unsigned long trap, unsigned long flags,
int ssize, unsigned int shift, unsigned int mmu_psize)
@@ -122,6 +127,7 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
*ptep = __pte(new_pte & ~H_PAGE_BUSY);
return 0;
}
+#endif
pte_t huge_ptep_modify_prot_start(struct vm_area_struct *vma,
unsigned long addr, pte_t *ptep)
diff --git a/arch/powerpc/mm/book3s64/mmu_context.c b/arch/powerpc/mm/book3s64/mmu_context.c
index c10fc8a72fb3..642cabc25e99 100644
--- a/arch/powerpc/mm/book3s64/mmu_context.c
+++ b/arch/powerpc/mm/book3s64/mmu_context.c
@@ -31,6 +31,7 @@ static int alloc_context_id(int min_id, int max_id)
return ida_alloc_range(&mmu_context_ida, min_id, max_id, GFP_KERNEL);
}
+#ifdef CONFIG_PPC_64S_HASH_MMU
void hash__reserve_context_id(int id)
{
int result = ida_alloc_range(&mmu_context_ida, id, id, GFP_KERNEL);
@@ -50,7 +51,9 @@ int hash__alloc_context_id(void)
return alloc_context_id(MIN_USER_CONTEXT, max);
}
EXPORT_SYMBOL_GPL(hash__alloc_context_id);
+#endif
+#ifdef CONFIG_PPC_64S_HASH_MMU
static int realloc_context_ids(mm_context_t *ctx)
{
int i, id;
@@ -144,12 +147,15 @@ static int hash__init_new_context(struct mm_struct *mm)
return index;
}
+void slb_setup_new_exec(void);
+
void hash__setup_new_exec(void)
{
slice_setup_new_exec();
slb_setup_new_exec();
}
+#endif
static int radix__init_new_context(struct mm_struct *mm)
{
@@ -175,7 +181,9 @@ static int radix__init_new_context(struct mm_struct *mm)
*/
asm volatile("ptesync;isync" : : : "memory");
+#ifdef CONFIG_PPC_64S_HASH_MMU
mm->context.hash_context = NULL;
+#endif
return index;
}
@@ -186,8 +194,10 @@ int init_new_context(struct task_struct *tsk, struct mm_struct *mm)
if (radix_enabled())
index = radix__init_new_context(mm);
+#ifdef CONFIG_PPC_64S_HASH_MMU
else
index = hash__init_new_context(mm);
+#endif
if (index < 0)
return index;
@@ -211,6 +221,7 @@ void __destroy_context(int context_id)
}
EXPORT_SYMBOL_GPL(__destroy_context);
+#ifdef CONFIG_PPC_64S_HASH_MMU
static void destroy_contexts(mm_context_t *ctx)
{
int index, context_id;
@@ -222,6 +233,7 @@ static void destroy_contexts(mm_context_t *ctx)
}
kfree(ctx->hash_context);
}
+#endif
static void pmd_frag_destroy(void *pmd_frag)
{
@@ -274,7 +286,11 @@ void destroy_context(struct mm_struct *mm)
process_tb[mm->context.id].prtb0 = 0;
else
subpage_prot_free(mm);
+#ifdef CONFIG_PPC_64S_HASH_MMU
destroy_contexts(&mm->context);
+#else
+ ida_free(&mmu_context_ida, mm->context.id);
+#endif
mm->context.id = MMU_NO_CONTEXT;
}
diff --git a/arch/powerpc/mm/book3s64/pgtable.c b/arch/powerpc/mm/book3s64/pgtable.c
index 9ffa65074cb0..354023f337f7 100644
--- a/arch/powerpc/mm/book3s64/pgtable.c
+++ b/arch/powerpc/mm/book3s64/pgtable.c
@@ -11,17 +11,29 @@
#include <asm/debugfs.h>
#include <asm/pgalloc.h>
#include <asm/tlb.h>
-#include <asm/trace.h>
#include <asm/powernv.h>
#include <asm/firmware.h>
#include <asm/ultravisor.h>
#include <asm/kexec.h>
#include <mm/mmu_decl.h>
+#define CREATE_TRACE_POINTS
#include <trace/events/thp.h>
#include "internal.h"
+struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT];
+EXPORT_SYMBOL_GPL(mmu_psize_defs);
+
+int mmu_linear_psize = MMU_PAGE_4K;
+EXPORT_SYMBOL_GPL(mmu_linear_psize);
+int mmu_vmalloc_psize = MMU_PAGE_4K;
+EXPORT_SYMBOL_GPL(mmu_vmalloc_psize);
+#ifdef CONFIG_SPARSEMEM_VMEMMAP
+int mmu_vmemmap_psize = MMU_PAGE_4K;
+#endif
+int mmu_io_psize = MMU_PAGE_4K;
+
unsigned long __pmd_frag_nr;
EXPORT_SYMBOL(__pmd_frag_nr);
unsigned long __pmd_frag_size_shift;
@@ -234,7 +246,13 @@ static void flush_partition(unsigned int lpid, bool radix)
"r" (TLBIEL_INVAL_SET_LPID), "r" (lpid));
/* do we need fixup here ?*/
asm volatile("eieio; tlbsync; ptesync" : : : "memory");
- trace_tlbie(lpid, 0, TLBIEL_INVAL_SET_LPID, lpid, 2, 0, 0);
+ /*
+ * trace_tlbie(lpid, 0, TLBIEL_INVAL_SET_LPID, lpid, 2, 0, 0);
+ *
+ * Can't use this tracepoint because this file defines the THP
+ * tracepoints. KVM tlbie tracing already has other
+ * deficiencies, so it's not urgent.
+ */
}
}
diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
index 2176a5f70746..a5d687ffcb97 100644
--- a/arch/powerpc/mm/book3s64/radix_pgtable.c
+++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
@@ -334,8 +334,10 @@ static void __init radix_init_pgtable(void)
phys_addr_t start, end;
u64 i;
+#ifdef CONFIG_PPC_64S_HASH_MMU
/* We don't support slb for radix */
mmu_slb_size = 0;
+#endif
/*
* Create the linear mapping
@@ -588,6 +590,7 @@ void __init radix__early_init_mmu(void)
{
unsigned long lpcr;
+#ifdef CONFIG_PPC_64S_HASH_MMU
#ifdef CONFIG_PPC_64K_PAGES
/* PAGE_SIZE mappings */
mmu_virtual_psize = MMU_PAGE_64K;
@@ -604,6 +607,7 @@ void __init radix__early_init_mmu(void)
mmu_vmemmap_psize = MMU_PAGE_2M;
} else
mmu_vmemmap_psize = mmu_virtual_psize;
+#endif
#endif
/*
* initialize page table size
diff --git a/arch/powerpc/mm/book3s64/slb.c b/arch/powerpc/mm/book3s64/slb.c
index c91bd85eb90e..83c30b222d4a 100644
--- a/arch/powerpc/mm/book3s64/slb.c
+++ b/arch/powerpc/mm/book3s64/slb.c
@@ -868,19 +868,3 @@ DEFINE_INTERRUPT_HANDLER_RAW(do_slb_fault)
return err;
}
}
-
-DEFINE_INTERRUPT_HANDLER(do_bad_slb_fault)
-{
- int err = regs->result;
-
- if (err == -EFAULT) {
- if (user_mode(regs))
- _exception(SIGSEGV, regs, SEGV_BNDERR, regs->dar);
- else
- bad_page_fault(regs, SIGSEGV);
- } else if (err == -EINVAL) {
- unrecoverable_exception(regs);
- } else {
- BUG();
- }
-}
diff --git a/arch/powerpc/mm/copro_fault.c b/arch/powerpc/mm/copro_fault.c
index 8acd00178956..c1cb21a00884 100644
--- a/arch/powerpc/mm/copro_fault.c
+++ b/arch/powerpc/mm/copro_fault.c
@@ -82,6 +82,7 @@ int copro_handle_mm_fault(struct mm_struct *mm, unsigned long ea,
}
EXPORT_SYMBOL_GPL(copro_handle_mm_fault);
+#ifdef CONFIG_PPC_64S_HASH_MMU
int copro_calculate_slb(struct mm_struct *mm, u64 ea, struct copro_slb *slb)
{
u64 vsid, vsidkey;
@@ -146,3 +147,4 @@ void copro_flush_all_slbs(struct mm_struct *mm)
cxl_slbia(mm);
}
EXPORT_SYMBOL_GPL(copro_flush_all_slbs);
+#endif
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 34f641d4a2fe..fe10695587c0 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -35,6 +35,7 @@
#include <linux/kfence.h>
#include <linux/pkeys.h>
+#include <asm/asm-prototypes.h>
#include <asm/firmware.h>
#include <asm/interrupt.h>
#include <asm/page.h>
@@ -622,4 +623,20 @@ DEFINE_INTERRUPT_HANDLER(do_bad_page_fault_segv)
{
bad_page_fault(regs, SIGSEGV);
}
+
+DEFINE_INTERRUPT_HANDLER(do_bad_slb_fault)
+{
+ int err = regs->result;
+
+ if (err == -EFAULT) {
+ if (user_mode(regs))
+ _exception(SIGSEGV, regs, SEGV_BNDERR, regs->dar);
+ else
+ bad_page_fault(regs, SIGSEGV);
+ } else if (err == -EINVAL) {
+ unrecoverable_exception(regs);
+ } else {
+ BUG();
+ }
+}
#endif
diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c
index cd16b407f47e..ab105d33e0b0 100644
--- a/arch/powerpc/mm/pgtable.c
+++ b/arch/powerpc/mm/pgtable.c
@@ -81,9 +81,6 @@ static struct page *maybe_pte_to_page(pte_t pte)
static pte_t set_pte_filter_hash(pte_t pte)
{
- if (radix_enabled())
- return pte;
-
pte = __pte(pte_val(pte) & ~_PAGE_HPTEFLAGS);
if (pte_looks_normal(pte) && !(cpu_has_feature(CPU_FTR_COHERENT_ICACHE) ||
cpu_has_feature(CPU_FTR_NOEXECUTE))) {
@@ -112,6 +109,9 @@ static inline pte_t set_pte_filter(pte_t pte)
{
struct page *pg;
+ if (radix_enabled())
+ return pte;
+
if (mmu_has_feature(MMU_FTR_HPTE_TABLE))
return set_pte_filter_hash(pte);
@@ -144,6 +144,10 @@ static pte_t set_access_flags_filter(pte_t pte, struct vm_area_struct *vma,
{
struct page *pg;
+#ifdef CONFIG_PPC_BOOK3S_64
+ return pte;
+#endif
+
if (mmu_has_feature(MMU_FTR_HPTE_TABLE))
return pte;
diff --git a/arch/powerpc/platforms/Kconfig.cputype b/arch/powerpc/platforms/Kconfig.cputype
index 0e1bb1cedd13..9b90fc501ed4 100644
--- a/arch/powerpc/platforms/Kconfig.cputype
+++ b/arch/powerpc/platforms/Kconfig.cputype
@@ -103,7 +103,6 @@ config PPC_BOOK3S_64
select ARCH_SUPPORTS_HUGETLBFS
select ARCH_SUPPORTS_NUMA_BALANCING
select IRQ_WORK
- select PPC_MM_SLICES
select PPC_HAVE_KUEP
select PPC_HAVE_KUAP
@@ -128,11 +127,13 @@ choice
config GENERIC_CPU
bool "Generic (POWER4 and above)"
depends on PPC64 && !CPU_LITTLE_ENDIAN
+ select PPC_64S_HASH_MMU
config GENERIC_CPU
bool "Generic (POWER8 and above)"
depends on PPC64 && CPU_LITTLE_ENDIAN
select ARCH_HAS_FAST_MULTIPLIER
+ select PPC_64S_HASH_MMU
config GENERIC_CPU
bool "Generic 32 bits powerpc"
@@ -141,24 +142,29 @@ config GENERIC_CPU
config CELL_CPU
bool "Cell Broadband Engine"
depends on PPC_BOOK3S_64 && !CPU_LITTLE_ENDIAN
+ select PPC_64S_HASH_MMU
config POWER5_CPU
bool "POWER5"
depends on PPC_BOOK3S_64 && !CPU_LITTLE_ENDIAN
+ select PPC_64S_HASH_MMU
config POWER6_CPU
bool "POWER6"
depends on PPC_BOOK3S_64 && !CPU_LITTLE_ENDIAN
+ select PPC_64S_HASH_MMU
config POWER7_CPU
bool "POWER7"
depends on PPC_BOOK3S_64
select ARCH_HAS_FAST_MULTIPLIER
+ select PPC_64S_HASH_MMU
config POWER8_CPU
bool "POWER8"
depends on PPC_BOOK3S_64
select ARCH_HAS_FAST_MULTIPLIER
+ select PPC_64S_HASH_MMU
config POWER9_CPU
bool "POWER9"
@@ -362,6 +368,17 @@ config SPE
If in doubt, say Y here.
+config PPC_64S_HASH_MMU
+ bool "Hash MMU Support"
+ depends on PPC_BOOK3S_64
+ default y
+ select PPC_MM_SLICES
+ help
+ Enable support for the Power ISA Hash style MMU. This is implemented
+ by all IBM Power and other Book3S CPUs.
+
+ If you're unsure, say Y.
+
config PPC_RADIX_MMU
bool "Radix MMU Support"
depends on PPC_BOOK3S_64
@@ -375,6 +392,7 @@ config PPC_RADIX_MMU
config PPC_RADIX_MMU_DEFAULT
bool "Default to using the Radix MMU when possible"
depends on PPC_RADIX_MMU
+ depends on PPC_64S_HASH_MMU
default y
help
When the hardware supports the Radix MMU, default to using it unless
diff --git a/arch/powerpc/platforms/cell/Kconfig b/arch/powerpc/platforms/cell/Kconfig
index db4465c51b56..faa894714a2a 100644
--- a/arch/powerpc/platforms/cell/Kconfig
+++ b/arch/powerpc/platforms/cell/Kconfig
@@ -8,6 +8,7 @@ config PPC_CELL_COMMON
select PPC_DCR_MMIO
select PPC_INDIRECT_PIO
select PPC_INDIRECT_MMIO
+ select PPC_64S_HASH_MMU
select PPC_HASH_MMU_NATIVE
select PPC_RTAS
select IRQ_EDGE_EOI_HANDLER
diff --git a/arch/powerpc/platforms/maple/Kconfig b/arch/powerpc/platforms/maple/Kconfig
index 7fd84311ade5..4c058cc57c90 100644
--- a/arch/powerpc/platforms/maple/Kconfig
+++ b/arch/powerpc/platforms/maple/Kconfig
@@ -9,6 +9,7 @@ config PPC_MAPLE
select GENERIC_TBSYNC
select PPC_UDBG_16550
select PPC_970_NAP
+ select PPC_64S_HASH_MMU
select PPC_HASH_MMU_NATIVE
select PPC_RTAS
select MMIO_NVRAM
diff --git a/arch/powerpc/platforms/microwatt/Kconfig b/arch/powerpc/platforms/microwatt/Kconfig
index 62b51e37fc05..823192e9d38a 100644
--- a/arch/powerpc/platforms/microwatt/Kconfig
+++ b/arch/powerpc/platforms/microwatt/Kconfig
@@ -5,7 +5,7 @@ config PPC_MICROWATT
select PPC_XICS
select PPC_ICS_NATIVE
select PPC_ICP_NATIVE
- select PPC_HASH_MMU_NATIVE
+ select PPC_HASH_MMU_NATIVE if PPC_64S_HASH_MMU
select PPC_UDBG_16550
select ARCH_RANDOM
help
diff --git a/arch/powerpc/platforms/pasemi/Kconfig b/arch/powerpc/platforms/pasemi/Kconfig
index bc7137353a7f..85ae18ddd911 100644
--- a/arch/powerpc/platforms/pasemi/Kconfig
+++ b/arch/powerpc/platforms/pasemi/Kconfig
@@ -5,6 +5,7 @@ config PPC_PASEMI
select MPIC
select FORCE_PCI
select PPC_UDBG_16550
+ select PPC_64S_HASH_MMU
select PPC_HASH_MMU_NATIVE
select MPIC_BROKEN_REGREAD
help
diff --git a/arch/powerpc/platforms/powermac/Kconfig b/arch/powerpc/platforms/powermac/Kconfig
index 2b56df145b82..130707ec9f99 100644
--- a/arch/powerpc/platforms/powermac/Kconfig
+++ b/arch/powerpc/platforms/powermac/Kconfig
@@ -6,6 +6,7 @@ config PPC_PMAC
select FORCE_PCI
select PPC_INDIRECT_PCI if PPC32
select PPC_MPC106 if PPC32
+ select PPC_64S_HASH_MMU if PPC64
select PPC_HASH_MMU_NATIVE
select ZONE_DMA if PPC32
default y
diff --git a/arch/powerpc/platforms/powernv/Kconfig b/arch/powerpc/platforms/powernv/Kconfig
index cd754e116184..161dfe024085 100644
--- a/arch/powerpc/platforms/powernv/Kconfig
+++ b/arch/powerpc/platforms/powernv/Kconfig
@@ -2,7 +2,7 @@
config PPC_POWERNV
depends on PPC64 && PPC_BOOK3S
bool "IBM PowerNV (Non-Virtualized) platform support"
- select PPC_HASH_MMU_NATIVE
+ select PPC_HASH_MMU_NATIVE if PPC_64S_HASH_MMU
select PPC_XICS
select PPC_ICP_NATIVE
select PPC_XIVE_NATIVE
diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
index 528a7e0cf83a..ea4c63ebd1f8 100644
--- a/arch/powerpc/platforms/powernv/idle.c
+++ b/arch/powerpc/platforms/powernv/idle.c
@@ -492,12 +492,14 @@ static unsigned long power7_idle_insn(unsigned long type)
mtspr(SPRN_SPRG3, local_paca->sprg_vdso);
+#ifdef CONFIG_PPC_64S_HASH_MMU
/*
* The SLB has to be restored here, but it sometimes still
* contains entries, so the __ variant must be used to prevent
* multi hits.
*/
__slb_restore_bolted_realmode();
+#endif
return srr1;
}
diff --git a/arch/powerpc/platforms/powernv/setup.c b/arch/powerpc/platforms/powernv/setup.c
index a8db3f153063..c6dbfa2e075a 100644
--- a/arch/powerpc/platforms/powernv/setup.c
+++ b/arch/powerpc/platforms/powernv/setup.c
@@ -207,6 +207,7 @@ static void __init pnv_init(void)
#endif
add_preferred_console("hvc", 0, NULL);
+#ifdef CONFIG_PPC_64S_HASH_MMU
if (!radix_enabled()) {
size_t size = sizeof(struct slb_entry) * mmu_slb_size;
int i;
@@ -219,6 +220,7 @@ static void __init pnv_init(void)
cpu_to_node(i));
}
}
+#endif
}
static void __init pnv_init_IRQ(void)
diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c
index dab356e3ff87..880b77ba939b 100644
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -57,6 +57,7 @@ EXPORT_SYMBOL(plpar_hcall);
EXPORT_SYMBOL(plpar_hcall9);
EXPORT_SYMBOL(plpar_hcall_norets);
+#ifdef CONFIG_PPC_64S_HASH_MMU
/*
* H_BLOCK_REMOVE supported block size for this page size in segment who's base
* page size is that page size.
@@ -65,6 +66,7 @@ EXPORT_SYMBOL(plpar_hcall_norets);
* page size.
*/
static int hblkrm_size[MMU_PAGE_COUNT][MMU_PAGE_COUNT] __ro_after_init;
+#endif
/*
* Due to the involved complexity, and that the current hypervisor is only
@@ -688,7 +690,7 @@ void vpa_init(int cpu)
return;
}
-#ifdef CONFIG_PPC_BOOK3S_64
+#ifdef CONFIG_PPC_64S_HASH_MMU
/*
* PAPR says this feature is SLB-Buffer but firmware never
* reports that. All SPLPAR support SLB shadow buffer.
@@ -701,7 +703,7 @@ void vpa_init(int cpu)
"cpu %d (hw %d) of area %lx failed with %ld\n",
cpu, hwcpu, addr, ret);
}
-#endif /* CONFIG_PPC_BOOK3S_64 */
+#endif /* CONFIG_PPC_64S_HASH_MMU */
/*
* Register dispatch trace log, if one has been allocated.
@@ -709,7 +711,35 @@ void vpa_init(int cpu)
register_dtl_buffer(cpu);
}
-#ifdef CONFIG_PPC_BOOK3S_64
+static int pseries_lpar_register_process_table(unsigned long base,
+ unsigned long page_size, unsigned long table_size)
+{
+ long rc;
+ unsigned long flags = 0;
+
+ if (table_size)
+ flags |= PROC_TABLE_NEW;
+ if (radix_enabled()) {
+ flags |= PROC_TABLE_RADIX;
+ if (mmu_has_feature(MMU_FTR_GTSE))
+ flags |= PROC_TABLE_GTSE;
+ } else
+ flags |= PROC_TABLE_HPT_SLB;
+ for (;;) {
+ rc = plpar_hcall_norets(H_REGISTER_PROC_TBL, flags, base,
+ page_size, table_size);
+ if (!H_IS_LONG_BUSY(rc))
+ break;
+ mdelay(get_longbusy_msecs(rc));
+ }
+ if (rc != H_SUCCESS) {
+ pr_err("Failed to register process table (rc=%ld)\n", rc);
+ BUG();
+ }
+ return rc;
+}
+
+#ifdef CONFIG_PPC_64S_HASH_MMU
static long pSeries_lpar_hpte_insert(unsigned long hpte_group,
unsigned long vpn, unsigned long pa,
@@ -1676,34 +1706,6 @@ static int pseries_lpar_resize_hpt(unsigned long shift)
return 0;
}
-static int pseries_lpar_register_process_table(unsigned long base,
- unsigned long page_size, unsigned long table_size)
-{
- long rc;
- unsigned long flags = 0;
-
- if (table_size)
- flags |= PROC_TABLE_NEW;
- if (radix_enabled()) {
- flags |= PROC_TABLE_RADIX;
- if (mmu_has_feature(MMU_FTR_GTSE))
- flags |= PROC_TABLE_GTSE;
- } else
- flags |= PROC_TABLE_HPT_SLB;
- for (;;) {
- rc = plpar_hcall_norets(H_REGISTER_PROC_TBL, flags, base,
- page_size, table_size);
- if (!H_IS_LONG_BUSY(rc))
- break;
- mdelay(get_longbusy_msecs(rc));
- }
- if (rc != H_SUCCESS) {
- pr_err("Failed to register process table (rc=%ld)\n", rc);
- BUG();
- }
- return rc;
-}
-
void __init hpte_init_pseries(void)
{
mmu_hash_ops.hpte_invalidate = pSeries_lpar_hpte_invalidate;
@@ -1726,6 +1728,7 @@ void __init hpte_init_pseries(void)
if (cpu_has_feature(CPU_FTR_ARCH_300))
pseries_lpar_register_process_table(0, 0, 0);
}
+#endif /* CONFIG_PPC_64S_HASH_MMU */
#ifdef CONFIG_PPC_RADIX_MMU
void radix_init_pseries(void)
@@ -1790,7 +1793,6 @@ void arch_free_page(struct page *page, int order)
EXPORT_SYMBOL(arch_free_page);
#endif /* CONFIG_PPC_SMLPAR */
-#endif /* CONFIG_PPC_BOOK3S_64 */
#ifdef CONFIG_TRACEPOINTS
#ifdef CONFIG_JUMP_LABEL
@@ -1928,6 +1930,7 @@ int h_get_mpp_x(struct hvcall_mpp_x_data *mpp_x_data)
return rc;
}
+#ifdef CONFIG_PPC_64S_HASH_MMU
static unsigned long vsid_unscramble(unsigned long vsid, int ssize)
{
unsigned long protovsid;
@@ -1988,6 +1991,7 @@ static int __init reserve_vrma_context_id(void)
return 0;
}
machine_device_initcall(pseries, reserve_vrma_context_id);
+#endif
#ifdef CONFIG_DEBUG_FS
/* debugfs file interface for vpa data */
diff --git a/arch/powerpc/platforms/pseries/lparcfg.c b/arch/powerpc/platforms/pseries/lparcfg.c
index f71eac74ea92..a0acb9e2b3c7 100644
--- a/arch/powerpc/platforms/pseries/lparcfg.c
+++ b/arch/powerpc/platforms/pseries/lparcfg.c
@@ -531,7 +531,7 @@ static int pseries_lparcfg_data(struct seq_file *m, void *v)
seq_printf(m, "shared_processor_mode=%d\n",
lppaca_shared_proc(get_lppaca()));
-#ifdef CONFIG_PPC_BOOK3S_64
+#ifdef CONFIG_PPC_64S_HASH_MMU
seq_printf(m, "slb_size=%d\n", mmu_slb_size);
#endif
parse_em_data(m);
diff --git a/arch/powerpc/platforms/pseries/mobility.c b/arch/powerpc/platforms/pseries/mobility.c
index e83e0891272d..aec1971e16a1 100644
--- a/arch/powerpc/platforms/pseries/mobility.c
+++ b/arch/powerpc/platforms/pseries/mobility.c
@@ -417,11 +417,15 @@ static void prod_others(void)
static u16 clamp_slb_size(void)
{
+#ifdef CONFIG_PPC_64S_HASH_MMU
u16 prev = mmu_slb_size;
slb_set_size(SLB_MIN_SIZE);
return prev;
+#else
+ return 0;
+#endif
}
static int do_suspend(void)
@@ -446,7 +450,9 @@ static int do_suspend(void)
ret = rtas_ibm_suspend_me(&status);
if (ret != 0) {
pr_err("ibm,suspend-me error: %d\n", status);
+#ifdef CONFIG_PPC_64S_HASH_MMU
slb_set_size(saved_slb_size);
+#endif
}
return ret;
diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c
index 167f2e1b8d39..9286302e841c 100644
--- a/arch/powerpc/platforms/pseries/ras.c
+++ b/arch/powerpc/platforms/pseries/ras.c
@@ -526,6 +526,7 @@ static int mce_handle_err_realmode(int disposition, u8 error_type)
disposition = RTAS_DISP_FULLY_RECOVERED;
break;
case MC_ERROR_TYPE_SLB:
+#ifdef CONFIG_PPC_64S_HASH_MMU
/*
* Store the old slb content in paca before flushing.
* Print this when we go to virtual mode.
@@ -537,6 +538,9 @@ static int mce_handle_err_realmode(int disposition, u8 error_type)
if (local_paca->in_mce == 1)
slb_save_contents(local_paca->mce_faulty_slbs);
flush_and_reload_slb();
+#else
+ flush_erat();
+#endif
disposition = RTAS_DISP_FULLY_RECOVERED;
break;
default:
diff --git a/arch/powerpc/platforms/pseries/reconfig.c b/arch/powerpc/platforms/pseries/reconfig.c
index 7f7369fec46b..80dae18d6621 100644
--- a/arch/powerpc/platforms/pseries/reconfig.c
+++ b/arch/powerpc/platforms/pseries/reconfig.c
@@ -337,8 +337,10 @@ static int do_update_property(char *buf, size_t bufsize)
if (!newprop)
return -ENOMEM;
+#ifdef CONFIG_PPC_64S_HASH_MMU
if (!strcmp(name, "slb-size") || !strcmp(name, "ibm,slb-size"))
slb_set_size(*(int *)value);
+#endif
return of_update_property(np, newprop);
}
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index 631a0d57b6cd..cf626224f6f7 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -113,7 +113,7 @@ static void __init fwnmi_init(void)
u8 *mce_data_buf;
unsigned int i;
int nr_cpus = num_possible_cpus();
-#ifdef CONFIG_PPC_BOOK3S_64
+#ifdef CONFIG_PPC_64S_HASH_MMU
struct slb_entry *slb_ptr;
size_t size;
#endif
@@ -153,7 +153,7 @@ static void __init fwnmi_init(void)
(RTAS_ERROR_LOG_MAX * i);
}
-#ifdef CONFIG_PPC_BOOK3S_64
+#ifdef CONFIG_PPC_64S_HASH_MMU
if (!radix_enabled()) {
/* Allocate per cpu area to save old slb contents during MCE */
size = sizeof(struct slb_entry) * mmu_slb_size * nr_cpus;
@@ -799,7 +799,9 @@ static void __init pSeries_setup_arch(void)
fwnmi_init();
pseries_setup_security_mitigations();
+#ifdef CONFIG_PPC_64S_HASH_MMU
pseries_lpar_read_hblkrm_characteristics();
+#endif
/* By default, only probe PCI (can be overridden by rtas_pci) */
pci_add_flags(PCI_PROBE_ONLY);
diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index da4d7f225a40..95d0ecd30a2a 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -1170,9 +1170,11 @@ cmds(struct pt_regs *excp)
show_tasks();
break;
#ifdef CONFIG_PPC_BOOK3S
+#ifdef CONFIG_PPC_64S_HASH_MMU
case 'u':
dump_segments();
break;
+#endif
#elif defined(CONFIG_44x)
case 'u':
dump_tlb_44x();
@@ -2618,7 +2620,7 @@ static void dump_tracing(void)
static void dump_one_paca(int cpu)
{
struct paca_struct *p;
-#ifdef CONFIG_PPC_BOOK3S_64
+#ifdef CONFIG_PPC_64S_HASH_MMU
int i = 0;
#endif
@@ -2660,6 +2662,7 @@ static void dump_one_paca(int cpu)
DUMP(p, cpu_start, "%#-*x");
DUMP(p, kexec_state, "%#-*x");
#ifdef CONFIG_PPC_BOOK3S_64
+#ifdef CONFIG_PPC_64S_HASH_MMU
if (!early_radix_enabled()) {
for (i = 0; i < SLB_NUM_BOLTED; i++) {
u64 esid, vsid;
@@ -2687,6 +2690,7 @@ static void dump_one_paca(int cpu)
22, "slb_cache", i, p->slb_cache[i]);
}
}
+#endif
DUMP(p, rfi_flush_fallback_area, "%-*px");
#endif
@@ -3751,7 +3755,7 @@ static void xmon_print_symbol(unsigned long address, const char *mid,
printf("%s", after);
}
-#ifdef CONFIG_PPC_BOOK3S_64
+#ifdef CONFIG_PPC_64S_HASH_MMU
void dump_segments(void)
{
int i;
--
2.23.0
^ permalink raw reply related
* [RFC PATCH 3/6] powerpc/pseries: Stop selecting PPC_HASH_MMU_NATIVE
From: Nicholas Piggin @ 2021-08-27 16:34 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20210827163410.1177154-1-npiggin@gmail.com>
The pseries platform does not use the native hash code but the PAPR
virtualised hash interfaces. Remove PPC_HASH_MMU_NATIVE. This
requires moving some tlbiel code from hash_native.c to hash_utils.c.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/include/asm/book3s/64/tlbflush.h | 4 -
arch/powerpc/mm/book3s64/hash_native.c | 104 ------------------
arch/powerpc/mm/book3s64/hash_utils.c | 104 ++++++++++++++++++
arch/powerpc/platforms/pseries/Kconfig | 1 -
4 files changed, 104 insertions(+), 109 deletions(-)
diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush.h b/arch/powerpc/include/asm/book3s/64/tlbflush.h
index 215973b4cb26..d2e80f178b6d 100644
--- a/arch/powerpc/include/asm/book3s/64/tlbflush.h
+++ b/arch/powerpc/include/asm/book3s/64/tlbflush.h
@@ -14,7 +14,6 @@ enum {
TLB_INVAL_SCOPE_LPID = 1, /* invalidate TLBs for current LPID */
};
-#ifdef CONFIG_PPC_NATIVE
static inline void tlbiel_all(void)
{
/*
@@ -30,9 +29,6 @@ static inline void tlbiel_all(void)
else
hash__tlbiel_all(TLB_INVAL_SCOPE_GLOBAL);
}
-#else
-static inline void tlbiel_all(void) { BUG(); }
-#endif
static inline void tlbiel_all_lpid(bool radix)
{
diff --git a/arch/powerpc/mm/book3s64/hash_native.c b/arch/powerpc/mm/book3s64/hash_native.c
index 52e170bd95ae..83afd08351f1 100644
--- a/arch/powerpc/mm/book3s64/hash_native.c
+++ b/arch/powerpc/mm/book3s64/hash_native.c
@@ -43,110 +43,6 @@
static DEFINE_RAW_SPINLOCK(native_tlbie_lock);
-static inline void tlbiel_hash_set_isa206(unsigned int set, unsigned int is)
-{
- unsigned long rb;
-
- rb = (set << PPC_BITLSHIFT(51)) | (is << PPC_BITLSHIFT(53));
-
- asm volatile("tlbiel %0" : : "r" (rb));
-}
-
-/*
- * tlbiel instruction for hash, set invalidation
- * i.e., r=1 and is=01 or is=10 or is=11
- */
-static __always_inline void tlbiel_hash_set_isa300(unsigned int set, unsigned int is,
- unsigned int pid,
- unsigned int ric, unsigned int prs)
-{
- unsigned long rb;
- unsigned long rs;
- unsigned int r = 0; /* hash format */
-
- rb = (set << PPC_BITLSHIFT(51)) | (is << PPC_BITLSHIFT(53));
- rs = ((unsigned long)pid << PPC_BITLSHIFT(31));
-
- asm volatile(PPC_TLBIEL(%0, %1, %2, %3, %4)
- : : "r"(rb), "r"(rs), "i"(ric), "i"(prs), "i"(r)
- : "memory");
-}
-
-
-static void tlbiel_all_isa206(unsigned int num_sets, unsigned int is)
-{
- unsigned int set;
-
- asm volatile("ptesync": : :"memory");
-
- for (set = 0; set < num_sets; set++)
- tlbiel_hash_set_isa206(set, is);
-
- ppc_after_tlbiel_barrier();
-}
-
-static void tlbiel_all_isa300(unsigned int num_sets, unsigned int is)
-{
- unsigned int set;
-
- asm volatile("ptesync": : :"memory");
-
- /*
- * Flush the partition table cache if this is HV mode.
- */
- if (early_cpu_has_feature(CPU_FTR_HVMODE))
- tlbiel_hash_set_isa300(0, is, 0, 2, 0);
-
- /*
- * Now invalidate the process table cache. UPRT=0 HPT modes (what
- * current hardware implements) do not use the process table, but
- * add the flushes anyway.
- *
- * From ISA v3.0B p. 1078:
- * The following forms are invalid.
- * * PRS=1, R=0, and RIC!=2 (The only process-scoped
- * HPT caching is of the Process Table.)
- */
- tlbiel_hash_set_isa300(0, is, 0, 2, 1);
-
- /*
- * Then flush the sets of the TLB proper. Hash mode uses
- * partition scoped TLB translations, which may be flushed
- * in !HV mode.
- */
- for (set = 0; set < num_sets; set++)
- tlbiel_hash_set_isa300(set, is, 0, 0, 0);
-
- ppc_after_tlbiel_barrier();
-
- asm volatile(PPC_ISA_3_0_INVALIDATE_ERAT "; isync" : : :"memory");
-}
-
-void hash__tlbiel_all(unsigned int action)
-{
- unsigned int is;
-
- switch (action) {
- case TLB_INVAL_SCOPE_GLOBAL:
- is = 3;
- break;
- case TLB_INVAL_SCOPE_LPID:
- is = 2;
- break;
- default:
- BUG();
- }
-
- if (early_cpu_has_feature(CPU_FTR_ARCH_300))
- tlbiel_all_isa300(POWER9_TLB_SETS_HASH, is);
- else if (early_cpu_has_feature(CPU_FTR_ARCH_207S))
- tlbiel_all_isa206(POWER8_TLB_SETS, is);
- else if (early_cpu_has_feature(CPU_FTR_ARCH_206))
- tlbiel_all_isa206(POWER7_TLB_SETS, is);
- else
- WARN(1, "%s called on pre-POWER7 CPU\n", __func__);
-}
-
static inline unsigned long ___tlbie(unsigned long vpn, int psize,
int apsize, int ssize)
{
diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
index e35faefa8e59..43c05cdbba73 100644
--- a/arch/powerpc/mm/book3s64/hash_utils.c
+++ b/arch/powerpc/mm/book3s64/hash_utils.c
@@ -175,6 +175,110 @@ static struct mmu_psize_def mmu_psize_defaults_gp[] = {
},
};
+static inline void tlbiel_hash_set_isa206(unsigned int set, unsigned int is)
+{
+ unsigned long rb;
+
+ rb = (set << PPC_BITLSHIFT(51)) | (is << PPC_BITLSHIFT(53));
+
+ asm volatile("tlbiel %0" : : "r" (rb));
+}
+
+/*
+ * tlbiel instruction for hash, set invalidation
+ * i.e., r=1 and is=01 or is=10 or is=11
+ */
+static __always_inline void tlbiel_hash_set_isa300(unsigned int set, unsigned int is,
+ unsigned int pid,
+ unsigned int ric, unsigned int prs)
+{
+ unsigned long rb;
+ unsigned long rs;
+ unsigned int r = 0; /* hash format */
+
+ rb = (set << PPC_BITLSHIFT(51)) | (is << PPC_BITLSHIFT(53));
+ rs = ((unsigned long)pid << PPC_BITLSHIFT(31));
+
+ asm volatile(PPC_TLBIEL(%0, %1, %2, %3, %4)
+ : : "r"(rb), "r"(rs), "i"(ric), "i"(prs), "i"(r)
+ : "memory");
+}
+
+
+static void tlbiel_all_isa206(unsigned int num_sets, unsigned int is)
+{
+ unsigned int set;
+
+ asm volatile("ptesync": : :"memory");
+
+ for (set = 0; set < num_sets; set++)
+ tlbiel_hash_set_isa206(set, is);
+
+ ppc_after_tlbiel_barrier();
+}
+
+static void tlbiel_all_isa300(unsigned int num_sets, unsigned int is)
+{
+ unsigned int set;
+
+ asm volatile("ptesync": : :"memory");
+
+ /*
+ * Flush the partition table cache if this is HV mode.
+ */
+ if (early_cpu_has_feature(CPU_FTR_HVMODE))
+ tlbiel_hash_set_isa300(0, is, 0, 2, 0);
+
+ /*
+ * Now invalidate the process table cache. UPRT=0 HPT modes (what
+ * current hardware implements) do not use the process table, but
+ * add the flushes anyway.
+ *
+ * From ISA v3.0B p. 1078:
+ * The following forms are invalid.
+ * * PRS=1, R=0, and RIC!=2 (The only process-scoped
+ * HPT caching is of the Process Table.)
+ */
+ tlbiel_hash_set_isa300(0, is, 0, 2, 1);
+
+ /*
+ * Then flush the sets of the TLB proper. Hash mode uses
+ * partition scoped TLB translations, which may be flushed
+ * in !HV mode.
+ */
+ for (set = 0; set < num_sets; set++)
+ tlbiel_hash_set_isa300(set, is, 0, 0, 0);
+
+ ppc_after_tlbiel_barrier();
+
+ asm volatile(PPC_ISA_3_0_INVALIDATE_ERAT "; isync" : : :"memory");
+}
+
+void hash__tlbiel_all(unsigned int action)
+{
+ unsigned int is;
+
+ switch (action) {
+ case TLB_INVAL_SCOPE_GLOBAL:
+ is = 3;
+ break;
+ case TLB_INVAL_SCOPE_LPID:
+ is = 2;
+ break;
+ default:
+ BUG();
+ }
+
+ if (early_cpu_has_feature(CPU_FTR_ARCH_300))
+ tlbiel_all_isa300(POWER9_TLB_SETS_HASH, is);
+ else if (early_cpu_has_feature(CPU_FTR_ARCH_207S))
+ tlbiel_all_isa206(POWER8_TLB_SETS, is);
+ else if (early_cpu_has_feature(CPU_FTR_ARCH_206))
+ tlbiel_all_isa206(POWER7_TLB_SETS, is);
+ else
+ WARN(1, "%s called on pre-POWER7 CPU\n", __func__);
+}
+
/*
* 'R' and 'C' update notes:
* - Under pHyp or KVM, the updatepp path will not set C, thus it *will*
diff --git a/arch/powerpc/platforms/pseries/Kconfig b/arch/powerpc/platforms/pseries/Kconfig
index 69a1ff8c079b..98ab9697ab5e 100644
--- a/arch/powerpc/platforms/pseries/Kconfig
+++ b/arch/powerpc/platforms/pseries/Kconfig
@@ -17,7 +17,6 @@ config PPC_PSERIES
select PPC_RTAS_DAEMON
select RTAS_ERROR_LOGGING
select PPC_UDBG_16550
- select PPC_HASH_MMU_NATIVE
select PPC_DOORBELL
select HOTPLUG_CPU
select ARCH_RANDOM
--
2.23.0
^ permalink raw reply related
* [RFC PATCH 2/6] powerpc: Rename PPC_NATIVE to PPC_HASH_MMU_NATIVE
From: Nicholas Piggin @ 2021-08-27 16:34 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20210827163410.1177154-1-npiggin@gmail.com>
PPC_NATIVE now only controls the native HPT code, so rename it to be
more descriptive. Restrict it to Book3S only.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/mm/book3s64/Makefile | 2 +-
arch/powerpc/mm/book3s64/hash_utils.c | 2 +-
arch/powerpc/platforms/52xx/Kconfig | 2 +-
arch/powerpc/platforms/Kconfig | 4 ++--
arch/powerpc/platforms/cell/Kconfig | 2 +-
arch/powerpc/platforms/chrp/Kconfig | 2 +-
arch/powerpc/platforms/embedded6xx/Kconfig | 2 +-
arch/powerpc/platforms/maple/Kconfig | 2 +-
arch/powerpc/platforms/microwatt/Kconfig | 2 +-
arch/powerpc/platforms/pasemi/Kconfig | 2 +-
arch/powerpc/platforms/powermac/Kconfig | 2 +-
arch/powerpc/platforms/powernv/Kconfig | 2 +-
arch/powerpc/platforms/pseries/Kconfig | 2 +-
13 files changed, 14 insertions(+), 14 deletions(-)
diff --git a/arch/powerpc/mm/book3s64/Makefile b/arch/powerpc/mm/book3s64/Makefile
index 1b56d3af47d4..319f4b7f3357 100644
--- a/arch/powerpc/mm/book3s64/Makefile
+++ b/arch/powerpc/mm/book3s64/Makefile
@@ -6,7 +6,7 @@ CFLAGS_REMOVE_slb.o = $(CC_FLAGS_FTRACE)
obj-y += hash_pgtable.o hash_utils.o slb.o \
mmu_context.o pgtable.o hash_tlb.o
-obj-$(CONFIG_PPC_NATIVE) += hash_native.o
+obj-$(CONFIG_PPC_HASH_MMU_NATIVE) += hash_native.o
obj-$(CONFIG_PPC_RADIX_MMU) += radix_pgtable.o radix_tlb.o
obj-$(CONFIG_PPC_4K_PAGES) += hash_4k.o
obj-$(CONFIG_PPC_64K_PAGES) += hash_64k.o
diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
index ac5720371c0d..e35faefa8e59 100644
--- a/arch/powerpc/mm/book3s64/hash_utils.c
+++ b/arch/powerpc/mm/book3s64/hash_utils.c
@@ -1091,7 +1091,7 @@ void __init hash__early_init_mmu(void)
ps3_early_mm_init();
else if (firmware_has_feature(FW_FEATURE_LPAR))
hpte_init_pseries();
- else if (IS_ENABLED(CONFIG_PPC_NATIVE))
+ else if (IS_ENABLED(CONFIG_PPC_HASH_MMU_NATIVE))
hpte_init_native();
if (!mmu_hash_ops.hpte_insert)
diff --git a/arch/powerpc/platforms/52xx/Kconfig b/arch/powerpc/platforms/52xx/Kconfig
index 99d60acc20c8..b72ed2950ca8 100644
--- a/arch/powerpc/platforms/52xx/Kconfig
+++ b/arch/powerpc/platforms/52xx/Kconfig
@@ -34,7 +34,7 @@ config PPC_EFIKA
bool "bPlan Efika 5k2. MPC5200B based computer"
depends on PPC_MPC52xx
select PPC_RTAS
- select PPC_NATIVE
+ select PPC_HASH_MMU_NATIVE
config PPC_LITE5200
bool "Freescale Lite5200 Eval Board"
diff --git a/arch/powerpc/platforms/Kconfig b/arch/powerpc/platforms/Kconfig
index e02d29a9d12f..d41dad227de8 100644
--- a/arch/powerpc/platforms/Kconfig
+++ b/arch/powerpc/platforms/Kconfig
@@ -40,9 +40,9 @@ config EPAPR_PARAVIRT
In case of doubt, say Y
-config PPC_NATIVE
+config PPC_HASH_MMU_NATIVE
bool
- depends on PPC_BOOK3S_32 || PPC64
+ depends on PPC_BOOK3S
help
Support for running natively on the hardware, i.e. without
a hypervisor. This option is not user-selectable but should
diff --git a/arch/powerpc/platforms/cell/Kconfig b/arch/powerpc/platforms/cell/Kconfig
index cb70c5f25bc6..db4465c51b56 100644
--- a/arch/powerpc/platforms/cell/Kconfig
+++ b/arch/powerpc/platforms/cell/Kconfig
@@ -8,7 +8,7 @@ config PPC_CELL_COMMON
select PPC_DCR_MMIO
select PPC_INDIRECT_PIO
select PPC_INDIRECT_MMIO
- select PPC_NATIVE
+ select PPC_HASH_MMU_NATIVE
select PPC_RTAS
select IRQ_EDGE_EOI_HANDLER
diff --git a/arch/powerpc/platforms/chrp/Kconfig b/arch/powerpc/platforms/chrp/Kconfig
index 9b5c5505718a..ff30ed579a39 100644
--- a/arch/powerpc/platforms/chrp/Kconfig
+++ b/arch/powerpc/platforms/chrp/Kconfig
@@ -11,6 +11,6 @@ config PPC_CHRP
select RTAS_ERROR_LOGGING
select PPC_MPC106
select PPC_UDBG_16550
- select PPC_NATIVE
+ select PPC_HASH_MMU_NATIVE
select FORCE_PCI
default y
diff --git a/arch/powerpc/platforms/embedded6xx/Kconfig b/arch/powerpc/platforms/embedded6xx/Kconfig
index 4c6d703a4284..c54786f8461e 100644
--- a/arch/powerpc/platforms/embedded6xx/Kconfig
+++ b/arch/powerpc/platforms/embedded6xx/Kconfig
@@ -55,7 +55,7 @@ config MVME5100
select FORCE_PCI
select PPC_INDIRECT_PCI
select PPC_I8259
- select PPC_NATIVE
+ select PPC_HASH_MMU_NATIVE
select PPC_UDBG_16550
help
This option enables support for the Motorola (now Emerson) MVME5100
diff --git a/arch/powerpc/platforms/maple/Kconfig b/arch/powerpc/platforms/maple/Kconfig
index 86ae210bee9a..7fd84311ade5 100644
--- a/arch/powerpc/platforms/maple/Kconfig
+++ b/arch/powerpc/platforms/maple/Kconfig
@@ -9,7 +9,7 @@ config PPC_MAPLE
select GENERIC_TBSYNC
select PPC_UDBG_16550
select PPC_970_NAP
- select PPC_NATIVE
+ select PPC_HASH_MMU_NATIVE
select PPC_RTAS
select MMIO_NVRAM
select ATA_NONSTANDARD if ATA
diff --git a/arch/powerpc/platforms/microwatt/Kconfig b/arch/powerpc/platforms/microwatt/Kconfig
index 8f6a81978461..62b51e37fc05 100644
--- a/arch/powerpc/platforms/microwatt/Kconfig
+++ b/arch/powerpc/platforms/microwatt/Kconfig
@@ -5,7 +5,7 @@ config PPC_MICROWATT
select PPC_XICS
select PPC_ICS_NATIVE
select PPC_ICP_NATIVE
- select PPC_NATIVE
+ select PPC_HASH_MMU_NATIVE
select PPC_UDBG_16550
select ARCH_RANDOM
help
diff --git a/arch/powerpc/platforms/pasemi/Kconfig b/arch/powerpc/platforms/pasemi/Kconfig
index c52731a7773f..bc7137353a7f 100644
--- a/arch/powerpc/platforms/pasemi/Kconfig
+++ b/arch/powerpc/platforms/pasemi/Kconfig
@@ -5,7 +5,7 @@ config PPC_PASEMI
select MPIC
select FORCE_PCI
select PPC_UDBG_16550
- select PPC_NATIVE
+ select PPC_HASH_MMU_NATIVE
select MPIC_BROKEN_REGREAD
help
This option enables support for PA Semi's PWRficient line
diff --git a/arch/powerpc/platforms/powermac/Kconfig b/arch/powerpc/platforms/powermac/Kconfig
index b97bf12801eb..2b56df145b82 100644
--- a/arch/powerpc/platforms/powermac/Kconfig
+++ b/arch/powerpc/platforms/powermac/Kconfig
@@ -6,7 +6,7 @@ config PPC_PMAC
select FORCE_PCI
select PPC_INDIRECT_PCI if PPC32
select PPC_MPC106 if PPC32
- select PPC_NATIVE
+ select PPC_HASH_MMU_NATIVE
select ZONE_DMA if PPC32
default y
diff --git a/arch/powerpc/platforms/powernv/Kconfig b/arch/powerpc/platforms/powernv/Kconfig
index 043eefbbdd28..cd754e116184 100644
--- a/arch/powerpc/platforms/powernv/Kconfig
+++ b/arch/powerpc/platforms/powernv/Kconfig
@@ -2,7 +2,7 @@
config PPC_POWERNV
depends on PPC64 && PPC_BOOK3S
bool "IBM PowerNV (Non-Virtualized) platform support"
- select PPC_NATIVE
+ select PPC_HASH_MMU_NATIVE
select PPC_XICS
select PPC_ICP_NATIVE
select PPC_XIVE_NATIVE
diff --git a/arch/powerpc/platforms/pseries/Kconfig b/arch/powerpc/platforms/pseries/Kconfig
index 5e037df2a3a1..69a1ff8c079b 100644
--- a/arch/powerpc/platforms/pseries/Kconfig
+++ b/arch/powerpc/platforms/pseries/Kconfig
@@ -17,7 +17,7 @@ config PPC_PSERIES
select PPC_RTAS_DAEMON
select RTAS_ERROR_LOGGING
select PPC_UDBG_16550
- select PPC_NATIVE
+ select PPC_HASH_MMU_NATIVE
select PPC_DOORBELL
select HOTPLUG_CPU
select ARCH_RANDOM
--
2.23.0
^ permalink raw reply related
* [RFC PATCH 1/6] powerpc: Remove unused FW_FEATURE_NATIVE references
From: Nicholas Piggin @ 2021-08-27 16:34 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20210827163410.1177154-1-npiggin@gmail.com>
FW_FEATURE_NATIVE_ALWAYS and FW_FEATURE_NATIVE_POSSIBLE are always
zero and never do anything. Remove them.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/include/asm/firmware.h | 8 --------
1 file changed, 8 deletions(-)
diff --git a/arch/powerpc/include/asm/firmware.h b/arch/powerpc/include/asm/firmware.h
index 7604673787d6..df9e4aae917e 100644
--- a/arch/powerpc/include/asm/firmware.h
+++ b/arch/powerpc/include/asm/firmware.h
@@ -79,8 +79,6 @@ enum {
FW_FEATURE_POWERNV_ALWAYS = 0,
FW_FEATURE_PS3_POSSIBLE = FW_FEATURE_LPAR | FW_FEATURE_PS3_LV1,
FW_FEATURE_PS3_ALWAYS = FW_FEATURE_LPAR | FW_FEATURE_PS3_LV1,
- FW_FEATURE_NATIVE_POSSIBLE = 0,
- FW_FEATURE_NATIVE_ALWAYS = 0,
FW_FEATURE_POSSIBLE =
#ifdef CONFIG_PPC_PSERIES
FW_FEATURE_PSERIES_POSSIBLE |
@@ -90,9 +88,6 @@ enum {
#endif
#ifdef CONFIG_PPC_PS3
FW_FEATURE_PS3_POSSIBLE |
-#endif
-#ifdef CONFIG_PPC_NATIVE
- FW_FEATURE_NATIVE_ALWAYS |
#endif
0,
FW_FEATURE_ALWAYS =
@@ -104,9 +99,6 @@ enum {
#endif
#ifdef CONFIG_PPC_PS3
FW_FEATURE_PS3_ALWAYS &
-#endif
-#ifdef CONFIG_PPC_NATIVE
- FW_FEATURE_NATIVE_ALWAYS &
#endif
FW_FEATURE_POSSIBLE,
--
2.23.0
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox