LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH v2 0/8] powernv/memtrace: don't abuse memory hot(un)plug infrastructure for memory allocations
From: Michael Ellerman @ 2020-11-25 11:57 UTC (permalink / raw)
  To: linux-kernel, David Hildenbrand
  Cc: Michal Hocko, Wei Yang, Nicholas Piggin, Michal Hocko, linux-mm,
	Paul Mackerras, Aneesh Kumar K.V, Rashmica Gupta, linuxppc-dev,
	Andrew Morton, Mike Rapoport, Oscar Salvador
In-Reply-To: <20201111145322.15793-1-david@redhat.com>

On Wed, 11 Nov 2020 15:53:14 +0100, David Hildenbrand wrote:
> Based on latest linux/master
> 
> powernv/memtrace is the only in-kernel user that rips out random memory
> it never added (doesn't own) in order to allocate memory without a
> linear mapping. Let's stop abusing memory hot(un)plug infrastructure for
> that - use alloc_contig_pages() for allocating memory and remove the
> linear mapping manually.
> 
> [...]

Applied to powerpc/next.

[1/8] powerpc/powernv/memtrace: Don't leak kernel memory to user space
      https://git.kernel.org/powerpc/c/c74cf7a3d59a21b290fe0468f5b470d0b8ee37df
[2/8] powerpc/powernv/memtrace: Fix crashing the kernel when enabling concurrently
      https://git.kernel.org/powerpc/c/d6718941a2767fb383e105d257d2105fe4f15f0e
[3/8] powerpc/mm: factor out creating/removing linear mapping
      https://git.kernel.org/powerpc/c/4abb1e5b63ac3281275315fc6b0cde0b9c2e2e42
[4/8] powerpc/mm: protect linear mapping modifications by a mutex
      https://git.kernel.org/powerpc/c/e5b2af044f31bf18defa557a8cd11c23caefa34c
[5/8] powerpc/mm: print warning in arch_remove_linear_mapping()
      https://git.kernel.org/powerpc/c/1f73ad3e8d755dbec52fcec98618a7ce4de12af2
[6/8] powerpc/book3s64/hash: Drop WARN_ON in hash__remove_section_mapping()
      https://git.kernel.org/powerpc/c/d8bd9a121c2f2bc8b36da930dc91b69fd2a705e2
[7/8] powerpc/mm: remove linear mapping if __add_pages() fails in arch_add_memory()
      https://git.kernel.org/powerpc/c/ca2c36cae9d48b180ea51259e35ab3d95d327df2
[8/8] powernv/memtrace: don't abuse memory hot(un)plug infrastructure for memory allocations
      https://git.kernel.org/powerpc/c/0bd4b96d99108b7ea9bac0573957483be7781d70

cheers

^ permalink raw reply

* Re: [PATCH v4 1/2] powerpc/64: Set up a kernel stack for secondaries before cpu_restore()
From: Michael Ellerman @ 2020-11-25 11:57 UTC (permalink / raw)
  To: Jordan Niethe, linuxppc-dev; +Cc: oohall, npiggin
In-Reply-To: <20201014072837.24539-1-jniethe5@gmail.com>

On Wed, 14 Oct 2020 18:28:36 +1100, Jordan Niethe wrote:
> Currently in generic_secondary_smp_init(), cur_cpu_spec->cpu_restore()
> is called before a stack has been set up in r1. This was previously fine
> as the cpu_restore() functions were implemented in assembly and did not
> use a stack. However commit 5a61ef74f269 ("powerpc/64s: Support new
> device tree binding for discovering CPU features") used
> __restore_cpu_cpufeatures() as the cpu_restore() function for a
> device-tree features based cputable entry. This is a C function and
> hence uses a stack in r1.
> 
> [...]

Applied to powerpc/next.

[1/2] powerpc/64: Set up a kernel stack for secondaries before cpu_restore()
      https://git.kernel.org/powerpc/c/3c0b976bf20d236c57adcefa80f86a0a1d737727
[2/2] powerpc/64s: Convert some cpu_setup() and cpu_restore() functions to C
      https://git.kernel.org/powerpc/c/344fbab991a568dc33ad90711b489d870e18d26d

cheers

^ permalink raw reply

* Re: [PATCH v1 0/4] powernv/memtrace: don't abuse memory hot(un)plug infrastructure for memory allocations
From: Michael Ellerman @ 2020-11-25 11:57 UTC (permalink / raw)
  To: linux-kernel, David Hildenbrand
  Cc: Michal Hocko, Wei Yang, Michal Hocko, linux-mm, Paul Mackerras,
	Rashmica Gupta, linuxppc-dev, Andrew Morton, Mike Rapoport,
	Oscar Salvador
In-Reply-To: <20201029162718.29910-1-david@redhat.com>

On Thu, 29 Oct 2020 17:27:14 +0100, David Hildenbrand wrote:
> powernv/memtrace is the only in-kernel user that rips out random memory
> it never added (doesn't own) in order to allocate memory without a
> linear mapping. Let's stop abusing memory hot(un)plug infrastructure for
> that - use alloc_contig_pages() for allocating memory and remove the
> linear mapping manually.
> 
> The original idea was discussed in:
>  https://lkml.kernel.org/r/48340e96-7e6b-736f-9e23-d3111b915b6e@redhat.com
> 
> [...]

Applied to powerpc/next.

[1/4] powerpc/mm: factor out creating/removing linear mapping
      https://git.kernel.org/powerpc/c/4abb1e5b63ac3281275315fc6b0cde0b9c2e2e42
[2/4] powerpc/mm: print warning in arch_remove_linear_mapping()
      https://git.kernel.org/powerpc/c/1f73ad3e8d755dbec52fcec98618a7ce4de12af2
[3/4] powerpc/mm: remove linear mapping if __add_pages() fails in arch_add_memory()
      https://git.kernel.org/powerpc/c/ca2c36cae9d48b180ea51259e35ab3d95d327df2
[4/4] powernv/memtrace: don't abuse memory hot(un)plug infrastructure for memory allocations
      https://git.kernel.org/powerpc/c/0bd4b96d99108b7ea9bac0573957483be7781d70

cheers

^ permalink raw reply

* Re: [PATCH v2 1/3] powerpc/64s: Replace RFI by RFI_TO_KERNEL and remove RFI
From: Michael Ellerman @ 2020-11-25 11:57 UTC (permalink / raw)
  To: Christophe Leroy, Michael Ellerman, Benjamin Herrenschmidt,
	Paul Mackerras
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <7719261b0a0d2787772339484c33eb809723bca7.1604854583.git.christophe.leroy@csgroup.eu>

On Sun, 8 Nov 2020 16:57:35 +0000 (UTC), Christophe Leroy wrote:
> In head_64.S, we have two places using RFI to return to
> kernel. Use RFI_TO_KERNEL instead.
> 
> They are the two only places using RFI on book3s/64, so
> the RFI macro can go away.

Applied to powerpc/next.

[1/3] powerpc/64s: Replace RFI by RFI_TO_KERNEL and remove RFI
      https://git.kernel.org/powerpc/c/879add7720172ffd2986c44587510fabb7af52f5
[2/3] powerpc: Replace RFI by rfi on book3s/32 and booke
      https://git.kernel.org/powerpc/c/120c0518ec321f33cdc4670059fb76e96ceb56eb
[3/3] powerpc: Remove RFI macro
      https://git.kernel.org/powerpc/c/62182e6c0faf75117f8d1719c118bb5fc8574012

cheers

^ permalink raw reply

* Re: [PATCH v13 0/8] powerpc: switch VDSO to C implementation
From: Michael Ellerman @ 2020-11-25 11:57 UTC (permalink / raw)
  To: Michael Ellerman, Paul Mackerras, anton, Benjamin Herrenschmidt,
	Christophe Leroy, nathanl
  Cc: linux-arch, arnd, linux-kernel, luto, tglx, vincenzo.frascino,
	linuxppc-dev
In-Reply-To: <cover.1604426550.git.christophe.leroy@csgroup.eu>

On Tue, 3 Nov 2020 18:07:11 +0000 (UTC), Christophe Leroy wrote:
> This is a series to switch powerpc VDSO to generic C implementation.
> 
> Changes in v13:
> - Reorganised headers to avoid the need for a fake 32 bits config for building VDSO32 on PPC64
> - Rebased after the removal of powerpc 601
> - Using DOTSYM() macro to call functions directly without using OPD
> - Explicitely dropped .opd and .got1 sections which are now unused
> 
> [...]

Patch 1 applied to powerpc/next.

[1/8] powerpc/feature: Fix CPU_FTRS_ALWAYS by removing CPU_FTRS_GENERIC_32
      https://git.kernel.org/powerpc/c/78665179e569c7e1fe102fb6c21d0f5b6951f084

cheers

^ permalink raw reply

* Re: [PATCH] powerpc/bitops: Fix possible undefined behaviour with fls() and fls64()
From: Michael Ellerman @ 2020-11-25 11:57 UTC (permalink / raw)
  To: Michael Ellerman, Paul Mackerras, jakub, Benjamin Herrenschmidt,
	Christophe Leroy, segher
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <348c2d3f19ffcff8abe50d52513f989c4581d000.1603375524.git.christophe.leroy@csgroup.eu>

On Thu, 22 Oct 2020 14:05:46 +0000 (UTC), Christophe Leroy wrote:
> fls() and fls64() are using __builtin_ctz() and _builtin_ctzll().
> On powerpc, those builtins trivially use ctlzw and ctlzd power
> instructions.
> 
> Allthough those instructions provide the expected result with
> input argument 0, __builtin_ctz() and __builtin_ctzll() are
> documented as undefined for value 0.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/bitops: Fix possible undefined behaviour with fls() and fls64()
      https://git.kernel.org/powerpc/c/1891ef21d92c4801ea082ee8ed478e304ddc6749

cheers

^ permalink raw reply

* Re: [PATCH] powerpc: avoid broken GCC __attribute__((optimize))
From: Michael Ellerman @ 2020-11-25 11:57 UTC (permalink / raw)
  To: linux-kernel, Ard Biesheuvel
  Cc: Kees Cook, Daniel Borkmann, Peter Zijlstra, Randy Dunlap,
	Nick Desaulniers, Alexei Starovoitov, Arvind Sankar,
	Paul Mackerras, Josh Poimboeuf, Geert Uytterhoeven,
	Thomas Gleixner, linuxppc-dev
In-Reply-To: <20201028080433.26799-1-ardb@kernel.org>

On Wed, 28 Oct 2020 09:04:33 +0100, Ard Biesheuvel wrote:
> Commit 7053f80d9696 ("powerpc/64: Prevent stack protection in early boot")
> introduced a couple of uses of __attribute__((optimize)) with function
> scope, to disable the stack protector in some early boot code.
> 
> Unfortunately, and this is documented in the GCC man pages [0], overriding
> function attributes for optimization is broken, and is only supported for
> debug scenarios, not for production: the problem appears to be that
> setting GCC -f flags using this method will cause it to forget about some
> or all other optimization settings that have been applied.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc: Avoid broken GCC __attribute__((optimize))
      https://git.kernel.org/powerpc/c/a7223f5bfcaeade4a86d35263493bcda6c940891

cheers

^ permalink raw reply

* Re: [PATCH v2] powerpc/mm: Update tlbiel loop on POWER10
From: Michael Ellerman @ 2020-11-25 11:57 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linuxppc-dev, mpe; +Cc: npiggin
In-Reply-To: <20201007053305.232879-1-aneesh.kumar@linux.ibm.com>

On Wed, 7 Oct 2020 11:03:05 +0530, Aneesh Kumar K.V wrote:
> With POWER10, single tlbiel instruction invalidates all the congruence
> class of the TLB and hence we need to issue only one tlbiel with SET=0.

Applied to powerpc/next.

[1/1] powerpc/mm: Update tlbiel loop on POWER10
      https://git.kernel.org/powerpc/c/e80639405c40127727812a0e1f8a65ba9979f146

cheers

^ permalink raw reply

* Re: [PATCH] powerpc/mm: move setting pte specific flags to pfn_pmd
From: Michael Ellerman @ 2020-11-25 11:57 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linuxppc-dev, mpe; +Cc: linux-mm
In-Reply-To: <20201022091115.39568-1-aneesh.kumar@linux.ibm.com>

On Thu, 22 Oct 2020 14:41:15 +0530, Aneesh Kumar K.V wrote:
> powerpc used to set the pte specific flags in set_pte_at().  This is
> different from other architectures. To be consistent with other
> architecture powerpc updated pfn_pte to set _PAGE_PTE with
> commit 379c926d6334 ("powerpc/mm: move setting pte specific flags to pfn_pte")
> 
> The commit didn't do the same w.r.t pfn_pmd because we expect pmd_mkhuge
> to do that. But as per Linus that is a bad rule [1].
> Hence update pfn_pmd to set _PAGE_PTE.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/mm: Move setting PTE specific flags to pfn_pmd()
      https://git.kernel.org/powerpc/c/53f45ecc9cd04b4b963f3040f2a54c3baf03b229

cheers

^ permalink raw reply

* [PATCH v2 2/2] powerpc/pseries: pass MSI affinity to irq_create_mapping()
From: Laurent Vivier @ 2020-11-25 11:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: Laurent Vivier, Michael S . Tsirkin, linux-pci, Greg Kurz,
	linux-block, Paul Mackerras, Marc Zyngier, Thomas Gleixner,
	linuxppc-dev, Christoph Hellwig
In-Reply-To: <20201125111657.1141295-1-lvivier@redhat.com>

With virtio multiqueue, normally each queue IRQ is mapped to a CPU.

But since commit 0d9f0a52c8b9f ("virtio_scsi: use virtio IRQ affinity")
this is broken on pseries.

The affinity is correctly computed in msi_desc but this is not applied
to the system IRQs.

It appears the affinity is correctly passed to rtas_setup_msi_irqs() but
lost at this point and never passed to irq_domain_alloc_descs()
(see commit 06ee6d571f0e ("genirq: Add affinity hint to irq allocation"))
because irq_create_mapping() doesn't take an affinity parameter.

As the previous patch has added the affinity parameter to
irq_create_mapping() we can forward the affinity from rtas_setup_msi_irqs()
to irq_domain_alloc_descs().

With this change, the virtqueues are correctly dispatched between the CPUs
on pseries.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
---
 arch/powerpc/platforms/pseries/msi.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/pseries/msi.c b/arch/powerpc/platforms/pseries/msi.c
index 133f6adcb39c..b3ac2455faad 100644
--- a/arch/powerpc/platforms/pseries/msi.c
+++ b/arch/powerpc/platforms/pseries/msi.c
@@ -458,7 +458,8 @@ static int rtas_setup_msi_irqs(struct pci_dev *pdev, int nvec_in, int type)
 			return hwirq;
 		}
 
-		virq = irq_create_mapping(NULL, hwirq);
+		virq = irq_create_mapping_affinity(NULL, hwirq,
+						   entry->affinity);
 
 		if (!virq) {
 			pr_debug("rtas_msi: Failed mapping hwirq %d\n", hwirq);
-- 
2.28.0


^ permalink raw reply related

* [PATCH v2 1/2] genirq: add an irq_create_mapping_affinity() function
From: Laurent Vivier @ 2020-11-25 11:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: Laurent Vivier, Michael S . Tsirkin, linux-pci, Greg Kurz,
	linux-block, Paul Mackerras, Marc Zyngier, Thomas Gleixner,
	linuxppc-dev, Christoph Hellwig
In-Reply-To: <20201125111657.1141295-1-lvivier@redhat.com>

This function adds an affinity parameter to irq_create_mapping().
This parameter is needed to pass it to irq_domain_alloc_descs().

irq_create_mapping() is a wrapper around irq_create_mapping_affinity()
to pass NULL for the affinity parameter.

No functional change.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
---
 include/linux/irqdomain.h | 12 ++++++++++--
 kernel/irq/irqdomain.c    | 13 ++++++++-----
 2 files changed, 18 insertions(+), 7 deletions(-)

diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
index 71535e87109f..ea5a337e0f8b 100644
--- a/include/linux/irqdomain.h
+++ b/include/linux/irqdomain.h
@@ -384,11 +384,19 @@ extern void irq_domain_associate_many(struct irq_domain *domain,
 extern void irq_domain_disassociate(struct irq_domain *domain,
 				    unsigned int irq);
 
-extern unsigned int irq_create_mapping(struct irq_domain *host,
-				       irq_hw_number_t hwirq);
+extern unsigned int irq_create_mapping_affinity(struct irq_domain *host,
+				      irq_hw_number_t hwirq,
+				      const struct irq_affinity_desc *affinity);
 extern unsigned int irq_create_fwspec_mapping(struct irq_fwspec *fwspec);
 extern void irq_dispose_mapping(unsigned int virq);
 
+static inline unsigned int irq_create_mapping(struct irq_domain *host,
+					      irq_hw_number_t hwirq)
+{
+	return irq_create_mapping_affinity(host, hwirq, NULL);
+}
+
+
 /**
  * irq_linear_revmap() - Find a linux irq from a hw irq number.
  * @domain: domain owning this hardware interrupt
diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index cf8b374b892d..e4ca69608f3b 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -624,17 +624,19 @@ unsigned int irq_create_direct_mapping(struct irq_domain *domain)
 EXPORT_SYMBOL_GPL(irq_create_direct_mapping);
 
 /**
- * irq_create_mapping() - Map a hardware interrupt into linux irq space
+ * irq_create_mapping_affinity() - Map a hardware interrupt into linux irq space
  * @domain: domain owning this hardware interrupt or NULL for default domain
  * @hwirq: hardware irq number in that domain space
+ * @affinity: irq affinity
  *
  * Only one mapping per hardware interrupt is permitted. Returns a linux
  * irq number.
  * If the sense/trigger is to be specified, set_irq_type() should be called
  * on the number returned from that call.
  */
-unsigned int irq_create_mapping(struct irq_domain *domain,
-				irq_hw_number_t hwirq)
+unsigned int irq_create_mapping_affinity(struct irq_domain *domain,
+				       irq_hw_number_t hwirq,
+				       const struct irq_affinity_desc *affinity)
 {
 	struct device_node *of_node;
 	int virq;
@@ -660,7 +662,8 @@ unsigned int irq_create_mapping(struct irq_domain *domain,
 	}
 
 	/* Allocate a virtual interrupt number */
-	virq = irq_domain_alloc_descs(-1, 1, hwirq, of_node_to_nid(of_node), NULL);
+	virq = irq_domain_alloc_descs(-1, 1, hwirq, of_node_to_nid(of_node),
+				      affinity);
 	if (virq <= 0) {
 		pr_debug("-> virq allocation failed\n");
 		return 0;
@@ -676,7 +679,7 @@ unsigned int irq_create_mapping(struct irq_domain *domain,
 
 	return virq;
 }
-EXPORT_SYMBOL_GPL(irq_create_mapping);
+EXPORT_SYMBOL_GPL(irq_create_mapping_affinity);
 
 /**
  * irq_create_strict_mappings() - Map a range of hw irqs to fixed linux irqs
-- 
2.28.0


^ permalink raw reply related

* [PATCH v2 0/2] powerpc/pseries: fix MSI/X IRQ affinity on pseries
From: Laurent Vivier @ 2020-11-25 11:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: Laurent Vivier, Michael S . Tsirkin, linux-pci, Greg Kurz,
	linux-block, Paul Mackerras, Marc Zyngier, Thomas Gleixner,
	linuxppc-dev, Christoph Hellwig

With virtio, in multiqueue case, each queue IRQ is normally
bound to a different CPU using the affinity mask.

This works fine on x86_64 but totally ignored on pseries.

This is not obvious at first look because irqbalance is doing
some balancing to improve that.

It appears that the "managed" flag set in the MSI entry
is never copied to the system IRQ entry.

This series passes the affinity mask from rtas_setup_msi_irqs()
to irq_domain_alloc_descs() by adding an affinity parameter to
irq_create_mapping().

The first patch adds the parameter (no functional change), the
second patch passes the actual affinity mask to irq_create_mapping()
in rtas_setup_msi_irqs().

For instance, with 32 CPUs VM and 32 queues virtio-scsi interface:

... -smp 32 -device virtio-scsi-pci,id=virtio_scsi_pci0,num_queues=32

for IRQ in $(grep virtio2-request /proc/interrupts |cut -d: -f1); do
    for file in /proc/irq/$IRQ/ ; do
        echo -n "IRQ: $(basename $file) CPU: " ; cat $file/smp_affinity_list
    done
done

Without the patch (and without irqbalanced)

IRQ: 268 CPU: 0-31
IRQ: 269 CPU: 0-31
IRQ: 270 CPU: 0-31
IRQ: 271 CPU: 0-31
IRQ: 272 CPU: 0-31
IRQ: 273 CPU: 0-31
IRQ: 274 CPU: 0-31
IRQ: 275 CPU: 0-31
IRQ: 276 CPU: 0-31
IRQ: 277 CPU: 0-31
IRQ: 278 CPU: 0-31
IRQ: 279 CPU: 0-31
IRQ: 280 CPU: 0-31
IRQ: 281 CPU: 0-31
IRQ: 282 CPU: 0-31
IRQ: 283 CPU: 0-31
IRQ: 284 CPU: 0-31
IRQ: 285 CPU: 0-31
IRQ: 286 CPU: 0-31
IRQ: 287 CPU: 0-31
IRQ: 288 CPU: 0-31
IRQ: 289 CPU: 0-31
IRQ: 290 CPU: 0-31
IRQ: 291 CPU: 0-31
IRQ: 292 CPU: 0-31
IRQ: 293 CPU: 0-31
IRQ: 294 CPU: 0-31
IRQ: 295 CPU: 0-31
IRQ: 296 CPU: 0-31
IRQ: 297 CPU: 0-31
IRQ: 298 CPU: 0-31
IRQ: 299 CPU: 0-31

With the patch:

IRQ: 265 CPU: 0
IRQ: 266 CPU: 1
IRQ: 267 CPU: 2
IRQ: 268 CPU: 3
IRQ: 269 CPU: 4
IRQ: 270 CPU: 5
IRQ: 271 CPU: 6
IRQ: 272 CPU: 7
IRQ: 273 CPU: 8
IRQ: 274 CPU: 9
IRQ: 275 CPU: 10
IRQ: 276 CPU: 11
IRQ: 277 CPU: 12
IRQ: 278 CPU: 13
IRQ: 279 CPU: 14
IRQ: 280 CPU: 15
IRQ: 281 CPU: 16
IRQ: 282 CPU: 17
IRQ: 283 CPU: 18
IRQ: 284 CPU: 19
IRQ: 285 CPU: 20
IRQ: 286 CPU: 21
IRQ: 287 CPU: 22
IRQ: 288 CPU: 23
IRQ: 289 CPU: 24
IRQ: 290 CPU: 25
IRQ: 291 CPU: 26
IRQ: 292 CPU: 27
IRQ: 293 CPU: 28
IRQ: 294 CPU: 29
IRQ: 295 CPU: 30
IRQ: 299 CPU: 31

This matches what we have on an x86_64 system.

v2: add a wrapper around original irq_create_mapping() with the
    affinity parameter. Update comments

Laurent Vivier (2):
  genirq: add an irq_create_mapping_affinity() function
  powerpc/pseries: pass MSI affinity to irq_create_mapping()

 arch/powerpc/platforms/pseries/msi.c |  3 ++-
 include/linux/irqdomain.h            | 12 ++++++++++--
 kernel/irq/irqdomain.c               | 13 ++++++++-----
 3 files changed, 20 insertions(+), 8 deletions(-)

-- 
2.28.0



^ permalink raw reply

* Re: [PATCH 1/2] powerpc: sstep: Fix load and update instructions
From: Ravi Bangoria @ 2020-11-25 10:09 UTC (permalink / raw)
  To: Sandipan Das
  Cc: Ravi Bangoria, jniethe5, paulus, naveen.n.rao, linuxppc-dev, dja
In-Reply-To: <20201119054139.244083-1-sandipan@linux.ibm.com>


> diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
> index 855457ed09b5..25a5436be6c6 100644
> --- a/arch/powerpc/lib/sstep.c
> +++ b/arch/powerpc/lib/sstep.c
> @@ -2157,11 +2157,15 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
>   
>   		case 23:	/* lwzx */
>   		case 55:	/* lwzux */
> +			if (u && (ra == 0 || ra == rd))
> +				return -1;

I guess you also need to split case 23 and 55?

- Ravi

^ permalink raw reply

* Re: C vdso
From: Christophe Leroy @ 2020-11-25  9:21 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: linuxppc-dev
In-Reply-To: <87tuteyyxi.fsf@mpe.ellerman.id.au>


Quoting Michael Ellerman <mpe@ellerman.id.au>:

> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>> Le 03/11/2020 à 19:13, Christophe Leroy a écrit :
>>> Le 23/10/2020 à 15:24, Michael Ellerman a écrit :
>>>> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>>>>> Le 24/09/2020 à 15:17, Christophe Leroy a écrit :
>>>>>> Le 17/09/2020 à 14:33, Michael Ellerman a écrit :
>>>>>>> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>>>>>>>>
>>>>>>>> What is the status with the generic C vdso merge ?
>>>>>>>> In some mail, you mentionned having difficulties getting it working on
>>>>>>>> ppc64, any progress ? What's the problem ? Can I help ?
>>>>>>>
>>>>>>> Yeah sorry I was hoping to get time to work on it but haven't been able
>>>>>>> to.
>>>>>>>
>>>>>>> It's causing crashes on ppc64 ie. big endian.
>>>> ...
>>>>>>
>>>>>> Can you tell what defconfig you are using ? I have been able to  
>>>>>> setup a full glibc PPC64 cross
>>>>>> compilation chain and been able to test it under QEMU with  
>>>>>> success, using Nathan's vdsotest tool.
>>>>>
>>>>> What config are you using ?
>>>>
>>>> ppc64_defconfig + guest.config
>>>>
>>>> Or pseries_defconfig.
>>>>
>>>> I'm using Ubuntu GCC 9.3.0 mostly, but it happens with other  
>>>> toolchains too.
>>>>
>>>> At a minimum we're seeing relocations in the output, which is a problem:
>>>>
>>>>    $ readelf -r build\~/arch/powerpc/kernel/vdso64/vdso64.so
>>>>    Relocation section '.rela.dyn' at offset 0x12a8 contains 8 entries:
>>>>      Offset          Info           Type           Sym. Value     
>>>> Sym. Name + Addend
>>>>    000000001368  000000000016 R_PPC64_RELATIVE                     7c0
>>>>    000000001370  000000000016 R_PPC64_RELATIVE                     9300
>>>>    000000001380  000000000016 R_PPC64_RELATIVE                     970
>>>>    000000001388  000000000016 R_PPC64_RELATIVE                     9300
>>>>    000000001398  000000000016 R_PPC64_RELATIVE                     a90
>>>>    0000000013a0  000000000016 R_PPC64_RELATIVE                     9300
>>>>    0000000013b0  000000000016 R_PPC64_RELATIVE                     b20
>>>>    0000000013b8  000000000016 R_PPC64_RELATIVE                     9300
>>>
>>> Looks like it's due to the OPD and relation between the function()  
>>> and .function()
>>>
>>> By using DOTSYM() in the 'bl' call, that's directly the dot  
>>> function which is called and the OPD is
>>> not used anymore, it can get dropped.
>>>
>>> Now I get .rela.dyn full of 0, don't know if we should drop it explicitely.
>>
>> What is the status now with latest version of CVDSO ? I saw you had  
>> it in next-test for some time,
>> it is not there anymore today.
>
> Still having some trouble with the compat VDSO.
>
> eg:
>
> $ ./vdsotest clock-gettime-monotonic verify
> timestamp obtained from kernel predates timestamp
> previously obtained from libc/vDSO:
> 	[1346, 821441653] (vDSO)
> 	[570, 769440040] (kernel)
>
>
> And similar for all clocks except the coarse ones.
>

Ok, I managed to get the same with QEMU. Looking at the binary, I only  
see an mftb instead of the mftbu/mftb/mftbu triplet.

Fix below. Can you carry it, or do you prefer a full patch from me ?  
The easiest would be either to squash it into [v13,4/8]  
("powerpc/time: Move timebase functions into new asm/timebase.h"), or  
to add it between patch 4 and 5 ?

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index f877a576b338..c3473eb031a3 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -1419,7 +1419,7 @@ static inline void msr_check_and_clear(unsigned  
long bits)
  		__msr_check_and_clear(bits);
  }

-#if defined(CONFIG_PPC_CELL) || defined(CONFIG_E500)
+#if defined(__powerpc64__) && (defined(CONFIG_PPC_CELL) ||  
defined(CONFIG_E500))
  #define mftb()		({unsigned long rval;				\
  			asm volatile(					\
  				"90:	mfspr %0, %2;\n"		\
diff --git a/arch/powerpc/include/asm/timebase.h  
b/arch/powerpc/include/asm/timebase.h
index a8eae3adaa91..7b372976f5a5 100644
--- a/arch/powerpc/include/asm/timebase.h
+++ b/arch/powerpc/include/asm/timebase.h
@@ -21,7 +21,7 @@ static inline u64 get_tb(void)
  {
  	unsigned int tbhi, tblo, tbhi2;

-	if (IS_ENABLED(CONFIG_PPC64))
+	if (IS_BUILTIN(__powerpc64__))
  		return mftb();

  	do {


^ permalink raw reply related

* Re: [PATCH v4 10/18] dt-bindings: usb: Convert DWC USB3 bindings to DT schema
From: Serge Semin @ 2020-11-25  8:32 UTC (permalink / raw)
  To: Rob Herring
  Cc: Neil Armstrong, Bjorn Andersson, Pavel Parkhomenko, Kevin Hilman,
	Krzysztof Kozlowski, Andy Gross, Chunfeng Yun, linux-snps-arc,
	devicetree, Mathias Nyman, Martin Blumenstingl, Lad Prabhakar,
	Alexey Malahov, linux-arm-kernel, Roger Quadros, Felipe Balbi,
	Greg Kroah-Hartman, Yoshihiro Shimoda, linux-usb, linux-mips,
	Serge Semin, linux-kernel, Manu Gautam, linuxppc-dev
In-Reply-To: <20201121124228.GA2039998@robh.at.kernel.org>

On Sat, Nov 21, 2020 at 06:42:28AM -0600, Rob Herring wrote:
> On Thu, Nov 12, 2020 at 01:29:46PM +0300, Serge Semin wrote:
> > On Wed, Nov 11, 2020 at 02:14:23PM -0600, Rob Herring wrote:
> > > On Wed, Nov 11, 2020 at 12:08:45PM +0300, Serge Semin wrote:
> > > > DWC USB3 DT node is supposed to be compliant with the Generic xHCI
> > > > Controller schema, but with additional vendor-specific properties, the
> > > > controller-specific reference clocks and PHYs. So let's convert the
> > > > currently available legacy text-based DWC USB3 bindings to the DT schema
> > > > and make sure the DWC USB3 nodes are also validated against the
> > > > usb-xhci.yaml schema.
> > > > 
> > > > Note we have to discard the nodename restriction of being prefixed with
> > > > "dwc3@" string, since in accordance with the usb-hcd.yaml schema USB nodes
> > > > are supposed to be named as "^usb(@.*)".
> > > > 
> > > > Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
> > > > 
> > > > ---
> > > > 
> > > > Changelog v2:
> > > > - Discard '|' from the descriptions, since we don't need to preserve
> > > >   the text formatting in any of them.
> > > > - Drop quotes from around the string constants.
> > > > - Fix the "clock-names" prop description to be referring the enumerated
> > > >   clock-names instead of the ones from the Databook.
> > > > 
> > > > Changelog v3:
> > > > - Apply usb-xhci.yaml# schema only if the controller is supposed to work
> > > >   as either host or otg.
> > > > 
> > > > Changelog v4:
> > > > - Apply usb-drd.yaml schema first. If the controller is configured
> > > >   to work in a gadget mode only, then apply the usb.yaml schema too,
> > > >   otherwise apply the usb-xhci.yaml schema.
> > > > - Discard the Rob'es Reviewed-by tag. Please review the patch one more
> > > >   time.
> > > > ---
> > > >  .../devicetree/bindings/usb/dwc3.txt          | 125 --------
> > > >  .../devicetree/bindings/usb/snps,dwc3.yaml    | 303 ++++++++++++++++++
> > > >  2 files changed, 303 insertions(+), 125 deletions(-)
> > > >  delete mode 100644 Documentation/devicetree/bindings/usb/dwc3.txt
> > > >  create mode 100644 Documentation/devicetree/bindings/usb/snps,dwc3.yaml
> 
> 
> > > > diff --git a/Documentation/devicetree/bindings/usb/snps,dwc3.yaml b/Documentation/devicetree/bindings/usb/snps,dwc3.yaml
> > > > new file mode 100644
> > > > index 000000000000..079617891da6
> > > > --- /dev/null
> > > > +++ b/Documentation/devicetree/bindings/usb/snps,dwc3.yaml
> > > > @@ -0,0 +1,303 @@
> > > > +# SPDX-License-Identifier: GPL-2.0
> > > > +%YAML 1.2
> > > > +---
> > > > +$id: http://devicetree.org/schemas/usb/snps,dwc3.yaml#
> > > > +$schema: http://devicetree.org/meta-schemas/core.yaml#
> > > > +
> > > > +title: Synopsys DesignWare USB3 Controller
> > > > +
> > > > +maintainers:
> > > > +  - Felipe Balbi <balbi@kernel.org>
> > > > +
> > > > +description:
> > > > +  This is usually a subnode to DWC3 glue to which it is connected, but can also
> > > > +  be presented as a standalone DT node with an optional vendor-specific
> > > > +  compatible string.
> > > > +
> > 
> > > > +allOf:
> > > > +  - $ref: usb-drd.yaml#
> > > > +  - if:
> > > > +      properties:
> > > > +        dr_mode:
> > > > +          const: peripheral
> 

> Another thing, this evaluates to true if dr_mode is not present. You 
> need to add 'required'?

Right. Will something like this do that?

+ allOf:
+  - $ref: usb-drd.yaml#
+  - if:
+      properties:
+        dr_mode:
+          const: peripheral
+ 
+      required:
+        - dr_mode
+    then:
+      $ref: usb.yaml#
+    else
+      $ref: usb-xhci.yaml#

> If dr_mode is otg, then don't you need to apply 
> both usb.yaml and usb-xhci.yaml?

No I don't. Since there is no peripheral-specific DT schema, then the
only schema any USB-gadget node needs to pass is usb.yaml, which
is already included into the usb-xhci.yaml schema. So for pure OTG devices
with xHCI host and gadget capabilities it's enough to evaluate: allOf:
[$ref: usb-drd.yaml#, $ref: usb-xhci.yaml#].  Please see the
sketch/ASCII-figure below and the following text for details.

-Sergey

> 
> > > > +    then:
> > > > +      $ref: usb.yaml#
> > > 
> > > This part could be done in usb-drd.yaml?
> > 
> > Originally I was thinking about that, but then in order to minimize
> > the properties validation I've decided to split the properties in
> > accordance with the USB controllers functionality:
> > 
> >             +----- USB Gadget/Peripheral Controller. There is no
> >             |      specific schema for the gadgets since there is no
> >             |      common gadget properties (at least I failed to find
> >             |      ones). So the pure gadget controllers need to be
> >             |      validated just against usb.yaml schema.
> >             |
> > usb.yaml <--+-- usb-hcd.yaml - Generic USB Host Controller. The schema
> >                 ^              turns out to include the OHCI/UHCI/EHCI
> >                 |              properties, which AFAICS are also
> >                 |              applicable for the other host controllers.
> >                 |              So any USB host controller node needs to
> >                 |              be validated against this schema.
> >                 |
> >                 +- usb-xhci.yaml - Generic xHCI Host controller.
> > 
> > usb-drd.yaml -- USB Dual-Role/OTG Controllers. It describes the
> >                 DRD/OTG-specific properties and nothing else. So normally
> >                 it should be applied together with one of the
> >                 schemas described above.
> > 
> > So the use-cases of the suggested schemas is following:
> > 
> > 1) USB Controller is pure gadget? Then:
> >    + allOf:
> >    +  - $ref: usb.yaml#
> > 2) USB Controller is pure USB host (including OHCI/UHCI/EHCI)?
> >    + allOf:
> >    +   - $ref: usb-hcd.yaml#
> >    Note this prevents us from fixing all the currently available USB DT
> >    schemas, which already apply the usb-hcd.yaml schema.
> > 3) USB Controller is pure xHCI host controller? Then:
> >    + allOf:
> >    +   - $ref: usb-xhci.yaml#
> > 4) USB Controller is Dual-Role/OTG controller with USB 2.0 host? Then:
> >    + allOf:
> >    +   - $ref: usb-drd.yaml#
> >    +   - $ref: usb-hcd.yaml#
> > 5) USB Controller is Dual-Role/OTG controller with xHCI host? Then:
> >    + allOf:
> >    +   - $ref: usb-drd.yaml#
> >    +   - $ref: usb-xhci.yaml#
> > 6) USB Controller is Dual-Role/OTG controller which can only be a
> >    gadget? Then:
> >    + allOf:
> >    +   - $ref: usb-drd.yaml#
> >    +   - $ref: usb.yaml#
> > 
> > * Don't know really if controllers like in 6)-th really exist. Most
> > * likely they are still internally capable of dual-roling, but due to
> > * some conditions can be used as gadgets only.
> > 
> > It looks a bit complicated, but at least by having such design we'd minimize
> > the number of properties validation.
> > 

[...]

^ permalink raw reply

* Re: [PATCH 1/3] perf/core: Flush PMU internal buffers for per-CPU events
From: Michael Ellerman @ 2020-11-25  8:12 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ian Rogers, Andi Kleen, Peter Zijlstra, linuxppc-dev,
	linux-kernel, Stephane Eranian, Paul Mackerras,
	Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Gabriel Marin,
	Liang, Kan
In-Reply-To: <CAM9d7cg8kYMyPHQK_rhEiYQaSddqqt93=pLVNKJm8Y6F=if9ow@mail.gmail.com>

Namhyung Kim <namhyung@kernel.org> writes:
> Hello,
>
> On Mon, Nov 23, 2020 at 8:00 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
>>
>> Namhyung Kim <namhyung@kernel.org> writes:
>> > Hi Peter and Kan,
>> >
>> > (Adding PPC folks)
>> >
>> > On Tue, Nov 17, 2020 at 2:01 PM Namhyung Kim <namhyung@kernel.org> wrote:
>> >>
>> >> Hello,
>> >>
>> >> On Thu, Nov 12, 2020 at 4:54 AM Liang, Kan <kan.liang@linux.intel.com> wrote:
>> >> >
>> >> >
>> >> >
>> >> > On 11/11/2020 11:25 AM, Peter Zijlstra wrote:
>> >> > > On Mon, Nov 09, 2020 at 09:49:31AM -0500, Liang, Kan wrote:
>> >> > >
>> >> > >> - When the large PEBS was introduced (9c964efa4330), the sched_task() should
>> >> > >> be invoked to flush the PEBS buffer in each context switch. However, The
>> >> > >> perf_sched_events in account_event() is not updated accordingly. The
>> >> > >> perf_event_task_sched_* never be invoked for a pure per-CPU context. Only
>> >> > >> per-task event works.
>> >> > >>     At that time, the perf_pmu_sched_task() is outside of
>> >> > >> perf_event_context_sched_in/out. It means that perf has to double
>> >> > >> perf_pmu_disable() for per-task event.
>> >> > >
>> >> > >> - The patch 1 tries to fix broken per-CPU events. The CPU context cannot be
>> >> > >> retrieved from the task->perf_event_ctxp. So it has to be tracked in the
>> >> > >> sched_cb_list. Yes, the code is very similar to the original codes, but it
>> >> > >> is actually the new code for per-CPU events. The optimization for per-task
>> >> > >> events is still kept.
>> >> > >>    For the case, which has both a CPU context and a task context, yes, the
>> >> > >> __perf_pmu_sched_task() in this patch is not invoked. Because the
>> >> > >> sched_task() only need to be invoked once in a context switch. The
>> >> > >> sched_task() will be eventually invoked in the task context.
>> >> > >
>> >> > > The thing is; your first two patches rely on PERF_ATTACH_SCHED_CB and
>> >> > > only set that for large pebs. Are you sure the other users (Intel LBR
>> >> > > and PowerPC BHRB) don't need it?
>> >> >
>> >> > I didn't set it for LBR, because the perf_sched_events is always enabled
>> >> > for LBR. But, yes, we should explicitly set the PERF_ATTACH_SCHED_CB
>> >> > for LBR.
>> >> >
>> >> >         if (has_branch_stack(event))
>> >> >                 inc = true;
>> >> >
>> >> > >
>> >> > > If they indeed do not require the pmu::sched_task() callback for CPU
>> >> > > events, then I still think the whole perf_sched_cb_{inc,dec}() interface
>> >> >
>> >> > No, LBR requires the pmu::sched_task() callback for CPU events.
>> >> >
>> >> > Now, The LBR registers have to be reset in sched in even for CPU events.
>> >> >
>> >> > To fix the shorter LBR callstack issue for CPU events, we also need to
>> >> > save/restore LBRs in pmu::sched_task().
>> >> > https://lore.kernel.org/lkml/1578495789-95006-4-git-send-email-kan.liang@linux.intel.com/
>> >> >
>> >> > > is confusing at best.
>> >> > >
>> >> > > Can't we do something like this instead?
>> >> > >
>> >> > I think the below patch may have two issues.
>> >> > - PERF_ATTACH_SCHED_CB is required for LBR (maybe PowerPC BHRB as well) now.
>> >> > - We may disable the large PEBS later if not all PEBS events support
>> >> > large PEBS. The PMU need a way to notify the generic code to decrease
>> >> > the nr_sched_task.
>> >>
>> >> Any updates on this?  I've reviewed and tested Kan's patches
>> >> and they all look good.
>> >>
>> >> Maybe we can talk to PPC folks to confirm the BHRB case?
>> >
>> > Can we move this forward?  I saw patch 3/3 also adds PERF_ATTACH_SCHED_CB
>> > for PowerPC too.  But it'd be nice if ppc folks can confirm the change.
>>
>> Sorry I've read the whole thread, but I'm still not entirely sure I
>> understand the question.
>
> Thanks for your time and sorry about not being clear enough.
>
> We found per-cpu events are not calling pmu::sched_task()
> on context switches.  So PERF_ATTACH_SCHED_CB was
> added to indicate the core logic that it needs to invoke the
> callback.

OK. TBH I've never thought of using branch stack with a per-cpu event,
but I guess you can do it.

I think the same logic applies as LBR, we need to read the BHRB entries
in the context of the task that they were recorded for.

> The patch 3/3 added the flag to PPC (for BHRB) with other
> changes (I think it should be split like in the patch 2/3) and
> want to get ACKs from the PPC folks.

If you post a new version with Maddy's comments addressed then he or I
can ack it.

cheers

^ permalink raw reply

* Re: [PATCH 1/2] genirq: add an affinity parameter to irq_create_mapping()
From: Laurent Vivier @ 2020-11-25  7:30 UTC (permalink / raw)
  To: Thomas Gleixner, linux-kernel
  Cc: Michael S . Tsirkin, linux-pci, linux-block, Paul Mackerras,
	Marc Zyngier, linuxppc-dev, Christoph Hellwig
In-Reply-To: <87h7pel7ng.fsf@nanos.tec.linutronix.de>

On 24/11/2020 23:19, Thomas Gleixner wrote:
> On Tue, Nov 24 2020 at 21:03, Laurent Vivier wrote:
>> This parameter is needed to pass it to irq_domain_alloc_descs().
>>
>> This seems to have been missed by
>> o06ee6d571f0e ("genirq: Add affinity hint to irq allocation")
> 
> No, this has not been missed at all. There was and is no reason to do
> this.
> 
>> This is needed to implement proper support for multiqueue with
>> pseries.
> 
> And because pseries needs this _all_ callers need to be changed?
> 
>>  123 files changed, 171 insertions(+), 146 deletions(-)
> 
> Lots of churn for nothing. 99% of the callers will never need that.
> 
> What's wrong with simply adding an interface which takes that parameter,
> make the existing one an inline wrapper and and leave the rest alone?

Nothing. I'm going to do like that.

Thank you for your comment.

Laurent


^ permalink raw reply

* [PATCH V2] powerpc/perf: Exclude kernel samples while counting events in user space.
From: Athira Rajeev @ 2020-11-25  7:26 UTC (permalink / raw)
  To: mpe; +Cc: maddy, linuxppc-dev

Perf event attritube supports exclude_kernel flag
to avoid sampling/profiling in supervisor state (kernel).
Based on this event attr flag, Monitor Mode Control Register
bit is set to freeze on supervisor state. But sometime (due
to hardware limitation), Sampled Instruction Address
Register (SIAR) locks on to kernel address even when
freeze on supervisor is set. Patch here adds a check to
drop those samples.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
Changes in v2:
- Initial patch was sent along with series:
  https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=209195
  Moving this patch as separate since this change is applicable
  for all PMU platforms.

 arch/powerpc/perf/core-book3s.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 08643cb..40aa117 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -2122,6 +2122,17 @@ static void record_and_restart(struct perf_event *event, unsigned long val,
 	perf_event_update_userpage(event);
 
 	/*
+	 * Due to hardware limitation, sometimes SIAR could
+	 * lock on to kernel address even with freeze on
+	 * supervisor state (kernel) is set in MMCR2.
+	 * Check attr.exclude_kernel and address
+	 * to drop the sample in these cases.
+	 */
+	if (event->attr.exclude_kernel && record)
+		if (is_kernel_addr(mfspr(SPRN_SIAR)))
+			record = 0;
+
+	/*
 	 * Finally record data if requested.
 	 */
 	if (record) {
-- 
1.8.3.1


^ permalink raw reply related

* Re: [PATCH v4] dt-bindings: misc: convert fsl,qoriq-mc from txt to YAML
From: Ioana Ciornei @ 2020-11-25  7:17 UTC (permalink / raw)
  To: Laurentiu Tudor
  Cc: devicetree@vger.kernel.org, corbet@lwn.net,
	netdev@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, Leo Li, robh+dt@kernel.org,
	Ionut-robert Aron, kuba@kernel.org, linuxppc-dev@lists.ozlabs.org,
	davem@davemloft.net, linux-arm-kernel@lists.infradead.org
In-Reply-To: <20201123090035.15734-1-laurentiu.tudor@nxp.com>

On Mon, Nov 23, 2020 at 11:00:35AM +0200, Laurentiu Tudor wrote:
> From: Ionut-robert Aron <ionut-robert.aron@nxp.com>
> 
> Convert fsl,qoriq-mc to YAML in order to automate the verification
> process of dts files. In addition, update MAINTAINERS accordingly
> and, while at it, add some missing files.
> 
> Signed-off-by: Ionut-robert Aron <ionut-robert.aron@nxp.com>
> [laurentiu.tudor@nxp.com: update MINTAINERS, updates & fixes in schema]
> Signed-off-by: Laurentiu Tudor <laurentiu.tudor@nxp.com>

Acked-by: Ioana Ciornei <ioana.ciornei@nxp.com>


> ---
> Changes in v4:
>  - use $ref to point to fsl,qoriq-mc-dpmac binding
> 
> Changes in v3:
>  - dropped duplicated "fsl,qoriq-mc-dpmac" schema and replaced with
>    reference to it
>  - fixed a dt_binding_check warning
> 
> Changes in v2:
>  - fixed errors reported by yamllint
>  - dropped multiple unnecessary quotes
>  - used schema instead of text in description
>  - added constraints on dpmac reg property
> 
>  .../devicetree/bindings/misc/fsl,qoriq-mc.txt | 196 ------------------
>  .../bindings/misc/fsl,qoriq-mc.yaml           | 186 +++++++++++++++++
>  .../ethernet/freescale/dpaa2/overview.rst     |   5 +-
>  MAINTAINERS                                   |   4 +-
>  4 files changed, 193 insertions(+), 198 deletions(-)
>  delete mode 100644 Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt
>  create mode 100644 Documentation/devicetree/bindings/misc/fsl,qoriq-mc.yaml
> 
> diff --git a/Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt b/Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt
> deleted file mode 100644
> index 7b486d4985dc..000000000000
> --- a/Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt
> +++ /dev/null
> @@ -1,196 +0,0 @@
> -* Freescale Management Complex
> -
> -The Freescale Management Complex (fsl-mc) is a hardware resource
> -manager that manages specialized hardware objects used in
> -network-oriented packet processing applications. After the fsl-mc
> -block is enabled, pools of hardware resources are available, such as
> -queues, buffer pools, I/O interfaces. These resources are building
> -blocks that can be used to create functional hardware objects/devices
> -such as network interfaces, crypto accelerator instances, L2 switches,
> -etc.
> -
> -For an overview of the DPAA2 architecture and fsl-mc bus see:
> -Documentation/networking/device_drivers/ethernet/freescale/dpaa2/overview.rst
> -
> -As described in the above overview, all DPAA2 objects in a DPRC share the
> -same hardware "isolation context" and a 10-bit value called an ICID
> -(isolation context id) is expressed by the hardware to identify
> -the requester.
> -
> -The generic 'iommus' property is insufficient to describe the relationship
> -between ICIDs and IOMMUs, so an iommu-map property is used to define
> -the set of possible ICIDs under a root DPRC and how they map to
> -an IOMMU.
> -
> -For generic IOMMU bindings, see
> -Documentation/devicetree/bindings/iommu/iommu.txt.
> -
> -For arm-smmu binding, see:
> -Documentation/devicetree/bindings/iommu/arm,smmu.yaml.
> -
> -The MSI writes are accompanied by sideband data which is derived from the ICID.
> -The msi-map property is used to associate the devices with both the ITS
> -controller and the sideband data which accompanies the writes.
> -
> -For generic MSI bindings, see
> -Documentation/devicetree/bindings/interrupt-controller/msi.txt.
> -
> -For GICv3 and GIC ITS bindings, see:
> -Documentation/devicetree/bindings/interrupt-controller/arm,gic-v3.yaml.
> -
> -Required properties:
> -
> -    - compatible
> -        Value type: <string>
> -        Definition: Must be "fsl,qoriq-mc".  A Freescale Management Complex
> -                    compatible with this binding must have Block Revision
> -                    Registers BRR1 and BRR2 at offset 0x0BF8 and 0x0BFC in
> -                    the MC control register region.
> -
> -    - reg
> -        Value type: <prop-encoded-array>
> -        Definition: A standard property.  Specifies one or two regions
> -                    defining the MC's registers:
> -
> -                       -the first region is the command portal for the
> -                        this machine and must always be present
> -
> -                       -the second region is the MC control registers. This
> -                        region may not be present in some scenarios, such
> -                        as in the device tree presented to a virtual machine.
> -
> -    - ranges
> -        Value type: <prop-encoded-array>
> -        Definition: A standard property.  Defines the mapping between the child
> -                    MC address space and the parent system address space.
> -
> -                    The MC address space is defined by 3 components:
> -                       <region type> <offset hi> <offset lo>
> -
> -                    Valid values for region type are
> -                       0x0 - MC portals
> -                       0x1 - QBMAN portals
> -
> -    - #address-cells
> -        Value type: <u32>
> -        Definition: Must be 3.  (see definition in 'ranges' property)
> -
> -    - #size-cells
> -        Value type: <u32>
> -        Definition: Must be 1.
> -
> -Sub-nodes:
> -
> -        The fsl-mc node may optionally have dpmac sub-nodes that describe
> -        the relationship between the Ethernet MACs which belong to the MC
> -        and the Ethernet PHYs on the system board.
> -
> -        The dpmac nodes must be under a node named "dpmacs" which contains
> -        the following properties:
> -
> -            - #address-cells
> -              Value type: <u32>
> -              Definition: Must be present if dpmac sub-nodes are defined and must
> -                          have a value of 1.
> -
> -            - #size-cells
> -              Value type: <u32>
> -              Definition: Must be present if dpmac sub-nodes are defined and must
> -                          have a value of 0.
> -
> -        These nodes must have the following properties:
> -
> -            - compatible
> -              Value type: <string>
> -              Definition: Must be "fsl,qoriq-mc-dpmac".
> -
> -            - reg
> -              Value type: <prop-encoded-array>
> -              Definition: Specifies the id of the dpmac.
> -
> -            - phy-handle
> -              Value type: <phandle>
> -              Definition: Specifies the phandle to the PHY device node associated
> -                          with the this dpmac.
> -Optional properties:
> -
> -- iommu-map: Maps an ICID to an IOMMU and associated iommu-specifier
> -  data.
> -
> -  The property is an arbitrary number of tuples of
> -  (icid-base,iommu,iommu-base,length).
> -
> -  Any ICID i in the interval [icid-base, icid-base + length) is
> -  associated with the listed IOMMU, with the iommu-specifier
> -  (i - icid-base + iommu-base).
> -
> -- msi-map: Maps an ICID to a GIC ITS and associated msi-specifier
> -  data.
> -
> -  The property is an arbitrary number of tuples of
> -  (icid-base,gic-its,msi-base,length).
> -
> -  Any ICID in the interval [icid-base, icid-base + length) is
> -  associated with the listed GIC ITS, with the msi-specifier
> -  (i - icid-base + msi-base).
> -
> -Deprecated properties:
> -
> -    - msi-parent
> -        Value type: <phandle>
> -        Definition: Describes the MSI controller node handling message
> -                    interrupts for the MC. When there is no translation
> -                    between the ICID and deviceID this property can be used
> -                    to describe the MSI controller used by the devices on the
> -                    mc-bus.
> -                    The use of this property for mc-bus is deprecated. Please
> -                    use msi-map.
> -
> -Example:
> -
> -        smmu: iommu@5000000 {
> -               compatible = "arm,mmu-500";
> -               #iommu-cells = <1>;
> -               stream-match-mask = <0x7C00>;
> -               ...
> -        };
> -
> -        gic: interrupt-controller@6000000 {
> -               compatible = "arm,gic-v3";
> -               ...
> -        }
> -        its: gic-its@6020000 {
> -               compatible = "arm,gic-v3-its";
> -               msi-controller;
> -               ...
> -        };
> -
> -        fsl_mc: fsl-mc@80c000000 {
> -                compatible = "fsl,qoriq-mc";
> -                reg = <0x00000008 0x0c000000 0 0x40>,    /* MC portal base */
> -                      <0x00000000 0x08340000 0 0x40000>; /* MC control reg */
> -                /* define map for ICIDs 23-64 */
> -                iommu-map = <23 &smmu 23 41>;
> -                /* define msi map for ICIDs 23-64 */
> -                msi-map = <23 &its 23 41>;
> -                #address-cells = <3>;
> -                #size-cells = <1>;
> -
> -                /*
> -                 * Region type 0x0 - MC portals
> -                 * Region type 0x1 - QBMAN portals
> -                 */
> -                ranges = <0x0 0x0 0x0 0x8 0x0c000000 0x4000000
> -                          0x1 0x0 0x0 0x8 0x18000000 0x8000000>;
> -
> -                dpmacs {
> -                    #address-cells = <1>;
> -                    #size-cells = <0>;
> -
> -                    dpmac@1 {
> -                        compatible = "fsl,qoriq-mc-dpmac";
> -                        reg = <1>;
> -                        phy-handle = <&mdio0_phy0>;
> -                    }
> -                }
> -        };
> diff --git a/Documentation/devicetree/bindings/misc/fsl,qoriq-mc.yaml b/Documentation/devicetree/bindings/misc/fsl,qoriq-mc.yaml
> new file mode 100644
> index 000000000000..f45e21872e4f
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/misc/fsl,qoriq-mc.yaml
> @@ -0,0 +1,186 @@
> +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
> +# Copyright 2020 NXP
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/misc/fsl,qoriq-mc.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +maintainers:
> +  - Laurentiu Tudor <laurentiu.tudor@nxp.com>
> +
> +title: Freescale Management Complex
> +
> +description: |
> +  The Freescale Management Complex (fsl-mc) is a hardware resource
> +  manager that manages specialized hardware objects used in
> +  network-oriented packet processing applications. After the fsl-mc
> +  block is enabled, pools of hardware resources are available, such as
> +  queues, buffer pools, I/O interfaces. These resources are building
> +  blocks that can be used to create functional hardware objects/devices
> +  such as network interfaces, crypto accelerator instances, L2 switches,
> +  etc.
> +
> +  For an overview of the DPAA2 architecture and fsl-mc bus see:
> +  Documentation/networking/device_drivers/freescale/dpaa2/overview.rst
> +
> +  As described in the above overview, all DPAA2 objects in a DPRC share the
> +  same hardware "isolation context" and a 10-bit value called an ICID
> +  (isolation context id) is expressed by the hardware to identify
> +  the requester.
> +
> +  The generic 'iommus' property is insufficient to describe the relationship
> +  between ICIDs and IOMMUs, so an iommu-map property is used to define
> +  the set of possible ICIDs under a root DPRC and how they map to
> +  an IOMMU.
> +
> +  For generic IOMMU bindings, see:
> +  Documentation/devicetree/bindings/iommu/iommu.txt.
> +
> +  For arm-smmu binding, see:
> +  Documentation/devicetree/bindings/iommu/arm,smmu.yaml.
> +
> +  MC firmware binary images can be found here:
> +  https://github.com/NXP/qoriq-mc-binary
> +
> +properties:
> +  compatible:
> +    const: fsl,qoriq-mc
> +    description:
> +      A Freescale Management Complex compatible with this binding must have
> +      Block Revision Registers BRR1 and BRR2 at offset 0x0BF8 and 0x0BFC in
> +      the MC control register region.
> +
> +  reg:
> +    minItems: 1
> +    items:
> +      - description: the command portal for this machine
> +      - description:
> +          MC control registers. This region may not be present in some
> +          scenarios, such as in the device tree presented to a virtual
> +          machine.
> +
> +  ranges:
> +    description: |
> +      A standard property. Defines the mapping between the child MC address
> +      space and the parent system address space.
> +
> +      The MC address space is defined by 3 components:
> +                <region type> <offset hi> <offset lo>
> +
> +      Valid values for region type are:
> +                  0x0 - MC portals
> +                  0x1 - QBMAN portals
> +
> +  '#address-cells':
> +    const: 3
> +
> +  '#size-cells':
> +    const: 1
> +
> +  dpmacs:
> +    type: object
> +    description:
> +      The fsl-mc node may optionally have dpmac sub-nodes that describe the
> +      relationship between the Ethernet MACs which belong to the MC and the
> +      Ethernet PHYs on the system board.
> +
> +    properties:
> +      '#address-cells':
> +        const: 1
> +
> +      '#size-cells':
> +        const: 0
> +
> +    patternProperties:
> +      "^(dpmac@[0-9a-f]+)|(ethernet@[0-9a-f]+)$":
> +        type: object
> +
> +        $ref: /schemas/net/fsl,qoriq-mc-dpmac.yaml#
> +
> +  iommu-map:
> +    description: |
> +      Maps an ICID to an IOMMU and associated iommu-specifier data.
> +
> +      The property is an arbitrary number of tuples of
> +      (icid-base, iommu, iommu-base, length).
> +
> +      Any ICID i in the interval [icid-base, icid-base + length) is
> +      associated with the listed IOMMU, with the iommu-specifier
> +      (i - icid-base + iommu-base).
> +
> +  msi-map:
> +    description: |
> +      Maps an ICID to a GIC ITS and associated msi-specifier data.
> +
> +      The property is an arbitrary number of tuples of
> +      (icid-base, gic-its, msi-base, length).
> +
> +      Any ICID in the interval [icid-base, icid-base + length) is
> +      associated with the listed GIC ITS, with the msi-specifier
> +      (i - icid-base + msi-base).
> +
> +  msi-parent:
> +    deprecated: true
> +    description:
> +      Points to the MSI controller node handling message interrupts for the MC.
> +
> +required:
> +  - compatible
> +  - reg
> +  - iommu-map
> +  - msi-map
> +  - ranges
> +  - '#address-cells'
> +  - '#size-cells'
> +
> +additionalProperties: false
> +
> +examples:
> +  - |
> +    soc {
> +      #address-cells = <2>;
> +      #size-cells = <2>;
> +
> +      smmu: iommu@5000000 {
> +        compatible = "arm,mmu-500";
> +        #global-interrupts = <1>;
> +        #iommu-cells = <1>;
> +        reg = <0 0x5000000 0 0x800000>;
> +        stream-match-mask = <0x7c00>;
> +        interrupts = <0 13 4>,
> +                     <0 146 4>, <0 147 4>,
> +                     <0 148 4>, <0 149 4>,
> +                     <0 150 4>, <0 151 4>,
> +                     <0 152 4>, <0 153 4>;
> +      };
> +
> +      fsl_mc: fsl-mc@80c000000 {
> +        compatible = "fsl,qoriq-mc";
> +        reg = <0x00000008 0x0c000000 0 0x40>,    /* MC portal base */
> +        <0x00000000 0x08340000 0 0x40000>; /* MC control reg */
> +        /* define map for ICIDs 23-64 */
> +        iommu-map = <23 &smmu 23 41>;
> +        /* define msi map for ICIDs 23-64 */
> +        msi-map = <23 &its 23 41>;
> +        #address-cells = <3>;
> +        #size-cells = <1>;
> +
> +        /*
> +        * Region type 0x0 - MC portals
> +        * Region type 0x1 - QBMAN portals
> +        */
> +        ranges = <0x0 0x0 0x0 0x8 0x0c000000 0x4000000
> +                  0x1 0x0 0x0 0x8 0x18000000 0x8000000>;
> +
> +        dpmacs {
> +          #address-cells = <1>;
> +          #size-cells = <0>;
> +
> +          ethernet@1 {
> +            compatible = "fsl,qoriq-mc-dpmac";
> +            reg = <1>;
> +            phy-handle = <&mdio0_phy0>;
> +          };
> +        };
> +      };
> +    };
> diff --git a/Documentation/networking/device_drivers/ethernet/freescale/dpaa2/overview.rst b/Documentation/networking/device_drivers/ethernet/freescale/dpaa2/overview.rst
> index d638b5a8aadd..b3261c5871cc 100644
> --- a/Documentation/networking/device_drivers/ethernet/freescale/dpaa2/overview.rst
> +++ b/Documentation/networking/device_drivers/ethernet/freescale/dpaa2/overview.rst
> @@ -28,6 +28,9 @@ interfaces, an L2 switch, or accelerator instances.
>  The MC provides memory-mapped I/O command interfaces (MC portals)
>  which DPAA2 software drivers use to operate on DPAA2 objects.
>  
> +MC firmware binary images can be found here:
> +https://github.com/NXP/qoriq-mc-binary
> +
>  The diagram below shows an overview of the DPAA2 resource management
>  architecture::
>  
> @@ -338,7 +341,7 @@ Key functions include:
>    a bind of the root DPRC to the DPRC driver
>  
>  The binding for the MC-bus device-tree node can be consulted at
> -*Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt*.
> +*Documentation/devicetree/bindings/misc/fsl,qoriq-mc.yaml*.
>  The sysfs bind/unbind interfaces for the MC-bus can be consulted at
>  *Documentation/ABI/testing/sysfs-bus-fsl-mc*.
>  
> diff --git a/MAINTAINERS b/MAINTAINERS
> index b516bb34a8d5..e0ce6e2b663c 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -14409,9 +14409,11 @@ M:	Stuart Yoder <stuyoder@gmail.com>
>  M:	Laurentiu Tudor <laurentiu.tudor@nxp.com>
>  L:	linux-kernel@vger.kernel.org
>  S:	Maintained
> -F:	Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt
> +F:	Documentation/devicetree/bindings/misc/fsl,dpaa2-console.yaml
> +F:	Documentation/devicetree/bindings/misc/fsl,qoriq-mc.yaml
>  F:	Documentation/networking/device_drivers/ethernet/freescale/dpaa2/overview.rst
>  F:	drivers/bus/fsl-mc/
> +F:	include/linux/fsl/mc.h
>  
>  QT1010 MEDIA DRIVER
>  M:	Antti Palosaari <crope@iki.fi>
> -- 
> 2.17.1
> 

^ permalink raw reply

* [PATCH v1 8/8] powerpc/32: Use SPRN_SPRG_SCRATCH2 in exception prologs
From: Christophe Leroy @ 2020-11-25  7:10 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <da51f7ec632825a4ce43290a904aad61648408c0.1606285013.git.christophe.leroy@csgroup.eu>

Use SPRN_SPRG_SCRATCH2 as a third scratch register in
exception prologs in order to simplify them and avoid
data going back and forth from/to CR.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/kernel/head_32.h | 22 +++++++---------------
 1 file changed, 7 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/kernel/head_32.h b/arch/powerpc/kernel/head_32.h
index 5e3393122d29..a1ee1e12241e 100644
--- a/arch/powerpc/kernel/head_32.h
+++ b/arch/powerpc/kernel/head_32.h
@@ -40,7 +40,7 @@
 
 .macro EXCEPTION_PROLOG_1 for_rtas=0
 #ifdef CONFIG_VMAP_STACK
-	mr	r11, r1
+	mtspr	SPRN_SPRG_SCRATCH2,r1
 	subi	r1, r1, INT_FRAME_SIZE		/* use r1 if kernel */
 	beq	1f
 	mfspr	r1,SPRN_SPRG_THREAD
@@ -61,15 +61,10 @@
 
 .macro EXCEPTION_PROLOG_2 handle_dar_dsisr=0
 #ifdef CONFIG_VMAP_STACK
-	mtcr	r10
-	li	r10, MSR_KERNEL & ~(MSR_IR | MSR_RI) /* can take DTLB miss */
-	mtmsr	r10
+	li	r11, MSR_KERNEL & ~(MSR_IR | MSR_RI) /* can take DTLB miss */
+	mtmsr	r11
 	isync
-#else
-	stw	r10,_CCR(r11)		/* save registers */
-#endif
-	mfspr	r10, SPRN_SPRG_SCRATCH0
-#ifdef CONFIG_VMAP_STACK
+	mfspr	r11, SPRN_SPRG_SCRATCH2
 	stw	r11,GPR1(r1)
 	stw	r11,0(r1)
 	mr	r11, r1
@@ -78,14 +73,12 @@
 	stw	r1,0(r11)
 	tovirt(r1, r11)		/* set new kernel sp */
 #endif
+	stw	r10,_CCR(r11)		/* save registers */
 	stw	r12,GPR12(r11)
 	stw	r9,GPR9(r11)
-	stw	r10,GPR10(r11)
-#ifdef CONFIG_VMAP_STACK
-	mfcr	r10
-	stw	r10, _CCR(r11)
-#endif
+	mfspr	r10,SPRN_SPRG_SCRATCH0
 	mfspr	r12,SPRN_SPRG_SCRATCH1
+	stw	r10,GPR10(r11)
 	stw	r12,GPR11(r11)
 	mflr	r10
 	stw	r10,_LINK(r11)
@@ -99,7 +92,6 @@
 	stw	r10, _DSISR(r11)
 	.endif
 	lwz	r9, SRR1(r12)
-	andi.	r10, r9, MSR_PR
 	lwz	r12, SRR0(r12)
 #else
 	mfspr	r12,SPRN_SRR0
-- 
2.25.0


^ permalink raw reply related

* [PATCH v1 7/8] powerpc/32s: Use SPRN_SPRG_SCRATCH2 in DSI prolog
From: Christophe Leroy @ 2020-11-25  7:10 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <da51f7ec632825a4ce43290a904aad61648408c0.1606285013.git.christophe.leroy@csgroup.eu>

Use SPRN_SPRG_SCRATCH2 as an alternative scratch register in
the early part of DSI prolog in order to avoid clobbering
SPRN_SPRG_SCRATCH0/1 used by other prologs.

The 603 doesn't like a jump from DataLoadTLBMiss to the 10 nops
that are now in the beginning of DSI exception as a result of
the feature section. To workaround this, add a jump as alternative.
It also avoids fetching 10 nops for nothing.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/include/asm/reg.h       |  1 +
 arch/powerpc/kernel/head_book3s_32.S | 24 ++++++++----------------
 2 files changed, 9 insertions(+), 16 deletions(-)

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index a37ce826f6f6..acd334ee3936 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -1203,6 +1203,7 @@
 #ifdef CONFIG_PPC_BOOK3S_32
 #define SPRN_SPRG_SCRATCH0	SPRN_SPRG0
 #define SPRN_SPRG_SCRATCH1	SPRN_SPRG1
+#define SPRN_SPRG_SCRATCH2	SPRN_SPRG2
 #define SPRN_SPRG_603_LRU	SPRN_SPRG4
 #endif
 
diff --git a/arch/powerpc/kernel/head_book3s_32.S b/arch/powerpc/kernel/head_book3s_32.S
index 51eef7b82f9c..22d670263222 100644
--- a/arch/powerpc/kernel/head_book3s_32.S
+++ b/arch/powerpc/kernel/head_book3s_32.S
@@ -288,9 +288,9 @@ MachineCheck:
 	DO_KVM  0x300
 DataAccess:
 #ifdef CONFIG_VMAP_STACK
-	mtspr	SPRN_SPRG_SCRATCH0,r10
-	mfspr	r10, SPRN_SPRG_THREAD
 BEGIN_MMU_FTR_SECTION
+	mtspr	SPRN_SPRG_SCRATCH2,r10
+	mfspr	r10, SPRN_SPRG_THREAD
 	stw	r11, THR11(r10)
 	mfspr	r10, SPRN_DSISR
 	mfcr	r11
@@ -304,19 +304,11 @@ BEGIN_MMU_FTR_SECTION
 .Lhash_page_dsi_cont:
 	mtcr	r11
 	lwz	r11, THR11(r10)
-END_MMU_FTR_SECTION_IFSET(MMU_FTR_HPTE_TABLE)
-	mtspr	SPRN_SPRG_SCRATCH1,r11
-	mfspr	r11, SPRN_DAR
-	stw	r11, DAR(r10)
-	mfspr	r11, SPRN_DSISR
-	stw	r11, DSISR(r10)
-	mfspr	r11, SPRN_SRR0
-	stw	r11, SRR0(r10)
-	mfspr	r11, SPRN_SRR1		/* check whether user or kernel */
-	stw	r11, SRR1(r10)
-	mfcr	r10
-	andi.	r11, r11, MSR_PR
-
+	mfspr	r10, SPRN_SPRG_SCRATCH2
+MMU_FTR_SECTION_ELSE
+	b	1f
+ALT_MMU_FTR_SECTION_END_IFSET(MMU_FTR_HPTE_TABLE)
+1:	EXCEPTION_PROLOG_0 handle_dar_dsisr=1
 	EXCEPTION_PROLOG_1
 	b	handle_page_fault_tramp_1
 #else	/* CONFIG_VMAP_STACK */
@@ -760,7 +752,7 @@ fast_hash_page_return:
 	/* DSI */
 	mtcr	r11
 	lwz	r11, THR11(r10)
-	mfspr	r10, SPRN_SPRG_SCRATCH0
+	mfspr	r10, SPRN_SPRG_SCRATCH2
 	RFI
 
 1:	/* ISI */
-- 
2.25.0


^ permalink raw reply related

* [PATCH v1 6/8] powerpc/32: Simplify EXCEPTION_PROLOG_1 macro
From: Christophe Leroy @ 2020-11-25  7:10 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <da51f7ec632825a4ce43290a904aad61648408c0.1606285013.git.christophe.leroy@csgroup.eu>

Make code more readable with a clear CONFIG_VMAP_STACK
section and a clear non CONFIG_VMAP_STACK section.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/kernel/head_32.h | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/kernel/head_32.h b/arch/powerpc/kernel/head_32.h
index 7c767765071d..5e3393122d29 100644
--- a/arch/powerpc/kernel/head_32.h
+++ b/arch/powerpc/kernel/head_32.h
@@ -46,18 +46,16 @@
 	mfspr	r1,SPRN_SPRG_THREAD
 	lwz	r1,TASK_STACK-THREAD(r1)
 	addi	r1, r1, THREAD_SIZE - INT_FRAME_SIZE
+1:
+	mtcrf	0x7f, r1
+	bt	32 - THREAD_ALIGN_SHIFT, stack_overflow
 #else
 	subi	r11, r1, INT_FRAME_SIZE		/* use r1 if kernel */
 	beq	1f
 	mfspr	r11,SPRN_SPRG_THREAD
 	lwz	r11,TASK_STACK-THREAD(r11)
 	addi	r11, r11, THREAD_SIZE - INT_FRAME_SIZE
-#endif
-1:
-	tophys_novmstack r11, r11
-#ifdef CONFIG_VMAP_STACK
-	mtcrf	0x7f, r1
-	bt	32 - THREAD_ALIGN_SHIFT, stack_overflow
+1:	tophys(r11, r11)
 #endif
 .endm
 
-- 
2.25.0


^ permalink raw reply related

* [PATCH v1 5/8] powerpc/603: Use SPRN_SDR1 to store the pgdir phys address
From: Christophe Leroy @ 2020-11-25  7:10 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <da51f7ec632825a4ce43290a904aad61648408c0.1606285013.git.christophe.leroy@csgroup.eu>

On the 603, SDR1 is not used.

In order to free SPRN_SPRG2, use SPRN_SDR1 to store the pgdir
phys addr.

But only some bits of SDR1 can be used (0xffff01ff).
As the pgdir is 4k aligned, rotate it by 4 bits to the left.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/include/asm/reg.h       |  1 -
 arch/powerpc/kernel/head_book3s_32.S | 31 +++++++++++++++++++++-------
 2 files changed, 24 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index f877a576b338..a37ce826f6f6 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -1203,7 +1203,6 @@
 #ifdef CONFIG_PPC_BOOK3S_32
 #define SPRN_SPRG_SCRATCH0	SPRN_SPRG0
 #define SPRN_SPRG_SCRATCH1	SPRN_SPRG1
-#define SPRN_SPRG_PGDIR		SPRN_SPRG2
 #define SPRN_SPRG_603_LRU	SPRN_SPRG4
 #endif
 
diff --git a/arch/powerpc/kernel/head_book3s_32.S b/arch/powerpc/kernel/head_book3s_32.S
index 236a95d163be..51eef7b82f9c 100644
--- a/arch/powerpc/kernel/head_book3s_32.S
+++ b/arch/powerpc/kernel/head_book3s_32.S
@@ -457,8 +457,9 @@ InstructionTLBMiss:
 	lis	r1, TASK_SIZE@h		/* check if kernel address */
 	cmplw	0,r1,r3
 #endif
-	mfspr	r2, SPRN_SPRG_PGDIR
+	mfspr	r2, SPRN_SDR1
 	li	r1,_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_EXEC
+	rlwinm	r2, r2, 28, 0xfffff000
 #ifdef CONFIG_MODULES
 	bgt-	112f
 	lis	r2, (swapper_pg_dir - PAGE_OFFSET)@ha	/* if kernel address, use */
@@ -519,8 +520,9 @@ DataLoadTLBMiss:
 	mfspr	r3,SPRN_DMISS
 	lis	r1, TASK_SIZE@h		/* check if kernel address */
 	cmplw	0,r1,r3
-	mfspr	r2, SPRN_SPRG_PGDIR
+	mfspr	r2, SPRN_SDR1
 	li	r1, _PAGE_PRESENT | _PAGE_ACCESSED
+	rlwinm	r2, r2, 28, 0xfffff000
 	bgt-	112f
 	lis	r2, (swapper_pg_dir - PAGE_OFFSET)@ha	/* if kernel address, use */
 	addi	r2, r2, (swapper_pg_dir - PAGE_OFFSET)@l	/* kernel page table */
@@ -595,8 +597,9 @@ DataStoreTLBMiss:
 	mfspr	r3,SPRN_DMISS
 	lis	r1, TASK_SIZE@h		/* check if kernel address */
 	cmplw	0,r1,r3
-	mfspr	r2, SPRN_SPRG_PGDIR
+	mfspr	r2, SPRN_SDR1
 	li	r1, _PAGE_RW | _PAGE_DIRTY | _PAGE_PRESENT | _PAGE_ACCESSED
+	rlwinm	r2, r2, 28, 0xfffff000
 	bgt-	112f
 	lis	r2, (swapper_pg_dir - PAGE_OFFSET)@ha	/* if kernel address, use */
 	addi	r2, r2, (swapper_pg_dir - PAGE_OFFSET)@l	/* kernel page table */
@@ -889,9 +892,12 @@ __secondary_start:
 	tophys(r4,r2)
 	addi	r4,r4,THREAD	/* phys address of our thread_struct */
 	mtspr	SPRN_SPRG_THREAD,r4
+BEGIN_MMU_FTR_SECTION
 	lis	r4, (swapper_pg_dir - PAGE_OFFSET)@h
 	ori	r4, r4, (swapper_pg_dir - PAGE_OFFSET)@l
-	mtspr	SPRN_SPRG_PGDIR, r4
+	rlwinm	r4, r4, 4, 0xffff01ff
+	mtspr	SPRN_SDR1, r4
+END_MMU_FTR_SECTION_IFCLR(MMU_FTR_HPTE_TABLE)
 
 	/* enable MMU and jump to start_secondary */
 	li	r4,MSR_KERNEL
@@ -931,11 +937,13 @@ load_up_mmu:
 	tlbia			/* Clear all TLB entries */
 	sync			/* wait for tlbia/tlbie to finish */
 	TLBSYNC			/* ... on all CPUs */
+BEGIN_MMU_FTR_SECTION
 	/* Load the SDR1 register (hash table base & size) */
 	lis	r6,_SDR1@ha
 	tophys(r6,r6)
 	lwz	r6,_SDR1@l(r6)
 	mtspr	SPRN_SDR1,r6
+END_MMU_FTR_SECTION_IFSET(MMU_FTR_HPTE_TABLE)
 
 /* Load the BAT registers with the values set up by MMU_init. */
 	lis	r3,BATS@ha
@@ -991,9 +999,12 @@ start_here:
 	tophys(r4,r2)
 	addi	r4,r4,THREAD	/* init task's THREAD */
 	mtspr	SPRN_SPRG_THREAD,r4
+BEGIN_MMU_FTR_SECTION
 	lis	r4, (swapper_pg_dir - PAGE_OFFSET)@h
 	ori	r4, r4, (swapper_pg_dir - PAGE_OFFSET)@l
-	mtspr	SPRN_SPRG_PGDIR, r4
+	rlwinm	r4, r4, 4, 0xffff01ff
+	mtspr	SPRN_SDR1, r4
+END_MMU_FTR_SECTION_IFCLR(MMU_FTR_HPTE_TABLE)
 
 	/* stack */
 	lis	r1,init_thread_union@ha
@@ -1073,16 +1084,22 @@ _ENTRY(switch_mmu_context)
 	li	r0,NUM_USER_SEGMENTS
 	mtctr	r0
 
-	lwz	r4, MM_PGD(r4)
 #ifdef CONFIG_BDI_SWITCH
 	/* Context switch the PTE pointer for the Abatron BDI2000.
 	 * The PGDIR is passed as second argument.
 	 */
+	lwz	r4, MM_PGD(r4)
 	lis	r5, abatron_pteptrs@ha
 	stw	r4, abatron_pteptrs@l + 0x4(r5)
+#endif
+BEGIN_MMU_FTR_SECTION
+#ifndef CONFIG_BDI_SWITCH
+	lwz	r4, MM_PGD(r4)
 #endif
 	tophys(r4, r4)
-	mtspr	SPRN_SPRG_PGDIR, r4
+	rlwinm	r4, r4, 4, 0xffff01ff
+	mtspr	SPRN_SDR1, r4
+END_MMU_FTR_SECTION_IFCLR(MMU_FTR_HPTE_TABLE)
 	li	r4,0
 	isync
 3:
-- 
2.25.0


^ permalink raw reply related

* [PATCH v1 2/8] powerpc/32s: Don't hash_preload() kernel text
From: Christophe Leroy @ 2020-11-25  7:10 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <da51f7ec632825a4ce43290a904aad61648408c0.1606285013.git.christophe.leroy@csgroup.eu>

We now always map kernel text with BATs. Neither need to preload
hash with kernel text addresses nor ensure they are never evicted.

This is more or less a revert of commit ee4f2ea48674 ("[POWERPC] Fix
32-bit mm operations when not using BATs")

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/mm/book3s32/hash_low.S | 18 +-----------------
 arch/powerpc/mm/book3s32/mmu.c      |  2 +-
 arch/powerpc/mm/mmu_decl.h          |  2 --
 arch/powerpc/mm/pgtable_32.c        |  4 ----
 4 files changed, 2 insertions(+), 24 deletions(-)

diff --git a/arch/powerpc/mm/book3s32/hash_low.S b/arch/powerpc/mm/book3s32/hash_low.S
index b2c912e517b9..48415c857d80 100644
--- a/arch/powerpc/mm/book3s32/hash_low.S
+++ b/arch/powerpc/mm/book3s32/hash_low.S
@@ -411,30 +411,14 @@ END_FTR_SECTION_IFCLR(CPU_FTR_NEED_COHERENT)
 	 * and we know there is a definite (although small) speed
 	 * advantage to putting the PTE in the primary PTEG, we always
 	 * put the PTE in the primary PTEG.
-	 *
-	 * In addition, we skip any slot that is mapping kernel text in
-	 * order to avoid a deadlock when not using BAT mappings if
-	 * trying to hash in the kernel hash code itself after it has
-	 * already taken the hash table lock. This works in conjunction
-	 * with pre-faulting of the kernel text.
-	 *
-	 * If the hash table bucket is full of kernel text entries, we'll
-	 * lockup here but that shouldn't happen
 	 */
 
-1:	lis	r4, (next_slot - PAGE_OFFSET)@ha	/* get next evict slot */
+	lis	r4, (next_slot - PAGE_OFFSET)@ha	/* get next evict slot */
 	lwz	r6, (next_slot - PAGE_OFFSET)@l(r4)
 	addi	r6,r6,HPTE_SIZE			/* search for candidate */
 	andi.	r6,r6,7*HPTE_SIZE
 	stw	r6,next_slot@l(r4)
 	add	r4,r3,r6
-	LDPTE	r0,HPTE_SIZE/2(r4)		/* get PTE second word */
-	clrrwi	r0,r0,12
-	lis	r6,etext@h
-	ori	r6,r6,etext@l			/* get etext */
-	tophys(r6,r6)
-	cmpl	cr0,r0,r6			/* compare and try again */
-	blt	1b
 
 #ifndef CONFIG_SMP
 	/* Store PTE in PTEG */
diff --git a/arch/powerpc/mm/book3s32/mmu.c b/arch/powerpc/mm/book3s32/mmu.c
index 5c60dcade90a..23f60e97196e 100644
--- a/arch/powerpc/mm/book3s32/mmu.c
+++ b/arch/powerpc/mm/book3s32/mmu.c
@@ -302,7 +302,7 @@ void __init setbat(int index, unsigned long virt, phys_addr_t phys,
 /*
  * Preload a translation in the hash table
  */
-void hash_preload(struct mm_struct *mm, unsigned long ea)
+static void hash_preload(struct mm_struct *mm, unsigned long ea)
 {
 	pmd_t *pmd;
 
diff --git a/arch/powerpc/mm/mmu_decl.h b/arch/powerpc/mm/mmu_decl.h
index 1b6d39e9baed..0ad6d476d01d 100644
--- a/arch/powerpc/mm/mmu_decl.h
+++ b/arch/powerpc/mm/mmu_decl.h
@@ -91,8 +91,6 @@ void print_system_hash_info(void);
 
 #ifdef CONFIG_PPC32
 
-void hash_preload(struct mm_struct *mm, unsigned long ea);
-
 extern void mapin_ram(void);
 extern void setbat(int index, unsigned long virt, phys_addr_t phys,
 		   unsigned int size, pgprot_t prot);
diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
index 079159e97bca..6e0083e7f008 100644
--- a/arch/powerpc/mm/pgtable_32.c
+++ b/arch/powerpc/mm/pgtable_32.c
@@ -112,10 +112,6 @@ static void __init __mapin_ram_chunk(unsigned long offset, unsigned long top)
 		ktext = ((char *)v >= _stext && (char *)v < etext) ||
 			((char *)v >= _sinittext && (char *)v < _einittext);
 		map_kernel_page(v, p, ktext ? PAGE_KERNEL_TEXT : PAGE_KERNEL);
-#ifdef CONFIG_PPC_BOOK3S_32
-		if (ktext)
-			hash_preload(&init_mm, v);
-#endif
 		v += PAGE_SIZE;
 		p += PAGE_SIZE;
 	}
-- 
2.25.0


^ permalink raw reply related

* [PATCH v1 3/8] powerpc/32s: Fix an FTR_SECTION_ELSE
From: Christophe Leroy @ 2020-11-25  7:10 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <da51f7ec632825a4ce43290a904aad61648408c0.1606285013.git.christophe.leroy@csgroup.eu>

An FTR_SECTION_ELSE is in the middle of
BEGIN_MMU_FTR_SECTION/ALT_MMU_FTR_SECTION_END_IFSET

Change it to MMU_FTR_SECTION_ELSE

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/kernel/head_book3s_32.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/head_book3s_32.S b/arch/powerpc/kernel/head_book3s_32.S
index 27767f3e7ec1..236a95d163be 100644
--- a/arch/powerpc/kernel/head_book3s_32.S
+++ b/arch/powerpc/kernel/head_book3s_32.S
@@ -332,7 +332,7 @@ BEGIN_MMU_FTR_SECTION
 	rlwinm	r3, r5, 32 - 15, 21, 21		/* DSISR_STORE -> _PAGE_RW */
 	bl	hash_page
 	b	handle_page_fault_tramp_1
-FTR_SECTION_ELSE
+MMU_FTR_SECTION_ELSE
 	b	handle_page_fault_tramp_2
 ALT_MMU_FTR_SECTION_END_IFSET(MMU_FTR_HPTE_TABLE)
 #endif	/* CONFIG_VMAP_STACK */
-- 
2.25.0


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox