LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [RFC PATCH 7/8] powerpc/purgatory: drop .machine specifier
From: Daniel Axtens @ 2021-02-26  0:17 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: linuxppc-dev, llvmlinux
In-Reply-To: <20210225155836.GG28121@gate.crashing.org>

Segher Boessenkool <segher@kernel.crashing.org> writes:

> On Thu, Feb 25, 2021 at 02:10:05PM +1100, Daniel Axtens wrote:
>> It's ignored by future versions of llvm's integrated assembler (by not -11).
>> I'm not sure what it does for us in gas.
>
> It enables all insns that exist on 620 (the first 64-bit PowerPC CPU).
>
>> --- a/arch/powerpc/purgatory/trampoline_64.S
>> +++ b/arch/powerpc/purgatory/trampoline_64.S
>> @@ -12,7 +12,7 @@
>>  #include <asm/asm-compat.h>
>>  #include <asm/crashdump-ppc64.h>
>>  
>> -	.machine ppc64
>> +//upgrade clang, gets ignored	.machine ppc64
>
> Why delete it if it is ignored?  Why add a cryptic comment?

Sorry, poor form on my part. I think I will give up on having llvm-11
work and target llvm HEAD, which means I can drop this.

>
>
> Segher

^ permalink raw reply

* Re: [RFC PATCH 8/8] powerpc/64/asm: don't reassign labels
From: Daniel Axtens @ 2021-02-26  0:28 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: linuxppc-dev, llvmlinux
In-Reply-To: <20210225160857.GH28121@gate.crashing.org>

Segher Boessenkool <segher@kernel.crashing.org> writes:

> On Thu, Feb 25, 2021 at 02:10:06PM +1100, Daniel Axtens wrote:
>> The assembler really does not like us reassigning things to the same
>> label:
>> 
>> <instantiation>:7:9: error: invalid reassignment of non-absolute variable 'fs_label'
>> 
>> This happens across a bunch of platforms:
>> https://github.com/ClangBuiltLinux/linux/issues/1043
>> https://github.com/ClangBuiltLinux/linux/issues/1008
>> https://github.com/ClangBuiltLinux/linux/issues/920
>> https://github.com/ClangBuiltLinux/linux/issues/1050
>> 
>> There is no hope of getting this fixed in LLVM, so if we want to build
>> with LLVM_IAS, we need to hack around it ourselves.
>> 
>> For us the big problem comes from this:
>> 
>> \#define USE_FIXED_SECTION(sname)				\
>> 	fs_label = start_##sname;				\
>> 	fs_start = sname##_start;				\
>> 	use_ftsec sname;
>> 
>> \#define USE_TEXT_SECTION()
>> 	fs_label = start_text;					\
>> 	fs_start = text_start;					\
>> 	.text
>> 
>> and in particular fs_label.
>
> The "Setting Symbols" super short chapter reads:
>
> "A symbol can be given an arbitrary value by writing a symbol, followed
> by an equals sign '=', followed by an expression.  This is equivalent
> to using the '.set' directive."
>
> And ".set" has
>
> "Set the value of SYMBOL to EXPRESSION.  This changes SYMBOL's value and
> type to conform to EXPRESSION.  If SYMBOL was flagged as external, it
> remains flagged.
>
> You may '.set' a symbol many times in the same assembly provided that
> the values given to the symbol are constants.  Values that are based on
> expressions involving other symbols are allowed, but some targets may
> restrict this to only being done once per assembly.  This is because
> those targets do not set the addresses of symbols at assembly time, but
> rather delay the assignment until a final link is performed.  This
> allows the linker a chance to change the code in the files, changing the
> location of, and the relative distance between, various different
> symbols.
>
> If you '.set' a global symbol, the value stored in the object file is
> the last value stored into it."
>
> So this really should be fixed in clang: it is basic assembler syntax.

No doubt I have explained this poorly.

LLVM does allow some things, this builds fine for example:

.set foo, 8192
addi %r3, %r3, foo
.set foo, 1234
addi %r3, %r3, foo

However, this does not:

a:
.set foo, a
addi %r3, %r3, foo@l
b:
.set foo, b
addi %r3, %r3, foo-a

clang -target ppc64le -integrated-as  foo.s -o foo.o -c
foo.s:5:11: error: invalid reassignment of non-absolute variable 'foo' in '.set' directive
.set foo, b
          ^

gas otoh, has no issues with reassignment:

$ powerpc64-linux-gnu-as foo.s -c -o foo.o
$ powerpc64-linux-gnu-objdump -dr foo.o

foo.o:     file format elf64-powerpc


Disassembly of section .text:

0000000000000000 <a>:
   0:	38 63 00 00 	addi    r3,r3,0
			2: R_PPC64_ADDR16_LO	.text

0000000000000004 <b>:
   4:	38 63 00 04 	addi    r3,r3,4


It seems the llvm assembler only does a single pass, so they're not keen
on trying to support reassigning labels with non-absolute values.

Kind regards,
Daniel

>
> Segher

^ permalink raw reply

* [PATCH v2] crypto/nx: add missing call to of_node_put()
From: Yang Li @ 2021-02-26  1:23 UTC (permalink / raw)
  To: mpe
  Cc: herbert, linux-kernel, paulus, linux-crypto, Yang Li,
	linuxppc-dev, davem

In one of the error paths of the for_each_child_of_node() loop,
add missing call to of_node_put().

Fix the following coccicheck warning:
./drivers/crypto/nx/nx-common-powernv.c:927:1-23: WARNING: Function
"for_each_child_of_node" should have of_node_put() before return around
line 936.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
---

Changes in v2:
-add braces for if

 drivers/crypto/nx/nx-common-powernv.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/crypto/nx/nx-common-powernv.c b/drivers/crypto/nx/nx-common-powernv.c
index 13c65de..446f611 100644
--- a/drivers/crypto/nx/nx-common-powernv.c
+++ b/drivers/crypto/nx/nx-common-powernv.c
@@ -932,8 +932,10 @@ static int __init nx_powernv_probe_vas(struct device_node *pn)
 			ret = find_nx_device_tree(dn, chip_id, vasid,
 				NX_CT_GZIP, "ibm,p9-nx-gzip", &ct_gzip);
 
-		if (ret)
+		if (ret) {
+			of_node_put(dn);
 			return ret;
+		}
 	}
 
 	if (!ct_842 || !ct_gzip) {
-- 
1.8.3.1


^ permalink raw reply related

* Re: [PATCH v4 12/14] swiotlb: Add restricted DMA alloc/free support.
From: Claire Chang @ 2021-02-26  4:17 UTC (permalink / raw)
  To: Rob Herring, mpe, Joerg Roedel, Will Deacon, Frank Rowand,
	Konrad Rzeszutek Wilk, boris.ostrovsky, jgross, Christoph Hellwig,
	Marek Szyprowski
  Cc: heikki.krogerus, peterz, grant.likely, paulus, mingo, sstabellini,
	Saravana Kannan, xypron.glpk, Rafael J . Wysocki,
	Bartosz Golaszewski, xen-devel, Thierry Reding, linux-devicetree,
	linuxppc-dev, Nicolas Boichat, Dan Williams, Andy Shevchenko,
	Greg KH, Randy Dunlap, lkml, list@263.net:IOMMU DRIVERS,
	Jim Quinlan, Robin Murphy, bauerman
In-Reply-To: <20210209062131.2300005-13-tientzu@chromium.org>

> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index fd9c1bd183ac..8b77fd64199e 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -836,6 +836,40 @@ late_initcall(swiotlb_create_default_debugfs);
>  #endif
>
>  #ifdef CONFIG_DMA_RESTRICTED_POOL
> +struct page *dev_swiotlb_alloc(struct device *dev, size_t size, gfp_t gfp)
> +{
> +       struct swiotlb *swiotlb;
> +       phys_addr_t tlb_addr;
> +       unsigned int index;
> +
> +       /* dev_swiotlb_alloc can be used only in the context which permits sleeping. */
> +       if (!dev->dev_swiotlb || !gfpflags_allow_blocking(gfp))

Just noticed that !gfpflags_allow_blocking(gfp) shouldn't be here.

Hi Christoph,

Do you think I should fix this and rebase on the latest linux-next
now? I wonder if there are more factor and clean up coming and I
should wait after that.

Thanks,
Claire

^ permalink raw reply

* Re: [PATCH v2 01/37] KVM: PPC: Book3S 64: remove unused kvmppc_h_protect argument
From: Daniel Axtens @ 2021-02-26  5:01 UTC (permalink / raw)
  To: Nicholas Piggin, kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin
In-Reply-To: <20210225134652.2127648-2-npiggin@gmail.com>

Hi Nick,

> The va argument is not used in the function or set by its asm caller,
> so remove it to be safe.

Huh, so it isn't. I tracked the original implementation down to commit
a8606e20e41a ("KVM: PPC: Handle some PAPR hcalls in the kernel") where
paulus first added the ability to handle it in the kernel - there it
takes a va argument but even then doesn't do anything with it.

ajd also pointed out that we don't pass a va when linux is running as a
guest, and LoPAR does not mention va as an argument.

One small nit: checkpatch is complaining about spaces vs tabs:
ERROR: code indent should use tabs where possible
#25: FILE: arch/powerpc/include/asm/kvm_ppc.h:770:
+                      unsigned long pte_index, unsigned long avpn);$

WARNING: please, no spaces at the start of a line
#25: FILE: arch/powerpc/include/asm/kvm_ppc.h:770:
+                      unsigned long pte_index, unsigned long avpn);$

Once that is resolved,
  Reviewed-by: Daniel Axtens <dja@axtens.net>

Kind regards,
Daniel Axtens

> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>  arch/powerpc/include/asm/kvm_ppc.h  | 3 +--
>  arch/powerpc/kvm/book3s_hv_rm_mmu.c | 3 +--
>  2 files changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
> index 8aacd76bb702..9531b1c1b190 100644
> --- a/arch/powerpc/include/asm/kvm_ppc.h
> +++ b/arch/powerpc/include/asm/kvm_ppc.h
> @@ -767,8 +767,7 @@ long kvmppc_h_remove(struct kvm_vcpu *vcpu, unsigned long flags,
>                       unsigned long pte_index, unsigned long avpn);
>  long kvmppc_h_bulk_remove(struct kvm_vcpu *vcpu);
>  long kvmppc_h_protect(struct kvm_vcpu *vcpu, unsigned long flags,
> -                      unsigned long pte_index, unsigned long avpn,
> -                      unsigned long va);
> +                      unsigned long pte_index, unsigned long avpn);
>  long kvmppc_h_read(struct kvm_vcpu *vcpu, unsigned long flags,
>                     unsigned long pte_index);
>  long kvmppc_h_clear_ref(struct kvm_vcpu *vcpu, unsigned long flags,
> diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
> index 88da2764c1bb..7af7c70f1468 100644
> --- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
> +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
> @@ -673,8 +673,7 @@ long kvmppc_h_bulk_remove(struct kvm_vcpu *vcpu)
>  }
>  
>  long kvmppc_h_protect(struct kvm_vcpu *vcpu, unsigned long flags,
> -		      unsigned long pte_index, unsigned long avpn,
> -		      unsigned long va)
> +		      unsigned long pte_index, unsigned long avpn)
>  {
>  	struct kvm *kvm = vcpu->kvm;
>  	__be64 *hpte;
> -- 
> 2.23.0

^ permalink raw reply

* Re: [PATCH v2 02/37] KVM: PPC: Book3S HV: Fix CONFIG_SPAPR_TCE_IOMMU=n default hcalls
From: Daniel Axtens @ 2021-02-26  5:21 UTC (permalink / raw)
  To: Nicholas Piggin, kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin
In-Reply-To: <20210225134652.2127648-3-npiggin@gmail.com>

Hi Nick,

> This config option causes the warning in init_default_hcalls to fire
> because the TCE handlers are in the default hcall list but not
> implemented.

I checked that the TCE handlers are indeed not defined unless
CONFIG_SPAPR_TCE_IOMMU=y, and so I can see how you would hit the
warning.

This seems like the right solution to me.

Reviewed-by: Daniel Axtens <dja@axtens.net>

Kind regards,
Daniel

>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>  arch/powerpc/kvm/book3s_hv.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 13bad6bf4c95..895090636295 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -5369,8 +5369,10 @@ static unsigned int default_hcall_list[] = {
>  	H_READ,
>  	H_PROTECT,
>  	H_BULK_REMOVE,
> +#ifdef CONFIG_SPAPR_TCE_IOMMU
>  	H_GET_TCE,
>  	H_PUT_TCE,
> +#endif
>  	H_SET_DABR,
>  	H_SET_XDABR,
>  	H_CEDE,
> -- 
> 2.23.0

^ permalink raw reply

* Re: [PATCH v4 12/14] swiotlb: Add restricted DMA alloc/free support.
From: Christoph Hellwig @ 2021-02-26  5:17 UTC (permalink / raw)
  To: Claire Chang
  Cc: heikki.krogerus, peterz, grant.likely, paulus, Frank Rowand,
	mingo, Marek Szyprowski, sstabellini, Saravana Kannan,
	Joerg Roedel, Rafael J . Wysocki, Christoph Hellwig,
	Bartosz Golaszewski, xen-devel, Thierry Reding, linux-devicetree,
	Will Deacon, Konrad Rzeszutek Wilk, Dan Williams, linuxppc-dev,
	Rob Herring, boris.ostrovsky, Andy Shevchenko, jgross,
	Nicolas Boichat, Greg KH, Randy Dunlap, lkml,
	list@263.net:IOMMU DRIVERS, Jim Quinlan, xypron.glpk,
	Robin Murphy, bauerman
In-Reply-To: <CALiNf298+DLjTK6ALe0mYrRuCP_LtztMGuQQCS90ubDctbS0kw@mail.gmail.com>

On Fri, Feb 26, 2021 at 12:17:50PM +0800, Claire Chang wrote:
> Do you think I should fix this and rebase on the latest linux-next
> now? I wonder if there are more factor and clean up coming and I
> should wait after that.

Here is my preferred plan:

 1) wait for my series to support the min alignment in swiotlb to
    land in Linus tree
 2) I'll resend my series with the further swiotlb cleanup and
    refactoring, which includes a slightly rebased version of your
    patch to add the io_tlb_mem structure
 3) resend your series on top of that as a baseline

This is my current WIP tree for 2:

  http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/swiotlb-struct

^ permalink raw reply

* Re: [PATCH v2 04/37] powerpc/64s: remove KVM SKIP test from instruction breakpoint handler
From: Daniel Axtens @ 2021-02-26  5:44 UTC (permalink / raw)
  To: Nicholas Piggin, kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin, Fabiano Rosas
In-Reply-To: <20210225134652.2127648-5-npiggin@gmail.com>

Nicholas Piggin <npiggin@gmail.com> writes:

> The code being executed in KVM_GUEST_MODE_SKIP is hypervisor code with
> MSR[IR]=0, so the faults of concern are the d-side ones caused by access
> to guest context by the hypervisor.
>
> Instruction breakpoint interrupts are not a concern here. It's unlikely
> any good would come of causing breaks in this code, but skipping the
> instruction that caused it won't help matters (e.g., skip the mtmsr that
> sets MSR[DR]=0 or clears KVM_GUEST_MODE_SKIP).

I'm not entirely clear on the example here, but the patch makes sense
and I can follow your logic for removing the IKVM_SKIP handler from the
instruction breakpoint exception.

On that basis:
Reviewed-by: Daniel Axtens <dja@axtens.net>

Kind regards,
Daniel

>
> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>  arch/powerpc/kernel/exceptions-64s.S | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
> index a027600beeb1..0097e0676ed7 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -2553,7 +2553,6 @@ EXC_VIRT_NONE(0x5200, 0x100)
>  INT_DEFINE_BEGIN(instruction_breakpoint)
>  	IVEC=0x1300
>  #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
> -	IKVM_SKIP=1
>  	IKVM_REAL=1
>  #endif
>  INT_DEFINE_END(instruction_breakpoint)
> -- 
> 2.23.0

^ permalink raw reply

* Re: [PATCH v2 05/37] KVM: PPC: Book3S HV: Ensure MSR[ME] is always set in guest MSR
From: Daniel Axtens @ 2021-02-26  6:06 UTC (permalink / raw)
  To: Nicholas Piggin, kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin, Fabiano Rosas
In-Reply-To: <20210225134652.2127648-6-npiggin@gmail.com>

Hi Nick,

>  void kvmppc_set_msr_hv(struct kvm_vcpu *vcpu, u64 msr)
>  {
> +	/*
> +	 * Guest must always run with machine check interrupt
> +	 * enabled.
> +	 */
> +	if (!(msr & MSR_ME))
> +		msr |= MSR_ME;

This 'if' is technically redundant but you mention a future patch warning
on !(msr & MSR_ME) so I'm holding off on any judgement about the 'if' until
I get to that patch :)

The patch seems sane to me, I agree that we don't want guests running with
MSR_ME=0 and kvmppc_set_msr_hv already ensures that the transactional state is
sane so this is another sanity-enforcement in the same sort of vein.

All up:
Reviewed-by: Daniel Axtens <dja@axtens.net>

Kind regards,
Daniel

> +
>  	/*
>  	 * Check for illegal transactional state bit combination
>  	 * and if we find it, force the TS field to a safe state.
> -- 
> 2.23.0

^ permalink raw reply

* [PATCH V2 1/2] powerpc/perf: Infrastructure to support checking of attr.config*
From: Madhavan Srinivasan @ 2021-02-26  6:50 UTC (permalink / raw)
  To: mpe; +Cc: Madhavan Srinivasan, linuxppc-dev

Introduce code to support the checking of attr.config* for
values which are reserved for a given platform.
Performance Monitoring Unit (PMU) configuration registers
have fields that are reserved and specific value to bit field
as reserved. For ex., MMCRA[61:62] is Randome Sampling Mode (SM)
and value of 0b11 to this field is reserved.

Writing a non-zero values in these fields or writing invalid
value to bit fields will have unknown behaviours.

Patch adds a generic call-back function "check_attr_config"
in "struct power_pmu", to be called in event_init to
check for attr.config* values for a given platform.
"check_attr_config" is valid only for raw event type.

Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
---
Changelog v1:
-Fixed commit message and in-code comments

 arch/powerpc/include/asm/perf_event_server.h |  6 ++++++
 arch/powerpc/perf/core-book3s.c              | 14 ++++++++++++++
 2 files changed, 20 insertions(+)

diff --git a/arch/powerpc/include/asm/perf_event_server.h b/arch/powerpc/include/asm/perf_event_server.h
index 00e7e671bb4b..dde97d7d9253 100644
--- a/arch/powerpc/include/asm/perf_event_server.h
+++ b/arch/powerpc/include/asm/perf_event_server.h
@@ -67,6 +67,12 @@ struct power_pmu {
 	 * the pmu supports extended perf regs capability
 	 */
 	int		capabilities;
+	/*
+	 * Function to check event code for values which are
+	 * reserved. Function takes struct perf_event as input,
+	 * since event code could be spread in attr.config*
+	 */
+	int		(*check_attr_config)(struct perf_event *ev);
 };
 
 /*
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 6817331e22ff..c6eeb4fdc5fd 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -1958,6 +1958,20 @@ static int power_pmu_event_init(struct perf_event *event)
 
 		if (ppmu->blacklist_ev && is_event_blacklisted(ev))
 			return -EINVAL;
+		/*
+		 * PMU config registers have fields that are
+		 * reserved and specific value to bit field as reserved.
+		 * For ex., MMCRA[61:62] is Randome Sampling Mode (SM)
+		 * and value of 0b11 to this field is reserved.
+		 *
+		 * This check is needed only for raw event type,
+		 * since tools like fuzzer use raw event type to
+		 * provide randomized event code values for test.
+		 *
+		 */
+		if (ppmu->check_attr_config &&
+		    ppmu->check_attr_config(event))
+			return -EINVAL;
 		break;
 	default:
 		return -ENOENT;
-- 
2.26.2


^ permalink raw reply related

* [PATCH V2 2/2] powerpc/perf: Add platform specific check_attr_config
From: Madhavan Srinivasan @ 2021-02-26  6:50 UTC (permalink / raw)
  To: mpe; +Cc: Madhavan Srinivasan, linuxppc-dev
In-Reply-To: <20210226065025.1254973-1-maddy@linux.ibm.com>

Add platform specific attr.config value checks. Patch
includes checks for both power9 and power10.

Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
---
Changelog v1:
- No changes.

 arch/powerpc/perf/isa207-common.c | 41 +++++++++++++++++++++++++++++++
 arch/powerpc/perf/isa207-common.h |  2 ++
 arch/powerpc/perf/power10-pmu.c   | 13 ++++++++++
 arch/powerpc/perf/power9-pmu.c    | 13 ++++++++++
 4 files changed, 69 insertions(+)

diff --git a/arch/powerpc/perf/isa207-common.c b/arch/powerpc/perf/isa207-common.c
index e4f577da33d8..b255799f5b51 100644
--- a/arch/powerpc/perf/isa207-common.c
+++ b/arch/powerpc/perf/isa207-common.c
@@ -694,3 +694,44 @@ int isa207_get_alternatives(u64 event, u64 alt[], int size, unsigned int flags,
 
 	return num_alt;
 }
+
+int isa3_X_check_attr_config(struct perf_event *ev)
+{
+	u64 val, sample_mode;
+	u64 event = ev->attr.config;
+
+	val = (event >> EVENT_SAMPLE_SHIFT) & EVENT_SAMPLE_MASK;
+	sample_mode = val & 0x3;
+
+	/*
+	 * MMCRA[61:62] is Randome Sampling Mode (SM).
+	 * value of 0b11 is reserved.
+	 */
+	if (sample_mode == 0x3)
+		return -1;
+
+	/*
+	 * Check for all reserved value
+	 */
+	switch (val) {
+	case 0x5:
+	case 0x9:
+	case 0xD:
+	case 0x19:
+	case 0x1D:
+	case 0x1A:
+	case 0x1E:
+		return -1;
+	}
+
+	/*
+	 * MMCRA[48:51]/[52:55]) Threshold Start/Stop
+	 * Events Selection.
+	 * 0b11110000/0b00001111 is reserved.
+	 */
+	val = (event >> EVENT_THR_CTL_SHIFT) & EVENT_THR_CTL_MASK;
+	if (((val & 0xF0) == 0xF0) || ((val & 0xF) == 0xF))
+		return -1;
+
+	return 0;
+}
diff --git a/arch/powerpc/perf/isa207-common.h b/arch/powerpc/perf/isa207-common.h
index 1af0e8c97ac7..ae8eaf05efd1 100644
--- a/arch/powerpc/perf/isa207-common.h
+++ b/arch/powerpc/perf/isa207-common.h
@@ -280,4 +280,6 @@ void isa207_get_mem_data_src(union perf_mem_data_src *dsrc, u32 flags,
 							struct pt_regs *regs);
 void isa207_get_mem_weight(u64 *weight);
 
+int isa3_X_check_attr_config(struct perf_event *ev);
+
 #endif
diff --git a/arch/powerpc/perf/power10-pmu.c b/arch/powerpc/perf/power10-pmu.c
index a901c1348cad..bc64354cab6a 100644
--- a/arch/powerpc/perf/power10-pmu.c
+++ b/arch/powerpc/perf/power10-pmu.c
@@ -106,6 +106,18 @@ static int power10_get_alternatives(u64 event, unsigned int flags, u64 alt[])
 	return num_alt;
 }
 
+static int power10_check_attr_config(struct perf_event *ev)
+{
+	u64 val;
+	u64 event = ev->attr.config;
+
+	val = (event >> EVENT_SAMPLE_SHIFT) & EVENT_SAMPLE_MASK;
+	if (val == 0x10 || isa3_X_check_attr_config(ev))
+		return -1;
+
+	return 0;
+}
+
 GENERIC_EVENT_ATTR(cpu-cycles,			PM_RUN_CYC);
 GENERIC_EVENT_ATTR(instructions,		PM_RUN_INST_CMPL);
 GENERIC_EVENT_ATTR(branch-instructions,		PM_BR_CMPL);
@@ -559,6 +571,7 @@ static struct power_pmu power10_pmu = {
 	.attr_groups		= power10_pmu_attr_groups,
 	.bhrb_nr		= 32,
 	.capabilities           = PERF_PMU_CAP_EXTENDED_REGS,
+	.check_attr_config	= power10_check_attr_config,
 };
 
 int init_power10_pmu(void)
diff --git a/arch/powerpc/perf/power9-pmu.c b/arch/powerpc/perf/power9-pmu.c
index 2a57e93a79dc..b3b9b226d053 100644
--- a/arch/powerpc/perf/power9-pmu.c
+++ b/arch/powerpc/perf/power9-pmu.c
@@ -151,6 +151,18 @@ static int power9_get_alternatives(u64 event, unsigned int flags, u64 alt[])
 	return num_alt;
 }
 
+static int power9_check_attr_config(struct perf_event *ev)
+{
+	u64 val;
+	u64 event = ev->attr.config;
+
+	val = (event >> EVENT_SAMPLE_SHIFT) & EVENT_SAMPLE_MASK;
+	if (val == 0xC || isa3_X_check_attr_config(ev))
+		return -1;
+
+	return 0;
+}
+
 GENERIC_EVENT_ATTR(cpu-cycles,			PM_CYC);
 GENERIC_EVENT_ATTR(stalled-cycles-frontend,	PM_ICT_NOSLOT_CYC);
 GENERIC_EVENT_ATTR(stalled-cycles-backend,	PM_CMPLU_STALL);
@@ -437,6 +449,7 @@ static struct power_pmu power9_pmu = {
 	.attr_groups		= power9_pmu_attr_groups,
 	.bhrb_nr		= 32,
 	.capabilities           = PERF_PMU_CAP_EXTENDED_REGS,
+	.check_attr_config	= power9_check_attr_config,
 };
 
 int init_power9_pmu(void)
-- 
2.26.2


^ permalink raw reply related

* Re: [PATCH next v3 12/15] printk: introduce a kmsg_dump iterator
From: John Ogness @ 2021-02-26  7:59 UTC (permalink / raw)
  To: Petr Mladek
  Cc: linux-hyperv, Sergey Senozhatsky, Douglas Anderson,
	Paul Mackerras, Miquel Raynal, K. Y. Srinivasan, Thomas Meyer,
	Vignesh Raghavendra, Wei Liu, Madhavan Srinivasan,
	Stephen Hemminger, kernel test robot, Anton Vorontsov,
	clang-built-linux, Joel Stanley, Jason Wessel, Anton Ivanov,
	Wei Li, Haiyang Zhang, Ravi Bangoria, Kees Cook, Alistair Popple,
	Jeff Dike, Colin Cross, linux-um, Daniel Thompson, Steven Rostedt,
	Davidlohr Bueso, Nicholas Piggin, Oleg Nesterov, Thomas Gleixner,
	Andy Shevchenko, Jordan Niethe, Michael Kelley, Christophe Leroy,
	Tony Luck, kbuild-all, Pavel Tatashin, linux-kernel,
	Sergey Senozhatsky, Richard Weinberger, kgdb-bugreport, linux-mtd,
	linuxppc-dev, Mike Rapoport
In-Reply-To: <20210225202438.28985-13-john.ogness@linutronix.de>

Hello,

Thank you kernel test robot!

Despite all of my efforts to carefully construct and test this series,
somehome I managed to miss a compile test with CONFIG_MTD_OOPS. That
kmsg_dumper does require the dumper parameter so that it can use
container_of().

I will discuss this with the printk team. But most likely we will just
re-instate the dumper parameter in the callback.

I apologize for the lack of care on my part.

John Ogness

On 2021-02-26, kernel test robot <lkp@intel.com> wrote:
> Hi John,
>
> I love your patch! Yet something to improve:
>
> [auto build test ERROR on next-20210225]
>
> url:    https://github.com/0day-ci/linux/commits/John-Ogness/printk-remove-logbuf_lock/20210226-043457
> base:    7f206cf3ec2bee4621325cfacb2588e5085c07f5
> config: arm-randconfig-r024-20210225 (attached as .config)
> compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project a921aaf789912d981cbb2036bdc91ad7289e1523)
> reproduce (this is a W=1 build):
>         wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
>         chmod +x ~/bin/make.cross
>         # install arm cross compiling tool for clang build
>         # apt-get install binutils-arm-linux-gnueabi
>         # https://github.com/0day-ci/linux/commit/fc7f655cded40fc98ba5304c200e3a01e8291fb4
>         git remote add linux-review https://github.com/0day-ci/linux
>         git fetch --no-tags linux-review John-Ogness/printk-remove-logbuf_lock/20210226-043457
>         git checkout fc7f655cded40fc98ba5304c200e3a01e8291fb4
>         # save the attached .config to linux build tree
>         COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=arm 
>
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <lkp@intel.com>
>
> All errors (new ones prefixed by >>):
>
>>> drivers/mtd/mtdoops.c:277:45: error: use of undeclared identifier 'dumper'
>            struct mtdoops_context *cxt = container_of(dumper,
>                                                       ^
>>> drivers/mtd/mtdoops.c:277:45: error: use of undeclared identifier 'dumper'
>>> drivers/mtd/mtdoops.c:277:45: error: use of undeclared identifier 'dumper'
>    3 errors generated.
>
>
> vim +/dumper +277 drivers/mtd/mtdoops.c
>
> 4b23aff083649e Richard Purdie 2007-05-29  274  
> fc7f655cded40f John Ogness    2021-02-25  275  static void mtdoops_do_dump(enum kmsg_dump_reason reason)
> 2e386e4bac9055 Simon Kagstrom 2009-11-03  276  {
> 2e386e4bac9055 Simon Kagstrom 2009-11-03 @277  	struct mtdoops_context *cxt = container_of(dumper,
> 2e386e4bac9055 Simon Kagstrom 2009-11-03  278  			struct mtdoops_context, dump);
> fc7f655cded40f John Ogness    2021-02-25  279  	struct kmsg_dump_iter iter;
> fc2d557c74dc58 Seiji Aguchi   2011-01-12  280  
> 2e386e4bac9055 Simon Kagstrom 2009-11-03  281  	/* Only dump oopses if dump_oops is set */
> 2e386e4bac9055 Simon Kagstrom 2009-11-03  282  	if (reason == KMSG_DUMP_OOPS && !dump_oops)
> 2e386e4bac9055 Simon Kagstrom 2009-11-03  283  		return;
> 2e386e4bac9055 Simon Kagstrom 2009-11-03  284  
> fc7f655cded40f John Ogness    2021-02-25  285  	kmsg_dump_rewind(&iter);
> fc7f655cded40f John Ogness    2021-02-25  286  
> df92cad8a03e83 John Ogness    2021-02-25  287  	if (test_and_set_bit(0, &cxt->oops_buf_busy))
> df92cad8a03e83 John Ogness    2021-02-25  288  		return;
> fc7f655cded40f John Ogness    2021-02-25  289  	kmsg_dump_get_buffer(&iter, true, cxt->oops_buf + MTDOOPS_HEADER_SIZE,
> e2ae715d66bf4b Kay Sievers    2012-06-15  290  			     record_size - MTDOOPS_HEADER_SIZE, NULL);
> df92cad8a03e83 John Ogness    2021-02-25  291  	clear_bit(0, &cxt->oops_buf_busy);
> 2e386e4bac9055 Simon Kagstrom 2009-11-03  292  
> c1cf1d57d14922 Mark Tomlinson 2020-09-03  293  	if (reason != KMSG_DUMP_OOPS) {
> 2e386e4bac9055 Simon Kagstrom 2009-11-03  294  		/* Panics must be written immediately */
> 2e386e4bac9055 Simon Kagstrom 2009-11-03  295  		mtdoops_write(cxt, 1);
> c1cf1d57d14922 Mark Tomlinson 2020-09-03  296  	} else {
> 2e386e4bac9055 Simon Kagstrom 2009-11-03  297  		/* For other cases, schedule work to write it "nicely" */
> 2e386e4bac9055 Simon Kagstrom 2009-11-03  298  		schedule_work(&cxt->work_write);
> 2e386e4bac9055 Simon Kagstrom 2009-11-03  299  	}
> c1cf1d57d14922 Mark Tomlinson 2020-09-03  300  }
> 4b23aff083649e Richard Purdie 2007-05-29  301  
>
> ---
> 0-DAY CI Kernel Test Service, Intel Corporation
> https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

^ permalink raw reply

* Re: [PATCH] perf bench numa: Fix the condition checks for max number of numa nodes
From: Srikar Dronamraju @ 2021-02-26  8:58 UTC (permalink / raw)
  To: Athira Rajeev
  Cc: ravi.bangoria, maddy, peterz, linux-kernel, acme,
	linux-perf-users, jolsa, kjain, linuxppc-dev, kan.liang
In-Reply-To: <1614271802-1503-1-git-send-email-atrajeev@linux.vnet.ibm.com>

* Athira Rajeev <atrajeev@linux.vnet.ibm.com> [2021-02-25 11:50:02]:

> In systems having higher node numbers available like node
> 255, perf numa bench will fail with SIGABORT.
> 
> <<>>
> perf: bench/numa.c:1416: init: Assertion `!(g->p.nr_nodes > 64 || g->p.nr_nodes < 0)' failed.
> Aborted (core dumped)
> <<>>
> 

Looks good to me.

Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>

-- 
Thanks and Regards
Srikar Dronamraju

^ permalink raw reply

* Re: [PATCH V2] powerpc/perf: Fix handling of privilege level checks in perf interrupt context
From: Peter Zijlstra @ 2021-02-26  9:35 UTC (permalink / raw)
  To: Athira Rajeev; +Cc: maddy, omosnace, acme, jolsa, linuxppc-dev, kan.liang
In-Reply-To: <1614247839-1428-1-git-send-email-atrajeev@linux.vnet.ibm.com>

On Thu, Feb 25, 2021 at 05:10:39AM -0500, Athira Rajeev wrote:
> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
> index 4b4319d8..c8be44c 100644
> --- a/arch/powerpc/perf/core-book3s.c
> +++ b/arch/powerpc/perf/core-book3s.c
> @@ -222,7 +222,7 @@ static inline void perf_get_data_addr(struct perf_event *event, struct pt_regs *
>  	if (!(mmcra & MMCRA_SAMPLE_ENABLE) || sdar_valid)
>  		*addrp = mfspr(SPRN_SDAR);
>  
> -	if (is_kernel_addr(mfspr(SPRN_SDAR)) && perf_allow_kernel(&event->attr) != 0)
> +	if (is_kernel_addr(mfspr(SPRN_SDAR)) && event->attr.exclude_kernel)
>  		*addrp = 0;
>  }
>  
> @@ -507,7 +507,7 @@ static void power_pmu_bhrb_read(struct perf_event *event, struct cpu_hw_events *
>  			 * addresses, hence include a check before filtering code
>  			 */
>  			if (!(ppmu->flags & PPMU_ARCH_31) &&
> -				is_kernel_addr(addr) && perf_allow_kernel(&event->attr) != 0)
> +			    is_kernel_addr(addr) && event->attr.exclude_kernel)
>  				continue;
>  
>  			/* Branches are read most recent first (ie. mfbhrb 0 is

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

^ permalink raw reply

* Re: [PATCH v4 12/14] swiotlb: Add restricted DMA alloc/free support.
From: Claire Chang @ 2021-02-26  9:35 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: heikki.krogerus, peterz, grant.likely, paulus, Frank Rowand,
	mingo, Marek Szyprowski, sstabellini, Saravana Kannan,
	Joerg Roedel, Rafael J . Wysocki, Bartosz Golaszewski, xen-devel,
	Thierry Reding, linux-devicetree, Will Deacon,
	Konrad Rzeszutek Wilk, Dan Williams, linuxppc-dev, Rob Herring,
	boris.ostrovsky, Andy Shevchenko, jgross, Nicolas Boichat,
	Greg KH, Randy Dunlap, lkml, list@263.net:IOMMU DRIVERS,
	Jim Quinlan, xypron.glpk, Robin Murphy, bauerman
In-Reply-To: <20210226051740.GB2072@lst.de>

On Fri, Feb 26, 2021 at 1:17 PM Christoph Hellwig <hch@lst.de> wrote:
>
> On Fri, Feb 26, 2021 at 12:17:50PM +0800, Claire Chang wrote:
> > Do you think I should fix this and rebase on the latest linux-next
> > now? I wonder if there are more factor and clean up coming and I
> > should wait after that.
>
> Here is my preferred plan:
>
>  1) wait for my series to support the min alignment in swiotlb to
>     land in Linus tree
>  2) I'll resend my series with the further swiotlb cleanup and
>     refactoring, which includes a slightly rebased version of your
>     patch to add the io_tlb_mem structure
>  3) resend your series on top of that as a baseline
>
> This is my current WIP tree for 2:
>
>   http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/swiotlb-struct

Sounds good to me. Thanks!

^ permalink raw reply

* Re: [PATCH] powerpc/sstep: Fix VSX instruction emulation
From: Ravi Bangoria @ 2021-02-26 11:09 UTC (permalink / raw)
  To: Jordan Niethe; +Cc: Ravi Bangoria, linuxppc-dev, bala24
In-Reply-To: <20210225031946.1458206-1-jniethe5@gmail.com>



On 2/25/21 8:49 AM, Jordan Niethe wrote:
> Commit af99da74333b ("powerpc/sstep: Support VSX vector paired storage
> access instructions") added loading and storing 32 word long data into
> adjacent VSRs. However the calculation used to determine if two VSRs
> needed to be loaded/stored inadvertently prevented the load/storing
> taking place for instructions with a data length less than 16 words.
> 
> This causes the emulation to not function correctly, which can be seen
> by the alignment_handler selftest:
> 
> $ ./alignment_handler
> [snip]
> test: test_alignment_handler_vsx_207
> tags: git_version:powerpc-5.12-1-0-g82d2c16b350f
> VSX: 2.07B
>          Doing lxsspx:   PASSED
>          Doing lxsiwax:  FAILED: Wrong Data
>          Doing lxsiwzx:  PASSED
>          Doing stxsspx:  PASSED
>          Doing stxsiwx:  PASSED
> failure: test_alignment_handler_vsx_207
> test: test_alignment_handler_vsx_300
> tags: git_version:powerpc-5.12-1-0-g82d2c16b350f
> VSX: 3.00B
>          Doing lxsd:     PASSED
>          Doing lxsibzx:  PASSED
>          Doing lxsihzx:  PASSED
>          Doing lxssp:    FAILED: Wrong Data
>          Doing lxv:      PASSED
>          Doing lxvb16x:  PASSED
>          Doing lxvh8x:   PASSED
>          Doing lxvx:     PASSED
>          Doing lxvwsx:   FAILED: Wrong Data
>          Doing lxvl:     PASSED
>          Doing lxvll:    PASSED
>          Doing stxsd:    PASSED
>          Doing stxsibx:  PASSED
>          Doing stxsihx:  PASSED
>          Doing stxssp:   PASSED
>          Doing stxv:     PASSED
>          Doing stxvb16x: PASSED
>          Doing stxvh8x:  PASSED
>          Doing stxvx:    PASSED
>          Doing stxvl:    PASSED
>          Doing stxvll:   PASSED
> failure: test_alignment_handler_vsx_300
> [snip]
> 
> Fix this by making sure all VSX instruction emulation correctly
> load/store from the VSRs.
> 
> Fixes: af99da74333b ("powerpc/sstep: Support VSX vector paired storage access instructions")
> Signed-off-by: Jordan Niethe <jniethe5@gmail.com>

Yikes!

Reviewed-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>

^ permalink raw reply

* Latest Git kernel doesn't compile because of the LINUX_VERSION_CODE issue
From: Christian Zigotzky @ 2021-02-26 12:34 UTC (permalink / raw)
  To: Michael Ellerman, linuxppc-dev; +Cc: Darren Stevens, R.T.Dickinson
In-Reply-To: <CAOSf1CHQ=QDwH=J4kLYqboe481poa7EdbC6gzq29W7KYHhn1YQ@mail.gmail.com>

Hello,

I tried to compile the latest Git kernel today. Unfortunately it doesn't 
compile.

Error messages:

   CC      arch/powerpc/kernel/udbg_16550.o
In file included from ./include/linux/stackprotector.h:10:0,
                  from arch/powerpc/kernel/smp.c:35:
./arch/powerpc/include/asm/stackprotector.h: In function 
‘boot_init_stack_canary’:
./arch/powerpc/include/asm/stackprotector.h:29:30: error: expected 
expression before ‘;’ token
   canary ^= LINUX_VERSION_CODE;
                               ^
scripts/Makefile.build:271: recipe for target 
'arch/powerpc/kernel/smp.o' failed
make[2]: *** [arch/powerpc/kernel/smp.o] Error 1

----

drivers/media/cec/core/cec-api.c: In function ‘cec_adap_g_caps’:
drivers/media/cec/core/cec-api.c:85:35: error: expected expression 
before ‘;’ token
   caps.version = LINUX_VERSION_CODE;

----

I have found the bad commit. It's "Merge tag 'kbuild-v5.12' of 
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild" [1]

The changes in the Makefile (a/Makefile) are responsible for the 
compiling errors. [2]

I was able to revert this bad commit. After that it compiled without any 
problems.

Could you please compile the latest Git kernel and confirm this issue?

Thanks,
Christian

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6fbd6cf85a3be127454a1ad58525a3adcf8612ab
[2] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/diff/Makefile?id=6fbd6cf85a3be127454a1ad58525a3adcf8612ab

^ permalink raw reply

* Re: [PATCH V2 1/2] powerpc/perf: Infrastructure to support checking of attr.config*
From: Paul A. Clarke @ 2021-02-26 14:03 UTC (permalink / raw)
  To: Madhavan Srinivasan; +Cc: linuxppc-dev
In-Reply-To: <20210226065025.1254973-1-maddy@linux.ibm.com>

Another drive-by review... just some minor nits, below...

On Fri, Feb 26, 2021 at 12:20:24PM +0530, Madhavan Srinivasan wrote:
> Introduce code to support the checking of attr.config* for
> values which are reserved for a given platform.
> Performance Monitoring Unit (PMU) configuration registers
> have fields that are reserved and specific value to bit field

I'd reword to "some specific values for bit fields are reserved".

> as reserved. For ex., MMCRA[61:62] is Randome Sampling Mode (SM)

s/Randome/Random/
This occurs here, and below, and in patch 2/2.

> and value of 0b11 to this field is reserved.

s/to/for/

> Writing a non-zero values in these fields or writing invalid
> value to bit fields will have unknown behaviours.

Suggest: Writing non-zero or invalid values in these fields
will have unknown behaviors. (or "behaviours" ;-)

PC

> Patch adds a generic call-back function "check_attr_config"
> in "struct power_pmu", to be called in event_init to
> check for attr.config* values for a given platform.
> "check_attr_config" is valid only for raw event type.
> 
> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
> ---
> Changelog v1:
> -Fixed commit message and in-code comments
> 
>  arch/powerpc/include/asm/perf_event_server.h |  6 ++++++
>  arch/powerpc/perf/core-book3s.c              | 14 ++++++++++++++
>  2 files changed, 20 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/perf_event_server.h b/arch/powerpc/include/asm/perf_event_server.h
> index 00e7e671bb4b..dde97d7d9253 100644
> --- a/arch/powerpc/include/asm/perf_event_server.h
> +++ b/arch/powerpc/include/asm/perf_event_server.h
> @@ -67,6 +67,12 @@ struct power_pmu {
>  	 * the pmu supports extended perf regs capability
>  	 */
>  	int		capabilities;
> +	/*
> +	 * Function to check event code for values which are
> +	 * reserved. Function takes struct perf_event as input,
> +	 * since event code could be spread in attr.config*
> +	 */
> +	int		(*check_attr_config)(struct perf_event *ev);
>  };
> 
>  /*
> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
> index 6817331e22ff..c6eeb4fdc5fd 100644
> --- a/arch/powerpc/perf/core-book3s.c
> +++ b/arch/powerpc/perf/core-book3s.c
> @@ -1958,6 +1958,20 @@ static int power_pmu_event_init(struct perf_event *event)
> 
>  		if (ppmu->blacklist_ev && is_event_blacklisted(ev))
>  			return -EINVAL;
> +		/*
> +		 * PMU config registers have fields that are
> +		 * reserved and specific value to bit field as reserved.
> +		 * For ex., MMCRA[61:62] is Randome Sampling Mode (SM)
> +		 * and value of 0b11 to this field is reserved.
> +		 *
> +		 * This check is needed only for raw event type,
> +		 * since tools like fuzzer use raw event type to
> +		 * provide randomized event code values for test.
> +		 *
> +		 */
> +		if (ppmu->check_attr_config &&
> +		    ppmu->check_attr_config(event))
> +			return -EINVAL;
>  		break;
>  	default:
>  		return -ENOENT;
> -- 
> 2.26.2
> 

^ permalink raw reply

* Re: Freeing page tables through RCU
From: Jason Gunthorpe @ 2021-02-26 14:42 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-mm, linuxppc-dev, linux-kernel, linux-s390
In-Reply-To: <20210225205820.GC2858050@casper.infradead.org>

On Thu, Feb 25, 2021 at 08:58:20PM +0000, Matthew Wilcox wrote:

> I'd like to hear better ideas than this.

You didn't like my suggestion to put a sleepable lock around the
freeing of page tables during flushing?

I still don't see how you convert the sleepable page walkers to use
rcu??

Jason
 

^ permalink raw reply

* Re: [PATCH v2 13/37] KVM: PPC: Book3S HV P9: Move radix MMU switching instructions together
From: Fabiano Rosas @ 2021-02-26 15:56 UTC (permalink / raw)
  To: Nicholas Piggin, kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin
In-Reply-To: <20210225134652.2127648-14-npiggin@gmail.com>

Nicholas Piggin <npiggin@gmail.com> writes:

> Switching the MMU from radix<->radix mode is tricky particularly as the
> MMU can remain enabled and requires a certain sequence of SPR updates.
> Move these together into their own functions.
>
> This also includes the radix TLB check / flush because it's tied in to
> MMU switching due to tlbiel getting LPID from LPIDR.
>
> (XXX: isync / hwsync synchronisation TBD)

I see bot mtlpidr and mtlpcr requiring a CSI in the ISA. Do you say we
might need more than an isync?

Regardless, I'd expect that to be separate from the refactoring here,
so:

Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>

>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>  arch/powerpc/kvm/book3s_hv.c | 55 +++++++++++++++++++++---------------
>  1 file changed, 32 insertions(+), 23 deletions(-)
>
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 895090636295..23d6dc04b0e9 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -3440,12 +3440,38 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
>  	trace_kvmppc_run_core(vc, 1);
>  }
>
> +static void switch_mmu_to_guest_radix(struct kvm *kvm, struct kvm_vcpu *vcpu, u64 lpcr)
> +{
> +	struct kvmppc_vcore *vc = vcpu->arch.vcore;
> +	struct kvm_nested_guest *nested = vcpu->arch.nested;
> +	u32 lpid;
> +
> +	lpid = nested ? nested->shadow_lpid : kvm->arch.lpid;
> +
> +	mtspr(SPRN_LPID, lpid);
> +	mtspr(SPRN_LPCR, lpcr);
> +	mtspr(SPRN_PID, vcpu->arch.pid);
> +	isync();
> +
> +	/* TLBIEL must have LPIDR set, so set guest LPID before flushing. */
> +	kvmppc_check_need_tlb_flush(kvm, vc->pcpu, nested);
> +}
> +
> +static void switch_mmu_to_host_radix(struct kvm *kvm, u32 pid)
> +{
> +	mtspr(SPRN_PID, pid);
> +	mtspr(SPRN_LPID, kvm->arch.host_lpid);
> +	mtspr(SPRN_LPCR, kvm->arch.host_lpcr);
> +	isync();
> +}
> +
>  /*
>   * Load up hypervisor-mode registers on P9.
>   */
>  static int kvmhv_load_hv_regs_and_go(struct kvm_vcpu *vcpu, u64 time_limit,
>  				     unsigned long lpcr)
>  {
> +	struct kvm *kvm = vcpu->kvm;
>  	struct kvmppc_vcore *vc = vcpu->arch.vcore;
>  	s64 hdec;
>  	u64 tb, purr, spurr;
> @@ -3468,12 +3494,12 @@ static int kvmhv_load_hv_regs_and_go(struct kvm_vcpu *vcpu, u64 time_limit,
>  	 * P8 and P9 suppress the HDEC exception when LPCR[HDICE] = 0,
>  	 * so set HDICE before writing HDEC.
>  	 */
> -	mtspr(SPRN_LPCR, vcpu->kvm->arch.host_lpcr | LPCR_HDICE);
> +	mtspr(SPRN_LPCR, kvm->arch.host_lpcr | LPCR_HDICE);
>  	isync();
>
>  	hdec = time_limit - mftb();
>  	if (hdec < 0) {
> -		mtspr(SPRN_LPCR, vcpu->kvm->arch.host_lpcr);
> +		mtspr(SPRN_LPCR, kvm->arch.host_lpcr);
>  		isync();
>  		return BOOK3S_INTERRUPT_HV_DECREMENTER;
>  	}
> @@ -3508,7 +3534,6 @@ static int kvmhv_load_hv_regs_and_go(struct kvm_vcpu *vcpu, u64 time_limit,
>  	}
>  	mtspr(SPRN_CIABR, vcpu->arch.ciabr);
>  	mtspr(SPRN_IC, vcpu->arch.ic);
> -	mtspr(SPRN_PID, vcpu->arch.pid);
>
>  	mtspr(SPRN_PSSCR, vcpu->arch.psscr | PSSCR_EC |
>  	      (local_paca->kvm_hstate.fake_suspend << PSSCR_FAKE_SUSPEND_LG));
> @@ -3522,8 +3547,7 @@ static int kvmhv_load_hv_regs_and_go(struct kvm_vcpu *vcpu, u64 time_limit,
>
>  	mtspr(SPRN_AMOR, ~0UL);
>
> -	mtspr(SPRN_LPCR, lpcr);
> -	isync();
> +	switch_mmu_to_guest_radix(kvm, vcpu, lpcr);
>
>  	kvmppc_xive_push_vcpu(vcpu);
>
> @@ -3562,7 +3586,6 @@ static int kvmhv_load_hv_regs_and_go(struct kvm_vcpu *vcpu, u64 time_limit,
>  		mtspr(SPRN_DAWR1, host_dawr1);
>  		mtspr(SPRN_DAWRX1, host_dawrx1);
>  	}
> -	mtspr(SPRN_PID, host_pidr);
>
>  	/*
>  	 * Since this is radix, do a eieio; tlbsync; ptesync sequence in
> @@ -3577,9 +3600,6 @@ static int kvmhv_load_hv_regs_and_go(struct kvm_vcpu *vcpu, u64 time_limit,
>  	if (cpu_has_feature(CPU_FTR_ARCH_31))
>  		asm volatile(PPC_CP_ABORT);
>
> -	mtspr(SPRN_LPID, vcpu->kvm->arch.host_lpid);	/* restore host LPID */
> -	isync();
> -
>  	vc->dpdes = mfspr(SPRN_DPDES);
>  	vc->vtb = mfspr(SPRN_VTB);
>  	mtspr(SPRN_DPDES, 0);
> @@ -3596,7 +3616,8 @@ static int kvmhv_load_hv_regs_and_go(struct kvm_vcpu *vcpu, u64 time_limit,
>  	}
>
>  	mtspr(SPRN_HDEC, 0x7fffffff);
> -	mtspr(SPRN_LPCR, vcpu->kvm->arch.host_lpcr);
> +
> +	switch_mmu_to_host_radix(kvm, host_pidr);
>
>  	return trap;
>  }
> @@ -4130,7 +4151,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
>  {
>  	struct kvm_run *run = vcpu->run;
>  	int trap, r, pcpu;
> -	int srcu_idx, lpid;
> +	int srcu_idx;
>  	struct kvmppc_vcore *vc;
>  	struct kvm *kvm = vcpu->kvm;
>  	struct kvm_nested_guest *nested = vcpu->arch.nested;
> @@ -4204,13 +4225,6 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
>  	vc->vcore_state = VCORE_RUNNING;
>  	trace_kvmppc_run_core(vc, 0);
>
> -	if (cpu_has_feature(CPU_FTR_HVMODE)) {
> -		lpid = nested ? nested->shadow_lpid : kvm->arch.lpid;
> -		mtspr(SPRN_LPID, lpid);
> -		isync();
> -		kvmppc_check_need_tlb_flush(kvm, pcpu, nested);
> -	}
> -
>  	guest_enter_irqoff();
>
>  	srcu_idx = srcu_read_lock(&kvm->srcu);
> @@ -4229,11 +4243,6 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
>
>  	srcu_read_unlock(&kvm->srcu, srcu_idx);
>
> -	if (cpu_has_feature(CPU_FTR_HVMODE)) {
> -		mtspr(SPRN_LPID, kvm->arch.host_lpid);
> -		isync();
> -	}
> -
>  	set_irq_happened(trap);
>
>  	kvmppc_set_host_core(pcpu);

^ permalink raw reply

* Re: Freeing page tables through RCU
From: Matthew Wilcox @ 2021-02-26 16:03 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: linux-mm, linuxppc-dev, linux-kernel, linux-s390
In-Reply-To: <20210226144200.GV2643399@ziepe.ca>

On Fri, Feb 26, 2021 at 10:42:00AM -0400, Jason Gunthorpe wrote:
> On Thu, Feb 25, 2021 at 08:58:20PM +0000, Matthew Wilcox wrote:
> 
> > I'd like to hear better ideas than this.
> 
> You didn't like my suggestion to put a sleepable lock around the
> freeing of page tables during flushing?
> 
> I still don't see how you convert the sleepable page walkers to use
> rcu??

I don't want to convert the sleepable ones to use RCU ... I want to
convert the non-sleeping ones to use RCU.  A page_table_free_lock might
work, but it might have its own problems later (eg a sleeping lock can't
be acquired under RCU or spinlock, and it can't be a spinlock because
it'd have to be held while we wait for IPIs).

I think it would solve my immediate problem, and I wonder if it might
solve some other problems ...

^ permalink raw reply

* Re: Latest Git kernel doesn't compile because of the LINUX_VERSION_CODE issue
From: Christophe Leroy @ 2021-02-26 16:10 UTC (permalink / raw)
  To: Christian Zigotzky, Michael Ellerman, linuxppc-dev
  Cc: Darren Stevens, R.T.Dickinson
In-Reply-To: <99f6d05a-d431-7444-bb0a-180c042c2afd@xenosoft.de>



Le 26/02/2021 à 13:34, Christian Zigotzky a écrit :
> Hello,
> 
> I tried to compile the latest Git kernel today. Unfortunately it doesn't compile.

I have no such problem with latest git kernel.

Christophe

> 
> Error messages:
> 
>    CC      arch/powerpc/kernel/udbg_16550.o
> In file included from ./include/linux/stackprotector.h:10:0,
>                   from arch/powerpc/kernel/smp.c:35:
> ./arch/powerpc/include/asm/stackprotector.h: In function ‘boot_init_stack_canary’:
> ./arch/powerpc/include/asm/stackprotector.h:29:30: error: expected expression before ‘;’ token
>    canary ^= LINUX_VERSION_CODE;
>                                ^
> scripts/Makefile.build:271: recipe for target 'arch/powerpc/kernel/smp.o' failed
> make[2]: *** [arch/powerpc/kernel/smp.o] Error 1
> 
> ----
> 
> drivers/media/cec/core/cec-api.c: In function ‘cec_adap_g_caps’:
> drivers/media/cec/core/cec-api.c:85:35: error: expected expression before ‘;’ token
>    caps.version = LINUX_VERSION_CODE;
> 
> ----
> 
> I have found the bad commit. It's "Merge tag 'kbuild-v5.12' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild" [1]
> 
> The changes in the Makefile (a/Makefile) are responsible for the compiling errors. [2]
> 
> I was able to revert this bad commit. After that it compiled without any problems.
> 
> Could you please compile the latest Git kernel and confirm this issue?
> 
> Thanks,
> Christian
> 
> [1] 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6fbd6cf85a3be127454a1ad58525a3adcf8612ab 
> 
> [2] 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/diff/Makefile?id=6fbd6cf85a3be127454a1ad58525a3adcf8612ab 
> 

^ permalink raw reply

* Re: Freeing page tables through RCU
From: Gerald Schaefer @ 2021-02-26 16:12 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-s390, Janosch Frank, Christian Borntraeger, Heiko Carstens,
	linux-kernel, linux-mm, Alexander Gordeev, Claudio Imbrenda,
	linuxppc-dev
In-Reply-To: <20210225205820.GC2858050@casper.infradead.org>

On Thu, 25 Feb 2021 20:58:20 +0000
Matthew Wilcox <willy@infradead.org> wrote:

> In order to walk the page tables without the mmap semaphore, it must
> be possible to prevent them from being freed and reused (eg if munmap()
> races with viewing /proc/$pid/smaps).
> 
> There is various commentary within the mm on how to prevent this.  One way
> is to disable interrupts, relying on that to block rcu_sched or IPIs.
> I don't think the RT people are terribly happy about reading a proc file
> disabling interrupts, and it doesn't work for architectures that free
> page tables directly instead of batching them into an rcu_sched (because
> the IPI may not be sent to this CPU if the task has never run on it).
> 
> See "Fast GUP" in mm/gup.c
> 
> Ideally, I'd like rcu_read_lock() to delay page table reuse.  This is
> close to trivial for architectures which use entire pages or multiple
> pages for levels of their page tables as we can use the rcu_head embedded
> in struct page to queue the page for RCU.
> 
> s390 and powerpc are the only two architectures I know of that have
> levels of their page table that are smaller than their PAGE_SIZE.
> I'd like to discuss options.  There may be a complicated scheme that
> allows partial pages to be freed via RCU, but I have something simpler
> in mind.  For powerpc in particular, it can have a PAGE_SIZE of 64kB
> and then the MMU wants to see 4kB entries in the PMD.  I suggest that
> instead of allocating each 4kB entry individually, we allocate a 64kB
> page and fill in 16 consecutive PMDs.  This could cost a bit more memory
> (although if you've asked for a CONFIG_PAGE_SIZE of 64kB, you presumably
> don't care too much about it), but it'll make future page faults cheaper
> (as the PMDs will already be present, assuming you have good locality
> of reference).
> 
> I'd like to hear better ideas than this.

Some background on the situation for s390: The architecture defines
an 8 bit pagetable index, so we have 256 entries in a 2 KB pagetable,
but PAGE_SIZE is 4 KB. pte_alloc(_one) will use alloc_page() to allocate
a full 4 KB page, and then do some housekeeping to maintain a per-mm list
of such 4 KB pages, which will contain either one or two 2 KB pagetable
fragments.

This is also the reason why pgtable_t on s390 is not pointing to the
struct page of the (4 KB) page containing a 2 KB pagetable fragment, but
rather to the 2 KB pagetable itself.

I see at least two issues here, with using rcu_head embedded in the
struct page (for a 4 KB page):

1) There might be two 2 KB pagetables present in that 4 KB page,
and the rcu_head would affect both. Not sure if this would really be
a problem, because we already have a similar situation with the split
ptlock embedded in struct page, which also might lock two 2 KB pagetables,
i.e. more than necessary. It still is far less "over-locking" than
using mm->page_table_lock, and the move_pte() code e.g. takes care
to avoid a deadlock if src and dst ptlocks happen to be on the
same page.

So, a similar "over-locking" might also be possible and acceptable
for the rcu_head approach, but I do not really understand if that could
have some deadlock or other unwanted side-effects.

2) The "housekeeping" of our 2 KB pagetable fragments uses page->lru
to maintain the per-mm list. It also (mis)uses page->_refcount to mark
which 2 KB half is used/free, but that should not be an issue I guess.
Using page->lru will be an issue though. IIUC, then page->rcu_head will
overlay with page->lru, so using page->rcu_head for pagetable pages
on s390 would conflict with our page->lru usage for such pagetable pages.

I do not really see how that could be fixed, maybe we could find and
re-use other struct page members for our 2 KB fragment list. Also, for
kvm, there seem to be even more users of page->lru for pagetable pages,
in arch/s390/mm/gmap.c. Not sure though if those would actually also
affect "regular" pagetable walks, or if they are somehow independent.
But if we'd find some new list home for the 2 KB fragments, then that
could probably also be used for the gmap stuff.

^ permalink raw reply

* Re: Freeing page tables through RCU
From: Jason Gunthorpe @ 2021-02-26 16:21 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-mm, linuxppc-dev, linux-kernel, linux-s390
In-Reply-To: <20210226160354.GE2723601@casper.infradead.org>

On Fri, Feb 26, 2021 at 04:03:54PM +0000, Matthew Wilcox wrote:
> On Fri, Feb 26, 2021 at 10:42:00AM -0400, Jason Gunthorpe wrote:
> > On Thu, Feb 25, 2021 at 08:58:20PM +0000, Matthew Wilcox wrote:
> > 
> > > I'd like to hear better ideas than this.
> > 
> > You didn't like my suggestion to put a sleepable lock around the
> > freeing of page tables during flushing?
> > 
> > I still don't see how you convert the sleepable page walkers to use
> > rcu??
> 
> I don't want to convert the sleepable ones to use RCU ... I want to
> convert the non-sleeping ones to use RCU.  

Why? Convert them to use the normal page table spinlocks?

It makes everything much more understandable if it is locked properly.

The lockless walks should be reserved for special places because they
are actually hard to do properly.

> A page_table_free_lock might work, but it might have its own
> problems later (eg a sleeping lock can't be acquired under RCU or
> spinlock, and it can't be a spinlock because it'd have to be held
> while we wait for IPIs).

The mmap_sem today is serving the function of the page_table_free_lock
idea, so I think a sleepable lock is already proven to work?

Jason

^ permalink raw reply

* Re: [PATCH v4 2/5] ibmvfc: fix invalid sub-CRQ handles after hard reset
From: Brian King @ 2021-02-26 16:39 UTC (permalink / raw)
  To: Tyrel Datwyler, james.bottomley
  Cc: brking, linuxppc-dev, linux-scsi, martin.petersen, linux-kernel
In-Reply-To: <20210225215057.23020-3-tyreld@linux.ibm.com>

Reviewed-by: Brian King <brking@linux.vnet.ibm.com>

-- 
Brian King
Power Linux I/O
IBM Linux Technology Center


^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox