* [PATCH 1/1] KVM/PPC: Fix typo on H_DISABLE_AND_GET hcall
From: Leonardo Bras @ 2020-07-07 0:48 UTC (permalink / raw)
To: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
Bharata B Rao, Leonardo Bras, Vaibhav Jain, Sukadev Bhattiprolu
Cc: linuxppc-dev, linux-kernel, kvm-ppc
On PAPR+ the hcall() on 0x1B0 is called H_DISABLE_AND_GET, but got
defined as H_DISABLE_AND_GETC instead.
This define was introduced with a typo in commit <b13a96cfb055>
("[PATCH] powerpc: Extends HCALL interface for InfiniBand usage"), and was
later used without having the typo noticed.
Signed-off-by: Leonardo Bras <leobras.c@gmail.com>
---
arch/powerpc/include/asm/hvcall.h | 2 +-
arch/powerpc/kvm/trace_hv.h | 2 +-
tools/perf/arch/powerpc/util/book3s_hcalls.h | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h
index e90c073e437e..d8ada9c7ec78 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -237,7 +237,7 @@
#define H_CREATE_RPT 0x1A4
#define H_REMOVE_RPT 0x1A8
#define H_REGISTER_RPAGES 0x1AC
-#define H_DISABLE_AND_GETC 0x1B0
+#define H_DISABLE_AND_GET 0x1B0
#define H_ERROR_DATA 0x1B4
#define H_GET_HCA_INFO 0x1B8
#define H_GET_PERF_COUNT 0x1BC
diff --git a/arch/powerpc/kvm/trace_hv.h b/arch/powerpc/kvm/trace_hv.h
index 4a61a971c34e..830a126e095d 100644
--- a/arch/powerpc/kvm/trace_hv.h
+++ b/arch/powerpc/kvm/trace_hv.h
@@ -89,7 +89,7 @@
{H_CREATE_RPT, "H_CREATE_RPT"}, \
{H_REMOVE_RPT, "H_REMOVE_RPT"}, \
{H_REGISTER_RPAGES, "H_REGISTER_RPAGES"}, \
- {H_DISABLE_AND_GETC, "H_DISABLE_AND_GETC"}, \
+ {H_DISABLE_AND_GET, "H_DISABLE_AND_GET"}, \
{H_ERROR_DATA, "H_ERROR_DATA"}, \
{H_GET_HCA_INFO, "H_GET_HCA_INFO"}, \
{H_GET_PERF_COUNT, "H_GET_PERF_COUNT"}, \
diff --git a/tools/perf/arch/powerpc/util/book3s_hcalls.h b/tools/perf/arch/powerpc/util/book3s_hcalls.h
index 54cfa0530e86..488f4339b83c 100644
--- a/tools/perf/arch/powerpc/util/book3s_hcalls.h
+++ b/tools/perf/arch/powerpc/util/book3s_hcalls.h
@@ -84,7 +84,7 @@
{0x1a4, "H_CREATE_RPT"}, \
{0x1a8, "H_REMOVE_RPT"}, \
{0x1ac, "H_REGISTER_RPAGES"}, \
- {0x1b0, "H_DISABLE_AND_GETC"}, \
+ {0x1b0, "H_DISABLE_AND_GET"}, \
{0x1b4, "H_ERROR_DATA"}, \
{0x1b8, "H_GET_HCA_INFO"}, \
{0x1bc, "H_GET_PERF_COUNT"}, \
--
2.25.4
^ permalink raw reply related
* Re: [PATCH v5 13/26] powerpc/book3s64/pkeys: Enable MMU_FTR_PKEY
From: Michael Ellerman @ 2020-07-07 1:02 UTC (permalink / raw)
To: Aneesh Kumar K.V, linuxppc-dev; +Cc: linuxram, bauerman
In-Reply-To: <6fa2a91d-c11c-2be4-2057-15f86d4dd39c@linux.ibm.com>
"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:
>>>
>>>> /*
>>>> * Let's assume 32 pkeys on P8 bare metal, if its not
>>>> defined by device
>>>> * tree. We make this exception since skiboot forgot to
>>>> expose this
>>>> * property on power8.
>>>> */
>>>> if (!firmware_has_feature(FW_FEATURE_LPAR) &&
>>>> - cpu_has_feature(CPU_FTRS_POWER8))
>>>> + early_cpu_has_feature(CPU_FTRS_POWER8))
>>>> pkeys_total = 32;
>>>
>>> That's not how cpu_has_feature() works, we'll need to fix that.
>>>
>>> cheers
>>>
>>
>> I did a separate patch to handle that which switch the above to
>>
>> /*
>> * Let's assume 32 pkeys on P8/P9 bare metal, if its not
>> defined by device
>> * tree. We make this exception since skiboot forgot to expose
>> this
>> * property on power8/9.
>> */
>> if (!firmware_has_feature(FW_FEATURE_LPAR) &&
>> (early_cpu_has_feature(CPU_FTR_ARCH_207S) ||
>> early_cpu_has_feature(CPU_FTR_ARCH_300)))
>> pkeys_total = 32;
>>
>
> We should do a PVR check here i guess.
Yes, the ARCH features don't work because P10 will have both of those
enabled.
> ret = of_scan_flat_dt(dt_scan_storage_keys, &pkeys_total);
> if (ret == 0) {
>
> /*
> * Let's assume 32 pkeys on P8/P9 bare metal, if its not defined by device
> * tree. We make this exception since skiboot forgot to expose this
> * property on power8/9.
Well, it does expose it on Power9 after v6.6, but most P9 systems have
an older firmware than that.
And also the kernel has been enabling that on Power9 because of the
CPU_FTRS_POWER8 bug, so this is not actually a behaviour change.
> */
> if (!firmware_has_feature(FW_FEATURE_LPAR) &&
> (pvr_version_is(PVR_POWER8) || pvr_version_is(PVR_POWER9)))
> pkeys_total = 32;
> }
You need PVR_POWER8E and PVR_POWER8NVL as well.
cheers
^ permalink raw reply
* Re: [PATCH v5 18/26] powerpc/book3s64/keys/kuap: Reset AMR/IAMR values on kexec
From: Michael Ellerman @ 2020-07-07 1:07 UTC (permalink / raw)
To: Aneesh Kumar K.V, linuxppc-dev; +Cc: linuxram, bauerman
In-Reply-To: <2739323b-51c0-c713-0986-aa0bdaab3184@linux.ibm.com>
"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:
> On 7/6/20 5:59 PM, Michael Ellerman wrote:
>> "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:
>>> As we kexec across kernels that use AMR/IAMR for different purposes
>>> we need to ensure that new kernels get kexec'd with a reset value
>>> of AMR/IAMR. For ex: the new kernel can use key 0 for kernel mapping and the old
>>> AMR value prevents access to key 0.
>>>
>>> This patch also removes reset if IAMR and AMOR in kexec_sequence. Reset of AMOR
>>> is not needed and the IAMR reset is partial (it doesn't do the reset
>>> on secondary cpus) and is redundant with this patch.
>>
>> I like the idea of cleaning this stuff up.
>>
>> But I think tying it into kup/kuep/kuap and the MMU features and ifdefs
>> and so on is overly complicated.
>>
>> We should just have eg:
>>
>> void reset_sprs(void)
>> {
>> if (early_cpu_has_feature(CPU_FTR_ARCH_206)) {
>> mtspr(SPRN_AMR, 0);
>> mtspr(SPRN_UAMOR, 0);
>> }
>>
>> if (early_cpu_has_feature(CPU_FTR_ARCH_207S)) {
>> mtspr(SPRN_IAMR, 0);
>> }
>> }
>>
>> And call that from the kexec paths.
>
> Will fix. Should that be early_cpu_has_feature()? cpu_has_feature()
> should work there right?
Yeah I guess. I was thinking if we crashed before code patching was
done, but if that happens we can't kdump anyway. So I'm probably being
over paranoid.
cheers
^ permalink raw reply
* Re: [PATCH v5 19/26] powerpc/book3s64/kuap: Move KUAP related function outside radix
From: Michael Ellerman @ 2020-07-07 1:22 UTC (permalink / raw)
To: Aneesh Kumar K.V, linuxppc-dev; +Cc: linuxram, bauerman
In-Reply-To: <d09d0150-860a-1e6a-1d4a-01ae8d7e97b9@linux.ibm.com>
"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:
> On 7/6/20 6:11 PM, Michael Ellerman wrote:
>> "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:
>>> The next set of patches adds support for kuap with hash translation.
>>
>> That's no longer true of this series.
>>
>>> diff --git a/arch/powerpc/mm/book3s64/pkeys.c b/arch/powerpc/mm/book3s64/pkeys.c
>>> index 0d72c0246052..e93b65a0e6e7 100644
>>> --- a/arch/powerpc/mm/book3s64/pkeys.c
>>> +++ b/arch/powerpc/mm/book3s64/pkeys.c
>>> @@ -199,6 +200,24 @@ void __init pkey_early_init_devtree(void)
>>> return;
>>> }
>>>
>>> +#ifdef CONFIG_PPC_KUAP
>>> +void __init setup_kuap(bool disabled)
>>> +{
>>> + if (disabled || !early_radix_enabled())
>>> + return;
>>> +
>>> + if (smp_processor_id() == boot_cpuid) {
>>> + pr_info("Activating Kernel Userspace Access Prevention\n");
>>> + cur_cpu_spec->mmu_features |= MMU_FTR_RADIX_KUAP;
>>> + }
>>> +
>>> + /* Make sure userspace can't change the AMR */
>>> + mtspr(SPRN_UAMOR, 0);
>>> + mtspr(SPRN_AMR, AMR_KUAP_BLOCKED);
>>> + isync();
>>> +}
>>> +#endif
>>
>> This makes this code depend on CONFIG_PPC_MEM_KEYS=y, which it didn't
>> used to.
>>
>> That risks breaking people's existing .configs, if they have
>> PPC_MEM_KEYS=n they will now lose KUAP.
>>
>> And I'm not convinced the two features should be tied together, at least
>> at the user-visible Kconfig level.
>
> That simplifies the addition of hash kuap a lot. Especially in the
> exception entry and return paths. I did try to consider them as
> independent options. But then the feature fixup in asm code gets
> unnecessarily complicated. Also the UAMOR handling also get complicated.
Yep. I'm OK if most of the code is enabled for either/both options, but
I think the user-visible options should not depend on each other.
So something like:
config PPC_PKEY
def_bool y
depends on PPC_MEM_KEYS || PPC_KUAP
And then the low-level code is guarded by PPC_PKEY.
Or we just say that making MEM_KEYS configurable is not worth the
added complexity and turn it on always.
cheers
^ permalink raw reply
* Re: [PATCH v5 10/26] powerpc/book3s64/pkeys: Convert pkey_total to max_pkey
From: Michael Ellerman @ 2020-07-07 1:26 UTC (permalink / raw)
To: Aneesh Kumar K.V, linuxppc-dev; +Cc: linuxram, bauerman
In-Reply-To: <15527f2f-1b50-2ff2-3e05-b03dec985391@linux.ibm.com>
"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:
> On 7/6/20 12:34 PM, Michael Ellerman wrote:
>> "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:
>>> max_pkey now represents max key value that userspace can allocate.
>>>
>
> I guess commit message is confusing.
And the name.
If it's called max_pkey then it should be the maximum allowed pkey
value.
cheers
>>> diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
>>> index 75d2a2c19c04..652bad7334f3 100644
>>> --- a/arch/powerpc/include/asm/pkeys.h
>>> +++ b/arch/powerpc/include/asm/pkeys.h
>>> @@ -12,7 +12,7 @@
>>> #include <asm/firmware.h>
>>>
>>> DECLARE_STATIC_KEY_FALSE(pkey_disabled);
>>> -extern int pkeys_total; /* total pkeys as per device tree */
>>> +extern int max_pkey;
>>> extern u32 initial_allocation_mask; /* bits set for the initially allocated keys */
>>> extern u32 reserved_allocation_mask; /* bits set for reserved keys */
>>>
>>> @@ -44,7 +44,10 @@ static inline int vma_pkey(struct vm_area_struct *vma)
>>> return (vma->vm_flags & ARCH_VM_PKEY_FLAGS) >> VM_PKEY_SHIFT;
>>> }
>>>
>>> -#define arch_max_pkey() pkeys_total
>>> +static inline int arch_max_pkey(void)
>>> +{
>>> + return max_pkey;
>>> +}
>>
>> If pkeys_total = 32 then max_pkey = 31.
>
> we have
>
> #ifdef CONFIG_PPC_4K_PAGES
> /*
> * The OS can manage only 8 pkeys due to its inability to represent them
> * in the Linux 4K PTE. Mark all other keys reserved.
> */
> max_pkey = min(8, pkeys_total);
> #else
> max_pkey = pkeys_total;
> #endif
>
> so it is 32.
>
>>
>> So we can't just substitute one for the other. ie. arch_max_pkey() must
>> have been wrong, or it is wrong now.
>>
>>> diff --git a/arch/powerpc/mm/book3s64/pkeys.c b/arch/powerpc/mm/book3s64/pkeys.c
>>> index 87d882a9aaf2..a4d7287082a8 100644
>>> --- a/arch/powerpc/mm/book3s64/pkeys.c
>>> +++ b/arch/powerpc/mm/book3s64/pkeys.c
>>> @@ -14,7 +14,7 @@
>>>
>>> DEFINE_STATIC_KEY_FALSE(pkey_disabled);
>>> DEFINE_STATIC_KEY_FALSE(execute_pkey_disabled);
>>> -int pkeys_total; /* Total pkeys as per device tree */
>>> +int max_pkey; /* Maximum key value supported */
>>> u32 initial_allocation_mask; /* Bits set for the initially allocated keys */
>>> /*
>>> * Keys marked in the reservation list cannot be allocated by userspace
>>> @@ -84,7 +84,7 @@ static int scan_pkey_feature(void)
>>>
>>> static int pkey_initialize(void)
>>> {
>>> - int os_reserved, i;
>>> + int pkeys_total, i;
>>>
>>> /*
>>> * We define PKEY_DISABLE_EXECUTE in addition to the arch-neutral
>>> @@ -122,12 +122,12 @@ static int pkey_initialize(void)
>>> * The OS can manage only 8 pkeys due to its inability to represent them
>>> * in the Linux 4K PTE. Mark all other keys reserved.
>>> */
>>> - os_reserved = pkeys_total - 8;
>>> + max_pkey = min(8, pkeys_total);
>>
>> Shouldn't it be 7 ?
>>
>>> #else
>>> - os_reserved = 0;
>>> + max_pkey = pkeys_total;
>>> #endif
>>>
>>> - if (unlikely((pkeys_total - os_reserved) <= execute_only_key)) {
>>> + if (unlikely(max_pkey <= execute_only_key)) {
>>
>> Isn't that an off-by-one now?
>>
>> This is one-off boot time code, there's no need to clutter it with
>> unlikely.
>>
>>> /*
>>> * Insufficient number of keys to support
>>> * execute only key. Mark it unavailable.
>>> @@ -174,10 +174,10 @@ static int pkey_initialize(void)
>>> default_uamor &= ~(0x3ul << pkeyshift(1));
>>>
>>> /*
>>> - * Prevent the usage of OS reserved the keys. Update UAMOR
>>> + * Prevent the usage of OS reserved keys. Update UAMOR
>>> * for those keys.
>>> */
>>> - for (i = (pkeys_total - os_reserved); i < pkeys_total; i++) {
>>> + for (i = max_pkey; i < pkeys_total; i++) {
>>
>> Another off-by-one? Shouldn't we start from max_pkey + 1 ?
>>
>>> reserved_allocation_mask |= (0x1 << i);
>>> default_uamor &= ~(0x3ul << pkeyshift(i));
>>> }
>>
>
> It is the commit message. It is max pkey such that userspace can
> allocate 0 - maxp_pkey -1.
>
> -aneesh
^ permalink raw reply
* Re: [PATCH] MAINTAINERS: Add Shengjiu to reviewer list of sound/soc/fsl
From: Shengjiu Wang @ 2020-07-07 1:51 UTC (permalink / raw)
To: Nicolin Chen
Cc: Linux-ALSA, Li.Xiubo, Shengjiu Wang, linuxppc-dev, timur,
linux-kernel, Mark Brown, Fabio Estevam
In-Reply-To: <20200702193102.25282-1-nicoleotsuka@gmail.com>
On Fri, Jul 3, 2020 at 3:33 AM Nicolin Chen <nicoleotsuka@gmail.com> wrote:
>
> Add Shengjiu who's actively working on the latest fsl/nxp audio drivers.
>
> Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
> Cc: Shengjiu Wang <shengjiu.wang@nxp.com>
> ---
> To Shengjiu, please ack if you feel okay with this (your email address too).
Thanks Nicolin for nominating me as a reviewer.
I'd like to use my gmail address "shengjiu.wang@gmail.com".
with this then
Acked-by: Shengjiu Wang <shengjiu.wang@gmail.com>
best regards
wang shengjiu
>
> MAINTAINERS | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 496fd4eafb68..54aab083bb88 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -6956,6 +6956,7 @@ M: Timur Tabi <timur@kernel.org>
> M: Nicolin Chen <nicoleotsuka@gmail.com>
> M: Xiubo Li <Xiubo.Lee@gmail.com>
> R: Fabio Estevam <festevam@gmail.com>
> +R: Shengjiu Wang <shengjiu.wang@nxp.com>
> L: alsa-devel@alsa-project.org (moderated for non-subscribers)
> L: linuxppc-dev@lists.ozlabs.org
> S: Maintained
> --
> 2.17.1
>
^ permalink raw reply
* Re: [PATCH] powerpc/pseries: detect secure and trusted boot state of the system.
From: Daniel Axtens @ 2020-07-07 2:06 UTC (permalink / raw)
To: Nayna Jain, linuxppc-dev; +Cc: Nayna Jain, linux-kernel, Mimi Zohar
In-Reply-To: <1593882535-21368-1-git-send-email-nayna@linux.ibm.com>
Thanks Nayna!
I'm hoping to get better public documentation for this soon as it's not
documented in a public PAPR yet.
Until then:
The values of ibm,secure-boot under PowerVM are:
0 - disabled
1 - audit mode only. This patch ignores this value for Linux, which I
think is the appropriate thing to do.
2 - enabled and enforcing
3-9 - enabled, OS-defined behaviour. In this patch we map all these
values to enabled and enforcing. Again I think this is the
appropriate thing to do.
ibm,trusted-boot isn't published by a current PowerVM LPAR but will be
published in future. (Currently, trusted boot state is inferred by the
presence or absense of a vTPM.) It's simply 1 = enabled, 0 = disabled.
As for this patch specifically, with the very small nits below,
Reviewed-by: Daniel Axtens <dja@axtens.net>
> - node = get_ppc_fw_sb_node();
> - enabled = of_property_read_bool(node, "os-secureboot-enforcing");
> + if (machine_is(powernv)) {
> + node = get_ppc_fw_sb_node();
> + enabled =
> + of_property_read_bool(node, "os-secureboot-enforcing");
> + of_node_put(node);
> + }
>
> - of_node_put(node);
> + if (machine_is(pseries)) {
Maybe this should be an else if?
> + secureboot = of_get_property(of_root, "ibm,secure-boot", NULL);
> + if (secureboot)
> + enabled = (*secureboot > 1) ? true : false;
> + }
>
> pr_info("Secure boot mode %s\n", enabled ? "enabled" : "disabled");
>
> @@ -38,11 +48,20 @@ bool is_ppc_trustedboot_enabled(void)
> {
> struct device_node *node;
> bool enabled = false;
> + const u32 *trustedboot;
>
> - node = get_ppc_fw_sb_node();
> - enabled = of_property_read_bool(node, "trusted-enabled");
> + if (machine_is(powernv)) {
> + node = get_ppc_fw_sb_node();
> + enabled = of_property_read_bool(node, "trusted-enabled");
> + of_node_put(node);
> + }
>
> - of_node_put(node);
> + if (machine_is(pseries)) {
Likewise.
> + trustedboot =
> + of_get_property(of_root, "ibm,trusted-boot", NULL);
> + if (trustedboot)
> + enabled = (*trustedboot > 0) ? true : false;
Regards,
Daniel
^ permalink raw reply
* RE: [PATCH v6 00/11] Add the multiple PF support for DWC and Layerscape
From: Z.q. Hou @ 2020-07-07 2:10 UTC (permalink / raw)
To: Lorenzo Pieralisi, Xiaowei Bao
Cc: Roy Zang, devicetree@vger.kernel.org, jingoohan1@gmail.com,
linuxppc-dev@lists.ozlabs.org, linux-pci@vger.kernel.org,
linux-kernel@vger.kernel.org, Leo Li, M.h. Lian,
robh+dt@kernel.org, linux-arm-kernel@lists.infradead.org,
gustavo.pimentel@synopsys.com, bhelgaas@google.com,
andrew.murray@arm.com, kishon@ti.com, shawnguo@kernel.org,
Mingkai Hu, amurray@thegoodpenguin.co.uk
In-Reply-To: <20200706104639.GC26377@e121166-lin.cambridge.arm.com>
Hi Lorenzo,
> -----Original Message-----
> From: Lorenzo Pieralisi [mailto:lorenzo.pieralisi@arm.com]
> Sent: 2020年7月6日 18:47
> To: Xiaowei Bao <xiaowei.bao@nxp.com>
> Cc: Z.q. Hou <zhiqiang.hou@nxp.com>; M.h. Lian <minghuan.lian@nxp.com>;
> Mingkai Hu <mingkai.hu@nxp.com>; bhelgaas@google.com;
> robh+dt@kernel.org; shawnguo@kernel.org; Leo Li <leoyang.li@nxp.com>;
> kishon@ti.com; Roy Zang <roy.zang@nxp.com>;
> amurray@thegoodpenguin.co.uk; jingoohan1@gmail.com;
> gustavo.pimentel@synopsys.com; andrew.murray@arm.com;
> linux-pci@vger.kernel.org; devicetree@vger.kernel.org;
> linux-kernel@vger.kernel.org; linux-arm-kernel@lists.infradead.org;
> linuxppc-dev@lists.ozlabs.org
> Subject: Re: [PATCH v6 00/11] Add the multiple PF support for DWC and
> Layerscape
>
> On Sat, Mar 14, 2020 at 11:30:27AM +0800, Xiaowei Bao wrote:
> > Add the PCIe EP multiple PF support for DWC and Layerscape, add the
> > doorbell MSIX function for DWC, use list to manage the PF of one PCIe
> > controller, and refactor the Layerscape EP driver due to some
> > platforms difference.
> >
> > Xiaowei Bao (11):
> > PCI: designware-ep: Add multiple PFs support for DWC
> > PCI: designware-ep: Add the doorbell mode of MSI-X in EP mode
> > PCI: designware-ep: Move the function of getting MSI capability
> > forward
> > PCI: designware-ep: Modify MSI and MSIX CAP way of finding
> > dt-bindings: pci: layerscape-pci: Add compatible strings for ls1088a
> > and ls2088a
> > PCI: layerscape: Fix some format issue of the code
> > PCI: layerscape: Modify the way of getting capability with different
> > PEX
> > PCI: layerscape: Modify the MSIX to the doorbell mode
> > PCI: layerscape: Add EP mode support for ls1088a and ls2088a
> > arm64: dts: layerscape: Add PCIe EP node for ls1088a
> > misc: pci_endpoint_test: Add LS1088a in pci_device_id table
> >
> > .../devicetree/bindings/pci/layerscape-pci.txt | 2 +
> > arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi | 31 +++
> > drivers/misc/pci_endpoint_test.c | 2 +
> > drivers/pci/controller/dwc/pci-layerscape-ep.c | 100 ++++++--
> > drivers/pci/controller/dwc/pcie-designware-ep.c | 255
> +++++++++++++++++----
> > drivers/pci/controller/dwc/pcie-designware.c | 59 +++--
> > drivers/pci/controller/dwc/pcie-designware.h | 48 +++-
> > 7 files changed, 404 insertions(+), 93 deletions(-)
>
> Can you rebase it against v5.8-rc1 please ?
Yes, I will help to rebase.
Thanks,
Zhiqiang
>
> Thanks,
> Lorenzo
^ permalink raw reply
* Re: [PATCH] powerpc/numa: Restrict possible nodes based on platform
From: Srikar Dronamraju @ 2020-07-07 2:50 UTC (permalink / raw)
To: Tyrel Datwyler; +Cc: Nathan Lynch, linuxppc-dev, Bharata B Rao
In-Reply-To: <00388e11-9025-e273-137d-c23f8795457a@linux.ibm.com>
* Tyrel Datwyler <tyreld@linux.ibm.com> [2020-07-06 13:58:42]:
> On 7/5/20 11:40 PM, Srikar Dronamraju wrote:
> > diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
> > index 9fcf2d195830..3d55cef1a2dc 100644
> > --- a/arch/powerpc/mm/numa.c
> > +++ b/arch/powerpc/mm/numa.c
> > @@ -897,7 +897,7 @@ static void __init find_possible_nodes(void)
> > return;
> >
> > if (of_property_read_u32_index(rtas,
> > - "ibm,max-associativity-domains",
> > + "ibm,current-associativity-domains",
>
> I'm not sure ibm,current-associativity-domains is guaranteed to exist on older
> firmware. You may need check that it exists and fall back to
> ibm,max-associativity-domains in the event it doesn't
>
> -Tyrel
>
Oh,
thanks Tyrel thats a good observation.
Will fallback on max-associativity.
--
Thanks and Regards
Srikar Dronamraju
^ permalink raw reply
* Re: [PATCH] powerpc/pseries: detect secure and trusted boot state of the system.
From: Daniel Axtens @ 2020-07-07 2:53 UTC (permalink / raw)
To: Nayna Jain, linuxppc-dev; +Cc: Nayna Jain, linux-kernel, Mimi Zohar
In-Reply-To: <87a70c3wpj.fsf@dja-thinkpad.axtens.net>
As a final extra note,
https://github.com/daxtens/qemu/tree/pseries-secboot
is a qemu repository that fakes out the relevant properties if anyone
else wants to test this.
Also,
> 3-9 - enabled, OS-defined behaviour. In this patch we map all these
> values to enabled and enforcing. Again I think this is the
> appropriate thing to do.
this should read "enabled and enforcing, requirements are at the
discretion of the operating system". Apologies.
Regards,
Daniel
> ibm,trusted-boot isn't published by a current PowerVM LPAR but will be
> published in future. (Currently, trusted boot state is inferred by the
> presence or absense of a vTPM.) It's simply 1 = enabled, 0 = disabled.
>
> As for this patch specifically, with the very small nits below,
>
> Reviewed-by: Daniel Axtens <dja@axtens.net>
>
>> - node = get_ppc_fw_sb_node();
>> - enabled = of_property_read_bool(node, "os-secureboot-enforcing");
>> + if (machine_is(powernv)) {
>> + node = get_ppc_fw_sb_node();
>> + enabled =
>> + of_property_read_bool(node, "os-secureboot-enforcing");
>> + of_node_put(node);
>> + }
>>
>> - of_node_put(node);
>> + if (machine_is(pseries)) {
> Maybe this should be an else if?
>
>> + secureboot = of_get_property(of_root, "ibm,secure-boot", NULL);
>> + if (secureboot)
>> + enabled = (*secureboot > 1) ? true : false;
>> + }
>>
>> pr_info("Secure boot mode %s\n", enabled ? "enabled" : "disabled");
>>
>> @@ -38,11 +48,20 @@ bool is_ppc_trustedboot_enabled(void)
>> {
>> struct device_node *node;
>> bool enabled = false;
>> + const u32 *trustedboot;
>>
>> - node = get_ppc_fw_sb_node();
>> - enabled = of_property_read_bool(node, "trusted-enabled");
>> + if (machine_is(powernv)) {
>> + node = get_ppc_fw_sb_node();
>> + enabled = of_property_read_bool(node, "trusted-enabled");
>> + of_node_put(node);
>> + }
>>
>> - of_node_put(node);
>> + if (machine_is(pseries)) {
> Likewise.
>> + trustedboot =
>> + of_get_property(of_root, "ibm,trusted-boot", NULL);
>> + if (trustedboot)
>> + enabled = (*trustedboot > 0) ? true : false;
>
> Regards,
> Daniel
^ permalink raw reply
* Re: [PATCH] powerpc/numa: Restrict possible nodes based on platform
From: Srikar Dronamraju @ 2020-07-07 2:53 UTC (permalink / raw)
To: Nathan Lynch; +Cc: Tyrel Datwyler, linuxppc-dev, Bharata B Rao
In-Reply-To: <874kqkf91s.fsf@linux.ibm.com>
* Nathan Lynch <nathanl@linux.ibm.com> [2020-07-06 19:44:31]:
> Tyrel Datwyler <tyreld@linux.ibm.com> writes:
> >> --- a/arch/powerpc/mm/numa.c
> >> +++ b/arch/powerpc/mm/numa.c
> >> @@ -897,7 +897,7 @@ static void __init find_possible_nodes(void)
> >> return;
> >>
> >> if (of_property_read_u32_index(rtas,
> >> - "ibm,max-associativity-domains",
> >> + "ibm,current-associativity-domains",
> >
> > I'm not sure ibm,current-associativity-domains is guaranteed to exist on older
> > firmware. You may need check that it exists and fall back to
> > ibm,max-associativity-domains in the event it doesn't
>
> Yes. Looks like it's a PowerVM-specific property.
Right, this is just a PowerVM specific property and doesn't affect PowerNV.
On PowerNV thought we have sparse nodes, the max possible nodes is set
correctly.
--
Thanks and Regards
Srikar Dronamraju
^ permalink raw reply
* Re: [PATCH v3 1/2] powerpc/perf/hv-24x7: Add cpu hotplug support
From: Madhavan Srinivasan @ 2020-07-07 4:15 UTC (permalink / raw)
To: Michael Ellerman, Kajol Jain, linuxppc-dev
Cc: nathanl, ego, suka, maddy, anju
In-Reply-To: <87zh8d5oab.fsf@mpe.ellerman.id.au>
On 7/6/20 8:43 AM, Michael Ellerman wrote:
> Kajol Jain <kjain@linux.ibm.com> writes:
>> Patch here adds cpu hotplug functions to hv_24x7 pmu.
>> A new cpuhp_state "CPUHP_AP_PERF_POWERPC_HV_24x7_ONLINE" enum
>> is added.
>>
>> The online callback function updates the cpumask only if its
>> empty. As the primary intention of adding hotplug support
>> is to designate a CPU to make HCALL to collect the
>> counter data.
>>
>> The offline function test and clear corresponding cpu in a cpumask
>> and update cpumask to any other active cpu.
>>
>> Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
>> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
>> ---
>> arch/powerpc/perf/hv-24x7.c | 45 +++++++++++++++++++++++++++++++++++++
>> include/linux/cpuhotplug.h | 1 +
>> 2 files changed, 46 insertions(+)
>>
>> diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
>> index db213eb7cb02..ce4739e2b407 100644
>> --- a/arch/powerpc/perf/hv-24x7.c
>> +++ b/arch/powerpc/perf/hv-24x7.c
>> @@ -31,6 +31,8 @@ static int interface_version;
>> /* Whether we have to aggregate result data for some domains. */
>> static bool aggregate_result_elements;
>>
>> +static cpumask_t hv_24x7_cpumask;
>> +
>> static bool domain_is_valid(unsigned domain)
>> {
>> switch (domain) {
>> @@ -1641,6 +1643,44 @@ static struct pmu h_24x7_pmu = {
>> .capabilities = PERF_PMU_CAP_NO_EXCLUDE,
>> };
>>
>> +static int ppc_hv_24x7_cpu_online(unsigned int cpu)
>> +{
>> + /* Make this CPU the designated target for counter collection */
> The comment implies every newly onlined CPU will become the target, but
> actually it's only the first onlined CPU.
>
> So I think the comment needs updating, or you could just drop the
> comment, I think the code is fairly clear by itself.
>
>> + if (cpumask_empty(&hv_24x7_cpumask))
>> + cpumask_set_cpu(cpu, &hv_24x7_cpumask);
>> +
>> + return 0;
>> +}
>> +
>> +static int ppc_hv_24x7_cpu_offline(unsigned int cpu)
>> +{
>> + int target = -1;
> No need to initialise target, you assign to it unconditionally below.
>
>> + /* Check if exiting cpu is used for collecting 24x7 events */
>> + if (!cpumask_test_and_clear_cpu(cpu, &hv_24x7_cpumask))
>> + return 0;
>> +
>> + /* Find a new cpu to collect 24x7 events */
>> + target = cpumask_last(cpu_active_mask);
> Any reason to use cpumask_last() vs cpumask_first(), or a randomly
> chosen CPU?
>
>> + if (target < 0 || target >= nr_cpu_ids)
>> + return -1;
>> +
>> + /* Migrate 24x7 events to the new target */
>> + cpumask_set_cpu(target, &hv_24x7_cpumask);
>> + perf_pmu_migrate_context(&h_24x7_pmu, cpu, target);
>> +
>> + return 0;
>> +}
>> +
>> +static int hv_24x7_cpu_hotplug_init(void)
>> +{
>> + return cpuhp_setup_state(CPUHP_AP_PERF_POWERPC_HV_24x7_ONLINE,
>> + "perf/powerpc/hv_24x7:online",
>> + ppc_hv_24x7_cpu_online,
>> + ppc_hv_24x7_cpu_offline);
>> +}
>> +
>> static int hv_24x7_init(void)
>> {
>> int r;
>> @@ -1685,6 +1725,11 @@ static int hv_24x7_init(void)
>> if (r)
>> return r;
>>
>> + /* init cpuhotplug */
>> + r = hv_24x7_cpu_hotplug_init();
>> + if (r)
>> + pr_err("hv_24x7: CPU hotplug init failed\n");
>> +
> The hotplug initialisation shouldn't fail unless something is badly
> wrong. I think you should just fail initialisation of the entire PMU if
> that happens, which will make the error handling in the next patch much
> simpler.
We did fail the PMU registration on failure of the hotplug
code (and yes error handling is much simpler), but on internal
review/discussion,
what came up was that, hv_24x7 PMU will still be usable without
the hotplug code (with "-C" option to perf tool command line).
Maddy
>
> cheers
>
>> r = perf_pmu_register(&h_24x7_pmu, h_24x7_pmu.name, -1);
>> if (r)
>> return r;
^ permalink raw reply
* Re: [PATCH v3 1/2] powerpc/perf/hv-24x7: Add cpu hotplug support
From: Michael Ellerman @ 2020-07-07 4:56 UTC (permalink / raw)
To: Madhavan Srinivasan, Kajol Jain, linuxppc-dev
Cc: nathanl, ego, suka, maddy, anju
In-Reply-To: <3b45d050-c27a-e5e7-8649-924910f5b01b@linux.ibm.com>
Madhavan Srinivasan <maddy@linux.ibm.com> writes:
> On 7/6/20 8:43 AM, Michael Ellerman wrote:
>> Kajol Jain <kjain@linux.ibm.com> writes:
>>> Patch here adds cpu hotplug functions to hv_24x7 pmu.
>>> A new cpuhp_state "CPUHP_AP_PERF_POWERPC_HV_24x7_ONLINE" enum
>>> is added.
>>>
>>> The online callback function updates the cpumask only if its
>>> empty. As the primary intention of adding hotplug support
>>> is to designate a CPU to make HCALL to collect the
>>> counter data.
>>>
>>> The offline function test and clear corresponding cpu in a cpumask
>>> and update cpumask to any other active cpu.
>>>
>>> Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
>>> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
>>> ---
>>> arch/powerpc/perf/hv-24x7.c | 45 +++++++++++++++++++++++++++++++++++++
>>> include/linux/cpuhotplug.h | 1 +
>>> 2 files changed, 46 insertions(+)
>>>
>>> diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
>>> index db213eb7cb02..ce4739e2b407 100644
>>> --- a/arch/powerpc/perf/hv-24x7.c
>>> +++ b/arch/powerpc/perf/hv-24x7.c
>>> @@ -31,6 +31,8 @@ static int interface_version;
>>> /* Whether we have to aggregate result data for some domains. */
>>> static bool aggregate_result_elements;
>>>
>>> +static cpumask_t hv_24x7_cpumask;
>>> +
>>> static bool domain_is_valid(unsigned domain)
>>> {
>>> switch (domain) {
>>> @@ -1641,6 +1643,44 @@ static struct pmu h_24x7_pmu = {
>>> .capabilities = PERF_PMU_CAP_NO_EXCLUDE,
>>> };
>>>
>>> +static int ppc_hv_24x7_cpu_online(unsigned int cpu)
>>> +{
>>> + /* Make this CPU the designated target for counter collection */
>> The comment implies every newly onlined CPU will become the target, but
>> actually it's only the first onlined CPU.
>>
>> So I think the comment needs updating, or you could just drop the
>> comment, I think the code is fairly clear by itself.
>>
>>> + if (cpumask_empty(&hv_24x7_cpumask))
>>> + cpumask_set_cpu(cpu, &hv_24x7_cpumask);
>>> +
>>> + return 0;
>>> +}
>>> +
>>> +static int ppc_hv_24x7_cpu_offline(unsigned int cpu)
>>> +{
>>> + int target = -1;
>> No need to initialise target, you assign to it unconditionally below.
>>
>>> + /* Check if exiting cpu is used for collecting 24x7 events */
>>> + if (!cpumask_test_and_clear_cpu(cpu, &hv_24x7_cpumask))
>>> + return 0;
>>> +
>>> + /* Find a new cpu to collect 24x7 events */
>>> + target = cpumask_last(cpu_active_mask);
>> Any reason to use cpumask_last() vs cpumask_first(), or a randomly
>> chosen CPU?
>>
>>> + if (target < 0 || target >= nr_cpu_ids)
>>> + return -1;
>>> +
>>> + /* Migrate 24x7 events to the new target */
>>> + cpumask_set_cpu(target, &hv_24x7_cpumask);
>>> + perf_pmu_migrate_context(&h_24x7_pmu, cpu, target);
>>> +
>>> + return 0;
>>> +}
>>> +
>>> +static int hv_24x7_cpu_hotplug_init(void)
>>> +{
>>> + return cpuhp_setup_state(CPUHP_AP_PERF_POWERPC_HV_24x7_ONLINE,
>>> + "perf/powerpc/hv_24x7:online",
>>> + ppc_hv_24x7_cpu_online,
>>> + ppc_hv_24x7_cpu_offline);
>>> +}
>>> +
>>> static int hv_24x7_init(void)
>>> {
>>> int r;
>>> @@ -1685,6 +1725,11 @@ static int hv_24x7_init(void)
>>> if (r)
>>> return r;
>>>
>>> + /* init cpuhotplug */
>>> + r = hv_24x7_cpu_hotplug_init();
>>> + if (r)
>>> + pr_err("hv_24x7: CPU hotplug init failed\n");
>>> +
>> The hotplug initialisation shouldn't fail unless something is badly
>> wrong. I think you should just fail initialisation of the entire PMU if
>> that happens, which will make the error handling in the next patch much
>> simpler.
>
> We did fail the PMU registration on failure of the hotplug
> code (and yes error handling is much simpler), but on internal
> review/discussion,
> what came up was that, hv_24x7 PMU will still be usable without
> the hotplug code (with "-C" option to perf tool command line).
In theory yes.
But in reality no one will ever test that case, so the code will easily
bit rot.
Even if it doesn't bit rot, you've now created another state the system
can legally be in (hotplug init failed but PMU still probed), which you
have to test, document & support.
If the hotplug init fails then something is badly wrong, the best thing
we can do is bail on the PMU initialisation and hope the rest of the
system boots OK.
cheers
^ permalink raw reply
* [PATCH v2] MAINTAINERS: Add Shengjiu to reviewer list of sound/soc/fsl
From: Nicolin Chen @ 2020-07-07 4:58 UTC (permalink / raw)
To: broonie
Cc: alsa-devel, shengjiu.wang, Xiubo.Lee, festevam, shengjiu.wang,
timur, linux-kernel, linuxppc-dev
Add Shengjiu who's actively working on the latest fsl/nxp audio drivers.
Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
Acked-by: Shengjiu Wang <shengjiu.wang@gmail.com>
Reviewed-by: Fabio Estevam <festevam@gmail.com>
---
Changelog
v1->v2:
* Replaced Shengjiu's emaill address with his gmail one
* Added Acked-by from Shengjiu and Reviewed-by from Fabio
MAINTAINERS | 1 +
1 file changed, 1 insertion(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index 496fd4eafb68..ff97b8cefaea 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6956,6 +6956,7 @@ M: Timur Tabi <timur@kernel.org>
M: Nicolin Chen <nicoleotsuka@gmail.com>
M: Xiubo Li <Xiubo.Lee@gmail.com>
R: Fabio Estevam <festevam@gmail.com>
+R: Shengjiu Wang <shengjiu.wang@gmail.com>
L: alsa-devel@alsa-project.org (moderated for non-subscribers)
L: linuxppc-dev@lists.ozlabs.org
S: Maintained
--
2.17.1
^ permalink raw reply related
* Re: [PATCH] powerpc/numa: Restrict possible nodes based on platform
From: Michael Ellerman @ 2020-07-07 5:02 UTC (permalink / raw)
To: Srikar Dronamraju
Cc: Nathan Lynch, linuxppc-dev, Srikar Dronamraju, Bharata B Rao
In-Reply-To: <20200706064002.14848-1-srikar@linux.vnet.ibm.com>
Srikar Dronamraju <srikar@linux.vnet.ibm.com> writes:
> As per PAPR, there are 2 device tree property
> ibm,max-associativity-domains (which defines the maximum number of
> domains that the firmware i.e PowerVM can support) and
> ibm,current-associativity-domains (which defines the maximum number of
> domains that the platform can support). Value of
> ibm,max-associativity-domains property is always greater than or equal
> to ibm,current-associativity-domains property.
Where is it documented?
It's definitely not in LoPAPR.
> Powerpc currently uses ibm,max-associativity-domains property while
> setting the possible number of nodes. This is currently set at 32.
> However the possible number of nodes for a platform may be significantly
> less. Hence set the possible number of nodes based on
> ibm,current-associativity-domains property.
>
> $ lsprop /proc/device-tree/rtas/ibm,*associ*-domains
> /proc/device-tree/rtas/ibm,current-associativity-domains
> 00000005 00000001 00000002 00000002 00000002 00000010
> /proc/device-tree/rtas/ibm,max-associativity-domains
> 00000005 00000001 00000008 00000020 00000020 00000100
>
> $ cat /sys/devices/system/node/possible ##Before patch
> 0-31
>
> $ cat /sys/devices/system/node/possible ##After patch
> 0-1
>
> Note the maximum nodes this platform can support is only 2 but the
> possible nodes is set to 32.
But what about LPM to a system with more nodes?
cheers
^ permalink raw reply
* Re: [PATCH v12 00/31] Speculative page faults
From: Chinwen Chang @ 2020-07-07 5:31 UTC (permalink / raw)
To: Laurent Dufour
Cc: jack, sergey.senozhatsky.work, peterz, Will Deacon, mhocko,
linux-mm, paulus, Punit Agrawal, hpa, Michel Lespinasse,
Alexei Starovoitov, Andrea Arcangeli, ak, Minchan Kim,
aneesh.kumar, x86, Matthew Wilcox, Daniel Jordan, Ingo Molnar,
zhong jiang, David Rientjes, paulmck, npiggin, sj38.park,
Jerome Glisse, dave, kemi.wang, kirill, Thomas Gleixner,
Haiyan Song, Ganesh Mahendran, Yang Shi, Mike Rapoport,
linuxppc-dev, linux-kernel, Sergey Senozhatsky, miles.chen,
vinayak menon, akpm, Tim Chen, haren
In-Reply-To: <490c0811-50cd-0802-2cbc-9c031ef309f6@linux.ibm.com>
On Mon, 2020-07-06 at 14:27 +0200, Laurent Dufour wrote:
> Le 06/07/2020 à 11:25, Chinwen Chang a écrit :
> > On Thu, 2019-06-20 at 16:19 +0800, Haiyan Song wrote:
> >> Hi Laurent,
> >>
> >> I downloaded your script and run it on Intel 2s skylake platform with spf-v12 patch
> >> serials.
> >>
> >> Here attached the output results of this script.
> >>
> >> The following comparison result is statistics from the script outputs.
> >>
> >> a). Enable THP
> >> SPF_0 change SPF_1
> >> will-it-scale.page_fault2.per_thread_ops 2664190.8 -11.7% 2353637.6
> >> will-it-scale.page_fault3.per_thread_ops 4480027.2 -14.7% 3819331.9
> >>
> >>
> >> b). Disable THP
> >> SPF_0 change SPF_1
> >> will-it-scale.page_fault2.per_thread_ops 2653260.7 -10% 2385165.8
> >> will-it-scale.page_fault3.per_thread_ops 4436330.1 -12.4% 3886734.2
> >>
> >>
> >> Thanks,
> >> Haiyan Song
> >>
> >>
> >> On Fri, Jun 14, 2019 at 10:44:47AM +0200, Laurent Dufour wrote:
> >>> Le 14/06/2019 à 10:37, Laurent Dufour a écrit :
> >>>> Please find attached the script I run to get these numbers.
> >>>> This would be nice if you could give it a try on your victim node and share the result.
> >>>
> >>> Sounds that the Intel mail fitering system doesn't like the attached shell script.
> >>> Please find it there: https://urldefense.com/v3/__https://gist.github.com/ldu4/a5cc1a93f293108ea387d43d5d5e7f44__;!!CTRNKA9wMg0ARbw!0lux2FMCbIFxFEl824CdSuSQqT0IVWsvyUqfDVJNEVb9gTWyRltm7cpPZg70N_XhXmMZ$
> >>>
> >>> Thanks,
> >>> Laurent.
> >>>
> >
> > Hi Laurent,
> >
> > We merged SPF v11 and some patches from v12 into our platforms. After
> > several experiments, we observed SPF has obvious improvements on the
> > launch time of applications, especially for those high-TLP ones,
> >
> > # launch time of applications(s):
> >
> > package version w/ SPF w/o SPF improve(%)
> > ------------------------------------------------------------------
> > Baidu maps 10.13.3 0.887 0.98 9.49
> > Taobao 8.4.0.35 1.227 1.293 5.10
> > Meituan 9.12.401 1.107 1.543 28.26
> > WeChat 7.0.3 2.353 2.68 12.20
> > Honor of Kings 1.43.1.6 6.63 6.713 1.24
>
> That's great news, thanks for reporting this!
>
> >
> > By the way, we have verified our platforms with those patches and
> > achieved the goal of mass production.
>
> Another good news!
> For my information, what is your targeted hardware?
>
> Cheers,
> Laurent.
Hi Laurent,
Our targeted hardware belongs to ARM64 multi-core series.
Thanks.
Chinwen
>
^ permalink raw reply
* Re: [PATCH v3 1/2] powerpc/perf/hv-24x7: Add cpu hotplug support
From: Madhavan Srinivasan @ 2020-07-07 5:36 UTC (permalink / raw)
To: Michael Ellerman, Kajol Jain, linuxppc-dev
Cc: nathanl, ego, suka, maddy, anju
In-Reply-To: <87o8or53fh.fsf@mpe.ellerman.id.au>
On 7/7/20 10:26 AM, Michael Ellerman wrote:
> Madhavan Srinivasan <maddy@linux.ibm.com> writes:
>> On 7/6/20 8:43 AM, Michael Ellerman wrote:
>>> Kajol Jain <kjain@linux.ibm.com> writes:
>>>> Patch here adds cpu hotplug functions to hv_24x7 pmu.
>>>> A new cpuhp_state "CPUHP_AP_PERF_POWERPC_HV_24x7_ONLINE" enum
>>>> is added.
>>>>
>>>> The online callback function updates the cpumask only if its
>>>> empty. As the primary intention of adding hotplug support
>>>> is to designate a CPU to make HCALL to collect the
>>>> counter data.
>>>>
>>>> The offline function test and clear corresponding cpu in a cpumask
>>>> and update cpumask to any other active cpu.
>>>>
>>>> Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
>>>> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
>>>> ---
>>>> arch/powerpc/perf/hv-24x7.c | 45 +++++++++++++++++++++++++++++++++++++
>>>> include/linux/cpuhotplug.h | 1 +
>>>> 2 files changed, 46 insertions(+)
>>>>
>>>> diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
>>>> index db213eb7cb02..ce4739e2b407 100644
>>>> --- a/arch/powerpc/perf/hv-24x7.c
>>>> +++ b/arch/powerpc/perf/hv-24x7.c
>>>> @@ -31,6 +31,8 @@ static int interface_version;
>>>> /* Whether we have to aggregate result data for some domains. */
>>>> static bool aggregate_result_elements;
>>>>
>>>> +static cpumask_t hv_24x7_cpumask;
>>>> +
>>>> static bool domain_is_valid(unsigned domain)
>>>> {
>>>> switch (domain) {
>>>> @@ -1641,6 +1643,44 @@ static struct pmu h_24x7_pmu = {
>>>> .capabilities = PERF_PMU_CAP_NO_EXCLUDE,
>>>> };
>>>>
>>>> +static int ppc_hv_24x7_cpu_online(unsigned int cpu)
>>>> +{
>>>> + /* Make this CPU the designated target for counter collection */
>>> The comment implies every newly onlined CPU will become the target, but
>>> actually it's only the first onlined CPU.
>>>
>>> So I think the comment needs updating, or you could just drop the
>>> comment, I think the code is fairly clear by itself.
>>>
>>>> + if (cpumask_empty(&hv_24x7_cpumask))
>>>> + cpumask_set_cpu(cpu, &hv_24x7_cpumask);
>>>> +
>>>> + return 0;
>>>> +}
>>>> +
>>>> +static int ppc_hv_24x7_cpu_offline(unsigned int cpu)
>>>> +{
>>>> + int target = -1;
>>> No need to initialise target, you assign to it unconditionally below.
>>>
>>>> + /* Check if exiting cpu is used for collecting 24x7 events */
>>>> + if (!cpumask_test_and_clear_cpu(cpu, &hv_24x7_cpumask))
>>>> + return 0;
>>>> +
>>>> + /* Find a new cpu to collect 24x7 events */
>>>> + target = cpumask_last(cpu_active_mask);
>>> Any reason to use cpumask_last() vs cpumask_first(), or a randomly
>>> chosen CPU?
>>>
>>>> + if (target < 0 || target >= nr_cpu_ids)
>>>> + return -1;
>>>> +
>>>> + /* Migrate 24x7 events to the new target */
>>>> + cpumask_set_cpu(target, &hv_24x7_cpumask);
>>>> + perf_pmu_migrate_context(&h_24x7_pmu, cpu, target);
>>>> +
>>>> + return 0;
>>>> +}
>>>> +
>>>> +static int hv_24x7_cpu_hotplug_init(void)
>>>> +{
>>>> + return cpuhp_setup_state(CPUHP_AP_PERF_POWERPC_HV_24x7_ONLINE,
>>>> + "perf/powerpc/hv_24x7:online",
>>>> + ppc_hv_24x7_cpu_online,
>>>> + ppc_hv_24x7_cpu_offline);
>>>> +}
>>>> +
>>>> static int hv_24x7_init(void)
>>>> {
>>>> int r;
>>>> @@ -1685,6 +1725,11 @@ static int hv_24x7_init(void)
>>>> if (r)
>>>> return r;
>>>>
>>>> + /* init cpuhotplug */
>>>> + r = hv_24x7_cpu_hotplug_init();
>>>> + if (r)
>>>> + pr_err("hv_24x7: CPU hotplug init failed\n");
>>>> +
>>> The hotplug initialisation shouldn't fail unless something is badly
>>> wrong. I think you should just fail initialisation of the entire PMU if
>>> that happens, which will make the error handling in the next patch much
>>> simpler.
>> We did fail the PMU registration on failure of the hotplug
>> code (and yes error handling is much simpler), but on internal
>> review/discussion,
>> what came up was that, hv_24x7 PMU will still be usable without
>> the hotplug code (with "-C" option to perf tool command line).
> In theory yes.
>
> But in reality no one will ever test that case, so the code will easily
> bit rot.
>
> Even if it doesn't bit rot, you've now created another state the system
> can legally be in (hotplug init failed but PMU still probed), which you
> have to test, document & support.
>
> If the hotplug init fails then something is badly wrong, the best thing
> we can do is bail on the PMU initialisation and hope the rest of the
> system boots OK.
Yep agreed. Thanks for the comments mpe
Maddy
>
> cheers
^ permalink raw reply
* Re: [PATCH] powerpc: select ARCH_HAS_MEMBARRIER_SYNC_CORE
From: Nicholas Piggin @ 2020-07-07 5:50 UTC (permalink / raw)
To: Christophe Leroy, linuxppc-dev; +Cc: linux-arch, Mathieu Desnoyers
In-Reply-To: <cf10b0bc-de79-1b2b-8355-fc7bbeec47c3@csgroup.eu>
Excerpts from Christophe Leroy's message of July 6, 2020 7:53 pm:
>
>
> Le 06/07/2020 à 04:18, Nicholas Piggin a écrit :
>> diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h
>> index 47bd4ea0837d..b88cb3a989b6 100644
>> --- a/arch/powerpc/include/asm/exception-64s.h
>> +++ b/arch/powerpc/include/asm/exception-64s.h
>> @@ -68,6 +68,10 @@
>> *
>> * The nop instructions allow us to insert one or more instructions to flush the
>> * L1-D cache when returning to userspace or a guest.
>> + *
>> + * powerpc relies on return from interrupt/syscall being context synchronising
>> + * (which hrfid, rfid, and rfscv are) to support ARCH_HAS_MEMBARRIER_SYNC_CORE
>> + * without additional additional synchronisation instructions.
>
> This file is dedicated to BOOK3S/64. What about other ones ?
>
> On 32 bits, this is also valid as 'rfi' is also context synchronising,
> but then why just add some comment in exception-64s.h and only there ?
Yeah you're right, I basically wanted to keep a note there just in case,
because it's possible we would get a less synchronising return (maybe
unlikely with meltdown) or even return from a kernel interrupt using a
something faster (e.g., bctar if we don't use tar register in the kernel
anywhere).
So I wonder where to add the note, entry_32.S and 64e.h as well?
I should actually change the comment for 64-bit because soft masked
interrupt replay is an interesting case. I thought it was okay (because
the IPI would cause a hard interrupt which does do the rfi) but that
should at least be written. The context synchronisation happens before
the Linux IPI function is called, but for the purpose of membarrier I
think that is okay (the membarrier just needs to have caused a memory
barrier + context synchronistaion by the time it has done).
Thanks,
Nick
^ permalink raw reply
* Re: [PATCH] powerpc/pseries: detect secure and trusted boot state of the system.
From: Michael Ellerman @ 2020-07-07 5:56 UTC (permalink / raw)
To: Nayna Jain, linuxppc-dev; +Cc: Nayna Jain, linux-kernel, Mimi Zohar
In-Reply-To: <1593882535-21368-1-git-send-email-nayna@linux.ibm.com>
Nayna Jain <nayna@linux.ibm.com> writes:
> The device-tree property to check secure and trusted boot state is
> different for guests(pseries) compared to baremetal(powernv).
>
> This patch updates the existing is_ppc_secureboot_enabled() and
> is_ppc_trustedboot_enabled() function to add support for pseries.
>
> Signed-off-by: Nayna Jain <nayna@linux.ibm.com>
> ---
> arch/powerpc/kernel/secure_boot.c | 31 +++++++++++++++++++++++++------
> 1 file changed, 25 insertions(+), 6 deletions(-)
>
> diff --git a/arch/powerpc/kernel/secure_boot.c b/arch/powerpc/kernel/secure_boot.c
> index 4b982324d368..43fc6607c7a5 100644
> --- a/arch/powerpc/kernel/secure_boot.c
> +++ b/arch/powerpc/kernel/secure_boot.c
> @@ -6,6 +6,7 @@
> #include <linux/types.h>
> #include <linux/of.h>
> #include <asm/secure_boot.h>
> +#include <asm/machdep.h>
>
> static struct device_node *get_ppc_fw_sb_node(void)
> {
> @@ -23,11 +24,20 @@ bool is_ppc_secureboot_enabled(void)
> {
> struct device_node *node;
> bool enabled = false;
> + const u32 *secureboot;
>
> - node = get_ppc_fw_sb_node();
> - enabled = of_property_read_bool(node, "os-secureboot-enforcing");
> + if (machine_is(powernv)) {
> + node = get_ppc_fw_sb_node();
> + enabled =
> + of_property_read_bool(node, "os-secureboot-enforcing");
> + of_node_put(node);
> + }
We generally try to avoid adding new machine_is() checks if we can.
In a case like this I think you can just check for both properties
regardless of what platform you're on.
> - of_node_put(node);
> + if (machine_is(pseries)) {
> + secureboot = of_get_property(of_root, "ibm,secure-boot", NULL);
> + if (secureboot)
> + enabled = (*secureboot > 1) ? true : false;
> + }
Please don't use of_get_property() in new code. Use one of the properly
typed accessors that handles endian conversion for you.
cheers
^ permalink raw reply
* Re: [PATCH v3 0/6] powerpc: queued spinlocks and rwlocks
From: Nicholas Piggin @ 2020-07-07 5:57 UTC (permalink / raw)
To: linuxppc-dev, Waiman Long
Cc: linux-arch, Peter Zijlstra, Boqun Feng, linux-kernel, kvm-ppc,
virtualization, Ingo Molnar, Will Deacon
In-Reply-To: <24f75d2c-60cd-2766-4aab-1a3b1c80646e@redhat.com>
Excerpts from Waiman Long's message of July 7, 2020 4:39 am:
> On 7/6/20 12:35 AM, Nicholas Piggin wrote:
>> v3 is updated to use __pv_queued_spin_unlock, noticed by Waiman (thank you).
>>
>> Thanks,
>> Nick
>>
>> Nicholas Piggin (6):
>> powerpc/powernv: must include hvcall.h to get PAPR defines
>> powerpc/pseries: move some PAPR paravirt functions to their own file
>> powerpc: move spinlock implementation to simple_spinlock
>> powerpc/64s: implement queued spinlocks and rwlocks
>> powerpc/pseries: implement paravirt qspinlocks for SPLPAR
>> powerpc/qspinlock: optimised atomic_try_cmpxchg_lock that adds the
>> lock hint
>>
>> arch/powerpc/Kconfig | 13 +
>> arch/powerpc/include/asm/Kbuild | 2 +
>> arch/powerpc/include/asm/atomic.h | 28 ++
>> arch/powerpc/include/asm/paravirt.h | 89 +++++
>> arch/powerpc/include/asm/qspinlock.h | 91 ++++++
>> arch/powerpc/include/asm/qspinlock_paravirt.h | 7 +
>> arch/powerpc/include/asm/simple_spinlock.h | 292 +++++++++++++++++
>> .../include/asm/simple_spinlock_types.h | 21 ++
>> arch/powerpc/include/asm/spinlock.h | 308 +-----------------
>> arch/powerpc/include/asm/spinlock_types.h | 17 +-
>> arch/powerpc/lib/Makefile | 3 +
>> arch/powerpc/lib/locks.c | 12 +-
>> arch/powerpc/platforms/powernv/pci-ioda-tce.c | 1 +
>> arch/powerpc/platforms/pseries/Kconfig | 5 +
>> arch/powerpc/platforms/pseries/setup.c | 6 +-
>> include/asm-generic/qspinlock.h | 4 +
>> 16 files changed, 577 insertions(+), 322 deletions(-)
>> create mode 100644 arch/powerpc/include/asm/paravirt.h
>> create mode 100644 arch/powerpc/include/asm/qspinlock.h
>> create mode 100644 arch/powerpc/include/asm/qspinlock_paravirt.h
>> create mode 100644 arch/powerpc/include/asm/simple_spinlock.h
>> create mode 100644 arch/powerpc/include/asm/simple_spinlock_types.h
>>
> This patch looks OK to me.
Thanks for reviewing and testing.
> I had run some microbenchmark on powerpc system with or w/o the patch.
>
> On a 2-socket 160-thread SMT4 POWER9 system (not virtualized):
>
> 5.8.0-rc4
> =========
>
> Running locktest with spinlock [runtime = 10s, load = 1]
> Threads = 160, Min/Mean/Max = 77,665/90,153/106,895
> Threads = 160, Total Rate = 1,441,759 op/s; Percpu Rate = 9,011 op/s
>
> Running locktest with rwlock [runtime = 10s, r% = 50%, load = 1]
> Threads = 160, Min/Mean/Max = 47,879/53,807/63,689
> Threads = 160, Total Rate = 860,192 op/s; Percpu Rate = 5,376 op/s
>
> Running locktest with spinlock [runtime = 10s, load = 1]
> Threads = 80, Min/Mean/Max = 242,907/319,514/463,161
> Threads = 80, Total Rate = 2,555 kop/s; Percpu Rate = 32 kop/s
>
> Running locktest with rwlock [runtime = 10s, r% = 50%, load = 1]
> Threads = 80, Min/Mean/Max = 146,161/187,474/259,270
> Threads = 80, Total Rate = 1,498 kop/s; Percpu Rate = 19 kop/s
>
> Running locktest with spinlock [runtime = 10s, load = 1]
> Threads = 40, Min/Mean/Max = 646,639/1,000,817/1,455,205
> Threads = 40, Total Rate = 4,001 kop/s; Percpu Rate = 100 kop/s
>
> Running locktest with rwlock [runtime = 10s, r% = 50%, load = 1]
> Threads = 40, Min/Mean/Max = 402,165/597,132/814,555
> Threads = 40, Total Rate = 2,388 kop/s; Percpu Rate = 60 kop/s
>
> 5.8.0-rc4-qlock+
> ================
>
> Running locktest with spinlock [runtime = 10s, load = 1]
> Threads = 160, Min/Mean/Max = 123,835/124,580/124,587
> Threads = 160, Total Rate = 1,992 kop/s; Percpu Rate = 12 kop/s
>
> Running locktest with rwlock [runtime = 10s, r% = 50%, load = 1]
> Threads = 160, Min/Mean/Max = 254,210/264,714/276,784
> Threads = 160, Total Rate = 4,231 kop/s; Percpu Rate = 26 kop/s
>
> Running locktest with spinlock [runtime = 10s, load = 1]
> Threads = 80, Min/Mean/Max = 599,715/603,397/603,450
> Threads = 80, Total Rate = 4,825 kop/s; Percpu Rate = 60 kop/s
>
> Running locktest with rwlock [runtime = 10s, r% = 50%, load = 1]
> Threads = 80, Min/Mean/Max = 492,687/525,224/567,456
> Threads = 80, Total Rate = 4,199 kop/s; Percpu Rate = 52 kop/s
>
> Running locktest with spinlock [runtime = 10s, load = 1]
> Threads = 40, Min/Mean/Max = 1,325,623/1,325,628/1,325,636
> Threads = 40, Total Rate = 5,299 kop/s; Percpu Rate = 132 kop/s
>
> Running locktest with rwlock [runtime = 10s, r% = 50%, load = 1]
> Threads = 40, Min/Mean/Max = 1,249,731/1,292,977/1,342,815
> Threads = 40, Total Rate = 5,168 kop/s; Percpu Rate = 129 kop/s
>
> On systems on large number of cpus, qspinlock lock is faster and more fair.
>
> With some tuning, we may be able to squeeze out more performance.
Yes, powerpc could certainly get more performance out of the slow
paths, and then there are a few parameters to tune.
We don't have a good alternate patching for function calls yet, but
that would be something to do for native vs pv.
And then there seem to be one or two tunable parameters we could
experiment with.
The paravirt locks may need a bit more tuning. Some simple testing
under KVM shows we might be a bit slower in some cases. Whether this
is fairness or something else I'm not sure. The current simple pv
spinlock code can do a directed yield to the lock holder CPU, whereas
the pv qspl here just does a general yield. I think we might actually
be able to change that to also support directed yield. Though I'm
not sure if this is actually the cause of the slowdown yet.
Thanks,
Nick
^ permalink raw reply
* Re: [PATCH v5 23/26] powerpc/book3s64/kuap: Move UAMOR setup to key init function
From: Michael Ellerman @ 2020-07-07 6:05 UTC (permalink / raw)
To: Aneesh Kumar K.V, linuxppc-dev; +Cc: Aneesh Kumar K.V, linuxram, bauerman
In-Reply-To: <20200619135850.47155-24-aneesh.kumar@linux.ibm.com>
"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:
> UAMOR values are not application-specific.
It used to be, that's worth mentioning.
> The kernel initializes its value based on different reserved keys.
> Remove the thread-specific UAMOR value and don't switch the UAMOR on
> context switch.
>
> Move UAMOR initialization to key initialization code. Now that
> KUAP/KUEP feature depends on PPC_MEM_KEYS, we can start to consolidate
> all register initialization to keys init.
>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
> arch/powerpc/include/asm/book3s/64/kup.h | 2 ++
> arch/powerpc/include/asm/processor.h | 1 -
> arch/powerpc/kernel/ptrace/ptrace-view.c | 17 ++++++++----
> arch/powerpc/kernel/smp.c | 5 ++++
> arch/powerpc/mm/book3s64/pkeys.c | 35 ++++++++++++++----------
> 5 files changed, 39 insertions(+), 21 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/kup.h b/arch/powerpc/include/asm/book3s/64/kup.h
> index 3a0e138d2735..942594745dfa 100644
> --- a/arch/powerpc/include/asm/book3s/64/kup.h
> +++ b/arch/powerpc/include/asm/book3s/64/kup.h
> @@ -67,6 +67,8 @@
> #include <asm/mmu.h>
> #include <asm/ptrace.h>
>
> +extern u64 default_uamor;
> +
> static inline void kuap_restore_amr(struct pt_regs *regs, unsigned long amr)
> {
> if (mmu_has_feature(MMU_FTR_KUAP) && unlikely(regs->kuap != amr)) {
> diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
> index 52a67835057a..6ac12168f1fe 100644
> --- a/arch/powerpc/include/asm/processor.h
> +++ b/arch/powerpc/include/asm/processor.h
> @@ -237,7 +237,6 @@ struct thread_struct {
> #ifdef CONFIG_PPC_MEM_KEYS
> unsigned long amr;
> unsigned long iamr;
> - unsigned long uamor;
> #endif
> #ifdef CONFIG_KVM_BOOK3S_32_HANDLER
> void* kvm_shadow_vcpu; /* KVM internal data */
> diff --git a/arch/powerpc/kernel/ptrace/ptrace-view.c b/arch/powerpc/kernel/ptrace/ptrace-view.c
> index caeb5822a8f4..689711eb018a 100644
> --- a/arch/powerpc/kernel/ptrace/ptrace-view.c
> +++ b/arch/powerpc/kernel/ptrace/ptrace-view.c
> @@ -488,14 +488,22 @@ static int pkey_active(struct task_struct *target, const struct user_regset *reg
> static int pkey_get(struct task_struct *target, const struct user_regset *regset,
> unsigned int pos, unsigned int count, void *kbuf, void __user *ubuf)
> {
> + int ret;
> +
> BUILD_BUG_ON(TSO(amr) + sizeof(unsigned long) != TSO(iamr));
> - BUILD_BUG_ON(TSO(iamr) + sizeof(unsigned long) != TSO(uamor));
>
> if (!arch_pkeys_enabled())
> return -ENODEV;
>
> - return user_regset_copyout(&pos, &count, &kbuf, &ubuf, &target->thread.amr,
> - 0, ELF_NPKEY * sizeof(unsigned long));
> + ret = user_regset_copyout(&pos, &count, &kbuf, &ubuf, &target->thread.amr,
> + 0, 2 * sizeof(unsigned long));
> + if (ret)
> + goto err_out;
Why not just return?
> +
> + ret = user_regset_copyout(&pos, &count, &kbuf, &ubuf, &default_uamor,
> + 2 * sizeof(unsigned long), 3 * sizeof(unsigned long));
> +err_out:
> + return ret;
> }
>
> static int pkey_set(struct task_struct *target, const struct user_regset *regset,
> @@ -518,8 +526,7 @@ static int pkey_set(struct task_struct *target, const struct user_regset *regset
> return ret;
>
> /* UAMOR determines which bits of the AMR can be set from userspace. */
> - target->thread.amr = (new_amr & target->thread.uamor) |
> - (target->thread.amr & ~target->thread.uamor);
> + target->thread.amr = (new_amr & default_uamor) | (target->thread.amr & ~default_uamor);
That comment could explain better why we are bothering to mask with ~default_uamor.
> return 0;
> }
> diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> index c820c95162ff..eec40082599f 100644
> --- a/arch/powerpc/kernel/smp.c
> +++ b/arch/powerpc/kernel/smp.c
> @@ -59,6 +59,7 @@
> #include <asm/asm-prototypes.h>
> #include <asm/cpu_has_feature.h>
> #include <asm/ftrace.h>
> +#include <asm/kup.h>
>
> #ifdef DEBUG
> #include <asm/udbg.h>
> @@ -1256,6 +1257,10 @@ void start_secondary(void *unused)
> mmgrab(&init_mm);
> current->active_mm = &init_mm;
>
> +#ifdef CONFIG_PPC_MEM_KEYS
> + mtspr(SPRN_UAMOR, default_uamor);
> +#endif
That's 1) not very pretty and 2) risks blowing up on other CPUs.
It should at least go in early_init_mmu_secondary().
> smp_store_cpu_info(cpu);
> set_dec(tb_ticks_per_jiffy);
> preempt_disable();
> diff --git a/arch/powerpc/mm/book3s64/pkeys.c b/arch/powerpc/mm/book3s64/pkeys.c
> index aeecc8b8e11c..3f3593f85358 100644
> --- a/arch/powerpc/mm/book3s64/pkeys.c
> +++ b/arch/powerpc/mm/book3s64/pkeys.c
> @@ -24,7 +24,7 @@ static u32 initial_allocation_mask; /* Bits set for the initially allocated k
> static u64 default_amr;
> static u64 default_iamr;
> /* Allow all keys to be modified by default */
> -static u64 default_uamor = ~0x0UL;
> +u64 default_uamor = ~0x0UL;
__ro_after_init?
> /*
> * Key used to implement PROT_EXEC mmap. Denies READ/WRITE
> * We pick key 2 because 0 is special key and 1 is reserved as per ISA.
> @@ -113,8 +113,16 @@ void __init pkey_early_init_devtree(void)
> /* scan the device tree for pkey feature */
> pkeys_total = scan_pkey_feature();
> if (!pkeys_total) {
> - /* No support for pkey. Mark it disabled */
> - return;
> + /*
> + * No key support but on radix we can use key 0
> + * to implement kuap.
> + */
> + if (early_radix_enabled())
> + /*
> + * Make sure userspace can't change the AMR
> + */
> + default_uamor = 0;
> + goto err_out;
Would be cleaner if you inverted that. ie. initialise to 0 and then set
to ~0x0UL when you detect pkeys.
> }
>
> cur_cpu_spec->mmu_features |= MMU_FTR_PKEY;
> @@ -197,6 +205,12 @@ void __init pkey_early_init_devtree(void)
> initial_allocation_mask |= reserved_allocation_mask;
>
> pr_info("Enabling Memory keys with max key count %d", max_pkey);
> +err_out:
It's not "err" out if the OK path goes via here. That's just "out".
> + /*
> + * Setup uamor on boot cpu
> + */
> + mtspr(SPRN_UAMOR, default_uamor);
> +
> return;
> }
>
> @@ -232,8 +246,9 @@ void __init setup_kuap(bool disabled)
> cur_cpu_spec->mmu_features |= MMU_FTR_KUAP;
> }
>
> - /* Make sure userspace can't change the AMR */
> - mtspr(SPRN_UAMOR, 0);
Why not just leave it there. It's extra insurance and it's good
documentation.
> + /*
> + * Set the default kernel AMR values on all cpus.
> + */
> mtspr(SPRN_AMR, AMR_KUAP_BLOCKED);
> isync();
> }
> @@ -278,11 +293,6 @@ static inline u64 read_uamor(void)
> return mfspr(SPRN_UAMOR);
> }
>
> -static inline void write_uamor(u64 value)
> -{
> - mtspr(SPRN_UAMOR, value);
> -}
> -
> static bool is_pkey_enabled(int pkey)
> {
> u64 uamor = read_uamor();
> @@ -353,7 +363,6 @@ void thread_pkey_regs_save(struct thread_struct *thread)
> */
> thread->amr = read_amr();
> thread->iamr = read_iamr();
> - thread->uamor = read_uamor();
> }
>
> void thread_pkey_regs_restore(struct thread_struct *new_thread,
> @@ -366,8 +375,6 @@ void thread_pkey_regs_restore(struct thread_struct *new_thread,
> write_amr(new_thread->amr);
> if (old_thread->iamr != new_thread->iamr)
> write_iamr(new_thread->iamr);
> - if (old_thread->uamor != new_thread->uamor)
> - write_uamor(new_thread->uamor);
> }
>
> void thread_pkey_regs_init(struct thread_struct *thread)
> @@ -377,11 +384,9 @@ void thread_pkey_regs_init(struct thread_struct *thread)
>
> thread->amr = default_amr;
> thread->iamr = default_iamr;
> - thread->uamor = default_uamor;
>
> write_amr(default_amr);
> write_iamr(default_iamr);
> - write_uamor(default_uamor);
> }
>
> int execute_only_pkey(struct mm_struct *mm)
cheers
^ permalink raw reply
* Re: [PATCH v2 02/10] KVM: PPC: Book3S HV: Save/restore new PMU registers
From: Michael Neuling @ 2020-07-07 6:13 UTC (permalink / raw)
To: Athira Rajeev, mpe; +Cc: Paul Mackerras, maddy, linuxppc-dev
In-Reply-To: <1593595262-1433-3-git-send-email-atrajeev@linux.vnet.ibm.com>
@@ -637,12 +637,12 @@ struct kvm_vcpu_arch {
> u32 ccr1;
> u32 dbsr;
>
> - u64 mmcr[5];
> + u64 mmcr[6];
> u32 pmc[8];
> u32 spmc[2];
> u64 siar;
> + mfspr r5, SPRN_MMCR3
> + mfspr r6, SPRN_SIER2
> + mfspr r7, SPRN_SIER3
> + std r5, VCPU_MMCR + 40(r9)
> + std r6, VCPU_SIER + 8(r9)
> + std r7, VCPU_SIER + 16(r9)
This is looking pretty fragile now. vcpu mmcr[6] stores (in this strict order):
mmcr0, mmcr1, mmcra, mmcr2, mmcrs, mmmcr3.
Can we clean that up? Give mmcra and mmcrs their own entries in vcpu and then
have a flat array for mmcr0-3.
Mikey
^ permalink raw reply
* Re: [PATCH v2 04/10] powerpc/perf: Add power10_feat to dt_cpu_ftrs
From: Michael Neuling @ 2020-07-07 6:22 UTC (permalink / raw)
To: Athira Rajeev, mpe; +Cc: maddy, linuxppc-dev
In-Reply-To: <1593595262-1433-5-git-send-email-atrajeev@linux.vnet.ibm.com>
On Wed, 2020-07-01 at 05:20 -0400, Athira Rajeev wrote:
> From: Madhavan Srinivasan <maddy@linux.ibm.com>
>
> Add power10 feature function to dt_cpu_ftrs.c along
> with a power10 specific init() to initialize pmu sprs.
Can you say why you're doing this?
Can you add some text about what you're doing to the BHRB in this patch?
Mikey
>
> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
> ---
> arch/powerpc/include/asm/reg.h | 3 +++
> arch/powerpc/kernel/cpu_setup_power.S | 7 +++++++
> arch/powerpc/kernel/dt_cpu_ftrs.c | 26 ++++++++++++++++++++++++++
> 3 files changed, 36 insertions(+)
>
> diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
> index 21a1b2d..900ada1 100644
> --- a/arch/powerpc/include/asm/reg.h
> +++ b/arch/powerpc/include/asm/reg.h
> @@ -1068,6 +1068,9 @@
> #define MMCR0_PMC2_LOADMISSTIME 0x5
> #endif
>
> +/* BHRB disable bit for PowerISA v3.10 */
> +#define MMCRA_BHRB_DISABLE 0x0000002000000000
> +
> /*
> * SPRG usage:
> *
> diff --git a/arch/powerpc/kernel/cpu_setup_power.S
> b/arch/powerpc/kernel/cpu_setup_power.S
> index efdcfa7..e8b3370c 100644
> --- a/arch/powerpc/kernel/cpu_setup_power.S
> +++ b/arch/powerpc/kernel/cpu_setup_power.S
> @@ -233,3 +233,10 @@ __init_PMU_ISA207:
> li r5,0
> mtspr SPRN_MMCRS,r5
> blr
> +
> +__init_PMU_ISA31:
> + li r5,0
> + mtspr SPRN_MMCR3,r5
> + LOAD_REG_IMMEDIATE(r5, MMCRA_BHRB_DISABLE)
> + mtspr SPRN_MMCRA,r5
> + blr
> diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c
> b/arch/powerpc/kernel/dt_cpu_ftrs.c
> index a0edeb3..14a513f 100644
> --- a/arch/powerpc/kernel/dt_cpu_ftrs.c
> +++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
> @@ -449,6 +449,31 @@ static int __init feat_enable_pmu_power9(struct
> dt_cpu_feature *f)
> return 1;
> }
>
> +static void init_pmu_power10(void)
> +{
> + init_pmu_power9();
> +
> + mtspr(SPRN_MMCR3, 0);
> + mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
> +}
> +
> +static int __init feat_enable_pmu_power10(struct dt_cpu_feature *f)
> +{
> + hfscr_pmu_enable();
> +
> + init_pmu_power10();
> + init_pmu_registers = init_pmu_power10;
> +
> + cur_cpu_spec->cpu_features |= CPU_FTR_MMCRA;
> + cur_cpu_spec->cpu_user_features |= PPC_FEATURE_PSERIES_PERFMON_COMPAT;
> +
> + cur_cpu_spec->num_pmcs = 6;
> + cur_cpu_spec->pmc_type = PPC_PMC_IBM;
> + cur_cpu_spec->oprofile_cpu_type = "ppc64/power10";
> +
> + return 1;
> +}
> +
> static int __init feat_enable_tm(struct dt_cpu_feature *f)
> {
> #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
> @@ -638,6 +663,7 @@ struct dt_cpu_feature_match {
> {"pc-relative-addressing", feat_enable, 0},
> {"machine-check-power9", feat_enable_mce_power9, 0},
> {"performance-monitor-power9", feat_enable_pmu_power9, 0},
> + {"performance-monitor-power10", feat_enable_pmu_power10, 0},
> {"event-based-branch-v3", feat_enable, 0},
> {"random-number-generator", feat_enable, 0},
> {"system-call-vectored", feat_disable, 0},
^ permalink raw reply
* Re: [PATCH v4 24/41] powerpc/book3s64/pkeys: Store/restore userspace AMR correctly on entry and exit from kernel
From: Nicholas Piggin @ 2020-07-07 6:23 UTC (permalink / raw)
To: Aneesh Kumar K.V, linuxppc-dev, mpe; +Cc: linuxram, bauerman
In-Reply-To: <83a38d37-099a-7b98-1262-c16b4f5a0cc4@linux.ibm.com>
Excerpts from Aneesh Kumar K.V's message of July 3, 2020 7:30 pm:
> On 7/3/20 2:48 PM, Nicholas Piggin wrote:
>> Excerpts from Aneesh Kumar K.V's message of June 15, 2020 4:14 pm:
>>> This prepare kernel to operate with a different value than userspace AMR.
>>> For this, AMR needs to be saved and restored on entry and return from the
>>> kernel.
>>>
>>> With KUAP we modify kernel AMR when accessing user address from the kernel
>>> via copy_to/from_user interfaces.
>>>
>>> If MMU_FTR_KEY is enabled we always use the key mechanism to implement KUAP
>>> feature. If MMU_FTR_KEY is not supported and if we support MMU_FTR_KUAP
>>> (radix translation on POWER9), we can skip restoring AMR on return
>>> to userspace. Userspace won't be using AMR in that specific config.
>>>
>>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>>> ---
>>> arch/powerpc/include/asm/book3s/64/kup.h | 141 ++++++++++++++++++-----
>>> arch/powerpc/kernel/entry_64.S | 6 +-
>>> arch/powerpc/kernel/exceptions-64s.S | 4 +-
>>> arch/powerpc/kernel/syscall_64.c | 26 ++++-
>>> 4 files changed, 144 insertions(+), 33 deletions(-)
>>>
>>> diff --git a/arch/powerpc/include/asm/book3s/64/kup.h b/arch/powerpc/include/asm/book3s/64/kup.h
>>> index e6ee1c34842f..6979cd1a0003 100644
>>> --- a/arch/powerpc/include/asm/book3s/64/kup.h
>>> +++ b/arch/powerpc/include/asm/book3s/64/kup.h
>>> @@ -13,18 +13,47 @@
>>>
>>> #ifdef __ASSEMBLY__
>>>
>>> -.macro kuap_restore_amr gpr1, gpr2
>>> -#ifdef CONFIG_PPC_KUAP
>>> +.macro kuap_restore_user_amr gpr1
>>> +#if defined(CONFIG_PPC_MEM_KEYS)
>>> BEGIN_MMU_FTR_SECTION_NESTED(67)
>>> - mfspr \gpr1, SPRN_AMR
>>> + /*
>>> + * AMR is going to be different when
>>> + * returning to userspace.
>>> + */
>>> + ld \gpr1, STACK_REGS_KUAP(r1)
>>> + isync
>>> + mtspr SPRN_AMR, \gpr1
>>> +
>>> + /* No isync required, see kuap_restore_user_amr() */
>>> + END_MMU_FTR_SECTION_NESTED_IFSET(MMU_FTR_PKEY , 67)
>>> +#endif
>>> +.endm
>>> +
>>> +.macro kuap_restore_kernel_amr gpr1, gpr2
>>> +#if defined(CONFIG_PPC_MEM_KEYS)
>>> + BEGIN_MMU_FTR_SECTION_NESTED(67)
>>> + b 99f // handle_pkey_restore_amr
>>> + END_MMU_FTR_SECTION_NESTED_IFSET(MMU_FTR_PKEY , 67)
>>> +
>>> + BEGIN_MMU_FTR_SECTION_NESTED(68)
>>> + b 99f // handle_kuap_restore_amr
>>> + MMU_FTR_SECTION_ELSE_NESTED(68)
>>> + b 100f // skip_restore_amr
>>> + ALT_MMU_FTR_SECTION_END_NESTED_IFSET(MMU_FTR_KUAP, 68)
>>> +
>>> +99:
>>> + /*
>>> + * AMR is going to be mostly the same since we are
>>> + * returning to the kernel. Compare and do a mtspr.
>>> + */
>>> ld \gpr2, STACK_REGS_KUAP(r1)
>>> + mfspr \gpr1, SPRN_AMR
>>> cmpd \gpr1, \gpr2
>>> - beq 998f
>>> + beq 100f
>>> isync
>>> mtspr SPRN_AMR, \gpr2
>>> /* No isync required, see kuap_restore_amr() */
>>> -998:
>>> - END_MMU_FTR_SECTION_NESTED_IFSET(MMU_FTR_KUAP, 67)
>>> +100: // skip_restore_amr
>>
>> Can't you code it like this? (_IFCLR requires none of the bits to be
>> set)
>>
>> BEGIN_MMU_FTR_SECTION_NESTED(67)
>> b 99f // nothing using AMR, no need to restore
>> END_MMU_FTR_SECTION_NESTED_IFCLR(MMU_FTR_PKEY | MMU_FTR_KUAP, 67)
>>
>> That saves you a branch in the common case of using AMR. Similar
>> for others.
>
>
>
> Yes i could switch to that. The code is taking extra 200 cycles even
> with KUAP/KUEP disabled and no keys being used on hash. I am yet to
> analyze this closely. So will rework things based on that analysis.
>
>>
>>> @@ -69,22 +133,40 @@
>>>
>>> extern u64 default_uamor;
>>>
>>> -static inline void kuap_restore_amr(struct pt_regs *regs, unsigned long amr)
>>> +static inline void kuap_restore_user_amr(struct pt_regs *regs)
>>> {
>>> - if (mmu_has_feature(MMU_FTR_KUAP) && unlikely(regs->kuap != amr)) {
>>> - isync();
>>> - mtspr(SPRN_AMR, regs->kuap);
>>> - /*
>>> - * No isync required here because we are about to RFI back to
>>> - * previous context before any user accesses would be made,
>>> - * which is a CSI.
>>> - */
>>> + if (!mmu_has_feature(MMU_FTR_PKEY))
>>> + return;
>>
>> If you have PKEY but not KUAP, do you still have to restore?
>>
>
>
> Yes, because user space pkey is now set on the exit path. This is needed
> to handle things like exec/fork().
>
>
>>> +
>>> + isync();
>>> + mtspr(SPRN_AMR, regs->kuap);
>>> + /*
>>> + * No isync required here because we are about to rfi
>>> + * back to previous context before any user accesses
>>> + * would be made, which is a CSI.
>>> + */
>>> +}
>>> +
>>> +static inline void kuap_restore_kernel_amr(struct pt_regs *regs,
>>> + unsigned long amr)
>>> +{
>>> + if (mmu_has_feature(MMU_FTR_KUAP) || mmu_has_feature(MMU_FTR_PKEY)) {
>>> +
>>> + if (unlikely(regs->kuap != amr)) {
>>> + isync();
>>> + mtspr(SPRN_AMR, regs->kuap);
>>> + /*
>>> + * No isync required here because we are about to rfi
>>> + * back to previous context before any user accesses
>>> + * would be made, which is a CSI.
>>> + */
>>> + }
>>> }
>>> }
>>>
>>> static inline unsigned long kuap_get_and_check_amr(void)
>>> {
>>> - if (mmu_has_feature(MMU_FTR_KUAP)) {
>>> + if (mmu_has_feature(MMU_FTR_KUAP) || mmu_has_feature(MMU_FTR_PKEY)) {
>>> unsigned long amr = mfspr(SPRN_AMR);
>>> if (IS_ENABLED(CONFIG_PPC_KUAP_DEBUG)) /* kuap_check_amr() */
>>> WARN_ON_ONCE(amr != AMR_KUAP_BLOCKED);
>>
>> We could do a static key that's based on this condition, but that can
>> wait for another day.
>>
>>>
>>> static inline void kuap_check_amr(void)
>>> {
>>> - if (IS_ENABLED(CONFIG_PPC_KUAP_DEBUG) && mmu_has_feature(MMU_FTR_KUAP))
>>> + if (IS_ENABLED(CONFIG_PPC_KUAP_DEBUG) &&
>>> + (mmu_has_feature(MMU_FTR_KUAP) || mmu_has_feature(MMU_FTR_PKEY)))
>>> WARN_ON_ONCE(mfspr(SPRN_AMR) != AMR_KUAP_BLOCKED);
>>> }
>>>
>>> #else /* CONFIG_PPC_MEM_KEYS */
>>>
>>> -static inline void kuap_restore_amr(struct pt_regs *regs, unsigned long amr)
>>> +static inline void kuap_restore_user_amr(struct pt_regs *regs)
>>> +{
>>> +}
>>> +
>>> +static inline void kuap_restore_kernel_amr(struct pt_regs *regs, unsigned long amr)
>>> {
>>> }
>>>
>>> @@ -113,6 +200,7 @@ static inline unsigned long kuap_get_and_check_amr(void)
>>> {
>>> return 0;
>>> }
>>> +
>>> #endif /* CONFIG_PPC_MEM_KEYS */
>>>
>>>
>>> @@ -187,7 +275,6 @@ bad_kuap_fault(struct pt_regs *regs, unsigned long address, bool is_write)
>>> "Bug: %s fault blocked by AMR!", is_write ? "Write" : "Read");
>>> }
>>> #endif /* CONFIG_PPC_KUAP */
>>> -
>>> #endif /* __ASSEMBLY__ */
>>>
>>> #endif /* _ASM_POWERPC_BOOK3S_64_KUP_H */
>>
>> Unnecessary hunks.
>
>
> will remove
>
>>
>>> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
>>> index 9d49338e0c85..a087cbe0b17d 100644
>>> --- a/arch/powerpc/kernel/entry_64.S
>>> +++ b/arch/powerpc/kernel/entry_64.S
>>> @@ -481,8 +481,8 @@ _ASM_NOKPROBE_SYMBOL(fast_interrupt_return)
>>> kuap_check_amr r3, r4
>>> ld r5,_MSR(r1)
>>> andi. r0,r5,MSR_PR
>>> - bne .Lfast_user_interrupt_return
>>> - kuap_restore_amr r3, r4
>>> + bne .Lfast_user_interrupt_return_amr
>>> + kuap_restore_kernel_amr r3, r4
>>> andi. r0,r5,MSR_RI
>>> li r3,0 /* 0 return value, no EMULATE_STACK_STORE */
>>> bne+ .Lfast_kernel_interrupt_return
>>> @@ -502,6 +502,8 @@ _ASM_NOKPROBE_SYMBOL(interrupt_return)
>>> cmpdi r3,0
>>> bne- .Lrestore_nvgprs
>>>
>>> +.Lfast_user_interrupt_return_amr:
>>> + kuap_restore_user_amr r3
>>> .Lfast_user_interrupt_return:
>>> ld r11,_NIP(r1)
>>> ld r12,_MSR(r1)
>>> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
>>> index e70ebb5c318c..8226af444d77 100644
>>> --- a/arch/powerpc/kernel/exceptions-64s.S
>>> +++ b/arch/powerpc/kernel/exceptions-64s.S
>>> @@ -971,7 +971,7 @@ EXC_COMMON_BEGIN(system_reset_common)
>>> ld r10,SOFTE(r1)
>>> stb r10,PACAIRQSOFTMASK(r13)
>>>
>>> - kuap_restore_amr r9, r10
>>> + kuap_restore_kernel_amr r9, r10
>>> EXCEPTION_RESTORE_REGS
>>> RFI_TO_USER_OR_KERNEL
>>>
>>> @@ -2784,7 +2784,7 @@ EXC_COMMON_BEGIN(soft_nmi_common)
>>> ld r10,SOFTE(r1)
>>> stb r10,PACAIRQSOFTMASK(r13)
>>>
>>> - kuap_restore_amr r9, r10
>>> + kuap_restore_kernel_amr r9, r10
>>> EXCEPTION_RESTORE_REGS hsrr=0
>>> RFI_TO_KERNEL
>>>
>>> diff --git a/arch/powerpc/kernel/syscall_64.c b/arch/powerpc/kernel/syscall_64.c
>>> index 7e560a01afa4..fded67982fbe 100644
>>> --- a/arch/powerpc/kernel/syscall_64.c
>>> +++ b/arch/powerpc/kernel/syscall_64.c
>>> @@ -35,7 +35,21 @@ notrace long system_call_exception(long r3, long r4, long r5,
>>> BUG_ON(!FULL_REGS(regs));
>>> BUG_ON(regs->softe != IRQS_ENABLED);
>>>
>>> - kuap_check_amr();
>>> +#ifdef CONFIG_PPC_MEM_KEYS
>>> + if (mmu_has_feature(MMU_FTR_PKEY)) {
>>> + unsigned long amr;
>>> + /*
>>> + * When entering from userspace we mostly have the AMR
>>> + * different from kernel default values. Hence don't compare.
>>> + */
>>> + amr = mfspr(SPRN_AMR);
>>> + regs->kuap = amr;
>>> + if (mmu_has_feature(MMU_FTR_KUAP))
>>> + mtspr(SPRN_AMR, AMR_KUAP_BLOCKED);
>>> + isync();
>>
>> isync should be inside the if(). Again do pkeys need to save this if
>> KUAP is not being used? I haven't really looked at how all that works,
>> but what's changing for the PKEY && !KUAP case?
>>
>
> There is no SPR switch in context switch now and all the AMR/IAMR
> handling is now in the exit to userspace.
If we have pkeys and no kuap, we could keep the switch in context
switch?
If you don't think it's worth bothering to optimise that case because we
expect KUAP to be used, that's probably okay although maybe an
adjustment to the comment (we don't expect userspace to have different
from kernel values if kernel is not using it for KUAP).
>>> + /*
>>> + * We do this at the end so that we do context switch with KERNEL AMR
>>> + */
>>> + kuap_restore_user_amr(regs);
>>> return ret;
>>
>> Comment doesn't make sense, newline required before return.
>
>
> Ok the detail there was we need to make sure we restore AMR towrads the
> end and make sure all the kernel code continue to run with KERNEL AMR
> value. There is a schedule() call in there with _TIF_NEED_RESCHED. But
> those details are not really relevant. That was me tracking down some
> issues and writing comment around that part of the code. The only real
> detail is switch to userspace AMR in the end.
Yep, I don't think that comment is needed at all. A space before the
return would be nice. I guess after the account_cpu_user_exit is fine,
that thing's a pain anyway that needs to be changed to avoid an SPR
stall I think so I'll look at that afterward anyway.
Thanks,
Nick
^ permalink raw reply
* Re: [PATCH v2 06/10] powerpc/perf: power10 Performance Monitoring support
From: Michael Neuling @ 2020-07-07 6:50 UTC (permalink / raw)
To: Athira Rajeev, mpe; +Cc: maddy, linuxppc-dev
In-Reply-To: <1593595262-1433-7-git-send-email-atrajeev@linux.vnet.ibm.com>
> @@ -480,6 +520,7 @@ int isa207_compute_mmcr(u64 event[], int n_ev,
> mmcr[1] = mmcr1;
> mmcr[2] = mmcra;
> mmcr[3] = mmcr2;
> + mmcr[4] = mmcr3;
This is fragile like the kvm vcpu case I commented on before but it gets passed
in via a function parameter?! Can you create a struct to store these in rather
than this odd ball numbering?
The cleanup should start in patch 1/10 here:
/*
* The order of the MMCR array is:
- * - 64-bit, MMCR0, MMCR1, MMCRA, MMCR2
+ * - 64-bit, MMCR0, MMCR1, MMCRA, MMCR2, MMCR3
* - 32-bit, MMCR0, MMCR1, MMCR2
*/
- unsigned long mmcr[4];
+ unsigned long mmcr[5];
mikey
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox