* Re: [PATCH 2/3] powerpc/pci: unmap legacy INTx interrupts of passthrough IO adapters
From: Michael Ellerman @ 2020-05-28 13:25 UTC (permalink / raw)
To: Cédric Le Goater; +Cc: Oliver O'Halloran, linuxppc-dev
In-Reply-To: <44126659-0490-4466-7f08-1726a7f0ce6e@kaod.org>
Cédric Le Goater <clg@kaod.org> writes:
> On 4/29/20 9:51 AM, Cédric Le Goater wrote:
>> When a passthrough IO adapter is removed from a pseries machine using
>> hash MMU and the XIVE interrupt mode, the POWER hypervisor, pHyp,
>> expects the guest OS to have cleared all page table entries related to
>> the adapter. If some are still present, the RTAS call which isolates
>> the PCI slot returns error 9001 "valid outstanding translations" and
>> the removal of the IO adapter fails.
>>
>> INTx interrupt numbers need special care because Linux maps the
>> interrupts automatically in the Linux interrupt number space if they
>> are presented in the device tree node describing the IO adapter. These
>> interrupts are not un-mapped automatically and in case of an hot-plug
>> adapter, the PCI hot-plug layer needs to handle the cleanup to make
>> sure that all the page table entries of the XIVE ESB pages are
>> cleared.
>
> It seems this patch needs more digging to make sure we are handling
> the IRQ unmapping in the correct PCI handler. Could you please keep
> it back for the moment ?
Yep no worries.
cheers
^ permalink raw reply
* Re: [PATCH v2] powerpc/wii: Fix declaration made after definition
From: Michael Ellerman @ 2020-05-28 13:32 UTC (permalink / raw)
To: Nathan Chancellor
Cc: kbuild test robot, Nick Desaulniers, linux-kernel,
clang-built-linux, Paul Mackerras, Nathan Chancellor,
linuxppc-dev
In-Reply-To: <20200526205756.2952882-1-natechancellor@gmail.com>
Nathan Chancellor <natechancellor@gmail.com> writes:
> A 0day randconfig uncovered an error with clang, trimmed for brevity:
>
> arch/powerpc/platforms/embedded6xx/wii.c:195:7: error: attribute
> declaration must precede definition [-Werror,-Wignored-attributes]
> if (!machine_is(wii))
> ^
>
> The macro machine_is declares mach_##name but define_machine actually
> defines mach_##name, hence the warning.
>
> To fix this, move define_machine after the machine_is usage.
>
> Fixes: 5a7ee3198dfa ("powerpc: wii: platform support")
> Reported-by: kbuild test robot <lkp@intel.com>
> Link: https://github.com/ClangBuiltLinux/linux/issues/989
> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
> ---
>
> v1 -> v2:
>
> * s/is_machine/machine_is/ (Nick)
>
> * Add Nick's reviewed-by tag.
I already picked up v1, and it's behind a merge so I don't want to
rebase it. I'll live with the typo. Thanks.
cheers
^ permalink raw reply
* Re: [PATCH] Revert "powerpc/32s: reorder Linux PTE bits to better match Hash PTE bits."
From: Michael Ellerman @ 2020-05-28 14:04 UTC (permalink / raw)
To: Michael Ellerman, Christophe Leroy, Benjamin Herrenschmidt,
Rui Salvaterra, Paul Mackerras
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <b34706f8de87f84d135abb5f3ede6b6f16fb1f41.1589969799.git.christophe.leroy@csgroup.eu>
On Wed, 20 May 2020 10:23:45 +0000 (UTC), Christophe Leroy wrote:
> This reverts commit 697ece78f8f749aeea40f2711389901f0974017a.
>
> The implementation of SWAP on powerpc requires page protection
> bits to not be one of the least significant PTE bits.
>
> Until the SWAP implementation is changed and this requirement voids,
> we have to keep at least _PAGE_RW outside of the 3 last bits.
>
> [...]
Applied to powerpc/fixes.
[1/1] Revert "powerpc/32s: reorder Linux PTE bits to better match Hash PTE bits."
https://git.kernel.org/powerpc/c/40bb0e904212cf7d6f041a98c58c8341b2016670
cheers
^ permalink raw reply
* Re: [PATCH] powerpc/64s: Disable STRICT_KERNEL_RWX
From: Michael Ellerman @ 2020-05-28 14:04 UTC (permalink / raw)
To: Michael Ellerman, linuxppc-dev
In-Reply-To: <20200520133605.972649-1-mpe@ellerman.id.au>
On Wed, 20 May 2020 23:36:05 +1000, Michael Ellerman wrote:
> Several strange crashes have been eventually traced back to
> STRICT_KERNEL_RWX and its interaction with code patching.
>
> Various paths in our ftrace, kprobes and other patching code need to
> be hardened against patching failures, otherwise we can end up running
> with partially/incorrectly patched ftrace paths, kprobes or jump
> labels, which can then cause strange crashes.
>
> [...]
Applied to powerpc/fixes.
[1/1] powerpc/64s: Disable STRICT_KERNEL_RWX
https://git.kernel.org/powerpc/c/8659a0e0efdd975c73355dbc033f79ba3b31e82c
cheers
^ permalink raw reply
* Re: [PATCH] powerpc/64s: Fix restore of NV GPRs after facility unavailable exception
From: Michael Ellerman @ 2020-05-28 14:04 UTC (permalink / raw)
To: Michael Ellerman, linuxppc-dev; +Cc: npiggin
In-Reply-To: <20200526061808.2472279-1-mpe@ellerman.id.au>
On Tue, 26 May 2020 16:18:08 +1000, Michael Ellerman wrote:
> Commit 702f09805222 ("powerpc/64s/exception: Remove lite interrupt
> return") changed the interrupt return path to not restore non-volatile
> registers by default, and explicitly restore them in paths where it is
> required.
>
> But it missed that the facility unavailable exception can sometimes
> modify user registers, ie. when it does emulation of move from DSCR.
>
> [...]
Applied to powerpc/fixes.
[1/1] powerpc/64s: Fix restore of NV GPRs after facility unavailable exception
https://git.kernel.org/powerpc/c/595d153dd1022392083ac93a1550382cbee127e0
cheers
^ permalink raw reply
* Re: [PATCH] powerpc/bpf: Enable bpf_probe_read{, str}() on powerpc again
From: Daniel Borkmann @ 2020-05-28 15:06 UTC (permalink / raw)
To: Michael Ellerman, Petr Mladek
Cc: bpf, Alexei Starovoitov, linux-kernel, Paul Mackerras,
Masami Hiramatsu, Brendan Gregg, Miroslav Benes, linuxppc-dev,
Christoph Hellwig
In-Reply-To: <87d06ojlib.fsf@mpe.ellerman.id.au>
On 5/28/20 2:23 PM, Michael Ellerman wrote:
> Petr Mladek <pmladek@suse.com> writes:
>> On Thu 2020-05-28 11:03:43, Michael Ellerman wrote:
>>> Petr Mladek <pmladek@suse.com> writes:
>>>> The commit 0ebeea8ca8a4d1d453a ("bpf: Restrict bpf_probe_read{, str}() only
>>>> to archs where they work") caused that bpf_probe_read{, str}() functions
>>>> were not longer available on architectures where the same logical address
>>>> might have different content in kernel and user memory mapping. These
>>>> architectures should use probe_read_{user,kernel}_str helpers.
>>>>
>>>> For backward compatibility, the problematic functions are still available
>>>> on architectures where the user and kernel address spaces are not
>>>> overlapping. This is defined CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE.
>>>>
>>>> At the moment, these backward compatible functions are enabled only
>>>> on x86_64, arm, and arm64. Let's do it also on powerpc that has
>>>> the non overlapping address space as well.
>>>>
>>>> Signed-off-by: Petr Mladek <pmladek@suse.com>
>>>
>>> This seems like it should have a Fixes: tag and go into v5.7?
>>
>> Good point:
>>
>> Fixes: commit 0ebeea8ca8a4d1d4 ("bpf: Restrict bpf_probe_read{, str}() only to archs where they work")
>>
>> And yes, it should ideally go into v5.7 either directly or via stable.
>>
>> Should I resend the patch with Fixes and
>> Cc: stable@vger.kernel.org #v45.7 lines, please?
>
> If it goes into v5.7 then it doesn't need a Cc: stable, and I guess a
> Fixes: tag is nice to have but not so important as it already mentions
> the commit that caused the problem. So a resend probably isn't
> necessary.
>
> Acked-by: Michael Ellerman <mpe@ellerman.id.au>
>
> Daniel can you pick this up, or should I?
Yeah I'll take it into bpf tree for v5.7.
Thanks everyone,
Daniel
^ permalink raw reply
* [PATCH net] drivers/net/ibmvnic: Update VNIC protocol version reporting
From: Thomas Falcon @ 2020-05-28 16:19 UTC (permalink / raw)
To: netdev; +Cc: Thomas Falcon, linuxppc-dev
VNIC protocol version is reported in big-endian format, but it
is not byteswapped before logging. Fix that, and remove version
comparison as only one protocol version exists at this time.
Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
---
drivers/net/ethernet/ibm/ibmvnic.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 621a6e0..f2c7160 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -4689,12 +4689,10 @@ static void ibmvnic_handle_crq(union ibmvnic_crq *crq,
dev_err(dev, "Error %ld in VERSION_EXCHG_RSP\n", rc);
break;
}
- dev_info(dev, "Partner protocol version is %d\n",
- crq->version_exchange_rsp.version);
- if (be16_to_cpu(crq->version_exchange_rsp.version) <
- ibmvnic_version)
- ibmvnic_version =
+ ibmvnic_version =
be16_to_cpu(crq->version_exchange_rsp.version);
+ dev_info(dev, "Partner protocol version is %d\n",
+ ibmvnic_version);
send_cap_queries(adapter);
break;
case QUERY_CAPABILITY_RSP:
--
1.8.3.1
^ permalink raw reply related
* Re: [PATCH v8 1/5] powerpc: Document details on H_SCM_HEALTH hcall
From: Vaibhav Jain @ 2020-05-28 19:24 UTC (permalink / raw)
To: Dan Williams
Cc: Santosh Sivaraj, Ira Weiny, linux-nvdimm,
Linux Kernel Mailing List, Steven Rostedt, Oliver O'Halloran,
Aneesh Kumar K . V, linuxppc-dev
In-Reply-To: <CAPcyv4jXp1FocSe-fFBA_00TnsjPudrBCuHBfv+zwHA_R0353A@mail.gmail.com>
Thanks for looking into this patchset Dan,
Dan Williams <dan.j.williams@intel.com> writes:
> On Tue, May 26, 2020 at 9:13 PM Vaibhav Jain <vaibhav@linux.ibm.com> wrote:
>>
>> Add documentation to 'papr_hcalls.rst' describing the bitmap flags
>> that are returned from H_SCM_HEALTH hcall as per the PAPR-SCM
>> specification.
>>
>
> Please do a global s/SCM/PMEM/ or s/SCM/NVDIMM/. It's unfortunate that
> we already have 2 ways to describe persistent memory devices, let's
> not perpetuate a third so that "grep" has a chance to find
> interrelated code across architectures. Other than that this looks
> good to me.
Sure, will use PAPR_NVDIMM instead of PAPR_SCM for new code being
introduced. However certain identifiers like H_SCM_HEALTH are taken from
the papr specificiation hence need to use the same name.
>
>> Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
>> Cc: Dan Williams <dan.j.williams@intel.com>
>> Cc: Michael Ellerman <mpe@ellerman.id.au>
>> Cc: Ira Weiny <ira.weiny@intel.com>
>> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
>> ---
>> Changelog:
>> v7..v8:
>> * Added a clarification on bit-ordering of Health Bitmap
>>
>> Resend:
>> * None
>>
>> v6..v7:
>> * None
>>
>> v5..v6:
>> * New patch in the series
>> ---
>> Documentation/powerpc/papr_hcalls.rst | 45 ++++++++++++++++++++++++---
>> 1 file changed, 41 insertions(+), 4 deletions(-)
>>
>> diff --git a/Documentation/powerpc/papr_hcalls.rst b/Documentation/powerpc/papr_hcalls.rst
>> index 3493631a60f8..45063f305813 100644
>> --- a/Documentation/powerpc/papr_hcalls.rst
>> +++ b/Documentation/powerpc/papr_hcalls.rst
>> @@ -220,13 +220,50 @@ from the LPAR memory.
>> **H_SCM_HEALTH**
>>
>> | Input: drcIndex
>> -| Out: *health-bitmap, health-bit-valid-bitmap*
>> +| Out: *health-bitmap (r4), health-bit-valid-bitmap (r5)*
>> | Return Value: *H_Success, H_Parameter, H_Hardware*
>>
>> Given a DRC Index return the info on predictive failure and overall health of
>> -the NVDIMM. The asserted bits in the health-bitmap indicate a single predictive
>> -failure and health-bit-valid-bitmap indicate which bits in health-bitmap are
>> -valid.
>> +the NVDIMM. The asserted bits in the health-bitmap indicate one or more states
>> +(described in table below) of the NVDIMM and health-bit-valid-bitmap indicate
>> +which bits in health-bitmap are valid. The bits are reported in
>> +reverse bit ordering for example a value of 0xC400000000000000
>> +indicates bits 0, 1, and 5 are valid.
>> +
>> +Health Bitmap Flags:
>> +
>> ++------+-----------------------------------------------------------------------+
>> +| Bit | Definition |
>> ++======+=======================================================================+
>> +| 00 | SCM device is unable to persist memory contents. |
>> +| | If the system is powered down, nothing will be saved. |
>> ++------+-----------------------------------------------------------------------+
>> +| 01 | SCM device failed to persist memory contents. Either contents were not|
>> +| | saved successfully on power down or were not restored properly on |
>> +| | power up. |
>> ++------+-----------------------------------------------------------------------+
>> +| 02 | SCM device contents are persisted from previous IPL. The data from |
>> +| | the last boot were successfully restored. |
>> ++------+-----------------------------------------------------------------------+
>> +| 03 | SCM device contents are not persisted from previous IPL. There was no |
>> +| | data to restore from the last boot. |
>> ++------+-----------------------------------------------------------------------+
>> +| 04 | SCM device memory life remaining is critically low |
>> ++------+-----------------------------------------------------------------------+
>> +| 05 | SCM device will be garded off next IPL due to failure |
>> ++------+-----------------------------------------------------------------------+
>> +| 06 | SCM contents cannot persist due to current platform health status. A |
>> +| | hardware failure may prevent data from being saved or restored. |
>> ++------+-----------------------------------------------------------------------+
>> +| 07 | SCM device is unable to persist memory contents in certain conditions |
>> ++------+-----------------------------------------------------------------------+
>> +| 08 | SCM device is encrypted |
>> ++------+-----------------------------------------------------------------------+
>> +| 09 | SCM device has successfully completed a requested erase or secure |
>> +| | erase procedure. |
>> ++------+-----------------------------------------------------------------------+
>> +|10:63 | Reserved / Unused |
>> ++------+-----------------------------------------------------------------------+
>>
>> **H_SCM_PERFORMANCE_STATS**
>>
>> --
>> 2.26.2
>>
--
Cheers
~ Vaibhav
^ permalink raw reply
* [PATCH] powerpc: Fix misleading small cores print
From: Michael Neuling @ 2020-05-28 23:07 UTC (permalink / raw)
To: mpe; +Cc: Gautham R . Shenoy, Michael Neuling, linuxppc-dev
Currently when we boot on a big core system, we get this print:
[ 0.040500] Using small cores at SMT level
This is misleading as we've actually detected big cores.
This patch clears up the print to say we've detect big cores but are
using small cores for scheduling.
Signed-off-by: Michael Neuling <mikey@neuling.org>
---
arch/powerpc/kernel/smp.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 6d2a3a3666..c820c95162 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -1383,7 +1383,7 @@ void __init smp_cpus_done(unsigned int max_cpus)
#ifdef CONFIG_SCHED_SMT
if (has_big_cores) {
- pr_info("Using small cores at SMT level\n");
+ pr_info("Big cores detected but using small core scheduling\n");
power9_topology[0].mask = smallcore_smt_mask;
powerpc_topology[0].mask = smallcore_smt_mask;
}
--
2.26.2
^ permalink raw reply related
* Re: [PATCH] powerpc/bpf: Enable bpf_probe_read{, str}() on powerpc again
From: Michael Ellerman @ 2020-05-29 0:05 UTC (permalink / raw)
To: Daniel Borkmann, Petr Mladek
Cc: bpf, Alexei Starovoitov, linux-kernel, Paul Mackerras,
Masami Hiramatsu, Brendan Gregg, Miroslav Benes, linuxppc-dev,
Christoph Hellwig
In-Reply-To: <aace2e9e-c63c-a1a2-a1e1-c7a46904e8c5@iogearbox.net>
Daniel Borkmann <daniel@iogearbox.net> writes:
> On 5/28/20 2:23 PM, Michael Ellerman wrote:
>> Petr Mladek <pmladek@suse.com> writes:
>>> On Thu 2020-05-28 11:03:43, Michael Ellerman wrote:
>>>> Petr Mladek <pmladek@suse.com> writes:
>>>>> The commit 0ebeea8ca8a4d1d453a ("bpf: Restrict bpf_probe_read{, str}() only
>>>>> to archs where they work") caused that bpf_probe_read{, str}() functions
>>>>> were not longer available on architectures where the same logical address
>>>>> might have different content in kernel and user memory mapping. These
>>>>> architectures should use probe_read_{user,kernel}_str helpers.
>>>>>
>>>>> For backward compatibility, the problematic functions are still available
>>>>> on architectures where the user and kernel address spaces are not
>>>>> overlapping. This is defined CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE.
>>>>>
>>>>> At the moment, these backward compatible functions are enabled only
>>>>> on x86_64, arm, and arm64. Let's do it also on powerpc that has
>>>>> the non overlapping address space as well.
>>>>>
>>>>> Signed-off-by: Petr Mladek <pmladek@suse.com>
>>>>
>>>> This seems like it should have a Fixes: tag and go into v5.7?
>>>
>>> Good point:
>>>
>>> Fixes: commit 0ebeea8ca8a4d1d4 ("bpf: Restrict bpf_probe_read{, str}() only to archs where they work")
>>>
>>> And yes, it should ideally go into v5.7 either directly or via stable.
>>>
>>> Should I resend the patch with Fixes and
>>> Cc: stable@vger.kernel.org #v45.7 lines, please?
>>
>> If it goes into v5.7 then it doesn't need a Cc: stable, and I guess a
>> Fixes: tag is nice to have but not so important as it already mentions
>> the commit that caused the problem. So a resend probably isn't
>> necessary.
>>
>> Acked-by: Michael Ellerman <mpe@ellerman.id.au>
>>
>> Daniel can you pick this up, or should I?
>
> Yeah I'll take it into bpf tree for v5.7.
Thanks.
cheers
^ permalink raw reply
* [powerpc:next-test] BUILD SUCCESS bc26d22277c297c35013700e40f276e37b991980
From: kbuild test robot @ 2020-05-29 0:37 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next-test
branch HEAD: bc26d22277c297c35013700e40f276e37b991980 powerpc/pseries: Update hv-24x7 information after migration
Warning in current branch:
kernel/events/hw_breakpoint.c:216:12: warning: no previous prototype for 'arch_reserve_bp_slot' [-Wmissing-prototypes]
kernel/events/hw_breakpoint.c:221:13: warning: no previous prototype for 'arch_release_bp_slot' [-Wmissing-prototypes]
Warning ids grouped by kconfigs:
recent_errors
|-- i386-allnoconfig
| |-- kernel-events-hw_breakpoint.c:warning:no-previous-prototype-for-arch_release_bp_slot
| `-- kernel-events-hw_breakpoint.c:warning:no-previous-prototype-for-arch_reserve_bp_slot
|-- i386-allyesconfig
| |-- kernel-events-hw_breakpoint.c:warning:no-previous-prototype-for-arch_release_bp_slot
| `-- kernel-events-hw_breakpoint.c:warning:no-previous-prototype-for-arch_reserve_bp_slot
|-- i386-debian-10.3
| |-- kernel-events-hw_breakpoint.c:warning:no-previous-prototype-for-arch_release_bp_slot
| `-- kernel-events-hw_breakpoint.c:warning:no-previous-prototype-for-arch_reserve_bp_slot
|-- i386-defconfig
| |-- kernel-events-hw_breakpoint.c:warning:no-previous-prototype-for-arch_release_bp_slot
| `-- kernel-events-hw_breakpoint.c:warning:no-previous-prototype-for-arch_reserve_bp_slot
|-- i386-randconfig-c001-20200526
| |-- kernel-events-hw_breakpoint.c:warning:no-previous-prototype-for-arch_release_bp_slot
| `-- kernel-events-hw_breakpoint.c:warning:no-previous-prototype-for-arch_reserve_bp_slot
|-- sh-allmodconfig
| |-- kernel-events-hw_breakpoint.c:warning:no-previous-prototype-for-arch_release_bp_slot
| `-- kernel-events-hw_breakpoint.c:warning:no-previous-prototype-for-arch_reserve_bp_slot
|-- sh-allnoconfig
| |-- kernel-events-hw_breakpoint.c:warning:no-previous-prototype-for-arch_release_bp_slot
| `-- kernel-events-hw_breakpoint.c:warning:no-previous-prototype-for-arch_reserve_bp_slot
`-- x86_64-randconfig-c002-20200526
|-- kernel-events-hw_breakpoint.c:warning:no-previous-prototype-for-arch_release_bp_slot
`-- kernel-events-hw_breakpoint.c:warning:no-previous-prototype-for-arch_reserve_bp_slot
elapsed time: 3491m
configs tested: 79
configs skipped: 1
The following configs have been built successfully.
More configs may be tested in the coming days.
arm64 allyesconfig
arm64 defconfig
arm64 allmodconfig
arm64 allnoconfig
arm defconfig
arm allyesconfig
arm allmodconfig
arm allnoconfig
i386 allyesconfig
i386 defconfig
i386 debian-10.3
i386 allnoconfig
ia64 allmodconfig
ia64 defconfig
ia64 allnoconfig
ia64 allyesconfig
m68k allmodconfig
m68k allnoconfig
m68k sun3_defconfig
m68k defconfig
m68k allyesconfig
nios2 defconfig
nios2 allyesconfig
openrisc defconfig
c6x allyesconfig
c6x allnoconfig
openrisc allyesconfig
nds32 defconfig
nds32 allnoconfig
csky allyesconfig
csky defconfig
alpha defconfig
alpha allyesconfig
xtensa allyesconfig
xtensa defconfig
h8300 allyesconfig
h8300 allmodconfig
arc defconfig
arc allyesconfig
sh allmodconfig
sh allnoconfig
microblaze allnoconfig
mips allyesconfig
mips allnoconfig
mips allmodconfig
parisc allnoconfig
parisc defconfig
parisc allyesconfig
parisc allmodconfig
powerpc defconfig
powerpc allyesconfig
powerpc rhel-kconfig
powerpc allmodconfig
powerpc allnoconfig
riscv allyesconfig
riscv allnoconfig
riscv defconfig
riscv allmodconfig
s390 allnoconfig
s390 defconfig
s390 allyesconfig
s390 allmodconfig
sparc allyesconfig
sparc defconfig
sparc64 defconfig
sparc64 allnoconfig
sparc64 allyesconfig
sparc64 allmodconfig
um allmodconfig
um allyesconfig
um allnoconfig
um defconfig
x86_64 rhel
x86_64 rhel-7.6
x86_64 rhel-7.6-kselftests
x86_64 rhel-7.2-clear
x86_64 lkp
x86_64 fedora-25
x86_64 kexec
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
^ permalink raw reply
* [PATCH] powerpc/nvram: Replace kmalloc with kzalloc in the error message
From: Yi Wang @ 2020-05-29 1:02 UTC (permalink / raw)
To: mpe
Cc: wang.yi59, tony.luck, keescook, wang.liang82, gregkh, anton,
linux-kernel, paulus, Liao Pingfang, xue.zhihong, ccross, tglx,
linuxppc-dev, allison
From: Liao Pingfang <liao.pingfang@zte.com.cn>
Use kzalloc instead of kmalloc in the error message according to
the previous kzalloc() call.
Signed-off-by: Liao Pingfang <liao.pingfang@zte.com.cn>
---
arch/powerpc/kernel/nvram_64.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/kernel/nvram_64.c b/arch/powerpc/kernel/nvram_64.c
index fb4f610..c3a0c8d 100644
--- a/arch/powerpc/kernel/nvram_64.c
+++ b/arch/powerpc/kernel/nvram_64.c
@@ -892,7 +892,7 @@ loff_t __init nvram_create_partition(const char *name, int sig,
/* Create our OS partition */
new_part = kzalloc(sizeof(*new_part), GFP_KERNEL);
if (!new_part) {
- pr_err("%s: kmalloc failed\n", __func__);
+ pr_err("%s: kzalloc failed\n", __func__);
return -ENOMEM;
}
--
2.9.5
^ permalink raw reply related
* Re: [RFC PATCH 1/4] powerpc/64s: Don't init FSCR_DSCR in __init_FSCR()
From: Alistair Popple @ 2020-05-29 1:24 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev, mikey, npiggin, jniethe5
In-Reply-To: <20200527145843.2761782-1-mpe@ellerman.id.au>
For what it's worth I tested this series on Mambo PowerNV and it seems to
correctly enable/disable the prefix FSCR bit based on the cpu feature so feel
free to add:
Tested-by: Alistair Popple <alistair@popple.id.au>
Mikey is going to test out pseries.
- Alistair
On Thursday, 28 May 2020 12:58:40 AM AEST Michael Ellerman wrote:
> __init_FSCR() was added originally in commit 2468dcf641e4 ("powerpc:
> Add support for context switching the TAR register") (Feb 2013), and
> only set FSCR_TAR.
>
> At that point FSCR (Facility Status and Control Register) was not
> context switched, so the setting was permanent after boot.
>
> Later we added initialisation of FSCR_DSCR to __init_FSCR(), in commit
> 54c9b2253d34 ("powerpc: Set DSCR bit in FSCR setup") (Mar 2013), again
> that was permanent after boot.
>
> Then commit 2517617e0de6 ("powerpc: Fix context switch DSCR on
> POWER8") (Aug 2013) added a limited context switch of FSCR, just the
> FSCR_DSCR bit was context switched based on thread.dscr_inherit. That
> commit said "This clears the H/FSCR DSCR bit initially", but it
> didn't, it left the initialisation of FSCR_DSCR in __init_FSCR().
> However the initial context switch from init_task to pid 1 would clear
> FSCR_DSCR because thread.dscr_inherit was 0.
>
> That commit also introduced the requirement that FSCR_DSCR be clear
> for user processes, so that we can take the facility unavailable
> interrupt in order to manage dscr_inherit.
>
> Then in commit 152d523e6307 ("powerpc: Create context switch helpers
> save_sprs() and restore_sprs()") (Dec 2015) FSCR was added to
> thread_struct. However it still wasn't fully context switched, we just
> took the existing value and set FSCR_DSCR if the new thread had
> dscr_inherit set. FSCR was still initialised at boot to FSCR_DSCR |
> FSCR_TAR, but that value was not propagated into the thread_struct, so
> the initial context switch set FSCR_DSCR back to 0.
>
> Finally commit b57bd2de8c6c ("powerpc: Improve FSCR init and context
> switching") (Jun 2016) added a full context switch of the FSCR, and
> added an initialisation of init_task.thread.fscr to FSCR_TAR |
> FSCR_EBB, but omitted FSCR_DSCR.
>
> The end result is that swapper runs with FSCR_DSCR set because of the
> initialisation in __init_FSCR(), but no other processes do, they use
> the value from init_task.thread.fscr.
>
> Having FSCR_DSCR set for swapper allows it to access SPR 3 from
> userspace, but swapper never runs userspace, so it has no useful
> effect. It's also confusing to have the value initialised in two
> places to two different values.
>
> So remove FSCR_DSCR from __init_FSCR(), this at least gets us to the
> point where there's a single value of FSCR, even if it's still set in
> two places.
>
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
> ---
> arch/powerpc/kernel/cpu_setup_power.S | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/kernel/cpu_setup_power.S
> b/arch/powerpc/kernel/cpu_setup_power.S index a460298c7ddb..f91ecb10d0ae
> 100644
> --- a/arch/powerpc/kernel/cpu_setup_power.S
> +++ b/arch/powerpc/kernel/cpu_setup_power.S
> @@ -184,7 +184,7 @@ _GLOBAL(__restore_cpu_power9)
>
> __init_FSCR:
> mfspr r3,SPRN_FSCR
> - ori r3,r3,FSCR_TAR|FSCR_DSCR|FSCR_EBB
> + ori r3,r3,FSCR_TAR|FSCR_EBB
> mtspr SPRN_FSCR,r3
> blr
^ permalink raw reply
* Re: [PATCH v2] selftests: powerpc: Add test for execute-disabled pkeys
From: Michael Ellerman @ 2020-05-29 1:48 UTC (permalink / raw)
To: Sandipan Das
Cc: fweimer, aneesh.kumar, linuxram, linux-mm, linux-kselftest,
linuxppc-dev, bauerman
In-Reply-To: <20200527030342.13712-1-sandipan@linux.ibm.com>
Hi Sandipan,
A few comments below ...
Sandipan Das <sandipan@linux.ibm.com> writes:
> Apart from read and write access, memory protection keys can
> also be used for restricting execute permission of pages on
> powerpc. This adds a test to verify if the feature works as
> expected.
>
> Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
> ---
>
> Previous versions can be found at
> v1: https://lore.kernel.org/linuxppc-dev/20200508162332.65316-1-sandipan@linux.ibm.com/
>
> Changes in v2:
> - Added .gitignore entry for test binary.
> - Fixed builds for older distros where siginfo_t might not have si_pkey as
> a formal member based on discussion with Michael.
>
> ---
> tools/testing/selftests/powerpc/mm/.gitignore | 1 +
> tools/testing/selftests/powerpc/mm/Makefile | 3 +-
> .../selftests/powerpc/mm/pkey_exec_prot.c | 336 ++++++++++++++++++
> 3 files changed, 339 insertions(+), 1 deletion(-)
> create mode 100644 tools/testing/selftests/powerpc/mm/pkey_exec_prot.c
>
> diff --git a/tools/testing/selftests/powerpc/mm/.gitignore b/tools/testing/selftests/powerpc/mm/.gitignore
> index 2ca523255b1b..8f841f925baa 100644
> --- a/tools/testing/selftests/powerpc/mm/.gitignore
> +++ b/tools/testing/selftests/powerpc/mm/.gitignore
> @@ -8,3 +8,4 @@ wild_bctr
> large_vm_fork_separation
> bad_accesses
> tlbie_test
> +pkey_exec_prot
> diff --git a/tools/testing/selftests/powerpc/mm/Makefile b/tools/testing/selftests/powerpc/mm/Makefile
> index b9103c4bb414..2816229f648b 100644
> --- a/tools/testing/selftests/powerpc/mm/Makefile
> +++ b/tools/testing/selftests/powerpc/mm/Makefile
> @@ -3,7 +3,7 @@ noarg:
> $(MAKE) -C ../
>
> TEST_GEN_PROGS := hugetlb_vs_thp_test subpage_prot prot_sao segv_errors wild_bctr \
> - large_vm_fork_separation bad_accesses
> + large_vm_fork_separation bad_accesses pkey_exec_prot
> TEST_GEN_PROGS_EXTENDED := tlbie_test
> TEST_GEN_FILES := tempfile
>
> @@ -17,6 +17,7 @@ $(OUTPUT)/prot_sao: ../utils.c
> $(OUTPUT)/wild_bctr: CFLAGS += -m64
> $(OUTPUT)/large_vm_fork_separation: CFLAGS += -m64
> $(OUTPUT)/bad_accesses: CFLAGS += -m64
> +$(OUTPUT)/pkey_exec_prot: CFLAGS += -m64
>
> $(OUTPUT)/tempfile:
> dd if=/dev/zero of=$@ bs=64k count=1
> diff --git a/tools/testing/selftests/powerpc/mm/pkey_exec_prot.c b/tools/testing/selftests/powerpc/mm/pkey_exec_prot.c
> new file mode 100644
> index 000000000000..147fb9ed47d5
> --- /dev/null
> +++ b/tools/testing/selftests/powerpc/mm/pkey_exec_prot.c
> @@ -0,0 +1,336 @@
> +// SPDX-License-Identifier: GPL-2.0+
> +
> +/*
> + * Copyright 2020, Sandipan Das, IBM Corp.
> + *
> + * Test if applying execute protection on pages using memory
> + * protection keys works as expected.
> + */
> +
> +#define _GNU_SOURCE
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <string.h>
> +#include <signal.h>
> +
> +#include <time.h>
> +#include <unistd.h>
> +#include <sys/mman.h>
> +
> +#include "utils.h"
> +
> +/* Override definitions as they might be inconsistent */
Can you please expand the comment to say why/where you've seen problems,
so one day we can drop these once those old libcs are no longer around.
> +#undef PKEY_DISABLE_ACCESS
> +#define PKEY_DISABLE_ACCESS 0x3
> +
> +#undef PKEY_DISABLE_WRITE
> +#define PKEY_DISABLE_WRITE 0x2
> +
> +#undef PKEY_DISABLE_EXECUTE
> +#define PKEY_DISABLE_EXECUTE 0x4
> +
> +/* Older distros might not define this */
> +#ifndef SEGV_PKUERR
> +#define SEGV_PKUERR 4
> +#endif
> +
> +#define SI_PKEY_OFFSET 0x20
> +
> +#define SYS_pkey_mprotect 386
> +#define SYS_pkey_alloc 384
> +#define SYS_pkey_free 385
> +
> +#define PKEY_BITS_PER_PKEY 2
> +#define NR_PKEYS 32
> +
> +#define PKEY_BITS_MASK ((1UL << PKEY_BITS_PER_PKEY) - 1)
If you include "reg.h" then there's a mfspr()/mtspr() macro you can use.
> +static unsigned long pkeyreg_get(void)
> +{
> + unsigned long uamr;
The SPR is AMR not uamr?
> + asm volatile("mfspr %0, 0xd" : "=r"(uamr));
> + return uamr;
> +}
> +
> +static void pkeyreg_set(unsigned long uamr)
> +{
> + asm volatile("isync; mtspr 0xd, %0; isync;" : : "r"(uamr));
> +}
You can use mtspr() there, but you'll need to add the isync's yourself.
> +static void pkey_set_rights(int pkey, unsigned long rights)
> +{
> + unsigned long uamr, shift;
> +
> + shift = (NR_PKEYS - pkey - 1) * PKEY_BITS_PER_PKEY;
> + uamr = pkeyreg_get();
> + uamr &= ~(PKEY_BITS_MASK << shift);
> + uamr |= (rights & PKEY_BITS_MASK) << shift;
> + pkeyreg_set(uamr);
> +}
> +
> +static int sys_pkey_mprotect(void *addr, size_t len, int prot, int pkey)
> +{
> + return syscall(SYS_pkey_mprotect, addr, len, prot, pkey);
> +}
> +
> +static int sys_pkey_alloc(unsigned long flags, unsigned long rights)
> +{
> + return syscall(SYS_pkey_alloc, flags, rights);
> +}
> +
> +static int sys_pkey_free(int pkey)
> +{
> + return syscall(SYS_pkey_free, pkey);
> +}
> +
> +static volatile int fpkey, fcode, ftype, faults;
The "proper" type to use for things accessed in signal handlers is
volatile sig_atomic_t, which should work here AFACIS.
> +static unsigned long pgsize, numinsns;
> +static volatile unsigned int *faddr;
> +static unsigned int *insns;
> +
> +static void segv_handler(int signum, siginfo_t *sinfo, void *ctx)
> +{
> + int pkey;
> +
> +#ifdef si_pkey
> + pkey = sinfo->si_pkey;
> +#else
> + pkey = *((int *)(((char *) sinfo) + SI_PKEY_OFFSET));
> +#endif
> +
> + /* Check if this fault originated because of the expected reasons */
> + if (sinfo->si_code != SEGV_ACCERR && sinfo->si_code != SEGV_PKUERR) {
> + printf("got an unexpected fault, code = %d\n",
> + sinfo->si_code);
printf() isn't signal safe, so this is a bit dicey. You can call
write(2) if you really want to.
If this is an unexpected condition you might better to just call
_exit(1) to bail out.
> + goto fail;
> + }
> +
> + /* Check if this fault originated from the expected address */
> + if (sinfo->si_addr != (void *) faddr) {
> + printf("got an unexpected fault, addr = %p\n",
> + sinfo->si_addr);
> + goto fail;
> + }
> +
> + /* Check if the expected number of faults has been exceeded */
> + if (faults == 0)
> + goto fail;
> +
> + fcode = sinfo->si_code;
> +
> + /* Restore permissions in order to continue */
> + switch (fcode) {
> + case SEGV_ACCERR:
> + if (mprotect(insns, pgsize, PROT_READ | PROT_WRITE)) {
mprotect() also isn't listed as being signal safe, though I don't see
why not. So that's probably fine for test code. We could always call the
syscall directly if necessary.
> + perror("mprotect");
> + goto fail;
> + }
> + break;
> + case SEGV_PKUERR:
> + if (pkey != fpkey)
> + goto fail;
> +
> + if (ftype == PKEY_DISABLE_ACCESS) {
> + pkey_set_rights(fpkey, 0);
> + } else if (ftype == PKEY_DISABLE_EXECUTE) {
> + /*
> + * Reassociate the exec-only pkey with the region
> + * to be able to continue. Unlike AMR, we cannot
> + * set IAMR directly from userspace to restore the
> + * permissions.
> + */
> + if (mprotect(insns, pgsize, PROT_EXEC)) {
> + perror("mprotect");
> + goto fail;
> + }
> + } else {
> + goto fail;
> + }
> + break;
> + }
> +
> + faults--;
> + return;
> +
> +fail:
> + /* Restore all page permissions to avoid repetitive faults */
> + if (mprotect(insns, pgsize, PROT_READ | PROT_WRITE | PROT_EXEC))
> + perror("mprotect");
> + if (sinfo->si_code == SEGV_PKUERR)
> + pkey_set_rights(pkey, 0);
> + faults = -1; /* Something unexpected happened */
> +}
> +
> +static int pkeys_unsupported(void)
> +{
> + bool using_hash = false;
> + char line[128];
> + int pkey;
> + FILE *f;
> +
> + f = fopen("/proc/cpuinfo", "r");
> + FAIL_IF(!f);
> +
> + /* Protection keys are currently supported on Hash MMU only */
> + while (fgets(line, sizeof(line), f)) {
> + if (strcmp(line, "MMU : Hash\n") == 0) {
> + using_hash = true;
> + break;
> + }
> + }
We already have using_hash_mmu() in the bad_accesses.c test.
Can you move using_hash_mmu() into
tools/testing/selftests/powerpc/utils.c, and declare it in
tools/testing/selftests/powerpc/include/utils.h and then use it in your
test.
> + fclose(f);
> + SKIP_IF(!using_hash);
> +
> + /* Check if the system call is supported */
> + pkey = sys_pkey_alloc(0, 0);
> + SKIP_IF(pkey < 0);
> + sys_pkey_free(pkey);
> +
> + return 0;
> +}
> +
> +static int test(void)
> +{
> + struct sigaction act;
> + int pkey, ret, i;
> +
> + ret = pkeys_unsupported();
> + if (ret)
> + return ret;
> +
> + /* Setup signal handler */
> + act.sa_handler = 0;
> + act.sa_sigaction = segv_handler;
> + FAIL_IF(sigprocmask(SIG_SETMASK, 0, &act.sa_mask) != 0);
> + act.sa_flags = SA_SIGINFO;
> + act.sa_restorer = 0;
> + FAIL_IF(sigaction(SIGSEGV, &act, NULL) != 0);
> +
> + /* Setup executable region */
> + pgsize = sysconf(_SC_PAGESIZE);
getpagesize() is cleaner.
> + numinsns = pgsize / sizeof(unsigned int);
> + insns = (unsigned int *) mmap(NULL, pgsize, PROT_READ | PROT_WRITE,
> + MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> + FAIL_IF(insns == MAP_FAILED);
> +
> + /* Write the instruction words */
> + for (i = 0; i < numinsns - 1; i++)
> + insns[i] = 0x60000000; /* nop */
> +
> + /*
> + * Later, to jump to the executable region, we use a linked
> + * branch which sets the return address automatically in LR.
"linked branch" is usually called "branch and link".
> + * Use that to return back.
> + */
> + insns[numinsns - 1] = 0x4e800020; /* blr */
> +
> + /* Allocate a pkey that restricts execution */
> + pkey = sys_pkey_alloc(0, PKEY_DISABLE_EXECUTE);
> + FAIL_IF(pkey < 0);
> +
> + /*
> + * Pick a random instruction address from the executable
> + * region.
> + */
> + srand(time(NULL));
> + faddr = &insns[rand() % (numinsns - 1)];
I'm not really sure the randomisation adds much, given it's only
randomised within the page and the protections only operate at page
granularity.
> +
> + /* The following two cases will avoid SEGV_PKUERR */
> + ftype = -1;
> + fpkey = -1;
> +
> + /*
> + * Read an instruction word from the address when AMR bits
> + * are not set.
You should explain for people who aren't familiar with the ISA that "AMR
bits not set" means "read/write access allowed".
> + *
> + * This should not generate a fault as having PROT_EXEC
> + * implicitly allows reads. The pkey currently restricts
Whether PROT_EXEC implies read is not well defined (see the man page).
If you want to test this case I think you'd be better off specifying
PROT_EXEC | PROT_READ explicitly.
> + * execution only based on the IAMR bits. The AMR bits are
> + * cleared.
> + */
> + faults = 0;
> + FAIL_IF(sys_pkey_mprotect(insns, pgsize, PROT_EXEC, pkey) != 0);
> + printf("read from %p, pkey is execute-disabled\n", (void *) faddr);
> + i = *faddr;
> + FAIL_IF(faults != 0);
> +
> + /*
> + * Write an instruction word to the address when AMR bits
> + * are not set.
> + *
> + * This should generate an access fault as having just
> + * PROT_EXEC also restricts writes. The pkey currently
OK that one is correct, PROT_EXEC without PROT_WRITE must prevent writes.
> + * restricts execution only based on the IAMR bits. The
> + * AMR bits are cleared.
> + */
> + faults = 1;
> + FAIL_IF(sys_pkey_mprotect(insns, pgsize, PROT_EXEC, pkey) != 0);
> + printf("write to %p, pkey is execute-disabled\n", (void *) faddr);
> + *faddr = 0x60000000; /* nop */
faddr is already == nop because you set the entire page to nops previously.
It would be a more convincing test if you set faddr to a trap at the
beginning, that way later when you execute it you can test that the
write of the nop succeeded.
> + FAIL_IF(faults != 0 || fcode != SEGV_ACCERR);
> +
> + /* The following three cases will generate SEGV_PKUERR */
> + ftype = PKEY_DISABLE_ACCESS;
> + fpkey = pkey;
> +
> + /*
> + * Read an instruction word from the address when AMR bits
> + * are set.
> + *
> + * This should generate a pkey fault based on AMR bits only
> + * as having PROT_EXEC implicitly allows reads.
Again would be better to specify PROT_READ IMHO.
> + */
> + faults = 1;
> + FAIL_IF(sys_pkey_mprotect(insns, pgsize, PROT_EXEC, pkey) != 0);
> + printf("read from %p, pkey is execute-disabled, access-disabled\n",
> + (void *) faddr);
> + pkey_set_rights(pkey, PKEY_DISABLE_ACCESS);
> + i = *faddr;
> + FAIL_IF(faults != 0 || fcode != SEGV_PKUERR);
> +
> + /*
> + * Write an instruction word to the address when AMR bits
> + * are set.
> + *
> + * This should generate two faults. First, a pkey fault based
> + * on AMR bits and then an access fault based on PROT_EXEC.
> + */
> + faults = 2;
Setting faults to the expected value and decrementing it in the fault
handler is kind of weird.
I think it would be clearer if faults was just a zero-based counter of
how many faults we've taken, and then you test that it's == 2 below.
> + FAIL_IF(sys_pkey_mprotect(insns, pgsize, PROT_EXEC, pkey) != 0);
> + printf("write to %p, pkey is execute-disabled, access-disabled\n",
> + (void *) faddr);
> + pkey_set_rights(pkey, PKEY_DISABLE_ACCESS);
> + *faddr = 0x60000000; /* nop */
> + FAIL_IF(faults != 0 || fcode != SEGV_ACCERR);
ie. FAIL_IF(faults != 2 || ... )
> + /*
> + * Jump to the executable region. This should generate a pkey
> + * fault based on IAMR bits. AMR bits will not affect execution.
> + */
> + faddr = insns;
> + ftype = PKEY_DISABLE_EXECUTE;
> + fpkey = pkey;
> + faults = 1;
> + FAIL_IF(sys_pkey_mprotect(insns, pgsize, PROT_EXEC, pkey) != 0);
> + pkey_set_rights(pkey, PKEY_DISABLE_ACCESS);
> + printf("execute at %p, ", (void *) faddr);
> + printf("pkey is execute-disabled, access-disabled\n");
> +
> + /* Branch into the executable region */
> + asm volatile("mtctr %0" : : "r"((unsigned long) insns));
> + asm volatile("bctrl");
I'm not sure that's safe, they should be part of a single asm block.
> + FAIL_IF(faults != 0 || fcode != SEGV_PKUERR);
I think as a final test you should remove the protections and confirm
you can successfully execute from the insns page.
> + /* Cleanup */
> + munmap((void *) insns, pgsize);
> + sys_pkey_free(pkey);
> +
> + return 0;
> +}
cheers
^ permalink raw reply
* Re: powerpc/pci: [PATCH 1/1 V3] PCIE PHB reset
From: Oliver O'Halloran @ 2020-05-29 2:56 UTC (permalink / raw)
To: wenxiong; +Cc: Brian King, linuxppc-dev, wenxiong
In-Reply-To: <1590499319-6472-1-git-send-email-wenxiong@linux.vnet.ibm.com>
On Wed, May 27, 2020 at 12:48 AM <wenxiong@linux.vnet.ibm.com> wrote:
>
> From: Wen Xiong <wenxiong@linux.vnet.ibm.com>
>
> Several device drivers hit EEH(Extended Error handling) when triggering
> kdump on Pseries PowerVM. This patch implemented a reset of the PHBs
> in pci general code when triggering kdump. PHB reset stop all PCI
> transactions from normal kernel. We have tested the patch in several
> enviroments:
> - direct slot adapters
> - adapters under the switch
> - a VF adapter in PowerVM
> - a VF adapter/adapter in KVM guest.
>
> Signed-off-by: Wen Xiong <wenxiong@linux.vnet.ibm.com>
Can you include a patch changelog in future. It's helpful to see what
we need to look at specificly when you post a new revision of a patch.
Looks good to me otherwise.
Reviewed-by: Oliver O'Halloran <oohall@gmail.com>
^ permalink raw reply
* Re: powerpc/pci: [PATCH 1/1 V3] PCIE PHB reset
From: Sam Bobroff @ 2020-05-29 3:17 UTC (permalink / raw)
To: wenxiong; +Cc: brking, oohall, linuxppc-dev, wenxiong
In-Reply-To: <1590499319-6472-1-git-send-email-wenxiong@linux.vnet.ibm.com>
[-- Attachment #1: Type: text/plain, Size: 6543 bytes --]
On Tue, May 26, 2020 at 08:21:59AM -0500, wenxiong@linux.vnet.ibm.com wrote:
> From: Wen Xiong <wenxiong@linux.vnet.ibm.com>
>
> Several device drivers hit EEH(Extended Error handling) when triggering
> kdump on Pseries PowerVM. This patch implemented a reset of the PHBs
> in pci general code when triggering kdump. PHB reset stop all PCI
> transactions from normal kernel. We have tested the patch in several
> enviroments:
> - direct slot adapters
> - adapters under the switch
> - a VF adapter in PowerVM
> - a VF adapter/adapter in KVM guest.
>
> Signed-off-by: Wen Xiong <wenxiong@linux.vnet.ibm.com>
Looks good to me.
Reviewed-by: Sam Bobroff <sbobroff@linux.ibm.com>
(It would be easier to review if you included a patchset change log and
CC'd people who reviewed earlier versions.)
Cheers,
Sam.
>
> ---
> arch/powerpc/platforms/pseries/pci.c | 152 +++++++++++++++++++++++++++
> 1 file changed, 152 insertions(+)
>
> diff --git a/arch/powerpc/platforms/pseries/pci.c b/arch/powerpc/platforms/pseries/pci.c
> index 911534b89c85..cb7e4276cf04 100644
> --- a/arch/powerpc/platforms/pseries/pci.c
> +++ b/arch/powerpc/platforms/pseries/pci.c
> @@ -11,6 +11,8 @@
> #include <linux/kernel.h>
> #include <linux/pci.h>
> #include <linux/string.h>
> +#include <linux/crash_dump.h>
> +#include <linux/delay.h>
>
> #include <asm/eeh.h>
> #include <asm/pci-bridge.h>
> @@ -354,3 +356,153 @@ int pseries_root_bridge_prepare(struct pci_host_bridge *bridge)
>
> return 0;
> }
> +
> +/**
> + * pseries_get_pdn_addr - Retrieve PHB address
> + * @pe: EEH PE
> + *
> + * Retrieve the assocated PHB address. Actually, there're 2 RTAS
> + * function calls dedicated for the purpose. We need implement
> + * it through the new function and then the old one. Besides,
> + * you should make sure the config address is figured out from
> + * FDT node before calling the function.
> + *
> + */
> +static int pseries_get_pdn_addr(struct pci_controller *phb)
> +{
> + int ret = -1;
> + int rets[3];
> + int ibm_get_config_addr_info;
> + int ibm_get_config_addr_info2;
> + int config_addr = 0;
> + struct pci_dn *root_pdn, *pdn;
> +
> + ibm_get_config_addr_info2 = rtas_token("ibm,get-config-addr-info2");
> + ibm_get_config_addr_info = rtas_token("ibm,get-config-addr-info");
> +
> + root_pdn = PCI_DN(phb->dn);
> + pdn = list_first_entry(&root_pdn->child_list, struct pci_dn, list);
> + config_addr = (pdn->busno << 16) | (pdn->devfn << 8);
> +
> + if (ibm_get_config_addr_info2 != RTAS_UNKNOWN_SERVICE) {
> + /*
> + * First of all, we need to make sure there has one PE
> + * associated with the device. If option is 1, it
> + * queries if config address is supported in a PE or not.
> + * If option is 0, it returns PE config address or config
> + * address for the PE primary bus.
> + */
> + ret = rtas_call(ibm_get_config_addr_info2, 4, 2, rets,
> + config_addr, BUID_HI(pdn->phb->buid),
> + BUID_LO(pdn->phb->buid), 1);
> + if (ret || (rets[0] == 0)) {
> + pr_warn("%s: Failed to get address for PHB#%x-PE# option=%d config_addr=%x\n",
> + __func__, pdn->phb->global_number, 1, rets[0]);
> + return -1;
> + }
> +
> + /* Retrieve the associated PE config address */
> + ret = rtas_call(ibm_get_config_addr_info2, 4, 2, rets,
> + config_addr, BUID_HI(pdn->phb->buid),
> + BUID_LO(pdn->phb->buid), 0);
> + if (ret) {
> + pr_warn("%s: Failed to get address for PHB#%x-PE# option=%d config_addr=%x\n",
> + __func__, pdn->phb->global_number, 0, rets[0]);
> + return -1;
> + }
> + return rets[0];
> + }
> +
> + if (ibm_get_config_addr_info != RTAS_UNKNOWN_SERVICE) {
> + ret = rtas_call(ibm_get_config_addr_info, 4, 2, rets,
> + config_addr, BUID_HI(pdn->phb->buid),
> + BUID_LO(pdn->phb->buid), 0);
> + if (ret || rets[0]) {
> + pr_warn("%s: Failed to get address for PHB#%x-PE# config_addr=%x\n",
> + __func__, pdn->phb->global_number, rets[0]);
> + return -1;
> + }
> + return rets[0];
> + }
> +
> + return ret;
> +}
> +
> +static int __init pseries_phb_reset(void)
> +{
> + struct pci_controller *phb;
> + int config_addr;
> + int ibm_set_slot_reset;
> + int ibm_configure_pe;
> + int ret;
> +
> + if (is_kdump_kernel() || reset_devices) {
> + pr_info("Issue PHB reset ...\n");
> + ibm_set_slot_reset = rtas_token("ibm,set-slot-reset");
> + ibm_configure_pe = rtas_token("ibm,configure-pe");
> +
> + if (ibm_set_slot_reset == RTAS_UNKNOWN_SERVICE ||
> + ibm_configure_pe == RTAS_UNKNOWN_SERVICE) {
> + pr_info("%s: EEH functionality not supported\n",
> + __func__);
> + }
> +
> + list_for_each_entry(phb, &hose_list, list_node) {
> + config_addr = pseries_get_pdn_addr(phb);
> + if (config_addr == -1)
> + continue;
> +
> + ret = rtas_call(ibm_set_slot_reset, 4, 1, NULL,
> + config_addr, BUID_HI(phb->buid),
> + BUID_LO(phb->buid), EEH_RESET_FUNDAMENTAL);
> +
> + /* If fundamental-reset not supported, try hot-reset */
> + if (ret == -8)
> + ret = rtas_call(ibm_set_slot_reset, 4, 1, NULL,
> + config_addr, BUID_HI(phb->buid),
> + BUID_LO(phb->buid), EEH_RESET_HOT);
> +
> + if (ret) {
> + pr_err("%s: PHB#%x-PE# failed with rtas_call activate reset=%d\n",
> + __func__, phb->global_number, ret);
> + continue;
> + }
> + }
> + msleep(EEH_PE_RST_SETTLE_TIME);
> +
> + list_for_each_entry(phb, &hose_list, list_node) {
> + config_addr = pseries_get_pdn_addr(phb);
> + if (config_addr == -1)
> + continue;
> +
> + ret = rtas_call(ibm_set_slot_reset, 4, 1, NULL,
> + config_addr, BUID_HI(phb->buid),
> + BUID_LO(phb->buid), EEH_RESET_DEACTIVATE);
> + if (ret) {
> + pr_err("%s: PHB#%x-PE# failed with rtas_call deactive reset=%d\n",
> + __func__, phb->global_number, ret);
> + continue;
> + }
> + }
> + msleep(EEH_PE_RST_SETTLE_TIME);
> +
> + list_for_each_entry(phb, &hose_list, list_node) {
> + config_addr = pseries_get_pdn_addr(phb);
> + if (config_addr == -1)
> + continue;
> +
> + ret = rtas_call(ibm_configure_pe, 3, 1, NULL,
> + config_addr, BUID_HI(phb->buid),
> + BUID_LO(phb->buid));
> + if (ret) {
> + pr_err("%s: PHB#%x-PE# failed with rtas_call configure_pe =%d\n",
> + __func__, phb->global_number, ret);
> + continue;
> + }
> + }
> + }
> +
> + return 0;
> +}
> +machine_postcore_initcall(pseries, pseries_phb_reset);
> +
> --
> 2.18.1
>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply
* Re: [PATCH] powerpc/nvram: Replace kmalloc with kzalloc in the error message
From: Michael Ellerman @ 2020-05-29 4:04 UTC (permalink / raw)
To: Yi Wang
Cc: wang.yi59, tony.luck, keescook, wang.liang82, gregkh, anton,
linux-kernel, paulus, Liao Pingfang, xue.zhihong, ccross, tglx,
linuxppc-dev, allison
In-Reply-To: <1590714135-15818-1-git-send-email-wang.yi59@zte.com.cn>
Yi Wang <wang.yi59@zte.com.cn> writes:
> From: Liao Pingfang <liao.pingfang@zte.com.cn>
>
> Use kzalloc instead of kmalloc in the error message according to
> the previous kzalloc() call.
Please just remove the message instead, it's a tiny allocation that's
unlikely to ever fail, and the caller will print an error anyway.
cheers
> diff --git a/arch/powerpc/kernel/nvram_64.c b/arch/powerpc/kernel/nvram_64.c
> index fb4f610..c3a0c8d 100644
> --- a/arch/powerpc/kernel/nvram_64.c
> +++ b/arch/powerpc/kernel/nvram_64.c
> @@ -892,7 +892,7 @@ loff_t __init nvram_create_partition(const char *name, int sig,
> /* Create our OS partition */
> new_part = kzalloc(sizeof(*new_part), GFP_KERNEL);
> if (!new_part) {
> - pr_err("%s: kmalloc failed\n", __func__);
> + pr_err("%s: kzalloc failed\n", __func__);
> return -ENOMEM;
> }
>
> --
> 2.9.5
^ permalink raw reply
* Re: [PATCH v2 1/5] uaccess: Add user_read_access_begin/end and user_write_access_begin/end
From: Michael Ellerman @ 2020-05-29 4:20 UTC (permalink / raw)
To: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras, airlied,
daniel, torvalds, viro, akpm, keescook, hpa
Cc: linux-arch, linux-mm, intel-gfx, linuxppc-dev, linux-kernel
In-Reply-To: <36e43241c7f043a24b5069e78c6a7edd11043be5.1585898438.git.christophe.leroy@c-s.fr>
On Fri, 2020-04-03 at 07:20:50 UTC, Christophe Leroy wrote:
> Some architectures like powerpc64 have the capability to separate
> read access and write access protection.
> For get_user() and copy_from_user(), powerpc64 only open read access.
> For put_user() and copy_to_user(), powerpc64 only open write access.
> But when using unsafe_get_user() or unsafe_put_user(),
> user_access_begin open both read and write.
>
> Other architectures like powerpc book3s 32 bits only allow write
> access protection. And on this architecture protection is an heavy
> operation as it requires locking/unlocking per segment of 256Mbytes.
> On those architecture it is therefore desirable to do the unlocking
> only for write access. (Note that book3s/32 ranges from very old
> powermac from the 90's with powerpc 601 processor, till modern
> ADSL boxes with PowerQuicc II processors for instance so it
> is still worth considering.)
>
> In order to avoid any risk based of hacking some variable parameters
> passed to user_access_begin/end that would allow hacking and
> leaving user access open or opening too much, it is preferable to
> use dedicated static functions that can't be overridden.
>
> Add a user_read_access_begin and user_read_access_end to only open
> read access.
>
> Add a user_write_access_begin and user_write_access_end to only open
> write access.
>
> By default, when undefined, those new access helpers default on the
> existing user_access_begin and user_access_end.
>
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> Reviewed-by: Kees Cook <keescook@chromium.org>
Applied to powerpc topic/uaccess, thanks.
https://git.kernel.org/powerpc/c/999a22890cb183b918e4372395d24426a755cef2
cheers
^ permalink raw reply
* Re: [PATCH v2 2/5] uaccess: Selectively open read or write user access
From: Michael Ellerman @ 2020-05-29 4:20 UTC (permalink / raw)
To: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras, airlied,
daniel, torvalds, viro, akpm, keescook, hpa
Cc: linux-arch, linux-mm, intel-gfx, linuxppc-dev, linux-kernel
In-Reply-To: <2e73bc57125c2c6ab12a587586a4eed3a47105fc.1585898438.git.christophe.leroy@c-s.fr>
On Fri, 2020-04-03 at 07:20:51 UTC, Christophe Leroy wrote:
> When opening user access to only perform reads, only open read access.
> When opening user access to only perform writes, only open write
> access.
>
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> Reviewed-by: Kees Cook <keescook@chromium.org>
Applied to powerpc topic/uaccess, thanks.
https://git.kernel.org/powerpc/c/41cd780524674082b037e7c8461f90c5e42103f0
cheers
^ permalink raw reply
* Re: [PATCH v2 3/5] drm/i915/gem: Replace user_access_begin by user_write_access_begin
From: Michael Ellerman @ 2020-05-29 4:20 UTC (permalink / raw)
To: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras, airlied,
daniel, torvalds, viro, akpm, keescook, hpa
Cc: linux-arch, linux-mm, intel-gfx, linuxppc-dev, linux-kernel
In-Reply-To: <ebf1250b6d4f351469fb339e5399d8b92aa8a1c1.1585898438.git.christophe.leroy@c-s.fr>
On Fri, 2020-04-03 at 07:20:52 UTC, Christophe Leroy wrote:
> When i915_gem_execbuffer2_ioctl() is using user_access_begin(),
> that's only to perform unsafe_put_user() so use
> user_write_access_begin() in order to only open write access.
>
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> Reviewed-by: Kees Cook <keescook@chromium.org>
Applied to powerpc topic/uaccess, thanks.
https://git.kernel.org/powerpc/c/b44f687386875b714dae2afa768e73401e45c21c
cheers
^ permalink raw reply
* Re: [PATCH v2 4/5] powerpc/uaccess: Implement user_read_access_begin and user_write_access_begin
From: Michael Ellerman @ 2020-05-29 4:24 UTC (permalink / raw)
To: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras, airlied,
daniel, torvalds, viro, akpm, keescook, hpa
Cc: linux-arch, linux-mm, intel-gfx, linuxppc-dev, linux-kernel
In-Reply-To: <6c83af0f0809ef2a955c39ac622767f6cbede035.1585898438.git.christophe.leroy@c-s.fr>
On Fri, 2020-04-03 at 07:20:53 UTC, Christophe Leroy wrote:
> Add support for selective read or write user access with
> user_read_access_begin/end and user_write_access_begin/end.
>
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> Reviewed-by: Kees Cook <keescook@chromium.org>
Applied to powerpc topic/uaccess-ppc, thanks.
https://git.kernel.org/powerpc/c/4fe5cda9f89d0aea8e915b7c96ae34bda4e12e51
cheers
^ permalink raw reply
* Re: [PATCH v3] powerpc/uaccess: evaluate macro arguments once, before user access is allowed
From: Michael Ellerman @ 2020-05-29 4:24 UTC (permalink / raw)
To: Nicholas Piggin, linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20200407041245.600651-1-npiggin@gmail.com>
On Tue, 2020-04-07 at 04:12:45 UTC, Nicholas Piggin wrote:
> get/put_user can be called with nontrivial arguments. fs/proc/page.c
> has a good example:
>
> if (put_user(stable_page_flags(ppage), out)) {
>
> stable_page_flags is quite a lot of code, including spin locks in the
> page allocator.
>
> Ensure these arguments are evaluated before user access is allowed.
> This improves security by reducing code with access to userspace, but
> it also fixes a PREEMPT bug with KUAP on powerpc/64s:
> stable_page_flags is currently called with AMR set to allow writes,
> it ends up calling spin_unlock(), which can call preempt_schedule. But
> the task switch code can not be called with AMR set (it relies on
> interrupts saving the register), so this blows up.
>
> It's fine if the code inside allow_user_access is preemptible, because
> a timer or IPI will save the AMR, but it's not okay to explicitly
> cause a reschedule.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Applied to powerpc topic/uaccess-ppc, thanks.
https://git.kernel.org/powerpc/c/d02f6b7dab8228487268298ea1f21081c0b4b3eb
cheers
^ permalink raw reply
* Re: [PATCH v4 1/2] powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'
From: Michael Ellerman @ 2020-05-29 4:24 UTC (permalink / raw)
To: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras, npiggin,
segher
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <23e680624680a9a5405f4b88740d2596d4b17c26.1587143308.git.christophe.leroy@c-s.fr>
On Fri, 2020-04-17 at 17:08:51 UTC, Christophe Leroy wrote:
> unsafe_put_user() is designed to take benefit of 'asm goto'.
>
> Instead of using the standard __put_user() approach and branch
> based on the returned error, use 'asm goto' and make the
> exception code branch directly to the error label. There is
> no code anymore in the fixup section.
>
> This change significantly simplifies functions using
> unsafe_put_user()
...
>
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>
Applied to powerpc topic/uaccess-ppc, thanks.
https://git.kernel.org/powerpc/c/334710b1496af8a0960e70121f850e209c20958f
cheers
^ permalink raw reply
* Re: [PATCH v4 2/2] powerpc/uaccess: Implement unsafe_copy_to_user() as a simple loop
From: Michael Ellerman @ 2020-05-29 4:24 UTC (permalink / raw)
To: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras, npiggin,
segher
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <fe952112c29bf6a0a2778c9e6bbb4f4afd2c4258.1587143308.git.christophe.leroy@c-s.fr>
On Fri, 2020-04-17 at 17:08:52 UTC, Christophe Leroy wrote:
> At the time being, unsafe_copy_to_user() is based on
> raw_copy_to_user() which calls __copy_tofrom_user().
>
> __copy_tofrom_user() is a big optimised function to copy big amount
> of data. It aligns destinations to cache line in order to use
> dcbz instruction.
>
> Today unsafe_copy_to_user() is called only from filldir().
> It is used to mainly copy small amount of data like filenames,
> so __copy_tofrom_user() is not fit.
>
> Also, unsafe_copy_to_user() is used within user_access_begin/end
> sections. In those section, it is preferable to not call functions.
>
> Rewrite unsafe_copy_to_user() as a macro that uses __put_user_goto().
> We first perform a loop of long, then we finish with necessary
> complements.
>
> unsafe_copy_to_user() might be used in the near future to copy
> fixed-size data, like pt_regs structs during signal processing.
> Having it as a macro allows GCC to optimise it for instead when
> it knows the size in advance, it can unloop loops, drop complements
> when the size is a multiple of longs, etc ...
>
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Applied to powerpc topic/uaccess-ppc, thanks.
https://git.kernel.org/powerpc/c/17bc43367fc2a720400d21c745db641c654c1e6b
cheers
^ permalink raw reply
* Re: [PATCH] powerpc/uaccess: Don't use "m<>" constraint
From: Michael Ellerman @ 2020-05-29 4:24 UTC (permalink / raw)
To: Michael Ellerman, linuxppc-dev
In-Reply-To: <20200507123324.2250024-1-mpe@ellerman.id.au>
On Thu, 2020-05-07 at 12:33:24 UTC, Michael Ellerman wrote:
> The "m<>" constraint breaks compilation with GCC 4.6.x era compilers.
>
> The use of the constraint allows the compiler to use update-form
> instructions, however in practice current compilers never generate
> those forms for any of the current uses of __put_user_asm_goto().
>
> We anticipate that GCC 4.6 will be declared unsupported for building
> the kernel in the not too distant future. So for now just switch to
> the "m" constraint.
>
> Fixes: 334710b1496a ("powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'")
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Applied to powerpc topic/uaccess-ppc.
https://git.kernel.org/powerpc/c/e2a8b49e79553bd8ec48f73cead84e6146c09408
cheers
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox