* Re: [PATCH v3 5/6] powerpc/pseries: implement paravirt qspinlocks for SPLPAR
From: Waiman Long @ 2020-07-23 21:58 UTC (permalink / raw)
To: peterz
Cc: linux-arch, Boqun Feng, virtualization, linuxppc-dev,
Nicholas Piggin, linux-kernel, Ingo Molnar, kvm-ppc, Will Deacon
In-Reply-To: <20200723195855.GU119549@hirez.programming.kicks-ass.net>
On 7/23/20 3:58 PM, peterz@infradead.org wrote:
> On Thu, Jul 23, 2020 at 03:04:13PM -0400, Waiman Long wrote:
>> On 7/23/20 2:47 PM, peterz@infradead.org wrote:
>>> On Thu, Jul 23, 2020 at 02:32:36PM -0400, Waiman Long wrote:
>>>> BTW, do you have any comment on my v2 lock holder cpu info qspinlock patch?
>>>> I will have to update the patch to fix the reported 0-day test problem, but
>>>> I want to collect other feedback before sending out v3.
>>> I want to say I hate it all, it adds instructions to a path we spend an
>>> aweful lot of time optimizing without really getting anything back for
>>> it.
>> It does add some extra instruction that may slow it down slightly, but I
>> don't agree that it gives nothing back. The cpu lock holder information can
>> be useful in analyzing crash dumps and in some debugging situation. I think
>> it can be useful in RHEL for this readon. How about an x86 config option to
>> allow distros to decide if they want to have it enabled? I will make sure
>> that it will have no performance degradation if the option is not enabled.
> Config knobs suck too; they create a maintenance burden (we get to make
> sure all the permutations works/build/etc..) and effectively nobody uses
> them, since world+dog uses what distros pick.
>
> Anyway, instead of adding a second per-cpu variable, can you see how
> horrible something like this is:
>
> unsigned char adds(unsigned char var, unsigned char val)
> {
> unsigned short sat = 0xff, tmp = var;
>
> asm ("addb %[val], %b[var];"
> "cmovc %[sat], %[var];"
> : [var] "+r" (tmp)
> : [val] "ir" (val), [sat] "r" (sat)
> );
>
> return tmp;
> }
>
> Another thing to try is, instead of threading that lockval throughout
> the thing, simply:
>
> #define _Q_LOCKED_VAL this_cpu_read_stable(cpu_sat)
>
> or combined with the above
>
> #define _Q_LOCKED_VAL adds(this_cpu_read_stable(cpu_number), 2)
>
> and see if the compiler really makes a mess of things.
>
Thanks for the suggestion. I will try that out.
Cheers,
Longman
^ permalink raw reply
* Re: [PATCH v4 03/12] powerpc/kexec_file: add helper functions for getting memory ranges
From: Thiago Jung Bauermann @ 2020-07-23 22:12 UTC (permalink / raw)
To: Hari Bathini
Cc: Pingfan Liu, Nayna Jain, Kexec-ml, Mahesh J Salgaonkar,
Mimi Zohar, lkml, linuxppc-dev, Sourabh Jain, Petr Tesarik,
Andrew Morton, Dave Young, Vivek Goyal, Eric Biederman
In-Reply-To: <159524946347.20855.15784642736087777919.stgit@hbathini.in.ibm.com>
Hari Bathini <hbathini@linux.ibm.com> writes:
> In kexec case, the kernel to be loaded uses the same memory layout as
> the running kernel. So, passing on the DT of the running kernel would
> be good enough.
>
> But in case of kdump, different memory ranges are needed to manage
> loading the kdump kernel, booting into it and exporting the elfcore
> of the crashing kernel. The ranges are exclude memory ranges, usable
> memory ranges, reserved memory ranges and crash memory ranges.
>
> Exclude memory ranges specify the list of memory ranges to avoid while
> loading kdump segments. Usable memory ranges list the memory ranges
> that could be used for booting kdump kernel. Reserved memory ranges
> list the memory regions for the loading kernel's reserve map. Crash
> memory ranges list the memory ranges to be exported as the crashing
> kernel's elfcore.
>
> Add helper functions for setting up the above mentioned memory ranges.
> This helpers facilitate in understanding the subsequent changes better
> and make it easy to setup the different memory ranges listed above, as
> and when appropriate.
>
> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
> Tested-by: Pingfan Liu <piliu@redhat.com>
Just one comment below, but regardless:
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
> +/**
> + * add_htab_mem_range - Adds htab range to the given memory ranges list,
> + * if it exists
> + * @mem_ranges: Range list to add the memory range to.
> + *
> + * Returns 0 on success, negative errno on error.
> + */
> +int add_htab_mem_range(struct crash_mem **mem_ranges)
> +{
> + if (!htab_address)
> + return 0;
> +
> + return add_mem_range(mem_ranges, __pa(htab_address), htab_size_bytes);
> +}
I believe you need to surround this function with `#ifdef
CONFIG_PPC_BOOK3S_64` and `#endif` to match what is done in
<asm/kexec_ranges.h>.
--
Thiago Jung Bauermann
IBM Linux Technology Center
^ permalink raw reply
* Re: [PATCH v4 04/12] ppc64/kexec_file: avoid stomping memory used by special regions
From: Thiago Jung Bauermann @ 2020-07-23 22:19 UTC (permalink / raw)
To: Hari Bathini
Cc: Pingfan Liu, Nayna Jain, Kexec-ml, Mahesh J Salgaonkar,
Mimi Zohar, lkml, linuxppc-dev, Sourabh Jain, Petr Tesarik,
Andrew Morton, Dave Young, Vivek Goyal, Eric Biederman
In-Reply-To: <159524948081.20855.1023953568610670370.stgit@hbathini.in.ibm.com>
Hari Bathini <hbathini@linux.ibm.com> writes:
> crashkernel region could have an overlap with special memory regions
> like opal, rtas, tce-table & such. These regions are referred to as
> exclude memory ranges. Setup this ranges during image probe in order
> to avoid them while finding the buffer for different kdump segments.
> Override arch_kexec_locate_mem_hole() to locate a memory hole taking
> these ranges into account.
>
> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
--
Thiago Jung Bauermann
IBM Linux Technology Center
^ permalink raw reply
* Re: [PATCH v5 1/4] riscv: Move kernel mapping to vmalloc zone
From: Benjamin Herrenschmidt @ 2020-07-23 22:33 UTC (permalink / raw)
To: Alex Ghiti, Palmer Dabbelt
Cc: aou, linux-mm, Anup Patel, linux-kernel, Atish Patra, paulus,
zong.li, Paul Walmsley, linux-riscv, linuxppc-dev
In-Reply-To: <cade70e2-0179-2650-41c5-036679aaf30c@ghiti.fr>
On Thu, 2020-07-23 at 01:21 -0400, Alex Ghiti wrote:
> > works fine with huge pages, what is your problem there ? You rely on
> > punching small-page size holes in there ?
> >
>
> ARCH_HAS_STRICT_KERNEL_RWX prevents the use of a hugepage for the kernel
> mapping in the direct mapping as it sets different permissions to
> different part of the kernel (data, text..etc).
Ah ok, that can be solved in a couple of ways...
One is to use the linker script to ensure those sections are linked
HUGE_PAGE_SIZE appart and moved appropriately by early boot code. One
is to selectively degrade just those huge pages.
I'm not familiar with the RiscV MMU (I should probably go have a look)
but if it's a classic radix tree with huge pages at PUD/PMD level, then
you could just degrade the one(s) that cross those boundaries.
Cheers,
Ben.
^ permalink raw reply
* Re: [PATCH v4 06/12] ppc64/kexec_file: restrict memory usage of kdump kernel
From: Thiago Jung Bauermann @ 2020-07-24 0:06 UTC (permalink / raw)
To: Hari Bathini
Cc: Pingfan Liu, Nayna Jain, Kexec-ml, Mahesh J Salgaonkar,
Mimi Zohar, lkml, linuxppc-dev, Sourabh Jain, Petr Tesarik,
Andrew Morton, Dave Young, Vivek Goyal, Eric Biederman
In-Reply-To: <159524954805.20855.1164928096364700614.stgit@hbathini.in.ibm.com>
Hari Bathini <hbathini@linux.ibm.com> writes:
> Kdump kernel, used for capturing the kernel core image, is supposed
> to use only specific memory regions to avoid corrupting the image to
> be captured. The regions are crashkernel range - the memory reserved
> explicitly for kdump kernel, memory used for the tce-table, the OPAL
> region and RTAS region as applicable. Restrict kdump kernel memory
> to use only these regions by setting up usable-memory DT property.
> Also, tell the kdump kernel to run at the loaded address by setting
> the magic word at 0x5c.
>
> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
> Tested-by: Pingfan Liu <piliu@redhat.com>
> ---
>
> v3 -> v4:
> * Updated get_node_path() to be an iterative function instead of a
> recursive one.
> * Added comment explaining why low memory is added to kdump kernel's
> usable memory ranges though it doesn't fall in crashkernel region.
> * For correctness, added fdt_add_mem_rsv() for the low memory being
> added to kdump kernel's usable memory ranges.
Good idea.
> * Fixed prop pointer update in add_usable_mem_property() and changed
> duple to tuple as suggested by Thiago.
<snip>
> +/**
> + * get_node_pathlen - Get the full path length of the given node.
> + * @dn: Node.
> + *
> + * Also, counts '/' at the end of the path.
> + * For example, /memory@0 will be "/memory@0/\0" => 11 bytes.
Wouldn't this function return 10 in the case of /memory@0?
Are you saying that it should count the \0 at the end too? it's not
doing that, AFAICS.
> + *
> + * Returns the string length of the node's full path.
> + */
Maybe it's me (by analogy with strlen()), but I would expect "string
length" to not include the terminating \0. I suggest renaming the
function to something like get_node_path_size() and do s/length/size/ in
the comment above if it's supposed to count the terminating \0.
> +static int get_node_pathlen(struct device_node *dn)
> +{
> + int len = 0;
> +
> + if (!dn)
> + return 0;
> +
> + while (dn) {
> + len += strlen(dn->full_name) + 1;
> + dn = dn->parent;
> + }
> +
> + return len + 1;
> +}
> +
> +/**
> + * get_node_path - Get the full path of the given node.
> + * @node: Device node.
> + *
> + * Allocates buffer for node path. The caller must free the buffer
> + * after use.
> + *
> + * Returns buffer with path on success, NULL otherwise.
> + */
> +static char *get_node_path(struct device_node *node)
> +{
> + struct device_node *dn;
> + int len, idx, nlen;
> + char *path = NULL;
> + char end_char;
> +
> + if (!node)
> + goto err;
> +
> + /*
> + * Get the path length first and use it to iteratively build the path
> + * from node to root.
> + */
> + len = get_node_pathlen(node);
> +
> + /* Allocate memory for node path */
> + path = kzalloc(ALIGN(len, 8), GFP_KERNEL);
> + if (!path)
> + goto err;
> +
> + /*
> + * Iteratively update path from node to root by decrementing
> + * index appropriately.
> + *
> + * Also, add %NUL at the end of node & '/' at the end of all its
> + * parent nodes.
> + */
> + dn = node;
> + path[0] = '/';
> + idx = len - 1;
Here, idx is pointing to the supposed '/' at the end of the node
path ...
> + end_char = '\0';
> + while (dn->parent) {
> + path[--idx] = end_char;
.. and in the first ireation, this is writing '\0' at a place which will be
overwritten by the memcpy() below with the last character of
dn->full_name. You need to start idx with len, not len - 1.
> + end_char = '/';
> +
> + nlen = strlen(dn->full_name);
> + idx -= nlen;
> + memcpy(path + idx, dn->full_name, nlen);
> +
> + dn = dn->parent;
> + }
> +
> + return path;
> +err:
> + kfree(path);
> + return NULL;
> +}
--
Thiago Jung Bauermann
IBM Linux Technology Center
^ permalink raw reply
* [PATCH v 1/1] powerpc/64s: allow for clang's objdump differences
From: Bill Wendling @ 2020-07-24 0:16 UTC (permalink / raw)
To: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras
Cc: linuxppc-dev, Bill Wendling
Clang's objdump emits slightly different output from GNU's objdump,
causing a list of warnings to be emitted during relocatable builds.
E.g., clang's objdump emits this:
c000000000000004: 2c 00 00 48 b 0xc000000000000030
...
c000000000005c6c: 10 00 82 40 bf 2, 0xc000000000005c7c
while GNU objdump emits:
c000000000000004: 2c 00 00 48 b c000000000000030 <__start+0x30>
...
c000000000005c6c: 10 00 82 40 bne c000000000005c7c <masked_interrupt+0x3c>
Adjust llvm-objdump's output to remove the extraneous '0x' and convert
'bf' and 'bt' to 'bne' and 'beq' resp. to more closely match GNU
objdump's output.
Note that clang's objdump doesn't yet output the relocation symbols on
PPC.
Signed-off-by: Bill Wendling <morbo@google.com>
---
arch/powerpc/tools/unrel_branch_check.sh | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/powerpc/tools/unrel_branch_check.sh b/arch/powerpc/tools/unrel_branch_check.sh
index 77114755dc6f..71ce86b68d18 100755
--- a/arch/powerpc/tools/unrel_branch_check.sh
+++ b/arch/powerpc/tools/unrel_branch_check.sh
@@ -31,6 +31,9 @@ grep -e "^c[0-9a-f]*:[[:space:]]*\([0-9a-f][0-9a-f][[:space:]]\)\{4\}[[:space:]]
grep -v '\<__start_initialization_multiplatform>' |
grep -v -e 'b.\?.\?ctr' |
grep -v -e 'b.\?.\?lr' |
+sed 's/\bbt.\?[[:space:]]*[[:digit:]][[:digit:]]*,/beq/' |
+sed 's/\bbf.\?[[:space:]]*[[:digit:]][[:digit:]]*,/bne/' |
+sed 's/[[:space:]]0x/ /' |
sed 's/://' |
awk '{ print $1 ":" $6 ":0x" $7 ":" $8 " "}'
)
^ permalink raw reply related
* Re: [PATCH v4 09/12] ppc64/kexec_file: setup backup region for kdump kernel
From: Thiago Jung Bauermann @ 2020-07-24 0:28 UTC (permalink / raw)
To: Hari Bathini
Cc: kernel test robot, Pingfan Liu, Nayna Jain, Kexec-ml,
Mahesh J Salgaonkar, Mimi Zohar, lkml, linuxppc-dev, Sourabh Jain,
Petr Tesarik, Andrew Morton, Dave Young, Vivek Goyal,
Eric Biederman
In-Reply-To: <159524964786.20855.15850644504721928289.stgit@hbathini.in.ibm.com>
Hari Bathini <hbathini@linux.ibm.com> writes:
> Though kdump kernel boots from loaded address, the first 64K bytes
> of it is copied down to real 0. So, setup a backup region to copy
> the first 64K bytes of crashed kernel, in purgatory, before booting
> into kdump kernel. Also, update reserve map with backup region and
> crashed kernel's memory to avoid kdump kernel from accidentially
> using that memory.
>
> Reported-by: kernel test robot <lkp@intel.com>
> [lkp: In v1, purgatory() declaration was missing]
> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Just one minor comment below:
> @@ -1047,13 +1120,26 @@ int setup_new_fdt_ppc64(const struct kimage *image, void *fdt,
> goto out;
> }
>
> - /* Ensure we don't touch crashed kernel's memory */
> - ret = fdt_add_mem_rsv(fdt, 0, crashk_res.start);
> + /*
> + * Ensure we don't touch crashed kernel's memory except the
> + * first 64K of RAM, which will be backed up.
> + */
> + ret = fdt_add_mem_rsv(fdt, BACKUP_SRC_SIZE,
I know BACKUP_SRC_START is 0, but please forgive my pedantry when I say
that I think it's clearer if the start address above is changed to
BACKUP_SRC_START + BACKUP_SRC_SIZE...
> + crashk_res.start - BACKUP_SRC_SIZE);
> if (ret) {
> pr_err("Error reserving crash memory: %s\n",
> fdt_strerror(ret));
> goto out;
> }
> +
> + /* Ensure backup region is not used by kdump/capture kernel */
> + ret = fdt_add_mem_rsv(fdt, image->arch.backup_start,
> + BACKUP_SRC_SIZE);
> + if (ret) {
> + pr_err("Error reserving memory for backup: %s\n",
> + fdt_strerror(ret));
> + goto out;
> + }
> }
>
> out:
--
Thiago Jung Bauermann
IBM Linux Technology Center
^ permalink raw reply
* Re: [PATCH v4 10/12] ppc64/kexec_file: prepare elfcore header for crashing kernel
From: Thiago Jung Bauermann @ 2020-07-24 0:33 UTC (permalink / raw)
To: Hari Bathini
Cc: Pingfan Liu, Nayna Jain, Kexec-ml, Mahesh J Salgaonkar,
Mimi Zohar, lkml, linuxppc-dev, Sourabh Jain, Petr Tesarik,
Andrew Morton, Dave Young, Vivek Goyal, Eric Biederman
In-Reply-To: <159524966309.20855.15216784717419378243.stgit@hbathini.in.ibm.com>
Hari Bathini <hbathini@linux.ibm.com> writes:
> Prepare elf headers for the crashing kernel's core file using
> crash_prepare_elf64_headers() and pass on this info to kdump
> kernel by updating its command line with elfcorehdr parameter.
> Also, add elfcorehdr location to reserve map to avoid it from
> being stomped on while booting.
>
> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
> Tested-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
--
Thiago Jung Bauermann
IBM Linux Technology Center
^ permalink raw reply
* [PATCH] powerpc/test_emulate_sstep: Fix build error
From: Michael Ellerman @ 2020-07-24 0:41 UTC (permalink / raw)
To: linuxppc-dev
ppc64_book3e_allmodconfig fails with:
arch/powerpc/lib/test_emulate_step.c: In function 'test_pld':
arch/powerpc/lib/test_emulate_step.c:113:7: error: implicit declaration of function 'cpu_has_feature'
113 | if (!cpu_has_feature(CPU_FTR_ARCH_31)) {
| ^~~~~~~~~~~~~~~
Add an include of cpu_has_feature.h to fix it.
Fixes: b6b54b42722a ("powerpc/sstep: Add tests for prefixed integer load/stores")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
arch/powerpc/lib/test_emulate_step.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/powerpc/lib/test_emulate_step.c b/arch/powerpc/lib/test_emulate_step.c
index 081b05480c47..d242e9f72e0c 100644
--- a/arch/powerpc/lib/test_emulate_step.c
+++ b/arch/powerpc/lib/test_emulate_step.c
@@ -8,6 +8,7 @@
#define pr_fmt(fmt) "emulate_step_test: " fmt
#include <linux/ptrace.h>
+#include <asm/cpu_has_feature.h>
#include <asm/sstep.h>
#include <asm/ppc-opcode.h>
#include <asm/code-patching.h>
--
2.25.1
^ permalink raw reply related
* Re: [PATCH v2 2/3] powerpc/powernv/idle: save-restore DAWR0,DAWRX0 for P10
From: Michael Neuling @ 2020-07-24 1:25 UTC (permalink / raw)
To: Pratik Rajesh Sampat, mpe, benh, paulus, ravi.bangoria, ego,
svaidy, pratik.r.sampat, linuxppc-dev, linux-kernel
In-Reply-To: <20200710052207.12003-3-psampat@linux.ibm.com>
On Fri, 2020-07-10 at 10:52 +0530, Pratik Rajesh Sampat wrote:
> Additional registers DAWR0, DAWRX0 may be lost on Power 10 for
> stop levels < 4.
> Therefore save the values of these SPRs before entering a "stop"
> state and restore their values on wakeup.
>
> Signed-off-by: Pratik Rajesh Sampat <psampat@linux.ibm.com>
> ---
> arch/powerpc/platforms/powernv/idle.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/arch/powerpc/platforms/powernv/idle.c
> b/arch/powerpc/platforms/powernv/idle.c
> index 19d94d021357..f2e2a6a4c274 100644
> --- a/arch/powerpc/platforms/powernv/idle.c
> +++ b/arch/powerpc/platforms/powernv/idle.c
> @@ -600,6 +600,8 @@ struct p9_sprs {
> u64 iamr;
> u64 amor;
> u64 uamor;
> + u64 dawr0;
> + u64 dawrx0;
> };
>
> static unsigned long power9_idle_stop(unsigned long psscr, bool mmu_on)
> @@ -687,6 +689,10 @@ static unsigned long power9_idle_stop(unsigned long
> psscr, bool mmu_on)
> sprs.iamr = mfspr(SPRN_IAMR);
> sprs.amor = mfspr(SPRN_AMOR);
> sprs.uamor = mfspr(SPRN_UAMOR);
> + if (cpu_has_feature(CPU_FTR_ARCH_31)) {
Can you add a comment here saying even though DAWR0 is ARCH_30, it's only
required to be saved on 31. Otherwise this looks pretty odd.
> + sprs.dawr0 = mfspr(SPRN_DAWR0);
> + sprs.dawrx0 = mfspr(SPRN_DAWRX0);
> + }
>
> srr1 = isa300_idle_stop_mayloss(psscr); /* go idle */
>
> @@ -710,6 +716,10 @@ static unsigned long power9_idle_stop(unsigned long
> psscr, bool mmu_on)
> mtspr(SPRN_IAMR, sprs.iamr);
> mtspr(SPRN_AMOR, sprs.amor);
> mtspr(SPRN_UAMOR, sprs.uamor);
> + if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> + mtspr(SPRN_DAWR0, sprs.dawr0);
> + mtspr(SPRN_DAWRX0, sprs.dawrx0);
> + }
>
> /*
> * Workaround for POWER9 DD2.0, if we lost resources, the ERAT
^ permalink raw reply
* [powerpc:merge] BUILD SUCCESS d6c13d397d6988ec3e6029cae9e80501364cf9cb
From: kernel test robot @ 2020-07-24 2:03 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git merge
branch HEAD: d6c13d397d6988ec3e6029cae9e80501364cf9cb Automatic merge of 'master', 'next' and 'fixes' (2020-07-22 23:08)
elapsed time: 2204m
configs tested: 74
configs skipped: 1
The following configs have been built successfully.
More configs may be tested in the coming days.
arm defconfig
arm allyesconfig
arm allmodconfig
arm allnoconfig
arm64 allyesconfig
arm64 defconfig
arm64 allmodconfig
arm64 allnoconfig
i386 defconfig
i386 debian-10.3
i386 allnoconfig
i386 allyesconfig
ia64 allmodconfig
ia64 defconfig
ia64 allnoconfig
ia64 allyesconfig
m68k allmodconfig
m68k allnoconfig
m68k sun3_defconfig
m68k defconfig
m68k allyesconfig
nios2 defconfig
nios2 allyesconfig
openrisc defconfig
c6x allyesconfig
c6x allnoconfig
openrisc allyesconfig
nds32 defconfig
nds32 allnoconfig
csky allyesconfig
csky defconfig
alpha defconfig
alpha allyesconfig
xtensa allyesconfig
h8300 allyesconfig
h8300 allmodconfig
xtensa defconfig
arc defconfig
arc allyesconfig
sh allmodconfig
sh allnoconfig
microblaze allnoconfig
mips allyesconfig
mips allnoconfig
mips allmodconfig
parisc allnoconfig
parisc defconfig
parisc allyesconfig
parisc allmodconfig
powerpc defconfig
powerpc allyesconfig
powerpc rhel-kconfig
powerpc allmodconfig
powerpc allnoconfig
riscv allyesconfig
riscv allnoconfig
riscv defconfig
riscv allmodconfig
s390 allyesconfig
s390 allnoconfig
s390 allmodconfig
s390 defconfig
sparc allyesconfig
sparc defconfig
sparc64 defconfig
sparc64 allnoconfig
sparc64 allyesconfig
sparc64 allmodconfig
x86_64 rhel-7.6-kselftests
x86_64 rhel-8.3
x86_64 kexec
x86_64 rhel
x86_64 lkp
x86_64 fedora-25
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
^ permalink raw reply
* Re: [PATCH 1/2] lockdep: improve current->(hard|soft)irqs_enabled synchronisation with actual irq state
From: kernel test robot @ 2020-07-24 2:19 UTC (permalink / raw)
To: Nicholas Piggin, linux-kernel
Cc: linux-arch, kbuild-all, Peter Zijlstra, Will Deacon,
Nicholas Piggin, Alexey Kardashevskiy, Ingo Molnar, linuxppc-dev
In-Reply-To: <20200723105615.1268126-1-npiggin@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 5221 bytes --]
Hi Nicholas,
I love your patch! Yet something to improve:
[auto build test ERROR on linux/master]
[also build test ERROR on powerpc/next linus/master v5.8-rc6 next-20200723]
[cannot apply to tip/locking/core]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Nicholas-Piggin/lockdep-improve-current-hard-soft-irqs_enabled-synchronisation-with-actual-irq-state/20200723-185938
base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 9ebcfadb0610322ac537dd7aa5d9cbc2b2894c68
config: nds32-allyesconfig (attached as .config)
compiler: nds32le-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=nds32
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
All errors (new ones prefixed by >>):
In file included from include/asm-generic/bitops.h:14,
from ./arch/nds32/include/generated/asm/bitops.h:1,
from include/linux/bitops.h:29,
from include/linux/kernel.h:12,
from include/linux/list.h:9,
from include/linux/rculist.h:10,
from include/linux/pid.h:5,
from include/linux/sched.h:14,
from arch/nds32/kernel/asm-offsets.c:4:
include/linux/spinlock_api_smp.h: In function '__raw_spin_lock_irq':
>> include/linux/irqflags.h:158:31: error: implicit declaration of function 'arch_irqs_disabled'; did you mean 'raw_irqs_disabled'? [-Werror=implicit-function-declaration]
158 | #define raw_irqs_disabled() (arch_irqs_disabled())
| ^~~~~~~~~~~~~~~~~~
include/linux/irqflags.h:174:23: note: in expansion of macro 'raw_irqs_disabled'
174 | bool was_disabled = raw_irqs_disabled(); \
| ^~~~~~~~~~~~~~~~~
include/linux/spinlock_api_smp.h:126:2: note: in expansion of macro 'local_irq_disable'
126 | local_irq_disable();
| ^~~~~~~~~~~~~~~~~
cc1: some warnings being treated as errors
make[2]: *** [scripts/Makefile.build:114: arch/nds32/kernel/asm-offsets.s] Error 1
make[2]: Target '__build' not remade because of errors.
make[1]: *** [Makefile:1175: prepare0] Error 2
make[1]: Target 'prepare' not remade because of errors.
make: *** [Makefile:185: __sub-make] Error 2
make: Target 'prepare' not remade because of errors.
vim +158 include/linux/irqflags.h
81d68a96a398448 Steven Rostedt 2008-05-12 132
df9ee29270c11db David Howells 2010-10-07 133 /*
df9ee29270c11db David Howells 2010-10-07 134 * Wrap the arch provided IRQ routines to provide appropriate checks.
df9ee29270c11db David Howells 2010-10-07 135 */
df9ee29270c11db David Howells 2010-10-07 136 #define raw_local_irq_disable() arch_local_irq_disable()
df9ee29270c11db David Howells 2010-10-07 137 #define raw_local_irq_enable() arch_local_irq_enable()
df9ee29270c11db David Howells 2010-10-07 138 #define raw_local_irq_save(flags) \
df9ee29270c11db David Howells 2010-10-07 139 do { \
df9ee29270c11db David Howells 2010-10-07 140 typecheck(unsigned long, flags); \
df9ee29270c11db David Howells 2010-10-07 141 flags = arch_local_irq_save(); \
df9ee29270c11db David Howells 2010-10-07 142 } while (0)
df9ee29270c11db David Howells 2010-10-07 143 #define raw_local_irq_restore(flags) \
df9ee29270c11db David Howells 2010-10-07 144 do { \
df9ee29270c11db David Howells 2010-10-07 145 typecheck(unsigned long, flags); \
df9ee29270c11db David Howells 2010-10-07 146 arch_local_irq_restore(flags); \
df9ee29270c11db David Howells 2010-10-07 147 } while (0)
df9ee29270c11db David Howells 2010-10-07 148 #define raw_local_save_flags(flags) \
df9ee29270c11db David Howells 2010-10-07 149 do { \
df9ee29270c11db David Howells 2010-10-07 150 typecheck(unsigned long, flags); \
df9ee29270c11db David Howells 2010-10-07 151 flags = arch_local_save_flags(); \
df9ee29270c11db David Howells 2010-10-07 152 } while (0)
df9ee29270c11db David Howells 2010-10-07 153 #define raw_irqs_disabled_flags(flags) \
df9ee29270c11db David Howells 2010-10-07 154 ({ \
df9ee29270c11db David Howells 2010-10-07 155 typecheck(unsigned long, flags); \
df9ee29270c11db David Howells 2010-10-07 156 arch_irqs_disabled_flags(flags); \
df9ee29270c11db David Howells 2010-10-07 157 })
df9ee29270c11db David Howells 2010-10-07 @158 #define raw_irqs_disabled() (arch_irqs_disabled())
df9ee29270c11db David Howells 2010-10-07 159 #define raw_safe_halt() arch_safe_halt()
de30a2b355ea853 Ingo Molnar 2006-07-03 160
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 57470 bytes --]
^ permalink raw reply
* Re: [PATCH 2/2] lockdep: warn on redundant or incorrect irq state changes
From: kernel test robot @ 2020-07-24 2:57 UTC (permalink / raw)
To: Nicholas Piggin, linux-kernel
Cc: linux-arch, kbuild-all, Peter Zijlstra, Will Deacon,
Nicholas Piggin, Alexey Kardashevskiy, Ingo Molnar, linuxppc-dev
In-Reply-To: <20200723105615.1268126-2-npiggin@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 3086 bytes --]
Hi Nicholas,
I love your patch! Yet something to improve:
[auto build test ERROR on linux/master]
[also build test ERROR on powerpc/next linus/master v5.8-rc6]
[cannot apply to tip/locking/core next-20200723]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Nicholas-Piggin/lockdep-improve-current-hard-soft-irqs_enabled-synchronisation-with-actual-irq-state/20200723-185938
base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 9ebcfadb0610322ac537dd7aa5d9cbc2b2894c68
config: x86_64-randconfig-a002-20200723 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-14) 9.3.0
reproduce (this is a W=1 build):
# save the attached .config to linux build tree
make W=1 ARCH=x86_64
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
All errors (new ones prefixed by >>):
kernel/locking/lockdep.c: In function 'lockdep_init':
>> kernel/locking/lockdep.c:5673:9: error: 'struct task_struct' has no member named 'hardirqs_enabled'
5673 | current->hardirqs_enabled = 1;
| ^~
>> kernel/locking/lockdep.c:5674:9: error: 'struct task_struct' has no member named 'softirqs_enabled'
5674 | current->softirqs_enabled = 1;
| ^~
In file included from kernel/locking/lockdep.c:60:
At top level:
kernel/locking/lockdep_internals.h:64:28: warning: 'LOCKF_USED_IN_IRQ_READ' defined but not used [-Wunused-const-variable=]
64 | static const unsigned long LOCKF_USED_IN_IRQ_READ =
| ^~~~~~~~~~~~~~~~~~~~~~
In file included from kernel/locking/lockdep.c:60:
kernel/locking/lockdep_internals.h:58:28: warning: 'LOCKF_ENABLED_IRQ_READ' defined but not used [-Wunused-const-variable=]
58 | static const unsigned long LOCKF_ENABLED_IRQ_READ =
| ^~~~~~~~~~~~~~~~~~~~~~
In file included from kernel/locking/lockdep.c:60:
kernel/locking/lockdep_internals.h:52:28: warning: 'LOCKF_USED_IN_IRQ' defined but not used [-Wunused-const-variable=]
52 | static const unsigned long LOCKF_USED_IN_IRQ =
| ^~~~~~~~~~~~~~~~~
In file included from kernel/locking/lockdep.c:60:
kernel/locking/lockdep_internals.h:46:28: warning: 'LOCKF_ENABLED_IRQ' defined but not used [-Wunused-const-variable=]
46 | static const unsigned long LOCKF_ENABLED_IRQ =
| ^~~~~~~~~~~~~~~~~
vim +5673 kernel/locking/lockdep.c
5667
5668 printk(" per task-struct memory footprint: %zu bytes\n",
5669 sizeof(((struct task_struct *)NULL)->held_locks));
5670
5671 WARN_ON(irqs_disabled());
5672
> 5673 current->hardirqs_enabled = 1;
> 5674 current->softirqs_enabled = 1;
5675 }
5676
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 38517 bytes --]
^ permalink raw reply
* Re: [v4] powerpc/perf: Initialize power10 PMU registers in cpu setup routine
From: kernel test robot @ 2020-07-24 3:02 UTC (permalink / raw)
To: Athira Rajeev, mpe; +Cc: jniethe5, mikey, maddy, kbuild-all, linuxppc-dev
In-Reply-To: <1595489557-2047-1-git-send-email-atrajeev@linux.vnet.ibm.com>
[-- Attachment #1: Type: text/plain, Size: 7614 bytes --]
Hi Athira,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on powerpc/next]
[also build test ERROR on v5.8-rc6 next-20200723]
[cannot apply to mpe/next scottwood/next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Athira-Rajeev/powerpc-perf-Initialize-power10-PMU-registers-in-cpu-setup-routine/20200723-153537
base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-defconfig (attached as .config)
compiler: powerpc64-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=powerpc
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
All errors (new ones prefixed by >>):
arch/powerpc/kernel/cpu_setup_power.S: Assembler messages:
>> arch/powerpc/kernel/cpu_setup_power.S:244: Error: non-constant expression in ".if" statement
>> arch/powerpc/kernel/cpu_setup_power.S:244: Error: non-constant expression in ".if" statement
>> arch/powerpc/kernel/cpu_setup_power.S:243: Error: unsupported relocation against SPRN_MMCR3
vim +244 arch/powerpc/kernel/cpu_setup_power.S
14
15 /* Entry: r3 = crap, r4 = ptr to cputable entry
16 *
17 * Note that we can be called twice for pseudo-PVRs
18 */
19 _GLOBAL(__setup_cpu_power7)
20 mflr r11
21 bl __init_hvmode_206
22 mtlr r11
23 beqlr
24 li r0,0
25 mtspr SPRN_LPID,r0
26 LOAD_REG_IMMEDIATE(r0, PCR_MASK)
27 mtspr SPRN_PCR,r0
28 mfspr r3,SPRN_LPCR
29 li r4,(LPCR_LPES1 >> LPCR_LPES_SH)
30 bl __init_LPCR_ISA206
31 mtlr r11
32 blr
33
34 _GLOBAL(__restore_cpu_power7)
35 mflr r11
36 mfmsr r3
37 rldicl. r0,r3,4,63
38 beqlr
39 li r0,0
40 mtspr SPRN_LPID,r0
41 LOAD_REG_IMMEDIATE(r0, PCR_MASK)
42 mtspr SPRN_PCR,r0
43 mfspr r3,SPRN_LPCR
44 li r4,(LPCR_LPES1 >> LPCR_LPES_SH)
45 bl __init_LPCR_ISA206
46 mtlr r11
47 blr
48
49 _GLOBAL(__setup_cpu_power8)
50 mflr r11
51 bl __init_FSCR
52 bl __init_PMU
53 bl __init_PMU_ISA207
54 bl __init_hvmode_206
55 mtlr r11
56 beqlr
57 li r0,0
58 mtspr SPRN_LPID,r0
59 LOAD_REG_IMMEDIATE(r0, PCR_MASK)
60 mtspr SPRN_PCR,r0
61 mfspr r3,SPRN_LPCR
62 ori r3, r3, LPCR_PECEDH
63 li r4,0 /* LPES = 0 */
64 bl __init_LPCR_ISA206
65 bl __init_HFSCR
66 bl __init_PMU_HV
67 bl __init_PMU_HV_ISA207
68 mtlr r11
69 blr
70
71 _GLOBAL(__restore_cpu_power8)
72 mflr r11
73 bl __init_FSCR
74 bl __init_PMU
75 bl __init_PMU_ISA207
76 mfmsr r3
77 rldicl. r0,r3,4,63
78 mtlr r11
79 beqlr
80 li r0,0
81 mtspr SPRN_LPID,r0
82 LOAD_REG_IMMEDIATE(r0, PCR_MASK)
83 mtspr SPRN_PCR,r0
84 mfspr r3,SPRN_LPCR
85 ori r3, r3, LPCR_PECEDH
86 li r4,0 /* LPES = 0 */
87 bl __init_LPCR_ISA206
88 bl __init_HFSCR
89 bl __init_PMU_HV
90 bl __init_PMU_HV_ISA207
91 mtlr r11
92 blr
93
94 _GLOBAL(__setup_cpu_power10)
95 mflr r11
96 bl __init_FSCR_power10
97 bl __init_PMU
98 bl __init_PMU_ISA31
99 b 1f
100
101 _GLOBAL(__setup_cpu_power9)
102 mflr r11
103 bl __init_FSCR
104 bl __init_PMU
105 1: bl __init_hvmode_206
106 mtlr r11
107 beqlr
108 li r0,0
109 mtspr SPRN_PSSCR,r0
110 mtspr SPRN_LPID,r0
111 mtspr SPRN_PID,r0
112 LOAD_REG_IMMEDIATE(r0, PCR_MASK)
113 mtspr SPRN_PCR,r0
114 mfspr r3,SPRN_LPCR
115 LOAD_REG_IMMEDIATE(r4, LPCR_PECEDH | LPCR_PECE_HVEE | LPCR_HVICE | LPCR_HEIC)
116 or r3, r3, r4
117 LOAD_REG_IMMEDIATE(r4, LPCR_UPRT | LPCR_HR)
118 andc r3, r3, r4
119 li r4,0 /* LPES = 0 */
120 bl __init_LPCR_ISA300
121 bl __init_HFSCR
122 bl __init_PMU_HV
123 mtlr r11
124 blr
125
126 _GLOBAL(__restore_cpu_power10)
127 mflr r11
128 bl __init_FSCR_power10
129 bl __init_PMU
130 bl __init_PMU_ISA31
131 b 1f
132
133 _GLOBAL(__restore_cpu_power9)
134 mflr r11
135 bl __init_FSCR
136 bl __init_PMU
137 1: mfmsr r3
138 rldicl. r0,r3,4,63
139 mtlr r11
140 beqlr
141 li r0,0
142 mtspr SPRN_PSSCR,r0
143 mtspr SPRN_LPID,r0
144 mtspr SPRN_PID,r0
145 LOAD_REG_IMMEDIATE(r0, PCR_MASK)
146 mtspr SPRN_PCR,r0
147 mfspr r3,SPRN_LPCR
148 LOAD_REG_IMMEDIATE(r4, LPCR_PECEDH | LPCR_PECE_HVEE | LPCR_HVICE | LPCR_HEIC)
149 or r3, r3, r4
150 LOAD_REG_IMMEDIATE(r4, LPCR_UPRT | LPCR_HR)
151 andc r3, r3, r4
152 li r4,0 /* LPES = 0 */
153 bl __init_LPCR_ISA300
154 bl __init_HFSCR
155 bl __init_PMU_HV
156 mtlr r11
157 blr
158
159 __init_hvmode_206:
160 /* Disable CPU_FTR_HVMODE and exit if MSR:HV is not set */
161 mfmsr r3
162 rldicl. r0,r3,4,63
163 bnelr
164 ld r5,CPU_SPEC_FEATURES(r4)
165 LOAD_REG_IMMEDIATE(r6,CPU_FTR_HVMODE | CPU_FTR_P9_TM_HV_ASSIST)
166 andc r5,r5,r6
167 std r5,CPU_SPEC_FEATURES(r4)
168 blr
169
170 __init_LPCR_ISA206:
171 /* Setup a sane LPCR:
172 * Called with initial LPCR in R3 and desired LPES 2-bit value in R4
173 *
174 * LPES = 0b01 (HSRR0/1 used for 0x500)
175 * PECE = 0b111
176 * DPFD = 4
177 * HDICE = 0
178 * VC = 0b100 (VPM0=1, VPM1=0, ISL=0)
179 * VRMASD = 0b10000 (L=1, LP=00)
180 *
181 * Other bits untouched for now
182 */
183 li r5,0x10
184 rldimi r3,r5, LPCR_VRMASD_SH, 64-LPCR_VRMASD_SH-5
185
186 /* POWER9 has no VRMASD */
187 __init_LPCR_ISA300:
188 rldimi r3,r4, LPCR_LPES_SH, 64-LPCR_LPES_SH-2
189 ori r3,r3,(LPCR_PECE0|LPCR_PECE1|LPCR_PECE2)
190 li r5,4
191 rldimi r3,r5, LPCR_DPFD_SH, 64-LPCR_DPFD_SH-3
192 clrrdi r3,r3,1 /* clear HDICE */
193 li r5,4
194 rldimi r3,r5, LPCR_VC_SH, 0
195 mtspr SPRN_LPCR,r3
196 isync
197 blr
198
199 __init_FSCR_power10:
200 mfspr r3, SPRN_FSCR
201 ori r3, r3, FSCR_PREFIX
202 mtspr SPRN_FSCR, r3
203 // fall through
204
205 __init_FSCR:
206 mfspr r3,SPRN_FSCR
207 ori r3,r3,FSCR_TAR|FSCR_EBB
208 mtspr SPRN_FSCR,r3
209 blr
210
211 __init_HFSCR:
212 mfspr r3,SPRN_HFSCR
213 ori r3,r3,HFSCR_TAR|HFSCR_TM|HFSCR_BHRB|HFSCR_PM|\
214 HFSCR_DSCR|HFSCR_VECVSX|HFSCR_FP|HFSCR_EBB|HFSCR_MSGP
215 mtspr SPRN_HFSCR,r3
216 blr
217
218 __init_PMU_HV:
219 li r5,0
220 mtspr SPRN_MMCRC,r5
221 blr
222
223 __init_PMU_HV_ISA207:
224 li r5,0
225 mtspr SPRN_MMCRH,r5
226 blr
227
228 __init_PMU:
229 li r5,0
230 mtspr SPRN_MMCRA,r5
231 mtspr SPRN_MMCR0,r5
232 mtspr SPRN_MMCR1,r5
233 mtspr SPRN_MMCR2,r5
234 blr
235
236 __init_PMU_ISA207:
237 li r5,0
238 mtspr SPRN_MMCRS,r5
239 blr
240
241 __init_PMU_ISA31:
242 li r5,0
> 243 mtspr SPRN_MMCR3,r5
> 244 LOAD_REG_IMMEDIATE(r5, MMCRA_BHRB_DISABLE)
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 26258 bytes --]
^ permalink raw reply
* Re: [PATCH v5 7/7] KVM: PPC: Book3S HV: rework secure mem slot dropping
From: Bharata B Rao @ 2020-07-24 3:03 UTC (permalink / raw)
To: Ram Pai
Cc: ldufour, cclaudio, kvm-ppc, sathnaga, aneesh.kumar, sukadev,
linuxppc-dev, bauerman, david
In-Reply-To: <1595534844-16188-8-git-send-email-linuxram@us.ibm.com>
On Thu, Jul 23, 2020 at 01:07:24PM -0700, Ram Pai wrote:
> From: Laurent Dufour <ldufour@linux.ibm.com>
>
> When a secure memslot is dropped, all the pages backed in the secure
> device (aka really backed by secure memory by the Ultravisor)
> should be paged out to a normal page. Previously, this was
> achieved by triggering the page fault mechanism which is calling
> kvmppc_svm_page_out() on each pages.
>
> This can't work when hot unplugging a memory slot because the memory
> slot is flagged as invalid and gfn_to_pfn() is then not trying to access
> the page, so the page fault mechanism is not triggered.
>
> Since the final goal is to make a call to kvmppc_svm_page_out() it seems
> simpler to call directly instead of triggering such a mechanism. This
> way kvmppc_uvmem_drop_pages() can be called even when hot unplugging a
> memslot.
>
> Since kvmppc_uvmem_drop_pages() is already holding kvm->arch.uvmem_lock,
> the call to __kvmppc_svm_page_out() is made. As
> __kvmppc_svm_page_out needs the vma pointer to migrate the pages,
> the VMA is fetched in a lazy way, to not trigger find_vma() all
> the time. In addition, the mmap_sem is held in read mode during
> that time, not in write mode since the virual memory layout is not
> impacted, and kvm->arch.uvmem_lock prevents concurrent operation
> on the secure device.
>
> Cc: Ram Pai <linuxram@us.ibm.com>
> Cc: Bharata B Rao <bharata@linux.ibm.com>
> Cc: Paul Mackerras <paulus@ozlabs.org>
> Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> [modified the changelog description]
> Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
> ---
> arch/powerpc/kvm/book3s_hv_uvmem.c | 54 ++++++++++++++++++++++++++------------
> 1 file changed, 37 insertions(+), 17 deletions(-)
>
> diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c
> index c772e92..daffa6e 100644
> --- a/arch/powerpc/kvm/book3s_hv_uvmem.c
> +++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
> @@ -632,35 +632,55 @@ static inline int kvmppc_svm_page_out(struct vm_area_struct *vma,
> * fault on them, do fault time migration to replace the device PTEs in
> * QEMU page table with normal PTEs from newly allocated pages.
> */
> -void kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *free,
> +void kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *slot,
> struct kvm *kvm, bool skip_page_out)
> {
> int i;
> struct kvmppc_uvmem_page_pvt *pvt;
> - unsigned long pfn, uvmem_pfn;
> - unsigned long gfn = free->base_gfn;
> + struct page *uvmem_page;
> + struct vm_area_struct *vma = NULL;
> + unsigned long uvmem_pfn, gfn;
> + unsigned long addr, end;
> +
> + mmap_read_lock(kvm->mm);
> +
> + addr = slot->userspace_addr;
> + end = addr + (slot->npages * PAGE_SIZE);
>
> - for (i = free->npages; i; --i, ++gfn) {
> - struct page *uvmem_page;
> + gfn = slot->base_gfn;
> + for (i = slot->npages; i; --i, ++gfn, addr += PAGE_SIZE) {
> +
> + /* Fetch the VMA if addr is not in the latest fetched one */
> + if (!vma || (addr < vma->vm_start || addr >= vma->vm_end)) {
> + vma = find_vma_intersection(kvm->mm, addr, end);
> + if (!vma ||
> + vma->vm_start > addr || vma->vm_end < end) {
> + pr_err("Can't find VMA for gfn:0x%lx\n", gfn);
> + break;
> + }
There is a potential issue with the boundary condition check here
which I discussed with Laurent yesterday. Guess he hasn't gotten around
to look at it yet.
Regards,
Bharata.
^ permalink raw reply
* Re: [PATCH 1/2] lockdep: improve current->(hard|soft)irqs_enabled synchronisation with actual irq state
From: kernel test robot @ 2020-07-24 3:15 UTC (permalink / raw)
To: Nicholas Piggin, linux-kernel
Cc: linux-arch, kbuild-all, Peter Zijlstra, Will Deacon,
Nicholas Piggin, Alexey Kardashevskiy, Ingo Molnar, linuxppc-dev
In-Reply-To: <20200723105615.1268126-1-npiggin@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 6352 bytes --]
Hi Nicholas,
I love your patch! Perhaps something to improve:
[auto build test WARNING on linux/master]
[also build test WARNING on powerpc/next linus/master v5.8-rc6 next-20200723]
[cannot apply to tip/locking/core]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Nicholas-Piggin/lockdep-improve-current-hard-soft-irqs_enabled-synchronisation-with-actual-irq-state/20200723-185938
base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 9ebcfadb0610322ac537dd7aa5d9cbc2b2894c68
config: i386-randconfig-s001-20200723 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-14) 9.3.0
reproduce:
# apt-get install sparse
# sparse version: v0.6.2-93-g4c6cbe55-dirty
# save the attached .config to linux build tree
make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=i386
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
sparse warnings: (new ones prefixed by >>)
kernel/locking/spinlock.c:149:17: sparse: sparse: context imbalance in '_raw_spin_lock' - wrong count at exit
kernel/locking/spinlock.c: note: in included file (through include/linux/preempt.h):
>> arch/x86/include/asm/preempt.h:79:9: sparse: sparse: context imbalance in '_raw_spin_lock_irqsave' - wrong count at exit
>> arch/x86/include/asm/preempt.h:79:9: sparse: sparse: context imbalance in '_raw_spin_lock_irq' - wrong count at exit
kernel/locking/spinlock.c:173:17: sparse: sparse: context imbalance in '_raw_spin_lock_bh' - wrong count at exit
kernel/locking/spinlock.c:181:17: sparse: sparse: context imbalance in '_raw_spin_unlock' - unexpected unlock
kernel/locking/spinlock.c:189:17: sparse: sparse: context imbalance in '_raw_spin_unlock_irqrestore' - unexpected unlock
kernel/locking/spinlock.c:197:17: sparse: sparse: context imbalance in '_raw_spin_unlock_irq' - unexpected unlock
kernel/locking/spinlock.c:205:17: sparse: sparse: context imbalance in '_raw_spin_unlock_bh' - unexpected unlock
kernel/locking/spinlock.c:221:17: sparse: sparse: context imbalance in '_raw_read_lock' - wrong count at exit
>> arch/x86/include/asm/preempt.h:79:9: sparse: sparse: context imbalance in '_raw_read_lock_irqsave' - wrong count at exit
>> arch/x86/include/asm/preempt.h:79:9: sparse: sparse: context imbalance in '_raw_read_lock_irq' - wrong count at exit
kernel/locking/spinlock.c:245:17: sparse: sparse: context imbalance in '_raw_read_lock_bh' - wrong count at exit
kernel/locking/spinlock.c:253:17: sparse: sparse: context imbalance in '_raw_read_unlock' - unexpected unlock
kernel/locking/spinlock.c:261:17: sparse: sparse: context imbalance in '_raw_read_unlock_irqrestore' - unexpected unlock
kernel/locking/spinlock.c:269:17: sparse: sparse: context imbalance in '_raw_read_unlock_irq' - unexpected unlock
kernel/locking/spinlock.c:277:17: sparse: sparse: context imbalance in '_raw_read_unlock_bh' - unexpected unlock
kernel/locking/spinlock.c:293:17: sparse: sparse: context imbalance in '_raw_write_lock' - wrong count at exit
>> arch/x86/include/asm/preempt.h:79:9: sparse: sparse: context imbalance in '_raw_write_lock_irqsave' - wrong count at exit
>> arch/x86/include/asm/preempt.h:79:9: sparse: sparse: context imbalance in '_raw_write_lock_irq' - wrong count at exit
kernel/locking/spinlock.c:317:17: sparse: sparse: context imbalance in '_raw_write_lock_bh' - wrong count at exit
kernel/locking/spinlock.c:325:17: sparse: sparse: context imbalance in '_raw_write_unlock' - unexpected unlock
kernel/locking/spinlock.c:333:17: sparse: sparse: context imbalance in '_raw_write_unlock_irqrestore' - unexpected unlock
kernel/locking/spinlock.c:341:17: sparse: sparse: context imbalance in '_raw_write_unlock_irq' - unexpected unlock
kernel/locking/spinlock.c:349:17: sparse: sparse: context imbalance in '_raw_write_unlock_bh' - unexpected unlock
kernel/locking/spinlock.c:358:17: sparse: sparse: context imbalance in '_raw_spin_lock_nested' - wrong count at exit
>> arch/x86/include/asm/preempt.h:79:9: sparse: sparse: context imbalance in '_raw_spin_lock_irqsave_nested' - wrong count at exit
kernel/locking/spinlock.c:380:17: sparse: sparse: context imbalance in '_raw_spin_lock_nest_lock' - wrong count at exit
--
kernel/trace/ring_buffer.c:699:32: sparse: sparse: incorrect type in return expression (different base types) @@ expected restricted __poll_t @@ got int @@
kernel/trace/ring_buffer.c:699:32: sparse: expected restricted __poll_t
kernel/trace/ring_buffer.c:699:32: sparse: got int
kernel/trace/ring_buffer.c: note: in included file (through include/linux/irqflags.h, arch/x86/include/asm/special_insns.h, arch/x86/include/asm/processor.h, ...):
>> arch/x86/include/asm/irqflags.h:162:28: sparse: sparse: context imbalance in 'ring_buffer_peek' - different lock contexts for basic block
>> arch/x86/include/asm/irqflags.h:162:28: sparse: sparse: context imbalance in 'ring_buffer_consume' - different lock contexts for basic block
>> arch/x86/include/asm/irqflags.h:162:28: sparse: sparse: context imbalance in 'ring_buffer_empty' - different lock contexts for basic block
>> arch/x86/include/asm/irqflags.h:162:28: sparse: sparse: context imbalance in 'ring_buffer_empty_cpu' - different lock contexts for basic block
vim +/_raw_spin_lock_irqsave +79 arch/x86/include/asm/preempt.h
c2daa3bed53a811 Peter Zijlstra 2013-08-14 72
c2daa3bed53a811 Peter Zijlstra 2013-08-14 73 /*
c2daa3bed53a811 Peter Zijlstra 2013-08-14 74 * The various preempt_count add/sub methods
c2daa3bed53a811 Peter Zijlstra 2013-08-14 75 */
c2daa3bed53a811 Peter Zijlstra 2013-08-14 76
c2daa3bed53a811 Peter Zijlstra 2013-08-14 77 static __always_inline void __preempt_count_add(int val)
c2daa3bed53a811 Peter Zijlstra 2013-08-14 78 {
b3ca1c10d7b32fd Christoph Lameter 2014-04-07 @79 raw_cpu_add_4(__preempt_count, val);
c2daa3bed53a811 Peter Zijlstra 2013-08-14 80 }
c2daa3bed53a811 Peter Zijlstra 2013-08-14 81
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 35341 bytes --]
^ permalink raw reply
* Re: [PATCH 15/15] powerpc/powernv/sriov: Make single PE mode a per-BAR setting
From: Oliver O'Halloran @ 2020-07-24 3:40 UTC (permalink / raw)
To: Alexey Kardashevskiy; +Cc: linuxppc-dev@lists.ozlabs.org
In-Reply-To: <fdd88062-0b62-fc6b-4de7-a4e099768cd9@ozlabs.ru>
On Wed, Jul 22, 2020 at 8:06 PM Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
>
> >> Well, realistically the segment size should be 8MB to make this matter
> >> (or the whole window 2GB) which does not seem to happen so it does not
> >> matter.
> >
> > I'm not sure what you mean.
>
> I mean how can we possibly hit this case, what m64_segsize would the
> platform have to trigger this. The whole check seems useless but whatever.
Yeah maybe.
IIRC some old P8 FSP systems had tiny M64 windows so it might have
been an issue there. Maybe we can get rid of it., but I'd rather just
leave the behaviour as-is for now.
^ permalink raw reply
* Re: [PATCH 1/2] lockdep: improve current->(hard|soft)irqs_enabled synchronisation with actual irq state
From: Alexey Kardashevskiy @ 2020-07-24 4:16 UTC (permalink / raw)
To: Nicholas Piggin, Peter Zijlstra
Cc: linux-arch, Will Deacon, Ingo Molnar, linuxppc-dev, linux-kernel
In-Reply-To: <1595506730.3mvrxktem5.astroid@bobo.none>
On 23/07/2020 23:11, Nicholas Piggin wrote:
> Excerpts from Peter Zijlstra's message of July 23, 2020 9:40 pm:
>> On Thu, Jul 23, 2020 at 08:56:14PM +1000, Nicholas Piggin wrote:
>>
>>> diff --git a/arch/powerpc/include/asm/hw_irq.h b/arch/powerpc/include/asm/hw_irq.h
>>> index 3a0db7b0b46e..35060be09073 100644
>>> --- a/arch/powerpc/include/asm/hw_irq.h
>>> +++ b/arch/powerpc/include/asm/hw_irq.h
>>> @@ -200,17 +200,14 @@ static inline bool arch_irqs_disabled(void)
>>> #define powerpc_local_irq_pmu_save(flags) \
>>> do { \
>>> raw_local_irq_pmu_save(flags); \
>>> - trace_hardirqs_off(); \
>>> + if (!raw_irqs_disabled_flags(flags)) \
>>> + trace_hardirqs_off(); \
>>> } while(0)
>>> #define powerpc_local_irq_pmu_restore(flags) \
>>> do { \
>>> - if (raw_irqs_disabled_flags(flags)) { \
>>> - raw_local_irq_pmu_restore(flags); \
>>> - trace_hardirqs_off(); \
>>> - } else { \
>>> + if (!raw_irqs_disabled_flags(flags)) \
>>> trace_hardirqs_on(); \
>>> - raw_local_irq_pmu_restore(flags); \
>>> - } \
>>> + raw_local_irq_pmu_restore(flags); \
>>> } while(0)
>>
>> You shouldn't be calling lockdep from NMI context!
>
> After this patch it doesn't.
>
> trace_hardirqs_on/off implementation appears to expect to be called in NMI
> context though, for some reason.
>
>> That is, I recently
>> added suport for that on x86:
>>
>> https://lkml.kernel.org/r/20200623083721.155449112@infradead.org
>> https://lkml.kernel.org/r/20200623083721.216740948@infradead.org
>>
>> But you need to be very careful on how you order things, as you can see
>> the above relies on preempt_count() already having been incremented with
>> NMI_MASK.
>
> Hmm. My patch seems simpler.
And your patches fix my error while Peter's do not:
IRQs not enabled as expected
WARNING: CPU: 0 PID: 1377 at /home/aik/p/kernel/kernel/softirq.c:169
__local_bh_enable_ip+0x118/0x190
>
> I don't know this stuff very well, I don't really understand what your patch
> enables for x86 but at least it shouldn't be incompatible with this one
> AFAIKS.
>
> Thanks,
> Nick
>
--
Alexey
^ permalink raw reply
* Re: [PATCH v5 4/7] KVM: PPC: Book3S HV: in H_SVM_INIT_DONE, migrate remaining normal-GFNs to secure-GFNs.
From: Bharata B Rao @ 2020-07-24 4:27 UTC (permalink / raw)
To: Ram Pai
Cc: ldufour, cclaudio, kvm-ppc, sathnaga, aneesh.kumar, sukadev,
linuxppc-dev, bauerman, david
In-Reply-To: <1595534844-16188-5-git-send-email-linuxram@us.ibm.com>
On Thu, Jul 23, 2020 at 01:07:21PM -0700, Ram Pai wrote:
> The Ultravisor is expected to explicitly call H_SVM_PAGE_IN for all the
> pages of the SVM before calling H_SVM_INIT_DONE. This causes a huge
> delay in tranistioning the VM to SVM. The Ultravisor is only interested
> in the pages that contain the kernel, initrd and other important data
> structures. The rest contain throw-away content.
>
> However if not all pages are requested by the Ultravisor, the Hypervisor
> continues to consider the GFNs corresponding to the non-requested pages
> as normal GFNs. This can lead to data-corruption and undefined behavior.
>
> In H_SVM_INIT_DONE handler, move all the PFNs associated with the SVM's
> GFNs to secure-PFNs. Skip the GFNs that are already Paged-in or Shared
> or Paged-in followed by a Paged-out.
>
> Cc: Paul Mackerras <paulus@ozlabs.org>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Bharata B Rao <bharata@linux.ibm.com>
> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
> Cc: Laurent Dufour <ldufour@linux.ibm.com>
> Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
> Cc: David Gibson <david@gibson.dropbear.id.au>
> Cc: Claudio Carvalho <cclaudio@linux.ibm.com>
> Cc: kvm-ppc@vger.kernel.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> ---
> Documentation/powerpc/ultravisor.rst | 2 +
> arch/powerpc/include/asm/kvm_book3s_uvmem.h | 2 +
> arch/powerpc/kvm/book3s_hv_uvmem.c | 136 +++++++++++++++++++++++++---
> 3 files changed, 127 insertions(+), 13 deletions(-)
>
> diff --git a/Documentation/powerpc/ultravisor.rst b/Documentation/powerpc/ultravisor.rst
> index a1c8c37..ba6b1bf 100644
> --- a/Documentation/powerpc/ultravisor.rst
> +++ b/Documentation/powerpc/ultravisor.rst
> @@ -934,6 +934,8 @@ Return values
> * H_UNSUPPORTED if called from the wrong context (e.g.
> from an SVM or before an H_SVM_INIT_START
> hypercall).
> + * H_STATE if the hypervisor could not successfully
> + transition the VM to Secure VM.
>
> Description
> ~~~~~~~~~~~
> diff --git a/arch/powerpc/include/asm/kvm_book3s_uvmem.h b/arch/powerpc/include/asm/kvm_book3s_uvmem.h
> index 9cb7d8b..f229ab5 100644
> --- a/arch/powerpc/include/asm/kvm_book3s_uvmem.h
> +++ b/arch/powerpc/include/asm/kvm_book3s_uvmem.h
> @@ -23,6 +23,8 @@ unsigned long kvmppc_h_svm_page_out(struct kvm *kvm,
> unsigned long kvmppc_h_svm_init_abort(struct kvm *kvm);
> void kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *free,
> struct kvm *kvm, bool skip_page_out);
> +int kvmppc_uv_migrate_mem_slot(struct kvm *kvm,
> + const struct kvm_memory_slot *memslot);
I still don't see why this be a global function. You should be able
to move around a few functions in book3s_hv_uvmem.c up/down and
satisfy the calling order dependencies.
Otherwise, Reviewed-by: Bharata B Rao <bharata@linux.ibm.com>
^ permalink raw reply
* [powerpc:fixes-test] BUILD SUCCESS 590ce02bd148cd35721560c140e3759e39a6e56a
From: kernel test robot @ 2020-07-24 4:54 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git fixes-test
branch HEAD: 590ce02bd148cd35721560c140e3759e39a6e56a powerpc/64s: Fix irq tracing corruption in interrupt/syscall return caused by perf interrupts
elapsed time: 905m
configs tested: 74
configs skipped: 1
The following configs have been built successfully.
More configs may be tested in the coming days.
arm defconfig
arm allyesconfig
arm allmodconfig
arm allnoconfig
arm64 allyesconfig
arm64 defconfig
arm64 allmodconfig
arm64 allnoconfig
i386 allyesconfig
i386 defconfig
i386 debian-10.3
i386 allnoconfig
ia64 allmodconfig
ia64 defconfig
ia64 allnoconfig
ia64 allyesconfig
m68k allmodconfig
m68k allnoconfig
m68k sun3_defconfig
m68k defconfig
m68k allyesconfig
nds32 defconfig
nds32 allnoconfig
csky allyesconfig
csky defconfig
alpha defconfig
alpha allyesconfig
xtensa allyesconfig
h8300 allyesconfig
h8300 allmodconfig
xtensa defconfig
arc defconfig
arc allyesconfig
sh allmodconfig
sh allnoconfig
microblaze allnoconfig
nios2 defconfig
nios2 allyesconfig
openrisc defconfig
c6x allyesconfig
c6x allnoconfig
openrisc allyesconfig
mips allyesconfig
mips allnoconfig
mips allmodconfig
parisc allnoconfig
parisc defconfig
parisc allyesconfig
parisc allmodconfig
powerpc defconfig
powerpc allyesconfig
powerpc rhel-kconfig
powerpc allmodconfig
powerpc allnoconfig
riscv allyesconfig
riscv allnoconfig
riscv defconfig
riscv allmodconfig
s390 allyesconfig
s390 allnoconfig
s390 allmodconfig
s390 defconfig
sparc allyesconfig
sparc defconfig
sparc64 defconfig
sparc64 allnoconfig
sparc64 allyesconfig
sparc64 allmodconfig
x86_64 rhel-7.6-kselftests
x86_64 rhel-8.3
x86_64 kexec
x86_64 rhel
x86_64 lkp
x86_64 fedora-25
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
^ permalink raw reply
* Re: [PATCH v2 14/14] powerpc/eeh: Move PE tree setup into the platform
From: Alexey Kardashevskiy @ 2020-07-24 5:01 UTC (permalink / raw)
To: Oliver O'Halloran, linuxppc-dev
In-Reply-To: <20200722042628.1425880-14-oohall@gmail.com>
On 22/07/2020 14:26, Oliver O'Halloran wrote:
> The EEH core has a concept of a "PE tree" to support PowerNV. The PE tree
> follows the PCI bus structures because a reset asserted on an upstream
> bridge will be propagated to the downstream bridges. On pseries there's a
> 1-1 correspondence between what the guest sees are a PHB and a PE so the
> "tree" is really just a single node.
>
> Current the EEH core is reponsible for setting up this PE tree which it
> does by traversing the pci_dn tree. The structure of the pci_dn tree
> matches the bus tree on PowerNV which leads to the PE tree being "correct"
> this setup method doesn't make a whole lot of sense and it's actively
> confusing for the pseries case where it doesn't really do anything.
>
> We want to remove the dependence on pci_dn anyway so this patch move
> choosing where to insert a new PE into the platform code rather than
> being part of the generic EEH code. For PowerNV this simplifies the
> tree building logic and removes the use of pci_dn. For pseries we
> keep the existing logic. I'm not really convinced it does anything
> due to the 1-1 PE-to-PHB correspondence so every device under that
> PHB should be in the same PE, but I'd rather not remove it entirely
> until we've had a chance to look at it more deeply.
>
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
> v2: Reworked pseries PE setup slightly. NOT DONE YET. mostly done needs test
So far it was looking good and now this :)
When is it going to be done? Is this the broken stuff you mentioned
elsewhere?
--
Alexey
^ permalink raw reply
* Re: [PATCH v2 01/14] powerpc/eeh: Remove eeh_dev_phb_init_dynamic()
From: Alexey Kardashevskiy @ 2020-07-24 5:05 UTC (permalink / raw)
To: Oliver O'Halloran, linuxppc-dev
In-Reply-To: <20200722042628.1425880-1-oohall@gmail.com>
On 22/07/2020 14:26, Oliver O'Halloran wrote:
> This function is a one line wrapper around eeh_phb_pe_create() and despite
> the name it doesn't create any eeh_dev structures.
The "eeh_dev_phb_init_dynamic" name does not suggest anything really but
the comment does.
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> Replace it with direct
> calls to eeh_phb_pe_create() since that does what it says on the tin
> and removes a layer of indirection.
>
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
> v2: Added sub prototype of eeh_phb_pe_create() for the !CONFIG_EEH case.
> ---
> arch/powerpc/include/asm/eeh.h | 3 ++-
> arch/powerpc/kernel/eeh.c | 2 +-
> arch/powerpc/kernel/eeh_dev.c | 13 -------------
> arch/powerpc/kernel/of_platform.c | 4 ++--
> arch/powerpc/platforms/pseries/pci_dlpar.c | 2 +-
> 5 files changed, 6 insertions(+), 18 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
> index 964a54292b36..64487b88c569 100644
> --- a/arch/powerpc/include/asm/eeh.h
> +++ b/arch/powerpc/include/asm/eeh.h
> @@ -294,7 +294,6 @@ const char *eeh_pe_loc_get(struct eeh_pe *pe);
> struct pci_bus *eeh_pe_bus_get(struct eeh_pe *pe);
>
> struct eeh_dev *eeh_dev_init(struct pci_dn *pdn);
> -void eeh_dev_phb_init_dynamic(struct pci_controller *phb);
> void eeh_show_enabled(void);
> int __init eeh_ops_register(struct eeh_ops *ops);
> int __exit eeh_ops_unregister(const char *name);
> @@ -370,6 +369,8 @@ void pseries_eeh_init_edev_recursive(struct pci_dn *pdn);
> #else
> static inline void pseries_eeh_add_device_early(struct pci_dn *pdn) { }
> static inline void pseries_eeh_add_device_tree_early(struct pci_dn *pdn) { }
> +
> +static inline int eeh_phb_pe_create(struct pci_controller *phb) { return 0; }
> #endif
>
> #ifdef CONFIG_PPC64
> diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
> index d407981dec76..859f76020256 100644
> --- a/arch/powerpc/kernel/eeh.c
> +++ b/arch/powerpc/kernel/eeh.c
> @@ -1096,7 +1096,7 @@ static int eeh_init(void)
>
> /* Initialize PHB PEs */
> list_for_each_entry_safe(hose, tmp, &hose_list, list_node)
> - eeh_dev_phb_init_dynamic(hose);
> + eeh_phb_pe_create(hose);
>
> eeh_addr_cache_init();
>
> diff --git a/arch/powerpc/kernel/eeh_dev.c b/arch/powerpc/kernel/eeh_dev.c
> index 7370185c7a05..8e159a12f10c 100644
> --- a/arch/powerpc/kernel/eeh_dev.c
> +++ b/arch/powerpc/kernel/eeh_dev.c
> @@ -52,16 +52,3 @@ struct eeh_dev *eeh_dev_init(struct pci_dn *pdn)
>
> return edev;
> }
> -
> -/**
> - * eeh_dev_phb_init_dynamic - Create EEH devices for devices included in PHB
> - * @phb: PHB
> - *
> - * Scan the PHB OF node and its child association, then create the
> - * EEH devices accordingly
> - */
> -void eeh_dev_phb_init_dynamic(struct pci_controller *phb)
> -{
> - /* EEH PE for PHB */
> - eeh_phb_pe_create(phb);
> -}
> diff --git a/arch/powerpc/kernel/of_platform.c b/arch/powerpc/kernel/of_platform.c
> index 71a3f97dc988..f89376ff633e 100644
> --- a/arch/powerpc/kernel/of_platform.c
> +++ b/arch/powerpc/kernel/of_platform.c
> @@ -62,8 +62,8 @@ static int of_pci_phb_probe(struct platform_device *dev)
> /* Init pci_dn data structures */
> pci_devs_phb_init_dynamic(phb);
>
> - /* Create EEH PEs for the PHB */
> - eeh_dev_phb_init_dynamic(phb);
> + /* Create EEH PE for the PHB */
> + eeh_phb_pe_create(phb);
>
> /* Scan the bus */
> pcibios_scan_phb(phb);
> diff --git a/arch/powerpc/platforms/pseries/pci_dlpar.c b/arch/powerpc/platforms/pseries/pci_dlpar.c
> index b3a38f5a6b68..f9ae17e8a0f4 100644
> --- a/arch/powerpc/platforms/pseries/pci_dlpar.c
> +++ b/arch/powerpc/platforms/pseries/pci_dlpar.c
> @@ -34,7 +34,7 @@ struct pci_controller *init_phb_dynamic(struct device_node *dn)
> pci_devs_phb_init_dynamic(phb);
>
> /* Create EEH devices for the PHB */
> - eeh_dev_phb_init_dynamic(phb);
> + eeh_phb_pe_create(phb);
>
> if (dn->child)
> pseries_eeh_init_edev_recursive(PCI_DN(dn));
>
--
Alexey
^ permalink raw reply
* Re: [PATCH v2 14/14] powerpc/eeh: Move PE tree setup into the platform
From: Oliver O'Halloran @ 2020-07-24 5:06 UTC (permalink / raw)
To: Alexey Kardashevskiy; +Cc: linuxppc-dev
In-Reply-To: <983435c2-0e6b-ded7-d28d-e6728c0a001e@ozlabs.ru>
On Fri, Jul 24, 2020 at 3:01 PM Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
>
>
>
> On 22/07/2020 14:26, Oliver O'Halloran wrote:
> > The EEH core has a concept of a "PE tree" to support PowerNV. The PE tree
> > follows the PCI bus structures because a reset asserted on an upstream
> > bridge will be propagated to the downstream bridges. On pseries there's a
> > 1-1 correspondence between what the guest sees are a PHB and a PE so the
> > "tree" is really just a single node.
> >
> > Current the EEH core is reponsible for setting up this PE tree which it
> > does by traversing the pci_dn tree. The structure of the pci_dn tree
> > matches the bus tree on PowerNV which leads to the PE tree being "correct"
> > this setup method doesn't make a whole lot of sense and it's actively
> > confusing for the pseries case where it doesn't really do anything.
> >
> > We want to remove the dependence on pci_dn anyway so this patch move
> > choosing where to insert a new PE into the platform code rather than
> > being part of the generic EEH code. For PowerNV this simplifies the
> > tree building logic and removes the use of pci_dn. For pseries we
> > keep the existing logic. I'm not really convinced it does anything
> > due to the 1-1 PE-to-PHB correspondence so every device under that
> > PHB should be in the same PE, but I'd rather not remove it entirely
> > until we've had a chance to look at it more deeply.
> >
> > Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> > Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> > ---
> > v2: Reworked pseries PE setup slightly. NOT DONE YET. mostly done needs test
>
> So far it was looking good and now this :)
>
> When is it going to be done? Is this the broken stuff you mentioned
> elsewhere?
I am a dumb.
I put those in there to remind myself what I have / haven't done when
respinning a series. I added that before I tested it and forgot to
remove the comment.
>
>
> --
> Alexey
^ permalink raw reply
* Re: [PATCH] powerpc/64s: Fix irq tracing corruption in interrupt/syscall return caused by perf interrupts
From: Alexey Kardashevskiy @ 2020-07-24 5:14 UTC (permalink / raw)
To: Nicholas Piggin, linuxppc-dev
In-Reply-To: <1595499918.mg25810wnp.astroid@bobo.none>
On 23/07/2020 20:29, Nicholas Piggin wrote:
> Excerpts from Alexey Kardashevskiy's message of July 22, 2020 8:50 pm:
>>
>>
>> On 22/07/2020 17:34, Nicholas Piggin wrote:
>>> Alexey reports lockdep_assert_irqs_enabled() warnings when stress testing perf, e.g.,
>>>
>>> WARNING: CPU: 0 PID: 1556 at kernel/softirq.c:169 __local_bh_enable_ip+0x258/0x270
>>> CPU: 0 PID: 1556 Comm: syz-executor
>>> NIP: c0000000001ec888 LR: c0000000001ec884 CTR: c000000000ef0610
>>> REGS: c000000022d4f8a0 TRAP: 0700 Not tainted (5.8.0-rc3-x)
>>> MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE> CR: 28008844 XER: 20040000
>>> CFAR: c0000000001dc1d0 IRQMASK: 0
>>>
>>> The interesting thing is MSR[EE] and IRQMASK shows interrupts are enabled,
>>> suggesting the current->hardirqs_enabled irq tracing state is going out of sync
>>> with the actual interrupt enable state.
>>>
>>> The cause is a window in interrupt/syscall return where irq tracing state is being
>>> adjusted for an irqs-enabled return while MSR[EE] is still enabled. A perf
>>> interrupt hits and ends up calling trace_hardirqs_off() when restoring
>>> interrupt flags to a disable state.
>>>
>>> Fix this by disabling perf interrupts as well while adjusting irq tracing state.
>>>
>>> Add a debug check that catches the condition sooner.
>>>
>>> Fixes: 68b34588e202 ("powerpc/64/sycall: Implement syscall entry/exit logic in C")
>>> Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>>> ---
>>>
>>> I can reproduce similar symptoms and this patch fixes my test case,
>>> still trying to confirm Alexey's test case or whether there's another
>>> similar bug causing it.
>>
>>
>> This does not fix my testcase. I applied this on top of 4fa640dc5230
>> ("Merge tag 'vfio-v5.8-rc7' of git://github.com/awilliam/linux-vfio into
>> master") without any of my testing code, just to be clear. Sorry...
>
> Okay it seems to be a bigger problem and not actually caused by that
> patch but was possible for lockdep hardirqs_enabled state to get out
> of synch with the local_irq_disable() state before that too. Root
> cause is similar -- perf interrupts hitting between updating the two
> different bits of state.
>
> Not quite sure why Alexey's test wasn't hitting it before the patch,
> but possibly the way masked interrupts get replayed. But I was able
> to hit the problem with a different assertion.
>
> I think I have a fix, but it seems to be a generic irq tracing code
> issue. So this patch can be dropped, and it's not an urgent issue for
> the next release (it only triggers warns on rare occasions and only
> when lockdep is enabled).
I would still like to understand how the last
curr->hardirq_disable_event misses the ftrace buffer and we end up in
the original interrupted kernel code...
--
Alexey
^ permalink raw reply
* Re: [PATCH v2 10/16] powerpc/powernv/pci: Refactor pnv_ioda_alloc_pe()
From: Alexey Kardashevskiy @ 2020-07-24 5:20 UTC (permalink / raw)
To: Oliver O'Halloran, linuxppc-dev
In-Reply-To: <20200722065715.1432738-10-oohall@gmail.com>
On 22/07/2020 16:57, Oliver O'Halloran wrote:
> Rework the PE allocation logic to allow allocating blocks of PEs rather
> than individually. We'll use this to allocate contigious blocks of PEs for
> the SR-IOVs.
>
> This patch also adds code to pnv_ioda_alloc_pe() and pnv_ioda_reserve_pe() to
> use the existing, but unused, phb->pe_alloc_mutex. Currently these functions
> use atomic bit ops to release a currently allocated PE number. However,
> the pnv_ioda_alloc_pe() wants to have exclusive access to the bit map while
> scanning for hole large enough to accomodate the allocation size.
>
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
> v2: Add some details about the pe_alloc mutex and why we're using it.
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
> arch/powerpc/platforms/powernv/pci-ioda.c | 41 ++++++++++++++++++-----
> arch/powerpc/platforms/powernv/pci.h | 2 +-
> 2 files changed, 34 insertions(+), 9 deletions(-)
>
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index 2d36a9ebf0e9..c9c25fb0783c 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -145,23 +145,45 @@ static void pnv_ioda_reserve_pe(struct pnv_phb *phb, int pe_no)
> return;
> }
>
> + mutex_lock(&phb->ioda.pe_alloc_mutex);
> if (test_and_set_bit(pe_no, phb->ioda.pe_alloc))
> pr_debug("%s: PE %x was reserved on PHB#%x\n",
> __func__, pe_no, phb->hose->global_number);
> + mutex_unlock(&phb->ioda.pe_alloc_mutex);
>
> pnv_ioda_init_pe(phb, pe_no);
> }
>
> -struct pnv_ioda_pe *pnv_ioda_alloc_pe(struct pnv_phb *phb)
> +struct pnv_ioda_pe *pnv_ioda_alloc_pe(struct pnv_phb *phb, int count)
> {
> - long pe;
> + struct pnv_ioda_pe *ret = NULL;
> + int run = 0, pe, i;
>
> + mutex_lock(&phb->ioda.pe_alloc_mutex);
> +
> + /* scan backwards for a run of @count cleared bits */
> for (pe = phb->ioda.total_pe_num - 1; pe >= 0; pe--) {
> - if (!test_and_set_bit(pe, phb->ioda.pe_alloc))
> - return pnv_ioda_init_pe(phb, pe);
> + if (test_bit(pe, phb->ioda.pe_alloc)) {
> + run = 0;
> + continue;
> + }
> +
> + run++;
> + if (run == count)
> + break;
> }
> + if (run != count)
> + goto out;
>
> - return NULL;
> + for (i = pe; i < pe + count; i++) {
> + set_bit(i, phb->ioda.pe_alloc);
> + pnv_ioda_init_pe(phb, i);
> + }
> + ret = &phb->ioda.pe_array[pe];
> +
> +out:
> + mutex_unlock(&phb->ioda.pe_alloc_mutex);
> + return ret;
> }
>
> void pnv_ioda_free_pe(struct pnv_ioda_pe *pe)
> @@ -173,7 +195,10 @@ void pnv_ioda_free_pe(struct pnv_ioda_pe *pe)
> WARN_ON(pe->npucomp); /* NPUs for nvlink are not supposed to be freed */
> kfree(pe->npucomp);
> memset(pe, 0, sizeof(struct pnv_ioda_pe));
> +
> + mutex_lock(&phb->ioda.pe_alloc_mutex);
> clear_bit(pe_num, phb->ioda.pe_alloc);
> + mutex_unlock(&phb->ioda.pe_alloc_mutex);
> }
>
> /* The default M64 BAR is shared by all PEs */
> @@ -976,7 +1001,7 @@ static struct pnv_ioda_pe *pnv_ioda_setup_dev_PE(struct pci_dev *dev)
> if (pdn->pe_number != IODA_INVALID_PE)
> return NULL;
>
> - pe = pnv_ioda_alloc_pe(phb);
> + pe = pnv_ioda_alloc_pe(phb, 1);
> if (!pe) {
> pr_warn("%s: Not enough PE# available, disabling device\n",
> pci_name(dev));
> @@ -1047,7 +1072,7 @@ static struct pnv_ioda_pe *pnv_ioda_setup_bus_PE(struct pci_bus *bus, bool all)
>
> /* The PE number isn't pinned by M64 */
> if (!pe)
> - pe = pnv_ioda_alloc_pe(phb);
> + pe = pnv_ioda_alloc_pe(phb, 1);
>
> if (!pe) {
> pr_warn("%s: Not enough PE# available for PCI bus %04x:%02x\n",
> @@ -3065,7 +3090,7 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np,
> pnv_ioda_reserve_pe(phb, phb->ioda.root_pe_idx);
> } else {
> /* otherwise just allocate one */
> - root_pe = pnv_ioda_alloc_pe(phb);
> + root_pe = pnv_ioda_alloc_pe(phb, 1);
> phb->ioda.root_pe_idx = root_pe->pe_number;
> }
>
> diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
> index 23fc5e391c7f..06431a452130 100644
> --- a/arch/powerpc/platforms/powernv/pci.h
> +++ b/arch/powerpc/platforms/powernv/pci.h
> @@ -224,7 +224,7 @@ int pnv_ioda_deconfigure_pe(struct pnv_phb *phb, struct pnv_ioda_pe *pe);
> void pnv_pci_ioda2_setup_dma_pe(struct pnv_phb *phb, struct pnv_ioda_pe *pe);
> void pnv_pci_ioda2_release_pe_dma(struct pnv_ioda_pe *pe);
>
> -struct pnv_ioda_pe *pnv_ioda_alloc_pe(struct pnv_phb *phb);
> +struct pnv_ioda_pe *pnv_ioda_alloc_pe(struct pnv_phb *phb, int count);
> void pnv_ioda_free_pe(struct pnv_ioda_pe *pe);
>
> #ifdef CONFIG_PCI_IOV
>
--
Alexey
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox