* Re: [PATCH v3] soc: fsl: enable acpi support
From: Christophe Leroy @ 2020-08-19 6:48 UTC (permalink / raw)
To: Ran Wang, Li Yang; +Cc: linuxppc-dev, Peng Ma, linux-kernel, linux-arm-kernel
In-Reply-To: <20200819040031.40204-1-ran.wang_1@nxp.com>
Le 19/08/2020 à 06:00, Ran Wang a écrit :
> From: Peng Ma <peng.ma@nxp.com>
>
> This patch enables ACPI support in RCPM driver.
Can you change the subject to "soc: fsl: enable acpi support in RCPM
driver" ?
>
> Signed-off-by: Peng Ma <peng.ma@nxp.com>
> Signed-off-by: Ran Wang <ran.wang_1@nxp.com>
> ---
> Change in v3:
> - Add #ifdef CONFIG_ACPI for acpi_device_id
> - Rename rcpm_acpi_imx_ids to rcpm_acpi_ids
>
> Change in v2:
> - Update acpi_device_id to fix conflict with other driver
>
> drivers/soc/fsl/rcpm.c | 13 ++++++++++++-
> 1 file changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/soc/fsl/rcpm.c b/drivers/soc/fsl/rcpm.c
> index a093dbe..55d1d73 100644
> --- a/drivers/soc/fsl/rcpm.c
> +++ b/drivers/soc/fsl/rcpm.c
> @@ -2,7 +2,7 @@
> //
> // rcpm.c - Freescale QorIQ RCPM driver
> //
> -// Copyright 2019 NXP
> +// Copyright 2019-2020 NXP
> //
> // Author: Ran Wang <ran.wang_1@nxp.com>
>
> @@ -13,6 +13,7 @@
> #include <linux/slab.h>
> #include <linux/suspend.h>
> #include <linux/kernel.h>
> +#include <linux/acpi.h>
>
> #define RCPM_WAKEUP_CELL_MAX_SIZE 7
>
> @@ -125,6 +126,7 @@ static int rcpm_probe(struct platform_device *pdev)
>
> ret = device_property_read_u32(&pdev->dev,
> "#fsl,rcpm-wakeup-cells", &rcpm->wakeup_cells);
> +
This blank line addition is unrelated to the patch and shouldn't be there.
Christophe
> if (ret)
> return ret;
>
> @@ -139,10 +141,19 @@ static const struct of_device_id rcpm_of_match[] = {
> };
> MODULE_DEVICE_TABLE(of, rcpm_of_match);
>
> +#ifdef CONFIG_ACPI
> +static const struct acpi_device_id rcpm_acpi_ids[] = {
> + {"NXP0015",},
> + { }
> +};
> +MODULE_DEVICE_TABLE(acpi, rcpm_acpi_ids);
> +#endif
> +
> static struct platform_driver rcpm_driver = {
> .driver = {
> .name = "rcpm",
> .of_match_table = rcpm_of_match,
> + .acpi_match_table = ACPI_PTR(rcpm_acpi_ids),
> .pm = &rcpm_pm_ops,
> },
> .probe = rcpm_probe,
>
^ permalink raw reply
* RE: [PATCH v3] soc: fsl: enable acpi support
From: Ran Wang @ 2020-08-19 6:52 UTC (permalink / raw)
To: Christophe Leroy, Leo Li
Cc: linuxppc-dev@lists.ozlabs.org, Peng Ma,
linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org
In-Reply-To: <02fa05ad-6d39-e16d-ad02-7472d28b6627@csgroup.eu>
Hi Christophe
On Wednesday, August 19, 2020 2:48 PM, Christophe Leroy wrote:
>
>
>
> Le 19/08/2020 à 06:00, Ran Wang a écrit :
> > From: Peng Ma <peng.ma@nxp.com>
> >
> > This patch enables ACPI support in RCPM driver.
>
> Can you change the subject to "soc: fsl: enable acpi support in RCPM driver" ?
Sure.
> >
> > Signed-off-by: Peng Ma <peng.ma@nxp.com>
> > Signed-off-by: Ran Wang <ran.wang_1@nxp.com>
> > ---
> > Change in v3:
> > - Add #ifdef CONFIG_ACPI for acpi_device_id
> > - Rename rcpm_acpi_imx_ids to rcpm_acpi_ids
> >
> > Change in v2:
> > - Update acpi_device_id to fix conflict with other driver
> >
> > drivers/soc/fsl/rcpm.c | 13 ++++++++++++-
> > 1 file changed, 12 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/soc/fsl/rcpm.c b/drivers/soc/fsl/rcpm.c index
> > a093dbe..55d1d73 100644
> > --- a/drivers/soc/fsl/rcpm.c
> > +++ b/drivers/soc/fsl/rcpm.c
> > @@ -2,7 +2,7 @@
> > //
> > // rcpm.c - Freescale QorIQ RCPM driver
> > //
> > -// Copyright 2019 NXP
> > +// Copyright 2019-2020 NXP
> > //
> > // Author: Ran Wang <ran.wang_1@nxp.com>
> >
> > @@ -13,6 +13,7 @@
> > #include <linux/slab.h>
> > #include <linux/suspend.h>
> > #include <linux/kernel.h>
> > +#include <linux/acpi.h>
> >
> > #define RCPM_WAKEUP_CELL_MAX_SIZE 7
> >
> > @@ -125,6 +126,7 @@ static int rcpm_probe(struct platform_device
> > *pdev)
> >
> > ret = device_property_read_u32(&pdev->dev,
> > "#fsl,rcpm-wakeup-cells", &rcpm->wakeup_cells);
> > +
>
> This blank line addition is unrelated to the patch and shouldn't be there.
Got it, will remove this in v4, thanks.
Regards,
Ran
> Christophe
>
> > if (ret)
> > return ret;
> >
> > @@ -139,10 +141,19 @@ static const struct of_device_id rcpm_of_match[]
> = {
> > };
> > MODULE_DEVICE_TABLE(of, rcpm_of_match);
> >
> > +#ifdef CONFIG_ACPI
> > +static const struct acpi_device_id rcpm_acpi_ids[] = {
> > + {"NXP0015",},
> > + { }
> > +};
> > +MODULE_DEVICE_TABLE(acpi, rcpm_acpi_ids); #endif
> > +
> > static struct platform_driver rcpm_driver = {
> > .driver = {
> > .name = "rcpm",
> > .of_match_table = rcpm_of_match,
> > + .acpi_match_table = ACPI_PTR(rcpm_acpi_ids),
> > .pm = &rcpm_pm_ops,
> > },
> > .probe = rcpm_probe,
> >
^ permalink raw reply
* [powerpc:next-test] BUILD SUCCESS dc76919c80d7128cd46cfa0f1f356e4c12e50229
From: kernel test robot @ 2020-08-19 6:53 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next-test
branch HEAD: dc76919c80d7128cd46cfa0f1f356e4c12e50229 powerpc/pseries/eeh: Fix dumb linebreaks
elapsed time: 1040m
configs tested: 66
configs skipped: 1
The following configs have been built successfully.
More configs may be tested in the coming days.
arm defconfig
arm64 allyesconfig
arm64 defconfig
arm allyesconfig
arm allmodconfig
ia64 allmodconfig
ia64 defconfig
ia64 allyesconfig
m68k allmodconfig
m68k defconfig
m68k allyesconfig
nios2 defconfig
arc allyesconfig
nds32 allnoconfig
c6x allyesconfig
nds32 defconfig
nios2 allyesconfig
csky defconfig
alpha defconfig
alpha allyesconfig
xtensa allyesconfig
h8300 allyesconfig
arc defconfig
sh allmodconfig
parisc defconfig
s390 allyesconfig
parisc allyesconfig
s390 defconfig
i386 allyesconfig
sparc allyesconfig
sparc defconfig
i386 defconfig
mips allyesconfig
mips allmodconfig
powerpc defconfig
powerpc allyesconfig
powerpc allmodconfig
powerpc allnoconfig
i386 randconfig-a005-20200818
i386 randconfig-a002-20200818
i386 randconfig-a001-20200818
i386 randconfig-a006-20200818
i386 randconfig-a003-20200818
i386 randconfig-a004-20200818
x86_64 randconfig-a013-20200818
x86_64 randconfig-a016-20200818
x86_64 randconfig-a012-20200818
x86_64 randconfig-a011-20200818
x86_64 randconfig-a014-20200818
x86_64 randconfig-a015-20200818
i386 randconfig-a016-20200818
i386 randconfig-a011-20200818
i386 randconfig-a015-20200818
i386 randconfig-a013-20200818
i386 randconfig-a012-20200818
i386 randconfig-a014-20200818
riscv allyesconfig
riscv allnoconfig
riscv defconfig
riscv allmodconfig
x86_64 rhel
x86_64 allyesconfig
x86_64 rhel-7.6-kselftests
x86_64 defconfig
x86_64 rhel-8.3
x86_64 kexec
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
^ permalink raw reply
* [powerpc:merge] BUILD SUCCESS d4ecce4dcc8f8820286cf4e0859850c555e89854
From: kernel test robot @ 2020-08-19 6:53 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git merge
branch HEAD: d4ecce4dcc8f8820286cf4e0859850c555e89854 Automatic merge of 'master', 'next' and 'fixes' (2020-08-18 23:23)
elapsed time: 1041m
configs tested: 66
configs skipped: 1
The following configs have been built successfully.
More configs may be tested in the coming days.
arm defconfig
arm64 allyesconfig
arm64 defconfig
arm allyesconfig
arm allmodconfig
ia64 allmodconfig
ia64 defconfig
ia64 allyesconfig
m68k allmodconfig
m68k defconfig
m68k allyesconfig
nios2 defconfig
arc allyesconfig
nds32 allnoconfig
c6x allyesconfig
nds32 defconfig
nios2 allyesconfig
csky defconfig
alpha defconfig
alpha allyesconfig
xtensa allyesconfig
h8300 allyesconfig
arc defconfig
sh allmodconfig
parisc defconfig
s390 allyesconfig
parisc allyesconfig
s390 defconfig
i386 allyesconfig
sparc allyesconfig
sparc defconfig
i386 defconfig
mips allyesconfig
mips allmodconfig
powerpc defconfig
powerpc allyesconfig
powerpc allmodconfig
powerpc allnoconfig
i386 randconfig-a005-20200818
i386 randconfig-a002-20200818
i386 randconfig-a001-20200818
i386 randconfig-a006-20200818
i386 randconfig-a003-20200818
i386 randconfig-a004-20200818
x86_64 randconfig-a013-20200818
x86_64 randconfig-a016-20200818
x86_64 randconfig-a012-20200818
x86_64 randconfig-a011-20200818
x86_64 randconfig-a014-20200818
x86_64 randconfig-a015-20200818
i386 randconfig-a016-20200818
i386 randconfig-a011-20200818
i386 randconfig-a015-20200818
i386 randconfig-a013-20200818
i386 randconfig-a012-20200818
i386 randconfig-a014-20200818
riscv allyesconfig
riscv allnoconfig
riscv defconfig
riscv allmodconfig
x86_64 rhel
x86_64 allyesconfig
x86_64 rhel-7.6-kselftests
x86_64 defconfig
x86_64 rhel-8.3
x86_64 kexec
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
^ permalink raw reply
* [powerpc:fixes-test] BUILD SUCCESS 801980f6497946048709b9b09771a1729551d705
From: kernel test robot @ 2020-08-19 6:52 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git fixes-test
branch HEAD: 801980f6497946048709b9b09771a1729551d705 powerpc/pseries/hotplug-cpu: wait indefinitely for vCPU death
elapsed time: 1042m
configs tested: 72
configs skipped: 68
The following configs have been built successfully.
More configs may be tested in the coming days.
arm defconfig
arm64 allyesconfig
arm64 defconfig
arm allyesconfig
arm allmodconfig
ia64 allmodconfig
ia64 defconfig
ia64 allyesconfig
m68k allmodconfig
m68k defconfig
m68k allyesconfig
nios2 defconfig
arc allyesconfig
nds32 allnoconfig
c6x allyesconfig
nds32 defconfig
nios2 allyesconfig
csky defconfig
alpha defconfig
alpha allyesconfig
xtensa allyesconfig
h8300 allyesconfig
arc defconfig
sh allmodconfig
parisc defconfig
s390 allyesconfig
parisc allyesconfig
s390 defconfig
i386 allyesconfig
sparc allyesconfig
sparc defconfig
i386 defconfig
mips allyesconfig
mips allmodconfig
powerpc allyesconfig
powerpc allmodconfig
powerpc allnoconfig
powerpc defconfig
i386 randconfig-a005-20200818
i386 randconfig-a002-20200818
i386 randconfig-a001-20200818
i386 randconfig-a006-20200818
i386 randconfig-a003-20200818
i386 randconfig-a004-20200818
x86_64 randconfig-a013-20200818
x86_64 randconfig-a016-20200818
x86_64 randconfig-a012-20200818
x86_64 randconfig-a011-20200818
x86_64 randconfig-a014-20200818
x86_64 randconfig-a015-20200818
i386 randconfig-a016-20200818
i386 randconfig-a011-20200818
i386 randconfig-a015-20200818
i386 randconfig-a013-20200818
i386 randconfig-a012-20200818
i386 randconfig-a014-20200818
x86_64 randconfig-a006-20200819
x86_64 randconfig-a001-20200819
x86_64 randconfig-a003-20200819
x86_64 randconfig-a005-20200819
x86_64 randconfig-a004-20200819
x86_64 randconfig-a002-20200819
riscv allyesconfig
riscv allnoconfig
riscv defconfig
riscv allmodconfig
x86_64 rhel
x86_64 allyesconfig
x86_64 rhel-7.6-kselftests
x86_64 defconfig
x86_64 rhel-8.3
x86_64 kexec
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
^ permalink raw reply
* Re: [PATCH 14/16] debug_vm_pgtable/hugetlb: Disable hugetlb test on ppc64
From: Aneesh Kumar K.V @ 2020-08-19 6:54 UTC (permalink / raw)
To: Anshuman Khandual, linux-mm, akpm; +Cc: linuxppc-dev
In-Reply-To: <08c63a65-cd3f-73f3-1698-5e60f398fbad@arm.com>
Anshuman Khandual <anshuman.khandual@arm.com> writes:
> On 08/12/2020 07:22 PM, Aneesh Kumar K.V wrote:
>> On 8/12/20 7:04 PM, Anshuman Khandual wrote:
>>>
>>>
>>> On 08/12/2020 06:46 PM, Aneesh Kumar K.V wrote:
>>>> On 8/12/20 6:33 PM, Anshuman Khandual wrote:
>>>>>
>>>>>
>>>>> On 08/12/2020 12:03 PM, Aneesh Kumar K.V wrote:
>>>>>> The seems to be missing quite a lot of details w.r.t allocating
>>>>>> the correct pgtable_t page (huge_pte_alloc()), holding the right
>>>>>> lock (huge_pte_lock()) etc. The vma used is also not a hugetlb VMA.
>>>>>>
>>>>>> ppc64 do have runtime checks within CONFIG_DEBUG_VM for most of these.
>>>>>> Hence disable the test on ppc64.
>>>>>
>>>>> This test is free from any platform specific #ifdefs which should
>>>>> never be broken. If hugetlb_advanced_tests() does not work or is
>>>>> not detailed enough for ppc64, then it would be great if you could
>>>>> suggest some improvements so that it works for all enabled platforms.
>>>>>
>>>>>
>>>>
>>>> As mentioned the test is broken. For hugetlb, the pgtable_t pages should be allocated by huge_pte_alloc(). We need to hold huget_pte_lock() before updating huge tlb pte. That takes hugepage size, which is mostly derived out of vma. Hence vma need to be a hugetlb vma. Some of the functions also depend on hstate. Also we should use set_huge_pte_at() when setting up hugetlb pte entries. I was tempted to remove that test completely marking it broken. But avoided that by marking it broken on only PPC64.
>>>
>>> The test is not broken, hugetlb helpers on multiple platforms dont complain about
>>> this at all. The tests here emulate 'enough' MM objects required for the helpers
>>> on enabled platforms, to perform the primary task i.e page table transformation it
>>> is expected to do. The test does not claim to emulate a perfect MM environment for
>>> a given subsystem's (like HugeTLB) arch helpers. Now in this case, the MM objects
>>> being emulated for the HugeTLB advanced tests does not seem to be sufficient for
>>> ppc64 but it can be improved. But that does not mean it is broken in it's current
>>> form for other platforms.
>>>
>>
>> There is nothing ppc64 specific here. It is just that we have CONFIG_DEBUG_VM based checks for different possibly wrong usages of these functions. This was done because we have different page sizes, two different translations to support and we want to avoid any wrong usage. IMHO expecting hugetlb page table helpers to work with a non hugetlb VMA and without holding hugeTLB pte lock is a clear violation of hugetlb interface.
>
> Do you have a modified version of the test with HugeTLB marked VMA and with pte lock
> held, which works on ppc664 ?
Nope. That is one of the reason I commented that out. We can sort that
out slowly.
-aneesh
^ permalink raw reply
* [PATCH v4] soc: fsl: enable acpi support in RCPM driver
From: Ran Wang @ 2020-08-19 6:52 UTC (permalink / raw)
To: Li Yang, Christophe Leroy
Cc: Peng Ma, Ran Wang, linuxppc-dev, linux-kernel, linux-arm-kernel
From: Peng Ma <peng.ma@nxp.com>
This patch enables ACPI support in RCPM driver.
Signed-off-by: Peng Ma <peng.ma@nxp.com>
Signed-off-by: Ran Wang <ran.wang_1@nxp.com>
---
Change in v4:
- Make commit subject more accurate
- Remove unrelated new blank line
Change in v3:
- Add #ifdef CONFIG_ACPI for acpi_device_id
- Rename rcpm_acpi_imx_ids to rcpm_acpi_ids
Change in v2:
- Update acpi_device_id to fix conflict with other driver
drivers/soc/fsl/rcpm.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/drivers/soc/fsl/rcpm.c b/drivers/soc/fsl/rcpm.c
index a093dbe..b5aa6db 100644
--- a/drivers/soc/fsl/rcpm.c
+++ b/drivers/soc/fsl/rcpm.c
@@ -2,7 +2,7 @@
//
// rcpm.c - Freescale QorIQ RCPM driver
//
-// Copyright 2019 NXP
+// Copyright 2019-2020 NXP
//
// Author: Ran Wang <ran.wang_1@nxp.com>
@@ -13,6 +13,7 @@
#include <linux/slab.h>
#include <linux/suspend.h>
#include <linux/kernel.h>
+#include <linux/acpi.h>
#define RCPM_WAKEUP_CELL_MAX_SIZE 7
@@ -139,10 +140,19 @@ static const struct of_device_id rcpm_of_match[] = {
};
MODULE_DEVICE_TABLE(of, rcpm_of_match);
+#ifdef CONFIG_ACPI
+static const struct acpi_device_id rcpm_acpi_ids[] = {
+ {"NXP0015",},
+ { }
+};
+MODULE_DEVICE_TABLE(acpi, rcpm_acpi_ids);
+#endif
+
static struct platform_driver rcpm_driver = {
.driver = {
.name = "rcpm",
.of_match_table = rcpm_of_match,
+ .acpi_match_table = ACPI_PTR(rcpm_acpi_ids),
.pm = &rcpm_pm_ops,
},
.probe = rcpm_probe,
--
2.7.4
^ permalink raw reply related
* Re: remove the last set_fs() in common code, and remove it for x86 and powerpc
From: Christophe Leroy @ 2020-08-19 7:16 UTC (permalink / raw)
To: Christoph Hellwig
Cc: linux-arch, Kees Cook, x86, linux-kernel, Al Viro, linux-fsdevel,
linuxppc-dev
In-Reply-To: <e3781661-2e13-4f46-d892-181907a2e768@csgroup.eu>
Le 18/08/2020 à 20:23, Christophe Leroy a écrit :
>
>
> Le 18/08/2020 à 20:05, Christoph Hellwig a écrit :
>> On Tue, Aug 18, 2020 at 07:46:22PM +0200, Christophe Leroy wrote:
>>> I gave it a go on my powerpc mpc832x. I tested it on top of my newest
>>> series that reworks the 32 bits signal handlers (see
>>> https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=196278) with
>>>
>>> the microbenchmark test used is that series.
>>>
>>> With KUAP activated, on top of signal32 rework, performance is
>>> boosted as
>>> system time for the microbenchmark goes from 1.73s down to 1.56s,
>>> that is
>>> 10% quicker
>>>
>>> Surprisingly, with the kernel as is today without my signal's series,
>>> your
>>> series degrades performance slightly (from 2.55s to 2.64s ie 3.5%
>>> slower).
>>>
>>>
>>> I also observe, in both cases, a degradation on
>>>
>>> dd if=/dev/zero of=/dev/null count=1M
>>>
>>> Without your series, it runs in 5.29 seconds.
>>> With your series, it runs in 5.82 seconds, that is 10% more time.
>>
>> That's pretty strage, I wonder if some kernel text cache line
>> effects come into play here?
>>
>> The kernel access side is only used in slow path code, so it should
>> not make a difference, and the uaccess code is simplified and should be
>> (marginally) faster.
>>
>> Btw, was this with the __{get,put}_user_allowed cockup that you noticed
>> fixed?
>>
>
> Yes it is with the __get_user_size() replaced by __get_user_size_allowed().
I made a test with only the first patch of your series: That's
definitely the culprit. With only that patch applies, the duration is
6.64 seconds, that's a 25% degradation.
A perf record provides the following without the patch:
41.91% dd [kernel.kallsyms] [k] __arch_clear_user
7.02% dd [kernel.kallsyms] [k] vfs_read
6.86% dd [kernel.kallsyms] [k] new_sync_read
6.68% dd [kernel.kallsyms] [k] iov_iter_zero
6.03% dd [kernel.kallsyms] [k] transfer_to_syscall
3.39% dd [kernel.kallsyms] [k] memset
3.07% dd [kernel.kallsyms] [k] __fsnotify_parent
2.68% dd [kernel.kallsyms] [k] ksys_read
2.09% dd [kernel.kallsyms] [k] read_iter_zero
2.01% dd [kernel.kallsyms] [k] __fget_light
1.84% dd [kernel.kallsyms] [k] __fdget_pos
1.35% dd [kernel.kallsyms] [k] rw_verify_area
1.32% dd libc-2.23.so [.] __GI___libc_write
1.21% dd [kernel.kallsyms] [k] vfs_write
...
0.03% dd [kernel.kallsyms] [k] write_null
And the following with the patch:
15.54% dd [kernel.kallsyms] [k] __arch_clear_user
9.17% dd [kernel.kallsyms] [k] vfs_read
6.54% dd [kernel.kallsyms] [k] new_sync_write
6.31% dd [kernel.kallsyms] [k] transfer_to_syscall
6.29% dd [kernel.kallsyms] [k] __fsnotify_parent
6.20% dd [kernel.kallsyms] [k] new_sync_read
5.47% dd [kernel.kallsyms] [k] memset
5.13% dd [kernel.kallsyms] [k] vfs_write
4.44% dd [kernel.kallsyms] [k] iov_iter_zero
2.95% dd [kernel.kallsyms] [k] write_iter_null
2.82% dd [kernel.kallsyms] [k] ksys_read
2.46% dd [kernel.kallsyms] [k] __fget_light
2.34% dd libc-2.23.so [.] __GI___libc_read
1.89% dd [kernel.kallsyms] [k] iov_iter_advance
1.76% dd [kernel.kallsyms] [k] __fdget_pos
1.65% dd [kernel.kallsyms] [k] rw_verify_area
1.63% dd [kernel.kallsyms] [k] read_iter_zero
1.60% dd [kernel.kallsyms] [k] iov_iter_init
1.22% dd [kernel.kallsyms] [k] ksys_write
1.14% dd libc-2.23.so [.] __GI___libc_write
Christophe
>
> Christophe
^ permalink raw reply
* iter and normal ops on /dev/zero & co, was Re: remove the last set_fs() in common code, and remove it for x86 and powerpc
From: Christoph Hellwig @ 2020-08-19 7:22 UTC (permalink / raw)
To: Christophe Leroy, Al Viro
Cc: linux-arch, Kees Cook, x86, linux-kernel, linux-fsdevel,
linuxppc-dev, Christoph Hellwig
In-Reply-To: <f2e31c89-dd9e-f0f8-ef5c-e930d01a3b65@csgroup.eu>
On Wed, Aug 19, 2020 at 09:16:59AM +0200, Christophe Leroy wrote:
> I made a test with only the first patch of your series: That's definitely
> the culprit. With only that patch applies, the duration is 6.64 seconds,
> that's a 25% degradation.
For the record: the first patch is:
mem: remove duplicate ops for /dev/zero and /dev/null
So these micro-optimizations matter at least for some popular
benchmarks. It would be easy to drop, but that means we either:
- can't support kernel_read/write on these files, which should not
matter
or
- have to drop the check for both ops being present
Al, what do you think?
^ permalink raw reply
* Re: [PATCH] powerpc/papr_scm: Limit the readability of 'perf_stats' sysfs attribute
From: Vaibhav Jain @ 2020-08-19 9:19 UTC (permalink / raw)
To: Michael Ellerman, Aneesh Kumar K.V, linuxppc-dev, linux-nvdimm
Cc: Oliver O'Halloran, Dan Williams, Ira Weiny, Santosh Sivaraj
In-Reply-To: <87imdm9frg.fsf@mpe.ellerman.id.au>
Thanks Aneesh and Mpe for reviewing this patch.
Michael Ellerman <mpe@ellerman.id.au> writes:
> "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:
[snip]
>>>
>>> + /* Allow access only to perfmon capable users */
>>> + if (!perfmon_capable())
>>> + return -EACCES;
>>> +
>>
>> An access check is usually done in open(). This is the read callback IIUC.
>
> Yes. Otherwise an unprivileged user can open the file, and then trick a
> suid program into reading from it.
Agree, but since the 'open()' for this sysfs attribute is handled
by kern-fs, AFAIK dont see any direct way to enforce this policy.
Only other way it seems to me is to convert the 'perf_stats' DEVICE_ATTR_RO
to DEVICE_ATTR_ADMIN_RO.
>
> cheers
--
Cheers
~ Vaibhav
^ permalink raw reply
* [PATCH] powerpc/powernv/idle: add a basic stop 0-3 driver for POWER10
From: Nicholas Piggin @ 2020-08-19 9:47 UTC (permalink / raw)
To: linuxppc-dev
Cc: Gautham R . Shenoy, Michael Neuling, Ryan P Grimm,
Pratik Rajesh Sampat, Nicholas Piggin
This driver does not restore stop > 3 state, so it limits itself
to states which do not lose full state or TB.
The POWER10 SPRs are sufficiently different from P9 that it seems
easier to split out the P10 code. The POWER10 deep sleep code
(e.g., the BHRB restore) has been taken out, but it can be re-added
when stop > 3 support is added.
Cc: Ryan P Grimm <rgrimm@us.ibm.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Cc: Pratik Rajesh Sampat <psampat@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
re-sending with linuxppc-dev list actually cc'ed
arch/powerpc/include/asm/machdep.h | 2 -
arch/powerpc/include/asm/processor.h | 2 +-
arch/powerpc/include/asm/reg.h | 1 +
arch/powerpc/platforms/powernv/idle.c | 304 ++++++++++++++++++--------
drivers/cpuidle/cpuidle-powernv.c | 2 +-
5 files changed, 213 insertions(+), 98 deletions(-)
diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h
index a90b892f0bfe..5082cd496190 100644
--- a/arch/powerpc/include/asm/machdep.h
+++ b/arch/powerpc/include/asm/machdep.h
@@ -222,8 +222,6 @@ struct machdep_calls {
extern void e500_idle(void);
extern void power4_idle(void);
-extern void power7_idle(void);
-extern void power9_idle(void);
extern void ppc6xx_idle(void);
extern void book3e_idle(void);
diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index ed0d633ab5aa..6865147209de 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -432,7 +432,7 @@ enum idle_boot_override {IDLE_NO_OVERRIDE = 0, IDLE_POWERSAVE_OFF};
extern int powersave_nap; /* set if nap mode can be used in idle loop */
extern void power7_idle_type(unsigned long type);
-extern void power9_idle_type(unsigned long stop_psscr_val,
+extern void arch300_idle_type(unsigned long stop_psscr_val,
unsigned long stop_psscr_mask);
extern void flush_instruction_cache(void);
diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index 88fb88491fe9..d3a0aed321d0 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -1353,6 +1353,7 @@
#define PVR_POWER8NVL 0x004C
#define PVR_POWER8 0x004D
#define PVR_POWER9 0x004E
+#define PVR_POWER10 0x0080
#define PVR_BE 0x0070
#define PVR_PA6T 0x0090
diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
index 77513a80cef9..1ed7c5286487 100644
--- a/arch/powerpc/platforms/powernv/idle.c
+++ b/arch/powerpc/platforms/powernv/idle.c
@@ -565,7 +565,7 @@ void power7_idle_type(unsigned long type)
irq_set_pending_from_srr1(srr1);
}
-void power7_idle(void)
+static void power7_idle(void)
{
if (!powersave_nap)
return;
@@ -659,20 +659,6 @@ static unsigned long power9_idle_stop(unsigned long psscr, bool mmu_on)
mmcr0 = mfspr(SPRN_MMCR0);
}
- if (cpu_has_feature(CPU_FTR_ARCH_31)) {
- /*
- * POWER10 uses MMCRA (BHRBRD) as BHRB disable bit.
- * If the user hasn't asked for the BHRB to be
- * written, the value of MMCRA[BHRBRD] is 1.
- * On wakeup from stop, MMCRA[BHRBD] will be 0,
- * since it is previleged resource and will be lost.
- * Thus, if we do not save and restore the MMCRA[BHRBD],
- * hardware will be needlessly writing to the BHRB
- * in problem mode.
- */
- mmcra = mfspr(SPRN_MMCRA);
- }
-
if ((psscr & PSSCR_RL_MASK) >= deep_spr_loss_state) {
sprs.lpcr = mfspr(SPRN_LPCR);
sprs.hfscr = mfspr(SPRN_HFSCR);
@@ -735,10 +721,6 @@ static unsigned long power9_idle_stop(unsigned long psscr, bool mmu_on)
mtspr(SPRN_MMCR0, mmcr0);
}
- /* Reload MMCRA to restore BHRB disable bit for POWER10 */
- if (cpu_has_feature(CPU_FTR_ARCH_31))
- mtspr(SPRN_MMCRA, mmcra);
-
/*
* DD2.2 and earlier need to set then clear bit 60 in MMCRA
* to ensure the PMU starts running.
@@ -823,73 +805,6 @@ static unsigned long power9_idle_stop(unsigned long psscr, bool mmu_on)
return srr1;
}
-#ifdef CONFIG_HOTPLUG_CPU
-static unsigned long power9_offline_stop(unsigned long psscr)
-{
- unsigned long srr1;
-
-#ifndef CONFIG_KVM_BOOK3S_HV_POSSIBLE
- __ppc64_runlatch_off();
- srr1 = power9_idle_stop(psscr, true);
- __ppc64_runlatch_on();
-#else
- /*
- * Tell KVM we're entering idle.
- * This does not have to be done in real mode because the P9 MMU
- * is independent per-thread. Some steppings share radix/hash mode
- * between threads, but in that case KVM has a barrier sync in real
- * mode before and after switching between radix and hash.
- *
- * kvm_start_guest must still be called in real mode though, hence
- * the false argument.
- */
- local_paca->kvm_hstate.hwthread_state = KVM_HWTHREAD_IN_IDLE;
-
- __ppc64_runlatch_off();
- srr1 = power9_idle_stop(psscr, false);
- __ppc64_runlatch_on();
-
- local_paca->kvm_hstate.hwthread_state = KVM_HWTHREAD_IN_KERNEL;
- /* Order setting hwthread_state vs. testing hwthread_req */
- smp_mb();
- if (local_paca->kvm_hstate.hwthread_req)
- srr1 = idle_kvm_start_guest(srr1);
- mtmsr(MSR_KERNEL);
-#endif
-
- return srr1;
-}
-#endif
-
-void power9_idle_type(unsigned long stop_psscr_val,
- unsigned long stop_psscr_mask)
-{
- unsigned long psscr;
- unsigned long srr1;
-
- if (!prep_irq_for_idle_irqsoff())
- return;
-
- psscr = mfspr(SPRN_PSSCR);
- psscr = (psscr & ~stop_psscr_mask) | stop_psscr_val;
-
- __ppc64_runlatch_off();
- srr1 = power9_idle_stop(psscr, true);
- __ppc64_runlatch_on();
-
- fini_irq_for_idle_irqsoff();
-
- irq_set_pending_from_srr1(srr1);
-}
-
-/*
- * Used for ppc_md.power_save which needs a function with no parameters
- */
-void power9_idle(void)
-{
- power9_idle_type(pnv_default_stop_val, pnv_default_stop_mask);
-}
-
#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
/*
* This is used in working around bugs in thread reconfiguration
@@ -962,6 +877,198 @@ void pnv_power9_force_smt4_release(void)
EXPORT_SYMBOL_GPL(pnv_power9_force_smt4_release);
#endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */
+struct p10_sprs {
+ /*
+ * SPRs that get lost in shallow states:
+ *
+ * P10 loses CR, LR, CTR, FPSCR, VSCR, XER, TAR, SPRG2, and HSPRG1
+ * isa300 idle routines restore CR, LR.
+ * CTR is volatile
+ * idle thread doesn't use FP or VEC
+ * kernel doesn't use TAR
+ * HSPRG1 is only live in HV interrupt entry
+ * SPRG2 is only live in KVM guests, KVM handles it.
+ */
+};
+
+static unsigned long power10_idle_stop(unsigned long psscr, bool mmu_on)
+{
+ int cpu = raw_smp_processor_id();
+ int first = cpu_first_thread_sibling(cpu);
+ unsigned long *state = &paca_ptrs[first]->idle_state;
+ unsigned long core_thread_mask = (1UL << threads_per_core) - 1;
+ unsigned long srr1;
+ unsigned long pls;
+// struct p10_sprs sprs = {}; /* avoid false used-uninitialised */
+ bool sprs_saved = false;
+
+ if (!(psscr & (PSSCR_EC|PSSCR_ESL))) {
+ /* EC=ESL=0 case */
+
+ BUG_ON(!mmu_on);
+
+ /*
+ * Wake synchronously. SRESET via xscom may still cause
+ * a 0x100 powersave wakeup with SRR1 reason!
+ */
+ srr1 = isa300_idle_stop_noloss(psscr); /* go idle */
+ if (likely(!srr1))
+ return 0;
+
+ /*
+ * Registers not saved, can't recover!
+ * This would be a hardware bug
+ */
+ BUG_ON((srr1 & SRR1_WAKESTATE) != SRR1_WS_NOLOSS);
+
+ goto out;
+ }
+
+ /* EC=ESL=1 case */
+ if ((psscr & PSSCR_RL_MASK) >= deep_spr_loss_state) {
+ /* XXX: save SPRs for deep state loss here. */
+
+ sprs_saved = true;
+
+ atomic_start_thread_idle();
+ }
+
+ srr1 = isa300_idle_stop_mayloss(psscr); /* go idle */
+
+ psscr = mfspr(SPRN_PSSCR);
+
+ WARN_ON_ONCE(!srr1);
+ WARN_ON_ONCE(mfmsr() & (MSR_IR|MSR_DR));
+
+ if (unlikely((srr1 & SRR1_WAKEMASK_P8) == SRR1_WAKEHMI))
+ hmi_exception_realmode(NULL);
+
+ /*
+ * On POWER10, SRR1 bits do not match exactly as expected.
+ * SRR1_WS_GPRLOSS (10b) can also result in SPR loss, so
+ * just always test PSSCR for SPR/TB state loss.
+ */
+ pls = (psscr & PSSCR_PLS) >> PSSCR_PLS_SHIFT;
+ if (likely(pls < deep_spr_loss_state)) {
+ if (sprs_saved)
+ atomic_stop_thread_idle();
+ goto out;
+ }
+
+ /* HV state loss */
+ BUG_ON(!sprs_saved);
+
+ atomic_lock_thread_idle();
+
+ if ((*state & core_thread_mask) != 0)
+ goto core_woken;
+
+ /* XXX: restore per-core SPRs here */
+
+ if (pls >= pnv_first_tb_loss_level) {
+ /* TB loss */
+ if (opal_resync_timebase() != OPAL_SUCCESS)
+ BUG();
+ }
+
+ /*
+ * isync after restoring shared SPRs and before unlocking. Unlock
+ * only contains hwsync which does not necessarily do the right
+ * thing for SPRs.
+ */
+ isync();
+
+core_woken:
+ atomic_unlock_and_stop_thread_idle();
+
+ /* XXX: restore per-thread SPRs here */
+
+ if (!radix_enabled())
+ __slb_restore_bolted_realmode();
+
+out:
+ if (mmu_on)
+ mtmsr(MSR_KERNEL);
+
+ return srr1;
+}
+
+#ifdef CONFIG_HOTPLUG_CPU
+static unsigned long arch300_offline_stop(unsigned long psscr)
+{
+ unsigned long srr1;
+
+#ifndef CONFIG_KVM_BOOK3S_HV_POSSIBLE
+ __ppc64_runlatch_off();
+ if (cpu_has_feature(CPU_FTR_ARCH_31))
+ srr1 = power10_idle_stop(psscr, true);
+ else
+ srr1 = power9_idle_stop(psscr, true);
+ __ppc64_runlatch_on();
+#else
+ /*
+ * Tell KVM we're entering idle.
+ * This does not have to be done in real mode because the P9 MMU
+ * is independent per-thread. Some steppings share radix/hash mode
+ * between threads, but in that case KVM has a barrier sync in real
+ * mode before and after switching between radix and hash.
+ *
+ * kvm_start_guest must still be called in real mode though, hence
+ * the false argument.
+ */
+ local_paca->kvm_hstate.hwthread_state = KVM_HWTHREAD_IN_IDLE;
+
+ __ppc64_runlatch_off();
+ if (cpu_has_feature(CPU_FTR_ARCH_31))
+ srr1 = power10_idle_stop(psscr, false);
+ else
+ srr1 = power9_idle_stop(psscr, false);
+ __ppc64_runlatch_on();
+
+ local_paca->kvm_hstate.hwthread_state = KVM_HWTHREAD_IN_KERNEL;
+ /* Order setting hwthread_state vs. testing hwthread_req */
+ smp_mb();
+ if (local_paca->kvm_hstate.hwthread_req)
+ srr1 = idle_kvm_start_guest(srr1);
+ mtmsr(MSR_KERNEL);
+#endif
+
+ return srr1;
+}
+#endif
+
+void arch300_idle_type(unsigned long stop_psscr_val,
+ unsigned long stop_psscr_mask)
+{
+ unsigned long psscr;
+ unsigned long srr1;
+
+ if (!prep_irq_for_idle_irqsoff())
+ return;
+
+ psscr = mfspr(SPRN_PSSCR);
+ psscr = (psscr & ~stop_psscr_mask) | stop_psscr_val;
+
+ __ppc64_runlatch_off();
+ if (cpu_has_feature(CPU_FTR_ARCH_31))
+ srr1 = power10_idle_stop(psscr, true);
+ else
+ srr1 = power9_idle_stop(psscr, true);
+ __ppc64_runlatch_on();
+
+ fini_irq_for_idle_irqsoff();
+
+ irq_set_pending_from_srr1(srr1);
+}
+
+/*
+ * Used for ppc_md.power_save which needs a function with no parameters
+ */
+static void arch300_idle(void)
+{
+ arch300_idle_type(pnv_default_stop_val, pnv_default_stop_mask);
+}
+
#ifdef CONFIG_HOTPLUG_CPU
void pnv_program_cpu_hotplug_lpcr(unsigned int cpu, u64 lpcr_val)
@@ -995,7 +1102,7 @@ unsigned long pnv_cpu_offline(unsigned int cpu)
psscr = mfspr(SPRN_PSSCR);
psscr = (psscr & ~pnv_deepest_stop_psscr_mask) |
pnv_deepest_stop_psscr_val;
- srr1 = power9_offline_stop(psscr);
+ srr1 = arch300_offline_stop(psscr);
} else if (cpu_has_feature(CPU_FTR_ARCH_206) && power7_offline_type) {
srr1 = power7_offline();
} else {
@@ -1093,11 +1200,15 @@ int validate_psscr_val_mask(u64 *psscr_val, u64 *psscr_mask, u32 flags)
* @dt_idle_states: Number of idle state entries
* Returns 0 on success
*/
-static void __init pnv_power9_idle_init(void)
+static void __init pnv_arch300_idle_init(void)
{
u64 max_residency_ns = 0;
int i;
+ /* stop is not really architected, we only have p9,p10 drivers */
+ if (!pvr_version_is(PVR_POWER10) && !pvr_version_is(PVR_POWER9))
+ return;
+
/*
* pnv_deepest_stop_{val,mask} should be set to values corresponding to
* the deepest stop state.
@@ -1112,6 +1223,11 @@ static void __init pnv_power9_idle_init(void)
struct pnv_idle_states_t *state = &pnv_idle_states[i];
u64 psscr_rl = state->psscr_val & PSSCR_RL_MASK;
+ /* No deep loss driver implemented for POWER10 yet */
+ if (pvr_version_is(PVR_POWER10) &&
+ state->flags & (OPAL_PM_TIMEBASE_STOP|OPAL_PM_LOSE_FULL_CONTEXT))
+ continue;
+
if ((state->flags & OPAL_PM_TIMEBASE_STOP) &&
(pnv_first_tb_loss_level > psscr_rl))
pnv_first_tb_loss_level = psscr_rl;
@@ -1162,7 +1278,7 @@ static void __init pnv_power9_idle_init(void)
if (unlikely(!default_stop_found)) {
pr_warn("cpuidle-powernv: No suitable default stop state found. Disabling platform idle.\n");
} else {
- ppc_md.power_save = power9_idle;
+ ppc_md.power_save = arch300_idle;
pr_info("cpuidle-powernv: Default stop: psscr = 0x%016llx,mask=0x%016llx\n",
pnv_default_stop_val, pnv_default_stop_mask);
}
@@ -1223,8 +1339,8 @@ static void __init pnv_probe_idle_states(void)
return;
}
- if (pvr_version_is(PVR_POWER9))
- pnv_power9_idle_init();
+ if (cpu_has_feature(CPU_FTR_ARCH_300))
+ pnv_arch300_idle_init();
for (i = 0; i < nr_pnv_idle_states; i++)
supported_cpuidle_states |= pnv_idle_states[i].flags;
@@ -1295,7 +1411,7 @@ static int pnv_parse_cpuidle_dt(void)
for (i = 0; i < nr_idle_states; i++)
pnv_idle_states[i].residency_ns = temp_u32[i];
- /* For power9 */
+ /* For power9 and later */
if (cpu_has_feature(CPU_FTR_ARCH_300)) {
/* Read pm_crtl_val */
if (of_property_read_u64_array(np, "ibm,cpu-idle-state-psscr",
@@ -1358,8 +1474,8 @@ static int __init pnv_init_idle_states(void)
if (!cpu_has_feature(CPU_FTR_ARCH_300)) {
/* P7/P8 nap */
p->thread_idle_state = PNV_THREAD_RUNNING;
- } else {
- /* P9 stop */
+ } else if (pvr_version_is(PVR_POWER9)) {
+ /* P9 stop workarounds */
#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
p->requested_psscr = 0;
atomic_set(&p->dont_stop, 0);
diff --git a/drivers/cpuidle/cpuidle-powernv.c b/drivers/cpuidle/cpuidle-powernv.c
index addaa6e6718b..c32c600b3cf8 100644
--- a/drivers/cpuidle/cpuidle-powernv.c
+++ b/drivers/cpuidle/cpuidle-powernv.c
@@ -141,7 +141,7 @@ static int stop_loop(struct cpuidle_device *dev,
struct cpuidle_driver *drv,
int index)
{
- power9_idle_type(stop_psscr_table[index].val,
+ arch300_idle_type(stop_psscr_table[index].val,
stop_psscr_table[index].mask);
return index;
}
--
2.23.0
^ permalink raw reply related
* Re: [Virtual ppce500] virtio_gpu virtio0: swiotlb buffer is full
From: Christian Zigotzky @ 2020-08-19 10:22 UTC (permalink / raw)
To: Gerd Hoffmann
Cc: Darren Stevens, R.T.Dickinson, daniel.vetter, Michel Dänzer,
kvm-ppc@vger.kernel.org, gurchetansingh,
Maling list - DRI developers, mad skateman, linuxppc-dev
In-Reply-To: <20200819043515.saq6ey33q7p2uccz@sirius.home.kraxel.org>
On 19 August 2020 at 06:35 am, Gerd Hoffmann wrote:
> On Tue, Aug 18, 2020 at 04:41:38PM +0200, Christian Zigotzky wrote:
>> Hello Gerd,
>>
>> I compiled a new kernel with the latest DRM misc updates today. The patch is
>> included in these updates.
>>
>> This kernel works with the VirtIO-GPU in a virtual e5500 QEMU/KVM HV machine
>> on my X5000.
>>
>> Unfortunately I can only use the VirtIO-GPU (Monitor: Red Hat, Inc. 8") with
>> a resolution of 640x480. If I set a higher resolution then the guest
>> disables the monitor.
>> I can use higher resolutions with the stable kernel 5.8 and the VirtIO-GPU.
>>
>> Please check the latest DRM updates.
> https://patchwork.freedesktop.org/patch/385980/
>
> (tests & reviews & acks are welcome)
>
> HTH,
> Gerd
>
Hello Gerd,
I compiled a new RC1 with our patches today. With these patches, the
VirtIO-GPU works without any problems. I can use higher resolutions again.
Screenshot of the RC1-3 with the VirtIO-GPU in a virtual e5500 QEMU/KVM
HV machine on my X5000:
https://i.pinimg.com/originals/4f/b0/14/4fb01476edd7abe6be1e1203a8e7e152.png
Thanks a lot for your help!
Cheers,
Christian
^ permalink raw reply
* Re: [PATCH 1/2] lockdep: improve current->(hard|soft)irqs_enabled synchronisation with actual irq state
From: Alexey Kardashevskiy @ 2020-08-19 10:39 UTC (permalink / raw)
To: Nicholas Piggin, peterz
Cc: linux-arch, Will Deacon, Ingo Molnar, linuxppc-dev, linux-kernel
In-Reply-To: <1597793862.l8c4pmmzpq.astroid@bobo.none>
On 19/08/2020 09:54, Nicholas Piggin wrote:
> Excerpts from peterz@infradead.org's message of August 19, 2020 1:41 am:
>> On Tue, Aug 18, 2020 at 05:22:33PM +1000, Nicholas Piggin wrote:
>>> Excerpts from peterz@infradead.org's message of August 12, 2020 8:35 pm:
>>>> On Wed, Aug 12, 2020 at 06:18:28PM +1000, Nicholas Piggin wrote:
>>>>> Excerpts from peterz@infradead.org's message of August 7, 2020 9:11 pm:
>>>>>>
>>>>>> What's wrong with something like this?
>>>>>>
>>>>>> AFAICT there's no reason to actually try and add IRQ tracing here, it's
>>>>>> just a hand full of instructions at the most.
>>>>>
>>>>> Because we may want to use that in other places as well, so it would
>>>>> be nice to have tracing.
>>>>>
>>>>> Hmm... also, I thought NMI context was free to call local_irq_save/restore
>>>>> anyway so the bug would still be there in those cases?
>>>>
>>>> NMI code has in_nmi() true, in which case the IRQ tracing is disabled
>>>> (except for x86 which has CONFIG_TRACE_IRQFLAGS_NMI).
>>>>
>>>
>>> That doesn't help. It doesn't fix the lockdep irq state going out of
>>> synch with the actual irq state. The code which triggered this with the
>>> special powerpc irq disable has in_nmi() true as well.
>>
>> Urgh, you're talking about using lockdep_assert_irqs*() from NMI
>> context?
>>
>> If not, I'm afraid I might've lost the plot a little on what exact
>> failure case we're talking about.
>>
>
> Hm, I may have been a bit confused actually. Since your Fix
> TRACE_IRQFLAGS vs NMIs patch it might now work.
>
> I'm worried powerpc disables trace irqs trace_hardirqs_off()
> before nmi_enter() might still be a problem, but not sure
> actually. Alexey did you end up re-testing with Peter's patch
The one above in the thread which replaces powerpc_local_irq_pmu_save()
with
raw_powerpc_local_irq_pmu_save()? It did not compile as there is no
raw_powerpc_local_irq_pmu_save() so I may be missing something here.
I applied the patch on top of the current upstream and replaced
raw_powerpc_local_irq_pmu_save() with raw_local_irq_pmu_save() (which I
think was the intention) but I still see the issue.
> or current upstream?
The upstream 18445bf405cb (13 hours old) also shows the problem. Yours
1/2 still fixes it.
>
> Thanks,
> Nick
>
--
Alexey
^ permalink raw reply
* Re: [PATCH 00/10] sound: convert tasklets to use new tasklet_setup()
From: Allen @ 2020-08-19 10:51 UTC (permalink / raw)
To: Mark Brown
Cc: alsa-devel, Kees Cook, timur, Xiubo.Lee, Takashi Iwai,
linux-kernel, clemens, tiwai, o-takashi, nicoleotsuka, Allen Pais,
perex, linuxppc-dev
In-Reply-To: <20200818104432.GB5337@sirena.org.uk>
>
> > Mark, may I apply those ASoC patches through my tree together with
> > others? Those seem targeting to 5.9, and I have a patch set to
> > convert to tasklet for 5.10, which would be better manageable when
> > based on top of those changes.
>
> These patches which I wasn't CCed on and which need their subject lines
> fixing :( . With the subject lines fixed I guess so so
Extremely sorry. I thought I had it covered. How would you like it
worded?
> Acked-by: Mark Brown <broonie@kernel.org>
>
> but judging from some of the other threads about similar patches that I
> was randomly CCed on I'm not sure people like from_tasklet() so perhaps
> there might be issues.
Yes, there is a new macro by name cast_out() is suggested in place of
from_tasklet(). Hopefully it will go in soon. Will spin out V2 with the change
and also re-word subject line.
> Allen, as documented in submitting-patches.rst please send patches to
> the maintainers for the code you would like to change. The normal
> kernel workflow is that people apply patches from their inboxes, if they
> aren't copied they are likely to not see the patch at all and it is much
> more difficult to apply patches.
I understand, I'll take care of it in the future. Thank you.
--
- Allen
^ permalink raw reply
* Re: [PATCH 00/10] sound: convert tasklets to use new tasklet_setup()
From: Mark Brown @ 2020-08-19 11:16 UTC (permalink / raw)
To: Allen
Cc: alsa-devel, Kees Cook, timur, Xiubo.Lee, Takashi Iwai,
linux-kernel, clemens, tiwai, o-takashi, nicoleotsuka, Allen Pais,
perex, linuxppc-dev
In-Reply-To: <CAOMdWSK79WWsmsxJH9zUMZMfkBNRWXbmEHg-haxNZopHjC1cGw@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 425 bytes --]
On Wed, Aug 19, 2020 at 04:21:58PM +0530, Allen wrote:
> > These patches which I wasn't CCed on and which need their subject lines
> > fixing :( . With the subject lines fixed I guess so so
> Extremely sorry. I thought I had it covered. How would you like it
> worded?
ASoC:
In general you should try to follow the style for the code you're
modifying, this applies to things like commit logs as well as the code
itself.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply
* Re: [PATCH 00/10] sound: convert tasklets to use new tasklet_setup()
From: Takashi Iwai @ 2020-08-19 11:28 UTC (permalink / raw)
To: Allen
Cc: alsa-devel, Kees Cook, timur, Xiubo.Lee, linux-kernel, clemens,
tiwai, o-takashi, nicoleotsuka, Allen Pais, Mark Brown, perex,
linuxppc-dev
In-Reply-To: <20200819111605.GC5441@sirena.org.uk>
On Wed, 19 Aug 2020 13:16:05 +0200,
Mark Brown wrote:
>
> On Wed, Aug 19, 2020 at 04:21:58PM +0530, Allen wrote:
>
> > > These patches which I wasn't CCed on and which need their subject lines
> > > fixing :( . With the subject lines fixed I guess so so
>
> > Extremely sorry. I thought I had it covered. How would you like it
> > worded?
>
> ASoC:
To be more exact, "ASoC:" prefix is for sound/soc/*, and for the rest
sound/*, use "ALSA:" prefix please.
Takashi
^ permalink raw reply
* [PATCH v2 01/13] powerpc/mm: Add DEBUG_VM WARN for pmd_clear
From: Aneesh Kumar K.V @ 2020-08-19 13:00 UTC (permalink / raw)
To: linux-mm, akpm; +Cc: linuxppc-dev, Aneesh Kumar K.V, Anshuman Khandual
In-Reply-To: <20200819130107.478414-1-aneesh.kumar@linux.ibm.com>
With the hash page table, the kernel should not use pmd_clear for clearing
huge pte entries. Add a DEBUG_VM WARN to catch the wrong usage.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
arch/powerpc/include/asm/book3s/64/pgtable.h | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 6de56c3b33c4..079211968987 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -868,6 +868,13 @@ static inline bool pte_ci(pte_t pte)
static inline void pmd_clear(pmd_t *pmdp)
{
+ if (IS_ENABLED(CONFIG_DEBUG_VM) && !radix_enabled()) {
+ /*
+ * Don't use this if we can possibly have a hash page table
+ * entry mapping this.
+ */
+ WARN_ON((pmd_val(*pmdp) & (H_PAGE_HASHPTE | _PAGE_PTE)) == (H_PAGE_HASHPTE | _PAGE_PTE));
+ }
*pmdp = __pmd(0);
}
@@ -916,6 +923,13 @@ static inline int pmd_bad(pmd_t pmd)
static inline void pud_clear(pud_t *pudp)
{
+ if (IS_ENABLED(CONFIG_DEBUG_VM) && !radix_enabled()) {
+ /*
+ * Don't use this if we can possibly have a hash page table
+ * entry mapping this.
+ */
+ WARN_ON((pud_val(*pudp) & (H_PAGE_HASHPTE | _PAGE_PTE)) == (H_PAGE_HASHPTE | _PAGE_PTE));
+ }
*pudp = __pud(0);
}
--
2.26.2
^ permalink raw reply related
* [PATCH v2 00/13] mm/debug_vm_pgtable fixes
From: Aneesh Kumar K.V @ 2020-08-19 13:00 UTC (permalink / raw)
To: linux-mm, akpm; +Cc: linuxppc-dev, Aneesh Kumar K.V, Anshuman Khandual
This patch series includes fixes for debug_vm_pgtable test code so that
they follow page table updates rules correctly. The first two patches introduce
changes w.r.t ppc64. The patches are included in this series for completeness. We can
merge them via ppc64 tree if required.
Hugetlb test is disabled on ppc64 because that needs larger change to satisfy
page table update rules.
Changes from V1:
* Address review feedback
* drop test specific pfn_pte and pfn_pmd.
* Update ppc64 page table helper to add _PAGE_PTE
Aneesh Kumar K.V (13):
powerpc/mm: Add DEBUG_VM WARN for pmd_clear
powerpc/mm: Move setting pte specific flags to pfn_pte
mm/debug_vm_pgtable/ppc64: Avoid setting top bits in radom value
mm/debug_vm_pgtables/hugevmap: Use the arch helper to identify huge
vmap support.
mm/debug_vm_pgtable/savedwrite: Enable savedwrite test with
CONFIG_NUMA_BALANCING
mm/debug_vm_pgtable/THP: Mark the pte entry huge before using
set_pmd/pud_at
mm/debug_vm_pgtable/set_pte/pmd/pud: Don't use set_*_at to update an
existing pte entry
mm/debug_vm_pgtable/thp: Use page table depost/withdraw with THP
mm/debug_vm_pgtable/locks: Move non page table modifying test together
mm/debug_vm_pgtable/locks: Take correct page table lock
mm/debug_vm_pgtable/pmd_clear: Don't use pmd/pud_clear on pte entries
mm/debug_vm_pgtable/hugetlb: Disable hugetlb test on ppc64
mm/debug_vm_pgtable: populate a pte entry before fetching it
arch/powerpc/include/asm/book3s/64/pgtable.h | 29 +++-
arch/powerpc/include/asm/nohash/pgtable.h | 5 -
arch/powerpc/mm/book3s64/pgtable.c | 2 +-
arch/powerpc/mm/pgtable.c | 5 -
include/linux/io.h | 12 ++
mm/debug_vm_pgtable.c | 151 +++++++++++--------
6 files changed, 127 insertions(+), 77 deletions(-)
--
2.26.2
^ permalink raw reply
* [PATCH v2 08/13] mm/debug_vm_pgtable/thp: Use page table depost/withdraw with THP
From: Aneesh Kumar K.V @ 2020-08-19 13:01 UTC (permalink / raw)
To: linux-mm, akpm; +Cc: linuxppc-dev, Aneesh Kumar K.V, Anshuman Khandual
In-Reply-To: <20200819130107.478414-1-aneesh.kumar@linux.ibm.com>
Architectures like ppc64 use deposited page table while updating the huge pte
entries.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
mm/debug_vm_pgtable.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index 9c7e2c9cfc76..6dcac2b40fef 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -149,7 +149,7 @@ static void __init pmd_basic_tests(unsigned long pfn, pgprot_t prot)
static void __init pmd_advanced_tests(struct mm_struct *mm,
struct vm_area_struct *vma, pmd_t *pmdp,
unsigned long pfn, unsigned long vaddr,
- pgprot_t prot)
+ pgprot_t prot, pgtable_t pgtable)
{
pmd_t pmd;
@@ -160,6 +160,8 @@ static void __init pmd_advanced_tests(struct mm_struct *mm,
/* Align the address wrt HPAGE_PMD_SIZE */
vaddr = (vaddr & HPAGE_PMD_MASK) + HPAGE_PMD_SIZE;
+ pgtable_trans_huge_deposit(mm, pmdp, pgtable);
+
pmd = pmd_mkhuge(pfn_pmd(pfn, prot));
set_pmd_at(mm, vaddr, pmdp, pmd);
pmdp_set_wrprotect(mm, vaddr, pmdp);
@@ -188,6 +190,8 @@ static void __init pmd_advanced_tests(struct mm_struct *mm,
pmdp_test_and_clear_young(vma, vaddr, pmdp);
pmd = READ_ONCE(*pmdp);
WARN_ON(pmd_young(pmd));
+
+ pgtable = pgtable_trans_huge_withdraw(mm, pmdp);
}
static void __init pmd_leaf_tests(unsigned long pfn, pgprot_t prot)
@@ -1000,7 +1004,7 @@ static int __init debug_vm_pgtable(void)
pgd_clear_tests(mm, pgdp);
pte_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
- pmd_advanced_tests(mm, vma, pmdp, pmd_aligned, vaddr, prot);
+ pmd_advanced_tests(mm, vma, pmdp, pmd_aligned, vaddr, prot, saved_ptep);
pud_advanced_tests(mm, vma, pudp, pud_aligned, vaddr, prot);
hugetlb_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
--
2.26.2
^ permalink raw reply related
* [PATCH v2 02/13] powerpc/mm: Move setting pte specific flags to pfn_pte
From: Aneesh Kumar K.V @ 2020-08-19 13:00 UTC (permalink / raw)
To: linux-mm, akpm; +Cc: linuxppc-dev, Aneesh Kumar K.V, Anshuman Khandual
In-Reply-To: <20200819130107.478414-1-aneesh.kumar@linux.ibm.com>
powerpc used to set the pte specific flags in set_pte_at(). This is different
from other architectures. To be consistent with other architecture update
pfn_pte and pfn_pmd to set _PAGE_PTE on ppc64. Also, drop now unused pte_mkpte.
We add a VM_WARN_ON() to catch the usage of calling set_pte_at() without setting
_PAGE_PTE bit. We will remove that after a few releases.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
arch/powerpc/include/asm/book3s/64/pgtable.h | 15 +++++++++------
arch/powerpc/include/asm/nohash/pgtable.h | 5 -----
arch/powerpc/mm/book3s64/pgtable.c | 2 +-
arch/powerpc/mm/pgtable.c | 5 -----
4 files changed, 10 insertions(+), 17 deletions(-)
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 079211968987..2382fd516f6b 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -619,7 +619,7 @@ static inline pte_t pfn_pte(unsigned long pfn, pgprot_t pgprot)
VM_BUG_ON(pfn >> (64 - PAGE_SHIFT));
VM_BUG_ON((pfn << PAGE_SHIFT) & ~PTE_RPN_MASK);
- return __pte(((pte_basic_t)pfn << PAGE_SHIFT) | pgprot_val(pgprot));
+ return __pte(((pte_basic_t)pfn << PAGE_SHIFT) | pgprot_val(pgprot) | _PAGE_PTE);
}
static inline unsigned long pte_pfn(pte_t pte)
@@ -655,11 +655,6 @@ static inline pte_t pte_mkexec(pte_t pte)
return __pte_raw(pte_raw(pte) | cpu_to_be64(_PAGE_EXEC));
}
-static inline pte_t pte_mkpte(pte_t pte)
-{
- return __pte_raw(pte_raw(pte) | cpu_to_be64(_PAGE_PTE));
-}
-
static inline pte_t pte_mkwrite(pte_t pte)
{
/*
@@ -823,6 +818,14 @@ static inline int pte_none(pte_t pte)
static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
pte_t *ptep, pte_t pte, int percpu)
{
+
+ VM_WARN_ON(!(pte_raw(pte) & cpu_to_be64(_PAGE_PTE)));
+ /*
+ * Keep the _PAGE_PTE added till we are sure we handle _PAGE_PTE
+ * in all the callers.
+ */
+ pte = __pte_raw(pte_raw(pte) | cpu_to_be64(_PAGE_PTE));
+
if (radix_enabled())
return radix__set_pte_at(mm, addr, ptep, pte, percpu);
return hash__set_pte_at(mm, addr, ptep, pte, percpu);
diff --git a/arch/powerpc/include/asm/nohash/pgtable.h b/arch/powerpc/include/asm/nohash/pgtable.h
index 4b7c3472eab1..6277e7596ae5 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -140,11 +140,6 @@ static inline pte_t pte_mkold(pte_t pte)
return __pte(pte_val(pte) & ~_PAGE_ACCESSED);
}
-static inline pte_t pte_mkpte(pte_t pte)
-{
- return pte;
-}
-
static inline pte_t pte_mkspecial(pte_t pte)
{
return __pte(pte_val(pte) | _PAGE_SPECIAL);
diff --git a/arch/powerpc/mm/book3s64/pgtable.c b/arch/powerpc/mm/book3s64/pgtable.c
index e18ae50a275c..3b4da7c63e28 100644
--- a/arch/powerpc/mm/book3s64/pgtable.c
+++ b/arch/powerpc/mm/book3s64/pgtable.c
@@ -141,7 +141,7 @@ pmd_t pfn_pmd(unsigned long pfn, pgprot_t pgprot)
unsigned long pmdv;
pmdv = (pfn << PAGE_SHIFT) & PTE_RPN_MASK;
- return pmd_set_protbits(__pmd(pmdv), pgprot);
+ return __pmd(pmdv | pgprot_val(pgprot) | _PAGE_PTE);
}
pmd_t mk_pmd(struct page *page, pgprot_t pgprot)
diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c
index 9c0547d77af3..ab57b07ef39a 100644
--- a/arch/powerpc/mm/pgtable.c
+++ b/arch/powerpc/mm/pgtable.c
@@ -184,9 +184,6 @@ void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
*/
VM_WARN_ON(pte_hw_valid(*ptep) && !pte_protnone(*ptep));
- /* Add the pte bit when trying to set a pte */
- pte = pte_mkpte(pte);
-
/* Note: mm->context.id might not yet have been assigned as
* this context might not have been activated yet when this
* is called.
@@ -275,8 +272,6 @@ void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_
*/
VM_WARN_ON(pte_hw_valid(*ptep) && !pte_protnone(*ptep));
- pte = pte_mkpte(pte);
-
pte = set_pte_filter(pte);
val = pte_val(pte);
--
2.26.2
^ permalink raw reply related
* [PATCH v2 10/13] mm/debug_vm_pgtable/locks: Take correct page table lock
From: Aneesh Kumar K.V @ 2020-08-19 13:01 UTC (permalink / raw)
To: linux-mm, akpm; +Cc: linuxppc-dev, Aneesh Kumar K.V, Anshuman Khandual
In-Reply-To: <20200819130107.478414-1-aneesh.kumar@linux.ibm.com>
Make sure we call pte accessors with correct lock held.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
mm/debug_vm_pgtable.c | 34 ++++++++++++++++++++--------------
1 file changed, 20 insertions(+), 14 deletions(-)
diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index 69fe3cd8126c..8f7a8ccb5a54 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -1024,33 +1024,39 @@ static int __init debug_vm_pgtable(void)
pmd_thp_tests(pmd_aligned, prot);
pud_thp_tests(pud_aligned, prot);
+ hugetlb_basic_tests(pte_aligned, prot);
+
/*
* Page table modifying tests
*/
- pte_clear_tests(mm, ptep, vaddr);
- pmd_clear_tests(mm, pmdp);
- pud_clear_tests(mm, pudp);
- p4d_clear_tests(mm, p4dp);
- pgd_clear_tests(mm, pgdp);
ptep = pte_alloc_map_lock(mm, pmdp, vaddr, &ptl);
+ pte_clear_tests(mm, ptep, vaddr);
pte_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
- pmd_advanced_tests(mm, vma, pmdp, pmd_aligned, vaddr, prot, saved_ptep);
- pud_advanced_tests(mm, vma, pudp, pud_aligned, vaddr, prot);
- hugetlb_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
-
+ pte_unmap_unlock(ptep, ptl);
+ ptl = pmd_lock(mm, pmdp);
+ pmd_clear_tests(mm, pmdp);
+ pmd_advanced_tests(mm, vma, pmdp, pmd_aligned, vaddr, prot, saved_ptep);
pmd_huge_tests(pmdp, pmd_aligned, prot);
+ pmd_populate_tests(mm, pmdp, saved_ptep);
+ spin_unlock(ptl);
+
+ ptl = pud_lock(mm, pudp);
+ pud_clear_tests(mm, pudp);
+ pud_advanced_tests(mm, vma, pudp, pud_aligned, vaddr, prot);
pud_huge_tests(pudp, pud_aligned, prot);
+ pud_populate_tests(mm, pudp, saved_pmdp);
+ spin_unlock(ptl);
- pte_unmap_unlock(ptep, ptl);
+ //hugetlb_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
- pmd_populate_tests(mm, pmdp, saved_ptep);
- pud_populate_tests(mm, pudp, saved_pmdp);
+ spin_lock(&mm->page_table_lock);
+ p4d_clear_tests(mm, p4dp);
+ pgd_clear_tests(mm, pgdp);
p4d_populate_tests(mm, p4dp, saved_pudp);
pgd_populate_tests(mm, pgdp, saved_p4dp);
-
- hugetlb_basic_tests(pte_aligned, prot);
+ spin_unlock(&mm->page_table_lock);
p4d_free(mm, saved_p4dp);
pud_free(mm, saved_pudp);
--
2.26.2
^ permalink raw reply related
* [PATCH v2 03/13] mm/debug_vm_pgtable/ppc64: Avoid setting top bits in radom value
From: Aneesh Kumar K.V @ 2020-08-19 13:00 UTC (permalink / raw)
To: linux-mm, akpm; +Cc: linuxppc-dev, Aneesh Kumar K.V, Anshuman Khandual
In-Reply-To: <20200819130107.478414-1-aneesh.kumar@linux.ibm.com>
ppc64 use bit 62 to indicate a pte entry (_PAGE_PTE). Avoid setting that bit in
random value.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
mm/debug_vm_pgtable.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index 086309fb9b6f..57259e2dbd17 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -44,10 +44,13 @@
* entry type. But these bits might affect the ability to clear entries with
* pxx_clear() because of how dynamic page table folding works on s390. So
* while loading up the entries do not change the lower 4 bits. It does not
- * have affect any other platform.
+ * have affect any other platform. Also avoid the 62nd bit on ppc64 that is
+ * used to mark a pte entry.
*/
-#define S390_MASK_BITS 4
-#define RANDOM_ORVALUE GENMASK(BITS_PER_LONG - 1, S390_MASK_BITS)
+#define S390_SKIP_MASK GENMASK(3, 0)
+#define PPC64_SKIP_MASK GENMASK(62, 62)
+#define ARCH_SKIP_MASK (S390_SKIP_MASK | PPC64_SKIP_MASK)
+#define RANDOM_ORVALUE (GENMASK(BITS_PER_LONG - 1, 0) & ~ARCH_SKIP_MASK)
#define RANDOM_NZVALUE GENMASK(7, 0)
static void __init pte_basic_tests(unsigned long pfn, pgprot_t prot)
--
2.26.2
^ permalink raw reply related
* [PATCH v2 11/13] mm/debug_vm_pgtable/pmd_clear: Don't use pmd/pud_clear on pte entries
From: Aneesh Kumar K.V @ 2020-08-19 13:01 UTC (permalink / raw)
To: linux-mm, akpm; +Cc: linuxppc-dev, Aneesh Kumar K.V, Anshuman Khandual
In-Reply-To: <20200819130107.478414-1-aneesh.kumar@linux.ibm.com>
pmd_clear() should not be used to clear pmd level pte entries.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
mm/debug_vm_pgtable.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index 8f7a8ccb5a54..63576fe767a2 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -191,6 +191,8 @@ static void __init pmd_advanced_tests(struct mm_struct *mm,
pmd = READ_ONCE(*pmdp);
WARN_ON(pmd_young(pmd));
+ /* Clear the pte entries */
+ pmdp_huge_get_and_clear(mm, vaddr, pmdp);
pgtable = pgtable_trans_huge_withdraw(mm, pmdp);
}
@@ -311,6 +313,8 @@ static void __init pud_advanced_tests(struct mm_struct *mm,
pudp_test_and_clear_young(vma, vaddr, pudp);
pud = READ_ONCE(*pudp);
WARN_ON(pud_young(pud));
+
+ pudp_huge_get_and_clear(mm, vaddr, pudp);
}
static void __init pud_leaf_tests(unsigned long pfn, pgprot_t prot)
@@ -429,8 +433,6 @@ static void __init pud_populate_tests(struct mm_struct *mm, pud_t *pudp,
* This entry points to next level page table page.
* Hence this must not qualify as pud_bad().
*/
- pmd_clear(pmdp);
- pud_clear(pudp);
pud_populate(mm, pudp, pmdp);
pud = READ_ONCE(*pudp);
WARN_ON(pud_bad(pud));
@@ -562,7 +564,6 @@ static void __init pmd_populate_tests(struct mm_struct *mm, pmd_t *pmdp,
* This entry points to next level page table page.
* Hence this must not qualify as pmd_bad().
*/
- pmd_clear(pmdp);
pmd_populate(mm, pmdp, pgtable);
pmd = READ_ONCE(*pmdp);
WARN_ON(pmd_bad(pmd));
--
2.26.2
^ permalink raw reply related
* [PATCH v2 12/13] mm/debug_vm_pgtable/hugetlb: Disable hugetlb test on ppc64
From: Aneesh Kumar K.V @ 2020-08-19 13:01 UTC (permalink / raw)
To: linux-mm, akpm; +Cc: linuxppc-dev, Aneesh Kumar K.V, Anshuman Khandual
In-Reply-To: <20200819130107.478414-1-aneesh.kumar@linux.ibm.com>
The seems to be missing quite a lot of details w.r.t allocating
the correct pgtable_t page (huge_pte_alloc()), holding the right
lock (huge_pte_lock()) etc. The vma used is also not a hugetlb VMA.
ppc64 do have runtime checks within CONFIG_DEBUG_VM for most of these.
Hence disable the test on ppc64.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
mm/debug_vm_pgtable.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index 63576fe767a2..09ce9974c187 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -798,6 +798,7 @@ static void __init hugetlb_basic_tests(unsigned long pfn, pgprot_t prot)
#endif /* CONFIG_ARCH_WANT_GENERAL_HUGETLB */
}
+#ifndef CONFIG_PPC_BOOK3S_64
static void __init hugetlb_advanced_tests(struct mm_struct *mm,
struct vm_area_struct *vma,
pte_t *ptep, unsigned long pfn,
@@ -840,6 +841,7 @@ static void __init hugetlb_advanced_tests(struct mm_struct *mm,
pte = huge_ptep_get(ptep);
WARN_ON(!(huge_pte_write(pte) && huge_pte_dirty(pte)));
}
+#endif
#else /* !CONFIG_HUGETLB_PAGE */
static void __init hugetlb_basic_tests(unsigned long pfn, pgprot_t prot) { }
static void __init hugetlb_advanced_tests(struct mm_struct *mm,
@@ -1050,7 +1052,9 @@ static int __init debug_vm_pgtable(void)
pud_populate_tests(mm, pudp, saved_pmdp);
spin_unlock(ptl);
- //hugetlb_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
+#ifndef CONFIG_PPC_BOOK3S_64
+ hugetlb_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
+#endif
spin_lock(&mm->page_table_lock);
p4d_clear_tests(mm, p4dp);
--
2.26.2
^ permalink raw reply related
* [PATCH v2 07/13] mm/debug_vm_pgtable/set_pte/pmd/pud: Don't use set_*_at to update an existing pte entry
From: Aneesh Kumar K.V @ 2020-08-19 13:01 UTC (permalink / raw)
To: linux-mm, akpm; +Cc: linuxppc-dev, Aneesh Kumar K.V, Anshuman Khandual
In-Reply-To: <20200819130107.478414-1-aneesh.kumar@linux.ibm.com>
set_pte_at() should not be used to set a pte entry at locations that
already holds a valid pte entry. Architectures like ppc64 don't do TLB
invalidate in set_pte_at() and hence expect it to be used to set locations
that are not a valid PTE.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
mm/debug_vm_pgtable.c | 35 +++++++++++++++--------------------
1 file changed, 15 insertions(+), 20 deletions(-)
diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index 76f4c713e5a3..9c7e2c9cfc76 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -74,15 +74,18 @@ static void __init pte_advanced_tests(struct mm_struct *mm,
{
pte_t pte = pfn_pte(pfn, prot);
+ /*
+ * Architectures optimize set_pte_at by avoiding TLB flush.
+ * This requires set_pte_at to be not used to update an
+ * existing pte entry. Clear pte before we do set_pte_at
+ */
+
pr_debug("Validating PTE advanced\n");
pte = pfn_pte(pfn, prot);
set_pte_at(mm, vaddr, ptep, pte);
ptep_set_wrprotect(mm, vaddr, ptep);
pte = ptep_get(ptep);
WARN_ON(pte_write(pte));
-
- pte = pfn_pte(pfn, prot);
- set_pte_at(mm, vaddr, ptep, pte);
ptep_get_and_clear(mm, vaddr, ptep);
pte = ptep_get(ptep);
WARN_ON(!pte_none(pte));
@@ -96,13 +99,11 @@ static void __init pte_advanced_tests(struct mm_struct *mm,
ptep_set_access_flags(vma, vaddr, ptep, pte, 1);
pte = ptep_get(ptep);
WARN_ON(!(pte_write(pte) && pte_dirty(pte)));
-
- pte = pfn_pte(pfn, prot);
- set_pte_at(mm, vaddr, ptep, pte);
ptep_get_and_clear_full(mm, vaddr, ptep, 1);
pte = ptep_get(ptep);
WARN_ON(!pte_none(pte));
+ pte = pfn_pte(pfn, prot);
pte = pte_mkyoung(pte);
set_pte_at(mm, vaddr, ptep, pte);
ptep_test_and_clear_young(vma, vaddr, ptep);
@@ -164,9 +165,6 @@ static void __init pmd_advanced_tests(struct mm_struct *mm,
pmdp_set_wrprotect(mm, vaddr, pmdp);
pmd = READ_ONCE(*pmdp);
WARN_ON(pmd_write(pmd));
-
- pmd = pmd_mkhuge(pfn_pmd(pfn, prot));
- set_pmd_at(mm, vaddr, pmdp, pmd);
pmdp_huge_get_and_clear(mm, vaddr, pmdp);
pmd = READ_ONCE(*pmdp);
WARN_ON(!pmd_none(pmd));
@@ -180,13 +178,11 @@ static void __init pmd_advanced_tests(struct mm_struct *mm,
pmdp_set_access_flags(vma, vaddr, pmdp, pmd, 1);
pmd = READ_ONCE(*pmdp);
WARN_ON(!(pmd_write(pmd) && pmd_dirty(pmd)));
-
- pmd = pmd_mkhuge(pfn_pmd(pfn, prot));
- set_pmd_at(mm, vaddr, pmdp, pmd);
pmdp_huge_get_and_clear_full(vma, vaddr, pmdp, 1);
pmd = READ_ONCE(*pmdp);
WARN_ON(!pmd_none(pmd));
+ pmd = pmd_mkhuge(pfn_pmd(pfn, prot));
pmd = pmd_mkyoung(pmd);
set_pmd_at(mm, vaddr, pmdp, pmd);
pmdp_test_and_clear_young(vma, vaddr, pmdp);
@@ -283,18 +279,10 @@ static void __init pud_advanced_tests(struct mm_struct *mm,
WARN_ON(pud_write(pud));
#ifndef __PAGETABLE_PMD_FOLDED
-
- pud = pud_mkhuge(pfn_pud(pfn, prot));
- set_pud_at(mm, vaddr, pudp, pud);
pudp_huge_get_and_clear(mm, vaddr, pudp);
pud = READ_ONCE(*pudp);
WARN_ON(!pud_none(pud));
- pud = pud_mkhuge(pfn_pud(pfn, prot));
- set_pud_at(mm, vaddr, pudp, pud);
- pudp_huge_get_and_clear_full(mm, vaddr, pudp, 1);
- pud = READ_ONCE(*pudp);
- WARN_ON(!pud_none(pud));
#endif /* __PAGETABLE_PMD_FOLDED */
pud = pud_mkhuge(pfn_pud(pfn, prot));
@@ -307,6 +295,13 @@ static void __init pud_advanced_tests(struct mm_struct *mm,
pud = READ_ONCE(*pudp);
WARN_ON(!(pud_write(pud) && pud_dirty(pud)));
+#ifndef __PAGETABLE_PMD_FOLDED
+ pudp_huge_get_and_clear_full(vma, vaddr, pudp, 1);
+ pud = READ_ONCE(*pudp);
+ WARN_ON(!pud_none(pud));
+#endif /* __PAGETABLE_PMD_FOLDED */
+
+ pud = pud_mkhuge(pfn_pud(pfn, prot));
pud = pud_mkyoung(pud);
set_pud_at(mm, vaddr, pudp, pud);
pudp_test_and_clear_young(vma, vaddr, pudp);
--
2.26.2
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox