* [PATCH AUTOSEL 5.4 102/330] powerpc/eeh: Only dump stack once if an MMIO loop is detected
From: Sasha Levin @ 2020-09-18 1:57 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Sam Bobroff, Oliver O'Halloran, linuxppc-dev, Sasha Levin
In-Reply-To: <20200918020110.2063155-1-sashal@kernel.org>
From: Oliver O'Halloran <oohall@gmail.com>
[ Upstream commit 4e0942c0302b5ad76b228b1a7b8c09f658a1d58a ]
Many drivers don't check for errors when they get a 0xFFs response from an
MMIO load. As a result after an EEH event occurs a driver can get stuck in
a polling loop unless it some kind of internal timeout logic.
Currently EEH tries to detect and report stuck drivers by dumping a stack
trace after eeh_dev_check_failure() is called EEH_MAX_FAILS times on an
already frozen PE. The value of EEH_MAX_FAILS was chosen so that a dump
would occur every few seconds if the driver was spinning in a loop. This
results in a lot of spurious stack traces in the kernel log.
Fix this by limiting it to printing one stack trace for each PE freeze. If
the driver is truely stuck the kernel's hung task detector is better suited
to reporting the probelm anyway.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: Sam Bobroff <sbobroff@linux.ibm.com>
Tested-by: Sam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191016012536.22588-1-oohall@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
arch/powerpc/kernel/eeh.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index bc8a551013be9..c35069294ecfb 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -503,7 +503,7 @@ int eeh_dev_check_failure(struct eeh_dev *edev)
rc = 1;
if (pe->state & EEH_PE_ISOLATED) {
pe->check_count++;
- if (pe->check_count % EEH_MAX_FAILS == 0) {
+ if (pe->check_count == EEH_MAX_FAILS) {
dn = pci_device_to_OF_node(dev);
if (dn)
location = of_get_property(dn, "ibm,loc-code",
--
2.25.1
^ permalink raw reply related
* [PATCH AUTOSEL 5.4 030/330] powerpc/64s: Always disable branch profiling for prom_init.o
From: Sasha Levin @ 2020-09-18 1:56 UTC (permalink / raw)
To: linux-kernel, stable; +Cc: Sasha Levin, linuxppc-dev
In-Reply-To: <20200918020110.2063155-1-sashal@kernel.org>
From: Michael Ellerman <mpe@ellerman.id.au>
[ Upstream commit 6266a4dadb1d0976490fdf5af4f7941e36f64e80 ]
Otherwise the build fails because prom_init is calling symbols it's
not allowed to, eg:
Error: External symbol 'ftrace_likely_update' referenced from prom_init.c
make[3]: *** [arch/powerpc/kernel/Makefile:197: arch/powerpc/kernel/prom_init_check] Error 1
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191106051129.7626-1-mpe@ellerman.id.au
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
arch/powerpc/kernel/Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index dc0780f930d5b..59260eb962916 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -19,6 +19,7 @@ CFLAGS_btext.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
CFLAGS_prom.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
CFLAGS_prom_init.o += $(call cc-option, -fno-stack-protector)
+CFLAGS_prom_init.o += -DDISABLE_BRANCH_PROFILING
ifdef CONFIG_FUNCTION_TRACER
# Do not trace early boot code
@@ -36,7 +37,6 @@ KASAN_SANITIZE_btext.o := n
ifdef CONFIG_KASAN
CFLAGS_early_32.o += -DDISABLE_BRANCH_PROFILING
CFLAGS_cputable.o += -DDISABLE_BRANCH_PROFILING
-CFLAGS_prom_init.o += -DDISABLE_BRANCH_PROFILING
CFLAGS_btext.o += -DDISABLE_BRANCH_PROFILING
endif
--
2.25.1
^ permalink raw reply related
* Re: [PATCH] powerpc/process: Fix uninitialised variable error
From: Nick Desaulniers @ 2020-09-18 1:34 UTC (permalink / raw)
To: mpe; +Cc: linuxppc-dev, Nick Desaulniers, clang-built-linux
In-Reply-To: <20200917024509.3253837-1-mpe@ellerman.id.au>
https://lore.kernel.org/linuxppc-dev/20200917024509.3253837-1-mpe@ellerman.id.au/
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
^ permalink raw reply
* Re: [PATCH] selftests/seccomp: fix ptrace tests on powerpc
From: Thadeu Lima de Souza Cascardo @ 2020-09-17 22:51 UTC (permalink / raw)
To: Kees Cook; +Cc: linuxppc-dev, Shuah Khan, Oleg Nesterov, linux-kselftest
In-Reply-To: <202009171536.7FACD19@keescook>
On Thu, Sep 17, 2020 at 03:37:16PM -0700, Kees Cook wrote:
> On Sun, Sep 13, 2020 at 10:34:23PM +1000, Michael Ellerman wrote:
> > Thadeu Lima de Souza Cascardo <cascardo@canonical.com> writes:
> > > On Tue, Sep 08, 2020 at 04:18:17PM -0700, Kees Cook wrote:
> > >> On Tue, Jun 30, 2020 at 01:47:39PM -0300, Thadeu Lima de Souza Cascardo wrote:
> > ...
> > >> > @@ -1809,10 +1818,15 @@ void tracer_ptrace(struct __test_metadata *_metadata, pid_t tracee,
> > >> > EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY
> > >> > : PTRACE_EVENTMSG_SYSCALL_EXIT, msg);
> > >> >
> > >> > - if (!entry)
> > >> > + if (!entry && !syscall_nr)
> > >> > return;
> > >> >
> > >> > - nr = get_syscall(_metadata, tracee);
> > >> > + if (entry)
> > >> > + nr = get_syscall(_metadata, tracee);
> > >> > + else
> > >> > + nr = *syscall_nr;
> > >>
> > >> This is weird? Shouldn't get_syscall() be modified to do the right thing
> > >> here instead of depending on the extra arg?
> > >>
> > >
> > > R0 might be clobered. Same documentation mentions it as volatile. So, during
> > > syscall exit, we can't tell for sure that R0 will have the original syscall
> > > number. So, we need to grab it during syscall enter, save it somewhere and
> > > reuse it. I used the test context/args for that.
> >
> > The user r0 (in regs->gpr[0]) shouldn't be clobbered.
> >
> > But it is modified if the tracer skips the syscall, by setting the
> > syscall number to -1. Or if the tracer changes the syscall number.
> >
> > So if you need the original syscall number in the exit path then I think
> > you do need to save it at entry.
>
> ... the selftest code wants to test the updated syscall (-1 or
> whatever), so this sounds correct. Was this test actually failing on
> powerpc? (I'd really rather not split entry/exit if I don't have to.)
>
> --
> Kees Cook
Yes, it started failing when the return code started being changed as well.
Though ptrace can change the syscall at entry (IIRC), it can't change the
return code. And that is part of the ABI. If ppc is changed so it allows
changing the return code during ptrace entry, some strace uses will break. So
that is not an option.
Cascardo.
^ permalink raw reply
* Re: [PATCH] selftests/seccomp: fix ptrace tests on powerpc
From: Kees Cook @ 2020-09-17 22:37 UTC (permalink / raw)
To: Michael Ellerman
Cc: Thadeu Lima de Souza Cascardo, linuxppc-dev, Shuah Khan,
Oleg Nesterov, linux-kselftest
In-Reply-To: <875z8hrh2o.fsf@mpe.ellerman.id.au>
On Sun, Sep 13, 2020 at 10:34:23PM +1000, Michael Ellerman wrote:
> Thadeu Lima de Souza Cascardo <cascardo@canonical.com> writes:
> > On Tue, Sep 08, 2020 at 04:18:17PM -0700, Kees Cook wrote:
> >> On Tue, Jun 30, 2020 at 01:47:39PM -0300, Thadeu Lima de Souza Cascardo wrote:
> ...
> >> > @@ -1809,10 +1818,15 @@ void tracer_ptrace(struct __test_metadata *_metadata, pid_t tracee,
> >> > EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY
> >> > : PTRACE_EVENTMSG_SYSCALL_EXIT, msg);
> >> >
> >> > - if (!entry)
> >> > + if (!entry && !syscall_nr)
> >> > return;
> >> >
> >> > - nr = get_syscall(_metadata, tracee);
> >> > + if (entry)
> >> > + nr = get_syscall(_metadata, tracee);
> >> > + else
> >> > + nr = *syscall_nr;
> >>
> >> This is weird? Shouldn't get_syscall() be modified to do the right thing
> >> here instead of depending on the extra arg?
> >>
> >
> > R0 might be clobered. Same documentation mentions it as volatile. So, during
> > syscall exit, we can't tell for sure that R0 will have the original syscall
> > number. So, we need to grab it during syscall enter, save it somewhere and
> > reuse it. I used the test context/args for that.
>
> The user r0 (in regs->gpr[0]) shouldn't be clobbered.
>
> But it is modified if the tracer skips the syscall, by setting the
> syscall number to -1. Or if the tracer changes the syscall number.
>
> So if you need the original syscall number in the exit path then I think
> you do need to save it at entry.
... the selftest code wants to test the updated syscall (-1 or
whatever), so this sounds correct. Was this test actually failing on
powerpc? (I'd really rather not split entry/exit if I don't have to.)
--
Kees Cook
^ permalink raw reply
* Re: [PATCH -next] ide: Fix symbol undeclared warnings
From: David Miller @ 2020-09-17 19:44 UTC (permalink / raw)
To: mpe; +Cc: wangwensheng4, linux-kernel, linux-ide, paulus, linuxppc-dev
In-Reply-To: <87zh5oobnn.fsf@mpe.ellerman.id.au>
From: Michael Ellerman <mpe@ellerman.id.au>
Date: Thu, 17 Sep 2020 22:01:00 +1000
> Wang Wensheng <wangwensheng4@huawei.com> writes:
>> Build the object file with `C=2` and get the following warnings:
>> make allmodconfig ARCH=powerpc CROSS_COMPILE=powerpc64-linux-gnu-
>> make C=2 drivers/ide/pmac.o ARCH=powerpc64
>> CROSS_COMPILE=powerpc64-linux-gnu-
>>
>> drivers/ide/pmac.c:228:23: warning: symbol 'mdma_timings_33' was not
>> declared. Should it be static?
>> drivers/ide/pmac.c:241:23: warning: symbol 'mdma_timings_33k' was not
>> declared. Should it be static?
>> drivers/ide/pmac.c:254:23: warning: symbol 'mdma_timings_66' was not
>> declared. Should it be static?
>> drivers/ide/pmac.c:272:3: warning: symbol 'kl66_udma_timings' was not
>> declared. Should it be static?
>> drivers/ide/pmac.c:1418:12: warning: symbol 'pmac_ide_probe' was not
>> declared. Should it be static?
>>
>> Signed-off-by: Wang Wensheng <wangwensheng4@huawei.com>
>> ---
>> drivers/ide/pmac.c | 10 +++++-----
>> 1 file changed, 5 insertions(+), 5 deletions(-)
>
> TIL davem maintains IDE?
>
> But I suspect he isn't that interested in this powerpc only driver, so
> I'll grab this.
I did have it in my queue, but if you want to take it that's fine too :)
^ permalink raw reply
* Re: [trivial PATCH] treewide: Convert switch/case fallthrough; to break;
From: Jacob Keller @ 2020-09-17 19:40 UTC (permalink / raw)
To: Keith Busch, Joe Perches
Cc: linux-wireless, linux-fbdev, oss-drivers, nouveau, alsa-devel,
dri-devel, linux-mips, linux-ide, dm-devel, linux-mtd, linux-i2c,
sparclinux, kvmarm, linux-rtc, linux-s390, linux-scsi, dccp,
linux-rdma, linux-atm-general, linux-afs, coreteam,
intel-wired-lan, linux-serial, linux-input, linux-mmc, Kees Cook,
linux-media, linux-pm, intel-gfx, linux-sctp, linux-mediatek,
linux-nvme, storagedev, ceph-devel, linux-arm-kernel, linux-nfs,
Jiri Kosina, linux-parisc, netdev, linux-usb, Nick Desaulniers,
LKML, iommu, netfilter-devel, linux-crypto, bpf, linuxppc-dev
In-Reply-To: <20200909205558.GA3384631@dhcp-10-100-145-180.wdl.wdc.com>
On 9/9/2020 1:55 PM, Keith Busch wrote:
> On Wed, Sep 09, 2020 at 01:06:39PM -0700, Joe Perches wrote:
>> diff --git a/crypto/tcrypt.c b/crypto/tcrypt.c
>> index eea0f453cfb6..8aac5bc60f4c 100644
>> --- a/crypto/tcrypt.c
>> +++ b/crypto/tcrypt.c
>> @@ -2464,7 +2464,7 @@ static int do_test(const char *alg, u32 type, u32 mask, int m, u32 num_mb)
>> test_hash_speed("streebog512", sec,
>> generic_hash_speed_template);
>> if (mode > 300 && mode < 400) break;
>> - fallthrough;
>> + break;
>> case 399:
>> break;
>
> Just imho, this change makes the preceding 'if' look even more
> pointless. Maybe the fallthrough was a deliberate choice? Not that my
> opinion matters here as I don't know this module, but it looked a bit
> odd to me.
>
Yea this does look very odd..
^ permalink raw reply
* Re: [PATCH v2 0/3] ASoC: fsl_sai: update the register list
From: Mark Brown @ 2020-09-17 18:57 UTC (permalink / raw)
To: tiwai, lgirdwood, perex, timur, Shengjiu Wang, Xiubo.Lee,
festevam, nicoleotsuka, alsa-devel
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <1600323079-5317-1-git-send-email-shengjiu.wang@nxp.com>
On Thu, 17 Sep 2020 14:11:16 +0800, Shengjiu Wang wrote:
> As sai ip is upgraded, so update sai register list.
>
> Shengjiu Wang (3):
> ASoC: fsl_sai: Add new added registers and new bit definition
> ASoC: fsl_sai: Add fsl_sai_check_version function
> ASoC: fsl_sai: Set MCLK input or output direction
>
> [...]
Applied to
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-next
Thanks!
[1/3] ASoC: fsl_sai: Add new added registers and new bit definition
commit: 0b2cbce6898600aae5e87285f1c2000162d59c76
[2/3] ASoC: fsl_sai: Add fsl_sai_check_version function
commit: 1dc658b13c1c365274b27bfc3c4d4f2955348fb8
[3/3] ASoC: fsl_sai: Set MCLK input or output direction
commit: a57d4e8730c1a55b2547ff81aef4753b67121cb8
All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.
You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.
If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.
Please add any relevant lists and maintainers to the CCs when replying
to this mail.
Thanks,
Mark
^ permalink raw reply
* Re: [PATCH] ASoC: fsl_audmix: make clock and output src write only
From: Mark Brown @ 2020-09-17 18:57 UTC (permalink / raw)
To: Xiubo Li, Nicolin Chen, linuxppc-dev, Jaroslav Kysela,
Takashi Iwai, Viorel Suman (OSS), Liam Girdwood, Shengjiu Wang,
Timur Tabi, linux-kernel, alsa-devel, Fabio Estevam
Cc: Viorel Suman, NXP Linux Team, Viorel Suman
In-Reply-To: <1600104274-13110-1-git-send-email-viorel.suman@oss.nxp.com>
On Mon, 14 Sep 2020 20:24:34 +0300, Viorel Suman (OSS) wrote:
> "alsactl -f state.conf store/restore" sequence fails because setting
> "mixing clock source" and "output source" requires active TDM clock
> being started for configuration propagation. Make these two controls
> write only so that their values are not stored at "alsactl store".
Applied to
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-next
Thanks!
[1/1] ASoC: fsl_audmix: make clock and output src write only
commit: 944c517b8c838832a166f1c89afbf8724f4a6b49
All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.
You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.
If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.
Please add any relevant lists and maintainers to the CCs when replying
to this mail.
Thanks,
Mark
^ permalink raw reply
* Re: [PATCH -next] PCI: rpadlpar: use for_each_child_of_node() and for_each_node_by_name
From: Bjorn Helgaas @ 2020-09-17 16:57 UTC (permalink / raw)
To: Qinglang Miao
Cc: Tyrel Datwyler, linux-pci, linux-kernel, Paul Mackerras,
Bjorn Helgaas, linuxppc-dev
In-Reply-To: <20200916062128.190819-1-miaoqinglang@huawei.com>
On Wed, Sep 16, 2020 at 02:21:28PM +0800, Qinglang Miao wrote:
> Use for_each_child_of_node() and for_each_node_by_name macro
> instead of open coding it.
>
> Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com>
Applied to pci/hotplug for v5.10, thanks!
> ---
> drivers/pci/hotplug/rpadlpar_core.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/pci/hotplug/rpadlpar_core.c b/drivers/pci/hotplug/rpadlpar_core.c
> index f979b7098..0a3c80ba6 100644
> --- a/drivers/pci/hotplug/rpadlpar_core.c
> +++ b/drivers/pci/hotplug/rpadlpar_core.c
> @@ -40,13 +40,13 @@ static DEFINE_MUTEX(rpadlpar_mutex);
> static struct device_node *find_vio_slot_node(char *drc_name)
> {
> struct device_node *parent = of_find_node_by_name(NULL, "vdevice");
> - struct device_node *dn = NULL;
> + struct device_node *dn;
> int rc;
>
> if (!parent)
> return NULL;
>
> - while ((dn = of_get_next_child(parent, dn))) {
> + for_each_child_of_node(parent, dn) {
> rc = rpaphp_check_drc_props(dn, drc_name, NULL);
> if (rc == 0)
> break;
> @@ -60,10 +60,10 @@ static struct device_node *find_vio_slot_node(char *drc_name)
> static struct device_node *find_php_slot_pci_node(char *drc_name,
> char *drc_type)
> {
> - struct device_node *np = NULL;
> + struct device_node *np;
> int rc;
>
> - while ((np = of_find_node_by_name(np, "pci"))) {
> + for_each_node_by_name(np, "pci") {
> rc = rpaphp_check_drc_props(np, drc_name, drc_type);
> if (rc == 0)
> break;
> --
> 2.23.0
>
^ permalink raw reply
* Re: [PATCHv7 00/12]PCI: dwc: Add the multiple PF support for DWC and Layerscape
From: Lorenzo Pieralisi @ 2020-09-17 16:20 UTC (permalink / raw)
To: Zhiqiang Hou
Cc: devicetree, andrew.murray, roy.zang, linux-pci, linux-kernel,
leoyang.li, minghuan.Lian, jingoohan1, robh+dt, mingkai.hu,
gustavo.pimentel, bhelgaas, shawnguo, kishon, linuxppc-dev,
linux-arm-kernel
In-Reply-To: <20200811095441.7636-1-Zhiqiang.Hou@nxp.com>
On Tue, Aug 11, 2020 at 05:54:29PM +0800, Zhiqiang Hou wrote:
> From: Hou Zhiqiang <Zhiqiang.Hou@nxp.com>
>
> Add the PCIe EP multiple PF support for DWC and Layerscape, and use
> a list to manage the PFs of each PCIe controller; add the doorbell
> MSIX function for DWC; and refactor the Layerscape EP driver due to
> some difference in Layercape platforms PCIe integration.
>
> Hou Zhiqiang (1):
> misc: pci_endpoint_test: Add driver data for Layerscape PCIe
> controllers
>
> Xiaowei Bao (11):
> PCI: designware-ep: Add multiple PFs support for DWC
> PCI: designware-ep: Add the doorbell mode of MSI-X in EP mode
> PCI: designware-ep: Move the function of getting MSI capability
> forward
> PCI: designware-ep: Modify MSI and MSIX CAP way of finding
> dt-bindings: pci: layerscape-pci: Add compatible strings for ls1088a
> and ls2088a
> PCI: layerscape: Fix some format issue of the code
> PCI: layerscape: Modify the way of getting capability with different
> PEX
> PCI: layerscape: Modify the MSIX to the doorbell mode
> PCI: layerscape: Add EP mode support for ls1088a and ls2088a
> arm64: dts: layerscape: Add PCIe EP node for ls1088a
> misc: pci_endpoint_test: Add LS1088a in pci_device_id table
>
> .../bindings/pci/layerscape-pci.txt | 2 +
> .../arm64/boot/dts/freescale/fsl-ls1088a.dtsi | 31 +++
> drivers/misc/pci_endpoint_test.c | 8 +-
> .../pci/controller/dwc/pci-layerscape-ep.c | 100 +++++--
> .../pci/controller/dwc/pcie-designware-ep.c | 258 ++++++++++++++----
> drivers/pci/controller/dwc/pcie-designware.c | 59 ++--
> drivers/pci/controller/dwc/pcie-designware.h | 48 +++-
> 7 files changed, 410 insertions(+), 96 deletions(-)
Side note: I will change it for you but please keep Signed-off-by:
tags together in the log instead of mixing them with other tags
randomly.
Can you rebase this series against my pci/dwc branch please and
send a v8 ?
I will apply it then.
Thanks,
Lorenzo
^ permalink raw reply
* Re: [PATCH v4] pseries/hotplug-memory: hot-add: skip redundant LMB lookup
From: Laurent Dufour @ 2020-09-17 14:34 UTC (permalink / raw)
To: Scott Cheloha, linuxppc-dev
Cc: Nathan Lynch, Michal Suchanek, David Hildenbrand, Rick Lindsley
In-Reply-To: <20200916145122.3408129-1-cheloha@linux.ibm.com>
Le 16/09/2020 à 16:51, Scott Cheloha a écrit :
> During memory hot-add, dlpar_add_lmb() calls memory_add_physaddr_to_nid()
> to determine which node id (nid) to use when later calling __add_memory().
>
> This is wasteful. On pseries, memory_add_physaddr_to_nid() finds an
> appropriate nid for a given address by looking up the LMB containing the
> address and then passing that LMB to of_drconf_to_nid_single() to get the
> nid. In dlpar_add_lmb() we get this address from the LMB itself.
>
> In short, we have a pointer to an LMB and then we are searching for
> that LMB *again* in order to find its nid.
>
> If we call of_drconf_to_nid_single() directly from dlpar_add_lmb() we
> can skip the redundant lookup. The only error handling we need to
> duplicate from memory_add_physaddr_to_nid() is the fallback to the
> default nid when drconf_to_nid_single() returns -1 (NUMA_NO_NODE) or
> an invalid nid.
>
> Skipping the extra lookup makes hot-add operations faster, especially
> on machines with many LMBs.
>
> Consider an LPAR with 126976 LMBs. In one test, hot-adding 126000
> LMBs on an upatched kernel took ~3.5 hours while a patched kernel
> completed the same operation in ~2 hours:
>
> Unpatched (12450 seconds):
> Sep 9 04:06:31 ltc-brazos1 drmgr[810169]: drmgr: -c mem -a -q 126000
> Sep 9 04:06:31 ltc-brazos1 kernel: pseries-hotplug-mem: Attempting to hot-add 126000 LMB(s)
> [...]
> Sep 9 07:34:01 ltc-brazos1 kernel: pseries-hotplug-mem: Memory at 20000000 (drc index 80000002) was hot-added
>
> Patched (7065 seconds):
> Sep 8 21:49:57 ltc-brazos1 drmgr[877703]: drmgr: -c mem -a -q 126000
> Sep 8 21:49:57 ltc-brazos1 kernel: pseries-hotplug-mem: Attempting to hot-add 126000 LMB(s)
> [...]
> Sep 8 23:27:42 ltc-brazos1 kernel: pseries-hotplug-mem: Memory at 20000000 (drc index 80000002) was hot-added
>
> It should be noted that the speedup grows more substantial when
> hot-adding LMBs at the end of the drconf range. This is because we
> are skipping a linear LMB search.
>
> To see the distinction, consider smaller hot-add test on the same
> LPAR. A perf-stat run with 10 iterations showed that hot-adding 4096
> LMBs completed less than 1 second faster on a patched kernel:
>
> Unpatched:
> Performance counter stats for 'drmgr -c mem -a -q 4096' (10 runs):
>
> 104,753.42 msec task-clock # 0.992 CPUs utilized ( +- 0.55% )
> 4,708 context-switches # 0.045 K/sec ( +- 0.69% )
> 2,444 cpu-migrations # 0.023 K/sec ( +- 1.25% )
> 394 page-faults # 0.004 K/sec ( +- 0.22% )
> 445,902,503,057 cycles # 4.257 GHz ( +- 0.55% ) (66.67%)
> 8,558,376,740 stalled-cycles-frontend # 1.92% frontend cycles idle ( +- 0.88% ) (49.99%)
> 300,346,181,651 stalled-cycles-backend # 67.36% backend cycles idle ( +- 0.76% ) (50.01%)
> 258,091,488,691 instructions # 0.58 insn per cycle
> # 1.16 stalled cycles per insn ( +- 0.22% ) (66.67%)
> 70,568,169,256 branches # 673.660 M/sec ( +- 0.17% ) (50.01%)
> 3,100,725,426 branch-misses # 4.39% of all branches ( +- 0.20% ) (49.99%)
>
> 105.583 +- 0.589 seconds time elapsed ( +- 0.56% )
>
> Patched:
> Performance counter stats for 'drmgr -c mem -a -q 4096' (10 runs):
>
> 104,055.69 msec task-clock # 0.993 CPUs utilized ( +- 0.32% )
> 4,606 context-switches # 0.044 K/sec ( +- 0.20% )
> 2,463 cpu-migrations # 0.024 K/sec ( +- 0.93% )
> 394 page-faults # 0.004 K/sec ( +- 0.25% )
> 442,951,129,921 cycles # 4.257 GHz ( +- 0.32% ) (66.66%)
> 8,710,413,329 stalled-cycles-frontend # 1.97% frontend cycles idle ( +- 0.47% ) (50.06%)
> 299,656,905,836 stalled-cycles-backend # 67.65% backend cycles idle ( +- 0.39% ) (50.02%)
> 252,731,168,193 instructions # 0.57 insn per cycle
> # 1.19 stalled cycles per insn ( +- 0.20% ) (66.66%)
> 68,902,851,121 branches # 662.173 M/sec ( +- 0.13% ) (49.94%)
> 3,100,242,882 branch-misses # 4.50% of all branches ( +- 0.15% ) (49.98%)
>
> 104.829 +- 0.325 seconds time elapsed ( +- 0.31% )
>
> This is consistent. An add-by-count hot-add operation adds LMBs
> greedily, so LMBs near the start of the drconf range are considered
> first. On an otherwise idle LPAR with so many LMBs we would expect to
> find the LMBs we need near the start of the drconf range, hence the
> smaller speedup.
>
> Signed-off-by: Scott Cheloha <cheloha@linux.ibm.com>
Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com>
> ---
> Changelog:
>
> v1: https://lore.kernel.org/linuxppc-dev/20200910175637.2865160-1-cheloha@linux.ibm.com/
>
> v2:
> - Move prototype for of_drconf_to_nid_single() to topology.h.
> Requested by Michael Ellerman.
>
> v3:
> - Send the right patch. v2 is from the wrong branch, my mistake.
>
> v4:
> - Fix checkpatch.pl warnings. Reported by Laurent Dufour.
>
> arch/powerpc/include/asm/topology.h | 3 +++
> arch/powerpc/mm/numa.c | 2 +-
> arch/powerpc/platforms/pseries/hotplug-memory.c | 6 ++++--
> 3 files changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/topology.h b/arch/powerpc/include/asm/topology.h
> index f0b6300e7dd3..ae19b19f9d44 100644
> --- a/arch/powerpc/include/asm/topology.h
> +++ b/arch/powerpc/include/asm/topology.h
> @@ -86,6 +86,9 @@ static inline int cpu_distance(__be32 *cpu1_assoc, __be32 *cpu2_assoc)
>
> #endif /* CONFIG_NUMA */
>
> +struct drmem_lmb;
> +int of_drconf_to_nid_single(struct drmem_lmb *lmb);
> +
> #if defined(CONFIG_NUMA) && defined(CONFIG_PPC_SPLPAR)
> extern int find_and_online_cpu_nid(int cpu);
> #else
> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
> index 1f61fa2148b5..63507b47164d 100644
> --- a/arch/powerpc/mm/numa.c
> +++ b/arch/powerpc/mm/numa.c
> @@ -430,7 +430,7 @@ static int of_get_assoc_arrays(struct assoc_arrays *aa)
> * This is like of_node_to_nid_single() for memory represented in the
> * ibm,dynamic-reconfiguration-memory node.
> */
> -static int of_drconf_to_nid_single(struct drmem_lmb *lmb)
> +int of_drconf_to_nid_single(struct drmem_lmb *lmb)
> {
> struct assoc_arrays aa = { .arrays = NULL };
> int default_nid = NUMA_NO_NODE;
> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
> index 0ea976d1cac4..9a533acf8ad0 100644
> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
> @@ -611,8 +611,10 @@ static int dlpar_add_lmb(struct drmem_lmb *lmb)
>
> block_sz = memory_block_size_bytes();
>
> - /* Find the node id for this address. */
> - nid = memory_add_physaddr_to_nid(lmb->base_addr);
> + /* Find the node id for this LMB. Fake one if necessary. */
> + nid = of_drconf_to_nid_single(lmb);
> + if (nid < 0 || !node_possible(nid))
> + nid = first_online_node;
>
> /* Add the memory */
> rc = __add_memory(nid, lmb->base_addr, block_sz);
>
^ permalink raw reply
* Re: [PATCH v2] powerpc: fix EDEADLOCK redefinition error in uapi/asm/errno.h
From: Andreas Schwab @ 2020-09-17 14:34 UTC (permalink / raw)
To: Arnd Bergmann
Cc: linux-arch, Tony Ambardar, linux-kernel@vger.kernel.org,
Rosen Penev, Paul Mackerras, bpf, linuxppc-dev
In-Reply-To: <CAK8P3a3FVoDzNb1TOA6cRQDdEc+st7KkBL70t0FeStEziQG4+A__37056.5000850306$1600351707$gmane$org@mail.gmail.com>
On Sep 17 2020, Arnd Bergmann wrote:
> The errno man page says they are supposed to be synonyms,
> and glibc defines it that way, while musl uses the numbers
> from the kernel.
glibc also uses whatever the kernel defines.
Andreas.
--
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1
"And now for something completely different."
^ permalink raw reply
* Re: [PATCH v2] powerpc: fix EDEADLOCK redefinition error in uapi/asm/errno.h
From: Arnd Bergmann @ 2020-09-17 14:01 UTC (permalink / raw)
To: Michael Ellerman
Cc: linux-arch, Tony Ambardar, linux-kernel@vger.kernel.org,
Paul Mackerras, Rosen Penev, bpf, linuxppc-dev
In-Reply-To: <87363gpqhz.fsf@mpe.ellerman.id.au>
On Thu, Sep 17, 2020 at 1:55 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> [ Cc += linux-arch & Arnd ]
>
> Hi Tony,
>
> This looks OK to me, but I'm always a bit nervous about changes in uapi.
> I've Cc'ed linux-arch and Arnd who look after the asm-generic headers,
> which this is slightly related to, just in case.
>
> One minor comment below.
>
> Tony Ambardar <tony.ambardar@gmail.com> writes:
> > A few archs like powerpc have different errno.h values for macros
> > EDEADLOCK and EDEADLK. In code including both libc and linux versions of
> > errno.h, this can result in multiple definitions of EDEADLOCK in the
> > include chain. Definitions to the same value (e.g. seen with mips) do
> > not raise warnings, but on powerpc there are redefinitions changing the
> > value, which raise warnings and errors (if using "-Werror").
> >
> > Guard against these redefinitions to avoid build errors like the following,
> > first seen cross-compiling libbpf v5.8.9 for powerpc using GCC 8.4.0 with
> > musl 1.1.24:
> >
> > In file included from ../../arch/powerpc/include/uapi/asm/errno.h:5,
> > from ../../include/linux/err.h:8,
> > from libbpf.c:29:
> > ../../include/uapi/asm-generic/errno.h:40: error: "EDEADLOCK" redefined [-Werror]
> > #define EDEADLOCK EDEADLK
> >
> > In file included from toolchain-powerpc_8540_gcc-8.4.0_musl/include/errno.h:10,
> > from libbpf.c:26:
> > toolchain-powerpc_8540_gcc-8.4.0_musl/include/bits/errno.h:58: note: this is the location of the previous definition
> > #define EDEADLOCK 58
> >
> > cc1: all warnings being treated as errors
> >
> > Fixes: 95f28190aa01 ("tools include arch: Grab a copy of errno.h for arch's supported by perf")
> > Fixes: c3617f72036c ("UAPI: (Scripted) Disintegrate arch/powerpc/include/asm")
>
> I suspect that's not the right commit to tag. It just moved errno.h from
> arch/powerpc/include/asm to arch/powerpc/include/uapi/asm. It's content
> was almost identical, and entirely identical as far as EDEADLOCK was
> concerned.
>
> Prior to that the file lived in asm-powerpc/errno.h, eg:
>
> $ git cat-file -p b8b572e1015f^:include/asm-powerpc/errno.h
>
> Before that it was include/asm-ppc64/errno.h, content still the same.
>
> To go back further we'd have to look at the historical git trees, which
> is probably overkill. I'm pretty sure it's always had this problem.
>
> So we should probably drop the Fixes tags and just Cc: stable, that
> means please backport it as far back as possible.
I can see that the two numbers (35 and 58) were consistent across
multiple architectures (i386, m68k, ppc32) up to linux-2.0.1, while
other architectures had two unique numbers (alpha, mips, sparc)
at the time, and sparc had BSD and Solaris compatible numbers
in addition.
In linux-2.0.2, alpha and i386 got changed to use 35 for both,
but the other architectures remained unchanged. All later
architectures followed x86 in using the same number for both.
I foudn a message about tcl breaking at compile time when
it changed:
http://lkml.iu.edu/hypermail/linux/kernel/9607.3/0500.html
The errno man page says they are supposed to be synonyms,
and glibc defines it that way, while musl uses the numbers
from the kernel.
Arnd
^ permalink raw reply
* [PATCH v3] powerpc: fix EDEADLOCK redefinition error in uapi/asm/errno.h
From: Tony Ambardar @ 2020-09-17 13:54 UTC (permalink / raw)
To: Michael Ellerman
Cc: linux-arch, Tony Ambardar, Arnd Bergmann, linux-kernel, Stable,
Paul Mackerras, Rosen Penev, bpf, linuxppc-dev
In-Reply-To: <20200917000757.1232850-1-Tony.Ambardar@gmail.com>
A few archs like powerpc have different errno.h values for macros
EDEADLOCK and EDEADLK. In code including both libc and linux versions of
errno.h, this can result in multiple definitions of EDEADLOCK in the
include chain. Definitions to the same value (e.g. seen with mips) do
not raise warnings, but on powerpc there are redefinitions changing the
value, which raise warnings and errors (if using "-Werror").
Guard against these redefinitions to avoid build errors like the following,
first seen cross-compiling libbpf v5.8.9 for powerpc using GCC 8.4.0 with
musl 1.1.24:
In file included from ../../arch/powerpc/include/uapi/asm/errno.h:5,
from ../../include/linux/err.h:8,
from libbpf.c:29:
../../include/uapi/asm-generic/errno.h:40: error: "EDEADLOCK" redefined [-Werror]
#define EDEADLOCK EDEADLK
In file included from toolchain-powerpc_8540_gcc-8.4.0_musl/include/errno.h:10,
from libbpf.c:26:
toolchain-powerpc_8540_gcc-8.4.0_musl/include/bits/errno.h:58: note: this is the location of the previous definition
#define EDEADLOCK 58
cc1: all warnings being treated as errors
CC: Stable <stable@vger.kernel.org>
Reported-by: Rosen Penev <rosenp@gmail.com>
Signed-off-by: Tony Ambardar <Tony.Ambardar@gmail.com>
---
v1 -> v2:
* clean up commit description formatting
v2 -> v3: (per Michael Ellerman)
* drop indeterminate 'Fixes' tags, request stable backports instead
---
arch/powerpc/include/uapi/asm/errno.h | 1 +
tools/arch/powerpc/include/uapi/asm/errno.h | 1 +
2 files changed, 2 insertions(+)
diff --git a/arch/powerpc/include/uapi/asm/errno.h b/arch/powerpc/include/uapi/asm/errno.h
index cc79856896a1..4ba87de32be0 100644
--- a/arch/powerpc/include/uapi/asm/errno.h
+++ b/arch/powerpc/include/uapi/asm/errno.h
@@ -2,6 +2,7 @@
#ifndef _ASM_POWERPC_ERRNO_H
#define _ASM_POWERPC_ERRNO_H
+#undef EDEADLOCK
#include <asm-generic/errno.h>
#undef EDEADLOCK
diff --git a/tools/arch/powerpc/include/uapi/asm/errno.h b/tools/arch/powerpc/include/uapi/asm/errno.h
index cc79856896a1..4ba87de32be0 100644
--- a/tools/arch/powerpc/include/uapi/asm/errno.h
+++ b/tools/arch/powerpc/include/uapi/asm/errno.h
@@ -2,6 +2,7 @@
#ifndef _ASM_POWERPC_ERRNO_H
#define _ASM_POWERPC_ERRNO_H
+#undef EDEADLOCK
#include <asm-generic/errno.h>
#undef EDEADLOCK
--
2.25.1
^ permalink raw reply related
* Re: [PATCH 1/2] ASoC: fsl_xcvr: Add XCVR ASoC CPU DAI driver
From: Mark Brown @ 2020-09-17 13:53 UTC (permalink / raw)
To: Viorel Suman (OSS)
Cc: devicetree, alsa-devel, Matthias Schiffer, Viorel Suman,
Timur Tabi, Xiubo Li, Shengjiu Wang, linuxppc-dev, Takashi Iwai,
Liam Girdwood, Nicolin Chen, Rob Herring, NXP Linux Team,
Philipp Zabel, Viorel Suman, Cosmin-Gabriel Samoila,
Jaroslav Kysela, Fabio Estevam, linux-kernel
In-Reply-To: <1600247876-8013-2-git-send-email-viorel.suman@oss.nxp.com>
[-- Attachment #1: Type: text/plain, Size: 2280 bytes --]
On Wed, Sep 16, 2020 at 12:17:55PM +0300, Viorel Suman (OSS) wrote:
This looks mostly good, a few smallish things below but nothing major.
> +static int fsl_xcvr_load_firmware(struct fsl_xcvr *xcvr)
> +{
> + struct device *dev = &xcvr->pdev->dev;
> + const struct firmware *fw;
> + int ret = 0, rem, off, out, page = 0, size = FSL_XCVR_REG_OFFSET;
> + u32 mask, val;
> +
> + ret = request_firmware(&fw, xcvr->fw_name, dev);
> + if (ret) {
> + dev_err(dev, "failed to request firmware.\n");
> + return ret;
> + }
> +
> + rem = fw->size;
It would be good to see some explicit validation of the image size, at
least printing an error message if the image is bigger than can be
loaded. The code should be safe in that it won't overflow the device
region it's writing to but it feels like it'd be better to tell people
if we spot a problem rather than just silently truncating the file.
> + /* RAM is 20KiB => max 10 pages 2KiB each */
> + for (page = 0; page < 10; page++) {
> + ret = regmap_update_bits(xcvr->regmap, FSL_XCVR_EXT_CTRL,
> + FSL_XCVR_EXT_CTRL_PAGE_MASK,
> + FSL_XCVR_EXT_CTRL_PAGE(page));
regmap does have paging support, though given that this is currently the
only place where paging is used this probably doesn't matter too much.
> +static irqreturn_t irq0_isr(int irq, void *devid)
> +{
> + struct fsl_xcvr *xcvr = (struct fsl_xcvr *)devid;
> + struct device *dev = &xcvr->pdev->dev;
> + struct regmap *regmap = xcvr->regmap;
> + void __iomem *reg_ctrl, *reg_buff;
> + u32 isr, val, i;
> +
> + regmap_read(regmap, FSL_XCVR_EXT_ISR, &isr);
> + regmap_write(regmap, FSL_XCVR_EXT_ISR_CLR, isr);
This will unconditionally clear any interrupts, even those we don't
understand - it might be better to only clear bits that are supported so
the IRQ core can complain if there's something unexpected showing up.
> + if (isr & FSL_XCVR_IRQ_FIFO_UOFL_ERR)
> + dev_dbg(dev, "RX/TX FIFO full/empty\n");
Should this be dev_err()?
> +static irqreturn_t irq1_isr(int irq, void *devid)
> +{
> + struct fsl_xcvr *xcvr = (struct fsl_xcvr *)devid;
> + struct device *dev = &xcvr->pdev->dev;
> +
> + dev_dbg(dev, "irq[1]: %d\n", irq);
> +
> + return IRQ_HANDLED;
> +}
Is there any value in even requesting this and irq2 given the lack of
meaningful handling?
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply
* Re: [PATCH v2] powerpc: fix EDEADLOCK redefinition error in uapi/asm/errno.h
From: Tony Ambardar @ 2020-09-17 13:42 UTC (permalink / raw)
To: Michael Ellerman
Cc: linux-arch, Arnd Bergmann, linux-kernel, Rosen Penev,
Paul Mackerras, bpf, linuxppc-dev
In-Reply-To: <87363gpqhz.fsf@mpe.ellerman.id.au>
On Thu, 17 Sep 2020 at 04:55, Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> [ Cc += linux-arch & Arnd ]
>
> Hi Tony,
>
> This looks OK to me, but I'm always a bit nervous about changes in uapi.
> I've Cc'ed linux-arch and Arnd who look after the asm-generic headers,
> which this is slightly related to, just in case.
>
I agree with the caution and would welcome any other insights.
> One minor comment below.
>
> Tony Ambardar <tony.ambardar@gmail.com> writes:
> > A few archs like powerpc have different errno.h values for macros
> > EDEADLOCK and EDEADLK. In code including both libc and linux versions of
> > errno.h, this can result in multiple definitions of EDEADLOCK in the
> > include chain. Definitions to the same value (e.g. seen with mips) do
> > not raise warnings, but on powerpc there are redefinitions changing the
> > value, which raise warnings and errors (if using "-Werror").
> >
> > Guard against these redefinitions to avoid build errors like the following,
> > first seen cross-compiling libbpf v5.8.9 for powerpc using GCC 8.4.0 with
> > musl 1.1.24:
> >
> > In file included from ../../arch/powerpc/include/uapi/asm/errno.h:5,
> > from ../../include/linux/err.h:8,
> > from libbpf.c:29:
> > ../../include/uapi/asm-generic/errno.h:40: error: "EDEADLOCK" redefined [-Werror]
> > #define EDEADLOCK EDEADLK
> >
> > In file included from toolchain-powerpc_8540_gcc-8.4.0_musl/include/errno.h:10,
> > from libbpf.c:26:
> > toolchain-powerpc_8540_gcc-8.4.0_musl/include/bits/errno.h:58: note: this is the location of the previous definition
> > #define EDEADLOCK 58
> >
> > cc1: all warnings being treated as errors
> >
> > Fixes: 95f28190aa01 ("tools include arch: Grab a copy of errno.h for arch's supported by perf")
> > Fixes: c3617f72036c ("UAPI: (Scripted) Disintegrate arch/powerpc/include/asm")
>
> I suspect that's not the right commit to tag. It just moved errno.h from
> arch/powerpc/include/asm to arch/powerpc/include/uapi/asm. It's content
> was almost identical, and entirely identical as far as EDEADLOCK was
> concerned.
>
> Prior to that the file lived in asm-powerpc/errno.h, eg:
>
> $ git cat-file -p b8b572e1015f^:include/asm-powerpc/errno.h
>
> Before that it was include/asm-ppc64/errno.h, content still the same.
>
> To go back further we'd have to look at the historical git trees, which
> is probably overkill. I'm pretty sure it's always had this problem.
>
> So we should probably drop the Fixes tags and just Cc: stable, that
> means please backport it as far back as possible.
>
Yes, you're right. Those two commits were simply where I stopped tracing back
the long chain. I'll drop them as you suggest and request a backport instead in
the next version of the patch.
Thanks for your feedback!
> cheers
>
>
> > Reported-by: Rosen Penev <rosenp@gmail.com>
> > Signed-off-by: Tony Ambardar <Tony.Ambardar@gmail.com>
> > ---
> > v1 -> v2:
> > * clean up commit description formatting
> > ---
> > arch/powerpc/include/uapi/asm/errno.h | 1 +
> > tools/arch/powerpc/include/uapi/asm/errno.h | 1 +
> > 2 files changed, 2 insertions(+)
> >
> > diff --git a/arch/powerpc/include/uapi/asm/errno.h b/arch/powerpc/include/uapi/asm/errno.h
> > index cc79856896a1..4ba87de32be0 100644
> > --- a/arch/powerpc/include/uapi/asm/errno.h
> > +++ b/arch/powerpc/include/uapi/asm/errno.h
> > @@ -2,6 +2,7 @@
> > #ifndef _ASM_POWERPC_ERRNO_H
> > #define _ASM_POWERPC_ERRNO_H
> >
> > +#undef EDEADLOCK
> > #include <asm-generic/errno.h>
> >
> > #undef EDEADLOCK
> > diff --git a/tools/arch/powerpc/include/uapi/asm/errno.h b/tools/arch/powerpc/include/uapi/asm/errno.h
> > index cc79856896a1..4ba87de32be0 100644
> > --- a/tools/arch/powerpc/include/uapi/asm/errno.h
> > +++ b/tools/arch/powerpc/include/uapi/asm/errno.h
> > @@ -2,6 +2,7 @@
> > #ifndef _ASM_POWERPC_ERRNO_H
> > #define _ASM_POWERPC_ERRNO_H
> >
> > +#undef EDEADLOCK
> > #include <asm-generic/errno.h>
> >
> > #undef EDEADLOCK
> > --
> > 2.25.1
^ permalink raw reply
* Re: [PATCH v6 8/8] powerpc/watchpoint/selftests: Tests for kernel accessing user memory
From: Rogerio Alves @ 2020-09-17 13:26 UTC (permalink / raw)
To: Ravi Bangoria, mpe, christophe.leroy
Cc: mikey, jniethe5, pedromfc, linux-kernel, paulus, rogealve,
naveen.n.rao, linuxppc-dev
In-Reply-To: <20200902042945.129369-9-ravi.bangoria@linux.ibm.com>
On 9/2/20 1:29 AM, Ravi Bangoria wrote:
> Introduce tests to cover simple scenarios where user is watching
> memory which can be accessed by kernel as well. We also support
> _MODE_EXACT with _SETHWDEBUG interface. Move those testcases out-
> side of _BP_RANGE condition. This will help to test _MODE_EXACT
> scenarios when CONFIG_HAVE_HW_BREAKPOINT is not set, eg:
>
> $ ./ptrace-hwbreak
> ...
> PTRACE_SET_DEBUGREG, Kernel Access Userspace, len: 8: Ok
> PPC_PTRACE_SETHWDEBUG, MODE_EXACT, WO, len: 1: Ok
> PPC_PTRACE_SETHWDEBUG, MODE_EXACT, RO, len: 1: Ok
> PPC_PTRACE_SETHWDEBUG, MODE_EXACT, RW, len: 1: Ok
> PPC_PTRACE_SETHWDEBUG, MODE_EXACT, Kernel Access Userspace, len: 1: Ok
> success: ptrace-hwbreak
>
> Suggested-by: Pedro Miraglia Franco de Carvalho <pedromfc@linux.ibm.com>
> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Tested-by: Rogerio Alves <rcardoso@linux.ibm.com>
> ---
> .../selftests/powerpc/ptrace/ptrace-hwbreak.c | 48 ++++++++++++++++++-
> 1 file changed, 46 insertions(+), 2 deletions(-)
>
> diff --git a/tools/testing/selftests/powerpc/ptrace/ptrace-hwbreak.c b/tools/testing/selftests/powerpc/ptrace/ptrace-hwbreak.c
> index fc477dfe86a2..2e0d86e0687e 100644
> --- a/tools/testing/selftests/powerpc/ptrace/ptrace-hwbreak.c
> +++ b/tools/testing/selftests/powerpc/ptrace/ptrace-hwbreak.c
> @@ -20,6 +20,8 @@
> #include <signal.h>
> #include <sys/types.h>
> #include <sys/wait.h>
> +#include <sys/syscall.h>
> +#include <linux/limits.h>
> #include "ptrace.h"
>
> #define SPRN_PVR 0x11F
> @@ -44,6 +46,7 @@ struct gstruct {
> };
> static volatile struct gstruct gstruct __attribute__((aligned(512)));
>
> +static volatile char cwd[PATH_MAX] __attribute__((aligned(8)));
>
> static void get_dbginfo(pid_t child_pid, struct ppc_debug_info *dbginfo)
> {
> @@ -138,6 +141,9 @@ static void test_workload(void)
> write_var(len);
> }
>
> + /* PTRACE_SET_DEBUGREG, Kernel Access Userspace test */
> + syscall(__NR_getcwd, &cwd, PATH_MAX);
> +
> /* PPC_PTRACE_SETHWDEBUG, MODE_EXACT, WO test */
> write_var(1);
>
> @@ -150,6 +156,9 @@ static void test_workload(void)
> else
> read_var(1);
>
> + /* PPC_PTRACE_SETHWDEBUG, MODE_EXACT, Kernel Access Userspace test */
> + syscall(__NR_getcwd, &cwd, PATH_MAX);
> +
> /* PPC_PTRACE_SETHWDEBUG, MODE_RANGE, DW ALIGNED, WO test */
> gstruct.a[rand() % A_LEN] = 'a';
>
> @@ -293,6 +302,24 @@ static int test_set_debugreg(pid_t child_pid)
> return 0;
> }
>
> +static int test_set_debugreg_kernel_userspace(pid_t child_pid)
> +{
> + unsigned long wp_addr = (unsigned long)cwd;
> + char *name = "PTRACE_SET_DEBUGREG";
> +
> + /* PTRACE_SET_DEBUGREG, Kernel Access Userspace test */
> + wp_addr &= ~0x7UL;
> + wp_addr |= (1Ul << DABR_READ_SHIFT);
> + wp_addr |= (1UL << DABR_WRITE_SHIFT);
> + wp_addr |= (1UL << DABR_TRANSLATION_SHIFT);
> + ptrace_set_debugreg(child_pid, wp_addr);
> + ptrace(PTRACE_CONT, child_pid, NULL, 0);
> + check_success(child_pid, name, "Kernel Access Userspace", wp_addr, 8);
> +
> + ptrace_set_debugreg(child_pid, 0);
> + return 0;
> +}
> +
> static void get_ppc_hw_breakpoint(struct ppc_hw_breakpoint *info, int type,
> unsigned long addr, int len)
> {
> @@ -338,6 +365,22 @@ static void test_sethwdebug_exact(pid_t child_pid)
> ptrace_delhwdebug(child_pid, wh);
> }
>
> +static void test_sethwdebug_exact_kernel_userspace(pid_t child_pid)
> +{
> + struct ppc_hw_breakpoint info;
> + unsigned long wp_addr = (unsigned long)&cwd;
> + char *name = "PPC_PTRACE_SETHWDEBUG, MODE_EXACT";
> + int len = 1; /* hardcoded in kernel */
> + int wh;
> +
> + /* PPC_PTRACE_SETHWDEBUG, MODE_EXACT, Kernel Access Userspace test */
> + get_ppc_hw_breakpoint(&info, PPC_BREAKPOINT_TRIGGER_WRITE, wp_addr, 0);
> + wh = ptrace_sethwdebug(child_pid, &info);
> + ptrace(PTRACE_CONT, child_pid, NULL, 0);
> + check_success(child_pid, name, "Kernel Access Userspace", wp_addr, len);
> + ptrace_delhwdebug(child_pid, wh);
> +}
> +
> static void test_sethwdebug_range_aligned(pid_t child_pid)
> {
> struct ppc_hw_breakpoint info;
> @@ -452,9 +495,10 @@ static void
> run_tests(pid_t child_pid, struct ppc_debug_info *dbginfo, bool dawr)
> {
> test_set_debugreg(child_pid);
> + test_set_debugreg_kernel_userspace(child_pid);
> + test_sethwdebug_exact(child_pid);
> + test_sethwdebug_exact_kernel_userspace(child_pid);
> if (dbginfo->features & PPC_DEBUG_FEATURE_DATA_BP_RANGE) {
> - test_sethwdebug_exact(child_pid);
> -
> test_sethwdebug_range_aligned(child_pid);
> if (dawr || is_8xx) {
> test_sethwdebug_range_unaligned(child_pid);
>
^ permalink raw reply
* Re: [PATCH v6 7/8] powerpc/watchpoint/ptrace: Introduce PPC_DEBUG_FEATURE_DATA_BP_ARCH_31
From: Rogerio Alves @ 2020-09-17 13:26 UTC (permalink / raw)
To: Ravi Bangoria, mpe, christophe.leroy
Cc: mikey, jniethe5, pedromfc, linux-kernel, paulus, rogealve,
naveen.n.rao, linuxppc-dev
In-Reply-To: <20200902042945.129369-8-ravi.bangoria@linux.ibm.com>
On 9/2/20 1:29 AM, Ravi Bangoria wrote:
> PPC_DEBUG_FEATURE_DATA_BP_ARCH_31 can be used to determine whether
> we are running on an ISA 3.1 compliant machine. Which is needed to
> determine DAR behaviour, 512 byte boundary limit etc. This was
> requested by Pedro Miraglia Franco de Carvalho for extending
> watchpoint features in gdb. Note that availability of 2nd DAWR is
> independent of this flag and should be checked using
> ppc_debug_info->num_data_bps.
>
> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Tested-by: Rogerio Alves <rcardoso@linux.ibm.com>
> ---
> Documentation/powerpc/ptrace.rst | 1 +
> arch/powerpc/include/uapi/asm/ptrace.h | 1 +
> arch/powerpc/kernel/ptrace/ptrace-noadv.c | 2 ++
> 3 files changed, 4 insertions(+)
>
> diff --git a/Documentation/powerpc/ptrace.rst b/Documentation/powerpc/ptrace.rst
> index 864d4b6dddd1..77725d69eb4a 100644
> --- a/Documentation/powerpc/ptrace.rst
> +++ b/Documentation/powerpc/ptrace.rst
> @@ -46,6 +46,7 @@ features will have bits indicating whether there is support for::
> #define PPC_DEBUG_FEATURE_DATA_BP_RANGE 0x4
> #define PPC_DEBUG_FEATURE_DATA_BP_MASK 0x8
> #define PPC_DEBUG_FEATURE_DATA_BP_DAWR 0x10
> + #define PPC_DEBUG_FEATURE_DATA_BP_ARCH_31 0x20
>
> 2. PTRACE_SETHWDEBUG
>
> diff --git a/arch/powerpc/include/uapi/asm/ptrace.h b/arch/powerpc/include/uapi/asm/ptrace.h
> index f5f1ccc740fc..7004cfea3f5f 100644
> --- a/arch/powerpc/include/uapi/asm/ptrace.h
> +++ b/arch/powerpc/include/uapi/asm/ptrace.h
> @@ -222,6 +222,7 @@ struct ppc_debug_info {
> #define PPC_DEBUG_FEATURE_DATA_BP_RANGE 0x0000000000000004
> #define PPC_DEBUG_FEATURE_DATA_BP_MASK 0x0000000000000008
> #define PPC_DEBUG_FEATURE_DATA_BP_DAWR 0x0000000000000010
> +#define PPC_DEBUG_FEATURE_DATA_BP_ARCH_31 0x0000000000000020
>
> #ifndef __ASSEMBLY__
>
> diff --git a/arch/powerpc/kernel/ptrace/ptrace-noadv.c b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
> index 48c52426af80..aa36fcad36cd 100644
> --- a/arch/powerpc/kernel/ptrace/ptrace-noadv.c
> +++ b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
> @@ -57,6 +57,8 @@ void ppc_gethwdinfo(struct ppc_debug_info *dbginfo)
> } else {
> dbginfo->features = 0;
> }
> + if (cpu_has_feature(CPU_FTR_ARCH_31))
> + dbginfo->features |= PPC_DEBUG_FEATURE_DATA_BP_ARCH_31;
> }
>
> int ptrace_get_debugreg(struct task_struct *child, unsigned long addr,
>
^ permalink raw reply
* Re: [PATCH v6 6/8] powerpc/watchpoint: Add hw_len wherever missing
From: Rogerio Alves @ 2020-09-17 13:26 UTC (permalink / raw)
To: Ravi Bangoria, mpe, christophe.leroy
Cc: mikey, jniethe5, pedromfc, linux-kernel, paulus, rogealve,
naveen.n.rao, linuxppc-dev
In-Reply-To: <20200902042945.129369-7-ravi.bangoria@linux.ibm.com>
On 9/2/20 1:29 AM, Ravi Bangoria wrote:
> There are couple of places where we set len but not hw_len. For
> ptrace/perf watchpoints, when CONFIG_HAVE_HW_BREAKPOINT=Y, hw_len
> will be calculated and set internally while parsing watchpoint.
> But when CONFIG_HAVE_HW_BREAKPOINT=N, we need to manually set
> 'hw_len'. Similarly for xmon as well, hw_len needs to be set
> directly.
>
> Fixes: b57aeab811db ("powerpc/watchpoint: Fix length calculation for unaligned target")
> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Tested-by: Rogerio Alves <rcardoso@linux.ibm.com>
> ---
> arch/powerpc/kernel/ptrace/ptrace-noadv.c | 1 +
> arch/powerpc/xmon/xmon.c | 1 +
> 2 files changed, 2 insertions(+)
>
> diff --git a/arch/powerpc/kernel/ptrace/ptrace-noadv.c b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
> index c9122ed91340..48c52426af80 100644
> --- a/arch/powerpc/kernel/ptrace/ptrace-noadv.c
> +++ b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
> @@ -219,6 +219,7 @@ long ppc_set_hwdebug(struct task_struct *child, struct ppc_hw_breakpoint *bp_inf
> brk.address = ALIGN_DOWN(bp_info->addr, HW_BREAKPOINT_SIZE);
> brk.type = HW_BRK_TYPE_TRANSLATE | HW_BRK_TYPE_PRIV_ALL;
> brk.len = DABR_MAX_LEN;
> + brk.hw_len = DABR_MAX_LEN;
> if (bp_info->trigger_type & PPC_BREAKPOINT_TRIGGER_READ)
> brk.type |= HW_BRK_TYPE_READ;
> if (bp_info->trigger_type & PPC_BREAKPOINT_TRIGGER_WRITE)
> diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
> index df7bca00f5ec..55c43a6c9111 100644
> --- a/arch/powerpc/xmon/xmon.c
> +++ b/arch/powerpc/xmon/xmon.c
> @@ -969,6 +969,7 @@ static void insert_cpu_bpts(void)
> brk.address = dabr[i].address;
> brk.type = (dabr[i].enabled & HW_BRK_TYPE_DABR) | HW_BRK_TYPE_PRIV_ALL;
> brk.len = 8;
> + brk.hw_len = 8;
> __set_breakpoint(i, &brk);
> }
> }
>
^ permalink raw reply
* Re: [PATCH v6 5/8] powerpc/watchpoint: Fix exception handling for CONFIG_HAVE_HW_BREAKPOINT=N
From: Rogerio Alves @ 2020-09-17 13:25 UTC (permalink / raw)
To: Ravi Bangoria, mpe, christophe.leroy
Cc: mikey, jniethe5, pedromfc, linux-kernel, paulus, rogealve,
naveen.n.rao, linuxppc-dev
In-Reply-To: <20200902042945.129369-6-ravi.bangoria@linux.ibm.com>
On 9/2/20 1:29 AM, Ravi Bangoria wrote:
> On powerpc, ptrace watchpoint works in one-shot mode. i.e. kernel
> disables event every time it fires and user has to re-enable it.
> Also, in case of ptrace watchpoint, kernel notifies ptrace user
> before executing instruction.
>
> With CONFIG_HAVE_HW_BREAKPOINT=N, kernel is missing to disable
> ptrace event and thus it's causing infinite loop of exceptions.
> This is especially harmful when user watches on a data which is
> also read/written by kernel, eg syscall parameters. In such case,
> infinite exceptions happens in kernel mode which causes soft-lockup.
>
> Fixes: 9422de3e953d ("powerpc: Hardware breakpoints rewrite to handle non DABR breakpoint registers")
> Reported-by: Pedro Miraglia Franco de Carvalho <pedromfc@linux.ibm.com>
> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Tested-by: Rogerio Alves <rcardoso@linux.ibm.com>
> ---
> arch/powerpc/include/asm/hw_breakpoint.h | 3 ++
> arch/powerpc/kernel/process.c | 48 +++++++++++++++++++++++
> arch/powerpc/kernel/ptrace/ptrace-noadv.c | 4 +-
> 3 files changed, 54 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/include/asm/hw_breakpoint.h b/arch/powerpc/include/asm/hw_breakpoint.h
> index 81872c420476..abebfbee5b1c 100644
> --- a/arch/powerpc/include/asm/hw_breakpoint.h
> +++ b/arch/powerpc/include/asm/hw_breakpoint.h
> @@ -18,6 +18,7 @@ struct arch_hw_breakpoint {
> u16 type;
> u16 len; /* length of the target data symbol */
> u16 hw_len; /* length programmed in hw */
> + u8 flags;
> };
>
> /* Note: Don't change the first 6 bits below as they are in the same order
> @@ -37,6 +38,8 @@ struct arch_hw_breakpoint {
> #define HW_BRK_TYPE_PRIV_ALL (HW_BRK_TYPE_USER | HW_BRK_TYPE_KERNEL | \
> HW_BRK_TYPE_HYP)
>
> +#define HW_BRK_FLAG_DISABLED 0x1
> +
> /* Minimum granularity */
> #ifdef CONFIG_PPC_8xx
> #define HW_BREAKPOINT_SIZE 0x4
> diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> index 016bd831908e..160fbbf41d40 100644
> --- a/arch/powerpc/kernel/process.c
> +++ b/arch/powerpc/kernel/process.c
> @@ -636,6 +636,44 @@ void do_send_trap(struct pt_regs *regs, unsigned long address,
> (void __user *)address);
> }
> #else /* !CONFIG_PPC_ADV_DEBUG_REGS */
> +
> +static void do_break_handler(struct pt_regs *regs)
> +{
> + struct arch_hw_breakpoint null_brk = {0};
> + struct arch_hw_breakpoint *info;
> + struct ppc_inst instr = ppc_inst(0);
> + int type = 0;
> + int size = 0;
> + unsigned long ea;
> + int i;
> +
> + /*
> + * If underneath hw supports only one watchpoint, we know it
> + * caused exception. 8xx also falls into this category.
> + */
> + if (nr_wp_slots() == 1) {
> + __set_breakpoint(0, &null_brk);
> + current->thread.hw_brk[0] = null_brk;
> + current->thread.hw_brk[0].flags |= HW_BRK_FLAG_DISABLED;
> + return;
> + }
> +
> + /* Otherwise findout which DAWR caused exception and disable it. */
> + wp_get_instr_detail(regs, &instr, &type, &size, &ea);
> +
> + for (i = 0; i < nr_wp_slots(); i++) {
> + info = ¤t->thread.hw_brk[i];
> + if (!info->address)
> + continue;
> +
> + if (wp_check_constraints(regs, instr, ea, type, size, info)) {
> + __set_breakpoint(i, &null_brk);
> + current->thread.hw_brk[i] = null_brk;
> + current->thread.hw_brk[i].flags |= HW_BRK_FLAG_DISABLED;
> + }
> + }
> +}
> +
> void do_break (struct pt_regs *regs, unsigned long address,
> unsigned long error_code)
> {
> @@ -647,6 +685,16 @@ void do_break (struct pt_regs *regs, unsigned long address,
> if (debugger_break_match(regs))
> return;
>
> + /*
> + * We reach here only when watchpoint exception is generated by ptrace
> + * event (or hw is buggy!). Now if CONFIG_HAVE_HW_BREAKPOINT is set,
> + * watchpoint is already handled by hw_breakpoint_handler() so we don't
> + * have to do anything. But when CONFIG_HAVE_HW_BREAKPOINT is not set,
> + * we need to manually handle the watchpoint here.
> + */
> + if (!IS_ENABLED(CONFIG_HAVE_HW_BREAKPOINT))
> + do_break_handler(regs);
> +
> /* Deliver the signal to userspace */
> force_sig_fault(SIGTRAP, TRAP_HWBKPT, (void __user *)address);
> }
> diff --git a/arch/powerpc/kernel/ptrace/ptrace-noadv.c b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
> index 57a0ab822334..c9122ed91340 100644
> --- a/arch/powerpc/kernel/ptrace/ptrace-noadv.c
> +++ b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
> @@ -286,11 +286,13 @@ long ppc_del_hwdebug(struct task_struct *child, long data)
> }
> return ret;
> #else /* CONFIG_HAVE_HW_BREAKPOINT */
> - if (child->thread.hw_brk[data - 1].address == 0)
> + if (!(child->thread.hw_brk[data - 1].flags & HW_BRK_FLAG_DISABLED) &&
> + child->thread.hw_brk[data - 1].address == 0)
> return -ENOENT;
>
> child->thread.hw_brk[data - 1].address = 0;
> child->thread.hw_brk[data - 1].type = 0;
> + child->thread.hw_brk[data - 1].flags = 0;
> #endif /* CONFIG_HAVE_HW_BREAKPOINT */
>
> return 0;
>
^ permalink raw reply
* Re: [PATCH v6 4/8] powerpc/watchpoint: Move DAWR detection logic outside of hw_breakpoint.c
From: Rogerio Alves @ 2020-09-17 13:25 UTC (permalink / raw)
To: Ravi Bangoria, mpe, christophe.leroy
Cc: mikey, jniethe5, pedromfc, linux-kernel, paulus, rogealve,
naveen.n.rao, linuxppc-dev
In-Reply-To: <20200902042945.129369-5-ravi.bangoria@linux.ibm.com>
On 9/2/20 1:29 AM, Ravi Bangoria wrote:
> Power10 hw has multiple DAWRs but hw doesn't tell which DAWR caused
> the exception. So we have a sw logic to detect that in hw_breakpoint.c.
> But hw_breakpoint.c gets compiled only with CONFIG_HAVE_HW_BREAKPOINT=Y.
> Move DAWR detection logic outside of hw_breakpoint.c so that it can be
> reused when CONFIG_HAVE_HW_BREAKPOINT is not set.
>
> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Tested-by: Rogerio Alves <rcardoso@linux.ibm.com>
> ---
> arch/powerpc/include/asm/hw_breakpoint.h | 8 +
> arch/powerpc/kernel/Makefile | 3 +-
> arch/powerpc/kernel/hw_breakpoint.c | 159 +----------------
> .../kernel/hw_breakpoint_constraints.c | 162 ++++++++++++++++++
> 4 files changed, 174 insertions(+), 158 deletions(-)
> create mode 100644 arch/powerpc/kernel/hw_breakpoint_constraints.c
>
> diff --git a/arch/powerpc/include/asm/hw_breakpoint.h b/arch/powerpc/include/asm/hw_breakpoint.h
> index 9b68eafebf43..81872c420476 100644
> --- a/arch/powerpc/include/asm/hw_breakpoint.h
> +++ b/arch/powerpc/include/asm/hw_breakpoint.h
> @@ -10,6 +10,7 @@
> #define _PPC_BOOK3S_64_HW_BREAKPOINT_H
>
> #include <asm/cpu_has_feature.h>
> +#include <asm/inst.h>
>
> #ifdef __KERNEL__
> struct arch_hw_breakpoint {
> @@ -52,6 +53,13 @@ static inline int nr_wp_slots(void)
> return cpu_has_feature(CPU_FTR_DAWR1) ? 2 : 1;
> }
>
> +bool wp_check_constraints(struct pt_regs *regs, struct ppc_inst instr,
> + unsigned long ea, int type, int size,
> + struct arch_hw_breakpoint *info);
> +
> +void wp_get_instr_detail(struct pt_regs *regs, struct ppc_inst *instr,
> + int *type, int *size, unsigned long *ea);
> +
> #ifdef CONFIG_HAVE_HW_BREAKPOINT
> #include <linux/kdebug.h>
> #include <asm/reg.h>
> diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
> index cbf41fb4ee89..a5550c2b24c4 100644
> --- a/arch/powerpc/kernel/Makefile
> +++ b/arch/powerpc/kernel/Makefile
> @@ -45,7 +45,8 @@ obj-y := cputable.o syscalls.o \
> signal.o sysfs.o cacheinfo.o time.o \
> prom.o traps.o setup-common.o \
> udbg.o misc.o io.o misc_$(BITS).o \
> - of_platform.o prom_parse.o firmware.o
> + of_platform.o prom_parse.o firmware.o \
> + hw_breakpoint_constraints.o
> obj-y += ptrace/
> obj-$(CONFIG_PPC64) += setup_64.o \
> paca.o nvram_64.o note.o syscall_64.o
> diff --git a/arch/powerpc/kernel/hw_breakpoint.c b/arch/powerpc/kernel/hw_breakpoint.c
> index f6b24838ca3c..f4e8f21046f5 100644
> --- a/arch/powerpc/kernel/hw_breakpoint.c
> +++ b/arch/powerpc/kernel/hw_breakpoint.c
> @@ -494,161 +494,6 @@ void thread_change_pc(struct task_struct *tsk, struct pt_regs *regs)
> }
> }
>
> -static bool dar_in_user_range(unsigned long dar, struct arch_hw_breakpoint *info)
> -{
> - return ((info->address <= dar) && (dar - info->address < info->len));
> -}
> -
> -static bool ea_user_range_overlaps(unsigned long ea, int size,
> - struct arch_hw_breakpoint *info)
> -{
> - return ((ea < info->address + info->len) &&
> - (ea + size > info->address));
> -}
> -
> -static bool dar_in_hw_range(unsigned long dar, struct arch_hw_breakpoint *info)
> -{
> - unsigned long hw_start_addr, hw_end_addr;
> -
> - hw_start_addr = ALIGN_DOWN(info->address, HW_BREAKPOINT_SIZE);
> - hw_end_addr = ALIGN(info->address + info->len, HW_BREAKPOINT_SIZE);
> -
> - return ((hw_start_addr <= dar) && (hw_end_addr > dar));
> -}
> -
> -static bool ea_hw_range_overlaps(unsigned long ea, int size,
> - struct arch_hw_breakpoint *info)
> -{
> - unsigned long hw_start_addr, hw_end_addr;
> - unsigned long align_size = HW_BREAKPOINT_SIZE;
> -
> - /*
> - * On p10 predecessors, quadword is handle differently then
> - * other instructions.
> - */
> - if (!cpu_has_feature(CPU_FTR_ARCH_31) && size == 16)
> - align_size = HW_BREAKPOINT_SIZE_QUADWORD;
> -
> - hw_start_addr = ALIGN_DOWN(info->address, align_size);
> - hw_end_addr = ALIGN(info->address + info->len, align_size);
> -
> - return ((ea < hw_end_addr) && (ea + size > hw_start_addr));
> -}
> -
> -/*
> - * If hw has multiple DAWR registers, we also need to check all
> - * dawrx constraint bits to confirm this is _really_ a valid event.
> - * If type is UNKNOWN, but privilege level matches, consider it as
> - * a positive match.
> - */
> -static bool check_dawrx_constraints(struct pt_regs *regs, int type,
> - struct arch_hw_breakpoint *info)
> -{
> - if (OP_IS_LOAD(type) && !(info->type & HW_BRK_TYPE_READ))
> - return false;
> -
> - /*
> - * The Cache Management instructions other than dcbz never
> - * cause a match. i.e. if type is CACHEOP, the instruction
> - * is dcbz, and dcbz is treated as Store.
> - */
> - if ((OP_IS_STORE(type) || type == CACHEOP) && !(info->type & HW_BRK_TYPE_WRITE))
> - return false;
> -
> - if (is_kernel_addr(regs->nip) && !(info->type & HW_BRK_TYPE_KERNEL))
> - return false;
> -
> - if (user_mode(regs) && !(info->type & HW_BRK_TYPE_USER))
> - return false;
> -
> - return true;
> -}
> -
> -/*
> - * Return true if the event is valid wrt dawr configuration,
> - * including extraneous exception. Otherwise return false.
> - */
> -static bool check_constraints(struct pt_regs *regs, struct ppc_inst instr,
> - unsigned long ea, int type, int size,
> - struct arch_hw_breakpoint *info)
> -{
> - bool in_user_range = dar_in_user_range(regs->dar, info);
> - bool dawrx_constraints;
> -
> - /*
> - * 8xx supports only one breakpoint and thus we can
> - * unconditionally return true.
> - */
> - if (IS_ENABLED(CONFIG_PPC_8xx)) {
> - if (!in_user_range)
> - info->type |= HW_BRK_TYPE_EXTRANEOUS_IRQ;
> - return true;
> - }
> -
> - if (unlikely(ppc_inst_equal(instr, ppc_inst(0)))) {
> - if (cpu_has_feature(CPU_FTR_ARCH_31) &&
> - !dar_in_hw_range(regs->dar, info))
> - return false;
> -
> - return true;
> - }
> -
> - dawrx_constraints = check_dawrx_constraints(regs, type, info);
> -
> - if (type == UNKNOWN) {
> - if (cpu_has_feature(CPU_FTR_ARCH_31) &&
> - !dar_in_hw_range(regs->dar, info))
> - return false;
> -
> - return dawrx_constraints;
> - }
> -
> - if (ea_user_range_overlaps(ea, size, info))
> - return dawrx_constraints;
> -
> - if (ea_hw_range_overlaps(ea, size, info)) {
> - if (dawrx_constraints) {
> - info->type |= HW_BRK_TYPE_EXTRANEOUS_IRQ;
> - return true;
> - }
> - }
> - return false;
> -}
> -
> -static int cache_op_size(void)
> -{
> -#ifdef __powerpc64__
> - return ppc64_caches.l1d.block_size;
> -#else
> - return L1_CACHE_BYTES;
> -#endif
> -}
> -
> -static void get_instr_detail(struct pt_regs *regs, struct ppc_inst *instr,
> - int *type, int *size, unsigned long *ea)
> -{
> - struct instruction_op op;
> -
> - if (__get_user_instr_inatomic(*instr, (void __user *)regs->nip))
> - return;
> -
> - analyse_instr(&op, regs, *instr);
> - *type = GETTYPE(op.type);
> - *ea = op.ea;
> -#ifdef __powerpc64__
> - if (!(regs->msr & MSR_64BIT))
> - *ea &= 0xffffffffUL;
> -#endif
> -
> - *size = GETSIZE(op.type);
> - if (*type == CACHEOP) {
> - *size = cache_op_size();
> - *ea &= ~(*size - 1);
> - } else if (*type == LOAD_VMX || *type == STORE_VMX) {
> - *ea &= ~(*size - 1);
> - }
> -}
> -
> static bool is_larx_stcx_instr(int type)
> {
> return type == LARX || type == STCX;
> @@ -732,7 +577,7 @@ int hw_breakpoint_handler(struct die_args *args)
> rcu_read_lock();
>
> if (!IS_ENABLED(CONFIG_PPC_8xx))
> - get_instr_detail(regs, &instr, &type, &size, &ea);
> + wp_get_instr_detail(regs, &instr, &type, &size, &ea);
>
> for (i = 0; i < nr_wp_slots(); i++) {
> bp[i] = __this_cpu_read(bp_per_reg[i]);
> @@ -742,7 +587,7 @@ int hw_breakpoint_handler(struct die_args *args)
> info[i] = counter_arch_bp(bp[i]);
> info[i]->type &= ~HW_BRK_TYPE_EXTRANEOUS_IRQ;
>
> - if (check_constraints(regs, instr, ea, type, size, info[i])) {
> + if (wp_check_constraints(regs, instr, ea, type, size, info[i])) {
> if (!IS_ENABLED(CONFIG_PPC_8xx) &&
> ppc_inst_equal(instr, ppc_inst(0))) {
> handler_error(bp[i], info[i]);
> diff --git a/arch/powerpc/kernel/hw_breakpoint_constraints.c b/arch/powerpc/kernel/hw_breakpoint_constraints.c
> new file mode 100644
> index 000000000000..867ee4aa026a
> --- /dev/null
> +++ b/arch/powerpc/kernel/hw_breakpoint_constraints.c
> @@ -0,0 +1,162 @@
> +// SPDX-License-Identifier: GPL-2.0+
> +#include <linux/kernel.h>
> +#include <linux/uaccess.h>
> +#include <linux/sched.h>
> +#include <asm/hw_breakpoint.h>
> +#include <asm/sstep.h>
> +#include <asm/cache.h>
> +
> +static bool dar_in_user_range(unsigned long dar, struct arch_hw_breakpoint *info)
> +{
> + return ((info->address <= dar) && (dar - info->address < info->len));
> +}
> +
> +static bool ea_user_range_overlaps(unsigned long ea, int size,
> + struct arch_hw_breakpoint *info)
> +{
> + return ((ea < info->address + info->len) &&
> + (ea + size > info->address));
> +}
> +
> +static bool dar_in_hw_range(unsigned long dar, struct arch_hw_breakpoint *info)
> +{
> + unsigned long hw_start_addr, hw_end_addr;
> +
> + hw_start_addr = ALIGN_DOWN(info->address, HW_BREAKPOINT_SIZE);
> + hw_end_addr = ALIGN(info->address + info->len, HW_BREAKPOINT_SIZE);
> +
> + return ((hw_start_addr <= dar) && (hw_end_addr > dar));
> +}
> +
> +static bool ea_hw_range_overlaps(unsigned long ea, int size,
> + struct arch_hw_breakpoint *info)
> +{
> + unsigned long hw_start_addr, hw_end_addr;
> + unsigned long align_size = HW_BREAKPOINT_SIZE;
> +
> + /*
> + * On p10 predecessors, quadword is handle differently then
> + * other instructions.
> + */
> + if (!cpu_has_feature(CPU_FTR_ARCH_31) && size == 16)
> + align_size = HW_BREAKPOINT_SIZE_QUADWORD;
> +
> + hw_start_addr = ALIGN_DOWN(info->address, align_size);
> + hw_end_addr = ALIGN(info->address + info->len, align_size);
> +
> + return ((ea < hw_end_addr) && (ea + size > hw_start_addr));
> +}
> +
> +/*
> + * If hw has multiple DAWR registers, we also need to check all
> + * dawrx constraint bits to confirm this is _really_ a valid event.
> + * If type is UNKNOWN, but privilege level matches, consider it as
> + * a positive match.
> + */
> +static bool check_dawrx_constraints(struct pt_regs *regs, int type,
> + struct arch_hw_breakpoint *info)
> +{
> + if (OP_IS_LOAD(type) && !(info->type & HW_BRK_TYPE_READ))
> + return false;
> +
> + /*
> + * The Cache Management instructions other than dcbz never
> + * cause a match. i.e. if type is CACHEOP, the instruction
> + * is dcbz, and dcbz is treated as Store.
> + */
> + if ((OP_IS_STORE(type) || type == CACHEOP) && !(info->type & HW_BRK_TYPE_WRITE))
> + return false;
> +
> + if (is_kernel_addr(regs->nip) && !(info->type & HW_BRK_TYPE_KERNEL))
> + return false;
> +
> + if (user_mode(regs) && !(info->type & HW_BRK_TYPE_USER))
> + return false;
> +
> + return true;
> +}
> +
> +/*
> + * Return true if the event is valid wrt dawr configuration,
> + * including extraneous exception. Otherwise return false.
> + */
> +bool wp_check_constraints(struct pt_regs *regs, struct ppc_inst instr,
> + unsigned long ea, int type, int size,
> + struct arch_hw_breakpoint *info)
> +{
> + bool in_user_range = dar_in_user_range(regs->dar, info);
> + bool dawrx_constraints;
> +
> + /*
> + * 8xx supports only one breakpoint and thus we can
> + * unconditionally return true.
> + */
> + if (IS_ENABLED(CONFIG_PPC_8xx)) {
> + if (!in_user_range)
> + info->type |= HW_BRK_TYPE_EXTRANEOUS_IRQ;
> + return true;
> + }
> +
> + if (unlikely(ppc_inst_equal(instr, ppc_inst(0)))) {
> + if (cpu_has_feature(CPU_FTR_ARCH_31) &&
> + !dar_in_hw_range(regs->dar, info))
> + return false;
> +
> + return true;
> + }
> +
> + dawrx_constraints = check_dawrx_constraints(regs, type, info);
> +
> + if (type == UNKNOWN) {
> + if (cpu_has_feature(CPU_FTR_ARCH_31) &&
> + !dar_in_hw_range(regs->dar, info))
> + return false;
> +
> + return dawrx_constraints;
> + }
> +
> + if (ea_user_range_overlaps(ea, size, info))
> + return dawrx_constraints;
> +
> + if (ea_hw_range_overlaps(ea, size, info)) {
> + if (dawrx_constraints) {
> + info->type |= HW_BRK_TYPE_EXTRANEOUS_IRQ;
> + return true;
> + }
> + }
> + return false;
> +}
> +
> +static int cache_op_size(void)
> +{
> +#ifdef __powerpc64__
> + return ppc64_caches.l1d.block_size;
> +#else
> + return L1_CACHE_BYTES;
> +#endif
> +}
> +
> +void wp_get_instr_detail(struct pt_regs *regs, struct ppc_inst *instr,
> + int *type, int *size, unsigned long *ea)
> +{
> + struct instruction_op op;
> +
> + if (__get_user_instr_inatomic(*instr, (void __user *)regs->nip))
> + return;
> +
> + analyse_instr(&op, regs, *instr);
> + *type = GETTYPE(op.type);
> + *ea = op.ea;
> +#ifdef __powerpc64__
> + if (!(regs->msr & MSR_64BIT))
> + *ea &= 0xffffffffUL;
> +#endif
> +
> + *size = GETSIZE(op.type);
> + if (*type == CACHEOP) {
> + *size = cache_op_size();
> + *ea &= ~(*size - 1);
> + } else if (*type == LOAD_VMX || *type == STORE_VMX) {
> + *ea &= ~(*size - 1);
> + }
> +}
>
^ permalink raw reply
* Re: [PATCH v6 3/8] powerpc/watchpoint/ptrace: Fix SETHWDEBUG when CONFIG_HAVE_HW_BREAKPOINT=N
From: Rogerio Alves @ 2020-09-17 13:25 UTC (permalink / raw)
To: Ravi Bangoria, mpe, christophe.leroy
Cc: mikey, jniethe5, pedromfc, linux-kernel, paulus, rogealve,
naveen.n.rao, linuxppc-dev
In-Reply-To: <20200902042945.129369-4-ravi.bangoria@linux.ibm.com>
On 9/2/20 1:29 AM, Ravi Bangoria wrote:
> When kernel is compiled with CONFIG_HAVE_HW_BREAKPOINT=N, user can
> still create watchpoint using PPC_PTRACE_SETHWDEBUG, with limited
> functionalities. But, such watchpoints are never firing because of
> the missing privilege settings. Fix that.
>
> It's safe to set HW_BRK_TYPE_PRIV_ALL because we don't really leak
> any kernel address in signal info. Setting HW_BRK_TYPE_PRIV_ALL will
> also help to find scenarios when kernel accesses user memory.
>
> Reported-by: Pedro Miraglia Franco de Carvalho <pedromfc@linux.ibm.com>
> Suggested-by: Pedro Miraglia Franco de Carvalho <pedromfc@linux.ibm.com>
> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Tested-by: Rogerio Alves <rcardoso@linux.ibm.com>
> ---
> arch/powerpc/kernel/ptrace/ptrace-noadv.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/kernel/ptrace/ptrace-noadv.c b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
> index 697c7e4b5877..57a0ab822334 100644
> --- a/arch/powerpc/kernel/ptrace/ptrace-noadv.c
> +++ b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
> @@ -217,7 +217,7 @@ long ppc_set_hwdebug(struct task_struct *child, struct ppc_hw_breakpoint *bp_inf
> return -EIO;
>
> brk.address = ALIGN_DOWN(bp_info->addr, HW_BREAKPOINT_SIZE);
> - brk.type = HW_BRK_TYPE_TRANSLATE;
> + brk.type = HW_BRK_TYPE_TRANSLATE | HW_BRK_TYPE_PRIV_ALL;
> brk.len = DABR_MAX_LEN;
> if (bp_info->trigger_type & PPC_BREAKPOINT_TRIGGER_READ)
> brk.type |= HW_BRK_TYPE_READ;
>
^ permalink raw reply
* Re: [PATCH v6 2/8] powerpc/watchpoint: Fix handling of vector instructions
From: Rogerio Alves @ 2020-09-17 13:25 UTC (permalink / raw)
To: Ravi Bangoria, mpe, christophe.leroy
Cc: mikey, jniethe5, pedromfc, linux-kernel, paulus, rogealve,
naveen.n.rao, linuxppc-dev
In-Reply-To: <20200902042945.129369-3-ravi.bangoria@linux.ibm.com>
On 9/2/20 1:29 AM, Ravi Bangoria wrote:
> Vector load/store instructions are special because they are always
> aligned. Thus unaligned EA needs to be aligned down before comparing
> it with watch ranges. Otherwise we might consider valid event as
> invalid.
>
> Fixes: 74c6881019b7 ("powerpc/watchpoint: Prepare handler to handle more than one watchpoint")
> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Tested-by: Rogerio Alves <rcardoso@linux.ibm.com>
> ---
> arch/powerpc/kernel/hw_breakpoint.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/arch/powerpc/kernel/hw_breakpoint.c b/arch/powerpc/kernel/hw_breakpoint.c
> index 9f7df1c37233..f6b24838ca3c 100644
> --- a/arch/powerpc/kernel/hw_breakpoint.c
> +++ b/arch/powerpc/kernel/hw_breakpoint.c
> @@ -644,6 +644,8 @@ static void get_instr_detail(struct pt_regs *regs, struct ppc_inst *instr,
> if (*type == CACHEOP) {
> *size = cache_op_size();
> *ea &= ~(*size - 1);
> + } else if (*type == LOAD_VMX || *type == STORE_VMX) {
> + *ea &= ~(*size - 1);
> }
> }
>
>
^ permalink raw reply
* Re: [PATCH v6 1/8] powerpc/watchpoint: Fix quarword instruction handling on p10 predecessors
From: Rogerio Alves @ 2020-09-17 13:25 UTC (permalink / raw)
To: Ravi Bangoria, mpe, christophe.leroy
Cc: mikey, jniethe5, pedromfc, linux-kernel, paulus, rogealve,
naveen.n.rao, linuxppc-dev
In-Reply-To: <20200902042945.129369-2-ravi.bangoria@linux.ibm.com>
On 9/2/20 1:29 AM, Ravi Bangoria wrote:
> On p10 predecessors, watchpoint with quarword access is compared at
> quardword length. If the watch range is doubleword or less than that
> in a first half of quarword aligned 16 bytes, and if there is any
> unaligned quadword access which will access only the 2nd half, the
> handler should consider it as extraneous and emulate/single-step it
> before continuing.
>
> Reported-by: Pedro Miraglia Franco de Carvalho <pedromfc@linux.ibm.com>
> Fixes: 74c6881019b7 ("powerpc/watchpoint: Prepare handler to handle more than one watchpoint")
> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Tested-by: Rogerio Alves <rcardoso@linux.ibm.com>
> ---
> arch/powerpc/include/asm/hw_breakpoint.h | 1 +
> arch/powerpc/kernel/hw_breakpoint.c | 12 ++++++++++--
> 2 files changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/hw_breakpoint.h b/arch/powerpc/include/asm/hw_breakpoint.h
> index db206a7f38e2..9b68eafebf43 100644
> --- a/arch/powerpc/include/asm/hw_breakpoint.h
> +++ b/arch/powerpc/include/asm/hw_breakpoint.h
> @@ -42,6 +42,7 @@ struct arch_hw_breakpoint {
> #else
> #define HW_BREAKPOINT_SIZE 0x8
> #endif
> +#define HW_BREAKPOINT_SIZE_QUADWORD 0x10
>
> #define DABR_MAX_LEN 8
> #define DAWR_MAX_LEN 512
> diff --git a/arch/powerpc/kernel/hw_breakpoint.c b/arch/powerpc/kernel/hw_breakpoint.c
> index 1f4a1efa0074..9f7df1c37233 100644
> --- a/arch/powerpc/kernel/hw_breakpoint.c
> +++ b/arch/powerpc/kernel/hw_breakpoint.c
> @@ -520,9 +520,17 @@ static bool ea_hw_range_overlaps(unsigned long ea, int size,
> struct arch_hw_breakpoint *info)
> {
> unsigned long hw_start_addr, hw_end_addr;
> + unsigned long align_size = HW_BREAKPOINT_SIZE;
>
> - hw_start_addr = ALIGN_DOWN(info->address, HW_BREAKPOINT_SIZE);
> - hw_end_addr = ALIGN(info->address + info->len, HW_BREAKPOINT_SIZE);
> + /*
> + * On p10 predecessors, quadword is handle differently then
> + * other instructions.
> + */
> + if (!cpu_has_feature(CPU_FTR_ARCH_31) && size == 16)
> + align_size = HW_BREAKPOINT_SIZE_QUADWORD;
> +
> + hw_start_addr = ALIGN_DOWN(info->address, align_size);
> + hw_end_addr = ALIGN(info->address + info->len, align_size);
>
> return ((ea < hw_end_addr) && (ea + size > hw_start_addr));
> }
>
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox