LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH kernel 2/2] powerpc/pseries/dma: Enable swiotlb
From: Thiago Jung Bauermann @ 2019-05-10 22:41 UTC (permalink / raw)
  To: Alexey Kardashevskiy; +Cc: Alistair Popple, linuxppc-dev, David Gibson
In-Reply-To: <20190507062559.20295-3-aik@ozlabs.ru>


Hello Alexey,

Thanks!

I have similar changes in my "Secure Virtual Machine Enablement"
patches, which I am currently preparing for posting again real soon now.

This is the last version:

https://lore.kernel.org/linuxppc-dev/20180824162535.22798-1-bauerman@linux.ibm.com/

Alexey Kardashevskiy <aik@ozlabs.ru> writes:

> So far the pseries platforms has always been using IOMMU making SWIOTLB
> unnecessary. Now we want secure guests which means devices can only
> access certain areas of guest physical memory; we are going to use
> SWIOTLB for this purpose.
>
> This allows SWIOTLB for pseries. By default there is no change in behavior.
>
> This enables SWIOTLB when the "swiotlb" kernel parameter is set to "force".
>
> With the SWIOTLB enabled, the kernel creates a directly mapped DMA window
> (using the usual DDW mechanism) and implements SWIOTLB on top of that.
>
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
>  arch/powerpc/platforms/pseries/setup.c | 5 +++++
>  arch/powerpc/platforms/pseries/Kconfig | 1 +
>  2 files changed, 6 insertions(+)
>
> diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
> index e4f0dfd4ae33..30d72b587ac5 100644
> --- a/arch/powerpc/platforms/pseries/setup.c
> +++ b/arch/powerpc/platforms/pseries/setup.c
> @@ -42,6 +42,7 @@
>  #include <linux/of.h>
>  #include <linux/of_pci.h>
>  #include <linux/memblock.h>
> +#include <linux/swiotlb.h>
>
>  #include <asm/mmu.h>
>  #include <asm/processor.h>
> @@ -71,6 +72,7 @@
>  #include <asm/isa-bridge.h>
>  #include <asm/security_features.h>
>  #include <asm/asm-const.h>
> +#include <asm/swiotlb.h>
>
>  #include "pseries.h"
>  #include "../../../../drivers/pci/pci.h"
> @@ -797,6 +799,9 @@ static void __init pSeries_setup_arch(void)
>  	}
>
>  	ppc_md.pcibios_root_bridge_prepare = pseries_root_bridge_prepare;
> +
> +	if (swiotlb_force == SWIOTLB_FORCE)
> +		ppc_swiotlb_enable = 1;
>  }

Yep! I have this here, enabled when booting as a secure guest:

https://lore.kernel.org/linuxppc-dev/20180824162535.22798-6-bauerman@linux.ibm.com/

And also another patch which makes it so that if booting as a secure
guest it acts as if the swiotlb kernel parameter was set to force:

https://lore.kernel.org/linuxppc-dev/20180824162535.22798-11-bauerman@linux.ibm.com/

>  static void pseries_panic(char *str)
> diff --git a/arch/powerpc/platforms/pseries/Kconfig b/arch/powerpc/platforms/pseries/Kconfig
> index 9c6b3d860518..b9e8b608de01 100644
> --- a/arch/powerpc/platforms/pseries/Kconfig
> +++ b/arch/powerpc/platforms/pseries/Kconfig
> @@ -23,6 +23,7 @@ config PPC_PSERIES
>  	select ARCH_RANDOM
>  	select PPC_DOORBELL
>  	select FORCE_SMP
> +	select SWIOTLB
>  	default y
>
>  config PPC_SPLPAR

I put this in a PPC_SVM config option:

https://lore.kernel.org/linuxppc-dev/20180824162535.22798-3-bauerman@linux.ibm.com/

--
Thiago Jung Bauermann
IBM Linux Technology Center


^ permalink raw reply

* Re: [PATCH, RFC] byteorder: sanity check toolchain vs kernel endianess
From: Arnd Bergmann @ 2019-05-11  0:51 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: linux-arch, linuxppc-dev, Linux Kernel Mailing List,
	Nick Kossifidis, Andrew Morton, Linus Torvalds, Christoph Hellwig
In-Reply-To: <CACT4Y+aKGKm9Wbc1owBr51adkbesHP_Z81pBAoZ5HmJ+uZdsaw@mail.gmail.com>

On Fri, May 10, 2019 at 6:53 AM Dmitry Vyukov <dvyukov@google.com> wrote:
> >
> > I think it's good to have a sanity check in-place for consistency.
>
>
> Hi,
>
> This broke our cross-builds from x86. I am using:
>
> $ powerpc64le-linux-gnu-gcc --version
> powerpc64le-linux-gnu-gcc (Debian 7.2.0-7) 7.2.0
>
> and it says that it's little-endian somehow:
>
> $ powerpc64le-linux-gnu-gcc -dM -E - < /dev/null | grep BYTE_ORDER
> #define __BYTE_ORDER__ __ORDER_LITTLE_ENDIAN__
>
> Is it broke compiler? Or I always hold it wrong? Is there some
> additional flag I need to add?

It looks like a bug in the kernel Makefiles to me. powerpc32 is always
big-endian,
powerpc64 used to be big-endian but is now usually little-endian. There are
often three separate toolchains that default to the respective user
space targets
(ppc32be, ppc64be, ppc64le), but generally you should be able to build
any of the
three kernel configurations with any of those compilers, and have the Makefile
pass the correct -m32/-m64/-mbig-endian/-mlittle-endian command line options
depending on the kernel configuration. It seems that this is not happening
here. I have not checked why, but if this is the problem, it should be
easy enough
to figure out.

       Arnd

^ permalink raw reply

* [PATCH 1/2] perf ioctl: Add check for the sample_period value
From: Ravi Bangoria @ 2019-05-11  2:42 UTC (permalink / raw)
  To: peterz, jolsa, mpe, maddy; +Cc: Ravi Bangoria, linuxppc-dev, linux-kernel, acme

Add a check for sample_period value sent from userspace. Negative
value does not make sense. And in powerpc arch code this could cause
a recursive PMI leading to a hang (reported when running perf-fuzzer).

Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
---
 kernel/events/core.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index abbd4b3b96c2..e44c90378940 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5005,6 +5005,9 @@ static int perf_event_period(struct perf_event *event, u64 __user *arg)
 	if (perf_event_check_period(event, value))
 		return -EINVAL;
 
+	if (!event->attr.freq && (value & (1ULL << 63)))
+		return -EINVAL;
+
 	event_function_call(event, __perf_event_period, &value);
 
 	return 0;
-- 
2.20.1


^ permalink raw reply related

* [PATCH 2/2] powerpc/perf: Fix mmcra corruption by bhrb_filter
From: Ravi Bangoria @ 2019-05-11  2:42 UTC (permalink / raw)
  To: peterz, jolsa, mpe, maddy; +Cc: Ravi Bangoria, linuxppc-dev, linux-kernel, acme
In-Reply-To: <20190511024217.4013-1-ravi.bangoria@linux.ibm.com>

Consider a scenario where user creates two events:

  1st event:
    attr.sample_type |= PERF_SAMPLE_BRANCH_STACK;
    attr.branch_sample_type = PERF_SAMPLE_BRANCH_ANY;
    fd = perf_event_open(attr, 0, 1, -1, 0);

  This sets cpuhw->bhrb_filter to 0 and returns valid fd.

  2nd event:
    attr.sample_type |= PERF_SAMPLE_BRANCH_STACK;
    attr.branch_sample_type = PERF_SAMPLE_BRANCH_CALL;
    fd = perf_event_open(attr, 0, 1, -1, 0);

  It overrides cpuhw->bhrb_filter to -1 and returns with error.

Now if power_pmu_enable() gets called by any path other than
power_pmu_add(), ppmu->config_bhrb(-1) will set mmcra to -1.

Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
---
 arch/powerpc/perf/core-book3s.c | 6 ++++--
 arch/powerpc/perf/power8-pmu.c  | 3 +++
 arch/powerpc/perf/power9-pmu.c  | 3 +++
 3 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index b0723002a396..8eb5dc5df62b 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -1846,6 +1846,7 @@ static int power_pmu_event_init(struct perf_event *event)
 	int n;
 	int err;
 	struct cpu_hw_events *cpuhw;
+	u64 bhrb_filter;
 
 	if (!ppmu)
 		return -ENOENT;
@@ -1951,13 +1952,14 @@ static int power_pmu_event_init(struct perf_event *event)
 	err = power_check_constraints(cpuhw, events, cflags, n + 1);
 
 	if (has_branch_stack(event)) {
-		cpuhw->bhrb_filter = ppmu->bhrb_filter_map(
+		bhrb_filter = ppmu->bhrb_filter_map(
 					event->attr.branch_sample_type);
 
-		if (cpuhw->bhrb_filter == -1) {
+		if (bhrb_filter == -1) {
 			put_cpu_var(cpu_hw_events);
 			return -EOPNOTSUPP;
 		}
+		cpuhw->bhrb_filter = bhrb_filter;
 	}
 
 	put_cpu_var(cpu_hw_events);
diff --git a/arch/powerpc/perf/power8-pmu.c b/arch/powerpc/perf/power8-pmu.c
index d12a2db26353..d10feef93b6b 100644
--- a/arch/powerpc/perf/power8-pmu.c
+++ b/arch/powerpc/perf/power8-pmu.c
@@ -29,6 +29,7 @@ enum {
 #define	POWER8_MMCRA_IFM1		0x0000000040000000UL
 #define	POWER8_MMCRA_IFM2		0x0000000080000000UL
 #define	POWER8_MMCRA_IFM3		0x00000000C0000000UL
+#define	POWER8_MMCRA_BHRB_MASK		0x00000000C0000000UL
 
 /*
  * Raw event encoding for PowerISA v2.07 (Power8):
@@ -243,6 +244,8 @@ static u64 power8_bhrb_filter_map(u64 branch_sample_type)
 
 static void power8_config_bhrb(u64 pmu_bhrb_filter)
 {
+	pmu_bhrb_filter &= POWER8_MMCRA_BHRB_MASK;
+
 	/* Enable BHRB filter in PMU */
 	mtspr(SPRN_MMCRA, (mfspr(SPRN_MMCRA) | pmu_bhrb_filter));
 }
diff --git a/arch/powerpc/perf/power9-pmu.c b/arch/powerpc/perf/power9-pmu.c
index 030544e35959..f3987915cadc 100644
--- a/arch/powerpc/perf/power9-pmu.c
+++ b/arch/powerpc/perf/power9-pmu.c
@@ -92,6 +92,7 @@ enum {
 #define POWER9_MMCRA_IFM1		0x0000000040000000UL
 #define POWER9_MMCRA_IFM2		0x0000000080000000UL
 #define POWER9_MMCRA_IFM3		0x00000000C0000000UL
+#define POWER9_MMCRA_BHRB_MASK		0x00000000C0000000UL
 
 /* Nasty Power9 specific hack */
 #define PVR_POWER9_CUMULUS		0x00002000
@@ -300,6 +301,8 @@ static u64 power9_bhrb_filter_map(u64 branch_sample_type)
 
 static void power9_config_bhrb(u64 pmu_bhrb_filter)
 {
+	pmu_bhrb_filter &= POWER9_MMCRA_BHRB_MASK;
+
 	/* Enable BHRB filter in PMU */
 	mtspr(SPRN_MMCRA, (mfspr(SPRN_MMCRA) | pmu_bhrb_filter));
 }
-- 
2.20.1


^ permalink raw reply related

* Re: [PATCH 2/2] powerpc/perf: Fix mmcra corruption by bhrb_filter
From: Ravi Bangoria @ 2019-05-11  2:47 UTC (permalink / raw)
  To: peterz, jolsa, mpe, maddy; +Cc: Ravi Bangoria, linuxppc-dev, linux-kernel, acme
In-Reply-To: <20190511024217.4013-2-ravi.bangoria@linux.ibm.com>



On 5/11/19 8:12 AM, Ravi Bangoria wrote:
> Consider a scenario where user creates two events:
> 
>   1st event:
>     attr.sample_type |= PERF_SAMPLE_BRANCH_STACK;
>     attr.branch_sample_type = PERF_SAMPLE_BRANCH_ANY;
>     fd = perf_event_open(attr, 0, 1, -1, 0);
> 
>   This sets cpuhw->bhrb_filter to 0 and returns valid fd.
> 
>   2nd event:
>     attr.sample_type |= PERF_SAMPLE_BRANCH_STACK;
>     attr.branch_sample_type = PERF_SAMPLE_BRANCH_CALL;
>     fd = perf_event_open(attr, 0, 1, -1, 0);
> 
>   It overrides cpuhw->bhrb_filter to -1 and returns with error.
> 
> Now if power_pmu_enable() gets called by any path other than
> power_pmu_add(), ppmu->config_bhrb(-1) will set mmcra to -1.
> 
> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>

Fixes: 3925f46bb590 ("powerpc/perf: Enable branch stack sampling framework")


^ permalink raw reply

* [Bug 203571] New: may_use_simd() returns false in kworkers
From: bugzilla-daemon @ 2019-05-11  5:01 UTC (permalink / raw)
  To: linuxppc-dev

https://bugzilla.kernel.org/show_bug.cgi?id=203571

            Bug ID: 203571
           Summary: may_use_simd() returns false in kworkers
           Product: Platform Specific/Hardware
           Version: 2.5
    Kernel Version: 4.19
          Hardware: All
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: PPC-64
          Assignee: platform_ppc-64@kernel-bugs.osdl.org
          Reporter: slandden@gmail.com
        Regression: No

This patch optimizing chacha20 for WireGuard
https://github.com/shawnl/WireGuard/commit/3e02fce92a14cba7b7d1e2733def3a51bec97498

doesn't work because may_use_simd() is always returning false in kworkers. If I
remove the check everything seems to work fine.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply

* [Bug 203571] may_use_simd() returns false in soft irq context
From: bugzilla-daemon @ 2019-05-11 17:27 UTC (permalink / raw)
  To: linuxppc-dev
In-Reply-To: <bug-203571-206035@https.bugzilla.kernel.org/>

https://bugzilla.kernel.org/show_bug.cgi?id=203571

--- Comment #1 from Shawn Landden (slandden@gmail.com) ---
                                                      ▒
-   11.41%     0.01%  kworker/1:2-mm_  [kernel.vmlinux]            [k]
process_one_work                                                               
         ▒
   - 11.41% process_one_work                                                   
                                                                               
▒
      - 10.28% wg_packet_tx_worker                                             
                                                                               
▒
         - 10.06% wg_socket_send_skb_to_peer                                   
                                                                               
▒
            - 7.38% _raw_read_unlock_bh                                        
                                                                               
▒
               - 7.35% __local_bh_enable_ip                                    
                                                                               
▒
                  - 7.34% do_softirq.part.2                                    
                                                                               
▒
                     - 7.31% do_softirq_own_stack                              
                                                                               
▒
                        + 7.28% call_do_softirq                                
                                                                               
▒
            - 2.45% send4                                                      
                                                                               
▒
               - 1.85% udp_tunnel_xmit_skb                                     
                                                                               
▒
                  - 1.66% iptunnel_xmit                                        
                                                                               
▒
                     - 1.33% ip_local_out                                      
                                                                               
▒
                        - 1.14% ip_output                                      
                                                                               
▒
                           - 1.06% ip_finish_output2                           
                                                                               
▒
                              + 0.99% __dev_queue_xmit                         
                                                                               
▒
      - 1.06% wg_packet_encrypt_worker                                         
                                                                               
▒
         - 1.03% encrypt_packet                                                
                                                                               
▒
            - 0.98% chacha20poly1305_encrypt_sg                                
                                                                               
▒
                 0.59% chacha20

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply

* [powerpc] Kernel OOPS running kselftest (sigfuz)
From: Sachin Sant @ 2019-05-12 17:00 UTC (permalink / raw)
  To: linuxppc-dev

While running kselftests(sigfuz) against 5.1.0 kernel (sha: 1fb3b526df)
ran into following OOPS.

Last good run was against Commit 8823880561. This is PowerNV environment.

(15/17) avocado-misc-tests/kernel/kselftest.py:kselftest.test:  
[  732.811553] sigfuz[28111]: illegal instruction (4) at 10001884 nip 10001884 lr 10001854 code 1 in sigfuz[10000000+10000]
[  732.811590] sigfuz[28111]: code: e8410018 3d206666 61296667 7d234896 7c6afe70 7d291e70 7d2a4850 552a103a 
[  732.811599] sigfuz[28111]: code: 55292036 7d2a4a14 7f834800 409e0008 <7c00055d> 38210020 e8010010 7c0803a6 
[  732.811676] Bad kernel stack pointer 7fffb949e670 at c000000000071e7c
[  732.811688] Oops: Bad kernel stack pointer, sig: 6 [#1]
[  732.811695] LE PAGE_SIZE=64K MMU=Radix MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV
[  732.811746] Dumping ftrace buffer:
[  732.811925]    (ftrace buffer empty)
[  732.811934] Modules linked in: xt_CHECKSUM xt_MASQUERADE tun bridge stp llc ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute ip6table_nat ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter kvm_hv kvm i2c_dev sunrpc dm_mirror dm_region_hash dm_log dm_mod ses enclosure scsi_transport_sas sg ipmi_powernv ipmi_devintf ipmi_msghandler uio_pdrv_genirq leds_powernv uio ibmpowernv opal_prd powernv_op_panel ip_tables ext4 mbcache jbd2 sd_mod ipr libata tg3 ptp pps_core [last unloaded: kretprobe_example]
[  732.812035] CPU: 118 PID: 28112 Comm: sigfuz Tainted: G           O      5.1.0-autotest-autotest #1
[  732.812040] Bad kernel stack pointer 7fffbacce670 at c000000000071e7c
[  732.812043] NIP:  c000000000071e7c LR: 0000000024c8028d CTR: 000000001b01d9b1
[  732.812046] REGS: c0002007ff283d30 TRAP: 1500   Tainted: G           O       (5.1.0-autotest-autotest)
[  732.812063] MSR:  9000000100001031 <SF,HV,ME,IR,DR,LE,TM[E]>  CR: 52a9219f  XER: 4003a27e
[  732.812076] CFAR: c000000000071b70 IRQMASK: c000200710e42f00 
[  732.812076] GPR00: 00000000000000fa 00007fffb949e670 00007fffbaef7f00 0000000000000000 
[  732.812076] GPR04: 0000000000006dd0 000000000000000a 00000000fa7ecd56 00007fffbaeb00e4 
[  732.812076] GPR08: 0000000000006dd0 0000000000000000 0000000000000000 0000000000000000 
[  732.812076] GPR12: 0000000000000000 00007fffb94a6900 0000000000000000 0000000000000000 
[  732.812076] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
[  732.812076] GPR20: 0000000000000000 0000000010003408 00000000100033f0 00000000100033e0 
[  732.812076] GPR24: 00000000100033b8 0000000010003390 0000000010003370 0000000010020164 
[  732.812076] GPR28: 00007fffbaef4428 00000000003d0f00 0000000010020164 000000005cd5df78 
[  732.812124] sigfuz[28118]: illegal instruction (4) at 100018a0 nip 100018a0 lr 10001828 code 1 in sigfuz[10000000+10000]
[  732.812132] NIP [c000000000071e7c] restore_gprs+0x100/0x198
[  732.812136] sigfuz[28118]: code: 552a103a 55292036 7d2a4a14 7f834800 409e0008 7c00055d 38210020 e8010010 
[  732.812141] LR [0000000024c8028d] 0x24c8028d
[  732.812142] Call Trace:
[  732.812150] sigfuz[28118]: code: 7c0803a6 4e800020 60000000 60420000 <7c00051d> 41820008 4bfff679 e8410018 
[  732.812153] Instruction dump:
[  732.812168] e8c700a0 e8a70098 f8a1fff8 e8a70078 f8a1fff0 e8e700a8 38a00000 7ca10164 
[  732.812178] 7c314ba6 e8a1fff8 e821fff0 7c0007dd <7c421378> 7db04aa6 7c314aa6 38800002 
[  732.812189] ---[ end trace 34d7d6321d88e03e ]---
[  732.812311] sigfuz[28119]: illegal instruction (4) at 100018a0 nip 100018a0 lr 10001828 code 1 in sigfuz[10000000+10000]
[  732.812325] sigfuz[28119]: code: 552a103a 55292036 7d2a4a14 7f834800 409e0008 7c00055d 38210020 e8010010 
[  732.812335] sigfuz[28119]: code: 7c0803a6 4e800020 60000000 60420000 <7c00051d> 41820008 4bfff679 e8410018 
[  732.812492] 
[  732.812496] Oops: Bad kernel stack pointer, sig: 6 [#2]
[  732.812503] LE PAGE_SIZE=64K MMU=Radix MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV
[  732.812522] Dumping ftrace buffer:
[  732.812560]    (ftrace buffer empty)

Thanks
-Sachin

^ permalink raw reply

* [PATCH RESEND] powerpc: add simd.h implementation specific to PowerPC
From: Shawn Landden @ 2019-05-13  0:51 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Shawn Landden, Paul Mackerras, linux-kernel
In-Reply-To: <20190512165032.19942-1-shawn@git.icu>

It is safe to do SIMD in an interrupt on PowerPC.
Only disable when there is no SIMD available
(and this is a static branch).

Tested and works with the WireGuard (Zinc) patch I wrote that needs this.
Also improves performance of the crypto subsystem that checks this.

Re-sending because this linuxppc-dev didn't seem to accept it.

Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=203571
Signed-off-by: Shawn Landden <shawn@git.icu>
---
 arch/powerpc/include/asm/simd.h | 15 +++++++++++++++
 1 file changed, 15 insertions(+)
 create mode 100644 arch/powerpc/include/asm/simd.h

diff --git a/arch/powerpc/include/asm/simd.h b/arch/powerpc/include/asm/simd.h
new file mode 100644
index 000000000..b3fecb95a
--- /dev/null
+++ b/arch/powerpc/include/asm/simd.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0+ */
+
+#include <asm/cpu_has_feature.h>
+
+/*
+ * may_use_simd - whether it is allowable at this time to issue SIMD
+ *                instructions or access the SIMD register file
+ *
+ * As documented in Chapter 6.2.1 Machine Status Save/Restore Registers
+ * of Power ISA (2.07 and 3.0), all registers are saved/restored in an interrupt.
+ */
+static inline bool may_use_simd(void)
+{
+	return !cpu_has_feature(CPU_FTR_FPU_UNAVAILABLE);
+}
-- 
2.21.0.1020.gf2820cf01a


^ permalink raw reply related

* Re: [PATCH] crypto: vmx - fix copy-paste error in CTR mode
From: Herbert Xu @ 2019-05-13  0:59 UTC (permalink / raw)
  To: Eric Biggers
  Cc: leo.barbosa, nayna, Stephan Mueller, Nayna, omosnacek,
	marcelo.cerri, pfsmorigo, linux-crypto, leitao, George Wilson,
	linuxppc-dev, Daniel Axtens
In-Reply-To: <20190506155315.GA661@sol.localdomain>

On Mon, May 06, 2019 at 08:53:17AM -0700, Eric Biggers wrote:
>
> Any progress on this?  Someone just reported this again here:
> https://bugzilla.kernel.org/show_bug.cgi?id=203515

Guys if I don't get a fix for this soon I'll have to disable CTR
in vmx.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* [PATCH] powerpc: add simd.h implementation specific to PowerPC
From: Shawn Landden @ 2019-05-12 16:50 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: Shawn Landden, linuxppc-dev, linux-kernel

It is safe to do SIMD in an interrupt on PowerPC.
Only disable when there is no SIMD available
(and this is a static branch).

Tested and works with the WireGuard (Zinc) patch I wrote that needs this.
Also improves performance of the crypto subsystem that checks this.

Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=203571
Signed-off-by: Shawn Landden <shawn@git.icu>
---
 arch/powerpc/include/asm/simd.h | 15 +++++++++++++++
 1 file changed, 15 insertions(+)
 create mode 100644 arch/powerpc/include/asm/simd.h

diff --git a/arch/powerpc/include/asm/simd.h b/arch/powerpc/include/asm/simd.h
new file mode 100644
index 000000000..b3fecb95a
--- /dev/null
+++ b/arch/powerpc/include/asm/simd.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0+ */
+
+#include <asm/cpu_has_feature.h>
+
+/*
+ * may_use_simd - whether it is allowable at this time to issue SIMD
+ *                instructions or access the SIMD register file
+ *
+ * As documented in Chapter 6.2.1 Machine Status Save/Restore Registers
+ * of Power ISA (2.07 and 3.0), all registers are saved/restored in an interrupt.
+ */
+static inline bool may_use_simd(void)
+{
+	return !cpu_has_feature(CPU_FTR_FPU_UNAVAILABLE);
+}
-- 
2.21.0.1020.gf2820cf01a


^ permalink raw reply related

* Re: [RFC PATCH] powerpc/64/ftrace: mprofile-kernel patch out mflr
From: Michael Ellerman @ 2019-05-13  3:25 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20190413015940.31170-1-npiggin@gmail.com>

Nicholas Piggin <npiggin@gmail.com> writes:
> The new mprofile-kernel mcount sequence is
>
>   mflr	r0
>   bl	_mcount
>
> Dynamic ftrace patches the branch instruction with a noop, but leaves
> the mflr. mflr is executed by the branch unit that can only execute one
> per cycle on POWER9 and shared with branches, so it would be nice to
> avoid it where possible.
>
> This patch is a hacky proof of concept to nop out the mflr. Can we do
> this or are there races or other issues with it?

There's a race, isn't there?

We have a function foo which currently has tracing disabled, so the mflr
and bl are nop'ed out.

  CPU 0			CPU 1
  ==================================
  bl foo
  nop (ie. not mflr)
  -> interrupt
  something else	enable tracing for foo
  ...			patch mflr and branch
  <- rfi
  bl _mcount

So we end up in _mcount() but with r0 not populated.

Or else what am I missing?

cheers


> diff --git a/arch/powerpc/kernel/trace/ftrace.c b/arch/powerpc/kernel/trace/ftrace.c
> index 52ee24fd353f..ecc75baef23e 100644
> --- a/arch/powerpc/kernel/trace/ftrace.c
> +++ b/arch/powerpc/kernel/trace/ftrace.c
> @@ -172,6 +172,19 @@ __ftrace_make_nop(struct module *mod,
>  		pr_err("Unexpected instruction %08x around bl _mcount\n", op);
>  		return -EINVAL;
>  	}
> +
> +	if (patch_instruction((unsigned int *)ip, pop)) {
> +		pr_err("Patching NOP failed.\n");
> +		return -EPERM;
> +	}
> +
> +	if (op == PPC_INST_MFLR) {
> +		if (patch_instruction((unsigned int *)(ip - 4), pop)) {
> +			pr_err("Patching NOP failed.\n");
> +			return -EPERM;
> +		}
> +	}
> +
>  #else
>  	/*
>  	 * Our original call site looks like:
> @@ -202,13 +215,14 @@ __ftrace_make_nop(struct module *mod,
>  		pr_err("Expected %08x found %08x\n", PPC_INST_LD_TOC, op);
>  		return -EINVAL;
>  	}
> -#endif /* CONFIG_MPROFILE_KERNEL */
>  
>  	if (patch_instruction((unsigned int *)ip, pop)) {
>  		pr_err("Patching NOP failed.\n");
>  		return -EPERM;
>  	}
>  
> +#endif /* CONFIG_MPROFILE_KERNEL */
> +
>  	return 0;
>  }
>  
> @@ -421,6 +435,20 @@ static int __ftrace_make_nop_kernel(struct dyn_ftrace *rec, unsigned long addr)
>  		return -EPERM;
>  	}
>  
> +#ifdef CONFIG_MPROFILE_KERNEL
> +	if (probe_kernel_read(&op, (void *)(ip - 4), 4)) {
> +		pr_err("Fetching instruction at %lx failed.\n", ip - 4);
> +		return -EFAULT;
> +	}
> +
> +	if (op == PPC_INST_MFLR) {
> +		if (patch_instruction((unsigned int *)(ip - 4), PPC_INST_NOP)) {
> +			pr_err("Patching NOP failed.\n");
> +			return -EPERM;
> +		}
> +	}
> +#endif
> +
>  	return 0;
>  }
>  
> @@ -437,9 +465,20 @@ int ftrace_make_nop(struct module *mod,
>  	 */
>  	if (test_24bit_addr(ip, addr)) {
>  		/* within range */
> +		int rc;
> +
>  		old = ftrace_call_replace(ip, addr, 1);
>  		new = PPC_INST_NOP;
> -		return ftrace_modify_code(ip, old, new);
> +		rc = ftrace_modify_code(ip, old, new);
> +		if (rc)
> +			return rc;
> +#ifdef CONFIG_MPROFILE_KERNEL
> +		old = PPC_INST_MFLR;
> +		new = PPC_INST_NOP;
> +		ftrace_modify_code(ip - 4, old, new);
> +		/* old mprofile kernel will error because no mflr */
> +#endif
> +		return rc;
>  	} else if (core_kernel_text(ip))
>  		return __ftrace_make_nop_kernel(rec, addr);
>  
> @@ -562,6 +601,20 @@ __ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
>  		return -EINVAL;
>  	}
>  
> +#ifdef CONFIG_MPROFILE_KERNEL
> +	if (probe_kernel_read(op, (ip - 4), 4)) {
> +		pr_err("Fetching instruction at %lx failed.\n", (unsigned long)(ip - 4));
> +		return -EFAULT;
> +	}
> +
> +	if (op[0] == PPC_INST_NOP) {
> +		if (patch_instruction((ip - 4), PPC_INST_MFLR)) {
> +			pr_err("Patching mflr failed.\n");
> +			return -EINVAL;
> +		}
> +	}
> +#endif
> +
>  	if (patch_branch(ip, tramp, BRANCH_SET_LINK)) {
>  		pr_err("REL24 out of range!\n");
>  		return -EINVAL;
> @@ -650,6 +703,20 @@ static int __ftrace_make_call_kernel(struct dyn_ftrace *rec, unsigned long addr)
>  		return -EINVAL;
>  	}
>  
> +#ifdef CONFIG_MPROFILE_KERNEL
> +	if (probe_kernel_read(&op, (ip - 4), 4)) {
> +		pr_err("Fetching instruction at %lx failed.\n", (unsigned long)(ip - 4));
> +		return -EFAULT;
> +	}
> +
> +	if (op == PPC_INST_NOP) {
> +		if (patch_instruction((ip - 4), PPC_INST_MFLR)) {
> +			pr_err("Patching mflr failed.\n");
> +			return -EINVAL;
> +		}
> +	}
> +#endif
> +
>  	if (patch_branch(ip, tramp, BRANCH_SET_LINK)) {
>  		pr_err("Error patching branch to ftrace tramp!\n");
>  		return -EINVAL;
> @@ -670,6 +737,12 @@ int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
>  	 */
>  	if (test_24bit_addr(ip, addr)) {
>  		/* within range */
> +#ifdef CONFIG_MPROFILE_KERNEL
> +		old = PPC_INST_NOP;
> +		new = PPC_INST_MFLR;
> +		ftrace_modify_code(ip - 4, old, new);
> +		/* old mprofile-kernel sequence will error because no nop */
> +#endif
>  		old = PPC_INST_NOP;
>  		new = ftrace_call_replace(ip, addr, 1);
>  		return ftrace_modify_code(ip, old, new);
> -- 
> 2.20.1

^ permalink raw reply

* [PATCH] powerpc: Document xive=off option
From: Michael Neuling @ 2019-05-13  5:39 UTC (permalink / raw)
  To: mpe; +Cc: mikey, linuxppc-dev, Cédric Le Goater

commit 243e25112d06 ("powerpc/xive: Native exploitation of the XIVE
interrupt controller") added an option to turn off Linux native XIVE
usage via the xive=off kernel command line option.

This documents this option.

Signed-off-by: Michael Neuling <mikey@neuling.org>
---
 Documentation/admin-guide/kernel-parameters.txt | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index c45a19d654..ee410d0ef4 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5177,6 +5177,15 @@
 			Format:
 			<irq>,<irq_mask>,<io>,<full_duplex>,<do_sound>,<lockup_hack>[,<irq2>[,<irq3>[,<irq4>]]]
 
+	xive=		[PPC]
+			By default on POWER9 and above, the kernel will
+			natively use the XIVE interrupt controller. This option
+			allows the fallback firmware mode to be used:
+
+			off       Fallback to firmware control of XIVE interrupt
+				  controller on both pseries and powernv
+				  platforms. Only useful on POWER9 and above.
+
 	xhci-hcd.quirks		[USB,KNL]
 			A hex value specifying bitmask with supplemental xhci
 			host controller quirks. Meaning of each bit can be
-- 
2.21.0


^ permalink raw reply related

* Re: [PATCH kernel 1/2] powerpc/pseries/dma: Allow swiotlb
From: Alexey Kardashevskiy @ 2019-05-13  6:30 UTC (permalink / raw)
  To: Thiago Jung Bauermann; +Cc: Alistair Popple, linuxppc-dev, David Gibson
In-Reply-To: <871s162az1.fsf@morokweng.localdomain>



On 11/05/2019 08:36, Thiago Jung Bauermann wrote:
> 
> Alexey Kardashevskiy <aik@ozlabs.ru> writes:
> 
>> The commit 8617a5c5bc00 ("powerpc/dma: handle iommu bypass in
>> dma_iommu_ops") merged direct DMA ops into the IOMMU DMA ops allowing
>> SWIOTLB as well but only for mapping; the unmapping and bouncing parts
>> were left unmodified.
>>
>> This adds missing direct unmapping calls to .unmap_page() and .unmap_sg().
>>
>> This adds missing sync callbacks and directs them to the direct DMA hooks.
>>
>> Fixes: 8617a5c5bc00 (powerpc/dma: handle iommu bypass in dma_iommu_ops)
>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> 
> Nice! Thanks for working on this. I have the patch at the end of this
> email to get virtio-scsi-pci and virtio-blk-pci working in a secure
> guest.

I saw the set_pci_dma_ops(NULL) patch but could not figure out how pass
NULL there sets the DMA ops to direct.

> 
> I applied your patch and reverted my patch and unfortunately the guest
> hangs right after mounting the disk:

Have you applied it on upstream kernel? I cannot see how it affects
current guests as it is...


> 
> [    0.185659] virtio-pci 0000:00:04.0: enabling device (0100 -> 0102)
> [    0.187082] virtio-pci 0000:00:04.0: ibm,query-pe-dma-windows(2026) 2000 8000000 20000000 returned 0
> [    0.187497] virtio-pci 0000:00:04.0: ibm,create-pe-dma-window(2027) 2000 8000000 20000000 10 20 returned 0 (liobn = 0x80000001 startin
> g addr = 8000000 0)
> [    0.226654] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
> [    0.227094] Non-volatile memory driver v1.3
> [    0.228950] brd: module loaded
> [    0.230666] loop: module loaded
> [    0.230773] ipr: IBM Power RAID SCSI Device Driver version: 2.6.4 (March 14, 2017)
> [    0.233323] scsi host0: Virtio SCSI HBA
> [    0.235439] scsi 0:0:0:0: Direct-Access     QEMU     QEMU HARDDISK    2.5+ PQ: 0 ANSI: 5
> [    0.369009] random: fast init done
> [    0.370819] sd 0:0:0:0: Attached scsi generic sg0 type 0
> [    0.371320] sd 0:0:0:0: Power-on or device reset occurred
> 
> <snip>
> 
> [    0.380378] sd 0:0:0:0: [sda] 31457280 512-byte logical blocks: (16.1 GB/15.0 GiB)
> [    0.381102] sd 0:0:0:0: [sda] Write Protect is off
> [    0.381195] sd 0:0:0:0: [sda] Mode Sense: 63 00 00 08
> [    0.382436] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> [    0.383630] sd 0:0:0:0: [sda] Optimal transfer size 0 bytes < PAGE_SIZE (65536 bytes)
> [    0.391562]  sda: sda1 sda2
> [    0.398101] sd 0:0:0:0: [sda] Attached SCSI disk
> [    0.398205] md: Waiting for all devices to be available before autodetect
> [    0.398318] md: If you don't use raid, use raid=noautodetect
> [    0.398515] md: Autodetecting RAID arrays.
> [    0.398585] md: autorun ...
> [    0.398631] md: ... autorun DONE.
> [    0.403552] EXT4-fs (sda2): mounted filesystem with ordered data mode. Opts: (null)
> [    0.403700] VFS: Mounted root (ext4 filesystem) readonly on device 8:2.
> [    0.405258] devtmpfs: mounted
> [    0.406427] Freeing unused kernel memory: 4224K
> [    0.406519] This architecture does not have kernel memory protection.
> [    0.406633] Run /sbin/init as init process
> 
> Sorry, I don't have any information on where the guest is stuck. I tried
> <sysrq>+l, <sysrq>+t and <sysrq>+w but nothing out of the ordinary
> showed up. Will try something else later.
> 
> --
> Thiago Jung Bauermann
> IBM Linux Technology Center
> 
> 
> 
> From 70d2fba809119ae2d35c9ca4269405bb5c28413a Mon Sep 17 00:00:00 2001
> From: Thiago Jung Bauermann <bauerman@linux.ibm.com>
> Date: Thu, 24 Jan 2019 22:40:16 -0200
> Subject: [PATCH 1/1] powerpc/pseries/iommu: Don't use dma_iommu_ops on secure
>  guests
> 
> Secure guest memory is inacessible to devices so regular DMA isn't
> possible.
> 
> In that case set devices' dma_map_ops to NULL so that the generic
> DMA code path will use SWIOTLB and DMA to bounce buffers.
> 
> Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
> ---
>  arch/powerpc/platforms/pseries/iommu.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
> index 36eb1ddbac69..1636306007eb 100644
> --- a/arch/powerpc/platforms/pseries/iommu.c
> +++ b/arch/powerpc/platforms/pseries/iommu.c
> @@ -50,6 +50,7 @@
>  #include <asm/udbg.h>
>  #include <asm/mmzone.h>
>  #include <asm/plpar_wrappers.h>
> +#include <asm/svm.h>
> 
>  #include "pseries.h"
> 
> @@ -1335,7 +1336,10 @@ void iommu_init_early_pSeries(void)
>  	of_reconfig_notifier_register(&iommu_reconfig_nb);
>  	register_memory_notifier(&iommu_mem_nb);
> 
> -	set_pci_dma_ops(&dma_iommu_ops);
> +	if (is_secure_guest())
> +		set_pci_dma_ops(NULL);
> +	else
> +		set_pci_dma_ops(&dma_iommu_ops);
>  }
> 
>  static int __init disable_multitce(char *str)
> 

-- 
Alexey

^ permalink raw reply

* Re: [PATCH kernel 2/2] powerpc/pseries/dma: Enable swiotlb
From: Alexey Kardashevskiy @ 2019-05-13  6:32 UTC (permalink / raw)
  To: Thiago Jung Bauermann; +Cc: Alistair Popple, linuxppc-dev, David Gibson
In-Reply-To: <87zhnu0w5t.fsf@morokweng.localdomain>



On 11/05/2019 08:41, Thiago Jung Bauermann wrote:
> 
> Hello Alexey,
> 
> Thanks!
> 
> I have similar changes in my "Secure Virtual Machine Enablement"
> patches, which I am currently preparing for posting again real soon now.
> 
> This is the last version:
> 
> https://lore.kernel.org/linuxppc-dev/20180824162535.22798-1-bauerman@linux.ibm.com/
> 
> Alexey Kardashevskiy <aik@ozlabs.ru> writes:
> 
>> So far the pseries platforms has always been using IOMMU making SWIOTLB
>> unnecessary. Now we want secure guests which means devices can only
>> access certain areas of guest physical memory; we are going to use
>> SWIOTLB for this purpose.
>>
>> This allows SWIOTLB for pseries. By default there is no change in behavior.
>>
>> This enables SWIOTLB when the "swiotlb" kernel parameter is set to "force".
>>
>> With the SWIOTLB enabled, the kernel creates a directly mapped DMA window
>> (using the usual DDW mechanism) and implements SWIOTLB on top of that.
>>
>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>> ---
>>  arch/powerpc/platforms/pseries/setup.c | 5 +++++
>>  arch/powerpc/platforms/pseries/Kconfig | 1 +
>>  2 files changed, 6 insertions(+)
>>
>> diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
>> index e4f0dfd4ae33..30d72b587ac5 100644
>> --- a/arch/powerpc/platforms/pseries/setup.c
>> +++ b/arch/powerpc/platforms/pseries/setup.c
>> @@ -42,6 +42,7 @@
>>  #include <linux/of.h>
>>  #include <linux/of_pci.h>
>>  #include <linux/memblock.h>
>> +#include <linux/swiotlb.h>
>>
>>  #include <asm/mmu.h>
>>  #include <asm/processor.h>
>> @@ -71,6 +72,7 @@
>>  #include <asm/isa-bridge.h>
>>  #include <asm/security_features.h>
>>  #include <asm/asm-const.h>
>> +#include <asm/swiotlb.h>
>>
>>  #include "pseries.h"
>>  #include "../../../../drivers/pci/pci.h"
>> @@ -797,6 +799,9 @@ static void __init pSeries_setup_arch(void)
>>  	}
>>
>>  	ppc_md.pcibios_root_bridge_prepare = pseries_root_bridge_prepare;
>> +
>> +	if (swiotlb_force == SWIOTLB_FORCE)
>> +		ppc_swiotlb_enable = 1;
>>  }
> 
> Yep! I have this here, enabled when booting as a secure guest:
> 
> https://lore.kernel.org/linuxppc-dev/20180824162535.22798-6-bauerman@linux.ibm.com/
> 
> And also another patch which makes it so that if booting as a secure
> guest it acts as if the swiotlb kernel parameter was set to force:
> 
> https://lore.kernel.org/linuxppc-dev/20180824162535.22798-11-bauerman@linux.ibm.com/
> 
>>  static void pseries_panic(char *str)
>> diff --git a/arch/powerpc/platforms/pseries/Kconfig b/arch/powerpc/platforms/pseries/Kconfig
>> index 9c6b3d860518..b9e8b608de01 100644
>> --- a/arch/powerpc/platforms/pseries/Kconfig
>> +++ b/arch/powerpc/platforms/pseries/Kconfig
>> @@ -23,6 +23,7 @@ config PPC_PSERIES
>>  	select ARCH_RANDOM
>>  	select PPC_DOORBELL
>>  	select FORCE_SMP
>> +	select SWIOTLB
>>  	default y
>>
>>  config PPC_SPLPAR
> 
> I put this in a PPC_SVM config option:
> 
> https://lore.kernel.org/linuxppc-dev/20180824162535.22798-3-bauerman@linux.ibm.com/


Well, my intention is to make it work regardless SVM, just to see if it
works and where the problems are if it does not (right now the NVIDIA
driver does not like SWIOTLB, debugging).



-- 
Alexey

^ permalink raw reply

* Re: [PATCH v10 2/2] powerpc/64s: KVM update for reimplement book3s idle code in C
From: Paul Mackerras @ 2019-05-13  6:42 UTC (permalink / raw)
  To: Nicholas Piggin; +Cc: Gautham R . Shenoy, linuxppc-dev, kvm-ppc
In-Reply-To: <20190428114515.32683-3-npiggin@gmail.com>

On Sun, Apr 28, 2019 at 09:45:15PM +1000, Nicholas Piggin wrote:
> This is the KVM update to the new idle code. A few improvements:
> 
> - Idle sleepers now always return to caller rather than branch out
>   to KVM first.
> - This allows optimisations like very fast return to caller when no
>   state has been lost.
> - KVM no longer requires nap_state_lost because it controls NVGPR
>   save/restore itself on the way in and out.
> - The heavy idle wakeup KVM request check can be moved out of the
>   normal host idle code and into the not-performance-critical offline
>   code.
> - KVM nap code now returns from where it is called, which makes the
>   flow a bit easier to follow.

One question below...

> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> index 58d0f1ba845d..f66191d8f841 100644
> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
...
> @@ -2656,6 +2662,9 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300)
>  
>  	lis	r3, LPCR_PECEDP@h	/* Do wake on privileged doorbell */
>  
> +	/* Go back to host stack */
> +	ld	r1, HSTATE_HOST_R1(r13)

At this point we are in kvmppc_h_cede, which we branched to from
hcall_try_real_mode, which came from the guest exit path, where we
have already loaded r1 from HSTATE_HOST_R1(r13).  So if there is a
path to get here with r1 not already set to HSTATE_HOST_R1(r13), then
I missed it - please point it out to me.  Otherwise this statement
seems superfluous.

Paul.

^ permalink raw reply

* Re: [RFC PATCH] powerpc/64/ftrace: mprofile-kernel patch out mflr
From: Naveen N. Rao @ 2019-05-13  6:47 UTC (permalink / raw)
  To: linuxppc-dev, Michael Ellerman, Nicholas Piggin
In-Reply-To: <871s13ujcf.fsf@concordia.ellerman.id.au>

Michael Ellerman wrote:
> Nicholas Piggin <npiggin@gmail.com> writes:
>> The new mprofile-kernel mcount sequence is
>>
>>   mflr	r0
>>   bl	_mcount
>>
>> Dynamic ftrace patches the branch instruction with a noop, but leaves
>> the mflr. mflr is executed by the branch unit that can only execute one
>> per cycle on POWER9 and shared with branches, so it would be nice to
>> avoid it where possible.
>>
>> This patch is a hacky proof of concept to nop out the mflr. Can we do
>> this or are there races or other issues with it?
> 
> There's a race, isn't there?
> 
> We have a function foo which currently has tracing disabled, so the mflr
> and bl are nop'ed out.
> 
>   CPU 0			CPU 1
>   ==================================
>   bl foo
>   nop (ie. not mflr)
>   -> interrupt
>   something else	enable tracing for foo
>   ...			patch mflr and branch
>   <- rfi
>   bl _mcount
> 
> So we end up in _mcount() but with r0 not populated.

Good catch! Looks like we need to patch the mflr with a "b +8" similar 
to what we do in __ftrace_make_nop().


- Naveen



^ permalink raw reply

* Re: [PATCH 03/16] lib,treewide: add new match_string() helper/macro
From: Ardelean, Alexandru @ 2019-05-13  7:00 UTC (permalink / raw)
  To: andriy.shevchenko@linux.intel.com
  Cc: linux-wireless@vger.kernel.org, linux-fbdev@vger.kernel.org,
	kvm@vger.kernel.org, linux-pci@vger.kernel.org,
	alsa-devel@alsa-project.org, dri-devel@lists.freedesktop.org,
	linux-mm@kvack.org, linux-mtd@lists.infradead.org,
	linux-clk@vger.kernel.org, devel@driverdev.osuosl.org,
	linux-rockchip@lists.infradead.org, linux-pm@vger.kernel.org,
	intel-gfx@lists.freedesktop.org, linux-gpio@vger.kernel.org,
	linux-rpi-kernel@lists.infradead.org, linux-tegra@vger.kernel.org,
	cgroups@vger.kernel.org, linux-omap@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, linux-ide@vger.kernel.org,
	gregkh@linuxfoundation.org, linux-usb@vger.kernel.org,
	linux-mmc@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org, netdev@vger.kernel.org,
	linux-integrity@vger.kernel.org, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <20190510143407.GA9224@smile.fi.intel.com>

On Fri, 2019-05-10 at 17:34 +0300, andriy.shevchenko@linux.intel.com wrote:
> [External]
> 
> 
> On Fri, May 10, 2019 at 09:15:27AM +0000, Ardelean, Alexandru wrote:
> > On Wed, 2019-05-08 at 16:22 +0300, Alexandru Ardelean wrote:
> > > On Wed, 2019-05-08 at 15:18 +0200, Greg KH wrote:
> > > > On Wed, May 08, 2019 at 04:11:28PM +0300, Andy Shevchenko wrote:
> > > > > On Wed, May 08, 2019 at 02:28:29PM +0300, Alexandru Ardelean
> > > > > wrote:
> > > > > Can you split include/linux/ change from the rest?
> > > > 
> > > > That would break the build, why do you want it split out?  This
> > > > makes
> > > > sense all as a single patch to me.
> > > > 
> > > 
> > > Not really.
> > > It would be just be the new match_string() helper/macro in a new
> > > commit.
> > > And the conversions of the simple users of match_string() (the ones
> > > using
> > > ARRAY_SIZE()) in another commit.
> > > 
> > 
> > I should have asked in my previous reply.
> > Leave this as-is or re-formulate in 2 patches ?
> 
> Depends on on what you would like to spend your time: collecting Acks for
> all
> pieces in treewide patch or send new API first followed up by per driver
> /
> module update in next cycle.

I actually would have preferred new API first, with the current
`match_string()` -> `__match_string()` rename from the start, but I wasn't
sure. I am still navigating through how feedbacks are working in this
realm.

I'll send a V2 with the API change-first/only; should be a smaller list.
Then see about follow-ups/changes per subsystems.

> 
> I also have no strong preference.
> And I think it's good to add Heikki Krogerus to Cc list for both patch
> series,
> since he is the author of sysfs variant and may have something to comment
> on
> the rest.

Thanks for the reference.

> 
> --
> With Best Regards,
> Andy Shevchenko
> 
> 

^ permalink raw reply

* Re: [PATCH] powerpc: Document xive=off option
From: Cédric Le Goater @ 2019-05-13  6:36 UTC (permalink / raw)
  To: Michael Neuling, mpe; +Cc: linuxppc-dev, Greg Kurz
In-Reply-To: <20190513053910.19227-1-mikey@neuling.org>

On 5/13/19 7:39 AM, Michael Neuling wrote:
> commit 243e25112d06 ("powerpc/xive: Native exploitation of the XIVE
> interrupt controller") added an option to turn off Linux native XIVE
> usage via the xive=off kernel command line option.
> 
> This documents this option.
> 
> Signed-off-by: Michael Neuling <mikey@neuling.org>



Reviewed-by: Cédric Le Goater <clg@kaod.org>


But,

We should fix the behavior because xive=off does not work on pseries. 
This is not handled correctly in prom when CAS negotiates with the 
hypervisor which interrupt mode is to be used. 

I haven't tried this option on PowerNV.

Cheers,

C.


> ---
>  Documentation/admin-guide/kernel-parameters.txt | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index c45a19d654..ee410d0ef4 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -5177,6 +5177,15 @@
>  			Format:
>  			<irq>,<irq_mask>,<io>,<full_duplex>,<do_sound>,<lockup_hack>[,<irq2>[,<irq3>[,<irq4>]]]
>  
> +	xive=		[PPC]
> +			By default on POWER9 and above, the kernel will
> +			natively use the XIVE interrupt controller. This option
> +			allows the fallback firmware mode to be used:
> +
> +			off       Fallback to firmware control of XIVE interrupt
> +				  controller on both pseries and powernv
> +				  platforms. Only useful on POWER9 and above.
> +
>  	xhci-hcd.quirks		[USB,KNL]
>  			A hex value specifying bitmask with supplemental xhci
>  			host controller quirks. Meaning of each bit can be
> 


^ permalink raw reply

* [PATCH v2] powerpc: Fix compile issue with force DAWR
From: Michael Neuling @ 2019-05-13  7:17 UTC (permalink / raw)
  To: mpe; +Cc: mikey, linuxppc-dev

If you compile with KVM but without CONFIG_HAVE_HW_BREAKPOINT you fail
at linking with:
  arch/powerpc/kvm/book3s_hv_rmhandlers.o:(.text+0x708): undefined reference to `dawr_force_enable'

This was caused by commit c1fe190c0672 ("powerpc: Add force enable of
DAWR on P9 option").

This puts more of the dawr infrastructure in a new file.

Signed-off-by: Michael Neuling <mikey@neuling.org>
--
v2:
  Fixes based on Christophe Leroy's comments:
  - Fix commit message formatting
  - Move more DAWR code into dawr.c
---
 arch/powerpc/Kconfig                     |  5 ++
 arch/powerpc/include/asm/hw_breakpoint.h | 20 ++++---
 arch/powerpc/kernel/Makefile             |  1 +
 arch/powerpc/kernel/dawr.c               | 75 ++++++++++++++++++++++++
 arch/powerpc/kernel/hw_breakpoint.c      | 56 ------------------
 arch/powerpc/kvm/Kconfig                 |  1 +
 6 files changed, 94 insertions(+), 64 deletions(-)
 create mode 100644 arch/powerpc/kernel/dawr.c

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 2711aac246..f4b19e48cc 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -242,6 +242,7 @@ config PPC
 	select SYSCTL_EXCEPTION_TRACE
 	select THREAD_INFO_IN_TASK
 	select VIRT_TO_BUS			if !PPC64
+	select PPC_DAWR_FORCE_ENABLE		if PPC64 || PERF
 	#
 	# Please keep this list sorted alphabetically.
 	#
@@ -369,6 +370,10 @@ config PPC_ADV_DEBUG_DAC_RANGE
 	depends on PPC_ADV_DEBUG_REGS && 44x
 	default y
 
+config PPC_DAWR_FORCE_ENABLE
+	bool
+	default y
+
 config ZONE_DMA
 	bool
 	default y if PPC_BOOK3E_64
diff --git a/arch/powerpc/include/asm/hw_breakpoint.h b/arch/powerpc/include/asm/hw_breakpoint.h
index 0fe8c1e46b..ffbc8eab41 100644
--- a/arch/powerpc/include/asm/hw_breakpoint.h
+++ b/arch/powerpc/include/asm/hw_breakpoint.h
@@ -47,6 +47,8 @@ struct arch_hw_breakpoint {
 #define HW_BRK_TYPE_PRIV_ALL	(HW_BRK_TYPE_USER | HW_BRK_TYPE_KERNEL | \
 				 HW_BRK_TYPE_HYP)
 
+extern int set_dawr(struct arch_hw_breakpoint *brk);
+
 #ifdef CONFIG_HAVE_HW_BREAKPOINT
 #include <linux/kdebug.h>
 #include <asm/reg.h>
@@ -90,18 +92,20 @@ static inline void hw_breakpoint_disable(void)
 extern void thread_change_pc(struct task_struct *tsk, struct pt_regs *regs);
 int hw_breakpoint_handler(struct die_args *args);
 
-extern int set_dawr(struct arch_hw_breakpoint *brk);
-extern bool dawr_force_enable;
-static inline bool dawr_enabled(void)
-{
-	return dawr_force_enable;
-}
-
 #else	/* CONFIG_HAVE_HW_BREAKPOINT */
 static inline void hw_breakpoint_disable(void) { }
 static inline void thread_change_pc(struct task_struct *tsk,
 					struct pt_regs *regs) { }
-static inline bool dawr_enabled(void) { return false; }
+
 #endif	/* CONFIG_HAVE_HW_BREAKPOINT */
+
+extern bool dawr_force_enable;
+
+#ifdef CONFIG_PPC_DAWR_FORCE_ENABLE
+extern bool dawr_enabled(void);
+#else
+#define dawr_enabled() true
+#endif
+
 #endif	/* __KERNEL__ */
 #endif	/* _PPC_BOOK3S_64_HW_BREAKPOINT_H */
diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index 0ea6c4aa3a..a9c497c34f 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -56,6 +56,7 @@ obj-$(CONFIG_PPC64)		+= setup_64.o sys_ppc32.o \
 obj-$(CONFIG_VDSO32)		+= vdso32/
 obj-$(CONFIG_PPC_WATCHDOG)	+= watchdog.o
 obj-$(CONFIG_HAVE_HW_BREAKPOINT)	+= hw_breakpoint.o
+obj-$(CONFIG_PPC_DAWR_FORCE_ENABLE)	+= dawr.o
 obj-$(CONFIG_PPC_BOOK3S_64)	+= cpu_setup_ppc970.o cpu_setup_pa6t.o
 obj-$(CONFIG_PPC_BOOK3S_64)	+= cpu_setup_power.o
 obj-$(CONFIG_PPC_BOOK3S_64)	+= mce.o mce_power.o
diff --git a/arch/powerpc/kernel/dawr.c b/arch/powerpc/kernel/dawr.c
new file mode 100644
index 0000000000..cf1d02fe1e
--- /dev/null
+++ b/arch/powerpc/kernel/dawr.c
@@ -0,0 +1,75 @@
+// SPDX-License-Identifier: GPL-2.0+
+//
+// DAWR infrastructure
+//
+// Copyright 2019, Michael Neuling, IBM Corporation.
+
+#include <linux/types.h>
+#include <linux/export.h>
+#include <linux/fs.h>
+#include <linux/debugfs.h>
+#include <asm/debugfs.h>
+#include <asm/machdep.h>
+#include <asm/hvcall.h>
+
+bool dawr_force_enable;
+EXPORT_SYMBOL_GPL(dawr_force_enable);
+
+extern bool dawr_enabled(void)
+{
+	return dawr_force_enable;
+}
+EXPORT_SYMBOL_GPL(dawr_enabled);
+
+static ssize_t dawr_write_file_bool(struct file *file,
+				    const char __user *user_buf,
+				    size_t count, loff_t *ppos)
+{
+	struct arch_hw_breakpoint null_brk = {0, 0, 0};
+	size_t rc;
+
+	/* Send error to user if they hypervisor won't allow us to write DAWR */
+	if ((!dawr_force_enable) &&
+	    (firmware_has_feature(FW_FEATURE_LPAR)) &&
+	    (set_dawr(&null_brk) != H_SUCCESS))
+		return -1;
+
+	rc = debugfs_write_file_bool(file, user_buf, count, ppos);
+	if (rc)
+		return rc;
+
+	/* If we are clearing, make sure all CPUs have the DAWR cleared */
+	if (!dawr_force_enable)
+		smp_call_function((smp_call_func_t)set_dawr, &null_brk, 0);
+
+	return rc;
+}
+
+static const struct file_operations dawr_enable_fops = {
+	.read =		debugfs_read_file_bool,
+	.write =	dawr_write_file_bool,
+	.open =		simple_open,
+	.llseek =	default_llseek,
+};
+
+static int __init dawr_force_setup(void)
+{
+	dawr_force_enable = false;
+
+	if (cpu_has_feature(CPU_FTR_DAWR)) {
+		/* Don't setup sysfs file for user control on P8 */
+		dawr_force_enable = true;
+		return 0;
+	}
+
+	if (PVR_VER(mfspr(SPRN_PVR)) == PVR_POWER9) {
+		/* Turn DAWR off by default, but allow admin to turn it on */
+		dawr_force_enable = false;
+		debugfs_create_file_unsafe("dawr_enable_dangerous", 0600,
+					   powerpc_debugfs_root,
+					   &dawr_force_enable,
+					   &dawr_enable_fops);
+	}
+	return 0;
+}
+arch_initcall(dawr_force_setup);
diff --git a/arch/powerpc/kernel/hw_breakpoint.c b/arch/powerpc/kernel/hw_breakpoint.c
index da307dd93e..95605a9c9a 100644
--- a/arch/powerpc/kernel/hw_breakpoint.c
+++ b/arch/powerpc/kernel/hw_breakpoint.c
@@ -380,59 +380,3 @@ void hw_breakpoint_pmu_read(struct perf_event *bp)
 {
 	/* TODO */
 }
-
-bool dawr_force_enable;
-EXPORT_SYMBOL_GPL(dawr_force_enable);
-
-static ssize_t dawr_write_file_bool(struct file *file,
-				    const char __user *user_buf,
-				    size_t count, loff_t *ppos)
-{
-	struct arch_hw_breakpoint null_brk = {0, 0, 0};
-	size_t rc;
-
-	/* Send error to user if they hypervisor won't allow us to write DAWR */
-	if ((!dawr_force_enable) &&
-	    (firmware_has_feature(FW_FEATURE_LPAR)) &&
-	    (set_dawr(&null_brk) != H_SUCCESS))
-		return -1;
-
-	rc = debugfs_write_file_bool(file, user_buf, count, ppos);
-	if (rc)
-		return rc;
-
-	/* If we are clearing, make sure all CPUs have the DAWR cleared */
-	if (!dawr_force_enable)
-		smp_call_function((smp_call_func_t)set_dawr, &null_brk, 0);
-
-	return rc;
-}
-
-static const struct file_operations dawr_enable_fops = {
-	.read =		debugfs_read_file_bool,
-	.write =	dawr_write_file_bool,
-	.open =		simple_open,
-	.llseek =	default_llseek,
-};
-
-static int __init dawr_force_setup(void)
-{
-	dawr_force_enable = false;
-
-	if (cpu_has_feature(CPU_FTR_DAWR)) {
-		/* Don't setup sysfs file for user control on P8 */
-		dawr_force_enable = true;
-		return 0;
-	}
-
-	if (PVR_VER(mfspr(SPRN_PVR)) == PVR_POWER9) {
-		/* Turn DAWR off by default, but allow admin to turn it on */
-		dawr_force_enable = false;
-		debugfs_create_file_unsafe("dawr_enable_dangerous", 0600,
-					   powerpc_debugfs_root,
-					   &dawr_force_enable,
-					   &dawr_enable_fops);
-	}
-	return 0;
-}
-arch_initcall(dawr_force_setup);
diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
index bfdde04e49..9c0d315108 100644
--- a/arch/powerpc/kvm/Kconfig
+++ b/arch/powerpc/kvm/Kconfig
@@ -39,6 +39,7 @@ config KVM_BOOK3S_32_HANDLER
 config KVM_BOOK3S_64_HANDLER
 	bool
 	select KVM_BOOK3S_HANDLER
+	select PPC_DAWR_FORCE_ENABLE
 
 config KVM_BOOK3S_PR_POSSIBLE
 	bool
-- 
2.21.0


^ permalink raw reply related

* Re: [PATCH, RFC] byteorder: sanity check toolchain vs kernel endianess
From: Dmitry Vyukov @ 2019-05-13  7:39 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-arch, linuxppc-dev, Linux Kernel Mailing List,
	Nick Kossifidis, Andrew Morton, Linus Torvalds, Christoph Hellwig
In-Reply-To: <CAK8P3a3xRBZrgv16sSigJhY0vGmb=qF9o=6dC_5DqAJtW3qPGQ@mail.gmail.com>

From: Arnd Bergmann <arnd@arndb.de>
Date: Sat, May 11, 2019 at 2:51 AM
To: Dmitry Vyukov
Cc: Nick Kossifidis, Christoph Hellwig, Linus Torvalds, Andrew Morton,
linux-arch, Linux Kernel Mailing List, linuxppc-dev

> On Fri, May 10, 2019 at 6:53 AM Dmitry Vyukov <dvyukov@google.com> wrote:
> > >
> > > I think it's good to have a sanity check in-place for consistency.
> >
> >
> > Hi,
> >
> > This broke our cross-builds from x86. I am using:
> >
> > $ powerpc64le-linux-gnu-gcc --version
> > powerpc64le-linux-gnu-gcc (Debian 7.2.0-7) 7.2.0
> >
> > and it says that it's little-endian somehow:
> >
> > $ powerpc64le-linux-gnu-gcc -dM -E - < /dev/null | grep BYTE_ORDER
> > #define __BYTE_ORDER__ __ORDER_LITTLE_ENDIAN__
> >
> > Is it broke compiler? Or I always hold it wrong? Is there some
> > additional flag I need to add?
>
> It looks like a bug in the kernel Makefiles to me. powerpc32 is always
> big-endian,
> powerpc64 used to be big-endian but is now usually little-endian. There are
> often three separate toolchains that default to the respective user
> space targets
> (ppc32be, ppc64be, ppc64le), but generally you should be able to build
> any of the
> three kernel configurations with any of those compilers, and have the Makefile
> pass the correct -m32/-m64/-mbig-endian/-mlittle-endian command line options
> depending on the kernel configuration. It seems that this is not happening
> here. I have not checked why, but if this is the problem, it should be
> easy enough
> to figure out.


Thanks! This clears a lot.
This may be a bug in our magic as we try to build kernel files outside
of make with own flags (required to extract parts of kernel
interfaces).
So don't spend time looking for the Makefile bugs yet.

^ permalink raw reply

* Re: [PATCH 1/2] perf ioctl: Add check for the sample_period value
From: Peter Zijlstra @ 2019-05-13  7:42 UTC (permalink / raw)
  To: Ravi Bangoria; +Cc: maddy, linuxppc-dev, linux-kernel, acme, jolsa
In-Reply-To: <20190511024217.4013-1-ravi.bangoria@linux.ibm.com>

On Sat, May 11, 2019 at 08:12:16AM +0530, Ravi Bangoria wrote:
> Add a check for sample_period value sent from userspace. Negative
> value does not make sense. And in powerpc arch code this could cause
> a recursive PMI leading to a hang (reported when running perf-fuzzer).
> 
> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
> ---
>  kernel/events/core.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index abbd4b3b96c2..e44c90378940 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -5005,6 +5005,9 @@ static int perf_event_period(struct perf_event *event, u64 __user *arg)
>  	if (perf_event_check_period(event, value))
>  		return -EINVAL;
>  
> +	if (!event->attr.freq && (value & (1ULL << 63)))
> +		return -EINVAL;

Well, perf_event_attr::sample_period is __u64. Would not be the site
using it as signed be the one in error?

^ permalink raw reply

* Re: [PATCH v2 3/8] mm/memory_hotplug: arch_remove_memory() and __remove_pages() with CONFIG_MEMORY_HOTPLUG
From: David Hildenbrand @ 2019-05-13  7:48 UTC (permalink / raw)
  To: Dan Williams, Michal Hocko, Oscar Salvador
  Cc: Rich Felker, linux-ia64, Linux-sh, Peter Zijlstra, Dave Hansen,
	Heiko Carstens, Chris Wilson, Linux MM, Paul Mackerras,
	H. Peter Anvin, Qian Cai, linux-s390, Yoshinori Sato,
	Rafael J. Wysocki, Mike Rapoport, Ingo Molnar, Fenghua Yu,
	Pavel Tatashin, Vasily Gorbik, Rob Herring, mike.travis@hpe.com,
	Nicholas Piggin, Alex Deucher, Mark Brown, Borislav Petkov,
	Andy Lutomirski, Thomas Gleixner, Tony Luck, Baoquan He,
	Masahiro Yamada, Mathieu Malaterre, Greg Kroah-Hartman,
	Andrew Banman, Linux Kernel Mailing List, Logan Gunthorpe,
	Wei Yang, Martin Schwidefsky, Arun KS, Andrew Morton,
	linuxppc-dev, David S. Miller, Kirill A. Shutemov
In-Reply-To: <CAPcyv4jpnKjeP3QEvF3_9CzdZhtFXN2nMU7P-Ee7y06J3bGZ0A@mail.gmail.com>

On 07.05.19 23:02, Dan Williams wrote:
> On Tue, May 7, 2019 at 11:38 AM David Hildenbrand <david@redhat.com> wrote:
>>
>> Let's prepare for better error handling while adding memory by allowing
>> to use arch_remove_memory() and __remove_pages() even if
>> CONFIG_MEMORY_HOTREMOVE is not set. CONFIG_MEMORY_HOTREMOVE effectively
>> covers
>> - Offlining of system ram (memory block devices) - offline_pages()
>> - Unplug of system ram - remove_memory()
>> - Unplug/remap of device memory - devm_memremap()
>>
>> This allows e.g. for handling like
>>
>> arch_add_memory()
>> rc = do_something();
>> if (rc) {
>>         arch_remove_memory();
>> }
>>
>> Whereby do_something() will for example be memory block device creation
>> after it has been factored out.
> 
> What's left after this? Can we just get rid of CONFIG_MEMORY_HOTREMOVE
> option completely when CONFIG_MEMORY_HOTPLUG is enabled? It's not
> clear to me why there was ever the option to compile out the remove
> code when the add code is included.
> 

If there are no other comments, I will go ahead and rip out
CONFIG_MEMORY_HOTREMOVE completely, gluing the functionality to
CONFIG_MEMORY_HOTPLUG.

-- 

Thanks,

David / dhildenb

^ permalink raw reply

* Re: [PATCH v2 3/8] mm/memory_hotplug: arch_remove_memory() and __remove_pages() with CONFIG_MEMORY_HOTPLUG
From: David Hildenbrand @ 2019-05-13  8:20 UTC (permalink / raw)
  To: Dan Williams, Michal Hocko, Oscar Salvador
  Cc: Rich Felker, linux-ia64, Linux-sh, Peter Zijlstra, Dave Hansen,
	Heiko Carstens, Chris Wilson, Linux MM, Paul Mackerras,
	H. Peter Anvin, Qian Cai, linux-s390, Yoshinori Sato,
	Rafael J. Wysocki, Mike Rapoport, Ingo Molnar, Fenghua Yu,
	Pavel Tatashin, Vasily Gorbik, Rob Herring, mike.travis@hpe.com,
	Nicholas Piggin, Alex Deucher, Mark Brown, Borislav Petkov,
	Andy Lutomirski, Thomas Gleixner, Tony Luck, Baoquan He,
	Masahiro Yamada, Mathieu Malaterre, Greg Kroah-Hartman,
	Andrew Banman, Linux Kernel Mailing List, Logan Gunthorpe,
	Wei Yang, Martin Schwidefsky, Arun KS, Andrew Morton,
	linuxppc-dev, David S. Miller, Kirill A. Shutemov
In-Reply-To: <c027a782-1cef-a076-92a3-3ce36140f3f2@redhat.com>

On 13.05.19 09:48, David Hildenbrand wrote:
> On 07.05.19 23:02, Dan Williams wrote:
>> On Tue, May 7, 2019 at 11:38 AM David Hildenbrand <david@redhat.com> wrote:
>>>
>>> Let's prepare for better error handling while adding memory by allowing
>>> to use arch_remove_memory() and __remove_pages() even if
>>> CONFIG_MEMORY_HOTREMOVE is not set. CONFIG_MEMORY_HOTREMOVE effectively
>>> covers
>>> - Offlining of system ram (memory block devices) - offline_pages()
>>> - Unplug of system ram - remove_memory()
>>> - Unplug/remap of device memory - devm_memremap()
>>>
>>> This allows e.g. for handling like
>>>
>>> arch_add_memory()
>>> rc = do_something();
>>> if (rc) {
>>>         arch_remove_memory();
>>> }
>>>
>>> Whereby do_something() will for example be memory block device creation
>>> after it has been factored out.
>>
>> What's left after this? Can we just get rid of CONFIG_MEMORY_HOTREMOVE
>> option completely when CONFIG_MEMORY_HOTPLUG is enabled? It's not
>> clear to me why there was ever the option to compile out the remove
>> code when the add code is included.
>>
> 
> If there are no other comments, I will go ahead and rip out
> CONFIG_MEMORY_HOTREMOVE completely, gluing the functionality to
> CONFIG_MEMORY_HOTPLUG.
> 

Hmmmm, however this will require CONFIG_MEMORY_HOTPLUG to require

- MEMORY_ISOLATION
- HAVE_BOOTMEM_INFO_NODE if (X86_64 || PPC64)

And depends on
- MIGRATION

Which would limit the configurations where memory hotplug would be
available. I guess going with this patch here is ok as a first step.

I just realized, that we'll need arch_remove_memory() for arm64 to make
this patch here work.

-- 

Thanks,

David / dhildenb

^ permalink raw reply

* RE: [PATCH] vsprintf: Do not break early boot with probing addresses
From: David Laight @ 2019-05-13  8:52 UTC (permalink / raw)
  To: 'christophe leroy', Steven Rostedt, Petr Mladek
  Cc: linux-arch@vger.kernel.org, linux-s390@vger.kernel.org,
	Sergey Senozhatsky, Heiko Carstens, linuxppc-dev@lists.ozlabs.org,
	Rasmus Villemoes, linux-kernel@vger.kernel.org, Michal Hocko,
	Sergey Senozhatsky, Stephen Rothwell, Andy Shevchenko,
	Linus Torvalds, Martin Schwidefsky, Tobin C . Harding
In-Reply-To: <daf4dfd1-7f4f-8b92-6866-437c3a2be28b@c-s.fr>

From: christophe leroy
> Sent: 10 May 2019 18:35
> Le 10/05/2019 à 18:24, Steven Rostedt a écrit :
> > On Fri, 10 May 2019 10:42:13 +0200
> > Petr Mladek <pmladek@suse.com> wrote:
> >
> >>   static const char *check_pointer_msg(const void *ptr)
> >>   {
> >> -	char byte;
> >> -
> >>   	if (!ptr)
> >>   		return "(null)";
> >>
> >> -	if (probe_kernel_address(ptr, byte))
> >> +	if ((unsigned long)ptr < PAGE_SIZE || IS_ERR_VALUE(ptr))
> >>   		return "(efault)";

"efault" looks a bit like a spellling mistake for "default".

> > 	< PAGE_SIZE ?
> >
> > do you mean: < TASK_SIZE ?
> 
> I guess not.
> 
> Usually, < PAGE_SIZE means NULL pointer dereference (via the member of a
> struct)

Maybe the caller should pass in a short buffer so that you can return
"(err-%d)" or "(null+%#x)" ?

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox