* Re: [v2] powerpc/lib: Adjust .balign inside string functions for PPC32
From: Michael Ellerman @ 2018-06-04 14:10 UTC (permalink / raw)
To: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <20180518130116.A1A3B6F937@po14934vm.idsi0.si.c-s.fr>
On Fri, 2018-05-18 at 13:01:16 UTC, Christophe Leroy wrote:
> commit 87a156fb18fe1 ("Align hot loops of some string functions")
> degraded the performance of string functions by adding useless
> nops
>
> A simple benchmark on an 8xx calling 100000x a memchr() that
> matches the first byte runs in 41668 TB ticks before this patch
> and in 35986 TB ticks after this patch. So this gives an
> improvement of approx 10%
>
> Another benchmark doing the same with a memchr() matching the 128th
> byte runs in 1011365 TB ticks before this patch and 1005682 TB ticks
> after this patch, so regardless on the number of loops, removing
> those useless nops improves the test by 5683 TB ticks.
>
> Fixes: 87a156fb18fe1 ("Align hot loops of some string functions")
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/1128bb7813a896bd608fb622eee3c2
cheers
^ permalink raw reply
* Re: [v2,1/4] powerpc/perf: Rearrange memory freeing in imc init
From: Michael Ellerman @ 2018-06-04 14:11 UTC (permalink / raw)
To: Anju T Sudhakar; +Cc: maddy, linuxppc-dev, anju
In-Reply-To: <1526980357-25385-2-git-send-email-anju@linux.vnet.ibm.com>
On Tue, 2018-05-22 at 09:12:34 UTC, Anju T Sudhakar wrote:
> When any of the IMC (In-Memory Collection counter) devices fail
> to initialize, imc_common_mem_free() frees set of memory. In doing so,
> pmu_ptr pointer is also freed. But pmu_ptr pointer is used in subsequent
> function (imc_common_cpuhp_mem_free()) which is wrong. Patch here reorders
> the code to avoid such access.
>
> Also free the memory which is dynamically allocated during imc
> initialization, wherever required.
>
> Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
> Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Series applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/cb094fa5af7c9623084aa4c3cf529b
cheers
^ permalink raw reply
* Re: [1/3] powerpc/sstep: Introduce GETTYPE macro
From: Michael Ellerman @ 2018-06-04 14:11 UTC (permalink / raw)
To: Ravi Bangoria, mikey
Cc: sandipan, Ravi Bangoria, linux-kernel, matthew.brown.dev, paulus,
anton, naveen.n.rao, linuxppc-dev, cyrilbur
In-Reply-To: <20180521042108.8318-2-ravi.bangoria@linux.ibm.com>
On Mon, 2018-05-21 at 04:21:06 UTC, Ravi Bangoria wrote:
> Replace 'op->type & INSTR_TYPE_MASK' expression with GETTYPE(op->type)
> macro.
>
> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Series applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/e6684d07e4308430b9b6497265781a
cheers
^ permalink raw reply
* Re: powerpc/32: Optimise __csum_partial()
From: Michael Ellerman @ 2018-06-04 14:11 UTC (permalink / raw)
To: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras, segher
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <484bcfaccc1ec3d91b74aeaaa26a0ae66fe0955a.1527160868.git.christophe.leroy@c-s.fr>
On Thu, 2018-05-24 at 11:22:27 UTC, Christophe Leroy wrote:
> Improve __csum_partial by interleaving loads and adds.
>
> On a 8xx, it brings neither improvement nor degradation.
> On a 83xx, it brings a 25% improvement.
>
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>
Applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/373e098e1e788d7b89ec0f31765a6c
cheers
^ permalink raw reply
* Re: [v2, 01/13] powerpc/eeh: Add eeh_max_freezes to initial EEH log line
From: Michael Ellerman @ 2018-06-04 14:11 UTC (permalink / raw)
To: Sam Bobroff, linuxppc-dev
In-Reply-To: <ac861431c98e0c259fec18a0d220994ad6b362ae.1527217866.git.sbobroff@linux.ibm.com>
On Fri, 2018-05-25 at 03:11:28 UTC, Sam Bobroff wrote:
> The current failure message includes the number of failures that have
> occurred in the last hour (for a device) but it does not indicate
> how many failures will be tolerated before the device is permanently
> disabled.
>
> Include the limit (eeh_max_freezes) to make this less surprising when
> it happens.
>
> Also remove the embedded newline from the existing message to make it
> easier to grep for.
>
> Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
Series applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/796b9f5b317a46d1b744f661c38a62
cheers
^ permalink raw reply
* Re: [v4] powerpc: Implement csum_ipv6_magic in assembly
From: Michael Ellerman @ 2018-06-04 14:11 UTC (permalink / raw)
To: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras, segher
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <7a756b816161007903ca8b28aec662de27135c55.1527161282.git.christophe.leroy@c-s.fr>
On Thu, 2018-05-24 at 11:33:18 UTC, Christophe Leroy wrote:
> The generic csum_ipv6_magic() generates a pretty bad result
>
> 00000000 <csum_ipv6_magic>: (PPC32)
> 0: 81 23 00 00 lwz r9,0(r3)
> 4: 81 03 00 04 lwz r8,4(r3)
> 8: 7c e7 4a 14 add r7,r7,r9
> c: 7d 29 38 10 subfc r9,r9,r7
> 10: 7d 4a 51 10 subfe r10,r10,r10
> 14: 7d 27 42 14 add r9,r7,r8
> 18: 7d 2a 48 50 subf r9,r10,r9
> 1c: 80 e3 00 08 lwz r7,8(r3)
> 20: 7d 08 48 10 subfc r8,r8,r9
> 24: 7d 4a 51 10 subfe r10,r10,r10
> 28: 7d 29 3a 14 add r9,r9,r7
> 2c: 81 03 00 0c lwz r8,12(r3)
> 30: 7d 2a 48 50 subf r9,r10,r9
> 34: 7c e7 48 10 subfc r7,r7,r9
> 38: 7d 4a 51 10 subfe r10,r10,r10
> 3c: 7d 29 42 14 add r9,r9,r8
> 40: 7d 2a 48 50 subf r9,r10,r9
> 44: 80 e4 00 00 lwz r7,0(r4)
> 48: 7d 08 48 10 subfc r8,r8,r9
> 4c: 7d 4a 51 10 subfe r10,r10,r10
> 50: 7d 29 3a 14 add r9,r9,r7
> 54: 7d 2a 48 50 subf r9,r10,r9
> 58: 81 04 00 04 lwz r8,4(r4)
> 5c: 7c e7 48 10 subfc r7,r7,r9
> 60: 7d 4a 51 10 subfe r10,r10,r10
> 64: 7d 29 42 14 add r9,r9,r8
> 68: 7d 2a 48 50 subf r9,r10,r9
> 6c: 80 e4 00 08 lwz r7,8(r4)
> 70: 7d 08 48 10 subfc r8,r8,r9
> 74: 7d 4a 51 10 subfe r10,r10,r10
> 78: 7d 29 3a 14 add r9,r9,r7
> 7c: 7d 2a 48 50 subf r9,r10,r9
> 80: 81 04 00 0c lwz r8,12(r4)
> 84: 7c e7 48 10 subfc r7,r7,r9
> 88: 7d 4a 51 10 subfe r10,r10,r10
> 8c: 7d 29 42 14 add r9,r9,r8
> 90: 7d 2a 48 50 subf r9,r10,r9
> 94: 7d 08 48 10 subfc r8,r8,r9
> 98: 7d 4a 51 10 subfe r10,r10,r10
> 9c: 7d 29 2a 14 add r9,r9,r5
> a0: 7d 2a 48 50 subf r9,r10,r9
> a4: 7c a5 48 10 subfc r5,r5,r9
> a8: 7c 63 19 10 subfe r3,r3,r3
> ac: 7d 29 32 14 add r9,r9,r6
> b0: 7d 23 48 50 subf r9,r3,r9
> b4: 7c c6 48 10 subfc r6,r6,r9
> b8: 7c 63 19 10 subfe r3,r3,r3
> bc: 7c 63 48 50 subf r3,r3,r9
> c0: 54 6a 80 3e rotlwi r10,r3,16
> c4: 7c 63 52 14 add r3,r3,r10
> c8: 7c 63 18 f8 not r3,r3
> cc: 54 63 84 3e rlwinm r3,r3,16,16,31
> d0: 4e 80 00 20 blr
>
> 0000000000000000 <.csum_ipv6_magic>: (PPC64)
> 0: 81 23 00 00 lwz r9,0(r3)
> 4: 80 03 00 04 lwz r0,4(r3)
> 8: 81 63 00 08 lwz r11,8(r3)
> c: 7c e7 4a 14 add r7,r7,r9
> 10: 7f 89 38 40 cmplw cr7,r9,r7
> 14: 7d 47 02 14 add r10,r7,r0
> 18: 7d 30 10 26 mfocrf r9,1
> 1c: 55 29 f7 fe rlwinm r9,r9,30,31,31
> 20: 7d 4a 4a 14 add r10,r10,r9
> 24: 7f 80 50 40 cmplw cr7,r0,r10
> 28: 7d 2a 5a 14 add r9,r10,r11
> 2c: 80 03 00 0c lwz r0,12(r3)
> 30: 81 44 00 00 lwz r10,0(r4)
> 34: 7d 10 10 26 mfocrf r8,1
> 38: 55 08 f7 fe rlwinm r8,r8,30,31,31
> 3c: 7d 29 42 14 add r9,r9,r8
> 40: 81 04 00 04 lwz r8,4(r4)
> 44: 7f 8b 48 40 cmplw cr7,r11,r9
> 48: 7d 29 02 14 add r9,r9,r0
> 4c: 7d 70 10 26 mfocrf r11,1
> 50: 55 6b f7 fe rlwinm r11,r11,30,31,31
> 54: 7d 29 5a 14 add r9,r9,r11
> 58: 7f 80 48 40 cmplw cr7,r0,r9
> 5c: 7d 29 52 14 add r9,r9,r10
> 60: 7c 10 10 26 mfocrf r0,1
> 64: 54 00 f7 fe rlwinm r0,r0,30,31,31
> 68: 7d 69 02 14 add r11,r9,r0
> 6c: 7f 8a 58 40 cmplw cr7,r10,r11
> 70: 7c 0b 42 14 add r0,r11,r8
> 74: 81 44 00 08 lwz r10,8(r4)
> 78: 7c f0 10 26 mfocrf r7,1
> 7c: 54 e7 f7 fe rlwinm r7,r7,30,31,31
> 80: 7c 00 3a 14 add r0,r0,r7
> 84: 7f 88 00 40 cmplw cr7,r8,r0
> 88: 7d 20 52 14 add r9,r0,r10
> 8c: 80 04 00 0c lwz r0,12(r4)
> 90: 7d 70 10 26 mfocrf r11,1
> 94: 55 6b f7 fe rlwinm r11,r11,30,31,31
> 98: 7d 29 5a 14 add r9,r9,r11
> 9c: 7f 8a 48 40 cmplw cr7,r10,r9
> a0: 7d 29 02 14 add r9,r9,r0
> a4: 7d 70 10 26 mfocrf r11,1
> a8: 55 6b f7 fe rlwinm r11,r11,30,31,31
> ac: 7d 29 5a 14 add r9,r9,r11
> b0: 7f 80 48 40 cmplw cr7,r0,r9
> b4: 7d 29 2a 14 add r9,r9,r5
> b8: 7c 10 10 26 mfocrf r0,1
> bc: 54 00 f7 fe rlwinm r0,r0,30,31,31
> c0: 7d 29 02 14 add r9,r9,r0
> c4: 7f 85 48 40 cmplw cr7,r5,r9
> c8: 7c 09 32 14 add r0,r9,r6
> cc: 7d 50 10 26 mfocrf r10,1
> d0: 55 4a f7 fe rlwinm r10,r10,30,31,31
> d4: 7c 00 52 14 add r0,r0,r10
> d8: 7f 80 30 40 cmplw cr7,r0,r6
> dc: 7d 30 10 26 mfocrf r9,1
> e0: 55 29 ef fe rlwinm r9,r9,29,31,31
> e4: 7c 09 02 14 add r0,r9,r0
> e8: 54 03 80 3e rotlwi r3,r0,16
> ec: 7c 03 02 14 add r0,r3,r0
> f0: 7c 03 00 f8 not r3,r0
> f4: 78 63 84 22 rldicl r3,r3,48,48
> f8: 4e 80 00 20 blr
>
> This patch implements it in assembly for both PPC32 and PPC64
>
> Link: https://github.com/linuxppc/linux/issues/9
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>
Applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/e9c4943a107b56696e4872cdffdba6
cheers
^ permalink raw reply
* Re: powerpc/Makefile: set -mcpu=860 flag for the 8xx
From: Michael Ellerman @ 2018-06-04 14:11 UTC (permalink / raw)
To: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <20180528060834.72DE86F377@po14934vm.idsi0.si.c-s.fr>
On Mon, 2018-05-28 at 06:08:34 UTC, Christophe Leroy wrote:
> When compiled with GCC 8.1, vmlinux is significantly bigger than
> with GCC 4.8.
>
> When looking at the generated code with objdump, we notice that
> all functions and loops when a 16 bytes alignment. This significantly
> increases the size of the kernel. It is pointless and even
> counterproductive as on the 8xx 'nop' also consumes one clock cycle.
>
> Size of vmlinux with GCC 4.8:
> text data bss dec hex filename
> 5801948 1626076 457796 7885820 7853fc vmlinux
>
> Size of vmlinux with GCC 8.1:
> text data bss dec hex filename
> 6764592 1630652 456476 8851720 871108 vmlinux
>
> Size of vmlinux with GCC 8.1 and this patch:
> text data bss dec hex filename
> 6331544 1631756 456476 8419776 8079c0 vmlinux
>
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/1c38976334c0efce1b285369a6037f
cheers
^ permalink raw reply
* Re: [v2] selftests/powerpc: Add perf breakpoint test
From: Michael Ellerman @ 2018-06-04 14:11 UTC (permalink / raw)
To: Michael Neuling; +Cc: mikey, linuxppc-dev
In-Reply-To: <20180528232238.22495-1-mikey@neuling.org>
On Mon, 2018-05-28 at 23:22:38 UTC, Michael Neuling wrote:
> This tests perf hardware breakpoints (ie PERF_TYPE_BREAKPOINT) on
> powerpc.
>
> Signed-off-by: Michael Neuling <mikey@neuling.org>
Applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/9c2d72d497a32788bf90f05610319a
cheers
^ permalink raw reply
* Re: [v2] powerpc/64: Fix build failure with GCC 8.1
From: Michael Ellerman @ 2018-06-04 14:11 UTC (permalink / raw)
To: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <f5f2de3e1a39613f7303bfdc0d2f2210d4c91910.1527573345.git.christophe.leroy@c-s.fr>
On Tue, 2018-05-29 at 06:03:53 UTC, Christophe Leroy wrote:
> CC arch/powerpc/kernel/nvram_64.o
> arch/powerpc/kernel/nvram_64.c: In function 'nvram_create_partition':
> arch/powerpc/kernel/nvram_64.c:1042:2: error: 'strncpy' specified bound 12 equals destination size [-Werror=stringop-truncation]
> strncpy(new_part->header.name, name, 12);
> ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> CC arch/powerpc/kernel/trace/ftrace.o
> In function 'make_field',
> inlined from 'ps3_repository_read_boot_dat_address' at arch/powerpc/platforms/ps3/repository.c:900:9:
> arch/powerpc/platforms/ps3/repository.c:106:2: error: 'strncpy' output truncated before terminating nul copying 8 bytes from a string of the same length [-Werror=stringop-truncation]
> strncpy((char *)&n, text, 8);
> ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/c95998811807d897ca112ea62d6671
cheers
^ permalink raw reply
* Re: powerpc/64s: Enhance the information in cpu_show_spectre_v1()
From: Michael Ellerman @ 2018-06-04 14:11 UTC (permalink / raw)
To: Michal Suchanek, Benjamin Herrenschmidt, Paul Mackerras,
Michal Suchanek, Mauricio Faria de Oliveira, Nicholas Piggin,
Michael Neuling, linuxppc-dev, linux-kernel
In-Reply-To: <20180528131914.32231-1-msuchanek@suse.de>
On Mon, 2018-05-28 at 13:19:14 UTC, Michal Suchanek wrote:
> We now have barrier_nospec as mitigation so print it in
> cpu_show_spectre_v1 when enabled.
>
> Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/a377514519b9a20fa1ea9adddbb412
cheers
^ permalink raw reply
* Re: [V2, 1/4] powerpc/mm/hugetlb: Update huge_ptep_set_access_flags to call __ptep_set_access_flags directly
From: Michael Ellerman @ 2018-06-04 14:11 UTC (permalink / raw)
To: Aneesh Kumar K.V, benh, paulus, npiggin; +Cc: Aneesh Kumar K.V, linuxppc-dev
In-Reply-To: <20180529142841.19428-1-aneesh.kumar@linux.ibm.com>
On Tue, 2018-05-29 at 14:28:38 UTC, "Aneesh Kumar K.V" wrote:
> In a later patch, we want to update __ptep_set_access_flags take page size
> arg. This makes ptep_set_access_flags only work with mmu_virtual_psize.
> To simplify the code make huge_ptep_set_access_flags directly call
> __ptep_set_access_flags so that we can compute the hugetlb page size in
> hugetlb function.
>
> Now that ptep_set_access_flags won't be called for hugetlb remove
> the is_vm_hugetlb_page() check and add the assert of pte lock
> unconditionally.
>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Series applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/f069ff396d657ac7bdb5de866c3ec2
cheers
^ permalink raw reply
* Re: powerpc/ptrace: Use copy_{from, to}_user() rather than open-coding
From: Michael Ellerman @ 2018-06-04 14:11 UTC (permalink / raw)
To: Michael Ellerman, linuxppc-dev, viro; +Cc: malat
In-Reply-To: <20180529125738.24271-1-mpe@ellerman.id.au>
On Tue, 2018-05-29 at 12:57:38 UTC, Michael Ellerman wrote:
> From: Al Viro <viro@ZenIV.linux.org.uk>
>
> In PPC_PTRACE_GETHWDBGINFO and PPC_PTRACE_SETHWDEBUG we do an
> access_ok() check and then __copy_{from,to}_user().
>
> Instead we should just use copy_{from,to}_user() which does all that
> for us and is less error prone.
>
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
> Reviewed-by: Samuel Mendoza-Jonas <sam@mendozajonas.com>
Applied to powerpc next.
https://git.kernel.org/powerpc/c/6bcdd2972b9f6ebda9ae5c7075e2d5
cheers
^ permalink raw reply
* Re: [v6,1/2] powerpc/lib: optimise 32 bits __clear_user()
From: Michael Ellerman @ 2018-06-04 14:11 UTC (permalink / raw)
To: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras, segher
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <23d156759ff411f5fd932e167b8b5f5ecd6aa88b.1527663626.git.christophe.leroy@c-s.fr>
On Wed, 2018-05-30 at 07:06:13 UTC, Christophe Leroy wrote:
> Rewrite clear_user() on the same principle as memset(0), making use
> of dcbz to clear complete cache lines.
>
> This code is a copy/paste of memset(), with some modifications
> in order to retrieve remaining number of bytes to be cleared,
> as it needs to be returned in case of error.
>
> On the same way as done on PPC64 in commit 17968fbbd19f1
> ("powerpc: 64bit optimised __clear_user"), the patch moves
> __clear_user() into a dedicated file string_32.S
>
> On a MPC885, throughput is almost doubled:
>
> Before:
> ~# dd if=/dev/zero of=/dev/null bs=1M count=1000
> 1048576000 bytes (1000.0MB) copied, 18.990779 seconds, 52.7MB/s
>
> After:
> ~# dd if=/dev/zero of=/dev/null bs=1M count=1000
> 1048576000 bytes (1000.0MB) copied, 9.611468 seconds, 104.0MB/s
>
> On a MPC8321, throughput is multiplied by 2.12:
>
> Before:
> root@vgoippro:~# dd if=/dev/zero of=/dev/null bs=1M count=1000
> 1048576000 bytes (1000.0MB) copied, 6.844352 seconds, 146.1MB/s
>
> After:
> root@vgoippro:~# dd if=/dev/zero of=/dev/null bs=1M count=1000
> 1048576000 bytes (1000.0MB) copied, 3.218854 seconds, 310.7MB/s
>
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Series applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/f36bbf21e8b911b3c629fd36d4d217
cheers
^ permalink raw reply
* Re: [v3,1/3] powerpc/time: inline arch_vtime_task_switch()
From: Michael Ellerman @ 2018-06-04 14:11 UTC (permalink / raw)
To: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <2e937890abac10677aae3c1e345dd934a6794c37.1527610536.git.christophe.leroy@c-s.fr>
On Tue, 2018-05-29 at 16:19:14 UTC, Christophe Leroy wrote:
> arch_vtime_task_switch() is a small function which is called
> only from vtime_common_task_switch(), so it is worth inlining
>
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/60f1d2893ee6de65cdea609c84950b
cheers
^ permalink raw reply
* Re: [v3] powerpc: fix build failure by disabling attribute-alias warning
From: Michael Ellerman @ 2018-06-04 14:11 UTC (permalink / raw)
To: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras, segher
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <919de56550f431fda0e1073cd51519b2c2623294.1527609852.git.christophe.leroy@c-s.fr>
On Tue, 2018-05-29 at 16:06:41 UTC, Christophe Leroy wrote:
> Latest GCC version emit the following warnings
>
> As arch/powerpc code is built with -Werror, this breaks build with
> GCC 8.1
>
> This patch inhibits those warnings
>
> CC arch/powerpc/kernel/syscalls.o
> In file included from arch/powerpc/kernel/syscalls.c:24:
> ./include/linux/syscalls.h:233:18: error: 'sys_mmap2' alias between functions of incompatible types 'long int(long unsigned int, size_t, long unsigned int, long unsigned int, long unsigned int, long unsigned int)' {aka 'long int(long unsigned int, long unsigned int, long unsigned int, long unsigned int, long unsigned int, long unsigned int)'} and 'long int(long int, long int, long int, long int, long int, long int)' [-Werror=attribute-alias]
> asmlinkage long sys##name(__MAP(x,__SC_DECL,__VA_ARGS__)) \
> ^~~
> ./include/linux/syscalls.h:222:2: note: in expansion of macro '__SYSCALL_DEFINEx'
> __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
> ^~~~~~~~~~~~~~~~~
> ./include/linux/syscalls.h:216:36: note: in expansion of macro 'SYSCALL_DEFINEx'
> #define SYSCALL_DEFINE6(name, ...) SYSCALL_DEFINEx(6, _##name, __VA_ARGS__)
> ^~~~~~~~~~~~~~~
> arch/powerpc/kernel/syscalls.c:65:1: note: in expansion of macro 'SYSCALL_DEFINE6'
> SYSCALL_DEFINE6(mmap2, unsigned long, addr, size_t, len,
> ^~~~~~~~~~~~~~~
> ./include/linux/syscalls.h:238:18: note: aliased declaration here
> asmlinkage long __se_sys##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
> ^~~~~~~~
> ./include/linux/syscalls.h:222:2: note: in expansion of macro '__SYSCALL_DEFINEx'
> __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
> ^~~~~~~~~~~~~~~~~
> ./include/linux/syscalls.h:216:36: note: in expansion of macro 'SYSCALL_DEFINEx'
> #define SYSCALL_DEFINE6(name, ...) SYSCALL_DEFINEx(6, _##name, __VA_ARGS__)
> ^~~~~~~~~~~~~~~
> arch/powerpc/kernel/syscalls.c:65:1: note: in expansion of macro 'SYSCALL_DEFINE6'
> SYSCALL_DEFINE6(mmap2, unsigned long, addr, size_t, len,
> ^~~~~~~~~~~~~~~
> ./include/linux/syscalls.h:233:18: error: 'sys_mmap' alias between functions of incompatible types 'long int(long unsigned int, size_t, long unsigned int, long unsigned int, long unsigned int, off_t)' {aka 'long int(long unsigned int, long unsigned int, long unsigned int, long unsigned int, long unsigned int, long int)'} and 'long int(long int, long int, long int, long int, long int, long int)' [-Werror=attribute-alias]
> asmlinkage long sys##name(__MAP(x,__SC_DECL,__VA_ARGS__)) \
> ^~~
> ./include/linux/syscalls.h:222:2: note: in expansion of macro '__SYSCALL_DEFINEx'
> __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
> ^~~~~~~~~~~~~~~~~
> ./include/linux/syscalls.h:216:36: note: in expansion of macro 'SYSCALL_DEFINEx'
> #define SYSCALL_DEFINE6(name, ...) SYSCALL_DEFINEx(6, _##name, __VA_ARGS__)
> ^~~~~~~~~~~~~~~
> arch/powerpc/kernel/syscalls.c:72:1: note: in expansion of macro 'SYSCALL_DEFINE6'
> SYSCALL_DEFINE6(mmap, unsigned long, addr, size_t, len,
> ^~~~~~~~~~~~~~~
> ./include/linux/syscalls.h:238:18: note: aliased declaration here
> asmlinkage long __se_sys##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
> ^~~~~~~~
> ./include/linux/syscalls.h:222:2: note: in expansion of macro '__SYSCALL_DEFINEx'
> __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
> ^~~~~~~~~~~~~~~~~
> ./include/linux/syscalls.h:216:36: note: in expansion of macro 'SYSCALL_DEFINEx'
> #define SYSCALL_DEFINE6(name, ...) SYSCALL_DEFINEx(6, _##name, __VA_ARGS__)
> ^~~~~~~~~~~~~~~
> arch/powerpc/kernel/syscalls.c:72:1: note: in expansion of macro 'SYSCALL_DEFINE6'
> SYSCALL_DEFINE6(mmap, unsigned long, addr, size_t, len,
> ^~~~~~~~~~~~~~~
> CC arch/powerpc/kernel/signal_32.o
> In file included from arch/powerpc/kernel/signal_32.c:31:
> ./include/linux/compat.h:74:18: error: 'compat_sys_swapcontext' alias between functions of incompatible types 'long int(struct ucontext32 *, struct ucontext32 *, int)' and 'long int(long int, long int, long int)' [-Werror=attribute-alias]
> asmlinkage long compat_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__)) \
> ^~~~~~~~~~
> ./include/linux/compat.h:58:2: note: in expansion of macro 'COMPAT_SYSCALL_DEFINEx'
> COMPAT_SYSCALL_DEFINEx(3, _##name, __VA_ARGS__)
> ^~~~~~~~~~~~~~~~~~~~~~
> arch/powerpc/kernel/signal_32.c:1041:1: note: in expansion of macro 'COMPAT_SYSCALL_DEFINE3'
> COMPAT_SYSCALL_DEFINE3(swapcontext, struct ucontext __user *, old_ctx,
> ^~~~~~~~~~~~~~~~~~~~~~
> ./include/linux/compat.h:79:18: note: aliased declaration here
> asmlinkage long __se_compat_sys##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
> ^~~~~~~~~~~~~~~
> ./include/linux/compat.h:58:2: note: in expansion of macro 'COMPAT_SYSCALL_DEFINEx'
> COMPAT_SYSCALL_DEFINEx(3, _##name, __VA_ARGS__)
> ^~~~~~~~~~~~~~~~~~~~~~
> arch/powerpc/kernel/signal_32.c:1041:1: note: in expansion of macro 'COMPAT_SYSCALL_DEFINE3'
> COMPAT_SYSCALL_DEFINE3(swapcontext, struct ucontext __user *, old_ctx,
> ^~~~~~~~~~~~~~~~~~~~~~
> CC arch/powerpc/kernel/signal_64.o
> In file included from arch/powerpc/kernel/signal_64.c:27:
> ./include/linux/syscalls.h:233:18: error: 'sys_swapcontext' alias between functions of incompatible types 'long int(struct ucontext *, struct ucontext *, long int)' and 'long int(long int, long int, long int)' [-Werror=attribute-alias]
> asmlinkage long sys##name(__MAP(x,__SC_DECL,__VA_ARGS__)) \
> ^~~
> ./include/linux/syscalls.h:222:2: note: in expansion of macro '__SYSCALL_DEFINEx'
> __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
> ^~~~~~~~~~~~~~~~~
> ./include/linux/syscalls.h:213:36: note: in expansion of macro 'SYSCALL_DEFINEx'
> #define SYSCALL_DEFINE3(name, ...) SYSCALL_DEFINEx(3, _##name, __VA_ARGS__)
> ^~~~~~~~~~~~~~~
> arch/powerpc/kernel/signal_64.c:628:1: note: in expansion of macro 'SYSCALL_DEFINE3'
> SYSCALL_DEFINE3(swapcontext, struct ucontext __user *, old_ctx,
> ^~~~~~~~~~~~~~~
> ./include/linux/syscalls.h:238:18: note: aliased declaration here
> asmlinkage long __se_sys##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
> ^~~~~~~~
> ./include/linux/syscalls.h:222:2: note: in expansion of macro '__SYSCALL_DEFINEx'
> __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
> ^~~~~~~~~~~~~~~~~
> ./include/linux/syscalls.h:213:36: note: in expansion of macro 'SYSCALL_DEFINEx'
> #define SYSCALL_DEFINE3(name, ...) SYSCALL_DEFINEx(3, _##name, __VA_ARGS__)
> ^~~~~~~~~~~~~~~
> arch/powerpc/kernel/signal_64.c:628:1: note: in expansion of macro 'SYSCALL_DEFINE3'
> SYSCALL_DEFINE3(swapcontext, struct ucontext __user *, old_ctx,
> ^~~~~~~~~~~~~~~
> CC arch/powerpc/kernel/rtas.o
> In file included from arch/powerpc/kernel/rtas.c:29:
> ./include/linux/syscalls.h:233:18: error: 'sys_rtas' alias between functions of incompatible types 'long int(struct rtas_args *)' and 'long int(long int)' [-Werror=attribute-alias]
> asmlinkage long sys##name(__MAP(x,__SC_DECL,__VA_ARGS__)) \
> ^~~
> ./include/linux/syscalls.h:222:2: note: in expansion of macro '__SYSCALL_DEFINEx'
> __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
> ^~~~~~~~~~~~~~~~~
> ./include/linux/syscalls.h:211:36: note: in expansion of macro 'SYSCALL_DEFINEx'
> #define SYSCALL_DEFINE1(name, ...) SYSCALL_DEFINEx(1, _##name, __VA_ARGS__)
> ^~~~~~~~~~~~~~~
> arch/powerpc/kernel/rtas.c:1054:1: note: in expansion of macro 'SYSCALL_DEFINE1'
> SYSCALL_DEFINE1(rtas, struct rtas_args __user *, uargs)
> ^~~~~~~~~~~~~~~
> ./include/linux/syscalls.h:238:18: note: aliased declaration here
> asmlinkage long __se_sys##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
> ^~~~~~~~
> ./include/linux/syscalls.h:222:2: note: in expansion of macro '__SYSCALL_DEFINEx'
> __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
> ^~~~~~~~~~~~~~~~~
> ./include/linux/syscalls.h:211:36: note: in expansion of macro 'SYSCALL_DEFINEx'
> #define SYSCALL_DEFINE1(name, ...) SYSCALL_DEFINEx(1, _##name, __VA_ARGS__)
> ^~~~~~~~~~~~~~~
> arch/powerpc/kernel/rtas.c:1054:1: note: in expansion of macro 'SYSCALL_DEFINE1'
> SYSCALL_DEFINE1(rtas, struct rtas_args __user *, uargs)
> ^~~~~~~~~~~~~~~
> CC arch/powerpc/kernel/pci_64.o
> In file included from arch/powerpc/kernel/pci_64.c:23:
> ./include/linux/syscalls.h:233:18: error: 'sys_pciconfig_iobase' alias between functions of incompatible types 'long int(long int, long unsigned int, long unsigned int)' and 'long int(long int, long int, long int)' [-Werror=attribute-alias]
> asmlinkage long sys##name(__MAP(x,__SC_DECL,__VA_ARGS__)) \
> ^~~
> ./include/linux/syscalls.h:222:2: note: in expansion of macro '__SYSCALL_DEFINEx'
> __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
> ^~~~~~~~~~~~~~~~~
> ./include/linux/syscalls.h:213:36: note: in expansion of macro 'SYSCALL_DEFINEx'
> #define SYSCALL_DEFINE3(name, ...) SYSCALL_DEFINEx(3, _##name, __VA_ARGS__)
> ^~~~~~~~~~~~~~~
> arch/powerpc/kernel/pci_64.c:206:1: note: in expansion of macro 'SYSCALL_DEFINE3'
> SYSCALL_DEFINE3(pciconfig_iobase, long, which, unsigned long, in_bus,
> ^~~~~~~~~~~~~~~
> ./include/linux/syscalls.h:238:18: note: aliased declaration here
> asmlinkage long __se_sys##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
> ^~~~~~~~
> ./include/linux/syscalls.h:222:2: note: in expansion of macro '__SYSCALL_DEFINEx'
> __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
> ^~~~~~~~~~~~~~~~~
> ./include/linux/syscalls.h:213:36: note: in expansion of macro 'SYSCALL_DEFINEx'
> #define SYSCALL_DEFINE3(name, ...) SYSCALL_DEFINEx(3, _##name, __VA_ARGS__)
> ^~~~~~~~~~~~~~~
> arch/powerpc/kernel/pci_64.c:206:1: note: in expansion of macro 'SYSCALL_DEFINE3'
> SYSCALL_DEFINE3(pciconfig_iobase, long, which, unsigned long, in_bus,
> ^~~~~~~~~~~~~~~
>
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/2479bfc9bc600dcce7f932d52dcfa8
cheers
^ permalink raw reply
* Re: [v5, 1/4] powerpc/kbuild: set default generic machine type for 32-bit compile
From: Michael Ellerman @ 2018-06-04 14:11 UTC (permalink / raw)
To: Nicholas Piggin, linux-kbuild
Cc: Masahiro Yamada, linuxppc-dev, Nicholas Piggin
In-Reply-To: <20180530121922.22122-2-npiggin@gmail.com>
On Wed, 2018-05-30 at 12:19:19 UTC, Nicholas Piggin wrote:
> Some 64-bit toolchains uses the wrong ISA variant for compiling 32-bit
> kernels, even with -m32. Debian's powerpc64le is one such case, and
> that is because it is built with --with-cpu=power8.
>
> So when cross compiling a 32-bit kernel with a 64-bit toolchain, set
> -mcpu=powerpc initially, which is the generic 32-bit powerpc machine
> type and scheduling model. CPU and platform code can override this
> with subsequent -mcpu flags if necessary.
>
> This is not done for 32-bit toolchains otherwise it would override
> their defaults, which are presumably set appropriately for the
> environment (moreso than a 64-bit cross compiler).
>
> This fixes a lot of build failures due to incompatible assembly when
> compiling 32-bit kernel with th Debian powerpc64le 64-bit toolchain.
>
> Cc: Segher Boessenkool <segher@kernel.crashing.org>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/4bf4f42a2febb449a5cc5d79e7c58e
cheers
^ permalink raw reply
* Re: powerpc/64s: Fix compiler store ordering to SLB shadow area
From: Michael Ellerman @ 2018-06-04 14:11 UTC (permalink / raw)
To: Nicholas Piggin, linuxppc-dev; +Cc: Aneesh Kumar K . V, Nicholas Piggin
In-Reply-To: <20180530103122.27674-1-npiggin@gmail.com>
On Wed, 2018-05-30 at 10:31:22 UTC, Nicholas Piggin wrote:
> The stores to update the SLB shadow area must be made as they appear
> in the C code, so that the hypervisor does not see an entry with
> mismatched vsid and esid. Use WRITE_ONCE for this.
>
> GCC has been observed to elide the first store to esid in the update,
> which means that if the hypervisor interrupts the guest after storing
> to vsid, it could see an entry with old esid and new vsid, which may
> possibly result in memory corruption.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/926bc2f100c24d4842b3064b5af44a
cheers
^ permalink raw reply
* Re: [kernel] powerpc/powernv/ioda2: Remove redundand free of TCE pages
From: Michael Ellerman @ 2018-06-04 14:11 UTC (permalink / raw)
To: Alexey Kardashevskiy, linuxppc-dev
Cc: Alexey Kardashevskiy, Oliver O'Halloran, Andrew Donnellan
In-Reply-To: <20180530092250.28981-1-aik@ozlabs.ru>
On Wed, 2018-05-30 at 09:22:50 UTC, Alexey Kardashevskiy wrote:
> When IODA2 creates a PE, it creates an IOMMU table with it_ops::free
> set to pnv_ioda2_table_free() which calls pnv_pci_ioda2_table_free_pages().
>
> Since iommu_tce_table_put() calls it_ops::free when the last reference
> to the table is released, explicit call to pnv_pci_ioda2_table_free_pages()
> is not needed so let's remove it.
>
> This should fix double free in the case of PCI hotuplug as
> pnv_pci_ioda2_table_free_pages() does not reset neither
> iommu_table::it_base nor ::it_size.
>
> This was not exposed by SRIOV as it uses different code path via
> pnv_pcibios_sriov_disable().
>
> IODA1 does not inialize it_ops::free so it does not have this issue.
>
> Fixes: c5f7700bb "powerpc/powernv: Dynamically release PE"
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/98fd72fe82527fd26618062b60cfd3
cheers
^ permalink raw reply
* Re: [v5, 3/4] powerpc/kbuild: Use flags variables rather than overriding LD/CC/AS
From: Michael Ellerman @ 2018-06-04 14:11 UTC (permalink / raw)
To: Nicholas Piggin, linux-kbuild
Cc: Masahiro Yamada, linuxppc-dev, Nicholas Piggin
In-Reply-To: <20180530121922.22122-4-npiggin@gmail.com>
On Wed, 2018-05-30 at 12:19:21 UTC, Nicholas Piggin wrote:
> The powerpc toolchain can compile combinations of 32/64 bit and
> big/little endian, so it's convenient to consider, e.g.,
>
> `CC -m64 -mbig-endian`
>
> To be the C compiler for the purpose of invoking it to build target
> artifacts. So overriding the the CC variable to include thse flags
> works for this purpose.
>
> Unfortunately that is not compatible with the way the proposed new
> Kconfig macro language will work.
>
> After previous patches in this series, these flags can be carefully
> passed in using flags instead.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/1421dc6d48296a9e91702743b31458
cheers
^ permalink raw reply
* Re: [v5, 2/4] powerpc/kbuild: remove CROSS32 defines from top level powerpc Makefile
From: Michael Ellerman @ 2018-06-04 14:11 UTC (permalink / raw)
To: Nicholas Piggin, linux-kbuild
Cc: Masahiro Yamada, linuxppc-dev, Nicholas Piggin
In-Reply-To: <20180530121922.22122-3-npiggin@gmail.com>
On Wed, 2018-05-30 at 12:19:20 UTC, Nicholas Piggin wrote:
> Switch VDSO32 build over to use CROSS32_COMPILE directly, and have
> it pass in -m32 after the standard c_flags. This allows endianness
> overrides to be removed and the endian and bitness flags moved into
> standard flags variables.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/af3901cbbd3de182aafb8ee553c825
cheers
^ permalink raw reply
* Re: powerpc/mm/hash: Add missing update in slb update sequence.
From: Michael Ellerman @ 2018-06-04 14:11 UTC (permalink / raw)
To: Aneesh Kumar K.V, benh, paulus, npiggin; +Cc: Aneesh Kumar K.V, linuxppc-dev
In-Reply-To: <20180530131804.1706-1-aneesh.kumar@linux.ibm.com>
On Wed, 2018-05-30 at 13:18:04 UTC, "Aneesh Kumar K.V" wrote:
> >From ISA
>
> "For data accesses, the context synchronizing instruction before the slbie,
> slbieg, slbia, slbmte, tlbie, or tlbiel instruction ensures that all preceding
> instructions that access data storage have completed to a point at which they
> have reported all exceptions they will cause."
>
> Add the missing isync when updating Kernel stack slb entry.
>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/91d06971881f71d945910de1286580
cheers
^ permalink raw reply
* Re: [v2,20/21] powerpc/xmon: use match_string() helper
From: Michael Ellerman @ 2018-06-04 14:11 UTC (permalink / raw)
To: Yisheng Xie, linux-kernel
Cc: Yisheng Xie, andy.shevchenko, Paul Mackerras, linuxppc-dev
In-Reply-To: <1527765086-19873-21-git-send-email-xieyisheng1@huawei.com>
On Thu, 2018-05-31 at 11:11:25 UTC, Yisheng Xie wrote:
> match_string() returns the index of an array for a matching string,
> which can be used instead of open coded variant.
>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: linuxppc-dev@lists.ozlabs.org
> Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
Applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/0abbf2bfdc9dec32e9832aa8d4522a
cheers
^ permalink raw reply
* Re: powerpc/mm/hash: hard disable irq in the SLB insert path
From: Michael Ellerman @ 2018-06-04 14:11 UTC (permalink / raw)
To: Aneesh Kumar K.V, benh, paulus, npiggin; +Cc: Aneesh Kumar K.V, linuxppc-dev
In-Reply-To: <20180601082402.17192-1-aneesh.kumar@linux.ibm.com>
On Fri, 2018-06-01 at 08:24:02 UTC, "Aneesh Kumar K.V" wrote:
> When inserting SLB entries for EA above 512TB, we need to hard disable irq.
> This will make sure we don't take a PMU interrupt that can possibly touch
> user space address via a stack dump. To prevent this, we need to hard disable
> the interrupt.
>
> Also add a comment explaining why we don't need context synchronizing isync
> with slbmte.
>
> Fixes: f384796c4 ("powerpc/mm: Add support for handling > 512TB address in SLB miss")
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/a5db5060e0b2e27605df272224bfd4
cheers
^ permalink raw reply
* Re: powerpc/mm/hugetlb: Update hugetlb related locks
From: Michael Ellerman @ 2018-06-04 14:11 UTC (permalink / raw)
To: Aneesh Kumar K.V, benh, paulus, npiggin; +Cc: Aneesh Kumar K.V, linuxppc-dev
In-Reply-To: <20180601082424.17393-1-aneesh.kumar@linux.ibm.com>
On Fri, 2018-06-01 at 08:24:24 UTC, "Aneesh Kumar K.V" wrote:
> With split pmd page table lock enabled, we don't use mm->page_table_lock when
> updating pmd entries. This patch update hugetlb path to use the right lock
> when inserting huge page directory entries into page table.
>
> ex: if we are using hugepd and inserting hugepd entry at the pmd level, we
> use pmd_lockptr, which based on config can be split pmd lock.
>
> For update huge page directory entries itself we use mm->page_table_lock. We
> do have a helper huge_pte_lockptr() for that.
>
> Fixes: 675d99529 ("powerpc/book3s64: Enable split pmd ptlock")
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/ed515b6898c36775ddd99ff9ffeda4
cheers
^ permalink raw reply
* Re: [v4, 1/7] powerpc/64s/radix: do not flush TLB when relaxing access
From: Michael Ellerman @ 2018-06-04 14:11 UTC (permalink / raw)
To: Nicholas Piggin, linuxppc-dev; +Cc: Aneesh Kumar K . V, Nicholas Piggin
In-Reply-To: <20180601100121.393-2-npiggin@gmail.com>
On Fri, 2018-06-01 at 10:01:15 UTC, Nicholas Piggin wrote:
> Radix flushes the TLB when updating ptes to increase permissiveness
> of protection (increase access authority). Book3S does not require
> TLB flushing in this case, and it is not done on hash. This patch
> avoids the flush for radix.
>
> >From Power ISA v3.0B, p.1090:
>
> Setting a Reference or Change Bit or Upgrading Access Authority
> (PTE Subject to Atomic Hardware Updates)
>
> If the only change being made to a valid PTE that is subject to
> atomic hardware updates is to set the Reference or Change bit to 1
> or to add access authorities, a simpler sequence suffices because
> the translation hardware will refetch the PTE if an access is
> attempted for which the only problems were reference and/or change
> bits needing to be set or insufficient access authority.
>
> The nest MMU on POWER9 does not re-fetch the PTE after such an access
> attempt before faulting, so address spaces with a coprocessor
> attached will continue to flush in these cases.
>
> This reduces tlbies for a kernel compile workload from 1.28M to 0.95M,
> tlbiels from 20.17M 19.68M.
>
> fork --fork --exec benchmark improved 2.77% (12000->12300).
>
> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Series applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/e5f7cb58c2b77a0249c2028b6d1ec4
cheers
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox