* [PATCH kernel v2] powerpc/kuap: Restore AMR after replaying soft interrupts
From: Alexey Kardashevskiy @ 2020-12-03 5:47 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Alexey Kardashevskiy, Nicholas Piggin
When interrupted in raw_copy_from_user()/... after user memory access
is enabled, a nested handler may also access user memory (perf is
one example) and when it does so, it calls prevent_read_from_user()
which prevents the upper handler from accessing user memory.
This saves/restores AMR when replaying interrupts.
get_kuap/set_kuap have stubs for disabled KUAP on RADIX but there are
none for hash-only configs (BOOK3E) so this adds stubs and moves
AMR_KUAP_BLOCK_xxx.
Found by syzkaller. More likely to break with enabled
CONFIG_DEBUG_ATOMIC_SLEEP, the call chain is
timer_interrupt -> ktime_get -> read_seqcount_begin -> local_irq_restore.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
Changes:
v2:
* fixed compile on hash
* moved get/set to arch_local_irq_restore
* block KUAP before replaying
---
This is an example:
------------[ cut here ]------------
Bug: Read fault blocked by AMR!
WARNING: CPU: 0 PID: 1603 at /home/aik/p/kernel/arch/powerpc/include/asm/book3s/64/kup-radix.h:145 __do_page_fau
Modules linked in:
CPU: 0 PID: 1603 Comm: amr Not tainted 5.10.0-rc6_v5.10-rc6_a+fstn1 #24
NIP: c00000000009ece8 LR: c00000000009ece4 CTR: 0000000000000000
REGS: c00000000dc63560 TRAP: 0700 Not tainted (5.10.0-rc6_v5.10-rc6_a+fstn1)
MSR: 8000000000021033 <SF,ME,IR,DR,RI,LE> CR: 28002888 XER: 20040000
CFAR: c0000000001fa928 IRQMASK: 1
GPR00: c00000000009ece4 c00000000dc637f0 c000000002397600 000000000000001f
GPR04: c0000000020eb318 0000000000000000 c00000000dc63494 0000000000000027
GPR08: c00000007fe4de68 c00000000dfe9180 0000000000000000 0000000000000001
GPR12: 0000000000002000 c0000000030a0000 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 bfffffffffffffff
GPR20: 0000000000000000 c0000000134a4020 c0000000019c2218 0000000000000fe0
GPR24: 0000000000000000 0000000000000000 c00000000d106200 0000000040000000
GPR28: 0000000000000000 0000000000000300 c00000000dc63910 c000000001946730
NIP [c00000000009ece8] __do_page_fault+0xb38/0xde0
LR [c00000000009ece4] __do_page_fault+0xb34/0xde0
Call Trace:
[c00000000dc637f0] [c00000000009ece4] __do_page_fault+0xb34/0xde0 (unreliable)
[c00000000dc638a0] [c00000000000c968] handle_page_fault+0x10/0x2c
--- interrupt: 300 at strncpy_from_user+0x290/0x440
LR = strncpy_from_user+0x284/0x440
[c00000000dc63ba0] [c000000000c3dcb0] strncpy_from_user+0x2f0/0x440 (unreliable)
[c00000000dc63c30] [c00000000068b888] getname_flags+0x88/0x2c0
[c00000000dc63c90] [c000000000662a44] do_sys_openat2+0x2d4/0x5f0
[c00000000dc63d30] [c00000000066560c] do_sys_open+0xcc/0x140
[c00000000dc63dc0] [c000000000045e10] system_call_exception+0x160/0x240
[c00000000dc63e20] [c00000000000da60] system_call_common+0xf0/0x27c
Instruction dump:
409c0048 3fe2ff5b 3bfff128 fac10060 fae10068 482f7a85 60000000 3c62ff5b
7fe4fb78 3863f250 4815bbd9 60000000 <0fe00000> 3c62ff5b 3863f2b8 4815c8b5
irq event stamp: 254
hardirqs last enabled at (253): [<c000000000019550>] arch_local_irq_restore+0xa0/0x150
hardirqs last disabled at (254): [<c000000000008a10>] data_access_common_virt+0x1b0/0x1d0
softirqs last enabled at (0): [<c0000000001f6d5c>] copy_process+0x78c/0x2120
softirqs last disabled at (0): [<0000000000000000>] 0x0
---[ end trace ba98aec5151f3aeb ]---
---
arch/powerpc/include/asm/book3s/64/kup-radix.h | 3 ---
arch/powerpc/include/asm/kup.h | 10 ++++++++++
arch/powerpc/kernel/irq.c | 6 ++++++
3 files changed, 16 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/include/asm/book3s/64/kup-radix.h b/arch/powerpc/include/asm/book3s/64/kup-radix.h
index a39e2d193fdc..4ad607461b75 100644
--- a/arch/powerpc/include/asm/book3s/64/kup-radix.h
+++ b/arch/powerpc/include/asm/book3s/64/kup-radix.h
@@ -5,9 +5,6 @@
#include <linux/const.h>
#include <asm/reg.h>
-#define AMR_KUAP_BLOCK_READ UL(0x4000000000000000)
-#define AMR_KUAP_BLOCK_WRITE UL(0x8000000000000000)
-#define AMR_KUAP_BLOCKED (AMR_KUAP_BLOCK_READ | AMR_KUAP_BLOCK_WRITE)
#define AMR_KUAP_SHIFT 62
#ifdef __ASSEMBLY__
diff --git a/arch/powerpc/include/asm/kup.h b/arch/powerpc/include/asm/kup.h
index 0d93331d0fab..e63930767a96 100644
--- a/arch/powerpc/include/asm/kup.h
+++ b/arch/powerpc/include/asm/kup.h
@@ -14,6 +14,10 @@
#define KUAP_CURRENT_WRITE 8
#define KUAP_CURRENT (KUAP_CURRENT_READ | KUAP_CURRENT_WRITE)
+#define AMR_KUAP_BLOCK_READ UL(0x4000000000000000)
+#define AMR_KUAP_BLOCK_WRITE UL(0x8000000000000000)
+#define AMR_KUAP_BLOCKED (AMR_KUAP_BLOCK_READ | AMR_KUAP_BLOCK_WRITE)
+
#ifdef CONFIG_PPC_BOOK3S_64
#include <asm/book3s/64/kup-radix.h>
#endif
@@ -64,6 +68,12 @@ bad_kuap_fault(struct pt_regs *regs, unsigned long address, bool is_write)
}
static inline void kuap_check_amr(void) { }
+static inline unsigned long get_kuap(void)
+{
+ return AMR_KUAP_BLOCKED;
+}
+
+static inline void set_kuap(unsigned long value) { }
/*
* book3s/64/kup-radix.h defines these functions for the !KUAP case to flush
diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index 7d0f7682d01d..d9fd46da04d6 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -314,6 +314,7 @@ void replay_soft_interrupts(void)
notrace void arch_local_irq_restore(unsigned long mask)
{
unsigned char irq_happened;
+ unsigned long kuap_state;
/* Write the new soft-enabled value */
irq_soft_mask_set(mask);
@@ -373,9 +374,14 @@ notrace void arch_local_irq_restore(unsigned long mask)
irq_soft_mask_set(IRQS_ALL_DISABLED);
trace_hardirqs_off();
+ kuap_state = get_kuap();
+ set_kuap(AMR_KUAP_BLOCKED);
+
replay_soft_interrupts();
local_paca->irq_happened = 0;
+ set_kuap(kuap_state);
+
trace_hardirqs_on();
irq_soft_mask_set(IRQS_ENABLED);
__hard_irq_enable();
--
2.17.1
^ permalink raw reply related
* Re: [powerpc:next-test 121/184] arch/powerpc/kernel/firmware.c:31:9-10: WARNING: return of 0/1 in function 'check_kvm_guest' with return type bool
From: Srikar Dronamraju @ 2020-12-03 6:12 UTC (permalink / raw)
To: kernel test robot; +Cc: linuxppc-dev, kbuild-all, Nicholas Piggin
In-Reply-To: <202012030759.zuEULDQ3-lkp@intel.com>
Hi,
Thanks for the report.
* kernel test robot <lkp@intel.com> [2020-12-03 07:25:06]:
> tree: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next-test
> head: fb003959777a635dea8910cf71109b612c7f940c
> commit: 77354ecf8473208a5cc5f20a501760f7d6d631cd [121/184] powerpc: Rename is_kvm_guest() to check_kvm_guest()
Sorry for the nit pick, but the commit should have been
Commit 107c55005fbd ("powerpc/pseries: Add KVM guest doorbell restrictions")
because thats the patch that introduced is_kvm_guest.
my patch was just renaming is_kvm_guest to check_kvm_guest.
> config: powerpc-randconfig-c003-20201202 (attached as .config)
> compiler: powerpc64le-linux-gcc (GCC) 9.3.0
>
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <lkp@intel.com>
>
>
> "coccinelle warnings: (new ones prefixed by >>)"
> >> arch/powerpc/kernel/firmware.c:31:9-10: WARNING: return of 0/1 in function 'check_kvm_guest' with return type bool
>
> Please review and possibly fold the followup patch.
>
But I agree we can still fold this the followup patch into the
Rename is_kvm_guest() to check_kvm_guest().
> ---
> 0-DAY CI Kernel Test Service, Intel Corporation
> https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
--
Thanks and Regards
Srikar Dronamraju
^ permalink raw reply
* Re: powerpc 5.10-rcN boot failures with RCU_SCALE_TEST=m
From: Michael Ellerman @ 2020-12-03 6:22 UTC (permalink / raw)
To: Uladzislau Rezki, Paul E . McKenney
Cc: rcu, linuxppc-dev, Paul E . McKenney, Daniel Axtens
In-Reply-To: <20201202143955.GA12300@pc636>
Uladzislau Rezki <urezki@gmail.com> writes:
> On Thu, Dec 03, 2020 at 01:03:32AM +1100, Michael Ellerman wrote:
...
>>
>> The SMP bringup stalls because _cpu_up() is blocked trying to take
>> cpu_hotplug_lock for writing:
>>
>> [ 401.403132][ T0] task:swapper/0 state:D stack:12512 pid: 1 ppid: 0 flags:0x00000800
>> [ 401.403502][ T0] Call Trace:
>> [ 401.403907][ T0] [c0000000062c37d0] [c0000000062c3830] 0xc0000000062c3830 (unreliable)
>> [ 401.404068][ T0] [c0000000062c39b0] [c000000000019d70] __switch_to+0x2e0/0x4a0
>> [ 401.404189][ T0] [c0000000062c3a10] [c000000000b87228] __schedule+0x288/0x9b0
>> [ 401.404257][ T0] [c0000000062c3ad0] [c000000000b879b8] schedule+0x68/0x120
>> [ 401.404324][ T0] [c0000000062c3b00] [c000000000184ad4] percpu_down_write+0x164/0x170
>> [ 401.404390][ T0] [c0000000062c3b50] [c000000000116b68] _cpu_up+0x68/0x280
>> [ 401.404475][ T0] [c0000000062c3bb0] [c000000000116e70] cpu_up+0xf0/0x140
>> [ 401.404546][ T0] [c0000000062c3c30] [c00000000011776c] bringup_nonboot_cpus+0xac/0xf0
>> [ 401.404643][ T0] [c0000000062c3c80] [c000000000eea1b8] smp_init+0x40/0xcc
>> [ 401.404727][ T0] [c0000000062c3ce0] [c000000000ec43dc] kernel_init_freeable+0x1e0/0x3a0
>> [ 401.404799][ T0] [c0000000062c3db0] [c000000000011ec4] kernel_init+0x24/0x150
>> [ 401.404958][ T0] [c0000000062c3e20] [c00000000000daf0] ret_from_kernel_thread+0x5c/0x6c
>>
>> It can't get it because kprobe_optimizer() has taken it for read and is now
>> blocked waiting for synchronize_rcu_tasks():
>>
>> [ 401.418808][ T0] task:kworker/0:1 state:D stack:13392 pid: 12 ppid: 2 flags:0x00000800
>> [ 401.418951][ T0] Workqueue: events kprobe_optimizer
>> [ 401.419078][ T0] Call Trace:
>> [ 401.419121][ T0] [c0000000062ef650] [c0000000062ef710] 0xc0000000062ef710 (unreliable)
>> [ 401.419213][ T0] [c0000000062ef830] [c000000000019d70] __switch_to+0x2e0/0x4a0
>> [ 401.419281][ T0] [c0000000062ef890] [c000000000b87228] __schedule+0x288/0x9b0
>> [ 401.419347][ T0] [c0000000062ef950] [c000000000b879b8] schedule+0x68/0x120
>> [ 401.419415][ T0] [c0000000062ef980] [c000000000b8e664] schedule_timeout+0x2a4/0x340
>> [ 401.419484][ T0] [c0000000062efa80] [c000000000b894ec] wait_for_completion+0x9c/0x170
>> [ 401.419552][ T0] [c0000000062efae0] [c0000000001ac85c] __wait_rcu_gp+0x19c/0x210
>> [ 401.419619][ T0] [c0000000062efb40] [c0000000001ac90c] synchronize_rcu_tasks_generic+0x3c/0x70
>> [ 401.419690][ T0] [c0000000062efbe0] [c00000000022a3dc] kprobe_optimizer+0x1dc/0x470
>> [ 401.419757][ T0] [c0000000062efc60] [c000000000136684] process_one_work+0x2f4/0x530
>> [ 401.419823][ T0] [c0000000062efd20] [c000000000138d28] worker_thread+0x78/0x570
>> [ 401.419891][ T0] [c0000000062efdb0] [c000000000142424] kthread+0x194/0x1a0
>> [ 401.419976][ T0] [c0000000062efe20] [c00000000000daf0] ret_from_kernel_thread+0x5c/0x6c
>>
>> But why is the synchronize_rcu_tasks() not completing?
>>
> I think that it is because RCU is not fully initialized by that time.
Yeah that would explain it :)
> The 36dadef23fcc ("kprobes: Init kprobes in early_initcall") patch
> switches to early_initcall() that has a higher priority sequence than
> core_initcall() that is used to complete an RCU setup in the rcu_set_runtime_mode().
I was looking at debug_lockdep_rcu_enabled(), which is:
noinstr int notrace debug_lockdep_rcu_enabled(void)
{
return rcu_scheduler_active != RCU_SCHEDULER_INACTIVE && debug_locks &&
current->lockdep_recursion == 0;
}
That is not firing any warnings for me because rcu_scheduler_active is:
(gdb) p/x rcu_scheduler_active
$1 = 0x1
Which is:
#define RCU_SCHEDULER_INIT 1
But that's different to RCU_SCHEDULER_RUNNING, which is set in
rcu_set_runtime_mode() as you mentioned:
static int __init rcu_set_runtime_mode(void)
{
rcu_test_sync_prims();
rcu_scheduler_active = RCU_SCHEDULER_RUNNING;
kfree_rcu_scheduler_running();
rcu_test_sync_prims();
return 0;
}
The comment on rcu_scheduler_active implies that once we're at
RCU_SCHEDULER_INIT things should work:
/*
* The rcu_scheduler_active variable is initialized to the value
* RCU_SCHEDULER_INACTIVE and transitions RCU_SCHEDULER_INIT just before the
* first task is spawned. So when this variable is RCU_SCHEDULER_INACTIVE,
* RCU can assume that there is but one task, allowing RCU to (for example)
* optimize synchronize_rcu() to a simple barrier(). When this variable
* is RCU_SCHEDULER_INIT, RCU must actually do all the hard work required
* to detect real grace periods. This variable is also used to suppress
* boot-time false positives from lockdep-RCU error checking. Finally, it
* transitions from RCU_SCHEDULER_INIT to RCU_SCHEDULER_RUNNING after RCU
* is fully initialized, including all of its kthreads having been spawned.
*/
So I'm not sure, the comments and the debug checks imply that it is OK
for kprobes to be using RCU this early.
I guess I'll keep digging.
cheers
^ permalink raw reply
* Re: [PATCH kernel v2] powerpc/kuap: Restore AMR after replaying soft interrupts
From: Aneesh Kumar K.V @ 2020-12-03 6:38 UTC (permalink / raw)
To: Alexey Kardashevskiy, linuxppc-dev; +Cc: Alexey Kardashevskiy, Nicholas Piggin
In-Reply-To: <20201203054724.44838-1-aik@ozlabs.ru>
Alexey Kardashevskiy <aik@ozlabs.ru> writes:
> When interrupted in raw_copy_from_user()/... after user memory access
> is enabled, a nested handler may also access user memory (perf is
> one example) and when it does so, it calls prevent_read_from_user()
> which prevents the upper handler from accessing user memory.
>
> This saves/restores AMR when replaying interrupts.
>
> get_kuap/set_kuap have stubs for disabled KUAP on RADIX but there are
> none for hash-only configs (BOOK3E) so this adds stubs and moves
> AMR_KUAP_BLOCK_xxx.
>
> Found by syzkaller. More likely to break with enabled
> CONFIG_DEBUG_ATOMIC_SLEEP, the call chain is
> timer_interrupt -> ktime_get -> read_seqcount_begin -> local_irq_restore.
Can you test this with https://github.com/kvaneesh/linux/commits/hash-kuap-reworked-2
We do save restore AMR on interrupt entry and exit.
-aneesh
^ permalink raw reply
* Re: [MOCKUP] x86/mm: Lightweight lazy mm refcounting
From: kernel test robot @ 2020-12-03 7:11 UTC (permalink / raw)
To: Andy Lutomirski, Nicholas Piggin
Cc: linux-arch, kbuild-all, Arnd Bergmann, Will Deacon, X86 ML, LKML,
Linux-MM, Mathieu Desnoyers, linuxppc-dev
In-Reply-To: <7c4bcc0a464ca60be1e0aeba805a192be0ee81e5.1606972194.git.luto@kernel.org>
[-- Attachment #1: Type: text/plain, Size: 2383 bytes --]
Hi Andy,
I love your patch! Yet something to improve:
[auto build test ERROR on tip/x86/core]
[also build test ERROR on tip/x86/mm soc/for-next linus/master v5.10-rc6 next-20201201]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Andy-Lutomirski/x86-mm-Lightweight-lazy-mm-refcounting/20201203-132706
base: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 238c91115cd05c71447ea071624a4c9fe661f970
config: m68k-randconfig-s032-20201203 (attached as .config)
compiler: m68k-linux-gcc (GCC) 9.3.0
reproduce:
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# apt-get install sparse
# sparse version: v0.6.3-179-ga00755aa-dirty
# https://github.com/0day-ci/linux/commit/0d9b23b22e621d8e588095b4d0f9f39110a57901
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Andy-Lutomirski/x86-mm-Lightweight-lazy-mm-refcounting/20201203-132706
git checkout 0d9b23b22e621d8e588095b4d0f9f39110a57901
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=m68k
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
All errors (new ones prefixed by >>):
m68k-linux-ld: kernel/fork.o: in function `__mmput':
>> fork.c:(.text+0x214): undefined reference to `arch_fixup_lazy_mm_refs'
m68k-linux-ld: kernel/fork.o: in function `mmput_async_fn':
fork.c:(.text+0x990): undefined reference to `arch_fixup_lazy_mm_refs'
m68k-linux-ld: kernel/fork.o: in function `mmput':
fork.c:(.text+0xa68): undefined reference to `arch_fixup_lazy_mm_refs'
m68k-linux-ld: kernel/fork.o: in function `dup_mm.isra.0':
fork.c:(.text+0x1192): undefined reference to `arch_fixup_lazy_mm_refs'
m68k-linux-ld: kernel/fork.o: in function `mm_access':
fork.c:(.text+0x13c6): undefined reference to `arch_fixup_lazy_mm_refs'
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 31262 bytes --]
^ permalink raw reply
* Re: [PATCH kernel v2] powerpc/kuap: Restore AMR after replaying soft interrupts
From: kernel test robot @ 2020-12-03 7:41 UTC (permalink / raw)
To: Alexey Kardashevskiy, linuxppc-dev
Cc: Alexey Kardashevskiy, kbuild-all, Nicholas Piggin
In-Reply-To: <20201203054724.44838-1-aik@ozlabs.ru>
[-- Attachment #1: Type: text/plain, Size: 9162 bytes --]
Hi Alexey,
Thank you for the patch! Perhaps something to improve:
[auto build test WARNING on linus/master]
[also build test WARNING on v5.10-rc6 next-20201201]
[cannot apply to powerpc/next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Alexey-Kardashevskiy/powerpc-kuap-Restore-AMR-after-replaying-soft-interrupts/20201203-135010
base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 34816d20f173a90389c8a7e641166d8ea9dce70a
config: powerpc-wii_defconfig (attached as .config)
compiler: powerpc-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/0day-ci/linux/commit/d2a99260ef6d241abe6fb961ee3ed84bbc5db7f5
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Alexey-Kardashevskiy/powerpc-kuap-Restore-AMR-after-replaying-soft-interrupts/20201203-135010
git checkout d2a99260ef6d241abe6fb961ee3ed84bbc5db7f5
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=powerpc
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
All warnings (new ones prefixed by >>):
In file included from arch/powerpc/include/asm/uaccess.h:9,
from include/linux/uaccess.h:11,
from include/linux/sched/task.h:11,
from include/linux/sched/signal.h:9,
from include/linux/rcuwait.h:6,
from include/linux/percpu-rwsem.h:7,
from include/linux/fs.h:33,
from include/linux/compat.h:17,
from arch/powerpc/kernel/asm-offsets.c:14:
arch/powerpc/include/asm/kup.h: In function 'get_kuap':
>> arch/powerpc/include/asm/kup.h:19:26: warning: conversion from 'long long unsigned int' to 'long unsigned int' changes value from '13835058055282163712' to '0' [-Woverflow]
19 | #define AMR_KUAP_BLOCKED (AMR_KUAP_BLOCK_READ | AMR_KUAP_BLOCK_WRITE)
| ^
arch/powerpc/include/asm/kup.h:73:9: note: in expansion of macro 'AMR_KUAP_BLOCKED'
73 | return AMR_KUAP_BLOCKED;
| ^~~~~~~~~~~~~~~~
--
In file included from arch/powerpc/include/asm/uaccess.h:9,
from include/linux/uaccess.h:11,
from include/linux/sched/task.h:11,
from include/linux/sched/signal.h:9,
from include/linux/rcuwait.h:6,
from include/linux/percpu-rwsem.h:7,
from include/linux/fs.h:33,
from include/linux/huge_mm.h:8,
from include/linux/mm.h:687,
from include/linux/ring_buffer.h:5,
from include/linux/trace_events.h:6,
from include/trace/trace_events.h:21,
from include/trace/define_trace.h:102,
from include/trace/events/sched.h:656,
from kernel/sched/core.c:10:
arch/powerpc/include/asm/kup.h: In function 'get_kuap':
>> arch/powerpc/include/asm/kup.h:19:26: warning: conversion from 'long long unsigned int' to 'long unsigned int' changes value from '13835058055282163712' to '0' [-Woverflow]
19 | #define AMR_KUAP_BLOCKED (AMR_KUAP_BLOCK_READ | AMR_KUAP_BLOCK_WRITE)
| ^
arch/powerpc/include/asm/kup.h:73:9: note: in expansion of macro 'AMR_KUAP_BLOCKED'
73 | return AMR_KUAP_BLOCKED;
| ^~~~~~~~~~~~~~~~
kernel/sched/core.c: In function 'ttwu_stat':
kernel/sched/core.c:2419:13: warning: variable 'rq' set but not used [-Wunused-but-set-variable]
2419 | struct rq *rq;
| ^~
--
In file included from arch/powerpc/include/asm/uaccess.h:9,
from include/linux/uaccess.h:11,
from include/linux/sched/task.h:11,
from include/linux/sched/signal.h:9,
from include/linux/sched/cputime.h:5,
from kernel/sched/sched.h:11,
from kernel/sched/fair.c:23:
arch/powerpc/include/asm/kup.h: In function 'get_kuap':
>> arch/powerpc/include/asm/kup.h:19:26: warning: conversion from 'long long unsigned int' to 'long unsigned int' changes value from '13835058055282163712' to '0' [-Woverflow]
19 | #define AMR_KUAP_BLOCKED (AMR_KUAP_BLOCK_READ | AMR_KUAP_BLOCK_WRITE)
| ^
arch/powerpc/include/asm/kup.h:73:9: note: in expansion of macro 'AMR_KUAP_BLOCKED'
73 | return AMR_KUAP_BLOCKED;
| ^~~~~~~~~~~~~~~~
kernel/sched/fair.c: At top level:
kernel/sched/fair.c:5368:6: warning: no previous prototype for 'init_cfs_bandwidth' [-Wmissing-prototypes]
5368 | void init_cfs_bandwidth(struct cfs_bandwidth *cfs_b) {}
| ^~~~~~~~~~~~~~~~~~
kernel/sched/fair.c:11150:6: warning: no previous prototype for 'free_fair_sched_group' [-Wmissing-prototypes]
11150 | void free_fair_sched_group(struct task_group *tg) { }
| ^~~~~~~~~~~~~~~~~~~~~
kernel/sched/fair.c:11152:5: warning: no previous prototype for 'alloc_fair_sched_group' [-Wmissing-prototypes]
11152 | int alloc_fair_sched_group(struct task_group *tg, struct task_group *parent)
| ^~~~~~~~~~~~~~~~~~~~~~
kernel/sched/fair.c:11157:6: warning: no previous prototype for 'online_fair_sched_group' [-Wmissing-prototypes]
11157 | void online_fair_sched_group(struct task_group *tg) { }
| ^~~~~~~~~~~~~~~~~~~~~~~
kernel/sched/fair.c:11159:6: warning: no previous prototype for 'unregister_fair_sched_group' [-Wmissing-prototypes]
11159 | void unregister_fair_sched_group(struct task_group *tg) { }
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~
--
In file included from arch/powerpc/include/asm/uaccess.h:9,
from include/linux/uaccess.h:11,
from include/linux/sched/task.h:11,
from include/linux/sched/signal.h:9,
from include/linux/sched/cputime.h:5,
from kernel/sched/sched.h:11,
from kernel/sched/rt.c:6:
arch/powerpc/include/asm/kup.h: In function 'get_kuap':
>> arch/powerpc/include/asm/kup.h:19:26: warning: conversion from 'long long unsigned int' to 'long unsigned int' changes value from '13835058055282163712' to '0' [-Woverflow]
19 | #define AMR_KUAP_BLOCKED (AMR_KUAP_BLOCK_READ | AMR_KUAP_BLOCK_WRITE)
| ^
arch/powerpc/include/asm/kup.h:73:9: note: in expansion of macro 'AMR_KUAP_BLOCKED'
73 | return AMR_KUAP_BLOCKED;
| ^~~~~~~~~~~~~~~~
kernel/sched/rt.c: At top level:
kernel/sched/rt.c:253:6: warning: no previous prototype for 'free_rt_sched_group' [-Wmissing-prototypes]
253 | void free_rt_sched_group(struct task_group *tg) { }
| ^~~~~~~~~~~~~~~~~~~
kernel/sched/rt.c:255:5: warning: no previous prototype for 'alloc_rt_sched_group' [-Wmissing-prototypes]
255 | int alloc_rt_sched_group(struct task_group *tg, struct task_group *parent)
| ^~~~~~~~~~~~~~~~~~~~
kernel/sched/rt.c:668:6: warning: no previous prototype for 'sched_rt_bandwidth_account' [-Wmissing-prototypes]
668 | bool sched_rt_bandwidth_account(struct rt_rq *rt_rq)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
--
In file included from arch/powerpc/include/asm/uaccess.h:9,
from include/linux/uaccess.h:11,
from include/linux/sched/task.h:11,
from include/linux/sched/signal.h:9,
from include/linux/rcuwait.h:6,
from include/linux/percpu-rwsem.h:7,
from include/linux/fs.h:33,
from include/linux/compat.h:17,
from arch/powerpc/kernel/asm-offsets.c:14:
arch/powerpc/include/asm/kup.h: In function 'get_kuap':
>> arch/powerpc/include/asm/kup.h:19:26: warning: conversion from 'long long unsigned int' to 'long unsigned int' changes value from '13835058055282163712' to '0' [-Woverflow]
19 | #define AMR_KUAP_BLOCKED (AMR_KUAP_BLOCK_READ | AMR_KUAP_BLOCK_WRITE)
| ^
arch/powerpc/include/asm/kup.h:73:9: note: in expansion of macro 'AMR_KUAP_BLOCKED'
73 | return AMR_KUAP_BLOCKED;
| ^~~~~~~~~~~~~~~~
vim +19 arch/powerpc/include/asm/kup.h
16
17 #define AMR_KUAP_BLOCK_READ UL(0x4000000000000000)
18 #define AMR_KUAP_BLOCK_WRITE UL(0x8000000000000000)
> 19 #define AMR_KUAP_BLOCKED (AMR_KUAP_BLOCK_READ | AMR_KUAP_BLOCK_WRITE)
20
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 16271 bytes --]
^ permalink raw reply
* Re: [PATCH kernel v2] powerpc/kuap: Restore AMR after replaying soft interrupts
From: Christophe Leroy @ 2020-12-03 8:03 UTC (permalink / raw)
To: Alexey Kardashevskiy, linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20201203054724.44838-1-aik@ozlabs.ru>
Le 03/12/2020 à 06:47, Alexey Kardashevskiy a écrit :
> When interrupted in raw_copy_from_user()/... after user memory access
> is enabled, a nested handler may also access user memory (perf is
> one example) and when it does so, it calls prevent_read_from_user()
> which prevents the upper handler from accessing user memory.
>
> This saves/restores AMR when replaying interrupts.
>
> get_kuap/set_kuap have stubs for disabled KUAP on RADIX but there are
> none for hash-only configs (BOOK3E) so this adds stubs and moves
> AMR_KUAP_BLOCK_xxx.
>
> Found by syzkaller. More likely to break with enabled
> CONFIG_DEBUG_ATOMIC_SLEEP, the call chain is
> timer_interrupt -> ktime_get -> read_seqcount_begin -> local_irq_restore.
>
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
> Changes:
> v2:
> * fixed compile on hash
> * moved get/set to arch_local_irq_restore
> * block KUAP before replaying
>
>
> ---
>
> This is an example:
>
> ------------[ cut here ]------------
> Bug: Read fault blocked by AMR!
> WARNING: CPU: 0 PID: 1603 at /home/aik/p/kernel/arch/powerpc/include/asm/book3s/64/kup-radix.h:145 __do_page_fau
>
> Modules linked in:
> CPU: 0 PID: 1603 Comm: amr Not tainted 5.10.0-rc6_v5.10-rc6_a+fstn1 #24
> NIP: c00000000009ece8 LR: c00000000009ece4 CTR: 0000000000000000
> REGS: c00000000dc63560 TRAP: 0700 Not tainted (5.10.0-rc6_v5.10-rc6_a+fstn1)
> MSR: 8000000000021033 <SF,ME,IR,DR,RI,LE> CR: 28002888 XER: 20040000
> CFAR: c0000000001fa928 IRQMASK: 1
> GPR00: c00000000009ece4 c00000000dc637f0 c000000002397600 000000000000001f
> GPR04: c0000000020eb318 0000000000000000 c00000000dc63494 0000000000000027
> GPR08: c00000007fe4de68 c00000000dfe9180 0000000000000000 0000000000000001
> GPR12: 0000000000002000 c0000000030a0000 0000000000000000 0000000000000000
> GPR16: 0000000000000000 0000000000000000 0000000000000000 bfffffffffffffff
> GPR20: 0000000000000000 c0000000134a4020 c0000000019c2218 0000000000000fe0
> GPR24: 0000000000000000 0000000000000000 c00000000d106200 0000000040000000
> GPR28: 0000000000000000 0000000000000300 c00000000dc63910 c000000001946730
> NIP [c00000000009ece8] __do_page_fault+0xb38/0xde0
> LR [c00000000009ece4] __do_page_fault+0xb34/0xde0
> Call Trace:
> [c00000000dc637f0] [c00000000009ece4] __do_page_fault+0xb34/0xde0 (unreliable)
> [c00000000dc638a0] [c00000000000c968] handle_page_fault+0x10/0x2c
> --- interrupt: 300 at strncpy_from_user+0x290/0x440
> LR = strncpy_from_user+0x284/0x440
> [c00000000dc63ba0] [c000000000c3dcb0] strncpy_from_user+0x2f0/0x440 (unreliable)
> [c00000000dc63c30] [c00000000068b888] getname_flags+0x88/0x2c0
> [c00000000dc63c90] [c000000000662a44] do_sys_openat2+0x2d4/0x5f0
> [c00000000dc63d30] [c00000000066560c] do_sys_open+0xcc/0x140
> [c00000000dc63dc0] [c000000000045e10] system_call_exception+0x160/0x240
> [c00000000dc63e20] [c00000000000da60] system_call_common+0xf0/0x27c
> Instruction dump:
> 409c0048 3fe2ff5b 3bfff128 fac10060 fae10068 482f7a85 60000000 3c62ff5b
> 7fe4fb78 3863f250 4815bbd9 60000000 <0fe00000> 3c62ff5b 3863f2b8 4815c8b5
> irq event stamp: 254
> hardirqs last enabled at (253): [<c000000000019550>] arch_local_irq_restore+0xa0/0x150
> hardirqs last disabled at (254): [<c000000000008a10>] data_access_common_virt+0x1b0/0x1d0
> softirqs last enabled at (0): [<c0000000001f6d5c>] copy_process+0x78c/0x2120
> softirqs last disabled at (0): [<0000000000000000>] 0x0
> ---[ end trace ba98aec5151f3aeb ]---
> ---
> arch/powerpc/include/asm/book3s/64/kup-radix.h | 3 ---
> arch/powerpc/include/asm/kup.h | 10 ++++++++++
> arch/powerpc/kernel/irq.c | 6 ++++++
> 3 files changed, 16 insertions(+), 3 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/kup-radix.h b/arch/powerpc/include/asm/book3s/64/kup-radix.h
> index a39e2d193fdc..4ad607461b75 100644
> --- a/arch/powerpc/include/asm/book3s/64/kup-radix.h
> +++ b/arch/powerpc/include/asm/book3s/64/kup-radix.h
> @@ -5,9 +5,6 @@
> #include <linux/const.h>
> #include <asm/reg.h>
>
> -#define AMR_KUAP_BLOCK_READ UL(0x4000000000000000)
> -#define AMR_KUAP_BLOCK_WRITE UL(0x8000000000000000)
> -#define AMR_KUAP_BLOCKED (AMR_KUAP_BLOCK_READ | AMR_KUAP_BLOCK_WRITE)
> #define AMR_KUAP_SHIFT 62
>
> #ifdef __ASSEMBLY__
> diff --git a/arch/powerpc/include/asm/kup.h b/arch/powerpc/include/asm/kup.h
> index 0d93331d0fab..e63930767a96 100644
> --- a/arch/powerpc/include/asm/kup.h
> +++ b/arch/powerpc/include/asm/kup.h
> @@ -14,6 +14,10 @@
> #define KUAP_CURRENT_WRITE 8
> #define KUAP_CURRENT (KUAP_CURRENT_READ | KUAP_CURRENT_WRITE)
>
> +#define AMR_KUAP_BLOCK_READ UL(0x4000000000000000)
> +#define AMR_KUAP_BLOCK_WRITE UL(0x8000000000000000)
> +#define AMR_KUAP_BLOCKED (AMR_KUAP_BLOCK_READ | AMR_KUAP_BLOCK_WRITE)
> +
Those macro are specific to BOOK3S_64, they have nothing to do in asm/kup.h, must stay in the file
included just below
> #ifdef CONFIG_PPC_BOOK3S_64
> #include <asm/book3s/64/kup-radix.h>
> #endif
> @@ -64,6 +68,12 @@ bad_kuap_fault(struct pt_regs *regs, unsigned long address, bool is_write)
> }
>
> static inline void kuap_check_amr(void) { }
> +static inline unsigned long get_kuap(void)
> +{
> + return AMR_KUAP_BLOCKED;
> +}
> +
The above is not generic, it is specific to book3s 64, AMR doesn't exist on book3s/32 or on 8xx.
> +static inline void set_kuap(unsigned long value) { }
>
> /*
> * book3s/64/kup-radix.h defines these functions for the !KUAP case to flush
> diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
> index 7d0f7682d01d..d9fd46da04d6 100644
> --- a/arch/powerpc/kernel/irq.c
> +++ b/arch/powerpc/kernel/irq.c
> @@ -314,6 +314,7 @@ void replay_soft_interrupts(void)
> notrace void arch_local_irq_restore(unsigned long mask)
> {
> unsigned char irq_happened;
> + unsigned long kuap_state;
>
> /* Write the new soft-enabled value */
> irq_soft_mask_set(mask);
> @@ -373,9 +374,14 @@ notrace void arch_local_irq_restore(unsigned long mask)
> irq_soft_mask_set(IRQS_ALL_DISABLED);
> trace_hardirqs_off();
>
> + kuap_state = get_kuap();
> + set_kuap(AMR_KUAP_BLOCKED);
> +
> replay_soft_interrupts();
> local_paca->irq_happened = 0;
>
> + set_kuap(kuap_state);
> +
> trace_hardirqs_on();
> irq_soft_mask_set(IRQS_ENABLED);
> __hard_irq_enable();
>
^ permalink raw reply
* Re: [PATCH kernel v2] powerpc/kuap: Restore AMR after replaying soft interrupts
From: Alexey Kardashevskiy @ 2020-12-03 8:10 UTC (permalink / raw)
To: Aneesh Kumar K.V, linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <87zh2vjsxg.fsf@linux.ibm.com>
On 03/12/2020 17:38, Aneesh Kumar K.V wrote:
> Alexey Kardashevskiy <aik@ozlabs.ru> writes:
>
>> When interrupted in raw_copy_from_user()/... after user memory access
>> is enabled, a nested handler may also access user memory (perf is
>> one example) and when it does so, it calls prevent_read_from_user()
>> which prevents the upper handler from accessing user memory.
>>
>> This saves/restores AMR when replaying interrupts.
>>
>> get_kuap/set_kuap have stubs for disabled KUAP on RADIX but there are
>> none for hash-only configs (BOOK3E) so this adds stubs and moves
>> AMR_KUAP_BLOCK_xxx.
>>
>> Found by syzkaller. More likely to break with enabled
>> CONFIG_DEBUG_ATOMIC_SLEEP, the call chain is
>> timer_interrupt -> ktime_get -> read_seqcount_begin -> local_irq_restore.
>
> Can you test this with https://github.com/kvaneesh/linux/commits/hash-kuap-reworked-2
Yup, broken:
[ 8.813115] ------------[ cut here ]------------
[ 8.813499] Bug: Read fault blocked by AMR!
[ 8.813532] WARNING: CPU: 0 PID: 1113 at
/home/aik/p/kernel/arch/powerpc/include/asm/book3s/64/kup.h:369
__do_page_fault+0xd
34/0xdf0
[ 8.814248] Modules linked in:
[ 8.814459] CPU: 0 PID: 1113 Comm: amr Not tainted
5.10.0-rc4_aneesh--hash-kuap-reworked-2_a+fstn1 #61
[ 8.815075] NIP: c0000000000a4674 LR: c0000000000a4670 CTR:
0000000000000000
[ 8.815632] REGS: c00000000d44f530 TRAP: 0700 Not tainted
(5.10.0-rc4_aneesh--hash-kuap-reworked-2_a+fstn1)
> We do save restore AMR on interrupt entry and exit.
This is nested interrupt. we get interrupted while copying to/from user.
The first handler has AMR off, then it goes to strncpy_from_user,
enables AMR, page fault happens and another interrupt arrives - and if
the new interrupt also wants strncpy_from_user/co, it will enable AMR
again (which is ok), do copy and then disable AMR; and then we go back
and resume the first strncpy_from_user which thinks AMR is enabling
while it is not. I just see how strncpy_from_user is called while
strncpy_from_user is running. Nick understands the mechanics better.
I can give you an initramdisk which crashes in a millisecond, ping me in
slack.
--
Alexey
^ permalink raw reply
* Re: [PATCH kernel v2] powerpc/kuap: Restore AMR after replaying soft interrupts
From: Alexey Kardashevskiy @ 2020-12-03 8:13 UTC (permalink / raw)
To: Christophe Leroy, linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <7d09d218-2703-a37e-bf47-1cc47ee467b7@csgroup.eu>
On 03/12/2020 19:03, Christophe Leroy wrote:
>
>
> Le 03/12/2020 à 06:47, Alexey Kardashevskiy a écrit :
>> When interrupted in raw_copy_from_user()/... after user memory access
>> is enabled, a nested handler may also access user memory (perf is
>> one example) and when it does so, it calls prevent_read_from_user()
>> which prevents the upper handler from accessing user memory.
>>
>> This saves/restores AMR when replaying interrupts.
>>
>> get_kuap/set_kuap have stubs for disabled KUAP on RADIX but there are
>> none for hash-only configs (BOOK3E) so this adds stubs and moves
>> AMR_KUAP_BLOCK_xxx.
>>
>> Found by syzkaller. More likely to break with enabled
>> CONFIG_DEBUG_ATOMIC_SLEEP, the call chain is
>> timer_interrupt -> ktime_get -> read_seqcount_begin -> local_irq_restore.
>>
>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>> ---
>> Changes:
>> v2:
>> * fixed compile on hash
>> * moved get/set to arch_local_irq_restore
>> * block KUAP before replaying
>>
>>
>> ---
>>
>> This is an example:
>>
>> ------------[ cut here ]------------
>> Bug: Read fault blocked by AMR!
>> WARNING: CPU: 0 PID: 1603 at
>> /home/aik/p/kernel/arch/powerpc/include/asm/book3s/64/kup-radix.h:145
>> __do_page_fau
>>
>> Modules linked in:
>> CPU: 0 PID: 1603 Comm: amr Not tainted 5.10.0-rc6_v5.10-rc6_a+fstn1 #24
>> NIP: c00000000009ece8 LR: c00000000009ece4 CTR: 0000000000000000
>> REGS: c00000000dc63560 TRAP: 0700 Not tainted
>> (5.10.0-rc6_v5.10-rc6_a+fstn1)
>> MSR: 8000000000021033 <SF,ME,IR,DR,RI,LE> CR: 28002888 XER: 20040000
>> CFAR: c0000000001fa928 IRQMASK: 1
>> GPR00: c00000000009ece4 c00000000dc637f0 c000000002397600
>> 000000000000001f
>> GPR04: c0000000020eb318 0000000000000000 c00000000dc63494
>> 0000000000000027
>> GPR08: c00000007fe4de68 c00000000dfe9180 0000000000000000
>> 0000000000000001
>> GPR12: 0000000000002000 c0000000030a0000 0000000000000000
>> 0000000000000000
>> GPR16: 0000000000000000 0000000000000000 0000000000000000
>> bfffffffffffffff
>> GPR20: 0000000000000000 c0000000134a4020 c0000000019c2218
>> 0000000000000fe0
>> GPR24: 0000000000000000 0000000000000000 c00000000d106200
>> 0000000040000000
>> GPR28: 0000000000000000 0000000000000300 c00000000dc63910
>> c000000001946730
>> NIP [c00000000009ece8] __do_page_fault+0xb38/0xde0
>> LR [c00000000009ece4] __do_page_fault+0xb34/0xde0
>> Call Trace:
>> [c00000000dc637f0] [c00000000009ece4] __do_page_fault+0xb34/0xde0
>> (unreliable)
>> [c00000000dc638a0] [c00000000000c968] handle_page_fault+0x10/0x2c
>> --- interrupt: 300 at strncpy_from_user+0x290/0x440
>> LR = strncpy_from_user+0x284/0x440
>> [c00000000dc63ba0] [c000000000c3dcb0] strncpy_from_user+0x2f0/0x440
>> (unreliable)
>> [c00000000dc63c30] [c00000000068b888] getname_flags+0x88/0x2c0
>> [c00000000dc63c90] [c000000000662a44] do_sys_openat2+0x2d4/0x5f0
>> [c00000000dc63d30] [c00000000066560c] do_sys_open+0xcc/0x140
>> [c00000000dc63dc0] [c000000000045e10] system_call_exception+0x160/0x240
>> [c00000000dc63e20] [c00000000000da60] system_call_common+0xf0/0x27c
>> Instruction dump:
>> 409c0048 3fe2ff5b 3bfff128 fac10060 fae10068 482f7a85 60000000 3c62ff5b
>> 7fe4fb78 3863f250 4815bbd9 60000000 <0fe00000> 3c62ff5b 3863f2b8 4815c8b5
>> irq event stamp: 254
>> hardirqs last enabled at (253): [<c000000000019550>]
>> arch_local_irq_restore+0xa0/0x150
>> hardirqs last disabled at (254): [<c000000000008a10>]
>> data_access_common_virt+0x1b0/0x1d0
>> softirqs last enabled at (0): [<c0000000001f6d5c>]
>> copy_process+0x78c/0x2120
>> softirqs last disabled at (0): [<0000000000000000>] 0x0
>> ---[ end trace ba98aec5151f3aeb ]---
>> ---
>> arch/powerpc/include/asm/book3s/64/kup-radix.h | 3 ---
>> arch/powerpc/include/asm/kup.h | 10 ++++++++++
>> arch/powerpc/kernel/irq.c | 6 ++++++
>> 3 files changed, 16 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/book3s/64/kup-radix.h
>> b/arch/powerpc/include/asm/book3s/64/kup-radix.h
>> index a39e2d193fdc..4ad607461b75 100644
>> --- a/arch/powerpc/include/asm/book3s/64/kup-radix.h
>> +++ b/arch/powerpc/include/asm/book3s/64/kup-radix.h
>> @@ -5,9 +5,6 @@
>> #include <linux/const.h>
>> #include <asm/reg.h>
>> -#define AMR_KUAP_BLOCK_READ UL(0x4000000000000000)
>> -#define AMR_KUAP_BLOCK_WRITE UL(0x8000000000000000)
>> -#define AMR_KUAP_BLOCKED (AMR_KUAP_BLOCK_READ | AMR_KUAP_BLOCK_WRITE)
>> #define AMR_KUAP_SHIFT 62
>> #ifdef __ASSEMBLY__
>> diff --git a/arch/powerpc/include/asm/kup.h
>> b/arch/powerpc/include/asm/kup.h
>> index 0d93331d0fab..e63930767a96 100644
>> --- a/arch/powerpc/include/asm/kup.h
>> +++ b/arch/powerpc/include/asm/kup.h
>> @@ -14,6 +14,10 @@
>> #define KUAP_CURRENT_WRITE 8
>> #define KUAP_CURRENT (KUAP_CURRENT_READ | KUAP_CURRENT_WRITE)
>> +#define AMR_KUAP_BLOCK_READ UL(0x4000000000000000)
>> +#define AMR_KUAP_BLOCK_WRITE UL(0x8000000000000000)
>> +#define AMR_KUAP_BLOCKED (AMR_KUAP_BLOCK_READ | AMR_KUAP_BLOCK_WRITE)
>> +
>
> Those macro are specific to BOOK3S_64, they have nothing to do in
> asm/kup.h, must stay in the file included just below
>
>> #ifdef CONFIG_PPC_BOOK3S_64
>> #include <asm/book3s/64/kup-radix.h>
>> #endif
>> @@ -64,6 +68,12 @@ bad_kuap_fault(struct pt_regs *regs, unsigned long
>> address, bool is_write)
>> }
>> static inline void kuap_check_amr(void) { }
>> +static inline unsigned long get_kuap(void)
>> +{
>> + return AMR_KUAP_BLOCKED;
>> +}
>> +
>
> The above is not generic, it is specific to book3s 64, AMR doesn't exist
> on book3s/32 or on 8xx.
ah ok, I did not want more ifdefs there but looks like it is
unavoidable, I'll repost if there are no other comments. Thanks,
--
Alexey
^ permalink raw reply
* Re: [MOCKUP] x86/mm: Lightweight lazy mm refcounting
From: Peter Zijlstra @ 2020-12-03 8:44 UTC (permalink / raw)
To: Andy Lutomirski
Cc: linux-arch, Arnd Bergmann, Rik van Riel, Will Deacon, X86 ML,
Dave Hansen, LKML, Nicholas Piggin, Linux-MM, Mathieu Desnoyers,
Catalin Marinas, linuxppc-dev
In-Reply-To: <7c4bcc0a464ca60be1e0aeba805a192be0ee81e5.1606972194.git.luto@kernel.org>
On Wed, Dec 02, 2020 at 09:25:51PM -0800, Andy Lutomirski wrote:
> power: same as ARM, except that the loop may be rather larger since
> the systems are bigger. But I imagine it's still faster than Nick's
> approach -- a cmpxchg to a remote cacheline should still be faster than
> an IPI shootdown.
While a single atomic might be cheaper than an IPI, the comparison
doesn't work out nicely. You do the xchg() on every unlazy, while the
IPI would be once per process exit.
So over the life of the process, it might do very many unlazies, adding
up to a total cost far in excess of what the single IPI would've been.
And while I appreciate all the work to get rid of the active_mm
accounting; the worry I have with pushing this all into arch code is
that it will be so very easy to get this subtly wrong.
^ permalink raw reply
* [PATCH] powerpc/hotplug: assign hot added LMB to the right node
From: Laurent Dufour @ 2020-12-03 10:15 UTC (permalink / raw)
To: linuxppc-dev
Cc: Nathan Lynch, Scott Cheloha, linux-kernel, stable, Paul Mackerras
This patch applies to 5.9 and earlier kernels only.
Since 5.10, this has been fortunately fixed by the commit
e5e179aa3a39 ("pseries/drmem: don't cache node id in drmem_lmb struct").
When LMBs are added to a running system, the node id assigned to the LMB is
fetched from the temporary DT node provided by the hypervisor.
However, LMBs added are always assigned to the first online node. This is a
mistake and this is because hot_add_drconf_scn_to_nid() called by
lmb_set_nid() is checking for the LMB flags DRCONF_MEM_ASSIGNED which is
set later in dlpar_add_lmb().
To fix this issue, simply set that flag earlier in dlpar_add_lmb().
Note, this code has been rewrote in 5.10 and thus this fix has no meaning
since this version.
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: Scott Cheloha <cheloha@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: stable@vger.kernel.org
---
arch/powerpc/platforms/pseries/hotplug-memory.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
index e54dcbd04b2f..92d83915c629 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -663,12 +663,14 @@ static int dlpar_add_lmb(struct drmem_lmb *lmb)
return rc;
}
+ lmb->flags |= DRCONF_MEM_ASSIGNED;
lmb_set_nid(lmb);
block_sz = memory_block_size_bytes();
/* Add the memory */
rc = __add_memory(lmb->nid, lmb->base_addr, block_sz);
if (rc) {
+ lmb->flags &= ~DRCONF_MEM_ASSIGNED;
invalidate_lmb_associativity_index(lmb);
return rc;
}
@@ -676,10 +678,9 @@ static int dlpar_add_lmb(struct drmem_lmb *lmb)
rc = dlpar_online_lmb(lmb);
if (rc) {
__remove_memory(lmb->nid, lmb->base_addr, block_sz);
+ lmb->flags &= ~DRCONF_MEM_ASSIGNED;
invalidate_lmb_associativity_index(lmb);
lmb_clear_nid(lmb);
- } else {
- lmb->flags |= DRCONF_MEM_ASSIGNED;
}
return rc;
--
2.29.2
^ permalink raw reply related
* Re: [PATCH] EDAC, mv64x60: Fix error return code in mv64x60_pci_err_probe()
From: Michael Ellerman @ 2020-12-03 11:27 UTC (permalink / raw)
To: Borislav Petkov, Benjamin Herrenschmidt, Paul Mackerras
Cc: cj.chengjian, linux-kernel, Wang ShaoBo, james.morse,
huawei.libin, mchehab, linuxppc-dev, linux-edac
In-Reply-To: <20201202112515.GC2951@zn.tnic>
Borislav Petkov <bp@alien8.de> writes:
> On Tue, Nov 24, 2020 at 02:30:09PM +0800, Wang ShaoBo wrote:
>> Fix to return -ENODEV error code when edac_pci_add_device() failed instaed
>> of 0 in mv64x60_pci_err_probe(), as done elsewhere in this function.
>>
>> Fixes: 4f4aeeabc061 ("drivers-edac: add marvell mv64x60 driver")
>> Signed-off-by: Wang ShaoBo <bobo.shaobowang@huawei.com>
>> ---
>> drivers/edac/mv64x60_edac.c | 1 +
>> 1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/edac/mv64x60_edac.c b/drivers/edac/mv64x60_edac.c
>> index 3c68bb525d5d..456b9ca1fe8d 100644
>> --- a/drivers/edac/mv64x60_edac.c
>> +++ b/drivers/edac/mv64x60_edac.c
>> @@ -168,6 +168,7 @@ static int mv64x60_pci_err_probe(struct platform_device *pdev)
>>
>> if (edac_pci_add_device(pci, pdata->edac_idx) > 0) {
>> edac_dbg(3, "failed edac_pci_add_device()\n");
>> + res = -ENODEV;
>> goto err;
>> }
>
> That driver depends on MV64X60 and I don't see anything in the tree
> enabling it and I can't select it AFAICT:
>
> config MV64X60
> bool
> select PPC_INDIRECT_PCI
> select CHECK_CACHE_COHERENCY
It was selected by PPC_C2K, but that was dropped in:
92c8c16f3457 ("powerpc/embedded6xx: Remove C2K board support")
> PPC folks, what do we do here?
>
> If not used anymore, I'd love to have one less EDAC driver.
It's dead code, so drop it.
I can send a patch if no one else wants to.
cheers
^ permalink raw reply
* Re: [PATCH] EDAC, mv64x60: Fix error return code in mv64x60_pci_err_probe()
From: Borislav Petkov @ 2020-12-03 11:39 UTC (permalink / raw)
To: Michael Ellerman
Cc: cj.chengjian, linux-kernel, Wang ShaoBo, Paul Mackerras,
huawei.libin, james.morse, mchehab, linuxppc-dev, linux-edac
In-Reply-To: <87pn3ruo2q.fsf@mpe.ellerman.id.au>
On Thu, Dec 03, 2020 at 10:27:25PM +1100, Michael Ellerman wrote:
> It's dead code, so drop it.
>
> I can send a patch if no one else wants to.
Yes please.
I love patches removing code! :-)
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply
* Re: [PATCH] powerpc/mm: Don't see NULL pointer dereference as a KUAP fault
From: Michael Ellerman @ 2020-12-03 11:55 UTC (permalink / raw)
To: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <8b865b93d25c15c8e6d41e71c368bfc28da4489d.1606816701.git.christophe.leroy@csgroup.eu>
Christophe Leroy <christophe.leroy@csgroup.eu> writes:
> Sometimes, NULL pointer dereferences are expected. Even when they
> are accidental they are unlikely an exploit attempt because the
> first page is never mapped.
The first page can be mapped if mmap_min_addr is 0.
Blocking all faults to the first page would potentially break any
program that does that.
Also if there is something mapped at 0 it's a good chance it is an
exploit attempt :)
cheers
> The exemple below shows what we get when invoking the "show task"
> sysrq handler, by writing 't' in /proc/sysrq-trigger
>
> [ 1117.202054] ------------[ cut here ]------------
> [ 1117.202102] Bug: fault blocked by AP register !
> [ 1117.202261] WARNING: CPU: 0 PID: 377 at arch/powerpc/include/asm/nohash/32/kup-8xx.h:66 do_page_fault+0x4a8/0x5ec
> [ 1117.202310] Modules linked in:
> [ 1117.202428] CPU: 0 PID: 377 Comm: sh Tainted: G W 5.10.0-rc5-s3k-dev-01340-g83f53be2de31-dirty #4175
> [ 1117.202499] NIP: c0012048 LR: c0012048 CTR: 00000000
> [ 1117.202573] REGS: cacdbb88 TRAP: 0700 Tainted: G W (5.10.0-rc5-s3k-dev-01340-g83f53be2de31-dirty)
> [ 1117.202625] MSR: 00021032 <ME,IR,DR,RI> CR: 24082222 XER: 20000000
> [ 1117.202899]
> [ 1117.202899] GPR00: c0012048 cacdbc40 c2929290 00000023 c092e554 00000001 c09865e8 c092e640
> [ 1117.202899] GPR08: 00001032 00000000 00000000 00014efc 28082224 100d166a 100a0920 00000000
> [ 1117.202899] GPR16: 100cac0c 100b0000 1080c3fc 1080d685 100d0000 100d0000 00000000 100a0900
> [ 1117.202899] GPR24: 100d0000 c07892ec 00000000 c0921510 c21f4440 0000005c c0000000 cacdbc80
> [ 1117.204362] NIP [c0012048] do_page_fault+0x4a8/0x5ec
> [ 1117.204461] LR [c0012048] do_page_fault+0x4a8/0x5ec
> [ 1117.204509] Call Trace:
> [ 1117.204609] [cacdbc40] [c0012048] do_page_fault+0x4a8/0x5ec (unreliable)
> [ 1117.204771] [cacdbc70] [c00112f0] handle_page_fault+0x8/0x34
> [ 1117.204911] --- interrupt: 301 at copy_from_kernel_nofault+0x70/0x1c0
> [ 1117.204979] NIP: c010dbec LR: c010dbac CTR: 00000001
> [ 1117.205053] REGS: cacdbc80 TRAP: 0301 Tainted: G W (5.10.0-rc5-s3k-dev-01340-g83f53be2de31-dirty)
> [ 1117.205104] MSR: 00009032 <EE,ME,IR,DR,RI> CR: 28082224 XER: 00000000
> [ 1117.205416] DAR: 0000005c DSISR: c0000000
> [ 1117.205416] GPR00: c0045948 cacdbd38 c2929290 00000001 00000017 00000017 00000027 0000000f
> [ 1117.205416] GPR08: c09926ec 00000000 00000000 3ffff000 24082224
> [ 1117.206106] NIP [c010dbec] copy_from_kernel_nofault+0x70/0x1c0
> [ 1117.206202] LR [c010dbac] copy_from_kernel_nofault+0x30/0x1c0
> [ 1117.206258] --- interrupt: 301
> [ 1117.206372] [cacdbd38] [c004bbb0] kthread_probe_data+0x44/0x70 (unreliable)
> [ 1117.206561] [cacdbd58] [c0045948] print_worker_info+0xe0/0x194
> [ 1117.206717] [cacdbdb8] [c00548ac] sched_show_task+0x134/0x168
> [ 1117.206851] [cacdbdd8] [c005a268] show_state_filter+0x70/0x100
> [ 1117.206989] [cacdbe08] [c039baa0] sysrq_handle_showstate+0x14/0x24
> [ 1117.207122] [cacdbe18] [c039bf18] __handle_sysrq+0xac/0x1d0
> [ 1117.207257] [cacdbe48] [c039c0c0] write_sysrq_trigger+0x4c/0x74
> [ 1117.207407] [cacdbe68] [c01fba48] proc_reg_write+0xb4/0x114
> [ 1117.207550] [cacdbe88] [c0179968] vfs_write+0x12c/0x478
> [ 1117.207686] [cacdbf08] [c0179e60] ksys_write+0x78/0x128
> [ 1117.207826] [cacdbf38] [c00110d0] ret_from_syscall+0x0/0x34
> [ 1117.207938] --- interrupt: c01 at 0xfd4e784
> [ 1117.208008] NIP: 0fd4e784 LR: 0fe0f244 CTR: 10048d38
> [ 1117.208083] REGS: cacdbf48 TRAP: 0c01 Tainted: G W (5.10.0-rc5-s3k-dev-01340-g83f53be2de31-dirty)
> [ 1117.208134] MSR: 0000d032 <EE,PR,ME,IR,DR,RI> CR: 44002222 XER: 00000000
> [ 1117.208470]
> [ 1117.208470] GPR00: 00000004 7fc34090 77bfb4e0 00000001 1080fa40 00000002 7400000f fefefeff
> [ 1117.208470] GPR08: 7f7f7f7f 10048d38 1080c414 7fc343c0 00000000
> [ 1117.209104] NIP [0fd4e784] 0xfd4e784
> [ 1117.209180] LR [0fe0f244] 0xfe0f244
> [ 1117.209236] --- interrupt: c01
> [ 1117.209274] Instruction dump:
> [ 1117.209353] 714a4000 418200f0 73ca0001 40820084 73ca0032 408200f8 73c90040 4082ff60
> [ 1117.209727] 0fe00000 3c60c082 386399f4 48013b65 <0fe00000> 80010034 3860000b 7c0803a6
> [ 1117.210102] ---[ end trace 1927c0323393af3e ]---
>
> So, avoid the big KUAP warning by bailing out of bad_kernel_fault()
> before calling bad_kuap_fault() when address references the first
> page.
>
> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
> ---
> arch/powerpc/mm/fault.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> index 0add963a849b..be2b4318206f 100644
> --- a/arch/powerpc/mm/fault.c
> +++ b/arch/powerpc/mm/fault.c
> @@ -198,6 +198,10 @@ static bool bad_kernel_fault(struct pt_regs *regs, unsigned long error_code,
> {
> int is_exec = TRAP(regs) == 0x400;
>
> + // Kernel fault on first page is likely a NULL pointer dereference
> + if (address < PAGE_SIZE)
> + return true;
> +
> /* NX faults set DSISR_PROTFAULT on the 8xx, DSISR_NOEXEC_OR_G on others */
> if (is_exec && (error_code & (DSISR_NOEXEC_OR_G | DSISR_KEYFAULT |
> DSISR_PROTFAULT))) {
> --
> 2.25.0
^ permalink raw reply
* Re: [MOCKUP] x86/mm: Lightweight lazy mm refcounting
From: Matthew Wilcox @ 2020-12-03 12:31 UTC (permalink / raw)
To: Andy Lutomirski
Cc: linux-arch, Arnd Bergmann, Rik van Riel, Will Deacon, X86 ML,
Dave Hansen, LKML, Nicholas Piggin, Linux-MM, Mathieu Desnoyers,
Catalin Marinas, linuxppc-dev
In-Reply-To: <7c4bcc0a464ca60be1e0aeba805a192be0ee81e5.1606972194.git.luto@kernel.org>
On Wed, Dec 02, 2020 at 09:25:51PM -0800, Andy Lutomirski wrote:
> This code compiles, but I haven't even tried to boot it. The earlier
> part of the series isn't terribly interesting -- it's a handful of
> cleanups that remove all reads of ->active_mm from arch/x86. I've
> been meaning to do that for a while, and now I did it. But, with
> that done, I think we can move to a totally different lazy mm refcounting
> model.
I went back and read Documentation/vm/active_mm.rst recently.
I think it's useful to think about how this would have been handled if
we'd had RCU at the time. Particularly:
Linus wrote:
> To support all that, the "struct mm_struct" now has two counters: a
> "mm_users" counter that is how many "real address space users" there are,
> and a "mm_count" counter that is the number of "lazy" users (ie anonymous
> users) plus one if there are any real users.
And this just makes me think RCU freeing of mm_struct. I'm sure it's
more complicated than that (then, or now), but if an anonymous process
is borrowing a freed mm, and the mm is freed by RCU then it will not go
away until the task context switches. When we context switch back to
the anon task, it'll borrow some other task's MM and won't even notice
that the MM it was using has gone away.
^ permalink raw reply
* [PATCH AUTOSEL 5.9 09/39] powerpc: Drop -me200 addition to build flags
From: Sasha Levin @ 2020-12-03 13:28 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Sasha Levin, kernel test robot, Németh Márton,
Nick Desaulniers, Scott Wood, linuxppc-dev
In-Reply-To: <20201203132834.930999-1-sashal@kernel.org>
From: Michael Ellerman <mpe@ellerman.id.au>
[ Upstream commit e02152ba2810f7c88cb54e71cda096268dfa9241 ]
Currently a build with CONFIG_E200=y will fail with:
Error: invalid switch -me200
Error: unrecognized option -me200
Upstream binutils has never supported an -me200 option. Presumably it
was supported at some point by either a fork or Freescale internal
binutils.
We can't support code that we can't even build test, so drop the
addition of -me200 to the build flags, so we can at least build with
CONFIG_E200=y.
Reported-by: Németh Márton <nm127@freemail.hu>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Scott Wood <oss@buserror.net>
Link: https://lore.kernel.org/r/20201116120913.165317-1-mpe@ellerman.id.au
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
arch/powerpc/Makefile | 1 -
1 file changed, 1 deletion(-)
diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index 3e8da9cf2eb9d..e6643d5699fef 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -249,7 +249,6 @@ KBUILD_CFLAGS += $(call cc-option,-mno-string)
cpu-as-$(CONFIG_40x) += -Wa,-m405
cpu-as-$(CONFIG_44x) += -Wa,-m440
cpu-as-$(CONFIG_ALTIVEC) += $(call as-option,-Wa$(comma)-maltivec)
-cpu-as-$(CONFIG_E200) += -Wa,-me200
cpu-as-$(CONFIG_E500) += -Wa,-me500
# When using '-many -mpower4' gas will first try and find a matching power4
--
2.27.0
^ permalink raw reply related
* [PATCH AUTOSEL 5.9 18/39] ibmvnic: skip tx timeout reset while in resetting
From: Sasha Levin @ 2020-12-03 13:28 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Sasha Levin, netdev, Dany Madden, Lijun Pan, Brian King,
Jakub Kicinski, linuxppc-dev
In-Reply-To: <20201203132834.930999-1-sashal@kernel.org>
From: Lijun Pan <ljp@linux.ibm.com>
[ Upstream commit 855a631a4c11458a9cef1ab79c1530436aa95fae ]
Sometimes it takes longer than 5 seconds (watchdog timeout) to complete
failover, migration, and other resets. In stead of scheduling another
timeout reset, we wait for the current one to complete.
Suggested-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/ethernet/ibm/ibmvnic.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 81ec233926acb..37012aa594f41 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -2368,6 +2368,12 @@ static void ibmvnic_tx_timeout(struct net_device *dev, unsigned int txqueue)
{
struct ibmvnic_adapter *adapter = netdev_priv(dev);
+ if (test_bit(0, &adapter->resetting)) {
+ netdev_err(adapter->netdev,
+ "Adapter is resetting, skip timeout reset\n");
+ return;
+ }
+
ibmvnic_reset(adapter, VNIC_RESET_TIMEOUT);
}
--
2.27.0
^ permalink raw reply related
* [PATCH AUTOSEL 5.9 26/39] soc: fsl: dpio: Get the cpumask through cpumask_of(cpu)
From: Sasha Levin @ 2020-12-03 13:28 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Yi Wang, Sasha Levin, Hao Si, Li Yang, Lin Chen, linuxppc-dev,
linux-arm-kernel
In-Reply-To: <20201203132834.930999-1-sashal@kernel.org>
From: Hao Si <si.hao@zte.com.cn>
[ Upstream commit 2663b3388551230cbc4606a40fabf3331ceb59e4 ]
The local variable 'cpumask_t mask' is in the stack memory, and its address
is assigned to 'desc->affinity' in 'irq_set_affinity_hint()'.
But the memory area where this variable is located is at risk of being
modified.
During LTP testing, the following error was generated:
Unable to handle kernel paging request at virtual address ffff000012e9b790
Mem abort info:
ESR = 0x96000007
Exception class = DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
Data abort info:
ISV = 0, ISS = 0x00000007
CM = 0, WnR = 0
swapper pgtable: 4k pages, 48-bit VAs, pgdp = 0000000075ac5e07
[ffff000012e9b790] pgd=00000027dbffe003, pud=00000027dbffd003,
pmd=00000027b6d61003, pte=0000000000000000
Internal error: Oops: 96000007 [#1] PREEMPT SMP
Modules linked in: xt_conntrack
Process read_all (pid: 20171, stack limit = 0x0000000044ea4095)
CPU: 14 PID: 20171 Comm: read_all Tainted: G B W
Hardware name: NXP Layerscape LX2160ARDB (DT)
pstate: 80000085 (Nzcv daIf -PAN -UAO)
pc : irq_affinity_hint_proc_show+0x54/0xb0
lr : irq_affinity_hint_proc_show+0x4c/0xb0
sp : ffff00001138bc10
x29: ffff00001138bc10 x28: 0000ffffd131d1e0
x27: 00000000007000c0 x26: ffff8025b9480dc0
x25: ffff8025b9480da8 x24: 00000000000003ff
x23: ffff8027334f8300 x22: ffff80272e97d000
x21: ffff80272e97d0b0 x20: ffff8025b9480d80
x19: ffff000009a49000 x18: 0000000000000000
x17: 0000000000000000 x16: 0000000000000000
x15: 0000000000000000 x14: 0000000000000000
x13: 0000000000000000 x12: 0000000000000040
x11: 0000000000000000 x10: ffff802735b79b88
x9 : 0000000000000000 x8 : 0000000000000000
x7 : ffff000009a49848 x6 : 0000000000000003
x5 : 0000000000000000 x4 : ffff000008157d6c
x3 : ffff00001138bc10 x2 : ffff000012e9b790
x1 : 0000000000000000 x0 : 0000000000000000
Call trace:
irq_affinity_hint_proc_show+0x54/0xb0
seq_read+0x1b0/0x440
proc_reg_read+0x80/0xd8
__vfs_read+0x60/0x178
vfs_read+0x94/0x150
ksys_read+0x74/0xf0
__arm64_sys_read+0x24/0x30
el0_svc_common.constprop.0+0xd8/0x1a0
el0_svc_handler+0x34/0x88
el0_svc+0x10/0x14
Code: f9001bbf 943e0732 f94066c2 b4000062 (f9400041)
---[ end trace b495bdcb0b3b732b ]---
Kernel panic - not syncing: Fatal exception
SMP: stopping secondary CPUs
SMP: failed to stop secondary CPUs 0,2-4,6,8,11,13-15
Kernel Offset: disabled
CPU features: 0x0,21006008
Memory Limit: none
---[ end Kernel panic - not syncing: Fatal exception ]---
Fix it by using 'cpumask_of(cpu)' to get the cpumask.
Signed-off-by: Hao Si <si.hao@zte.com.cn>
Signed-off-by: Lin Chen <chen.lin5@zte.com.cn>
Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
Signed-off-by: Li Yang <leoyang.li@nxp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/soc/fsl/dpio/dpio-driver.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/drivers/soc/fsl/dpio/dpio-driver.c b/drivers/soc/fsl/dpio/dpio-driver.c
index 7b642c330977f..7f397b4ad878d 100644
--- a/drivers/soc/fsl/dpio/dpio-driver.c
+++ b/drivers/soc/fsl/dpio/dpio-driver.c
@@ -95,7 +95,6 @@ static int register_dpio_irq_handlers(struct fsl_mc_device *dpio_dev, int cpu)
{
int error;
struct fsl_mc_device_irq *irq;
- cpumask_t mask;
irq = dpio_dev->irqs[0];
error = devm_request_irq(&dpio_dev->dev,
@@ -112,9 +111,7 @@ static int register_dpio_irq_handlers(struct fsl_mc_device *dpio_dev, int cpu)
}
/* set the affinity hint */
- cpumask_clear(&mask);
- cpumask_set_cpu(cpu, &mask);
- if (irq_set_affinity_hint(irq->msi_desc->irq, &mask))
+ if (irq_set_affinity_hint(irq->msi_desc->irq, cpumask_of(cpu)))
dev_err(&dpio_dev->dev,
"irq_set_affinity failed irq %d cpu %d\n",
irq->msi_desc->irq, cpu);
--
2.27.0
^ permalink raw reply related
* [PATCH AUTOSEL 5.9 27/39] sched/idle: Fix arch_cpu_idle() vs tracing
From: Sasha Levin @ 2020-12-03 13:28 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Mark Rutland, Sasha Levin, linux-ia64, linux-parisc, linux-s390,
Peter Zijlstra, linux-hexagon, linux-sh, linux-um, linux-mips,
linux-csky, sparclinux, openrisc, Sven Schnelle, linux-alpha,
uclinux-h8-devel, linux-riscv, linuxppc-dev, linux-arm-kernel
In-Reply-To: <20201203132834.930999-1-sashal@kernel.org>
From: Peter Zijlstra <peterz@infradead.org>
[ Upstream commit 58c644ba512cfbc2e39b758dd979edd1d6d00e27 ]
We call arch_cpu_idle() with RCU disabled, but then use
local_irq_{en,dis}able(), which invokes tracing, which relies on RCU.
Switch all arch_cpu_idle() implementations to use
raw_local_irq_{en,dis}able() and carefully manage the
lockdep,rcu,tracing state like we do in entry.
(XXX: we really should change arch_cpu_idle() to not return with
interrupts enabled)
Reported-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lkml.kernel.org/r/20201120114925.594122626@infradead.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
arch/alpha/kernel/process.c | 2 +-
arch/arm/kernel/process.c | 2 +-
arch/arm64/kernel/process.c | 2 +-
arch/csky/kernel/process.c | 2 +-
arch/h8300/kernel/process.c | 2 +-
arch/hexagon/kernel/process.c | 2 +-
arch/ia64/kernel/process.c | 2 +-
arch/microblaze/kernel/process.c | 2 +-
arch/mips/kernel/idle.c | 12 ++++++------
arch/nios2/kernel/process.c | 2 +-
arch/openrisc/kernel/process.c | 2 +-
arch/parisc/kernel/process.c | 2 +-
arch/powerpc/kernel/idle.c | 4 ++--
arch/riscv/kernel/process.c | 2 +-
arch/s390/kernel/idle.c | 6 +++---
arch/sh/kernel/idle.c | 2 +-
arch/sparc/kernel/leon_pmc.c | 4 ++--
arch/sparc/kernel/process_32.c | 2 +-
arch/sparc/kernel/process_64.c | 4 ++--
arch/um/kernel/process.c | 2 +-
arch/x86/include/asm/mwait.h | 2 --
arch/x86/kernel/process.c | 12 +++++++-----
kernel/sched/idle.c | 28 +++++++++++++++++++++++++++-
23 files changed, 64 insertions(+), 38 deletions(-)
diff --git a/arch/alpha/kernel/process.c b/arch/alpha/kernel/process.c
index 7462a79110024..4c7b0414a3ff3 100644
--- a/arch/alpha/kernel/process.c
+++ b/arch/alpha/kernel/process.c
@@ -57,7 +57,7 @@ EXPORT_SYMBOL(pm_power_off);
void arch_cpu_idle(void)
{
wtint(0);
- local_irq_enable();
+ raw_local_irq_enable();
}
void arch_cpu_idle_dead(void)
diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c
index 8e6ace03e960b..9f199b1e83839 100644
--- a/arch/arm/kernel/process.c
+++ b/arch/arm/kernel/process.c
@@ -71,7 +71,7 @@ void arch_cpu_idle(void)
arm_pm_idle();
else
cpu_do_idle();
- local_irq_enable();
+ raw_local_irq_enable();
}
void arch_cpu_idle_prepare(void)
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 2da5f3f9d345f..f7c42a7d09b66 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -124,7 +124,7 @@ void arch_cpu_idle(void)
* tricks
*/
cpu_do_idle();
- local_irq_enable();
+ raw_local_irq_enable();
}
#ifdef CONFIG_HOTPLUG_CPU
diff --git a/arch/csky/kernel/process.c b/arch/csky/kernel/process.c
index f730869e21eed..69af6bc87e647 100644
--- a/arch/csky/kernel/process.c
+++ b/arch/csky/kernel/process.c
@@ -102,6 +102,6 @@ void arch_cpu_idle(void)
#ifdef CONFIG_CPU_PM_STOP
asm volatile("stop\n");
#endif
- local_irq_enable();
+ raw_local_irq_enable();
}
#endif
diff --git a/arch/h8300/kernel/process.c b/arch/h8300/kernel/process.c
index 83ce3caf73139..a2961c7b2332c 100644
--- a/arch/h8300/kernel/process.c
+++ b/arch/h8300/kernel/process.c
@@ -57,7 +57,7 @@ asmlinkage void ret_from_kernel_thread(void);
*/
void arch_cpu_idle(void)
{
- local_irq_enable();
+ raw_local_irq_enable();
__asm__("sleep");
}
diff --git a/arch/hexagon/kernel/process.c b/arch/hexagon/kernel/process.c
index dfd322c5ce83a..20962601a1b47 100644
--- a/arch/hexagon/kernel/process.c
+++ b/arch/hexagon/kernel/process.c
@@ -44,7 +44,7 @@ void arch_cpu_idle(void)
{
__vmwait();
/* interrupts wake us up, but irqs are still disabled */
- local_irq_enable();
+ raw_local_irq_enable();
}
/*
diff --git a/arch/ia64/kernel/process.c b/arch/ia64/kernel/process.c
index f19cb97c00987..1b2769260688d 100644
--- a/arch/ia64/kernel/process.c
+++ b/arch/ia64/kernel/process.c
@@ -252,7 +252,7 @@ void arch_cpu_idle(void)
if (mark_idle)
(*mark_idle)(1);
- safe_halt();
+ raw_safe_halt();
if (mark_idle)
(*mark_idle)(0);
diff --git a/arch/microblaze/kernel/process.c b/arch/microblaze/kernel/process.c
index a9e46e525cd0a..f99860771ff48 100644
--- a/arch/microblaze/kernel/process.c
+++ b/arch/microblaze/kernel/process.c
@@ -149,5 +149,5 @@ int dump_fpu(struct pt_regs *regs, elf_fpregset_t *fpregs)
void arch_cpu_idle(void)
{
- local_irq_enable();
+ raw_local_irq_enable();
}
diff --git a/arch/mips/kernel/idle.c b/arch/mips/kernel/idle.c
index 5bc3b04693c7d..18e69ebf5691d 100644
--- a/arch/mips/kernel/idle.c
+++ b/arch/mips/kernel/idle.c
@@ -33,19 +33,19 @@ static void __cpuidle r3081_wait(void)
{
unsigned long cfg = read_c0_conf();
write_c0_conf(cfg | R30XX_CONF_HALT);
- local_irq_enable();
+ raw_local_irq_enable();
}
static void __cpuidle r39xx_wait(void)
{
if (!need_resched())
write_c0_conf(read_c0_conf() | TX39_CONF_HALT);
- local_irq_enable();
+ raw_local_irq_enable();
}
void __cpuidle r4k_wait(void)
{
- local_irq_enable();
+ raw_local_irq_enable();
__r4k_wait();
}
@@ -64,7 +64,7 @@ void __cpuidle r4k_wait_irqoff(void)
" .set arch=r4000 \n"
" wait \n"
" .set pop \n");
- local_irq_enable();
+ raw_local_irq_enable();
}
/*
@@ -84,7 +84,7 @@ static void __cpuidle rm7k_wait_irqoff(void)
" wait \n"
" mtc0 $1, $12 # stalls until W stage \n"
" .set pop \n");
- local_irq_enable();
+ raw_local_irq_enable();
}
/*
@@ -257,7 +257,7 @@ void arch_cpu_idle(void)
if (cpu_wait)
cpu_wait();
else
- local_irq_enable();
+ raw_local_irq_enable();
}
#ifdef CONFIG_CPU_IDLE
diff --git a/arch/nios2/kernel/process.c b/arch/nios2/kernel/process.c
index 88a4ec03edab4..f5cc55a88d310 100644
--- a/arch/nios2/kernel/process.c
+++ b/arch/nios2/kernel/process.c
@@ -33,7 +33,7 @@ EXPORT_SYMBOL(pm_power_off);
void arch_cpu_idle(void)
{
- local_irq_enable();
+ raw_local_irq_enable();
}
/*
diff --git a/arch/openrisc/kernel/process.c b/arch/openrisc/kernel/process.c
index 0ff391f00334c..3c98728cce249 100644
--- a/arch/openrisc/kernel/process.c
+++ b/arch/openrisc/kernel/process.c
@@ -79,7 +79,7 @@ void machine_power_off(void)
*/
void arch_cpu_idle(void)
{
- local_irq_enable();
+ raw_local_irq_enable();
if (mfspr(SPR_UPR) & SPR_UPR_PMP)
mtspr(SPR_PMR, mfspr(SPR_PMR) | SPR_PMR_DME);
}
diff --git a/arch/parisc/kernel/process.c b/arch/parisc/kernel/process.c
index f196d96e2f9f5..a92a23d6acd93 100644
--- a/arch/parisc/kernel/process.c
+++ b/arch/parisc/kernel/process.c
@@ -169,7 +169,7 @@ void __cpuidle arch_cpu_idle_dead(void)
void __cpuidle arch_cpu_idle(void)
{
- local_irq_enable();
+ raw_local_irq_enable();
/* nop on real hardware, qemu will idle sleep. */
asm volatile("or %%r10,%%r10,%%r10\n":::);
diff --git a/arch/powerpc/kernel/idle.c b/arch/powerpc/kernel/idle.c
index 422e31d2f5a2b..8df35f1329a42 100644
--- a/arch/powerpc/kernel/idle.c
+++ b/arch/powerpc/kernel/idle.c
@@ -60,9 +60,9 @@ void arch_cpu_idle(void)
* interrupts enabled, some don't.
*/
if (irqs_disabled())
- local_irq_enable();
+ raw_local_irq_enable();
} else {
- local_irq_enable();
+ raw_local_irq_enable();
/*
* Go into low thread priority and possibly
* low power mode.
diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c
index 2b97c493427c9..308e1d95ecbf0 100644
--- a/arch/riscv/kernel/process.c
+++ b/arch/riscv/kernel/process.c
@@ -36,7 +36,7 @@ extern asmlinkage void ret_from_kernel_thread(void);
void arch_cpu_idle(void)
{
wait_for_interrupt();
- local_irq_enable();
+ raw_local_irq_enable();
}
void show_regs(struct pt_regs *regs)
diff --git a/arch/s390/kernel/idle.c b/arch/s390/kernel/idle.c
index f7f1e64e0d980..2b85096964f84 100644
--- a/arch/s390/kernel/idle.c
+++ b/arch/s390/kernel/idle.c
@@ -33,10 +33,10 @@ void enabled_wait(void)
PSW_MASK_IO | PSW_MASK_EXT | PSW_MASK_MCHECK;
clear_cpu_flag(CIF_NOHZ_DELAY);
- local_irq_save(flags);
+ raw_local_irq_save(flags);
/* Call the assembler magic in entry.S */
psw_idle(idle, psw_mask);
- local_irq_restore(flags);
+ raw_local_irq_restore(flags);
/* Account time spent with enabled wait psw loaded as idle time. */
raw_write_seqcount_begin(&idle->seqcount);
@@ -123,7 +123,7 @@ void arch_cpu_idle_enter(void)
void arch_cpu_idle(void)
{
enabled_wait();
- local_irq_enable();
+ raw_local_irq_enable();
}
void arch_cpu_idle_exit(void)
diff --git a/arch/sh/kernel/idle.c b/arch/sh/kernel/idle.c
index 0dc0f52f9bb8d..f59814983bd59 100644
--- a/arch/sh/kernel/idle.c
+++ b/arch/sh/kernel/idle.c
@@ -22,7 +22,7 @@ static void (*sh_idle)(void);
void default_idle(void)
{
set_bl_bit();
- local_irq_enable();
+ raw_local_irq_enable();
/* Isn't this racy ? */
cpu_sleep();
clear_bl_bit();
diff --git a/arch/sparc/kernel/leon_pmc.c b/arch/sparc/kernel/leon_pmc.c
index 065e2d4b72908..396f46bca52eb 100644
--- a/arch/sparc/kernel/leon_pmc.c
+++ b/arch/sparc/kernel/leon_pmc.c
@@ -50,7 +50,7 @@ static void pmc_leon_idle_fixup(void)
register unsigned int address = (unsigned int)leon3_irqctrl_regs;
/* Interrupts need to be enabled to not hang the CPU */
- local_irq_enable();
+ raw_local_irq_enable();
__asm__ __volatile__ (
"wr %%g0, %%asr19\n"
@@ -66,7 +66,7 @@ static void pmc_leon_idle_fixup(void)
static void pmc_leon_idle(void)
{
/* Interrupts need to be enabled to not hang the CPU */
- local_irq_enable();
+ raw_local_irq_enable();
/* For systems without power-down, this will be no-op */
__asm__ __volatile__ ("wr %g0, %asr19\n\t");
diff --git a/arch/sparc/kernel/process_32.c b/arch/sparc/kernel/process_32.c
index adfcaeab3ddc5..a023637359154 100644
--- a/arch/sparc/kernel/process_32.c
+++ b/arch/sparc/kernel/process_32.c
@@ -74,7 +74,7 @@ void arch_cpu_idle(void)
{
if (sparc_idle)
(*sparc_idle)();
- local_irq_enable();
+ raw_local_irq_enable();
}
/* XXX cli/sti -> local_irq_xxx here, check this works once SMP is fixed. */
diff --git a/arch/sparc/kernel/process_64.c b/arch/sparc/kernel/process_64.c
index a75093b993f9a..6f8c7822fc065 100644
--- a/arch/sparc/kernel/process_64.c
+++ b/arch/sparc/kernel/process_64.c
@@ -62,11 +62,11 @@ void arch_cpu_idle(void)
{
if (tlb_type != hypervisor) {
touch_nmi_watchdog();
- local_irq_enable();
+ raw_local_irq_enable();
} else {
unsigned long pstate;
- local_irq_enable();
+ raw_local_irq_enable();
/* The sun4v sleeping code requires that we have PSTATE.IE cleared over
* the cpu sleep hypervisor call.
diff --git a/arch/um/kernel/process.c b/arch/um/kernel/process.c
index 26b5e243d3fc0..495f101792b3d 100644
--- a/arch/um/kernel/process.c
+++ b/arch/um/kernel/process.c
@@ -217,7 +217,7 @@ void arch_cpu_idle(void)
{
cpu_tasks[current_thread_info()->cpu].pid = os_getpid();
um_idle_sleep();
- local_irq_enable();
+ raw_local_irq_enable();
}
int __cant_sleep(void) {
diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h
index e039a933aca3c..29dd27b5a339d 100644
--- a/arch/x86/include/asm/mwait.h
+++ b/arch/x86/include/asm/mwait.h
@@ -88,8 +88,6 @@ static inline void __mwaitx(unsigned long eax, unsigned long ebx,
static inline void __sti_mwait(unsigned long eax, unsigned long ecx)
{
- trace_hardirqs_on();
-
mds_idle_clear_cpu_buffers();
/* "mwait %eax, %ecx;" */
asm volatile("sti; .byte 0x0f, 0x01, 0xc9;"
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index ba4593a913fab..145a7ac0c19aa 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -685,7 +685,7 @@ void arch_cpu_idle(void)
*/
void __cpuidle default_idle(void)
{
- safe_halt();
+ raw_safe_halt();
}
#if defined(CONFIG_APM_MODULE) || defined(CONFIG_HALTPOLL_CPUIDLE_MODULE)
EXPORT_SYMBOL(default_idle);
@@ -736,6 +736,8 @@ void stop_this_cpu(void *dummy)
/*
* AMD Erratum 400 aware idle routine. We handle it the same way as C3 power
* states (local apic timer and TSC stop).
+ *
+ * XXX this function is completely buggered vs RCU and tracing.
*/
static void amd_e400_idle(void)
{
@@ -757,9 +759,9 @@ static void amd_e400_idle(void)
* The switch back from broadcast mode needs to be called with
* interrupts disabled.
*/
- local_irq_disable();
+ raw_local_irq_disable();
tick_broadcast_exit();
- local_irq_enable();
+ raw_local_irq_enable();
}
/*
@@ -801,9 +803,9 @@ static __cpuidle void mwait_idle(void)
if (!need_resched())
__sti_mwait(0, 0);
else
- local_irq_enable();
+ raw_local_irq_enable();
} else {
- local_irq_enable();
+ raw_local_irq_enable();
}
__current_clr_polling();
}
diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
index f324dc36fc43d..dee807ffad11b 100644
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -78,7 +78,7 @@ void __weak arch_cpu_idle_dead(void) { }
void __weak arch_cpu_idle(void)
{
cpu_idle_force_poll = 1;
- local_irq_enable();
+ raw_local_irq_enable();
}
/**
@@ -94,9 +94,35 @@ void __cpuidle default_idle_call(void)
trace_cpu_idle(1, smp_processor_id());
stop_critical_timings();
+
+ /*
+ * arch_cpu_idle() is supposed to enable IRQs, however
+ * we can't do that because of RCU and tracing.
+ *
+ * Trace IRQs enable here, then switch off RCU, and have
+ * arch_cpu_idle() use raw_local_irq_enable(). Note that
+ * rcu_idle_enter() relies on lockdep IRQ state, so switch that
+ * last -- this is very similar to the entry code.
+ */
+ trace_hardirqs_on_prepare();
+ lockdep_hardirqs_on_prepare(_THIS_IP_);
rcu_idle_enter();
+ lockdep_hardirqs_on(_THIS_IP_);
+
arch_cpu_idle();
+
+ /*
+ * OK, so IRQs are enabled here, but RCU needs them disabled to
+ * turn itself back on.. funny thing is that disabling IRQs
+ * will cause tracing, which needs RCU. Jump through hoops to
+ * make it 'work'.
+ */
+ raw_local_irq_disable();
+ lockdep_hardirqs_off(_THIS_IP_);
rcu_idle_exit();
+ lockdep_hardirqs_on(_THIS_IP_);
+ raw_local_irq_enable();
+
start_critical_timings();
trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id());
}
--
2.27.0
^ permalink raw reply related
* [PATCH AUTOSEL 5.4 05/23] powerpc: Drop -me200 addition to build flags
From: Sasha Levin @ 2020-12-03 13:29 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Sasha Levin, kernel test robot, Németh Márton,
Nick Desaulniers, Scott Wood, linuxppc-dev
In-Reply-To: <20201203132935.931362-1-sashal@kernel.org>
From: Michael Ellerman <mpe@ellerman.id.au>
[ Upstream commit e02152ba2810f7c88cb54e71cda096268dfa9241 ]
Currently a build with CONFIG_E200=y will fail with:
Error: invalid switch -me200
Error: unrecognized option -me200
Upstream binutils has never supported an -me200 option. Presumably it
was supported at some point by either a fork or Freescale internal
binutils.
We can't support code that we can't even build test, so drop the
addition of -me200 to the build flags, so we can at least build with
CONFIG_E200=y.
Reported-by: Németh Márton <nm127@freemail.hu>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Scott Wood <oss@buserror.net>
Link: https://lore.kernel.org/r/20201116120913.165317-1-mpe@ellerman.id.au
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
arch/powerpc/Makefile | 1 -
1 file changed, 1 deletion(-)
diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index 37ac731a556b8..9f73fb6b1cc91 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -250,7 +250,6 @@ KBUILD_CFLAGS += $(call cc-option,-mno-string)
cpu-as-$(CONFIG_4xx) += -Wa,-m405
cpu-as-$(CONFIG_ALTIVEC) += $(call as-option,-Wa$(comma)-maltivec)
-cpu-as-$(CONFIG_E200) += -Wa,-me200
cpu-as-$(CONFIG_E500) += -Wa,-me500
# When using '-many -mpower4' gas will first try and find a matching power4
--
2.27.0
^ permalink raw reply related
* [PATCH AUTOSEL 5.4 12/23] ibmvnic: skip tx timeout reset while in resetting
From: Sasha Levin @ 2020-12-03 13:29 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Sasha Levin, netdev, Dany Madden, Lijun Pan, Brian King,
Jakub Kicinski, linuxppc-dev
In-Reply-To: <20201203132935.931362-1-sashal@kernel.org>
From: Lijun Pan <ljp@linux.ibm.com>
[ Upstream commit 855a631a4c11458a9cef1ab79c1530436aa95fae ]
Sometimes it takes longer than 5 seconds (watchdog timeout) to complete
failover, migration, and other resets. In stead of scheduling another
timeout reset, we wait for the current one to complete.
Suggested-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/ethernet/ibm/ibmvnic.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index e53994ca3142c..af1d9d2150616 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -2266,6 +2266,12 @@ static void ibmvnic_tx_timeout(struct net_device *dev)
{
struct ibmvnic_adapter *adapter = netdev_priv(dev);
+ if (test_bit(0, &adapter->resetting)) {
+ netdev_err(adapter->netdev,
+ "Adapter is resetting, skip timeout reset\n");
+ return;
+ }
+
ibmvnic_reset(adapter, VNIC_RESET_TIMEOUT);
}
--
2.27.0
^ permalink raw reply related
* [PATCH AUTOSEL 5.4 15/23] soc: fsl: dpio: Get the cpumask through cpumask_of(cpu)
From: Sasha Levin @ 2020-12-03 13:29 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Yi Wang, Sasha Levin, Hao Si, Li Yang, Lin Chen, linuxppc-dev,
linux-arm-kernel
In-Reply-To: <20201203132935.931362-1-sashal@kernel.org>
From: Hao Si <si.hao@zte.com.cn>
[ Upstream commit 2663b3388551230cbc4606a40fabf3331ceb59e4 ]
The local variable 'cpumask_t mask' is in the stack memory, and its address
is assigned to 'desc->affinity' in 'irq_set_affinity_hint()'.
But the memory area where this variable is located is at risk of being
modified.
During LTP testing, the following error was generated:
Unable to handle kernel paging request at virtual address ffff000012e9b790
Mem abort info:
ESR = 0x96000007
Exception class = DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
Data abort info:
ISV = 0, ISS = 0x00000007
CM = 0, WnR = 0
swapper pgtable: 4k pages, 48-bit VAs, pgdp = 0000000075ac5e07
[ffff000012e9b790] pgd=00000027dbffe003, pud=00000027dbffd003,
pmd=00000027b6d61003, pte=0000000000000000
Internal error: Oops: 96000007 [#1] PREEMPT SMP
Modules linked in: xt_conntrack
Process read_all (pid: 20171, stack limit = 0x0000000044ea4095)
CPU: 14 PID: 20171 Comm: read_all Tainted: G B W
Hardware name: NXP Layerscape LX2160ARDB (DT)
pstate: 80000085 (Nzcv daIf -PAN -UAO)
pc : irq_affinity_hint_proc_show+0x54/0xb0
lr : irq_affinity_hint_proc_show+0x4c/0xb0
sp : ffff00001138bc10
x29: ffff00001138bc10 x28: 0000ffffd131d1e0
x27: 00000000007000c0 x26: ffff8025b9480dc0
x25: ffff8025b9480da8 x24: 00000000000003ff
x23: ffff8027334f8300 x22: ffff80272e97d000
x21: ffff80272e97d0b0 x20: ffff8025b9480d80
x19: ffff000009a49000 x18: 0000000000000000
x17: 0000000000000000 x16: 0000000000000000
x15: 0000000000000000 x14: 0000000000000000
x13: 0000000000000000 x12: 0000000000000040
x11: 0000000000000000 x10: ffff802735b79b88
x9 : 0000000000000000 x8 : 0000000000000000
x7 : ffff000009a49848 x6 : 0000000000000003
x5 : 0000000000000000 x4 : ffff000008157d6c
x3 : ffff00001138bc10 x2 : ffff000012e9b790
x1 : 0000000000000000 x0 : 0000000000000000
Call trace:
irq_affinity_hint_proc_show+0x54/0xb0
seq_read+0x1b0/0x440
proc_reg_read+0x80/0xd8
__vfs_read+0x60/0x178
vfs_read+0x94/0x150
ksys_read+0x74/0xf0
__arm64_sys_read+0x24/0x30
el0_svc_common.constprop.0+0xd8/0x1a0
el0_svc_handler+0x34/0x88
el0_svc+0x10/0x14
Code: f9001bbf 943e0732 f94066c2 b4000062 (f9400041)
---[ end trace b495bdcb0b3b732b ]---
Kernel panic - not syncing: Fatal exception
SMP: stopping secondary CPUs
SMP: failed to stop secondary CPUs 0,2-4,6,8,11,13-15
Kernel Offset: disabled
CPU features: 0x0,21006008
Memory Limit: none
---[ end Kernel panic - not syncing: Fatal exception ]---
Fix it by using 'cpumask_of(cpu)' to get the cpumask.
Signed-off-by: Hao Si <si.hao@zte.com.cn>
Signed-off-by: Lin Chen <chen.lin5@zte.com.cn>
Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
Signed-off-by: Li Yang <leoyang.li@nxp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/soc/fsl/dpio/dpio-driver.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/drivers/soc/fsl/dpio/dpio-driver.c b/drivers/soc/fsl/dpio/dpio-driver.c
index 7b642c330977f..7f397b4ad878d 100644
--- a/drivers/soc/fsl/dpio/dpio-driver.c
+++ b/drivers/soc/fsl/dpio/dpio-driver.c
@@ -95,7 +95,6 @@ static int register_dpio_irq_handlers(struct fsl_mc_device *dpio_dev, int cpu)
{
int error;
struct fsl_mc_device_irq *irq;
- cpumask_t mask;
irq = dpio_dev->irqs[0];
error = devm_request_irq(&dpio_dev->dev,
@@ -112,9 +111,7 @@ static int register_dpio_irq_handlers(struct fsl_mc_device *dpio_dev, int cpu)
}
/* set the affinity hint */
- cpumask_clear(&mask);
- cpumask_set_cpu(cpu, &mask);
- if (irq_set_affinity_hint(irq->msi_desc->irq, &mask))
+ if (irq_set_affinity_hint(irq->msi_desc->irq, cpumask_of(cpu)))
dev_err(&dpio_dev->dev,
"irq_set_affinity failed irq %d cpu %d\n",
irq->msi_desc->irq, cpu);
--
2.27.0
^ permalink raw reply related
* [PATCH AUTOSEL 4.19 04/14] powerpc: Drop -me200 addition to build flags
From: Sasha Levin @ 2020-12-03 13:30 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Sasha Levin, kernel test robot, Németh Márton,
Nick Desaulniers, Scott Wood, linuxppc-dev
In-Reply-To: <20201203133010.931600-1-sashal@kernel.org>
From: Michael Ellerman <mpe@ellerman.id.au>
[ Upstream commit e02152ba2810f7c88cb54e71cda096268dfa9241 ]
Currently a build with CONFIG_E200=y will fail with:
Error: invalid switch -me200
Error: unrecognized option -me200
Upstream binutils has never supported an -me200 option. Presumably it
was supported at some point by either a fork or Freescale internal
binutils.
We can't support code that we can't even build test, so drop the
addition of -me200 to the build flags, so we can at least build with
CONFIG_E200=y.
Reported-by: Németh Márton <nm127@freemail.hu>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Scott Wood <oss@buserror.net>
Link: https://lore.kernel.org/r/20201116120913.165317-1-mpe@ellerman.id.au
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
arch/powerpc/Makefile | 1 -
1 file changed, 1 deletion(-)
diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index 8954108df4570..f51e21ea53492 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -251,7 +251,6 @@ endif
cpu-as-$(CONFIG_4xx) += -Wa,-m405
cpu-as-$(CONFIG_ALTIVEC) += $(call as-option,-Wa$(comma)-maltivec)
-cpu-as-$(CONFIG_E200) += -Wa,-me200
cpu-as-$(CONFIG_E500) += -Wa,-me500
# When using '-many -mpower4' gas will first try and find a matching power4
--
2.27.0
^ permalink raw reply related
* Re: [PATCH 1/2] ASoC: fsl-asoc-card: Add support for si476x codec
From: Mark Brown @ 2020-12-03 14:18 UTC (permalink / raw)
To: devicetree, linux-kernel, perex, Shengjiu Wang, linuxppc-dev,
timur, Xiubo.Lee, nicoleotsuka, alsa-devel, lgirdwood, robh+dt,
tiwai, festevam
In-Reply-To: <1606708668-28786-1-git-send-email-shengjiu.wang@nxp.com>
On Mon, 30 Nov 2020 11:57:47 +0800, Shengjiu Wang wrote:
> The si476x codec is used for FM radio function on i.MX6
> auto board, it only supports recording function.
Applied to
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-next
Thanks!
[1/2] ASoC: fsl-asoc-card: Add support for si476x codec
commit: 77f1ff751037fcd39c8fc37b3c3796fb139fb388
[2/2] ASoC: bindings: fsl-asoc-card: add compatible string for si476x codec
commit: 0b3355b070434f9901f641aac9000df93e2c96ad
All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.
You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.
If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.
Please add any relevant lists and maintainers to the CCs when replying
to this mail.
Thanks,
Mark
^ permalink raw reply
* Re: powerpc 5.10-rcN boot failures with RCU_SCALE_TEST=m
From: Uladzislau Rezki @ 2020-12-03 14:34 UTC (permalink / raw)
To: Michael Ellerman
Cc: rcu, Uladzislau Rezki, linuxppc-dev, Paul E . McKenney,
Daniel Axtens
In-Reply-To: <87sg8nv277.fsf@mpe.ellerman.id.au>
On Thu, Dec 03, 2020 at 05:22:20PM +1100, Michael Ellerman wrote:
> Uladzislau Rezki <urezki@gmail.com> writes:
> > On Thu, Dec 03, 2020 at 01:03:32AM +1100, Michael Ellerman wrote:
> ...
> >>
> >> The SMP bringup stalls because _cpu_up() is blocked trying to take
> >> cpu_hotplug_lock for writing:
> >>
> >> [ 401.403132][ T0] task:swapper/0 state:D stack:12512 pid: 1 ppid: 0 flags:0x00000800
> >> [ 401.403502][ T0] Call Trace:
> >> [ 401.403907][ T0] [c0000000062c37d0] [c0000000062c3830] 0xc0000000062c3830 (unreliable)
> >> [ 401.404068][ T0] [c0000000062c39b0] [c000000000019d70] __switch_to+0x2e0/0x4a0
> >> [ 401.404189][ T0] [c0000000062c3a10] [c000000000b87228] __schedule+0x288/0x9b0
> >> [ 401.404257][ T0] [c0000000062c3ad0] [c000000000b879b8] schedule+0x68/0x120
> >> [ 401.404324][ T0] [c0000000062c3b00] [c000000000184ad4] percpu_down_write+0x164/0x170
> >> [ 401.404390][ T0] [c0000000062c3b50] [c000000000116b68] _cpu_up+0x68/0x280
> >> [ 401.404475][ T0] [c0000000062c3bb0] [c000000000116e70] cpu_up+0xf0/0x140
> >> [ 401.404546][ T0] [c0000000062c3c30] [c00000000011776c] bringup_nonboot_cpus+0xac/0xf0
> >> [ 401.404643][ T0] [c0000000062c3c80] [c000000000eea1b8] smp_init+0x40/0xcc
> >> [ 401.404727][ T0] [c0000000062c3ce0] [c000000000ec43dc] kernel_init_freeable+0x1e0/0x3a0
> >> [ 401.404799][ T0] [c0000000062c3db0] [c000000000011ec4] kernel_init+0x24/0x150
> >> [ 401.404958][ T0] [c0000000062c3e20] [c00000000000daf0] ret_from_kernel_thread+0x5c/0x6c
> >>
> >> It can't get it because kprobe_optimizer() has taken it for read and is now
> >> blocked waiting for synchronize_rcu_tasks():
> >>
> >> [ 401.418808][ T0] task:kworker/0:1 state:D stack:13392 pid: 12 ppid: 2 flags:0x00000800
> >> [ 401.418951][ T0] Workqueue: events kprobe_optimizer
> >> [ 401.419078][ T0] Call Trace:
> >> [ 401.419121][ T0] [c0000000062ef650] [c0000000062ef710] 0xc0000000062ef710 (unreliable)
> >> [ 401.419213][ T0] [c0000000062ef830] [c000000000019d70] __switch_to+0x2e0/0x4a0
> >> [ 401.419281][ T0] [c0000000062ef890] [c000000000b87228] __schedule+0x288/0x9b0
> >> [ 401.419347][ T0] [c0000000062ef950] [c000000000b879b8] schedule+0x68/0x120
> >> [ 401.419415][ T0] [c0000000062ef980] [c000000000b8e664] schedule_timeout+0x2a4/0x340
> >> [ 401.419484][ T0] [c0000000062efa80] [c000000000b894ec] wait_for_completion+0x9c/0x170
> >> [ 401.419552][ T0] [c0000000062efae0] [c0000000001ac85c] __wait_rcu_gp+0x19c/0x210
> >> [ 401.419619][ T0] [c0000000062efb40] [c0000000001ac90c] synchronize_rcu_tasks_generic+0x3c/0x70
> >> [ 401.419690][ T0] [c0000000062efbe0] [c00000000022a3dc] kprobe_optimizer+0x1dc/0x470
> >> [ 401.419757][ T0] [c0000000062efc60] [c000000000136684] process_one_work+0x2f4/0x530
> >> [ 401.419823][ T0] [c0000000062efd20] [c000000000138d28] worker_thread+0x78/0x570
> >> [ 401.419891][ T0] [c0000000062efdb0] [c000000000142424] kthread+0x194/0x1a0
> >> [ 401.419976][ T0] [c0000000062efe20] [c00000000000daf0] ret_from_kernel_thread+0x5c/0x6c
> >>
> >> But why is the synchronize_rcu_tasks() not completing?
> >>
> > I think that it is because RCU is not fully initialized by that time.
>
> Yeah that would explain it :)
>
> > The 36dadef23fcc ("kprobes: Init kprobes in early_initcall") patch
> > switches to early_initcall() that has a higher priority sequence than
> > core_initcall() that is used to complete an RCU setup in the rcu_set_runtime_mode().
>
> I was looking at debug_lockdep_rcu_enabled(), which is:
>
> noinstr int notrace debug_lockdep_rcu_enabled(void)
> {
> return rcu_scheduler_active != RCU_SCHEDULER_INACTIVE && debug_locks &&
> current->lockdep_recursion == 0;
> }
>
> That is not firing any warnings for me because rcu_scheduler_active is:
>
> (gdb) p/x rcu_scheduler_active
> $1 = 0x1
>
> Which is:
>
> #define RCU_SCHEDULER_INIT 1
>
Agree with that.
> But that's different to RCU_SCHEDULER_RUNNING, which is set in
> rcu_set_runtime_mode() as you mentioned:
>
> static int __init rcu_set_runtime_mode(void)
> {
> rcu_test_sync_prims();
> rcu_scheduler_active = RCU_SCHEDULER_RUNNING;
> kfree_rcu_scheduler_running();
> rcu_test_sync_prims();
> return 0;
> }
>
BTW, since you can reproduce it and have a test setup, could you please
check that:
<snip>
-core_initcall(rcu_set_runtime_mode);
+early_initcall(rcu_set_runtime_mode);
<snip>
Just in case. The the synchronize_rcu_tasks() gets blocked:
<snip>
void __wait_rcu_gp(bool checktiny, int n, call_rcu_func_t *crcu_array,
if (crcu_array[j] == crcu_array[i])
break;
if (j == i) {
wait_for_completion(&rs_array[i].completion); <--- here
destroy_rcu_head_on_stack(&rs_array[i].head);
}
...
<snip>
but that is obvious when looking at your full traces.
> The comment on rcu_scheduler_active implies that once we're at
> RCU_SCHEDULER_INIT things should work:
>
> /*
> * The rcu_scheduler_active variable is initialized to the value
> * RCU_SCHEDULER_INACTIVE and transitions RCU_SCHEDULER_INIT just before the
> * first task is spawned. So when this variable is RCU_SCHEDULER_INACTIVE,
> * RCU can assume that there is but one task, allowing RCU to (for example)
> * optimize synchronize_rcu() to a simple barrier(). When this variable
> * is RCU_SCHEDULER_INIT, RCU must actually do all the hard work required
> * to detect real grace periods. This variable is also used to suppress
> * boot-time false positives from lockdep-RCU error checking. Finally, it
> * transitions from RCU_SCHEDULER_INIT to RCU_SCHEDULER_RUNNING after RCU
> * is fully initialized, including all of its kthreads having been spawned.
> */
>
>
> So I'm not sure, the comments and the debug checks imply that it is OK
> for kprobes to be using RCU this early.
>
Sounds like it should be possible.
>
> I guess I'll keep digging.
>
Thank you! I also will dig further with that even though i do not have a setup
for reproducing it.
--
Vlad Rezki
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox