* [BISECTED] rcu_sched self-detected stall since 3.17
@ 2015-11-20 15:35 Vladimir Murzin
2015-12-01 11:50 ` Vladimir Murzin
2015-12-01 13:04 ` Peter Zijlstra
0 siblings, 2 replies; 5+ messages in thread
From: Vladimir Murzin @ 2015-11-20 15:35 UTC (permalink / raw)
To: linux-kernel
Cc: neilb, oleg, peterz, mark.rutland, linux-arm-kernel, linux-mm
Hi,
I've been getting rcu_sched self-detected stall on one of our test
environment with the following (or similar) log:
Linux 4.3
> INFO: rcu_sched self-detected stall on CPU
> 2: (2099 ticks this GP) idle=a73/140000000000001/0 softirq=277/283 fqs=2092
> (t=2100 jiffies g=-197 c=-198 q=230)
> Task dump for CPU 2:
> paranoia R running 0 423 1 0x00000003
> [<c0016f20>] (unwind_backtrace) from [<c0012fd0>] (show_stack+0x10/0x14)
> [<c0012fd0>] (show_stack) from [<c006c52c>] (rcu_dump_cpu_stacks+0xa4/0xc4)
> [<c006c52c>] (rcu_dump_cpu_stacks) from [<c006f188>] (rcu_check_callbacks+0x2dc/0x81c)
> [<c006f188>] (rcu_check_callbacks) from [<c0072278>] (update_process_times+0x38/0x64)
> [<c0072278>] (update_process_times) from [<c007e7b8>] (tick_periodic+0xa8/0xb8)
> [<c007e7b8>] (tick_periodic) from [<c007e904>] (tick_handle_periodic+0x28/0x88)
> [<c007e904>] (tick_handle_periodic) from [<c0223074>] (arch_timer_handler_virt+0x28/0x30)
> [<c0223074>] (arch_timer_handler_virt) from [<c0068008>] (handle_percpu_devid_irq+0x6c/0x84)
> [<c0068008>] (handle_percpu_devid_irq) from [<c0064244>] (generic_handle_irq+0x24/0x34)
> [<c0064244>] (generic_handle_irq) from [<c0064518>] (__handle_domain_irq+0x98/0xac)
> [<c0064518>] (__handle_domain_irq) from [<c0009444>] (gic_handle_irq+0x54/0x90)
> [<c0009444>] (gic_handle_irq) from [<c0013b14>] (__irq_svc+0x54/0x70)
> Exception stack(0xdb007c60 to 0xdb007ca8)
> 7c60: 00000000 00000000 00000000 00000001 dc3e9a20 00000000 00000000 00000001
> 7c80: dc3e9a20 0000001b 00080000 0000000b ffffffff db007cb0 c00ac8f0 c00aa96c
> 7ca0: 600d0113 ffffffff
> [<c0013b14>] (__irq_svc) from [<c00aa96c>] (free_pages_prepare+0x18c/0x218)
> [<c00aa96c>] (free_pages_prepare) from [<c00ac8f0>] (free_hot_cold_page+0x34/0x168)
> [<c00ac8f0>] (free_hot_cold_page) from [<c00c996c>] (handle_mm_fault+0x7b8/0xe60)
> [<c00c996c>] (handle_mm_fault) from [<c001c194>] (do_page_fault+0x12c/0x378)
> [<c001c194>] (do_page_fault) from [<c00092dc>] (do_DataAbort+0x38/0xb4)
> [<c00092dc>] (do_DataAbort) from [<c0013aa0>] (__dabt_svc+0x40/0x60)
> Exception stack(0xdb007e68 to 0xdb007eb0)
> 7e60: 0001b2c8 00000d30 00000000 00000055 00000051 dbb04600
> 7e80: 00000000 db007ec8 db8ea700 db8f3080 dba43240 0001b2c8 00000000 db007eb8
> 7ea0: c012c178 c0174ce8 200d0113 ffffffff
> [<c0013aa0>] (__dabt_svc) from [<c0174ce8>] (__clear_user_std+0x34/0x68)
> [<c0174ce8>] (__clear_user_std) from [<c012c178>] (padzero+0x58/0x74)
> [<c012c178>] (padzero) from [<c012cb4c>] (load_elf_binary+0x730/0x1250)
> [<c012cb4c>] (load_elf_binary) from [<c00ecd6c>] (search_binary_handler+0xa0/0x23c)
> [<c00ecd6c>] (search_binary_handler) from [<c00ed730>] (do_execveat_common+0x444/0x5d8)
> [<c00ed730>] (do_execveat_common) from [<c00ed8e8>] (do_execve+0x24/0x2c)
> [<c00ed8e8>] (do_execve) from [<c000f580>] (ret_fast_syscall+0x0/0x3c)
(I put report for Linux 3.17 under [2])
This rcu_sched self-detected stall is usually reproduced 2 or 3 times
out of 10 runs, but sometimes even more runs go without issue.
I bisected [1] it to commit
commit 743162013d40ca612b4cb53d3a200dff2d9ab26e
Author: NeilBrown <neilb@suse.de>
Date: Mon Jul 7 15:16:04 2014 +1000
sched: Remove proliferation of wait_on_bit() action functions
The only change I noticed is from (mm/filemap.c)
io_schedule();
fatal_signal_pending(current)
to (kernel/sched/wait.c)
signal_pending_state(current->state, current)
io_schedule();
and if I apply following diff I don't see stalls anymore.
diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
index a104879..2d68cdb 100644
--- a/kernel/sched/wait.c
+++ b/kernel/sched/wait.c
@@ -514,9 +514,10 @@ EXPORT_SYMBOL(bit_wait);
__sched int bit_wait_io(void *word)
{
+ io_schedule();
+
if (signal_pending_state(current->state, current))
return 1;
- io_schedule();
return 0;
}
EXPORT_SYMBOL(bit_wait_io);
Any ideas why it might happen and why diff above helps?
[1] Bisect log:
I checked 4.[0-3] and 3.1[7-9] manually
I have https://lkml.org/lkml/2015/8/26/649 for better report
> git bisect start '--no-checkout'
> # good: [19583ca584d6f574384e17fe7613dfaeadcdc4a6] Linux 3.16
> git bisect good 19583ca584d6f574384e17fe7613dfaeadcdc4a6
> # bad: [53ee983378ff23e8f3ff95ecf99dea7c6c221900] Merge tag 'staging-3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
> git bisect bad 53ee983378ff23e8f3ff95ecf99dea7c6c221900
> # good: [2042088cd67d0064d18c52c13c69af2499907bb1] staging: comedi: ni_labpc: tidy up labpc_ai_scan_mode()
> git bisect good 2042088cd67d0064d18c52c13c69af2499907bb1
> # bad: [98959948a7ba33cf8c708626e0d2a1456397e1c6] Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
> git bisect bad 98959948a7ba33cf8c708626e0d2a1456397e1c6
> # good: [c9b88e9581828bb8bba06c5e7ee8ed1761172b6e] Merge tag 'trace-3.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
> git bisect good c9b88e9581828bb8bba06c5e7ee8ed1761172b6e
> # good: [5bda4f638f36ef4c4e3b1397b02affc3db94356e] Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
> git bisect good 5bda4f638f36ef4c4e3b1397b02affc3db94356e
> # good: [2336ebc32676df5b794acfe0c980583ec6c05f34] Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf into perf/core
> git bisect good 2336ebc32676df5b794acfe0c980583ec6c05f34
> # good: [ef35ad26f8ff44d2c93e29952cdb336bda729d9d] Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
> git bisect good ef35ad26f8ff44d2c93e29952cdb336bda729d9d
> # good: [1c5d3eb3759013bc7ee4197aa0a9f245bdb6eb90] sched/numa: Simplify task_numa_compare()
> git bisect good 1c5d3eb3759013bc7ee4197aa0a9f245bdb6eb90
> # good: [6e76ea8a8209386c3cc7ee5594e6ea5d25525cf2] sched: Remove extra static_key*() function indirection
> git bisect good 6e76ea8a8209386c3cc7ee5594e6ea5d25525cf2
> # bad: [c1221321b7c25b53204447cff9949a6d5a7ddddc] sched: Allow wait_on_bit_action() functions to support a timeout
> git bisect bad c1221321b7c25b53204447cff9949a6d5a7ddddc
> # good: [e720fff6341fe4b95e5a93c939bd3c77fa55ced4] sched/numa: Revert "Use effective_load() to balance NUMA loads"
> git bisect good e720fff6341fe4b95e5a93c939bd3c77fa55ced4
> # bad: [743162013d40ca612b4cb53d3a200dff2d9ab26e] sched: Remove proliferation of wait_on_bit() action functions
> git bisect bad 743162013d40ca612b4cb53d3a200dff2d9ab26e
> # good: [d26fad5b38e1c4667d4f2604936e59c837caa54d] Merge tag 'v3.16-rc5' into sched/core, to refresh the branch before applying bigger tree-wide changes
> git bisect good d26fad5b38e1c4667d4f2604936e59c837caa54d
[2] rcu_sched self-detected stall under Linux 3.17
> INFO: rcu_sched self-detected stall on CPU { 2} (t=2100 jiffies g=-199 c=-200 q=172)
> Task dump for CPU 2:
> paranoia R running 0 414 1 0x00000003
> [<c0015394>] (unwind_backtrace) from [<c00117dc>] (show_stack+0x10/0x14)
> [<c00117dc>] (show_stack) from [<c00602d0>] (rcu_dump_cpu_stacks+0xa4/0xc4)
> [<c00602d0>] (rcu_dump_cpu_stacks) from [<c0062aec>] (rcu_check_callbacks+0x280/0x6f8)
> [<c0062aec>] (rcu_check_callbacks) from [<c0065b9c>] (update_process_times+0x40/0x60)
> [<c0065b9c>] (update_process_times) from [<c007169c>] (tick_periodic+0xa8/0xb8)
> [<c007169c>] (tick_periodic) from [<c00717bc>] (tick_handle_periodic+0x28/0x88)
> [<c00717bc>] (tick_handle_periodic) from [<c01fc414>] (arch_timer_handler_virt+0x28/0x30)
> [<c01fc414>] (arch_timer_handler_virt) from [<c005d100>] (handle_percpu_devid_irq+0x6c/0x84)
> [<c005d100>] (handle_percpu_devid_irq) from [<c0059658>] (generic_handle_irq+0x2c/0x3c)
> [<c0059658>] (generic_handle_irq) from [<c000ee3c>] (handle_IRQ+0x7c/0x8c)
> [<c000ee3c>] (handle_IRQ) from [<c00085f8>] (gic_handle_irq+0x40/0x5c)
> [<c00085f8>] (gic_handle_irq) from [<c0012300>] (__irq_svc+0x40/0x54)
> Exception stack(0xdb063c60 to 0xdb063ca8)
> 3c60: 00000000 0001b000 dbbfc000 0000000b dbbfc000 0000000b dbbfc000 0001b000
> 3c80: 0001b000 dbb2d6e0 0000001b db03c380 0000006c db063ca8 c00b935c c00b935c
> 3ca0: a00d0113 ffffffff
> [<c0012300>] (__irq_svc) from [<c00b935c>] (do_cow_fault.isra.97+0x1c/0x198)
> [<c00b935c>] (do_cow_fault.isra.97) from [<c00b9804>] (handle_mm_fault+0x14c/0x900)
> [<c00b9804>] (handle_mm_fault) from [<c001ae38>] (do_page_fault+0x120/0x3d8)
> [<c001ae38>] (do_page_fault) from [<c00084c0>] (do_DataAbort+0x38/0x98)
> [<c00084c0>] (do_DataAbort) from [<c0012298>] (__dabt_svc+0x38/0x60)
> Exception stack(0xdb063e70 to 0xdb063eb8)
> 3e60: 0001b2c8 00000d30 00000000 00000000
> 3e80: db902a80 dbadc100 00000000 db063ec8 db062008 dbadc200 0001b2c8 0001b6cc
> 3ea0: 00000000 db063eb8 c011689c c015a884 200d0013 ffffffff
> [<c0012298>] (__dabt_svc) from [<c015a884>] (__clear_user_std+0x34/0x64)
> [<c015a884>] (__clear_user_std) from [<c011689c>] (padzero+0x44/0x58)
> [<c011689c>] (padzero) from [<c01182e8>] (load_elf_binary+0x778/0x138c)
> [<c01182e8>] (load_elf_binary) from [<c00da960>] (search_binary_handler+0x98/0x1d8)
> [<c00da960>] (search_binary_handler) from [<c00dbc2c>] (do_execve+0x35c/0x4bc)
> [<c00dbc2c>] (do_execve) from [<c000e520>] (ret_fast_syscall+0x0/0x30)
Thanks
Vladimir
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [BISECTED] rcu_sched self-detected stall since 3.17
2015-11-20 15:35 [BISECTED] rcu_sched self-detected stall since 3.17 Vladimir Murzin
@ 2015-12-01 11:50 ` Vladimir Murzin
2015-12-01 13:04 ` Peter Zijlstra
1 sibling, 0 replies; 5+ messages in thread
From: Vladimir Murzin @ 2015-12-01 11:50 UTC (permalink / raw)
To: linux-kernel
Cc: neilb, oleg, peterz, mark.rutland, linux-arm-kernel, linux-mm
On 20/11/15 15:35, Vladimir Murzin wrote:
> Hi,
>
> I've been getting rcu_sched self-detected stall on one of our test
> environment with the following (or similar) log:
>
> Linux 4.3
Anybody looking into this? Oleg, Peter, any ideas what's happening here?
Thanks
Vladimir
>
>> INFO: rcu_sched self-detected stall on CPU
>> 2: (2099 ticks this GP) idle=a73/140000000000001/0 softirq=277/283 fqs=2092
>> (t=2100 jiffies g=-197 c=-198 q=230)
>> Task dump for CPU 2:
>> paranoia R running 0 423 1 0x00000003
>> [<c0016f20>] (unwind_backtrace) from [<c0012fd0>] (show_stack+0x10/0x14)
>> [<c0012fd0>] (show_stack) from [<c006c52c>] (rcu_dump_cpu_stacks+0xa4/0xc4)
>> [<c006c52c>] (rcu_dump_cpu_stacks) from [<c006f188>] (rcu_check_callbacks+0x2dc/0x81c)
>> [<c006f188>] (rcu_check_callbacks) from [<c0072278>] (update_process_times+0x38/0x64)
>> [<c0072278>] (update_process_times) from [<c007e7b8>] (tick_periodic+0xa8/0xb8)
>> [<c007e7b8>] (tick_periodic) from [<c007e904>] (tick_handle_periodic+0x28/0x88)
>> [<c007e904>] (tick_handle_periodic) from [<c0223074>] (arch_timer_handler_virt+0x28/0x30)
>> [<c0223074>] (arch_timer_handler_virt) from [<c0068008>] (handle_percpu_devid_irq+0x6c/0x84)
>> [<c0068008>] (handle_percpu_devid_irq) from [<c0064244>] (generic_handle_irq+0x24/0x34)
>> [<c0064244>] (generic_handle_irq) from [<c0064518>] (__handle_domain_irq+0x98/0xac)
>> [<c0064518>] (__handle_domain_irq) from [<c0009444>] (gic_handle_irq+0x54/0x90)
>> [<c0009444>] (gic_handle_irq) from [<c0013b14>] (__irq_svc+0x54/0x70)
>> Exception stack(0xdb007c60 to 0xdb007ca8)
>> 7c60: 00000000 00000000 00000000 00000001 dc3e9a20 00000000 00000000 00000001
>> 7c80: dc3e9a20 0000001b 00080000 0000000b ffffffff db007cb0 c00ac8f0 c00aa96c
>> 7ca0: 600d0113 ffffffff
>> [<c0013b14>] (__irq_svc) from [<c00aa96c>] (free_pages_prepare+0x18c/0x218)
>> [<c00aa96c>] (free_pages_prepare) from [<c00ac8f0>] (free_hot_cold_page+0x34/0x168)
>> [<c00ac8f0>] (free_hot_cold_page) from [<c00c996c>] (handle_mm_fault+0x7b8/0xe60)
>> [<c00c996c>] (handle_mm_fault) from [<c001c194>] (do_page_fault+0x12c/0x378)
>> [<c001c194>] (do_page_fault) from [<c00092dc>] (do_DataAbort+0x38/0xb4)
>> [<c00092dc>] (do_DataAbort) from [<c0013aa0>] (__dabt_svc+0x40/0x60)
>> Exception stack(0xdb007e68 to 0xdb007eb0)
>> 7e60: 0001b2c8 00000d30 00000000 00000055 00000051 dbb04600
>> 7e80: 00000000 db007ec8 db8ea700 db8f3080 dba43240 0001b2c8 00000000 db007eb8
>> 7ea0: c012c178 c0174ce8 200d0113 ffffffff
>> [<c0013aa0>] (__dabt_svc) from [<c0174ce8>] (__clear_user_std+0x34/0x68)
>> [<c0174ce8>] (__clear_user_std) from [<c012c178>] (padzero+0x58/0x74)
>> [<c012c178>] (padzero) from [<c012cb4c>] (load_elf_binary+0x730/0x1250)
>> [<c012cb4c>] (load_elf_binary) from [<c00ecd6c>] (search_binary_handler+0xa0/0x23c)
>> [<c00ecd6c>] (search_binary_handler) from [<c00ed730>] (do_execveat_common+0x444/0x5d8)
>> [<c00ed730>] (do_execveat_common) from [<c00ed8e8>] (do_execve+0x24/0x2c)
>> [<c00ed8e8>] (do_execve) from [<c000f580>] (ret_fast_syscall+0x0/0x3c)
>
> (I put report for Linux 3.17 under [2])
>
> This rcu_sched self-detected stall is usually reproduced 2 or 3 times
> out of 10 runs, but sometimes even more runs go without issue.
>
> I bisected [1] it to commit
>
> commit 743162013d40ca612b4cb53d3a200dff2d9ab26e
> Author: NeilBrown <neilb@suse.de>
> Date: Mon Jul 7 15:16:04 2014 +1000
>
> sched: Remove proliferation of wait_on_bit() action functions
>
> The only change I noticed is from (mm/filemap.c)
>
> io_schedule();
> fatal_signal_pending(current)
>
> to (kernel/sched/wait.c)
>
> signal_pending_state(current->state, current)
> io_schedule();
>
> and if I apply following diff I don't see stalls anymore.
>
> diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
> index a104879..2d68cdb 100644
> --- a/kernel/sched/wait.c
> +++ b/kernel/sched/wait.c
> @@ -514,9 +514,10 @@ EXPORT_SYMBOL(bit_wait);
>
> __sched int bit_wait_io(void *word)
> {
> + io_schedule();
> +
> if (signal_pending_state(current->state, current))
> return 1;
> - io_schedule();
> return 0;
> }
> EXPORT_SYMBOL(bit_wait_io);
>
> Any ideas why it might happen and why diff above helps?
>
> [1] Bisect log:
>
> I checked 4.[0-3] and 3.1[7-9] manually
> I have https://lkml.org/lkml/2015/8/26/649 for better report
>
>> git bisect start '--no-checkout'
>> # good: [19583ca584d6f574384e17fe7613dfaeadcdc4a6] Linux 3.16
>> git bisect good 19583ca584d6f574384e17fe7613dfaeadcdc4a6
>> # bad: [53ee983378ff23e8f3ff95ecf99dea7c6c221900] Merge tag 'staging-3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
>> git bisect bad 53ee983378ff23e8f3ff95ecf99dea7c6c221900
>> # good: [2042088cd67d0064d18c52c13c69af2499907bb1] staging: comedi: ni_labpc: tidy up labpc_ai_scan_mode()
>> git bisect good 2042088cd67d0064d18c52c13c69af2499907bb1
>> # bad: [98959948a7ba33cf8c708626e0d2a1456397e1c6] Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
>> git bisect bad 98959948a7ba33cf8c708626e0d2a1456397e1c6
>> # good: [c9b88e9581828bb8bba06c5e7ee8ed1761172b6e] Merge tag 'trace-3.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
>> git bisect good c9b88e9581828bb8bba06c5e7ee8ed1761172b6e
>> # good: [5bda4f638f36ef4c4e3b1397b02affc3db94356e] Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
>> git bisect good 5bda4f638f36ef4c4e3b1397b02affc3db94356e
>> # good: [2336ebc32676df5b794acfe0c980583ec6c05f34] Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf into perf/core
>> git bisect good 2336ebc32676df5b794acfe0c980583ec6c05f34
>> # good: [ef35ad26f8ff44d2c93e29952cdb336bda729d9d] Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
>> git bisect good ef35ad26f8ff44d2c93e29952cdb336bda729d9d
>> # good: [1c5d3eb3759013bc7ee4197aa0a9f245bdb6eb90] sched/numa: Simplify task_numa_compare()
>> git bisect good 1c5d3eb3759013bc7ee4197aa0a9f245bdb6eb90
>> # good: [6e76ea8a8209386c3cc7ee5594e6ea5d25525cf2] sched: Remove extra static_key*() function indirection
>> git bisect good 6e76ea8a8209386c3cc7ee5594e6ea5d25525cf2
>> # bad: [c1221321b7c25b53204447cff9949a6d5a7ddddc] sched: Allow wait_on_bit_action() functions to support a timeout
>> git bisect bad c1221321b7c25b53204447cff9949a6d5a7ddddc
>> # good: [e720fff6341fe4b95e5a93c939bd3c77fa55ced4] sched/numa: Revert "Use effective_load() to balance NUMA loads"
>> git bisect good e720fff6341fe4b95e5a93c939bd3c77fa55ced4
>> # bad: [743162013d40ca612b4cb53d3a200dff2d9ab26e] sched: Remove proliferation of wait_on_bit() action functions
>> git bisect bad 743162013d40ca612b4cb53d3a200dff2d9ab26e
>> # good: [d26fad5b38e1c4667d4f2604936e59c837caa54d] Merge tag 'v3.16-rc5' into sched/core, to refresh the branch before applying bigger tree-wide changes
>> git bisect good d26fad5b38e1c4667d4f2604936e59c837caa54d
>
> [2] rcu_sched self-detected stall under Linux 3.17
>
>> INFO: rcu_sched self-detected stall on CPU { 2} (t=2100 jiffies g=-199 c=-200 q=172)
>> Task dump for CPU 2:
>> paranoia R running 0 414 1 0x00000003
>> [<c0015394>] (unwind_backtrace) from [<c00117dc>] (show_stack+0x10/0x14)
>> [<c00117dc>] (show_stack) from [<c00602d0>] (rcu_dump_cpu_stacks+0xa4/0xc4)
>> [<c00602d0>] (rcu_dump_cpu_stacks) from [<c0062aec>] (rcu_check_callbacks+0x280/0x6f8)
>> [<c0062aec>] (rcu_check_callbacks) from [<c0065b9c>] (update_process_times+0x40/0x60)
>> [<c0065b9c>] (update_process_times) from [<c007169c>] (tick_periodic+0xa8/0xb8)
>> [<c007169c>] (tick_periodic) from [<c00717bc>] (tick_handle_periodic+0x28/0x88)
>> [<c00717bc>] (tick_handle_periodic) from [<c01fc414>] (arch_timer_handler_virt+0x28/0x30)
>> [<c01fc414>] (arch_timer_handler_virt) from [<c005d100>] (handle_percpu_devid_irq+0x6c/0x84)
>> [<c005d100>] (handle_percpu_devid_irq) from [<c0059658>] (generic_handle_irq+0x2c/0x3c)
>> [<c0059658>] (generic_handle_irq) from [<c000ee3c>] (handle_IRQ+0x7c/0x8c)
>> [<c000ee3c>] (handle_IRQ) from [<c00085f8>] (gic_handle_irq+0x40/0x5c)
>> [<c00085f8>] (gic_handle_irq) from [<c0012300>] (__irq_svc+0x40/0x54)
>> Exception stack(0xdb063c60 to 0xdb063ca8)
>> 3c60: 00000000 0001b000 dbbfc000 0000000b dbbfc000 0000000b dbbfc000 0001b000
>> 3c80: 0001b000 dbb2d6e0 0000001b db03c380 0000006c db063ca8 c00b935c c00b935c
>> 3ca0: a00d0113 ffffffff
>> [<c0012300>] (__irq_svc) from [<c00b935c>] (do_cow_fault.isra.97+0x1c/0x198)
>> [<c00b935c>] (do_cow_fault.isra.97) from [<c00b9804>] (handle_mm_fault+0x14c/0x900)
>> [<c00b9804>] (handle_mm_fault) from [<c001ae38>] (do_page_fault+0x120/0x3d8)
>> [<c001ae38>] (do_page_fault) from [<c00084c0>] (do_DataAbort+0x38/0x98)
>> [<c00084c0>] (do_DataAbort) from [<c0012298>] (__dabt_svc+0x38/0x60)
>> Exception stack(0xdb063e70 to 0xdb063eb8)
>> 3e60: 0001b2c8 00000d30 00000000 00000000
>> 3e80: db902a80 dbadc100 00000000 db063ec8 db062008 dbadc200 0001b2c8 0001b6cc
>> 3ea0: 00000000 db063eb8 c011689c c015a884 200d0013 ffffffff
>> [<c0012298>] (__dabt_svc) from [<c015a884>] (__clear_user_std+0x34/0x64)
>> [<c015a884>] (__clear_user_std) from [<c011689c>] (padzero+0x44/0x58)
>> [<c011689c>] (padzero) from [<c01182e8>] (load_elf_binary+0x778/0x138c)
>> [<c01182e8>] (load_elf_binary) from [<c00da960>] (search_binary_handler+0x98/0x1d8)
>> [<c00da960>] (search_binary_handler) from [<c00dbc2c>] (do_execve+0x35c/0x4bc)
>> [<c00dbc2c>] (do_execve) from [<c000e520>] (ret_fast_syscall+0x0/0x30)
>
> Thanks
> Vladimir
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>
>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [BISECTED] rcu_sched self-detected stall since 3.17
2015-11-20 15:35 [BISECTED] rcu_sched self-detected stall since 3.17 Vladimir Murzin
2015-12-01 11:50 ` Vladimir Murzin
@ 2015-12-01 13:04 ` Peter Zijlstra
2015-12-02 9:04 ` Vladimir Murzin
2015-12-15 16:56 ` Oleg Nesterov
1 sibling, 2 replies; 5+ messages in thread
From: Peter Zijlstra @ 2015-12-01 13:04 UTC (permalink / raw)
To: Vladimir Murzin
Cc: linux-kernel, neilb, oleg, mark.rutland, linux-arm-kernel,
linux-mm
Sorry for the delay and thanks for the reminder!
On Fri, Nov 20, 2015 at 03:35:38PM +0000, Vladimir Murzin wrote:
> commit 743162013d40ca612b4cb53d3a200dff2d9ab26e
> Author: NeilBrown <neilb@suse.de>
> Date: Mon Jul 7 15:16:04 2014 +1000
>
> sched: Remove proliferation of wait_on_bit() action functions
>
> The only change I noticed is from (mm/filemap.c)
>
> io_schedule();
> fatal_signal_pending(current)
>
> to (kernel/sched/wait.c)
>
> signal_pending_state(current->state, current)
> io_schedule();
>
> and if I apply following diff I don't see stalls anymore.
>
> diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
> index a104879..2d68cdb 100644
> --- a/kernel/sched/wait.c
> +++ b/kernel/sched/wait.c
> @@ -514,9 +514,10 @@ EXPORT_SYMBOL(bit_wait);
>
> __sched int bit_wait_io(void *word)
> {
> + io_schedule();
> +
> if (signal_pending_state(current->state, current))
> return 1;
> - io_schedule();
> return 0;
> }
> EXPORT_SYMBOL(bit_wait_io);
>
> Any ideas why it might happen and why diff above helps?
Yes, the code as presented is simply wrong. And in fact most of the code
it replaced was of the right form (with a few exceptions which would
indeed have been subject to the same problem you've observed.
Note how the late:
- cifs_sb_tcon_pending_wait
- fscache_wait_bit_interruptible
- sleep_on_page_killable
- wait_inquiry
- key_wait_bit_intr
All check the signal state _after_ calling schedule().
As opposed to:
- gfs2_journalid_wait
which follows the broken pattern.
Further notice that most expect a return of -EINTR, which also seems
correct given that this is a signal, those that do not return -EINTR
only check for a !0 return value so would work equally well with -EINTR.
The reason this is broken is that schedule() will no-op when there is a
pending signal, while raising a signal will also issue a wakeup.
Thus the right thing to do is check for the signal state after, that way
you handle both cases:
- calling schedule() with a signal pending
- receiving a signal while sleeping
As such, I would propose the below patch. Oleg, do you concur?
---
Subject: sched,wait: Fix signal handling in bit wait helpers
Vladimir reported getting RCU stall warnings and bisected it back to
commit 743162013d40. That commit inadvertently reversed the calls to
schedule() and signal_pending(), thereby not handling the case where the
signal receives while we sleep.
Fixes: 743162013d40 ("sched: Remove proliferation of wait_on_bit() action functions")
Fixes: cbbce8220949 ("SCHED: add some "wait..on_bit...timeout()" interfaces.")
Reported-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
kernel/sched/wait.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
index 052e02672d12..f10bd873e684 100644
--- a/kernel/sched/wait.c
+++ b/kernel/sched/wait.c
@@ -583,18 +583,18 @@ EXPORT_SYMBOL(wake_up_atomic_t);
__sched int bit_wait(struct wait_bit_key *word)
{
- if (signal_pending_state(current->state, current))
- return 1;
schedule();
+ if (signal_pending(current))
+ return -EINTR;
return 0;
}
EXPORT_SYMBOL(bit_wait);
__sched int bit_wait_io(struct wait_bit_key *word)
{
- if (signal_pending_state(current->state, current))
- return 1;
io_schedule();
+ if (signal_pending(current))
+ return -EINTR;
return 0;
}
EXPORT_SYMBOL(bit_wait_io);
@@ -602,11 +602,11 @@ EXPORT_SYMBOL(bit_wait_io);
__sched int bit_wait_timeout(struct wait_bit_key *word)
{
unsigned long now = READ_ONCE(jiffies);
- if (signal_pending_state(current->state, current))
- return 1;
if (time_after_eq(now, word->timeout))
return -EAGAIN;
schedule_timeout(word->timeout - now);
+ if (signal_pending(current))
+ return -EINTR;
return 0;
}
EXPORT_SYMBOL_GPL(bit_wait_timeout);
@@ -614,11 +614,11 @@ EXPORT_SYMBOL_GPL(bit_wait_timeout);
__sched int bit_wait_io_timeout(struct wait_bit_key *word)
{
unsigned long now = READ_ONCE(jiffies);
- if (signal_pending_state(current->state, current))
- return 1;
if (time_after_eq(now, word->timeout))
return -EAGAIN;
io_schedule_timeout(word->timeout - now);
+ if (signal_pending(current))
+ return -EINTR;
return 0;
}
EXPORT_SYMBOL_GPL(bit_wait_io_timeout);
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [BISECTED] rcu_sched self-detected stall since 3.17
2015-12-01 13:04 ` Peter Zijlstra
@ 2015-12-02 9:04 ` Vladimir Murzin
2015-12-15 16:56 ` Oleg Nesterov
1 sibling, 0 replies; 5+ messages in thread
From: Vladimir Murzin @ 2015-12-02 9:04 UTC (permalink / raw)
To: Peter Zijlstra
Cc: linux-kernel, neilb, oleg, mark.rutland, linux-arm-kernel,
linux-mm
On 01/12/15 13:04, Peter Zijlstra wrote:
> Sorry for the delay and thanks for the reminder!
>
> On Fri, Nov 20, 2015 at 03:35:38PM +0000, Vladimir Murzin wrote:
>> commit 743162013d40ca612b4cb53d3a200dff2d9ab26e
>> Author: NeilBrown <neilb@suse.de>
>> Date: Mon Jul 7 15:16:04 2014 +1000
>>
>> sched: Remove proliferation of wait_on_bit() action functions
>>
>> The only change I noticed is from (mm/filemap.c)
>>
>> io_schedule();
>> fatal_signal_pending(current)
>>
>> to (kernel/sched/wait.c)
>>
>> signal_pending_state(current->state, current)
>> io_schedule();
>>
>> and if I apply following diff I don't see stalls anymore.
>>
>> diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
>> index a104879..2d68cdb 100644
>> --- a/kernel/sched/wait.c
>> +++ b/kernel/sched/wait.c
>> @@ -514,9 +514,10 @@ EXPORT_SYMBOL(bit_wait);
>>
>> __sched int bit_wait_io(void *word)
>> {
>> + io_schedule();
>> +
>> if (signal_pending_state(current->state, current))
>> return 1;
>> - io_schedule();
>> return 0;
>> }
>> EXPORT_SYMBOL(bit_wait_io);
>>
>> Any ideas why it might happen and why diff above helps?
>
> Yes, the code as presented is simply wrong. And in fact most of the code
> it replaced was of the right form (with a few exceptions which would
> indeed have been subject to the same problem you've observed.
>
> Note how the late:
>
> - cifs_sb_tcon_pending_wait
> - fscache_wait_bit_interruptible
> - sleep_on_page_killable
> - wait_inquiry
> - key_wait_bit_intr
>
> All check the signal state _after_ calling schedule().
>
> As opposed to:
>
> - gfs2_journalid_wait
>
> which follows the broken pattern.
>
> Further notice that most expect a return of -EINTR, which also seems
> correct given that this is a signal, those that do not return -EINTR
> only check for a !0 return value so would work equally well with -EINTR.
>
> The reason this is broken is that schedule() will no-op when there is a
> pending signal, while raising a signal will also issue a wakeup.
>
Glad to hear confirmation on a problem. Thanks for detailed answer!
> Thus the right thing to do is check for the signal state after, that way
> you handle both cases:
>
> - calling schedule() with a signal pending
> - receiving a signal while sleeping
>
> As such, I would propose the below patch. Oleg, do you concur?
>
> ---
> Subject: sched,wait: Fix signal handling in bit wait helpers
>
> Vladimir reported getting RCU stall warnings and bisected it back to
> commit 743162013d40. That commit inadvertently reversed the calls to
> schedule() and signal_pending(), thereby not handling the case where the
> signal receives while we sleep.
>
> Fixes: 743162013d40 ("sched: Remove proliferation of wait_on_bit() action functions")
> Fixes: cbbce8220949 ("SCHED: add some "wait..on_bit...timeout()" interfaces.")
> Reported-by: Vladimir Murzin <vladimir.murzin@arm.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
> kernel/sched/wait.c | 16 ++++++++--------
> 1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
> index 052e02672d12..f10bd873e684 100644
> --- a/kernel/sched/wait.c
> +++ b/kernel/sched/wait.c
> @@ -583,18 +583,18 @@ EXPORT_SYMBOL(wake_up_atomic_t);
>
> __sched int bit_wait(struct wait_bit_key *word)
> {
> - if (signal_pending_state(current->state, current))
> - return 1;
> schedule();
> + if (signal_pending(current))
> + return -EINTR;
> return 0;
> }
> EXPORT_SYMBOL(bit_wait);
>
> __sched int bit_wait_io(struct wait_bit_key *word)
> {
> - if (signal_pending_state(current->state, current))
> - return 1;
> io_schedule();
> + if (signal_pending(current))
> + return -EINTR;
> return 0;
> }
> EXPORT_SYMBOL(bit_wait_io);
> @@ -602,11 +602,11 @@ EXPORT_SYMBOL(bit_wait_io);
> __sched int bit_wait_timeout(struct wait_bit_key *word)
> {
> unsigned long now = READ_ONCE(jiffies);
> - if (signal_pending_state(current->state, current))
> - return 1;
> if (time_after_eq(now, word->timeout))
> return -EAGAIN;
> schedule_timeout(word->timeout - now);
> + if (signal_pending(current))
> + return -EINTR;
> return 0;
> }
> EXPORT_SYMBOL_GPL(bit_wait_timeout);
> @@ -614,11 +614,11 @@ EXPORT_SYMBOL_GPL(bit_wait_timeout);
> __sched int bit_wait_io_timeout(struct wait_bit_key *word)
> {
> unsigned long now = READ_ONCE(jiffies);
> - if (signal_pending_state(current->state, current))
> - return 1;
> if (time_after_eq(now, word->timeout))
> return -EAGAIN;
> io_schedule_timeout(word->timeout - now);
> + if (signal_pending(current))
> + return -EINTR;
> return 0;
> }
> EXPORT_SYMBOL_GPL(bit_wait_io_timeout);
>
I run it overnight on top of 4.3 and didn't see stalls. So in case it helps
Tested-by: Vladimir Murzin <vladimir.murzin@arm.com>
Cheers
Vladimir
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [BISECTED] rcu_sched self-detected stall since 3.17
2015-12-01 13:04 ` Peter Zijlstra
2015-12-02 9:04 ` Vladimir Murzin
@ 2015-12-15 16:56 ` Oleg Nesterov
1 sibling, 0 replies; 5+ messages in thread
From: Oleg Nesterov @ 2015-12-15 16:56 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Vladimir Murzin, linux-kernel, neilb, mark.rutland,
linux-arm-kernel, linux-mm
Sorry again for the huge delay.
And all I can say is that I am all confused.
On 12/01, Peter Zijlstra wrote:
>
> On Fri, Nov 20, 2015 at 03:35:38PM +0000, Vladimir Murzin wrote:
> > commit 743162013d40ca612b4cb53d3a200dff2d9ab26e
> > Author: NeilBrown <neilb@suse.de>
> > Date: Mon Jul 7 15:16:04 2014 +1000
That patch still looks correct to me.
> > and if I apply following diff I don't see stalls anymore.
> >
> > diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
> > index a104879..2d68cdb 100644
> > --- a/kernel/sched/wait.c
> > +++ b/kernel/sched/wait.c
> > @@ -514,9 +514,10 @@ EXPORT_SYMBOL(bit_wait);
> >
> > __sched int bit_wait_io(void *word)
> > {
> > + io_schedule();
> > +
> > if (signal_pending_state(current->state, current))
> > return 1;
> > - io_schedule();
> > return 0;
> > }
> > EXPORT_SYMBOL(bit_wait_io);
I can't understand why this change helps. But note that it actually removes
the signal_pending_state() check from bit_wait_io(), current->state is always
TASK_RUNNING after return from schedule(), signal_pending_state() will always
return zero.
This means that after this change wait_on_page_bit_killable() will spin in a
busy-wait loop if the caller is killed.
> The reason this is broken is that schedule() will no-op when there is a
> pending signal, while raising a signal will also issue a wakeup.
But why this is wrong? We should notice signal_pending_state() on the next
iteration.
> Thus the right thing to do is check for the signal state after,
I think this check should work on both sides. The only difference is that
you obviously can't use current->state after schedule().
I still can't understand the problem.
Oleg.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-12-15 16:56 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-11-20 15:35 [BISECTED] rcu_sched self-detected stall since 3.17 Vladimir Murzin
2015-12-01 11:50 ` Vladimir Murzin
2015-12-01 13:04 ` Peter Zijlstra
2015-12-02 9:04 ` Vladimir Murzin
2015-12-15 16:56 ` Oleg Nesterov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).