* 5.12-rc1 regression: freezing iou-mgr/wrk failed [not found] <1614646241.av51lk2de4.none.ref@localhost> @ 2021-03-02 0:57 ` Alex Xu (Hello71) 2021-03-02 1:11 ` Jens Axboe 0 siblings, 1 reply; 6+ messages in thread From: Alex Xu (Hello71) @ 2021-03-02 0:57 UTC (permalink / raw) To: linux-kernel, linux-block, Jens Axboe Hi, On Linux 5.12-rc1, I am unable to suspend to RAM. The system freezes for about 40 seconds and then continues operation. The following messages are printed to the kernel log: [ 240.650300] PM: suspend entry (deep) [ 240.650748] Filesystems sync: 0.000 seconds [ 240.725605] Freezing user space processes ... [ 260.739483] Freezing of tasks failed after 20.013 seconds (3 tasks refusing to freeze, wq_busy=0): [ 260.739497] task:iou-mgr-446 state:S stack: 0 pid: 516 ppid: 439 flags:0x00004224 [ 260.739504] Call Trace: [ 260.739507] ? sysvec_apic_timer_interrupt+0xb/0x81 [ 260.739515] ? pick_next_task_fair+0x197/0x1cde [ 260.739519] ? sysvec_reschedule_ipi+0x2f/0x6a [ 260.739522] ? asm_sysvec_reschedule_ipi+0x12/0x20 [ 260.739525] ? __schedule+0x57/0x6d6 [ 260.739529] ? del_timer_sync+0xb9/0x115 [ 260.739533] ? schedule+0x63/0xd5 [ 260.739536] ? schedule_timeout+0x219/0x356 [ 260.739540] ? __next_timer_interrupt+0xf1/0xf1 [ 260.739544] ? io_wq_manager+0x73/0xb1 [ 260.739549] ? io_wq_create+0x262/0x262 [ 260.739553] ? ret_from_fork+0x22/0x30 [ 260.739557] task:iou-mgr-517 state:S stack: 0 pid: 522 ppid: 439 flags:0x00004224 [ 260.739561] Call Trace: [ 260.739563] ? sysvec_apic_timer_interrupt+0xb/0x81 [ 260.739566] ? pick_next_task_fair+0x16f/0x1cde [ 260.739569] ? sysvec_apic_timer_interrupt+0xb/0x81 [ 260.739571] ? asm_sysvec_apic_timer_interrupt+0x12/0x20 [ 260.739574] ? __schedule+0x5b7/0x6d6 [ 260.739578] ? del_timer_sync+0x70/0x115 [ 260.739581] ? schedule_timeout+0x211/0x356 [ 260.739585] ? __next_timer_interrupt+0xf1/0xf1 [ 260.739588] ? io_wq_check_workers+0x15/0x11f [ 260.739592] ? io_wq_manager+0x69/0xb1 [ 260.739596] ? io_wq_create+0x262/0x262 [ 260.739600] ? ret_from_fork+0x22/0x30 [ 260.739603] task:iou-wrk-517 state:S stack: 0 pid: 523 ppid: 439 flags:0x00004224 [ 260.739607] Call Trace: [ 260.739609] ? __schedule+0x5b7/0x6d6 [ 260.739614] ? schedule+0x63/0xd5 [ 260.739617] ? schedule_timeout+0x219/0x356 [ 260.739621] ? __next_timer_interrupt+0xf1/0xf1 [ 260.739624] ? task_thread.isra.0+0x148/0x3af [ 260.739628] ? task_thread_unbound+0xa/0xa [ 260.739632] ? task_thread_bound+0x7/0x7 [ 260.739636] ? ret_from_fork+0x22/0x30 [ 260.739647] OOM killer enabled. [ 260.739648] Restarting tasks ... done. [ 260.740077] PM: suspend exit and then a set of similar messages except with s2idle instead of deep. Reverting 5695e51619 ("Merge tag 'io_uring-worker.v3-2021-02-25' of git://git.kernel.dk/linux-block") appears to resolve the issue. I have not yet bisected further. Let me know which troubleshooting steps I should perform next. Thanks, Alex. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: 5.12-rc1 regression: freezing iou-mgr/wrk failed 2021-03-02 0:57 ` 5.12-rc1 regression: freezing iou-mgr/wrk failed Alex Xu (Hello71) @ 2021-03-02 1:11 ` Jens Axboe 2021-03-02 1:25 ` Jens Axboe 0 siblings, 1 reply; 6+ messages in thread From: Jens Axboe @ 2021-03-02 1:11 UTC (permalink / raw) To: Alex Xu (Hello71), linux-kernel, linux-block On 3/1/21 5:57 PM, Alex Xu (Hello71) wrote: > Hi, > > On Linux 5.12-rc1, I am unable to suspend to RAM. The system freezes for > about 40 seconds and then continues operation. The following messages > are printed to the kernel log: > > [ 240.650300] PM: suspend entry (deep) > [ 240.650748] Filesystems sync: 0.000 seconds > [ 240.725605] Freezing user space processes ... > [ 260.739483] Freezing of tasks failed after 20.013 seconds (3 tasks refusing to freeze, wq_busy=0): > [ 260.739497] task:iou-mgr-446 state:S stack: 0 pid: 516 ppid: 439 flags:0x00004224 > [ 260.739504] Call Trace: > [ 260.739507] ? sysvec_apic_timer_interrupt+0xb/0x81 > [ 260.739515] ? pick_next_task_fair+0x197/0x1cde > [ 260.739519] ? sysvec_reschedule_ipi+0x2f/0x6a > [ 260.739522] ? asm_sysvec_reschedule_ipi+0x12/0x20 > [ 260.739525] ? __schedule+0x57/0x6d6 > [ 260.739529] ? del_timer_sync+0xb9/0x115 > [ 260.739533] ? schedule+0x63/0xd5 > [ 260.739536] ? schedule_timeout+0x219/0x356 > [ 260.739540] ? __next_timer_interrupt+0xf1/0xf1 > [ 260.739544] ? io_wq_manager+0x73/0xb1 > [ 260.739549] ? io_wq_create+0x262/0x262 > [ 260.739553] ? ret_from_fork+0x22/0x30 > [ 260.739557] task:iou-mgr-517 state:S stack: 0 pid: 522 ppid: 439 flags:0x00004224 > [ 260.739561] Call Trace: > [ 260.739563] ? sysvec_apic_timer_interrupt+0xb/0x81 > [ 260.739566] ? pick_next_task_fair+0x16f/0x1cde > [ 260.739569] ? sysvec_apic_timer_interrupt+0xb/0x81 > [ 260.739571] ? asm_sysvec_apic_timer_interrupt+0x12/0x20 > [ 260.739574] ? __schedule+0x5b7/0x6d6 > [ 260.739578] ? del_timer_sync+0x70/0x115 > [ 260.739581] ? schedule_timeout+0x211/0x356 > [ 260.739585] ? __next_timer_interrupt+0xf1/0xf1 > [ 260.739588] ? io_wq_check_workers+0x15/0x11f > [ 260.739592] ? io_wq_manager+0x69/0xb1 > [ 260.739596] ? io_wq_create+0x262/0x262 > [ 260.739600] ? ret_from_fork+0x22/0x30 > [ 260.739603] task:iou-wrk-517 state:S stack: 0 pid: 523 ppid: 439 flags:0x00004224 > [ 260.739607] Call Trace: > [ 260.739609] ? __schedule+0x5b7/0x6d6 > [ 260.739614] ? schedule+0x63/0xd5 > [ 260.739617] ? schedule_timeout+0x219/0x356 > [ 260.739621] ? __next_timer_interrupt+0xf1/0xf1 > [ 260.739624] ? task_thread.isra.0+0x148/0x3af > [ 260.739628] ? task_thread_unbound+0xa/0xa > [ 260.739632] ? task_thread_bound+0x7/0x7 > [ 260.739636] ? ret_from_fork+0x22/0x30 > [ 260.739647] OOM killer enabled. > [ 260.739648] Restarting tasks ... done. > [ 260.740077] PM: suspend exit > > and then a set of similar messages except with s2idle instead of deep. > > Reverting 5695e51619 ("Merge tag 'io_uring-worker.v3-2021-02-25' of > git://git.kernel.dk/linux-block") appears to resolve the issue. I have > not yet bisected further. Let me know which troubleshooting steps I > should perform next. Can you try and pull in: git://git.kernel.dk/linux-block io_uring-5.12 and see if that resolves it? I usually always run -git on my laptop as well, but something broke it in the merge window so I need to figure out what that is first... What distro are you running? -- Jens Axboe ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: 5.12-rc1 regression: freezing iou-mgr/wrk failed 2021-03-02 1:11 ` Jens Axboe @ 2021-03-02 1:25 ` Jens Axboe 2021-03-02 1:35 ` Jens Axboe 0 siblings, 1 reply; 6+ messages in thread From: Jens Axboe @ 2021-03-02 1:25 UTC (permalink / raw) To: Alex Xu (Hello71), linux-kernel, linux-block On 3/1/21 6:11 PM, Jens Axboe wrote: > On 3/1/21 5:57 PM, Alex Xu (Hello71) wrote: >> Hi, >> >> On Linux 5.12-rc1, I am unable to suspend to RAM. The system freezes for >> about 40 seconds and then continues operation. The following messages >> are printed to the kernel log: >> >> [ 240.650300] PM: suspend entry (deep) >> [ 240.650748] Filesystems sync: 0.000 seconds >> [ 240.725605] Freezing user space processes ... >> [ 260.739483] Freezing of tasks failed after 20.013 seconds (3 tasks refusing to freeze, wq_busy=0): >> [ 260.739497] task:iou-mgr-446 state:S stack: 0 pid: 516 ppid: 439 flags:0x00004224 >> [ 260.739504] Call Trace: >> [ 260.739507] ? sysvec_apic_timer_interrupt+0xb/0x81 >> [ 260.739515] ? pick_next_task_fair+0x197/0x1cde >> [ 260.739519] ? sysvec_reschedule_ipi+0x2f/0x6a >> [ 260.739522] ? asm_sysvec_reschedule_ipi+0x12/0x20 >> [ 260.739525] ? __schedule+0x57/0x6d6 >> [ 260.739529] ? del_timer_sync+0xb9/0x115 >> [ 260.739533] ? schedule+0x63/0xd5 >> [ 260.739536] ? schedule_timeout+0x219/0x356 >> [ 260.739540] ? __next_timer_interrupt+0xf1/0xf1 >> [ 260.739544] ? io_wq_manager+0x73/0xb1 >> [ 260.739549] ? io_wq_create+0x262/0x262 >> [ 260.739553] ? ret_from_fork+0x22/0x30 >> [ 260.739557] task:iou-mgr-517 state:S stack: 0 pid: 522 ppid: 439 flags:0x00004224 >> [ 260.739561] Call Trace: >> [ 260.739563] ? sysvec_apic_timer_interrupt+0xb/0x81 >> [ 260.739566] ? pick_next_task_fair+0x16f/0x1cde >> [ 260.739569] ? sysvec_apic_timer_interrupt+0xb/0x81 >> [ 260.739571] ? asm_sysvec_apic_timer_interrupt+0x12/0x20 >> [ 260.739574] ? __schedule+0x5b7/0x6d6 >> [ 260.739578] ? del_timer_sync+0x70/0x115 >> [ 260.739581] ? schedule_timeout+0x211/0x356 >> [ 260.739585] ? __next_timer_interrupt+0xf1/0xf1 >> [ 260.739588] ? io_wq_check_workers+0x15/0x11f >> [ 260.739592] ? io_wq_manager+0x69/0xb1 >> [ 260.739596] ? io_wq_create+0x262/0x262 >> [ 260.739600] ? ret_from_fork+0x22/0x30 >> [ 260.739603] task:iou-wrk-517 state:S stack: 0 pid: 523 ppid: 439 flags:0x00004224 >> [ 260.739607] Call Trace: >> [ 260.739609] ? __schedule+0x5b7/0x6d6 >> [ 260.739614] ? schedule+0x63/0xd5 >> [ 260.739617] ? schedule_timeout+0x219/0x356 >> [ 260.739621] ? __next_timer_interrupt+0xf1/0xf1 >> [ 260.739624] ? task_thread.isra.0+0x148/0x3af >> [ 260.739628] ? task_thread_unbound+0xa/0xa >> [ 260.739632] ? task_thread_bound+0x7/0x7 >> [ 260.739636] ? ret_from_fork+0x22/0x30 >> [ 260.739647] OOM killer enabled. >> [ 260.739648] Restarting tasks ... done. >> [ 260.740077] PM: suspend exit >> >> and then a set of similar messages except with s2idle instead of deep. >> >> Reverting 5695e51619 ("Merge tag 'io_uring-worker.v3-2021-02-25' of >> git://git.kernel.dk/linux-block") appears to resolve the issue. I have >> not yet bisected further. Let me know which troubleshooting steps I >> should perform next. > > Can you try and pull in: > > git://git.kernel.dk/linux-block io_uring-5.12 > > and see if that resolves it? I usually always run -git on my laptop as > well, but something broke it in the merge window so I need to figure > out what that is first... > > What distro are you running? You probably want this on top... diff --git a/fs/io-wq.c b/fs/io-wq.c index 1fdb2b621b51..a763e1b09073 100644 --- a/fs/io-wq.c +++ b/fs/io-wq.c @@ -567,7 +567,7 @@ static int task_thread(void *data, int index) worker->task = current; set_cpus_allowed_ptr(current, cpumask_of_node(wqe->node)); - current->flags |= PF_NO_SETAFFINITY; + current->flags |= PF_NO_SETAFFINITY | PF_NOFREEZE; raw_spin_lock_irq(&wqe->lock); hlist_nulls_add_head_rcu(&worker->nulls_node, &wqe->free_list); @@ -722,7 +722,7 @@ static int io_wq_manager(void *data) sprintf(buf, "iou-mgr-%d", wq->task_pid); set_task_comm(current, buf); - current->flags |= PF_IO_WORKER; + current->flags |= PF_IO_WORKER | PF_NOFREEZE; wq->manager = get_task_struct(current); complete(&wq->started); diff --git a/fs/io_uring.c b/fs/io_uring.c index 2757675ab417..e7aaf56b4dea 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -6679,6 +6685,7 @@ static int io_sq_thread(void *data) set_task_comm(current, buf); sqd->thread = current; current->pf_io_worker = NULL; + current->flags |= PF_NOFREEZE; if (sqd->sq_cpu != -1) set_cpus_allowed_ptr(current, cpumask_of(sqd->sq_cpu)); -- Jens Axboe ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: 5.12-rc1 regression: freezing iou-mgr/wrk failed 2021-03-02 1:25 ` Jens Axboe @ 2021-03-02 1:35 ` Jens Axboe 2021-03-02 22:13 ` Alex Xu (Hello71) 0 siblings, 1 reply; 6+ messages in thread From: Jens Axboe @ 2021-03-02 1:35 UTC (permalink / raw) To: Alex Xu (Hello71), linux-kernel, linux-block On 3/1/21 6:25 PM, Jens Axboe wrote: > On 3/1/21 6:11 PM, Jens Axboe wrote: >> On 3/1/21 5:57 PM, Alex Xu (Hello71) wrote: >>> Hi, >>> >>> On Linux 5.12-rc1, I am unable to suspend to RAM. The system freezes for >>> about 40 seconds and then continues operation. The following messages >>> are printed to the kernel log: >>> >>> [ 240.650300] PM: suspend entry (deep) >>> [ 240.650748] Filesystems sync: 0.000 seconds >>> [ 240.725605] Freezing user space processes ... >>> [ 260.739483] Freezing of tasks failed after 20.013 seconds (3 tasks refusing to freeze, wq_busy=0): >>> [ 260.739497] task:iou-mgr-446 state:S stack: 0 pid: 516 ppid: 439 flags:0x00004224 >>> [ 260.739504] Call Trace: >>> [ 260.739507] ? sysvec_apic_timer_interrupt+0xb/0x81 >>> [ 260.739515] ? pick_next_task_fair+0x197/0x1cde >>> [ 260.739519] ? sysvec_reschedule_ipi+0x2f/0x6a >>> [ 260.739522] ? asm_sysvec_reschedule_ipi+0x12/0x20 >>> [ 260.739525] ? __schedule+0x57/0x6d6 >>> [ 260.739529] ? del_timer_sync+0xb9/0x115 >>> [ 260.739533] ? schedule+0x63/0xd5 >>> [ 260.739536] ? schedule_timeout+0x219/0x356 >>> [ 260.739540] ? __next_timer_interrupt+0xf1/0xf1 >>> [ 260.739544] ? io_wq_manager+0x73/0xb1 >>> [ 260.739549] ? io_wq_create+0x262/0x262 >>> [ 260.739553] ? ret_from_fork+0x22/0x30 >>> [ 260.739557] task:iou-mgr-517 state:S stack: 0 pid: 522 ppid: 439 flags:0x00004224 >>> [ 260.739561] Call Trace: >>> [ 260.739563] ? sysvec_apic_timer_interrupt+0xb/0x81 >>> [ 260.739566] ? pick_next_task_fair+0x16f/0x1cde >>> [ 260.739569] ? sysvec_apic_timer_interrupt+0xb/0x81 >>> [ 260.739571] ? asm_sysvec_apic_timer_interrupt+0x12/0x20 >>> [ 260.739574] ? __schedule+0x5b7/0x6d6 >>> [ 260.739578] ? del_timer_sync+0x70/0x115 >>> [ 260.739581] ? schedule_timeout+0x211/0x356 >>> [ 260.739585] ? __next_timer_interrupt+0xf1/0xf1 >>> [ 260.739588] ? io_wq_check_workers+0x15/0x11f >>> [ 260.739592] ? io_wq_manager+0x69/0xb1 >>> [ 260.739596] ? io_wq_create+0x262/0x262 >>> [ 260.739600] ? ret_from_fork+0x22/0x30 >>> [ 260.739603] task:iou-wrk-517 state:S stack: 0 pid: 523 ppid: 439 flags:0x00004224 >>> [ 260.739607] Call Trace: >>> [ 260.739609] ? __schedule+0x5b7/0x6d6 >>> [ 260.739614] ? schedule+0x63/0xd5 >>> [ 260.739617] ? schedule_timeout+0x219/0x356 >>> [ 260.739621] ? __next_timer_interrupt+0xf1/0xf1 >>> [ 260.739624] ? task_thread.isra.0+0x148/0x3af >>> [ 260.739628] ? task_thread_unbound+0xa/0xa >>> [ 260.739632] ? task_thread_bound+0x7/0x7 >>> [ 260.739636] ? ret_from_fork+0x22/0x30 >>> [ 260.739647] OOM killer enabled. >>> [ 260.739648] Restarting tasks ... done. >>> [ 260.740077] PM: suspend exit >>> >>> and then a set of similar messages except with s2idle instead of deep. >>> >>> Reverting 5695e51619 ("Merge tag 'io_uring-worker.v3-2021-02-25' of >>> git://git.kernel.dk/linux-block") appears to resolve the issue. I have >>> not yet bisected further. Let me know which troubleshooting steps I >>> should perform next. >> >> Can you try and pull in: >> >> git://git.kernel.dk/linux-block io_uring-5.12 >> >> and see if that resolves it? I usually always run -git on my laptop as >> well, but something broke it in the merge window so I need to figure >> out what that is first... >> >> What distro are you running? > > You probably want this on top... And if you've verified that that one works OK, can you try this variant instead? diff --git a/fs/io-wq.c b/fs/io-wq.c index 1fdb2b621b51..fe004cf93c4b 100644 --- a/fs/io-wq.c +++ b/fs/io-wq.c @@ -16,6 +16,7 @@ #include <linux/rculist_nulls.h> #include <linux/cpu.h> #include <linux/tracehook.h> +#include <linux/freezer.h> #include "../kernel/sched/sched.h" #include "io-wq.h" @@ -480,6 +481,7 @@ static int io_wqe_worker(void *data) io_worker_start(worker); while (!test_bit(IO_WQ_BIT_EXIT, &wq->state)) { + try_to_freeze(); set_current_state(TASK_INTERRUPTIBLE); loop: raw_spin_lock_irq(&wqe->lock); @@ -731,6 +733,7 @@ static int io_wq_manager(void *data) set_current_state(TASK_INTERRUPTIBLE); io_wq_check_workers(wq); schedule_timeout(HZ); + try_to_freeze(); if (fatal_signal_pending(current)) set_bit(IO_WQ_BIT_EXIT, &wq->state); } while (!test_bit(IO_WQ_BIT_EXIT, &wq->state)); diff --git a/fs/io_uring.c b/fs/io_uring.c index 2757675ab417..03c42f1f9862 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -74,13 +74,11 @@ #include <linux/fsnotify.h> #include <linux/fadvise.h> #include <linux/eventpoll.h> -#include <linux/fs_struct.h> #include <linux/splice.h> #include <linux/task_work.h> #include <linux/pagemap.h> #include <linux/io_uring.h> -#include <linux/blk-cgroup.h> -#include <linux/audit.h> +#include <linux/freezer.h> #define CREATE_TRACE_POINTS #include <trace/events/io_uring.h> @@ -6744,6 +6748,7 @@ static int io_sq_thread(void *data) io_ring_set_wakeup_flag(ctx); schedule(); + try_to_freeze(); list_for_each_entry(ctx, &sqd->ctx_list, sqd_list) io_ring_clear_wakeup_flag(ctx); } -- Jens Axboe ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: 5.12-rc1 regression: freezing iou-mgr/wrk failed 2021-03-02 1:35 ` Jens Axboe @ 2021-03-02 22:13 ` Alex Xu (Hello71) 2021-03-02 22:31 ` Jens Axboe 0 siblings, 1 reply; 6+ messages in thread From: Alex Xu (Hello71) @ 2021-03-02 22:13 UTC (permalink / raw) To: Jens Axboe; +Cc: linux-block, linux-kernel I tried 29be7fc03d ("io_uring: ensure that threads freeze on suspend") and it seems to work OK. The system suspends fine and no errors are printed to the kernel log. I am using Gentoo on the machine in question. I didn't test the other patches you supplied. Let me know if there's anything you would like me to test. Thanks, Alex. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: 5.12-rc1 regression: freezing iou-mgr/wrk failed 2021-03-02 22:13 ` Alex Xu (Hello71) @ 2021-03-02 22:31 ` Jens Axboe 0 siblings, 0 replies; 6+ messages in thread From: Jens Axboe @ 2021-03-02 22:31 UTC (permalink / raw) To: Alex Xu (Hello71); +Cc: linux-block, linux-kernel On 3/2/21 3:13 PM, Alex Xu (Hello71) wrote: > I tried 29be7fc03d ("io_uring: ensure that threads freeze on suspend") > and it seems to work OK. The system suspends fine and no errors are > printed to the kernel log. > > I am using Gentoo on the machine in question. > > I didn't test the other patches you supplied. Let me know if there's > anything you would like me to test. OK great, thanks. I'll add your reported/tested-by to the patch. -- Jens Axboe ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-03-02 23:53 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1614646241.av51lk2de4.none.ref@localhost>
2021-03-02 0:57 ` 5.12-rc1 regression: freezing iou-mgr/wrk failed Alex Xu (Hello71)
2021-03-02 1:11 ` Jens Axboe
2021-03-02 1:25 ` Jens Axboe
2021-03-02 1:35 ` Jens Axboe
2021-03-02 22:13 ` Alex Xu (Hello71)
2021-03-02 22:31 ` Jens Axboe
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox