* 5.12-rc1 regression: freezing iou-mgr/wrk failed
[not found] <1614646241.av51lk2de4.none.ref@localhost>
@ 2021-03-02 0:57 ` Alex Xu (Hello71)
2021-03-02 1:11 ` Jens Axboe
0 siblings, 1 reply; 6+ messages in thread
From: Alex Xu (Hello71) @ 2021-03-02 0:57 UTC (permalink / raw)
To: linux-kernel, linux-block, Jens Axboe
Hi,
On Linux 5.12-rc1, I am unable to suspend to RAM. The system freezes for
about 40 seconds and then continues operation. The following messages
are printed to the kernel log:
[ 240.650300] PM: suspend entry (deep)
[ 240.650748] Filesystems sync: 0.000 seconds
[ 240.725605] Freezing user space processes ...
[ 260.739483] Freezing of tasks failed after 20.013 seconds (3 tasks refusing to freeze, wq_busy=0):
[ 260.739497] task:iou-mgr-446 state:S stack: 0 pid: 516 ppid: 439 flags:0x00004224
[ 260.739504] Call Trace:
[ 260.739507] ? sysvec_apic_timer_interrupt+0xb/0x81
[ 260.739515] ? pick_next_task_fair+0x197/0x1cde
[ 260.739519] ? sysvec_reschedule_ipi+0x2f/0x6a
[ 260.739522] ? asm_sysvec_reschedule_ipi+0x12/0x20
[ 260.739525] ? __schedule+0x57/0x6d6
[ 260.739529] ? del_timer_sync+0xb9/0x115
[ 260.739533] ? schedule+0x63/0xd5
[ 260.739536] ? schedule_timeout+0x219/0x356
[ 260.739540] ? __next_timer_interrupt+0xf1/0xf1
[ 260.739544] ? io_wq_manager+0x73/0xb1
[ 260.739549] ? io_wq_create+0x262/0x262
[ 260.739553] ? ret_from_fork+0x22/0x30
[ 260.739557] task:iou-mgr-517 state:S stack: 0 pid: 522 ppid: 439 flags:0x00004224
[ 260.739561] Call Trace:
[ 260.739563] ? sysvec_apic_timer_interrupt+0xb/0x81
[ 260.739566] ? pick_next_task_fair+0x16f/0x1cde
[ 260.739569] ? sysvec_apic_timer_interrupt+0xb/0x81
[ 260.739571] ? asm_sysvec_apic_timer_interrupt+0x12/0x20
[ 260.739574] ? __schedule+0x5b7/0x6d6
[ 260.739578] ? del_timer_sync+0x70/0x115
[ 260.739581] ? schedule_timeout+0x211/0x356
[ 260.739585] ? __next_timer_interrupt+0xf1/0xf1
[ 260.739588] ? io_wq_check_workers+0x15/0x11f
[ 260.739592] ? io_wq_manager+0x69/0xb1
[ 260.739596] ? io_wq_create+0x262/0x262
[ 260.739600] ? ret_from_fork+0x22/0x30
[ 260.739603] task:iou-wrk-517 state:S stack: 0 pid: 523 ppid: 439 flags:0x00004224
[ 260.739607] Call Trace:
[ 260.739609] ? __schedule+0x5b7/0x6d6
[ 260.739614] ? schedule+0x63/0xd5
[ 260.739617] ? schedule_timeout+0x219/0x356
[ 260.739621] ? __next_timer_interrupt+0xf1/0xf1
[ 260.739624] ? task_thread.isra.0+0x148/0x3af
[ 260.739628] ? task_thread_unbound+0xa/0xa
[ 260.739632] ? task_thread_bound+0x7/0x7
[ 260.739636] ? ret_from_fork+0x22/0x30
[ 260.739647] OOM killer enabled.
[ 260.739648] Restarting tasks ... done.
[ 260.740077] PM: suspend exit
and then a set of similar messages except with s2idle instead of deep.
Reverting 5695e51619 ("Merge tag 'io_uring-worker.v3-2021-02-25' of
git://git.kernel.dk/linux-block") appears to resolve the issue. I have
not yet bisected further. Let me know which troubleshooting steps I
should perform next.
Thanks,
Alex.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: 5.12-rc1 regression: freezing iou-mgr/wrk failed
2021-03-02 0:57 ` 5.12-rc1 regression: freezing iou-mgr/wrk failed Alex Xu (Hello71)
@ 2021-03-02 1:11 ` Jens Axboe
2021-03-02 1:25 ` Jens Axboe
0 siblings, 1 reply; 6+ messages in thread
From: Jens Axboe @ 2021-03-02 1:11 UTC (permalink / raw)
To: Alex Xu (Hello71), linux-kernel, linux-block
On 3/1/21 5:57 PM, Alex Xu (Hello71) wrote:
> Hi,
>
> On Linux 5.12-rc1, I am unable to suspend to RAM. The system freezes for
> about 40 seconds and then continues operation. The following messages
> are printed to the kernel log:
>
> [ 240.650300] PM: suspend entry (deep)
> [ 240.650748] Filesystems sync: 0.000 seconds
> [ 240.725605] Freezing user space processes ...
> [ 260.739483] Freezing of tasks failed after 20.013 seconds (3 tasks refusing to freeze, wq_busy=0):
> [ 260.739497] task:iou-mgr-446 state:S stack: 0 pid: 516 ppid: 439 flags:0x00004224
> [ 260.739504] Call Trace:
> [ 260.739507] ? sysvec_apic_timer_interrupt+0xb/0x81
> [ 260.739515] ? pick_next_task_fair+0x197/0x1cde
> [ 260.739519] ? sysvec_reschedule_ipi+0x2f/0x6a
> [ 260.739522] ? asm_sysvec_reschedule_ipi+0x12/0x20
> [ 260.739525] ? __schedule+0x57/0x6d6
> [ 260.739529] ? del_timer_sync+0xb9/0x115
> [ 260.739533] ? schedule+0x63/0xd5
> [ 260.739536] ? schedule_timeout+0x219/0x356
> [ 260.739540] ? __next_timer_interrupt+0xf1/0xf1
> [ 260.739544] ? io_wq_manager+0x73/0xb1
> [ 260.739549] ? io_wq_create+0x262/0x262
> [ 260.739553] ? ret_from_fork+0x22/0x30
> [ 260.739557] task:iou-mgr-517 state:S stack: 0 pid: 522 ppid: 439 flags:0x00004224
> [ 260.739561] Call Trace:
> [ 260.739563] ? sysvec_apic_timer_interrupt+0xb/0x81
> [ 260.739566] ? pick_next_task_fair+0x16f/0x1cde
> [ 260.739569] ? sysvec_apic_timer_interrupt+0xb/0x81
> [ 260.739571] ? asm_sysvec_apic_timer_interrupt+0x12/0x20
> [ 260.739574] ? __schedule+0x5b7/0x6d6
> [ 260.739578] ? del_timer_sync+0x70/0x115
> [ 260.739581] ? schedule_timeout+0x211/0x356
> [ 260.739585] ? __next_timer_interrupt+0xf1/0xf1
> [ 260.739588] ? io_wq_check_workers+0x15/0x11f
> [ 260.739592] ? io_wq_manager+0x69/0xb1
> [ 260.739596] ? io_wq_create+0x262/0x262
> [ 260.739600] ? ret_from_fork+0x22/0x30
> [ 260.739603] task:iou-wrk-517 state:S stack: 0 pid: 523 ppid: 439 flags:0x00004224
> [ 260.739607] Call Trace:
> [ 260.739609] ? __schedule+0x5b7/0x6d6
> [ 260.739614] ? schedule+0x63/0xd5
> [ 260.739617] ? schedule_timeout+0x219/0x356
> [ 260.739621] ? __next_timer_interrupt+0xf1/0xf1
> [ 260.739624] ? task_thread.isra.0+0x148/0x3af
> [ 260.739628] ? task_thread_unbound+0xa/0xa
> [ 260.739632] ? task_thread_bound+0x7/0x7
> [ 260.739636] ? ret_from_fork+0x22/0x30
> [ 260.739647] OOM killer enabled.
> [ 260.739648] Restarting tasks ... done.
> [ 260.740077] PM: suspend exit
>
> and then a set of similar messages except with s2idle instead of deep.
>
> Reverting 5695e51619 ("Merge tag 'io_uring-worker.v3-2021-02-25' of
> git://git.kernel.dk/linux-block") appears to resolve the issue. I have
> not yet bisected further. Let me know which troubleshooting steps I
> should perform next.
Can you try and pull in:
git://git.kernel.dk/linux-block io_uring-5.12
and see if that resolves it? I usually always run -git on my laptop as
well, but something broke it in the merge window so I need to figure
out what that is first...
What distro are you running?
--
Jens Axboe
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: 5.12-rc1 regression: freezing iou-mgr/wrk failed
2021-03-02 1:11 ` Jens Axboe
@ 2021-03-02 1:25 ` Jens Axboe
2021-03-02 1:35 ` Jens Axboe
0 siblings, 1 reply; 6+ messages in thread
From: Jens Axboe @ 2021-03-02 1:25 UTC (permalink / raw)
To: Alex Xu (Hello71), linux-kernel, linux-block
On 3/1/21 6:11 PM, Jens Axboe wrote:
> On 3/1/21 5:57 PM, Alex Xu (Hello71) wrote:
>> Hi,
>>
>> On Linux 5.12-rc1, I am unable to suspend to RAM. The system freezes for
>> about 40 seconds and then continues operation. The following messages
>> are printed to the kernel log:
>>
>> [ 240.650300] PM: suspend entry (deep)
>> [ 240.650748] Filesystems sync: 0.000 seconds
>> [ 240.725605] Freezing user space processes ...
>> [ 260.739483] Freezing of tasks failed after 20.013 seconds (3 tasks refusing to freeze, wq_busy=0):
>> [ 260.739497] task:iou-mgr-446 state:S stack: 0 pid: 516 ppid: 439 flags:0x00004224
>> [ 260.739504] Call Trace:
>> [ 260.739507] ? sysvec_apic_timer_interrupt+0xb/0x81
>> [ 260.739515] ? pick_next_task_fair+0x197/0x1cde
>> [ 260.739519] ? sysvec_reschedule_ipi+0x2f/0x6a
>> [ 260.739522] ? asm_sysvec_reschedule_ipi+0x12/0x20
>> [ 260.739525] ? __schedule+0x57/0x6d6
>> [ 260.739529] ? del_timer_sync+0xb9/0x115
>> [ 260.739533] ? schedule+0x63/0xd5
>> [ 260.739536] ? schedule_timeout+0x219/0x356
>> [ 260.739540] ? __next_timer_interrupt+0xf1/0xf1
>> [ 260.739544] ? io_wq_manager+0x73/0xb1
>> [ 260.739549] ? io_wq_create+0x262/0x262
>> [ 260.739553] ? ret_from_fork+0x22/0x30
>> [ 260.739557] task:iou-mgr-517 state:S stack: 0 pid: 522 ppid: 439 flags:0x00004224
>> [ 260.739561] Call Trace:
>> [ 260.739563] ? sysvec_apic_timer_interrupt+0xb/0x81
>> [ 260.739566] ? pick_next_task_fair+0x16f/0x1cde
>> [ 260.739569] ? sysvec_apic_timer_interrupt+0xb/0x81
>> [ 260.739571] ? asm_sysvec_apic_timer_interrupt+0x12/0x20
>> [ 260.739574] ? __schedule+0x5b7/0x6d6
>> [ 260.739578] ? del_timer_sync+0x70/0x115
>> [ 260.739581] ? schedule_timeout+0x211/0x356
>> [ 260.739585] ? __next_timer_interrupt+0xf1/0xf1
>> [ 260.739588] ? io_wq_check_workers+0x15/0x11f
>> [ 260.739592] ? io_wq_manager+0x69/0xb1
>> [ 260.739596] ? io_wq_create+0x262/0x262
>> [ 260.739600] ? ret_from_fork+0x22/0x30
>> [ 260.739603] task:iou-wrk-517 state:S stack: 0 pid: 523 ppid: 439 flags:0x00004224
>> [ 260.739607] Call Trace:
>> [ 260.739609] ? __schedule+0x5b7/0x6d6
>> [ 260.739614] ? schedule+0x63/0xd5
>> [ 260.739617] ? schedule_timeout+0x219/0x356
>> [ 260.739621] ? __next_timer_interrupt+0xf1/0xf1
>> [ 260.739624] ? task_thread.isra.0+0x148/0x3af
>> [ 260.739628] ? task_thread_unbound+0xa/0xa
>> [ 260.739632] ? task_thread_bound+0x7/0x7
>> [ 260.739636] ? ret_from_fork+0x22/0x30
>> [ 260.739647] OOM killer enabled.
>> [ 260.739648] Restarting tasks ... done.
>> [ 260.740077] PM: suspend exit
>>
>> and then a set of similar messages except with s2idle instead of deep.
>>
>> Reverting 5695e51619 ("Merge tag 'io_uring-worker.v3-2021-02-25' of
>> git://git.kernel.dk/linux-block") appears to resolve the issue. I have
>> not yet bisected further. Let me know which troubleshooting steps I
>> should perform next.
>
> Can you try and pull in:
>
> git://git.kernel.dk/linux-block io_uring-5.12
>
> and see if that resolves it? I usually always run -git on my laptop as
> well, but something broke it in the merge window so I need to figure
> out what that is first...
>
> What distro are you running?
You probably want this on top...
diff --git a/fs/io-wq.c b/fs/io-wq.c
index 1fdb2b621b51..a763e1b09073 100644
--- a/fs/io-wq.c
+++ b/fs/io-wq.c
@@ -567,7 +567,7 @@ static int task_thread(void *data, int index)
worker->task = current;
set_cpus_allowed_ptr(current, cpumask_of_node(wqe->node));
- current->flags |= PF_NO_SETAFFINITY;
+ current->flags |= PF_NO_SETAFFINITY | PF_NOFREEZE;
raw_spin_lock_irq(&wqe->lock);
hlist_nulls_add_head_rcu(&worker->nulls_node, &wqe->free_list);
@@ -722,7 +722,7 @@ static int io_wq_manager(void *data)
sprintf(buf, "iou-mgr-%d", wq->task_pid);
set_task_comm(current, buf);
- current->flags |= PF_IO_WORKER;
+ current->flags |= PF_IO_WORKER | PF_NOFREEZE;
wq->manager = get_task_struct(current);
complete(&wq->started);
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 2757675ab417..e7aaf56b4dea 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -6679,6 +6685,7 @@ static int io_sq_thread(void *data)
set_task_comm(current, buf);
sqd->thread = current;
current->pf_io_worker = NULL;
+ current->flags |= PF_NOFREEZE;
if (sqd->sq_cpu != -1)
set_cpus_allowed_ptr(current, cpumask_of(sqd->sq_cpu));
--
Jens Axboe
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: 5.12-rc1 regression: freezing iou-mgr/wrk failed
2021-03-02 1:25 ` Jens Axboe
@ 2021-03-02 1:35 ` Jens Axboe
2021-03-02 22:13 ` Alex Xu (Hello71)
0 siblings, 1 reply; 6+ messages in thread
From: Jens Axboe @ 2021-03-02 1:35 UTC (permalink / raw)
To: Alex Xu (Hello71), linux-kernel, linux-block
On 3/1/21 6:25 PM, Jens Axboe wrote:
> On 3/1/21 6:11 PM, Jens Axboe wrote:
>> On 3/1/21 5:57 PM, Alex Xu (Hello71) wrote:
>>> Hi,
>>>
>>> On Linux 5.12-rc1, I am unable to suspend to RAM. The system freezes for
>>> about 40 seconds and then continues operation. The following messages
>>> are printed to the kernel log:
>>>
>>> [ 240.650300] PM: suspend entry (deep)
>>> [ 240.650748] Filesystems sync: 0.000 seconds
>>> [ 240.725605] Freezing user space processes ...
>>> [ 260.739483] Freezing of tasks failed after 20.013 seconds (3 tasks refusing to freeze, wq_busy=0):
>>> [ 260.739497] task:iou-mgr-446 state:S stack: 0 pid: 516 ppid: 439 flags:0x00004224
>>> [ 260.739504] Call Trace:
>>> [ 260.739507] ? sysvec_apic_timer_interrupt+0xb/0x81
>>> [ 260.739515] ? pick_next_task_fair+0x197/0x1cde
>>> [ 260.739519] ? sysvec_reschedule_ipi+0x2f/0x6a
>>> [ 260.739522] ? asm_sysvec_reschedule_ipi+0x12/0x20
>>> [ 260.739525] ? __schedule+0x57/0x6d6
>>> [ 260.739529] ? del_timer_sync+0xb9/0x115
>>> [ 260.739533] ? schedule+0x63/0xd5
>>> [ 260.739536] ? schedule_timeout+0x219/0x356
>>> [ 260.739540] ? __next_timer_interrupt+0xf1/0xf1
>>> [ 260.739544] ? io_wq_manager+0x73/0xb1
>>> [ 260.739549] ? io_wq_create+0x262/0x262
>>> [ 260.739553] ? ret_from_fork+0x22/0x30
>>> [ 260.739557] task:iou-mgr-517 state:S stack: 0 pid: 522 ppid: 439 flags:0x00004224
>>> [ 260.739561] Call Trace:
>>> [ 260.739563] ? sysvec_apic_timer_interrupt+0xb/0x81
>>> [ 260.739566] ? pick_next_task_fair+0x16f/0x1cde
>>> [ 260.739569] ? sysvec_apic_timer_interrupt+0xb/0x81
>>> [ 260.739571] ? asm_sysvec_apic_timer_interrupt+0x12/0x20
>>> [ 260.739574] ? __schedule+0x5b7/0x6d6
>>> [ 260.739578] ? del_timer_sync+0x70/0x115
>>> [ 260.739581] ? schedule_timeout+0x211/0x356
>>> [ 260.739585] ? __next_timer_interrupt+0xf1/0xf1
>>> [ 260.739588] ? io_wq_check_workers+0x15/0x11f
>>> [ 260.739592] ? io_wq_manager+0x69/0xb1
>>> [ 260.739596] ? io_wq_create+0x262/0x262
>>> [ 260.739600] ? ret_from_fork+0x22/0x30
>>> [ 260.739603] task:iou-wrk-517 state:S stack: 0 pid: 523 ppid: 439 flags:0x00004224
>>> [ 260.739607] Call Trace:
>>> [ 260.739609] ? __schedule+0x5b7/0x6d6
>>> [ 260.739614] ? schedule+0x63/0xd5
>>> [ 260.739617] ? schedule_timeout+0x219/0x356
>>> [ 260.739621] ? __next_timer_interrupt+0xf1/0xf1
>>> [ 260.739624] ? task_thread.isra.0+0x148/0x3af
>>> [ 260.739628] ? task_thread_unbound+0xa/0xa
>>> [ 260.739632] ? task_thread_bound+0x7/0x7
>>> [ 260.739636] ? ret_from_fork+0x22/0x30
>>> [ 260.739647] OOM killer enabled.
>>> [ 260.739648] Restarting tasks ... done.
>>> [ 260.740077] PM: suspend exit
>>>
>>> and then a set of similar messages except with s2idle instead of deep.
>>>
>>> Reverting 5695e51619 ("Merge tag 'io_uring-worker.v3-2021-02-25' of
>>> git://git.kernel.dk/linux-block") appears to resolve the issue. I have
>>> not yet bisected further. Let me know which troubleshooting steps I
>>> should perform next.
>>
>> Can you try and pull in:
>>
>> git://git.kernel.dk/linux-block io_uring-5.12
>>
>> and see if that resolves it? I usually always run -git on my laptop as
>> well, but something broke it in the merge window so I need to figure
>> out what that is first...
>>
>> What distro are you running?
>
> You probably want this on top...
And if you've verified that that one works OK, can you try this variant
instead?
diff --git a/fs/io-wq.c b/fs/io-wq.c
index 1fdb2b621b51..fe004cf93c4b 100644
--- a/fs/io-wq.c
+++ b/fs/io-wq.c
@@ -16,6 +16,7 @@
#include <linux/rculist_nulls.h>
#include <linux/cpu.h>
#include <linux/tracehook.h>
+#include <linux/freezer.h>
#include "../kernel/sched/sched.h"
#include "io-wq.h"
@@ -480,6 +481,7 @@ static int io_wqe_worker(void *data)
io_worker_start(worker);
while (!test_bit(IO_WQ_BIT_EXIT, &wq->state)) {
+ try_to_freeze();
set_current_state(TASK_INTERRUPTIBLE);
loop:
raw_spin_lock_irq(&wqe->lock);
@@ -731,6 +733,7 @@ static int io_wq_manager(void *data)
set_current_state(TASK_INTERRUPTIBLE);
io_wq_check_workers(wq);
schedule_timeout(HZ);
+ try_to_freeze();
if (fatal_signal_pending(current))
set_bit(IO_WQ_BIT_EXIT, &wq->state);
} while (!test_bit(IO_WQ_BIT_EXIT, &wq->state));
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 2757675ab417..03c42f1f9862 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -74,13 +74,11 @@
#include <linux/fsnotify.h>
#include <linux/fadvise.h>
#include <linux/eventpoll.h>
-#include <linux/fs_struct.h>
#include <linux/splice.h>
#include <linux/task_work.h>
#include <linux/pagemap.h>
#include <linux/io_uring.h>
-#include <linux/blk-cgroup.h>
-#include <linux/audit.h>
+#include <linux/freezer.h>
#define CREATE_TRACE_POINTS
#include <trace/events/io_uring.h>
@@ -6744,6 +6748,7 @@ static int io_sq_thread(void *data)
io_ring_set_wakeup_flag(ctx);
schedule();
+ try_to_freeze();
list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
io_ring_clear_wakeup_flag(ctx);
}
--
Jens Axboe
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: 5.12-rc1 regression: freezing iou-mgr/wrk failed
2021-03-02 1:35 ` Jens Axboe
@ 2021-03-02 22:13 ` Alex Xu (Hello71)
2021-03-02 22:31 ` Jens Axboe
0 siblings, 1 reply; 6+ messages in thread
From: Alex Xu (Hello71) @ 2021-03-02 22:13 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-block, linux-kernel
I tried 29be7fc03d ("io_uring: ensure that threads freeze on suspend")
and it seems to work OK. The system suspends fine and no errors are
printed to the kernel log.
I am using Gentoo on the machine in question.
I didn't test the other patches you supplied. Let me know if there's
anything you would like me to test.
Thanks,
Alex.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: 5.12-rc1 regression: freezing iou-mgr/wrk failed
2021-03-02 22:13 ` Alex Xu (Hello71)
@ 2021-03-02 22:31 ` Jens Axboe
0 siblings, 0 replies; 6+ messages in thread
From: Jens Axboe @ 2021-03-02 22:31 UTC (permalink / raw)
To: Alex Xu (Hello71); +Cc: linux-block, linux-kernel
On 3/2/21 3:13 PM, Alex Xu (Hello71) wrote:
> I tried 29be7fc03d ("io_uring: ensure that threads freeze on suspend")
> and it seems to work OK. The system suspends fine and no errors are
> printed to the kernel log.
>
> I am using Gentoo on the machine in question.
>
> I didn't test the other patches you supplied. Let me know if there's
> anything you would like me to test.
OK great, thanks. I'll add your reported/tested-by to the patch.
--
Jens Axboe
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-03-02 23:53 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1614646241.av51lk2de4.none.ref@localhost>
2021-03-02 0:57 ` 5.12-rc1 regression: freezing iou-mgr/wrk failed Alex Xu (Hello71)
2021-03-02 1:11 ` Jens Axboe
2021-03-02 1:25 ` Jens Axboe
2021-03-02 1:35 ` Jens Axboe
2021-03-02 22:13 ` Alex Xu (Hello71)
2021-03-02 22:31 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox