* problems with 4.14.6-rt7
@ 2018-01-08 14:56 Tim Sander
2018-01-24 14:18 ` Sebastian Andrzej Siewior
0 siblings, 1 reply; 5+ messages in thread
From: Tim Sander @ 2018-01-08 14:56 UTC (permalink / raw)
To: linux-rt-users; +Cc: Sebastian Andrzej Siewior
Hi
I am currently using 4.14.6-rt7 on an De0-Nano-Soc (intel/altera arm v7 dual core)
board. I have added a small driver which adds some functionality like iio. As soon
as i have some realtime load on one core i see messages like:
Showing busy workqueues and worker pools:
workqueue events_freezable: flags=0x4
pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
pending: mmc_rescan
workqueue events_power_efficient: flags=0x80
pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=2/256
pending: phy_state_machine, neigh_periodic_work
workqueue mm_percpu_wq: flags=0x8
pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256
pending: vmstat_update
BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=0 stuck for 68s!
Showing busy workqueues and worker pools:
workqueue mm_percpu_wq: flags=0x8
pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256
pending: vmstat_update
BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=0 stuck for 98s!
Showing busy workqueues and worker pools:
workqueue mm_percpu_wq: flags=0x8
pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256
pending: vmstat_update
In other occasions i have seen complete system lockups where the only message i get is:
INFO: task systemd:1 blocked for more than 120 seconds.
Tainted: G O 4.14.6-rt7 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
systemd D 0 1 0 0x00001000
[<80793c70>] (__schedule) from [<80793f30>] (schedule+0x64/0xfc)
[<80793f30>] (schedule) from [<80796940>] (__write_rt_lock+0x13c/0x214)
[<80796940>] (__write_rt_lock) from [<80798464>] (rt_write_lock+0x24/0x28)
[<80798464>] (rt_write_lock) from [<8011f4b4>] (copy_process.part.5+0xfc4/0x17d8)
[<8011f4b4>] (copy_process.part.5) from [<8011fe58>] (_do_fork+0xc8/0x474)
[<8011fe58>] (_do_fork) from [<80120320>] (SyS_clone+0x30/0x38)
[<80120320>] (SyS_clone) from [<80107e60>] (ret_fast_syscall+0x0/0x28)
INFO: task systemd:1 blocked for more than 120 seconds.
I know that i am really using up all realtime budget on this system for one core
but i would expect that rt-throttling kicks in and gets some time for these
starving processes?
Best regards
Tim
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: problems with 4.14.6-rt7
2018-01-08 14:56 problems with 4.14.6-rt7 Tim Sander
@ 2018-01-24 14:18 ` Sebastian Andrzej Siewior
2018-01-26 16:01 ` Tim Sander
0 siblings, 1 reply; 5+ messages in thread
From: Sebastian Andrzej Siewior @ 2018-01-24 14:18 UTC (permalink / raw)
To: Tim Sander; +Cc: linux-rt-users
On 2018-01-08 15:56:17 [+0100], Tim Sander wrote:
> Hi
Hi,
> I am currently using 4.14.6-rt7 on an De0-Nano-Soc (intel/altera arm v7 dual core)
> board. I have added a small driver which adds some functionality like iio. As soon
> as i have some realtime load on one core i see messages like:
…
> I know that i am really using up all realtime budget on this system for one core
> but i would expect that rt-throttling kicks in and gets some time for these
> starving processes?
You should see something like
sched: RT throttling activated
once the RT tasks are throttled. Do you?
> Best regards
> Tim
Sebastian
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: problems with 4.14.6-rt7
2018-01-24 14:18 ` Sebastian Andrzej Siewior
@ 2018-01-26 16:01 ` Tim Sander
2018-02-20 18:48 ` Sebastian Andrzej Siewior
0 siblings, 1 reply; 5+ messages in thread
From: Tim Sander @ 2018-01-26 16:01 UTC (permalink / raw)
To: Sebastian Andrzej Siewior; +Cc: linux-rt-users
Hi Sebastian
Am Mittwoch, 24. Januar 2018, 15:18:50 CET schrieb Sebastian Andrzej Siewior:
> On 2018-01-08 15:56:17 [+0100], Tim Sander wrote:
> > I am currently using 4.14.6-rt7 on an De0-Nano-Soc (intel/altera arm v7
> > dual core) board. I have added a small driver which adds some
> > functionality like iio. As soon
> > as i have some realtime load on one core i see messages like:
> …
>
> > I know that i am really using up all realtime budget on this system for
> > one core but i would expect that rt-throttling kicks in and gets some
> > time for these starving processes?
>
> You should see something like
> sched: RT throttling activated
>
> once the RT tasks are throttled. Do you?
I just double checked with your 4.14.15-rt12 release. I have not seen
any throtteling messages? (the double "cut here" line is no copy error of mine, but
verbatim kernel output)
Here is the dmesg output of a testrun:
BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=0 stuck for 56s!
Showing busy workqueues and worker pools:
workqueue mm_percpu_wq: flags=0x8
pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256
pending: vmstat_update
------------[ cut here ]------------
------------[ cut here ]------------
WARNING: CPU: 1 PID: 215 at kernel/workqueue.c:1699 worker_thread+0x390/0x614
Modules linked in: serdes(O)
CPU: 1 PID: 215 Comm: kworker/0:2 Tainted: G O 4.14.15-rt12 #1
Hardware name: Altera SOCFPGA
[<801116d0>] (unwind_backtrace) from [<8010cae4>] (show_stack+0x20/0x24)
[<8010cae4>] (show_stack) from [<8077ef10>] (dump_stack+0x90/0xa4)
[<8077ef10>] (dump_stack) from [<80120e9c>] (__warn+0xf8/0x110)
[<80120e9c>] (__warn) from [<80120f84>] (warn_slowpath_null+0x30/0x38)
[<80120f84>] (warn_slowpath_null) from [<8013d76c>] (worker_thread+0x390/0x614)
[<8013d76c>] (worker_thread) from [<801437fc>] (kthread+0x13c/0x16c)
[<801437fc>] (kthread) from [<80107f1c>] (ret_from_fork+0x14/0x38)
---[ end trace 0000000000000002 ]---
WARNING: CPU: 0 PID: 3 at kernel/kthread.c:370 __kthread_bind_mask+0x74/0x78
Modules linked in: serdes(O)
CPU: 0 PID: 3 Comm: kworker/0:0 Tainted: G W O 4.14.15-rt12 #1
Hardware name: Altera SOCFPGA
[<801116d0>] (unwind_backtrace) from [<8010cae4>] (show_stack+0x20/0x24)
[<8010cae4>] (show_stack) from [<8077ef10>] (dump_stack+0x90/0xa4)
[<8077ef10>] (dump_stack) from [<80120e9c>] (__warn+0xf8/0x110)
[<80120e9c>] (__warn) from [<80120f84>] (warn_slowpath_null+0x30/0x38)
[<80120f84>] (warn_slowpath_null) from [<80143ee4>] (__kthread_bind_mask+0x74/0x78)
[<80143ee4>] (__kthread_bind_mask) from [<80144680>] (kthread_bind_mask+0x1c/0x20)
[<80144680>] (kthread_bind_mask) from [<8013ac4c>] (create_worker+0xec/0x180)
[<8013ac4c>] (create_worker) from [<8013d7d0>] (worker_thread+0x3f4/0x614)
[<8013d7d0>] (worker_thread) from [<801437fc>] (kthread+0x13c/0x16c)
[<801437fc>] (kthread) from [<80107f1c>] (ret_from_fork+0x14/0x38)
---[ end trace 0000000000000003 ]---
------------[ cut here ]------------
WARNING: CPU: 0 PID: 3 at kernel/workqueue.c:1657 worker_enter_idle+0x168/0x1d8
Modules linked in: serdes(O)
CPU: 0 PID: 3 Comm: kworker/0:0 Tainted: G W O 4.14.15-rt12 #1
Hardware name: Altera SOCFPGA
[<801116d0>] (unwind_backtrace) from [<8010cae4>] (show_stack+0x20/0x24)
[<8010cae4>] (show_stack) from [<8077ef10>] (dump_stack+0x90/0xa4)
[<8077ef10>] (dump_stack) from [<80120e9c>] (__warn+0xf8/0x110)
[<80120e9c>] (__warn) from [<80120f84>] (warn_slowpath_null+0x30/0x38)
[<80120f84>] (warn_slowpath_null) from [<80139964>] (worker_enter_idle+0x168/0x1d8)
[<80139964>] (worker_enter_idle) from [<8013ac78>] (create_worker+0x118/0x180)
[<8013ac78>] (create_worker) from [<8013d7d0>] (worker_thread+0x3f4/0x614)
[<8013d7d0>] (worker_thread) from [<801437fc>] (kthread+0x13c/0x16c)
[<801437fc>] (kthread) from [<80107f1c>] (ret_from_fork+0x14/0x38)
---[ end trace 0000000000000004 ]---
Best regards
Tim
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: problems with 4.14.6-rt7
2018-01-26 16:01 ` Tim Sander
@ 2018-02-20 18:48 ` Sebastian Andrzej Siewior
2018-02-20 19:36 ` Sebastian Andrzej Siewior
0 siblings, 1 reply; 5+ messages in thread
From: Sebastian Andrzej Siewior @ 2018-02-20 18:48 UTC (permalink / raw)
To: Tim Sander; +Cc: linux-rt-users
On 2018-01-26 17:01:09 [+0100], Tim Sander wrote:
> Hi Sebastian
Hi Tim,
> I just double checked with your 4.14.15-rt12 release. I have not seen
> any throtteling messages? (the double "cut here" line is no copy error of mine, but
> verbatim kernel output)
>
> Here is the dmesg output of a testrun:
>
> BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=0 stuck for 56s!
> Showing busy workqueues and worker pools:
> workqueue mm_percpu_wq: flags=0x8
> pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256
> pending: vmstat_update
I don't see all of this but I don't see the lockdep detector unless I
boot UP. This bothers me little, let me look why…
Sebastian
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: problems with 4.14.6-rt7
2018-02-20 18:48 ` Sebastian Andrzej Siewior
@ 2018-02-20 19:36 ` Sebastian Andrzej Siewior
0 siblings, 0 replies; 5+ messages in thread
From: Sebastian Andrzej Siewior @ 2018-02-20 19:36 UTC (permalink / raw)
To: Tim Sander; +Cc: linux-rt-users, Peter Zijlstra, tglx
On 2018-02-20 19:48:06 [+0100], To Tim Sander wrote:
> On 2018-01-26 17:01:09 [+0100], Tim Sander wrote:
> > Hi Sebastian
Hi Tim,
> > I just double checked with your 4.14.15-rt12 release. I have not seen
> > any throtteling messages? (the double "cut here" line is no copy error of mine, but
> > verbatim kernel output)
> >
> > Here is the dmesg output of a testrun:
> >
> > BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=0 stuck for 56s!
> > Showing busy workqueues and worker pools:
> > workqueue mm_percpu_wq: flags=0x8
> > pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256
> > pending: vmstat_update
>
> I don't see all of this but I don't see the lockdep detector unless I
> boot UP. This bothers me little, let me look why…
As I just learned (or have been told): RT_RUNTIME_SHARE on SMP disables
the RT throttling if a RT thread went mad. This can be undone via
/sys/kernel/debug/sched_features.
Without throttling, the workqueue code never gets on the CPU and
complains that it is stuck. With throttling it wouldn't happen. However
throttling kills everything in the RT class which means it won't process
threaded interrupts so an UP system remains "dead" (since you can't
access via network/serial, most timer won't expires, …). The benefit is
limited.
Sebastian
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2018-02-20 19:37 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-08 14:56 problems with 4.14.6-rt7 Tim Sander
2018-01-24 14:18 ` Sebastian Andrzej Siewior
2018-01-26 16:01 ` Tim Sander
2018-02-20 18:48 ` Sebastian Andrzej Siewior
2018-02-20 19:36 ` Sebastian Andrzej Siewior
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).