* [PATCH] Force processes to non-realtime before mm_exit
@ 2016-05-10 18:04 Brian Silverman
  2016-05-12  8:59 ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 11+ messages in thread
From: Brian Silverman @ 2016-05-10 18:04 UTC (permalink / raw)
  To: linux-rt-users; +Cc: Brian Silverman
Without this, a realtime process which has called mlockall exiting
causes large latencies for other realtime processes at the same or
lower priorities. This seems like a fairly common use case too, because
realtime processes generally want their memory locked into RAM.
Signed-off-by: Brian Silverman <brian@peloton-tech.com>
---
 kernel/exit.c | 6 ++++++
 1 file changed, 6 insertions(+)
diff --git a/kernel/exit.c b/kernel/exit.c
index a0cf72b..68a97df 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -730,6 +730,12 @@ void do_exit(long code)
 	tsk->exit_code = code;
 	taskstats_exit(tsk, group_dead);
 
+	if (tsk->policy == SCHED_FIFO || tsk->policy == SCHED_RR) {
+		struct sched_param param = { .sched_priority = 0 };
+
+		sched_setscheduler_nocheck(current, SCHED_NORMAL, ¶m);
+	}
+
 	exit_mm(tsk);
 
 	if (group_dead)
-- 
2.1.4
^ permalink raw reply related	[flat|nested] 11+ messages in thread
* Re: [PATCH] Force processes to non-realtime before mm_exit
  2016-05-10 18:04 [PATCH] Force processes to non-realtime before mm_exit Brian Silverman
@ 2016-05-12  8:59 ` Sebastian Andrzej Siewior
  2016-05-16 21:05   ` Brian Silverman
  0 siblings, 1 reply; 11+ messages in thread
From: Sebastian Andrzej Siewior @ 2016-05-12  8:59 UTC (permalink / raw)
  To: Brian Silverman; +Cc: linux-rt-users
* Brian Silverman | 2016-05-10 14:04:24 [-0400]:
>Without this, a realtime process which has called mlockall exiting
>causes large latencies for other realtime processes at the same or
>lower priorities. This seems like a fairly common use case too, because
>realtime processes generally want their memory locked into RAM.
Can please explain more detailed what exactly causes the high latencies?
Sebastian
^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: [PATCH] Force processes to non-realtime before mm_exit
  2016-05-12  8:59 ` Sebastian Andrzej Siewior
@ 2016-05-16 21:05   ` Brian Silverman
  2016-05-25 16:33     ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 11+ messages in thread
From: Brian Silverman @ 2016-05-16 21:05 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior; +Cc: rt-users
On Thu, May 12, 2016 at 1:59 AM, Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
> * Brian Silverman | 2016-05-10 14:04:24 [-0400]:
>
>>Without this, a realtime process which has called mlockall exiting
>>causes large latencies for other realtime processes at the same or
>>lower priorities. This seems like a fairly common use case too, because
>>realtime processes generally want their memory locked into RAM.
>
> Can please explain more detailed what exactly causes the high latencies?
>
It happens when realtime (SCHED_RR/SCHED_FIFO) processes which have
called mlockall(MCL_CURRENT | MCL_FUTURE) exit.
I can easily reproduce the problem on both an ARM machine with
4.1.18-rt17 and a Intel i7 with 3.14.43-rt42. `cyclictest -p 10 -m -t
-a -i 2000 -n` on both machines for a few minutes gives a <150us max
latency. However, when a program at SCHED_RR 10 which has been pinned
to a single core and called mlockall exits, that goes up to about 11ms
on the ARM machine and around 3ms on the i7. Not pinning also gives
larger latencies, but they aren't as consistent.
Using ftrace reveals that most of this time is being spent in exit_mm.
There are a couple of loops that take a long time, and if the process
is running at realtime priority other processes at the same priority
never get a chance to run. Here's single cycles of the longer loops
from ftrace on the 4.1.18-rt17 box:
One of them:
 queue_cyclictes-5298  [000] ....1..   290.366030: remove_vma <-exit_mmap
 queue_cyclictes-5298  [000] ....1..   290.366032: __might_sleep <-remove_vma
 queue_cyclictes-5298  [000] ....1..   290.366033: ___might_sleep
<-__might_sleep
 queue_cyclictes-5298  [000] ....1..   290.366035: kmem_cache_free <-remove_vma
 queue_cyclictes-5298  [000] ....1..   290.366036: preempt_count_add
<-kmem_cache_free
 queue_cyclictes-5298  [000] ....2..   290.366037: preempt_count_sub
<-kmem_cache_free
 queue_cyclictes-5298  [000] ....1..   290.366039: __slab_free <-kmem_cache_free
 queue_cyclictes-5298  [000] d...1..   290.366040: preempt_count_add
<-__slab_free
 queue_cyclictes-5298  [000] d...2..   290.366041: preempt_count_sub
<-__slab_free
 queue_cyclictes-5298  [000] ....1..   290.366043: kmem_cache_free:
call_site=c0147054 ptr=ebb93cb8
Another one:
 queue_cyclictes-5298  [000] ....1..   290.364272: free_hot_cold_page
<-free_hot_cold_page_list
 queue_cyclictes-5298  [000] ....1..   290.364273: free_pages_prepare
<-free_hot_cold_page
 queue_cyclictes-5298  [000] ....1..   290.364274: mm_page_free:
page=ef056f20 pfn=667513 order=0
 queue_cyclictes-5298  [000] ....1..   290.364275: page_address
<-free_pages_prepare
 queue_cyclictes-5298  [000] ....1..   290.364277: page_address
<-free_pages_prepare
 queue_cyclictes-5298  [000] ....1..   290.364278:
get_pfnblock_flags_mask <-free_hot_cold_page
 queue_cyclictes-5298  [000] ....1..   290.364279: migrate_disable
<-free_hot_cold_page
 queue_cyclictes-5298  [000] ....1..   290.364280: preempt_count_add
<-migrate_disable
 queue_cyclictes-5298  [000] ....21.   290.364281: pin_current_cpu
<-migrate_disable
 queue_cyclictes-5298  [000] ....211   290.364283: preempt_count_sub
<-migrate_disable
 queue_cyclictes-5298  [000] ....111   290.364284: migrate_disable
<-free_hot_cold_page
 queue_cyclictes-5298  [000] ....112   290.364286: rt_spin_lock
<-free_hot_cold_page
 queue_cyclictes-5298  [000] ....112   290.364287: ___might_sleep <-rt_spin_lock
 queue_cyclictes-5298  [000] ....112   290.364288: preempt_count_add
<-free_hot_cold_page
 queue_cyclictes-5298  [000] ....212   290.364289: preempt_count_sub
<-free_hot_cold_page
 queue_cyclictes-5298  [000] ....112   290.364291: rt_spin_unlock
<-free_hot_cold_page
 queue_cyclictes-5298  [000] ....112   290.364292: migrate_enable
<-free_hot_cold_page
 queue_cyclictes-5298  [000] ....111   290.364293: migrate_enable
<-free_hot_cold_page
 queue_cyclictes-5298  [000] ....111   290.364295: preempt_count_add
<-migrate_enable
 queue_cyclictes-5298  [000] ....21.   290.364296: unpin_current_cpu
<-migrate_enable
 queue_cyclictes-5298  [000] ....21.   290.364297: preempt_count_sub
<-migrate_enable
 queue_cyclictes-5298  [000] ....1..   290.364299:
mm_page_free_batched: page=ef056f40 pfn=667514 order=0 cold=0
Another one (this one loops in blocks interspersed with the previous one):
 queue_cyclictes-5298  [000] ....111   290.361572: mm_page_pcpu_drain:
page=ef7005a0 pfn=885805 order=0 migratetype=2
 queue_cyclictes-5298  [000] ....111   290.361574:
__mod_zone_page_state <-free_pcppages_bulk
 queue_cyclictes-5298  [000] ....111   290.361575: preempt_count_add
<-__mod_zone_page_state
 queue_cyclictes-5298  [000] ....211   290.361576: preempt_count_sub
<-__mod_zone_page_state
Yet another one (also in blocks interspersed with the previous two):
 queue_cyclictes-5298  [000] ....111   290.333553: page_remove_rmap
<-unmap_single_vma
 queue_cyclictes-5298  [000] ....111   290.333555:
__mod_zone_page_state <-page_remove_rmap
 queue_cyclictes-5298  [000] ....111   290.333556: preempt_count_add
<-__mod_zone_page_state
 queue_cyclictes-5298  [000] ....211   290.333557: preempt_count_sub
<-__mod_zone_page_state
This is a simple program which creates the problem when run:
#define _GNU_SOURCE
#include <sched.h>
#include <stdio.h>
#include <string.h>
#include <sys/mman.h>
#include <errno.h>
#define CHECK_SYSCALL(syscall)                                                 \
  do {                                                                         \
    if ((syscall) == -1) {                                                     \
      fprintf(stderr, "syscall failed with %d: %s\n", errno, strerror(errno)); \
      return 1;                                                                \
    }                                                                          \
  } while (0)
int main() {
  CHECK_SYSCALL(mlockall(MCL_CURRENT | MCL_FUTURE));
  {
    struct sched_param param;
    memset(¶m, 0, sizeof(param));
    param.sched_priority = 11;
    CHECK_SYSCALL(sched_setscheduler(0, SCHED_RR, ¶m));
  }
  {
    cpu_set_t cpuset;
    CPU_ZERO(&cpuset);
    CPU_SET(0, &cpuset);
    CHECK_SYSCALL(sched_setaffinity(0, sizeof(cpuset), &cpuset));
  }
  return 0;
}
^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: [PATCH] Force processes to non-realtime before mm_exit
  2016-05-16 21:05   ` Brian Silverman
@ 2016-05-25 16:33     ` Sebastian Andrzej Siewior
  2016-05-25 18:00       ` Brian Silverman
  0 siblings, 1 reply; 11+ messages in thread
From: Sebastian Andrzej Siewior @ 2016-05-25 16:33 UTC (permalink / raw)
  To: Brian Silverman; +Cc: rt-users
* Brian Silverman | 2016-05-16 14:05:54 [-0700]:
Thank you for the explanation.
>It happens when realtime (SCHED_RR/SCHED_FIFO) processes which have
>called mlockall(MCL_CURRENT | MCL_FUTURE) exit.
Why can't the application drop the RT priority before its exit? Wouldn't
that be appropriate?
>One of them:
> queue_cyclictes-5298  [000] ....1..   290.366030: remove_vma <-exit_mmap
> queue_cyclictes-5298  [000] ....1..   290.366032: __might_sleep <-remove_vma
> queue_cyclictes-5298  [000] ....1..   290.366033: ___might_sleep
><-__might_sleep
> queue_cyclictes-5298  [000] ....1..   290.366035: kmem_cache_free <-remove_vma
> queue_cyclictes-5298  [000] ....1..   290.366036: preempt_count_add
><-kmem_cache_free
> queue_cyclictes-5298  [000] ....2..   290.366037: preempt_count_sub
><-kmem_cache_free
> queue_cyclictes-5298  [000] ....1..   290.366039: __slab_free <-kmem_cache_free
> queue_cyclictes-5298  [000] d...1..   290.366040: preempt_count_add
><-__slab_free
> queue_cyclictes-5298  [000] d...2..   290.366041: preempt_count_sub
><-__slab_free
> queue_cyclictes-5298  [000] ....1..   290.366043: kmem_cache_free:
>call_site=c0147054 ptr=ebb93cb8
In this trace and the others you attached, you have a preempt count of
at least one. Is this a function trace? For events I recently figure out
that the counter in event tracing is recorded wrongly [0]. So my
question here is basically is this a long preempt disable section or
not.
[0] http://permalink.gmane.org/gmane.linux.kernel/2227774
Sebastian
^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: [PATCH] Force processes to non-realtime before mm_exit
  2016-05-25 16:33     ` Sebastian Andrzej Siewior
@ 2016-05-25 18:00       ` Brian Silverman
  2016-05-25 19:54         ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 11+ messages in thread
From: Brian Silverman @ 2016-05-25 18:00 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior; +Cc: rt-users
Thanks for looking at this. :)
On Wed, May 25, 2016 at 9:33 AM, Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
> * Brian Silverman | 2016-05-16 14:05:54 [-0700]:
>
> Thank you for the explanation.
>
>>It happens when realtime (SCHED_RR/SCHED_FIFO) processes which have
>>called mlockall(MCL_CURRENT | MCL_FUTURE) exit.
>
> Why can't the application drop the RT priority before its exit? Wouldn't
> that be appropriate?
If it crashes or gets killed, it doesn't have a chance to drop priority.
>>One of them:
>> queue_cyclictes-5298  [000] ....1..   290.366030: remove_vma <-exit_mmap
>> queue_cyclictes-5298  [000] ....1..   290.366032: __might_sleep <-remove_vma
>> queue_cyclictes-5298  [000] ....1..   290.366033: ___might_sleep
>><-__might_sleep
>> queue_cyclictes-5298  [000] ....1..   290.366035: kmem_cache_free <-remove_vma
>> queue_cyclictes-5298  [000] ....1..   290.366036: preempt_count_add
>><-kmem_cache_free
>> queue_cyclictes-5298  [000] ....2..   290.366037: preempt_count_sub
>><-kmem_cache_free
>> queue_cyclictes-5298  [000] ....1..   290.366039: __slab_free <-kmem_cache_free
>> queue_cyclictes-5298  [000] d...1..   290.366040: preempt_count_add
>><-__slab_free
>> queue_cyclictes-5298  [000] d...2..   290.366041: preempt_count_sub
>><-__slab_free
>> queue_cyclictes-5298  [000] ....1..   290.366043: kmem_cache_free:
>>call_site=c0147054 ptr=ebb93cb8
>
> In this trace and the others you attached, you have a preempt count of
> at least one. Is this a function trace? For events I recently figure out
> that the counter in event tracing is recorded wrongly [0]. So my
> question here is basically is this a long preempt disable section or
> not.
>
> [0] http://permalink.gmane.org/gmane.linux.kernel/2227774
Yes, they are function traces without that patch applied. Every single
line in the file has a preempt count of at least 1.
Brian
^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: [PATCH] Force processes to non-realtime before mm_exit
  2016-05-25 18:00       ` Brian Silverman
@ 2016-05-25 19:54         ` Sebastian Andrzej Siewior
  2016-06-03 23:33           ` Brian Silverman
  0 siblings, 1 reply; 11+ messages in thread
From: Sebastian Andrzej Siewior @ 2016-05-25 19:54 UTC (permalink / raw)
  To: Brian Silverman; +Cc: rt-users
On 05/25/2016 08:00 PM, Brian Silverman wrote:
>> Why can't the application drop the RT priority before its exit? Wouldn't
>> that be appropriate?
> 
> If it crashes or gets killed, it doesn't have a chance to drop priority.
That is correct. The task with the highest priority is usually one of
the most important ones. Usually if that task crashes or gets killed by
the OOM killer while in production you have usually bigger problems
than this.
I'm neither pro nor against this patch. This patch can go actually
upstream if accepted since it is not RT specific. If you have a good
use case please submit please post it upstream and CC me. Once accepted
I would pull it in -RT as well.
> Yes, they are function traces without that patch applied. Every single
> line in the file has a preempt count of at least 1.
What I tried to figure out whether dropping RT prio alone solves the
problem or if have additionally long preempt-disable regions. Based on
your feedback you don't have long preempt-disable regions.
> Brian
Sebastian
^ permalink raw reply	[flat|nested] 11+ messages in thread
* [PATCH] Force processes to non-realtime before mm_exit
@ 2016-06-03 23:18 Brian Silverman
  2016-06-05  0:28 ` Corey Minyard
  2016-07-14 17:24 ` Peter Zijlstra
  0 siblings, 2 replies; 11+ messages in thread
From: Brian Silverman @ 2016-06-03 23:18 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-rt-users, bigeasy, Brian Silverman
Without this, a realtime process which has called mlockall exiting
causes large latencies for other realtime processes at the same or
lower priorities. This seems like a fairly common use case too, because
realtime processes generally want their memory locked into RAM.
Signed-off-by: Brian Silverman <brian@peloton-tech.com>
---
 kernel/exit.c | 6 ++++++
 1 file changed, 6 insertions(+)
diff --git a/kernel/exit.c b/kernel/exit.c
index a0cf72b..68a97df 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -730,6 +730,12 @@ void do_exit(long code)
 	tsk->exit_code = code;
 	taskstats_exit(tsk, group_dead);
 
+	if (tsk->policy == SCHED_FIFO || tsk->policy == SCHED_RR) {
+		struct sched_param param = { .sched_priority = 0 };
+
+		sched_setscheduler_nocheck(current, SCHED_NORMAL, ¶m);
+	}
+
 	exit_mm(tsk);
 
 	if (group_dead)
-- 
2.1.4
^ permalink raw reply related	[flat|nested] 11+ messages in thread
* Re: [PATCH] Force processes to non-realtime before mm_exit
  2016-05-25 19:54         ` Sebastian Andrzej Siewior
@ 2016-06-03 23:33           ` Brian Silverman
  0 siblings, 0 replies; 11+ messages in thread
From: Brian Silverman @ 2016-06-03 23:33 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, LKML; +Cc: rt-users
Sebastian had some questions about this patch when I first sent it to rt-users.
On Wed, May 25, 2016 at 12:54 PM, Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
> On 05/25/2016 08:00 PM, Brian Silverman wrote:
>>> Why can't the application drop the RT priority before its exit? Wouldn't
>>> that be appropriate?
>>
>> If it crashes or gets killed, it doesn't have a chance to drop priority.
>
> That is correct. The task with the highest priority is usually one of
> the most important ones. Usually if that task crashes or gets killed by
> the OOM killer while in production you have usually bigger problems
> than this.
I sometimes use high priority for low latency rather than because a
process is important. Processes like that are designed to always run
quickly, but they need to run with low latency, so they're high
priority. However, they should never cause high latencies for other
processes, even if they crash.
Also, when something bad does happen and an important process crashes,
it's nice to have it not cause problems for other processes too.
Dumping core or freeing memory pages is never more important than
continuing to run the rest of the system.
> I'm neither pro nor against this patch. This patch can go actually
> upstream if accepted since it is not RT specific. If you have a good
> use case please submit please post it upstream and CC me. Once accepted
> I would pull it in -RT as well.
Done
Brian
^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: [PATCH] Force processes to non-realtime before mm_exit
  2016-06-03 23:18 Brian Silverman
@ 2016-06-05  0:28 ` Corey Minyard
  2016-07-14 17:24 ` Peter Zijlstra
  1 sibling, 0 replies; 11+ messages in thread
From: Corey Minyard @ 2016-06-05  0:28 UTC (permalink / raw)
  To: Brian Silverman, linux-kernel; +Cc: linux-rt-users, bigeasy
On 06/03/2016 06:18 PM, Brian Silverman wrote:
> Without this, a realtime process which has called mlockall exiting
> causes large latencies for other realtime processes at the same or
> lower priorities. This seems like a fairly common use case too, because
> realtime processes generally want their memory locked into RAM.
Could this cause a subtle priority inversion for a process waiting
on this process to die?  I'm thinking that if this is a critical process,
it crashes, and the system is very busy with other RT processes,
it could take a long time before the process gets restarted when
it is expected to happen quickly.
I don't have another solution for you, and beyond speeding up the
memory reclamation process (which may not be possible or easy)
I'm not sure there is.  I'm just pointing out a possible side effect.
-corey
> Signed-off-by: Brian Silverman <brian@peloton-tech.com>
> ---
>   kernel/exit.c | 6 ++++++
>   1 file changed, 6 insertions(+)
>
> diff --git a/kernel/exit.c b/kernel/exit.c
> index a0cf72b..68a97df 100644
> --- a/kernel/exit.c
> +++ b/kernel/exit.c
> @@ -730,6 +730,12 @@ void do_exit(long code)
>   	tsk->exit_code = code;
>   	taskstats_exit(tsk, group_dead);
>   
> +	if (tsk->policy == SCHED_FIFO || tsk->policy == SCHED_RR) {
> +		struct sched_param param = { .sched_priority = 0 };
> +
> +		sched_setscheduler_nocheck(current, SCHED_NORMAL, ¶m);
> +	}
> +
>   	exit_mm(tsk);
>   
>   	if (group_dead)
^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: [PATCH] Force processes to non-realtime before mm_exit
  2016-06-03 23:18 Brian Silverman
  2016-06-05  0:28 ` Corey Minyard
@ 2016-07-14 17:24 ` Peter Zijlstra
  2016-09-02 15:02   ` Thomas Gleixner
  1 sibling, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2016-07-14 17:24 UTC (permalink / raw)
  To: Brian Silverman
  Cc: linux-kernel, linux-rt-users, bigeasy, Thomas Gleixner,
	Ingo Molnar, Mike Galbraith
On Fri, Jun 03, 2016 at 04:18:44PM -0700, Brian Silverman wrote:
> Without this, a realtime process which has called mlockall exiting
> causes large latencies for other realtime processes at the same or
> lower priorities. This seems like a fairly common use case too, because
> realtime processes generally want their memory locked into RAM.
So I'm not too sure..  SCHED_FIFO/RR are a complete trainwreck and
provide absolutely no isolation from badly behaving tasks what so ever,
so I'm not too inclined to protect them from exit either, its just one
more way in which they can cause pain.
But aside from the, the patch has issues..
> +++ b/kernel/exit.c
> @@ -730,6 +730,12 @@ void do_exit(long code)
>  	tsk->exit_code = code;
>  	taskstats_exit(tsk, group_dead);
>  
> +	if (tsk->policy == SCHED_FIFO || tsk->policy == SCHED_RR) {
> +		struct sched_param param = { .sched_priority = 0 };
> +
> +		sched_setscheduler_nocheck(current, SCHED_NORMAL, ¶m);
> +	}
> +
>  	exit_mm(tsk);
That only does half a job. You forget about SCHED_DEADLINE and negative
nice tasks.
Something like the below perhaps... But yeah, unconvinced.
diff --git a/kernel/exit.c b/kernel/exit.c
index 84ae830234f8..25da16bcfb9b 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -812,6 +812,13 @@ void do_exit(long code)
 	tsk->exit_code = code;
 	taskstats_exit(tsk, group_dead);
 
+	/*
+	 * Drop all scheduler priority before exit_mm() as that
+	 * can involve a lot of work and we don't want a dying
+	 * task to interfere with healthy/running tasks.
+	 */
+	sched_normalize_task(tsk);
+
 	exit_mm(tsk);
 
 	if (group_dead)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index d7f5376cfaac..14e1945c62e8 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7585,14 +7585,29 @@ void ___might_sleep(const char *file, int line, int preempt_offset)
 EXPORT_SYMBOL(___might_sleep);
 #endif
 
-#ifdef CONFIG_MAGIC_SYSRQ
-void normalize_rt_tasks(void)
+void sched_normalize_task(struct task_struct *p)
 {
-	struct task_struct *g, *p;
 	struct sched_attr attr = {
 		.sched_policy = SCHED_NORMAL,
 	};
 
+	if (!dl_task(p) && !rt_task(p)) {
+		/*
+		 * Renice negative nice level tasks.
+		 */
+		if (task_nice(p) < 0)
+			set_user_nice(p, 0);
+		return;
+	}
+
+	__sched_setscheduler(p, &attr, false, false);
+}
+
+#ifdef CONFIG_MAGIC_SYSRQ
+void normalize_rt_tasks(void)
+{
+	struct task_struct *g, *p;
+
 	read_lock(&tasklist_lock);
 	for_each_process_thread(g, p) {
 		/*
@@ -7608,21 +7623,10 @@ void normalize_rt_tasks(void)
 		p->se.statistics.block_start	= 0;
 #endif
 
-		if (!dl_task(p) && !rt_task(p)) {
-			/*
-			 * Renice negative nice level userspace
-			 * tasks back to 0:
-			 */
-			if (task_nice(p) < 0)
-				set_user_nice(p, 0);
-			continue;
-		}
-
-		__sched_setscheduler(p, &attr, false, false);
+		sched_normalize_task(p);
 	}
 	read_unlock(&tasklist_lock);
 }
-
 #endif /* CONFIG_MAGIC_SYSRQ */
 
 #if defined(CONFIG_IA64) || defined(CONFIG_KGDB_KDB)
^ permalink raw reply related	[flat|nested] 11+ messages in thread
* Re: [PATCH] Force processes to non-realtime before mm_exit
  2016-07-14 17:24 ` Peter Zijlstra
@ 2016-09-02 15:02   ` Thomas Gleixner
  0 siblings, 0 replies; 11+ messages in thread
From: Thomas Gleixner @ 2016-09-02 15:02 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Brian Silverman, linux-kernel, linux-rt-users, bigeasy,
	Ingo Molnar, Mike Galbraith
On Thu, 14 Jul 2016, Peter Zijlstra wrote:
> On Fri, Jun 03, 2016 at 04:18:44PM -0700, Brian Silverman wrote:
> > Without this, a realtime process which has called mlockall exiting
> > causes large latencies for other realtime processes at the same or
> > lower priorities. This seems like a fairly common use case too, because
> > realtime processes generally want their memory locked into RAM.
> 
> So I'm not too sure..  SCHED_FIFO/RR are a complete trainwreck and
> provide absolutely no isolation from badly behaving tasks what so ever,
> so I'm not too inclined to protect them from exit either, its just one
> more way in which they can cause pain.
> 
> But aside from the, the patch has issues..
> 
> > +++ b/kernel/exit.c
> > @@ -730,6 +730,12 @@ void do_exit(long code)
> >  	tsk->exit_code = code;
> >  	taskstats_exit(tsk, group_dead);
> >  
> > +	if (tsk->policy == SCHED_FIFO || tsk->policy == SCHED_RR) {
> > +		struct sched_param param = { .sched_priority = 0 };
> > +
> > +		sched_setscheduler_nocheck(current, SCHED_NORMAL, ¶m);
> > +	}
> > +
> >  	exit_mm(tsk);
> 
> That only does half a job. You forget about SCHED_DEADLINE and negative
> nice tasks.
> 
> Something like the below perhaps... But yeah, unconvinced.
I agree that FIFO/RR can cause pain, but running exit_mm() with RT priority
or consuming DL time is silly.
 
FWIW: Acked-by-me
^ permalink raw reply	[flat|nested] 11+ messages in thread
end of thread, other threads:[~2016-09-02 15:04 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-05-10 18:04 [PATCH] Force processes to non-realtime before mm_exit Brian Silverman
2016-05-12  8:59 ` Sebastian Andrzej Siewior
2016-05-16 21:05   ` Brian Silverman
2016-05-25 16:33     ` Sebastian Andrzej Siewior
2016-05-25 18:00       ` Brian Silverman
2016-05-25 19:54         ` Sebastian Andrzej Siewior
2016-06-03 23:33           ` Brian Silverman
  -- strict thread matches above, loose matches on Subject: below --
2016-06-03 23:18 Brian Silverman
2016-06-05  0:28 ` Corey Minyard
2016-07-14 17:24 ` Peter Zijlstra
2016-09-02 15:02   ` Thomas Gleixner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).