All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v1] sched/deadline: Add more reschedule cases to prio_changed_dl()
@ 2023-02-02 18:28 Valentin Schneider
  2023-02-03  7:06 ` Juri Lelli
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Valentin Schneider @ 2023-02-02 18:28 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Eder Zulian

I've been tracking down an issue on a ~5.17ish kernel where:

  CPUx                           CPUy

  <DL task p0 owns an rtmutex M>
  <p0 depletes its runtime, gets throttled>
  <rq switches to the idle task>
				 <DL task p1 blocks on M, boost/replenish p0>
				 <No call to resched_curr() happens here>

  [idle task keeps running here until *something*
   accidentally sets TIF_NEED_RESCHED]

On that kernel, it is quite easy to trigger using rt-tests's deadline_test
[1] with the test running on isolated CPUs (this reduces the chance of
something unrelated setting TIF_NEED_RESCHED on the idle tasks, making the
issue even more obvious as the hung task detector chimes in).

I haven't been able to reproduce this using a mainline kernel, even if I
revert

  2972e3050e35 ("tracing: Make trace_marker{,_raw} stream-like")

which gets rid of the lock involved in the above test, *but* I cannot
convince myself the issue isn't there from looking at the code.

Make prio_changed_dl() issue a reschedule if the current task isn't a
deadline one. While at it, ensure a reschedule is emitted when a
queued-but-not-current task gets boosted with an earlier deadline that
current's.

[1]: https://git.kernel.org/pub/scm/utils/rt-tests/rt-tests.git
Signed-off-by: Valentin Schneider <vschneid@redhat.com>
---
 kernel/sched/deadline.c | 45 ++++++++++++++++++++++++++---------------
 1 file changed, 29 insertions(+), 16 deletions(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 0d97d54276cc8..faa382ea084c1 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -2663,17 +2663,28 @@ static void switched_to_dl(struct rq *rq, struct task_struct *p)
 static void prio_changed_dl(struct rq *rq, struct task_struct *p,
 			    int oldprio)
 {
-	if (task_on_rq_queued(p) || task_current(rq, p)) {
-#ifdef CONFIG_SMP
-		/*
-		 * This might be too much, but unfortunately
-		 * we don't have the old deadline value, and
-		 * we can't argue if the task is increasing
-		 * or lowering its prio, so...
-		 */
-		if (!rq->dl.overloaded)
-			deadline_queue_pull_task(rq);
+	if (!task_on_rq_queued(p))
+		return;
+
+	/*
+	 * We don't know if p has a earlier or later deadline, so let's blindly
+	 * set a (maybe not needed) rescheduling point.
+	 */
+	if (!IS_ENABLED(CONFIG_SMP)) {
+		resched_curr(rq);
+		return;
+	}
 
+	/*
+	 * This might be too much, but unfortunately
+	 * we don't have the old deadline value, and
+	 * we can't argue if the task is increasing
+	 * or lowering its prio, so...
+	 */
+	if (!rq->dl.overloaded)
+		deadline_queue_pull_task(rq);
+
+	if (task_current(rq, p)) {
 		/*
 		 * If we now have a earlier deadline task than p,
 		 * then reschedule, provided p is still on this
@@ -2681,14 +2692,16 @@ static void prio_changed_dl(struct rq *rq, struct task_struct *p,
 		 */
 		if (dl_time_before(rq->dl.earliest_dl.curr, p->dl.deadline))
 			resched_curr(rq);
-#else
+	} else {
 		/*
-		 * Again, we don't know if p has a earlier
-		 * or later deadline, so let's blindly set a
-		 * (maybe not needed) rescheduling point.
+		 * Current may not be deadline in case p was throttled but we
+		 * have just replenished it (e.g. rt_mutex_setprio()).
+		 *
+		 * Otherwise, if p was given an earlier deadline, reschedule.
 		 */
-		resched_curr(rq);
-#endif /* CONFIG_SMP */
+		if (!dl_task(rq->curr) ||
+		    dl_time_before(p->dl.deadline, rq->curr->dl.deadline))
+			resched_curr(rq);
 	}
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH v1] sched/deadline: Add more reschedule cases to prio_changed_dl()
  2023-02-02 18:28 [RFC PATCH v1] sched/deadline: Add more reschedule cases to prio_changed_dl() Valentin Schneider
@ 2023-02-03  7:06 ` Juri Lelli
  2023-02-06 12:46   ` Valentin Schneider
  2023-02-04 13:01 ` kernel test robot
  2023-02-04 18:29 ` kernel test robot
  2 siblings, 1 reply; 5+ messages in thread
From: Juri Lelli @ 2023-02-03  7:06 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: linux-kernel, Ingo Molnar, Peter Zijlstra, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Eder Zulian

Hi,

On 02/02/23 18:28, Valentin Schneider wrote:
> I've been tracking down an issue on a ~5.17ish kernel where:
> 
>   CPUx                           CPUy
> 
>   <DL task p0 owns an rtmutex M>
>   <p0 depletes its runtime, gets throttled>
>   <rq switches to the idle task>
> 				 <DL task p1 blocks on M, boost/replenish p0>
> 				 <No call to resched_curr() happens here>
> 
>   [idle task keeps running here until *something*
>    accidentally sets TIF_NEED_RESCHED]
> 
> On that kernel, it is quite easy to trigger using rt-tests's deadline_test
> [1] with the test running on isolated CPUs (this reduces the chance of
> something unrelated setting TIF_NEED_RESCHED on the idle tasks, making the
> issue even more obvious as the hung task detector chimes in).
> 
> I haven't been able to reproduce this using a mainline kernel, even if I
> revert
> 
>   2972e3050e35 ("tracing: Make trace_marker{,_raw} stream-like")
> 
> which gets rid of the lock involved in the above test, *but* I cannot
> convince myself the issue isn't there from looking at the code.
> 
> Make prio_changed_dl() issue a reschedule if the current task isn't a
> deadline one. While at it, ensure a reschedule is emitted when a
> queued-but-not-current task gets boosted with an earlier deadline that
> current's.

As discussed offline I agree this needs fixing, but .. :)

> [1]: https://git.kernel.org/pub/scm/utils/rt-tests/rt-tests.git
> Signed-off-by: Valentin Schneider <vschneid@redhat.com>
> ---
>  kernel/sched/deadline.c | 45 ++++++++++++++++++++++++++---------------
>  1 file changed, 29 insertions(+), 16 deletions(-)
> 
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 0d97d54276cc8..faa382ea084c1 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -2663,17 +2663,28 @@ static void switched_to_dl(struct rq *rq, struct task_struct *p)
>  static void prio_changed_dl(struct rq *rq, struct task_struct *p,
>  			    int oldprio)
>  {
> -	if (task_on_rq_queued(p) || task_current(rq, p)) {
> -#ifdef CONFIG_SMP

Doesn't this break UP? Don't think earlierst_dl etc are defined in UP.

Thanks,
Juri


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH v1] sched/deadline: Add more reschedule cases to prio_changed_dl()
  2023-02-02 18:28 [RFC PATCH v1] sched/deadline: Add more reschedule cases to prio_changed_dl() Valentin Schneider
  2023-02-03  7:06 ` Juri Lelli
@ 2023-02-04 13:01 ` kernel test robot
  2023-02-04 18:29 ` kernel test robot
  2 siblings, 0 replies; 5+ messages in thread
From: kernel test robot @ 2023-02-04 13:01 UTC (permalink / raw)
  To: Valentin Schneider; +Cc: llvm, oe-kbuild-all

Hi Valentin,

[FYI, it's a private test report for your RFC patch.]
[auto build test ERROR on tip/sched/core]
[also build test ERROR on tip/master tip/auto-latest linus/master v6.2-rc6 next-20230203]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Valentin-Schneider/sched-deadline-Add-more-reschedule-cases-to-prio_changed_dl/20230203-023449
patch link:    https://lore.kernel.org/r/20230202182854.3696665-1-vschneid%40redhat.com
patch subject: [RFC PATCH v1] sched/deadline: Add more reschedule cases to prio_changed_dl()
config: x86_64-randconfig-a001 (https://download.01.org/0day-ci/archive/20230204/202302042016.wBoBvL60-lkp@intel.com/config)
compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project f28c006a5895fc0e329fe15fead81e37457cb1d1)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/83c5e4c04268bf314814c49022d874719a516dbe
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Valentin-Schneider/sched-deadline-Add-more-reschedule-cases-to-prio_changed_dl/20230203-023449
        git checkout 83c5e4c04268bf314814c49022d874719a516dbe
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   In file included from kernel/sched/build_policy.c:53:
>> kernel/sched/deadline.c:2684:14: error: no member named 'overloaded' in 'struct dl_rq'
           if (!rq->dl.overloaded)
                ~~~~~~ ^
>> kernel/sched/deadline.c:2693:29: error: no member named 'earliest_dl' in 'struct dl_rq'
                   if (dl_time_before(rq->dl.earliest_dl.curr, p->dl.deadline))
                                      ~~~~~~ ^
   2 errors generated.


vim +2684 kernel/sched/deadline.c

aab03e05e8f7e2 Dario Faggioli     2013-11-28  2658  
1baca4ce16b8cc Juri Lelli         2013-11-07  2659  /*
1baca4ce16b8cc Juri Lelli         2013-11-07  2660   * If the scheduling parameters of a -deadline task changed,
1baca4ce16b8cc Juri Lelli         2013-11-07  2661   * a push or pull operation might be needed.
1baca4ce16b8cc Juri Lelli         2013-11-07  2662   */
aab03e05e8f7e2 Dario Faggioli     2013-11-28  2663  static void prio_changed_dl(struct rq *rq, struct task_struct *p,
aab03e05e8f7e2 Dario Faggioli     2013-11-28  2664  			    int oldprio)
aab03e05e8f7e2 Dario Faggioli     2013-11-28  2665  {
83c5e4c04268bf Valentin Schneider 2023-02-02  2666  	if (!task_on_rq_queued(p))
83c5e4c04268bf Valentin Schneider 2023-02-02  2667  		return;
83c5e4c04268bf Valentin Schneider 2023-02-02  2668  
83c5e4c04268bf Valentin Schneider 2023-02-02  2669  	/*
83c5e4c04268bf Valentin Schneider 2023-02-02  2670  	 * We don't know if p has a earlier or later deadline, so let's blindly
83c5e4c04268bf Valentin Schneider 2023-02-02  2671  	 * set a (maybe not needed) rescheduling point.
83c5e4c04268bf Valentin Schneider 2023-02-02  2672  	 */
83c5e4c04268bf Valentin Schneider 2023-02-02  2673  	if (!IS_ENABLED(CONFIG_SMP)) {
83c5e4c04268bf Valentin Schneider 2023-02-02  2674  		resched_curr(rq);
83c5e4c04268bf Valentin Schneider 2023-02-02  2675  		return;
83c5e4c04268bf Valentin Schneider 2023-02-02  2676  	}
83c5e4c04268bf Valentin Schneider 2023-02-02  2677  
1baca4ce16b8cc Juri Lelli         2013-11-07  2678  	/*
1baca4ce16b8cc Juri Lelli         2013-11-07  2679  	 * This might be too much, but unfortunately
1baca4ce16b8cc Juri Lelli         2013-11-07  2680  	 * we don't have the old deadline value, and
1baca4ce16b8cc Juri Lelli         2013-11-07  2681  	 * we can't argue if the task is increasing
1baca4ce16b8cc Juri Lelli         2013-11-07  2682  	 * or lowering its prio, so...
1baca4ce16b8cc Juri Lelli         2013-11-07  2683  	 */
1baca4ce16b8cc Juri Lelli         2013-11-07 @2684  	if (!rq->dl.overloaded)
02d8ec9456f47b Ingo Molnar        2018-03-03  2685  		deadline_queue_pull_task(rq);
1baca4ce16b8cc Juri Lelli         2013-11-07  2686  
83c5e4c04268bf Valentin Schneider 2023-02-02  2687  	if (task_current(rq, p)) {
1baca4ce16b8cc Juri Lelli         2013-11-07  2688  		/*
1baca4ce16b8cc Juri Lelli         2013-11-07  2689  		 * If we now have a earlier deadline task than p,
1baca4ce16b8cc Juri Lelli         2013-11-07  2690  		 * then reschedule, provided p is still on this
1baca4ce16b8cc Juri Lelli         2013-11-07  2691  		 * runqueue.
1baca4ce16b8cc Juri Lelli         2013-11-07  2692  		 */
9916e214998a4a Peter Zijlstra     2015-06-11 @2693  		if (dl_time_before(rq->dl.earliest_dl.curr, p->dl.deadline))
8875125efe8402 Kirill Tkhai       2014-06-29  2694  			resched_curr(rq);
83c5e4c04268bf Valentin Schneider 2023-02-02  2695  	} else {
1baca4ce16b8cc Juri Lelli         2013-11-07  2696  		/*
83c5e4c04268bf Valentin Schneider 2023-02-02  2697  		 * Current may not be deadline in case p was throttled but we
83c5e4c04268bf Valentin Schneider 2023-02-02  2698  		 * have just replenished it (e.g. rt_mutex_setprio()).
83c5e4c04268bf Valentin Schneider 2023-02-02  2699  		 *
83c5e4c04268bf Valentin Schneider 2023-02-02  2700  		 * Otherwise, if p was given an earlier deadline, reschedule.
1baca4ce16b8cc Juri Lelli         2013-11-07  2701  		 */
83c5e4c04268bf Valentin Schneider 2023-02-02  2702  		if (!dl_task(rq->curr) ||
83c5e4c04268bf Valentin Schneider 2023-02-02  2703  		    dl_time_before(p->dl.deadline, rq->curr->dl.deadline))
8875125efe8402 Kirill Tkhai       2014-06-29  2704  			resched_curr(rq);
801ccdbf018ca5 Peter Zijlstra     2016-02-25  2705  	}
aab03e05e8f7e2 Dario Faggioli     2013-11-28  2706  }
aab03e05e8f7e2 Dario Faggioli     2013-11-28  2707  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH v1] sched/deadline: Add more reschedule cases to prio_changed_dl()
  2023-02-02 18:28 [RFC PATCH v1] sched/deadline: Add more reschedule cases to prio_changed_dl() Valentin Schneider
  2023-02-03  7:06 ` Juri Lelli
  2023-02-04 13:01 ` kernel test robot
@ 2023-02-04 18:29 ` kernel test robot
  2 siblings, 0 replies; 5+ messages in thread
From: kernel test robot @ 2023-02-04 18:29 UTC (permalink / raw)
  To: Valentin Schneider; +Cc: oe-kbuild-all

Hi Valentin,

[FYI, it's a private test report for your RFC patch.]
[auto build test ERROR on tip/sched/core]
[also build test ERROR on tip/master tip/auto-latest linus/master v6.2-rc6]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Valentin-Schneider/sched-deadline-Add-more-reschedule-cases-to-prio_changed_dl/20230203-023449
patch link:    https://lore.kernel.org/r/20230202182854.3696665-1-vschneid%40redhat.com
patch subject: [RFC PATCH v1] sched/deadline: Add more reschedule cases to prio_changed_dl()
config: arm-randconfig-r046-20230205 (https://download.01.org/0day-ci/archive/20230205/202302050231.eII2pEZn-lkp@intel.com/config)
compiler: arm-linux-gnueabi-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/83c5e4c04268bf314814c49022d874719a516dbe
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Valentin-Schneider/sched-deadline-Add-more-reschedule-cases-to-prio_changed_dl/20230203-023449
        git checkout 83c5e4c04268bf314814c49022d874719a516dbe
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=arm olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=arm SHELL=/bin/bash

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   In file included from kernel/sched/build_policy.c:53:
   kernel/sched/deadline.c: In function 'prio_changed_dl':
>> kernel/sched/deadline.c:2684:20: error: 'struct dl_rq' has no member named 'overloaded'
    2684 |         if (!rq->dl.overloaded)
         |                    ^
>> kernel/sched/deadline.c:2693:42: error: 'struct dl_rq' has no member named 'earliest_dl'
    2693 |                 if (dl_time_before(rq->dl.earliest_dl.curr, p->dl.deadline))
         |                                          ^


vim +2684 kernel/sched/deadline.c

aab03e05e8f7e2 Dario Faggioli     2013-11-28  2658  
1baca4ce16b8cc Juri Lelli         2013-11-07  2659  /*
1baca4ce16b8cc Juri Lelli         2013-11-07  2660   * If the scheduling parameters of a -deadline task changed,
1baca4ce16b8cc Juri Lelli         2013-11-07  2661   * a push or pull operation might be needed.
1baca4ce16b8cc Juri Lelli         2013-11-07  2662   */
aab03e05e8f7e2 Dario Faggioli     2013-11-28  2663  static void prio_changed_dl(struct rq *rq, struct task_struct *p,
aab03e05e8f7e2 Dario Faggioli     2013-11-28  2664  			    int oldprio)
aab03e05e8f7e2 Dario Faggioli     2013-11-28  2665  {
83c5e4c04268bf Valentin Schneider 2023-02-02  2666  	if (!task_on_rq_queued(p))
83c5e4c04268bf Valentin Schneider 2023-02-02  2667  		return;
83c5e4c04268bf Valentin Schneider 2023-02-02  2668  
83c5e4c04268bf Valentin Schneider 2023-02-02  2669  	/*
83c5e4c04268bf Valentin Schneider 2023-02-02  2670  	 * We don't know if p has a earlier or later deadline, so let's blindly
83c5e4c04268bf Valentin Schneider 2023-02-02  2671  	 * set a (maybe not needed) rescheduling point.
83c5e4c04268bf Valentin Schneider 2023-02-02  2672  	 */
83c5e4c04268bf Valentin Schneider 2023-02-02  2673  	if (!IS_ENABLED(CONFIG_SMP)) {
83c5e4c04268bf Valentin Schneider 2023-02-02  2674  		resched_curr(rq);
83c5e4c04268bf Valentin Schneider 2023-02-02  2675  		return;
83c5e4c04268bf Valentin Schneider 2023-02-02  2676  	}
83c5e4c04268bf Valentin Schneider 2023-02-02  2677  
1baca4ce16b8cc Juri Lelli         2013-11-07  2678  	/*
1baca4ce16b8cc Juri Lelli         2013-11-07  2679  	 * This might be too much, but unfortunately
1baca4ce16b8cc Juri Lelli         2013-11-07  2680  	 * we don't have the old deadline value, and
1baca4ce16b8cc Juri Lelli         2013-11-07  2681  	 * we can't argue if the task is increasing
1baca4ce16b8cc Juri Lelli         2013-11-07  2682  	 * or lowering its prio, so...
1baca4ce16b8cc Juri Lelli         2013-11-07  2683  	 */
1baca4ce16b8cc Juri Lelli         2013-11-07 @2684  	if (!rq->dl.overloaded)
02d8ec9456f47b Ingo Molnar        2018-03-03  2685  		deadline_queue_pull_task(rq);
1baca4ce16b8cc Juri Lelli         2013-11-07  2686  
83c5e4c04268bf Valentin Schneider 2023-02-02  2687  	if (task_current(rq, p)) {
1baca4ce16b8cc Juri Lelli         2013-11-07  2688  		/*
1baca4ce16b8cc Juri Lelli         2013-11-07  2689  		 * If we now have a earlier deadline task than p,
1baca4ce16b8cc Juri Lelli         2013-11-07  2690  		 * then reschedule, provided p is still on this
1baca4ce16b8cc Juri Lelli         2013-11-07  2691  		 * runqueue.
1baca4ce16b8cc Juri Lelli         2013-11-07  2692  		 */
9916e214998a4a Peter Zijlstra     2015-06-11 @2693  		if (dl_time_before(rq->dl.earliest_dl.curr, p->dl.deadline))
8875125efe8402 Kirill Tkhai       2014-06-29  2694  			resched_curr(rq);
83c5e4c04268bf Valentin Schneider 2023-02-02  2695  	} else {
1baca4ce16b8cc Juri Lelli         2013-11-07  2696  		/*
83c5e4c04268bf Valentin Schneider 2023-02-02  2697  		 * Current may not be deadline in case p was throttled but we
83c5e4c04268bf Valentin Schneider 2023-02-02  2698  		 * have just replenished it (e.g. rt_mutex_setprio()).
83c5e4c04268bf Valentin Schneider 2023-02-02  2699  		 *
83c5e4c04268bf Valentin Schneider 2023-02-02  2700  		 * Otherwise, if p was given an earlier deadline, reschedule.
1baca4ce16b8cc Juri Lelli         2013-11-07  2701  		 */
83c5e4c04268bf Valentin Schneider 2023-02-02  2702  		if (!dl_task(rq->curr) ||
83c5e4c04268bf Valentin Schneider 2023-02-02  2703  		    dl_time_before(p->dl.deadline, rq->curr->dl.deadline))
8875125efe8402 Kirill Tkhai       2014-06-29  2704  			resched_curr(rq);
801ccdbf018ca5 Peter Zijlstra     2016-02-25  2705  	}
aab03e05e8f7e2 Dario Faggioli     2013-11-28  2706  }
aab03e05e8f7e2 Dario Faggioli     2013-11-28  2707  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH v1] sched/deadline: Add more reschedule cases to prio_changed_dl()
  2023-02-03  7:06 ` Juri Lelli
@ 2023-02-06 12:46   ` Valentin Schneider
  0 siblings, 0 replies; 5+ messages in thread
From: Valentin Schneider @ 2023-02-06 12:46 UTC (permalink / raw)
  To: Juri Lelli
  Cc: linux-kernel, Ingo Molnar, Peter Zijlstra, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Eder Zulian

On 03/02/23 08:06, Juri Lelli wrote:
> Hi,
>
> On 02/02/23 18:28, Valentin Schneider wrote:
>> I've been tracking down an issue on a ~5.17ish kernel where:
>>
>>   CPUx                           CPUy
>>
>>   <DL task p0 owns an rtmutex M>
>>   <p0 depletes its runtime, gets throttled>
>>   <rq switches to the idle task>
>>                               <DL task p1 blocks on M, boost/replenish p0>
>>                               <No call to resched_curr() happens here>
>>
>>   [idle task keeps running here until *something*
>>    accidentally sets TIF_NEED_RESCHED]
>>
>> On that kernel, it is quite easy to trigger using rt-tests's deadline_test
>> [1] with the test running on isolated CPUs (this reduces the chance of
>> something unrelated setting TIF_NEED_RESCHED on the idle tasks, making the
>> issue even more obvious as the hung task detector chimes in).
>>
>> I haven't been able to reproduce this using a mainline kernel, even if I
>> revert
>>
>>   2972e3050e35 ("tracing: Make trace_marker{,_raw} stream-like")
>>
>> which gets rid of the lock involved in the above test, *but* I cannot
>> convince myself the issue isn't there from looking at the code.
>>
>> Make prio_changed_dl() issue a reschedule if the current task isn't a
>> deadline one. While at it, ensure a reschedule is emitted when a
>> queued-but-not-current task gets boosted with an earlier deadline that
>> current's.
>
> As discussed offline I agree this needs fixing, but .. :)
>
>> [1]: https://git.kernel.org/pub/scm/utils/rt-tests/rt-tests.git
>> Signed-off-by: Valentin Schneider <vschneid@redhat.com>
>> ---
>>  kernel/sched/deadline.c | 45 ++++++++++++++++++++++++++---------------
>>  1 file changed, 29 insertions(+), 16 deletions(-)
>>
>> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
>> index 0d97d54276cc8..faa382ea084c1 100644
>> --- a/kernel/sched/deadline.c
>> +++ b/kernel/sched/deadline.c
>> @@ -2663,17 +2663,28 @@ static void switched_to_dl(struct rq *rq, struct task_struct *p)
>>  static void prio_changed_dl(struct rq *rq, struct task_struct *p,
>>                          int oldprio)
>>  {
>> -	if (task_on_rq_queued(p) || task_current(rq, p)) {
>> -#ifdef CONFIG_SMP
>
> Doesn't this break UP? Don't think earlierst_dl etc are defined in UP.
>

Indeed, I thought myself clever by getting rid of the ifdefs...

> Thanks,
> Juri


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-02-06 12:47 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-02-02 18:28 [RFC PATCH v1] sched/deadline: Add more reschedule cases to prio_changed_dl() Valentin Schneider
2023-02-03  7:06 ` Juri Lelli
2023-02-06 12:46   ` Valentin Schneider
2023-02-04 13:01 ` kernel test robot
2023-02-04 18:29 ` kernel test robot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.