public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] Fix nohz_full vs rt bandwidth
@ 2023-08-21  9:49 Hao Jia
  2023-08-21  9:49 ` [PATCH 1/2] sched/core: Introduce sched_class::can_stop_tick() Hao Jia
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Hao Jia @ 2023-08-21  9:49 UTC (permalink / raw)
  To: mingo, peterz, mingo, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, mgorman, bristot, vschneid,
	pauld
  Cc: linux-kernel, Hao Jia

Since the commit 88c56cfeaec4 ("sched/fair: Block nohz tick_stop
when cfs bandwidth in use") was merged, it handles conflicts between
NOHZ full and cfs_bandwidth well, and the scheduler feature HZ_BW
allows us to choose which one to prefer.

This conflict also exists between NOHZ full and rt_bandwidth,
these two patches try to handle it in a similar way.

patch1: Extracts a can_stop_tick() callback function for each
sched_class from sched_can_stop_tick(), it will make things clearer
and also convenient to handle the conflict between NOHZ full
and rt_bandwidth.

patch2: If the HZ_BW scheduler feature is enabled, and the RT task
to be run is constrained by rt_bandwidth runtime. Then it will
prevent NO_HZ full from stopping tick.

Hao Jia (2):
  sched/core: Introduce sched_class::can_stop_tick()
  sched/rt: Block nohz tick_stop when rt bandwidth in use

 kernel/sched/core.c     | 67 +++++--------------------------
 kernel/sched/deadline.c | 16 ++++++++
 kernel/sched/fair.c     | 56 +++++++++++++++++++++++---
 kernel/sched/rt.c       | 89 ++++++++++++++++++++++++++++++++++++++++-
 kernel/sched/sched.h    |  5 ++-
 5 files changed, 168 insertions(+), 65 deletions(-)

-- 
2.39.2


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/2] sched/core: Introduce sched_class::can_stop_tick()
  2023-08-21  9:49 [PATCH 0/2] Fix nohz_full vs rt bandwidth Hao Jia
@ 2023-08-21  9:49 ` Hao Jia
  2023-08-24  7:43   ` kernel test robot
  2023-08-21  9:49 ` [PATCH 2/2] sched/rt: Block nohz tick_stop when rt bandwidth in use Hao Jia
  2023-09-06  6:45 ` [PATCH 0/2] Fix nohz_full vs rt bandwidth Hao Jia
  2 siblings, 1 reply; 11+ messages in thread
From: Hao Jia @ 2023-08-21  9:49 UTC (permalink / raw)
  To: mingo, peterz, mingo, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, mgorman, bristot, vschneid,
	pauld
  Cc: linux-kernel, Hao Jia

Extract a can_stop_tick() callback function for each
sched_class from sched_can_stop_tick(). It will
clean up some checks about cfs_bandwidth in sched/core.c.
Put these checks into their own sched_class,
and make some functions static.

This also makes it easier for us to deal with
"nohz_full vs rt_bandwidth" case later.

No functional changes.

Signed-off-by: Hao Jia <jiahao.os@bytedance.com>
---
 kernel/sched/core.c     | 67 ++++++-----------------------------------
 kernel/sched/deadline.c | 16 ++++++++++
 kernel/sched/fair.c     | 56 +++++++++++++++++++++++++++++++---
 kernel/sched/rt.c       | 34 +++++++++++++++++++++
 kernel/sched/sched.h    |  5 +--
 5 files changed, 114 insertions(+), 64 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index efe3848978a0..1107ce6e4f6c 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1191,68 +1191,21 @@ static void nohz_csd_func(void *info)
 #endif /* CONFIG_NO_HZ_COMMON */
 
 #ifdef CONFIG_NO_HZ_FULL
-static inline bool __need_bw_check(struct rq *rq, struct task_struct *p)
-{
-	if (rq->nr_running != 1)
-		return false;
-
-	if (p->sched_class != &fair_sched_class)
-		return false;
-
-	if (!task_on_rq_queued(p))
-		return false;
-
-	return true;
-}
-
 bool sched_can_stop_tick(struct rq *rq)
 {
-	int fifo_nr_running;
-
-	/* Deadline tasks, even if single, need the tick */
-	if (rq->dl.dl_nr_running)
-		return false;
-
-	/*
-	 * If there are more than one RR tasks, we need the tick to affect the
-	 * actual RR behaviour.
-	 */
-	if (rq->rt.rr_nr_running) {
-		if (rq->rt.rr_nr_running == 1)
-			return true;
-		else
-			return false;
-	}
-
-	/*
-	 * If there's no RR tasks, but FIFO tasks, we can skip the tick, no
-	 * forced preemption between FIFO tasks.
-	 */
-	fifo_nr_running = rq->rt.rt_nr_running - rq->rt.rr_nr_running;
-	if (fifo_nr_running)
-		return true;
-
-	/*
-	 * If there are no DL,RR/FIFO tasks, there must only be CFS tasks left;
-	 * if there's more than one we need the tick for involuntary
-	 * preemption.
-	 */
-	if (rq->nr_running > 1)
-		return false;
+	const struct sched_class *class;
+	int stop_next = 0;
+	bool ret = true;
 
-	/*
-	 * If there is one task and it has CFS runtime bandwidth constraints
-	 * and it's on the cpu now we don't want to stop the tick.
-	 * This check prevents clearing the bit if a newly enqueued task here is
-	 * dequeued by migrating while the constrained task continues to run.
-	 * E.g. going from 2->1 without going through pick_next_task().
-	 */
-	if (sched_feat(HZ_BW) && __need_bw_check(rq, rq->curr)) {
-		if (cfs_task_bw_constrained(rq->curr))
-			return false;
+	for_each_class(class) {
+		if (class->can_stop_tick) {
+			ret = class->can_stop_tick(rq, &stop_next);
+			if (stop_next)
+				break;
+		}
 	}
 
-	return true;
+	return ret;
 }
 #endif /* CONFIG_NO_HZ_FULL */
 #endif /* CONFIG_SMP */
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 58b542bf2893..0b461cb40408 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -2715,6 +2715,19 @@ static int task_is_throttled_dl(struct task_struct *p, int cpu)
 }
 #endif
 
+#ifdef CONFIG_NO_HZ_FULL
+static bool can_stop_tick_dl(struct rq *rq, int *stop_next)
+{
+	/* Deadline tasks, even if single, need the tick */
+	if (rq->dl.dl_nr_running) {
+		*stop_next = 1;
+		return false;
+	}
+
+	return true;
+}
+#endif
+
 DEFINE_SCHED_CLASS(dl) = {
 
 	.enqueue_task		= enqueue_task_dl,
@@ -2750,6 +2763,9 @@ DEFINE_SCHED_CLASS(dl) = {
 #ifdef CONFIG_SCHED_CORE
 	.task_is_throttled	= task_is_throttled_dl,
 #endif
+#ifdef CONFIG_NO_HZ_FULL
+	.can_stop_tick		= can_stop_tick_dl,
+#endif
 };
 
 /* Used for dl_bw check and update, used under sched_rt_handler()::mutex */
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 128a78f3f264..7fa4892267f1 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6267,7 +6267,8 @@ static void __maybe_unused unthrottle_offline_cfs_rqs(struct rq *rq)
 	rq_clock_stop_loop_update(rq);
 }
 
-bool cfs_task_bw_constrained(struct task_struct *p)
+#ifdef CONFIG_NO_HZ_FULL
+static inline bool cfs_task_bw_constrained(struct task_struct *p)
 {
 	struct cfs_rq *cfs_rq = task_cfs_rq(p);
 
@@ -6281,7 +6282,6 @@ bool cfs_task_bw_constrained(struct task_struct *p)
 	return false;
 }
 
-#ifdef CONFIG_NO_HZ_FULL
 /* called from pick_next_task_fair() */
 static void sched_fair_update_stop_tick(struct rq *rq, struct task_struct *p)
 {
@@ -6305,6 +6305,44 @@ static void sched_fair_update_stop_tick(struct rq *rq, struct task_struct *p)
 	if (cfs_task_bw_constrained(p))
 		tick_nohz_dep_set_cpu(cpu, TICK_DEP_BIT_SCHED);
 }
+
+static inline bool __need_bw_check(struct rq *rq, struct task_struct *p)
+{
+	if (rq->nr_running != 1)
+		return false;
+
+	if (p->sched_class != &fair_sched_class)
+		return false;
+
+	if (!task_on_rq_queued(p))
+		return false;
+
+	return true;
+}
+
+static bool can_stop_tick_fair(struct rq *rq, int *stop_next)
+{
+	if (rq->nr_running > 1) {
+		*stop_next = 1;
+		return false;
+	}
+
+	/*
+	 * If there is one task and it has CFS runtime bandwidth constraints
+	 * and it's on the cpu now we don't want to stop the tick.
+	 * This check prevents clearing the bit if a newly enqueued task here is
+	 * dequeued by migrating while the constrained task continues to run.
+	 * E.g. going from 2->1 without going through pick_next_task().
+	 */
+	if (sched_feat(HZ_BW) && __need_bw_check(rq, rq->curr)) {
+		if (cfs_task_bw_constrained(rq->curr)) {
+			*stop_next = 1;
+			return false;
+		}
+	}
+
+	return true;
+}
 #endif
 
 #else /* CONFIG_CFS_BANDWIDTH */
@@ -6348,10 +6386,15 @@ static inline struct cfs_bandwidth *tg_cfs_bandwidth(struct task_group *tg)
 static inline void destroy_cfs_bandwidth(struct cfs_bandwidth *cfs_b) {}
 static inline void update_runtime_enabled(struct rq *rq) {}
 static inline void unthrottle_offline_cfs_rqs(struct rq *rq) {}
-#ifdef CONFIG_CGROUP_SCHED
-bool cfs_task_bw_constrained(struct task_struct *p)
+#ifdef CONFIG_NO_HZ_FULL
+static bool can_stop_tick_fair(struct rq *rq, int *stop_next)
 {
-	return false;
+	if (rq->nr_running > 1) {
+		*stop_next = 1;
+		return false;
+	}
+
+	return true;
 }
 #endif
 #endif /* CONFIG_CFS_BANDWIDTH */
@@ -12864,6 +12907,9 @@ DEFINE_SCHED_CLASS(fair) = {
 #ifdef CONFIG_SCHED_CORE
 	.task_is_throttled	= task_is_throttled_fair,
 #endif
+#ifdef CONFIG_NO_HZ_FULL
+	.can_stop_tick		= can_stop_tick_fair,
+#endif
 
 #ifdef CONFIG_UCLAMP_TASK
 	.uclamp_enabled		= 1,
diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 0597ba0f85ff..0b9e9467ef61 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -1740,6 +1740,37 @@ static void check_preempt_curr_rt(struct rq *rq, struct task_struct *p, int flag
 #endif
 }
 
+#ifdef CONFIG_NO_HZ_FULL
+static bool can_stop_tick_rt(struct rq *rq, int *stop_next)
+{
+	int fifo_nr_running;
+
+	/*
+	 * If there are more than one RR tasks, we need the tick to affect the
+	 * actual RR behaviour.
+	 */
+	if (rq->rt.rr_nr_running) {
+		*stop_next = 1;
+		if (rq->rt.rr_nr_running == 1)
+			return true;
+		else
+			return false;
+	}
+
+	/*
+	 * If there's no RR tasks, but FIFO tasks, we can skip the tick, no
+	 * forced preemption between FIFO tasks.
+	 */
+	fifo_nr_running = rq->rt.rt_nr_running - rq->rt.rr_nr_running;
+	if (fifo_nr_running) {
+		*stop_next = 1;
+		return true;
+	}
+
+	return true;
+}
+#endif
+
 static inline void set_next_task_rt(struct rq *rq, struct task_struct *p, bool first)
 {
 	struct sched_rt_entity *rt_se = &p->rt;
@@ -2732,6 +2763,9 @@ DEFINE_SCHED_CLASS(rt) = {
 #ifdef CONFIG_SCHED_CORE
 	.task_is_throttled	= task_is_throttled_rt,
 #endif
+#ifdef CONFIG_NO_HZ_FULL
+	.can_stop_tick		= can_stop_tick_rt,
+#endif
 
 #ifdef CONFIG_UCLAMP_TASK
 	.uclamp_enabled		= 1,
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 4d4b6f178e99..f464e7fff0ef 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -459,7 +459,6 @@ extern void init_cfs_bandwidth(struct cfs_bandwidth *cfs_b, struct cfs_bandwidth
 extern void __refill_cfs_bandwidth_runtime(struct cfs_bandwidth *cfs_b);
 extern void start_cfs_bandwidth(struct cfs_bandwidth *cfs_b);
 extern void unthrottle_cfs_rq(struct cfs_rq *cfs_rq);
-extern bool cfs_task_bw_constrained(struct task_struct *p);
 
 extern void init_tg_rt_entry(struct task_group *tg, struct rt_rq *rt_rq,
 		struct sched_rt_entity *rt_se, int cpu,
@@ -495,7 +494,6 @@ static inline void set_task_rq_fair(struct sched_entity *se,
 #else /* CONFIG_CGROUP_SCHED */
 
 struct cfs_bandwidth { };
-static inline bool cfs_task_bw_constrained(struct task_struct *p) { return false; }
 
 #endif	/* CONFIG_CGROUP_SCHED */
 
@@ -2289,6 +2287,9 @@ struct sched_class {
 #ifdef CONFIG_SCHED_CORE
 	int (*task_is_throttled)(struct task_struct *p, int cpu);
 #endif
+#ifdef CONFIG_NO_HZ_FULL
+	bool (*can_stop_tick)(struct rq *rq, int *stop_next);
+#endif
 };
 
 static inline void put_prev_task(struct rq *rq, struct task_struct *prev)
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/2] sched/rt: Block nohz tick_stop when rt bandwidth in use
  2023-08-21  9:49 [PATCH 0/2] Fix nohz_full vs rt bandwidth Hao Jia
  2023-08-21  9:49 ` [PATCH 1/2] sched/core: Introduce sched_class::can_stop_tick() Hao Jia
@ 2023-08-21  9:49 ` Hao Jia
  2023-09-06  6:45 ` [PATCH 0/2] Fix nohz_full vs rt bandwidth Hao Jia
  2 siblings, 0 replies; 11+ messages in thread
From: Hao Jia @ 2023-08-21  9:49 UTC (permalink / raw)
  To: mingo, peterz, mingo, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, mgorman, bristot, vschneid,
	pauld
  Cc: linux-kernel, Hao Jia

The commit 88c56cfeaec4 ("sched/fair: Block nohz tick_stop
when cfs bandwidth in use") has been merged. It can handle
conflicts between NOHZ full and cfs_bandwidth, and the
scheduler feature HZ_BW allows us to choose which one to prefer.

This problem also exists between NOHZ full and rt_bandwidth.
Now when there is only one RR task or only FIFO tasks on
the NOHZ full cpu, NOHZ full will stop tick even though
those tasks are constrained by rt_bandwidth. This makes It
very easy for RT tasks to run longer than the rt_bandwidth
runtime limit. This may cause rt bandwidth limit not timely
and CFS task starvation for a long time.

Similar to the commit above, we also add a check in
pick_next_task_rt(), which updates the tick dependency
status according to whether the task to be run is
constrained by rt_bandwidth.

Add check in sched_can_stop_tick() to cover some
edge cases such as rq nr_running going from 2->1 without
going through pick_next_task_rt() and the 1 remains
the running RT task.

Signed-off-by: Hao Jia <jiahao.os@bytedance.com>
---
 kernel/sched/rt.c | 61 +++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 57 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 0b9e9467ef61..f55ce6935a80 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -1741,7 +1741,31 @@ static void check_preempt_curr_rt(struct rq *rq, struct task_struct *p, int flag
 }
 
 #ifdef CONFIG_NO_HZ_FULL
-static bool can_stop_tick_rt(struct rq *rq, int *stop_next)
+/*
+ * If the scheduler feature HZ_BW is enabled, we need to further
+ * check whether task @p is constrained by RT bandwidth to decide
+ * whether to stop tick.
+ */
+static inline bool rt_task_bw_constrained(struct task_struct *p)
+{
+	struct rt_rq *rt_rq;
+
+	if (!sched_feat(HZ_BW))
+		return false;
+
+	if (rt_bandwidth_enabled())
+		return true;
+
+	if (p->sched_class == &rt_sched_class && task_on_rq_queued(p)) {
+		rt_rq = rt_rq_of_se(&p->rt);
+		if (sched_rt_runtime(rt_rq) != RUNTIME_INF)
+			return true;
+	}
+
+	return false;
+}
+
+static bool __can_stop_tick_rt(struct rq *rq, struct task_struct *p, int *stop_next)
 {
 	int fifo_nr_running;
 
@@ -1751,7 +1775,7 @@ static bool can_stop_tick_rt(struct rq *rq, int *stop_next)
 	 */
 	if (rq->rt.rr_nr_running) {
 		*stop_next = 1;
-		if (rq->rt.rr_nr_running == 1)
+		if (rq->rt.rr_nr_running == 1 && !rt_task_bw_constrained(p))
 			return true;
 		else
 			return false;
@@ -1764,11 +1788,38 @@ static bool can_stop_tick_rt(struct rq *rq, int *stop_next)
 	fifo_nr_running = rq->rt.rt_nr_running - rq->rt.rr_nr_running;
 	if (fifo_nr_running) {
 		*stop_next = 1;
-		return true;
+		if (!rt_task_bw_constrained(p))
+			return true;
+		else
+			return false;
 	}
 
 	return true;
 }
+
+static bool can_stop_tick_rt(struct rq *rq, int *stop_next)
+{
+	bool ret;
+
+	ret = __can_stop_tick_rt(rq, rq->curr, stop_next);
+	if (stop_next)
+		return ret;
+
+	return true;
+}
+static void sched_rt_update_stop_tick(struct rq *rq, struct task_struct *p)
+{
+	int cpu = cpu_of(rq);
+	int unused;
+
+	if (!sched_feat(HZ_BW) || !tick_nohz_full_cpu(cpu))
+		return;
+
+	if (!__can_stop_tick_rt(rq, p, &unused))
+		tick_nohz_dep_set_cpu(cpu, TICK_DEP_BIT_SCHED);
+}
+#else /* CONFIG_NO_HZ_FULL */
+static inline void sched_rt_update_stop_tick(struct rq *rq, struct task_struct *p) {}
 #endif
 
 static inline void set_next_task_rt(struct rq *rq, struct task_struct *p, bool first)
@@ -1846,8 +1897,10 @@ static struct task_struct *pick_next_task_rt(struct rq *rq)
 {
 	struct task_struct *p = pick_task_rt(rq);
 
-	if (p)
+	if (p) {
+		sched_rt_update_stop_tick(rq, p);
 		set_next_task_rt(rq, p, true);
+	}
 
 	return p;
 }
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] sched/core: Introduce sched_class::can_stop_tick()
  2023-08-21  9:49 ` [PATCH 1/2] sched/core: Introduce sched_class::can_stop_tick() Hao Jia
@ 2023-08-24  7:43   ` kernel test robot
  0 siblings, 0 replies; 11+ messages in thread
From: kernel test robot @ 2023-08-24  7:43 UTC (permalink / raw)
  To: Hao Jia, mingo, peterz, mingo, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, mgorman, bristot, vschneid,
	pauld
  Cc: oe-kbuild-all, linux-kernel, Hao Jia

Hi Hao,

kernel test robot noticed the following build warnings:

[auto build test WARNING on tip/sched/core]
[also build test WARNING on next-20230824]
[cannot apply to linus/master v6.5-rc7]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Hao-Jia/sched-core-Introduce-sched_class-can_stop_tick/20230821-175200
base:   tip/sched/core
patch link:    https://lore.kernel.org/r/20230821094927.51079-2-jiahao.os%40bytedance.com
patch subject: [PATCH 1/2] sched/core: Introduce sched_class::can_stop_tick()
config: arm64-randconfig-r122-20230824 (https://download.01.org/0day-ci/archive/20230824/202308241526.be4l9ROa-lkp@intel.com/config)
compiler: aarch64-linux-gcc (GCC) 12.3.0
reproduce: (https://download.01.org/0day-ci/archive/20230824/202308241526.be4l9ROa-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202308241526.be4l9ROa-lkp@intel.com/

sparse warnings: (new ones prefixed by >>)
   kernel/sched/fair.c:702:6: sparse: sparse: symbol 'update_entity_lag' was not declared. Should it be static?
   kernel/sched/fair.c:1140:34: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct sched_entity const *se @@     got struct sched_entity [noderef] __rcu * @@
   kernel/sched/fair.c:1140:34: sparse:     expected struct sched_entity const *se
   kernel/sched/fair.c:1140:34: sparse:     got struct sched_entity [noderef] __rcu *
   kernel/sched/fair.c:12087:9: sparse: sparse: incorrect type in assignment (different address spaces) @@     expected struct sched_domain *[assigned] sd @@     got struct sched_domain [noderef] __rcu *parent @@
   kernel/sched/fair.c:12087:9: sparse:     expected struct sched_domain *[assigned] sd
   kernel/sched/fair.c:12087:9: sparse:     got struct sched_domain [noderef] __rcu *parent
   kernel/sched/fair.c:5674:22: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/sched/fair.c:5674:22: sparse:    struct task_struct [noderef] __rcu *
   kernel/sched/fair.c:5674:22: sparse:    struct task_struct *
>> kernel/sched/fair.c:6336:56: sparse: sparse: incorrect type in argument 2 (different address spaces) @@     expected struct task_struct *p @@     got struct task_struct [noderef] __rcu *curr @@
   kernel/sched/fair.c:6336:56: sparse:     expected struct task_struct *p
   kernel/sched/fair.c:6336:56: sparse:     got struct task_struct [noderef] __rcu *curr
   kernel/sched/fair.c:6337:47: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct task_struct *p @@     got struct task_struct [noderef] __rcu *curr @@
   kernel/sched/fair.c:6337:47: sparse:     expected struct task_struct *p
   kernel/sched/fair.c:6337:47: sparse:     got struct task_struct [noderef] __rcu *curr
   kernel/sched/fair.c:6438:38: sparse: sparse: incorrect type in initializer (different address spaces) @@     expected struct task_struct *curr @@     got struct task_struct [noderef] __rcu *curr @@
   kernel/sched/fair.c:6438:38: sparse:     expected struct task_struct *curr
   kernel/sched/fair.c:6438:38: sparse:     got struct task_struct [noderef] __rcu *curr
   kernel/sched/fair.c:7738:20: sparse: sparse: incorrect type in assignment (different address spaces) @@     expected struct sched_domain *[assigned] sd @@     got struct sched_domain [noderef] __rcu *parent @@
   kernel/sched/fair.c:7738:20: sparse:     expected struct sched_domain *[assigned] sd
   kernel/sched/fair.c:7738:20: sparse:     got struct sched_domain [noderef] __rcu *parent
   kernel/sched/fair.c:7940:9: sparse: sparse: incorrect type in assignment (different address spaces) @@     expected struct sched_domain *[assigned] tmp @@     got struct sched_domain [noderef] __rcu *parent @@
   kernel/sched/fair.c:7940:9: sparse:     expected struct sched_domain *[assigned] tmp
   kernel/sched/fair.c:7940:9: sparse:     got struct sched_domain [noderef] __rcu *parent
   kernel/sched/fair.c:8039:38: sparse: sparse: incorrect type in initializer (different address spaces) @@     expected struct task_struct *curr @@     got struct task_struct [noderef] __rcu *curr @@
   kernel/sched/fair.c:8039:38: sparse:     expected struct task_struct *curr
   kernel/sched/fair.c:8039:38: sparse:     got struct task_struct [noderef] __rcu *curr
   kernel/sched/fair.c:8319:38: sparse: sparse: incorrect type in initializer (different address spaces) @@     expected struct task_struct *curr @@     got struct task_struct [noderef] __rcu *curr @@
   kernel/sched/fair.c:8319:38: sparse:     expected struct task_struct *curr
   kernel/sched/fair.c:8319:38: sparse:     got struct task_struct [noderef] __rcu *curr
   kernel/sched/fair.c:9312:40: sparse: sparse: incorrect type in initializer (different address spaces) @@     expected struct sched_domain *child @@     got struct sched_domain [noderef] __rcu *child @@
   kernel/sched/fair.c:9312:40: sparse:     expected struct sched_domain *child
   kernel/sched/fair.c:9312:40: sparse:     got struct sched_domain [noderef] __rcu *child
   kernel/sched/fair.c:9939:22: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/sched/fair.c:9939:22: sparse:    struct task_struct [noderef] __rcu *
   kernel/sched/fair.c:9939:22: sparse:    struct task_struct *
   kernel/sched/fair.c:11359:9: sparse: sparse: incorrect type in assignment (different address spaces) @@     expected struct sched_domain *[assigned] sd @@     got struct sched_domain [noderef] __rcu *parent @@
   kernel/sched/fair.c:11359:9: sparse:     expected struct sched_domain *[assigned] sd
   kernel/sched/fair.c:11359:9: sparse:     got struct sched_domain [noderef] __rcu *parent
   kernel/sched/fair.c:11018:44: sparse: sparse: incorrect type in initializer (different address spaces) @@     expected struct sched_domain *sd_parent @@     got struct sched_domain [noderef] __rcu *parent @@
   kernel/sched/fair.c:11018:44: sparse:     expected struct sched_domain *sd_parent
   kernel/sched/fair.c:11018:44: sparse:     got struct sched_domain [noderef] __rcu *parent
   kernel/sched/fair.c:11455:9: sparse: sparse: incorrect type in assignment (different address spaces) @@     expected struct sched_domain *[assigned] sd @@     got struct sched_domain [noderef] __rcu *parent @@
   kernel/sched/fair.c:11455:9: sparse:     expected struct sched_domain *[assigned] sd
   kernel/sched/fair.c:11455:9: sparse:     got struct sched_domain [noderef] __rcu *parent
   kernel/sched/fair.c: note: in included file:
   kernel/sched/sched.h:2130:25: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/sched/sched.h:2130:25: sparse:    struct task_struct [noderef] __rcu *
   kernel/sched/sched.h:2130:25: sparse:    struct task_struct *
   kernel/sched/sched.h:2296:9: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/sched/sched.h:2296:9: sparse:    struct task_struct [noderef] __rcu *
   kernel/sched/sched.h:2296:9: sparse:    struct task_struct *
   kernel/sched/sched.h:2130:25: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/sched/sched.h:2130:25: sparse:    struct task_struct [noderef] __rcu *
   kernel/sched/sched.h:2130:25: sparse:    struct task_struct *
   kernel/sched/sched.h:2130:25: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/sched/sched.h:2130:25: sparse:    struct task_struct [noderef] __rcu *
   kernel/sched/sched.h:2130:25: sparse:    struct task_struct *

vim +6336 kernel/sched/fair.c

  6321	
  6322	static bool can_stop_tick_fair(struct rq *rq, int *stop_next)
  6323	{
  6324		if (rq->nr_running > 1) {
  6325			*stop_next = 1;
  6326			return false;
  6327		}
  6328	
  6329		/*
  6330		 * If there is one task and it has CFS runtime bandwidth constraints
  6331		 * and it's on the cpu now we don't want to stop the tick.
  6332		 * This check prevents clearing the bit if a newly enqueued task here is
  6333		 * dequeued by migrating while the constrained task continues to run.
  6334		 * E.g. going from 2->1 without going through pick_next_task().
  6335		 */
> 6336		if (sched_feat(HZ_BW) && __need_bw_check(rq, rq->curr)) {
  6337			if (cfs_task_bw_constrained(rq->curr)) {
  6338				*stop_next = 1;
  6339				return false;
  6340			}
  6341		}
  6342	
  6343		return true;
  6344	}
  6345	#endif
  6346	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] Fix nohz_full vs rt bandwidth
  2023-08-21  9:49 [PATCH 0/2] Fix nohz_full vs rt bandwidth Hao Jia
  2023-08-21  9:49 ` [PATCH 1/2] sched/core: Introduce sched_class::can_stop_tick() Hao Jia
  2023-08-21  9:49 ` [PATCH 2/2] sched/rt: Block nohz tick_stop when rt bandwidth in use Hao Jia
@ 2023-09-06  6:45 ` Hao Jia
  2023-09-07 14:17   ` Phil Auld
  2 siblings, 1 reply; 11+ messages in thread
From: Hao Jia @ 2023-09-06  6:45 UTC (permalink / raw)
  To: mingo, peterz, mingo, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, mgorman, bristot, vschneid,
	pauld
  Cc: linux-kernel


Friendly ping...

On 2023/8/21 Hao Jia wrote:
> Since the commit 88c56cfeaec4 ("sched/fair: Block nohz tick_stop
> when cfs bandwidth in use") was merged, it handles conflicts between
> NOHZ full and cfs_bandwidth well, and the scheduler feature HZ_BW
> allows us to choose which one to prefer.
> 
> This conflict also exists between NOHZ full and rt_bandwidth,
> these two patches try to handle it in a similar way.
> 
> patch1: Extracts a can_stop_tick() callback function for each
> sched_class from sched_can_stop_tick(), it will make things clearer
> and also convenient to handle the conflict between NOHZ full
> and rt_bandwidth.
> 
> patch2: If the HZ_BW scheduler feature is enabled, and the RT task
> to be run is constrained by rt_bandwidth runtime. Then it will
> prevent NO_HZ full from stopping tick.
> 
> Hao Jia (2):
>    sched/core: Introduce sched_class::can_stop_tick()
>    sched/rt: Block nohz tick_stop when rt bandwidth in use
> 
>   kernel/sched/core.c     | 67 +++++--------------------------
>   kernel/sched/deadline.c | 16 ++++++++
>   kernel/sched/fair.c     | 56 +++++++++++++++++++++++---
>   kernel/sched/rt.c       | 89 ++++++++++++++++++++++++++++++++++++++++-
>   kernel/sched/sched.h    |  5 ++-
>   5 files changed, 168 insertions(+), 65 deletions(-)
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] Fix nohz_full vs rt bandwidth
  2023-09-06  6:45 ` [PATCH 0/2] Fix nohz_full vs rt bandwidth Hao Jia
@ 2023-09-07 14:17   ` Phil Auld
  2023-09-08  2:57     ` [External] " Hao Jia
  0 siblings, 1 reply; 11+ messages in thread
From: Phil Auld @ 2023-09-07 14:17 UTC (permalink / raw)
  To: Hao Jia
  Cc: mingo, peterz, mingo, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, mgorman, bristot, vschneid,
	linux-kernel

Hi Hao,

On Wed, Sep 06, 2023 at 02:45:39PM +0800 Hao Jia wrote:
> 
> Friendly ping...
> 
> On 2023/8/21 Hao Jia wrote:
> > Since the commit 88c56cfeaec4 ("sched/fair: Block nohz tick_stop
> > when cfs bandwidth in use") was merged, it handles conflicts between
> > NOHZ full and cfs_bandwidth well, and the scheduler feature HZ_BW
> > allows us to choose which one to prefer.
> > 
> > This conflict also exists between NOHZ full and rt_bandwidth,
> > these two patches try to handle it in a similar way.
> >

Are you actually hitting this in the real world?

We, for example, no longer enable RT_GROUP_SCHED so this is a non-issue
for our use cases.  I'd recommend considering that. (Does it even
work with cgroup2?)

In some ways what you have is a simplification of code, but it also
obfuscates the stop_tick conditions by hiding them all in the class
specific functions.  It was easier to see why the tick didn't stop
looking at the original code.

It would be better to do this only if it is really needed, in my opinion.


Cheers,
Phil

> > patch1: Extracts a can_stop_tick() callback function for each
> > sched_class from sched_can_stop_tick(), it will make things clearer
> > and also convenient to handle the conflict between NOHZ full
> > and rt_bandwidth.
> > 
> > patch2: If the HZ_BW scheduler feature is enabled, and the RT task
> > to be run is constrained by rt_bandwidth runtime. Then it will
> > prevent NO_HZ full from stopping tick.
> > 
> > Hao Jia (2):
> >    sched/core: Introduce sched_class::can_stop_tick()
> >    sched/rt: Block nohz tick_stop when rt bandwidth in use
> > 
> >   kernel/sched/core.c     | 67 +++++--------------------------
> >   kernel/sched/deadline.c | 16 ++++++++
> >   kernel/sched/fair.c     | 56 +++++++++++++++++++++++---
> >   kernel/sched/rt.c       | 89 ++++++++++++++++++++++++++++++++++++++++-
> >   kernel/sched/sched.h    |  5 ++-
> >   5 files changed, 168 insertions(+), 65 deletions(-)
> > 
> 

-- 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [External] Re: [PATCH 0/2] Fix nohz_full vs rt bandwidth
  2023-09-07 14:17   ` Phil Auld
@ 2023-09-08  2:57     ` Hao Jia
  2023-09-08 12:45       ` Phil Auld
  0 siblings, 1 reply; 11+ messages in thread
From: Hao Jia @ 2023-09-08  2:57 UTC (permalink / raw)
  To: Phil Auld
  Cc: mingo, peterz, mingo, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, mgorman, bristot, vschneid,
	linux-kernel



On 2023/9/7 Phil Auld wrote:
> Hi Hao,
> 
> On Wed, Sep 06, 2023 at 02:45:39PM +0800 Hao Jia wrote:
>>
>> Friendly ping...
>>
>> On 2023/8/21 Hao Jia wrote:
>>> Since the commit 88c56cfeaec4 ("sched/fair: Block nohz tick_stop
>>> when cfs bandwidth in use") was merged, it handles conflicts between
>>> NOHZ full and cfs_bandwidth well, and the scheduler feature HZ_BW
>>> allows us to choose which one to prefer.
>>>
>>> This conflict also exists between NOHZ full and rt_bandwidth,
>>> these two patches try to handle it in a similar way.
>>>
> 
> Are you actually hitting this in the real world?
> 
> We, for example, no longer enable RT_GROUP_SCHED so this is a non-issue
> for our use cases.  I'd recommend considering that. (Does it even
> work with cgroup2?)
> 

Yes, it has always been there. Regardless of whether RT_GROUP_SCHED is 
enabled or not, rt bandwidth is always enabled. If RT_GROUP_SCHED is not 
enabled, all rt tasks in the system are a group, and rt_runtime is 
950000, and rt_period is 1000000.So rt bandwidth is always enabled by 
default.


Thanks,
Hao

> In some ways what you have is a simplification of code, but it also
> obfuscates the stop_tick conditions by hiding them all in the class
> specific functions.  It was easier to see why the tick didn't stop
> looking at the original code.
> 
> It would be better to do this only if it is really needed, in my opinion.
> 
> 
> Cheers,
> Phil
> 
>>> patch1: Extracts a can_stop_tick() callback function for each
>>> sched_class from sched_can_stop_tick(), it will make things clearer
>>> and also convenient to handle the conflict between NOHZ full
>>> and rt_bandwidth.
>>>
>>> patch2: If the HZ_BW scheduler feature is enabled, and the RT task
>>> to be run is constrained by rt_bandwidth runtime. Then it will
>>> prevent NO_HZ full from stopping tick.
>>>
>>> Hao Jia (2):
>>>     sched/core: Introduce sched_class::can_stop_tick()
>>>     sched/rt: Block nohz tick_stop when rt bandwidth in use
>>>
>>>    kernel/sched/core.c     | 67 +++++--------------------------
>>>    kernel/sched/deadline.c | 16 ++++++++
>>>    kernel/sched/fair.c     | 56 +++++++++++++++++++++++---
>>>    kernel/sched/rt.c       | 89 ++++++++++++++++++++++++++++++++++++++++-
>>>    kernel/sched/sched.h    |  5 ++-
>>>    5 files changed, 168 insertions(+), 65 deletions(-)
>>>
>>
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [External] Re: [PATCH 0/2] Fix nohz_full vs rt bandwidth
  2023-09-08  2:57     ` [External] " Hao Jia
@ 2023-09-08 12:45       ` Phil Auld
  2023-09-11  3:39         ` Hao Jia
  0 siblings, 1 reply; 11+ messages in thread
From: Phil Auld @ 2023-09-08 12:45 UTC (permalink / raw)
  To: Hao Jia
  Cc: mingo, peterz, mingo, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, mgorman, bristot, vschneid,
	linux-kernel

On Fri, Sep 08, 2023 at 10:57:26AM +0800 Hao Jia wrote:
> 
> 
> On 2023/9/7 Phil Auld wrote:
> > Hi Hao,
> > 
> > On Wed, Sep 06, 2023 at 02:45:39PM +0800 Hao Jia wrote:
> > > 
> > > Friendly ping...
> > > 
> > > On 2023/8/21 Hao Jia wrote:
> > > > Since the commit 88c56cfeaec4 ("sched/fair: Block nohz tick_stop
> > > > when cfs bandwidth in use") was merged, it handles conflicts between
> > > > NOHZ full and cfs_bandwidth well, and the scheduler feature HZ_BW
> > > > allows us to choose which one to prefer.
> > > > 
> > > > This conflict also exists between NOHZ full and rt_bandwidth,
> > > > these two patches try to handle it in a similar way.
> > > > 
> > 
> > Are you actually hitting this in the real world?
> > 
> > We, for example, no longer enable RT_GROUP_SCHED so this is a non-issue
> > for our use cases.  I'd recommend considering that. (Does it even
> > work with cgroup2?)
> > 
> 
> Yes, it has always been there. Regardless of whether RT_GROUP_SCHED is
> enabled or not, rt bandwidth is always enabled. If RT_GROUP_SCHED is not
> enabled, all rt tasks in the system are a group, and rt_runtime is 950000,
> and rt_period is 1000000.So rt bandwidth is always enabled by default.

Sure, there is that. But I think Daniel is actively trying to remove it.

Also I'm not sure you answered my question. Are you actually hitting this
in the real world?  I'd be tempted to think this is a mis-configuration or
mis-use of RT.  Plus you can disable that throttling and use stalld to catch
cases where the rt task goes out of control.

I'm not totally against doing this (for what my vote counts...), I just
wonder if it's really needed. It seem it may be over-engineering something
that is soon to be a non-problem. 


Cheers,
Phil

> 
> 
> Thanks,
> Hao
> 
> > In some ways what you have is a simplification of code, but it also
> > obfuscates the stop_tick conditions by hiding them all in the class
> > specific functions.  It was easier to see why the tick didn't stop
> > looking at the original code.
> > 
> > It would be better to do this only if it is really needed, in my opinion.
> > 
> > 
> > Cheers,
> > Phil
> > 
> > > > patch1: Extracts a can_stop_tick() callback function for each
> > > > sched_class from sched_can_stop_tick(), it will make things clearer
> > > > and also convenient to handle the conflict between NOHZ full
> > > > and rt_bandwidth.
> > > > 
> > > > patch2: If the HZ_BW scheduler feature is enabled, and the RT task
> > > > to be run is constrained by rt_bandwidth runtime. Then it will
> > > > prevent NO_HZ full from stopping tick.
> > > > 
> > > > Hao Jia (2):
> > > >     sched/core: Introduce sched_class::can_stop_tick()
> > > >     sched/rt: Block nohz tick_stop when rt bandwidth in use
> > > > 
> > > >    kernel/sched/core.c     | 67 +++++--------------------------
> > > >    kernel/sched/deadline.c | 16 ++++++++
> > > >    kernel/sched/fair.c     | 56 +++++++++++++++++++++++---
> > > >    kernel/sched/rt.c       | 89 ++++++++++++++++++++++++++++++++++++++++-
> > > >    kernel/sched/sched.h    |  5 ++-
> > > >    5 files changed, 168 insertions(+), 65 deletions(-)
> > > > 
> > > 
> > 
> 

-- 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [External] Re: [PATCH 0/2] Fix nohz_full vs rt bandwidth
  2023-09-08 12:45       ` Phil Auld
@ 2023-09-11  3:39         ` Hao Jia
  2023-09-11 13:25           ` Phil Auld
  0 siblings, 1 reply; 11+ messages in thread
From: Hao Jia @ 2023-09-11  3:39 UTC (permalink / raw)
  To: Phil Auld
  Cc: mingo, peterz, mingo, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, mgorman, bristot, vschneid,
	linux-kernel



On 2023/9/8 Phil Auld wrote:
> On Fri, Sep 08, 2023 at 10:57:26AM +0800 Hao Jia wrote:
>>
>>
>> On 2023/9/7 Phil Auld wrote:
>>> Hi Hao,
>>>
>>> On Wed, Sep 06, 2023 at 02:45:39PM +0800 Hao Jia wrote:
>>>>
>>>> Friendly ping...
>>>>
>>>> On 2023/8/21 Hao Jia wrote:
>>>>> Since the commit 88c56cfeaec4 ("sched/fair: Block nohz tick_stop
>>>>> when cfs bandwidth in use") was merged, it handles conflicts between
>>>>> NOHZ full and cfs_bandwidth well, and the scheduler feature HZ_BW
>>>>> allows us to choose which one to prefer.
>>>>>
>>>>> This conflict also exists between NOHZ full and rt_bandwidth,
>>>>> these two patches try to handle it in a similar way.
>>>>>
>>>
>>> Are you actually hitting this in the real world?
>>>
>>> We, for example, no longer enable RT_GROUP_SCHED so this is a non-issue
>>> for our use cases.  I'd recommend considering that. (Does it even
>>> work with cgroup2?)
>>>
>>
>> Yes, it has always been there. Regardless of whether RT_GROUP_SCHED is
>> enabled or not, rt bandwidth is always enabled. If RT_GROUP_SCHED is not
>> enabled, all rt tasks in the system are a group, and rt_runtime is 950000,
>> and rt_period is 1000000.So rt bandwidth is always enabled by default.
> 
> Sure, there is that. But I think Daniel is actively trying to remove it.
> 

Thank you for your reply. Maybe I'm missing something. Can you give me 
some links to discussions about it?

> Also I'm not sure you answered my question. Are you actually hitting this
> in the real world?  I'd be tempted to think this is a mis-configuration or
> mis-use of RT.  Plus you can disable that throttling and use stalld to catch
> cases where the rt task goes out of control.
> 

 > Are you actually hitting this in the real world?

I tested on my machine using default settings (rt_runtime is 950000, and 
rt_period is 1000000.). The rt task is supposed to be throttled after 
running for 0.95 seconds, but due to the influence of NO_HZ_FULL, it may 
be throttled after running for about 1.4 seconds. This will only cause 
the rt_bandwidth throttle to be delayed, but no warning will be triggered.


 > Plus you can disable that throttling and use stalld to catch cases 
where the rt task goes out of control.

IIRC, if we disable rt_bandwidth. The rt task is always running, which 
may cause cfs task starvation and hung_task warnning. This may be the 
reason why rt_bandwidth is enabled by default (rt_runtime is 950000, and 
rt_period is 1000000).


Thanks,
Hao

> I'm not totally against doing this (for what my vote counts...), I just
> wonder if it's really needed. It seem it may be over-engineering something
> that is soon to be a non-problem.
> 
> 
> Cheers,
> Phil
> 
>>
>>
>> Thanks,
>> Hao
>>
>>> In some ways what you have is a simplification of code, but it also
>>> obfuscates the stop_tick conditions by hiding them all in the class
>>> specific functions.  It was easier to see why the tick didn't stop
>>> looking at the original code.
>>>
>>> It would be better to do this only if it is really needed, in my opinion.
>>>
>>>
>>> Cheers,
>>> Phil
>>>
>>>>> patch1: Extracts a can_stop_tick() callback function for each
>>>>> sched_class from sched_can_stop_tick(), it will make things clearer
>>>>> and also convenient to handle the conflict between NOHZ full
>>>>> and rt_bandwidth.
>>>>>
>>>>> patch2: If the HZ_BW scheduler feature is enabled, and the RT task
>>>>> to be run is constrained by rt_bandwidth runtime. Then it will
>>>>> prevent NO_HZ full from stopping tick.
>>>>>
>>>>> Hao Jia (2):
>>>>>      sched/core: Introduce sched_class::can_stop_tick()
>>>>>      sched/rt: Block nohz tick_stop when rt bandwidth in use
>>>>>
>>>>>     kernel/sched/core.c     | 67 +++++--------------------------
>>>>>     kernel/sched/deadline.c | 16 ++++++++
>>>>>     kernel/sched/fair.c     | 56 +++++++++++++++++++++++---
>>>>>     kernel/sched/rt.c       | 89 ++++++++++++++++++++++++++++++++++++++++-
>>>>>     kernel/sched/sched.h    |  5 ++-
>>>>>     5 files changed, 168 insertions(+), 65 deletions(-)
>>>>>
>>>>
>>>
>>
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [External] Re: [PATCH 0/2] Fix nohz_full vs rt bandwidth
  2023-09-11  3:39         ` Hao Jia
@ 2023-09-11 13:25           ` Phil Auld
  2023-09-12  2:35             ` Hao Jia
  0 siblings, 1 reply; 11+ messages in thread
From: Phil Auld @ 2023-09-11 13:25 UTC (permalink / raw)
  To: Hao Jia
  Cc: mingo, peterz, mingo, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, mgorman, bristot, vschneid,
	linux-kernel


Hi Hao,

On Mon, Sep 11, 2023 at 11:39:02AM +0800 Hao Jia wrote:
> On 2023/9/8 Phil Auld wrote:
> > On Fri, Sep 08, 2023 at 10:57:26AM +0800 Hao Jia wrote:
> > > On 2023/9/7 Phil Auld wrote:
> > > > Hi Hao,

...

> > > > 
> > > > Are you actually hitting this in the real world?
> > > > 
> > > > We, for example, no longer enable RT_GROUP_SCHED so this is a non-issue
> > > > for our use cases.  I'd recommend considering that. (Does it even
> > > > work with cgroup2?)
> > > > 
> > > 
> > > Yes, it has always been there. Regardless of whether RT_GROUP_SCHED is
> > > enabled or not, rt bandwidth is always enabled. If RT_GROUP_SCHED is not
> > > enabled, all rt tasks in the system are a group, and rt_runtime is 950000,
> > > and rt_period is 1000000.So rt bandwidth is always enabled by default.
> > 
> > Sure, there is that. But I think Daniel is actively trying to remove it.
> > 
> 
> Thank you for your reply. Maybe I'm missing something. Can you give me some
> links to discussions about it?
>

Sure, try this one:
      https://lore.kernel.org/lkml/cover.1693510979.git.bristot@kernel.org/


> > Also I'm not sure you answered my question. Are you actually hitting this
> > in the real world?  I'd be tempted to think this is a mis-configuration or
> > mis-use of RT.  Plus you can disable that throttling and use stalld to catch
> > cases where the rt task goes out of control.
> > 
> 
> > Are you actually hitting this in the real world?
> 
> I tested on my machine using default settings (rt_runtime is 950000, and
> rt_period is 1000000.). The rt task is supposed to be throttled after
> running for 0.95 seconds, but due to the influence of NO_HZ_FULL, it may be
> throttled after running for about 1.4 seconds. This will only cause the
> rt_bandwidth throttle to be delayed, but no warning will be triggered.

Yes, you can hit this in testing.  I'm asking if it's causing your real-world
applicaton issues or is this just a theoretical problem you can contrive a
test for?  Are you actually hitting this when running your workload?
From what you are showing (a test setup) I'm guessing no.

> 
> 
> > Plus you can disable that throttling and use stalld to catch cases where
> the rt task goes out of control.
> 
> IIRC, if we disable rt_bandwidth. The rt task is always running, which may
> cause cfs task starvation and hung_task warnning. This may be the reason why
> rt_bandwidth is enabled by default (rt_runtime is 950000, and rt_period is
> 1000000).

That's what stalld is for.  Some rt applications don't like giving up 5% of
the cpu time when they don't really need to.


Cheers,
Phil


-- 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [External] Re: [PATCH 0/2] Fix nohz_full vs rt bandwidth
  2023-09-11 13:25           ` Phil Auld
@ 2023-09-12  2:35             ` Hao Jia
  0 siblings, 0 replies; 11+ messages in thread
From: Hao Jia @ 2023-09-12  2:35 UTC (permalink / raw)
  To: Phil Auld
  Cc: mingo, peterz, mingo, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, mgorman, bristot, vschneid,
	linux-kernel



On 2023/9/11 Phil Auld wrote:
> 
> Hi Hao,
> 
> On Mon, Sep 11, 2023 at 11:39:02AM +0800 Hao Jia wrote:
>> On 2023/9/8 Phil Auld wrote:
>>> On Fri, Sep 08, 2023 at 10:57:26AM +0800 Hao Jia wrote:
>>>> On 2023/9/7 Phil Auld wrote:
>>>>> Hi Hao,
> 
> ...
> 
>>>>>
>>>>> Are you actually hitting this in the real world?
>>>>>
>>>>> We, for example, no longer enable RT_GROUP_SCHED so this is a non-issue
>>>>> for our use cases.  I'd recommend considering that. (Does it even
>>>>> work with cgroup2?)
>>>>>
>>>>
>>>> Yes, it has always been there. Regardless of whether RT_GROUP_SCHED is
>>>> enabled or not, rt bandwidth is always enabled. If RT_GROUP_SCHED is not
>>>> enabled, all rt tasks in the system are a group, and rt_runtime is 950000,
>>>> and rt_period is 1000000.So rt bandwidth is always enabled by default.
>>>
>>> Sure, there is that. But I think Daniel is actively trying to remove it.
>>>
>>
>> Thank you for your reply. Maybe I'm missing something. Can you give me some
>> links to discussions about it?
>>
> 
> Sure, try this one:
>        https://lore.kernel.org/lkml/cover.1693510979.git.bristot@kernel.org/
> 

Thanks for the information you shared.

> 
>>> Also I'm not sure you answered my question. Are you actually hitting this
>>> in the real world?  I'd be tempted to think this is a mis-configuration or
>>> mis-use of RT.  Plus you can disable that throttling and use stalld to catch
>>> cases where the rt task goes out of control.
>>>
>>
>>> Are you actually hitting this in the real world?
>>
>> I tested on my machine using default settings (rt_runtime is 950000, and
>> rt_period is 1000000.). The rt task is supposed to be throttled after
>> running for 0.95 seconds, but due to the influence of NO_HZ_FULL, it may be
>> throttled after running for about 1.4 seconds. This will only cause the
>> rt_bandwidth throttle to be delayed, but no warning will be triggered.
> 
> Yes, you can hit this in testing.  I'm asking if it's causing your real-world
> applicaton issues or is this just a theoretical problem you can contrive a
> test for?  Are you actually hitting this when running your workload?
>  From what you are showing (a test setup) I'm guessing no.
> 

Yes, I don't see this issue in our production environment. The number of 
rt tasks is very small in our production environment, and their running 
time is very short, so the rt_bandwidth throttle will not be triggered 
unless the rt task goes out of control.

Thanks,
Hao

>>
>>
>>> Plus you can disable that throttling and use stalld to catch cases where
>> the rt task goes out of control.
>>
>> IIRC, if we disable rt_bandwidth. The rt task is always running, which may
>> cause cfs task starvation and hung_task warnning. This may be the reason why
>> rt_bandwidth is enabled by default (rt_runtime is 950000, and rt_period is
>> 1000000).
> 
> That's what stalld is for.  Some rt applications don't like giving up 5% of
> the cpu time when they don't really need to.
> 
> 
> Cheers,
> Phil
> 
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-09-12  3:36 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-21  9:49 [PATCH 0/2] Fix nohz_full vs rt bandwidth Hao Jia
2023-08-21  9:49 ` [PATCH 1/2] sched/core: Introduce sched_class::can_stop_tick() Hao Jia
2023-08-24  7:43   ` kernel test robot
2023-08-21  9:49 ` [PATCH 2/2] sched/rt: Block nohz tick_stop when rt bandwidth in use Hao Jia
2023-09-06  6:45 ` [PATCH 0/2] Fix nohz_full vs rt bandwidth Hao Jia
2023-09-07 14:17   ` Phil Auld
2023-09-08  2:57     ` [External] " Hao Jia
2023-09-08 12:45       ` Phil Auld
2023-09-11  3:39         ` Hao Jia
2023-09-11 13:25           ` Phil Auld
2023-09-12  2:35             ` Hao Jia

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox