* [RFC][PATCH 0/3 v2] sched: Display deadline bandwidth and other SCHED_DEBUG clean up
@ 2016-02-03 16:56 Steven Rostedt
2016-02-03 16:56 ` [RFC][PATCH 1/3 v2] sched: Move sched_feature file setup into debug.c Steven Rostedt
` (3 more replies)
0 siblings, 4 replies; 7+ messages in thread
From: Steven Rostedt @ 2016-02-03 16:56 UTC (permalink / raw)
To: linux-kernel
Cc: Ingo Molnar, Peter Zijlstra, Thomas Gleixner, Juri Lelli,
Clark Williams, Andrew Morton
I'm starting to play with SCHED_DEADLINE a bit and I'm able to cause
a bandwidth "leak". Then I realized there's no way to examine what bandwidths
are enabled on which CPUs. I added the bandwith ratios to the
/proc/sched_debug file.
I will be posting the SCHED_DEADLINE issue in a separate thread.
# grep dl /proc/sched_debug
dl_rq[0]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 0
dl_rq[1]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 104857
dl_rq[2]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 0
dl_rq[3]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 0
dl_rq[4]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 0
dl_rq[5]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 0
dl_rq[6]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 0
dl_rq[7]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 0
Before adding this code, I also realized there was a bit of
SCHED_DEBUG code in the kernel/sched/core.c file, and decided to move that
to kernel/sched/debug.c to clean the core.c file up a bit. Those patches
are mostly orthognal to the deadline_bw file, but decided to group them
together here.
Changes since v1:
o Added the bandwith to /proc/sched_debug instead of creating a new file.
o Removed patch: sched: Move sched_domain_debug into debug.c
As it was decided that debug.c shouldn't have domain debugging.
Steven Rostedt (Red Hat) (3):
sched: Move sched_feature file setup into debug.c
sched: Move sched_domain_sysctl to debug.c
sched: Add bandwidth ratio to /proc/sched_debug
----
kernel/sched/core.c | 311 --------------------------------------------------
kernel/sched/debug.c | 313 +++++++++++++++++++++++++++++++++++++++++++++++++++
kernel/sched/sched.h | 13 +++
3 files changed, 326 insertions(+), 311 deletions(-)
^ permalink raw reply [flat|nested] 7+ messages in thread
* [RFC][PATCH 1/3 v2] sched: Move sched_feature file setup into debug.c
2016-02-03 16:56 [RFC][PATCH 0/3 v2] sched: Display deadline bandwidth and other SCHED_DEBUG clean up Steven Rostedt
@ 2016-02-03 16:56 ` Steven Rostedt
2016-02-03 16:57 ` [RFC][PATCH 2/3 v2] sched: Move sched_domain_sysctl to debug.c Steven Rostedt
` (2 subsequent siblings)
3 siblings, 0 replies; 7+ messages in thread
From: Steven Rostedt @ 2016-02-03 16:56 UTC (permalink / raw)
To: linux-kernel
Cc: Ingo Molnar, Peter Zijlstra, Thomas Gleixner, Juri Lelli,
Clark Williams, Andrew Morton
[-- Attachment #1: 0001-sched-Move-sched_feature-file-setup-into-debug.c.patch --]
[-- Type: text/plain, Size: 6768 bytes --]
From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>
As sched_feature is only created when SCHED_DEBUG is enabled, and the file
debug.c is only compiled when SCHED_DEBUG is enabled, it makes sense to move
sched_feature setup into that file and get rid of the #ifdef.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
kernel/sched/core.c | 133 ---------------------------------------------------
kernel/sched/debug.c | 131 ++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 131 insertions(+), 133 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 63d3a24e081a..c49e64c4675e 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -66,7 +66,6 @@
#include <linux/pagemap.h>
#include <linux/hrtimer.h>
#include <linux/tick.h>
-#include <linux/debugfs.h>
#include <linux/ctype.h>
#include <linux/ftrace.h>
#include <linux/slab.h>
@@ -124,138 +123,6 @@ const_debug unsigned int sysctl_sched_features =
#undef SCHED_FEAT
-#ifdef CONFIG_SCHED_DEBUG
-#define SCHED_FEAT(name, enabled) \
- #name ,
-
-static const char * const sched_feat_names[] = {
-#include "features.h"
-};
-
-#undef SCHED_FEAT
-
-static int sched_feat_show(struct seq_file *m, void *v)
-{
- int i;
-
- for (i = 0; i < __SCHED_FEAT_NR; i++) {
- if (!(sysctl_sched_features & (1UL << i)))
- seq_puts(m, "NO_");
- seq_printf(m, "%s ", sched_feat_names[i]);
- }
- seq_puts(m, "\n");
-
- return 0;
-}
-
-#ifdef HAVE_JUMP_LABEL
-
-#define jump_label_key__true STATIC_KEY_INIT_TRUE
-#define jump_label_key__false STATIC_KEY_INIT_FALSE
-
-#define SCHED_FEAT(name, enabled) \
- jump_label_key__##enabled ,
-
-struct static_key sched_feat_keys[__SCHED_FEAT_NR] = {
-#include "features.h"
-};
-
-#undef SCHED_FEAT
-
-static void sched_feat_disable(int i)
-{
- static_key_disable(&sched_feat_keys[i]);
-}
-
-static void sched_feat_enable(int i)
-{
- static_key_enable(&sched_feat_keys[i]);
-}
-#else
-static void sched_feat_disable(int i) { };
-static void sched_feat_enable(int i) { };
-#endif /* HAVE_JUMP_LABEL */
-
-static int sched_feat_set(char *cmp)
-{
- int i;
- int neg = 0;
-
- if (strncmp(cmp, "NO_", 3) == 0) {
- neg = 1;
- cmp += 3;
- }
-
- for (i = 0; i < __SCHED_FEAT_NR; i++) {
- if (strcmp(cmp, sched_feat_names[i]) == 0) {
- if (neg) {
- sysctl_sched_features &= ~(1UL << i);
- sched_feat_disable(i);
- } else {
- sysctl_sched_features |= (1UL << i);
- sched_feat_enable(i);
- }
- break;
- }
- }
-
- return i;
-}
-
-static ssize_t
-sched_feat_write(struct file *filp, const char __user *ubuf,
- size_t cnt, loff_t *ppos)
-{
- char buf[64];
- char *cmp;
- int i;
- struct inode *inode;
-
- if (cnt > 63)
- cnt = 63;
-
- if (copy_from_user(&buf, ubuf, cnt))
- return -EFAULT;
-
- buf[cnt] = 0;
- cmp = strstrip(buf);
-
- /* Ensure the static_key remains in a consistent state */
- inode = file_inode(filp);
- inode_lock(inode);
- i = sched_feat_set(cmp);
- inode_unlock(inode);
- if (i == __SCHED_FEAT_NR)
- return -EINVAL;
-
- *ppos += cnt;
-
- return cnt;
-}
-
-static int sched_feat_open(struct inode *inode, struct file *filp)
-{
- return single_open(filp, sched_feat_show, NULL);
-}
-
-static const struct file_operations sched_feat_fops = {
- .open = sched_feat_open,
- .write = sched_feat_write,
- .read = seq_read,
- .llseek = seq_lseek,
- .release = single_release,
-};
-
-static __init int sched_init_debug(void)
-{
- debugfs_create_file("sched_features", 0644, NULL, NULL,
- &sched_feat_fops);
-
- return 0;
-}
-late_initcall(sched_init_debug);
-#endif /* CONFIG_SCHED_DEBUG */
-
/*
* Number of tasks to iterate in a single balance run.
* Limited because this is done with IRQs disabled.
diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
index 641511771ae6..593178be8124 100644
--- a/kernel/sched/debug.c
+++ b/kernel/sched/debug.c
@@ -16,6 +16,7 @@
#include <linux/kallsyms.h>
#include <linux/utsname.h>
#include <linux/mempolicy.h>
+#include <linux/debugfs.h>
#include "sched.h"
@@ -58,6 +59,136 @@ static unsigned long nsec_low(unsigned long long nsec)
#define SPLIT_NS(x) nsec_high(x), nsec_low(x)
+#define SCHED_FEAT(name, enabled) \
+ #name ,
+
+static const char * const sched_feat_names[] = {
+#include "features.h"
+};
+
+#undef SCHED_FEAT
+
+static int sched_feat_show(struct seq_file *m, void *v)
+{
+ int i;
+
+ for (i = 0; i < __SCHED_FEAT_NR; i++) {
+ if (!(sysctl_sched_features & (1UL << i)))
+ seq_puts(m, "NO_");
+ seq_printf(m, "%s ", sched_feat_names[i]);
+ }
+ seq_puts(m, "\n");
+
+ return 0;
+}
+
+#ifdef HAVE_JUMP_LABEL
+
+#define jump_label_key__true STATIC_KEY_INIT_TRUE
+#define jump_label_key__false STATIC_KEY_INIT_FALSE
+
+#define SCHED_FEAT(name, enabled) \
+ jump_label_key__##enabled ,
+
+struct static_key sched_feat_keys[__SCHED_FEAT_NR] = {
+#include "features.h"
+};
+
+#undef SCHED_FEAT
+
+static void sched_feat_disable(int i)
+{
+ static_key_disable(&sched_feat_keys[i]);
+}
+
+static void sched_feat_enable(int i)
+{
+ static_key_enable(&sched_feat_keys[i]);
+}
+#else
+static void sched_feat_disable(int i) { };
+static void sched_feat_enable(int i) { };
+#endif /* HAVE_JUMP_LABEL */
+
+static int sched_feat_set(char *cmp)
+{
+ int i;
+ int neg = 0;
+
+ if (strncmp(cmp, "NO_", 3) == 0) {
+ neg = 1;
+ cmp += 3;
+ }
+
+ for (i = 0; i < __SCHED_FEAT_NR; i++) {
+ if (strcmp(cmp, sched_feat_names[i]) == 0) {
+ if (neg) {
+ sysctl_sched_features &= ~(1UL << i);
+ sched_feat_disable(i);
+ } else {
+ sysctl_sched_features |= (1UL << i);
+ sched_feat_enable(i);
+ }
+ break;
+ }
+ }
+
+ return i;
+}
+
+static ssize_t
+sched_feat_write(struct file *filp, const char __user *ubuf,
+ size_t cnt, loff_t *ppos)
+{
+ char buf[64];
+ char *cmp;
+ int i;
+ struct inode *inode;
+
+ if (cnt > 63)
+ cnt = 63;
+
+ if (copy_from_user(&buf, ubuf, cnt))
+ return -EFAULT;
+
+ buf[cnt] = 0;
+ cmp = strstrip(buf);
+
+ /* Ensure the static_key remains in a consistent state */
+ inode = file_inode(filp);
+ inode_lock(inode);
+ i = sched_feat_set(cmp);
+ inode_unlock(inode);
+ if (i == __SCHED_FEAT_NR)
+ return -EINVAL;
+
+ *ppos += cnt;
+
+ return cnt;
+}
+
+static int sched_feat_open(struct inode *inode, struct file *filp)
+{
+ return single_open(filp, sched_feat_show, NULL);
+}
+
+static const struct file_operations sched_feat_fops = {
+ .open = sched_feat_open,
+ .write = sched_feat_write,
+ .read = seq_read,
+ .llseek = seq_lseek,
+ .release = single_release,
+};
+
+static __init int sched_init_debug(void)
+{
+ debugfs_create_file("sched_features", 0644, NULL, NULL,
+ &sched_feat_fops);
+
+ return 0;
+}
+late_initcall(sched_init_debug);
+
#ifdef CONFIG_FAIR_GROUP_SCHED
static void print_cfs_group_stats(struct seq_file *m, int cpu, struct task_group *tg)
{
--
2.6.4
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [RFC][PATCH 2/3 v2] sched: Move sched_domain_sysctl to debug.c
2016-02-03 16:56 [RFC][PATCH 0/3 v2] sched: Display deadline bandwidth and other SCHED_DEBUG clean up Steven Rostedt
2016-02-03 16:56 ` [RFC][PATCH 1/3 v2] sched: Move sched_feature file setup into debug.c Steven Rostedt
@ 2016-02-03 16:57 ` Steven Rostedt
2016-02-03 16:57 ` [RFC][PATCH 3/3 v2] sched: Add bandwidth ratio to /proc/sched_debug Steven Rostedt
2016-02-10 15:13 ` [RFC][PATCH 0/3 v2] sched: Display deadline bandwidth and other SCHED_DEBUG clean up Steven Rostedt
3 siblings, 0 replies; 7+ messages in thread
From: Steven Rostedt @ 2016-02-03 16:57 UTC (permalink / raw)
To: linux-kernel
Cc: Ingo Molnar, Peter Zijlstra, Thomas Gleixner, Juri Lelli,
Clark Williams, Andrew Morton
[-- Attachment #1: 0002-sched-Move-sched_domain_sysctl-to-debug.c.patch --]
[-- Type: text/plain, Size: 11900 bytes --]
From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>
The sched_domain_sysctl setup is only enabled when SCHED_DEBUG is
configured. As debug.c is only compiled when SCHED_DEBUG is configured as
well, move the setup of sched_domain_sysctl into that file.
Note, the (un)register_sched_domain_sysctl() functions had to be changed
from static to allow access to them from core.c.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
kernel/sched/core.c | 178 ---------------------------------------------------
kernel/sched/debug.c | 173 +++++++++++++++++++++++++++++++++++++++++++++++++
kernel/sched/sched.h | 13 ++++
3 files changed, 186 insertions(+), 178 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index c49e64c4675e..9c6d976d94f0 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -70,7 +70,6 @@
#include <linux/ftrace.h>
#include <linux/slab.h>
#include <linux/init_task.h>
-#include <linux/binfmts.h>
#include <linux/context_tracking.h>
#include <linux/compiler.h>
@@ -5272,183 +5271,6 @@ static void migrate_tasks(struct rq *dead_rq)
}
#endif /* CONFIG_HOTPLUG_CPU */
-#if defined(CONFIG_SCHED_DEBUG) && defined(CONFIG_SYSCTL)
-
-static struct ctl_table sd_ctl_dir[] = {
- {
- .procname = "sched_domain",
- .mode = 0555,
- },
- {}
-};
-
-static struct ctl_table sd_ctl_root[] = {
- {
- .procname = "kernel",
- .mode = 0555,
- .child = sd_ctl_dir,
- },
- {}
-};
-
-static struct ctl_table *sd_alloc_ctl_entry(int n)
-{
- struct ctl_table *entry =
- kcalloc(n, sizeof(struct ctl_table), GFP_KERNEL);
-
- return entry;
-}
-
-static void sd_free_ctl_entry(struct ctl_table **tablep)
-{
- struct ctl_table *entry;
-
- /*
- * In the intermediate directories, both the child directory and
- * procname are dynamically allocated and could fail but the mode
- * will always be set. In the lowest directory the names are
- * static strings and all have proc handlers.
- */
- for (entry = *tablep; entry->mode; entry++) {
- if (entry->child)
- sd_free_ctl_entry(&entry->child);
- if (entry->proc_handler == NULL)
- kfree(entry->procname);
- }
-
- kfree(*tablep);
- *tablep = NULL;
-}
-
-static int min_load_idx = 0;
-static int max_load_idx = CPU_LOAD_IDX_MAX-1;
-
-static void
-set_table_entry(struct ctl_table *entry,
- const char *procname, void *data, int maxlen,
- umode_t mode, proc_handler *proc_handler,
- bool load_idx)
-{
- entry->procname = procname;
- entry->data = data;
- entry->maxlen = maxlen;
- entry->mode = mode;
- entry->proc_handler = proc_handler;
-
- if (load_idx) {
- entry->extra1 = &min_load_idx;
- entry->extra2 = &max_load_idx;
- }
-}
-
-static struct ctl_table *
-sd_alloc_ctl_domain_table(struct sched_domain *sd)
-{
- struct ctl_table *table = sd_alloc_ctl_entry(14);
-
- if (table == NULL)
- return NULL;
-
- set_table_entry(&table[0], "min_interval", &sd->min_interval,
- sizeof(long), 0644, proc_doulongvec_minmax, false);
- set_table_entry(&table[1], "max_interval", &sd->max_interval,
- sizeof(long), 0644, proc_doulongvec_minmax, false);
- set_table_entry(&table[2], "busy_idx", &sd->busy_idx,
- sizeof(int), 0644, proc_dointvec_minmax, true);
- set_table_entry(&table[3], "idle_idx", &sd->idle_idx,
- sizeof(int), 0644, proc_dointvec_minmax, true);
- set_table_entry(&table[4], "newidle_idx", &sd->newidle_idx,
- sizeof(int), 0644, proc_dointvec_minmax, true);
- set_table_entry(&table[5], "wake_idx", &sd->wake_idx,
- sizeof(int), 0644, proc_dointvec_minmax, true);
- set_table_entry(&table[6], "forkexec_idx", &sd->forkexec_idx,
- sizeof(int), 0644, proc_dointvec_minmax, true);
- set_table_entry(&table[7], "busy_factor", &sd->busy_factor,
- sizeof(int), 0644, proc_dointvec_minmax, false);
- set_table_entry(&table[8], "imbalance_pct", &sd->imbalance_pct,
- sizeof(int), 0644, proc_dointvec_minmax, false);
- set_table_entry(&table[9], "cache_nice_tries",
- &sd->cache_nice_tries,
- sizeof(int), 0644, proc_dointvec_minmax, false);
- set_table_entry(&table[10], "flags", &sd->flags,
- sizeof(int), 0644, proc_dointvec_minmax, false);
- set_table_entry(&table[11], "max_newidle_lb_cost",
- &sd->max_newidle_lb_cost,
- sizeof(long), 0644, proc_doulongvec_minmax, false);
- set_table_entry(&table[12], "name", sd->name,
- CORENAME_MAX_SIZE, 0444, proc_dostring, false);
- /* &table[13] is terminator */
-
- return table;
-}
-
-static struct ctl_table *sd_alloc_ctl_cpu_table(int cpu)
-{
- struct ctl_table *entry, *table;
- struct sched_domain *sd;
- int domain_num = 0, i;
- char buf[32];
-
- for_each_domain(cpu, sd)
- domain_num++;
- entry = table = sd_alloc_ctl_entry(domain_num + 1);
- if (table == NULL)
- return NULL;
-
- i = 0;
- for_each_domain(cpu, sd) {
- snprintf(buf, 32, "domain%d", i);
- entry->procname = kstrdup(buf, GFP_KERNEL);
- entry->mode = 0555;
- entry->child = sd_alloc_ctl_domain_table(sd);
- entry++;
- i++;
- }
- return table;
-}
-
-static struct ctl_table_header *sd_sysctl_header;
-static void register_sched_domain_sysctl(void)
-{
- int i, cpu_num = num_possible_cpus();
- struct ctl_table *entry = sd_alloc_ctl_entry(cpu_num + 1);
- char buf[32];
-
- WARN_ON(sd_ctl_dir[0].child);
- sd_ctl_dir[0].child = entry;
-
- if (entry == NULL)
- return;
-
- for_each_possible_cpu(i) {
- snprintf(buf, 32, "cpu%d", i);
- entry->procname = kstrdup(buf, GFP_KERNEL);
- entry->mode = 0555;
- entry->child = sd_alloc_ctl_cpu_table(i);
- entry++;
- }
-
- WARN_ON(sd_sysctl_header);
- sd_sysctl_header = register_sysctl_table(sd_ctl_root);
-}
-
-/* may be called multiple times per register */
-static void unregister_sched_domain_sysctl(void)
-{
- unregister_sysctl_table(sd_sysctl_header);
- sd_sysctl_header = NULL;
- if (sd_ctl_dir[0].child)
- sd_free_ctl_entry(&sd_ctl_dir[0].child);
-}
-#else
-static void register_sched_domain_sysctl(void)
-{
-}
-static void unregister_sched_domain_sysctl(void)
-{
-}
-#endif /* CONFIG_SCHED_DEBUG && CONFIG_SYSCTL */
-
static void set_rq_online(struct rq *rq)
{
if (!rq->online) {
diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
index 593178be8124..a5625e793d15 100644
--- a/kernel/sched/debug.c
+++ b/kernel/sched/debug.c
@@ -189,6 +189,179 @@ static __init int sched_init_debug(void)
}
late_initcall(sched_init_debug);
+#ifdef CONFIG_SMP
+
+#ifdef CONFIG_SYSCTL
+
+static struct ctl_table sd_ctl_dir[] = {
+ {
+ .procname = "sched_domain",
+ .mode = 0555,
+ },
+ {}
+};
+
+static struct ctl_table sd_ctl_root[] = {
+ {
+ .procname = "kernel",
+ .mode = 0555,
+ .child = sd_ctl_dir,
+ },
+ {}
+};
+
+static struct ctl_table *sd_alloc_ctl_entry(int n)
+{
+ struct ctl_table *entry =
+ kcalloc(n, sizeof(struct ctl_table), GFP_KERNEL);
+
+ return entry;
+}
+
+static void sd_free_ctl_entry(struct ctl_table **tablep)
+{
+ struct ctl_table *entry;
+
+ /*
+ * In the intermediate directories, both the child directory and
+ * procname are dynamically allocated and could fail but the mode
+ * will always be set. In the lowest directory the names are
+ * static strings and all have proc handlers.
+ */
+ for (entry = *tablep; entry->mode; entry++) {
+ if (entry->child)
+ sd_free_ctl_entry(&entry->child);
+ if (entry->proc_handler == NULL)
+ kfree(entry->procname);
+ }
+
+ kfree(*tablep);
+ *tablep = NULL;
+}
+
+static int min_load_idx = 0;
+static int max_load_idx = CPU_LOAD_IDX_MAX-1;
+
+static void
+set_table_entry(struct ctl_table *entry,
+ const char *procname, void *data, int maxlen,
+ umode_t mode, proc_handler *proc_handler,
+ bool load_idx)
+{
+ entry->procname = procname;
+ entry->data = data;
+ entry->maxlen = maxlen;
+ entry->mode = mode;
+ entry->proc_handler = proc_handler;
+
+ if (load_idx) {
+ entry->extra1 = &min_load_idx;
+ entry->extra2 = &max_load_idx;
+ }
+}
+
+static struct ctl_table *
+sd_alloc_ctl_domain_table(struct sched_domain *sd)
+{
+ struct ctl_table *table = sd_alloc_ctl_entry(14);
+
+ if (table == NULL)
+ return NULL;
+
+ set_table_entry(&table[0], "min_interval", &sd->min_interval,
+ sizeof(long), 0644, proc_doulongvec_minmax, false);
+ set_table_entry(&table[1], "max_interval", &sd->max_interval,
+ sizeof(long), 0644, proc_doulongvec_minmax, false);
+ set_table_entry(&table[2], "busy_idx", &sd->busy_idx,
+ sizeof(int), 0644, proc_dointvec_minmax, true);
+ set_table_entry(&table[3], "idle_idx", &sd->idle_idx,
+ sizeof(int), 0644, proc_dointvec_minmax, true);
+ set_table_entry(&table[4], "newidle_idx", &sd->newidle_idx,
+ sizeof(int), 0644, proc_dointvec_minmax, true);
+ set_table_entry(&table[5], "wake_idx", &sd->wake_idx,
+ sizeof(int), 0644, proc_dointvec_minmax, true);
+ set_table_entry(&table[6], "forkexec_idx", &sd->forkexec_idx,
+ sizeof(int), 0644, proc_dointvec_minmax, true);
+ set_table_entry(&table[7], "busy_factor", &sd->busy_factor,
+ sizeof(int), 0644, proc_dointvec_minmax, false);
+ set_table_entry(&table[8], "imbalance_pct", &sd->imbalance_pct,
+ sizeof(int), 0644, proc_dointvec_minmax, false);
+ set_table_entry(&table[9], "cache_nice_tries",
+ &sd->cache_nice_tries,
+ sizeof(int), 0644, proc_dointvec_minmax, false);
+ set_table_entry(&table[10], "flags", &sd->flags,
+ sizeof(int), 0644, proc_dointvec_minmax, false);
+ set_table_entry(&table[11], "max_newidle_lb_cost",
+ &sd->max_newidle_lb_cost,
+ sizeof(long), 0644, proc_doulongvec_minmax, false);
+ set_table_entry(&table[12], "name", sd->name,
+ CORENAME_MAX_SIZE, 0444, proc_dostring, false);
+ /* &table[13] is terminator */
+
+ return table;
+}
+
+static struct ctl_table *sd_alloc_ctl_cpu_table(int cpu)
+{
+ struct ctl_table *entry, *table;
+ struct sched_domain *sd;
+ int domain_num = 0, i;
+ char buf[32];
+
+ for_each_domain(cpu, sd)
+ domain_num++;
+ entry = table = sd_alloc_ctl_entry(domain_num + 1);
+ if (table == NULL)
+ return NULL;
+
+ i = 0;
+ for_each_domain(cpu, sd) {
+ snprintf(buf, 32, "domain%d", i);
+ entry->procname = kstrdup(buf, GFP_KERNEL);
+ entry->mode = 0555;
+ entry->child = sd_alloc_ctl_domain_table(sd);
+ entry++;
+ i++;
+ }
+ return table;
+}
+
+static struct ctl_table_header *sd_sysctl_header;
+void register_sched_domain_sysctl(void)
+{
+ int i, cpu_num = num_possible_cpus();
+ struct ctl_table *entry = sd_alloc_ctl_entry(cpu_num + 1);
+ char buf[32];
+
+ WARN_ON(sd_ctl_dir[0].child);
+ sd_ctl_dir[0].child = entry;
+
+ if (entry == NULL)
+ return;
+
+ for_each_possible_cpu(i) {
+ snprintf(buf, 32, "cpu%d", i);
+ entry->procname = kstrdup(buf, GFP_KERNEL);
+ entry->mode = 0555;
+ entry->child = sd_alloc_ctl_cpu_table(i);
+ entry++;
+ }
+
+ WARN_ON(sd_sysctl_header);
+ sd_sysctl_header = register_sysctl_table(sd_ctl_root);
+}
+
+/* may be called multiple times per register */
+void unregister_sched_domain_sysctl(void)
+{
+ unregister_sysctl_table(sd_sysctl_header);
+ sd_sysctl_header = NULL;
+ if (sd_ctl_dir[0].child)
+ sd_free_ctl_entry(&sd_ctl_dir[0].child);
+}
+#endif /* CONFIG_SYSCTL */
+#endif /* CONFIG_SMP */
+
#ifdef CONFIG_FAIR_GROUP_SCHED
static void print_cfs_group_stats(struct seq_file *m, int cpu, struct task_group *tg)
{
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 10f16374df7f..b146f5298cb1 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -3,6 +3,7 @@
#include <linux/sched/sysctl.h>
#include <linux/sched/rt.h>
#include <linux/sched/deadline.h>
+#include <linux/binfmts.h>
#include <linux/mutex.h>
#include <linux/spinlock.h>
#include <linux/stop_machine.h>
@@ -909,6 +910,18 @@ static inline unsigned int group_first_cpu(struct sched_group *group)
extern int group_balance_cpu(struct sched_group *sg);
+#if defined(CONFIG_SCHED_DEBUG) && defined(CONFIG_SYSCTL)
+void register_sched_domain_sysctl(void);
+void unregister_sched_domain_sysctl(void);
+#else
+static inline void register_sched_domain_sysctl(void)
+{
+}
+static inline void unregister_sched_domain_sysctl(void)
+{
+}
+#endif
+
#else
static inline void sched_ttwu_pending(void) { }
--
2.6.4
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [RFC][PATCH 3/3 v2] sched: Add bandwidth ratio to /proc/sched_debug
2016-02-03 16:56 [RFC][PATCH 0/3 v2] sched: Display deadline bandwidth and other SCHED_DEBUG clean up Steven Rostedt
2016-02-03 16:56 ` [RFC][PATCH 1/3 v2] sched: Move sched_feature file setup into debug.c Steven Rostedt
2016-02-03 16:57 ` [RFC][PATCH 2/3 v2] sched: Move sched_domain_sysctl to debug.c Steven Rostedt
@ 2016-02-03 16:57 ` Steven Rostedt
2016-02-03 17:56 ` Juri Lelli
2016-02-10 15:13 ` [RFC][PATCH 0/3 v2] sched: Display deadline bandwidth and other SCHED_DEBUG clean up Steven Rostedt
3 siblings, 1 reply; 7+ messages in thread
From: Steven Rostedt @ 2016-02-03 16:57 UTC (permalink / raw)
To: linux-kernel
Cc: Ingo Molnar, Peter Zijlstra, Thomas Gleixner, Juri Lelli,
Clark Williams, Andrew Morton
[-- Attachment #1: 0003-sched-Add-bandwidth-ratio-to-proc-sched_debug.patch --]
[-- Type: text/plain, Size: 4010 bytes --]
From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>
Playing with SCHED_DEADLINE and cpusets, I found that I was unable to create
new SCHED_DEADLINE tasks, with the error of EBUSY as if the bandwidth was
already used up. I then realized there wa no way to see what bandwidth is
used by the runqueues to debug the issue.
By adding the dl_bw->bw and dl_bw->total_bw to the output of the deadline
info in /proc/sched_debug, this allows us to see what bandwidth has been
reserved and where a problem may exist.
For example, before the issue we see the ratio of the bandwidth:
# cat /proc/sys/kernel/sched_rt_runtime_us
950000
# cat /proc/sys/kernel/sched_rt_period_us
1000000
# grep dl /proc/sched_debug
dl_rq[0]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 0
dl_rq[1]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 0
dl_rq[2]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 0
dl_rq[3]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 0
dl_rq[4]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 0
dl_rq[5]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 0
dl_rq[6]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 0
dl_rq[7]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 0
Note: (950000 / 1000000) << 20 == 996147
After I played with cpusets and hit the issue, the result is now:
# grep dl /proc/sched_debug
dl_rq[0]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : -104857
dl_rq[1]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 104857
dl_rq[2]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 104857
dl_rq[3]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 104857
dl_rq[4]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : -104857
dl_rq[5]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : -104857
dl_rq[6]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : -104857
dl_rq[7]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : -104857
This shows that there is definitely a problem as we should never have a
negative total bandwidth.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
kernel/sched/debug.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
index a5625e793d15..a1f161a7500e 100644
--- a/kernel/sched/debug.c
+++ b/kernel/sched/debug.c
@@ -562,8 +562,17 @@ void print_rt_rq(struct seq_file *m, int cpu, struct rt_rq *rt_rq)
void print_dl_rq(struct seq_file *m, int cpu, struct dl_rq *dl_rq)
{
+ struct dl_bw *dl_bw;
+
SEQ_printf(m, "\ndl_rq[%d]:\n", cpu);
SEQ_printf(m, " .%-30s: %ld\n", "dl_nr_running", dl_rq->dl_nr_running);
+#ifdef CONFIG_SMP
+ dl_bw = &cpu_rq(cpu)->rd->dl_bw;
+#else
+ dl_bw = &dl_rq->dl_bw;
+#endif
+ SEQ_printf(m, " .%-30s: %lld\n", "dl_bw->bw", dl_bw->bw);
+ SEQ_printf(m, " .%-30s: %lld\n", "dl_bw->total_bw", dl_bw->total_bw);
}
extern __read_mostly int sched_clock_running;
--
2.6.4
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [RFC][PATCH 3/3 v2] sched: Add bandwidth ratio to /proc/sched_debug
2016-02-03 16:57 ` [RFC][PATCH 3/3 v2] sched: Add bandwidth ratio to /proc/sched_debug Steven Rostedt
@ 2016-02-03 17:56 ` Juri Lelli
2016-02-03 18:10 ` Steven Rostedt
0 siblings, 1 reply; 7+ messages in thread
From: Juri Lelli @ 2016-02-03 17:56 UTC (permalink / raw)
To: Steven Rostedt
Cc: linux-kernel, Ingo Molnar, Peter Zijlstra, Thomas Gleixner,
Juri Lelli, Clark Williams, Andrew Morton
Hi,
On 03/02/16 11:57, Steven Rostedt wrote:
> From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>
>
> Playing with SCHED_DEADLINE and cpusets, I found that I was unable to create
> new SCHED_DEADLINE tasks, with the error of EBUSY as if the bandwidth was
> already used up. I then realized there wa no way to see what bandwidth is
> used by the runqueues to debug the issue.
>
> By adding the dl_bw->bw and dl_bw->total_bw to the output of the deadline
> info in /proc/sched_debug, this allows us to see what bandwidth has been
> reserved and where a problem may exist.
>
> For example, before the issue we see the ratio of the bandwidth:
>
> # cat /proc/sys/kernel/sched_rt_runtime_us
> 950000
> # cat /proc/sys/kernel/sched_rt_period_us
> 1000000
>
> # grep dl /proc/sched_debug
> dl_rq[0]:
> .dl_nr_running : 0
> .dl_bw->bw : 996147
> .dl_bw->total_bw : 0
> dl_rq[1]:
> .dl_nr_running : 0
> .dl_bw->bw : 996147
> .dl_bw->total_bw : 0
> dl_rq[2]:
> .dl_nr_running : 0
> .dl_bw->bw : 996147
> .dl_bw->total_bw : 0
> dl_rq[3]:
> .dl_nr_running : 0
> .dl_bw->bw : 996147
> .dl_bw->total_bw : 0
> dl_rq[4]:
> .dl_nr_running : 0
> .dl_bw->bw : 996147
> .dl_bw->total_bw : 0
> dl_rq[5]:
> .dl_nr_running : 0
> .dl_bw->bw : 996147
> .dl_bw->total_bw : 0
> dl_rq[6]:
> .dl_nr_running : 0
> .dl_bw->bw : 996147
> .dl_bw->total_bw : 0
> dl_rq[7]:
> .dl_nr_running : 0
> .dl_bw->bw : 996147
> .dl_bw->total_bw : 0
>
I think this is already quite useful. But, do you think we can also
display information about the root_domain (like the span for example)?
Or do we have that some place else already?
Thanks,
- Juri
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC][PATCH 3/3 v2] sched: Add bandwidth ratio to /proc/sched_debug
2016-02-03 17:56 ` Juri Lelli
@ 2016-02-03 18:10 ` Steven Rostedt
0 siblings, 0 replies; 7+ messages in thread
From: Steven Rostedt @ 2016-02-03 18:10 UTC (permalink / raw)
To: Juri Lelli
Cc: linux-kernel, Ingo Molnar, Peter Zijlstra, Thomas Gleixner,
Juri Lelli, Clark Williams, Andrew Morton
On Wed, 3 Feb 2016 17:56:46 +0000
Juri Lelli <juri.lelli@arm.com> wrote:
> I think this is already quite useful. But, do you think we can also
> display information about the root_domain (like the span for example)?
> Or do we have that some place else already?
That's something I was planning on adding to that debugfs/sched
directory. But it may already be present. Perhaps it can be added to
the sched_debug file too if it isn't already there.
But that's a question for Peter.
-- Steve
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC][PATCH 0/3 v2] sched: Display deadline bandwidth and other SCHED_DEBUG clean up
2016-02-03 16:56 [RFC][PATCH 0/3 v2] sched: Display deadline bandwidth and other SCHED_DEBUG clean up Steven Rostedt
` (2 preceding siblings ...)
2016-02-03 16:57 ` [RFC][PATCH 3/3 v2] sched: Add bandwidth ratio to /proc/sched_debug Steven Rostedt
@ 2016-02-10 15:13 ` Steven Rostedt
3 siblings, 0 replies; 7+ messages in thread
From: Steven Rostedt @ 2016-02-10 15:13 UTC (permalink / raw)
To: linux-kernel
Cc: Ingo Molnar, Peter Zijlstra, Thomas Gleixner, Juri Lelli,
Clark Williams, Andrew Morton
On Wed, 03 Feb 2016 11:56:58 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:
> I'm starting to play with SCHED_DEADLINE a bit and I'm able to cause
> a bandwidth "leak". Then I realized there's no way to examine what bandwidths
> are enabled on which CPUs. I added the bandwith ratios to the
> /proc/sched_debug file.
>
ping?
-- Steve
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2016-02-10 15:16 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-03 16:56 [RFC][PATCH 0/3 v2] sched: Display deadline bandwidth and other SCHED_DEBUG clean up Steven Rostedt
2016-02-03 16:56 ` [RFC][PATCH 1/3 v2] sched: Move sched_feature file setup into debug.c Steven Rostedt
2016-02-03 16:57 ` [RFC][PATCH 2/3 v2] sched: Move sched_domain_sysctl to debug.c Steven Rostedt
2016-02-03 16:57 ` [RFC][PATCH 3/3 v2] sched: Add bandwidth ratio to /proc/sched_debug Steven Rostedt
2016-02-03 17:56 ` Juri Lelli
2016-02-03 18:10 ` Steven Rostedt
2016-02-10 15:13 ` [RFC][PATCH 0/3 v2] sched: Display deadline bandwidth and other SCHED_DEBUG clean up Steven Rostedt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox