From: Bharata B Rao <bharata@amd.com>
To: <linux-kernel@vger.kernel.org>, <linux-mm@kvack.org>
Cc: <mgorman@suse.de>, <peterz@infradead.org>, <mingo@redhat.com>,
<bp@alien8.de>, <dave.hansen@linux.intel.com>, <x86@kernel.org>,
<akpm@linux-foundation.org>, <luto@kernel.org>,
<tglx@linutronix.de>, <yue.li@memverge.com>,
<Ravikumar.Bangoria@amd.com>, Bharata B Rao <bharata@amd.com>
Subject: [RFC PATCH 3/5] x86/ibs: Enable per-process IBS from sched switch path
Date: Wed, 8 Feb 2023 13:05:31 +0530 [thread overview]
Message-ID: <20230208073533.715-4-bharata@amd.com> (raw)
In-Reply-To: <20230208073533.715-1-bharata@amd.com>
Program IBS for access profiling for threads from the
task sched switch path. IBS is programmed with a period
that corresponds to the incoming thread. Kernel threads are
excluded from this.
The sample period is currently kept at a fixed value of 10000.
Signed-off-by: Bharata B Rao <bharata@amd.com>
---
arch/x86/mm/ibs.c | 27 +++++++++++++++++++++++++++
include/linux/sched.h | 1 +
kernel/sched/core.c | 1 +
kernel/sched/fair.c | 1 +
kernel/sched/sched.h | 5 +++++
5 files changed, 35 insertions(+)
diff --git a/arch/x86/mm/ibs.c b/arch/x86/mm/ibs.c
index adbc587b1767..a479029e9262 100644
--- a/arch/x86/mm/ibs.c
+++ b/arch/x86/mm/ibs.c
@@ -8,6 +8,7 @@
#include <asm/perf_event.h> /* TODO: Move defns like IBS_OP_ENABLE into non-perf header */
#include <asm/apic.h>
+#define IBS_SAMPLE_PERIOD 10000
static u64 ibs_config __read_mostly;
struct ibs_access_work {
@@ -15,6 +16,31 @@ struct ibs_access_work {
u64 laddr, paddr;
};
+void hw_access_sched_in(struct task_struct *prev, struct task_struct *curr)
+{
+ u64 config = 0;
+ unsigned int period;
+
+ if (!static_branch_unlikely(&hw_access_hints))
+ return;
+
+ /* Disable IBS for kernel thread */
+ if (!curr->mm)
+ goto out;
+
+ if (curr->numa_sample_period)
+ period = curr->numa_sample_period;
+ else
+ period = IBS_SAMPLE_PERIOD;
+
+
+ config = (period >> 4) & IBS_OP_MAX_CNT;
+ config |= (period & IBS_OP_MAX_CNT_EXT_MASK);
+ config |= ibs_config;
+out:
+ wrmsrl(MSR_AMD64_IBSOPCTL, config);
+}
+
void task_ibs_access_work(struct callback_head *work)
{
struct ibs_access_work *iwork = container_of(work, struct ibs_access_work, work);
@@ -198,6 +224,7 @@ int __init ibs_access_profiling_init(void)
x86_amd_ibs_access_profile_startup,
x86_amd_ibs_access_profile_teardown);
+ static_branch_enable(&hw_access_hints);
pr_info("IBS access profiling setup for NUMA Balancing\n");
return 0;
}
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 19dd4ee07436..66c532418d38 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1254,6 +1254,7 @@ struct task_struct {
int numa_scan_seq;
unsigned int numa_scan_period;
unsigned int numa_scan_period_max;
+ unsigned int numa_sample_period;
int numa_preferred_nid;
unsigned long numa_migrate_retry;
/* Migration stamp: */
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index e838feb6adc5..1c13fed8bebc 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5165,6 +5165,7 @@ static struct rq *finish_task_switch(struct task_struct *prev)
prev_state = READ_ONCE(prev->__state);
vtime_task_switch(prev);
perf_event_task_sched_in(prev, current);
+ hw_access_sched_in(prev, current);
finish_task(prev);
tick_nohz_task_switch();
finish_lock_switch(rq);
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index c9b9e62da779..3f617c799821 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3094,6 +3094,7 @@ void init_numa_balancing(unsigned long clone_flags, struct task_struct *p)
p->node_stamp = 0;
p->numa_scan_seq = mm ? mm->numa_scan_seq : 0;
p->numa_scan_period = sysctl_numa_balancing_scan_delay;
+ p->numa_sample_period = 0;
p->numa_migrate_retry = 0;
/* Protect against double add, see task_tick_numa and task_numa_work */
p->numa_work.next = &p->numa_work;
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 771f8ddb7053..953d16c802d6 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1723,11 +1723,16 @@ extern int migrate_task_to(struct task_struct *p, int cpu);
extern int migrate_swap(struct task_struct *p, struct task_struct *t,
int cpu, int scpu);
extern void init_numa_balancing(unsigned long clone_flags, struct task_struct *p);
+void hw_access_sched_in(struct task_struct *prev, struct task_struct *curr);
#else
static inline void
init_numa_balancing(unsigned long clone_flags, struct task_struct *p)
{
}
+static inline void hw_access_sched_in(struct task_struct *prev,
+ struct task_struct *curr)
+{
+}
#endif /* CONFIG_NUMA_BALANCING */
#ifdef CONFIG_SMP
--
2.25.1
next prev parent reply other threads:[~2023-02-08 7:36 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-08 7:35 [RFC PATCH 0/5] Memory access profiler(IBS) driven NUMA balancing Bharata B Rao
2023-02-08 7:35 ` [RFC PATCH 1/5] x86/ibs: In-kernel IBS driver for page access profiling Bharata B Rao
2023-02-08 7:35 ` [RFC PATCH 2/5] x86/ibs: Drive NUMA balancing via IBS access data Bharata B Rao
2023-02-08 7:35 ` Bharata B Rao [this message]
2023-02-08 7:35 ` [RFC PATCH 4/5] x86/ibs: Adjust access faults sampling period Bharata B Rao
2023-02-08 7:35 ` [RFC PATCH 5/5] x86/ibs: Delay the collection of HW-provided access info Bharata B Rao
2023-02-08 18:03 ` [RFC PATCH 0/5] Memory access profiler(IBS) driven NUMA balancing Peter Zijlstra
2023-02-08 18:12 ` Dave Hansen
2023-02-09 6:04 ` Bharata B Rao
2023-02-09 14:28 ` Dave Hansen
2023-02-10 4:28 ` Bharata B Rao
2023-02-10 4:40 ` Dave Hansen
2023-02-10 15:10 ` Bharata B Rao
2023-02-09 5:57 ` Bharata B Rao
2023-02-13 2:56 ` Huang, Ying
2023-02-13 3:23 ` Bharata B Rao
2023-02-13 3:34 ` Huang, Ying
2023-02-13 3:26 ` Huang, Ying
2023-02-13 5:52 ` Bharata B Rao
2023-02-13 6:30 ` Huang, Ying
2023-02-14 4:55 ` Bharata B Rao
2023-02-15 6:07 ` Huang, Ying
2023-02-24 3:28 ` Bharata B Rao
2023-02-16 8:41 ` Bharata B Rao
2023-02-17 6:03 ` Huang, Ying
2023-02-24 3:36 ` Bharata B Rao
2023-02-27 7:54 ` Huang, Ying
2023-03-01 11:21 ` Bharata B Rao
2023-03-02 8:10 ` Huang, Ying
2023-03-03 5:25 ` Bharata B Rao
2023-03-03 5:53 ` Huang, Ying
2023-03-06 15:30 ` Bharata B Rao
2023-03-07 2:33 ` Huang, Ying
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230208073533.715-4-bharata@amd.com \
--to=bharata@amd.com \
--cc=Ravikumar.Bangoria@amd.com \
--cc=akpm@linux-foundation.org \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luto@kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
--cc=yue.li@memverge.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.