[RFC PATCH 0/6] Provide vcpu dispatch statistics

linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed

* [RFC PATCH 0/6] Provide vcpu dispatch statistics
@ 2019-05-06  9:43 Naveen N. Rao
  2019-05-06  9:43 ` [RFC PATCH 1/6] powerpc/pseries: Use macros for referring to the DTL enable mask Naveen N. Rao
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Naveen N. Rao @ 2019-05-06  9:43 UTC (permalink / raw)
  To: Michael Ellerman, Nathan Lynch, Mingming Cao, Aneesh Kumar K.V
  Cc: linuxppc-dev

This series adds a new procfs file /proc/powerpc/vcpudispatch_stats for 
providing statistics around how the LPAR processors are dispatched by 
the POWER Hypervisor, in a shared LPAR environment. Patch 6/6 has more 
details on how the statistics are gathered.

An example output:
    $ sudo cat /proc/powerpc/vcpudispatch_stats
    cpu0 6839 4126 0 2683 30 0 6821 18 0
    cpu1 2515 1274 0 1229 12 0 2509 6 0
    cpu2 2317 1198 0 1109 10 0 2312 5 0
    cpu3 2259 1165 0 1088 6 0 2256 3 0
    cpu4 2205 1143 0 1056 6 0 2202 3 0
    cpu5 2165 1121 0 1038 6 0 2162 3 0
    cpu6 2183 1127 0 1050 6 0 2180 3 0
    cpu7 2193 1133 0 1052 8 0 2187 6 0
    cpu8 2165 1115 0 1032 18 0 2156 9 0
    cpu9 2301 1252 0 1033 16 0 2293 8 0
    cpu10 2197 1138 0 1041 18 0 2187 10 0
    cpu11 2273 1185 0 1062 26 0 2260 13 0
    cpu12 2186 1125 0 1043 18 0 2177 9 0
    cpu13 2161 1115 0 1030 16 0 2153 8 0
    cpu14 2206 1153 0 1033 20 0 2196 10 0
    cpu15 2163 1115 0 1032 16 0 2155 8 0

In the output above, for cpu0, we see that there have been 6839 vcpu 
dispatches since statistics were enabled. The next 5 numbers represent 
dispatch dispersions without taking the vcpu home nodes into account.
- 4126 dispatches were on the same physical cpu as the previous time
- 2683 were on a different core, but within the same chip,
- while 30 times, a vcpu was dispatched on a different chip compared to 
  its last dispatch.

The next 3 numbers represent statistics in relation to the vcpu home 
node. We see that there have been 6821 dispatches in the vcpu home node, 
while 18 dispatches were in a different chip.

TODO:
- Consider need for adding cond_resched() in some places.
- More testing, especially on larger machines.


- Naveen

Naveen N. Rao (6):
  powerpc/pseries: Use macros for referring to the DTL enable mask
  powerpc/pseries: Do not save the previous DTL mask value
  powerpc/pseries: Factor out DTL buffer allocation and registration
    routines
  powerpc/pseries: Generalize hcall_vphn()
  powerpc/pseries: Introduce helpers to gatekeep DTLB usage
  powerpc/pseries: Provide vcpu dispatch statistics

 arch/powerpc/include/asm/lppaca.h         |  11 +
 arch/powerpc/include/asm/plpar_wrappers.h |   4 +
 arch/powerpc/include/asm/topology.h       |   4 +
 arch/powerpc/mm/book3s64/vphn.h           |   8 +
 arch/powerpc/mm/numa.c                    | 139 +++++-
 arch/powerpc/platforms/pseries/dtl.c      |  22 +-
 arch/powerpc/platforms/pseries/lpar.c     | 543 +++++++++++++++++++++-
 arch/powerpc/platforms/pseries/setup.c    |  34 +-
 8 files changed, 690 insertions(+), 75 deletions(-)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [RFC PATCH 1/6] powerpc/pseries: Use macros for referring to the DTL enable mask
  2019-05-06  9:43 [RFC PATCH 0/6] Provide vcpu dispatch statistics Naveen N. Rao
@ 2019-05-06  9:43 ` Naveen N. Rao
  2019-05-06  9:43 ` [RFC PATCH 2/6] powerpc/pseries: Do not save the previous DTL mask value Naveen N. Rao
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Naveen N. Rao @ 2019-05-06  9:43 UTC (permalink / raw)
  To: Michael Ellerman, Nathan Lynch, Mingming Cao, Aneesh Kumar K.V
  Cc: linuxppc-dev

Introduce macros to encode the DTL enable mask fields and use those
instead of hardcoding numbers.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/lppaca.h      | 11 +++++++++++
 arch/powerpc/platforms/pseries/dtl.c   |  8 +-------
 arch/powerpc/platforms/pseries/lpar.c  |  2 +-
 arch/powerpc/platforms/pseries/setup.c |  2 +-
 4 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/include/asm/lppaca.h b/arch/powerpc/include/asm/lppaca.h
index 7c23ce8a5a4c..2c7e31187726 100644
--- a/arch/powerpc/include/asm/lppaca.h
+++ b/arch/powerpc/include/asm/lppaca.h
@@ -154,6 +154,17 @@ struct dtl_entry {
 #define DISPATCH_LOG_BYTES	4096	/* bytes per cpu */
 #define N_DISPATCH_LOG		(DISPATCH_LOG_BYTES / sizeof(struct dtl_entry))
 
+/*
+ * Dispatch trace log event enable mask:
+ *   0x1: voluntary virtual processor waits
+ *   0x2: time-slice preempts
+ *   0x4: virtual partition memory page faults
+ */
+#define DTL_LOG_CEDE		0x1
+#define DTL_LOG_PREEMPT		0x2
+#define DTL_LOG_FAULT		0x4
+#define DTL_LOG_ALL		(DTL_LOG_CEDE | DTL_LOG_PREEMPT | DTL_LOG_FAULT)
+
 extern struct kmem_cache *dtl_cache;
 
 /*
diff --git a/arch/powerpc/platforms/pseries/dtl.c b/arch/powerpc/platforms/pseries/dtl.c
index ef6595153642..051ea2de1e1a 100644
--- a/arch/powerpc/platforms/pseries/dtl.c
+++ b/arch/powerpc/platforms/pseries/dtl.c
@@ -40,13 +40,7 @@ struct dtl {
 };
 static DEFINE_PER_CPU(struct dtl, cpu_dtl);
 
-/*
- * Dispatch trace log event mask:
- * 0x7: 0x1: voluntary virtual processor waits
- *      0x2: time-slice preempts
- *      0x4: virtual partition memory page faults
- */
-static u8 dtl_event_mask = 0x7;
+static u8 dtl_event_mask = DTL_LOG_ALL;
 
 
 /*
diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c
index 1034ef1fe2b4..23f2ac6793b7 100644
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -126,7 +126,7 @@ void vpa_init(int cpu)
 			pr_err("WARNING: DTL registration of cpu %d (hw %d) "
 			       "failed with %ld\n", smp_processor_id(),
 			       hwcpu, ret);
-		lppaca_of(cpu).dtl_enable_mask = 2;
+		lppaca_of(cpu).dtl_enable_mask = DTL_LOG_PREEMPT;
 	}
 }
 
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index e4f0dfd4ae33..fabaefff8399 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -316,7 +316,7 @@ static int alloc_dispatch_logs(void)
 		pr_err("WARNING: DTL registration of cpu %d (hw %d) failed "
 		       "with %d\n", smp_processor_id(),
 		       hard_smp_processor_id(), ret);
-	get_paca()->lppaca_ptr->dtl_enable_mask = 2;
+	get_paca()->lppaca_ptr->dtl_enable_mask = DTL_LOG_PREEMPT;
 
 	return 0;
 }
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RFC PATCH 2/6] powerpc/pseries: Do not save the previous DTL mask value
  2019-05-06  9:43 [RFC PATCH 0/6] Provide vcpu dispatch statistics Naveen N. Rao
  2019-05-06  9:43 ` [RFC PATCH 1/6] powerpc/pseries: Use macros for referring to the DTL enable mask Naveen N. Rao
@ 2019-05-06  9:43 ` Naveen N. Rao
  2019-05-06  9:43 ` [RFC PATCH 3/6] powerpc/pseries: Factor out DTL buffer allocation and registration routines Naveen N. Rao
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Naveen N. Rao @ 2019-05-06  9:43 UTC (permalink / raw)
  To: Michael Ellerman, Nathan Lynch, Mingming Cao, Aneesh Kumar K.V
  Cc: linuxppc-dev

When CONFIG_VIRT_CPU_ACCOUNTING_NATIVE is enabled, we always initialize
DTL enable mask to DTL_LOG_PREEMPT (0x2). There are no other places
where the mask is changed. As such, when reading the DTL log buffer
through debugfs, there is no need to save and restore the previous mask
value.

We don't need to save and restore the earlier mask value if
CONFIG_VIRT_CPU_ACCOUNTING_NATIVE is not enabled. So, remove the field
from the structure as well.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/pseries/dtl.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/dtl.c b/arch/powerpc/platforms/pseries/dtl.c
index 051ea2de1e1a..fb05804adb2f 100644
--- a/arch/powerpc/platforms/pseries/dtl.c
+++ b/arch/powerpc/platforms/pseries/dtl.c
@@ -55,7 +55,6 @@ struct dtl_ring {
 	struct dtl_entry *write_ptr;
 	struct dtl_entry *buf;
 	struct dtl_entry *buf_end;
-	u8	saved_dtl_mask;
 };
 
 static DEFINE_PER_CPU(struct dtl_ring, dtl_rings);
@@ -105,7 +104,6 @@ static int dtl_start(struct dtl *dtl)
 	dtlr->write_ptr = dtl->buf;
 
 	/* enable event logging */
-	dtlr->saved_dtl_mask = lppaca_of(dtl->cpu).dtl_enable_mask;
 	lppaca_of(dtl->cpu).dtl_enable_mask |= dtl_event_mask;
 
 	dtl_consumer = consume_dtle;
@@ -123,7 +121,7 @@ static void dtl_stop(struct dtl *dtl)
 	dtlr->buf = NULL;
 
 	/* restore dtl_enable_mask */
-	lppaca_of(dtl->cpu).dtl_enable_mask = dtlr->saved_dtl_mask;
+	lppaca_of(dtl->cpu).dtl_enable_mask = DTL_LOG_PREEMPT;
 
 	if (atomic_dec_and_test(&dtl_count))
 		dtl_consumer = NULL;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RFC PATCH 3/6] powerpc/pseries: Factor out DTL buffer allocation and registration routines
  2019-05-06  9:43 [RFC PATCH 0/6] Provide vcpu dispatch statistics Naveen N. Rao
  2019-05-06  9:43 ` [RFC PATCH 1/6] powerpc/pseries: Use macros for referring to the DTL enable mask Naveen N. Rao
  2019-05-06  9:43 ` [RFC PATCH 2/6] powerpc/pseries: Do not save the previous DTL mask value Naveen N. Rao
@ 2019-05-06  9:43 ` Naveen N. Rao
  2019-05-06  9:43 ` [RFC PATCH 4/6] powerpc/pseries: Generalize hcall_vphn() Naveen N. Rao
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Naveen N. Rao @ 2019-05-06  9:43 UTC (permalink / raw)
  To: Michael Ellerman, Nathan Lynch, Mingming Cao, Aneesh Kumar K.V
  Cc: linuxppc-dev

Introduce new helpers for DTL buffer allocation and registration and
have the existing code use those.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/plpar_wrappers.h |  2 +
 arch/powerpc/platforms/pseries/lpar.c     | 66 ++++++++++++++++-------
 arch/powerpc/platforms/pseries/setup.c    | 34 +-----------
 3 files changed, 52 insertions(+), 50 deletions(-)

diff --git a/arch/powerpc/include/asm/plpar_wrappers.h b/arch/powerpc/include/asm/plpar_wrappers.h
index cff5a411e595..d08feb1bc2bd 100644
--- a/arch/powerpc/include/asm/plpar_wrappers.h
+++ b/arch/powerpc/include/asm/plpar_wrappers.h
@@ -88,6 +88,8 @@ static inline long register_dtl(unsigned long cpu, unsigned long vpa)
 	return vpa_call(H_VPA_REG_DTL, cpu, vpa);
 }
 
+extern void register_dtl_buffer(int cpu);
+extern void alloc_dtl_buffers(void);
 extern void vpa_init(int cpu);
 
 static inline long plpar_pte_enter(unsigned long flags,
diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c
index 23f2ac6793b7..3375ca8cefb5 100644
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -65,13 +65,58 @@ EXPORT_SYMBOL(plpar_hcall);
 EXPORT_SYMBOL(plpar_hcall9);
 EXPORT_SYMBOL(plpar_hcall_norets);
 
+void alloc_dtl_buffers(void)
+{
+	int cpu;
+	struct paca_struct *pp;
+	struct dtl_entry *dtl;
+
+	for_each_possible_cpu(cpu) {
+		pp = paca_ptrs[cpu];
+		dtl = kmem_cache_alloc(dtl_cache, GFP_KERNEL);
+		if (!dtl) {
+			pr_warn("Failed to allocate dispatch trace log for cpu %d\n",
+				cpu);
+			pr_warn("Stolen time statistics will be unreliable\n");
+			break;
+		}
+
+		pp->dtl_ridx = 0;
+		pp->dispatch_log = dtl;
+		pp->dispatch_log_end = dtl + N_DISPATCH_LOG;
+		pp->dtl_curr = dtl;
+	}
+}
+
+void register_dtl_buffer(int cpu)
+{
+	long ret;
+	struct paca_struct *pp;
+	struct dtl_entry *dtl;
+	int hwcpu = get_hard_smp_processor_id(cpu);
+
+	pp = paca_ptrs[cpu];
+	dtl = pp->dispatch_log;
+	if (dtl) {
+		pp->dtl_ridx = 0;
+		pp->dtl_curr = dtl;
+		lppaca_of(cpu).dtl_idx = 0;
+
+		/* hypervisor reads buffer length from this field */
+		dtl->enqueue_to_dispatch_time = cpu_to_be32(DISPATCH_LOG_BYTES);
+		ret = register_dtl(hwcpu, __pa(dtl));
+		if (ret)
+			pr_err("WARNING: DTL registration of cpu %d (hw %d) "
+			       "failed with %ld\n", cpu, hwcpu, ret);
+		lppaca_of(cpu).dtl_enable_mask = DTL_LOG_PREEMPT;
+	}
+}
+
 void vpa_init(int cpu)
 {
 	int hwcpu = get_hard_smp_processor_id(cpu);
 	unsigned long addr;
 	long ret;
-	struct paca_struct *pp;
-	struct dtl_entry *dtl;
 
 	/*
 	 * The spec says it "may be problematic" if CPU x registers the VPA of
@@ -112,22 +157,7 @@ void vpa_init(int cpu)
 	/*
 	 * Register dispatch trace log, if one has been allocated.
 	 */
-	pp = paca_ptrs[cpu];
-	dtl = pp->dispatch_log;
-	if (dtl) {
-		pp->dtl_ridx = 0;
-		pp->dtl_curr = dtl;
-		lppaca_of(cpu).dtl_idx = 0;
-
-		/* hypervisor reads buffer length from this field */
-		dtl->enqueue_to_dispatch_time = cpu_to_be32(DISPATCH_LOG_BYTES);
-		ret = register_dtl(hwcpu, __pa(dtl));
-		if (ret)
-			pr_err("WARNING: DTL registration of cpu %d (hw %d) "
-			       "failed with %ld\n", smp_processor_id(),
-			       hwcpu, ret);
-		lppaca_of(cpu).dtl_enable_mask = DTL_LOG_PREEMPT;
-	}
+	register_dtl_buffer(cpu);
 }
 
 #ifdef CONFIG_PPC_BOOK3S_64
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index fabaefff8399..b6995e5cc5c9 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -277,46 +277,16 @@ struct kmem_cache *dtl_cache;
  */
 static int alloc_dispatch_logs(void)
 {
-	int cpu, ret;
-	struct paca_struct *pp;
-	struct dtl_entry *dtl;
-
 	if (!firmware_has_feature(FW_FEATURE_SPLPAR))
 		return 0;
 
 	if (!dtl_cache)
 		return 0;
 
-	for_each_possible_cpu(cpu) {
-		pp = paca_ptrs[cpu];
-		dtl = kmem_cache_alloc(dtl_cache, GFP_KERNEL);
-		if (!dtl) {
-			pr_warn("Failed to allocate dispatch trace log for cpu %d\n",
-				cpu);
-			pr_warn("Stolen time statistics will be unreliable\n");
-			break;
-		}
-
-		pp->dtl_ridx = 0;
-		pp->dispatch_log = dtl;
-		pp->dispatch_log_end = dtl + N_DISPATCH_LOG;
-		pp->dtl_curr = dtl;
-	}
+	alloc_dtl_buffers();
 
 	/* Register the DTL for the current (boot) cpu */
-	dtl = get_paca()->dispatch_log;
-	get_paca()->dtl_ridx = 0;
-	get_paca()->dtl_curr = dtl;
-	get_paca()->lppaca_ptr->dtl_idx = 0;
-
-	/* hypervisor reads buffer length from this field */
-	dtl->enqueue_to_dispatch_time = cpu_to_be32(DISPATCH_LOG_BYTES);
-	ret = register_dtl(hard_smp_processor_id(), __pa(dtl));
-	if (ret)
-		pr_err("WARNING: DTL registration of cpu %d (hw %d) failed "
-		       "with %d\n", smp_processor_id(),
-		       hard_smp_processor_id(), ret);
-	get_paca()->lppaca_ptr->dtl_enable_mask = DTL_LOG_PREEMPT;
+	register_dtl_buffer(smp_processor_id());
 
 	return 0;
 }
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RFC PATCH 4/6] powerpc/pseries: Generalize hcall_vphn()
  2019-05-06  9:43 [RFC PATCH 0/6] Provide vcpu dispatch statistics Naveen N. Rao
                   ` (2 preceding siblings ...)
  2019-05-06  9:43 ` [RFC PATCH 3/6] powerpc/pseries: Factor out DTL buffer allocation and registration routines Naveen N. Rao
@ 2019-05-06  9:43 ` Naveen N. Rao
  2019-05-06  9:43 ` [RFC PATCH 5/6] powerpc/pseries: Introduce helpers to gatekeep DTLB usage Naveen N. Rao
  2019-05-06  9:43 ` [RFC PATCH 6/6] powerpc/pseries: Provide vcpu dispatch statistics Naveen N. Rao
  5 siblings, 0 replies; 7+ messages in thread
From: Naveen N. Rao @ 2019-05-06  9:43 UTC (permalink / raw)
  To: Michael Ellerman, Nathan Lynch, Mingming Cao, Aneesh Kumar K.V
  Cc: linuxppc-dev

H_HOME_NODE_ASSOCIATIVITY hcall can take two different flags and return
different associativity information in each case. Generalize the
existing hcall_vphn() function to take flags as an argument and to
return the result. Update the only existing user to pass the proper
arguments.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
 arch/powerpc/mm/book3s64/vphn.h |  8 ++++++++
 arch/powerpc/mm/numa.c          | 27 +++++++++++++--------------
 2 files changed, 21 insertions(+), 14 deletions(-)

diff --git a/arch/powerpc/mm/book3s64/vphn.h b/arch/powerpc/mm/book3s64/vphn.h
index f0b93c2dd578..f7ff1e0c3801 100644
--- a/arch/powerpc/mm/book3s64/vphn.h
+++ b/arch/powerpc/mm/book3s64/vphn.h
@@ -11,6 +11,14 @@
  */
 #define VPHN_ASSOC_BUFSIZE (VPHN_REGISTER_COUNT*sizeof(u64)/sizeof(u16) + 1)
 
+/*
+ * The H_HOME_NODE_ASSOCIATIVITY hcall takes two values for flags:
+ * 1 for retrieving associativity information for a guest cpu
+ * 2 for retrieving associativity information for a host/hypervisor cpu
+ */
+#define VPHN_FLAG_VCPU	1
+#define VPHN_FLAG_PCPU	2
+
 extern int vphn_unpack_associativity(const long *packed, __be32 *unpacked);
 
 #endif
diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index 57e64273cb33..57f006b6214b 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -1087,6 +1087,17 @@ static void reset_topology_timer(void);
 static int topology_timer_secs = 1;
 static int topology_inited;
 
+static long hcall_vphn(unsigned long cpu, u64 flags, __be32 *associativity)
+{
+	long rc;
+	long retbuf[PLPAR_HCALL9_BUFSIZE] = {0};
+
+	rc = plpar_hcall9(H_HOME_NODE_ASSOCIATIVITY, retbuf, flags, cpu);
+	vphn_unpack_associativity(retbuf, associativity);
+
+	return rc;
+}
+
 /*
  * Change polling interval for associativity changes.
  */
@@ -1165,25 +1176,13 @@ static int update_cpu_associativity_changes_mask(void)
  * Retrieve the new associativity information for a virtual processor's
  * home node.
  */
-static long hcall_vphn(unsigned long cpu, __be32 *associativity)
-{
-	long rc;
-	long retbuf[PLPAR_HCALL9_BUFSIZE] = {0};
-	u64 flags = 1;
-	int hwcpu = get_hard_smp_processor_id(cpu);
-
-	rc = plpar_hcall9(H_HOME_NODE_ASSOCIATIVITY, retbuf, flags, hwcpu);
-	vphn_unpack_associativity(retbuf, associativity);
-
-	return rc;
-}
-
 static long vphn_get_associativity(unsigned long cpu,
 					__be32 *associativity)
 {
 	long rc;
 
-	rc = hcall_vphn(cpu, associativity);
+	rc = hcall_vphn(get_hard_smp_processor_id(cpu),
+				VPHN_FLAG_VCPU, associativity);
 
 	switch (rc) {
 	case H_FUNCTION:
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RFC PATCH 5/6] powerpc/pseries: Introduce helpers to gatekeep DTLB usage
  2019-05-06  9:43 [RFC PATCH 0/6] Provide vcpu dispatch statistics Naveen N. Rao
                   ` (3 preceding siblings ...)
  2019-05-06  9:43 ` [RFC PATCH 4/6] powerpc/pseries: Generalize hcall_vphn() Naveen N. Rao
@ 2019-05-06  9:43 ` Naveen N. Rao
  2019-05-06  9:43 ` [RFC PATCH 6/6] powerpc/pseries: Provide vcpu dispatch statistics Naveen N. Rao
  5 siblings, 0 replies; 7+ messages in thread
From: Naveen N. Rao @ 2019-05-06  9:43 UTC (permalink / raw)
  To: Michael Ellerman, Nathan Lynch, Mingming Cao, Aneesh Kumar K.V
  Cc: linuxppc-dev

Since we would be introducing a new user of the DTL buffer in a
subsequent patch, add helpers to gatekeep use of the DTL buffer. The
current usage of the DTL buffer from debugfs is at a per-cpu level
(corresponding to the cpu debugfs file that is opened). Subsequently, we
will have users enabling/accessing DTLB for all online cpus. These
helpers allow any number of per-cpu users, or a single global user
exclusively.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/plpar_wrappers.h |  2 ++
 arch/powerpc/platforms/pseries/dtl.c      | 10 ++++++-
 arch/powerpc/platforms/pseries/lpar.c     | 36 +++++++++++++++++++++++
 3 files changed, 47 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/plpar_wrappers.h b/arch/powerpc/include/asm/plpar_wrappers.h
index d08feb1bc2bd..ab7dd454b6eb 100644
--- a/arch/powerpc/include/asm/plpar_wrappers.h
+++ b/arch/powerpc/include/asm/plpar_wrappers.h
@@ -88,6 +88,8 @@ static inline long register_dtl(unsigned long cpu, unsigned long vpa)
 	return vpa_call(H_VPA_REG_DTL, cpu, vpa);
 }
 
+extern bool register_dtl_buffer_access(bool global);
+extern void unregister_dtl_buffer_access(bool global);
 extern void register_dtl_buffer(int cpu);
 extern void alloc_dtl_buffers(void);
 extern void vpa_init(int cpu);
diff --git a/arch/powerpc/platforms/pseries/dtl.c b/arch/powerpc/platforms/pseries/dtl.c
index fb05804adb2f..dd28296c9903 100644
--- a/arch/powerpc/platforms/pseries/dtl.c
+++ b/arch/powerpc/platforms/pseries/dtl.c
@@ -193,11 +193,15 @@ static int dtl_enable(struct dtl *dtl)
 	if (dtl->buf)
 		return -EBUSY;
 
+	if (register_dtl_buffer_access(false))
+		return -EBUSY;
+
 	n_entries = dtl_buf_entries;
 	buf = kmem_cache_alloc_node(dtl_cache, GFP_KERNEL, cpu_to_node(dtl->cpu));
 	if (!buf) {
 		printk(KERN_WARNING "%s: buffer alloc failed for cpu %d\n",
 				__func__, dtl->cpu);
+		unregister_dtl_buffer_access(false);
 		return -ENOMEM;
 	}
 
@@ -214,8 +218,11 @@ static int dtl_enable(struct dtl *dtl)
 	}
 	spin_unlock(&dtl->lock);
 
-	if (rc)
+	if (rc) {
+		unregister_dtl_buffer_access(false);
 		kmem_cache_free(dtl_cache, buf);
+	}
+
 	return rc;
 }
 
@@ -227,6 +234,7 @@ static void dtl_disable(struct dtl *dtl)
 	dtl->buf = NULL;
 	dtl->buf_entries = 0;
 	spin_unlock(&dtl->lock);
+	unregister_dtl_buffer_access(false);
 }
 
 /* file interface */
diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c
index 3375ca8cefb5..6af5a2a11deb 100644
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -65,6 +65,42 @@ EXPORT_SYMBOL(plpar_hcall);
 EXPORT_SYMBOL(plpar_hcall9);
 EXPORT_SYMBOL(plpar_hcall_norets);
 
+static DEFINE_SPINLOCK(dtl_buffer_refctr_lock);
+static unsigned int dtl_buffer_global_refctr, dtl_buffer_percpu_refctr;
+
+bool register_dtl_buffer_access(bool global)
+{
+	int rc = 0;
+
+	spin_lock(&dtl_buffer_refctr_lock);
+
+	if ((global && (dtl_buffer_global_refctr || dtl_buffer_percpu_refctr))
+			|| (!global && dtl_buffer_global_refctr)) {
+		rc = -1;
+	} else {
+		if (global)
+			dtl_buffer_global_refctr++;
+		else
+			dtl_buffer_percpu_refctr++;
+	}
+
+	spin_unlock(&dtl_buffer_refctr_lock);
+
+	return rc;
+}
+
+void unregister_dtl_buffer_access(bool global)
+{
+	spin_lock(&dtl_buffer_refctr_lock);
+
+	if (global)
+		dtl_buffer_global_refctr--;
+	else
+		dtl_buffer_percpu_refctr--;
+
+	spin_unlock(&dtl_buffer_refctr_lock);
+}
+
 void alloc_dtl_buffers(void)
 {
 	int cpu;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RFC PATCH 6/6] powerpc/pseries: Provide vcpu dispatch statistics
  2019-05-06  9:43 [RFC PATCH 0/6] Provide vcpu dispatch statistics Naveen N. Rao
                   ` (4 preceding siblings ...)
  2019-05-06  9:43 ` [RFC PATCH 5/6] powerpc/pseries: Introduce helpers to gatekeep DTLB usage Naveen N. Rao
@ 2019-05-06  9:43 ` Naveen N. Rao
  5 siblings, 0 replies; 7+ messages in thread
From: Naveen N. Rao @ 2019-05-06  9:43 UTC (permalink / raw)
  To: Michael Ellerman, Nathan Lynch, Mingming Cao, Aneesh Kumar K.V
  Cc: linuxppc-dev

For Shared Processor LPARs, the POWER Hypervisor maintains a relatively
static mapping of the LPAR processors (vcpus) to physical processor
chips (representing the "home" node) and tries to always dispatch vcpus
on their associated physical processor chip. However, under certain
scenarios, vcpus may be dispatched on a different processor chip (away
from its home node). The actual physical processor number on which a
certain vcpu is dispatched is available to the guest in the
'processor_id' field of each DTL entry.

The guest can discover the home node of each vcpu through the
H_HOME_NODE_ASSOCIATIVITY(flags=1) hcall. The guest can also discover
the associativity of physical processors, as represented in the DTL
entry, through the H_HOME_NODE_ASSOCIATIVITY(flags=2) hcall.

These can then be compared to determine if the vcpu was dispatched on
its home node or not. If the vcpu was not dispatched on the home node,
it is possible to determine if the vcpu was dispatched in a different
chip, socket or drawer.

Introduce a procfs file /proc/powerpc/vcpudispatch_stats that can be
used to obtain these statistics. Writing '1' to this file enables
collecting the statistics, while writing '0' disables the statistics.
The statistics themselves are available by reading the procfs file. By
default, the DTLB log for each vcpu is processed 50 times a second so as
not to miss any entries. This processing frequency can be changed
through /proc/powerpc/vcpudispatch_stats_freq.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/topology.h   |   4 +
 arch/powerpc/mm/numa.c                | 112 +++++++
 arch/powerpc/platforms/pseries/lpar.c | 445 +++++++++++++++++++++++++-
 3 files changed, 559 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/topology.h b/arch/powerpc/include/asm/topology.h
index f85e2b01c3df..7c064731a0f2 100644
--- a/arch/powerpc/include/asm/topology.h
+++ b/arch/powerpc/include/asm/topology.h
@@ -93,6 +93,10 @@ extern int prrn_is_enabled(void);
 extern int find_and_online_cpu_nid(int cpu);
 extern int timed_topology_update(int nsecs);
 extern void __init shared_proc_topology_init(void);
+extern int init_cpu_associativity(void);
+extern void destroy_cpu_associativity(void);
+extern int cpu_relative_dispatch_distance(int last_disp_cpu, int cur_disp_cpu);
+extern int cpu_home_node_dispatch_distance(int disp_cpu);
 #else
 static inline int start_topology_update(void)
 {
diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index 57f006b6214b..b673f206fe43 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -1086,6 +1086,17 @@ static int prrn_enabled;
 static void reset_topology_timer(void);
 static int topology_timer_secs = 1;
 static int topology_inited;
+static __be32 *vcpu_associativity, *pcpu_associativity;
+static DEFINE_MUTEX(vphn_hcall_mutex);
+
+/*
+ * This represents the number of cpus in the hypervisor. Since there is no
+ * architected way to discover the number of processors in the host, we
+ * provision for dealing with NR_CPUS. This is currently 2048 by default, and
+ * is sufficient for our purposes. This will need to be tweaked if
+ * CONFIG_NR_CPUS is changed.
+ */
+#define NR_CPUS_H	NR_CPUS
 
 static long hcall_vphn(unsigned long cpu, u64 flags, __be32 *associativity)
 {
@@ -1098,6 +1109,107 @@ static long hcall_vphn(unsigned long cpu, u64 flags, __be32 *associativity)
 	return rc;
 }
 
+int init_cpu_associativity(void)
+{
+	vcpu_associativity = kcalloc(num_possible_cpus() / threads_per_core,
+			VPHN_ASSOC_BUFSIZE * sizeof(__be32), GFP_KERNEL);
+	pcpu_associativity = kcalloc(NR_CPUS_H / threads_per_core,
+			VPHN_ASSOC_BUFSIZE * sizeof(__be32), GFP_KERNEL);
+
+	if (!vcpu_associativity || !pcpu_associativity) {
+		pr_err("error allocating memory for associativity information\n");
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+void destroy_cpu_associativity(void)
+{
+	kfree(vcpu_associativity);
+	kfree(pcpu_associativity);
+	vcpu_associativity = pcpu_associativity = 0;
+}
+
+static __be32 *__get_cpu_associativity(int cpu, __be32 *cpu_assoc, int flag)
+{
+	__be32 *assoc;
+	int rc = 0;
+
+	assoc = &cpu_assoc[(int)(cpu / threads_per_core) * VPHN_ASSOC_BUFSIZE];
+	if (!assoc[0]) {
+		mutex_lock(&vphn_hcall_mutex);
+		/* did someone else already retrieve this? */
+		if (!assoc[0])
+			rc = hcall_vphn(cpu, flag, &assoc[0]);
+		mutex_unlock(&vphn_hcall_mutex);
+		if (rc)
+			return NULL;
+	}
+
+	return assoc;
+}
+
+static __be32 *get_pcpu_associativity(int cpu)
+{
+	return __get_cpu_associativity(cpu, pcpu_associativity, VPHN_FLAG_PCPU);
+}
+
+static __be32 *get_vcpu_associativity(int cpu)
+{
+	return __get_cpu_associativity(cpu, vcpu_associativity, VPHN_FLAG_VCPU);
+}
+
+static int calc_dispatch_distance(__be32 *cpu1_assoc, __be32 *cpu2_assoc)
+{
+	int i, index, dist;
+
+	for (i = 0, dist = 0; i < distance_ref_points_depth; i++) {
+		index = be32_to_cpu(distance_ref_points[i]);
+		if (cpu1_assoc[index] == cpu2_assoc[index])
+			break;
+		dist++;
+	}
+
+	return dist;
+}
+
+int cpu_relative_dispatch_distance(int last_disp_cpu, int cur_disp_cpu)
+{
+	__be32 *last_disp_cpu_assoc, *cur_disp_cpu_assoc;
+
+	if (last_disp_cpu >= NR_CPUS_H || cur_disp_cpu >= NR_CPUS_H)
+		return -EINVAL;
+
+	last_disp_cpu_assoc = get_pcpu_associativity(last_disp_cpu);
+	cur_disp_cpu_assoc = get_pcpu_associativity(cur_disp_cpu);
+
+	if (!last_disp_cpu_assoc || !cur_disp_cpu_assoc)
+		return -EIO;
+
+	return calc_dispatch_distance(last_disp_cpu_assoc, cur_disp_cpu_assoc);
+}
+
+int cpu_home_node_dispatch_distance(int disp_cpu)
+{
+	__be32 *disp_cpu_assoc, *vcpu_assoc;
+	int vcpu_id = smp_processor_id();
+
+	if (disp_cpu >= NR_CPUS_H) {
+		pr_debug_ratelimited("vcpu dispatch cpu %d > %d\n",
+						disp_cpu, NR_CPUS_H);
+		return -EINVAL;
+	}
+
+	disp_cpu_assoc = get_pcpu_associativity(disp_cpu);
+	vcpu_assoc = get_vcpu_associativity(vcpu_id);
+
+	if (!disp_cpu_assoc || !vcpu_assoc)
+		return -EIO;
+
+	return calc_dispatch_distance(disp_cpu_assoc, vcpu_assoc);
+}
+
 /*
  * Change polling interval for associativity changes.
  */
diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c
index 6af5a2a11deb..3dd02642a74b 100644
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -30,6 +30,10 @@
 #include <linux/jump_label.h>
 #include <linux/delay.h>
 #include <linux/stop_machine.h>
+#include <linux/spinlock.h>
+#include <linux/cpuhotplug.h>
+#include <linux/workqueue.h>
+#include <linux/proc_fs.h>
 #include <asm/processor.h>
 #include <asm/mmu.h>
 #include <asm/page.h>
@@ -65,8 +69,43 @@ EXPORT_SYMBOL(plpar_hcall);
 EXPORT_SYMBOL(plpar_hcall9);
 EXPORT_SYMBOL(plpar_hcall_norets);
 
+#ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
+static u8 dtl_mask = DTL_LOG_PREEMPT;
+#else
+static u8 dtl_mask;
+#endif
+
+struct dtl_worker {
+	struct delayed_work work;
+	int cpu;
+};
+
+struct vcpu_dispatch_data {
+	int last_disp_cpu;
+
+	int total_disp;
+
+	int same_thread_disp;
+	int same_core_disp;
+	int same_chip_disp;
+	int diff_chip_disp;
+	int far_chip_disp;
+
+	int numa_home_disp;
+	int numa_remote_disp;
+	int numa_far_disp;
+};
+
+static DEFINE_PER_CPU(struct vcpu_dispatch_data, vcpu_disp_data);
+static DEFINE_PER_CPU(u64, dtl_entry_ridx);
+static DEFINE_PER_CPU(struct dtl_worker, dtl_workers);
+static enum cpuhp_state dtl_worker_state;
+static DEFINE_MUTEX(dtl_worker_mutex);
+static unsigned int dtl_worker_refctr;
 static DEFINE_SPINLOCK(dtl_buffer_refctr_lock);
 static unsigned int dtl_buffer_global_refctr, dtl_buffer_percpu_refctr;
+static int vcpudispatch_stats_on __read_mostly;
+static int vcpudispatch_stats_freq = 50;
 
 bool register_dtl_buffer_access(bool global)
 {
@@ -101,19 +140,143 @@ void unregister_dtl_buffer_access(bool global)
 	spin_unlock(&dtl_buffer_refctr_lock);
 }
 
+static void update_vcpu_disp_stat(int disp_cpu)
+{
+	struct vcpu_dispatch_data *disp;
+	int distance;
+
+	disp = this_cpu_ptr(&vcpu_disp_data);
+	if (disp->last_disp_cpu == -1) {
+		disp->last_disp_cpu = disp_cpu;
+		return;
+	}
+
+	disp->total_disp++;
+
+	if (disp->last_disp_cpu == disp_cpu)
+		disp->same_thread_disp++;
+	else if (cpu_first_thread_sibling(disp->last_disp_cpu) ==
+		 cpu_first_thread_sibling(disp_cpu))
+		disp->same_core_disp++;
+	else {
+		distance = cpu_relative_dispatch_distance(disp->last_disp_cpu,
+								disp_cpu);
+		if (distance < 0)
+			pr_debug_ratelimited("vcpudispatch_stats: cpu %d: error determining associativity\n",
+					smp_processor_id());
+		else {
+			switch (distance) {
+			case 0:
+				disp->same_chip_disp++;
+				break;
+			case 1:
+				disp->diff_chip_disp++;
+				break;
+			case 2:
+				disp->far_chip_disp++;
+				break;
+			default:
+				pr_debug_ratelimited("vcpudispatch_stats: cpu %d (%d -> %d): unexpected relative dispatch distance %d\n",
+						 smp_processor_id(),
+						 disp->last_disp_cpu,
+						 disp_cpu,
+						 distance);
+			}
+		}
+	}
+
+	distance = cpu_home_node_dispatch_distance(disp_cpu);
+	if (distance < 0)
+		pr_debug_ratelimited("vcpudispatch_stats: cpu %d: error determining associativity\n",
+				smp_processor_id());
+	else {
+		switch (distance) {
+		case 0:
+			disp->numa_home_disp++;
+			break;
+		case 1:
+			disp->numa_remote_disp++;
+			break;
+		case 2:
+			disp->numa_far_disp++;
+			break;
+		default:
+			pr_debug_ratelimited("vcpudispatch_stats: cpu %d on %d: unexpected numa dispatch distance %d\n",
+						 smp_processor_id(),
+						 disp_cpu,
+						 distance);
+		}
+	}
+
+	disp->last_disp_cpu = disp_cpu;
+}
+
+static void process_dtl_buffer(struct work_struct *work)
+{
+	struct dtl_entry dtle;
+	u64 i = __this_cpu_read(dtl_entry_ridx);
+	struct dtl_entry *dtl = local_paca->dispatch_log + (i % N_DISPATCH_LOG);
+	struct dtl_entry *dtl_end = local_paca->dispatch_log_end;
+	struct lppaca *vpa = local_paca->lppaca_ptr;
+	struct dtl_worker *d = container_of(work, struct dtl_worker, work.work);
+
+	if (!local_paca->dispatch_log)
+		return;
+
+	/* if we have been migrated away, we cancel ourself */
+	if (d->cpu != smp_processor_id()) {
+		pr_debug("vcpudispatch_stats: cpu %d worker migrated -- canceling worker\n",
+						smp_processor_id());
+		return;
+	}
+
+	if (i == be64_to_cpu(vpa->dtl_idx))
+		goto out;
+
+	while (i < be64_to_cpu(vpa->dtl_idx)) {
+		dtle = *dtl;
+		barrier();
+		if (i + N_DISPATCH_LOG < be64_to_cpu(vpa->dtl_idx)) {
+			/* buffer has overflowed */
+			pr_debug_ratelimited("vcpudispatch_stats: cpu %d lost %lld DTL samples\n",
+				d->cpu,
+				be64_to_cpu(vpa->dtl_idx) - N_DISPATCH_LOG - i);
+			i = be64_to_cpu(vpa->dtl_idx) - N_DISPATCH_LOG;
+			dtl = local_paca->dispatch_log + (i % N_DISPATCH_LOG);
+			continue;
+		}
+		update_vcpu_disp_stat(be16_to_cpu(dtle.processor_id));
+		++i;
+		++dtl;
+		if (dtl == dtl_end)
+			dtl = local_paca->dispatch_log;
+	}
+
+	__this_cpu_write(dtl_entry_ridx, i);
+
+out:
+	schedule_delayed_work_on(d->cpu, to_delayed_work(work),
+					HZ / vcpudispatch_stats_freq);
+}
+
 void alloc_dtl_buffers(void)
 {
 	int cpu;
 	struct paca_struct *pp;
 	struct dtl_entry *dtl;
 
+	/* TODO: do we need cond_resched() here? */
 	for_each_possible_cpu(cpu) {
 		pp = paca_ptrs[cpu];
+		if (pp->dispatch_log)
+			continue;
 		dtl = kmem_cache_alloc(dtl_cache, GFP_KERNEL);
 		if (!dtl) {
 			pr_warn("Failed to allocate dispatch trace log for cpu %d\n",
 				cpu);
+#ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
 			pr_warn("Stolen time statistics will be unreliable\n");
+#endif
 			break;
 		}
 
@@ -124,6 +287,25 @@ void alloc_dtl_buffers(void)
 	}
 }
 
+static void free_dtl_buffers(void)
+{
+#ifndef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
+	int cpu;
+	struct paca_struct *pp;
+
+	for_each_possible_cpu(cpu) {
+		pp = paca_ptrs[cpu];
+		if (!pp->dispatch_log)
+			continue;
+		kmem_cache_free(dtl_cache, pp->dispatch_log);
+		pp->dtl_ridx = 0;
+		pp->dispatch_log = 0;
+		pp->dispatch_log_end = 0;
+		pp->dtl_curr = 0;
+	}
+#endif
+}
+
 void register_dtl_buffer(int cpu)
 {
 	long ret;
@@ -133,7 +315,7 @@ void register_dtl_buffer(int cpu)
 
 	pp = paca_ptrs[cpu];
 	dtl = pp->dispatch_log;
-	if (dtl) {
+	if (dtl && dtl_mask) {
 		pp->dtl_ridx = 0;
 		pp->dtl_curr = dtl;
 		lppaca_of(cpu).dtl_idx = 0;
@@ -144,10 +326,269 @@ void register_dtl_buffer(int cpu)
 		if (ret)
 			pr_err("WARNING: DTL registration of cpu %d (hw %d) "
 			       "failed with %ld\n", cpu, hwcpu, ret);
-		lppaca_of(cpu).dtl_enable_mask = DTL_LOG_PREEMPT;
+		lppaca_of(cpu).dtl_enable_mask = dtl_mask;
+	}
+}
+
+static int dtl_worker_online(unsigned int cpu)
+{
+	struct dtl_worker *d = &per_cpu(dtl_workers, cpu);
+
+	memset(d, 0, sizeof(*d));
+	INIT_DELAYED_WORK(&d->work, process_dtl_buffer);
+	d->cpu = cpu;
+
+#ifndef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
+	per_cpu(dtl_entry_ridx, cpu) = 0;
+	register_dtl_buffer(cpu);
+#else
+	per_cpu(dtl_entry_ridx, cpu) = be64_to_cpu(lppaca_of(cpu).dtl_idx);
+#endif
+
+	schedule_delayed_work_on(cpu, &d->work, HZ / vcpudispatch_stats_freq);
+	return 0;
+}
+
+static int dtl_worker_offline(unsigned int cpu)
+{
+	struct dtl_worker *d = &per_cpu(dtl_workers, cpu);
+
+	cancel_delayed_work_sync(&d->work);
+
+#ifndef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
+	unregister_dtl(get_hard_smp_processor_id(cpu));
+#endif
+
+	return 0;
+}
+
+static void set_global_dtl_mask(u8 mask)
+{
+	int cpu;
+
+	dtl_mask = mask;
+	for_each_present_cpu(cpu)
+		lppaca_of(cpu).dtl_enable_mask = dtl_mask;
+}
+
+static void reset_global_dtl_mask(void)
+{
+	int cpu;
+
+#ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
+	dtl_mask = DTL_LOG_PREEMPT;
+#else
+	dtl_mask = 0;
+#endif
+	for_each_present_cpu(cpu)
+		lppaca_of(cpu).dtl_enable_mask = dtl_mask;
+}
+
+static int dtl_worker_enable(void)
+{
+	int rc = 0, state;
+
+	mutex_lock(&dtl_worker_mutex);
+
+	if (dtl_worker_refctr) {
+		dtl_worker_refctr++;
+		goto out;
+	}
+
+	if (register_dtl_buffer_access(1)) {
+		rc = -EBUSY;
+		goto out;
+	}
+
+	set_global_dtl_mask(DTL_LOG_ALL);
+
+	/* Setup dtl buffers and register those */
+	alloc_dtl_buffers();
+
+	state = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "powerpc/dtl:online",
+					dtl_worker_online, dtl_worker_offline);
+	if (state < 0) {
+		pr_err("vcpudispatch_stats: unable to setup workqueue for DTL processing\n");
+		free_dtl_buffers();
+		reset_global_dtl_mask();
+		unregister_dtl_buffer_access(1);
+		rc = -EINVAL;
+		goto out;
+	}
+	dtl_worker_state = state;
+	dtl_worker_refctr++;
+
+out:
+	mutex_unlock(&dtl_worker_mutex);
+	return rc;
+}
+
+static void dtl_worker_disable(void)
+{
+	mutex_lock(&dtl_worker_mutex);
+	dtl_worker_refctr--;
+	if (!dtl_worker_refctr) {
+		cpuhp_remove_state(dtl_worker_state);
+		dtl_worker_state = 0;
+		free_dtl_buffers();
+		reset_global_dtl_mask();
+		unregister_dtl_buffer_access(1);
+	}
+	mutex_unlock(&dtl_worker_mutex);
+}
+
+static ssize_t vcpudispatch_stats_write(struct file *file, const char __user *p,
+		size_t count, loff_t *ppos)
+{
+	struct vcpu_dispatch_data *disp;
+	int rc, cmd, cpu;
+	char buf[16];
+
+	if (count > 15)
+		return -EINVAL;
+
+	if (copy_from_user(buf, p, count))
+		return -EFAULT;
+
+	buf[count] = 0;
+	rc = kstrtoint(buf, 0, &cmd);
+	if (rc || cmd < 0 || cmd > 1) {
+		pr_err("vcpudispatch_stats: please use 0 to disable or 1 to enable dispatch statistics\n");
+		return rc ? rc : -EINVAL;
+	}
+
+	if ((cmd == 0 && !vcpudispatch_stats_on) ||
+			(cmd == 1 && vcpudispatch_stats_on))
+		return count;
+
+	if (cmd) {
+		rc = init_cpu_associativity();
+		if (rc)
+			return rc;
+
+		for_each_possible_cpu(cpu) {
+			disp = per_cpu_ptr(&vcpu_disp_data, cpu);
+			memset(disp, 0, sizeof(*disp));
+			disp->last_disp_cpu = -1;
+		}
+
+		rc = dtl_worker_enable();
+		if (rc) {
+			destroy_cpu_associativity();
+			return rc;
+		}
+	} else {
+		dtl_worker_disable();
+		destroy_cpu_associativity();
+	}
+
+	vcpudispatch_stats_on = cmd;
+
+	return count;
+}
+
+static int vcpudispatch_stats_display(struct seq_file *p, void *v)
+{
+	int cpu;
+	struct vcpu_dispatch_data *disp;
+
+	if (!vcpudispatch_stats_on) {
+		seq_puts(p, "off\n");
+		return 0;
+	}
+
+	for_each_online_cpu(cpu) {
+		disp = per_cpu_ptr(&vcpu_disp_data, cpu);
+		seq_printf(p, "cpu%d", cpu);
+		seq_put_decimal_ull(p, " ", disp->total_disp);
+		seq_put_decimal_ull(p, " ", disp->same_thread_disp);
+		seq_put_decimal_ull(p, " ", disp->same_core_disp);
+		seq_put_decimal_ull(p, " ", disp->same_chip_disp);
+		seq_put_decimal_ull(p, " ", disp->diff_chip_disp);
+		seq_put_decimal_ull(p, " ", disp->far_chip_disp);
+		seq_put_decimal_ull(p, " ", disp->numa_home_disp);
+		seq_put_decimal_ull(p, " ", disp->numa_remote_disp);
+		seq_put_decimal_ull(p, " ", disp->numa_far_disp);
+		seq_puts(p, "\n");
 	}
+
+	return 0;
 }
 
+static int vcpudispatch_stats_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, vcpudispatch_stats_display, NULL);
+}
+
+static const struct file_operations vcpudispatch_stats_proc_ops = {
+	.open		= vcpudispatch_stats_open,
+	.read		= seq_read,
+	.write		= vcpudispatch_stats_write,
+	.llseek		= seq_lseek,
+	.release	= single_release,
+};
+
+static ssize_t vcpudispatch_stats_freq_write(struct file *file,
+		const char __user *p, size_t count, loff_t *ppos)
+{
+	int rc, freq;
+	char buf[16];
+
+	if (count > 15)
+		return -EINVAL;
+
+	if (copy_from_user(buf, p, count))
+		return -EFAULT;
+
+	buf[count] = 0;
+	rc = kstrtoint(buf, 0, &freq);
+	if (rc || freq < 1 || freq > HZ) {
+		pr_err("vcpudispatch_stats_freq: please specify a frequency between 1 and %d\n",
+				HZ);
+		return rc ? rc : -EINVAL;
+	}
+
+	vcpudispatch_stats_freq = freq;
+
+	return count;
+}
+
+static int vcpudispatch_stats_freq_display(struct seq_file *p, void *v)
+{
+	seq_printf(p, "%d\n", vcpudispatch_stats_freq);
+	return 0;
+}
+
+static int vcpudispatch_stats_freq_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, vcpudispatch_stats_freq_display, NULL);
+}
+
+static const struct file_operations vcpudispatch_stats_freq_proc_ops = {
+	.open		= vcpudispatch_stats_freq_open,
+	.read		= seq_read,
+	.write		= vcpudispatch_stats_freq_write,
+	.llseek		= seq_lseek,
+	.release	= single_release,
+};
+
+static int __init vcpudispatch_stats_procfs_init(void)
+{
+	if (!lppaca_shared_proc(get_lppaca()))
+		return 0;
+
+	if (!proc_create("powerpc/vcpudispatch_stats", 0600, NULL,
+					&vcpudispatch_stats_proc_ops))
+		pr_err("vcpudispatch_stats: error creating procfs file\n");
+	else if (!proc_create("powerpc/vcpudispatch_stats_freq", 0600, NULL,
+					&vcpudispatch_stats_freq_proc_ops))
+		pr_err("vcpudispatch_stats_freq: error creating procfs file\n");
+
+	return 0;
+}
+
+machine_device_initcall(pseries, vcpudispatch_stats_procfs_init);
+
 void vpa_init(int cpu)
 {
 	int hwcpu = get_hard_smp_processor_id(cpu);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-05-06  9:54 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-05-06  9:43 [RFC PATCH 0/6] Provide vcpu dispatch statistics Naveen N. Rao
2019-05-06  9:43 ` [RFC PATCH 1/6] powerpc/pseries: Use macros for referring to the DTL enable mask Naveen N. Rao
2019-05-06  9:43 ` [RFC PATCH 2/6] powerpc/pseries: Do not save the previous DTL mask value Naveen N. Rao
2019-05-06  9:43 ` [RFC PATCH 3/6] powerpc/pseries: Factor out DTL buffer allocation and registration routines Naveen N. Rao
2019-05-06  9:43 ` [RFC PATCH 4/6] powerpc/pseries: Generalize hcall_vphn() Naveen N. Rao
2019-05-06  9:43 ` [RFC PATCH 5/6] powerpc/pseries: Introduce helpers to gatekeep DTLB usage Naveen N. Rao
2019-05-06  9:43 ` [RFC PATCH 6/6] powerpc/pseries: Provide vcpu dispatch statistics Naveen N. Rao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).