All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michael Neuling <mikey@neuling.org>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linuxppc-dev@ozlabs.org, Will Schmidt <will_schmidt@vnet.ibm.com>,
	Steven Munroe <sjmunroe@us.ibm.com>
Subject: Re: [RFC] Moving toward smarter disabling of FPRs, VRs, and VSRs in the MSR
Date: Mon, 16 Mar 2009 17:43:02 +1100	[thread overview]
Message-ID: <7609.1237185782@neuling.org> (raw)
In-Reply-To: <1237164566.25062.114.camel@pasglop>

> > We can do some VMX testing on existing POWER6 machines.  The VSX
> > instruction set hasn't been fully implemented in GCC yet so we'll need
> > to wait a bit for that.  Does anyone have an idea for a good VMX/Altivec
> > benchmark?
> 
> Note that there are two aspects to the problem:
> 
>  - Lazy save/restore on SMP. This would avoid both the save and restore
> phases, thus is where the most potential gain is to be made. At the
> expense of some tricky IPI work when processes migrate between CPUs.
> 
> However, it will only be useful -if- a process using FP/VMX/VSX is
> "interrupted" by another process that isn't using them. For example, a
> kernel thread. So it's unclear whether that's worth it in practice, ie,
> does this happen that often ?
> 
>  - Always restoring the FP/VMX/VSX state on context switch "in" rather
> than taking a fault. This is reasonably simple, but at the potential
> expense of adding the save/restore overhead to applications that only
> seldomly use these facilities. (Some heuristics might help here).
> 
> However, the question here what do this buy us ?
> 
> IE, In the worst case scenario, which is HZ=1000, so every 1ms, the
> process would have the overhead of an interrupt to do the restore of the
> state. IE. The restore state itself doesn't count since it would be done
> either way (at context switch vs. in the unavailable interrupt), so all
> we win here is the overhead of the actual interrupt, which is
> implemented as a fast interrupt in assembly. So we have what here ? 1000
> cycles to be pessimistic ? On a 1Ghz CPU, that is 1/1000 of the time
> slice, and both of these are rather pessimistic numbers.
> 
> So that leaves us with the possible case of 2 tasks using the facility
> and running a fraction of the timeslice each, for example because they
> are ping-ponging with each other.
> 
> Is that something that happens in practice to make it noticeable ?

I hacked up the below to put stats in /proc/self/sched.

A quick grep through /proc on a rhel5.2 machine (egrep
'(fp_count|switch_count)' /proc/*/sched) shows a few apps use fp a few
dozen times but then stop.  This is only init apps like hald, so need to
check some real world apps too.

Ryan: let me know if this allows you to collect some useful stats.  

Subject: [PATCH] powerpc: add context switch, fpr & vr stats to /proc/self/sched.

Add a counter for every task switch, fp and vr exception to
/proc/self/sched.

[root@p5-20-p6-e0 ~]# cat /proc/3422/sched |tail -3
switch_count                       :                  559
fp_count                           :                  317
vr_count                           :                    0
[root@p5-20-p6-e0 ~]# 

Signed-off-by: Michael Neuling <mikey@neuling.org>
---
 arch/powerpc/include/asm/processor.h |    3 +++
 arch/powerpc/kernel/asm-offsets.c    |    3 +++
 arch/powerpc/kernel/fpu.S            |    3 +++
 arch/powerpc/kernel/process.c        |    3 +++
 arch/powerpc/kernel/setup-common.c   |   10 ++++++++++
 include/linux/seq_file.h             |   12 ++++++++++++
 kernel/sched_debug.c                 |   16 ++++------------
 7 files changed, 38 insertions(+), 12 deletions(-)

Index: linux-2.6-ozlabs/arch/powerpc/include/asm/processor.h
===================================================================
--- linux-2.6-ozlabs.orig/arch/powerpc/include/asm/processor.h
+++ linux-2.6-ozlabs/arch/powerpc/include/asm/processor.h
@@ -174,11 +174,13 @@ struct thread_struct {
 	} fpscr;
 	int		fpexc_mode;	/* floating-point exception mode */
 	unsigned int	align_ctl;	/* alignment handling control */
+	unsigned long	fp_count;	/* FP restore count */
 #ifdef CONFIG_PPC64
 	unsigned long	start_tb;	/* Start purr when proc switched in */
 	unsigned long	accum_tb;	/* Total accumilated purr for process */
 #endif
 	unsigned long	dabr;		/* Data address breakpoint register */
+	unsigned long	switch_count;	/* switch count */
 #ifdef CONFIG_ALTIVEC
 	/* Complete AltiVec register set */
 	vector128	vr[32] __attribute__((aligned(16)));
@@ -186,6 +188,7 @@ struct thread_struct {
 	vector128	vscr __attribute__((aligned(16)));
 	unsigned long	vrsave;
 	int		used_vr;	/* set if process has used altivec */
+	unsigned long	vr_count;	/* VSX restore count */
 #endif /* CONFIG_ALTIVEC */
 #ifdef CONFIG_VSX
 	/* VSR status */
Index: linux-2.6-ozlabs/arch/powerpc/kernel/asm-offsets.c
===================================================================
--- linux-2.6-ozlabs.orig/arch/powerpc/kernel/asm-offsets.c
+++ linux-2.6-ozlabs/arch/powerpc/kernel/asm-offsets.c
@@ -74,14 +74,17 @@ int main(void)
 	DEFINE(KSP, offsetof(struct thread_struct, ksp));
 	DEFINE(KSP_LIMIT, offsetof(struct thread_struct, ksp_limit));
 	DEFINE(PT_REGS, offsetof(struct thread_struct, regs));
+	DEFINE(THREAD_SWITCHCOUNT, offsetof(struct thread_struct, switch_count));
 	DEFINE(THREAD_FPEXC_MODE, offsetof(struct thread_struct, fpexc_mode));
 	DEFINE(THREAD_FPR0, offsetof(struct thread_struct, fpr[0]));
 	DEFINE(THREAD_FPSCR, offsetof(struct thread_struct, fpscr));
+	DEFINE(THREAD_FPCOUNT, offsetof(struct thread_struct, fp_count));
 #ifdef CONFIG_ALTIVEC
 	DEFINE(THREAD_VR0, offsetof(struct thread_struct, vr[0]));
 	DEFINE(THREAD_VRSAVE, offsetof(struct thread_struct, vrsave));
 	DEFINE(THREAD_VSCR, offsetof(struct thread_struct, vscr));
 	DEFINE(THREAD_USED_VR, offsetof(struct thread_struct, used_vr));
+	DEFINE(THREAD_VRCOUNT, offsetof(struct thread_struct, vr_count));
 #endif /* CONFIG_ALTIVEC */
 #ifdef CONFIG_VSX
 	DEFINE(THREAD_VSR0, offsetof(struct thread_struct, fpr));
Index: linux-2.6-ozlabs/arch/powerpc/kernel/fpu.S
===================================================================
--- linux-2.6-ozlabs.orig/arch/powerpc/kernel/fpu.S
+++ linux-2.6-ozlabs/arch/powerpc/kernel/fpu.S
@@ -102,6 +102,9 @@ END_FTR_SECTION_IFSET(CPU_FTR_VSX)
 	ori	r12,r12,MSR_FP
 	or	r12,r12,r4
 	std	r12,_MSR(r1)
+	ld	r4,THREAD_FPCOUNT(r5)
+	addi	r4, r4, 1
+	std	r4,THREAD_FPCOUNT(r5)
 #endif
 	lfd	fr0,THREAD_FPSCR(r5)
 	MTFSF_L(fr0)
Index: linux-2.6-ozlabs/arch/powerpc/kernel/process.c
===================================================================
--- linux-2.6-ozlabs.orig/arch/powerpc/kernel/process.c
+++ linux-2.6-ozlabs/arch/powerpc/kernel/process.c
@@ -744,17 +744,20 @@ void start_thread(struct pt_regs *regs, 
 #endif
 
 	discard_lazy_cpu_state();
+	current->thread.switch_count = 0;
 #ifdef CONFIG_VSX
 	current->thread.used_vsr = 0;
 #endif
 	memset(current->thread.fpr, 0, sizeof(current->thread.fpr));
 	current->thread.fpscr.val = 0;
+	current->thread.fp_count = 0;
 #ifdef CONFIG_ALTIVEC
 	memset(current->thread.vr, 0, sizeof(current->thread.vr));
 	memset(&current->thread.vscr, 0, sizeof(current->thread.vscr));
 	current->thread.vscr.u[3] = 0x00010000; /* Java mode disabled */
 	current->thread.vrsave = 0;
 	current->thread.used_vr = 0;
+	current->thread.vr_count = 0;
 #endif /* CONFIG_ALTIVEC */
 #ifdef CONFIG_SPE
 	memset(current->thread.evr, 0, sizeof(current->thread.evr));
Index: linux-2.6-ozlabs/arch/powerpc/kernel/setup-common.c
===================================================================
--- linux-2.6-ozlabs.orig/arch/powerpc/kernel/setup-common.c
+++ linux-2.6-ozlabs/arch/powerpc/kernel/setup-common.c
@@ -669,3 +669,13 @@ static int powerpc_debugfs_init(void)
 }
 arch_initcall(powerpc_debugfs_init);
 #endif
+
+void arch_proc_sched_show_task(struct task_struct *p, struct seq_file *m) {
+	SEQ_printf(m, "%-35s:%21Ld\n",
+		   "switch_count", (long long)p->thread.switch_count);
+	SEQ_printf(m, "%-35s:%21Ld\n",
+		   "fp_count", (long long)p->thread.fp_count);
+	SEQ_printf(m, "%-35s:%21Ld\n",
+		   "vr_count", (long long)p->thread.vr_count);
+}
+
Index: linux-2.6-ozlabs/include/linux/seq_file.h
===================================================================
--- linux-2.6-ozlabs.orig/include/linux/seq_file.h
+++ linux-2.6-ozlabs/include/linux/seq_file.h
@@ -95,4 +95,16 @@ extern struct list_head *seq_list_start_
 extern struct list_head *seq_list_next(void *v, struct list_head *head,
 		loff_t *ppos);
 
+/*
+ * This allows printing both to /proc/sched_debug and
+ * to the console
+ */
+#define SEQ_printf(m, x...)			\
+ do {						\
+	if (m)					\
+		seq_printf(m, x);		\
+	else					\
+		printk(x);			\
+ } while (0)
+
 #endif
Index: linux-2.6-ozlabs/kernel/sched_debug.c
===================================================================
--- linux-2.6-ozlabs.orig/kernel/sched_debug.c
+++ linux-2.6-ozlabs/kernel/sched_debug.c
@@ -17,18 +17,6 @@
 #include <linux/utsname.h>
 
 /*
- * This allows printing both to /proc/sched_debug and
- * to the console
- */
-#define SEQ_printf(m, x...)			\
- do {						\
-	if (m)					\
-		seq_printf(m, x);		\
-	else					\
-		printk(x);			\
- } while (0)
-
-/*
  * Ease the printing of nsec fields:
  */
 static long long nsec_high(unsigned long long nsec)
@@ -370,6 +358,9 @@ static int __init init_sched_debug_procf
 
 __initcall(init_sched_debug_procfs);
 
+void __attribute__ ((weak))
+arch_proc_sched_show_task(struct task_struct *p, struct seq_file *m) {}
+
 void proc_sched_show_task(struct task_struct *p, struct seq_file *m)
 {
 	unsigned long nr_switches;
@@ -473,6 +464,7 @@ void proc_sched_show_task(struct task_st
 		SEQ_printf(m, "%-35s:%21Ld\n",
 			   "clock-delta", (long long)(t1-t0));
 	}
+	arch_proc_sched_show_task(p, m);
 }
 
 void proc_sched_set_task(struct task_struct *p)

  reply	other threads:[~2009-03-16  6:43 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-13 20:23 [RFC] Moving toward smarter disabling of FPRs, VRs, and VSRs in the MSR Ryan Arnold
2009-03-13 21:15 ` Kumar Gala
2009-03-13 22:45   ` Benjamin Herrenschmidt
2009-03-13 23:52     ` Josh Boyer
2009-03-14  2:31     ` Ryan Arnold
2009-03-14  3:22       ` Benjamin Herrenschmidt
2009-03-14 13:55       ` Segher Boessenkool
2009-03-14 13:49     ` Segher Boessenkool
2009-03-14 14:58       ` Ryan Arnold
2009-03-16  0:49         ` Benjamin Herrenschmidt
2009-03-16  6:43           ` Michael Neuling [this message]
2009-03-16 10:52       ` Gabriel Paubert
2009-03-14  8:20 ` Michael Neuling

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7609.1237185782@neuling.org \
    --to=mikey@neuling.org \
    --cc=benh@kernel.crashing.org \
    --cc=linuxppc-dev@ozlabs.org \
    --cc=sjmunroe@us.ibm.com \
    --cc=will_schmidt@vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.