* Add prefetch switch stack hook in scheduler function
@ 2005-07-27 22:07 Chen, Kenneth W
2005-07-27 23:13 ` Andrew Morton
0 siblings, 1 reply; 40+ messages in thread
From: Chen, Kenneth W @ 2005-07-27 22:07 UTC (permalink / raw)
To: 'Ingo Molnar'; +Cc: linux-kernel, linux-ia64
I would like to propose adding a prefetch switch stack hook in
the scheduler function. For architecture like ia64, the switch
stack structure is fairly large (currently 528 bytes). For context
switch intensive application, we found that significant amount of
cache misses occurs in switch_to() function. The following patch
adds a hook in the schedule() function to prefetch switch stack
structure as soon as 'next' task is determined. This allows maximum
overlap in prefetch cache lines for that structure.
Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
--- linux-2.6.12/include/linux/sched.h.orig 2005-07-27 14:43:49.321986290 -0700
+++ linux-2.6.12/include/linux/sched.h 2005-07-27 14:44:03.390345492 -0700
@@ -622,6 +622,11 @@ extern int groups_search(struct group_in
#define GROUP_AT(gi, i) \
((gi)->blocks[(i)/NGROUPS_PER_BLOCK][(i)%NGROUPS_PER_BLOCK])
+#ifdef ARCH_HAS_PREFETCH_SWITCH_STACK
+extern void prefetch_switch_stack(struct task_struct*);
+#else
+#define prefetch_switch_stack(task) do { } while (0)
+#endif
struct audit_context; /* See audit.c */
struct mempolicy;
--- linux-2.6.12/kernel/sched.c.orig 2005-07-27 14:43:49.391322226 -0700
+++ linux-2.6.12/kernel/sched.c 2005-07-27 14:44:03.394251742 -0700
@@ -3044,6 +3044,7 @@ switch_tasks:
if (next == rq->idle)
schedstat_inc(rq, sched_goidle);
prefetch(next);
+ prefetch_switch_stack(next);
clear_tsk_need_resched(prev);
rcu_qsctr_inc(task_cpu(prev));
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Add prefetch switch stack hook in scheduler function
2005-07-27 22:07 Add prefetch switch stack hook in scheduler function Chen, Kenneth W
@ 2005-07-27 23:13 ` Andrew Morton
2005-07-27 23:23 ` david mosberger
0 siblings, 1 reply; 40+ messages in thread
From: Andrew Morton @ 2005-07-27 23:13 UTC (permalink / raw)
To: Chen, Kenneth W; +Cc: mingo, linux-kernel, linux-ia64
"Chen, Kenneth W" <kenneth.w.chen@intel.com> wrote:
>
> +#ifdef ARCH_HAS_PREFETCH_SWITCH_STACK
> +extern void prefetch_switch_stack(struct task_struct*);
> +#else
> +#define prefetch_switch_stack(task) do { } while (0)
> +#endif
It is better to use
static inline void prefetch_switch_stack(struct task_struct *t) { }
in the second case, rather than a macro. It provides typechecking.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Add prefetch switch stack hook in scheduler function
2005-07-27 23:13 ` Andrew Morton
@ 2005-07-27 23:23 ` david mosberger
2005-07-28 7:41 ` Ingo Molnar
0 siblings, 1 reply; 40+ messages in thread
From: david mosberger @ 2005-07-27 23:23 UTC (permalink / raw)
To: Andrew Morton; +Cc: Chen, Kenneth W, mingo, linux-kernel, linux-ia64
Also, should this be called prefetch_stack() or perhaps even just
prefetch_task()? Not every architecture defines a switch_stack
structure.
--david
--
Mosberger Consulting LLC, voice/fax: 510-744-9372,
http://www.mosberger-consulting.com/
35706 Runckel Lane, Fremont, CA 94536
On 7/27/05, Andrew Morton <akpm@osdl.org> wrote:
> "Chen, Kenneth W" <kenneth.w.chen@intel.com> wrote:
> >
> > +#ifdef ARCH_HAS_PREFETCH_SWITCH_STACK
> > +extern void prefetch_switch_stack(struct task_struct*);
> > +#else
> > +#define prefetch_switch_stack(task) do { } while (0)
> > +#endif
>
> It is better to use
>
> static inline void prefetch_switch_stack(struct task_struct *t) { }
>
> in the second case, rather than a macro. It provides typechecking.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Add prefetch switch stack hook in scheduler function
2005-07-27 23:23 ` david mosberger
@ 2005-07-28 7:41 ` Ingo Molnar
2005-07-28 8:09 ` Keith Owens
0 siblings, 1 reply; 40+ messages in thread
From: Ingo Molnar @ 2005-07-28 7:41 UTC (permalink / raw)
To: David.Mosberger; +Cc: Andrew Morton, Chen, Kenneth W, linux-kernel, linux-ia64
* david mosberger <dmosberger@gmail.com> wrote:
> Also, should this be called prefetch_stack() or perhaps even just
> prefetch_task()? Not every architecture defines a switch_stack
> structure.
yeah. I'd too suggest to call it prefetch_stack(), and not make it a
macro & hook but something defined on all arches, with for now only ia64
having any real code in the inline function.
i'm wondering, is the switch_stack at the same/similar place as
next->thread_info? If yes then we could simply do a
prefetch(next->thread_info).
Ingo
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Add prefetch switch stack hook in scheduler function
2005-07-28 7:41 ` Ingo Molnar
@ 2005-07-28 8:09 ` Keith Owens
2005-07-28 8:16 ` Ingo Molnar
2005-07-28 8:31 ` Nick Piggin
0 siblings, 2 replies; 40+ messages in thread
From: Keith Owens @ 2005-07-28 8:09 UTC (permalink / raw)
To: Ingo Molnar
Cc: David.Mosberger, Andrew Morton, Chen, Kenneth W, linux-kernel,
linux-ia64
On Thu, 28 Jul 2005 09:41:18 +0200,
Ingo Molnar <mingo@elte.hu> wrote:
>
>* david mosberger <dmosberger@gmail.com> wrote:
>
>> Also, should this be called prefetch_stack() or perhaps even just
>> prefetch_task()? Not every architecture defines a switch_stack
>> structure.
>
>yeah. I'd too suggest to call it prefetch_stack(), and not make it a
>macro & hook but something defined on all arches, with for now only ia64
>having any real code in the inline function.
>
>i'm wondering, is the switch_stack at the same/similar place as
>next->thread_info? If yes then we could simply do a
>prefetch(next->thread_info).
No, they can be up to 30K apart. See include/asm-ia64/ptrace.h.
thread_info is at ~0xda0, depending on the config. The switch_stack
can be as high as 0x7bd0 in the kernel stack, depending on why the task
is sleeping.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Add prefetch switch stack hook in scheduler function
2005-07-28 8:09 ` Keith Owens
@ 2005-07-28 8:16 ` Ingo Molnar
2005-07-28 9:09 ` Ingo Molnar
2005-07-28 8:31 ` Nick Piggin
1 sibling, 1 reply; 40+ messages in thread
From: Ingo Molnar @ 2005-07-28 8:16 UTC (permalink / raw)
To: Keith Owens
Cc: David.Mosberger, Andrew Morton, Chen, Kenneth W, linux-kernel,
linux-ia64
* Keith Owens <kaos@ocs.com.au> wrote:
> >yeah. I'd too suggest to call it prefetch_stack(), and not make it a
> >macro & hook but something defined on all arches, with for now only ia64
> >having any real code in the inline function.
> >
> >i'm wondering, is the switch_stack at the same/similar place as
> >next->thread_info? If yes then we could simply do a
> >prefetch(next->thread_info).
>
> No, they can be up to 30K apart. See include/asm-ia64/ptrace.h.
> thread_info is at ~0xda0, depending on the config. The switch_stack
> can be as high as 0x7bd0 in the kernel stack, depending on why the
> task is sleeping.
is the switch_stack the same thing as the kernel stack? If yes then we
want to have something like:
prefetch(kernel_stack(next));
to make it more generic. By default kernel_stack(next) could be
next->thread_info (to make sure we prefetch something real). On e.g.
x86/x64, kernel_stack(next) should be something like next->thread.esp.
Ingo
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Add prefetch switch stack hook in scheduler function
2005-07-28 8:09 ` Keith Owens
2005-07-28 8:16 ` Ingo Molnar
@ 2005-07-28 8:31 ` Nick Piggin
2005-07-28 8:35 ` Ingo Molnar
1 sibling, 1 reply; 40+ messages in thread
From: Nick Piggin @ 2005-07-28 8:31 UTC (permalink / raw)
To: Keith Owens
Cc: Ingo Molnar, David.Mosberger, Andrew Morton, Chen, Kenneth W,
linux-kernel, linux-ia64
Keith Owens wrote:
> On Thu, 28 Jul 2005 09:41:18 +0200,
> Ingo Molnar <mingo@elte.hu> wrote:
>>i'm wondering, is the switch_stack at the same/similar place as
>>next->thread_info? If yes then we could simply do a
>>prefetch(next->thread_info).
>
>
> No, they can be up to 30K apart. See include/asm-ia64/ptrace.h.
> thread_info is at ~0xda0, depending on the config. The switch_stack
> can be as high as 0x7bd0 in the kernel stack, depending on why the task
> is sleeping.
>
Just a minor point, I agree with David: I'd like it to be called
prefetch_task(), because some architecture may want to prefetch other
memory.
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Add prefetch switch stack hook in scheduler function
2005-07-28 8:31 ` Nick Piggin
@ 2005-07-28 8:35 ` Ingo Molnar
2005-07-28 8:48 ` Nick Piggin
0 siblings, 1 reply; 40+ messages in thread
From: Ingo Molnar @ 2005-07-28 8:35 UTC (permalink / raw)
To: Nick Piggin
Cc: Keith Owens, David.Mosberger, Andrew Morton, Chen, Kenneth W,
linux-kernel, linux-ia64
* Nick Piggin <nickpiggin@yahoo.com.au> wrote:
> >No, they can be up to 30K apart. See include/asm-ia64/ptrace.h.
> >thread_info is at ~0xda0, depending on the config. The switch_stack
> >can be as high as 0x7bd0 in the kernel stack, depending on why the task
> >is sleeping.
> >
>
> Just a minor point, I agree with David: I'd like it to be called
> prefetch_task(), because some architecture may want to prefetch other
> memory.
such as?
Ingo
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Add prefetch switch stack hook in scheduler function
2005-07-28 8:35 ` Ingo Molnar
@ 2005-07-28 8:48 ` Nick Piggin
2005-07-28 9:16 ` Ingo Molnar
0 siblings, 1 reply; 40+ messages in thread
From: Nick Piggin @ 2005-07-28 8:48 UTC (permalink / raw)
To: Ingo Molnar
Cc: Keith Owens, David.Mosberger, Andrew Morton, Chen, Kenneth W,
linux-kernel, linux-ia64
Ingo Molnar wrote:
> * Nick Piggin <nickpiggin@yahoo.com.au> wrote:
>
>
>>>No, they can be up to 30K apart. See include/asm-ia64/ptrace.h.
>>>thread_info is at ~0xda0, depending on the config. The switch_stack
>>>can be as high as 0x7bd0 in the kernel stack, depending on why the task
>>>is sleeping.
>>>
>>
>>Just a minor point, I agree with David: I'd like it to be called
>>prefetch_task(), because some architecture may want to prefetch other
>>memory.
>
>
> such as?
>
Not sure. thread_info? Maybe next->timestamp or some other fields
in next, something in next->mm?
I didn't really have a concrete example, but in the interests of
being future proof...
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Add prefetch switch stack hook in scheduler function
2005-07-28 8:16 ` Ingo Molnar
@ 2005-07-28 9:09 ` Ingo Molnar
2005-07-28 19:14 ` Chen, Kenneth W
0 siblings, 1 reply; 40+ messages in thread
From: Ingo Molnar @ 2005-07-28 9:09 UTC (permalink / raw)
To: Keith Owens
Cc: David.Mosberger, Andrew Morton, Chen, Kenneth W, linux-kernel,
linux-ia64
* Ingo Molnar <mingo@elte.hu> wrote:
> [...] If yes then we want to have something like:
>
> prefetch(kernel_stack(next));
>
> to make it more generic. By default kernel_stack(next) could be
> next->thread_info (to make sure we prefetch something real). On e.g.
> x86/x64, kernel_stack(next) should be something like next->thread.esp.
i.e. like the patch below. Boot-tested on x86. x86, x64 and ia64 have a
real kernel_stack() implementation, the other architectures all return
'next'. (I've also cleaned up a couple of other things in the
prefetch-next area, see the changelog below.)
Ken, would this patch generate a sufficient amount of prefetching on
ia64?
Ingo
-----
For architecture like ia64, the switch stack structure is fairly large
(currently 528 bytes). For context switch intensive application, we
found that significant amount of cache misses occurs in switch_to()
function. The following patch adds a hook in the schedule() function to
prefetch switch stack structure as soon as 'next' task is determined.
This allows maximum overlap in prefetch cache lines for that structure.
Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
modifications:
- add a generic kernel_stack() function instead of a hook
- prefetch the next task's thread_info and kernel stack as well
- dont try to prefetch 'next' itself - we've already touched it
so the cacheline is present.
- do the prefetching as early as possible
currently covered architectures: ia64, x86, x64. (Other architectures
will have to replace the default 'return task' in their kernel_stack()
implementations.)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
include/asm-alpha/mmu_context.h | 9 +++++++++
include/asm-arm/mmu_context.h | 9 +++++++++
include/asm-arm26/mmu_context.h | 9 +++++++++
include/asm-cris/mmu_context.h | 9 +++++++++
include/asm-frv/mmu_context.h | 9 +++++++++
include/asm-h8300/mmu_context.h | 9 +++++++++
include/asm-i386/mmu_context.h | 5 +++++
include/asm-ia64/mmu_context.h | 9 +++++++++
include/asm-m32r/mmu_context.h | 9 +++++++++
include/asm-m68k/mmu_context.h | 9 +++++++++
include/asm-m68knommu/mmu_context.h | 9 +++++++++
include/asm-mips/mmu_context.h | 9 +++++++++
include/asm-parisc/mmu_context.h | 10 ++++++++++
include/asm-ppc/mmu_context.h | 9 +++++++++
include/asm-ppc64/mmu_context.h | 9 +++++++++
include/asm-s390/mmu_context.h | 9 +++++++++
include/asm-sh/mmu_context.h | 9 +++++++++
include/asm-sh64/mmu_context.h | 9 +++++++++
include/asm-sparc/mmu_context.h | 9 +++++++++
include/asm-sparc64/mmu_context.h | 9 +++++++++
include/asm-um/mmu_context.h | 9 +++++++++
include/asm-v850/mmu_context.h | 9 +++++++++
include/asm-x86_64/mmu_context.h | 9 +++++++++
include/asm-xtensa/mmu_context.h | 9 +++++++++
kernel/sched.c | 9 ++++++++-
25 files changed, 221 insertions(+), 1 deletion(-)
Index: linux/include/asm-alpha/mmu_context.h
===================================================================
--- linux.orig/include/asm-alpha/mmu_context.h
+++ linux/include/asm-alpha/mmu_context.h
@@ -258,4 +258,13 @@ enter_lazy_tlb(struct mm_struct *mm, str
#undef __MMU_EXTERN_INLINE
#endif
+/*
+ * Returns the current bottom of a task's kernel stack. Used
+ * by the scheduler to prefetch it.
+ */
+static inline void * kernel_stack(struct task_struct *task)
+{
+ return task;
+}
+
#endif /* __ALPHA_MMU_CONTEXT_H */
Index: linux/include/asm-arm/mmu_context.h
===================================================================
--- linux.orig/include/asm-arm/mmu_context.h
+++ linux/include/asm-arm/mmu_context.h
@@ -93,4 +93,13 @@ switch_mm(struct mm_struct *prev, struct
#define deactivate_mm(tsk,mm) do { } while (0)
#define activate_mm(prev,next) switch_mm(prev, next, NULL)
+/*
+ * Returns the current bottom of a task's kernel stack. Used
+ * by the scheduler to prefetch it.
+ */
+static inline void * kernel_stack(struct task_struct *task)
+{
+ return task;
+}
+
#endif
Index: linux/include/asm-arm26/mmu_context.h
===================================================================
--- linux.orig/include/asm-arm26/mmu_context.h
+++ linux/include/asm-arm26/mmu_context.h
@@ -48,4 +48,13 @@ static inline void activate_mm(struct mm
cpu_switch_mm(next->pgd, next);
}
+/*
+ * Returns the current bottom of a task's kernel stack. Used
+ * by the scheduler to prefetch it.
+ */
+static inline void * kernel_stack(struct task_struct *task)
+{
+ return task;
+}
+
#endif
Index: linux/include/asm-cris/mmu_context.h
===================================================================
--- linux.orig/include/asm-cris/mmu_context.h
+++ linux/include/asm-cris/mmu_context.h
@@ -21,4 +21,13 @@ static inline void enter_lazy_tlb(struct
{
}
+/*
+ * Returns the current bottom of a task's kernel stack. Used
+ * by the scheduler to prefetch it.
+ */
+static inline void * kernel_stack(struct task_struct *task)
+{
+ return task;
+}
+
#endif
Index: linux/include/asm-frv/mmu_context.h
===================================================================
--- linux.orig/include/asm-frv/mmu_context.h
+++ linux/include/asm-frv/mmu_context.h
@@ -47,4 +47,13 @@ do { \
do { \
} while(0)
+/*
+ * Returns the current bottom of a task's kernel stack. Used
+ * by the scheduler to prefetch it.
+ */
+static inline void * kernel_stack(struct task_struct *task)
+{
+ return task;
+}
+
#endif
Index: linux/include/asm-h8300/mmu_context.h
===================================================================
--- linux.orig/include/asm-h8300/mmu_context.h
+++ linux/include/asm-h8300/mmu_context.h
@@ -29,4 +29,13 @@ extern inline void activate_mm(struct mm
{
}
+/*
+ * Returns the current bottom of a task's kernel stack. Used
+ * by the scheduler to prefetch it.
+ */
+static inline void * kernel_stack(struct task_struct *task)
+{
+ return task;
+}
+
#endif
Index: linux/include/asm-i386/mmu_context.h
===================================================================
--- linux.orig/include/asm-i386/mmu_context.h
+++ linux/include/asm-i386/mmu_context.h
@@ -69,4 +69,9 @@ static inline void switch_mm(struct mm_s
#define activate_mm(prev, next) \
switch_mm((prev),(next),NULL)
+static inline void * kernel_stack(struct task_struct *task)
+{
+ return (void *) task->thread.esp;
+}
+
#endif
Index: linux/include/asm-ia64/mmu_context.h
===================================================================
--- linux.orig/include/asm-ia64/mmu_context.h
+++ linux/include/asm-ia64/mmu_context.h
@@ -169,5 +169,14 @@ activate_mm (struct mm_struct *prev, str
#define switch_mm(prev_mm,next_mm,next_task) activate_mm(prev_mm, next_mm)
+/*
+ * Returns the current bottom of a task's kernel stack. Used
+ * by the scheduler to prefetch it.
+ */
+static inline void * kernel_stack(struct task_struct *task)
+{
+ return (void *) task->thread.ksp;
+}
+
# endif /* ! __ASSEMBLY__ */
#endif /* _ASM_IA64_MMU_CONTEXT_H */
Index: linux/include/asm-m32r/mmu_context.h
===================================================================
--- linux.orig/include/asm-m32r/mmu_context.h
+++ linux/include/asm-m32r/mmu_context.h
@@ -167,4 +167,13 @@ static inline void switch_mm(struct mm_s
#endif /* __KERNEL__ */
+/*
+ * Returns the current bottom of a task's kernel stack. Used
+ * by the scheduler to prefetch it.
+ */
+static inline void * kernel_stack(struct task_struct *task)
+{
+ return task;
+}
+
#endif /* _ASM_M32R_MMU_CONTEXT_H */
Index: linux/include/asm-m68k/mmu_context.h
===================================================================
--- linux.orig/include/asm-m68k/mmu_context.h
+++ linux/include/asm-m68k/mmu_context.h
@@ -150,5 +150,14 @@ static inline void activate_mm(struct mm
activate_context(next_mm);
}
+/*
+ * Returns the current bottom of a task's kernel stack. Used
+ * by the scheduler to prefetch it.
+ */
+static inline void * kernel_stack(struct task_struct *task)
+{
+ return task;
+}
+
#endif
#endif
Index: linux/include/asm-m68knommu/mmu_context.h
===================================================================
--- linux.orig/include/asm-m68knommu/mmu_context.h
+++ linux/include/asm-m68knommu/mmu_context.h
@@ -30,4 +30,13 @@ extern inline void activate_mm(struct mm
{
}
+/*
+ * Returns the current bottom of a task's kernel stack. Used
+ * by the scheduler to prefetch it.
+ */
+static inline void * kernel_stack(struct task_struct *task)
+{
+ return task;
+}
+
#endif
Index: linux/include/asm-mips/mmu_context.h
===================================================================
--- linux.orig/include/asm-mips/mmu_context.h
+++ linux/include/asm-mips/mmu_context.h
@@ -193,4 +193,13 @@ drop_mmu_context(struct mm_struct *mm, u
local_irq_restore(flags);
}
+/*
+ * Returns the current bottom of a task's kernel stack. Used
+ * by the scheduler to prefetch it.
+ */
+static inline void * kernel_stack(struct task_struct *task)
+{
+ return task;
+}
+
#endif /* _ASM_MMU_CONTEXT_H */
Index: linux/include/asm-parisc/mmu_context.h
===================================================================
--- linux.orig/include/asm-parisc/mmu_context.h
+++ linux/include/asm-parisc/mmu_context.h
@@ -70,4 +70,14 @@ static inline void activate_mm(struct mm
switch_mm(prev,next,current);
}
+
+/*
+ * Returns the current bottom of a task's kernel stack. Used
+ * by the scheduler to prefetch it.
+ */
+static inline void * kernel_stack(struct task_struct *task)
+{
+ return task;
+}
+
#endif
Index: linux/include/asm-ppc/mmu_context.h
===================================================================
--- linux.orig/include/asm-ppc/mmu_context.h
+++ linux/include/asm-ppc/mmu_context.h
@@ -195,5 +195,14 @@ static inline void switch_mm(struct mm_s
extern void mmu_context_init(void);
+/*
+ * Returns the current bottom of a task's kernel stack. Used
+ * by the scheduler to prefetch it.
+ */
+static inline void * kernel_stack(struct task_struct *task)
+{
+ return task;
+}
+
#endif /* __PPC_MMU_CONTEXT_H */
#endif /* __KERNEL__ */
Index: linux/include/asm-ppc64/mmu_context.h
===================================================================
--- linux.orig/include/asm-ppc64/mmu_context.h
+++ linux/include/asm-ppc64/mmu_context.h
@@ -84,4 +84,13 @@ static inline void activate_mm(struct mm
local_irq_restore(flags);
}
+/*
+ * Returns the current bottom of a task's kernel stack. Used
+ * by the scheduler to prefetch it.
+ */
+static inline void * kernel_stack(struct task_struct *task)
+{
+ return task;
+}
+
#endif /* __PPC64_MMU_CONTEXT_H */
Index: linux/include/asm-s390/mmu_context.h
===================================================================
--- linux.orig/include/asm-s390/mmu_context.h
+++ linux/include/asm-s390/mmu_context.h
@@ -51,4 +51,13 @@ extern inline void activate_mm(struct mm
set_fs(current->thread.mm_segment);
}
+/*
+ * Returns the current bottom of a task's kernel stack. Used
+ * by the scheduler to prefetch it.
+ */
+static inline void * kernel_stack(struct task_struct *task)
+{
+ return task;
+}
+
#endif
Index: linux/include/asm-sh/mmu_context.h
===================================================================
--- linux.orig/include/asm-sh/mmu_context.h
+++ linux/include/asm-sh/mmu_context.h
@@ -202,5 +202,14 @@ static inline void disable_mmu(void)
#define disable_mmu() do { BUG(); } while (0)
#endif
+/*
+ * Returns the current bottom of a task's kernel stack. Used
+ * by the scheduler to prefetch it.
+ */
+static inline void * kernel_stack(struct task_struct *task)
+{
+ return task;
+}
+
#endif /* __KERNEL__ */
#endif /* __ASM_SH_MMU_CONTEXT_H */
Index: linux/include/asm-sh64/mmu_context.h
===================================================================
--- linux.orig/include/asm-sh64/mmu_context.h
+++ linux/include/asm-sh64/mmu_context.h
@@ -206,4 +206,13 @@ enter_lazy_tlb(struct mm_struct *mm, str
#endif /* __ASSEMBLY__ */
+/*
+ * Returns the current bottom of a task's kernel stack. Used
+ * by the scheduler to prefetch it.
+ */
+static inline void * kernel_stack(struct task_struct *task)
+{
+ return task;
+}
+
#endif /* __ASM_SH64_MMU_CONTEXT_H */
Index: linux/include/asm-sparc/mmu_context.h
===================================================================
--- linux.orig/include/asm-sparc/mmu_context.h
+++ linux/include/asm-sparc/mmu_context.h
@@ -37,4 +37,13 @@ BTFIXUPDEF_CALL(void, switch_mm, struct
#endif /* !(__ASSEMBLY__) */
+/*
+ * Returns the current bottom of a task's kernel stack. Used
+ * by the scheduler to prefetch it.
+ */
+static inline void * kernel_stack(struct task_struct *task)
+{
+ return task;
+}
+
#endif /* !(__SPARC_MMU_CONTEXT_H) */
Index: linux/include/asm-sparc64/mmu_context.h
===================================================================
--- linux.orig/include/asm-sparc64/mmu_context.h
+++ linux/include/asm-sparc64/mmu_context.h
@@ -142,4 +142,13 @@ static inline void activate_mm(struct mm
#endif /* !(__ASSEMBLY__) */
+/*
+ * Returns the current bottom of a task's kernel stack. Used
+ * by the scheduler to prefetch it.
+ */
+static inline void * kernel_stack(struct task_struct *task)
+{
+ return task;
+}
+
#endif /* !(__SPARC64_MMU_CONTEXT_H) */
Index: linux/include/asm-um/mmu_context.h
===================================================================
--- linux.orig/include/asm-um/mmu_context.h
+++ linux/include/asm-um/mmu_context.h
@@ -66,6 +66,15 @@ static inline void destroy_context(struc
CHOOSE_MODE((void) 0, destroy_context_skas(mm));
}
+/*
+ * Returns the current bottom of a task's kernel stack. Used
+ * by the scheduler to prefetch it.
+ */
+static inline void * kernel_stack(struct task_struct *task)
+{
+ return task;
+}
+
#endif
/*
Index: linux/include/asm-v850/mmu_context.h
===================================================================
--- linux.orig/include/asm-v850/mmu_context.h
+++ linux/include/asm-v850/mmu_context.h
@@ -8,4 +8,13 @@
#define activate_mm(prev,next) ((void)0)
#define enter_lazy_tlb(mm,tsk) ((void)0)
+/*
+ * Returns the current bottom of a task's kernel stack. Used
+ * by the scheduler to prefetch it.
+ */
+static inline void * kernel_stack(struct task_struct *task)
+{
+ return task;
+}
+
#endif /* __V850_MMU_CONTEXT_H__ */
Index: linux/include/asm-x86_64/mmu_context.h
===================================================================
--- linux.orig/include/asm-x86_64/mmu_context.h
+++ linux/include/asm-x86_64/mmu_context.h
@@ -76,4 +76,13 @@ static inline void switch_mm(struct mm_s
switch_mm((prev),(next),NULL)
+/*
+ * Returns the current bottom of a task's kernel stack. Used
+ * by the scheduler to prefetch it.
+ */
+static inline void * kernel_stack(struct task_struct *task)
+{
+ return (void *) task->thread.rsp;
+}
+
#endif
Index: linux/include/asm-xtensa/mmu_context.h
===================================================================
--- linux.orig/include/asm-xtensa/mmu_context.h
+++ linux/include/asm-xtensa/mmu_context.h
@@ -327,4 +327,13 @@ static inline void enter_lazy_tlb(struct
}
+/*
+ * Returns the current bottom of a task's kernel stack. Used
+ * by the scheduler to prefetch it.
+ */
+static inline void * kernel_stack(struct task_struct *task)
+{
+ return task;
+}
+
#endif /* _XTENSA_MMU_CONTEXT_H */
Index: linux/kernel/sched.c
===================================================================
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -2864,6 +2864,13 @@ go_idle:
queue = array->queue + idx;
next = list_entry(queue->next, task_t, run_list);
+ /*
+ * Cache-prefetch crutial memory areas of the next task,
+ * its thread_info and its kernel stack:
+ */
+ prefetch(next->thread_info);
+ prefetch(kernel_stack(next));
+
if (!rt_task(next) && next->activated > 0) {
unsigned long long delta = now - next->timestamp;
if (unlikely((long long)(now - next->timestamp) < 0))
@@ -2886,7 +2893,7 @@ go_idle:
switch_tasks:
if (next == rq->idle)
schedstat_inc(rq, sched_goidle);
- prefetch(next);
+
clear_tsk_need_resched(prev);
rcu_qsctr_inc(task_cpu(prev));
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Add prefetch switch stack hook in scheduler function
2005-07-28 8:48 ` Nick Piggin
@ 2005-07-28 9:16 ` Ingo Molnar
2005-07-28 9:19 ` Ingo Molnar
2005-07-28 9:34 ` Nick Piggin
0 siblings, 2 replies; 40+ messages in thread
From: Ingo Molnar @ 2005-07-28 9:16 UTC (permalink / raw)
To: Nick Piggin
Cc: Keith Owens, David.Mosberger, Andrew Morton, Chen, Kenneth W,
linux-kernel, linux-ia64
* Nick Piggin <nickpiggin@yahoo.com.au> wrote:
> >>Just a minor point, I agree with David: I'd like it to be called
> >>prefetch_task(), because some architecture may want to prefetch other
> >>memory.
> >
> >such as?
>
> Not sure. thread_info? Maybe next->timestamp or some other fields in
> next, something in next->mm?
next->thread_info we could and should prefetch - but from the generic
scheduler code (see the patch i just sent).
i'm not sure what you mean by prefetching next->timestamp, it's an
inline field to 'next', in the first cacheline of it, which we've
already used so it's present. (If you mean the value of next->timestamp,
that has no address meaning at all so would lead to unpredictable
results on some arches.)
next->mm we might want to prefetch, but it's probably not worth it
because we are referencing it too soon, in context_switch(). (while the
kernel stack itself wont be referenced until the full context-switch is
done) But might be worth trying - but even then, it should be done from
the generic code, like the thread_info and kernel-stack prefetching.
> I didn't really have a concrete example, but in the interests of being
> future proof...
i'd like to keep generic bits in generic code, and only move things to
per-arch include files if absolutely necessary. next->mm is generic.
Ingo
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Add prefetch switch stack hook in scheduler function
2005-07-28 9:16 ` Ingo Molnar
@ 2005-07-28 9:19 ` Ingo Molnar
2005-07-28 9:34 ` Nick Piggin
1 sibling, 0 replies; 40+ messages in thread
From: Ingo Molnar @ 2005-07-28 9:19 UTC (permalink / raw)
To: Nick Piggin
Cc: Keith Owens, David.Mosberger, Andrew Morton, Chen, Kenneth W,
linux-kernel, linux-ia64
* Ingo Molnar <mingo@elte.hu> wrote:
> next->mm we might want to prefetch, but it's probably not worth it
> because we are referencing it too soon, in context_switch(). (while
> the kernel stack itself wont be referenced until the full
> context-switch is done) But might be worth trying - but even then, it
> should be done from the generic code, like the thread_info and
> kernel-stack prefetching.
the patch below adds next->mm prefetching too, ontop of the previous
patch.
Ingo
------
cache-prefetch next->mm too.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
kernel/sched.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletion(-)
Index: linux/kernel/sched.c
===================================================================
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -2866,10 +2866,11 @@ go_idle:
/*
* Cache-prefetch crutial memory areas of the next task,
- * its thread_info and its kernel stack:
+ * its thread_info, its kernel stack and mm:
*/
prefetch(next->thread_info);
prefetch(kernel_stack(next));
+ prefetch(next->mm);
if (!rt_task(next) && next->activated > 0) {
unsigned long long delta = now - next->timestamp;
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Add prefetch switch stack hook in scheduler function
2005-07-28 9:16 ` Ingo Molnar
2005-07-28 9:19 ` Ingo Molnar
@ 2005-07-28 9:34 ` Nick Piggin
2005-07-28 10:04 ` Ingo Molnar
1 sibling, 1 reply; 40+ messages in thread
From: Nick Piggin @ 2005-07-28 9:34 UTC (permalink / raw)
To: Ingo Molnar
Cc: Keith Owens, David.Mosberger, Andrew Morton, Chen, Kenneth W,
linux-kernel, linux-ia64
Ingo Molnar wrote:
> * Nick Piggin <nickpiggin@yahoo.com.au> wrote:
>
>
>>>such as?
>>
>>Not sure. thread_info? Maybe next->timestamp or some other fields in
>>next, something in next->mm?
>
>
> next->thread_info we could and should prefetch - but from the generic
> scheduler code (see the patch i just sent).
>
Right. We're always testing the TIF_NEED_RESCHED field after the
switch.
> i'm not sure what you mean by prefetching next->timestamp, it's an
> inline field to 'next', in the first cacheline of it, which we've
> already used so it's present. (If you mean the value of next->timestamp,
> that has no address meaning at all so would lead to unpredictable
> results on some arches.)
>
No, I meant the cacheline holding the field of course. I guess I
could have looked for a field further down, but even so, ->timestamp
might be 96 bytes into the structure on a 64-bit arch, which may or
may not be the first cacheline... but you get the idea.
> next->mm we might want to prefetch, but it's probably not worth it
> because we are referencing it too soon, in context_switch(). (while the
> kernel stack itself wont be referenced until the full context-switch is
> done) But might be worth trying - but even then, it should be done from
> the generic code, like the thread_info and kernel-stack prefetching.
>
>
>>I didn't really have a concrete example, but in the interests of being
>>future proof...
>
>
> i'd like to keep generic bits in generic code, and only move things to
> per-arch include files if absolutely necessary. next->mm is generic.
>
Yeah, then a specific field _within_ next->mm or thread_info may
want to be fetched. In short, I don't see any argument why we
shouldn't call the function prefetch_task().
Secondly, I don't really like your prefetch(kernel_stack()) function
because it doesn't really give architectures enough control over
exactly what cachelines they get in memory.
prefetching and memory access patterns of all this stuff are fairly
architecture specific. I see nothing wrong with having a prefetch_task()
call. (Although I agree things like thread_info->flags and next->mm can
be done in generic code).
Nick
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Add prefetch switch stack hook in scheduler function
2005-07-28 9:34 ` Nick Piggin
@ 2005-07-28 10:04 ` Ingo Molnar
2005-07-28 10:29 ` Nick Piggin
0 siblings, 1 reply; 40+ messages in thread
From: Ingo Molnar @ 2005-07-28 10:04 UTC (permalink / raw)
To: Nick Piggin
Cc: Keith Owens, David.Mosberger, Andrew Morton, Chen, Kenneth W,
linux-kernel, linux-ia64
* Nick Piggin <nickpiggin@yahoo.com.au> wrote:
> >i'm not sure what you mean by prefetching next->timestamp, it's an
> >inline field to 'next', in the first cacheline of it, which we've
> >already used so it's present. (If you mean the value of next->timestamp,
> >that has no address meaning at all so would lead to unpredictable
> >results on some arches.)
> >
>
> No, I meant the cacheline holding the field of course. I guess I could
> have looked for a field further down, but even so, ->timestamp might
> be 96 bytes into the structure on a 64-bit arch, which may or may not
> be the first cacheline... but you get the idea.
i'd rather wait with that one. A prefetch can generate a prefetch of the
next cacheline too, so it might or might not make sense to prefetch both
cachelines explicitly.
Also, explicity prefetching &next->timestamp is ugly because it includes
a number of bad assumptions. If multi-cacheline prefetching becomes
necessary then this needs to be implemented via a generic extension to
prefetch.h:
prefetch_area(void *first_addr, void *last_addr)
(or as addr,len)
This way an architecture can decide how it prefetches continuous
cachelines _without knowing anything about the structure being
prefetched_.
Also, the generic code using the prefetches could this way designate
'cache-hot' areas without having to know about cacheline size or
prefetching tactics of the architecture. (prefetch_area() could be done
compile-time so it would be equivalent to a single-cacheline prefetch on
arches with larger cachelines.)
> > i'd like to keep generic bits in generic code, and only move things
> > to per-arch include files if absolutely necessary. next->mm is
> > generic.
>
> Yeah, then a specific field _within_ next->mm or thread_info may want
> to be fetched. In short, I don't see any argument why we shouldn't
> call the function prefetch_task().
it's a fundamental thing: we _dont_ want to push generic code into
architectures, and there's nothing per-arch about next->mm.
thread_info is really small on most arches, and even if they are not,
the most important data is always grouped to the top of a structure.
(this is a pretty fundamental concept throughout the kernel)
> Secondly, I don't really like your prefetch(kernel_stack()) function
> because it doesn't really give architectures enough control over
> exactly what cachelines they get in memory.
my point is, it comes down to concrete examples, it may or may not make
sense to do things per-arch.
If this was per-arch most arches wouldnt get this benefit, and the
propagation of any improvements would likely stick in whatever arch did
that. We've seen that happen too many times to not repeat this mistake
again. We've got 24 architectures (and counting), so any code 'left to
the architecture to control' will propagate to other architectures only
very slowly.
try to look at it from another angle: probably only because i insisted
on doing this in the generic code and not per-arch did we end up
discovering the potential generic need to prefetch next->thread_info and
next->mm.
> prefetching and memory access patterns of all this stuff are fairly
> architecture specific.
true of course, but from this it does not follow at all that we should
move all task-switch related prefetching into a prefetch_task() call!
> [...] I see nothing wrong with having a prefetch_task() call.
> (Although I agree things like thread_info->flags and next->mm can be
> done in generic code).
great that we now agree wrt. thread_info and next->mm. My remaining
point is, once we prefetch ->thread_info, ->mm and the kernel stack,
nothing else significant remains! (It's still very much possible that
something needs to be prefetched per-arch, but i'd like to see a robust
case be made for it, instead of your global 'it might happen' argument.)
Ingo
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Add prefetch switch stack hook in scheduler function
2005-07-28 10:04 ` Ingo Molnar
@ 2005-07-28 10:29 ` Nick Piggin
0 siblings, 0 replies; 40+ messages in thread
From: Nick Piggin @ 2005-07-28 10:29 UTC (permalink / raw)
To: Ingo Molnar
Cc: Keith Owens, David.Mosberger, Andrew Morton, Chen, Kenneth W,
linux-kernel, linux-ia64
Ingo Molnar wrote:
> * Nick Piggin <nickpiggin@yahoo.com.au> wrote:
[...]
> prefetch_area(void *first_addr, void *last_addr)
>
> (or as addr,len)
>
Yep. We have prefetch_range.
>>
>>Yeah, then a specific field _within_ next->mm or thread_info may want
>>to be fetched. In short, I don't see any argument why we shouldn't
>>call the function prefetch_task().
>
>
> it's a fundamental thing: we _dont_ want to push generic code into
> architectures, and there's nothing per-arch about next->mm.
>
Yeah, I mean within mm, ie. prefetch(&mm->random_cacheline).
>>Secondly, I don't really like your prefetch(kernel_stack()) function
>>because it doesn't really give architectures enough control over
>>exactly what cachelines they get in memory.
>
>
> my point is, it comes down to concrete examples, it may or may not make
> sense to do things per-arch.
>
I thought the concrete example there was ia64's switch_stack,
which looks to be over half a K... oh I see you've asked Ken
whether this will be sufficient. OK in that case let's wait and
see.
>
>>[...] I see nothing wrong with having a prefetch_task() call.
>>(Although I agree things like thread_info->flags and next->mm can be
>>done in generic code).
>
>
> great that we now agree wrt. thread_info and next->mm. My remaining
> point is, once we prefetch ->thread_info, ->mm and the kernel stack,
> nothing else significant remains! (It's still very much possible that
> something needs to be prefetched per-arch, but i'd like to see a robust
> case be made for it, instead of your global 'it might happen' argument.)
>
Well to be clear, I think we have always agreed, except that I
thought it 'did happen' with the ia64 example. If it turns out
that your prefetch is good enough then I will have been mistaken.
Actually to be even clearer, I was never really arguing about what
to prefetch or whether to prefetch from arch code or not. Just that
the name, if any, should be prefetch_task as opposed to
prefetch_stack :)
Nick
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
^ permalink raw reply [flat|nested] 40+ messages in thread
* RE: Add prefetch switch stack hook in scheduler function
2005-07-28 9:09 ` Ingo Molnar
@ 2005-07-28 19:14 ` Chen, Kenneth W
2005-07-29 7:04 ` Ingo Molnar
0 siblings, 1 reply; 40+ messages in thread
From: Chen, Kenneth W @ 2005-07-28 19:14 UTC (permalink / raw)
To: 'Ingo Molnar', Keith Owens
Cc: David.Mosberger, Andrew Morton, linux-kernel, linux-ia64
> i.e. like the patch below. Boot-tested on x86. x86, x64 and ia64 have a
> real kernel_stack() implementation, the other architectures all return
> 'next'. (I've also cleaned up a couple of other things in the
> prefetch-next area, see the changelog below.)
>
> Ken, would this patch generate a sufficient amount of prefetching on
> ia64?
Sorry, this is not enough. Switch stack on ia64 is 528 bytes. We need to
prefetch 5 lines. It probably should use prefetch_range(). But on ia64,
prefetch_range stride L1_CACHE_BYTES, where I really want to stride L3 cache
line size.
We also want to prefetch switch stack for current task, since processor
state is saved onto the stack for outgoing process. And that stack is
almost guaranteed to be cold because switch stack is created below current
stack pointer.
Can we just go back to prefetch_stack() or prefetch_task() (or use plural
name) and let each arch decide what to do with it?
- Ken
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Add prefetch switch stack hook in scheduler function
2005-07-28 19:14 ` Chen, Kenneth W
@ 2005-07-29 7:04 ` Ingo Molnar
2005-07-29 7:07 ` Ingo Molnar
` (2 more replies)
0 siblings, 3 replies; 40+ messages in thread
From: Ingo Molnar @ 2005-07-29 7:04 UTC (permalink / raw)
To: Chen, Kenneth W
Cc: Keith Owens, David.Mosberger, Andrew Morton, linux-kernel,
linux-ia64
* Chen, Kenneth W <kenneth.w.chen@intel.com> wrote:
> > i.e. like the patch below. Boot-tested on x86. x86, x64 and ia64 have a
> > real kernel_stack() implementation, the other architectures all return
> > 'next'. (I've also cleaned up a couple of other things in the
> > prefetch-next area, see the changelog below.)
> >
> > Ken, would this patch generate a sufficient amount of prefetching on
> > ia64?
>
> Sorry, this is not enough. Switch stack on ia64 is 528 bytes. We
> need to prefetch 5 lines. It probably should use prefetch_range().
ok, how about the additional patch below? Does this do the trick on
ia64? It makes complete sense on every architecture to prefetch from
below the current kernel stack, in the expectation of the next task
touching the stack. The only difference is that for ia64 the 'expected
minimum stack footprint' is larger, due to the switch_stack.
Ingo
-------
enable architectures to define a 'minimum number of kernel stack
bytes touched' - which will be prefetched from the scheduler.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
include/asm-alpha/mmu_context.h | 6 ++++++
include/asm-arm/mmu_context.h | 6 ++++++
include/asm-arm26/mmu_context.h | 6 ++++++
include/asm-cris/mmu_context.h | 6 ++++++
include/asm-frv/mmu_context.h | 6 ++++++
include/asm-h8300/mmu_context.h | 6 ++++++
include/asm-i386/mmu_context.h | 6 ++++++
include/asm-ia64/mmu_context.h | 6 ++++++
include/asm-m32r/mmu_context.h | 6 ++++++
include/asm-m68k/mmu_context.h | 6 ++++++
include/asm-m68knommu/mmu_context.h | 6 ++++++
include/asm-mips/mmu_context.h | 6 ++++++
include/asm-parisc/mmu_context.h | 6 ++++++
include/asm-ppc/mmu_context.h | 6 ++++++
include/asm-ppc64/mmu_context.h | 6 ++++++
include/asm-s390/mmu_context.h | 6 ++++++
include/asm-sh/mmu_context.h | 6 ++++++
include/asm-sh64/mmu_context.h | 6 ++++++
include/asm-sparc/mmu_context.h | 6 ++++++
include/asm-sparc64/mmu_context.h | 6 ++++++
include/asm-um/mmu_context.h | 6 ++++++
include/asm-v850/mmu_context.h | 6 ++++++
include/asm-x86_64/mmu_context.h | 5 +++++
include/asm-xtensa/mmu_context.h | 6 ++++++
kernel/sched.c | 9 ++++++++-
25 files changed, 151 insertions(+), 1 deletion(-)
Index: linux/include/asm-alpha/mmu_context.h
===================================================================
--- linux.orig/include/asm-alpha/mmu_context.h
+++ linux/include/asm-alpha/mmu_context.h
@@ -259,6 +259,12 @@ enter_lazy_tlb(struct mm_struct *mm, str
#endif
/*
+ * Minimum number of bytes a new task will touch on the
+ * kernel stack:
+ */
+#define MIN_KERNEL_STACK_FOOTPRINT L1_CACHE_BYTES
+
+/*
* Returns the current bottom of a task's kernel stack. Used
* by the scheduler to prefetch it.
*/
Index: linux/include/asm-arm/mmu_context.h
===================================================================
--- linux.orig/include/asm-arm/mmu_context.h
+++ linux/include/asm-arm/mmu_context.h
@@ -94,6 +94,12 @@ switch_mm(struct mm_struct *prev, struct
#define activate_mm(prev,next) switch_mm(prev, next, NULL)
/*
+ * Minimum number of bytes a new task will touch on the
+ * kernel stack:
+ */
+#define MIN_KERNEL_STACK_FOOTPRINT L1_CACHE_BYTES
+
+/*
* Returns the current bottom of a task's kernel stack. Used
* by the scheduler to prefetch it.
*/
Index: linux/include/asm-arm26/mmu_context.h
===================================================================
--- linux.orig/include/asm-arm26/mmu_context.h
+++ linux/include/asm-arm26/mmu_context.h
@@ -49,6 +49,12 @@ static inline void activate_mm(struct mm
}
/*
+ * Minimum number of bytes a new task will touch on the
+ * kernel stack:
+ */
+#define MIN_KERNEL_STACK_FOOTPRINT L1_CACHE_BYTES
+
+/*
* Returns the current bottom of a task's kernel stack. Used
* by the scheduler to prefetch it.
*/
Index: linux/include/asm-cris/mmu_context.h
===================================================================
--- linux.orig/include/asm-cris/mmu_context.h
+++ linux/include/asm-cris/mmu_context.h
@@ -22,6 +22,12 @@ static inline void enter_lazy_tlb(struct
}
/*
+ * Minimum number of bytes a new task will touch on the
+ * kernel stack:
+ */
+#define MIN_KERNEL_STACK_FOOTPRINT L1_CACHE_BYTES
+
+/*
* Returns the current bottom of a task's kernel stack. Used
* by the scheduler to prefetch it.
*/
Index: linux/include/asm-frv/mmu_context.h
===================================================================
--- linux.orig/include/asm-frv/mmu_context.h
+++ linux/include/asm-frv/mmu_context.h
@@ -48,6 +48,12 @@ do { \
} while(0)
/*
+ * Minimum number of bytes a new task will touch on the
+ * kernel stack:
+ */
+#define MIN_KERNEL_STACK_FOOTPRINT L1_CACHE_BYTES
+
+/*
* Returns the current bottom of a task's kernel stack. Used
* by the scheduler to prefetch it.
*/
Index: linux/include/asm-h8300/mmu_context.h
===================================================================
--- linux.orig/include/asm-h8300/mmu_context.h
+++ linux/include/asm-h8300/mmu_context.h
@@ -30,6 +30,12 @@ extern inline void activate_mm(struct mm
}
/*
+ * Minimum number of bytes a new task will touch on the
+ * kernel stack:
+ */
+#define MIN_KERNEL_STACK_FOOTPRINT L1_CACHE_BYTES
+
+/*
* Returns the current bottom of a task's kernel stack. Used
* by the scheduler to prefetch it.
*/
Index: linux/include/asm-i386/mmu_context.h
===================================================================
--- linux.orig/include/asm-i386/mmu_context.h
+++ linux/include/asm-i386/mmu_context.h
@@ -69,6 +69,12 @@ static inline void switch_mm(struct mm_s
#define activate_mm(prev, next) \
switch_mm((prev),(next),NULL)
+/*
+ * Minimum number of bytes a new task will touch on the
+ * kernel stack:
+ */
+#define MIN_KERNEL_STACK_FOOTPRINT L1_CACHE_BYTES
+
static inline void * kernel_stack(struct task_struct *task)
{
return (void *) task->thread.esp;
Index: linux/include/asm-ia64/mmu_context.h
===================================================================
--- linux.orig/include/asm-ia64/mmu_context.h
+++ linux/include/asm-ia64/mmu_context.h
@@ -170,6 +170,12 @@ activate_mm (struct mm_struct *prev, str
#define switch_mm(prev_mm,next_mm,next_task) activate_mm(prev_mm, next_mm)
/*
+ * Minimum number of bytes a new task will touch on the
+ * kernel stack:
+ */
+#define MIN_KERNEL_STACK_FOOTPRINT IA64_SWITCH_STACK_SIZE
+
+/*
* Returns the current bottom of a task's kernel stack. Used
* by the scheduler to prefetch it.
*/
Index: linux/include/asm-m32r/mmu_context.h
===================================================================
--- linux.orig/include/asm-m32r/mmu_context.h
+++ linux/include/asm-m32r/mmu_context.h
@@ -168,6 +168,12 @@ static inline void switch_mm(struct mm_s
#endif /* __KERNEL__ */
/*
+ * Minimum number of bytes a new task will touch on the
+ * kernel stack:
+ */
+#define MIN_KERNEL_STACK_FOOTPRINT L1_CACHE_BYTES
+
+/*
* Returns the current bottom of a task's kernel stack. Used
* by the scheduler to prefetch it.
*/
Index: linux/include/asm-m68k/mmu_context.h
===================================================================
--- linux.orig/include/asm-m68k/mmu_context.h
+++ linux/include/asm-m68k/mmu_context.h
@@ -151,6 +151,12 @@ static inline void activate_mm(struct mm
}
/*
+ * Minimum number of bytes a new task will touch on the
+ * kernel stack:
+ */
+#define MIN_KERNEL_STACK_FOOTPRINT L1_CACHE_BYTES
+
+/*
* Returns the current bottom of a task's kernel stack. Used
* by the scheduler to prefetch it.
*/
Index: linux/include/asm-m68knommu/mmu_context.h
===================================================================
--- linux.orig/include/asm-m68knommu/mmu_context.h
+++ linux/include/asm-m68knommu/mmu_context.h
@@ -31,6 +31,12 @@ extern inline void activate_mm(struct mm
}
/*
+ * Minimum number of bytes a new task will touch on the
+ * kernel stack:
+ */
+#define MIN_KERNEL_STACK_FOOTPRINT L1_CACHE_BYTES
+
+/*
* Returns the current bottom of a task's kernel stack. Used
* by the scheduler to prefetch it.
*/
Index: linux/include/asm-mips/mmu_context.h
===================================================================
--- linux.orig/include/asm-mips/mmu_context.h
+++ linux/include/asm-mips/mmu_context.h
@@ -194,6 +194,12 @@ drop_mmu_context(struct mm_struct *mm, u
}
/*
+ * Minimum number of bytes a new task will touch on the
+ * kernel stack:
+ */
+#define MIN_KERNEL_STACK_FOOTPRINT L1_CACHE_BYTES
+
+/*
* Returns the current bottom of a task's kernel stack. Used
* by the scheduler to prefetch it.
*/
Index: linux/include/asm-parisc/mmu_context.h
===================================================================
--- linux.orig/include/asm-parisc/mmu_context.h
+++ linux/include/asm-parisc/mmu_context.h
@@ -72,6 +72,12 @@ static inline void activate_mm(struct mm
}
/*
+ * Minimum number of bytes a new task will touch on the
+ * kernel stack:
+ */
+#define MIN_KERNEL_STACK_FOOTPRINT L1_CACHE_BYTES
+
+/*
* Returns the current bottom of a task's kernel stack. Used
* by the scheduler to prefetch it.
*/
Index: linux/include/asm-ppc/mmu_context.h
===================================================================
--- linux.orig/include/asm-ppc/mmu_context.h
+++ linux/include/asm-ppc/mmu_context.h
@@ -196,6 +196,12 @@ static inline void switch_mm(struct mm_s
extern void mmu_context_init(void);
/*
+ * Minimum number of bytes a new task will touch on the
+ * kernel stack:
+ */
+#define MIN_KERNEL_STACK_FOOTPRINT L1_CACHE_BYTES
+
+/*
* Returns the current bottom of a task's kernel stack. Used
* by the scheduler to prefetch it.
*/
Index: linux/include/asm-ppc64/mmu_context.h
===================================================================
--- linux.orig/include/asm-ppc64/mmu_context.h
+++ linux/include/asm-ppc64/mmu_context.h
@@ -85,6 +85,12 @@ static inline void activate_mm(struct mm
}
/*
+ * Minimum number of bytes a new task will touch on the
+ * kernel stack:
+ */
+#define MIN_KERNEL_STACK_FOOTPRINT L1_CACHE_BYTES
+
+/*
* Returns the current bottom of a task's kernel stack. Used
* by the scheduler to prefetch it.
*/
Index: linux/include/asm-s390/mmu_context.h
===================================================================
--- linux.orig/include/asm-s390/mmu_context.h
+++ linux/include/asm-s390/mmu_context.h
@@ -52,6 +52,12 @@ extern inline void activate_mm(struct mm
}
/*
+ * Minimum number of bytes a new task will touch on the
+ * kernel stack:
+ */
+#define MIN_KERNEL_STACK_FOOTPRINT L1_CACHE_BYTES
+
+/*
* Returns the current bottom of a task's kernel stack. Used
* by the scheduler to prefetch it.
*/
Index: linux/include/asm-sh/mmu_context.h
===================================================================
--- linux.orig/include/asm-sh/mmu_context.h
+++ linux/include/asm-sh/mmu_context.h
@@ -203,6 +203,12 @@ static inline void disable_mmu(void)
#endif
/*
+ * Minimum number of bytes a new task will touch on the
+ * kernel stack:
+ */
+#define MIN_KERNEL_STACK_FOOTPRINT L1_CACHE_BYTES
+
+/*
* Returns the current bottom of a task's kernel stack. Used
* by the scheduler to prefetch it.
*/
Index: linux/include/asm-sh64/mmu_context.h
===================================================================
--- linux.orig/include/asm-sh64/mmu_context.h
+++ linux/include/asm-sh64/mmu_context.h
@@ -207,6 +207,12 @@ enter_lazy_tlb(struct mm_struct *mm, str
#endif /* __ASSEMBLY__ */
/*
+ * Minimum number of bytes a new task will touch on the
+ * kernel stack:
+ */
+#define MIN_KERNEL_STACK_FOOTPRINT L1_CACHE_BYTES
+
+/*
* Returns the current bottom of a task's kernel stack. Used
* by the scheduler to prefetch it.
*/
Index: linux/include/asm-sparc/mmu_context.h
===================================================================
--- linux.orig/include/asm-sparc/mmu_context.h
+++ linux/include/asm-sparc/mmu_context.h
@@ -38,6 +38,12 @@ BTFIXUPDEF_CALL(void, switch_mm, struct
#endif /* !(__ASSEMBLY__) */
/*
+ * Minimum number of bytes a new task will touch on the
+ * kernel stack:
+ */
+#define MIN_KERNEL_STACK_FOOTPRINT L1_CACHE_BYTES
+
+/*
* Returns the current bottom of a task's kernel stack. Used
* by the scheduler to prefetch it.
*/
Index: linux/include/asm-sparc64/mmu_context.h
===================================================================
--- linux.orig/include/asm-sparc64/mmu_context.h
+++ linux/include/asm-sparc64/mmu_context.h
@@ -143,6 +143,12 @@ static inline void activate_mm(struct mm
#endif /* !(__ASSEMBLY__) */
/*
+ * Minimum number of bytes a new task will touch on the
+ * kernel stack:
+ */
+#define MIN_KERNEL_STACK_FOOTPRINT L1_CACHE_BYTES
+
+/*
* Returns the current bottom of a task's kernel stack. Used
* by the scheduler to prefetch it.
*/
Index: linux/include/asm-um/mmu_context.h
===================================================================
--- linux.orig/include/asm-um/mmu_context.h
+++ linux/include/asm-um/mmu_context.h
@@ -67,6 +67,12 @@ static inline void destroy_context(struc
}
/*
+ * Minimum number of bytes a new task will touch on the
+ * kernel stack:
+ */
+#define MIN_KERNEL_STACK_FOOTPRINT L1_CACHE_BYTES
+
+/*
* Returns the current bottom of a task's kernel stack. Used
* by the scheduler to prefetch it.
*/
Index: linux/include/asm-v850/mmu_context.h
===================================================================
--- linux.orig/include/asm-v850/mmu_context.h
+++ linux/include/asm-v850/mmu_context.h
@@ -9,6 +9,12 @@
#define enter_lazy_tlb(mm,tsk) ((void)0)
/*
+ * Minimum number of bytes a new task will touch on the
+ * kernel stack:
+ */
+#define MIN_KERNEL_STACK_FOOTPRINT L1_CACHE_BYTES
+
+/*
* Returns the current bottom of a task's kernel stack. Used
* by the scheduler to prefetch it.
*/
Index: linux/include/asm-x86_64/mmu_context.h
===================================================================
--- linux.orig/include/asm-x86_64/mmu_context.h
+++ linux/include/asm-x86_64/mmu_context.h
@@ -75,6 +75,11 @@ static inline void switch_mm(struct mm_s
#define activate_mm(prev, next) \
switch_mm((prev),(next),NULL)
+/*
+ * Minimum number of bytes a new task will touch on the
+ * kernel stack:
+ */
+#define MIN_KERNEL_STACK_FOOTPRINT L1_CACHE_BYTES
/*
* Returns the current bottom of a task's kernel stack. Used
Index: linux/include/asm-xtensa/mmu_context.h
===================================================================
--- linux.orig/include/asm-xtensa/mmu_context.h
+++ linux/include/asm-xtensa/mmu_context.h
@@ -328,6 +328,12 @@ static inline void enter_lazy_tlb(struct
}
/*
+ * Minimum number of bytes a new task will touch on the
+ * kernel stack:
+ */
+#define MIN_KERNEL_STACK_FOOTPRINT L1_CACHE_BYTES
+
+/*
* Returns the current bottom of a task's kernel stack. Used
* by the scheduler to prefetch it.
*/
Index: linux/kernel/sched.c
===================================================================
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -2869,7 +2869,14 @@ go_idle:
* its thread_info, its kernel stack and mm:
*/
prefetch(next->thread_info);
- prefetch(kernel_stack(next));
+ /*
+ * Prefetch (at least) a cacheline below the current
+ * kernel stack (in expectation of any new task touching
+ * the stack at least minimally), and a cacheline above
+ * the stack:
+ */
+ prefetch_range(kernel_stack(next) - MIN_KERNEL_STACK_FOOTPRINT,
+ MIN_KERNEL_STACK_FOOTPRINT + L1_CACHE_BYTES);
prefetch(next->mm);
if (!rt_task(next) && next->activated > 0) {
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Add prefetch switch stack hook in scheduler function
2005-07-29 7:04 ` Ingo Molnar
@ 2005-07-29 7:07 ` Ingo Molnar
2005-07-29 8:30 ` Eric Dumazet
` (2 more replies)
2005-07-29 7:22 ` Chen, Kenneth W
2005-07-29 7:38 ` Keith Owens
2 siblings, 3 replies; 40+ messages in thread
From: Ingo Molnar @ 2005-07-29 7:07 UTC (permalink / raw)
To: Chen, Kenneth W
Cc: Keith Owens, David.Mosberger, Andrew Morton, linux-kernel,
linux-ia64
* Ingo Molnar <mingo@elte.hu> wrote:
> > Sorry, this is not enough. Switch stack on ia64 is 528 bytes. We
> > need to prefetch 5 lines. It probably should use prefetch_range().
>
> ok, how about the additional patch below? Does this do the trick on
> ia64? It makes complete sense on every architecture to prefetch from
> below the current kernel stack, in the expectation of the next task
> touching the stack. The only difference is that for ia64 the 'expected
> minimum stack footprint' is larger, due to the switch_stack.
the patch below unrolls the prefetch_range() loop manually, for up to 5
cachelines prefetched. This patch, ontop of the 4 previous patches,
should generate similar code to the assembly code in your original
patch. The full patch-series is:
patches/prefetch-next.patch
patches/prefetch-mm.patch
patches/prefetch-kstack-size.patch
patches/prefetch-unroll.patch
Ingo
---------
unroll prefetch_range() loops manually.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
include/linux/prefetch.h | 31 +++++++++++++++++++++++++++++--
1 files changed, 29 insertions(+), 2 deletions(-)
Index: linux/include/linux/prefetch.h
===================================================================
--- linux.orig/include/linux/prefetch.h
+++ linux/include/linux/prefetch.h
@@ -58,11 +58,38 @@ static inline void prefetchw(const void
static inline void prefetch_range(void *addr, size_t len)
{
#ifdef ARCH_HAS_PREFETCH
- char *cp;
+ char *cp = addr;
char *end = addr + len;
- for (cp = addr; cp < end; cp += PREFETCH_STRIDE)
+ /*
+ * Unroll agressively:
+ */
+ if (len <= PREFETCH_STRIDE)
prefetch(cp);
+ else if (len <= 2*PREFETCH_STRIDE) {
+ prefetch(cp);
+ prefetch(cp + PREFETCH_STRIDE);
+ }
+ else if (len <= 3*PREFETCH_STRIDE) {
+ prefetch(cp);
+ prefetch(cp + PREFETCH_STRIDE);
+ prefetch(cp + 2*PREFETCH_STRIDE);
+ }
+ else if (len <= 4*PREFETCH_STRIDE) {
+ prefetch(cp);
+ prefetch(cp + PREFETCH_STRIDE);
+ prefetch(cp + 2*PREFETCH_STRIDE);
+ prefetch(cp + 3*PREFETCH_STRIDE);
+ }
+ else if (len <= 5*PREFETCH_STRIDE) {
+ prefetch(cp);
+ prefetch(cp + PREFETCH_STRIDE);
+ prefetch(cp + 2*PREFETCH_STRIDE);
+ prefetch(cp + 3*PREFETCH_STRIDE);
+ prefetch(cp + 4*PREFETCH_STRIDE);
+ } else
+ for (; cp < end; cp += PREFETCH_STRIDE)
+ prefetch(cp);
#endif
}
^ permalink raw reply [flat|nested] 40+ messages in thread
* RE: Add prefetch switch stack hook in scheduler function
2005-07-29 7:04 ` Ingo Molnar
2005-07-29 7:07 ` Ingo Molnar
@ 2005-07-29 7:22 ` Chen, Kenneth W
2005-07-29 7:45 ` Keith Owens
2005-07-29 8:28 ` Ingo Molnar
2005-07-29 7:38 ` Keith Owens
2 siblings, 2 replies; 40+ messages in thread
From: Chen, Kenneth W @ 2005-07-29 7:22 UTC (permalink / raw)
To: 'Ingo Molnar'
Cc: Keith Owens, David.Mosberger, Andrew Morton, linux-kernel,
linux-ia64
Ingo Molnar wrote on Friday, July 29, 2005 12:05 AM
> --- linux.orig/kernel/sched.c
> +++ linux/kernel/sched.c
> @@ -2869,7 +2869,14 @@ go_idle:
> * its thread_info, its kernel stack and mm:
> */
> prefetch(next->thread_info);
> - prefetch(kernel_stack(next));
> + /*
> + * Prefetch (at least) a cacheline below the current
> + * kernel stack (in expectation of any new task touching
> + * the stack at least minimally), and a cacheline above
> + * the stack:
> + */
> + prefetch_range(kernel_stack(next) - MIN_KERNEL_STACK_FOOTPRINT,
> + MIN_KERNEL_STACK_FOOTPRINT + L1_CACHE_BYTES);
> prefetch(next->mm);
Doctor, it still hurts :-(
On ia64, we have two kernel stacks, one for outgoing task, and one for
incoming task. for outgoing task, we haven't called switch_to() yet.
So the switch stack structure for 'current' will be allocated immediately
below current 'sp' pointer. For the incoming task, it was fully ctx'ed out
previously, so switch stack structure is immediate above kernel_stack(next).
It Would be beneficial to prefetch both stacks.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Add prefetch switch stack hook in scheduler function
2005-07-29 7:04 ` Ingo Molnar
2005-07-29 7:07 ` Ingo Molnar
2005-07-29 7:22 ` Chen, Kenneth W
@ 2005-07-29 7:38 ` Keith Owens
2005-07-29 8:08 ` Chen, Kenneth W
2 siblings, 1 reply; 40+ messages in thread
From: Keith Owens @ 2005-07-29 7:38 UTC (permalink / raw)
To: Ingo Molnar
Cc: Chen, Kenneth W, David.Mosberger, Andrew Morton, linux-kernel,
linux-ia64
On Fri, 29 Jul 2005 09:04:48 +0200,
Ingo Molnar <mingo@elte.hu> wrote:
>ok, how about the additional patch below? Does this do the trick on
>ia64? It makes complete sense on every architecture to prefetch from
>below the current kernel stack, in the expectation of the next task
>touching the stack. The only difference is that for ia64 the 'expected
>minimum stack footprint' is larger, due to the switch_stack.
>...
>Index: linux/kernel/sched.c
>===================================================================
>--- linux.orig/kernel/sched.c
>+++ linux/kernel/sched.c
>@@ -2869,7 +2869,14 @@ go_idle:
> * its thread_info, its kernel stack and mm:
> */
> prefetch(next->thread_info);
>- prefetch(kernel_stack(next));
>+ /*
>+ * Prefetch (at least) a cacheline below the current
>+ * kernel stack (in expectation of any new task touching
>+ * the stack at least minimally), and a cacheline above
>+ * the stack:
>+ */
>+ prefetch_range(kernel_stack(next) - MIN_KERNEL_STACK_FOOTPRINT,
>+ MIN_KERNEL_STACK_FOOTPRINT + L1_CACHE_BYTES);
> prefetch(next->mm);
>
> if (!rt_task(next) && next->activated > 0) {
Surely the prefetch range has to depend on which direction the stack
grows. For stacks that grow down, we want esp/ksp upwards,
prefetch_range(kernel_stack(next),
MIN_KERNEL_STACK_FOOTPRINT + L1_CACHE_BYTES);
For stacks that grow up, we want esp/ksp downwards
prefetch_range(kernel_stack(next) - MIN_KERNEL_STACK_FOOTPRINT,
MIN_KERNEL_STACK_FOOTPRINT + L1_CACHE_BYTES);
BTW, for ia64 you may as well prefetch pt_regs, that is also quite
large.
#define MIN_KERNEL_STACK_FOOTPRINT (IA64_SWITCH_STACK_SIZE + IA64_PT_REGS_SIZE)
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Add prefetch switch stack hook in scheduler function
2005-07-29 7:22 ` Chen, Kenneth W
@ 2005-07-29 7:45 ` Keith Owens
2005-07-29 8:02 ` Chen, Kenneth W
2005-07-29 8:28 ` Ingo Molnar
1 sibling, 1 reply; 40+ messages in thread
From: Keith Owens @ 2005-07-29 7:45 UTC (permalink / raw)
To: Chen, Kenneth W
Cc: 'Ingo Molnar', David.Mosberger, Andrew Morton,
linux-kernel, linux-ia64
On Fri, 29 Jul 2005 00:22:43 -0700,
"Chen, Kenneth W" <kenneth.w.chen@intel.com> wrote:
>On ia64, we have two kernel stacks, one for outgoing task, and one for
>incoming task. for outgoing task, we haven't called switch_to() yet.
>So the switch stack structure for 'current' will be allocated immediately
>below current 'sp' pointer. For the incoming task, it was fully ctx'ed out
>previously, so switch stack structure is immediate above kernel_stack(next).
>It Would be beneficial to prefetch both stacks.
struct switch_stack for current is all write data, no reading is done.
Is it worth doing prefetchw() for current? IOW, is there any
measurable performance gain?
^ permalink raw reply [flat|nested] 40+ messages in thread
* RE: Add prefetch switch stack hook in scheduler function
2005-07-29 7:45 ` Keith Owens
@ 2005-07-29 8:02 ` Chen, Kenneth W
0 siblings, 0 replies; 40+ messages in thread
From: Chen, Kenneth W @ 2005-07-29 8:02 UTC (permalink / raw)
To: 'Keith Owens'
Cc: 'Ingo Molnar', David.Mosberger, Andrew Morton,
linux-kernel, linux-ia64
Keith Owens wrote on Friday, July 29, 2005 12:46 AM
> On Fri, 29 Jul 2005 00:22:43 -0700,
> "Chen, Kenneth W" <kenneth.w.chen@intel.com> wrote:
> >On ia64, we have two kernel stacks, one for outgoing task, and one for
> >incoming task. for outgoing task, we haven't called switch_to() yet.
> >So the switch stack structure for 'current' will be allocated immediately
> >below current 'sp' pointer. For the incoming task, it was fully ctx'ed out
> >previously, so switch stack structure is immediate above kernel_stack(next).
> >It Would be beneficial to prefetch both stacks.
>
> struct switch_stack for current is all write data, no reading is done.
> Is it worth doing prefetchw() for current?
Oh yes, very much so. L2 is an out of order cache and it can only queue
limited amount of store operations. With the number of stores for switch
stack structure, it will easily exceed that hardware limit.
> IOW, is there any measurable performance gain?
I don't have exact breakdown to how much contribute from prefetch the outgoing
process versus incoming process. But I believe both contributes to perf.
gain.
- Ken
^ permalink raw reply [flat|nested] 40+ messages in thread
* RE: Add prefetch switch stack hook in scheduler function
2005-07-29 7:38 ` Keith Owens
@ 2005-07-29 8:08 ` Chen, Kenneth W
0 siblings, 0 replies; 40+ messages in thread
From: Chen, Kenneth W @ 2005-07-29 8:08 UTC (permalink / raw)
To: 'Keith Owens', Ingo Molnar
Cc: David.Mosberger, Andrew Morton, linux-kernel, linux-ia64
Keith Owens wrote on Friday, July 29, 2005 12:38 AM
> BTW, for ia64 you may as well prefetch pt_regs, that is also quite
> large.
>
> #define MIN_KERNEL_STACK_FOOTPRINT (IA64_SWITCH_STACK_SIZE + IA64_PT_REGS_SIZE)
This has to be carefully done, because you really don't want to overwhelm
number of outstanding L2 cache misses. It only has a limited amount of
queue and once that is filled, cpu get into L2 recirculate state. And that
is going to be very painful because it translate into cpu stall cycles.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Add prefetch switch stack hook in scheduler function
2005-07-29 7:22 ` Chen, Kenneth W
2005-07-29 7:45 ` Keith Owens
@ 2005-07-29 8:28 ` Ingo Molnar
2005-07-29 9:02 ` Russell King
1 sibling, 1 reply; 40+ messages in thread
From: Ingo Molnar @ 2005-07-29 8:28 UTC (permalink / raw)
To: Chen, Kenneth W
Cc: Keith Owens, David.Mosberger, Andrew Morton, linux-kernel,
linux-ia64
* Chen, Kenneth W <kenneth.w.chen@intel.com> wrote:
> On ia64, we have two kernel stacks, one for outgoing task, and one for
> incoming task. for outgoing task, we haven't called switch_to() yet.
> So the switch stack structure for 'current' will be allocated
> immediately below current 'sp' pointer. For the incoming task, it was
> fully ctx'ed out previously, so switch stack structure is immediate
> above kernel_stack(next). It Would be beneficial to prefetch both
> stacks.
ok, could you apply the two patches below? The first one implements
prefetchw_range(), the second one makes use of it to prefetch the
outgoing stack for writes. We can actually do the current task
prefetching much earlier, because unlike the next task, we know its
stack pointer right away, so this patch could have some additional
benefits.
btw., i'm not totally convinced this is needed, because even in a
context-switch-intense workload (as DB workloads are), i'd expect the
outgoing task to still have a hot switch-stack, either due to having
context-switched recently, or due to having done some deeper kernel
function call recently. The next task will likely be cache-cold, but the
current task is usually cache-hot. Maybe not cache-hot to a depth of 528
bytes, but it needs to be measured: it would be nice to benchmark the
effects of this particular patch in isolation as well, so that we know
the breakdown.
Ingo
----
introduce prefetchw_range(addr, len).
Signed-off-by: Ingo Molnar <mingo@elte.hu>
include/linux/prefetch.h | 39 +++++++++++++++++++++++++++++++++++++++
1 files changed, 39 insertions(+)
Index: linux-prefetch-task/include/linux/prefetch.h
===================================================================
--- linux-prefetch-task.orig/include/linux/prefetch.h
+++ linux-prefetch-task/include/linux/prefetch.h
@@ -93,4 +93,43 @@ static inline void prefetch_range(void *
#endif
}
+static inline void prefetchw_range(void *addr, size_t len)
+{
+#ifdef ARCH_HAS_PREFETCH
+ char *cp = addr;
+ char *end = addr + len;
+
+ /*
+ * Unroll agressively:
+ */
+ if (len <= PREFETCH_STRIDE)
+ prefetchw(cp);
+ else if (len <= 2*PREFETCH_STRIDE) {
+ prefetchw(cp);
+ prefetchw(cp + PREFETCH_STRIDE);
+ }
+ else if (len <= 3*PREFETCH_STRIDE) {
+ prefetchw(cp);
+ prefetchw(cp + PREFETCH_STRIDE);
+ prefetchw(cp + 2*PREFETCH_STRIDE);
+ }
+ else if (len <= 4*PREFETCH_STRIDE) {
+ prefetchw(cp);
+ prefetchw(cp + PREFETCH_STRIDE);
+ prefetchw(cp + 2*PREFETCH_STRIDE);
+ prefetchw(cp + 3*PREFETCH_STRIDE);
+ }
+ else if (len <= 5*PREFETCH_STRIDE) {
+ prefetchw(cp);
+ prefetchw(cp + PREFETCH_STRIDE);
+ prefetchw(cp + 2*PREFETCH_STRIDE);
+ prefetchw(cp + 3*PREFETCH_STRIDE);
+ prefetchw(cp + 4*PREFETCH_STRIDE);
+ } else
+ for (; cp < end; cp += PREFETCH_STRIDE)
+ prefetchw(cp);
+#endif
+}
+
+
#endif
-----
prefetch the current kernel stack for writing.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
kernel/sched.c | 12 +++++++++---
1 files changed, 9 insertions(+), 3 deletions(-)
Index: linux-prefetch-task/kernel/sched.c
===================================================================
--- linux-prefetch-task.orig/kernel/sched.c
+++ linux-prefetch-task/kernel/sched.c
@@ -2754,6 +2754,12 @@ asmlinkage void __sched schedule(void)
int cpu, idx, new_prio;
/*
+ * Prefetch the current stack for writing (we use switch_count's
+ * address to get to the stack pointer):
+ */
+ prefetchw_range((void *)&switch_count - MIN_KERNEL_STACK_FOOTPRINT,
+ MIN_KERNEL_STACK_FOOTPRINT + L1_CACHE_BYTES);
+ /*
* Test if we are atomic. Since do_exit() needs to call into
* schedule() atomically, we ignore that path for now.
* Otherwise, whine if we are scheduling when we should not be.
@@ -2872,10 +2878,10 @@ go_idle:
/*
* Prefetch (at least) a cacheline below the current
* kernel stack (in expectation of any new task touching
- * the stack at least minimally), and a cacheline above
- * the stack:
+ * the stack at least minimally), and at least a cacheline
+ * above the stack:
*/
- prefetch_range(kernel_stack(next) - MIN_KERNEL_STACK_FOOTPRINT,
+ prefetch_range(kernel_stack(next) - L1_CACHE_BYTES,
MIN_KERNEL_STACK_FOOTPRINT + L1_CACHE_BYTES);
prefetch(next->mm);
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Add prefetch switch stack hook in scheduler function
2005-07-29 7:07 ` Ingo Molnar
@ 2005-07-29 8:30 ` Eric Dumazet
2005-07-29 8:44 ` Ingo Molnar
2005-07-31 16:27 ` hashed spinlocks Daniel Walker
2005-07-29 8:30 ` Add prefetch switch stack hook in scheduler function Chen, Kenneth W
2005-07-29 9:17 ` Peter Zijlstra
2 siblings, 2 replies; 40+ messages in thread
From: Eric Dumazet @ 2005-07-29 8:30 UTC (permalink / raw)
To: Ingo Molnar
Cc: Chen, Kenneth W, Keith Owens, David.Mosberger, Andrew Morton,
linux-kernel, linux-ia64
[-- Attachment #1: Type: text/plain, Size: 1647 bytes --]
Ingo Molnar a écrit :
> unroll prefetch_range() loops manually.
>
> Signed-off-by: Ingo Molnar <mingo@elte.hu>
>
> include/linux/prefetch.h | 31 +++++++++++++++++++++++++++++--
> 1 files changed, 29 insertions(+), 2 deletions(-)
>
> Index: linux/include/linux/prefetch.h
> ===================================================================
> --- linux.orig/include/linux/prefetch.h
> +++ linux/include/linux/prefetch.h
> @@ -58,11 +58,38 @@ static inline void prefetchw(const void
> static inline void prefetch_range(void *addr, size_t len)
> {
> #ifdef ARCH_HAS_PREFETCH
> - char *cp;
> + char *cp = addr;
> char *end = addr + len;
>
> - for (cp = addr; cp < end; cp += PREFETCH_STRIDE)
> + /*
> + * Unroll agressively:
> + */
> + if (len <= PREFETCH_STRIDE)
> prefetch(cp);
> + else if (len <= 2*PREFETCH_STRIDE) {
> + prefetch(cp);
> + prefetch(cp + PREFETCH_STRIDE);
> + }
> + else if (len <= 3*PREFETCH_STRIDE) {
> + prefetch(cp);
> + prefetch(cp + PREFETCH_STRIDE);
> + prefetch(cp + 2*PREFETCH_STRIDE);
> + }
> + else if (len <= 4*PREFETCH_STRIDE) {
> + prefetch(cp);
> + prefetch(cp + PREFETCH_STRIDE);
> + prefetch(cp + 2*PREFETCH_STRIDE);
> + prefetch(cp + 3*PREFETCH_STRIDE);
> + }
> + else if (len <= 5*PREFETCH_STRIDE) {
> + prefetch(cp);
> + prefetch(cp + PREFETCH_STRIDE);
> + prefetch(cp + 2*PREFETCH_STRIDE);
> + prefetch(cp + 3*PREFETCH_STRIDE);
> + prefetch(cp + 4*PREFETCH_STRIDE);
> + } else
> + for (; cp < end; cp += PREFETCH_STRIDE)
> + prefetch(cp);
> #endif
> }
>
> -
Please test that len is a constant, or else the inlining is too large for the non constant case.
Thank you
[-- Attachment #2: prefetch_range --]
[-- Type: text/plain, Size: 881 bytes --]
static inline void prefetch_range(void *addr, size_t len)
{
char *cp;
char *end = addr + len;
if (__builtin_constant_p(len) && (len <= 5*PREFETCH_STRIDE)) {
if (len <= PREFETCH_STRIDE)
prefetch(cp);
else if (len <= 2*PREFETCH_STRIDE) {
prefetch(cp);
prefetch(cp + PREFETCH_STRIDE);
}
else if (len <= 3*PREFETCH_STRIDE) {
prefetch(cp);
prefetch(cp + PREFETCH_STRIDE);
prefetch(cp + 2*PREFETCH_STRIDE);
}
else if (len <= 4*PREFETCH_STRIDE) {
prefetch(cp);
prefetch(cp + PREFETCH_STRIDE);
prefetch(cp + 2*PREFETCH_STRIDE);
prefetch(cp + 3*PREFETCH_STRIDE);
}
else if (len <= 5*PREFETCH_STRIDE) {
prefetch(cp);
prefetch(cp + PREFETCH_STRIDE);
prefetch(cp + 2*PREFETCH_STRIDE);
prefetch(cp + 3*PREFETCH_STRIDE);
prefetch(cp + 4*PREFETCH_STRIDE);
}
}
else
for (; cp < end; cp += PREFETCH_STRIDE)
prefetch(cp);
}
^ permalink raw reply [flat|nested] 40+ messages in thread
* RE: Add prefetch switch stack hook in scheduler function
2005-07-29 7:07 ` Ingo Molnar
2005-07-29 8:30 ` Eric Dumazet
@ 2005-07-29 8:30 ` Chen, Kenneth W
2005-07-29 8:35 ` Ingo Molnar
2005-07-29 9:17 ` Peter Zijlstra
2 siblings, 1 reply; 40+ messages in thread
From: Chen, Kenneth W @ 2005-07-29 8:30 UTC (permalink / raw)
To: 'Ingo Molnar'
Cc: Keith Owens, David.Mosberger, Andrew Morton, linux-kernel,
linux-ia64
Ingo Molnar wrote on Friday, July 29, 2005 12:07 AM
> the patch below unrolls the prefetch_range() loop manually, for up to 5
> cachelines prefetched. This patch, ontop of the 4 previous patches,
> should generate similar code to the assembly code in your original
> patch. The full patch-series is:
It generate slight different code because previous patch asks for a little
over 5 cache lines worth of bytes and it always go to the for loop.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Add prefetch switch stack hook in scheduler function
2005-07-29 8:30 ` Add prefetch switch stack hook in scheduler function Chen, Kenneth W
@ 2005-07-29 8:35 ` Ingo Molnar
2005-07-29 8:39 ` Chen, Kenneth W
0 siblings, 1 reply; 40+ messages in thread
From: Ingo Molnar @ 2005-07-29 8:35 UTC (permalink / raw)
To: Chen, Kenneth W
Cc: Keith Owens, David.Mosberger, Andrew Morton, linux-kernel,
linux-ia64
* Chen, Kenneth W <kenneth.w.chen@intel.com> wrote:
> Ingo Molnar wrote on Friday, July 29, 2005 12:07 AM
> > the patch below unrolls the prefetch_range() loop manually, for up to 5
> > cachelines prefetched. This patch, ontop of the 4 previous patches,
> > should generate similar code to the assembly code in your original
> > patch. The full patch-series is:
>
> It generate slight different code because previous patch asks for a little
> over 5 cache lines worth of bytes and it always go to the for loop.
ok - fix below. But i'm not that sure we want to unroll a 6-instruction
loop, and it's getting a bit ugly. (Also, it would be nice to have a gcc
extension for loops that says 'always unroll up to N entries'.)
Ingo
----
unroll the prefetch loop some more.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
include/linux/prefetch.h | 16 ++++++++++++++++
1 files changed, 16 insertions(+)
Index: linux-prefetch-task/include/linux/prefetch.h
===================================================================
--- linux-prefetch-task.orig/include/linux/prefetch.h
+++ linux-prefetch-task/include/linux/prefetch.h
@@ -87,6 +87,14 @@ static inline void prefetch_range(void *
prefetch(cp + 2*PREFETCH_STRIDE);
prefetch(cp + 3*PREFETCH_STRIDE);
prefetch(cp + 4*PREFETCH_STRIDE);
+ }
+ else if (len <= 6*PREFETCH_STRIDE) {
+ prefetch(cp);
+ prefetch(cp + PREFETCH_STRIDE);
+ prefetch(cp + 2*PREFETCH_STRIDE);
+ prefetch(cp + 3*PREFETCH_STRIDE);
+ prefetch(cp + 4*PREFETCH_STRIDE);
+ prefetch(cp + 5*PREFETCH_STRIDE);
} else
for (; cp < end; cp += PREFETCH_STRIDE)
prefetch(cp);
@@ -125,6 +133,14 @@ static inline void prefetchw_range(void
prefetchw(cp + 2*PREFETCH_STRIDE);
prefetchw(cp + 3*PREFETCH_STRIDE);
prefetchw(cp + 4*PREFETCH_STRIDE);
+ }
+ else if (len <= 6*PREFETCH_STRIDE) {
+ prefetchw(cp);
+ prefetchw(cp + PREFETCH_STRIDE);
+ prefetchw(cp + 2*PREFETCH_STRIDE);
+ prefetchw(cp + 3*PREFETCH_STRIDE);
+ prefetchw(cp + 4*PREFETCH_STRIDE);
+ prefetchw(cp + 5*PREFETCH_STRIDE);
} else
for (; cp < end; cp += PREFETCH_STRIDE)
prefetchw(cp);
^ permalink raw reply [flat|nested] 40+ messages in thread
* RE: Add prefetch switch stack hook in scheduler function
2005-07-29 8:35 ` Ingo Molnar
@ 2005-07-29 8:39 ` Chen, Kenneth W
0 siblings, 0 replies; 40+ messages in thread
From: Chen, Kenneth W @ 2005-07-29 8:39 UTC (permalink / raw)
To: 'Ingo Molnar'
Cc: Keith Owens, David.Mosberger, Andrew Morton, linux-kernel,
linux-ia64
Ingo Molnar wrote on Friday, July 29, 2005 1:36 AM
> * Chen, Kenneth W <kenneth.w.chen@intel.com> wrote:
> > It generate slight different code because previous patch asks for a little
> > over 5 cache lines worth of bytes and it always go to the for loop.
>
> ok - fix below. But i'm not that sure we want to unroll a 6-instruction
> loop, and it's getting a bit ugly.
Yeah, I agree. We probably won't see a difference whether the loop is unrolled
or not.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Add prefetch switch stack hook in scheduler function
2005-07-29 8:30 ` Eric Dumazet
@ 2005-07-29 8:44 ` Ingo Molnar
2005-07-31 16:27 ` hashed spinlocks Daniel Walker
1 sibling, 0 replies; 40+ messages in thread
From: Ingo Molnar @ 2005-07-29 8:44 UTC (permalink / raw)
To: Eric Dumazet
Cc: Chen, Kenneth W, Keith Owens, David.Mosberger, Andrew Morton,
linux-kernel, linux-ia64
* Eric Dumazet <dada1@cosmosbay.com> wrote:
> Please test that len is a constant, or else the inlining is too large
> for the non constant case.
yeah. fix below.
Ingo
-----
noticed by Eric Dumazet: unrolling should be dependent on a constant
length, otherwise inlining gets too large.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
include/linux/prefetch.h | 128 ++++++++++++++++++++++++-----------------------
1 files changed, 66 insertions(+), 62 deletions(-)
Index: linux-prefetch-task/include/linux/prefetch.h
===================================================================
--- linux-prefetch-task.orig/include/linux/prefetch.h
+++ linux-prefetch-task/include/linux/prefetch.h
@@ -64,37 +64,39 @@ static inline void prefetch_range(void *
/*
* Unroll agressively:
*/
- if (len <= PREFETCH_STRIDE)
- prefetch(cp);
- else if (len <= 2*PREFETCH_STRIDE) {
- prefetch(cp);
- prefetch(cp + PREFETCH_STRIDE);
- }
- else if (len <= 3*PREFETCH_STRIDE) {
- prefetch(cp);
- prefetch(cp + PREFETCH_STRIDE);
- prefetch(cp + 2*PREFETCH_STRIDE);
- }
- else if (len <= 4*PREFETCH_STRIDE) {
- prefetch(cp);
- prefetch(cp + PREFETCH_STRIDE);
- prefetch(cp + 2*PREFETCH_STRIDE);
- prefetch(cp + 3*PREFETCH_STRIDE);
- }
- else if (len <= 5*PREFETCH_STRIDE) {
- prefetch(cp);
- prefetch(cp + PREFETCH_STRIDE);
- prefetch(cp + 2*PREFETCH_STRIDE);
- prefetch(cp + 3*PREFETCH_STRIDE);
- prefetch(cp + 4*PREFETCH_STRIDE);
- }
- else if (len <= 6*PREFETCH_STRIDE) {
- prefetch(cp);
- prefetch(cp + PREFETCH_STRIDE);
- prefetch(cp + 2*PREFETCH_STRIDE);
- prefetch(cp + 3*PREFETCH_STRIDE);
- prefetch(cp + 4*PREFETCH_STRIDE);
- prefetch(cp + 5*PREFETCH_STRIDE);
+ if (__builtin_constant_p(len) && (len <= 6*PREFETCH_STRIDE)) {
+ if (len <= PREFETCH_STRIDE)
+ prefetch(cp);
+ else if (len <= 2*PREFETCH_STRIDE) {
+ prefetch(cp);
+ prefetch(cp + PREFETCH_STRIDE);
+ }
+ else if (len <= 3*PREFETCH_STRIDE) {
+ prefetch(cp);
+ prefetch(cp + PREFETCH_STRIDE);
+ prefetch(cp + 2*PREFETCH_STRIDE);
+ }
+ else if (len <= 4*PREFETCH_STRIDE) {
+ prefetch(cp);
+ prefetch(cp + PREFETCH_STRIDE);
+ prefetch(cp + 2*PREFETCH_STRIDE);
+ prefetch(cp + 3*PREFETCH_STRIDE);
+ }
+ else if (len <= 5*PREFETCH_STRIDE) {
+ prefetch(cp);
+ prefetch(cp + PREFETCH_STRIDE);
+ prefetch(cp + 2*PREFETCH_STRIDE);
+ prefetch(cp + 3*PREFETCH_STRIDE);
+ prefetch(cp + 4*PREFETCH_STRIDE);
+ }
+ else if (len <= 6*PREFETCH_STRIDE) {
+ prefetch(cp);
+ prefetch(cp + PREFETCH_STRIDE);
+ prefetch(cp + 2*PREFETCH_STRIDE);
+ prefetch(cp + 3*PREFETCH_STRIDE);
+ prefetch(cp + 4*PREFETCH_STRIDE);
+ prefetch(cp + 5*PREFETCH_STRIDE);
+ }
} else
for (; cp < end; cp += PREFETCH_STRIDE)
prefetch(cp);
@@ -110,37 +112,39 @@ static inline void prefetchw_range(void
/*
* Unroll agressively:
*/
- if (len <= PREFETCH_STRIDE)
- prefetchw(cp);
- else if (len <= 2*PREFETCH_STRIDE) {
- prefetchw(cp);
- prefetchw(cp + PREFETCH_STRIDE);
- }
- else if (len <= 3*PREFETCH_STRIDE) {
- prefetchw(cp);
- prefetchw(cp + PREFETCH_STRIDE);
- prefetchw(cp + 2*PREFETCH_STRIDE);
- }
- else if (len <= 4*PREFETCH_STRIDE) {
- prefetchw(cp);
- prefetchw(cp + PREFETCH_STRIDE);
- prefetchw(cp + 2*PREFETCH_STRIDE);
- prefetchw(cp + 3*PREFETCH_STRIDE);
- }
- else if (len <= 5*PREFETCH_STRIDE) {
- prefetchw(cp);
- prefetchw(cp + PREFETCH_STRIDE);
- prefetchw(cp + 2*PREFETCH_STRIDE);
- prefetchw(cp + 3*PREFETCH_STRIDE);
- prefetchw(cp + 4*PREFETCH_STRIDE);
- }
- else if (len <= 6*PREFETCH_STRIDE) {
- prefetchw(cp);
- prefetchw(cp + PREFETCH_STRIDE);
- prefetchw(cp + 2*PREFETCH_STRIDE);
- prefetchw(cp + 3*PREFETCH_STRIDE);
- prefetchw(cp + 4*PREFETCH_STRIDE);
- prefetchw(cp + 5*PREFETCH_STRIDE);
+ if (__builtin_constant_p(len) && (len <= 6*PREFETCH_STRIDE)) {
+ if (len <= PREFETCH_STRIDE)
+ prefetchw(cp);
+ else if (len <= 2*PREFETCH_STRIDE) {
+ prefetchw(cp);
+ prefetchw(cp + PREFETCH_STRIDE);
+ }
+ else if (len <= 3*PREFETCH_STRIDE) {
+ prefetchw(cp);
+ prefetchw(cp + PREFETCH_STRIDE);
+ prefetchw(cp + 2*PREFETCH_STRIDE);
+ }
+ else if (len <= 4*PREFETCH_STRIDE) {
+ prefetchw(cp);
+ prefetchw(cp + PREFETCH_STRIDE);
+ prefetchw(cp + 2*PREFETCH_STRIDE);
+ prefetchw(cp + 3*PREFETCH_STRIDE);
+ }
+ else if (len <= 5*PREFETCH_STRIDE) {
+ prefetchw(cp);
+ prefetchw(cp + PREFETCH_STRIDE);
+ prefetchw(cp + 2*PREFETCH_STRIDE);
+ prefetchw(cp + 3*PREFETCH_STRIDE);
+ prefetchw(cp + 4*PREFETCH_STRIDE);
+ }
+ else if (len <= 6*PREFETCH_STRIDE) {
+ prefetchw(cp);
+ prefetchw(cp + PREFETCH_STRIDE);
+ prefetchw(cp + 2*PREFETCH_STRIDE);
+ prefetchw(cp + 3*PREFETCH_STRIDE);
+ prefetchw(cp + 4*PREFETCH_STRIDE);
+ prefetchw(cp + 5*PREFETCH_STRIDE);
+ }
} else
for (; cp < end; cp += PREFETCH_STRIDE)
prefetchw(cp);
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Add prefetch switch stack hook in scheduler function
2005-07-29 8:28 ` Ingo Molnar
@ 2005-07-29 9:02 ` Russell King
2005-07-29 9:45 ` Ingo Molnar
0 siblings, 1 reply; 40+ messages in thread
From: Russell King @ 2005-07-29 9:02 UTC (permalink / raw)
To: Ingo Molnar
Cc: Chen, Kenneth W, Keith Owens, David.Mosberger, Andrew Morton,
linux-kernel, linux-ia64
On Fri, Jul 29, 2005 at 10:28:26AM +0200, Ingo Molnar wrote:
> @@ -2872,10 +2878,10 @@ go_idle:
> /*
> * Prefetch (at least) a cacheline below the current
> * kernel stack (in expectation of any new task touching
> - * the stack at least minimally), and a cacheline above
> - * the stack:
> + * the stack at least minimally), and at least a cacheline
> + * above the stack:
> */
> - prefetch_range(kernel_stack(next) - MIN_KERNEL_STACK_FOOTPRINT,
> + prefetch_range(kernel_stack(next) - L1_CACHE_BYTES,
> MIN_KERNEL_STACK_FOOTPRINT + L1_CACHE_BYTES);
This needs to ensure that we don't prefetch outside the page of the
kernel stack - otherwise we risk weird problems on architectures
which support prefetching but not DMA cache coherency.
--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 Serial core
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Add prefetch switch stack hook in scheduler function
2005-07-29 7:07 ` Ingo Molnar
2005-07-29 8:30 ` Eric Dumazet
2005-07-29 8:30 ` Add prefetch switch stack hook in scheduler function Chen, Kenneth W
@ 2005-07-29 9:17 ` Peter Zijlstra
2005-07-29 10:52 ` Ingo Molnar
2 siblings, 1 reply; 40+ messages in thread
From: Peter Zijlstra @ 2005-07-29 9:17 UTC (permalink / raw)
To: Ingo Molnar
Cc: Chen, Kenneth W, Keith Owens, David.Mosberger, Andrew Morton,
linux-kernel, linux-ia64
> ---------
> unroll prefetch_range() loops manually.
>
> Signed-off-by: Ingo Molnar <mingo@elte.hu>
>
> include/linux/prefetch.h | 31 +++++++++++++++++++++++++++++--
> 1 files changed, 29 insertions(+), 2 deletions(-)
>
> Index: linux/include/linux/prefetch.h
> ===================================================================
> --- linux.orig/include/linux/prefetch.h
> +++ linux/include/linux/prefetch.h
> @@ -58,11 +58,38 @@ static inline void prefetchw(const void
> static inline void prefetch_range(void *addr, size_t len)
> {
> #ifdef ARCH_HAS_PREFETCH
> - char *cp;
> + char *cp = addr;
> char *end = addr + len;
>
> - for (cp = addr; cp < end; cp += PREFETCH_STRIDE)
> + /*
> + * Unroll agressively:
> + */
> + if (len <= PREFETCH_STRIDE)
> prefetch(cp);
> + else if (len <= 2*PREFETCH_STRIDE) {
> + prefetch(cp);
> + prefetch(cp + PREFETCH_STRIDE);
> + }
> + else if (len <= 3*PREFETCH_STRIDE) {
> + prefetch(cp);
> + prefetch(cp + PREFETCH_STRIDE);
> + prefetch(cp + 2*PREFETCH_STRIDE);
> + }
> + else if (len <= 4*PREFETCH_STRIDE) {
> + prefetch(cp);
> + prefetch(cp + PREFETCH_STRIDE);
> + prefetch(cp + 2*PREFETCH_STRIDE);
> + prefetch(cp + 3*PREFETCH_STRIDE);
> + }
> + else if (len <= 5*PREFETCH_STRIDE) {
> + prefetch(cp);
> + prefetch(cp + PREFETCH_STRIDE);
> + prefetch(cp + 2*PREFETCH_STRIDE);
> + prefetch(cp + 3*PREFETCH_STRIDE);
> + prefetch(cp + 4*PREFETCH_STRIDE);
> + } else
> + for (; cp < end; cp += PREFETCH_STRIDE)
> + prefetch(cp);
> #endif
> }
>
code like that always makes me think of duffs-device
http://www.lysator.liu.se/c/duffs-device.html
although it might be that the compiler generates better code from the
current incarnation; just my .02 ;-)
regards,
--
Peter Zijlstra <a.p.zijlstra@chello.nl>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Add prefetch switch stack hook in scheduler function
2005-07-29 9:02 ` Russell King
@ 2005-07-29 9:45 ` Ingo Molnar
0 siblings, 0 replies; 40+ messages in thread
From: Ingo Molnar @ 2005-07-29 9:45 UTC (permalink / raw)
To: Russell King
Cc: Chen, Kenneth W, Keith Owens, David.Mosberger, Andrew Morton,
linux-kernel, linux-ia64
* Russell King <rmk+lkml@arm.linux.org.uk> wrote:
> On Fri, Jul 29, 2005 at 10:28:26AM +0200, Ingo Molnar wrote:
> > @@ -2872,10 +2878,10 @@ go_idle:
> > /*
> > * Prefetch (at least) a cacheline below the current
> > * kernel stack (in expectation of any new task touching
> > - * the stack at least minimally), and a cacheline above
> > - * the stack:
> > + * the stack at least minimally), and at least a cacheline
> > + * above the stack:
> > */
> > - prefetch_range(kernel_stack(next) - MIN_KERNEL_STACK_FOOTPRINT,
> > + prefetch_range(kernel_stack(next) - L1_CACHE_BYTES,
> > MIN_KERNEL_STACK_FOOTPRINT + L1_CACHE_BYTES);
>
> This needs to ensure that we don't prefetch outside the page of the
> kernel stack - otherwise we risk weird problems on architectures which
> support prefetching but not DMA cache coherency.
ok, agreed. Since kernel_stack(next) defaults to 'next', we go below
that structure which has unknown coherency attributes. I guess the
easiest solution would be to default kernel_stack(next) to '(void *)next
+ L1_CACHE_BYTES'? That way the default prefetching would happen for the
[next...next+2*L1_BYTES] range.
Ingo
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Add prefetch switch stack hook in scheduler function
2005-07-29 9:17 ` Peter Zijlstra
@ 2005-07-29 10:52 ` Ingo Molnar
0 siblings, 0 replies; 40+ messages in thread
From: Ingo Molnar @ 2005-07-29 10:52 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Chen, Kenneth W, Keith Owens, David.Mosberger, Andrew Morton,
linux-kernel, linux-ia64
* Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> code like that always makes me think of duffs-device
> http://www.lysator.liu.se/c/duffs-device.html
>
> although it might be that the compiler generates better code from the
> current incarnation; just my .02 ;-)
yeah, will do that. First wanted to see whether it's worth it.
Ingo
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Add prefetch switch stack hook in scheduler function
@ 2005-07-29 15:18 linux
2005-07-29 15:49 ` Ingo Molnar
0 siblings, 1 reply; 40+ messages in thread
From: linux @ 2005-07-29 15:18 UTC (permalink / raw)
To: linux-kernel; +Cc: mingo
> include/asm-alpha/mmu_context.h | 6 ++++++
> include/asm-arm/mmu_context.h | 6 ++++++
> include/asm-arm26/mmu_context.h | 6 ++++++
> include/asm-cris/mmu_context.h | 6 ++++++
> include/asm-frv/mmu_context.h | 6 ++++++
> include/asm-h8300/mmu_context.h | 6 ++++++
> include/asm-i386/mmu_context.h | 6 ++++++
> include/asm-ia64/mmu_context.h | 6 ++++++
> include/asm-m32r/mmu_context.h | 6 ++++++
> include/asm-m68k/mmu_context.h | 6 ++++++
> include/asm-m68knommu/mmu_context.h | 6 ++++++
> include/asm-mips/mmu_context.h | 6 ++++++
> include/asm-parisc/mmu_context.h | 6 ++++++
> include/asm-ppc/mmu_context.h | 6 ++++++
> include/asm-ppc64/mmu_context.h | 6 ++++++
> include/asm-s390/mmu_context.h | 6 ++++++
> include/asm-sh/mmu_context.h | 6 ++++++
> include/asm-sh64/mmu_context.h | 6 ++++++
> include/asm-sparc/mmu_context.h | 6 ++++++
> include/asm-sparc64/mmu_context.h | 6 ++++++
> include/asm-um/mmu_context.h | 6 ++++++
> include/asm-v850/mmu_context.h | 6 ++++++
> include/asm-x86_64/mmu_context.h | 5 +++++
> include/asm-xtensa/mmu_context.h | 6 ++++++
> kernel/sched.c | 9 ++++++++-
> 25 files changed, 151 insertions(+), 1 deletion(-)
I think this pretty clearly points out the need for some arch-generic
infrastructure in Linux. An awful lot of arch hooks are for one
or two architectures with some peculiarities, and the other 90% of
the implementations are identical.
For example, this is 22 repetitions of
#define MIN_KERNEL_STACK_FOOTPRINT L1_CACHE_BYTES
with one different case.
It would be awfully nice if there was a standard way to provide a default
implementation that was automatically picked up by any architecture that
didn't explicitly override it.
One possibility is to use #ifndef:
/* asm-$PLATFORM/foo.h */
#define MIN_KERNEL_STACK_FOOTPRINT IA64_SWITCH_STACK_SIZE
inline void
prefetch_task(struct task_struct const *task)
{
...
}
#define prefetch_task prefetch_task
/* asm-generic/foo.h */
#include <asm/foo.h>
#ifndef MIN_KERNEL_STACK_FOOTPRINT
#define MIN_KERNEL_STACK_FOOTPRINT L1_CACHE_BYTES
#endif
#ifndef prefetch_task
inline void prefetch_task(struct task_struct const *task) { }
/* The #define is OPTIONAL... */
#define prefetch_task prefetch_task
#endif
But both understanding and maintaining the arch code could be
much easier if the shared parts were collapsed. A comment in the
generic versions can explain what the assumptions are.
If there are cases where there is more than one implementation with
multiple users, it can be stuffed into a third category of headers.
E.g. <asm-generic/noiommu/foo.h> and <asm-generic/iommu/foo.h> or some
such, using the same duplicate-suppression technique and #included at
the end of <asm-$PLATFORM/foo.h>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: Add prefetch switch stack hook in scheduler function
2005-07-29 15:18 linux
@ 2005-07-29 15:49 ` Ingo Molnar
0 siblings, 0 replies; 40+ messages in thread
From: Ingo Molnar @ 2005-07-29 15:49 UTC (permalink / raw)
To: linux; +Cc: linux-kernel
* linux@horizon.com <linux@horizon.com> wrote:
> I think this pretty clearly points out the need for some arch-generic
> infrastructure in Linux. An awful lot of arch hooks are for one or
> two architectures with some peculiarities, and the other 90% of the
> implementations are identical.
>
> For example, this is 22 repetitions of
> #define MIN_KERNEL_STACK_FOOTPRINT L1_CACHE_BYTES
>
> with one different case.
this just primes all the architectures so that they build. Every
architecture should then adjust these parameters. Also, since the
patches are not final yet i didnt try to widen them too much.
> It would be awfully nice if there was a standard way to provide a
> default implementation that was automatically picked up by any
> architecture that didn't explicitly override it.
that used to be the ARCH_HAS_* flags & macros - but these days we prefer
clean inline functions defined per arch, no ifdefs.
If there is something that is truly shared between all arches then an
asm-generic/*.h file can be generated for it, and included from most
arches. I dont think the changes i did will necessiate that.
Ingo
^ permalink raw reply [flat|nested] 40+ messages in thread
* hashed spinlocks
2005-07-29 8:30 ` Eric Dumazet
2005-07-29 8:44 ` Ingo Molnar
@ 2005-07-31 16:27 ` Daniel Walker
2005-07-31 18:46 ` David S. Miller
1 sibling, 1 reply; 40+ messages in thread
From: Daniel Walker @ 2005-07-31 16:27 UTC (permalink / raw)
To: Eric Dumazet; +Cc: linux-kernel
>From 2.6.13-rc4 this hunk
+#else
+# define rt_hash_lock_addr(slot) NULL
+# define rt_hash_lock_init()
+#endif
Doesn't work with the following,
+ spin_unlock(rt_hash_lock_addr(i));
Cause your spin locking a NULL .. I would give a patch, but I'm not sure
what should be done in this case..
Daniel
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: hashed spinlocks
2005-07-31 16:27 ` hashed spinlocks Daniel Walker
@ 2005-07-31 18:46 ` David S. Miller
2005-07-31 19:06 ` Daniel Walker
0 siblings, 1 reply; 40+ messages in thread
From: David S. Miller @ 2005-07-31 18:46 UTC (permalink / raw)
To: dwalker; +Cc: dada1, linux-kernel
From: Daniel Walker <dwalker@mvista.com>
Date: Sun, 31 Jul 2005 09:27:55 -0700
> >From 2.6.13-rc4 this hunk
>
> +#else
> +# define rt_hash_lock_addr(slot) NULL
> +# define rt_hash_lock_init()
> +#endif
>
> Doesn't work with the following,
>
> + spin_unlock(rt_hash_lock_addr(i));
>
>
> Cause your spin locking a NULL .. I would give a patch, but I'm not sure
> what should be done in this case..
That spinlock debugging code is such a pain in the butt,
nothing at all should be happening with spinlocks on
a non-SMP build.
We should just change the route.c ifdef to check for
CONFIG_DEBUG_SPINLOCK as well as CONFIG_SMP, in order
to fix this.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: hashed spinlocks
2005-07-31 18:46 ` David S. Miller
@ 2005-07-31 19:06 ` Daniel Walker
2005-07-31 19:11 ` David S. Miller
0 siblings, 1 reply; 40+ messages in thread
From: Daniel Walker @ 2005-07-31 19:06 UTC (permalink / raw)
To: David S. Miller; +Cc: dada1, linux-kernel
On Sun, 2005-07-31 at 11:46 -0700, David S. Miller wrote:
> From: Daniel Walker <dwalker@mvista.com>
> Date: Sun, 31 Jul 2005 09:27:55 -0700
>
> > >From 2.6.13-rc4 this hunk
> >
> > +#else
> > +# define rt_hash_lock_addr(slot) NULL
> > +# define rt_hash_lock_init()
> > +#endif
> >
> > Doesn't work with the following,
> >
> > + spin_unlock(rt_hash_lock_addr(i));
> >
> >
> > Cause your spin locking a NULL .. I would give a patch, but I'm not sure
> > what should be done in this case..
>
> That spinlock debugging code is such a pain in the butt,
> nothing at all should be happening with spinlocks on
> a non-SMP build.
>
> We should just change the route.c ifdef to check for
> CONFIG_DEBUG_SPINLOCK as well as CONFIG_SMP, in order
> to fix this.
The ifdef that switched between the two rt_hash_lock_addr() switched on
for CONFIG_SMP or CONFIG_DEBUG_SPINLOCK . I was compiling UP , so I
didn't get either.
Seems like you'll need to have an rt_hash_lock(slot) that replaces the
spin_lock calls ..
Daniel
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: hashed spinlocks
2005-07-31 19:06 ` Daniel Walker
@ 2005-07-31 19:11 ` David S. Miller
2005-07-31 19:16 ` Daniel Walker
0 siblings, 1 reply; 40+ messages in thread
From: David S. Miller @ 2005-07-31 19:11 UTC (permalink / raw)
To: dwalker; +Cc: dada1, linux-kernel
From: Daniel Walker <dwalker@mvista.com>
Date: Sun, 31 Jul 2005 12:06:47 -0700
> The ifdef that switched between the two rt_hash_lock_addr() switched on
> for CONFIG_SMP or CONFIG_DEBUG_SPINLOCK . I was compiling UP , so I
> didn't get either.
>
> Seems like you'll need to have an rt_hash_lock(slot) that replaces the
> spin_lock calls ..
spin_lock(x) and spin_unlock(x) are both a nop in this case, so what
is the problem passing in a NULL? The argument is arbitrary and should
should just ignored, right?
If both CONFIG_SMP and CONFIG_DEBUG_SPINLOCK are disabled, we
end up with these definitions in linux/spinlock.h
#define spin_lock(lock) _spin_lock(lock)
#define _spin_lock(lock) \
do { \
preempt_disable(); \
_raw_spin_lock(lock); \
__acquire(lock); \
} while(0)
#define _raw_spin_lock(lock) do { (void)(lock); } while(0)
What kind of warning do you get?
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: hashed spinlocks
2005-07-31 19:11 ` David S. Miller
@ 2005-07-31 19:16 ` Daniel Walker
0 siblings, 0 replies; 40+ messages in thread
From: Daniel Walker @ 2005-07-31 19:16 UTC (permalink / raw)
To: David S. Miller; +Cc: dada1, linux-kernel
On Sun, 2005-07-31 at 12:11 -0700, David S. Miller wrote:
> From: Daniel Walker <dwalker@mvista.com>
> Date: Sun, 31 Jul 2005 12:06:47 -0700
>
> > The ifdef that switched between the two rt_hash_lock_addr() switched on
> > for CONFIG_SMP or CONFIG_DEBUG_SPINLOCK . I was compiling UP , so I
> > didn't get either.
> >
> > Seems like you'll need to have an rt_hash_lock(slot) that replaces the
> > spin_lock calls ..
>
> spin_lock(x) and spin_unlock(x) are both a nop in this case, so what
> is the problem passing in a NULL? The argument is arbitrary and should
> should just ignored, right?
True.
> If both CONFIG_SMP and CONFIG_DEBUG_SPINLOCK are disabled, we
> end up with these definitions in linux/spinlock.h
>
> #define spin_lock(lock) _spin_lock(lock)
>
> #define _spin_lock(lock) \
> do { \
> preempt_disable(); \
> _raw_spin_lock(lock); \
> __acquire(lock); \
> } while(0)
>
> #define _raw_spin_lock(lock) do { (void)(lock); } while(0)
>
> What kind of warning do you get?
It was an RT kernel, which isn't mainline .. Your right it shouldn't be
a problem .
Daniel
^ permalink raw reply [flat|nested] 40+ messages in thread
end of thread, other threads:[~2005-07-31 19:18 UTC | newest]
Thread overview: 40+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-07-27 22:07 Add prefetch switch stack hook in scheduler function Chen, Kenneth W
2005-07-27 23:13 ` Andrew Morton
2005-07-27 23:23 ` david mosberger
2005-07-28 7:41 ` Ingo Molnar
2005-07-28 8:09 ` Keith Owens
2005-07-28 8:16 ` Ingo Molnar
2005-07-28 9:09 ` Ingo Molnar
2005-07-28 19:14 ` Chen, Kenneth W
2005-07-29 7:04 ` Ingo Molnar
2005-07-29 7:07 ` Ingo Molnar
2005-07-29 8:30 ` Eric Dumazet
2005-07-29 8:44 ` Ingo Molnar
2005-07-31 16:27 ` hashed spinlocks Daniel Walker
2005-07-31 18:46 ` David S. Miller
2005-07-31 19:06 ` Daniel Walker
2005-07-31 19:11 ` David S. Miller
2005-07-31 19:16 ` Daniel Walker
2005-07-29 8:30 ` Add prefetch switch stack hook in scheduler function Chen, Kenneth W
2005-07-29 8:35 ` Ingo Molnar
2005-07-29 8:39 ` Chen, Kenneth W
2005-07-29 9:17 ` Peter Zijlstra
2005-07-29 10:52 ` Ingo Molnar
2005-07-29 7:22 ` Chen, Kenneth W
2005-07-29 7:45 ` Keith Owens
2005-07-29 8:02 ` Chen, Kenneth W
2005-07-29 8:28 ` Ingo Molnar
2005-07-29 9:02 ` Russell King
2005-07-29 9:45 ` Ingo Molnar
2005-07-29 7:38 ` Keith Owens
2005-07-29 8:08 ` Chen, Kenneth W
2005-07-28 8:31 ` Nick Piggin
2005-07-28 8:35 ` Ingo Molnar
2005-07-28 8:48 ` Nick Piggin
2005-07-28 9:16 ` Ingo Molnar
2005-07-28 9:19 ` Ingo Molnar
2005-07-28 9:34 ` Nick Piggin
2005-07-28 10:04 ` Ingo Molnar
2005-07-28 10:29 ` Nick Piggin
-- strict thread matches above, loose matches on Subject: below --
2005-07-29 15:18 linux
2005-07-29 15:49 ` Ingo Molnar
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox